work_2bo75polmfcszi3xk4pnpihvu4 ---- [PDF] Thomas Gray, Samuel Taylor Coleridge and geographical information systems: A literary GIS of two Lake District tours | Semantic Scholar Skip to search formSkip to main content> Semantic Scholar's Logo Search Sign InCreate Free Account You are currently offline. Some features of the site may not work correctly. DOI:10.3366/ijhac.2009.0009 Corpus ID: 36797812Thomas Gray, Samuel Taylor Coleridge and geographical information systems: A literary GIS of two Lake District tours @article{Gregory2009ThomasGS, title={Thomas Gray, Samuel Taylor Coleridge and geographical information systems: A literary GIS of two Lake District tours}, author={I. Gregory and D. Cooper}, journal={Int. J. Humanit. Arts Comput.}, year={2009}, volume={3}, pages={61-84} } I. Gregory, D. Cooper Published 2009 Computer Science, History Int. J. Humanit. Arts Comput. There have been growing calls to develop the use of Geographical Information Systems (GIS) across the humanities. For this shift to take place, two things must be demonstrated: first, that it is technically possible to create a useful GIS of textual material, the main medium through which humanities research is conducted; and, secondly that such a database can be used to enhance our understanding of disciplines within the humanities. This paper reports on a pilot project that created a GIS of… Expand View via Publisher e-space.mmu.ac.uk Save to Library Create Alert Cite Launch Research Feed Share This Paper 21 CitationsBackground Citations 6 Methods Citations 1 View All Figures and Topics from this paper figure 1 figure 2 figure 3 figure 4 figure 5 figure 6 figure 7 figure 8 figure 9 figure 10 figure 11 View All 11 Figures & Tables Geographic information system Internet Embedded system 21 Citations Citation Type Citation Type All Types Cites Results Cites Methods Cites Background Has PDF Publication Type Author More Filters More Filters Filters Sort by Relevance Sort by Most Influenced Papers Sort by Citation Count Sort by Recency Mapping the English Lake District: a literary GIS D. Cooper, I. Gregory Sociology 2011 93 PDF View 1 excerpt, cites methods Save Alert Research Feed Crossing Boundaries: Using GIS in Literary Studies, History and Beyond I. Gregory, A. Baron, D. Cooper, Andrew Hardie, Patricia Murrieta-Flores, Paul Rayson Geography 2014 5 PDF Save Alert Research Feed Geographic Information Systems and Historical Research: An Appraisal Luís Espinha da Silveira History, Computer Science Int. J. Humanit. Arts Comput. 2014 5 Save Alert Research Feed GIS and Literary History: Advancing Digital Humanities research through the Spatial Analysis of historical travel writing and topographical literature Patricia Murrieta-Flores, Christopher Donaldson, I. Gregory History, Computer Science Digit. Humanit. Q. 2017 13 PDF Save Alert Research Feed Text, images and statistics: Integrating data and approaches using geospatial computing I. Gregory Computer Science 2009 5th IEEE International Conference on E-Science Workshops 2009 PDF View 1 excerpt, cites background Save Alert Research Feed Towards the Spatial Analysis of Vague and Imaginary Places : Evolving the Spatial Humanities through Medieval Romance Patricia Murrieta-Flores, Naomi Howell 2018 1 PDF Save Alert Research Feed Critical Literary Cartography: Text, Maps and a Coleridge Notebook D. Cooper Geography 2012 3 PDF Save Alert Research Feed Mapping Travelers' Cultural and Environmental Perceptions: Thomas Nuttall and Henry Rowe Schoolcraft in Arkansas, 1818-1819 Andrew J. Milson History 2017 2 Save Alert Research Feed Exploring Literary Landscapes: From Texts to Spatiotemporal Analysis through Collaborative Work and GIS Daniel Alves, A. I. Queiroz Sociology, Computer Science Int. J. Humanit. Arts Comput. 2015 15 PDF Save Alert Research Feed Employment of Geoinformation Technologies in Historical Researches Experience of Kazan (Volga Region) Federal University D. Mustafina, O. Luneva, L. K. Karimova Geography 2015 PDF View 1 excerpt, cites background Save Alert Research Feed ... 1 2 3 ... References Geography, Timing, and Technology: A GIS-Based Analysis of Pennsylvania's Iron Industry, 1825–1875 Anne Knowles, R. Healey Economics The Journal of Economic History 2006 27 PDF Save Alert Research Feed Related Papers Abstract Figures and Topics 21 Citations 1 References Related Papers Stay Connected With Semantic Scholar Sign Up About Semantic Scholar Semantic Scholar is a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI. Learn More → Resources DatasetsSupp.aiAPIOpen Corpus Organization About UsResearchPublishing PartnersData Partners   FAQContact Proudly built by AI2 with the help of our Collaborators Terms of Service•Privacy Policy The Allen Institute for AI By clicking accept or continuing to use the site, you agree to the terms outlined in our Privacy Policy, Terms of Service, and Dataset License ACCEPT & CONTINUE work_2d55kfkwzfh3dcbwvg7v5mj4wm ---- Digital Humanities 2010 1 The Importance of Pedagogy: Towards a Companion to Teaching Digital Humanities Hirsch, Brett D. brett.hirsch@gmail.com University of Western Australia Timney, Meagan mbtimney.etcl@gmail.com University of Victoria The need to “encourage digital scholarship” was one of eight key recommendations in Our Cultural Commonwealth: The Report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Sciences (Unsworth et al). As the report suggested, “if more than a few are to pioneer new digital pathways, more formal venues and opportunities for training and encouragement are needed” (34). In other words, human infrastructure is as crucial as cyberinfrastructure for the future of scholarship in the humanities and social sciences. While the Commission’s recommendation pertains to the training of faculty and early career researchers, we argue that the need extends to graduate and undergraduate students. Despite the importance of pedagogy to the development and long-term sustainability of digital humanities, as yet very little critical literature has been published. Both the Companion to Digital Humanities (2004) and the Companion to Digital Literary Studies (2007), seminal reference works in their own right, focus primarily on the theories, principles, and research practices associated with digital humanities, and not pedagogical issues. There is much work to be done. This poster presentation will begin by contextualizing the need for a critical discussion of pedagogical issues associated with digital humanities. This discussion will be framed by a brief survey of existing undergraduate and graduate programs and courses in digital humanities (or with a digital humanities component), drawing on the “institutional models” outlined by McCarty and Kirschenbaum (2003). The growth in the number of undergraduate and graduate programs and courses offered reflects both an increasing desire on the part of students to learn about sorts of “transferable skills” and “applied computing” that digital humanities offers (Jessop 2005), and the desire of practitioners to consolidate and validate their research and methods. We propose a volume, Teaching Digital Humanities: Principles, Practices, and Politics, to capitalize on the growing prominence of digital humanities within university curricula and infrastructure, as well as in the broader professional community. We plan to structure the volume according to the four critical questions educators should consider as emphasized recently by Mary Bruenig, namely: - What knowledge is of most worth? - By what means shall we determine what we teach? - In what ways shall we teach it? - Toward what purpose? In addition to these questions, we are mindful of Henry A. Giroux’s argument that “to invoke the importance of pedagogy is to raise questions not simply about how students learn but also about how educators (in the broad sense of the term) construct the ideological and political positions from which they speak” (45). Consequently, we will encourage submissions to the volume that address these wider concerns. References Breunig, Mary (2006). 'Radical Pedagogy as Praxis'. Radical Pedagogy. http://radicalpeda gogy.icaap.org/content/issue8_1/breunig.ht ml. Giroux, Henry A. (1994). 'Rethinking the Boundaries of Educational Discourse: Modernism, Postmodernism, and Feminism'. Margins in the Classroom: Teaching Literature. Myrsiades, Kostas, Myrsiades, Linda S. (eds.). Minneapolis: University of Minnesota Press, pp. 1-51. http://radicalpedagogy.icaap.org/content/issue8_1/breunig.html http://radicalpedagogy.icaap.org/content/issue8_1/breunig.html http://radicalpedagogy.icaap.org/content/issue8_1/breunig.html Digital Humanities 2010 2 Schreibman, Susan, Siemens, Ray, Unsworth, John (eds.) (2004). A Companion to Digital Humanities. Malden: Blackwell. Jessop, Martyn (2005). 'Teaching, Learning and Research in Final Year Humanities Computing Student Projects'. Literary and Linguistic Computing. 20.3 (2005): 295-311. McCarty, Willard, Kirschenbaum , Matthew (2003). 'Institutional Models for Humanities Computing'. Literary and Linguistic Computing. 18.4 (2003): 465-89. Unsworth et al. (2006). Our Cultural Commonwealth: The Report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Sciences. New York: American Council of Learned Societies. work_27yesv3dzbd3fkwljmlxvksexq ---- Atti del IX Convegno Annuale dell'Associazione per l'Informatica Umanistica e la Cultura Digitale (AIUCD) LA SVOLTA INEVITABILE: SFIDE E PROSPETTIVE PER L’INFORMATICA UMANISTICA 15 – 17 gennaio 2020 Milano Università Cattolica del Sacro Cuore A CURA DI: Cristina Marras Marco Passarotti Greta Franzini Eleonora Litta ISBN: 978-88-942535-4-2 Copyright © 2020 Associazione per l’Informatica Umanistica e la Cultura Digitale Copyright of each individual chapter is maintained by the authors. This work is licensed under a Creative Commons Attribution Share-Alike 4.0 International license (CC-BY-SA 4.0). This license allows you to share, copy, distribute and transmit the text; to adapt the text and to make commercial use of the text providing attribution is made to the authors (but not in any way that suggests that they endorse you or your use of the work). Attribution should include the following information: Cristina Marras, Marco Passarotti, Greta Franzini, Eleonora Litta (a cura di), Atti del IX Convegno Annuale AIUCD. La svolta inevitabile: sfide e prospettive per l’Informatica Umanistica. Available online as a supplement of Umanistica Digitale: https://umanisticadigitale.unibo.it All links were visited on 29th December 2019, unless otherwise indicated. Every effort has been made to identify and contact copyright holders and any omission or error will be corrected if notified to the editors. https://umanisticadigitale.unibo.it iii Prefazione La nona edizione del convegno annuale dell'Associazione per l'Informatica Umanistica e la Cultura Digitale (AIUCD 2020; Milano, 15-17 gennaio 2020) ha come tema “La svolta inevitabile: sfide e prospettive per l'Informatica Umanistica”, con lo specifico obiettivo di fornire un'occasione per riflettere sulle conseguenze della crescente diffusione dell’approccio computazionale al trattamento dei dati connessi all’ambito umanistico. Questo volume raccoglie gli articoli i cui contenuti sono stati presentati al convegno. A diversa stregua, essi affrontano il tema proposto da un punto di vista ora più teorico- metodologico, ora più empirico-pratico, presentando i risultati di lavori e progetti (conclusi o in corso) che considerino centrale il trattamento computazionale dei dati. Dunque, la svolta inevitabile qui a tema va intesa innanzitutto come metodologica e, più nello specifico, computazionale. Ad essa la ricerca umanistica contemporanea assiste, con diversi gradi di accoglienza, critica addirittura rifiuto. La computabilità del dato empirico (anche) in area umanistica è, infatti, il tratto distintivo e il vero valore aggiunto che le innovazioni tecnologiche degli ultimi decenni hanno comportato in questo ambito. Nonostante negli anni il settore delle cosiddette Digital Humanities si sia voluto caratterizzare, anche a partire dalla propria denominazione, insistendo maggiormente sull'aspetto digitale che non su quello computazionale, i tempi sembrano ormai maturi perché il termine Computational Humanities, o il troppo precocemente accantonato Humanities Computing, (ri)prenda il posto oggi ancora occupato da Digital Humanities.1 Digitale è, infatti, il formato dei dati con cui attualmente si ha in gran parte a che fare nel nostro settore: ma è computazionale l'uso che di questi dati si fa ed è un fatto che gran parte dei lavori prodotti nell'area delle Digital Humanities consista nel “fare conti” sui dati.2 Come tanti suoi predecessori, anche il formato digitale passerà; mentre il metodo, e la svolta che esso comporta, resterà, perché solidamente ancorato all'evidenza empirica del dato che è il punto di partenza e, quindi, il centro di analisi di molta ricerca umanistica. Per questa ragione, la svolta computazionale nelle scienze umanistiche è innanzitutto metodologica: a cambiare radicalmente non è tanto il formato dei dati, ma il modo con cui ad essi ci si approccia e l'uso che di essi si fa. Non va negato un certo scetticismo reazionario che, ora esplicito, ora sottaciuto, parte del mondo della ricerca umanistica nutre nei confronti dei metodi e degli strumenti che la svolta computazionale ha messo a disposizione di noi ricercatori, che viviamo l'attuale scorcio di storia della scienza. Negli anni, tale scetticismo ha alimentato una irragionevole distinzione, e conseguente separazione, tra umanisti “tradizionali” e umanisti “digitali”, quasi che si debbano identificare due aree al fine di evitare che gli uni infastidiscano troppo gli altri con le proprie ricerche, trascurando che esse trattano i medesimi oggetti e hanno quale fine comune la produzione di nuova conoscenza. Siffatta separazione è dovuta a errori imputabili all'una e all'altra parte. Da un lato, certi umanisti “digitali” tendono a produrre ricerca che rischia di scadere nella superficialità, assumendo che l'alta quantità dei dati trattati possa compensarne l'eventuale bassa qualità e dimenticando, così, che le ricerche di area umanistica molto raramente lavorano su Big Data e non possono (anzi, non vogliono) accontentarsi di tendenze percentuali fondate su dati imprecisi. Dall'altro lato, i “tradizionali” sono spesso afflitti da un conservatorismo protezionista incompatibile con la natura stessa del lavoro di ricerca, che è in sé progressivo e in costante evoluzione. Ne consegue un dialogo interrotto tra le due parti: i “digitali” sono considerati dei tecnici (inteso in senso riduttivo) che brutalizzano il delicato dato umanistico, mentre i “tradizionali” vengono derubricati a dinosauri incartapecoriti che ormai non hanno più niente di nuovo da dire. Ma la svolta computazionale non è né “digitale”, né “tradizionale”. Semplicemente, essa è inevitabile. Chi ne fa cattivo uso, come certo mondo “digitale”, non sa valorizzarne la forza della portata; chi la rifiuta a priori, si pone fuori dalla realtà e, volutamente ignorando il nuovo, ferisce la ragione stessa del far ricerca. 1 Una valida sintesi della questione relativa alla denominazione del settore, con una buona bibliografia a supporto, è riportata in un articolo di Leah Henrikson pubblicato su 3:AM Magazine (24 Ottobre 2019) e disponibile presso https://www.3ammagazine.com/3am/humanities-computing-digital-humanities-and-computational- humanities-whats-in-a-name/ 2 Da, Nan Z. “The computational case against computational literary studies.” Critical Inquiry 45.3 (2019): 601-639. https://www.3ammagazine.com/3am/humanities-computing-digital-humanities-and-computational-humanities-whats-in-a-name/ iv Ma resta che la svolta è inevitabile: non si comprende perché sul tavolo dell'umanista del 2020 non possano trovarsi al contempo un'edizione critica cartacea e i risultati di un analizzatore morfologico automatico proiettati sullo schermo di un computer. Entrambi sono strumenti che diversamente trattano il comune oggetto d'interesse di tanta ricerca, ovvero i dati. Ma di una svolta non solo metodologica questa edizione 2020 del convegno AIUCD vuole trattare e farsi carico, aspirando anzi a mettere in atto anche una piccola, ma sostanziale svolta organizzativa. Per la prima volta, la call for papers di un convegno dell'Associazione, ha richiesto l'invio non di abstract, ma di articoli completi della lunghezza di un massimo di 4 pagine (bibliografia esclusa). Di concerto con il Comitato Direttivo dell'Associazione, abbiamo deciso di orientarci in tal senso per due ragioni principali. Primo, crediamo che, giunto alla propria nona edizione, il convegno annuale della AIUCD sia ormai sufficientemente maturo per passare a una fase il cui obiettivo sia quello di accogliere nel programma del convegno proposte che nel formato dell'articolo completo consentissero ai revisori una valutazione piena e più accurata. Ciò si lega anche alla seconda ragione. Il nostro settore c o m e è n o t o è molto veloce: i dati (e i risultati su di essi basati) tendono a cambiare nel giro di poco tempo. Ricevere articoli completi ci ha consentito di mettere i contenuti del presente volume nelle mani dei partecipanti (e più in generale della comunità tutta) il primo giorno del convegno, fornendo così una realistica fotografia dello stato dei lavori al gennaio 2020. Tutti gli articoli selezionati per essere presentati al convegno hanno cittadinanza in questo volume. Anche questa è una svolta: diversamente dall'uso fino ad oggi adottato, gli articoli pubblicati non sono più il risultato di una selezione a posteriori rispetto al convegno, ma tutti quelli effettivamente apparsi nel programma di AIUCD 2020. In tal senso, una certa esclusività promossa a livello di selezione scientifica si fa inclusività in termini di pubblicazione e, dunque di visibilità dei lavori presentati. Ogni proposta è stata valutata da tre revisori; si è dovuto ricorrere a una quarta valutazione solo nel caso di due proposte su cui i tre revisori avevano espresso opinioni che rendevano difficile prendere una decisione in merito alla loro accettazione, o meno. Al proposito delle differenze tra i revisori, abbiamo constatato divergenze piuttosto frequenti e, in alcuni casi, nette tra coloro che provengono dall'area linguistico-computazionale e quanti, invece, sono a vario titolo legati ai diversi settori dell' “umanistica digitale”. Mentre i linguisti computazionali sono tradizionalmente usi a valutare articoli completi e tendono a richiedere che i contenuti di essi descrivano motivazioni, metodi e risultati (preferibilmente replicabili) di lavori di ricerca in corso, o completati, i revisori di area umanistico-digitale sono disposti a valutare positivamente anche idee e proposte che ancora non si siano incarnate in una reale applicazione ai dati. La constatazione di tale diversità è il risultato della composizione volutamente inter- e trans-disciplinare del comitato dei revisori, a rappresentare la natura trasversale di AIUCD e, di riflesso, del suo convegno annuale. Nel prendere le decisioni in merito alle proposte, abbiamo cercato un equilibrio tra gli atteggiamenti delle due parti, favoriti dall'avere a disposizione un livello di dettaglio sul lavoro descritto. La richiesta di articoli completi ha avuto un impatto non molto rilevante sul numero delle proposte inviate, che sono state 71, di cui 67 sottoposte al processo di revisione, mentre 4 sono state escluse perché non confacenti ai criteri richiesti dalla call for papers (tra cui anonimato e originalità). Alla precedente edizione del convegno AIUCD (Udine, 23-25 gennaio 2019) erano state inviate 82 proposte, di cui 75 sottoposte a revisione. Conseguenze più sostanziali si sono, invece, riscontrate sulla percentuale delle proposte accettate e rifiutate. Delle 67 proposte valutate, 45 sono state accettate per apparire nel programma del convegno e, quindi, in questo volume, mentre 22 sono state rifiutate, risultando così in una percentuale di accettazione pari al 67.16%. All'edizione udinese, la percentuale si era attestata intorno all'84%. La contrazione del numero di proposte accettate è strettamente connessa alla richiesta di articoli completi invece che di abstract. Il programma del convegno ha incluso due sessioni poster. Dei 45 contributi accettati, 21 sono stati giudicati adatti alla presentazione in modalità poster. Rispetto alle consuetudini del settore, che tende a relegare le proposte meno interessanti o più problematiche nelle sessioni poster, abbiamo deciso di assegnare la modalità di comunicazione in forma di poster non secondo la qualità, ma piuttosto in base alla tipologia della proposta. Dunque, tendenzialmente le proposte che presentano lavori che hanno portato a risultati pratici (come strumenti, risorse, o interfacce) sono state giudicate più adatte a una presentazione in formato poster, mentre le discussioni teoriche, disciplinari, o metodologiche hanno occupato le sessioni di comunicazioni orali. Resta che non sussiste differenza alcuna in termini di selezione qualitativa tra un articolo i cui contenuti sono stati v presentati al convegno in forma orale, o in forma di poster, come dimostra l'aver riservato il medesimo numero di pagine a tutti gli articoli presenti in questo volume. I contenuti dei testi qui raccolti in ordine alfabetico testimoniano la varietà dei temi che usualmente sono trattati nei convegni della AIUCD. Essi spaziano da riflessioni generali sui settore di ricerca alla realizzazione di nuove risorse linguistiche e strumenti di analisi dei dati, da lavori di filologia ed editoria digitale a temi connessi alla digitalizzazione delle fonti in ambito bibliotecario. Oltre alla presentazione dei contenuti degli articoli di questo volume, il programma del convegno ha previsto tre relazioni su invito (una per ciascuno dei tre giorni della sua durata), che sono state rispettivamente presentate da Roberto Navigli (Sapienza, Università di Roma), Julianne Nyhan (University College London) e Steven Jones (University of South Florida). Il contributo di Roberto Navigli, intitolato Every time I hire a linguist my performance goes up (or: the quest for multilingual lexical knowledge in a deep (learning) world), è un esempio di ricerca che dice della ineludibilità del legame e, auspicabilmente, della collaborazione tra mondo scientifico e mondo umanistico e, nello specifico, tra la comunità che si riconosce nella AIUCD e quella della linguistica computazionale. Gli interventi di Julianne Nyhan (Where does the history of the Digital Humanities fit in the longer history of the Humanities? Reflections on the historiography of the ‘old’ in the work of Fr Roberto Busa S.J.) e Steven Jones (Digging into CAAL: Father Roberto Busa’s Center and the Prehistory of the Digital Humanities) si posizionano nell'alveo della storia della disciplina, particolarmente riferendo in merito ai loro studi sulle attività di padre Roberto Busa. La figura di Busa è strettamente legata all'Università Cattolica del Sacro Cuore di Milano, dove a partire dalla fine degli anni settanta il gesuita tenne un corso di Linguistica Computazionale e Matematica e fondò un gruppo di ricerca che, nel 2009, fu trasformato in un Centro di Ricerca; quel CIRCSE che con l’AIUCD ha organizzato il convegno annuale dell’associazione di cui questo volume raccoglie gli Atti. Nel 2010, un anno prima di lasciarci, padre Busa volle donare alla Biblioteca della Cattolica il proprio archivio personale. Una ricchissima documentazione del lavoro di Busa e della sua diffusione, oltre che delle sue relazioni personali e professionali (ricostruibili attraverso il vasto epistolario), l'Archivio Busa è attualmente in fase di catalogazione e digitalizzazione da parte della Biblioteca d'Ateneo. Una selezione di materiale tratto dall'Archivio è stata resa direttamente accessibile ai partecipanti dell'edizione milanese del convegno AIUCD in una piccola mostra allestita nell'atrio dell'aula dei lavori congressuali. Le teche della mostra raccolgono fogli di lavoro, lettere, schede perforate, nastri e articoli di quotidiani che trattano del lavoro di padre Busa: una forma di ringraziamento che l'Università Cattolica, il CIRCSE e la comunità scientifica tutta vuole riservare a uno dei pionieri dell'analisi linguistica automatica. I nostri ringraziamenti vanno innanzitutto alla Presidente di AIUCD Francesca Tomasi e a Fabio Ciotti, che in quel ruolo l'ha preceduta, per aver scelto Milano quale sede dell'edizione 2020 del convegno. Da loro è venuto il primo, fondamentale, sostegno alla “svolta organizzativa” di cui abbiamo voluto farci portatori. Ringraziamo altresì il Consiglio Direttivo dell'Associazione, il Comitato di Programma e tutti i revisori, che hanno lavorato alacremente per metterci nelle condizioni di definire il miglior programma possibile. La sede milanese dell'Università Cattolica del Sacro Cuore ci ha supportato a livello amministrativo e logistico; teniamo particolarmente a ringraziare l'Ufficio Formazione Permanente, nello specifico di Elisa Ballerini, la Biblioteca d'Ateneo, e specificatamente Paolo Senna, che ci ha messo a disposizione i materiali dell'Archivio Busa, l'Ufficio Eventi e la Direzione di Sede, che hanno fornito gli spazi per il convegno. Grazie soprattutto a chi ha inviato proposte, ai relatori e ai partecipanti tutti, perché sono loro i protagonisti essenziali dell'evento. La nostra speranza è che il lavoro fatto sia utile ancora prima che apprezzato. E che i suoi risultati si mantengano nelle edizioni a venire, con l'obiettivo di migliorare sempre, guardando avanti; perché saper vedere le svolte e affrontarle è la ragione stessa della ricerca. Cristina Marras Marco Passarotti Greta Franzini Eleonora Litta vi Chair e Comitati General Chair • Cristina Marras Chair del comitato scientifico e di programma • Marco Passarotti Comitato scientifico e di programma • Maristella Agosti • Stefano Allegrezza • Federica Bressan • Cristiano Chesi • Fabio Ciracì • Greta Franzini • Angelo Mario Del Grosso • Eleonora Litta • Pietro Maria Liuzzo • Federico Meschini • Johanna Monti • Federico Nanni • Marianna Nicolosi • Dario Rodighiero • Marco Rospocher • Chiara Zuanni Comitato Organizzatore • Greta Franzini • Eleonora Litta vii Indice dei Contenuti 1 7 14 19 24 28 34 39 47 EcoDigit-Ecosistema Digitale per la fruizione e la valorizzazione dei beni e delle attività culturali del Lazio Luigi Asprino, Antonio Budano, Marco Canciani, Luisa Carbone, Miguel Ceriani, Ludovica Marinucci, Massimo Mecella, Federico Meschini, Marialuisa Mongelli, Andrea Giovanni Nuzzolese, Valentina Presutti, Marco Puccini, Mauro Saccone Encoding the Critical Apparatus by Domain Specific Languages: The Case of the Hebrew Book of Qohelet Luigi Bambaci, Federico Boschetti 600 maestri raccontano la loro vita professionale in video: un progetto di (fully searchable) open data Gianfranco Bandini, Andrea Mangiatordi Ripensare i dati come risorse digitali: un processo difficile? Nicola Barbuti Verso il riconoscimento delle Digital Humanities come Area Scientifica: il catalogo online condiviso delle pubblicazioni dell’AIUCD Nicola Barbuti, Maurizio Lana, Vittore Casarosa Il trattamento automatico del linguaggio applicato all'italiano volgare. La redazione di un formario tratto dalle prime dieci Lettere di Alessandra M. Strozzi Ottavia Bersano, Nadezda Okinina Annotazione semantica e visualizzazione di un corpus di corrispondenze di guerra Beatrice Dal Bo, Francesca Frontini, Giancarlo Luxardo The Use of Parallel Corpora for a Contrastive (Russian-Italian) Description of Resource Markers: New Instruments Compared to Traditional Lexicography Anna Bonola, Valentina Noseda PhiloEditor: Simplified HTML Markup for Interpretative Pathways over Literary Collections Claudia Bonsi, Angelo Di Iorio, Paola Italia, Francesca Tomasi, Fabio Vitali, Ersilia Russo An Empirical Study of Versioning in Digital Scholarly Editions Martina Bürgermeister 55 viii 61 67 74 82 89 98 106 113 130 ELA: fasi del progetto, bilanci e prospettive Emmanuela Carbé, Nicola Giannelli Digitized and Digitalized Humanities: Words and Identity Claire Clivaz La geolinguistica digitale e le sfide lessicografiche nell’era delle digital humanities: l’esempio di VerbaAlpina Beatrice Colcuc Una proposta di ontologia basata su RDA per il patrimonio culturale di Vincenzo Bellini Salvatore Cristofaro, Daria Spampinato Biblioteche di conservazione e libera fruizione dei manoscritti digitalizzati: la Veneranda Biblioteca Ambrosiana e la svolta inevitabile grazie a IIIF Fabio Cusimano Repertori terminologici plurilingui fra normatività e uso nella comunicazione digitale istituzionale e professionale Klara Dankova, Silvia Calvi The Digital Lexicon Translaticium Latinum: Theoretical and Methodological Issues Chiara Fedriani, Irene De Felice, William Michael Short Selling Autograph Manuscripts in 19th c. Paris: Digitising the Revue des Autographes Simon Gabay, Lucie Rondeau du Noyer, Mohamed Khemakhem Enriching a Multilingual Terminology Exploiting Parallel Texts: an Experiment on the Italian Translation of the Babylonian Talmud Angelo Mario Del Grosso, Emiliano Giovannetti, Simone Marchi Towards a Lexical Standard for the Representation of Etymological Data Fahad Khan, Jack Bowers Workflows, Digital Data Management and Curation in the RETOPEA Project Ilenia Eleonor Laudito 119 125 ix 136 142 148 156 163 171 178 185 191 198 204 210 Il confronto con Wikipedia come occasione di valorizzazione professionale: il case study di Biblioteca digitale BEIC Lisa Longhi Making a Digital Edition: The Petrarchive Project Isabella Magni Extending the DSE: LOD Support and TEI/IIIF Integration in EVT Paolo Monella, Roberto Rosselli Del Turco Mapping as a Contemporary Instrument for Orientation in Conferences Chloe Ye-Eun Moon, Dario Rodighiero Argumentation Mapping for the History of Philosophical and Scientific Ideas: The TheSu Annotation Scheme and its Application to Plutarch’s Aquane an ignis Daniele Morrone Leitwort Detection, Quantification and Discernment Racheli Moskowitz, Moriyah Schick, Joshua Waxman From Copies to an Original: The Contribution of Statistical Methods Amanda Murphy, Raffaella Zardoni, Felicita Mornata FORMAL. Mapping Fountains over Time and Place. Mappare il movimento delle fontane monumentali nel tempo e nello spazio attraverso la geovisualizzazione Pamela Palomba, Emanuele Garzia, Roberto Montanari Paul is Dead? Differences and Similarities before and after Paul McCartney’s Supposed Death. Stylometric Analysis of Transcribed Interviews Antonio Pascucci, Raffaele Manna, Vincenzo Masucci, Johanna Monti Digital Projects for Music Research and Education from the Center for Music Research and Documentation (CIDoM), Associated Unit of the Spanish National Research Council Juan José Pastor Comín, Francisco Manuel López Gómez Prospects for Computational Hermeneutics Michael Piotrowski, Markus Neuwirth EModSar: A Corpus of Early Modern Sardinian Texts Nicoletta Puddu, Luigi Talamo Shared Emotions in Reading Pirandello. An Experiment with Sentiment Analysis Simone Rebora DH as an Ideal Educational Environment: The Ethnographic Museum of La Spezia Letizia Ricci, Francesco Melighetti, Federico Boschetti, Angelo Mario Del Grosso, Enrica Salvatori 216 222 x 227 235 240 246 253 260 265 A Digital Review of Critical Editions: A Case Study on Sophocles, Ajax 1-332 Camilla Rossini Strategie e metodi per il recupero di dizionari storici Eva Sassolini, Marco Biffi Encoding Byzantine Seals: SigiDoc Alessio Sopracasa, Martina Filosa Preliminary Results on Mapping Digital Humanities Research Gianmarco Spinaci, Giovanni Colavizza, Silvio Peroni Epistolario De Gasperi: National Edition of De Gasperi’s Letters in Digital Format Sara Tonelli, Rachele Sprugnoli, Giovanni Moretti, Stefano Malfatti, Marco Odorizzi Visualizing Romanesco; or, Old Data, New Insights Gianluca Valenti What is a Last Letter? A Linguistics/Preventive Analysis of Prisoner Letters from the Two World Wars Giovanni Pietro Vitali L’organizzazione e la descrizione di un fondo nativo digitale: PAD e l’Archivio Franco Buffoni Paul Gabriele Weston, Primo Baldini, Laura Pusterla 273 Digitized and Digitalized Humanities: Words and Identity Claire Clivaz Swiss Institute of Bioinformatics claire.clivaz@sib.swiss Abstract English. This paper analyses two closely related but different concepts, digitization and digitalization, first discussed in an encyclopedia article by Brennen and Kreiss in 2016. Digital Humanities mainly uses the first term, whereas business and economics tend to use the second to praise the process of the digitalization of society. But digitalization was coined as a critical concept in 1971 by Wachal and is sometimes used in post- colonial studies. Consequently, humanist scholars are invited to avoid the “path of least resistance” when using digitalization, and to explore its critical potential. The paper concludes by considering the effect of the digitalization perspective and by expressing author’s point of view on the issue. Italiano. Questo articolo analizza due concetti correlati ma differenti fra loro, “digitization” e “digitalization”, discussi la prima volta in una voce di enciclopedia da Brennen e Kreiss nel 2016. Nelle scienze umane digitali si utilizza sostanzialmente il primo termine, mentre in economia si tende a utilizzare il secondo per sottolineare il processo di digitalizzazione della società. Ma il termine “digitalization” era stato creato nel 1971 da Wachal come un concetto critico, ed era stato utilizzato in alcuni studi sul post-colonialismo. Di conseguenza, gli studiosi nelle scienze umane sono invitati a evitare di utilizzare “digitalization” in modo triviale, e ad esplorare il suo potenziale critico. L'articolo termina con alcune considerazioni sugli effetti della prospettiva della digitalizzazione, presentando il punto di vista dell’autore. 1 Introduction: Words and Identity in Digital Humanities As the 2020 AIUCD conference topic underlines, the identity and definition of the Humanities that has met the computing world, is in constant reshaping (Ciotti, 2019)1. The English language has acknowledged the important turn from humanities computing to digital humanities at the beginning of the 21st century (Kirschenbaum, 2010), whereas French-speaking scholarship is wrestling between humanités numériques (Berra, 2012; Doueihi, 2014) and humanités digitales (LeDeuff, 2016; Cormerais–Gilbert, 2016; Clivaz, 2019). Moreover, new words are often tested to express the intensity of what is at stake: if Jones has chosen the term “eversion” for describing the present state of the digital turn (Jones, 2016), the French thinker Bernard Stiegler focuses on “disruption” (Stiegler, 2016). German and Hebrew link digital humanities naming with the vocabulary of spirit/mind, whereas the outmoded word humanités has come back in French through the naming of the humanités numériques, recalling the presence of the body (Clivaz, 2017). Inscribed in this linguistic effervescence, a phenomenon has so far not drawn the attention of the humanist scholarship: the difference between digitization and digitalization, or between digitized and digitalized Humanities. The present paper will explore, as far as possible, the emergence of this dualistic vocabulary, inside and outside of digital humanities scholarship, looking for its meanings and implications. It represents only a first overview about the scare definitions and occasional uses of “digitalization”, even if the debate between digitization and digitalization can sometimes inform implicitly the discourse, as we will see in Section 4 (Smithies, 2017). Section 2 will first comment similarity and difference between both words, looking for “digitalization” definitions, and its uses. Section 3 discusses in detail the only definition article we have so far debating these two concepts. Section 4 considers more broadly the digitalization perspective and presents the author’s point of view on the issue, including its articulation to the AIUCD 2020 topic. 2 Looking for “digitalization” definition and uses English native speakers would surely ask first if there is really a difference between “digitization” and “digitalization.” “Digitalization” does not benefit from its own entry in Wikipedia or in the Collins Dictionary 1 Many thanks are due to the reviewers for their remarks, to Andrea Stevens for her English proof-reading, and to Elena Giglia for her translation of the Italian abstract. 67 mailto:claire.clivaz@sib.swiss online.2 However, the Oxford English Dictionary (OED) dates the first use of digitalization as equivalent to digitization in 1959,3 whereas the medical sense appeared in 1876.4 OED presents also digitalization as meaning “the adoption or increase in use of digital or computer technology by an organization, industry, country, etc.”5 In the Wikipedia entry “digital transformation”, a similar definition is given for “digitalization”: “unlike digitization, digitalization is the ‘organizational process’ or ‘business process’ of the technologically- induced change within industries, organizations, markets and branches.”6 A most decisive shift in the sense of a difference between the two words can be seen in the International Encyclopedia of Communication Theory and Philosophy, which published an entry on “Digitalization” by J. Scott Brennen and Daniel Kreiss in 2016. They argue in favour of a distinction from “digitization” (Brennen–Kreiss, 2016). This publication is in itself a quite clear signal, according to our cultural and scholarly habits, that “digitalization” exists with its own meanings, since it has been defined in an encyclopedia. As far as I have been able to determine, it is the only article trying to define both concepts and is discussed in detail in Section 3. As we see, references to digitalization’s definition are quite scare. So far, there it is not even possible to do a systematic overview of its theoretical background based in the scholarly literature because it is not discussed, with the exception of the Brennen–Kreiss article. But if we look at its uses, some aspects clearly emerge. “Digitalization” is mainly used in the business and economical world, and very infrequently in digital humanities. For example, according to Jari Collin in a 2015 Finnish volume of collected essays, digitalization refers to the understanding of “the dualistic role of IT in order to make right strategic decisions on IT priorities and on the budget for the coming years. IT should not be seen only as a cost center function anymore!” (Collin, 2015, 30). Digitalization seems to be “one of the major trends changing society and business. Digitalization causes changes for companies due to the adoption of digital technologies in the organization or in the operation environment” (Parvianien et al., 2017, 63). According to Mäenpää and Korhonen, “from the retail business point of view, the ‘digitalization of the consumer’ is of essence. People are increasingly able to use digital services and are even beginning to expect them. To a certain extent, this is a generational issue. The younger generations, such as Millennials, are growing up with digitalization and are eagerly in the forefront of adopting new technology and its affordances” (Mäenpää–Korhonen, 2015, 90). In 2018, Toni Ryynäen and Torsti Hyyryläinen, members of the Helsinki Institute of Sustainability Science at the Faculty of Agriculture and Forestry, published an article seeking to fill the gap between the digitalization process and digital humanities, by focusing on the concern for “new forms of e-commerce, changing consumer roles and the digital virtual consumption” (Ryynäen – Hyyryläinen, 2018, 1). In this process, the role of digital humanities is described in a way that is quite hard to recognize for DHers, at least for those not involved in digital social sciences: “A challenge for digital humanities research is how to outline the most interesting phenomena from the endless pool of consumption activities and practices. Another challenge is how to define a combination of accessible datasets needed for solving the chosen research tasks” (Ryynäen – Hyyryläinen, 2018, 1). In light of such clear descriptions of what “digitalization” means for business and economy, digital humanities scholarship demonstrates a deafening silence about this notion. The 2004 and 2016 editions of the reference work Companion to Digital Humanities do not mention the word. In the established series Debates in the Digital Humanities, one finds one occurrence in the five volumes, under the pen of Domenico Fiormonte (2016). As a third example, the collected essays Text and Genre in Reconstruction: Effects of Digitalization on Ideas, Behaviours, Products and Institutions, edited by Willard McCarty (2010), can only surprise the reader: indeed, “digitalization” stands in the title, but the word is then totally absent from the volume. When questioned about this discrepancy, McCarty answered that the publisher had requested to have this word in the title. This request has led to a damaging side effect in terms of Google searches: if one searches for “digitalization” and “digital humanities”, one gets several book titles that do not contain no mention of this word other than a reference to Text and Genre’s title. It is also the case in my 2019 book Ecritures digitales. 2 Entry “digitization” in Wikipedia: https://en.wikipedia.org/wiki/Digitization; entry “digitalize” in the Collins Dictionary online: https://www.collinsdictionary.com/dictionary/english/digitalize. All hyperlinks have been last checked on 30/11/19. 3 Entry “digitalization n.2”, OED, https://www.oed.com/view/Entry/242061 4 Entry “digitalization n.1”, OED, https://www.oed.com/view/Entry/52616: “the administration of digitalis or any of its constituent cardiac glycosides to a person or animal, esp. in such a way as to achieve and maintain optimum blood levels of the drug. Also: the physiological condition resulting from this”. 5 Entry “digitalization n.2” in the Oxford English Dictionary online: https://www.oed.com/view/Entry/24206 6 Entry “digital transformation” in Wikipedia: https://en.wikipedia.org/wiki/Digital_transformation#Digitization_(of_information) 68 https://en.wikipedia.org/wiki/Digitization https://www.collinsdictionary.com/dictionary/english/digitalize https://www.oed.com/view/Entry/242061 https://www.oed.com/view/Entry/52616: https://www.oed.com/view/Entry/242061 https://en.wikipedia.org/wiki/Digital_transformation#Digitization_ https://www.oed.com/view/Entry/242061 https://www.oed.com/view/Entry/24206 Digital writing, digital Scriptures: the unique occurrence of “digitalization” occurs in my reference to McCarty’s collected essays (Clivaz, 2019). One can sometimes meet infrequent uses of digitalization in digital humanities, such as a 2013 article by Amelia Sanz. She uses the word to describe Google Books and the Hathi Trust’s effect on Spanish literature: “Digital Libraries as Google Books or Hathi Trust include numerous works belonging to our study period among its digitalized collections in US universities, because most of these forgotten authors make part of the Spanish diaspora after the Civil War (1936-39) and during the subsequent dictatorship (1939-1975). In fact, European copyright legislation has made Google digitalize only works prior to 1870 in Spain, and, unfortunately for Spanish researchers, those works appear to be in ‘limited access’ due to the existing diffusion/circulation rights, but available in ‘full text’ mode for researchers located in the US” (Sanz, 2013, n.p.). The two italicized words are the unique occurrences of digitalization vocabulary in an article focused on the effects of digitization. When asked about her use of these two words, Sanz answered that it was probably a misuse of language, since she is not a native English speaker. Usually in digital humanities scholarship, one speaks about “Humanities digitized” (Shaw, 2012)7, and the mutation to the digital sphere is seen as a pre-step before the processes of interpretation.8 Uses of digitalization and cognate terms remain rare, like Domenico Fiormonte, who is also a non-native English speaker and the only one to use digitalization in the series Debates in Digital Humanities: “In the last ten years, the extended colonization, both material and symbolical, of digital technologies has completely overwhelmed the research and educational world. Digitalization has become not only a vogue or an imperative, but a normality. In this sort of ‘gold rush’, the digital humanities perhaps have been losing their original openness and revolutionary potential” (Fiormonte, 2016, n.p.). Fiormonte compares digitalization to a colonization process: if there is some consciousness of the digitalization vocabulary in humanities, it can be indeed found in research about cultural diversity and colonialism, such as in a 2007 article by Maja van der Velden, “Invisibility and the Ethics of the Digitalization: Designing so as not to Hurt Others.” Van der Velden studies “the designs of Indymedia, an Internet-based alternative media network, and TAMI, an Aboriginal database, [...] informed by the confrontations over different ways of knowing” (2007, 81). She points to the fact that, “if we understand knowledge not as a commodity but as a process of knowing, something produced socially, we must ask about the nature of digitalization itself. As the Aboriginal elders say, ‘Things are not real without their story’” (2007, 82). She documents in this way two examples of non- Western digital projects, in which the diversity of the source codes and standards has led to recurrent negotiations: “the confrontations over issues of privacy and control resulted in different ways of organizing access and information management” (2007, 89). Van der Velden’s article allows one to understand, from a humanist point of view, what is at stake in the concept of digitalization, a perspective that the next section develops. But it should be underlined that, even in this article pointing to cultural and digital control issues, digitalization is not discussed as such. The apparent lack of awareness about this binomial vocabulary and its implication for DH scholarly literature appears to be a real blind spot that section 4 considers. 3 Claiming a Critical Use of Digitalization in Humanities In their overview article, Brennen and Kreiss give a general definition of “digitalization” similar to the one presented in Section 2: “We [...] define digitization as the material process of converting analog streams of information into digital bits. In contrast, we refer to digitalization as the way many domains of social life are restructured around digital communication and media infrastructures” (Brennen–Kreiss, 2016, 1). They usefully remind us that “digitization is a process that has both symbolic and material dimensions” (2016, 2), and that “analog and digital media, [...] all forms of mediation necessarily interpret the world” (2016, 3). The authors also consider that “the first contemporary use of the term ‘digitalization’ in conjunction with computerization appeared in a 1971 essay first published in the North American Review. In it, Robert Wachal discusses the social implications of the ‘digitalization of society’ in the context of considering objections to, and the potential for, computer-assisted humanities research. From this beginning, writing about digitalization has grown into a massive literature” (2016, 5). The reference to Wachal’s article is a very interesting one, and it deserves more attention than the co-authors devote to it. Moreover, they omit any reference to Maja van der Velden’s article or to similar approaches in Brennen and Kreiss’s article. The “winners” of their digitalization 7 One can also see uses of digitalization in the humanities in archaeology, notably in conjunction with 3D discussion (Ercek –Viviers –Warzée, 2009). 8 See Earhart – Taylor (2016): “Our White Violence, Black Resistance project merges foundational digital humanities approaches with issues of social justice by engaging students and the community in digitizing and interpreting historical moments of racial conflicts.” 69 definition are scholars from the vein of Manuel Castells, who argues that “technology is society, and society cannot be understood or represented without its technological tools” (Brennen–Kreiss, 2016, 5). To get a deeper understanding of the critical potential of digitalization, it is worth reading Wachal’s 1971 article. He uses digitalization in just one sentence: “The humanist’s fears are not entirely without foundation, and in any case, as a humane man he naturally fears the digitalization of the society. He doesn’t like to be computed. He doesn’t want to be randomly fingered by a credit card company computer” (1971, 30). The entire article is an ironic confrontation between the habits of a humanist scholar and what a programmer and a computer could do for humanities. As a computer programmer teacher himself, Wachal remembers the term coined by Theodor Nelson, “cybercrud”: “putting things over on people [by] saying using computers. When you consider that this includes everything from intimidation (‘Because we are using automatic computers, it is necessary to assign common expiration dates to all subscriptions’) to mal implementation (‘You’re going to have to shorten your name - it doesn’t fit in to the computer’), it may be that cybercrud is one of the most important activities of the computer field” (1971, 30). In other words, computer scholars have a clear awareness about their world, as Nelson and Wachal after him demonstrate. After this captatio benevolentiae, Wachal raises what is for him the main issue with the humanist point of view on computing: “Dare we hope that the day has come when humanists will begin asking some new questions?” (1971, 33), referring also to artificial intelligence (1971, 31). His “personal view”, as announced in the article title, is an open call that is still worth humanist scholars’ attention. The complex elements of the discussion of the digitization/digitized vs digitalization/digitalized divide indicates that it is surely time for DHers to pay attention to this binomial expression, so successfully deployed in business or economy that a publisher can get it in a title of collected essays that does not contain the word digitalization at all. It is time to form an understanding of digitalization that still denounces “cybercrud” when needed, or helps us to pay attention to “the confrontations over issues of privacy and control resulted in different ways of organizing access and information management” (van der Velden, 2007, 89). To express it in an electronic vocabulary, Brennen and Kreiss present a “path of least resistance” to the definition of digitalization, according to the path describing the third potential state of an electronic circuit (open, closed, or not working), because electricity follows the “path of least resistance.” 9 But it is a core skill of the humanities to renounce the paths of least resistance and to wrestle with words, concepts, and realities. In that perspective, the last Section will develop some tracks to further the debate. 4 The effect of the “digitalization” perspective The binomial expression “digitization” versus “digitalization” enters in the international debate through the English language. Such a distinction does not exist in French, Italian, or German, for example. But the inquiry of this article demonstrates that it this concept is worthy of exploration in an effort to grasp what is at stake in an explicit way in the English language. It represents surely one further argument in favor of a multilingual approach to digital epistemology, like the one developed in Digital writing, digital Scriptures (Clivaz, 2019). I firstly underline how striking it is that even in the few occurrences where humanist scholars consciously use the term “digitalization” (van der Velden, Fiormonte), it is not discussed per se: a blind point exists in the scholarly discussion apart of Brennen and Kreiss’s article. After all, the first use of “digitalization” in relation to the computer sphere was by a programmer (Wachal, 1971), but nowadays its use in critical discussion is mainly found under the pen of scholars outside of humanities who make claims about the “essence” of “the ‘digitalization of the consumer’” (Mäenpää–Korhonen, 2015, 90; quoted in Section 2). In light of this consumerist perspective, DH scholars are generally confident in the traditional critical impact of their methodologies and knowledge. Alan Liu, for example, writes that “the digital humanities serve as a shadow play for a future form of the humanities that wishes to include what contemporary society values about the digital without losing its soul to other domains of knowledge work that have gone digital to stake their claim to that society” (2013, 410). In the same line, the HERA 2017 call hopes that the humanities, when digitized, will be able “to deepen the theoretical and empirical cultural understanding of public spaces in a European context.”10 But it could secondly be argued that the blind point of the absent discussion about digitization/digitalization demonstrates an overconfidence of the digital humanities in its capacity to not lose the soul of the humanities in digital networks. Other voices are indeed more sensitive to the limitations imposed on humanities research 9 See “Path of Leaf Resistance”, Wikipedia, https://en.m.wikipedia.org/wiki/Path_of_least_resistance 10 See “HERA Public Spaces”, 31.08.17, http://heranet.info/2017/08/31/hera-launches-its-fourth- joint-research-programme-public-spaces/ 70 https://en.m.wikipedia.org/wiki/Path_of_least_resistance http://heranet.info/2017/08/31/hera-launches-its-fourth-joint-research-programme-public-spaces/.70 http://heranet.info/2017/08/31/hera-launches-its-fourth-joint-research-programme-public-spaces/.70 http://heranet.info/2017/08/31/hera-launches-its-fourth-joint-research-programme-public-spaces/.70 https://en.m.wikipedia.org/wiki/Path_of_least_resistance by digital constraints, as we have seen with Maja van der Velden: even if she uses the word “digitalization” without discussing it, her article clearly points to digital control issues in the practice of building a database or a virtual research environment. From a more general and theoretical point of view, James Smithies strongly underlines in his book The Digital Humanities and the Digital Modern the same issues, even if the word digitalization is totally absent in it. He suggests that “our digital infrastructure […] has grown opaque and has extended into areas well outside scholarly or even governmental control” (2017, 11). His discourse becomes overtly political when he affirms the existence of a “point of entanglement between the humanities and neoliberalism, implicating digital humanists and their critics in equal measure” (2017, 218). We are probably reaching here the main root of the silence about the digitization/digitalization challenge in DH debates: this binomial expression points to the political dimension of the digital revolution in humanities, to its economic and institutional implications, something that we prefer to let aside, consciously or unconsciously. This fear is also described by Wachal: “The humanist’s fears are not entirely without foundation, and in any case, as a humane man he naturally fears the digitalization of the society” (1971, 30; quoted in Section 3). Listening to Wachal, and almost fifty years later to Smithies, can begin to lead us beyond the “path of leaf resistance” of Brennen and Kreiss. We should consider digitalization rather as the top of a mountain: it can be reached only through the via ferrata of the debates about cultural and multilingual diversity, about multiple source codes and standards, a multiplicity that preserves, at the end, diversity in human- computing knowledge productions. Moreover, we are probably reaching right now the start of the DH awareness of this linguistic debate. As I end this article, I have opened the debate in the list Humanist Discussion Group and Simon Tanner has signaled his interest in the point, referring to Brennen and Kreiss’ definition: “I have found the difference to be significant enough to seek to define it for my current book and in the past it has been a source of confusion or conflation that has not been helpful. I make it very clear to our students in the Masters of Digital Humanities or the MA Digital Asset and Media Management that they should not use the interchangeably” (Tanner, 2019). Third, since the binomial expression digitization/digitalization is a vehicle for its own impact and meaning within the DH epistemology, is it possible to tie these concepts to the general challenge raised by the AIUCD 2020 call for papers? Notably, this discussion raises the following questions: “is it still necessary to talk about (and make) a distinction between ‘traditional’ humanists and ‘digital’ humanists? Is the term ‘Digital Humanities’ still appropriate or should it be replaced with ‘Computational Humanities’ or ‘Humanities Computing’? Is the computational dimension of the research projects typically presented at AIUCD conferences that methodologically distinctive?”11 At the root of these problems stands of course an important debate in Italian speaking DH, present in the name itself of the national DH organization, the AIUCD. This name mentions “Humanities Computing” (informatica umanistica) and “digital culture” (cultura digitale): AIUCD - Associazione per l’Informatica Umanistica e la Cultura Digitale.12 But beyond this specific Italian perspective, the importance of collaboration between DHers and other humanist scholars concerns all of us. The dialectic between Humanities Computing and Digital Humanities will in all cases remain in the historical memory of the DH development. But I am personally not convinced that a “step back” in the form of a return to Humanities Computing, motivated by a desire to keep all the humanists together under the banner of the informatica umanistica, is viable. Why? When the Harvard Magazine published in 2012 one of its first articles about the digital humanities, it was entitled “Humanities Digitized” (Shaw, 2012). It has always been meaningful for me to think in that direction. As I have argued elsewhere in detail, we could “begin to speak about the digitized humanities, or simply about humanities again, instead of digital humanities. Such an evolution might occur, if one looks at the evolution of the expression ‘digital computer’ which was in common usage during the fifties, but it has been now replaced by the single latter word ‘computer’ (Williams, 1984, 310; Dennhardt, 2016). When humanities finally become almost entirely digitized, perhaps it is safe to bet that we will once again speak simply about humanities in English or about humanités in French, thus making this outmoded word again meaningful through the process of cultural digitization” (Clivaz, 2019, 85–86). According to this perspective, the debate between “humanities digitized” or “humanities digitalized”, with all its cultural, economic, material, institutional and political dimensions, could signal a third step after Humanities Computing and Digital Humanities. This third step would stand at the crossroads where all humanists could meet up again, in an academic world definitively digitized, but hopefully not totally digitalized. It is up to all of us to decide if, in the third millennium, Humanities will be digitized or digitalized. 11See “Convegno annuale dell'Associazione per l'Informatica Umanistica e la Cultura Digitale. Call for papers”, https://aiucd2020.unicatt.it/aiucd-call-for-papers-1683. 12 See AIUCD, www.aiucd.it. 71 https://aiucd2020.unicatt.it/aiucd-call-for-papers-1683 http://www.aiucd.it https://aiucd2020.unicatt.it/aiucd-call-for-papers-1683 http://www.aiucd.it References Aurélien Berra. 2012. Faire des Humanités Numériques. Read/Write Book 2. Pierre Mounier (ed.). Paris, OpenEdition Press, 25–43. http://books.openedition.org/oep/238 J. Scott Brennen and Daniel Kreiss. 2016. Digitalization. International Encyclopedia of Communication Theory and Philosophy 23 October: 1–11. https://onlinelibrary.wiley.com/doi/ abs/10.1002/9781118766804.wbiect111 Fabio Ciotti. 2019. Oltre la galassia delle Digital Humanities : per la costituzione di una disciplina di Informatica Umanistica. AIUCD2019. Book of Abstracts. Teaching and Research in Digital Humanities’ Era. Stefano Allegrezza (ed). Udine, AIUCD 2019, 52–56. http://aiucd2019.uniud.it/wp-content/ uploads/2019/01/BoA-2019_PROVV.pdf Claire Clivaz. 2017. Lost in Translation? The Odyssey of “digital humanities” in French. Studia UBB Digitalia 62/1:26-41. https://digihubb.centre.ubbcluj.ro/journal/index.php/digitalia/article/ view/4 Claire Clivaz. 2019. Ecritures digitales. Digital writing, digital Scriptures. DBS 4. Leiden, Brill. Jari Collin. 2015. Digitalization and Dualistic IT. IT Leadership in Transition. The Impact of Digitalization on Finnish Organizations. Science + Technology 7. Jari Collin, Kari Hiekkanen, Janne J. Korhonen, Marco Halén, Timo Itälä, and Mika Helenius (eds.). Helsinki, Aalto University Publication Series, 29–34. Franc Cormerais and Jacques Gilbert. 2016. Introduction. Le texte à venir. Etudes Digitales 1:11–16. Robert Dennhardt. 2016. The Term Digital Computer (Stibitz 1942) and the Flip-Flop (Turner 1920). München, Grin Verlag. Milad Doueihi. 2014. Préface: quête et enquête. Le temps des humanités digitales. Olivier Le Deuff (dir.). Limoges, FyP editions, 7–10. Amy A. Eahart and Toniesha L. Taylor. 2016. Pedagogies of Race: Digital Humanities in the Age of Ferguson. Debates in the Digital Humanities, volume 2. Matthew K. Gold and Lauren F. Klein (eds.). https:// dhdebates.gc.cuny.edu/read/untitled/section/58ca5d2e-da4b-41cf-abd2- d8f2a68d2914-ch21 Rudy Ercek, Didier Viviers and Nadine Warzée. 2010. 3D Reconstruction and Digitalization of an Archaeological Site, Itanos, Crete. Virtual Archaeology Revie volume 1/1:81–85. DOI: 10.4995/var.2010.4794 Domenico Fiormonte. 2016. Toward a Cultural Critique of Digital Humanities. Debates in the Digital Humanities, volume 2. Matthew K. Gold and Lauren F. Klein (eds.). https://dhdebates.gc.cuny.edu/ read/untitled/section/5cac8409-e521-4349-ab03-f341a5359a34-ch35 Steven E. Jones. 2016. The Emergence of the Digital Humanities (the Network Is Everting). Debates in the Digital Humanities, volume 2. Matthew K. Gold and Lauren Klein (ed.). Minneapolis, University Minnesota Press. http://dhdebates.gc.cuny.edu/debates/text/52 Alan Liu. 2013. The Meaning of the Digital Humanities. PMLA 128/2:409-423. https:// doi.org/10.1632/pmla.2013.128.2.409 Willard McCarty (ed.). 2010. Text and Genre in Reconstruction: Effects of Digitalization on Ideas, Behavious, Products and Institutions. Cambridge, OpenBook Publishers. Raimo Mäenpää and Janne J. Korhonen. 2015. Digitalization in Retail: The Impact on Competition. IT Leadership in Transition. The Impact of Digitalization on Finnish Organizations. Science + Technology 7. Jari Collin, Kari Hiekkanen, Janne J. Korhonen, Marco Halén, Timo Itälä, and Mika Helenius (eds.). Helsinki, Aalto University Publication Series, 89–102. Matthew G. Kirschenbaum. 2010. What Is Digital Humanities and What’s It Doing in English Departments? ADE Bulletin 150:55–61. http://mkirschenbaum.files.wordpress.com/2011/03/ade- final.pdf 72 http://books.openedition.org/oep/238 https://onlinelibrary.wiley.com/doi/abs/10.1002/9781118766804.wbiect111 http://aiucd2019.uniud.it/wp-content/uploads/2019/01/BoA-2019_PROVV.pdf https://dhdebates.gc.cuny.edu/read/untitled/section/5cac8409-e521-4349-ab03-f341a5359a34 http://dhdebates.gc.cuny.edu/debates/text/52 http://mkirschenbaum.files.wordpress.com/2011/03/ade-final.pdf http://books.openedition.org/oep/238 https://dhdebates.gc.cuny.edu/read/untitled/section/58ca5d2e-da4b-41cf-abd2-d8f2a68d2914-ch21 https://dhdebates.gc.cuny.edu/read/untitled/section/58ca5d2e-da4b-41cf-abd2-d8f2a68d2914-ch21 https://dhdebates.gc.cuny.edu/read/untitled/section/58ca5d2e-da4b-41cf-abd2-d8f2a68d2914-ch21 https://dhdebates.gc.cuny.edu/ http://dhdebates.gc.cuny.edu/debates/text/52 Olivier Le Deuff. 2016. Humanités digitales versus humanités numériques, les raisons d’un choix. Le texte à venir. Etudes Digitales 1: 263–264. Päivi Parvianien, Jukka Kääriäinen, Maarit Tihinen, and Susanna Teppola. 2017. Tackling the digitalization challenge: how to benefit from digitalization in practice. International Journal of Information Systems and Project Management 5/1: 63–76. DOI: 10.12821/ijispm050104. Toni Ryynänen and Torsti Hyyryläinen. 2018. Digitalisation of Consumption and Digital Humanities - Development Trajectories and Challenges for the Future. CEUR Workshop Proceedings Vol-2084: 1–8. http://ceur-ws.org/Vol-2084/short11.pdf Amelia Sanz. 2013. Digital Humanities or Hypercolonial Studies? E-Prints Complutense. Madrid, University of Madrid. https://eprints.ucm.es/50610/ Jonathan Shaw. 2012. Humanities Digitized. Reconceiving the study of culture. Harvard Magazine May–June: 40– 44 and 73–75. http://harvardmag.com/pdf/2012/05-pdfs/0512-40.pdf James Smithies. 2017. The Digital Humanities and the Digital Modern. Basingstoke, Palgrave Macmillan. Bernard Stiegler. 2016. Dans la disruption. Comment ne pas devenir fou? Paris, Les liens qui libèrent. Simon Tanner. 2019. Delivering impact with digital resources: Planning strategy in the attention economy. London, Facet Publishing. Forthcoming. Quotation in Simon Tanner. 2019. Digitization vs digitalization. Humanist Discussion Group 33.390/4. https://dhhumanist.org/volume/33/392/ Maja van der Velden. 2007. Invisibility and the Ethics of the Digitalization: Designing so as not to Hurt Others. Information Technology Ethics: Cultural Perspectives. Sonja Hongladarom and Charles Ess (eds.). Hershey et al., Idea Group Reference Ed., 81–93. Robert Wachal. 1971. Humanities and Computers: A Personal View. The North American Review 256/1:30– 33. https://www.jstor.org/stable/25117163 Bernard O. Williams. 1984. Computing with Electricity, 1935–1945. PhD Dissertation. Lawrence, University of Kansas. 73 http://ceur-ws.org/Vol-2084/short11.pdf http://ceur-ws.org/Vol-2084/short11.pdf https://eprints.ucm.es/50610/ http://harvardmag.com/pdf/2012/05-pdfs/0512-40.pdf https://dhhumanist.org/volume/33/392/ https://www.jstor.org/stable/25117163 http://ceur-ws.org/Vol-2084/short11.pdf https://eprints.ucm.es/50610/ http://harvardmag.com/pdf/2012/05-pdfs/0512-40.pdf https://dhhumanist.org/volume/33/392/ https://www.jstor.org/stable/25117163 Pages de AIUCD1 Pages de AIUCD_2020_volume_FINAL-3 work_2fjnabij65dknn6bwzrt2slb3m ---- IJTM/IJCEE PAGE TEMPLATEv2 International Journal of Human Factors Modelling and Simulation 3(1), 90-106, 2012. doi:10.1504/IJHFMS.2012.050078 Applying Cognitive Science to Digital Human Modelling for User Centred Design Peter Thorvald*, Dan Högberg & Keith Case Virtual Systems Research Center University of Skövde, Sweden Mechanichal and Manufacturing Technology Loughborough University, UK Peter.Thorvald@his.se, Dan.Hogberg@his.se, K.Case@lboro.ac.uk *Corresponding author Abstract. To build software which, at the press of a button, can tell you what cognition related hazards there are within an environment or a task, is probably well into the future if it is possible at all. However, incorporating existing tools such as task analysis tools, interface design guidelines and information about general cognitive limitations in humans, could allow for greater evaluative options for cognitive ergonomics. The paper discusses previous approaches to the subject and suggests adding design and evaluative guiding in Digital Human Modelling that will help a user with little or no knowledge of cognitive science to design and evaluate a human- product interaction scenario. Keywords: Digital human modelling, cognition, context, situatedness, ecological interface design, system ergonomics, HTA, usability simulation. Peter Thorvald is a lecturer at the School of Technology and Society at the University of Skövde. He received his PhD from Loughborough University in 2011 and his MSc in Cognitive Science at University of Skövde in 2006. Peter is a member of the Virtual Systems Research Centre and the User Centred Product/Workplace Design research group at University of Skövde and his main research focus is information design for manual assembly in the automotive industry. Dan Högberg is an Associate Professor in the School of Technology and Society at the University of Skövde. His research interests include methods and support systems for designers and engineers to consider human-machine interaction related matters in development processes, for example, the development and integration of digital human modelling. He received a BSc in Product Design Engineering from the University of Skövde in 1998, an MSc in Engineering Design from Loughborough University, UK in 1999 and his PhD from Loughborough University in 2005. He is a member of the User Centred Product/Workplace Design research group at University of Skövde and the Virtual Ergonomics Centre (VEC) in Sweden. Keith Case is Professor of Computer Aided Engineering in the Mechanical and Manufacturing Engineering Department at Loughborough University where he leads the Product Realisation Technologies and Innovative Digital Manufacturing Research Groups. Keith has a manufacturing engineering BSc and a PhD from Nottingham University, is a Fellow of the Ergonomics and Human Factors Society, a Fellow of the British Computing Society and a Chartered Engineer. His PhD thesis was entitled ’An Anthropometric and Biomechanical Computer Model of Man’ (1975) and was concerned with developing the SAMMIE Digital Human Modelling System. Other research interests include virtual manufacturing, inclusive design and the application of genetic algorithms. 1 Introduction In Digital Human Modelling (DHM), the term ergonomics usually refers to modelling physical aspects of humans with the main focus being on anthropometry and physical strain on the body. This is also reflected in the DHM tools that exist on the market, e.g. RAMSIS, JACK, SAMMIE, V5 Human (Case & Porter, 1980; Bubb, 2007); tools that mainly, if not exclusively, model physical ergonomics. This paper proposes ways of bringing cognition into the equation and provide users of DHM tools with an aid in evaluating cognitive as well as physical ergonomics. Computer modelling of human cognition has traditionally mainly been done off-line in the sense that the cognitive system is viewed as a hardware independent program, effectively disregarding the surrounding environment and even the importance of a human body. However, in later years, there has been an increasing interest in viewing the human as part of a complex system, incorporating the environment and the human body in cognitive modelling. This has led to new theories regarding how humans cognize within the world and has allowed us to regard the body and the context as part of the cognitive system. Human cognition is not an isolated island where we can view our surrounding context as merely a problem space. We are very much dependant on our body and our surroundings to successfully survive in the world. Previous suggestions on integrating cognition in DHM tools have largely taken their basis in symbol processing architectures such as ACT-R, Soar etc. (Bernard et al., 2005; Gore, 2006; Carruth et al., 2007); architectures that disregard embodiment and situatedness of cognition. This paper places the computer manikins used in DHM tools within a context, a context where cognitive offloading and scaffolding onto the environment is supported. The main advantage of using DHM and incorporating the suggested functionality is that it can be used very early in the system development process. It also allows the designer to consider the spatial information that the physical array incorporates. In traditional usability methods, this is seldom the case as design iterations are often done offline in the sense that they only incorporate some (if any) physical properties of the domain where the system is to be implemented. 1.1 Human performance modelling For as long as experimental psychology has been of interest in science, trying to model human performance has also been pursued. Pew (2007) describes three major movements within this field of study; manual control models of human control, task network models that ultimately predict success and performance time of systems and cognitive architectures that utilizes theories on human performance to predict behavior. 1.1.1 Cognitive Modelling in DHM During the last decade, there have been several attempts at incorporating cognitive modelling in DHM, most of which have focused on using cognitive architectures to predict human performance. A research group at Sandia National Laboratories in New Mexico have created a framework based on a modular and symbol processing view of human cognition and others have focused on a rule based system built on architectures such as ACT-R and Soar (Bernard et al., 2005; Carruth et al., 2007). Though not built on exactly the same architecture, several others have gone about the problem in similar ways, ultimately trying to reach a state where the system can, at the press of a button, perform a cognitive evaluation (Gore, 2006). However, the methodology upon which these architectures are built is challenged by researchers that recommend a more situated view on cognition as a whole. This view, originating in the 1920s from the Russian psychologist Lev Vygotsky, argues that human cognition cannot be viewed separately from its context and body (Clark, 1997). There is no clear-cut line between what happens in the world and what happens in the head; the mind “leaks” into the world. A view already expressed in the DHM community is a need to stop dividing human factors into “neck up” and “neck down” and instead view the human as a whole (Feyen, 2007). This view finds much support in the work on social embodiment by Lawrence Barsalou and colleagues. They discuss how the embodiment of the self or others can elicit embodied mimicry in the self or others (Barsalou et al., 2003), ultimately arguing for a holistic view of the human where the body and mind are both necessary for cognition. Whereas the discussion on embodiment and situatedness is beyond the scope of this paper, it shows us how earlier approaches to modelling cognition in DHM are at best insufficient and that a new approach is needed. This paper discusses two separate approaches to modelling cognition in DHM. The first is a mathematical one where a plausible way towards creating a mathematical model of cognitive behaviour is suggested. The second has a much lower technological level as it tries to consider the human as a system with a physical body, acting within an environment. 2 Existing Mathematical Approaches Whereas past attempts at incorporating cognitive ergonomics in DHM can be criticized, there are other approaches that deserve mentioning. These are, more often than not, based on theories aimed at quantifying behaviour and trying to predict reaction times, body movement etc. (Shannon, 1948; Hick, 1952; Fitts, 1954; Freivalds, 2009). Shalin et al (1996) described a number of potential approaches and classified them into the following categories: • Predetermined Motion-Time Systems (PMTS) o MOST o MTM • Mathematical models o Signal detection theory o Information theory • Symbolic computational models o ACT o Soar 2.1 PREDETERMINED MOTION-TIME SYSTEMS Digital Human Modelling systems usually had their origins either in military or manufacturing applications and SAMMIE (Bonney et al., 1972) is an example of the latter type. At this time economic efficiency of work was as important if not more important than workplace ergonomics and so such DHM systems often contained predetermined motion-time systems. In the case of SAMMIE this was a representation of MTM-2, but SAMMIE soon became a purely ergonomics system and the MTM component was embodied in a separate system (AUTOMAT) (Bonney & Schofield, 1971). In PMTS, task performance is predicted by the addition of expected times for sequential motor processes and is very vulnerable to inaccuracies in the estimations of physical task demands. These demands require elaborate and accurate task performance models that may or may not be present (Shalin et al., 1996). While PMTS, by their computational nature might seem very suitable for inclusion in DHM software, they do not consider the mental demand that is involved with performing a task but regard tasks as sequential and rather offline in a cognitive sense. This results in a lack of concern for distribution of cognition (Hutchins, 1995) and also social, physical and mental context. 2.2 MATHEMATICAL MODELS Mathematical representation models of cognition as found in information theory and detection theory are mainly used to describe mental tasks and predict error rates (Shalin et al., 1996). Also, the field offers specific insights into the description and effect of noise (Shannon, 1948) as well as predicting reaction times as results of spatial properties of the task and the number of choices (Hick, 1952; Fitts, 1954). Information theory quantifies information by calculating the entropy (number of options) in a task. For instance, choosing between eight parts in assembly requires three bits (log28) of information. Using tables that predict task performance, such as reaction times, offers standardized reaction times for a given number of bits in the task. For instance, the reaction time (not response time) for a task with eight options, three bits, would result in a reaction time of 800 milliseconds (Freivalds, 2009). Detection theory, or signal detection theory, was founded as a method of quantifying results from stimulus detection. Among the fundamentals is the classification of responses seen in figure 1 (Wickens, 2002). Response present Response absent Stimulus present Hit Miss Stimulus absent False alarm Correct rejection Figure 1. Matrix of the classification of response and stimulus. Detection theory also takes into account the biases that the respondent may or may not have which are based on the consequences of false alarms or misses (Wickens, 2002). For example, an oncologist, examining x-rays for possible tumours may be biased towards finding a tumour where there is none (False alarm), as a failure to recognize a tumour (Miss) might have larger repercussions than a false alarm. These are often referred to as false-positives or false-negatives. 2.2.1 SYMBOLIC COMPUTATIONAL MODELS As mentioned, the traditional approach to cognitive science, although heavily challenged in recent years (Searle, 1980; Harnad, 1990), is to view human cognition as a symbol processing system, effectively disregarding the context as merely a problem space. The difficulty in considering multiple task dimensions in the traditional mathematical models has given rise to the use of symbolic models for application to task analysis. The models are based on the belief that intelligence is symbol processing in the brain, much like a formal computer system. It is a matter of following a set of basic rules for manipulating symbols and searching over a set of stored problem-solving operations (Shalin et al., 1996). 3 A Mathematical Model of Cognition in DHM The aforementioned mathematical approaches all have in common that they are quantifiable. Whereas they might be subject to challenge on their philosophical basis in cognitive science, the ability to compute them gives them an advantage. Such an approach has great potential impact due to the context in which it can be used. An expert evaluation system, such as that discussed in the rest of the paper, will be limited due to its need for expert analysts. However, a mathematical model would consist of a number of parameters and variables that need to be completed for the system make an accurate computation of cognitive strain or whatever the system is designed to do. To create such a model, the first task would be to create a way to quantify information. Within information theory, information entropy is measured in bits (Shannon, 1948; Freivalds, 2009), which is a binary expression of the amount of information required to decide between two equally likely alternatives. It is calculated using: 𝐻𝐻 = 𝑙𝑙𝑙𝑙𝑙𝑙2𝑛𝑛 Where H is the entropy and n is the amount of equally likely alternatives. Using this mathematical expression, calculating the entropy for a decision with eight equally likely alternatives would result in three bits as: 𝑙𝑙𝑙𝑙𝑙𝑙28 = 3 Three bits, when written in binary ranges from 0-111 which corresponds to 0-7, or eight states. Information entropy can also be calculated for alternatives that are not equally likely by using this formula (Freivalds, 2009): 𝐻𝐻 = Σ𝑝𝑝𝑖𝑖𝑙𝑙𝑙𝑙𝑙𝑙2 � 1 𝑝𝑝𝑖𝑖 � where pipi is the probability of i i is the number of alternatives from 1 to n Once entropy has been calculated, it is merely a matter of assigning each bit a value. Looking at Hick’s law (or Hick-Hyman law) reaction time for instance is calculated by: 𝑅𝑅𝑅𝑅 = 𝑎𝑎 + 𝑏𝑏𝐻𝐻 where RT is response time H is the information entropy in bits a is the intercept b is the slope or information processing rate The information processing rate, or the bandwidth of information processing if you will (expressed in bits/s), is sensitive to disturbances. Consider the biases made in detection theory, mentioned earlier, and it is plausible that a task with a high demand on accuracy would affect the speed of the response. Creating a model which takes the information processing rate into account would require quite a bit of empirical data. What would be sought in such an endeavor is something similar to table 1. Table 1. Linear model of choice reaction time according to Hick's law. Choices Bits Reaction time (ms) 1 0 150 2 1 300 4 2 450 8 3 600 Finding the parameters for such a table would, as mentioned, require a substantial amount of empirical data but once it is gathered, it could be applied to DHM and to DHM-tools. Simply calculating the entropy of a task would allow us to find the reaction time as long as we have found the slope-intercept relationship first. Similarly, this methodology can be extended to include not only reaction times but also response times. Fitts’ law for instance (Fitts, 1954; MacKenzie, 1995), is often coupled with Hick’s law and performs similar calculations to determine the time required to move into a target area. This might be moving a finger to a button, moving a cursor to an icon etc. The general rule being that the time it takes to move into the target area is a function of the distance and the size of the target. A model, like the one described, could potentially have great impact in society. However, it might not be as easy as it sounds to create it, as it also needs to be generalizable to a number of problems. It needs to be able to handle choices of buttons on a car’s dashboard while at the same time being able to calculate which lever to pull in an overhead crane to raise the hook The model presented in this chapter, while full of potential were it to be realized, has inherent problems. Any model intended for use by laymen, needs to be simple and generalizable, something that this model would have difficulties with. However, it is still worth keeping in mind that perhaps it is not the absolute values of e.g. a reaction time that is most important but rather how a particular choice holds up against several others and in that context, the model might be very successful. A significantly different way from mathematics to handle cognition will be discussed henceforth in this paper. Instead of focusing on the quantifiable parts of information entropy, softer values of how a human cognizes will be presented. 4 Cognition as a System For a system design to become successful, the incorporation of human factors is essential. To a large part, physical ergonomics is very well accounted for in today’s system design practices, but the cognizing human is often neglected. On the one hand, as technology increasingly demands more human processing abilities, the modelling of human cognition becomes more important. The range of human behaviours need to be known to design for human-related control systems (Bubb, 2002). However, improved knowledge of human behaviours must not excuse ‘bad design’. It should become more important to design systems that are compatible with how human cognition actually works in order to make the “entire system” (i.e. including the human component) work in an effective and efficient manner, with the overall objective that the system output shall be high and stable (high productivity and quality). So perhaps instead of calling for raised knowledge of cognitive strengths and limitations, one should focus on developing technologies that comply with the whole human system, physical and mental. System ergonomics can be used to describe a more or less complex task’s mental demands on a human. It does so in three ways (Bubb, 2002). 1. Function The main consideration of function is what the operator has in view and to what extent the task is supported by the system. It is largely defined by the temporal and spatial properties of the activities to be performed. When and where should the task be performed? 2. Feedback The feedback allows the user to identify what state the system is in. If a performed task has resulted in anything, what task was performed etc. It is very important to allow the operator to recognize if an action had any effect on the system and also what the result of it was (Norman, 2002). For example, even if a computing task on a PC takes some time to calculate, the operator is informed that the computer is working by a flashing light or an hourglass on the screen. Figure 2. A seat adjustment control which exhibits excellent natural mapping or matching between the system and the user’s mental model. 3. Compatibility Compatibility is largely about the match between systems or between the system and the user’s mental model of the system. The operator should not be required to put too much effort into translating system signals. Compatibility relates information sources to each other. A very simple and obvious example from the automotive industry is described by Norman (2002) with a seat adjustment control from a car. A similar seat adjustment control can be viewed in figure 2. It is obvious in the figure that the system (the adjustment control) corresponds well to the result of the task of manoeuvring the controls. The control maps very well to the response of the seat and to the user’s probable mental model. However, the compatibility is not exclusively relevant to the psychological issues but a designer also needs to consider the physical compatibility of the user and the system. Controls might for example be spatially located away from the physical reach of the human. Though these three points are hardly sufficient for a comprehensive design tool, they are of great help in an initial state of system design and will prove helpful to us in developing a more detailed design aid. 5 Methods for Interface Design and Evaluation In Human-Computer Interaction (HCI) there are several evaluation methods with great use for certain situations. As the aim of this paper is to present a proposal for a design tool, we shall take a closer look at a few of these methods along with a task analysis tool. 5.1 Task Analysis All good design processes include some sort of task analysis. To be able to design a system that fits both task and human, we need to know as much as possible about the task. A fairly quick and dirty task analysis which provides a good basis for further development is the Hierarchical Task Analysis (HTA) (Annett, 2003). A HTA is a tree diagram of the task structure and serves several purposes. It gives us a good overview of the system or the task and subtasks that need to be performed, and assists in achieving common ground within a design group. It can also even serve as a task evaluation tool, allowing a designer to find global problems that can be missed when using usability inspection methods such as cognitive walkthrough (Polson et al., 1992), heuristic evaluation (Nielsen, 1993; Nielsen, 1994) etc. Global issues are mainly related to the structure of the task and the relation between the subtasks whereas local issues are within a subtask with a very limited scope. Figure 3. A very simple HTA of the process of making a pot of coffee. The creation of a HTA is fairly simple. First, identify the overall task to be performed, which in our very simple example, illustrated in figure 3, is making a pot of coffee. The HTA in figure 3 shows this process and also shown are the plans within which each subtask should be performed. In this example it is limited to doing the tasks in order or doing two subtasks first in any order and then continuing with the third. However, these plans can be very variable and flexible including elements such as selections (do one but not the other), linear or non- linear, or even based on a specific condition (if X then do Y, else Z). The finished task analysis is then used as a basis for further inspections and design iterations. 5.2 Ecological Interface Design Ecological Interface Design (EID) is spawned from Cognitive Work Analysis (CWA), which was developed as an analytical approach to cognitive engineering by the Risø group in Denmark (Vicente, 1999). CWA was developed to aid in the design of very critical human-machine systems such as nuclear power plant control rooms to make them safer and more reliable. It is an approach that allows the operator to handle situations that the system designers had not anticipated. CWA is made up of five phases to analyse within a system. These phases are work domain analysis, control task analysis, strategies analysis, social-organisational analysis and worker competencies analysis (Sanderson, 2003). Having these analyses allows the designer and the operator a better understanding of the system and already this enables the operator to better respond to unforeseen events. The idea behind EID is to create interfaces based on certain principles of CWA. It is very closely related to the principles of ecological psychology and direct perception, concepts developed by J.J Gibson in the 70s (Gibson, 1986). Gibson argued that there is enough information in the visual array to directly perceive information and that mental processing of visual information is not necessary. Though this claim is highly challenged, Make Coffee 1. Add Water 2. Add Coffee 3. Press Button 1.1.Fill Pot with Water 1.2. Pour water into Coffee Maker 2.1. Place Filter 2.2. Add Coffee Do in any order 1-2 Do 3 Do 1.1-1.2 Do 2.1-2.2 EID is largely built up around these principles in that its goal is to create interfaces containing objects that visually reveal information on their function. A related goal of EID is to make affordances visible in interface design. Affordances, another concept created by Gibson, are the action possibilities of a specific object (Gibson, 1986; McGrenere & Ho, 2000). The ideas surrounding affordances and EID can also be found in other areas of the scientific literature. In product design, one tends to discuss similar issues in terms of semantics (Monö, 1997). 5.3 Usability Inspections Usability inspection methods are predictive evaluation methods, usually performed without end user participation (although this is not a prerequisite). Usability experts simulate the users and inspect the interface resulting in problem lists with varying degrees of severity (Nielsen & Mack, 1994). 5.3.1 Cognitive Walkthrough A cognitive walkthrough is usually performed by usability experts considering, in sequence, all actions incorporated in a predefined task. Its focus is almost exclusively on ease of learning and the method contains two phases. First the preparations phase where the analyst defines the users, their experience and knowledge; defines the task to be analysed and identifies the correct sequence of actions to achieve the goal of the task. In the second phase, the analysis phase, the analyst answers and motivates a set of questions for each action within the task (Polson et al., 1992). 1. Will the user try to achieve the right effect? For example, if the task is to fill up the car with petrol and a button first has to be pressed from inside the car to open the gas cap, does the user know that this has to be done? 2. Will the user notice that the correct action is available? Simply pressing the button for the gas cap would not be a problem but if the button has to be slid or twisted in some way the user may not think of this. 3. Will the user associate the correct action with the desired effect? Is it clear that this is what the specific control is for? Unambiguous icons and names of controls are important to this aspect. 4. If the correct action is performed, will the user see that progress is being made? The importance of feedback, discussed earlier, comes into play here. These questions, though applicable to many tasks, are merely guidelines towards conducting a successful cognitive walkthrough. The method’s advantage is its focus on detail; it identifies local problems within the task and considers the users’ previous knowledge and experiences. However, it rarely catches global problems related to the overlying structure of the task and can be viewed as fairly subjective. It also requires a detailed prototype for evaluation although this would probably not be a problem if it is complementing a DHM tool where a virtual prototype is likely to already exist. Also, it is not just about presenting information but it is about how the information is presented. A robot would not have problems with different types of knobs or buttons as it has no preconceived notions of how they should look and does not expect things to be in a certain way. Humans do and this is why we have to stick to consistency and standards. 5.3.2 Heuristic Evaluation Just as in the case of cognitive walkthrough, heuristic evaluations are usually performed by usability experts sequentially going through each action within a main task with a basis in a set of heuristics (Nielsen, 1994). The method was developed by usability expert Jacob Nielsen and a set of his heuristics can be found through his publications (Nielsen, 1992; Nielsen, 1993; Nielsen, 1994). Examples of Nielsen’s heuristics are • Match between system and the real world o Similar to the matching and mapping concept discussed in system ergonomics, the system should speak the users’ language, matching the real world in terms of terminology and semiotics. • Consistency and standards o Also related to the matching concept is using accepted conventions to avoid making users wonder whether different words, icons or actions mean the same thing in different contexts. • Recognition rather than recall o Options should be made visible to avoid making the user having to remember how or where specific actions should be performed. • Aesthetic and minimalist design o Dialogues and controls should not be littered with irrelevant or seldom used information. Heuristics can be added and subtracted to fit certain tasks before the evaluation commences. The method results in problem lists with motivations and rankings of the severity of the problems found. 6 An Expert Design Guide for DHM The evaluation and design tools discussed in previous sections are developed for interface design in different settings than DHM. However, the design guide proposed in this section is a hybrid of these, adapted for use under the specific conditions that DHM provides. The method strives to take into account global as well as local issues through the use of action based interface inspections and a task analysis focusing on the structure of the task. As stated earlier in this paper and by others (Pheasant & Haslegrave, 2006), every good design process starts with a task analysis. For our purposes, a hierarchical task analysis is suitable as it complements the inspection methods incorporated in this design guide. The HTA serves several purposes; it gives the designer a better understanding of the task and it provides a common understanding of the task within a development group. The task analysis can also be used as an evaluation tool of the task itself. It allows the designer to identify problems in the task structure that could result in problems with automatism (Thorvald et al., 2008), it can identify recurring tasks and give them a higher priority in the interface etc. Complementary to the task analysis, the designer should consider who the users are and what a priori knowledge they have. This resembles the guiding system for utilising traditional DHM tools in development processes suggested by Hanson et al. (2006), where the users’ anthropometry and tasks are defined before the actual analyses or simulations are performed. The sequence-based walkthrough will take its basis in the task analysis performed. For each subtask (box) of the HTA, a set of questions, based on Bubb’s points regarding system ergonomics (Bubb, 2002), will act as guidelines for the design. • Function – When and where should the action be performed? o Will the user identify the action space where the correct action should be performed? What do the physical and geographical properties of each control convey to the user? o Frequency of actions – a frequently recurring action should take precedence in taking up place and intrusiveness in the physical and cognitive envelope. o Importance of action – Safety critical systems should also take precedence in the available information space. o Minimalism of design – avoid taking up space with irrelevant or rarely needed information. Hick’s law: Reaction time is a function of the number of choices in a decision (Hick, 1952). In figure 4, there is an example of what a virtual interface, modelled in a DHM-tool can look like. In this case the picture shows a fighter jet cockpit used for evaluation where the pilot needed to locate a “panic button” to bring the aircraft back into control under extreme physical and mental load conditions. Figure 4. Two views of a cockpit modelled in the DHM tool SAMMIE. The action spaces that the user has to identify when performing an action are the controls in front of, and to the right and left of the steering control stick. Preferably, a frequently performed action control should be placed on the control stick or directly in front of it as these are the spaces that best correspond to the physical and cognitive reach of the pilot. Also safety systems, as in the case of the evaluation in figure 4, should be placed so that they are easily accessible for the user. Knowing that certain controls are rarely used, they can be placed to the right and left to avoid having too many options in terms of ‘pushable’ buttons at the same place. The intrusiveness and affordances of such “high priority controls” should also be accentuated in terms of their design. • Feedback o Will the user understand that a correct or faulty move has been made? o Is the system status visible? Understanding what has been done and what is in progress of happening with the system can prove vital in many cases. Surely we can all relate to a situation where we have pressed the print button more than once only to find out that we have printed several more copies than needed. While this may be a minor problem, one can easily imagine the problems that can arise in more critical domains. What if there were no indications for what gear the car’s gearbox was in? The driver would have to test each time to see if the car is in reverse or drive. In an incident at a hospital, a patient died as a result of being exposed to a massive overdose of radiation during a radiotherapy session. The problem could easily have been avoided, had the system provided the treating radiology technician with information of the machines settings (Casey, 1998). • Compatibility o Does the system match other, similar systems in terms of semantics, semiotics etc.? o Does the system match the real world and the plausible mental model of the user? o Are demands on consistency and standards of the domain met? o Action-effect discrepancies – is it obvious beforehand that a certain action will have a certain effect? Accurate mapping between systems and mental models is a key concept in the compatibility section. This includes trying to adhere to consistencies and standards of the organisation and the specific field. There should also be clear connection between action and effect. Neglecting these consistencies can lead to serious problems as in the case with an aircraft’s rudder settings. The sensitivity of the rudder could be set through a lever placed to the side of the pilot’s seat. However, between the simulator for the aircraft and the actual aircraft, the lever was reversed, moving in the opposite direction for maximum and minimum sensitivity almost resulting in a crash (Casey, 2006). 7 Conclusions & Future Work In ergonomics, it seems to be common practice to separate human factors into “neck up” and “neck down”. Though this approach may make it easier to study ergonomics, it does not portray an entirely accurate picture of the human. The evidence for a tight coupling between mind and body is so overwhelming that instead of talking about mind and body, perhaps we should be talking about the human system. The aim of this paper has been to consider past and current approaches towards integrating cognition into DHM tools and outline potential new design guides and models to help designers to achieve this integration in a better way. The originality of the design guide lies in the combination of methods from the HCI-field and the new application to DHM. The guide is not complete and needs extensive further development and testing. However, it is a pragmatic start towards including functionality to consider cognitive ergonomics in DHM tools. The mathematical model on the other hand is merely a theoretical suggestion for what potentially could be realized in the future. It also serves as contrast to the expert design guide presented later in the paper. Obviously, the conceptual guide for DHM tool development needs to be detailed and tested on real problems to prove its contribution to the field of ergonomics design and evaluation. Small scale testing with, and evaluation on, the tool will be carried out and the tool will be compared to other HCI and cognitive ergonomics evaluation tools such as CWA, usability testing, user studies, etc. Another approach towards the objective to consider the ‘full’ human in DHM tools is the conceptual ideas of creating DHM Personas as presented and discussed in Högberg, et al. (2009). This concept is based on the idea of giving the manikins in the DHM tool a degree of personality by describing characteristic ‘user types’. The basic approach to portray user types in terms of narrative texts and images is a widespread design method (Nielsen, 2002; Pruitt & Grudin, 2003; Cross, 2008). The idea to map such descriptions on computer manikins is however a newer approach, and resembles the ideas of Högberg and Case (2006) as well as Conradi and Alexander (2007). Figure 5 shows two manikins as DHM Personas, where the descriptions applied on the manikins convey certain capacities and give the manikins personality traits. This attempt is to enrich the DHM tool user’s understanding of end user requirements and about user diversity in the targeted population, both related to physical and cognitive ergonomics, but also for what we may term as pleasurable or emotional ergonomics (Jordan, 2002; Siddique, 2004). Figure 5. Example of DHM Personas. An interesting possibility would be to integrate DHM Personas in the methods for the consideration of cognitive aspects in DHM as suggested in this paper. This as an attempt to even further take into account the ‘whole’ human, and the diversity of humans, in the human-system interface being designed or evaluated. For example, one may imagine the product or workplace designer doing a HTA or CWA with ‘different hats on’, as described by the DHM Persona, hence even further increasing the chance that the ‘entire” user and user diversity is considered in the design process. References Annett, J. (2003). Hierarchichal Task Analysis. In D. Diaper & N. Stanton (eds.). The Handbook of Task Analysis for Human-Computer Interaction. pp. 67-82. Mahwah, New Jersey: Lawrence Erlbaum Associates. Barsalou, L. W., Niedenthal, P. M., Barbey, A. K. & Ruppert, J. A. (2003). Social Embodiment. In B. H. Ross (ed.). The Psychology of Learning and Motivation. San Diego, CA: Academic Press. Bernard, M. L., Xavier, P., Wolfenbarger, P., Hart, D., Waymire, R., Glickman, M. & Gardner, M. (2005). Psychologically Plausible Cognitive Models for Simulating Interactive Human Behaviors. In Proceedings of the Human Factors and Ergonomics Society 49th Annual Meeting. pp. 1205-1210. Bonney, M. C., Case, K., Hughes, B. J., Schofield, N. A. & Williams, R. W. (1972). Computer Aided Workplace Design using SAMMIE. In Ergonomics Research Society Annual Conference, Cardiff, April. Bonney, M. C. & Schofield, N. A. (1971). Computerized work study using the SAMMIE/AUTOMAT system. International Journal of Production Research, 9 (3), 321- 336. Bubb, H. (2002). Computer Aided Tools of Ergonomics and System Design. Human Factors and Ergonomics in Manufacturing, 12 (3), 249-265. Bubb, H. (2007). Future Applications of DHM in Ergonomic Design. LECTURE NOTES IN COMPUTER SCIENCE, 4561, 779-793. Carruth, D. W., Thomas, M. D., Robbins, B. & Morais, A. (2007). Integrating Perception, Cognition and Action for Digital Human Modeling. In V. G. Duffy (ed.). Digital Human Modeling, HCII 2007. pp. 333-342. Berlin: Springer-Verlag. Case, K. & Porter, J. M. (1980). SAMMIE - A Computer Aided Ergonomics Design System. Engineering, 220, 21-25. Casey, S. M. (1998). Set phasers on stun and other true tales of design, technology, and human error. Santa Barbara, CA: Aegean. Casey, S. M. (2006). The atomic chef : and other true tales of design, technology, and human error. Santa Barbara, CA: Aegean Pub. Co. Clark, A. (1997). Being There: Putting Brain, Body, and World Together Again. MIT Press. Conradi, J. & Alexander, T. (2007). Modeling personality traits for digital humans. Society of Automotive Engineers., SAE Technical paper (2007-01-2507). Cross, N. (2008). Engineering design methods: strategies for product design. Chichester: Wiley Feyen, R. (2007). Bridging the Gap: Exploring Interactions Between Digital Human Models and Cognitive Models. In V. G. Duffy (ed.). Digital Human Modeling, HCII 2007. pp. 382-391. Berlin: Springer-Verlag. Fitts, P. (1954). The information capacity of the human motor system in controlling the amplitude of movement. Journal of experimental psychology, 47 (6), 381-391. Freivalds, A. (2009). Niebel's methods, standards, and work design. New York: McGraw-Hill Higher Education. Gibson, J. J. (1986). The ecological approach to visual perception. Hillsdale, NJ: Lawrence Erlbaum Associates. Gore, B. F. (2006). Human Performance: Evaluating the Cognitive Aspects. In V. G. Duffy (ed.). Handbook of digital human modeling. Mahwah, New Jersey. Hanson, L., Blomé, M., Dukic, T. & Högberg, D. (2006). Guide and documentation system to support digital human modeling applications. International Journal of Industrial Ergonomics, 36 (1), 17-24. Harnad, S. (1990). The symbol grounding problem. Physica D: Nonlinear Phenomena, 42 (1-3), 335-346. Hick, W. E. (1952). On the rate of gain of information. The Quarterly Journal of Experimental Psychology, 4 (1), 11-26. Hutchins, E. (1995). Cognition in the Wild. Cambridge, MA: MIT Press. Högberg, D. & Case, K. (2006). Manikin characters: user characters in human computer modelling. Contemporary Ergonomics, 499-503. Högberg, D., Lundstrom, D., Hanson, L. & Warell, M. (2009). Increasing Functionality of DHM Software by Industry Specific Program Features. SAE Technical paper (2009-01-2288). Jordan, P. (2002). Designing pleasurable products: An introduction to the new human factors. London: Taylor & Francis. MacKenzie, I. S. (1995). Movement time prediction in human-computer interfaces. In R. M. Baecker, W. A. S. Buxton, J. Grudin & S. Greenberg (eds.). Readings in human- computer interaction. pp. 483-493. McGrenere, J. & Ho, W. (2000). Affordances: Clarifying and Evolving a Concept. Proceedings of Graphics Interface 2000, 179-186. Monö, R. (1997). Design for Product Understanding. Skogs Boktryckeri AB. Nielsen, J. (1992). Finding usability problems through heuristic evaluation. In Proceedings of ACM, Monterey, CA. 373-380. Nielsen, J. (1993). Usability Engineering. San Francisco, CA: Morgan Kaufmann. Nielsen, J. (1994). Heuristic evaluation. In J. Nielsen & R. L. Mack (eds.). Usability inspection methods. pp. 25-62. New York: John Wiley & Sons, Inc. Nielsen, J. & Mack, R. L. (1994). Usability inspection methods. Wiley New York. Nielsen, L. (2002). From user to character: an investigation into user-descriptions in scenarios. In DIS 2002, Designing Interactive Systems, London. 99-104. Norman, D. (2002). The design of everyday things. New York: Basic Books Pew, R. W. (2007). Some history of human performance modeling. In W. Gray (ed.). Integrated models of cognitive systems. pp. 29-44. New York: Oxford University Press. Pheasant, S. & Haslegrave, C. M. (2006). Bodyspace: Anthropometry, Ergonomics and the Design of Work. CRC Press. Polson, P. G., Lewis, C., Rieman, J. & Wharton, C. (1992). Cognitive Walkthroughs: A Method for Theory-Based Evaluation of User Interfaces. International Journal of Man- Machine Studies, 36 (5), 741-773. Pruitt, J. & Grudin, J. (2003). Personas: practice and theory. In Proceedings of the 2003 conference on designing for user experiences, San Francisco, CA. 1-15. Sanderson, P. M. (2003). Cognitive work analysis. In J. M. Carroll (ed.). HCI models, theories, and frameworks: Toward an interdisciplinary science. pp. 225-264. San Francisco, USA: Morgan Kaufmann Publishers. Searle, J. (1980). Minds, brains, and programs. Behavioral and Brain Sciences, 3 (3), 417-457. Shalin, V., Prabhu, G. & Helander, M. (1996). A cognitive perspective on manual assembly. Ergonomics, 39 (1), 108-127. Shannon, C. (1948). A Mathematical Theory of Communication. Bell System Technical Journal, 27, 379-423, 623-656 Siddique, Z. (2004). Conceptualizing Emotional Ergonomics and Exploring Ways to Empower Workplace Dynamics. Contemporary Ergonomics, 540-544. Thorvald, P., Bäckstrand, G., Högberg, D., de Vin, L. J. & Case, K. (2008). Demands on Technology from a Human Automatism Perspective in Manual Assembly. In Proceedings of FAIM2008 Skövde, Sweden, June-July 2008. 632-638. Vicente, K. J. (1999). Cognitive Work Analysis: Toward Safe, Productive, and Healthy Computer-Based Work. Lawrence Erlbaum Assoc Inc. Wickens, T. (2002). Elementary signal detection theory. Oxford University Press, USA. 1 Introduction 1.1 Human performance modelling 1.1.1 Cognitive Modelling in DHM 2 Existing Mathematical Approaches 2.1 PREDETERMINED MOTION-TIME SYSTEMS 2.2 MATHEMATICAL MODELS 2.2.1 SYMBOLIC COMPUTATIONAL MODELS 3 A Mathematical Model of Cognition in DHM 4 Cognition as a System 5 Methods for Interface Design and Evaluation 5.1 Task Analysis 5.2 Ecological Interface Design 5.3 Usability Inspections 5.3.1 Cognitive Walkthrough 5.3.2 Heuristic Evaluation 6 An Expert Design Guide for DHM 7 Conclusions & Future Work References work_2jrybysfevbopiecnsyp6jokoe ---- This is a preprint of an Article submitted for consideration in Textual Practice © 2011 copyright Taylor & Francis; Textual Practice is available online at: http://www.informaworld.com/openurl?genre=article&issn=0950-236X&volume=24&issue=6&spage=1095 Review of Alan Liu, Local Transcendence: Essays on Postmodern Historicism and the Database (Chicago: University of Chicago Press) by Martin Paul Eve is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Based on a work at www.martineve.com. Alan Liu, Local Transcendence: Essays on Postmodern Historicism and the Database (Chicago: University of Chicago Press), 392 pp., £14.50 (paper), £37.00 (cloth) 'Keep cool but care' wrote Thomas Pynchon in his first novel, V.; an acknowledgment of the tension between the individual and the masses, an invitation to consider concepts of freedom and control, passion and apathy. It was also an invitation framed by the turbulence of an impending digital era, the 'flip' and 'flop' of these dichotomies corresponding to the zeros and ones of a 'computer's brain'. 1 Extending his previous work on Romanticism and – more obviously from this context – 'Cool', Alan Liu's latest collection is a volume that offers an insightful and much needed digital-era revision of cultural criticism since the 1980s, while also exhibiting a playful side in which the work frequently posits a structural counter-irony to the obverse line of thought. Containing an almost cartographic overview of Liu's work from 1989 to present – from the New Historicism to the Spruce Goose – Local Transcendence also provides, in its final two essays, 'Transcendental Data' and 'Escaping History', a solid rationale for an extension of such cultural criticism into the Digital Humanities projects of the last decade. In contrast to many essay collections amalgamating such spans of work, Liu's volume amounts, through the combination of cumulative argument and the constant structural plays on immanent transcendence, to more than the sum of its components; while this book has strong reference value it is 1 Thomas Pynchon, V. (London: Vintage, 1995), p. 366. http://www.informaworld.com/openurl?genre=article&issn=0950-236X&volume=24&issue=6&spage=1095 http://www.martineve.com/Publications/Martin%20Paul%20Eve%20-%20Review%20of%20Alan%20Liu,%20Local%20Transcendence:%20Essays%20on%20Postmodern%20Historicism%20and%20the%20Database%20(Chicago:%20University%20of%20Chicago%20Press)%20-%20PrePrint.pdf http://creativecommons.org/licenses/by-sa/3.0/ http://www.martineve.com/ http://creativecommons.org/licenses/by-sa/3.0/ in the totality of its trajectory that it truly shines. To begin with this in mind, it is perhaps apt to remark that the subtitle, 'Essays on Postmodern Historicism and the Database', appears, for much of a first reading, somewhat misplaced, for it is only in the final essay, 'Escaping History', in which Liu combines historiography with the database. Indeed, the early works in the collection – offering critique of, among others, the New Historicism's angst regarding 'the marginality of literary history' (p. 30) and Welsh colonial discourse in relation to Wordsworth, recusancy, patriotism and the New Historicist subversion/containment dichotomy – leave the reader with a sense of disjointedness and an impression of a telos-less wandering. However, it emerges that this is integral to the very performativity of Liu's writing; can one claim to have thoroughly covered, for example, the topic of anti-methodology in cultural criticism, if one's structural movement does not also query this model? Liu, it would seem, thinks not: his critique of lists takes the form of a list; his damnation of New Historicism's overemphasis on supposedly representative theatrical moments begins with a representative theatrical moment; the unacknowledged dangers of eXtensible Markup Language (XML) encoding for artistic practices are encoded within such a schema. While this could degenerate towards gimmickry, Liu pulls it off as a serious methodological undertaking, for mirrored in the synthetic resolution of these structural dialectics is a parallel to Liu's critique of 'that which the postmodern interpreter champions as subversive', the element which 'sympathizes with ourselves' (p. 62). In proposing, and then undermining, a structurally 'subversive' element, Local Transcendence sidesteps the pitfalls of methodological hypocrisy that lie in wait for such meta-textual performativity and reveals a distinct path. This debate on form and content – so tired in other spheres – is further revived by Liu in his digital humanities work. In the penultimate piece, 'Transcendental Data', Liu argues that the increasing prevalence of content-transmission-consumption models (pp. 214-215) built upon standards such as XML, which aim to separate content from presentation, poses a threat to artistic modes that rely upon the blurring of this distinction. Yet, is this pushing the implication of these technologies too far? After all, 95% of artists working in a digital medium are not currently exposed to XML, but rather constrained at the level of the user interfaces within which they must operate. The 5% who do encounter this medium will likely have the technical ability to craft a presentation layer – to borrow Marshall McLuhan's phrase – that would transform it to the message. While XML is, indeed, designed for presentational re-construction at the consumer-end, specifying procedures and constraints for this reception – and thereby circumventing the problem of which Liu writes – would be no different to the outcry at the Tate Modern when, in 2008, a Mark Rothko painting was accidentally hung, against the artist's instructions, upside down. Furthermore, it is possible that such a content/form dichotomy, in which each element must be separately considered, could lead to a culture of artistic practice which places a greater emphasis on the self-aware consideration of this distinction; surely a positive turn. From this mention of auto-consciousness, it is fair to state that Liu's self-aware, self-criticism marks the strongest point in this volume. In knowledge of his earlier complicity with the New Historicism, Liu's work on Romanticism and, in particular, Wordsworth, wastes no time on preliminary synopsis and assumes a familiarity with the field, allowing his analysis of cultural criticism to shine through. When writing on the digital humanities, however, Liu digresses into lengthy exegeses of what are, to figures in the computer science arena, trivial aspects of database theory (p. 249-254). This discrepancy somewhat betrays Liu's objective to 'rethink thinking' (p. 181) as regards interdisciplinarity. Such an assumption of familiarity with the literary, and an opposing presumption of ignorance of the technological seems, at times, to recross the boundary of pragmatism back to a home discipline seeking 'some more absolute validation' (p. 181) in the exotic other. These relatively minor critiques are outshone, however, by the majority of the book, none more so than in Liu's approach to the academy itself. Building on the premise that 'an adequate discussion of literary history must at some point cite the history of the academy' (p. 202), Local Transcendence is topical and relevant, especially in its dealings with the already touched-upon interdisciplinary studies. Situating this term within the military metaphor aptly applied to disciplinarity, Liu covers the field with focus on the Fishean critique of epistemological boundaries, acquiescing to a degradation of the interdisciplinary to a mere rhetorical trope, yet simultaneously offering a means of redemption. In the recognition that interdisciplinarity is a mode whose quest for knowledge risks a fall towards this rhetorical formation, Liu sees the potential for a counterforce who, re-appropriating Lyotard, would deny the 'consensus of good taste'; a war machine that reverts to a Deleuzian horde, rather than a monolithic entity (pp. 184-185). Similarly, the discussion on literary history as the management of presentations of literature – while having a Wittgensteinian feel to its motif of 'citation-as-seeing' as opposed to 'citation-as-calling' (p. 196) – also has a role in the practicalities of course design and pedagogy. Although Liu lacks the space, or perhaps inclination, to develop this into a full pragmatic paradigm, it is hinted that while the academy's current mode may permit plurality within its meta- structure, a move to a new literary history would involve – couched by Liu in the terminology of packet switching and, even, patchwork quilting – less credulity towards these meta-narratives. From this description, one might be tempted to believe that Local Transcendence has scarcely advanced since the heyday of high postmodernism; be assured that this is not the case. While in both structure and content Liu explores the bounds of knowledge that so pervaded this era, this collection is an entirely historicized account of the period which covers, ultimately, the unacknowledged tension between contingency and freedom, between the academy and its objects of study, between digital threat and digital redemption within the discourses of postmodern historiography. © Copyright 2011 Martin Paul Eve, University of Sussex Alan Liu, Local Transcendence: Essays on Postmodern Historicism and the Database (Chicago: University of Chicago Press), 392 pp., £14.50 (paper), £37.00 (cloth) work_2m3ubeq33ncujnaa7gcfv62hwy ---- 1 Towards an African Indigenous Model of Communication for Software Development in Digital Humanities Akin-Otiko Akinmayowa Augustine Akintunde Farinola Abstract Drawing insight from Toyin Falola’s call for African scholars to Africanize knowledge, this paper argues for a review of the digital technological tools being used for research in African studies in order to adequately capture and properly process and present African data. To achieve this, the inadequacies of Digital Humanities for specific areas of African Studies will be highlighted, especially in the deployment of digital humanities tools. The major challenge being the distortion and constraint experienced in processing and presenting research through the use of Digital Humanities ‘tools of translation and communication. The paper argues that such technological limitation has its root in the incompatibility of the epistemological frameworks within which those digital tools were developed. The paper discusses 'Ojú lòrówà' – a theory of communication in an African society as a model to highlight the importance of African context to African scholars in their exploration into African history, technology, culture, philosophy and tradition. Indigenous theory is an appropriate model for developing digital as well as virtual software for African scholars in human communication. The paper concludes by enjoining scholars in African Studies to ensure that the digital tools employed in African studies are not only able to collect data, but also able to process and present data adequately without losing the original meaning or sense. Keywords: African Studies, data-presentation, data-processing, 'Ojú lòrówà', software development, digital humanities. 2 Introduction In our increasingly digital world, it is expected that scholars in the humanities will embrace computer software and programs designed as tool for research. Meanwhile, experience has shown that scholars in African Studies have encountered difficulties in using those tools to capture a true representation of African heritage in the light of Africa’s indigenous concepts, phenomena, beliefs and worldviews. Instead of discouraging the use of these digital tools, we are developing a conceptual and epistemological framework for developers, as well as identifying the appropriate tools for research in Africa. To achieve this objective, we shall embark on a brief conceptual analysis and discussion on the humanities, digital technologies in the humanities, the idea of communication and digital communication, and technological tools used in research in the humanities. Then, we shall examine the inadequacies of some digital humanities tools for researchers in African Studies and argue that they were built on a theory of communication within the framework of Western and Oriental knowledge and belief system. This would necessitate the explication of the idea of Ojú lòrówà’ as a theory of communication to address this epistemic imposition on the technologies of the digital humanities. We will then deploy this theory to address the identified limitations of digital technological tools developed to be used by scholars in humanities. The Humanities: Definition, Disciplines, Goal, and Peculiarities The term ‘humanities’ comes from the Latin word ‘humanus’, meaning ‘human’ (Vito R. Giustiniani,1985). So, the idea of the humanities is considered as a loosely defined group of academic subjects united by a commitment to studying aspects of the human condition. These subjects produce reflections and thoughts on human experiences and practices. The disciplines of humanities include history, anthropology, literature, art, philosophy, and law, political and cultural studies. The study of the humanities helps to understand human values and how these values translate to knowledge, attitudes, policies and inventions for the advancement of commodious living and common good. (Godwin Sogolo, 1981) 3 Digital Humanities and Its Technological Tools The Digital Humanities (DH) currently incorporates both digitized and born-digital materials and combines the methodologies from traditional humanities disciplines (such as history, philosophy, linguistics, literature, art, archaeology, music, and cultural studies). It provides computing tools (like data visualization, information retrieval, data mining, statistics, text mining) and digital publishing tools (Arjun Sabharwal, 2015). Today, scholars in the humanities are using chat rooms, bulletin boards, and social networking websites for academic interactions. These digital technologies help in making digital information to travels over thousands of miles, thereby making research findings shareable within a global academic community. To be a member of these cyber communities, one simply needs a networked computer, or a computer that is connected to a larger system of other computers (Albert Borgmann, 1999). Furthermore, it is becoming easier than ever for scholars, through the use of technology, to validate, track, and cross-check information (Anne Burdick et al., 2012). The most interesting aspect is the easy access to primary source materials, understanding texts written in different languages, and in preserving digital resources for the future. Digital humanities technologies have enhanced perception, automated analysis, modelling and simulation, easy search for books, interactive music scores, dynamically generated maps, and other multimedia and digital resources or repositories. Most scholars in the humanities currently make effort to digitalize their works. The first process in doing this is the digitalization of texts, images, and other data (e.g., survey data, videos, etc.), then the delivery of that data via the web (Ibid.). The digitalization of text has helped scholarship as many more people could access those virtual libraries, museums, and archives across the world (Ian Foster, 2011). This has enabled historians, folklorists, digital humanists, ethnologists, anthropologists, and archivists in the process of collecting, preserving, and understanding, interpreting, and retelling stories of humanity (Douglas A. Boyd and Mary A. Larson, 2014). It does not require advanced computing or programming skills to benefit from the opportunities offered by digital humanities tools. Media outlets such as YouTube or SoundCloud offer near instant and free distribution of audio and video oral histories, while digital repository and content management systems like CONTENTdm, Omeka, or 4 even Drupal or Wordpress, provide powerful infrastructure for housing oral histories in a digital archive or library. Systems such as OHMS (ORAL History Metadata Synchronizer) now provide free opportunities to enhance access to oral histories online, connecting a textual search of a transcript or an index to the correlating moment in the online audio or video interview. Mobile applications like Curatescape offer enormous opportunities for collecting, curtain, and disseminating interviews and projects (Ibid). It must however be stated that as they offer benefits, these tools also posed potential threats such as increased vulnerability of narrators, infrastructure obsolescence, and hosts of other ethical issues (Ibid). Most of these technologies are built on the mandate that we have to be online and be connected to a source of power. This imperative creates a sense of significance dependence by that fact. For instance, with the advent of Apple iCloud, Amazon, Microsoft and other online storage systems, one no longer need the memory of one’s computer because everything one writes, photographs and records will be saved in the ‘cloud’ or on a server somewhere which one can access anywhere in the world. In other words, cloud storage allows scholars to manage their data in an infinitely more convenient way so that they are synchronized across our growing collections of information appliances (Domenico Fiormonte et al., 2015). Communication and Research In Humanities Research is a process by which human beings investigate and obtain an understanding of the world. Today, the way in which research is carried out is changing either for better or for worse. We use our brain along with technological aids so as to enhance the limited biological capabilities (Ian Foster, 2011). In our present-day society, electronic communication plays a vital role within the academic community such that anyone ignorant of the use of digital tools would become near invisible in the global academy. This is not a surprise since the seeds of modern digital technologies were planted many centuries ago and develop with the research of renowned scholars in the humanities (Albert Borgmann, 1999). Researchers in the humanities currently use tools such as such as telephones, cell phones, e-mail, and so on (Ananda Mitra, 2010) for interpersonal communication; while electronic bulletin boards, chat rooms, digital conferencing, and small private digital 5 networks tools are used for group communication (Albert Borgmann, 1999). These tools have enabled author(s) or researcher(s), editor(s), technician(s), publisher(s), librarian(s), reader(s) and audience(s) to interact without even meeting physically. But there have also been some observation as the limitations that researchers are facing in their engagement. Daniel O'Donnell's review of global participation of researchers in digital humanities suggests that digital activity may be correlated with the economic situation of a country, such that countries with high income, will most likely witness a high participation in digital activities while countries with average or low income [here Africa studies fit] have partial or low participation (O'Donnell, 2012). Beyond the link the Digital Humanities has with economics, there is also the necessary link it has with culture and context, such that Digital humanities finds it easier to express data within the worldview of the coder and developer than the worldview of the user and learner where many African scholar belong. Inadequacies of Tools in Digital Humanities As scholarship moves from the libraries and the lecture halls to the digital communication networks, in order to deploy Digital Humanities, which is “an interdisciplinary academic field that is focussed on the development and use of applications that improve the quality of research and teaching in the humanities” (Babalola, 2014). Researchers are faced with new challenges, such as, collaborative authoring, multiple versioning, flexible attitudes toward intellectual property, peer contributions, access to multiple and multiplying communities, and overall pattern of distributed knowledge production, review, and use (Ibid.). Some of the real problems which the use of digital tools have engendered include: immersion in the virtual communities rather than the human communities, deception, misinformation or vulnerability of information, phishing, sudden loss of data, spamming and unwanted digital communication (Ibid.), open-source knowledge, lack of bridge between the academic and social life. Digital humanities scholars like Anne Burdick, Johanna Drucker, Peter Lunenfeld, Todd Presner, Jeffrey Schnapp, and others have expressed their fear that “as humans and data machines become equal partners in cultural practice, social experience, and humanistic research, the humanities may no longer look like ‘The Humanities’ (Anne Burdick et al., 2012). They pointed out the negative contributions tools of digital humanities have had 6 on the tension between those in the humanities who now solely embrace quantitative methods and those who insists on qualitative analysis. This is a tension that has integrated the quantitative wing into the social sciences, while the other wing fights to defend its autonomy and critical stance (Ibid.). Thus, the digital humanities scholars enjoined us, as the next generation of digital experimenters, to contribute to humanities theory by forging digital tools that quite literally embody humanities-centered views regarding the world. It is not helpful to classify Digital Humanities as unhelpful or dangerous to African Studies, researches show that technology is positively impacting on researchers and students, Babalola (2014) noted that in 2008, Lawal conducted a survey on the level of computer literacy and the use of the internet for research among the students and staff of computer science and engineering faculties in a Nigerian state university. The result revealed that ninety-four percent of the respondents are computer literate (Lawal et al. 2008). Beyond the positive side is the limitation of Digital humanities to researchers in African Studies, because Digital Humanities does not fully represent the context and meanings that African ideas and worldviews carry when it tries to process and present data. It is the lack of adequacy in DH that this paper points out and attempts to engage. “The most cutting criticisms of digital humanities: that it constitutes a naively positivist refuge from cultural studies, critical race theory, postcolonial theory, and other scholarly methods designed to surface the concerns of marginalized communities” (Brier, 2012:390) Researches have shown that Digital Humanities lack ‘for now’ the ‘know how’ of detailed data processing and presentation of data from African studies, “African writers are at times forced to relate their worldviews in Western colonial languages which do not often lend themselves easily to expressing African sociocultural reality” (Bandia, 1996:139). There are African ideas that cannot so far be completely and properly captured when translated or interpreted data into the language or programmes that the current Digital Humanities developer have and know. Global language and representatives many times shut out African context and particular views and ideas get missing (give examples) For many Africans, words [signs representations] are never adequate to fully express salient ideas, and there are issues and ideas that are commonly hidden in words and signs 7 such that, only trained person can understand the hidden ideas. This epistemic framework is contained in sayings such as àbò òrò làá so f’ómolúàbí tí ó bá dénú rè á di odindin (half a word is spoken to the wise, once heard, it becomes complete). This sets the background for the Epistemological frameworks of 'Ojú lòrówà'. There is so much that is said when eyes meet in communication, that which is not said can also be understood, these unspoken and unpresented ideas represent a percentage of data that so far do not have representations or equivalents in the Digital space; this makes it important to develop Digital Humanities in the context of Africans. The existing Digital tool cannot adequately process or present African ideas and now is the best time to begin to make changes since Digital Humanities, is still “a relative newcomer to the media scholar’s toolkit, is notoriously difficult to define” (Posner, 2018) and so gives room for the required addition and adjustments for a clearer processing and presentation of data in African Studies. If “Most digital humanities practitioners would agree that the digital humanist works at the intersection of technology and the humanities (which is to say, the loose collection of disciplines comprising literature, art history, the study of music, media studies, languages, and philosophy)” (Posner, 2018) there will be the required effort to develop tools that can adequately and correctly process and present African ideas with correct interpretation. Raising the issue is not enough, it is important to engage the issue because of the fact that “digital humanities has very real problems with racial diversity and gender representation in its scholarly community” (Posner, 2018). The concern of this paper is to engage the second and the third layers of GH engagements with particular reference to African studies, there seems to be little concern with ‘sourcing’ idea since this happens at all levels of research, from the fieldwork to uploading ideas into machines. It is important to engage these issues as Digital humanities does not constitute a new discipline in itself, but rather a new approach to humanities research that cuts across different existing humanities disciplines, but the effect is not yet adequate in African Studies because “Although many Nigerians have acquired skills that are useful in digital humanities, and though the internet and computers are widely used for research purposes across the country, the integration of digital tools into the educational system is very low” (Babalola, 2014). 8 Importation and Use of Computer and Digital Technologies in Africa The use of Computer and Digital Technologies in Africa has led to what can be called ‘technological dependence’ – a situation in which almost all the technologies that we boast of in Africa has their root in America and Europe, and indications that we have no total control. This poses a huge difficulty in the presentation and processing of ideas especially where the context influences the meaning and representation of ideas. The technological globalization agenda has not achieved the desired result of global representation of ideas and views. There is a sort of theoretical framework which enables what most scholars would call ‘the politics of technology’, this basically reflects the knowledge interest of the major game players, the major scholars in the DH. Attempts to Close the Gap There are no doubts that “Technology has certainly made leaps and bounds over the past fifty years, yet it is evident that many conversations about Africa from external perspectives have remained somewhat stunted” (Falola and Sanchez 2016:2). The shortfall in Digital Humanities’ politics of ideas has raised notable African linguists and technologists who now have recognized the need to develop digital humanities technologies that are built on African indigenous knowledge system and ontology. Such individuals include Tunde Adegbola, Tunde Opeibi, Victor Odumuyiwa, Frank Ugiomo, and many others. This shift is not limited to Africa as Microsoft and Google have been working to incorporate indigenous African languages into their software which are used by million of Scholars and Researchers in Africa. This quest leads to emergence of African linguists and information technology experts on the scene of localization of computer technology and the Africanization of the cyberspace. While digital humanities extend well beyond language-based research, textual resources and spoken language materials play a central role in most humanities disciplines. In the digital humanities, scholars have begun to see an increase emphasis on anthologies, especially for the purposes of annotation and data integration. Adegbola's approach to Computer and Digital Technology satisfy to a great extent the requirement for ordering technology for the good of a society. First, he mastered the principle of technology and gained expertise in programming languages. He then contributed immensely to the 9 development of Human Language Technology (HLT) and this has lead to the localization of computer and digital technological tools in Africa. His investigation of African languages from acoustic, information theoretic and linguistic perspective led to the development of theories and frameworks for designers or developers of African based Human Language Technology (HLT). The concerns of these scholars raised the need to create technology within the ontology and epistemology of any indigenous society, and an avoidance of foreign ones that could destroy the local language and culture. This argument is built on the assumption that there is no account in history of people who became great after adopting the culture and language of other people. Such society won’t be able to connect their act and activities with their behaviour nor allow technology to respond their culture. For instance, when most African societies first came in contact with mobile phone, it was a communal device. It should be made clear that technology cannot consider the ontology of a society by itself; rather it is the designers that need to be motivated by question asked by the culture. It is in the course of technology transfer that one begins to ask which part of our culture it conforms into. This representation of the reality around us can be done either by creating analogy (between the known and the unknown) or digital (that in which values in the analogical sense is being represented by a number and compared with its equivalence in reality) BRIDGING THE GAP: OJÚ LÒRÓWÀ, A THEORY OF COMMUNICATION The contribution of Africans to Digital Humanities in order to process and present ideas in African Studies is an urgent task considering the low level of scientific and technological attitude within the continent. Writers have begun this attempt, through the use of “a characteristic feature of African creative writing [called] code-switching (CS) and code-mixing (CM) as a writing technique. CS and CM have a social, discursive and referential significance in a text” (Bandia, 1996:139). These reconstructed ideologies must then be incorporated into computer and digital technologies that characterize this milieu. This is in line with what an African Scholar Kofi Awoonor opines that science and technology must be grafted upon African social and cultural realities, without losing sight of the original humanistic impulse of their communal existence (Awoonor, 2006). 10 This is reflected in CS, “In code-switched discourse, the items in question form part of the same speech act. They are tied together prosodically as well as by semantic and syntactic relations equivalent to those that join passages in a single speech act” (Romaine 1989:111). What then is the possibility for the localization of digital humanities technologies on African Studies for better processing and presentation. A typical African setup or research field has Folklores, proverbs and parables are folkmedia and means of information dissemination in Nigerian towns and villages (Nwuneli, 1983; Akpan, 1977; and Otasowie, 1981). Folkmedia are intangible artefact of a culture, made up of customs, traditions, stories, songs, religion, performance arts and superstition and these can pose difficulty to tools used in Digital Humanities. The concept of ‘Ojú lòrówà’, as a theory of communication, addresses some of the limitation in the existing Digital Humanities tools which serves as bedrock for the processing and presentation of research ideas in African Studies. ‘Ojú lòrówà’ is a Yoruba statement which could be literally translated as ‘discussion is in the eye’; that is, ‘communication takes place when we see physically’. The eyes have always been a formidable means of initiating, sustaining and emphasizing details of conversation among the Africans and not just among the Yoruba. According to Nwuneli “In some cultures it is considered sincere and trustworthy when a person looks straight in the face or,...looks [at] you right in the eyes. In other cultures it is rude and impertinent to “catch somebody’s eye” during conversation. In some cultures, people express themselves non-verbally by the mimicry of the face” (1983:148). As a communication framework, ‘Ojú lòrówà’ has five major components: 1. Coding: In this theory, like every communication, there has to be an operative coding system understood by the parties involved. Ojú lòrówà demands eye contact, or even contact through any of the other senses, for there to be communication using known and agreed on code. Holding one’s ear while talking with a child or another person, for the Yoruba, is a sign of warning. This may pass an entirely different communicative meaning in some other cultural contexts across the world. The reality of Digital Humanities can at best capture a sense of warning, but it will find it difficult to process how the holding of ears translates to warning. Beyond the 11 processing, the limitation of the existing tools become visible in its inability to communicate to the listener or reader why the warning is being issued. The general popular warning sign learnt by road users or public space users will not work here in the context of the African. The same sign can mean different things, the face (ojú) adds the context; if a mother holds her ears while looking at a child in a friend’s house, it simply will mean ‘I am warning you about what we had discussed earlier’, here the facial expression provides the context. But if the same mother does the same thing at home and the mood is happy, the facial express this time also provides the context, they mother may hold the ears and still be smiling, the child understands that the context is milder and may not require to total halt in whatever is being done. There are not Digital Humanities tools that can fill this expression gap. The fixed tools and too mechanical for many African expressions. 2. Privacy in communication: This theory ensures and protects privacy in communication on the ground that only the child or any other person in the know of the code can decipher what is being communicated. Persons outside the code will not understand the code. And this applies to codes that are specially developed by a group of people for particular communication either to exclude others from their communication for the sake of privacy or to password their communication. Developed codes serve as gate ways meanings and contents to be processed and presented, but these passwords and learnt and accessed by people who are trained across the globe, which is part of the aims of Digital Humanity, but Africa is replete with information that are coded and limited to specific groups and contexts. This are excluded from the tools of Digital Humanities either because the data is not made available because of the nature of Digital humanities or because the data is for coded group that do not find representation in the present sphere of Digital Humanities. A good example will the content of different cults in the African contexts. 3. Participation: This theory ensures effective communication by ensuring that the subject and object of communication are totally immersed in the process through participation. For instance, the eye contact leaves out doubt as to whether the message was delivered or received. Communication is usually straight to the point; 12 message is usually clear, and brief. Every sign is done only when there is an eye/sound/touch contact between the parties engaged in the communication. Studies have shown that Africans are expressive, and messages are passed swiftly just as they can be changed using same code. This has no equivalent in Digital Humanities and codes are representative of particular messages. The swiftness and flexibility of messages requires participation. Communication is like a game where every player is expected to be focus for success and team play. 4. Concentration: This theory sustains concentration by ensuring that messages are brief and straight to the point, for effective communication there is always the need to concentrate, that is why messages are given only when there is an eye contact, whenever there is no eye contact, a form of a sign (cough, taping of finger, etc.) is given to draw the attention of the person to receive communication, and once attention is gained, attempt is made to sustain the concentration of the other party. Data in African Studies are continuous in their nature, it is not a once and for all encounter. When information is being passed, relationship is built as well. Persons involved in data sourcing are expected to maintain regular contact with the information for relevance. A password only makes information available, but the effectiveness of the information is achieved through proper use of context. For example, the tradition of greeting the king when one is passing by his palace will hold true, but there are certain times that the information will not be. 5. Feedback: This theory ensures feedbacks – which are usually in the form of reaction to communication. What is communicated is either understood or not, once understood; a sign is given to indicate that communication had taken place. If code is not understood, a sign is given to indicate that message is not understood. It is never presumed; an affirmation or denial of message is always given. As the world becomes a global village, and “moves into this increasingly transnational and global age, it is more and more evident that homelands and identities are profound spaces for social, political, cultural, and academic engagement in Africa and beyond.” (Falola and Sanchez 2016:1). This creates the need for an effective theory of communication, that can source for data and beyond that adequately process and present data from African Studies. Lessons can be learnt 13 from writers such that “When African writers cannot adequately express African sociocultural reality in a European language, they resort to the use of indigenous words and expressions” (Bandia, 1996:141). Digital Humanities experts must also realize that there are ideas that cannot be captured unless the African context and tools are deployed to facilitate communication among functionaries in humanities. In the development of those tools (software), it would be necessary for the developers to put those five components into consideration in the course of their brainstorming. 8.0 Conclusion Digital Humanities is still, but the reality of African Studies is beyond material representation, there is òrò which represents the material data that has been sourced but beyond the material data there is also, ojú which give the contest and more details to the material substance that has been gathered. This for now is beyond the developed programs and tools of Digital Humanities. In this article, we have been able to illustrate how digital technological tools can be Africanized using ‘Ojú lòrówà’ as an indigenous theory suitable for developing digital as well as virtual software for African scholars. We have addressed the cause of those challenges confronting scholars in African Studies in their deployment of digital humanities tools. We have also identified beneficial digital tool that could genuinely promote research in African Studies, as well as discourage the usage of digital technological tools such AntConc, Nvivo, E-translators, and likes, on the basis that they distort or imposes a certain framework on literatures in African history, techne, culture, philosophy and tradition. 14 Furthermore, we enjoin scholars in African Studies to provide contents and frameworks for appropriate digital technological tools for Research in Africa. Effort must be made to develop multimedia documentaries, archives of cultural movies, digital translators for African languages, virtual galleries that would display African sculptures, arts, artifacts and antiques. In this age of technology, there is need for scholars in African studies to extract, from our indigenous practices, theories and framework that would help software developers to create tools appropriate for our knowledge system. On a final note, we advocate for minimum digital literacy for African researchers and scholars in humanities. This will ensure the availability of their works online, through academic social media and community such as academia or research gate, thereby bringing the research efforts made in Africa available to the global academic community. “Digital Humanities work has been criticized as empiricist, secular, and reductive of the creativity of human expression to a mathematical elegance that perhaps no longer carries the evocative mysteries of the original object or experience of it” (Hall, 2017). If the existing Digital humanities tools are not improved upon, by including and using African contextual digital representations: Meanings will be lost, details will be sacrificed and fundamentally, ideas will be misrepresented. About the Authors Akinmayowa Akin-Otiko currently works at the Institute of African and Diaspora Studies, University of Lagos. Akinmayowa does research in African Traditional medicine, Religion and Culture. Their current project is on: i) the use of Traditional Medicine for primary health care; ii) Religion and Culture of the Yoruba in Nigeria. Augustine Farinola is currently a researcher at the Department of English Literature, University of Birmingham, United Kingdom. His research focuses on post phenomenological analysis of Digital Humanities (DH) Technological tools used for Scholarly Communication and Linguistic Analysis. 15 REFERENCES Babalola, T. L. (2014). The Digital Humanities and Digital Literacy: Understanding the Digital Culture in Nigeria. Digital Studies/le Champ Numérique, 5(1). Bartscherer, Thomas and Coover, Roderick .(Ed.). (2011). Switching Codes: Thinking Through Digital Technology in the Humanities and the Arts. `University of Chicago Press. Borgmann, Albert (1999). Holding on to Reality: The Nature of Information at the Turn of the Millelium. Chicago: The University of Chicago Press, Ltd. Borgmann, Albert. (2005). Technology. In Dreyfus and Wrathall (Ed.) A Companion to Heidegger (pp. 420 -432).Washington: Blackwell Publishing. Burdick, Anne. Schnapp, Jeffrey. Drucker, Johanna. And Lunenfeld, Peter. (2012). Digital Humanities. Cambridge MA, United States of America: MIT Press. C.N, Davidson (2012). Humanities 2.0: promise, perils, predictions. In M. K. Gold (Ed.) Debates in the digital humanities (pp. ). Minneapolis, MN :University of Minnesota Press. Das, Apurba. (2010) Digital Communication: Principles and System Modelling (India: Springer. Das, Apurba. (2010). Digital Communication: Principles and System Modelling. India: Springer. Davidson C.N. (2012). Humanities 2.0: promise, perils, predictions. In M. K. Gold (Ed.) Debates in the digital humanities Minneapolis, MN: University of Minnesota Press. Domenico Fiormonte et al. (2015). The Digital Humanist: A Critical Inquiry (Brooklyn, New York: Punctum Books. Douglas A. Boyd and Larson, Mary A. (2014) Oral History and Digital Humanities: Voice, Access and Engagement USA: Palgrave Macmillan. 16 Fiormonte, Domenico. (2015). The Digital Humanist: A Critical Inquiry. Brooklyn, New York: Punctum Books. Gill, Jim and Geoff, Vincent. (1981) Software Development Handbook Texas: Texas Instruments. Meinel, Christoph and Sack, Harald. (2014). Digital Communication: Communication, Multimedia, Security Berlin, Germany: Springer. Mitra,Ananda. (2010). Digital Communication: From Email to the Cyber Community. New York: Infobase Publishing. Rydberg-Cox, Jeffery A. (2006). Digital Libraries and the Challenges of Digital Humanities Oxford: Chandos Publishing Limited. Sabharwal, Arjun. (2015). Digital Curation in the Digital Humanities: Preserving and Promoting Archival and Special Collections. USA: Elsevier. Skolnikoff, B. Eugene. (1993) The Elusive Transformation: Science, Technology, and the Evolution of International Politics Princeton. NJ: Princeton University Press. JOURNALS Borgmann, Albert. (2000). Semiartificial Intelligence. In Mark Wrathall & Jeff Malpas (Ed.) Heidegger, Coping, and Cognitive Science: Essays in Honour of Hubert L. Dreyfus Vol.2 (pp. 197 -206) Cambridge, Massachusetts: The MIT Press F. Reynolds, Dwight. (2016) ‘From Basmati Rice to the Bani Hilal: Digital Archives and Public Humanities’ in Elias Muhanna (Ed.), The Digital Humanities and Islamic & Middle East Studies Germany: De Gruyter. Sogolo, Godwin. (1981). Literary Values and the Academic Mind: A Portrait of the Humanistic Studies. Ibadan Journal of Humanistic Studies, No. 2. The Journal of Epsilon Pi Tau, Vol. 4, No. 1 (Spring 1978), pp. 46-51 Weidong, Xia and Lee, Gwanhoo. (March 2010) “Toward Agile: An Integrated Analysis of Quantitative and Qualitative Field Data on Software Development Agility” in : MIS Quarterly, Vol. 34, No. 1 pp. 87-114 Published by: Management Information Systems Research Center, University of Minnesota. work_2m65gksq2rbwvovm24iljmys6q ---- Microsoft Word - Spring 2019 Digital Humanity Syllabus To Share.docx 1 HAA/ENGCMP 0425: Digital Humanity Syllabus for Spring 2019 University of Pittsburgh Last Updated: April 6, 2019 Creative Commons License: Attribution – Non-Commercial Alison Langmead Department of History of Art and Architecture and School of Computing and Information Annette Vee Department of English Description How have computational devices affected the way we think about our own humanity? Our relationship to digitality has changed from the mainframe to the smartphone, but throughout, computers have processed huge amounts of data, kept track of our (or our enemies’) activities, made our lives more fun or at least more complicated, allowed us to communicate with each other, and archive knowledge on a broad scale. What roles do computers play in our lives, and what role do we play in theirs? What are the borders between humans and computers, or can they be drawn at all? This course prepares students to critically examine the intersections between digital devices and human life. Covering topics such as the relationship between computers and humans, surveillance, big data, and interactivity and games, we question what it means to be human in a space of pervasive digitality. Students will read philosophy, fiction, essays, book excerpts, and watch movies and play games. Assessment will be based on regular online posts to WordPress, a take-home midterm examination, a reflective synthesis of online posts, and class participation. The course fills the Philosophy and Ethics General Education requirement, is a gateway course for the Digital Narrative and Interactive Design major (https://www.dnid.pitt.edu/) and meets three times per week: twice for lecture, once for recitation/lab. Learning Outcomes After successfully completing this course, students will be able to: 1. Demonstrate a more sophisticated understanding of the ways digital technologies affect their lives and the lives of others, including potential effects on the experience and concepts of freedom, security, social relations, cognition, and human and digital consciousness (e.g., artificial intelligence). 2. Articulate ways that digital technologies may be used effectively and ethically in their academic and professional careers. 3. Assess their own work, including its suitability for particular audiences, and their strengths and weaknesses as composers. 4. Use different software packages or web services that process text, still images, moving images, and audio files in order to reflect on readings and discussion about digital technologies. 2 Assignments Readings All readings will be due on the Mondays of the week for which they are listed, and we will discuss them throughout that week. There are no required books for the course, and all readings will be available through WordPress or online. WordPress Posts Each week you will be asked to post a critical response to the ideas and concepts brought up in class and in the readings to the course WordPress site. The instructors have provided prompts to guide you in creating your posts to the site, all of which are on the syllabus for their given week. Sometimes you will be asked to respond with text, but sometimes with still images, moving images, or audio. You are are also responsible for responding thoughtfully to one other student’s WordPress post each week (100+ words). The address of the WordPress site is: http://pittdigitalhumanity.org/ Your weekly posts and your responses to other posts are due by noon on Wednesdays. In other words, readings are due on Mondays; WordPress posts and responses are due on Wednesdays. The WordPress site is private and invitation-only and will be accessible only to members of the class. The instructors may share the work on this site to selected faculty members at the University of Pittsburgh for purposes of showcasing the course. Any other public use of these materials will be requested specifically from you. If you have questions about your WordPress posts at any time, feel free to get in touch with one of the instructors for feedback. Image, audio and video assignments Some of your weekly WordPress posts as well as your Midterm and Final will be in non-textual format. We ask you to compose in images, audio, and video in order to accomplish some of the course learning goals (see especially #4 above). In particular, we want you to have an opportunity to engage with the material in the course in digital ways, and in ways that you might be less familiar with in your other courses at Pitt. We do not expect perfection in these modes of composition. Some of you might be experienced, others are novices. But we do expect that you compose thoughtfully in these media formats, and that you use your resources at Pitt to help you succeed in these compositions. Some resources you might want to use: visits to the UTA's office hours, Hillman Library's One Button Video Recording Studio (https://www.library.pitt.edu/one-button) and Whisper Sound Recording booth (https://www.library.pitt.edu/whisper-room), open homework hours at the English Department's Digital Media Lab in 435 CL (https://dmap.pitt.edu/hours), and equipment loans through Hillman (https://pitt.libguides.com/equipment/hillmanequipmentcollection). Graded WordPress Feedback (First and Second) We will provide graded feedback on your WordPress posts twice during term, and the entire class will also be discussing selected posts periodically in recitation. We will take both the posts and the responses you add to the site into consideration when composing your grade. Midterm Exam 3 The midterm exam will be a take-home assignment. We will distribute the exam prompt on Weds, Feb 20 and the exam will be due to WordPress by the end of the day of Fri, Mar 1. Reflective Synthesis (Final) The final exam will be a take-home assignment. We will distribute the exam prompt on Mon, Apr 8, and the exam will be due on Apr 19. Face-to-Face Participation Your face-to-face participation grade will be based on your attendance and your substantive participation in large and small group work in both lecture and recitation. Once a week, at the beginning of each lecture, you will be asked to write an “Entry Card,” on which you offer one thing that you were curious about in the readings and one connection that you can make between the readings and your lived experience. The quality of your work on the Entry Cards will be factored into this face-to-face participation grade. It will be important to attend lectures and recitations as this latter meeting is the time when we will mull over and extend the information found in the readings for the week as smaller groups. Missing more than two lectures or recitations will negatively affect your face-to-face grade. If you miss more than two weeks' worth of class, it will be very difficult for you to pass the course. We will be offering two movie nights during the term, and one of them will be mandatory to make up for the later start date for Spring Term 2019 (see below). If you attend both, the second movie night can be used to replace attendance for one missed lecture or recitation. You will also be asked to post an accompanying WordPress piece that summarizes your thoughts on the movie’s relationship to the themes of the class. You need to both attend the movie and write the post to get credit. You have another opportunity to make up one absence: Attend a Steiner lecture at CMU's Studio for Creative Inquiry. The lectures are public and relevant to our course--feel free to attend all of them! You can make up one absence by attending a lecture, and posting to the WP site about that lecture and its connections to the course. The schedule is posted here: http://studioforcreativeinquiry.org/events/spring-2019-steiner-lectures-in-creative-inquiry The other opportunity to make up one absence is to attend a workshop offered by Hillman Library: https://pitt.libcal.com/calendar/today/?cid=2274&t=d&d=0000-00-00&cal=2274&ct=35349 Assessment First Feedback on WordPress Posts Week of February 18th 15% Second Feedback on WordPress Posts Week of April 8th 20% Midterm Exam (take-home) March 1st (by noon, to WordPress) 20% Reflective Synthesis (Final) April 19th (by noon, to Courseweb) 25% 4 Face-to-Face Participation all term 20% Course Policies Inclusivity Policy Your success in this class is important to us. We recognize that everyone learns differently. Although we have designed the course to tap into different learning styles (online participation, f2f participation, videos, text, drawing, handwriting, etc.), we recognize that everyone will find different aspects of the course challenging. Challenge is good! But if there are aspects of this course that prevent you from learning or exclude you, please let us know as soon as possible. Together we’ll develop strategies to meet both your needs and the requirements of the course. We encourage you to visit us in office hours (see above) and also Disability Resources and Services to determine how you could improve your learning as well. Disability Resources and Services is located at 140 William Pitt Union, 412-648-7890 or 412-383-7355 (TTY)--please contact them and us as early as possible in the term. If you need official accommodations, you have a right to have these met. There are also a range of resources on campus, including the Writing Center and the Counseling Center, and the online mental health resource, ULifeline. Names and Pronouns Policy Sometimes it's hard to know what to call your instructors: when in doubt, ask them! We go by Prof. Vee and Prof. Langmead and use she/her pronouns. Erin O'Rourke, the undergrad TA for the course goes by Erin and uses she/her pronouns. If you have preferences about pronouns and naming for the class, please let us know and we will respect your wishes. We also request that you respect the wishes of our classroom community. Academic Integrity Policy Cheating or plagiarism on any assignment or exam will not be tolerated. Plagiarism is using someone else's words, research, or ideas as if they are your own. Please see us if you are unclear about this policy, or check out Pitt English's resource on Understanding and Avoiding Plagiarism. If you ever use someone else’s text word for word in your own writing, you must enclose those words in quotation marks and cite the source; if you paraphrase from a source, you must cite it as well. If you try to pass off someone else’s writing or research as your own in any assignment for the course, you may receive an F for the course, and be reported to the Dean’s office for disciplinary action pursuant to the school’s academic integrity code (http://www.as.pitt.edu/faculty/policy/integrity.html). Email Communication Policy If you do not ordinarily use your Pitt email address, please make sure that the Pitt address is forwarding properly to whatever email address you do use, since we will be sending messages this way. We expect that you will read your email regularly and may communicate about any course changes via email. The University provides an email forwarding service that allows students to read their email via other service providers (e.g. Google, Yahoo), but you do so at your own risk. To forward email sent to your University account, go to http://accounts.pitt.edu, log into your account, click on Edit Forwarding Addresses, and follow the instructions on the page. Be sure to logout of your account when you have finished. (For the full email Communication Policy, go to http://www.bc.pitt.edu/policies/policy/09/09-10-01.html.) 5 Before emailing your instructors a question about the course, please check the syllabus to see if your answer is here. The instructors for your course will attempt to answer your emails about the course promptly. Weekly Chart Scheduling Note Because of the newly-instituted, later start date of Pitt’s Spring Terms, all courses that meet on Mondays during spring are required to make up a class session outside of the traditionally-scheduled class meetings. For this additional class session, you will be required to attend a movie night organized by the instructors and then to post an accompanying WordPress piece that summarizes your thoughts on the movie’s relationship to the themes of the class. There will, in fact, be two such movie nights offered in total. The exact times will be arranged by the instructors, the undergraduate TA, and the participants in the class, and they will take place before Finals Week, as required. Week Subject 1. January 7/9 Introduction / Creating Computers 2. January 14/16 Can Machines Think? Automata and Algorithms (AI I) 3. January 23 Hardcoding Abstractions 4. January 28/30 Processing Encoded Information 5. February 4/6 Surveillance Society 6. February 11/13 Watching Networks Watching You 7. February 18/20 Computers + Humans: Augmentation or Symbiosis? 8. February 25/27 Midterm Week 9. March 4/6 Artificial Intelligence and the Human Mind (AI II) Spring Break 10. March 18/20 Human Labor in Computing 11. March 25/27 Old-School Games 12. April 1/3 Computational Creativity 13. April 8/10 Becoming a Digital Citizen 14. April 15/17 Synthesis and Peer Evaluation Class Schedule This schedule will change somewhat over the course of the semester–readings cut or swapped out, etc. Please check http://pittdigitalhumanity.org/pages/syllabus for the up-to-date version of this syllabus. Week 1: What Are Computers? What Are Humans? Creating Computers Where did computers come from, and what makes them tick? We'll start to look at the history of information processing and digital computing and learn a bit about how contemporary computers and peripherals actually work. 6 ● Martin Campbell-Kelly, William Aspray, Nathan Ensmenger, and Jeffrey Yost, Computer: A History of the Information Machine, Third Edition (Boulder, CO: Westview Press, 2014), 21-40 (“The Mechanical Office“), 65-85 (“Inventing the Computer“), 253-264 (“Broadening the Appeal“). [Available to read online through PittCat] ● Bettina Bair, “Inside Your Computer,“ TED-Ed, July 1, 2013. https://youtu.be/AkFi90lZmXA ● Daisuke Wakabayashi and Kate Conger, “Uber’s Self-Driving Cars Are Set to Return in a Downsized Test,” The New York Times, December 5, 2018. https://www.nytimes.com/2018/12/05/technology/uber-self-driving-cars.html ● Wilton L. Virgo, “How Does Your Smartphone Know Your Location?“ TED-Ed, January 29, 2015. https://youtu.be/70cDSUI4XKE ● Optional: Hidden Figures, 2016 movie, directed by Theodore Melfi. Available at Hillman's Stark Media Services. ● WordPress Post for Week 1: Accept the invitation to join the WordPress when it arrives in your email inbox--check your Quarantined Messages folder on found through http://my.pitt.edu (not your Junk Folder in Outlook, but “Quarantined Messages”), the invitation very likely went there. Then, set up your WordPress account, and write a first post answering two questions: 1) What's most exciting to you about computers and computation? 2) What makes you fearful about computers and computation? Your post should be 200 words or more, and it could cover the future or past of computation, your personal relationship with computers, your chosen major or profession or computers, etc. Feel free to integrate images or links in your post. The best posts will go beyond simple observations. After you post, please read some of your classmate's post and choose at least one to comment on. Again, we're looking for something more substantive than, "hey, interesting idea." You can make connections to your own post, offer additional information, or links, observations, history, etc. Categorize your post Week 01. Week 2: Can Machines Think? Automata and Algorithms We have long hoped that machines could think and learn like humans, from the "Mechanical Turk" that impressed the French queen and Benjamin Franklin in the late 18th century, to the early days of computational artificial intelligence in the 1950s, to our reliance on Google's algorithms to tell us what's important, to the use of algorithms in determining prison sentencing. How successful are these attempts to make machines do our thinking for us? How do these automata and algorithms reinforce or perpetuate human biases, or do they correct these biases? What does our use of these automata and algorithms say about us? ● Alan Turing, “Computing Machinery and Intelligence,” Mind: A Quarterly Review of Psychology and Philosophy 59(236):433-460, Oct 1950; reprinted in The New Media Reader, Eds. Montfort and Wardrip-Fruin, MIT Press, 2003. Available as a pdf on Pitt Box. ● "The Box that AI lives in," episode of Secret History of the Future podcast, Sept 10, 2018. https://slate.com/technology/2018/09/secret-history-of-the-future-podcast-intro.html 33min. Covers the original "Mechanical Turk" from 1780s, Amazon's Mechanical Turk, ReCapcha, connections between AI and human labor. Listen to alongside this article: Ella Morton, "Object of Intrigue: The Turk, a Mechanical Chess Player that Unsettled the World," Atlas Obscura, Aug 18, 2015, https://www.atlasobscura.com/articles/object-of-intrigue-the-turk. ● Safiya Umoja Noble, "Challenging the Algorithms of Oppression," talk for the Personal Democracy Forum, published on YouTube, June 15, 2016, 12m18s, https://www.youtube.com/watch?v=iRVZozEEWlE [contains references to pornography and racism] 7 ● Audrey Watters, “Clippy and the History of the Future of Educational Chatbots,” Hack Education, September 14, 2016, http://hackeducation.com/2016/09/14/chatbot ● Optional: Black Mirror, "Be Right Back," February 11, 2013. Available on Netflix. [contains sex scenes] ● WordPress Post for Week 2: Tell a story about an algorithm or automata that has impacted your life. If you're having trouble thinking of something, consider your educational history, your life as a college student, your shopping habits, your transportation routines, and your interactions with law enforcement or government. How did this algorithm or automata come into your life, what was its intent, how did it work, and what did it do to/with/about you? What was the nature of its "intelligence" and to what extent did you/do you trust it? Your WordPress post should be in audio-only format, approximately 2 minutes. You can upload audio directly to WordPress, but you may alternatively want to use SoundCloud to host your audio clip, then link that to your WordPress post. Please listen to your audio prior to posting to ensure that it is playable, that the volume is appropriate, and that it says what you'd like to say. The best posts will consider their chosen algorithm in light of our readings and be interesting to listen to. Categorize your post Week 02. Remember to do your response comment as well--the comment can be in plain text. Week 3: Hardcoding Abstractions You may have heard that computers really only understand 1's and 0's. So, how do the things you type into your computer get translated into language the computer can understand? ● Bell Laboratories, “Incredible Machine,” film from 1968, https://youtu.be/iwVu2BWLZqA. ● Jennifer Light, "When Computers Were Women." Technology and Culture 40, no. 3 (1999): 455- 483. http://muse.jhu.edu/journals/technology_and_culture/v040/40.3light.html. ● WordPress Post for Week 3: Write a WordPress post, using both text and images, documenting one full day of your interactions with computational devices. Reflect on what you find. Were you surprised by anything? Did anything you learned by doing this exercise change your mind? The best posts this week will use the words and images together creatively to narrate the day and will focus on computational technologies rather than simply electronic devices. Categorize your post Week 03. Week 4: Processing Encoded Information One of the reasons humans turn to computers is that they can process a lot more information than our brains can, and much faster. Historically we've used computers to scale up information processing beyond the capacity of the individual human brain: artillery tables, mathematical fractals, reading millions of historical or literary texts, or handling exabytes of data per day from the proposed Square Kilometer Array of radio telescopes. What does it mean to rely on a computer to deal with all the details? How do computers make sense of things that we cannot? Or can they? What are the compromises and assumptions we are making? ● Paul Ford, “What is Code,“ Businessweek, electronic edition, June 11, 2015, http://www.bloomberg.com/graphics/2015-paul-ford-what-is-code/, Section 1 (all), Section 2- 2.4, Section 4 (all), Section 7.5. ● Jacob Gaboury, “A Queer History of Computing: Part Three,” Rhizome, April 9, 2013. http://rhizome.org/editorial/2013/apr/9/queer-history-computing-part-three/. 8 ● James Gleick, The Information (New York: Pantheon Books, 2011), 398-426 (“New News Every Day“) and 413-426 (“Epilogue”). Available for download here: https://pitt.box.com/s/qnqrgkzrywsp3yvrrim6wnp59b7rydjl. ● “30 for 30 Shorts: The Schedule Makers,“ directed by Joseph Garner, 12m26s, ESPN, 2013 [https://vimeo.com/75943437] ● WordPress Post for Week 4: Look back over your WordPress post for week 3, and think about what you found carefully. You clearly inhabit a socio-technical system. Where in this system do you see that computers make rules you are subject to? Where can you break those rules to accommodate special situations? Where can you not break those rules and why not? Can you imagine a scenario in which you make rules that the computers are subject to? What would that (does that) look like? Post should be in a text-only format. You're aiming for around 500 words. Categorize your post Week 04. Week 5: Surveillance Society This week, we explore one of the major uses for computers' ability to process massive amounts of data: surveillance. As you sit at a screen—your smartphone, an ATM, your Facebook News Feed, or a wall of lighted panels presented for public use—you are watching something. Are you being watched back? Spoiler: yes. ● Gabriel J.X. Dance, Michael LaForgia and Nicholas Confessore, "As Facebook Raised a Privacy Wall, It Carved an Opening for Tech Giants," New York Times, December 18, 2018, https://www.nytimes.com/2018/12/18/technology/facebook-privacy.html ● Rob Kitchin, “No Longer Lost in the Crowd? How People’s Location and Movement Is Being Tracked,“ The Programmable City, December 3, 2015, http://www.maynoothuniversity.ie/progcity/2015/12/no-longer-lost-in-the-crowd-seven-ways- peoples-location-and-movement-is-being-tracked/. ● Brian Merchant, “Looking Up Symptoms Online? These Companies Are Tracking You,” Motherboard, February 23, 2015, http://motherboard.vice.com/read/looking-up-symptoms- online-these-companies-are-collecting-your-data. ● Optional: "Arkangel," episode of Black Mirror, directed by Jodie Foster, Netflix, Dec 29, 2017. ● WordPress Post for Week 5: This week, we want you to notice all of the digital systems that are tracking you. You can focus on one day to remind you of your general activities (like you did in Week 3), but you may find it useful to think about this over a couple of days to get a more general picture of the places and ways in which you’re being tracked. Consider surveillance cameras, educational technologies, digital communications, frequent shopper cards or IDs, transportation networks, GPS, and other such technologies. Also, Pitt is tracking you. How? Write up a list of all of the activities in which you think you are tracked, and then write, reflecting on the following questions: Who has this data? In what form does it exist (database, video, logs, etc.)? What permission do they need from you to collect it, if any? How long do you think they keep it and for what purposes? Post should be audio accompanying a series of images, 2 minutes long. Consider the medium's affordances: the best posts will combine voice/audio creatively with images in a sequence that reinforces your narrative. Categorize your post Week 05. Week 6: Watching Networks Watching You 9 Digital technology speeds up and complicates surveillance feedback loops. Who or what is recognizing your face or monitoring your activities? To what degree are you in control of this surveillance? Our modes of looking, typing, hearing and speaking interact to create a system where we are not only being watched by devices, corporations and government institutions, but we are also watching each other. ● Michel Foucault, “‘Panopticism’ from Discipline and Punish,” in Ways of Reading, 9th ed., eds. David Bartholomae and Anthony Petrosky (New York: Bedford/St. Martin’s, 2010 (orig. 1975)), 282-295 (Abridged). Available on Pitt Box. ● Nathan Jurgenson, “The Facebook Eye,“ The Atlantic, January 13, 2012, http://www.theatlantic.com/technology/archive/2012/01/the-facebook-eye/251377/ ● Kate Losse, "The Male Gazed: Surveillance, Power, and Gender," Model View Culture, January 13, 2014, https://modelviewculture.com/pieces/the-male-gazed ● Drew Harwell, "Fake-porn videos are being weaponized to harass and humiliate women," The Washington Post, December 30, 2018, https://www.washingtonpost.com/technology/2018/12/30/fake-porn-videos-are-being- weaponized-harass-humiliate-women-everybody-is-potential- target/?noredirect=on&utm_term=.13e93edeb5b3 ● Optional: John Oliver's Last Week Tonight on Online Harrassment (June 21, 2015), or Ashley Judd's TedTalk, How Online Abuse of Women has Spiraled Out of Control (January 18, 2017). Be aware that both of these videos use explicit language and describe some nasty stuff, although it's probably nothing you don't already know. ● WordPress Post for Week 6: Choose one of the surveillance systems or devices that you listed and wrote about for Week 5 or the #SurveillanceScavengerHunt from the recitation makeup in Week 4, and dive more deeply into how it works. Figure out as much as you can about who owns it, what technological mechanisms and algorithms are behind it, where your data is held and for how long, what it can be used for, etc. You may find it helpful to read privacy policies (you might want to check out the website “Terms of Service; Didn’t Read,” https://tosdr.org/), search for relevant legal cases, publicity announcements about an amazing new system that will blah blah blah… Write up the interesting stuff you found, as well as what you couldn’t find. How does all of this impact you? You may choose any medium you wish to express yourself. If it is time-based, make it around 2 minutes long, if it is text-based make it around 500 words. Categorize your post Week 06. Week 7: Computers + Humans: Augmentation or Symbiosis? How would you describe your relationship to computers? Do they help you do your work or do they actually tell you what work you need to do? Do they augment your abilities, or could not not do what you want to do without them? What are the ideas wrapped up in our dependence on the digital infrastructure that now wraps the globe? ● Doug Engelbart, “Augmenting Human Intellect: A Conceptual Framework,“ SRI Summary Report AFOSR-3223, Prepared for the Director of Information Sciences, Air Force Office of Scientific Research, Washington DC, Contract AF 49(638)-1024, SRI Project No. 3578, 1962, excerpts. Posted to Box here: https://pitt.box.com/s/npzyzrqhqcu4ktprs41lepvbxt9g63d0. ● Neal Stephenson, Diamond Age (New York: Bantam Books, 1995), 1-21. Posted to Box here: https://pitt.box.com/s/i4a2s3uly851yhrutobyqn7wrk46c2qx. 10 ● J.C.R. Licklider, “Man-Computer Symbiosis,“ in IRE Transactions on Human Factors in Electronics, vol. HFE-1, no.1 (March 1960): 4-11. http://groups.csail.mit.edu/medg/people/psz/Licklider.html ● Optional: Black Mirror, "San Junipero," directed by Owen Harris, February 11, 2013. Available on Netflix [contains sex scenes]. ● WordPress Post for Week 7: Using concepts and readings from the course so far, reinvent yourself as a cyborg. Choose at least three enhancements and changes that reflect somehow who you believe yourself to be. Create a diagrammatic image of what that would look like. Your diagram could be hand-drawn or digitally constructed, but should be posted digitally. In an accompanying textual post, describe your changes and reflect on what they mean about your human relationship to technology and computers, your perceived weaknesses and strengths, and how willing you are to give yourself over to the machine. Categorize your post Week 07. Week 8: Midterm Week ● No readings this week. Class on Monday will meet as normal, and will be a review for the midterm. Wednesday’s class and recitation will be optional office hours with Prof. Vee and Erin. Prof. Langmead will be available 12-1 Week 9: Artificial Intelligence and the Human Mind (AI II) Can computers truly become thinking, sentient beings? Can humans become digital computers? Massive and intricate computational systems attached to government defense programs in the 1960s were called “command and control“ systems for the way that they allowed centralized control and coordination both of the artillery and the humans who ran them. ● Michael Brennan, “Can Computers Be Racist? Big Data, Inequality, and Discrimination,” Ford Foundation’s Equal Change Blog, November 28, 2015: https://www.fordfoundation.org/ideas/equals-change-blog/posts/can-computers-be-racist-big- data-inequality-and-discrimination/. The text of the blog post and the two short video clips are required readings. If you are interested in learning (much) more about this subject, please also feel free to watch the hour-long video recording at the bottom of the post featuring more from Latanya Sweeney and Alvaro Bedoya. ● Meredith Broussard, Artificial Unintelligence: How Computers Misunderstand the World (Cambridge, MA, 2018), “Hello, AI,” “Hello, Data Journalism,” AND, “Machine Learning: The DL on ML,” 31-47; 87-119. https://ebookcentral.proquest.com/lib/pitt-ebooks/ detail.action?docID=5355856 or on Box https://pitt.box.com/s/ klxfo6p41g8mv4gpkpcnazac84usdi6x ● Scott Rosenberg (interview with Kate Crawford), “Why AI Is Still Waiting for Its Ethics Transplant,” Wired, November 1, 2017. https://www.wired.com/story/why-ai-is-still-waiting- for-its-ethics-transplant/. ● “Unsupervised Learning - Georgia Tech - Machine Learning,” Udacity, 4m23s, February 23, 2015, https://youtu.be/1qtfILYSDJY. ● “M[achine] L[earning] in the Google Self-Driving Car,” Udacity, 2m12s, October 27, 2014, https://youtu.be/lL16AQItG1g. ● WordPress Post for Week 9: Tell a structured narrative–using only images–about the improbability, possibility, or inevitability (choose one) of computers surpassing humans in 11 intelligence. Use between 5 and 10 images. You can use images you find online, pictures you take yourself, or images you design yourself (say in Photoshop or old fashioned pen-and-ink). You are welcome to use stock images you find on the Web, but please choose mindfully. The best posts will be show something more than just humans being blown up by computers; they will instead have complex, possibly even ambiguous narratives. If you choose a graphic novel style (or even a series of memes/animated GIFs), you are welcome to use text within the frame of the images. Categorize your post Week 09. Spring Break Week 10: Human Labor in Computing Before we created devices in metal and glass to calculate, humans did this work: the infamous "Mechanical Turk" from the late 18th century, women “computers“ performing the complex calculations that were needed for warfare and science until the 1940s. Now, there's a lot of talk about how computers will replace humans in certain jobs. This was true for human mathematical calculators in the 1940s, it has been true in some manufacturing contexts, and now computers threaten to replace drivers. But computers also introduce new jobs, too--jobs to keep humans from using computers for ill. ● Adrian Chen, “The Laborers Who Keep Dick Pics and Beheadings Out of Your Facebook Feed,“ Wired.com, October 23, 2014, http://www.wired.com/2014/10/content-moderation/. [Please be forewarned, this article contains references to explicit violent and sexual content on the Internet. If you would like to avoid this content, please read this article instead: Marc Burrows, "They Called it the Worst Job in the World: My Life as a Guardian Moderator," The Guardian.com, April 18, 2016, https://www.theguardian.com/technology/2016/apr/18/welcome-to-the-worst-job-in-the- world-my-life-as-a-guardian-moderator] ● Simon Parkin, "The YouTube stars heading for burnout," The Guardian, September 8, 2018, https://www.theguardian.com/technology/2018/sep/08/youtube-stars-burnout-fun-bleak- stressed ● Thomas Davenport and Julia Kirby, “Beyond Automation,“ Harvard Business Review (June 2015): https://hbr.org/2015/06/beyond-automation. ● Lisa Nakamura, “Indigenous Circuits (backstory),” Computer History Museum Blog, January 2, 2014: http://www.computerhistory.org/atchm/indigenous-circuits/. (In this blog post, Nakamura describes her research process for uncovering some of the surprising and racialized history of semiconductor manufacturing. *Optional* reading: the article she refers to is available here: Lisa Nakamura, “Indigenous Circuits: Navajo Women and the Racialization of Early Electronic Manufacture,” American Quarterly 66 (December 2014): 919-941, https://lnakamur.files.wordpress.com/2011/01/indigenous-circuits-nakamura-aq.pdf.) ● WordPress Post for Week 10: If you think about it, when you are circulating memes and stories online, you are working for the network. Your task this week is to design your own, original viral campaign. It could support a local or global political cause, or it could be satirical or just-for-fun (although please do steer clear of misogyny, hate speech, etc.). Your campaign's central artifact could take the form of a meme-image, a hashtag, a Facebook post, or something else you think could be potentially viral. To accompany this artifact, write a 250-word plan for deploying your viral campaign, considering what makes something go viral, and arguing for why your work will 12 spread. The best posts will catch attention and have a solid plan for distribution and anticipated success. Categorize your post Week 10. Week 11: Old-School Games Initially conceived almost entirely as a device for work (and destruction), computers quite quickly became an area for creativity and play. We make music, play games, socialize and challenge ourselves on our computers. This week, we'll look at some of the games and fun of computers in the 1970s and '80s. ● Stewart Brand, “Spacewar: Fanatic Life and Symbolic Death Among the Computer Bums,“ Rolling Stone (December 7, 1972): 50-58. Available from Box here: https://pitt.box.com/s/3eeywx3jketyo0ws8j26c3x92kbumbd0. ● Leigh Alexander, “The Original Gaming Bug: Centipede Creator Dona Bailey“ [Interview], Gamasutra, Aug 27, 2007. http://www.gamasutra.com/view/feature/130082/the_original_gaming_bug_centipede_.php?p age=all ● WarGames. Directed by John Badham. Santa Monica, CA : MGM Home Entertainment, 1983. ● WordPress Post for Week 11: Play a game on the Internet Arcade (https://archive.org/details/internetarcade). Write a review of the game (250 words) and excerpt the most interesting game moments in an accompanying video walkthrough that lasts no longer than 2 minutes. In your voiceover to the walkthrough, talk about what “play” means to you in the context of your everyday lived experience and how this game did or did not match up to what your idea of “ideal play” might be. Don’t just complain about the controls! What does or does not make something fun? Categorize your post Week 11. ● Note that you may experience some difficulty in getting the game to actually work in your browser. Part of this assignment is about working through these difficulties and thinking about when, how, and why you may want to give up on interacting with a computer when you aren’t getting what you need to get out if the experience. [Hint: Your browser has to run an arcade emulator and the game itself, and you'll be using a different set of controls (your keyboard, an external device if you have one) than the arcade machine had. Read the explanation and the comments below the game for some help. Firefox is the browser they recommend, and if you're running MacOS, you may want to or have to disable some keyboard shortcuts that interfere with the controls.] Week 12: Computational Creativity ● Michael Mateas and Nick Montfort, “A Box, Darkly: Obfuscation, Weird Languages and Code Aesthetics,” in Proceedings of the 6th Digital Arts and Culture Conference, IT University of Copenhagen, 1-3 December 2005, 144-153, http://nickm.com/cis/a_box_darkly.pdf. ● Michael Edwards, “Algorithmic Composition: Computational Thinking in Music,” Communications of the ACM, 2011, http://cacm.acm.org/magazines/2011/7/109891- algorithmic-composition/fulltext ● Stephen Ramsay, "Algorithms Are Thoughts, Chainsaws are Tools,” 2010 (23-minute video about livecoding, 2010). https://vimeo.com/9790850. ● Check out some examples of computational creativity: Johnny Sun and Hannah Davis, et al.'s The Laughing Room installation: https://shass.mit.edu/news/news-2018-inside-laughing-room; 13 Allison Parrish's portfolio of computational word experiments: http://portfolio.decontextualize.com/ and twitterbots https://twitter.com/aparrish/lists/my-bots/members; Winnie Soon's portfolio: http://siusoon.net/category/creative_works/; OFFAL (Orchestra for Females and Laptops): https://offal.github.io/; Shelly Knotts' algorithmic sound compositions: https://soundcloud.com/shelly-knotts; Primavera DiFillippi's Plantoid, a blockchain-based lifeform: http://okhaos.com/plantoids/ ) ● WordPress Post for Week 12: Give us a glimpse at an algorithm you perform regularly. We do mean algorithm, and not simply routine. If you want a bit of help understanding what an algorithm is, the first 30 seconds of this Khan Academy video might help: https://www.khanacademy.org/computing/computer-science/algorithms/intro-to- algorithms/v/what-are-algorithms. Could your behavioral algorithm be automated? Would you like it to be automated? Why or why not? Your post could be in the form of a screen capture, video, audio, a series of still images, a written post, hybrid image/text, whatever. Please note that while the format for this week is open, you should keep in mind our time/length guidelines for other WordPress posts: keep time-based media to around 2-3min, and text only should be around 500 words, etc. The best posts will consider deeply what it means to perform algorithms as a human and what it means to perform algorithms as a computer, and how those are different and interact within contemporary patterns of automation. Categorize your post Week 12. Week 13: Becoming a Digital Citizen How would you describe the relationship humans have with computers? What changes us when we interact with them? How might we all go out into our various workplaces and fields of study and consider this relationship differently? The concluding week will focus on student work and drawing overall conclusions from the discussions that have taken place over the term. ● Bonnie Stewart, “Digital Identities: Six Key Selves of Networked Publics,” thetheoryblog, May 6, 2012, http://theory.cribchronicles.com/2012/05/06/digital-identities-six-key-selves/. (You can skip the first part--just skip down to after the video where she notes the six key selves.) ● ICANN’s Beginner’s Guide to Domain Names : https://www.icann.org/en/system/files/files/domain-names-beginners-guide-06dec10-en.pdf (This is a big document that's not meant to be read cover to cover--just skim through to understand a bit about how domain names work.) ● Troy Hunt, “Going Dark: Online Privacy and Anonymity for Normal People,” http://troyhunt.com, May 17, 2016, https://www.troyhunt.com/going-dark-online-privacy-and- anonymity-for-normal-people/ ● Rebecca Heilweil, "How Close Is An American Right-To-Be-Forgotten?," Forbes, Mar 4, 2018, https://www.forbes.com/sites/rebeccaheilweil1/2018/03/04/how-close-is-an-american-right- to-be-forgotten/ Week 14: Synthesis of the Course and Peer Evaluations of Final Work work_2mmfgozb35ghnkw55ivtgdsuqm ---- So what are you going to do with that?: The promises and pitfalls of massive data sets Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=wcul20 Download by: [University of Michigan] Date: 20 November 2017, At: 11:41 College & Undergraduate Libraries ISSN: 1069-1316 (Print) 1545-2530 (Online) Journal homepage: http://www.tandfonline.com/loi/wcul20 So what are you going to do with that?: The promises and pitfalls of massive data sets Sigrid Anderson Cordell & Melissa Gomis To cite this article: Sigrid Anderson Cordell & Melissa Gomis (2017): So what are you going to do with that?: The promises and pitfalls of massive data sets, College & Undergraduate Libraries, DOI: 10.1080/10691316.2017.1338979 To link to this article: http://dx.doi.org/10.1080/10691316.2017.1338979 Published online: 20 Jul 2017. Submit your article to this journal Article views: 42 View related articles View Crossmark data http://www.tandfonline.com/action/journalInformation?journalCode=wcul20 http://www.tandfonline.com/loi/wcul20 http://www.tandfonline.com/action/showCitFormats?doi=10.1080/10691316.2017.1338979 http://dx.doi.org/10.1080/10691316.2017.1338979 http://www.tandfonline.com/action/authorSubmission?journalCode=wcul20&show=instructions http://www.tandfonline.com/action/authorSubmission?journalCode=wcul20&show=instructions http://www.tandfonline.com/doi/mlt/10.1080/10691316.2017.1338979 http://www.tandfonline.com/doi/mlt/10.1080/10691316.2017.1338979 http://crossmark.crossref.org/dialog/?doi=10.1080/10691316.2017.1338979&domain=pdf&date_stamp=2017-07-20 http://crossmark.crossref.org/dialog/?doi=10.1080/10691316.2017.1338979&domain=pdf&date_stamp=2017-07-20 COLLEGE & UNDERGRADUATE LIBRARIES https://doi.org/./.. So what are you going to do with that?: The promises and pitfalls of massive data sets Sigrid Anderson Cordell a and Melissa Gomis b aHatcher Graduate Library, University of Michigan, Ann Arbor, Michigan, USA; bPerkins Library, Doane University, Crete, Nebraska, USA ARTICLE HISTORY Received  February  Revised  June  Accepted  June  KEYWORDS Data mining; library services; supporting DH across the institution; teaching DH ABSTRACT Thisarticletakesasitscasestudythechallengeofdatasetsfortext mining, sources that offer tremendous promise for DH methodol- ogy but present specific challenges for humanities scholars. These text sets raise a range of issues: What skills do you train humanists to have? What is the library’s role in enabling and supporting use of those materials? How do you allocate staff? Who oversees sus- tainability and data management? By addressing these questions through a specific use case scenario, this article shows how these questions are central to mapping out future directions for a range of library services. Introduction When the first set of texts from the Early English Books Online Text Creation Partnership (EEBO-TCP) was released on January 1, 2015 (Text Creation Partner- ship [TCP] 2014), there was understandable excitement about the release of 25,000 openly available texts from the Early Modern period (Levelt n.d.). In addition to making these texts available to read, this release also opened up possibilities for text mining the EEBO-TCP data set. However, while there is clear potential for digital humanities research in making a relatively clean data set of texts from the early mod- ern period available, the structure of the data set itself poses considerable challenges for scholars without a background in programming. Most humanities scholars can- not take advantage of a data set like this one—or similar data sets, such as the historical newspapers that ProQuest has recently made available to institutions that have purchased perpetual access—without considerable training and support. The question becomes, who is best positioned to provide that support? For many, the obvious answer to this question is the library because of its position as provider of resources and expertise in navigating them. If the library is to provide this support, however, how can it do so most effectively? The gap between the promise and usability of massive humanities data sets like the EEBO-TCP project presents an CONTACT Melissa Gomis msgomis@gmail.com Perkins Library, Doane University,  Boswell Ave, Crete, NE . Published with license by Taylor & Francis ©  Sigrid Anderson Cordell and Melissa Gomis https://doi.org/10.1080/10691316.2017.1338979 https://crossmark.crossref.org/dialog/?doi=10.1080/10691316.2017.1338979&domain=pdf&date_stamp=2017-07-20 mailto:msgomis@gmail.com 2 S. A. CORDELL AND M. GOMIS opportunity to consider a host of questions facing libraries today as they develop ser- vice models and expertise to support traditional and emerging forms of scholarship. This article takes as its case study the challenge of massive data sets for text min- ing, sources that have been lauded as offering tremendous promise for DH method- ology but present very specific challenges for humanities scholars with minimal pro- gramming skills. The data management and use issues with which we are concerned in this article engage the question of whether humanists should learn to code; how- ever, they go beyond that in scale and scope. The text sets under discussion in this article raise a broad range of issues if they are to be used by researchers: What skills do you train humanists to have? While the library in most cases helped to create and provides access to these data sets, what is the library’s evolving role in enabling and supporting use of those materials? How do you allocate staff in this situation? Who’s going to oversee sustainability and data management? By addressing these questions through the lens of a specific use case scenario, this article shows how these ques- tions are central to mapping out future directions for a range of library services. Background New digital methodologies and sources for humanistic scholarship raise new ques- tions for training humanities scholars, as well as for the roles that libraries can play in supporting emerging scholarly approaches. As many have noted, emerging digital methodologies in humanities scholarship have opened up new ways to analyze texts at scale. As Heuser, Le-Khac, and Moretti (2001) observe, digital methodologies open up the possibility of asking broader questions of larger corpora to understand texts and underlying social and cultural phenomena at scale. Traditional scholarly methods, in particular the close reading of texts, necessarily limit the scale of anal- ysis, leaving open the question of how authoritative any analysis based on reading a necessarily limited corpus can be. As Heuser, Le-Khac, and Moretti point out, machine reading methods hold promise for allowing us to answer new questions based on a larger, more inclusive corpus: “These emerging methods promise ways to pursue big questions we have always wanted to ask with evidence not from a selection of texts, but from something approaching the entire literary or cultural record. Moreover, the answers produced could have the authoritative backing of empirical data” (79). Alongside the “authoritative backing” that “empirical data” promises, these approaches raise concerns among humanists, especially for disciplines that have long defined themselves in opposition to the sciences. As Heuser, Le-Khac, and Moretti (2011) observe, By offering an entirely different model of humanities scholarship, the digital humanities raise many questions …. Can we leverage quantitative methods in ways that respect the nuance and complexity we value in the humanities? … Under the flag of interdisciplinar- ity, are the digital humanities no more than the colonization of the humanities by the sciences? (79). COLLEGE & UNDERGRADUATE LIBRARIES 3 In conjunction with this lively debate over whether the core values of the human- ities are lost by drawing on computational approaches is the question of how best to train humanists to undertake these approaches, as well as a necessary discussion about what might get lost in the process. Some of the resistance to computational training by humanists, Kirschenbaum argues, stems from a misunderstanding of what computer science is about, as well as its relevance to critical thinking: Many of us in the humanities think our colleagues across the campus in the computer- science department spend most of their time debugging software. This is no more true than the notion that English professors spend most of their time correcting people’s grammar and spelling. More significantly, many of us in the humanities miss the extent to which programming is a creative and generative activity. (2009, B10) Scholars like Kirschenbaum (2009) have argued forcefully for rethinking human- ities training so as to incorporate programming skills. One way to make space, Kirschenbaum suggests, is to replace the foreign language requirement in PhD pro- grams with programming. These skills are crucial, he argues, because Computers should not be black boxes but rather understood as engines for creating pow- erful and persuasive models of the world around us. The world around us (and inside us) is something we in the humanities have been interested in for a very long time. I believe that, increasingly, an appreciation of how complex ideas can be imagined and expressed as a set of formal procedures—rules, models, algorithms—in the virtual space of a computer will be an essential element of a humanities education. As Kirschenbaum argues, humanities scholars cannot explore the “complex ideas” that humanities computing generates without an understanding of the underlying computational systems. Likewise, scholars connected to the Humanities, Arts, Science, and Technology Alliance and Collaboratory (HASTAC) have devoted considerable energy to advo- cating for humanists to learn coding. Hunter (2016) describes an anecdote that her advisor told her when she wanted to do DH work but resisted taking a programming class: “‘I’ll never forget this young scholar who put himself forward as an expert on Chekhov,’ he mused. ‘I asked if he spoke Russian, and he proudly said he’d never even taken a class. He lost all credibility in that moment. Don’t be the Chekhov scholar who didn’t take Russian 101.”’ As Hunter suggests, scholars need to understand code to design digital projects. While there is some consensus in the scholarship that it is valuable for humanists to learn programming skills, there has been less detailed attention paid to what the best process is for teaching those skills. Antonijevic’s (2015) ethnographic study of digital humanists reveals an informal, unstructured mode of learning that is focused on point-of-need, where learning is linked to immediate scholars’ needs, arising from specific research prob- lems, which generally makes this way of learning preferred over organized efforts, such as library workshops, where learning is decontextualized from scholarly practice. This method also successfully makes use of one of the scholars’ most scarce resources: their time. (80–81) 4 S. A. CORDELL AND M. GOMIS As Antonijevic (2015) points out, this method has the disadvantage of “depend[ing] on a scholar’s social network and its knowledge capacity” (81). The idea of a “social network” as the basis for acquiring programming skills is linked to another solution to the training dilemma offered by the literature on digital scholarship: collaboration. Gibson, Ladd, and Presnell (2015) argue that, “Unlike traditional humanities research, digital humanities scholarship is not a solitary affair. Generally, no single person has all the skills, materials, and knowledge to create a research project. By nature, the digital humanities project, big or small, requires a collabo- rative team approach with roles for scholars, ‘technologists,’ and librarians” (4). Liu echoes this sentiment, arguing that DH work requires a full team of researchers with diverse skills in programming, database design, visualization, text analysis and encoding, statistics, discourse analysis, website design, ethics (including complex ‘human subjects’ research rules), and so on, to pursue ambi- tious digital projects at a grant competitive level premised on making a difference in today’s world. (2009, 27) Collaboration, however, requires considerable support and advocacy in a disci- plinary landscape where it is not the norm. Reid points out that, Unlike a laboratory, which requires a team of people to operate, the default mode for humanities academic labor has been for a professor to work independently …. It is unusual for humanities scholarship to appear with more than two authors, let alone the long list of authors that will accompany work in the sciences …. While there are certainly examples of notable, long-standing collaborations in the humanities, they are exceptions to the rule. (2012, 356) Although collaboration can be fruitful for scholars in the humanities, it requires both a cultural shift and a rethinking of the workflow for scholarly projects. At this point, collaboration has not been fully embraced by scholars across the disciplines. In addition to differing disciplinary attitudes that engender resistance to collabo- ration in the humanities, collaboration can have its own drawbacks, especially when the collaboration is not seen as fully equitable. As Edmond points out, “In the worst cases, teamwork based on an ethos of knowledge sharing can degenerate into the negotiation of uncomfortable tacit hierarchies, where some contributors (regardless of their expertise or seniority) feel like service providers working in the shadow of otherwise autonomous project leaders” (2015, 57). Further, Edmond observes that collaboration doesn’t just require bringing people together but also reimagining projects so that all people involved have an intellectual stake. According to Edmond, successful digital humanities collaborations “ensure from the outset that the project objectives propose interesting research questions or otherwise substantive contribu- tions for each discipline or specialty involved” (56). As Reid (2012) explains, “Given that the assemblage operates effectively with a single author, one essentially has to invent new roles for additional participants” (356). Because of their well-established role supporting research, librarians have taken up the question of how to enable fruitful collaborations and how best they can train humanists seeking to create DH projects or learn programming skills. Green COLLEGE & UNDERGRADUATE LIBRARIES 5 asks how libraries can facilitate “scholars’ initial skills acquisition in text encoding” (2014, 222). Green recommends a workshop model that does “not simply inculcate scholars with the latest software; rather librarians and scholars work together to facilitate scholars’ entry into the communities of practice that make up digital humanities” (222). Pointing to the TEI (Text Encoding Initiative) consortium as a model, she argues that it “presents a strong case study of the role of librarians in building learning environments that enable scholars to become members of its community of practice” (223). One key question is whether it is the role of libraries to offer technical support for digital projects, train researchers in attaining new skills (through workshops, for example), or enable collaboration. Lewis et al. assert that “Organizations most successful at building expertise among faculty, students, and staff tended to share characteristics such as an open and collaborative interdisciplinary culture in which each team member contributes expertise and is respected for it” (2015, 2). Discussions of the library’s role in supporting scholars in emerging digital schol- arship skills necessarily invites a conversation about staffing in libraries. Should the library provide support staff for digital projects, or should that support staff come from the ranks of graduate students? If graduate students are used as labor for these projects, how can it be organically integrated into graduate training? Lewis et al. (2015) point to both the advantages and disadvantages of this model for graduate students: Often, digital scholarship projects rely on graduate student assistants. The experience gives students opportunities to build their knowledge and provides inexpensive labor. But such projects must contend with frequent turnover; as one faculty member put it, “I get these MA students, I train them, they graduate.” One university that offers degree programs in digital scholarship tries to recruit its own students as staff, but there aren’t necessarily enough students to meet the demand, especially with competition from other organizations. Most of their graduates go to industry, since “they can offer more money. The only people we have are here because of idealism.” (2015, 27) Likewise, sustainability can be an issue when the support model is based on labor by students who necessarily stay only a short period of time. In describing the com- munity of practice support model that has been used by various projects such as TEI, Documenting the American South, and the Victorian Women Writers Project, Green points out, “The labor and craft taught for encoding texts generates a ‘shared repertoire’ of skills that is continually disseminated and refined through the training of new and established scholars. This shared repertoire is a critical element to the ability of a community of practice to sustain and expand itself”(2014, 228). The com- munity of practice model constantly requires new participants, especially because many graduate students in library and information science programs or schools of information are only pursuing master’s degrees and graduate after two years. At the center of the question of library staffing, training, and support for digital scholarship is the debate over whether libraries should establish digital humanities centers. Ithaka’s report on supporting DH outlines three “campus models for sup- port”: the service model, the lab model, and the network model. In the network 6 S. A. CORDELL AND M. GOMIS model, “there are multiple units whose services have developed over time, in the library and IT departments, but also visualization labs, centers in museums, and instructional technology groups, each of which was formed to meet a specific need” (Maron and Pickle 2014, 34). Maron follows up on the Ithaka report on DH centers by arguing that the service model has been controversial in libraries because of the debate over “the degree to which librarians should envision themselves in a ‘service role”’ (2015, 33). Nevertheless, this is the most common model, and it is driven by the fact that it meet[s] faculty and students where they are—to offer courses, training, and some pro- gramming support for members of the campus community. This often takes the form of developing a full range of programming, from workshops to courses, and bringing in guest speakers. The library or center following this model seeks to identify and respond to faculty needs rather than “independently identifying a path of innovation” (33), Maron identifies the “path of innovation model” as closer to the lab model. Likewise, digital humanities centers can create a central space for networking and collaboration. As Freistat explains, Digital humanities centers are key sites for bridging the daunting gap between new technol- ogy and humanities scholars, serving as the crosswalks between cyberinfrastructure and users, where scholars learn how to introduce into their research computational methods, encoding practices, and tools and where users of digital resources can be transformed into producers. (2012, 281) While there is much support for the development of digital humanities centers, there are also detractors. Schaffner and Erway argue that “There are many ways to respond to the needs of digital humanists, and a digital humanities (DH) center is appropriate in relatively few circumstances” (2014, 5). Instead, libraries can draw on a host of other approaches to support DH on their campuses. In this case, Shaffner and Erway assert, “[i]n most settings, the best decision is to observe what the DH academics are already doing and then set out to address gaps” (5). Whether or not libraries build digital humanities centers, there is widespread consensus that libraries are natural partners in supporting digital scholarship. At the same time, there has been much less discussion of the specific challenges raised by complex data sets that are not inherently user-friendly. Libraries offer varying mod- els of support, and there is a robust conversation in the scholarly literature about whether training, direct technical support, or enabling collaboration—or a combi- nation of all three—is the best approach to supporting digital scholarship. As we argue in the next section, the potential and challenges of large data sets provide an opportunity to think through approaches to training, as well as the library’s role in supporting teaching and research using these data sets. Case study: The EEBO-TCP data set As new digital methodologies emerge, along with new data sets that enable textual analysis at scale, many scholars have sought help from librarians, other researchers COLLEGE & UNDERGRADUATE LIBRARIES 7 (both in and beyond their disciplines), and technology experts as they begin nav- igating resources and methodologies far outside their traditional training. While there are expected challenges to learning the basic methods of digital scholarship and analysis, a significant additional barrier exists in formatting and preparing the data sets themselves, even beyond the programming skills that are necessary for analysis. For example, while many researchers can operate basic web-based text visualization tools such as Voyant with relative ease, finding and then preparing a corpus for analysis with these tools is often far more daunting. The challenge in this case comes from the complex nature of raw data sets, as well as other factors that work against usability. Creating data sets for analysis often involves individual downloads of plain text files (in the relatively limited cases in which platforms allow that functionality), using R or Python to isolate subsets of larger corpora, or being limited to corpora that are larger than the researcher may need. While it would be unrealistic to suggest that it is possible to eliminate all challenges to creating cor- pora, putting resources toward facilitating the creation of corpora from raw data sets would offer significant advances in scholars’ involvement with digital scholarship. Even data sets that have been produced by libraries pose challenges in usability for researchers. Without a significant infusion of resources aimed at increasing the usability of these data sets by researchers at all levels of technical abilities, the question becomes, who is best positioned to offer researchers and instructors support in using these data sets? Likewise, who is best positioned to communicate the research possibilities, as well as how to determine a fruitful research question, for using these data sets? Preparing a corpus takes time, and there is no guarantee that text analysis will yield usable results. This article takes the EEBO-TCP data set as a case study to discuss the challenges and potential approaches for libraries to support digital humanities work using these corpora. We draw on the EEBO-TCP data set both because its potential and challenges are representative of other data sets being made available for humanities research and because it is openly available. EEBO-TCP offers considerable potential because it makes transcriptions of early modern texts available for scholars, as well as because it is a clean data set. EEBO- TCP is based on the Early English Books microfilm collection that includes over 130,000 titles from Pollard and Redgrave’s Short Title Catalogue (1475–1640), Wing’s Short-Title Catalogue (1641–1700), and the Thomason Tracts (1640–1661) (Early English Book Online [EEBO] n.d.). When the microfilm set was originally digitized, the scans appeared as images, and only the metadata was searchable. To make the texts themselves searchable, and because optical character recognition (OCR) soft- ware has not yet advanced to handle early modern fonts with any degree of accuracy, the Text Creation Project made the ambitious decision to re-key (i.e., transcribe) the texts, as well as to mark them up using XML/SGML encoding. Although the original goal was to make the texts full-text searchable, emerging text mining methodolo- gies have made the existence of clean data sets particularly desirable for researchers. Because the texts have been re-keyed, there are fewer errors in the texts than in those that have been OCR’d. As part of its agreement with ProQuest, which makes the EEBO database commercially available, Phase I of the EEBO-TCP texts, which 8 S. A. CORDELL AND M. GOMIS includes the first 25,000 re-keyed texts, was made publicly available in December 2014. While the data set offers considerable potential for researchers and also makes the texts themselves available, the data set itself is not easy for researchers to use for a variety of reasons. The texts are available either as a full data set on Box and github, or as individual, HTML, ePUB, and TEI P5 XML files through the Oxford Text Archive. The files on Box and Github are referenced by TCP number, a number that is not available on the ProQuest platform, meaning that researchers who are not interested in working with the corpus as a whole—who, for example, are interested only in texts from a specific time frame or author—have to do considerable extra work to identify the relevant files before they can begin downloading and formatting them for analysis. While researchers who are fluent in programming languages such as R or Python have little trouble accessing these texts, in our experience many researchers in the humanities are understandably daunted when faced with zip files containing 25,000 files, each of which contains XML or SGML markup that they must decide whether (and how) to scrub or retain. There is little documentation on strategies for accessing and cleaning up the text in preparation for mining or information on analysis tools once you have the data. Likewise, ProQuest has recently made their historical newspaper collections available (for a fee) to libraries that have already purchased perpetual access to spe- cific titles. When libraries license the full-text data sets of historical papers, they are given access to the marked-up files. The Los Angeles Times, for example, is a col- lection of 4.5 million files, presented in no particular order and with no metadata in the file names. As in the case of the EEBO-TCP data set, to make use of these files, researchers must begin by pulling down slices of the corpus (such as by year or article type) using R or Python. Unlike the EEBO-TCP files, most LA Times articles are not available one by one as plain text files on a platform for researchers to cob- ble together a corpus through the search interface (and license agreements generally limit bulk downloads in any case). Once researchers have pulled down a subset of the corpus, they must decide how much of the markup to keep or strip out before they can run it through a text visualization tool (unless they decide to use the text mining package in R or a similar programming language). Leaving aside the techni- cal skills needed to do this, researchers must also decide how to approach the dirty OCR problem because the texts themselves are riddled with errors due to the con- version process from microfilm. While data sets like this offer tremendous poten- tial, it is not feasible for humanities scholars to make use of it without considerable support. Another example outside of the humanities is the United States Census Bureau, which provides access to data sets through a variety of different websites and for- mats. Determining the type of data that is needed and locating that data can be chal- lenging to researchers new to working with census data. The Census Bureau offers a list of recommended software and provides workshops, webinars, and classroom trainings to help people get what they need. They also provide phone and e-mail COLLEGE & UNDERGRADUATE LIBRARIES 9 support for researchers and people using census data in their work. Libraries are just beginning to offer a range of data sets to their users either through their subscription databases or through their own digital projects. Usually this type of information is provided without creating a service model. Faculty and students often have to figure out how to use these data sets themselves. Once users have the data set, the library doesn’t play a strong role in helping them use it. The U.S. Census Bureau could serve as a service model for supporting text mining in the digital humanities. When an institution or a company provides access to a data set, do they have a responsibility to assist researchers in using the data set? The following section presents different support models that allow us to examine the ways libraries are supporting digital scholarship projects with large data sets for research and learning. Gaining access to the texts and analysis tools is not always the barrier to digital schol- arship, especially for content out of copyright. Researchers often need help locating resources, including money for staff, storage space, and software and technological expertise to execute their projects. Potential support models for digital scholarship using unwieldy data sets Although there are certainly scholars out there who are capable of making use of raw data sets, the majority are not. We as librarians and scholars need to advocate for the ways in which our scholars want to use these materials. At the moment, we are operating in a bifurcated context: On the one hand, there exist graphical interface tools that do not give you much flexibility or control to manipulate or build the corpus you are analyzing but that meet the needs of some researchers, such as the Google N-Gram tool, or on the other hand, a move by publishers to dump the raw data. As in the case of the ProQuest Historical Newspapers data sets, publishers have responded to requests from researchers by making data sets available; these data sets are usually delivered in large raw text file dumps that are not manageable to the average humanist scholar. Advocacy As a first step in enabling research with these data sets, libraries, as the purchasers and as the supporters of researchers, need to advocate for tools that create bridges between easy-to-use digital tools (like Voyant and AntConc) and the data sets. For example, rather than having either the entire raw data set for EEBO-TCP or the Oxford cut-and-paste formatted version, why not create tools that make it easy to use the platform to designate a corpus (i.e., by doing a search using the parameters on the platform) and then extract plain text files from the search results? In the case of the ProQuest Historical Newspapers example mentioned, it is not consistently possible across the PQHN platform to download plain text files of individual files, although this would make text mining custom corpora much more manageable for researchers without a background in programming or the resources to hire an assis- tant to manage the technical aspects. 10 S. A. CORDELL AND M. GOMIS Creating new tools Leonard recommends that libraries create tools or adopt open source tools to make analysis easier. At the Yale University Library, they adopted the HathiTrust Book- worm tool to analyze a small digital corpus of the Vogue collection. By creating tools that researchers can use to search text in other ways, they also help patrons to analyze their large digital collections (2014). To facilitate work on the EEBO-TCP data set, Washington University in St. Louis created the Early Modern Print (n.d.) project, which is supported by the Humanities Digital Workshop at Washington University. The Early Modern Print project pro- vides exploration tools tailored to the EEBO-TCP data. They describe the tools as an aggregate view of the corpus that enables us to probe English lexical and orthographic history in ways that usefully complement the search capabilities of EEBO-TCP and the Oxford English Dictionary; they also help us to see early modern book culture in a new way, as a structured flow of words. (Early Modern Print n.d.) The developers have created graphical interface tools, such as an EEBO N-GRAM Browser, to facilitate use of the collection by researchers, but users necessarily have less ability to manipulate the corpus when they are using this tool. Until there are more robust tools available to make working with a broad range of data sets easier for scholars, libraries can play a role in supporting emerging research by teaching scholars basic skills. The workshop model: Creating stages for learning In designing workshops to teach skills in digital scholarship, librarians need to be attentive to felt needs in their community and to carefully stage those workshops to make sure that instructors are not spending too much time on technical minu- tiae, such as constructing a corpus or setting up frustration with tools. To do this, workshop facilitators need to draw on the principles of backward design by asking, what is the intellectual outcome that they want to have in the session? Wiggins and McTighe explain backward design as a methodology that conceives of curricular design by thinking at the outset in terms of outcomes rather than lessons: “Given a task to be accomplished, how do we get there? … What kinds of lessons and prac- tices are needed to master key performances?” (1998, 8). In just the same way that you might design a classroom exercise to focus narrowly on imparting a specific skill or research strategy, it is useful to isolate the specific technical skill, as well as the possibilities for further exploration, that you hope to impart. This is likely to require more setup in advance by the workshop leaders—for example, creating a specific corpus to work with or downloading example files to practice on—but it will allow the session to focus on that specific skill rather than the frustrations of getting ready to learn that skill. A scenario to avoid is when workshop participants try to download software and wind up spending most of the time troubleshooting the download and relatively little time on using the tool. COLLEGE & UNDERGRADUATE LIBRARIES 11 Designing workshops in ways that focus narrowly on outcomes may also require participants to use the same operating system and computers that have all been set up the same in advance. Creating an equal computing environment is a big chal- lenge, especially when people have different skill levels and different technology vocabularies. As the scholarship on how researchers learn technical skills suggests, if you can give an opening to the possibilities, and offer a framework for follow- up support, interested researchers will take the time to teach themselves or request consultations on how to do the technical minutiae. A key goal for a workshop can often be illustrating the possibilities. How can you illustrate the possibilities in the approach so that scholars are motivated to learn the details of downloading and con- structing their own corpus? Can you create a session that focuses on a piece of the process—i.e., looking at a predetermined corpus in AntConc? One approach is to make the entry easy so that scholars can decide if they want to do more, then offer resources for them to take the next steps. A significant goal for workshops can be illustrating why researchers would want to learn these approaches. Workshops can also be augmented by working sessions, such as the Hackfest sponsored by the Bodleian libraries in 2015 (Oxford University n.d.). This full-day session included researchers as well as robust technical support, as participants had a chance to “pitch ideas and find collaborators, firm up projects and groups, and request (or indeed recruit) technical help as necessary” (Willcox 2015). Key to the success of this model, practiced also by Software Carpentry, whose goal is “teaching basic lab skills for research computing” (Software Carpentry n.d.), is the availability of support from multiple people, rather than one or two workshop leaders trying to troubleshoot and lead the session. Classroom approach In addition to workshops aimed at researchers at all levels, librarians can offer con- siderable support for digital scholarship through course-integrated instruction at the undergraduate or graduate level. If integrated thoughtfully into a course’s learn- ing goals and assignments, course-integrated instruction can be, arguably, at least as effective as workshops because the individual skills to be taught are bound up with the questions raised by a specific course theme. By working with the faculty member leading the course, and by being attentive to the specific learning goals and questions for the course, librarians can design exercises that are targeted toward spe- cific research questions. Just as in workshops, it is essential that librarians front-load the planning for these instruction sessions to isolate the specific learning goal for the course. While it is not possible, nor is it realistic (or, really, desirable), to eliminate all possible frustration in working with complex data sets, librarians can anticipate and minimize potential pain points so that the session can focus on the learning goals. For example, in one undergraduate class session at the University of Michigan, the librarian and technology specialist worked closely with the faculty member to design an instruction session that drew on the EEBO-TCP data set in a 300-level 12 S. A. CORDELL AND M. GOMIS course. Because the point of the assignment was not necessarily to teach students how to compile corpora for analysis but rather to allow students to perform text analysis on a set of relevant texts, they set the session up so that students were cre- ating a limited corpus of only ten texts, based on search criteria that students deter- mined (and determining the search words was part of the goal for the exercise). To minimize frustration with the data set as a whole, they first showed students how to use the EEBO platform so as to explore texts related to their topics and identify ten potential texts. Once they had identified the ten texts, it was relatively easy for students to find those texts on the Oxford platform and cut and paste the text into plain text files. Although this approach may have glossed over some of the intrica- cies of the data set and corpus creation, it allowed students to create a minicorpus relatively easily to import into Voyant, where the bulk of the learning was meant to happen. The lab approach: ScholarSpace at the University of Michigan Library ScholarSpace at the Graduate Library at the University of Michigan provides access to technologies for small-scale experimentation and technologies for formal project support with the understanding that anyone can access them. ScholarSpace sup- ports humanists working on text mining projects by providing access and expertise for digitization, storage, text cleanup, and analysis. We have purchased text mining software that is not available elsewhere on campus, thereby providing access to anyone affiliated with the university. This approach relies on humanists to be willing to experiment with librarians and to train each other. Text mining varies greatly by discipline; through creating a community of scholars, we can build a network of experts and draw on experiences and expertise related to text mining in Chinese studies, economics, history, English language and literature, and more. Staffing models Across these different models, the question remains as to how best to apportion staffing to support digital scholarship. In a distributed model, where librarians are leading workshops for the campus community and for classes, subject specialists, technology librarians, and undergraduate learning librarians can provide consid- erable support, especially if they are provided training and if the workshops are a natural extension of their expertise and outreach areas. Depending on the demand on campus, this model can, however, lead to librarians being stretched too thin; thus, creative staffing, such as training students to lead or support workshops, is necessary. Likewise, students can be brought into a project to work on a specific slice—such as OCR-ing pdf files and cleaning up the resulting OCR. In this case, however, it is important to bring the students into the conversation about the project at some level so that they understand how their work fits into the larger intellectual work of the project. Otherwise, libraries miss out on the opportunity to mentor students in emerging questions and methodologies of digital scholarship. The bulk of preparing COLLEGE & UNDERGRADUATE LIBRARIES 13 texts for mining and analysis can also be tedious, and it requires careful attention to detail. Librarians or others overseeing students working on DH projects need to be vigilant in keeping the work moving forward and in checking the quality and consistency of the work. Sustainability and scalability are challenges across all staffing models. Projects that have dedicated funding may not have enough funding to cover the entire project. Students cycle off projects either because they graduate or because they receive other opportunities such as internships or jobs. Conclusion As the preceding discussion of staffing illustrates, challenges remain in think- ing through collaborative work in digital scholarship, especially in terms of the necessary—but not as obviously exciting—work of data preparation and cleanup. The need to develop and create digital scholarship projects will continue to grow in the humanities, and at some institutions it will be embedded into the curriculum. Learning project management, digitization, and analysis are skills humanists will need in the future, and they will learn them through the channels available. These skills can translate easily to a number of positions postgraduation and will be desired by employers. Having graduate students work on digital projects can provide them with perfect opportunities to obtain new skills. Considering that resources are not currently in place to make data sets easier to use in the near future, librarians can advance digital scholarship by helping scholars in incremental ways targeted at the specific challenges and frustrations that data sets pose. Librarians can set the expectation that they will work with students and faculty to explore these new areas together and work to scaffold the learning experience so that humanists beginning text mining see the possibilities and not just the minutiae. Some challenges that still persist include developing relationships across campus, continually building skills, and finding partners to collaborate. ORCID Sigrid Anderson Cordell http://orcid.org/0000-0003-3956-0606 Melissa Gomis http://orcid.org/0000-0002-5622-8560 References Antonijevic, Smiljana. 2015. Amongst Digital Humanists: An Ethnographic Study of Digital Knowl- edge Production. New York: Palgrave Macmillan. Early English Books Online (EEBO). n.d. “What Is Early English Books Online?” http:// eebo.chadwyck.com/about/about.htm#top “Early Modern Print: Text Mining Early Printed English.” n.d. http://earlyprint.wustl.edu Edmond, Jennifer. 2015. “Collaboration and Infrastructure.” In A New Companion to Digi- tal Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 54–65. Chichester, UK: John Wiley & Sons. http://orcid.org/0000-0003-3956-0606 http://orcid.org/0000-0002-5622-8560 http://eebo.chadwyck.com/about/about.htm#top http://earlyprint.wustl.edu 14 S. A. CORDELL AND M. GOMIS Freistat, Neil. 2012. “The Function of Digital Humanities Centers at the Present Time.” In Debates in the Digital Humanities, edited by Matthew Gold, 281–91. Minneapolis: University of Min- nesota Press. Gibson, Katie, Marcus Ladd, and Jenny Presnell. 2015. “Traversing the Gap: Subject Specialists Connecting Humanities Researchers and Digital Scholarship Centers.” In Digital Humanities in the Library: Challenges and Opportunities for Subject Specialists, edited by Arianne Harsell- Gundy, Laura Braunstein, and Liorah Golomb, 3–17. Chicago: Association of College and Research Libraries. Green, Harriett E. 2014. “Facilitating Communities of Practice in Digital Humanities: Librarian Collaborations for Research and Training in Text Encoding.” The Library Quarterly 84(2): 219–34. Heuser, Ryan, Long Le-Khac, and Franco Moretti. 2011. “Learning to Read Data: Bringing out the Humanistic in the Digital Humanities.” Victorian Studies: An Interdisciplinary Journal of Social, Political, and Cultural Studies 54(1): 79–86. Hunter, Elizabeth. 2016. “Must Humanists Learn to Code? Or: Should I Replace My Own Carburetor?” HASTAC (blog), December 7, https://www.hastac.org/blogs/shakespeare- games/2016/12/07/must-humanists-learn-code-or-should-i-replace-my-own-carburetor Kirschenbaum, Matthew. 2009. “Hello Worlds: Why Humanities Students Should Learn to Program.” The Chronicle Review 55(20): B10. Leonard, Peter. 2014. “Mining Large Datasets for the Humanities.” IFLA Library. http:// library.ifla.org/930/1/119-leonard-en.pdf Levelt, Sjoerd. n.d. “#EEBOLiberationDay.” https://storify.com/SjoerdLevelt/eeboliberationday Lewis, Vivian, Lisa Spiro, Xuemao Wang, and Jon E. Cawthorne. 2015. Building Expertise to Sup- port Digital Scholarship: A Global Perspective. Washington, DC: Council on Library and Infor- mation Resources. Liu, Alan. 2009. “Digital Humanities and Academic Change.” English Language Notes 47(1): 17– 35. Maron, Nancy. 2015. “The Digital Humanities Are Alive and Well and Blooming: Now What?” Educause Review. http://er.educause.edu/∼/media/files/articles/2015/8/erm1552.pdf Maron, Nancy, and Sarah Pickle. 2014. “Sustaining the Digital Humanities: Host Insti- tution Support beyond the Start-Up Phase.” Ithaka S+R. http://www.sr.ithaka.org/ wp-content/mig/SR_Supporting_Digital_Humanities_20140618f.pdf Oxford University. n.d. “Text Creation Partnership: EEBO, ECCO and Evans Texts.” http://ota.ox. ac.uk/tcp/ Reid, Alexander. 2012. “Graduate Education and the Ethics of the Digital Humanities.” In Debates in the Digital Humanities, edited by Matthew Gold, 350–67. Minneapolis: University of Min- nesota Press. Schaffner, J., and R. Erway. 2014. “Does Every Research Library Need a Digital Humanities Center?” OCLC Research Report. http://www.oclc.org/content/am/research/dpublications/ library/2014/oclcresearch-digital-humanities-center-2014.pdf Software Carpentry. n.d. “Software Carpentry: Teaching Basic Lab Skills for Research Comput- ing.” https://software-carpentry.org Text Creation Partnership (TCP). 2014. “EEBO-TCP Phase I Public Release: What to Expect on January 1.” http://www.textcreationpartnership.org/2014/12/24/eebo-tcp-phase-i-public- release-what-to-expect-on-january-1/ Wiggins, Grant P., and Jay McTighe. 1998. Understanding by Design. Alexandria, VA: Association for Supervision and Curriculum Development. Willcox, Pip. 2015. “Early English Books Hackfest.” Bodleian Libraries (blog), April 22, http:// blogs.bodleian.ox.ac.uk/digital/2015/04/22/early-english-books-hackfest/ https://www.hastac.org/blogs/shakespeare-games/2016/12/07/must-humanists-learn-code-or-should-i-replace-my-own-carburetor http://library.ifla.org/930/1/119-leonard-en.pdf https://storify.com/SjoerdLevelt/eeboliberationday http://er.educause.edu/~/media/files/articles/2015/8/erm1552.pdf http://www.sr.ithaka.org/wp-content/mig/SR_Supporting_Digital_Humanities_20140618f.pdf http://ota.ox.ac.uk/tcp/ http://www.oclc.org/content/dam/research/publications/library/2014/oclcresearch-digital-humanities-center-2014.pdf https://software-carpentry.org http://www.textcreationpartnership.org/2014/12/24/eebo-tcp-phase-i-public-release-what-to-expect-on-january-1/ http://blogs.bodleian.ox.ac.uk/digital/2015/04/22/early-english-books-hackfest/ Abstract References work_2noa3kjwirhltenxswqjd3ghw4 ---- Picture archives and the emergence of visual history of education. ISCHE 40 pre-conference workshop. 3rd workshop "Pictura Paedagogica Online: educational knowledge in images" Picture Archives and the Emergence of Visual History of Education | August 28, 2018. BBF (Berlin) 87 Digital Resources and Tools in Historical Research Lars Wieneke and Gerben Zaagsma Luxembourg Centre for Contemporary and Digital History https://www.c2dh.uni.lu Abstract. This proposal will discuss the use of digital picture archives and associated tools in historical research from the perspective of digital history with a focus on resources for the history of education. Our starting point will be threefold: • digital picture archives need to be seen as part of a wide range of digital resources that are currently available for historical research; while certain methodological, epistemological and technical questions • the broader question of what prerequisites should be considered for digital archives more generally; and addressing the question of whether or not such general prerequisites can be formulated at all given the wide range of research questions and use cases researchers bring to the table With this broader contextualisation in mind we will focus on the possibilities and limitations of digital picture archives for the history of education through a brief discussion of the following points: • what are the characteristics of digital picture archives; technical and otherwise, and within that context, latter or not? • what layers of information are currently embedded in digital picture archives for the history of education (taking Gerhard Paul‘s differentiation as a starting point); how can we improve the design, annotation 88 ISCHE 40 Pre-Conference Workshop | 3rd Workshop “Pictura Paedagogica Online: Educational Knowledge in Images” 88 Lars Wieneke and Gerben Zaagsma Introduction Our paper will have to start with a con- fession: neither of us are historians of ed- ucation. Both of us, however, have a keen interest and experience in what is called “digital history” and hence in the use and potential of digital resources in histori- cal research. This workshop presents us with an excellent opportunity to consider a particular case study, the visual history of education, and think through how, and in what ways, digital resources are already In our paper we will discuss the use of dig- ital picture archives and associated tools in historical research with a focus on the possibilities and limitations of digital pic- ture archives for the history of education. Before doing so some contextual parame- ters need to be established: First of all: digital picture archives need to be seen as part of a wide range of digital resources that are currently available for historical research; while certain meth- odological, epistemological and technical archives, many pertain to digital resources in general; Secondly: discussing the prerequisites for a picture archive on educational histo- question of what prerequisites should be considered for digital archives more generally; and addressing the question of whether or not such general prerequisites can be formulated at all given the wide range of research questions and use cases researchers bring to the table. The call for papers for this Pre-Conference workshop announced a focus on “the im- pact of the discipline on developing and maintaining of a picture archive” and list- ed a number of pertinent questions. For the purposes of our talk we have reframed these questions as three major topics to be addressed, and added some of our own concerns: Prerequisites: what prerequisites, if any, are there for a picture archive on educational history? Are there common basic requirements? Existing data archives: in how far can ex- isting data archives meet the needs? What do they offer and how are they used? What is their strength and weakness in regard to the analytical possibilities they offer? Potential: Can existing data archives meet the demands of the visual history of educa- tion? Is there a need for another solution? And what potential do new technological Computer Vision, offer? Picture Archives and the Emergence of Visual History of Education | August 28, 2018. BBF (Berlin) 89 Digital Resources and Tools in Historical Research 89 The visual turn in the digital humanities Before addressing the issues above, a very brief word on the visual turn in the digital. The Digital Humanities are tradi- tionally text-based and the engagement with images, as more than digitised ar- tefacts, is only recent. As Patrik Svensson wrote as recently as 2009: “The so-called “visual turn” or research on multimodal representation does not seem to have had a large impact on humanities computing.”1 Many digitisation projects of the 1990s fo- cused on textual materials and text edi- tions. Large scale digitisation of images is a development of roughly the past 15 years. Gerhard Paul talks about the “the technological quantum leap of the world wide web” as a result of which “historians have had completely new possibilities of image research at their disposal for mere- ly the last ten years.”2 The question is of course how that po- tential has been used and to what extent it has been realised. For our purposes a further question is whether digital picture archives are merely used as repositories of visual material and images, now easily 1 Patrik Svensson, Humanities Computing As Digital Humanities. DHQ: Digital Humanities Quarterly 3 / 3 (2009). 2 Gerhard Paul, ‘Visual History’, Version: 1.0, in: Docupe- dia-Zeitgeschichte, 11.02.2010. URL: http://docupedia. de/zg/paul_visual_history _v1_de_2010. DOI: http://dx. doi.org/10.14765/zzf.dok.2.557.v1 accessible and in much larger quantities than before, or if new digital methods are used to actually analyse them. As to the latter: recent developments in Computer Vision are now beginning to offer exciting new possibilities. Using neural networks and other techniques rapid advances are being made in visual pattern discovery, 3 As a result, it is now possible to analyse large image data sets and categorise them4 to a certain extend. If all that sounds some- what deterring to many historians, a crea- tive use of metadata can already yield in- teresting research results for those willing to invest in some technical expertise. Prerequisites Let’s now turn our attention to the impact of the discipline on developing and main- - tion is what prerequisites, if any, exist for a digital picture archive on educational his- tory? This is a problematic that relates to the more fundamental question what the demands of the visual history of education 3 See for a succint overview of recent developments the DH2017 workshop proposal on Computer Vision in Digital Humanities: https://dh2017.adho.org/ab- stracts/639/639.pdf. 4 See for a description of a very recent example this DHBenelux 2018 abstract: Seeing History: Analyzing Large-Scale Historical Visual Datasets Using Deep Neural Networks. http://2018.dhbenelux.org/wp-con- tent/uploads/sites/8/2018/05/Wevers_Smits_Seeing _ History _DHBenelux2018.pdf. 90 ISCHE 40 Pre-Conference Workshop | 3rd Workshop “Pictura Paedagogica Online: Educational Knowledge in Images” 90 Lars Wieneke and Gerben Zaagsma should look like and can be approached from either a content or technical per- spective. As to content, researchers and heritage institutions should jointly decide digitisation priorities, ensuring that what is being digitised represents a broad spec- trum of relevant topics for potential future research (avoiding the pitfall that what is - ter narratives). From a technical perspective other factors come into play, related to content: First of all the size of the collection affords differ- ent approaches when dealing with hun- dreds, thousands or millions of images. Furthermore the technical provenance of the images (raster images, photographs, etc.) comes into play while resolution and file size of the scanned images require different strategies depending on the size - search questions and topics require cer- tain technical possibilities, for example, certain types of metadata to allow a re- case of images, high-quality and consist- ent metadata are of crucial importance as they provide, at least so far, the only way to find relevant non-textual mate- rials (as compared to being able to per- form a full-text search in OCR’ed textual materials). Conversely, new technologi- cal possibilities can help to open up new avenues of research and generate new research questions. Here one can think about interlinking materials from various repositories, for example through the so- called International Image Interoperability Framework5. State of the Art Let’s consider a couple of example of rele- vant digital picture archives for the history of education. Pictura Paedagogica Online is the BBF’s digital picture archive, His- torywallcharts is a collaborative project offering history wallcharts from Germany, The Netherlands and Denmark; and Dig- iPorta is a digital portrait archive. These archives differ considerably when it comes to search and browsing options, extent and quality of metadata, possibilities to save and / or export found objects and their metadata, etc. As image repositories they function well but there is much room for improvement, especially when search options and quality of metadata are con- cerned. One factor to keep in mind here is that the migration of data from legacy websites to newer more state of the art content and or asset management sys- tems is costly. In many cases, the question then is how existing databases can be im- proved until funding is secured for entirely new solutions. 5 See https://iiif.io Picture Archives and the Emergence of Visual History of Education | August 28, 2018. BBF (Berlin) 91 Digital Resources and Tools in Historical Research 91 Potential Can existing digital picture archives meet the demands of the visual history of edu- cation? This, of course, all depends on how one formulates these demands. To provide an example: suppose we wanted to con- duct a comparative wallchart analysis of the depiction of World War II in Germa- ny, Denmark and The Netherlands based upon the collection in http://historywall- charts.eu/. This can certainly be done, yet it requires quite some time as there is no advanced keyword search that would al- low us to retrieve all relevant images at once and / or per country; moreover, one needs to search using multiple languages to obtain all possibly relevant results. In this particular example the main point to address would be the quality and consist- ency of metadata. A different approach would be to use IIIF to interlink the original databases the wallcharts come from, obvi- ating the need for a new application that brings them together in a new database. Websites like DigiPorta allow users to ex- port metadata, but only for individual re- cords. If this could be done for all relevant records that a search yields the options for following example, a description by Dutch historian Martijn Kleppe of his research into iconic images used in Dutch history textbooks: “This presentation will focus on the methods applied to establish which photos can be called iconic. One of the characteristics of iconic photos is the repetitive publication of the same image. We therefore made an inven- tory of all photos that were published in Dutch High School History text- books during 1970 — 2000. A total of 412 books have been analysed and photos were digitised and added into a database, using software package Fotostation Pro. A total of 42 varia- bles containing information about the photo and the textbook were written photo. This enabled the researcher not only to ‘read’ the information in different types of Photo- editing and viewing software but we could also export the data into statistical soft- ware like SPSS, enabling us to cal- culate which photos were used most often, resulting in a list of most used photos.”6 If we move to the realm of computer vision technologies we have other options. Apart from using, for example, deep learning approaches to determine the type of im- age we are looking at (a drawing, photo, engraving, etc) we could look for all kinds of categories of interest, such as depic- tions of war, cities, cars, animals etc. 6 Martijn Kleppe, ‘Photographic Icons — Building and researching large-scale photo collections’, Brainstorm Meeting — e-Humanities: Innovating Scholarship (29 March 2011, NIAS Wassenaar). URL: https://www.ehu- manities.nl/v02/beheer/wp-content/uploads/2011/04/ Booklet-e-Humanities-Meeting1.pdf. 92 ISCHE 40 Pre-Conference Workshop | 3rd Workshop “Pictura Paedagogica Online: Educational Knowledge in Images” Concluding remarks The above was only a very short explora- tion of digital picture archives in historical research and the visual history of edu- cation. What is clear to us is that a mul- tilayered strategy is necessary to realise the potential of digital picture archives for the visual history of education more fully. Enriching and improving the quality and consistency of metadata of existing re- positories is one important approach as is exploring what improvements in browsing and advanced search options could be im- plemented. As to new solutions, migration to more modern systems is costly but of course preferred. Interlinking reposito- ries, through IIIF and Linked Open Data should be part of such an effort. For research, a way to export metadata of search results is crucial to open up more possibilities for digital historical analy- sis with some of the existing repositories. Nonetheless, whereas the design and technical possibilities embedded in digital archives obviously shape and constrain what researchers can do with the mate- rials located within them, a researcher’s creativity, imagination and willingness to experiment are equally important. In the end, though, we have to return to the question posed in the beginning: what are of education from the perspective of its researchers? Only by formulating these can we hope to build corpuses that meet researcher’s requirements. work_2ojqsvuzizdklmgcf5o7knkreu ---- Published by: #EWAVirtual Conference Organisers, Maynooth University Arts and Humanities Institute Edited by Sharon Healy, Michael Kurzmeier, Helena La Pina and Patricia Duffe DOI: http://doi.org/10.5281/zenodo.4058013 This work is licensed under a Creative Commons Attribution 4.0 International License https://creativecommons.org/licenses/by/4.0/ Program Committee: Co-Chairs Sharon Healy, PhD Candidate and IRC Scholar in Digital Humanities, Maynooth University Michael Kurzmeier, PhD Candidate and IRC Scholar in Digital Humanities/Media Studies, Maynooth University #EWAVirtual Coordinators Rebecca O’Neill, MA Historical Archives, Maynooth University Helena La Pina, MA Historical Archives, Maynooth University Programme Coordinator Maria Ryan, Web archivist at the National Library of Ireland (NLI Web Archive) Treasurer Dr Joseph Timoney, Head of Department of Computer Science, Maynooth University PR/Outreach Julian Carr, MA Geography (Urban Studies), Maynooth University Committee Dr Martin Maguire, History/Digital Humanities, Dundalk Institute of Technology. Dr Thomas Lysaght, Deputy Head of Department of Computer Science, Maynooth University Gavin MacAllister, Historian in Residence, Irish Military War Museum. Bernadette McKevitt, MA International Peace Studies, Trinity College Dublin. http://doi.org/10.5281/zenodo.4058013 Table of Contents Introduction 1 Welcome from Sharon Healy and Michael Kurzmeier, Conference Co-Chairs 4 #EWAVirtual Keynotes 6 #EWAVirtual Programme 9 #EWAVirtual Abstracts 15 Session 1: Archiving Initiatives 15 Session 2: Collaborations 20 Session 3: Archiving Initiatives (Lightning Round) 23 Session 4: Research Engagement & Access 27 Session 5: Archiving Initiatives 32 Session 6: Social Science & Politics 35 Session 7: Collaborations & Teaching 40 Session 8: Research of Web Archives 44 Session 9: Research Approaches 46 Session 10: Culture & Sports 50 Session 11: Research (Lightning Round) 54 Session 12: Youth & Family 58 Session 13: Source Code and App Histories 63 Session 14: AI and Infrastructures 67 Session 15: WARC and OAIS 73 Session 16: Web Archives as Scholarly Dataset 75 Session 17: An Irish Tale / Scéal Éireannach 77 1 Introduction Engaging with Web Archives ‘Opportunities, Challenges and Potentialities’, (#EWAVirtual), 21-22 September 2020, Maynooth University Arts and Humanities Institute, Co. Kildare, Ireland. Maynooth University Arts and Humanities Institute are delighted to be hosting the first international EWA conference which aims to: ● Raise awareness for the use of web archives and the archived web for research and education across a broad range of disciplines and professions in the Arts, Humanities, Social Sciences, Political Science, Media Studies, Information Science, Computer Science and more; ● Foster collaborations between web archiving initiatives, researchers, educators and IT professionals. ● Highlight how the development of the internet and the web is intricately linked to the history of the 1990s. What is Web Archiving? Pioneered by the efforts of the Internet Archive in 1996, national libraries and cultural heritage organisations quickly realised the need to preserve information and content that was born on the web. It was this awareness that gave rise to technologies, specifically web crawler programmes, used for web archiving. According to the International Internet Preservation Consortium, ‘Web archiving is the process of collecting portions of the World Wide Web, preserving the collections in an archival format, and then serving the archives for access and use.’ Due to serious concerns about the loss of web-born heritage, there has been a continuous growth of web archiving initiatives across the globe. Why should we care? For example, in Ireland — The first connection to the Internet as we know it (via TCP/IP), went live in Trinity College Dublin in June 1991. The first web server and website in Ireland can be traced back to 1991/92 in University College Cork (CURIA project); and other websites followed in 1993 from IONA Technologies, TCD Maths, IEunet, and University of Limerick. The growth of Irish websites was slow at first, but this changed by the end of 1995 due to international developments in browser technology, and the growth of internet service providers in Ireland (see TechArchives, How the internet came to Ireland; David Malone, Early Irish Web Stuff). https://help.archive.org/hc/en-us/categories/360000553851-The-Wayback-Machine http://netpreserve.org/web-archiving/ https://en.wikipedia.org/wiki/List_of_Web_archiving_initiatives https://en.wikipedia.org/wiki/List_of_Web_archiving_initiatives https://curia.ucc.ie/oldmenu.html https://techarchives.irish/how-the-internet-came-to-ireland-1987-97/ https://www.maths.tcd.ie/~dwmalone/early-web.html 2 THERE ARE SIMILAR SCENARIOS AROUND THE WORLD As researchers begin to negotiate and write the history of their countries for the 1990s, whether it is social, cultural, political or even economic, it seems inevitable that they will also need to consider their histories of IT – in terms of how the introduction of the internet and the WWW began to infiltrate the fabric of life, work and play. The archived web is now an object of study in many countries, and there has been a lot of work done already to build research infrastructures and networks. But more needs to be done to promote awareness of the availability of web archives, and how they can be utilised as resources for research going into the future. And certainly, much more needs to be done in the realms of how web archives can be incorporated as resources in education, and how the use of web archives can be taught. International literature using web archives for research and historical inquiry is growing; yet the question of how to effectively use the archived web for qualitative and quantitative research still remains open; and how to integrate the use of web archives into teaching is a path yet to be explored. Furthermore, existing web archiving efforts find it hard to exchange knowledge and take on larger projects, partially due to the lack of opportunities for exchange between the disciplines and educators. The EWA organisers would also like to extend their sincerest thanks and appreciation to the following organisations and institutions for their kind support and efforts to make this conference event possible: ● Maynooth University Arts and Humanities Institute ● Maynooth University, Department of Sociology ● Maynooth University, Department of Media Studies ● Maynooth University, Department of Computer Science ● Maynooth University, Department of History ● National Library of Ireland, Web Archive ● TechArchives, Ireland ● University College Cork, Digital Arts & Humanities ● University College Dublin, School of History ● AGREXIS AG https://www.maynoothuniversity.ie/arts-and-humanities-institute https://www.maynoothuniversity.ie/sociology https://www.maynoothuniversity.ie/media-studies https://www.maynoothuniversity.ie/computer-science https://www.maynoothuniversity.ie/history https://www.nli.ie/en/web_archive.aspx https://techarchives.irish/ https://www.ucc.ie/en/dah/ https://www.ucd.ie/history/ https://www.agrexis.com/ 3 If you require more information or have any questions please feel free to email us: ewaconference@gmail.com Follow us on Twitter: ● @EWAConf ● @MU_AHI ● #EWAVirtual mailto:ewaconference@gmail.com https://twitter.com/EWAConf https://twitter.com/MU_AHI 4 Welcome from Sharon Healy and Michael Kurzmeier #EWAVirtual 2020 Conference Co-Chairs On behalf of the organising committee of the first international Engaging with Web Archives conference, we would like to welcome all delegates to Maynooth University Arts and Humanities Institute for what we hope will be a stimulating event within the realms of engaging with web archives and web archiving activities. We are proud to announce that this is the first web archive conference of its nature ever to be held in Ireland; and, the first virtual conference to be held in Maynooth University for 2020. The programme contains, 35 paper presentations, and 2 distinguished Keynote speakers. We are delighted to extend a warm welcome to the two keynotes speakers: Prof. Niels Brügger of Aarhus University, Denmark; and Prof. Jane Winters of School of Advanced Study, University of London. UK. #EWAVirtual brings together speakers who are historians, digital humanists, media scholars, social scientists, information and IT professionals, computer scientists, data consultants, librarians and archivists from Ireland, the United Kingdom, Europe, Canada, and the United States. To all the speakers, we appreciate your kindness, support and patience when the initial conference, scheduled in the Spring of 2020 was postponed, and your continued enthusiasm, cooperation and collaboration when we announced it would become a virtual event. We are also indebted to the Chairs of each session. Each one volunteered their services enthusiastically to assure the smooth running of the conference. Our gratitude is extended to the tireless efforts of the organising committee. Its dedication, from the reviewing of papers, to the logistical components of organising the first physical conference. Then to find the motivation, and spirit to reorganise the event as a virtual conference, is greatly appreciated. To all at Maynooth University and the band of volunteers, we appreciate your time, talent, and storyboard of ideas. Without your support and dedication, this conference would not be possible. A special shoutout goes to Professor Thomas O’Connor and Ann Donoghue from Maynooth University Arts and Humanities Institute. Their unfailing support, advice and kind assistance was invaluable throughout the entire processes of planning both EWA conferences (from the physical to the virtual). 5 Also, to all our sponsors and supporters, we appreciate all your encouragement, sound advice and uplifting messages. Particularly, we are grateful to the year-long encouragement and support by the committed staff at the National Library of Ireland. To all the speakers, guests, volunteers, chairs and attendees, we thank you. Together we have all played a part in the transformation of #EWA20 to #EWAVirtual. All the Best Sharon & Michael 6 #EWAVirtual KEYNOTES Professor Niels Brügger The variety of European web archives - potential effects for future humanities research The aim of this keynote is to open up a discussion of how the great variety of European web archives may affect future humanities research based on the archived web as a source. The keynote is divided in two main sections. First, the different web archiving forms in Europe are briefly mapped with a focus on which countries do have a web archive, archiving strategies, and access conditions. Second, it is discussed how this state of affairs may affect transnational research projects, spanning more web archives. The case of the national Danish web domain is used as a stepping stone to evaluate to what an extent such a study can be replicated in other European countries, thus enabling transnational comparisons. ------------------------------------------------------------------------------------------- Niels Brügger is a Professor in Media Studies, Head of NetLab, part of the Danish Digital Humanities Lab, and head of the Centre for Internet Studies at Aarhus University in Denmark. He is a Coordinator of the European network RESAW, a Research Infrastructure for the Study of Archived Web Materials, and the managing editor of the international journal Internet histories: Digital technology, culture and society. Professor Brügger has initiated the research projects “Probing a Nation’s Web Domain — the Historical Development of the Danish Web” (2014-) and “the history of dr.dk, 1996-2006” (2007-), and co-initiated the research infrastructure project NetLab (2012-17) within the Digital Humanities Lab. His research interests are the history of the Internet as a means of communication, and Digital Humanities, including archiving the Internet as well as the use of digital research tools. Other interests include media theory, the Internet, and the relation between the two with a view to (re)evaluating the status and relevance of existing media theories and methods. Recent publications include: ● The Historical Web and Digital Humanities, eds. N. Brügger, D. Laursen (Routledge, 2019) ● The SAGE Handbook of Web History eds. N. Brügger, I. Milligan (SAGE, 2019), ● The Archived Web: Doing History in the Digital Age (MIT Press, 2018). ● Web 25: Histories from the first 25 years of the World Wide Web ed. Niels Brügger (New York: Peter Lang, 2017) https://www.tandfonline.com/loi/rint20 https://www.routledge.com/The-Historical-Web-and-Digital-Humanities/Brugger-Laursen/p/book/9781138294318 https://uk.sagepub.com/en-gb/eur/the-sage-handbook-of-web-history/book252251 https://mitpress.mit.edu/books/archived-web https://www.amazon.com/Web-25-Histories-Digital-Formations/dp/1433132699 7 Professor Jane Winters Web archives as sites of collaboration Openness to collaboration has been one of the defining characteristics of web archiving and web archive studies from the outset. The challenges posed by the archiving and preservation of born-digital data, including web archives, are simply too great to be solved by individuals or single organisations. This keynote will present some of the partnerships which have moved the field forward in the past decade, suggest some new avenues for collaboration in the future, and consider how the required knowledge and skills can be developed within universities and the cultural heritage sector to ensure that current web archiving initiatives are sustainable. ------------------------------------------------------------------------------------------- Jane Winters is a Professor of Digital Humanities and Pro-Dean for Libraries in the School of Advanced Study at the University of London. She is responsible for developing digital humanities and has led or co-directed a range of digital projects, including most recently Big UK Domain Data for the Arts and Humanities; Digging into Linked Parliamentary Metadata; Traces through Time: Prosopography in Practice across Big Data; the Thesaurus of British and Irish History as SKOS; and Born Digital Big Data and Approaches for History and the Humanities. Professor Winters is a Fellow and Councillor of the Royal Historical Society, and a member of RESAW (Research Infrastructure for the Study of the Archived Web), the Academic Steering & Advocacy Committee of the Open Library of Humanities, the Advisory Board of the European Holocaust Research Infrastructure, the Advisory Board of Cambridge Digital Humanities, and the UK UNESCO Memory of the World Committee. Jane’s research interests include digital history, born-digital archives (particularly the archived web), big data for humanities research, peer review in the digital environment, text editing and open access publishing. Recent publications include: ● ‘Giving with one hand, taking with the other: e-legal deposit, web archives and researcher access’, in Electronic Legal Deposit: Shaping the Library Collections of the Future, ed. Paul Gooding and Melissa Terras (London: Facet Publishing, 2019); ● ‘Negotiating the born digital: a problem of search‘, Archives and Manuscripts, 47:4 2019; ● ‘Negotiating the archives of UK web space‘, in The Historical Web and Digital Humanities: the Case of National Web Domains, ed. Niels Brügger and Ditte Laursen (London: Routledge, 2019); http://www.facetpublishing.co.uk/title.php?id=303779&category_code=10#.XY-rY0ZKjIV http://www.facetpublishing.co.uk/title.php?id=303779&category_code=10#.XY-rY0ZKjIV https://www.tandfonline.com/doi/abs/10.1080/01576895.2019.1640753?journalCode=raam20 https://www.routledge.com/The-Historical-Web-and-Digital-Humanities/Brugger-Laursen/p/book/9781138294318 8 ● ‘Web archives and (digital) history: a troubled past and a promising future?’ in The SAGE Handbook of Web History, ed. Niels Brügger and Ian Milligan (SAGE Publications Ltd., 2019) https://us.sagepub.com/en-us/nam/the-sage-handbook-of-web-history/book252251 9 #EWAVirtual Programme DAY ONE: 21 September 2020 9.45 (IRE) / 10.45 (CEST) WELCOME Professor Tom O’Connor, Director of Maynooth University Arts and Humanities Institute Michael Kurzmeier, #EWAVirtual Co-Chair, (Maynooth University) 10.00 (IRE) / 11.00 (CEST) KEYNOTE Chair: Joanna Finegan (National Library of Ireland) Professor Niels Brügger, Aarhus University: The variety of European web archives — potential effects for future humanities research 11.00 (IRE) / 12.00 (CEST) Session 1: Archiving Initiatives Chair: Jason Webber (UK Web Archive, British Library) ● Maria Ryan (National Library of Ireland): The National Library of Ireland’s Web Archive: preserving Ireland’s online life for tomorrow ● Sara Day Thomson (University of Edinburgh) Developing a Web Archiving Strategy for the Covid-19 Collecting Initiative at the University of Edinburgh ● Dr. Kees Teszelszky (KB – National Library of the Netherlands): Internet for everyone: the selection and harvest of the homepages of the oldest Dutch provider XS4ALL (1993- 2001) 12.00 (IRE) / 13.00 (CEST) Session 2: Collaborations Chair: Patricia Duffe (Maynooth University) ● Dr. Brendan Power (The Library of Trinity College Dublin): Leveraging the UK Web Archive in an Irish context: Challenges and Opportunities 10 ● Sarah Haylett & Patricia Falcao (Tate): Creating a web archive at Tate: an opportunity for ongoing collaboration 12.40 (IRE) / 13.40 (CEST) Session 3: Archiving Initiatives (lightning round) Chair: Rebecca O’Neill (Maynooth University) ● Rosita Murchan (Public Record Office of Northern Ireland): PRONI Web Archive: A Collaborative Approach ● Inge Rudomino & Marta Matijević (Croatian Web Archive, National and University Library in Zagreb – NSK): An overview of 15 years of experience in archiving the Croatian web ● Robert McNicol (Kenneth Ritchie Wimbledon Library): The UK Web Archive and Wimbledon: A Winning Combination 14.00 (IRE) / 15.00 (CEST) Session 4: Research Engagement & Access Chair: Chris Beausang (Maynooth University) ● Dr. Peter Mechant; Sally Chambers; Eveline Vlassenroot (Ghent University); Friedel Geeraert (KBR – Royal Library and the State Archives of Belgium): Piloting access to the Belgian web-archive for scientific research: a methodological exploration ● Sharon Healy (Maynooth University): Awareness and Engagement with Web Archives in Irish Academic Institutions 14.40 (IRE) / 15.40 (CEST) / 09:40 (EDT) Session 5: Archiving Initiatives Chair: Sara Day Thomson (University of Edinburgh) ● Anisa Hawes (Independent Curatorial Researcher): Archiving 1418-Now using Rhizome’s Webrecorder: observations and reflections ● Nicole Greenhouse (New York University Libraries): Managing the Lifecycle of Web Archiving at a Large Private University 11 15.30 (IRE) / 16.30 (CEST) Session 6: Social Science & Politics Chair: Dr. Claire McGinn (Institute of Art, Design and Technology, Dún Laoghaire) ● Benedikt Adelmann MSc & Dr. Lina Franken (University of Hamburg): Thematic web crawling and scraping as a way to form focussed web archives ● Andrea Prokopová (Webarchiv, National Library of the Czech Republic): Metadata for social science research ● Dr. Derek Greene (University College Dublin): Exploring Web Archive Networks: The Case of the 2018 Irish Presidential Election 16.10 (IRE) / 17.10 (CEST) / 11:10 (EDT) Session 7: Collaborations & Teaching Chair: Dr. Joseph Timoney (Maynooth University) ● Olga Holownia (International Internet Preservation Consortium): IIPC: training, research, and outreach activities ● Dr. Juan-José Boté (Universitat de Barcelona): Using web archives to teach and opportunities on the information science field 16.50 (IRE) / 17.50 (CEST) / 10:50 (CST) Session 8: Research of Web Archives Chair: Sally Chambers (Ghent Centre for Digital Humanities, Ghent University) ● Bartłomiej Konopa (State Archives in Bydgoszcz; Nicolaus Copernicus University): Web archiving – professionals and amateurs ● Prof. Lynne M. Rudasill & Dr. Steven W. Witt (University of Illinois at Urbana- Champaign): Opportunities for Use, Challenges for Collections: Exploring Archive-It for Sites and Synergies 12 DAY TWO: 22 September 2020 9.45 (IRE) / 10.45 (CEST) WELCOME Michael Kurzmeier, EWA Co-Chair (Maynooth University) 10.00 (IRE) / 11.00 (CEST) KEYNOTE Chair: Maria Ryan (National Library of Ireland) Professor Jane Winters, School of Advanced Study, University of London: Web archives as sites of collaboration 11.00 (IRE) / 12.00 (CEST) Session 9: Research Approaches Chair: Jason Webber (UK Web Archive, British Library) ● Dr. Peter Webster (Independent Scholar, Historian and Consultant): Digital archaeology in the web of links: reconstructing a late-90s web sphere ● Michael Kurzmeier (Maynooth University): Web defacements and takeovers and their role in web archiving 11.40 (IRE) / 12.40 (CEST) Session 10: Culture & Sport Chair: Gavin Mac Allister (Irish Military War Museum) ● Dr. Philipp Budka (University of Vienna; Free University Berlin): MyKnet.org: Traces of Digital Decoloniality in an Indigenous Web-Based Environment ● Helena Byrne (British Library): From the sidelines to the archived web: What are the most annoying football phrases in the UK? 13 12.30 (IRE) / 13.30 (CEST) Session 11: Research (lightning round) Chair: Dr Julie Brooks (School of History, University College Dublin) ● Caio de Castro Mello Santos & Daniela Cotta de Azevedo Major (School of Advanced Study, University of London): Tracking and Analysing Media Events through Web Archives ● Dr. Eamonn Bell (Trinity College Dublin): Reanimating the CDLink platform: A challenge for the preservation of mid-1990s Web-based interactive media and net.art ● Hannah Connell (King’s College London; British Library): Curating culturally themed collections online: The Russia in the UK Special Collection, UK Web Archive 14.00 (IRE) / 15.00 (CEST) / 9.00 (EST) Session 12: Youth & Family Chair: Dr. Lina Franken (University of Hamburg) ● Katie Mackinnon (University of Toronto): DELETE MY ACCOUNT: Ethical Approaches to Researching Youth Cultures in Historical Web Archives ● Dr. Susan Aasman (University of Groningen): Changing platforms of ritualized memory practices. Assessing the value of family websites 14.40 (IRE) / 15.40 (CEST) Session 13: Source code and app histories Chair: Prof. David Malone (Hamilton Institute, Maynooth University) ● Dr. Anne Helmond (University of Amsterdam) & Fernando van der Vlist (Utrecht University): Platform and app histories: Assessing source availability in web archives and app repositories ● Dr. Janne Nielsen (Aarhus University) Exploring archived source code: computational approaches to historical studies of web tracking 14 15.30 (IRE) / 16.30 (CEST) / 10.30 (EST) Session 14: AI and Infrastructures Chair: Dr. Juan-José Boté (Universitat de Barcelona) ● Mark Bell; Tom Storrar; Dr. Eirini Goudarouli; Pip Willcox (The National Archives, UK); David Beavan; Dr. Barbara McGillivray; Dr. Federico Nanni (The Alan Turing Institute): Cross-sector interdisciplinary collaboration to discover topics and trends in the UK Government Web Archive: a reflection on process ● Dr. Jessica Ogden (University of Southampton) & Emily Maemura (University of Toronto): A tale of two web archives: Challenges of engaging web archival infrastructures for research 16.10 (IRE) / 17.10 (CEST) Session 15: WARC and OAIS Chair: Kieran O’Leary (National Library of Ireland) ● Consultative Committee for Space Data Systems (CCSDS), Data Archive Interoperability (DAI) Working Group; Michael W. Kearney III; David Giaretta; John Garrett; Steve Hughes: What’s missing from WARC? (Abstract/Bio) 16.45 (IRE) / 17.45 (CEST) / 08:45 (PDT) Session 16: Web Archives as Scholarly Dataset Chair: Michael Kurzmeier (Maynooth University) ● Dr. Helge Holzmann & Mr. Jefferson Bailey (Internet Archive): Web Archives as Scholarly Dataset to Study the Web 17.15 (IRE) / 18.15 (CEST) An Irish Tale / Scéal Éireannach 17.45 (IRE) / 18.45 (CEST) The Future of EWA Sharon Healy & Michael Kurzmeier (Maynooth University) 15 #EWAVirtual Abstracts Session 1: Archiving Initiatives The National Library of Ireland's Web Archive: preserving Ireland's online life for tomorrow Maria Ryan (National Library of Ireland) Keywords: Collection development, national domains, web archives, research, datasets ABSTRACT The National Library of Ireland (NLI) was founded in 1877 and its mission remains the same today; to collect, protect and make available the memory of Ireland. The library cares for a collection of over ten million physical items, with collections including manuscripts, photographs, prints and drawings and an extensive ephemera collection. In the 21st century, the NLI is working towards meeting the challenges of the digital world; collecting, preserving and providing access to a born digital record of Irish life. This presentation aims to examine the NLI web archive and highlight its importance to the documentation of Irish society and culture. In 2011, the general and presidential election provided the catalyst for a pilot web-archiving project. Following the success of this project, the NLI focused on establishing the web-archiving programme by archiving political, cultural and social websites, capturing a record of elections, budgets, the decade of commemorations and historic events such as the 2015 marriage referendum. In 2016, the NLI received its first full time web archivist and launched a significant promotional drive around the 2016 commemorative project ‘Remembering 1916, Recording 2016’. In 2017, The NLI also undertook a domain crawl of the Irish web, allowing for the capture of a wider range of websites and greater amounts of data, when compared with the selective web archive. The 2017 crawl encompassed all of the Irish top-level domain and other relevant websites that could be recognised as being hosted in Ireland but outside the .ie domain. It also used language detection software to identify Irish language websites outside the national domain. The crawl 16 amounted in almost 40 TB of unique data, which is preserved in the NLI. However, due to legislative restrictions, this data cannot be made available to researchers. In the past nine years, the NLI web archive has grown and developed into what is now an established collecting strand in the NLI. Workflow development and a comprehensive collecting strategy has seen the web archive grow and mature. The NLI has embarked up to new opportunities for collaboration and research. Collaboration is at the heart of the values of the NLI and it has helped us broaden our collections and provide datasets to new researchers. The future of research lies largely in born digital archives. The social, political and historical researchers of the future will require a record of the 21st century in Ireland. In other words, they will need web archives. This presentation will explore how the NLI is dedicated to building an Irish web archive that will document Irish life for decades to come. Biography: Maria Ryan is an assistant keeper and web archivist at the National Library of Ireland. A qualified archivist, she is co-chair of the IIPC training working group and a member of the NLI's diversity and inclusion committee. Developing a Web Archiving Strategy for the Covid-19 Collecting Initiative at the University of Edinburgh Sara Day Thomson (Digital Archivist, Centre for Research Collections, University of Edinburgh) Keywords: Covid-19, web archiving strategy, challenges, opportunities for collaboration; web archive collections ABSTRACT In this talk, the Digital Archivist at the University of Edinburgh will discuss the process (so far) for developing a strategy for capturing and preserving web-based submissions to their Collecting Covid-19 Initiative. She will also present plans for using this process as a springboard to develop a wider institutional programme(s) of web archiving. 17 In April, the Centre for Research Collections (CRC) put out an open call for members of the university community to submit materials that document their experiences of the Covid-19 pandemic and lockdown [1]. Depositors are invited to submit their digital records using a web form embedded on the university website [2]. At the time of the open call, the CRC did not have an established web archiving programme. Therefore, a new strategy had to be developed in response to the influx of web-based submissions (and other relevant web pages identified by the collecting team). This strategy, further, had to address the identified concerns of the Initiative: namely speedy deployment, but also handling sensitive material, understanding potential research uses, and balancing metadata requirements with low-barrier submission requirements. The project team is now in the early stages of a partnership with the UK Web Archive through the National Library of Scotland. The CRC team will curate a special collection for the Collecting Covid-19 Initiative using the UKWA’s infrastructure and guidance. Recognising some of the limitations of this approach, the Digital Archivist will supplement the Collecting Covid-19 collection with manual captures using OS tools, such as Conifer / Webrecorder Desktop and TAGS. In order to make the most use of this strategy, the Digital Archivist has invited the project team to view these steps as a pilot study for wider web archiving programmes. This pilot will include an evaluation of methods for: ● gathering and analysing user needs and requirements ● choosing an approach, either collaboration with the UKWA or OS tools ● training, both staff and researchers, to capture web content as part of their work ● outreach to the wider university community to raise awareness of web archiving and of available archived web resources Currently, the focus is finding a robust and reliable way to capture, curate, and preserve web-based submissions to the Covid-19 Collecting Initiative. However, in the coming months, the Digital Archivist hopes to lay the groundwork for next steps. First and foremost, she aims to host a series of focus groups (potentially virtually) with key researchers in collaboration with the Research Data Support team to better gather information about research needs and to raise the profile of available archived web content. 18 References: [1] University of Edinburgh, Staff News, ‘Covid-19 experiences to be documented’ https://www.ed.ac.uk/news/students/2020/covid-19-experiences-to-be-documented [2] University of Edinburgh, Collecting Covid-19 Initiative, https://www.ed.ac.uk/information- services/library-museum-gallery/crc/collecting-covid-19-initiative Biography: Sara Day Thomson is Digital Archivist at the University of Edinburgh where she looks after the management and preservation of digital materials across collections. She joined the University from the Digital Preservation Coalition where she was Research Officer, supporting the development of new methods and technologies to ensure long-term access to digital data. She reconvened the DPC’s Web Archiving and Preservation Working Group, a forum for organisations to share experiences in archiving web content. She also contributed to the development of IIPC & DPC Beginner Web Archiving Training materials and is the author of Preserving Social Media, a DPC Technology Watch Report. Internet for everyone: the selection and harvest of the homepages of the oldest Dutch provider XS4ALL (1993-2001) Dr. Kees Teszelszky (Koninklijke Bibliotheek - National Library of the Netherlands) Keywords: web archiving, web archaeology, web incunables, homepages, early web ABSTRACT “Web incunables” can be defined as those websites which were published in the first stage of the world wide web between 1990 and 1998. The early sites of the nineties were made at the start of publishing texts on the web and mark the frontier between analogue prints on paper and digital publications on the web. The first Dutch homepage and web incunable was put online in 1993: the same year one of the oldest Dutch internet provider XS4ALL (“Access for All”) started to offer its services to customers for the first time. This provider was founded by hackers and techno- anarchists in this year. It attracted a large group of creative Dutch internet pioneers after the start in May 1993 who have built at least 10,000 homepages between 1993 and 2001, of which a large part is still online in some form. https://www.ed.ac.uk/news/students/2020/covid-19-experiences-to-be-documented https://www.ed.ac.uk/information-services/library-museum-gallery/crc/collecting-covid-19-initiative https://www.ed.ac.uk/information-services/library-museum-gallery/crc/collecting-covid-19-initiative 19 We can consider the remaining homepages as the most interesting born digital Dutch heritage collection still online and waiting to be studied. As XS4ALL was promoting and facilitating the building of these sites, the early web designers, artists, activists, writers and scientists were eagerly experimenting with the possibilities of the new medium in content, design and functionality. As XS4ALL was not so much seen as a company, but more as a society, many customers remained faithful to this provider till now. Due to this, a large amount of homepages of the early Dutch web can still be found at this provider. This heritage is however in danger. Dutch telephone company KPN took XS4ALL over in 1998 and announced in January 2019 to end this brand in near future. This is the reason why Koninklijke Bibliotheek - National Library of the Netherlands (KB-NL) started a web archiving project the same year to identify and rescue as much web incunables and early homepages as possible which are still hosted by this provider. This project was generously sponsored by SIDN-fonds and Stichting Internet4ALL. This paper describes the method and first results of the ongoing pilot research project on internet archaeology and web incunables of KB-NL. It is about web archiving a selection of web incunables published on the Dutch web before 2001 which mirror the development of Dutch online culture on the web. I will describe the methods and sum up the experiences with selecting and harvesting homepages and mapping the Dutch digital culture online by link analysis of this collection. I will discuss also the characteristics of web materials and archived web materials, among others the first Dutch interactive 3D house, a virtual metro line for the digital city of Amsterdam, the “Stone Age Computer” and the first Dutch online literature magazine. I will also explain the use of these various materials (harvested websites, metadata link clouds, context information) for future research on the history of the Dutch web. Biography: Kees Teszelszky (1972) is a historian and curator of the digital collections at the Koninklijke Bibliotheek - National Library of The Netherlands. He graduated at the University of Leiden (Political Science, 1999) and at the University of Amsterdam (East European Studies, 1998) and obtained his PhD at the University of Groningen (Cultural History, 2006). He has been involved in research on web archiving and born digital sources since 2012. His present research field covers the selection, harvest and presentation of born digital sources at the KB. He is currently involved in projects on internet archaeology in the Netherlands, mapping the Frisian and Dutch national web domain, online news and the historic sources of our Post-truth era. 20 Session 2: Collaborations Leveraging the UK Web Archive in an Irish context: Challenges and Opportunities Dr Brendan Power (The Library of Trinity College Dublin) Keywords: web archives, collaboration, legal deposit, 1916 Easter Rising ABSTRACT This paper will discuss a project to curate an archive of websites undertaken by The Library of Trinity College Dublin. The context for these projects was the UK legal deposit environment in which the six Legal Deposit Libraries (LDL’s) work together to help preserve the UK’s knowledge and memory. In 2013 the legal deposit remit was extended to include non-print, electronically published material, which means the LDL’s may now capture and archive any freely available websites that are published or hosted in the UK. This happens in the Legal Deposit UK Web Archive, with the British Library providing the technical and curatorial infrastructure, and all LDL’s contributing at both the strategic and planning level, and through curating themed collections. In this paper I will present a case study which demonstrates how The Library of Trinity College Dublin has explored the challenges and opportunities of utilising the research potential of this vast new resource. The 1916 Easter Rising collection was a collaborative project in 2015/2016 between The Library of Trinity College Dublin (University of Dublin), the Bodleian Libraries (University of Oxford), and the British Library. The project aimed to identify, collect, and preserve websites that contribute to an understanding of the 1916 Easter Rising, with the aim of enabling critical reflection on both the Rising itself, and how it was commemorated in 2016. The project was a test case for effective collaboration between libraries in multiple jurisdictions helping to explore how themed, curated web archive collections can promote the potential of web archives to a wider audience. The presentation will review the project and outline the challenges and opportunities that emerged as it progressed. In particular, it will highlight the challenges that arose from working across multiple jurisdictions, and the implications of different legislative frameworks for archive curation and collection building. 21 Biography: Brendan Power is Digital Preservation Librarian at The Library of Trinity College Dublin. He holds a BA from Dublin City University, an MPhil and PhD in History from Trinity College, the University of Dublin, and an MLIS from University College Dublin. A former Postdoctoral Research Fellow at Trinity College Dublin, he acted as the Web Archive Project Officer on the 1916 Easter Rising Web Archive and has previously published on this project. Creating a web archive at Tate: an opportunity for ongoing collaboration Sarah Haylett (Tate) Patricia Falcao (Tate) Keywords: web archives, net art, digital preservation, web-based art, archives ABSTRACT In the year 2000, Tate commissioned the first of fifteen net artworks for the then newly launched Tate website, Tate Online, which was devised as the fifth gallery. The commissioned artworks were meant to attract and challenge visitors to this still new online space. Initially these works were closely entwined with the main website, they were highlighted on the front page of the site, but as the number of works grew and Tate Online changed focus, these works were grouped together under the Intermedia Art microsite alongside contextualising texts, a programme of events and podcasts. The Intermedia website still exists online, but it has not been updated since 2012 and sits on a server that is now outdated and will eventually have to be decommissioned. Tate does not archive its website, as a public body this is carried out by The National Archives UK Government Web Archive. It has a significant number of captures for the Intermedia website, but it is not consistent in capturing its interactive content - which was a key feature of several of the commissioned artworks. Therefore, due to these gaps and missing contextual information, there is not a representative or effective archived version of the Intermedia website, or the artworks available. As part of the Andrew W. Mellon Foundation funded project Reshaping the Collectible: When Artworks Live in the Museum, a team of interdisciplinary researchers are looking at the history of the Net Art commissioning programme, the strategies to preserve the artworks and website as well 22 as looking to build Tate’s capacity to collect internet art. The project is also an opportunity to go beyond the artwork collection and consider the same set of issues from the perspective of institutional records and the Tate Archive. The developments in digital preservation, web archiving and more specifically in small scale web recording and emulation, means that this was the perfect moment to undertake extensive captures and documentation of the Intermedia Art website and individual artworks as they exist now. This has included extensive discussion with the artists who continue to host the works on their own servers. This paper will present the different but complementary perspectives of both Tate’s archive and Time-Based Media Conservation as they have worked together to understand the intricacies of documenting, conserving and maintaining the integrity and accessibility of web-based art and its online records in the contemporary art museum. It will discuss the tools and methodology used to archive the website and the plans to make it available as Tate’s first website archived as a public record. Biographies: Patricia Falcao is a Time-based Media Conservator with a broad interest in the preservation of the digital components of contemporary artworks. She has worked at Tate since 2008, and currently works in the acquisition of media-based media artworks into the Collection. She currently collaborates with Tate’s Research Department in the Reshaping the Collectible project, looking at the preservation of websites in Tate’s context, as well as working with Tate’s Technology team to continue to develop Tate's strategy for the preservation of high value digital assets. Patricia completed her MA at the University of the Arts in Bern with a thesis on risk assessment for software-based artworks. She continues to develop research in this field in her role as a Doctoral Researcher in the AHRC funded Collaborative Doctoral Program, between Tate Research and the Computing Department at Goldsmiths College, University of London. The subject of her research are the practices of software-based art preservation in collections, by artists and in the gaming industry. Sarah Haylett is a professional Archivist; she received her MA in Archives and Records Management from UCL in 2014. She joined Tate in June 2018 having previously worked at Zaha Hadid Architects, The Photographers’ Gallery and with private collectors. As part of the Reshaping the Collectible: When Artworks Live in the Museum project team, her research interests are rooted in the relationship between archival and curatorial theory and how, beyond a culture of compliance, Tate’s record keeping can be more intuitive to research and collecting practice. She is very interested in sites of archival creation and intention, and how these are represented in artistic practice and the contemporary art museum. 23 Session 3: Archiving Initiatives (Lightning Round) PRONI Web Archive: A collaborative approach Rosita Murchan (Public Record Office of Northern Ireland - PRONI) Keywords: Collaborations, challenges, resources, permissions, partnerships ABSTRACT The Public record of Northern Ireland web archive has been building its collection of websites for almost ten years, focusing initially on capturing the websites of our local councils and Government departments and those deemed historically or culturally important to Northern Ireland. However, unlike the UK and Ireland, Northern Ireland do not have Legal deposit status and as a result we are sometimes limited as to what we can capture. As the web archive has grown and evolved organically over the years with more and more requests for websites to be archived, PRONI has had to look at the issue of gaining permissions (and capturing sites without any legal deposit legislation) and on how we can continue to grow our collection with the limited resources we have available to us. One of the ways in which we are able to expand the scope of the collection is through collaborations not only with other institutes such as the British Library, that allow us to capture sites that would usually be outside our remit, but also by working in partnership with the other sections within our organisation. The aim of this short presentation will be to look in more depth at PRONI’s work with the web Archive, the strategies we have used to build it, our collaborative projects, and the challenges and obstacles we face as we continue to grow. Biography: Rosita Murchan has worked with the Public Record Office for two years and has been working solely on the web archive for one year. 24 An overview of 15 years of experience in archiving the Croatian web Inge Rudomino (Croatian Web Archive, National and University Library in Zagreb – NSK) Marta Matijević (Croatian Web Archive, National and University Library in Zagreb – NSK) Keywords: legal deposit, Croatian Web Archive, web archiving, open access, online publication ABSTRACT National and University Library in Zagreb (NSK) began archiving Croatian web in 2004 , in collaboration with the University of Zagreb University Computing Centre (SRCE) when the Croatian Web Archive (HAW) was established. The basis for archiving web was the Law on libraries (1997) which subjected online publications to legal deposit. To harvest the web, HAW is using three different approaches: selective, .hr domain harvesting and thematic harvesting. In period from 2004 to 2010, HAW was based only on the concept of selective harvesting which implies that each resource is selected to be archived according to established Selection Criteria. Each title has a full level of bibliographic description and is retrievable in library online catalogue providing the end user with high quality archived copy. Special care is given to news portals which are archived daily. To each title and archived copy an URN:NBN identifier is assigned to ensure permanent access that is of great importance for future citations. Since 2010, HAW conducts .hr domain crawls annually and harvests websites related to topics and events of national importance periodically. HAW’s primal task is to ensure that harvested resources are preserved in their entirety, original format and with all the accompanying functionalities. Majority of harvested content is in open access. The poster will present a fifteen years’ experience of the National and University Library in Zagreb (NSK) in managing web resources with the emphasis on selective, domain and thematic harvestings as well as new website design with new functionalities. Biographies: Inge Rudomino: Senior librarian at Croatian Web Archive, National and University Library in Zagreb (Croatia). Graduated at Information Sciences (Librarianship), Faculty of Philosophy, University of Zagreb. From 2001 to 2007 works as a cataloguer in Department for Cataloguing Foreign Publications in National and University Library in Zagreb. Since 2007 works at Croatian Web Archive on tasks which include identification, selection, cataloguing, archiving, maintaining 25 Croatian Web Archive, communications with publishers, and promotion. Publishes articles in Croatian and conference proceedings in the field of web archiving. Marta Matijević: MA is a librarian at Croatian Web Archive, National and University Library in Zagreb. Graduated Library and Information Science at Faculty of Humanities and Social Sciences in Osijek in 2016. From 2016 to 2018 has worked in academic and school libraries. Since 2019 works at Croatian Web Archive on identification, selection, cataloguing, archiving, maintaining Archive, communication with publishers and promotion. Her interests are web archiving and information theories and has published papers in such fields. The UK Web Archive and Wimbledon: A Winning Combination Robert McNicol (Kenneth Ritchie Wimbledon Library, Wimbledon Lawn Tennis Museum) Keywords: Tennis, Sport, Collaboration, Heritage, Preservation ABSTRACT Since January 2019, the Kenneth Ritchie Wimbledon Library, the world's largest tennis library, has been collaborating with the British Library on a web archiving project. The Wimbledon Library is curating the Tennis subsection of the UK Web Archive Sports Collection. The UK Web Archive aims to collect every UK website at least once per year and they also work with subject specialists to curate collections of websites on specific subjects. The ultimate aim is for the Tennis collection to contain all UK-based tennis-related websites. This will include websites relating to tournaments, clubs, players and governing bodies. It will also include social media feeds of individuals or organisations involved with tennis in the UK. Already we have collected the twitter feeds of all male and female British players with a world ranking. We have also archived Wimbledon’s own digital presence, including the award-winning Wimbledon.com, which celebrates its 25th anniversary in 2020. In addition to this we have archived Wimbledon’s social media accounts, including those belonging to the Museum and the Wimbledon Foundation and its international digital presence in the form of the Wimbledon page on Weibo, a Chinese social media site. This falls within the scope of the project as, although the site is not an English language one, it is based in the UK. 26 The collaboration is mutually beneficial. For a small, specialist library such as ours, there are many advantages to having a partnership with the British Library. Equally, the UK Web Archive benefits from our specialist expertise in curating their Tennis collection. In many ways, a project like this one is perfect for Wimbledon. Although our history and heritage are at the heart of everything we do, we’re always innovating and striving to improve as well. That’s why this project, which involves using the latest technology to preserve tennis history, is so exciting for us. This presentation will give an overview of why the Kenneth Ritchie Wimbledon Library wanted to get involved in web archiving, how the collaboration with the UK Web Archive came about and give an overview what has been collected so far. Biography: Since March 2016 I have worked as the Librarian of the Kenneth Ritchie Wimbledon Library, which is part of the Wimbledon Lawn Tennis Museum. Prior to this, I had a long career as a media librarian, mostly working in sport. From 2008 to 2016 I was Sport Media Manager at BBC Scotland in Glasgow. Before that, I also worked for the BBC in London and Aberdeen and I also worked briefly for ITV Sport and Sky Sports. I studied History at the University of Glasgow and Information and Library Studies at the University of Strathclyde. 27 Session 4: Research Engagement & Access Piloting access to the Belgian web-archive for scientific research: a methodological exploration Dr. Peter Mechant (Ghent University) Sally Chambers (Ghent University) Eveline Vlassenroot (Ghent University) Friedel Geeraert (KBR - Royal Library and the State Archives of Belgium) Keywords: research use of web archives, web-archiving, digital humanities, born-digital collections, digital research labs ABSTRACT The web is fraught with contradiction. On the one hand, the web has become a central means of information in everyday life and therefore holds the primary sources of our history created by a large variety of people (Milligan, 2016; Winters, 2017). Yet, much less importance is attached to its preservation, meaning that potentially interesting sources for future (humanities) research are lost. Web archiving therefore is a direct result of the computational turn and has a role to play in knowledge production and dissemination as demonstrated by a number of publications (e.g. Brügger & Schroeder, 2017) and research initiatives related to the research use of web archives (e.g. https://resaw.eu/). However, conducting research, and answering research questions based on web archives - in short; ‘using web archives as a data resource for digital scholars’ (Vlassenroot et al., 2019) - demonstrates that this so-called ‘computational turn’ in humanities and social sciences (i.e. the increased incorporation of advanced computational research methods and large datasets into disciplines which have traditionally dealt with considerably more limited collections of evidence), indeed requires new skills and new software. In December 2016, a pilot web-archiving project called PROMISE (PReserving Online Multiple Information: towards a Belgian StratEgy) was funded. The aim of the project was to (i) identify current best practices in web-archiving and apply them to the Belgian context, (ii) pilot Belgian 28 web-archiving, (iii) pilot access (and use) of the pilot Belgian web archive for scientific research, and (iv) make recommendations for a sustainable web-archiving service for Belgium. Now the project is moving towards its final stages, the project team is focusing on the third objective of the project, namely how pilot access to the Belgian web archive for scientific research. The aim of this presentation is to discuss how the PROMISE team approached piloting access to the Belgian web- archive for scientific research, including: a) reviewing how existing web-archives provide access to their collections for research, b) assessing the needs of researchers based on a range of initiatives focussing on research-use of web-archives (e.g. RESAW, BUDDAH, WARCnet, IIPC Research Working Group, etc. and c) exploring how the five persona’s created as part of the French National Library’s Corpus project (Moiraghi, 2018) could help us to explore how different types of academic researchers that might use web archives in their research. Finally, we will introduce the emerging Digital Research Lab at the Royal Library of Belgium (KBR) as part of a long-term collaboration with the Ghent Centre for Digital Humanities (GhentCDH) which aims to facilitate data-level access to KBR’s digitised and born-digital collections and could potentially provide the solution for offering research access to the Belgian web-archive. Bibliography Brügger, N. & Schroeder, R. (Eds.). (2017). The web as history: Using web archives to understand the past and present. London: UCL Press. Milligan, I. (2016). Lost in the infinite archive: the promise and pitfalls of web archives. International Journal of Humanities and Arts Computing, 10(1), 78-94. Doi: 10.3366/ijhac.2016.0161. Moiraghi, E. (2018). Le projet Corpus et ses publics potentiels: Une étude prospective sur les besoins et les attentes des futurs usagers. [Rapport de recherche] Bibliothèque nationale de France. 2018. ⟨hal-01739730⟩ Winters, J. (2017). Breaking into the mainstream: demonstrating the value of internet (and web) histories. Internet Histories, 1(1-2), 173-179. https://doi.org/10.1080/24701475.2017.1305713. Vlassenroot, E., Chambers, S., Di Pretoro, E., Geeraert, F., Haesendonck, G., Michel, A., & Mechant, P. (2019). Web archives as a data resource for digital scholars. International Journal of Digital Humanities, 1(1), 85-111. https://doi.org/10.1007/s42803-019-00007-7 Biographies: Dr Peter Mechant holds a PhD in Communication Sciences from Ghent University (2012). After joining research group mict (www.mict.be), Peter has been mainly working on research projects related to e-gov (open and linked data), smart cities, online communities and web archiving. As 29 senior researcher, he is currently involved in managing projects and project proposals at a European, national as well as regional level. Sally Chambers is Digital Humanities Research Coordinator at the Ghent Centre for Digital Humanities, Ghent University, Belgium and National Coordinator for DARIAH in Belgium. She is one of the instigators of an emerging Digital Research Lab at KBR, Royal Library of Belgium as part of a long-term collaboration with the Ghent Centre for Digital Humanities. This lab will facilitate data-level access to KBR’s digitised and born-digital collections for digital humanities research. Her role in PROMISE relates to research access and use of Belgium’s web-archive. Eveline Vlassenroot holds a Bachelor Degree in Communication Sciences (Ghent University) and graduated in 2016 as a Master in Communication Sciences with a specialisation in New Media and Society (Ghent University). After completing additional courses in Information Management & Security at Thomas More Mechelen (KU Leuven), she joined imec-mict-Ghent University in September 2017. She participates in the PROMISE project (Preserving Online Multiple Information: towards a Belgian StratEgy), where she is researching international best-practices for preserving and archiving online information. She is also involved in several projects with the Flemish government regarding data standards, the governance of interoperability standards and linked open data. Friedel Geeraert is a researcher at KBR (Royal Library) and the State Archives of Belgium, where she works on the PROMISE project that focuses on the development of a Belgian web archive at the federal level. Her role in the project includes comparing and analysing best practices regarding selection of and providing access to the information and data to be archived and making recommendations for the development of a long-term and sustainable web archiving service in Belgium. Reimagining Web Archiving as a Realtime Global Open Research Platform: The GDELT Project Dr. Kalev Hannes Leetaru (The GDELT Project) Keywords: GDELT Project; realtime; research-first web archive; news homepages ABSTRACT The GDELT Project (https://www.gdeltproject.org/) is a realization of the vision I laid out at the opening of the 2012 IIPC General Assembly for the transformation of web archives into open research platforms. Today GDELT is one of the world’s largest global open research datasets for understanding human society, spanning 200 years in 152 languages across almost every country on earth. Its datasets span text, imagery, spoken word and video, enabling fundamentally new https://www.gdeltproject.org/ https://blogs.loc.gov/thesignal/2012/05/a-vision-of-the-role-and-future-of-web-archives-conclusions-and-the-role-of-archives/ 30 kinds of multimodal analyses and reach deeply into local sources to reflect the richly diverse global landscape of events, narratives and emotions. At its core, GDELT in the web era is essentially a realtime production research-centered web archive centered on global news (defined as sources used to inform societies, both professional and citizen-generated). It continually maps the global digital news landscape in realtime across countries, languages and narrative communities, acting both as archival facilitator (providing a live stream of every URL it discovers to organizations including the Internet Archive for permanent preservation) and research platform. In contrast to the traditional post-analytic workflow most commonly associated with web archival research, in which archives are queried, sampled and analyzed after creation, GDELT focuses on realtime analysis, processing every single piece of content it encounters through an ever-growing array of standing datasets and APIs spanning rules-based, statistical and neural methodologies. Native analysis of 152 languages is supported, while machine translation is used to live translate everything it monitors in 65 languages, enabling language-independent search and analysis. Twin global crawler and computational fleets are distributed across 24 data centers across 17 countries, leveraging Google Cloud’s Compute Engine and Cloud Storage infrastructures, coupled with its ever-growing array of AI services and APIs, underpinning regional ElasticSearch and bespoke database and analytic clusters and all feeding into petascale analytic platforms like BigQuery and Inference API for at-scale analyses. This massive global-scale system must operate entirely autonomously, scale to support enormous sudden loads (such as during breaking disasters) and function within an environment in which both the structure (rendering and transport technologies) and semantics (evolving language use) are in a state of perpetual and rapid change. Traditional web archives are not always well-aligned with the research questions of news analysis, which often require fixed time guarantees and a greater emphasis on areas like change detection and agenda setting. Thus, GDELT includes numerous specialized news-centric structural datasets including the Global Frontpage Graph that catalogs more than 50,000 major news homepages every hour on the hour, totaling nearly a quarter trillion links over the last two years to support agenda setting research. The Global Difference Graph recrawls every article after 24 hours and after one week with fixed time guarantees to generate a 152-language realtime news editing dataset cataloging stealth editing and silent deletions. Structural markup is examined and embedded social media posts cataloged as part of its Global Knowledge Graph. A vast distributed processing pipeline performs everything from entity extraction and emotional coding to SOTA language 31 modeling and claims and relationship mapping. Images are extracted from each article and analyzed by Cloud Vision, enabling analysis of the visual landscape of the web. Datasets from quotations to geography to relationships to emotions to entailment and dependency extracts are all computed and output in realtime, operating on either native or translated content. In essence, GDELT doesn’t just crawl the open web, it processes everything it sees in realtime to create a vast archive of rich realtime research datasets. This firehose of data feeds into downloadable datasets and APIs to enable realtime interactive analyses, while BigQuery enables at-scale explorations of limitless complexity, including one-line terascale graph construction and geographic analysis and full integration with the latest neural modeling approaches. Full integration with GCE, GCS and BigQuery couples realtime analysis of GDELT’s rich standing annotations with the ability to interactively apply new analyses including arbitrarily complex neural modeling at scale. This means that GDELT is able to both provide a standing set of realtime annotations over everything it encounters and support traditional post-facto analysis at the effectively infinite scale of the public cloud. From mapping global conflict and modeling global narratives to providing the data behind one of the earliest alerts of the COVID-19 pandemic, GDELT showcases what a research-first web archive is capable of and how to leverage the full power of the modern cloud in transforming web archives from cold storage into realtime open research platforms. Biography Dr. Kalev Hannes Leetaru - One of Foreign Policy Magazine's Top 100 Global Thinkers of 2013, Kalev founded the open data GDELT Project. From 2013-2014 he was the Yahoo! Fellow in Residence of International Values, Communications Technology & the Global Internet at Georgetown University's Edmund A. Walsh School of Foreign Service, where he was also an Adjunct Assistant Professor, as well as a Council Member of the World Economic Forum's Global Agenda Council on the Future of Government. His work has been profiled in the presses of more than 100 nations and in 2011 The Economist selected his Culturomics 2.0 study as one of just five science discoveries deemed the most significant developments of 2011. Kalev’s work focuses on how innovative applications of the world's largest datasets, computing platforms, algorithms and mind-sets can reimagine the way we understand and interact with our global world. More on his latest projects can be found on his website at https://www.kalevleetaru.com/ or https://blog.gdeltproject.org. https://www.gdeltproject.org/ https://www.kalevleetaru.com/ https://blog.gdeltproject.org/ 32 Session 5: Archiving Initiatives Archiving 1418-Now using Rhizome’s Webrecorder: observations and reflections Anisa Hawes (Independent Curatorial Researcher and Web Archivist) Keywords: web archiving tools, social media, curation, process, Webrecorder ABSTRACT This paper explores the challenges of archiving https://www.1418now.org.uk/ and its associated social media profiles (Twitter, Instagram, and YouTube) using Rhizome’s Webrecorder. These web collections form an integral part of the Imperial War Museum’s record of the 14-18Now WW1 Centenary Art Commissions programme and represent a recognition that essential facets of many of the Commissions would otherwise be absent from the archive. Immediate public responses to Jeremy Deller’s modern memorial event We're Here Because We're Here, for example, played out in the contemporary context of Web 2.0. Many people who encountered the memorial directly were moved to share their reflections on social media. Many others encountered the event indirectly: via messages, images, and videos which circulated on social networking platforms. In this way, the online sphere became an expanded site of public participation and experience. Meanwhile, imprinted engagement metrics and appended comments threads provided unprecedented curatorial insight into the artwork's impact and reach. Webrecorder is a free, open-source web archiving tool developed by Rhizome. It enables high- fidelity capture of complex, interactive web pages, including social media sites. Written from the point of view of a curatorial researcher, this paper includes insights into the web archiving process and workflow. Combining work-in-progress screenshots and reflections extracted from my log notes, I’ll explain how I have utilised Webrecorder’s automation features and scripted behaviours alongside manual, action-by-action capture to build a rich collection, tackling the challenge of archiving both in-detail and at-scale. 33 Biography: Anisa Hawes is an independent curatorial researcher and web archivist based in London, UK. As an embedded researcher at the Victoria and Albert Museum (2015-18) her work investigated how digital tools and software environments have altered design practice; and how the web and social media have produced new, participatory poster forms––such as memes which are appropriated as they circulate. Collaborating with Rhizome and British Library/UK Web Archive, she tested web archiving technologies to capture digital objects in the context of the platforms where they are created and encountered, whilst developing a framework of curatorial principles to support digital collecting. Managing the Lifecycle of Web Archiving at a Large Private University Nicole Greenhouse (New York University Libraries) Keywords: workflows, accessioning, description, quality assurance, context ABSTRACT New York University Libraries has been archiving websites since 2007. The collection, developed using the service Archive-It, consists of websites related to Labor and Left movements, the New York City downtown arts scene, contemporary composers, and university websites, totaling approximately 5000 websites and 13 terabytes of data. In 2018, I was hired as the first permanent structural archivist whose role is to solely manage the web archiving program. During this first year, it was important to the Archival Collections Management department in the NYU Libraries to incorporate web archiving in the greater workflows of the department as well as manage the day to day work that comes with web archiving, including capture, website submissions, quality assurance, and access and description. This presentation will discuss how we have developed a database to manage capture and quality assurance, as well as the ongoing project to accession recently added websites and create consistent description across all of the archived websites. The database allows us to track the lifecycle of each archived website and take advantage of the scoping and quality assurance tools provided by Archive-it but work around the service’s limitations. The presentation will conclude with an overview of descriptive practices by creating accession records to track why curators and archivists add websites to the collection and update finding aids that provide a greater amount of contextual description that goes beyond Dublin Core and in line with 34 the department’s descriptive policies to create transparent and standards compliant description in the context of the Special Collection’s analog collections. By creating records that put the web archives in the context of the rest of the collections, NYU is able to promote the use of the archived websites. Biography: Nicole Greenhouse is the Web Archivist in the Archival Collections Management department at New York University Libraries. Nicole received her MA in Archives and Public History at NYU. She has previously worked at the Winthrop Group, the Center for Jewish History, and the Jewish Theological Seminary on a variety of analog and digital archives projects. She is currently the Communications Manager for the Web Archiving Section of the Society of American Archivists. 35 Session 6: Social Science & Politics Thematic web crawling and scraping as a way to form focussed web archives Benedikt Adelmann MSc (University of Hamburg) Dr. Lina Franken (University of Hamburg) Keywords: web crawling, scraping, thematic focussed web archives, discourse analysis ABSTRACT For humanities and social science research on the contemporary, the web and web archives are growing in their relevance. Not much is available when it comes to thematically based collections of websites. In order to find out about ongoing online discussions, a web crawling and scraping is needed as soon as a larger collection shall be generated as a corpus for further explorations. Within the study presented here, we focus on the acceptance of telemedicine and its challenges. For the discourse analysis conducted (Keller 2005), the concept of telemedicine often is discussed within a broader field of digital health systems, while there are only few statements of relevance within single texts. Therefore, a large corpus is needed to identify relevant stakeholders and discourse positions and go into details of text passages – big data turns into small data and has to be filtered (see Koch/Franken 2019). Thematic web crawling and scraping (Barbaresi 2019: 30) is a mayor facilitator with these steps. Web crawling has to start from a list of so-called seed URLs, which in our case refer to the main pages of web sites of organizations (e.g. health insurance companies, doctors’ or patients’ associations) known to be involved in the topic of interest. From these seed URLs, our crawl explores the network structure expressed by the (hyper)links between webpages in a breadth-first manner (see Barbaresi 2015: 120ff. for an overview of web crawling practices). It is able to handle content with MIME types text/html, application/pdf, application/x-pdf and text/plain. Content text is extracted and linguistically pre-processed: tokenization, part-of-speech tagging, lemmatization (reduction of word forms to their basic forms). If the lemmatized text contains at least one of some pre-defined keywords (see Adelmann et al. 2019 for this semantic-field based approach), the original content of the webpage (HTML, PDF etc.) is saved as well as the results of the linguistic 36 pre-processing. (Hyper)links from HTML pages are followed if they refer to (other) URLs of the same host. If the HTML page is a match, and only then, links are also followed if their host is different. We employ some heuristics to correct malformed URLs and avoid a variety of non-trivial equivalences since we are testing whether a URL has already been visited by the crawler. Of saved pages, the crawler records accessed URLs, date and time of access, and other metadata, including the matched keywords. URLs only visited (but not saved) are recorded without metadata; found links between them are as well. The script is published as hermA-Crawler (Adelmann 2019). When using focussed web archives formed in this way, it is easy to use different approaches such as topic modelling (Blei 2012) or sentiment analysis (D'Andrea et al. 2015) on a larger base in order to support discourse analysis with digital humanities approaches. References: Adelmann, Benedikt; Andresen, Melanie; Begerow, Anke; Franken, Lina; Gius, Evelyn; Vauth, Mi-chael: Evaluation of a Semantic Field-Based Approach to Identifying Text Sections about Specific Topics. In: Book of Abstracts DH2019. https://dh2019.adho.org/wp- content/uploads/2019/04/Short-Papers_23032019.pdf. Adelmann, Benedikt: hermA-Crawler. https://github.com/benadelm/hermA-Crawler. Barbaresi, Adrien: Ad hoc and general-purpose corpus construction from web sources. Doctoral dissertation, Lyon, 2015. Barbaresi, Adrien: The Vast and the Focused: On the need for thematic web and blog corpora. In: Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC- 7), Cardiff, 2019. DOI: https://doi.org/10.14618/ids-pub-9025 Blei, David M.: Probabilistic topic models. Surveying a suite of algorithms that offer a solution to managing large document archives. In: Communications of the ACM 55 (2012), S. 77–84. D'Andrea, Alessia; Ferri, Fernando; Grifoni, Patrizia; Guzzo, Tiziana: Approaches, Tools and Applications for Sentiment Analysis Implementation. In: International Journal of Computer Applications 125 (2015), S. 26–33. DOI: 10.5120/ijca2015905866. Keller, Reiner: Analysing Discourse. An Approach from the Sociology of Knowledge. In: Forum: Qualitative Social Research Volume 6, No. 3, Art. 32 (2005). DOI: http://dx.doi.org/10.17169/fqs-6.3.19 Koch, Gertraud; Franken, Lina: Automatisierungspotenziale in der qualitativen Diskursanalyse. Das Prinzip des „Filterns“. In: Sahle, Patrick (ed.): 6. Tagung des Verbands Digital Humanities im deutschsprachigen Raum e.V. (DHd 2019). Digital Humanities: multimedial & multimodal. https://github.com/benadelm/hermA-Crawler 37 Universitäten zu Mainz und Frankfurt, March 25 to 29, 2019. Book of Abstracts, pp. 89–91. DOI: 10.5281/zenodo.2596095 Biographies: Benedikt Adelmann is a computer scientist at the University of Hamburg. Lina Franken is a cultural anthropologist at the University of Hamburg. Together, they are working within the collaborative research project “Automated modelling of hermeneutic processes – The use of annotation in social research and the humanities for analyses on health (hermA)”. See https://www.herma.uni-hamburg.de/en.html. Metadata for social science research Andrea Prokopová (Webarchiv, National Library of the Czech Republic) Keywords: web archiving, metadata, big data, social sciences, data mining ABSTRACT The Czech web archive of National Library of the Czech Republic (Webarchiv) is one of the oldest in Europe (since 2000). It is therefore able to provide methodological support to new web archives and also has a large amount of harvested data. However, data cannot be provided due to copyright. At least there is the opportunity to use metadata of harvested web resources. Two years ago, sociologists from the Academy of Sciences of the Czech Republic showed interest in the data for their research. This started their cooperation with the Czech web archive and also with the Technical University in Pilsen. These three institutions are currently working together to Development of the Centralized Interface for the Web content and Social Networks Data Mining. The data sets that researchers prepare on their own using the interface can be used for various data analysis and interpretation of social trends and changes in the Internet environment. In the first phase of the project, a basic analysis of the content of the web archive took place. This revealed that the web archive contains nearly 9 and a half billion unique digital objects. These can be text, image, audio and video objects, or other digital objects (software, scripts, etc.). The analysis provided accurate information on how many objects are in the Webarchive with the current size. 38 The next phase was the programming work itself. There is already a prototype of the search engine that is in the process of internal testing. Bibliography: BRÜGGER, Niels, Niels Ole FINNEMANN, 2013. The Web and digital humanities: Theoretical and methodological concerns. Journal of Broadcasting & Electronic Media [online]. 2013, s. 66- 80. ISSN 1550-6878. Dostupné z: http://thelecturn.com/wp-content/uploads/2013/07/The-web- and-digital-humanities-Theoretical-and-Methodological-Concerns.pdf KVASNICA, Jaroslav, Marie HAŠKOVCOVÁ a Monika HOLOUBKOVÁ. Jak velký je Webarchiv? E-zpravodaj Národní knihovny ČR [online]. Praha: Národní knihovna ČR, 2018, 5(5), 6 - 7 [cit. 2020-01-22]. Dostupné z: http://text.nkp.cz/o-knihovne/zakladni- informace/vydane-publikace/soubory/ostatni/ez_2018_5.pdf KVASNICA, Jaroslav, Andrea PROKOPOVÁ, Zdenko VOZÁR a Zuzana KVAŠOVÁ. Analýza českého webového archivu: Provenience, autenticita a technické parametry. ProInflow [online]. 2019, 11(1) [cit. 2020-01-22]. DOI: 10.5817/ProIn2019-1-2. ISSN 1804-2406. Dostupné z: http://www.phil.muni.cz/journals/index.php/proinflow/article/view/2019-1-2 Webarchiv: O Webarchivu [online]. Praha, 2015 [cit. 2020-01-22]. Dostupné z: https://www.webarchiv.cz/cs/o-webarchivu Biography I work as data analyst at Czech Webarchive and also in a project called Centralized Interface for the Web content and Social Networks Data Mining. Our goal is to provide datasets of metadata to scientists from humanities especially sociologists for their future research and data analýzy. Webarchiv is a part of NationaI Library of the Czech Republic. We harvest and archive all web sources with the Czech domain. I study Library studies and information science at Masaryk University, so I currently work in my field. I am a typical book worm with a creative soul and a passion for photography. Exploring Web Archive Networks: The Case of the 2018 Irish Presidential Election Dr. Derek Greene (University College Dublin) Keywords: web archives, network analysis, data analysis, case study https://www.webarchiv.cz/cs/o-webarchivu 39 ABSTRACT The hyperlink structure of the Web can be used not only for search, but also to analyse the associations between websites. By representing large collections of web pages as a link network, researchers can apply existing methodologies from the field of network analysis. For web archives, we can use these methods to explore their content, potentially identifying meaningful historical trends. In recent years the National Library of Ireland (NLI) has selectively archived web content covering a variety of political and cultural events of public interest. In this work, we analyse an archive of websites pertaining to the 2018 Irish Presidential Election. The original archive consists of a total of 57,065 HTML pages retrieved in 2018. From this data we extracted all links appearing in these pages and mapped each link to a pair of domains. For our case study, we focus only on pairs of domains for which both the source and target are distinct, yielding 28,555 relevant domain pairs. Next, we created a directed weighted network representation. In this network, each node is a unique domain. Each edge from node A to node B indicates that there are one or more links in the pages on domain A pointing to domain B. Each edge also has a weight, indicating the number of links between two domains. This yielded a network with 263 nodes and 284 weighted directed edges. Using network diagrams generated on this data, we can visualise the link structure around the sites used to promote each presidential candidate, and how they relate to one another. This work highlights the potential insights which can be gained by using network analysis to explore web archives. These include the possible impact on collection development in the NLI selective web archive and the further study of the archived Irish web. Biography: Dr. Derek Greene is Assistant Professor at the UCD School of Computer Science and Research Investigator at the SFI Insight Centre for Data Analytics. He has over 15 years’ experience in AI and machine learning, with a PhD in Computer Science from Trinity College Dublin. He is involved in a range of interdisciplinary projects which involve applying machine learning methods in fields such as digital humanities, smart agriculture, and political science. 40 Session 7: Collaborations & Teaching IIPC: training, collecting, research, and outreach activities Dr. Olga Holownia (International Internet Preservation Consortium / British Library) Keywords: web archiving, web archiving training, collaborative collections, Covid-19 web archive collections, web archiving resources ABSTRACT The basis of founding the International Internet Preservation Consortium (IIPC) in 2003 was the acknowledgement of “the importance of international collaboration for preserving Internet content for future generations”. Over the years, the IIPC members have worked together on multiple technical, curatorial, and educational activities. They have developed standards and supported open source web archiving tools and software. The annual General Assembly (GA) and Web Archiving Conference (WAC) have provided a forum for exchanging knowledge and forging new collaborations not only within the IIPC but also within the wider web archiving community and beyond. This talk will give an update on the most recent activities, including the IIPC funded projects as well as initiatives led by the working groups: training, collecting, and research, all of which fall under membership engagement and outreach overseen by the IIPC Portfolios. One of the key initiatives this year has been the “Novel Coronavirus (Covid-19) outbreak” transnational collection coordinated by the IIPC Content Development Group and organised in partnership with the Internet Archive. Over 9000 sites from over 140 countries and over 160 top level domains were made available through Archive-It seven months after the collection was launched in February 2020. We have also been publishing blog posts documenting the IIPC members’ efforts at capturing and archiving web content related to the pandemic within the national domains. This year also saw the publication of training materials designed and produced by the IIPC Training Working Group in partnership with the Digital Preservation Coalition. The first module comprising eight sessions, is aimed at curators, policy makers and managers or those who would like to learn about the basics of web archiving, including what web archives are, how they work, 41 and how web archive collections are curated. The programme helps acquire basic skills in capturing web archive content, but also how to plan and implement a web archiving programme. In terms of research activities, alongside the repository of web archiving resources at the University of North Texas (UNT) Digital Library and enhancing the metadata in the Zotero bibliography, we have been promoting the outcomes of the IIPC funded projects through a series of webinars organised by the Research Working Group. Among the funded projects are a set of introductory Jupyter Notebooks developed by Tim Sherratt, the creator of the GLAM Workbench, and LinkGate, a tool for graph visualisation of web archives aided by an inventory of use cases. The former project was led by the UK Web Archive based at the British Library, in partnership with the Australian and the New Zealand web archives, the latter is a collaboration between Bibliotheca Alexandrina and the National Library of New Zealand. References About IIPC: https://netpreserve.org/about-us IIPC Working Groups: https://netpreserve.org/about-us/working-groups IIPC Projects: https://netpreserve.org/projects IIPC General Assembly and Web Archiving Conference: https://netpreserve.org/general- assembly IIPC collections in the UNT Digital Library: https://digital.library.unt.edu/explore/partners/IIPC IIPC members’ COVID-19 collections: https://netpreserveblog.wordpress.com/tag/covid-19- collection “Novel Coronavirus (Covid-19) outbreak” collaborative collection: https://archive- it.org/collections/13529 Biography Olga Holownia is Programme and Communications Officer based at the British Library. She manages the communications and provides support to the programmes of the International Internet Preservation Consortium (netpreserve.org). Her key projects include the organisation of the annual IIPC General Assembly and Web Archiving Conference as well as associated training and events. She is a co-chair of the IIPC Research Working Group. https://netpreserve.org/about-us https://netpreserve.org/about-us/working-groups https://netpreserve.org/projects https://netpreserve.org/general-assembly https://netpreserve.org/general-assembly https://digital.library.unt.edu/explore/partners/IIPC https://netpreserveblog.wordpress.com/tag/covid-19-collection https://netpreserveblog.wordpress.com/tag/covid-19-collection https://archive-it.org/collections/13529 https://archive-it.org/collections/13529 42 Using Web Archives to Teach and Opportunities in the Information Science Field Dr. Juan-José Boté (Universitat de Barcelona) Keywords: digital preservation, teaching, web archives, emulator, archiving software ABSTRACT Web archives are a useful tool for teaching different subjects to students, not only for history but also for teaching courses such as digital preservation, information architecture, or metadata structures. The digital preservation of web archives offers a unique set of challenges when teaching students about information science. The first one is teaching about search strategies. Web archives have specific search tools and it is necessary to develop search strategies before beginning any search. For instance, one of the main challenges for students is in learning how to look for information through collections or looking for a precise website. Secondly, in addition to search strategies, the students need to learn how to find and use old software to run images, videos, or other informational content. Part of the search process includes checking whether the archived software was commercial and whether it is possible to use for free with some limitations. Therefore, to run old software which can be downloaded from web archives, sometimes it is also necessary to use emulators to run the old software. Emulators are not always found in web archives and may not be available and students must add a further step in order to run old software. In addition, when students set up archiving software, it is useful to know how it works. Testing the possibilities of archiving software is often kept to small scenarios because of the limitations of the course. Exposure to archiving software would permit students to learn the process of building small collections or creating new datasets of archived websites. In this paper I explore different uses of the information science field when using web archives as a resource for teaching, which is especially helpful in a digital preservation course. 43 Biography: Juan-José Boté is Assistant Professor at Universitat de Barcelona where he is also the coordinator of the Postgraduate Program on Social Media Content. His research is focused on digital preservation and cultural heritage. 44 Session 8: Research of Web Archives Web archiving - professionals and amateurs Bartłomiej Konopa (State Archives in Bydgoszcz; Nicolaus Copernicus University) Keywords: web archives, professional web archiving, amateurish web archiving, ArchiveTeam, comparative study ABSTRACT Web archiving can be defined as "any form of deliberate and purposive preserving of web material" (Brügger, 2011). That broad definition allows us to divide web archiving on numerous levels and distinguish many types of it. One of the possible distinctions is between professional and amateurish archiving. As professional archive one can treat big projects led mainly by national libraries, which employs experts and have strict regulations, like for example UK Web Archive and Danish Netarkivet. They are interested in national Webs and mainly preserve resources from one ccTLD in routine and repeatable crawls. Sometimes these archives build special collections, but very often they are predictable and related to "real world" events, for instance national elections. On the other side, as amateurish, one can recognize initiatives like ArchiveTeam, which are open for Internet users and does not have rigorous rules. They react to what is happening on the Web, observe endangered websites and services and try to preserve it. Their actions are spontaneous and disposable, but precisely aimed on the resources that would be lost. Both sides are trying to preserve web resources, because they consider them as digital heritage, which needs to be saved for the future generations. However, despite the mutual goal, professional and amateurish archives visibly differ in the way they function and materials they are interested as described above. The paper will search for these differences and analyse its influence on how and what will be archived, and then available for those, who want to experience and research the past Web. To reach this goal the author will compare UK Web Archive and Netarkivet with ArchiveTeam. Main source of information about these projects will be papers, news and their websites. The most important elements of these studies will be selection policy and criteria, scope, frequency and methods of archiving, and access rules. It will show differences in thinking about 45 Web, its border and ways of preserving and sharing this digital heritage. These factors will have also an impact on what resources will be available for later studies. Biography: Bartłomiej obtained his master's degree in archival science in 2007, currently he is a senior archivist at the State Archives in Bydgoszcz and a PhD student at the Nicolaus Copernicus University in Toruń (Poland). He is preparing a doctoral dissertation on Web archives, which are his main research interest. He collaborated with the web archiving lab "webArch", which is a pioneering project to popularize this issue in Poland. 46 Session 9: Research Approaches Digital archaeology in the web of links: reconstructing a late-90s web sphere Dr Peter Webster (Independent Scholar, Historian and Consultant) Keywords: web spheres, method, link graphs, link analysis, reconstruction ABSTRACT As interest in Web history has grown, so has the understanding of the archived Web as an object of study. But there is more to the Web than individual objects and sites. This paper is an exercise in understanding a particular ‘web sphere’. Niels Brügger defines a web sphere as ‘web material … related to a topic, a theme, an event or a geographic area’ (Brügger 2018). I posit a distinction between ‘hard’ and ‘soft’ web spheres, defined in terms of the ease with which their boundaries may be drawn, and the rate at which those boundaries move over time. Examples of hard web spheres are organisations that have clear forms of membership or association: eg. the websites of the individual members of the European Parliament. The study of ‘soft’ web spheres tends to present additional difficulties, since the definition of topics or themes is more difficult if not expressed in institutional terms. The definition of ‘European politics’ may be contested in ways that ‘membership of the European Parliament’ may not. I present a method of reconstructing just such a soft web sphere, much of which is lost from the live web and exists only in the Internet Archive: the web estate of conservative Christian campaign groups in the UK in the 1990s and early 2000s. The historian of the late 1990s has a problem. The vast bulk of content from the period is no longer on the live web; there are few, if any, indications of what has been lost – no inventory of the 1990s Web against which to check; of the content that was captured by the Internet Archive, only a superficial layer is exposed to full-text search, and the bulk may only be retrieved by a search for the URL. We do not know what was never archived, and in the archive it is difficult to find what we might want, since there is no means of knowing the URL of a lost resource. 47 We need, then, to understand the archived Web using only the technical data about itself that it can be made to disclose. This method of web sphere reconstruction is based not on page content but on the relationships between sites, i.e., the web of hyperlinks. The method is iterative, involving the computational interrogation of large datasets from the British Library and the close examination of individual archived pages, along with the use of printed and other non-digital sources. It builds upon recent studies which explore the available primary sources from outside the Web from which it may be reconstructed (Nanni 2017; Teszelszky 2019, Ben-David 2016; Ben-David 2019). It develops my earlier work in which the method was applied to smaller, less complex spheres (Webster 2017; Webster 2019). References: Ben-David, Anat. 2016. What does the Web remember of its deleted past? An archival reconstruction of the former Yugoslav top-level domain. New Media and Society 18, 1103-1119. https://doi.org/10.1177/1461444816643790 Ben-David, Anat. 2019. National web histories at the fringe of the Web: Palestine, Kosovo and the quest for online self-determination. In: The Historical Web and Digital Humanities: the Case of National Web domains, eds Niels Brügger & Ditte Laursen, 89-109. London: Routledge. Brügger, Niels. 2018. The archived Web: Doing history in the digital age. Cambridge, MA: MIT Press. Nanni, Federico. 2017. Reconstructing a website’s lost past: methodological issues concerning the history of Unibo.it. Digital Humanities Quarterly 11. http://www.digitalhumanities.org/dhq/vol/11/2/000292/000292.html Teszelszky, Kees. 2019. Web archaeology in The Netherlands: the selection and harvest of the Dutch web incunables of provider Euronet (1994–2000). Internet Histories 3, 180-194, DOI: 10.1080/24701475.2019.1603951 Webster, Peter. 2017. Religious discourse in the archived web: Rowan Williams, archbishop of Canterbury, and the sharia law controversy of 2008. In: The Web as History, eds Niels Brügger & Ralph Schroeder, 190-203. London: UCL Press. Webster, Peter. 2019. Lessons from cross-border religion in the Northern Irish web sphere: understanding the limitations of the ccTLD as a proxy for the national web. In: The Historical Web and Digital Humanities: the Case of National Web domains, eds Niels Brügger & Ditte Laursen, 110-23. London: Routledge. Biographies: Dr Peter Webster is an independent scholar and consultant, and founder and managing director of Webster Research and Consulting (UK). He has published widely on the use of Web archives for contemporary history. http://www.digitalhumanities.org/dhq/vol/11/2/000292/000292.html 48 Web defacements and takeovers and their role in web archiving Michael Kurzmeier (Maynooth University) Keywords: defaced websites; hacktivism; cybercrime archives; Geocities; web archives ABSTRACT This paper will provide insight into the archiving and utilization of defaced websites as ephemeral, non-traditional web resources. Web defacements as a form of hacktivism are rarely archived and thus mostly lost for systematic study. When they find their way into web archives, it is often more as a by-product of a larger web archiving effort than as the result of a targeted effort. Aside from large collections such as Geocities, which during a crawl might pick up a few hacked pages, there also exists a small scene of community-maintained cybercrime archives that archive hacked web sites, some of which are hacked in a hacktivist context. By examining sample cases of cybercrime archives, the paper will show the ephemerality of their content and introduce a framework for analysis. As more and more of our daily communication happens digitally, marginalized and counter-public groups have often used the new media to overcome real-world limitations. This phenomenon can be traced back to the early days of the Web. This paper will provide an overview of defacements on the web and show the role web archives play in understanding these phenomena. Web defacements are ephemeral content and as such especially prone to link rot and deletion. They can provide not only information on the history of a single web page; they can also be seen as artifacts of a struggle for attention. Contextualized with metadata and the original page, defacements can add help restore such lost histories. The current state, however, is that only a number of collections are still online with only one collection still accepting new material and none being in a condition to be used for academic research. Finding relevant defacements in collections like the mentioned is a challenge, especially since there is little conformity in terms of content, language and layout between people hacking websites. The paper will introduce different approaches to methodology for identifying defacements and related pages. Biography: Michael Kurzmeier is a fourth-year PhD candidate in Digital Humanities and recipient of the Irish Research Council Postgraduate Scholarship. His research interest is the intersection between 49 technology and society. His PhD thesis investigates the use of hacktivism as a tool of political expression. The research is grounded in an understanding of a contested materiality of communication, in which hacktivism is one method to occupy contested space. Michael is working with Kylie Jarrett (MU Media Studies) and Orla Murphy (UCC Digital Humanities). ORCID: https://orcid.org/0000-0003-4925-5197. https://orcid.org/0000-0003-4925-5197 50 Session 10: Culture & Sport MyKnet.org: Traces of Digital Decoloniality in an Indigenous Web-Based Environment Dr. Philipp Budka (University of Vienna; Free University Berlin) Keywords: MyKnet.org, indigenous web-based environment, digital decoloniality, internet history, anthropology ABSTRACT This paper discusses traces of digital decoloniality (e.g., Deem 2019) by exploring the history of the indigenous web-based environment MyKnet.org. By considering the cultural and techno-social contexts of First Nations' everyday life in Northwestern Ontario, Canada, and by drawing from ethnographic fieldwork (e.g., Budka 2015, 2019), it critically reviews theoretical accounts and conceptualizations of change and continuity that have been developed in an anthropology of media and technology (e.g., Postill 2017). In so doing, it examines how techno-social change and cultural continuity can be conceptualized in relation to each other and in the context of (historical) processes of digital decoloniality. In 1994, the tribal council Keewaytinook Okimakanak (KO) established the Kuh-ke-nah Network (KO-KNET) to connect indigenous people in Northwestern Ontario' remote communities through and to the internet. At that time, a local telecommunication infrastructure was almost non-existent. KO-KNET started with a simple bulletin board system that developed into a community-controlled ICT infrastructure, which today includes landline and satellite broadband internet as well as internet-based mobile phone communication. Moreover, KO-KNET established services that became widely popular among the local indigenous communities such as the web-based environment MyKnet.org. MyKnet.org was set up in 1998 exclusively for First Nations people to create and maintain personal homepages within a cost- and commercial-free space on the web. Particularly between 2004 and 2008, MyKnet.org used to be extremely popular mainly because of two reasons. First, MyKnet.org enabled people to establish and maintain social relationships across spatial distance in an 51 infrastructurally disadvantaged region. They communicated through homepage’s communication boxes and they linked their homepages to the pages of family members and friends. Creating thus a “digital directory” of indigenous people in Northwestern Ontario. Second, MyKnet.org contributed to different forms of cultural representation and identity construction. Homepage producers utilized the service to represent and negotiate their everyday lives by displaying and sharing pictures, music, texts, website layouts, and artwork. During fieldwork in Northwestern Ontario (2006-2008), many people told me stories about their first MyKnet.org websites in the early 2000s and how they evolved. People vividly described how their homepages were designed and structured and to which other websites they were linked. To deepen my interpretation and understanding of these stories, I used the Internet Archive's Wayback Machine to recover archived versions of these websites whenever possible. Thus, the Wayback Machine became an important methodological tool for my research into the decolonial history of MyKnet.org and related practices and processes of techno-social change and cultural continuity. References: Budka, P. (2019). Indigenous media technologies in “the digital age”: Cultural articulation, digital practices, and sociopolitical concepts. In S. S. Yu & M. D. Matsaganis (Eds.), Ethnic media in the digital age (pp. 162-172). New York: Routledge. Budka, P. (2015). From marginalization to self-determined participation: Indigenous digital infrastructures and technology appropriation in Northwestern Ontario's remote communities. Journal des Anthropologues, 142-143(3), 127–153. Deem, A. (2019). Mediated intersections of environmental and decolonial politics in the No Dakota Access Pipeline movement. Theory, Culture & Society, 36(5), 113–131. Postill, J. (2017). The diachronic ethnography of media: From social changing to actual social changes. Moment. Journal of Cultural Studies, 4(1), 19–43. Biography: Philipp Budka is a Lecturer in the Department of Social and Cultural Anthropology, University of Vienna, and the M.A. program Visual and Media Anthropology at the Free University Berlin. His research areas include digital anthropology and ethnography, the anthropology of media and technology as well as visual culture and communication. He is the co-editor of Ritualisierung – Mediatisierung – Performance (Vienna University Press, 2019) and Theorsising Media and Conflict (Berghahn Books, in press). His research has also been published in journals and books such as Journal des Anthropologues, Canadian Journal of Communication and Ethnic Media in the Digital Age (Routledge, 2019). 52 From the sidelines to the archived web: What are the most annoying football phrases in the UK? Helena Byrne (British Library) Keywords: Football, Annoying Football Phrases, Shine, UK Web Archive, Web Archive Case Study ABSTRACT As the news and TV coverage of football has increased in recent years, there has been growing interest in the type of language and phrases used to describe the game. Online, there have been numerous news articles, blog posts and lists on public internet forums on what are the most annoying football clichés. However, all these lists focus on the men’s game and finding a similar list on women’s football online was very challenging. Only by posting a tweet with a survey to ask the public “What do you think are the most annoying phrases to describe women’s football?” was I able to collate an appropriate sample to work through. Consequently, the lack of any such list in a similar format highlights the issue of gender inequality online as this is a reflection of wider society. I filtered a sample of the phrases from men’s and women’s football to find the top five most annoying phrases. I then ran these phrases through the UK Web Archive Shine interface to determine their popularity on the archived web. The UK Web Archive Shine interface was first developed in 2015, as part of the Big UK Domain Data for the Arts and Humanities project. This presentation will assess how useful the Trends function on the Shine interface is to determine the popularity of a sample of selected football phrases from 1996 to 2013 on the UK web. The Shine interface searches across 3,520,628,647 distinct records from .uk domain, captured from January 1996 to the 6th April 2013. This paper goes through the challenges of using the Shine interface to determine: what are the most annoying football phrases on the archived UK web. By using this example, it highlights how working with this resource differs from working with digitised publications and what strategies can be employed to gain meaningful answers to research questions. It is hoped that the findings 53 from this study will be of interest to the footballing world but more importantly, encourage further research in sports and linguistics using the UK Web Archive. References: Helena Byrne. (2018). What do you think are the most annoying phrases to describe women’s football??https://footballcollective.org.uk/2018/05/18/what-do-you-think-are-the-most-annoying- phrases-to-describe-womens-football/ (Accessed August 26, 2018) Andrew Jackson. (2016). Introducing SHINE 2.0 – A Historical Search Engine. Retrieved from: http://blogs.bl.uk/webarchive/2016/02/updating-our-historical-search-service.html (Accessed August 26, 2018) Biography: Helena Byrne is the Curator of Web Archives at the British Library. She was the Lead Curator on the IIPC CDG 2018 and 2016 Olympic and Paralympic collections. Helena completed her Master’s in Library and Information Studies at UCD in 2015. Previously she worked as an English language teacher in Turkey, South Korea and Ireland. 54 Session 11: Research (Lightning Round) Tracking and Analysing Media Events through Web Archives Caio de Castro Mello Santos (School of Advanced Study, University of London) Daniela Cotta de Azevedo Major (School of Advanced Study, University of London) Keywords: Digital Humanities; Media Events; Web Archives; Discourse Analysis ABSTRACT Throughout the last two decades, media outlets have grown more reliant on online platforms to spread news and ideas. Web Archives are a valuable tool to analyse the recent past as well as the present social and political context. However, the use of Web Archives to conduct research can be challenging due to the amount of data and its access limits. This project aims to develop mechanisms to extract, process and analyse data in order to provide scholars with a model to explore the impact of massive media events in the last couple decades. Two events have been taken as case studies: The London 2012 and Rio 2016 Olympics and the European Parliamentary Elections from 2004 to 2019. Regarding the Olympics, we aim to understand how online media have described the legacies of the London 2012 and Rio 2016 Olympics and how the choices made by the gatekeepers (news editors, journalists) influence the narrative about the consequences of both events. Whereas the study of the media coverage of the European elections can shed light on how political concepts such as nationalism and integration have an impact on the European public opinion and its attitudes towards European Institutions. Given the geographical and the temporal range of these projects, we will focus on different yet complementary Web Archives initiatives such as the Internet Archive, the UK Web Archive and Arquivo.pt. This project is being developed as part of the Cleopatra Training Network under a PhD in Digital Humanities. Therefore, this research is combining traditional methods such as Discourse Analysis through a qualitative close reading with quantitative computational methods through distant reading. This approach aims to provide examples of how to apply this type of data to the interpretative methodologies of the Social Sciences. 55 Biographies: Daniela Major: Early Stage Researcher at School of Advanced Study. Her doctoral project is on the Media coverage of the European Elections 2004-2019. She holds a master of letters in Intellectual History from the University of Saint Andrews and is a former research fellow at Arquivo.pt. Caio Mello: Early Stage Researcher at the School of Advanced Study/University of London. Journalist with a master’s in communication (UFPE – Brazil). Former research fellow at the Center for Advanced Internet Studies (CAIS - Germany). Reanimating the CDLink platform: A challenge for the preservation of mid- 1990s Web-based interactive media and net.art Dr. Eamonn Bell (Trinity College Dublin) Keywords: compact disc, Web, preservation, music, interactive multimedia ABSTRACT The Voyager Company realised the creative and commercial potential of mixed-mode CD-ROMs as the platform par excellence for interactive multimedia. Starting in 1989, with the release of a HyperCard-based interactive listening guide for Beethoven's Symphony No. 9, Voyager tightly integrated rich multimedia, hyperlinked text, and high quality audiovisual recordings into over 50 software releases for Mac and PC well into the late 1990s. Consolidating their expertise in computer-controlled optical media with Laserdics, Voyager developed AudioStack: a set of extensions for the HyperCard environment that allowed fine-grained software control of high- fidelity audio stored on conventional optical media. AudioStack led to a cross-platform technology designed for use on the web called CDLink, comprising CD-ROM controller drivers, extensions for Macromedia Shockwave and the plain-text Voyager CDLink Control Language. CDLink enabled and inspired commercial ventures and amateur productions alike, such as Sony Music's short lived ConnecteD experiment, the small but dedicated community of fan-sites that published time-synced lyric pages alongside hyperlinked commentaries for popular records, and even experimental sonic net.art in Mark Kolmar's Chaotic Entertainment (1996). As Volker Straebel (1997) has pointed out, Kolmar's work used CDLink files to probabilistically remix and 56 loop the contents of the user's own CD collection in code, evincing similar tactics of creation by contemporary experimental musicians and sound artists. Owing to the mostly obsolete hardware and software dependencies of the CDLink platform and the challenges posted by the fading born- digital traces of the mid-1990s Web, CDLink-dependent artifacts create difficulties for preservation and access. I summarise the above-mentioned developments that culminated in CDLink and describe the challenges of preserving Kolmar's artwork and making it available for future audiences, as well as those of the larger so-called "extended CD" ecosystem, which flourished during this decade. Biography: Eamonn Bell is a Research Fellow at the Department of Music, Trinity College Dublin. His current research focus is on the cultural history of the digital Audio CD format told from a viewpoint between musicology and media studies. In 2019, Eamonn was awarded a Government of Ireland Postdoctoral Fellowship in support of this two-year project, ‘Opening the “Red Book”’. He holds a doctorate in music theory from Columbia University (2019), where he wrote a dissertation on the early history of computing in the analysis of musical scores. He also holds a bachelor's degree in music and mathematics from Trinity College Dublin (2013). His research engages the history of digital technology as it relates to musical production, consumption, and criticism in the twentieth century. Curating culturally themed collections online: The 'Russia in the UK' Special Collection, UK Web Archive Hannah Connell (King’s College London; British Library) Keywords: Curatorship, diaspora, media, community, web archiving ABSTRACT The researcher-curated special collection, Russia in the UK, is part of the UK Web Archive, hosted by the British Library. This collection comprises a selection of websites created for and by the Russian-speaking population in the UK. This paper will explore the challenges for creating and maintaining web archival collections. I will discuss difficulties in determining the parameters of this special collection. Alongside the impact 57 of the single-curatorial voice in shaping a collection, this paper will address the ways in which the legal and technical infrastructure underlying web archiving affects the shape of a collection. I will examine how the decision-making process behind curating and expanding this collection encourages reflection on the specific cultural context of Russian migration to the UK and complicates the notion of a culturally-themed diaspora collection. The Russia in the UK special collection is public but still growing. This collection is valuable for researchers both as a resource for further research, and as a means of questioning research practices. The practice of creating and maintaining a special collection such as the Russia in the UK collection influences the shape of the collection and the online representation of the diasporic community it reflects. This paper will examine how the ongoing process of research and selection can be broadened to include new curators. I will discuss the ways in which a broader community can be involved in the curation process and the development of this special collection in the future. Biography: Hannah is undertaking an AHRC funded collaborative PhD studentship with the British Library and King’s College London exploring interwar migration from Russia through Russian-language émigré publishing. The selection of the content for the UKWA ‘Russia in the UK’ special collection forms part of this research, reflecting the ways in which diasporic communities continue to preserve and contribute to a shared identity though new forms of media today. 58 Session 12: Youth & Family DELETE MY ACCOUNT: Ethical Approaches to Researching Youth Cultures in Historical Web Archives Katie Mackinnon (University of Toronto) Keywords: web history, web archives, research ethics, youth cultures, 1990s web ABSTRACT Over the past 25 years the web has become an “unprecedentedly rich primary source…it is where we socialise, learn, campaign and shop. All human life, as it were, is vigorously there” (Winters, 2017). Web archives, as an increasingly important resource for writing social, cultural, political, economic, and legal histories, pose new challenges for historians who must learn how to “navigate this sea of digital material” (Milligan, 2012). Throughout these past few decades, young people have been a focus of digital cultures and participation (Turkle, 1995; Kearney, 2006; Scheidt, 2006; Ito et al., 2010; boyd, 2014; Vickery, 2017; Watkins et al., 2018). The early web communities of GeoCities that are available on the Internet Archive are a unique and incredibly fruitful resource for studying youth participation in the early web (Milligan, 2017) in a way that gives youth voices autonomy and agency. New challenges emerge when applying computational methodologies and tools to youth cultures in historical web archives at scale. This paper considers the challenges in: 1) researching and writing about the phenomenon of young people divulging personal details about their lives without the possibility of informed consent; 2) accurately contextualizing web pages within wider online communities and; 3) engaging with socio-political climates that young people were experiencing and exploring the Web that focuses on the intersections of race, gender, sexuality, class, geography, and cultural and social pressures. The EU’s “Right to be Forgotten” (2014) and GDPR (2018) call into question the regularity with which young people become “data subjects” through their proximity to social networking sites, either through family, friends or themselves. Young people’s data is subject to commodification, surveillance, and archiving without consent. Researchers engaging with historical web material have a responsibility to develop better practices of care. This paper further develops frameworks 59 to ethically research young people’s historical web content in digital archives that accounts for the sensitive nature of web materials (Adair, 2018; Eichhorn, 2019), lack of consent protocols available to historical web researchers (Aoir IRE 3.0, 2019), and the ways in which computational methods and big data research attempts often fail to anonymize data (Brügger & Milligan, 2018). Web history research puts living human subjects at the forefront of historical research, which is something that historians are not particularly well-versed in. This paper surveys ethical approaches to internet and web archive research (Lomborg, 2018; Schäfer & Van Es, 2017; Whiteman, 2012; Weltevrede, 2016), identifies gaps in studying historical web youth cultures and suggests next steps. Works Cited: Adair, Cassius. (2019). “Delete Yr Account: Speculations on Trans Digital Lives and the Anti- Archival.” Digital Research Ethics Collaboratory. http://www.drecollab.org/ Brugger, Niels and Ian Milligan. (2018). The SAGE Handbook of Web History. London: Sage. Bruckman, Amy, Kurt Luther, and Casey Fiesler. 2015. “When Should We Use Real Names in Published Accounts of Internet Research?,” in Eszter Hargittai and Christian Sandvig (eds) Digital research confidential: the secrets of studying behavior online. Cambridge, Mass: MIT Press. DiMaggio, P., E. Hargittai, C. Celeste and S. Shafer. (2004). “Digital inequality: From unequal access to differentiated use.” In Social Inequality, ed. K. Neckerman. Russel Sage Foundation. Eichhorn, Kate (2019). The end of forgetting: growing up with social media. Cambridge, Mass: Harvard University Press. franzke, a.s., Bechmann, A., Zimmer, M. & Ess, C.M. (2019) Internet Research: Ethical Guidelines 3.0, Association of Internet Researchers, www.aoir.org/ethics. Ito et al. (2010). Hanging Out, Messing Around, and Geeking Out: Kids Living and Learning with New Media. MIT Press. Jenkins, H., M. Ito, and d. boyd. (2016). Participatory Culture in a Networked Era: A Conversation on Youth, Learning, Commerce, and Politics. Polity. Kearney, M. C. (2006). Girls Make Media. Routledge. Kearney, M. C. (2007). “Productive spaces girls’ bedrooms as sites of cultural production spaces.” Journal of Children and Media, 1, 126-141. Lincoln, S. (2013). “I’ve Stamped My Personality All Over It”: The Meaning of Objects in Teenage Bedroom Space.” Space and Culture, 17(3), 266–279. 60 Lomborg, Stine. (2018). “Ethical Considerations for Web Archives and Web History Research,” in SAGE Handbook of Web History, eds. Niels Brügger and Ian Milligan. Milligan, Ian. (2017). “Pages by Kids, For Kids”: Unlocking Childhood and Youth History through Web Archived Big Data,” in The Web as History, eds. Niels Brügger and Ralph Schroeder, UCL Press. Schäfer, Mirko Tobias, and Karin Van Es. (2017). The datafied society: studying culture through data. Amsterdam University Press. Scheidt, L. A. (2006.) “Adolescent diary weblogs and the unseen audience,” in Digital Generations: Children, Young People, and New Media, ed. D. Buckingham and R. Willet. Erlbaum. Skelton T. and Valentine G. (1998). Cool Places: Geographies of Youth Cultures. Routledge. Turkle, Sherry. (1995). Life on the Screen: Identity in the Age of the Internet, Simon and Schuster. van Dijck, José, Thomas Poell, and Martijn de Waal. (2018). The Platform Society; Public Values in a Connective World. New York: Oxford University Press. Vickery, J. R. (2017). Worried about the wrong things: Youth, risk, and opportunity in the digital world. Cambridge, MA: MIT Press. Watkins, S. C. et. al. (2018). The Digital Edge: How Black and Latino Youth Navigate Digital Inequality. NYU Press. Weltevrede. Esther. (2016). Repurposing digital methods. The research affordances of platforms and engines. PhD Dissertation, University of Amsterdam Whiteman, Natasha. (2012). “Ethical Stances in (Internet) Research,” in Undoing Ethics, by Natasha Whiteman, 1–23. Boston, MA: Springer US, 2012. Winters, Jane. (2017) “Breaking in to the mainstream: demonstrating the value of internet (and web) histories,” Internet Histories, 1:1-2, 173-179. Biography: Katherine (Katie) Mackinnon is a Ph.D. candidate at the University of Toronto in the Faculty of Information. She researches web histories, including early uses of the internet by young people in the 1990s through a case study of the popular website, ‘GeoCities’. She is particularly interested in using web archives to conduct historical work, focusing on youth expressions of identity and community within their specific socio-political contexts. 61 Changing platforms of ritualized memory practices. Assessing the value of family websites Dr. Susan Aasman (University of Groningen) Keywords: web archives, vernacular culture, amateur media, web archaeology, technologies of memory ABSTRACT In this presentation I want to introduce research on current personal digital archival practices, as they have shifted from private spaces to more public platforms. I would especially like to discuss the value of concrete everyday practices of storing and sharing multimodal family records on late nineties/early 21st century family web sites. In addition, I will address the vulnerability of these archival practices, introducing a casus of a particular family web site hosted by the famous Dutch provider XS4all who will close its service permanently. Although the National Library of the Netherlands (KB) started to collect XS4all websites, when it comes to selecting and preserving online personal archives, there is still a need to raise awareness about these deeply meaningful memory practices. For one, these type of practices of memory staging do have a history that is much older that the history of the web suggests; they belong to a long durée history of technologies of memory production and distribution. At the same time, understanding these family oriented websites as designed in the nineties and early 200s gives us an excellent opportunity to understand the specificities of the shift from private to public, and from analogue to digital. This research is part of larger agenda that addresses the urgent issue of long-term preservation of amateur media and how technological, political, social and cultural factors influence how we appraise and archive the often ephemeral nature of amateur media expressions. In particular, digital material poses multiple challenges, one of them the sustainability of many forms and formats of amateur media. The challenge is a shared task of public cultural heritage institutions, commercial, scholars and individuals alike. The archival strategies and the choices of what to keep and what to delete may resonate for decades to come. The presentation will argue that the complexities and contradictions that characterize present-day amateur media culture are mirrored by and reproduced in the complexities and contractions of archiving digital memories. There are no simple solutions and there are no simple guidelines, as amateur media archives – whether personal or collective or 62 whether they are analogue or digital - have been caught up in ethical, emotional, commercial, political contested areas and bear the burden of being technological, material, and personal. Biography: Dr. Susan Aasman is associate professor at the Centre for Media and Journalism Studies and Director of the Centre of Digital Humanities at the University of Groningen (NL). Her field of expertise is in media history, with a particular interest in amateur film and documentaries, digital cultures and digital archives, web history and digital history. She was a senior researcher in the research project ‘Changing Platforms of Ritualised Memory Practices: The Cultural Dynamics of Home Movie Making’. Together with Annamaria Motrescu-Mayes, she is the co-author of Amateur Media and Participatory Culture: Film, Video and Digital Media (Routledge 2019). Recently she started working on web archival and web historical projects. She co-edited – together with Kees Teszelszky and Tjarda de Haan - a special issue on Web Archaeology for the journal TMG/Journal for Media History (https://www.tmgonline.nl/). https://www.tmgonline.nl/ 63 Session 13: Source Code and App Histories Platform and app histories: Assessing source availability in web archives and app repositories Dr. Anne Helmond (University of Amsterdam) Fernando van der Vlist (Utrecht University) Keywords: platforms, apps, web historiography, web archiving, app archiving ABSTRACT In this presentation, we discuss the research opportunities for historical studies of apps and platforms by focusing on their distinctive characteristics and material traces. We demonstrate the value and explore the utility and breadth of web archives and software repositories for building corpora of archived platform and app sources. Platforms and apps notoriously resist archiving due to their ephemerality and continuous updates. As a result of rapid release cycles that enable developers to develop and deploy their code very quickly, large web platforms such as Facebook and YouTube change continuously, overwriting their material presence with each new deployment. Similarly, the pace of mobile app development and deployment is only growing, with each new software update overwriting the previous version. As a consequence, their histories are being overwritten with each update, rather than written and preserved. In this presentation, we consider how one might write the histories of these new digital objects, despite such challenges. When thinking of how platforms and apps are archived today, we contend that we need to consider their specific materiality. With the term materiality, we refer to the material form of those digital objects themselves as well as the material circumstances of those objects that leave material traces behind, including developer resources and reference documentation, business tools and product pages, and help and support pages. We understand these contextual materials as important primary sources through which digital objects such as platforms and apps write their own histories with web archives and software repositories. We present a method to assess the availability of these archived web materials for social media platforms and apps across the leading web archives and app repositories. Additionally, we conduct a comparative source set availability analysis to establish how, and how well, various source sets 64 are represented across web archives. Our preliminary results indicate that despite the challenges of social media and app archiving, many material traces of platforms and apps are in fact well preserved. The method is not just useful for building corpora of historical platform or app sources but also potentially valuable for determining significant omissions in web archives and for guiding future archiving practices. We showcase how researchers can use web archives and repositories to reconstruct platform and app histories, and narrate the drama of changes, updates, and versions. Biographies: Anne Helmond is an assistant professor of New Media and Digital Culture at the University of Amsterdam. Her research interests include software studies, platform studies, app studies, digital methods, and web history. Fernando van der Vlist is a PhD candidate at Utrecht University and a research associate with the Collaborative Research Centre “Media of Cooperation” at the University of Siegen. His research interests include software studies, digital methods, social media and platform studies, app studies, and critical data studies. Exploring archived source code: computational approaches to historical studies of web tracking Dr. Janne Nielsen (Aarhus University) Keywords: archived source code: computational approaches; historical studies; web tracking ABSTRACT This paper presents different ways of examining archived source code to find traces of tracking technologies in web archives. Several studies have shown a prolific use of tracking technologies used to collect data about web users and their behavior on the web (e.g. Altaweel, Good & Hoofnagle, 2015; Roesner, Kohno & Wetherall, 2012; Ayenson, Wambach, Soltani, Good & Hoofnagle, 2011; see also the review of existing tracking methods in Bujlow, Carela-Espanol, Lee & Barlet-Ros, 2017). Tracking is used for a multitude of purposes from authorisation and personalisation over web analytics and optimisation to targeted advertising and social profiling. The extent of web tracking and the magnitude of data collected by powerful companies like 65 Facebook and Google have caused concerns about privacy and consent. To better understand the spread of tracking and the possible implications of the practices involved, it is important to study the development leading up to today. Most studies of web tracking study the current web but to study the historical development of tracking, we can turn to web archives. The distinctive nature of archived web as "reborn digital" (Brügger, 2018) means that a study using archived web must always address the specific characteristics of this source and the associated methodological issues (Brügger, 2018; Masanès, 2006; Schneider & Foot, 2004) but a study of tracking technologies in the archived web poses additional, new methodological challenges. Tracking technologies are part of what could be called the environment of a website (cf. Helmond, 2017) but it is not part of what is usually considered the 'content', which the web archives aim to collect and preserve (Rogers, 2013). Tracking can also depend on technologies that are often difficult to archive (e.g. content based on JavaScript, Flash or similar). None the less, it is still possible to find traces of tracking technologies in web archives. One approach, inspired by the work of Helmond (2017), is to study the archived source code of websites. This paper presents a study of tracking technologies on the Danish web from 2006 to 2015 as it has been archived in the Danish national web archive Netarkivet. The study experiments with computational methods to map the development of different tracking technologies (e.g. http cookies and web beacons). The paper discusses the main methodological challenges of the study and shows how a profound knowledge of the specific archive and the changes in archiving strategies and settings over time is necessary for such a study. References: Altaweel, I., Good, N., & Hoofnagle, C. J. (2015). “Web Privacy Census”. Technology Science. Ayenson, M. D., Wambach, D. J., Soltani, A., Good, N., & Hoofnagle, C. J. 2011. “Flash Cookies and Privacy II: Now with Html5 and Etag Respawning.” Ssrn.com. July 29. Bujlow, T., Carela-Espanol, V., Lee, B.-R., & Barlet-Ros, P. 2017. “A Survey on Web Tracking: Mechanisms, Implications, and Defenses”. Proceedings of the IEEE, 105(8), 1476–1510. Brügger, N. 2018. The Archived Web: Doing History in the Digital Age. Cambridge: MIT Press. Helmond, A. 2017. Historical website ecology: Analyzing past states of the web using archived source code. In N. Brügger (Ed.), Web 25: histories from the first 25 years of the World Wide Web (pp. 139–155). New York: Peter Lang. Masanès, J. 2006. Web Archiving: Issues and Methods. In J. Masanes (Ed.), Web Archiving (pp. 1–53). Springer. 66 Roesner, F., Kohno, T., & Wetherall, D. 2012. “Detecting and Defending Against Third-Party Tracking on the Web”. Presented at the 9th USENIX Symposium on Networked Systems Design. Rogers, R. 2013. Digital methods. Cambridge: MIT Press. Schneider, S. M. & Foot, K. A. 2004. “The Web as an Object of Study”. New Media & Society, 6(1), 114–122. Biography: Janne Nielsen is an Assistant Professor, PhD, in Media Studies and a board member of the Centre for Internet Studies at Aarhus University. She is part of DIGHUMLAB, where she is head of LARM.fm (a community and research infrastructure for the study of audio and visual materials) and part of NetLab (a community and research infrastructure for the study of internet materials). Her research interests include media history, cross media, web historiography, web archiving, web tracking, privacy and consent. 67 Session 14: AI and Infrastructures Cross-sector interdisciplinary collaboration to discover topics and trends in the UK Government Web Archive: a reflection on process Mark Bell (The National Archives, UK) Tom Storrar (The National Archives, UK) David Beavan (The Alan Turing Institute) Dr. Eirini Goudarouli (The National Archives, UK) Dr. Barbara McGillivray (The Alan Turing Institute) Dr. Federico Nanni (The Alan Turing Institute) Pip Willcox (The National Archives, UK) Keywords: Discovery, Machine Learning, Collaboration, Machine Assisted Exploration, Scale ABSTRACT This paper proposes a discussion of a collaboration between The National Archives and The Alan Turing Institute to use artificial intelligence technologies to enable the navigation and comprehension of the UK Government Web Archive (UKGWA) at scale. The National Archives are the official archive of UK government holding over 1000 years of history. Since 1996 The National Archives have been archiving UK government websites and social media output that are publicly accessible through the UKGWA. Users of the UKGWA can browse sites or use the very effective full text search service to find content in over 350 million documents (and counting). Search relies on keyword matching and is most effective when combined with domain knowledge, but most of our users don’t have this. There is currently no way to view the UKGWA as a whole or to group similar material together. Research into UKGWA users indicates they expect an “intuitive” search experience, allowing them to navigate this massive dataset, with search results surfacing relevant results. That type of search experience requires resource intensive data engineering and natural language processing methods that handle a high volume of queries, neither of which is currently available. 68 With The Alan Turing Institute, the national institute for data science and AI, we proposed a Data Study Group (DSG) to bring together experts from across and beyond academia to work on a data challenge for a week. Held in December 2019, the challenge focuses on discoverability of the UKGWA, applying advanced machine learning and natural language processing approaches to tasks such as creating a subject matter overview of the archive, machine assisted exploration, and identifying the emergence, growth, and decay of topics over time. This talk will explain the challenges that we face when it comes to explore, understand, analyse and interpret the UKGWA; will focus on the collaboration between The National Archives and The Alan Turing Institute; and will present the work of selection and preparation of data prior to the challenge, as well as the process and outcomes of the challenge week itself – what went well, what didn’t, what surprised us. We will also discuss next steps and how we will seek to implement the outcomes of this collaboration. This will include the challenges of turning a complex research prototype developed in a technical environment into something that can be practically integrated into the UKGWA interface to meet the needs of, and be understood by, our users. We would welcome the thoughts of conference participants on this work to date, including on how it can be made useful to researchers, web archives, and their users. Biographies: Mark Bell is Senior Digital Researcher at The National Archives. He has worked as researcher on the AHRC funded project Traces Through Time on which he developed statistical methodologies for record linkage, and on the ESPRC funded ARCHANGEL which explored the use of Distributed Ledger Technology to provide trust in archived born-digital material. Mark’s research interests cover a broad range of areas including Handwritten Text Recognition, Crowdsourcing, applications of Machine Learning to archival processes, and of course the challenges of working with large scale web archives. Tom Storrar is the Head of Web Archiving at The National Archives. He has led the Web Archive team for over 10 years, transforming the way that web archiving is performed. Tom has spoken at a number of international conferences about the challenges of web archiving. As well as the day to day challenges of maintaining the archive, he has also defined collection policies around web pages, social media accounts, and even code repositories, as well as managing the migration to cloud based archiving. David Beavan is Senior Research Software Engineer – Digital Humanities in the Research Engineering Group (also known as Hut 23) in The Alan Turing Institute. He has been working in the Digital Humanities (DH) for over 15 years, working collaboratively, applying cutting edge computational methods to explore new humanities challenges. He is Co-Investigator for two Arts and Humanities Research Council (AHRC) funded projects: Living with Machines and Chronotopic Cartographies, is Co-organiser of the Humanities and Data Science Turing Interest 69 Group and is Research Engineering's challenge lead for Data Science for Science (and also humanities) and Urban Analytics. Eirini Goudarouli is a member of the Research Team at The National Archives. Her current research interests include digital humanities and digital archives. She is particularly interested in bringing together methods and theories from a range of disciplines that could essentially contribute to the rethinking of digital, archival and collection-based research. Eirini is the Co-Investigator of the International Research Collaboration Network in Computational Archival Science (IRCN- CAS), funded by the Arts and Humanities Research Council. Barbara McGillivray is Turing Research Fellow at The Alan Turing Institute and the University of Cambridge. She has always been passionate about how Sciences and Humanities can meet. She completed a PhD in Computational Linguistics from the University of Pisa in 2010 after a degree in Mathematics and one in Classics from the University of Florence (Italy). Before joining the Turing, she was language technologist in the Dictionary division of Oxford University Press and data scientist in the Open Research Group of Springer Nature. Federico Nanni is a Research Data Scientist at The Alan Turing Institute, working as part of the Research Engineering Group, and a visiting fellow at the School of Advanced Study, University of London. He completed a PhD in History of Technology and Digital Humanities at the University of Bologna focusing on the use of web archives in historical research and has been a post-doc in Computational Social Science at the Data and Web Science Group of the University of Mannheim. He also spent time as a visiting researcher at the Foundation Bruno Kessler and the University of New Hampshire, working on Natural Language Processing and Information Retrieval. Pip Willcox is Head of Research at The National Archives. She has a background in digital editing and book history, focussing first on encoding medieval manuscripts and later on early modern printed books. More recently she has worked on projects linking collections and semantic web technologies, and social machines. She has developed a framework for an experimental humanities, using digital simulation to close-read and explicate interpretation of the archive. Her focus for the past several years has been on multidisciplinary engagement with collections, enabling digital research and innovation. A tale of two web archives: Challenges of engaging web archival infrastructures for research Jessica Ogden (University of Southampton) Emily Maemura (University of Toronto) Keywords: national web archives, researcher engagement, infrastructure studies ABSTRACT Web archives (WAs) are a key source for historical web research, and recent anthologies provide examples of their use by scholars from a range of disciplines (Brügger, 2017; Brügger 2018; 70 Brügger & Schroeder, 2017). Much of this work has drawn on large-scale collections, with a particular focus on the use of national web domain collections (Brügger & Laursen, 2019; Hockx- Yu, 2016). This previous work demonstrates how WAs afford new scholarship opportunities, yet little work has addressed how researcher engagement is impacted by the complexity of WA collection and curation. Further research has begun to address the impact of specific organizational settings where the technical constraints interact with policy frameworks and the limitations of resources and labour (Dougherty & Meyer, 2014; Hockx-Yu, 2014; Maemura et al. 2018; Ogden et al., 2017). Here, we extend this work to consider how these factors influence subsequent engagement, to investigate the very real barriers researchers face when using WAs as a source for research. This paper explores the challenges of researcher engagement from the vantage point of two national WAs: the UK Web Archive at the British Library, and Netarkivet at the Royal Danish Library. We compare and contrast our experiences of undertaking WA research at these institutions. Our personal interactions with the collections are supplemented by observations of practice and interviews with staff, in an effort to investigate the circumstances that shape the ways that researchers use WAs. We compare these two national WAs along several dimensions, including: the legal mandates for collection; the ontological decisions that drive practices; the affordances of tools and technical standards; everyday infrastructural maintenance and labour; and the ways in which all of the above constructs the interfaces through which WAs are researched. Our approach explores the materiality of WAs data across these two sites to acknowledge the generative capabilities of web archiving and reinforce an understanding that these data are not given or ‘natural’ (Gitelman, 2013). We highlight how the sociotechnical infrastructure of web archiving shapes researcher access, the types of questions asked, and the methods used. Here, access is conceived of not only in terms of ‘open’ versus ‘closed’ data, but rather as a spectrum of possibilities that orientates researchers to particular ways of working with data, whilst often decontextualising them from the circumstances of their creation. We question which kinds of digital research are afforded by national WAs, particularly when the scoping of collection boundaries on ccTLDs (top level domains) creates ‘artificial geographic boundaries’ (Winters, in press). Through this process we recognise and centre the assumptions about collection and use that are embedded in these research infrastructures, to facilitate a discussion of how they both enable and foreclose on particular forms of engagement with the Web’s past. 71 Bibliography: Brügger, N. (2018). The Archived Web: Doing History in the Digital Age. Cambridge, MA: MIT Press. Brügger, N. (Ed.). (2017). Web 25: histories from the first 25 years of the World Wide Web. New York: Peter Lang. Brügger, N., & Laursen, D. (Eds.). (2019). The historical web and digital humanities: The case of national web domains. Abingdon: Routledge. Brügger, N., & Schroeder, R. (Eds.). (2017). The Web as History: Using Web Archives to Understand the Past and the Present. London: UCL Press. Retrieved from http://oapen.org/download?type=document&docid=625768 Dougherty, M., & Meyer, E. T. (2014). Community, tools, and practices in web archiving: The state-of-the-art in relation to social science and humanities research needs. Journal of the Association for Information Science and Technology, 65(11), 2195–2209. https://doi.org/10.1002/asi.23099 Gitelman, L. (Ed.). (2013). “Raw data” is an oxymoron. Cambridge, Massachusetts; London, England: The MIT Press. Hockx-Yu, H. (2014). Access and Scholarly Use of Web Archives. Alexandria: The Journal of National and International Library and Information Issues, 25(1), 113–127. https://doi.org/10.7227/ALX.0023 Hockx-Yu, H. (2016). Web Archiving at National Libraries Findings of Stakeholders’ Consultation by the Internet Archive. Internet Archive. Retrieved from https://archive.org/details/InternetArchiveStakeholdersConsultationFindingsPublic Maemura, E., Worby, N., Milligan, I., & Becker, C. (2018). If These Crawls Could Talk: Studying and Documenting Web Archives Provenance. Journal of the Association for Information Science and Technology, 69(10), 1223–1233. https://doi.org/10.1002/asi.24048 Ogden, J., Halford, S., & Carr, L. (2017). Observing Web Archives: The Case for an Ethnographic Study of Web Archiving. In Proceedings of the 2017 ACM on Web Science Conference (pp. 299–308). Troy, New York, USA: ACM Press. https://doi.org/10.1145/3091478.3091506 Winters, J. (in press, 2019). Giving with one hand, taking with the other: E-legal deposit, web archives and researcher access. In P. Gooding & M. Terras (Eds.), Electronic Legal Deposit: Shaping the library collections of the future. London: Facet Publishing. Biography: Jessica Ogden, University of Southampton; jessica.ogden@soton.ac.uk Jessica Ogden is a PhD Candidate based in Sociology and the Web Science Centre for Doctoral Training at the University of Southampton. Jessica’s research focuses on the politics of data, web archiving and digital data scholarship. 72 Emily Maemura, University of Toronto; e.maemura@mail.utoronto.ca Emily Maemura is a PhD candidate at the University of Toronto’s Faculty of Information (iSchool). Her research focus is on web archiving, including approaches and methods for working with web archives data and research collections, and capturing diverse perspectives of the internet as an object and/or site of study. 73 Session 15: WARC and OAIS What’s missing from WARC? (Consultative Committee for Space Data Systems (CCSDS), Data Archive Interoperability (DAI) Working Group) Mr. Michael W. Kearney III Sponsored by Google, Huntsville, Alabama, USA. Mr. John Garrett Garrett Software, Columbia, Maryland USA Mr. David Giaretta PTAB Ltd, Dorset, UK. Mr. Steve Hughes Jet Propulsion Laboratory, California Institute of Technology, Pasadena, California, USA Keywords: OAIS; WARC; CCSDS; HTML; MIME ABSTRACT This presentation will explain why the WARC format, by itself, is not adequate to preserve websites. As a brief justification of the claim, it is well known that a WARC file essentially captures the information sent from a website. However, by itself, this is not enough for long term preservation for the following reasons. Right now, there are suitable, readily available, Web browsers which can deal with current websites, supporting HTML standards, but often making guesses about how to display important but badly constructed web pages. In future these will not necessarily be available. More importantly websites not only display pages but also download files. The WARC file may show a MIME type of “application/vnd.ms-excel”, which is a hint to the web browser to use MS Excel to show a spreadsheet. But what do the columns mean? For example, a column labelled “speed” may seem easy to understand but a speed of 10 mm/hour is very different from a speed of 10 miles/second. The WARC file does not provide enough information. The presentation will also explain what can be done to supplement WARC to fix these problems utilizing the long-term preservation practices of OAIS. 74 Biographies: Mike Kearney is an engineering graduate of the University of Kentucky. He worked for NASA for 34 years in Systems Engineering and Technology positions; including chairmanship of the international standards body CCSDS, until retiring from NASA in 2015. He is now working with the non-profit Space Infrastructure Foundation and volunteers time for Google who sponsors attendance at Digital Preservation forums. David Giaretta has led developments of standards in digital preservation (ISO 14721), in particular audit and certification of repositories (ISO 16363 and 16919) and developed practical and coherent solutions and services that will help repositories seeking ISO certification while adding value to their holdings. Steve Hughes is a Principal Computer Scientist at the National Aeronautics and Space Administration (NASA) Jet Propulsion Laboratory. Three decades of experience with NASA’s official archive for Solar System Exploration science data, the Planetary Data System. Chief architect for the archive’s information architecture which is based on principles from the Open Archive Information System (OAIS) Reference Model (ISO-14721) and the ISO/IEC 11179 Metadata Registry (MDR) standard. Member of the Primary Trusted Digital Repository Accreditation Board (PTAB). Associate member of Jet Propulsion Laboratory’s Center for Data Science and Technology, a virtual center for research, development and operations of data intensive and data-driven science systems. He was awarded the NASA Exceptional Public Service Medal for exceptional service to NASA science missions and data archives, architecting and implementing data intensive systems, information models, and ontologies for three decades. John Garrett is an engineering graduate from Missouri University for Science and Technology and a Computer Science graduate of Johns Hopkins University. He spent 25 years working as a contractor for NASA’s National Space Science Data Archive, including many years representing their needs and interests while developing digital preservation standards. He was instrumental in developing the OAIS Reference Model and continues to help lead the CCSDS DAI efforts developing OAIS related standards and standards for certifying Trustworthy Digital Repositories. Background on the CCSDS DAI Working Group: CCSDS is the Consultative Committee for Space Data Systems. It started in 1982 developing data and communications interoperability standards for data systems (flight and ground) that are used in space missions. While CCSDS is organized by space agencies, it is inclusive of other non-space organizations, industry and academia. CCSDS consists of about 22 working groups, one of which is the Data Archive Interoperability WG. The DAI WG is focused on long-term digital preservation archives. With extensive support from non-space-industry organizations (national archives and libraries from various countries, academia, other industry domains, etc.), the DAI WG developed the Reference Model for OAIS. Due to its wide applicability, OAIS became broadly adopted outside of the space industry. CCSDS and DAI standards are procedurally adopted by and published by ISO (as CCSDS functions as ISO TC20/SC13). The DAI has published many standards that support OAIS and that are applicable to some space-related archives as well as other “generic” preservation archives globally. 75 Session 16: Web Archives as Scholarly Dataset Web Archives as Scholarly Dataset to Study the Web Dr. Helge Holzmann (Internet Archive) Jefferson Bailey (Internet Archive) Keywords: data processing, extraction, derivation, access, research ABSTRACT The Internet Archive (IA) has been archiving broad portions of the global web for over 20 years. This historical dataset, currently totaling over 20 petabytes of data, offers unparalleled insight into how the web has evolved over time. Part of this collecting effort has included the ability to support large-scale computational research efforts analyzing this collection. This presentation will update efforts within IA to support computational use of its web archive, approaching this topic through description of both program and technical development efforts. Web archives give us the opportunity to process the web as if it was a dataset, which can be searched, analyzed and studied, temporally as well as retrospectively. However, web data features some very specific traits that raise new challenges to deal with when providing services based on the contained information. Our Web Data Engineering efforts are tackling these challenges in order to discover, identify, extract and transform archival web data into meaningful information for our users and partners, by hiding all the complexity and abstract away technical details. Engineering has traditionally been the systematic application and combination of existing methods to build a desired system or thing. Data Engineering is different from this in that engineering here does not refer to creating something but transform the data in a way that it is more useful for what should be achieved. As part of this, new tools and processes are developed to accomplish this transformation more effectively as well as efficiently in terms of resources and time. The talk will outline different computational research services for historical web archive data, along with technical challenges, novel developments and opportunities as well as considerations to make when working with this unique dataset, including: 76 ● Researcher support scenarios ● Data limitations, affordances, and complexities ● Extraction, derivation, and access methods ● Infrastructure requirements ● Relevant tools and technologies ● Collection development and augmentation In covering these topics through the lens of specific collaborations between IA and computational researchers performing large-scale analysis of web archives, this presentation will illuminate issues and approaches that can inform both the implementation of similar programs at other web archiving institutions and also help researchers interested in data mining web collections better understand the possibilities of studying web archives and the types of services they can expect to encounter when pursuing this work. This overview is meant to showcase the latest achievements and upcoming data services from the Internet Archive's web archiving and data services group. Details about the way we and our systems work will be presented together with APIs and programming libraries that are ready to use as well as new features that are to be expected soon. Biographies: Helge Holzmann is Web Data Engineer at Internet Archive. Helge started working for the Archive in August 2018. Before, he earned his Master of Computer Science and worked as a researcher in Germany, striving for his PhD on efficient access methods for web archives, which resulted in publications at different conferences and journals, including TPDL, JCDL, BigData, SIGIR, WWW as well as the International Journal on Digital Libraries. He is passionate about big data, especially if there’s a temporal aspect to it, and is glad to contribute to a non-profit organization that holds one of the biggest collections of free data in the world. In addition to creating innovative services by deriving new value from this unique dataset, Helge is happy to support libraries and institutions interested in accessing the data as a consultant located in Europe. Jefferson Bailey is Director of Web Archiving & Data Services at Internet Archive. Jefferson joined Internet Archive in Summer 2014 and manages Internet Archive's web archiving services including Archive-It, used by over 650 institutions to preserve the web, as well as domain-scale and contract harvesting and indexing services. He works closely with partner institutions on collaborative technology development, computational research support, and data services. He is PI on multiple grants focused on systems interoperability, data-driven research use of web archives, and digital preservation initiatives. He was Chair of the Steering Committee of the International Internet Preservation Consortium (IIPC) until 2019. 77 Session 17: An Irish Tale / Scéal Éireannach Born-digital displaced records: The disappearance of the GAA websites Helena La Pina (Maynooth University) Keywords: Irish culture; GAA; archived websites; born-digital displaced records ABSTRACT This year, the author completed an MA in Historical Archives in Maynooth University, and produced a thesis titled: ‘Displaced archives, and the core components in the debates surrounding repatriation’. The thesis utilises secondary literature in archival science, information/records management, and interdisciplinary scholarship to investigate the dilemmas associated with displaced archives. During the thesis research process, the author discovered that there was a limited amount of scholarship dealing with the displacement of electronic records, and a scarcity of scholarship regarding the displacement of born-digital records. This presentation aims to open a discussion on how archived websites, might also be understood as displaced born-digital records. In doing so, the author discusses a research study, which explores the presence of the Gaelic Athletic Association (GAA) web heritage in the Internet Archive’s Wayback Machine. Danielson (cited in Winn, 2015) offers an interpretation of displaced archives as ‘archival materials that have been lost, seized, requisitioned, confiscated, purchased under duress, or otherwise gone astray’. Inkster (1983) proffers that a displaced or misplaced document comes under three definitions: the document is missing, the document is estray (which is the legal definition of a document not in possession of its owner), or the document is fugitive. The Society of Archivists (SAA) define fugitive as connoting ‘materials that are not held by the designated archives or library charged with their preservation.’ Displaced archives are also referred to as misplaced archives, expatriated archives, seized archives, archives in exile, and migrated archives (Inkster, 1983; Garaba, 2011; Winn, 2015). However, as Garaba argues, whatever term is used to describe displaced records and for whatever reason, the fundamental fact remains, they are not where they should be. In this presentation, the author provides an analysis of the official GAA website, archived in the Wayback Machine within a certain timeframe. It also covers, on the periphery, other ‘unofficial’ 78 GAA archived websites. While chronicling the important role the GAA has played in Irish society, the author observes what dates were used for capturing and why the randomness of captures is not calibrated with end-of-season competitions like the All-Ireland final. The author discusses how the disappearance of GAA websites from the live web, fit the description of a missing cultural record. The author also highlights how the capture of GAA websites in the Wayback Machine, offers an interpretation of born-digital displaced record, in so far as the record is not where it should be. References: Garaba, Francis (2010) An investigation into the management of the records and archives of former liberation movements in east and southern Africa held by national and private archival institutions (PhD Dissertation, University of KwaZulu-Natal, South Africa, 2010) (https://researchspace.ukzn.ac.za/xmlui/handle/10413/1495) Inkster, Carole M. (1983) Geographically misplaced archives and manuscripts: problems and arguments associated with their restitution, Archives and Manuscripts, 11(2), pp 113-124 (https://publications.archivists.org.au/index.php/asa/article/view/7559) Winn, Samantha R. (2015) Ethics of access in displaced archives, Provenance, Journal of the Society of Georgia Archivists, 33(1), pp 6-13 (http://digitalcommons.kennesaw.edu/provenance/vol33/iss1/5) Society of American Archivists, Dictionary of archival terminology, (https://dictionary.archivists.org/entry/fugitive.html). Biography: Helena La Pina recently completed an MA in Historical Archives at Maynooth University. Titled, ‘Displaced archives, and the core components in the debates surrounding repatriation’, her thesis investigates the dilemmas associated with displaced archives within the context of archival practices, and the justifications, rationales, and challenges for repatriation. Recording Ireland's technology heritage: Lessons learned John Sterne (TechArchives project, Ireland) Keywords: IT Histories; technology heritage https://researchspace.ukzn.ac.za/xmlui/handle/10413/1495 https://publications.archivists.org.au/index.php/asa/article/view/7559 http://digitalcommons.kennesaw.edu/provenance/vol33/iss1/5 https://dictionary.archivists.org/entry/fugitive.html 79 ABSTRACT At its public launch in June 2016 the TechArchives project reached out to people with experience of past generations of information technology in Ireland and asked them to record personal testimonies. This work is continuing. As the project evolved, however, it became more concerned about the limited quantity and quality of historic material. It is therefore developing processes and methods to locate, catalogue and preserve digital evidence of significant actions and events. Biography: John Sterne is the founder of the TechArchives project. In the past he worked as a researcher, author, reporter and editor. Table of Contents Introduction Welcome from Sharon Healy and Michael Kurzmeier #EWAVirtual KEYNOTES #EWAVirtual Programme #EWAVirtual Abstracts Session 1: Archiving Initiatives The National Library of Ireland's Web Archive: preserving Ireland's online life for tomorrow Developing a Web Archiving Strategy for the Covid-19 Collecting Initiative at the University of Edinburgh Internet for everyone: the selection and harvest of the homepages of the oldest Dutch provider XS4ALL (1993-2001) Session 2: Collaborations Leveraging the UK Web Archive in an Irish context: Challenges and Opportunities Creating a web archive at Tate: an opportunity for ongoing collaboration Session 3: Archiving Initiatives (Lightning Round) PRONI Web Archive: A collaborative approach An overview of 15 years of experience in archiving the Croatian web The UK Web Archive and Wimbledon: A Winning Combination Session 4: Research Engagement & Access Piloting access to the Belgian web-archive for scientific research: a methodological exploration Reimagining Web Archiving as a Realtime Global Open Research Platform: The GDELT Project Session 5: Archiving Initiatives Archiving 1418-Now using Rhizome’s Webrecorder: observations and reflections Managing the Lifecycle of Web Archiving at a Large Private University Session 6: Social Science & Politics Thematic web crawling and scraping as a way to form focussed web archives Metadata for social science research Exploring Web Archive Networks: The Case of the 2018 Irish Presidential Election Session 7: Collaborations & Teaching IIPC: training, collecting, research, and outreach activities Using Web Archives to Teach and Opportunities in the Information Science Field Session 8: Research of Web Archives Web archiving - professionals and amateurs Session 9: Research Approaches Digital archaeology in the web of links: reconstructing a late-90s web sphere Web defacements and takeovers and their role in web archiving Session 10: Culture & Sport MyKnet.org: Traces of Digital Decoloniality in an Indigenous Web-Based Environment From the sidelines to the archived web: What are the most annoying football phrases in the UK? Session 11: Research (Lightning Round) Tracking and Analysing Media Events through Web Archives Reanimating the CDLink platform: A challenge for the preservation of mid-1990s Web-based interactive media and net.art Curating culturally themed collections online: The 'Russia in the UK' Special Collection, UK Web Archive Session 12: Youth & Family DELETE MY ACCOUNT: Ethical Approaches to Researching Youth Cultures in Historical Web Archives Changing platforms of ritualized memory practices. Assessing the value of family websites Session 13: Source Code and App Histories Platform and app histories: Assessing source availability in web archives and app repositories Exploring archived source code: computational approaches to historical studies of web tracking Session 14: AI and Infrastructures Cross-sector interdisciplinary collaboration to discover topics and trends in the UK Government Web Archive: a reflection on process A tale of two web archives: Challenges of engaging web archival infrastructures for research Session 15: WARC and OAIS What’s missing from WARC? Session 16: Web Archives as Scholarly Dataset Web Archives as Scholarly Dataset to Study the Web Session 17: An Irish Tale / Scéal Éireannach Born-digital displaced records: The disappearance of the GAA websites Recording Ireland's technology heritage: Lessons learned work_2pvrb5nxdjgufi5waihbgv3fmy ---- This may be the author’s version of a work that was submitted/accepted for publication in the following source: Paul, Gunther & Wischniewski, Sascha (2012) Standardisation of digital human models. Ergonomics, 55(9), pp. 1115-1118. This file was downloaded from: https://eprints.qut.edu.au/50094/ c© Copyright 2012 Taylor & Francis This is a preprint of an article submitted for consideration in the [Ergonomics] c© VOL 55 IS 9 DOI: 10.1080/00140139.2012.690454 [copyright Taylor & Francis]; [Ergonomics] is available online at: http://www.tandfonline.com/doi/abs/10.1080/00140139.2012.690454 Notice: Please note that this document may not be the Version of Record (i.e. published version) of the work. Author manuscript versions (as Sub- mitted for peer review or as Accepted for publication after peer review) can be identified by an absence of publisher branding and/or typeset appear- ance. If there is any doubt, please refer to the published source. https://doi.org/10.1080/00140139.2012.690454 https://eprints.qut.edu.au/view/person/Paul,_Gunther.html https://eprints.qut.edu.au/50094/ https://doi.org/10.1080/00140139.2012.690454 SHORT COMMUNICATION  Standardization of Digital Human Models  Gunther Paula,* and Sascha Wischniewskib aSchool of Public Health and Social Work, Queensland University of Technology, Victoria Park Road, Kelvin Grove QLD 4059, Australia. Tel: +61 7 313 85795, Fax: +61 7 313 83369, email: gunther.paul@qut.edu.au bFederal Institute for Occupational Safety and Health (BAuA), Friedrich-Henkel-Weg 1-25, 44149 Dortmund, Germany. Tel: +49 231 9071 2249, Fax: +49 231 9071 2294, email: wischniewski.sascha@baua.bund.de                                                                  * Corresponding author.   Standardization of Digital Human Models  Abstract. Digital human models (DHM) have evolved as useful tools for ergonomic workplace design and product development, and found in various industries and educa- tion. DHM systems which dominate the market were developed for specific purposes and differ significantly, which is not only reflected in non-compatible results of DHM simulations, but also provoking misunderstanding of how DHM simulations relate to real world problems. While DHM developers are restricted by uncertainty about the user need and lack of model data related standards, users are confined to one specific product and cannot exchange results, or upgrade to another DHM system, as their pre- vious results would be rendered worthless. Furthermore, origin and validity of anthro- pometric and biomechanical data is not transparent to the user. The lack of standardiza- tion in DHM systems has become a major roadblock in further system development, af- fecting all stakeholders in the DHM industry. Evidently a framework for standardizing digital human models is necessary to overcome current obstructions. Keywords: Digital Human Model, standardization, computer manikin, body template, virtual human Practitioner Summary. This short communication addresses a standardization issue for digital human models, which has been addressed at the International Ergonomics Association Technical Committee for Human Simulation and Virtual Environments. It is the outcome of a workshop at the DHM 2011 symposium in Lyon, which concluded steps towards DHM standardization that need to be taken. 1.  Introduction  Digital human models are of great importance in research and industry: They enable scientists  to carry out computer‐aided studies on human postures and motions with the option to easily  vary anthropometric as well as biomechanical parameters of virtual surrogates (Chaffin 2005).  While multi‐body, biomechanical (e.g. Christensen et al. 2003), finite element (FEM) (e.g.  Siefert et al. 2008), and human segmental models (e.g. Zhuang et al. 2010) are considered  DHM in this framework, psychophysical or cognitive models (Bellet et al. 2011), lumped‐ parameter and  biodynamic models (Griffin 2001) will be considered out of scope.  The simulation approach increases the amount of possible analyses, while at the same time  decreasing effort, time and cost due to omission of physical experiments with real subjects. In  practice digital human models help to improve design and usability of products and work sys‐ tems in early stages of product and process design. Again, effort, time and cost are optimized  due to the more efficient organization of iterative product and production process design  phases.  It is for this reason that a number of different, commercially available models with  heterogeneous properties, capabilities, underlying algorithms, anthropometric and biome‐ chanical data sets and even more scientific models for varying purposes have been developed  in the past. A review of 63 posters and full papers presented at the First International Sympo‐ sium on Digital Human Modeling of the International Ergonomics Association, held in Lyon July  2011, reveals more than 30 different full or partial models of the human body (e. g. hand‐arm  system, foot‐leg system).  The large variability of existing digital human models, affecting for example naming of  segments or joints, definition of global and local coordinate systems and degrees of freedom  (DOF) of joint and segment motion as well as the embeded anthropometric and biomechnical  data, makes it difficult to disseminate and compare results or exchange research ideas. Such  may be due to the implementation of different algorithms, body and kinematic models, an‐ thropometric assumptions or location of reference points. Furthermore, this makes it difficult  to transfer validated research concepts into commercially distributed DHM software systems,  which would broaden their basis for different usage.  This lack of standardization in DHM systems has become a major roadblock in further  system development, equally affecting all stakeholders in the DHM industry. It is evident that a  framework for standardizing digital human models is necessary to overcome current obstruc‐ tions. Therefore the IEA Technical Committee on Human Simulation and Virtual Environments  has formed a sub‐committee for DHM standardization (WG S) in July 2011.  This paper summarizes the outcome of the first meeting of the WG S, including previ‐ ous work, existing standards and guidelines, further requirements towards DHM standardiza‐ tion which were identified and the structure of required future activities.  2.  Previous work  The sub‐committee refers to previous, unpublished work done under the Society of Automo‐ tive Engineers (SAE) G‐13 committee (Human Modeling and Technology). The SAE G‐13 com‐ mittee defined human modeling technology purpose as to improve design quality in relation to  Human Factors, support definition of design requirements, demonstrate physical interaction  between human and system, and identify risks and cost associated with man‐in‐the‐loop (SAE  International 2012). Although it was recommended to expand on this work, the SAE G‐13  committee had difficulty defining standards for digital human models, as they concluded that  doing so would impact a supplier’s proprietary approach to building a manikin. Hence the G‐13  workgroup went no further than a project comparing the anthropometric accuracy of various  man models.   However, there is an apparent broad need in the wider digital human modeling com‐ munity to better understand the model assumptions of specific DHM manikins, exchange or  transfer information between DHM systems or DHM users, interpret DHM study results, justify  DHM system selection or investment and support DHM development.  3. Standards and guidelines  The International Organization for Standardization (ISO) provides a basic standard for com‐ puter manikins including joint degrees of freedom in ISO 15536, as well as the detailed stan‐ dards ISO 7250, ISO 15535 and ISO 20685 relating to human body measurements and their  storage in databases. ISO/IEC 19774 "specifies a systematic method for representing human‐ oids in a network‐enabled 3D graphics and multimedia environment". Besides, ISO  TC108/SC4/WG14 (posture related to whole‐body vibration) is drafting standard ISO TR 10687  on “Mechanical vibration – Description and determination of seated postures with reference  to whole‐body vibration” in the related domain of biodynamic modelling, with reference to  coherent measurement and modelling.     Apart from international standards, the International Society of Biomechanics devel‐ oped standards for the human body coordinate system (Wu and Cavanagh 1995) as well as for  joint coordinate systems (Wu et al. 2002, Wu et al. 2005).   Furthermore, file formats are important for the exchange of data. Different quasi  standards exist like for instance ASF/AMC (Acclaim 1994), BVH (Meredith and Maddock 2001),  GMS (Luciani et al. 2006), C3D (Motion Lab Systems 2008), COLLADA (Collada Working Group  2006) or X3D (ISO 19775, ISO 19776, ISO 19777), which are in parts driven by the gaming and  movie industry in conjunction with motion capturing.   Currently the German Engineering Association (VDI) is working on a comprehensive standard  on human representation in the digital factory to provide an overview on current DHM practi‐ cal and theoretical issues to be published as part 4 of the VDI Guideline 4499 (Zuelch 2012).  4.  DHM standard considerations  From the above presented review of existing standards and guidelines, it becomes obvious  that many approaches already exist to build upon for a DHM standard. Despite this advantage,  proper integration of those existing standards and guidelines into a new DHM standard re‐ quires thorough consideration. The WG S raised additional questions, which were transferred  into the following course of action:  4.1. Review of past and current efforts  Before starting a new DHM standardization process, it is important to evaluate past as well as  current efforts. Looking at past initiatives and their results builds the basis and enables a les‐ sons learnt process.   Considering current efforts helps to avoid duplication of work. Examples for past and  current efforts collected and structured so far, are briefly summarized in section 2 and 3.  4.2. Establish current needs of users and vendors  Once the review has been completed, current needs of users as well as software vendors have  to be analyzed and established. In view of associated developmental work and future imple‐ mentation of the standard, a categorization into fundamental, important and useful issues  should be pursued.  User needs should be divided into academic and practical needs. Scientists may re‐ quire different standardization features than product and process engineers when using DHM  systems.  Identifying software vendor needs is another challenging task, since the question may  arise if standardization supports a software vendor’s business model.  4.3. Scope of standardization  The most critical question to answer remains the standardization targets. Fundamentally im‐ portant is a standard human anatomic structure, with defined global and local coordinate sys‐ tems, consistent naming and numbering of limb segments or joints, with their corresponding  uniform degrees of freedom.   Further on, consideration is required for a DHM standard procedure and parametric  model of linking anthropometric databases (ISO 15535) to a defined DHM human structure, in  order to create proportions representative for a selected population. Moreover, it has to be  assured that available anthropometric data (ISO 7250) can be used to calibrate the digital hu‐ man model to be used as either an individual or boundary manikin.   Additionally, a standard data format would significantly facilitate the exchange of re‐ search results. A DHM standard data model should encompass an input section, containing  information of model structure description, parameterization of the structure components  (e. g. limb size, range of motion), anthropometric assumptions, hard points and kinematic  drivers; as well as an output section, documenting the nature and results of simulations per‐ formed by a DHM.  Beyond these intrinsic parameters of a DHM standard, further extrinsic parameters  encompassing all interaction between the user and one or several DHM have to be considered.   Thus a DHM standard should present an exemplary procedure on how to integrate the  virtual ergonomic process into today’s product and production design processes, in order to  assure ergonomically valid results. A future standard has to define classes of accuracy for digi‐ tal human models: How closely does a manikin need to match the human to produce adequate  analysis for the application pursued? Finally, a standard test/ protocol should be worked out to  allow comparisons between DHM performance and their compliance with the standard.  Out of scope for the working group but crucial to the success of DHM systems in terms  of validity are the collection, processing and accurate usage of anthropometric, physiologic  and biomechanical data (Van Sint Jan 20005). Current DHM systems contain different data sets  and algorithms which are of limited transparency to the user. ISO standards (7250, 15535,  20685) and their further enhancements target to ensure consistent data collection in terms of  methodology, sample size as well as data management and analysis. Their consideration needs  to be mandatory for a DHM standard.  5. Future work  The presented aspects have been clustered and assigned to small working groups, which will  develop drafts to be discussed at the regular plenary meetings. WG S plenary meetings are  held in conjunction with TC HS & VE annual meetings.   The WG S is structured as a self‐organized network within the TC HS & VE. The TC uses  a LinkedIn social network platform as the main communication channel for exchanging ideas.  WG S working sites have been established under the Standards Australia hub and an informa‐ tion share system managed by the German Federal Institute for Occupational Safety and  Health (BAuA) provided by the German Federal Office of Administration. The IEA TC HS & VE  and its WG S sub‐committee are open for new members to join and invite participation beyond  IEA membership.  6. References  Acclaim Advanced Technologies Group, 1994. Internal Technical Memo #39 [online]. Avail‐ able from  http://www.darwin3d.com/gamedev/acclaim.zip [Accessed 21 Mar 2012].  Bellet, T., et al., 2011. A computational model for car drivers Situation Awareness simula‐ tion: COSMODRIVE. In: Proceedings of the First International Symposium on Digital Human  Modeling, 14.06.‐16.06.2011, Lyon, France.  Collada Working Group, 2006. Collada  TM  Digital Asset and FX Exchange Schema [online].  Available from: https://collada.org/mediawiki/index.php/COLLADA [Accessed 21 Mar  2012].  Chaffin, D.B., 2005. Improving digital human modelling for proactive ergonomics in design.  Ergonomics, 48 (5), 478‐491.  Griffin, M.J., 2001. The validation of biodynamic models. Clin Biomech, 16 (Suppl 1), 81‐92.  International Organization for Standardization, ISO 7250: Basic human body measurements  for technological design.  International Organization for Standardization, ISO TR 10687: Mechanical vibration – De‐ scription and determination of seated postures with reference to whole‐body vibration.  Under development.  International Organization for Standardization, ISO 15535: General requirements for estab‐ lishing anthropometric databases.  International Organization for Standardization, ISO 15536: Ergonomics – Computer mani‐ kins and body templates.  International Organization for Standardization, ISO/IEC 19775 – Information technology ‐  Computer graphics and image processing – Extensible 3D (X3D).  International Organization for Standardization, ISO/IEC 19776 – Information technology –  Computer graphics, image processing and environmental data representation – Extensible  3D (X3D) encodings.  International Organization for Standardization, ISO/IEC 19777 ‐ Information technology –  Computer graphics and image processing – Extensible 3D (X3D) language bindings.  International Organization for Standardization, ISO 20685: 3‐D scanning methodologies for  internationally compatible anthropometric databases.  International Organization for Standardization, ISO/IEC 19774: Humanoid Animation (H‐ Anim).  Luciani, A., et al., 2006. A Basic Gesture and Motion Format for Virtual Reality Multisensory  Applications. In Proceedings of the 1st International Conference on Computer Graphics  Theory and Applications, Setubal (Portugal), March 2006.    Meredith, M. and Maddock, S., 2001. Motion Capture File Formats Explained, Department  of Computer Science Technical Report CS‐01‐11 [online]. Available from:  http://www.dcs.shef.ac.uk/intranet/research/resmes/CS0111.pdf [Accessed 21 Mar 2012].  Motion Lab Systems, 2008. The C3D File Format User Guide [online]. Available from:  http://www.c3d.org/pdf/c3dformat_ug.pdf [Accessed 21 Mar 2012].   SAE International, 2012. G‐13 Human Modeling Technology [online]. Available from:  http://www.sae.org/standardsdev/aerospace/g13.htm    [Accessed 21 Mar 2012].  Siefert, A., et al., 2008. Virtual optimisation of car passenger seats: Simulation of static and  dynamic effects on driver’s seating comfort. Int J Ind Ergon, 38 (5/6), 410‐424.  Christensen, S.T., Siebertz, K., Damsgaard, M., de Zee, M., Rasmussen, J. and Paul, G., 2003.  Human seat modeling using inverse dynamic musculo‐skeletal models. Digital Human Mod‐ eling for Design and Engineering, Society of Automotive Engineers, Montreal, Canada, 16‐ 19 June, 2003.   Van Sint Jan, S., 2005. Introducing Anatomical and Physiological Accuracy in Computerized  Anthropometry for Increasing the Clinical Usefulness of Modeling Systems. Critical Reviews  in Physical and Rehabilitation Medicine, 17 (4), 249–274.   Wu, G. and Cavanagh, P.R., 1995. ISB recommendation for standardization in the reporting  of kinematic data. J. Biomech, 28 (10), 1257‐1261.  Wu, G., et al., 2002. Standardization Committee of the International Society of Biomechan‐ ics, ISB recommendation on definitions of joint coordinate system of various joints for the  reporting of human joint motion – part 1: ankle, hip, and spine. J. Biomech, 35 (4), 543‐548.  Wu, G., et al., 2005. International Society of Biomechanics, ISB recommendation on defini‐ tions of joint coordinate system of various joints for the reporting of human joint motion –  part 2: shoulder, elbow, wrist and hand. J. Biomech, 38 (5), 981‐992.  Zhuang, Z., Benson, S. and Viscusi, D., 2010. Digital 3‐D headforms with facial features rep‐ resentative of the current US workforce. Ergonomics, 53(5), 661‐671.  Zuelch, G., 2012. Features and limitations of digital human models – a new German guide‐ line. Work: A Journal of Prevention, Assessment and Rehabilitation, 41 (Suppl 1), 2253‐ 2259.    work_2r4jqloa6zg6jescqvapjpgamu ---- For your Partnered #DH project Critique essay, you were asked to use Shannon Mattern’s criteria for evaluating Multimodal Student Work and the peer review criteria from Galey & Ruecker’s “How a Prototype Argues”, (2010) to evaluate selected digital humanities projects. This is an important exercise, especially to those new to #dh, as it helps you think about what does and doesn’t work, about the usefulness of various genres of projects, and about how current projects might be altered to become more useful, more user- friendly, and/or more academically rigorous. As we critique existing projects, we come to a better grasp of how we might develop and manifest our own projects. An important next gesture, then, is for us to establish assessment (read: grading) criteria for our own final transmedia projects. We’ll start where we’ve already started: we’ll look at Mattern’s comprehensive criteria and chose those that best suit our needs. Then we’ll add more, delete the unnecessaries, and edit those we want to tweak. Here I’ve posted my annotated copy of her list. These criteria are now posted on Rap Genius. Your assignment, to be completed within the next two weeks, is to add at least four annotations to that Rap Genius page expressing your ideas and opinions about these criteria. You can add criteria or vote for a deletion. You can suggest edits or request justification for why I’ve highlighted certain Search Recent Posts Diego Slide Pooja Final Present 1 – Side Presentation Greg’s One Slide Presentation Portal 2 Project: 1 Slide Presentation Recent Comments direct sales training on Daytripper Blog Post Zenobia Huse on Daytripper – Search TECHNOSCIENCE / ECOMATERIALITY / LITERATURE Evaluating Digital Humanities Projects: Collaborative Course Assessment HOME SYLLABUS SCHEDULE ASSIGNMENTS AR_BLOG https://sites.duke.edu/lit80s_01_f2014 http://sites.duke.edu/lit80s_01_f2014/partnered-dh-project-critique/ http://journalofdigitalhumanities.org/1-4/evaluating-multimodal-work-revisited-by-shannon-mattern/ http://llc.oxfordjournals.org/content/25/4/405.abstract http://genius.com/tags/poetry https://sites.duke.edu/lit80s_01_f2014/2014/12/03/diego-slide/ https://sites.duke.edu/lit80s_01_f2014/2014/12/03/pooja-final-present/ https://sites.duke.edu/lit80s_01_f2014/2014/12/03/1-side-presentation/ https://sites.duke.edu/lit80s_01_f2014/2014/12/03/gregs-one-slide-presentation/ https://sites.duke.edu/lit80s_01_f2014/2014/12/03/portal-2-project-1-slide-presentation/ http://www.pajeuwebtv.com.br/profile.php?u=JonelleSaav https://sites.duke.edu/lit80s_01_f2014/2014/10/17/daytripper-blog-post-2/#comment-2229 http://monstervacuum.com/ https://sites.duke.edu/lit80s_01_f2014/2014/10/18/daytripper-graphic-novel-annotation/#comment-2098 https://sites.duke.edu/lit80s_01_f2014/ https://sites.duke.edu/lit80s_01_f2014/syllabus/ https://sites.duke.edu/lit80s_01_f2014/schedule/ https://sites.duke.edu/lit80s_01_f2014/assignments/ https://sites.duke.edu/lit80s_01_f2014/ar_blog/ sections. The criteria I feel most important to our project are highlighted and annotated below. I welcome all comments. Go! Gould Annotations – Click and Zoom to read Gould annotations page 2 Graphic Novel annotation Concerns Over Sexuality In Game Spaces | fourgirlsblogmcc on Gamer Theory Critique Pooja Mehta on Infinite Twitter Poem Amanda Starling Gould on Powerful Digital Representation Archives December 2014 November 2014 October 2014 September 2014 March 2014 Categories Uncategorized Meta Log in Entries RSS Comments RSS WordPress.org http://sites.duke.edu/lit80s_02_f2013/files/2013/09/Augmenting-Realities-Final-Project-Criteria_Page_1.jpg http://sites.duke.edu/lit80s_02_f2013/files/2013/09/Augmenting-Realities-Final-Project-Criteria_Page_2.jpg https://sites.duke.edu/lit80s_01_f2014/2014/10/18/daytripper-graphic-novel-annotation/#comment-2098 http://fourgirlsblogmcc.wordpress.com/2014/11/17/concerns-over-sexuality-in-game-spaces/ https://sites.duke.edu/lit80s_01_f2014/2014/10/06/gamers-theory-critique/#comment-66 https://sites.duke.edu/lit80s_01_f2014/2014/10/24/infinite-twitter-poem/#comment-51 http://texturalliterature.blogspot.com/ https://sites.duke.edu/lit80s_01_f2014/2014/09/15/powerful-digital-representation/#comment-22 https://sites.duke.edu/lit80s_01_f2014/2014/12/ https://sites.duke.edu/lit80s_01_f2014/2014/11/ https://sites.duke.edu/lit80s_01_f2014/2014/10/ https://sites.duke.edu/lit80s_01_f2014/2014/09/ https://sites.duke.edu/lit80s_01_f2014/2014/03/ https://sites.duke.edu/lit80s_01_f2014/category/uncategorized/ https://sites.duke.edu/lit80s_01_f2014/wp-login.php https://sites.duke.edu/lit80s_01_f2014/feed/ https://sites.duke.edu/lit80s_01_f2014/comments/feed/ https://wordpress.org/ Gould annotations page 3 To these, let’s add: A consideration of Digital Preservation: Let’s think about how we might best preserve our content as a whole. And let’s consider best practices for saving our own personal work. Remember, it is always best to write your webcontent using a saveable (and backup-able) document source, like Word or Google Docs, that you can save, store, and archive. Should our site go down or become compromised, you’ll want to have a backup copy of your hard work. Make sure you download a copy of your media element if possible and/or store an extra copy in the cloud. Let’s think too about zombie links and dead sites. asg Comment Name Leave a Reply Your email address will not be published. Required fields are marked http://sites.duke.edu/lit80s_02_f2013/files/2013/09/Augmenting-Realities-Final-Project-Criteria_Page_3.jpg Copyright © 2017 All rights reserved. Designed by Email Website POST COMMENT Technoscience / Ecomateriality / Literature is powered by WordPress at Duke WordPress Sites. Please read the Duke Wordpress Policies. Contact the Duke WordPress team. https://sites.duke.edu/lit80s_01_f2014/feed/ http://twitter.com/stargould http://www.nattywp.com/ https://duke.edu/ http://wordpress.org/ http://sites.duke.edu/ http://sites.duke.edu/policies/ http://sites.duke.edu/help/ work_2seveqjymjgtpf7bowr6kyyiaa ---- Microsoft Word - 12-유희천.doc Journal of the Ergonomics Society of Korea Vol. 29, No. 3 pp.383-391, June 2010 DOI:10.5143/JESK.2010.29.3.383 Digital Human Simulation을 활용한 방사성 폐기물 처리장 주제어실의 인체공학적 평가 * 이백희1·장 윤1·정기효2·정일호3·유희천1 1포항공과대학교 기계산업공학부 / 2Pennsylvania State University 산업공학과 / 3현대엔지니어링 원자력부 Ergonomic Evaluation of a Control Room Design of  Radioactive Waste Facility using Digital Human Simulation  Baekhee Lee 1 , Yoon Chang 1 , Kihyo Jung 2 , Ilho Jung 3 , Heecheon You 1 1 Department of Industrial and Management Engineering, POSTECH, Pohang, 790‐784  2 Department of Industrial and Manufacturing Engineering, Pennsylvania State University, University Park, PA, USA 16802  3 Department of nuclear, Power & Energy Plant Division, Hyundai Engineering, Seoul, 158‐050  ABSTRACT The present study evaluated a preliminary control room (CR) design of radioactive waste facility using the JACK® human simulation system. Four digital humanoids (5th, 50th, 95th, and 99th percentiles) were used in the ergonomic evaluation. The first three were selected to represent 90% of the target population (Korean males aged 20 to 50 years) and the last to reflect the secular trend of stature for next 20 years in South Korea. The preliminary CR design was assessed by checking its compliance to ergonomic guidelines specified in NUREG-0700 and conducting an in-depth ergonomic analysis with a digital prototype of the CR design and the digital humanoids in terms of postural comfort, reachability, visibility, and clearance. For identified design problems, proper design changes and their validities were examined using JACK. A revised CR design suggested in the present study would contribute to effective and safe operations of the CR as well as operators' health in the workplace. Keyword: Digital human simulation, Control room, Radioactive waste facility 1. 서 론 방사성 폐기물 처리장(이하, 방폐장)은 원자력발전소(이하, 원전)에서 발생하는 중·저준위 방사성 폐기물을 처리하는 시설이다. 중·저준위 방사성 폐기물은 원전에 사용된 연료 를 비롯해 방사선 관리구역에서 사용된 작업복, 장갑, 기기 교체 부품 등으로써 안전하게 관리되도록 법적으로 지정되 어 있다(한국방사성 폐기물관리공단, 2009a). 우리나라는 원전 내 임시저장시설에 중·저준위 방사성 폐기물을 보관 해오고 있으나, 향후 임시저장시설의 포화를 고려해 경상 북도 경주시에 2012년까지 방폐장을 신설할 계획이다(한국 방사성 폐기물관리공단, 2009b). 방폐장 주제어실은 운전원의 효과적인 감시 작업 수행과 * 본 논문은 2009년 한국전력주식회사의 지원을 받아 수행되었음. 교신저자: 유희천 주 소: 790-784 포항시 남구 효자동 산31, 전화: 054-279-2210, E-mail: hcyou@postech.ac.kr 384 이백희 · 장윤 · 정기효 · 정일호 · 유희천 大韓人間工學會 개발 비용 경감을 위해 설계 초기부터 인간공학적인 고려가 필요하다. Hwang et al.(2009)은 신형 원전의 본격적인 가 동에 앞서 인간공학적인 주제어실 평가를 수행하여 세 가지 측면(주제어실 설비 배치, 표시장치와 제어장치의 인터페이 스, 그리고 업무 절차서의 사용성)의 주요한 사용성 개선 사항을 파악하였다. 또한, 구진영 외(2007)는 주기적 안정 성 평가의 일환으로 평가 checklist를 적용하여 국내 운영 중인 원전 주제어실(고리 1~4호, 영광 1, 2호)을 인간공학 적으로 평가하고 개선 요구 항목을 분석하였다. 이러한 개 발된 주제어실의 인간공학적 평가는 설계 개선 사항 파악 에 유용하나, 이미 개발된 주제어실의 개선에 비용 및 시간 이 상대적으로 많이 소요된다. 따라서, 효과적인 주제어실 설계 및 개발을 위해서는 설계 초기부터 인간공학적인 평가 의 적용이 요구된다. 가상인체모델을 이용한 digital human simulation(DHS) 은 작업공간의 인간공학적 설계에 유용하게 활용되고 있다. 이상기 외(2005)와 박장운 외(2008)는 천장 크레인 운전 공간 설계와 한국형 헬리콥터 조종실 설계에 대해 DHS를 활용한 인간공학적 평가를 통하여 개선이 요구되는 설계요 소를 파악하고 개선 방안을 제시하였다(그림 1 참조). 이 러한 DHS를 이용한 인간공학적 설계 및 평가는 제품 개발 초기부터 가상 시제품(virtual mockup)을 사용하여 인간공 학적 평가를 수행할 수 있게 하며 제품 개발 기간과 비용을 효과적으로 절감하는데 유용한 방법으로 권장되고 있다(유 희천, 2007; Chaffin, 2005). 본 연구는 DHS를 활용해 방폐장 주제어실의 예비 설계 (preliminary design)를 평가하고 개선 설계요소를 분석하 였다. 먼저, DHS 평가를 위해 H사에서 개발한 방폐장 주 제어실의 예비 설계에 대해 3차원 가상 시제품을 개발하였 다. DHS 평가는 Jack®을 활용하여 이루어졌으며, 2004년 Size Korea 인체측정자료와 향후 20년간의 신장 추세변동 (secular trend)을 고려하여 4명의 대표인체모델(5 th , 50 th , 95 th , and 99 th percentiles)이 생성되었다. 방폐장 주제어실 의 설계는 4가지 인간공학적 기준(자세 안락도, 도달성, 시 계성, 그리고 여유공간)을 적용해 평가되었으며, 인간공학적 평가 결과를 토대로 개선이 요구되는 설계요소와 개선 방향 이 분석되었다. 2. 평가 방법 2.1 가상인체모델 DHS 평가를 위한 대표인체모델은 설계대상인구의 90% 수용(5 th ~ 95 th percentiles) 특성과 향후 20년간의 신장 추 세변동을 고려해 표 1에 나타낸 것과 같은 4명으로 선정되었 표 1. 방폐장 주제어실 평가를 위한 대표인체모델의 인체크기(단위: cm, kg) Percentile 순번 인체변수 5 th 50 th 95 th 99 th 1 키 160.5 170.2 180.2 184.4 2 허리두께 17.8 22.0 27.3 29.4 3 발목 높이 7.3 8.5 9.6 10.2 4 어깨 높이 129.1 137.8 146.8 150.0 5 아래팔 길이 70.3 75.8 81.6 83.9 6 어깨 너비 35.8 39.7 43.1 44.8 7 위팔 사이 너비 42.7 46.6 50.6 52.7 8 앉은 엉덩이 무릎 수평길이 52.8 56.7 61.2 63.4 9 앉은 팔꿈치 높이 22.2 26.0 29.9 31.3 10 굽힌 팔꿈치 손끝 수평길이 41.5 44.8 48.0 49.3 11 발 너비 9.3 10.1 10.9 11.3 12 발 직선길이 23.5 25.3 27.1 28.0 13 손 직선길이 17.2 18.5 19.9 20.5 14 머리 너비 13.0 14.8 18.1 18.7 15 머리 수직길이 17.7 20.8 25.7 26.6 16 엉덩이 너비 30.5 33.0 35.8 36.9 17 눈동자 사이 너비 5.5 6.3 7.1 8.0 18 위팔 길이 30.7 33.2 36.0 37.0 19 앉은 어깨 높이 55.3 59.6 64.1 65.5 20 앉은 눈 높이 75.4 80.5 85.7 87.6 21 앉은 키 86.8 92.1 97.4 99.5 22 앉은 무릎 높이 46.9 50.7 54.8 56.7 23 앉은 넙다리 높이 12.7 15.2 17.8 19.4 24 몸무게 55.6 70.1 87.1 95.3 (a) 천장 크레인 평가 (b) 헬리콥터 조종실 평가 그림 1. DHS를 활용한 인간공학적 평가 사례 第 29 卷, 第 3 號, 2010. 6. 30 Digital Human Simulation을 활용한 방사성 폐기물 처리장 주제어실의 인체공학적 평가 385 다. 설계대상인구는 방폐장 운전원 인력 소요계획을 감안하 여 20~50대 남성으로 설정되었으며, 2004년 한국인 인체 측정자료(n = 1,992; Size Korea, 2004)의 90%를 수용할 수 있도록 3명의 대표인체모델(5 th , 50 th , 95 th percentiles) 이 생성되었다. 또한, 본 연구는 평가에 추세변동을 고려하 기 위해 세 가지 측면(한국인 남성의 성장분, 국외 신장 성 장분 정보, 그리고 보수적 추정)을 반영한 99 th percentile 인체모델(20년 후 95 th percentile)이 추가로 생성되었다. 최근 25년간(1979~2004년) 한국인 남성의 신장은 그림 2에 나타낸 것과 같이 4.4cm 성장한 것으로 파악되고 있 다. 한편, 시대적 성장분은 표 2에 나타낸 것과 같이 국가별 로 경제성장 및 영양섭취 특성에 따라 차이가 있는 것으로 알려지고 있다(Roche, 1995). 예를 들면, GNP 규모가 우 리나라(9,287백만 달러)보다 약 4.5배 큰 일본(42,657백 만 달러)의 최근 10년간 시대적 성장분은 1.32cm인 반면 우리나라는 동시기에 1.65cm 성장하였다. 마지막으로, 국내 신장 성장분 특성과 국외 신장 성장분 특성에 근간해 방폐 장 주제어실이 20년 후에도 목표 수용률을 최대한 충족시킬 수 있도록 보수적인 시대적 성장분 추정값을 적용하였다. 가상인체모델은 정의된 대표인체모델의 인체크기 정보를 Jack에 입력해 그림 3과 같이 생성되었다. Jack에서 가상 인체모델을 생성하기 위해서는 27개 인체변수에 대한 크기 를 입력해야 하나, 한국인 인체측정자료는 그 중에서 24개 인체변수에 대한 자료만을 제공하고 있다. 따라서, 본 연구 는 한국인 인체측정자료에 제공되어 있는 24개 인체변수의 치수는 직접 입력하고, 나머지 3개 인체변수(손 너비, 머리 길이, 엄지 손끝 길이)는 입력된 인체변수 값에 근간한 Jack의 추정치를 사용하였다. 2.2 평가 기준 자세 본 연구는 컴퓨터 작업 자세와 관련된 기존 연구를 참고 하여 DHS 평가를 위한 운전원 감시 자세를 표 3과 같이 설정하였다. 기존 연구들은 컴퓨터 작업 자세에 대한 관찰 및 자세 분석을 통해 표 3에 나타낸 것과 같은 컴퓨터 작업 에 대한 추천 자세를 제시하고 있다. 본 연구는 방폐장 주 제어실 감시 작업과 특성이 유사한 컴퓨터 작업에 대한 기 존 연구의 추천 자세를 참고하여 평가 기준 자세(그림 4 참 조)를 설정하였다. 표 3을 예로 들면, 어깨 굽힘 동작에 대 한 컴퓨터 작업 추천 자세는 0~25°이며, 방폐장 평가 기 준 자세로 관련 추천 범위의 중간 값 13°가 선정되었다. 2.3 인간공학적 평가 기준 본 연구의 평가에는 표 4와 같은 네 가지 인간공학적 평 가 기준이 고려되었으며, 주제어실 설계요소 별 평가 기준 은 해당 설계요소의 특성을 고려해 표 5와 같이 설정되었 표 2. Secular trend of stature for different populations Populations Age* Gender** Secular trend per decade(cm) References Italian N.S. N.S. 0.97 Arcaleni (2006) American N.S. M 1.00 NASA (2006) Portuguese 18 M 0.99 Padez and Johnston (1999) Pole 19 M 2.10 Bielicki and Szklarska (1999) Korean 20~49 M & F 1.65 Size Korea (2004) Japanese 20~49 M & F 1.32 AIST (2006) *N.S: not specified / **M: male, F: female 그림 2. 한국인 20~50대 남성의 신장 변화(Size Korea, 2004) 그림 3. Jack®을 이용해 생성된 가상인체모델 386 이백희 · 장윤 · 정기효 · 정일호 · 유희천 大韓人間工學會 다. 인간공학적 평가 기준은 기존 DHS 평가 연구들(박장운 외, 2008; Bowman, 2001; Nelson, 2001)에서 활용된 자 세 안락도, 도달성, 시계성, 그리고 여유공간으로 결정되었다. 또한, 선정된 인간공학적 평가 기준은 평가 대상 설계요소 의 특성에 따라 선택적으로 적용되었다. 예를 들면, 표 5는 운전원이 착석하는 console은 자세와 여유공간 측면에서 평가되며, 방폐장 관련 정보를 제공하는 LDP(large display panel)는 자세 안락도와 시계성 측면에서 분석됨을 보여 준다. 방폐장 주제어실의 설계 적합성 판단은 인간공학적 평가 결과의 NUREG-0700 설계 지침(O'Hara et al., 2002) 충 족 여부 분석을 통해 이루어졌다. NUREG-0700은 표 6 과 같이 원전에 사용되는 각종 설계요소에 대한 인간공학적 설계 지침을 제공하고 있다. 표 6에 예시된 NUREG-0700 지침에 따르면, console의 여유공간은 5 th percentile과 95 th percentile 운전원 착석 시 적절한 다리 움직임 공간을 제 공할 수 있도록 설계되어야 하며, LDP의 위치는 다양한 위 치에서 근무하는 모든 운전원들이 화면을 볼 수 있도록 결 정되어야 하고, LCD의 높이는 시야 범위 -40~20° 내에 있도록 설계할 것을 추천한다. 본 연구는 NUREG-0700 에 제공된 설계 지침을 적용하여 방폐장 주제어실의 설계 적합성을 평가하였다. 표 3. 컴퓨터 작업 자세와 방폐장 주제어실 평가 기준 자세 신체부위 관절 동작 추천 자세() 참고문헌 추천 자세 범위() 기준 자세() 34~65 Grandjean et al.(1983) Neck * ventral flexion(+) /dorsal flexion(-) 24.5~65 김철중 외(1991) 24.5~65 35 0~25 Chaffin and Andersson(1984) 0 ANSI/HFES(2007) 13 Geandjean(1987) flexion(+) /extension(-) 23 Salvendy(1987) 0~25 13 0~25 Chaffin and Andersson(1984) Shoulder abduction(+) /adduction(-) 8~23 Salvendy(1987) 0~25 13 70~135 Cushman(1984); Grandjean et al.(1983); Miller and Suther(1981); Weber et al.(1984) 90 ANSI/HFES(2007) 99 Salvendy(1987) Elbow flexion(+) 75~125 Grandjean et al.(1983) 70~135 80 Wrist flexion(+) /extension(-) -10~30 Hedge el al.(1995); Keir et al.(1995); Rempel and Horie(1994); Weiss et al.(1995) -10~30 10 90 Chaffin and Andersson(1984) 104 Geandjean(1987) 100~110 Salvendy(1987) Trunk ** flexion(+) 90 ANSI/HFES(2007) 90~110 95 Hip ** flexion(+) 0 ANSI/HFES(2007) 0 0 Knee ** flexion(+) 90 ANSI/HFES(2007) 90 90 * 목뒷점(cervical)을 축으로 하여 목뒷점을 지나는 수직선과 목뒷점에서 이주점(tragion)을 잇는 선이 이루는 각도 ** 횡단면(transverse plane)과 신체부위가 이루는 각도 (a) 측면 착석 자세 (b) 정면 착석 자세 그림 4. 방폐장 주제어실 평가 기준 자세 第 29 卷, 第 3 號, 2010. 6. 30 Digital Human Simulation을 활용한 방사성 폐기물 처리장 주제어실의 인체공학적 평가 387 3. 평가 사례 본 연구의 평가 결과 중 방폐장 주제어실의 대표적인 세 가지 설계요소(console, LDP, LCD)에 대한 인간공학적 적합성 평가 사례를 소개한다. 먼저, console의 최소여유 공간은 가상인체모델 4명에 대해 1.6~6cm로 분석되어 NUREG-0700의 설계 기준을 만족하는 것으로 파악되었 다. 최소여유공간 크기는 다리와 console간의 최단 거리로 계산되었으며, 가상인체모델의 인체크기가 클수록 감소한다. 예를 들면, 그림 5는 95 th percentile 인체모델에 대한 최 소여유공간 크기는 3.5cm이고, 99 th percentile 인체모델에 대한 최소여유공간 크기는 1.6cm인 것을 보여준다. LCD 수직 시야 범위는 NUREG-0700의 설계 기준을 표 4. 인간공학적 평가 기준 평가 기준 설명 자세 안락도 (postural comfort) 운전원이 감시 작업 수행 시 편안한 자세 를 유지하는 정도 도달성(reachability) 운전원이 주제어실 설계요소에 용이하게 도달할 수 있는 정도 시계성(visibility) 운전원이 주제어실 설계요소를 편안하게 볼 수 있는 정도 여유공간(clearance) 운전원의 신체와 설계요소 간의 여유공간 정도 표 5. 설계요소와 인간공학적 평가 기준 연관관계 분석 예 순번 설계요소 자세 안락도 도달성 시계성 여유 공간 1 Console  ×   2 LDP  ×  × 3 LCD  ×  × 4 Security access control sub-console   × × 5 CCTV master control rack   × × 6 Main fire control panel   × × 7 Printers   × × 표 6. 인간공학적 평가 기준 평가 대상 평가 기준 권장 지침 평가 적용 Percentile Console Clearance Should provide adequate height, depth, and knee clearance for the 5 th to 95 th percentile adults(p. 426, 11.1.5-4) 95 th & 99 th Permit operators at the consoles full view of all display panels(p. 459, 12.1.1.3-1) Be able to view information from multiple locations(p. 327, 6.3.1-1) Visibility Horizontal viewing angle requirement: Acceptable limit is within 30° from the centerline of each display(p. 329, 6.3.2-4, 6.3.2-5) 5 th ~ 99 th Centrally located in the control room(p. 311) Location Viewing distance - Minimum: Not closer to any observer than half the display width or height, which is greater(p. 329, 6.3.2-3) - Maximum: Able to resolve all important display detail at the maximum viewing position(p. 329, 6.3.2-2) 5 th ~ 99 th Character height (cm) = 6.283×D×(MA) / 21600(p. 47, 1.3.1-4) Minimum of minutes of arc (MA): 16' Recommended MA: 20'~22' LDP Character size Character height-to-width ratio should be between 1:0.7 to 1:0.9(p. 47, 1.3.1-5) 5 th ~ 99 th Vertical viewing angle requirement: Not more than 20° above and 40° below the user's horizontal LOS(p. 419, 11.1.2-6) LCD Visibility Viewing distance: 33~80cm with 46~61 cm preferred(p. 420, 11.1.2-8) 5 th ~ 99 th 그림 5. Console의 다리 여유공간 분석 결과 388 이백희 · 장윤 · 정기효 · 정일호 · 유희천 大韓人間工學會 만족하는 것으로 분석되었다. LCD 수직 시야 범위는 그림 6에 나타낸 것과 같이 평가 기준 자세에서 가상인체모델이 LCD를 바라보는 시야 각으로 계산되었다. 그림 6을 예로 들면, LCD 수직 시야 범위는 5 th percentile 가상인체모 델의 경우 -29~1°, 95 th percentile 가상인체모델의 경우 -34 ~ -4°로 파악되어, NUREG-0700의 설계 기준인 -40~20°를 만족하는 것으로 분석되었다. LDP의 수직 시야는 그림 7에 나타낸 것과 같이 수평 시선보다 높게 형성되어 장시간 감시 작업을 수행할 경우 자세가 불편할 수 있는 것으로 나타났다. LDP 수직 시야 범위는 LCD를 넘어 125cm 높이에 설치된 LDP를 볼 수 있는 시야 각으로 계산되었다. 그림 7.a를 예로 들면, 5 th percentile 가상인체모델의 LDP 수직 시야 각은 2~23°로 서, 5 th percentile 가상인체모델이 LCD 너머로 LDP를 적 합하게 볼 수 있는 것으로 나타났다. 가상인체모델들에 대한 LDP 수직 시야 범위는 -1~23° 로 파악되었으며, 이는 전체 LDP 화면을 볼 수 있어야 한 다는 NUREG-0700의 설계 기준을 만족한다. 그러나, LDP 시야 각은 수평 시선보다 높게 형성되어 기존 문헌의 표시장 치 권장 시야 각도 범위(Grandjean et al., 1983: -26 ~ -2°; Kim et al., 1991: -56 ~ -1°; O'Hara et al., 2002: -40~20°)를 벗어나는 것으로 나타나 장시간 LDP 감시 작업을 수행할 경우 신체 불편 및 피로를 초래할 수 있는 것으로 나타났다. LDP 높이를 낮추어 수직 시야 각을 개선하기 위해서는 LCD 높이를 함께 낮춰야 하는 것으로 분석되었다. LDP 수직 시야는 LDP의 높이를 낮추어 개선될 수 있으나, 그 림 8에 나타낸 것과 같이 현행 설계에서 LDP 높이를 낮출 경우 LCD에 의한 시야간섭이 발생할 수 있다. 이러한 LCD 에 의한 시야 간섭은 그림 9에 나타낸 것과 같이 console 에 LCD 하단부를 삽입하는 홈을 설치함으로써 효율적으로 제거될 수 있는 것으로 파악되었다. 깊이 10cm인 LCD 설 치 홈 구비를 통해 LDP 수직 높이를 115cm로 낮출 경우, 그림 10과 같이 LDP 수직 시야 범위는 -3~19°로 개선되 는 것으로 파악되었다. 본 연구에서 개선된 LDP 시야 범 위는 기존 시야 범위(-1~23°)보다 낮아진 것으로 분석되 었다. 예를 들면, 5 th percentile 가상인체모델에 대한 LDP 의 시야 범위는 기존 2~23°에서 0~19°로 개선되었다. 한 편, LCD 설치 홈을 구비할 경우 LCD 수직 시야 범위는 (a) 5 th percentile (b) 50 th percentile (c) 95 th percentile (d) 99 th percentile 그림 6. LCD 수직 시야 범위 그림 8. LDP와 LCD의 시계성 연관관계 (a) LCD 설치 홈 (b) LCD 설치 모습 그림 9. Console의 LCD 설치 홈 (a) 5 th percentile (b) 50 th percentile (c) 95 th percentile (d) 99 th percentile 그림 7. LDP 수직 시야 범위 第 29 卷, 第 3 號, 2010. 6. 30 Digital Human Simulation을 활용한 방사성 폐기물 처리장 주제어실의 인체공학적 평가 389 -31 ~ -2.5°로 나타나 NUREG-0700의 설계 기준(-40 ~20°)을 만족하는 것으로 파악되었다. LDP의 수평 시야 범위는 NUREG-0700에 명시된 LDP 중앙을 기준으로 좌우 수평 시야 각 30° 내의 설계 지침을 만족하는 것으로 분석되었다. 방폐장 주제어실은 그림 11에 나타낸 것과 같이 한 명의 운전원(좌측 7개 console 담당) 과 한 명의 감독관(우측 3개 console 담당)에 의해 운영 될 계획이다. 운전원과 감독관 모두에 대해 LDP 수평 시 야 범위는 LDP 중앙을 바라보는 시선을 기준으로 LDP의 좌/우 모서리를 바라보는 시선의 사이 각도로 계산되었다. 운전원의 수평 시야 범위는 console의 착석 위치에 따라 12~27°인 것으로 나타났고, 감독관의 수평 시야 범위는 14~26°로 파악되었다. 4. 토 의 본 연구는 Jack을 활용하여 방폐장 주제어실의 개념 설 계에 대한 인간공학적 평가와 NUREG-0700 설계 기준 충족 여부를 분석하였다. 방폐장 주제어실의 설계 적합성은 네 가지 인간공학적 측면(자세 안락도, 도달성, 시계성, 그 리고 여유공간)에서 NUREG-0700에서 제시된 원전 관련 설계 지침과 기타 인간공학 문헌에서 제시된 지침의 충족 여부를 종합적으로 고려하여 분석되었다. 또한, DHS 평가 를 통해 설계 개선이 요구되는 것으로 파악된 설계 항목에 대해 설계 개선 방안을 개발하고 개선 효과를 분석하였다. 본 연구를 통해 평가 및 개선된 방폐장 주제어실 예비 설계 는 방폐장 주제어실 구축 시 유용하게 활용될 수 있을 것 이다. 본 연구의 평가에는 한국인 인체크기 정보와 추세변동을 적용하여 생성된 가상인체모델들이 활용되었다. 대표인체 모델은 방폐장 주제어실 운전원의 인구학적 특성을 고려하 여 2004년 Size Korea 인체측정자료의 20~50대 남성의 90%를 수용하는 3명(5 th ~ 95 th percentiles)으로 정의되 었다. 또한, 본 연구는 1979년부터 2004년까지의 한국인 신장 정보에 근간하여 향후 20년의 추세적 성장을 반영한 99 th percentile 인체모델을 추가로 선정하여 분석하였다. 본 연구는 가상인체모델 생성 시 한국인 인체측정자료에 누락된 3개 인체변수(손 너비, 머리 길이, 엄지 손끝 길이) 에 대해 Jack에서 제공하는 추정치를 사용하였으나, 이들 변수들은 다른 인체변수들과 통계적 연관관계가 높은 것으 로 분석되었다. Jack은 27개 인체변수에 대한 수치를 사용 하여 가상인체모델을 생성하는데, 입력되지 않은 인체변수 는 자동으로 추정하게 된다. 본 연구에서는 사후 분석(post hoc analysis)으로 US Army 인체측정자료(Gordon et al., 1988)를 이용하여 누락된 3개의 인체변수와 나머지 인체 변수들에 대해 stepwise regression 분석을 수행한 결과 (pin = 0.05, pout = 0.1), 누락된 3개의 인체변수의 회귀식 들이 높은 수정회귀계수(adj. R2 = 52%, 손 너비; 83%, 머 리 길이; 84%, 엄지 손끝 길이)를 가지는 것으로 파악되 었다. 본 연구는 기존 연구에서 제시한 computer workstation 작업 자세에 근간해 평가 기준 자세를 설정하였다. 그러나, 본 연구의 방폐장 주제어실은 두 개의 표시장치(LCD와 LDP)가 설치되어 있어 computer workstation과는 작업 자세가 상이할 수 있다. 따라서, 방폐장 주제어실에 대한 보 다 적합한 평가를 위해서는 LCD와 LDP를 사용하는 작업 특성을 고려한 평가 기준 자세의 설정이 필요하다. 참고 문헌 구진영, 장통일, 이중근, 이용희. 국내 원자력발전소 주제어실의 인 (a) 5 th percentile (b) 50 th percentile (c) 95 th percentile (d) 99 th percentile 그림 10. 개선된 LDP의 수직 시야 범위 (a) 운전원 (b) 감독관 그림 11. LDP 수평 시야 범위 390 이백희 · 장윤 · 정기효 · 정일호 · 유희천 大韓人間工學會 간공학적 설계 기준 비교 검토. 대한인간공학회 2006 추계학 술대회지, 2006. 김철중, 이남식, 김진호, 박세진, 박수찬, 박재회, 조현모, 이윤우, 이회윤. VDT Workstation의 인간공학적 설계 및 평가기술에 관한 연구. 한국표준과학연구원, KRISS-93-061-IR, 1991. 박장운, 정기효, 이원섭, 강병길, 이정효, 엄주호, 박세권, 유희천. Digital Human Simulation을 통한 인간공학적 헬리콥터 조종실 설계 평가 방법 개발. 대한인간공학회 2008 춘계학술대회지. 유희천. Digital human model simulation for ergonomic design of tangible products and workplaces. 대한인간공학회 2007 추계학술대회지, 2007. 이상기, 이민정, 조영석, 권오채, 박정철, 유희천, 한성호. Digital human simulation을 통한 overhead crane의 인간공학적 설계 및 평가. 대한인간공학회/ 한국감성과학회 2005 춘계학술대회 및 제 8회 한일 공동 인간공학 심포지움, 57-60, 2005. 정기효, 이원섭, 박장운, 강병길, 엄주호, 박세권, 유희천. 한국형 헬리콥터 조종실의 인간공학적 설계 및 평가. 제 16회 지상 무기학술대회, 2008. 한국방사성 폐기물관리공단. 중 · 저준위 방사성 폐기물의 정의. Retrieved August 21, 2009 from http://www.krmc.or.kr., 2009a. 한국방사성 폐기물관리공단. 중 · 저준위 월성원자력환경관리센터 사업개요. Retrieved August 21, 2009 from http://www.krmc.or.kr, 2009b. ANSI/HFES. Human Factors Engineering of Computer Workstations. California, USA: Human Factors and Ergonomics Society, 2007. Arcaleni, E., Secular trend and regional differences in the stature of Italians, 1854-1980. Economics and Human Biology, 4, 24-38, 2006. Bielicki, A. and Szklarska, A., Secular trends in stature in Poland: national and social class-specific. Annals of Human Biology, 26(3), 251-258, 1999. Bowman D., Using digital human modeling in a virtual heavy vehicle development environment. In Chaffin, D. B. (Ed.), Digital Human Modeling for Vehicle and Workplace Design. Warrendale, PA: SAE International, 2001. Chaffin, D. B., Improving digital human modeling for proactive ergonomics in design. Ergonomics, 48(5), 478-491, 2005. Chaffin, D. B., Digital Human Modeling for Vehicle and Workplace Design. Pennsylvania, USA: SAE, 2001. Chaffin, D. B. and Andersson, G., Occupational Biomechanics (2nded.). New York, USA: WileyInterscience, 1984. Gordon, C. C., Bradtmiller, B., Churchill, T., Clauser, C., McConville, J., Tebbetts, I. and Walker, R., 1988 Anthropometric Survey of US Army Personnel: Methods and Summary Statistics (Technical Report NATICK/TR-89/044). US Army Natick Research Center: Natick, MA, 1998. Grandjean, E., Ergonomics in Computerized Offices. Philadelphia, USA: Taylor & Francis, 1987. Grandjean, E., Hunting, W. and Pidermann, M., VDT workstation design: Preferred settings and their effects. Human Factors, 25, 161-175, 1983. Hedge, A. and Powers, J. A.. Wrist postures while keyboarding: Effects of a negative slope keyboard system and full motion forearm supports. Ergonomics, 38, 508-517, 1995. Hwang, S.-L., Liang, S.-F.M.b, Liu, T.-Y.Y., Yang, Y.-J., Chen, P.-Y., Chuang, C.-F., Evaluation of human factors in interface design in main control rooms. Nuclear Engineering and Design, 239, 3069 -3075, 2009. NASA., Man-system integration standards. Retrieved September 22, 2009 from http://msis.jsc.nasa.gov/Volume1.htm, 2006. National Institute of Advanced Industrial Science and Technology (AIST). Secular change in Japan. Retrieved January 11, 2009, from http:// www.dh.aist.go.jp/research/centered/anthropometry/secular.php.en, 2006. Nelson, C., Anthropometric Analyses of Crew Interfaces and Component Accessibility for the International Space Station. In Chaffin, D. B. (Ed.), Digital Human Modeling for Vehicle and Workplace Design. Warrendale, PA: SAE International, 2001. O'Hara, J. M., Brown, W. S., Lewis, P. M. and Persensky, J. J., Human- System Interface Design Review Guidelines (DC 20555-0001). U.S. Nuclear Regulatory Commission, Office of Nuclear Regulatory Research, 2002. Padez, C. and Johnston, F., Secular trends in male adult height 1904-1996 in relation to place of residence and parent's educational level in Portugal. Annals of Human Biology, 26(3), 287-298, 1999. Roche, A. F., Executive Summary of Workshop to Consider Secular Trends and Possible Pooling of Data in Relation to the Revision of the NCHS Growth Charts. Division of Health Examination Statistics, National Center for Health Statistics, Hyattsville, Maryland, 1995. Size Korea. 한국인 인체측정 통계자료. Retrieved September 26, 2009 from http://sizekorea.kats.go.kr, 2004. ◐ 저자 소개 ◑  이 백 희  x200won@postech.ac.kr 인하대학교 산업공학과 학사 현 재: 포항공과대학교 산업경영공학과 석사과정 관심분야: 디지털 환경 기반 인간공학적 제품 설계 및 평가, 자동차 인간공학, 사용자 인터페이스 설계  장 윤  thursday@postech.ac.kr 한동대학교 산업정보디자인학부 학사 현 재: 포항공과대학교 산업경영공학과 석사과정 관심분야: 인간공학적 제품 설계, 3차원 시각화, 정보 디자인  정 기 효  khjung@postech.ac.kr 포항공과대학교 산업경영공학과 박사 현 재: Pennsylvania State University 산업공학과 Post-doc 관심분야: 디지털 환경 기반 인간공학적 제품 설계 및 평가, 사용자 중심 제품 설계, 사용성 평가, 직업성 근골격계질환 예방 및 통제 第 29 卷, 第 3 號, 2010. 6. 30 Digital Human Simulation을 활용한 방사성 폐기물 처리장 주제어실의 인체공학적 평가 391  정 일 호  wgo15ugo@hec.co.kr 경희대학교 멀티미디어시스템공학과 석사 현 재: 현대엔지니어링 원자력부 연구원 관심분야: 주제어실 설계, MCB 설계, 원자력 인간공학  유 희 천  hcyou@postech.ac.kr Pennsylvania State University 산업공학과 박사 현 재: 포항공과대학교 산업경영공학과 부교수 관심분야: 인간공학적 제품설계 기술, 사용자 중심의 제품설계, 가상 환경 기반 인간공학적 제품 설계 및 평가, 사용성 공학, 근골격계질환 예방 및 통제 논 문 접 수 일 (Date Received) : 2009년 11월 05일 논 문 수 정 일 (Date Revised) : 2010년 03월 19일 논문게재승인일 (Date Accepted) : 2010년 04월 30일 work_2sfmwbbrdfflznayz2scykzqqq ---- HowToDoDPoS_Preprint This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 1 How to do digital philosophy of science Charles H. Pence Department of Philosophy and Religious Studies Louisiana State University Baton Rouge, LA, USA charles@charlespence.net https://charlespence.net Grant Ramsey Institute of Philosophy KU Leuven Leuven, Belgium grant@theramseylab.org http://www.theramseylab.org Abstract Philosophy of science is beginning to be expanded via the introduction of new digital resources—both data and tools for its analysis. The data comprise digitized published books and journal articles, as well as heretofore unpublished and recently digitized material, such as images, archival text, notebooks, meeting notes, and programs. This growing bounty of data would be of little use, however, without quality tools with which to analyze it. Fortunately, the growth in available data is matched by the extensive development of automated analysis tools. For the beginner, this wide variety of data sources and tools can be overwhelming. In this essay, we survey the state of digital work in the philosophy of science, showing what kinds of questions can be answered and how one can go about answering them. This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 2 1. Introduction. Our understanding of science is being broadened by the digitization and automated analysis of the various outputs of the scientific process, such as scientific literature, archival data, and networks of collaboration and correspondence. These technological changes are laying the foundation for new types of problems and solutions in the philosophy of science. The purpose of this article is to provide an overview and guide to some of the novel capabilities of digital philosophy of science. To best understand the reasons why digital philosophy of science lets us ask a new class of questions, let’s consider how it differs from more traditional approaches. For example, consider how we might draw conclusions about articles in the journal Nature. It has published over 360,000 articles since its founding in 1869, meaning that one would have to read ten articles a day for one hundred years to work through the complete archives of this journal alone. Of course, the standard response in the philosophy of science is to favor depth over breadth, and closely read a much smaller number of articles. While there is certainly much we can learn about science in this way, some broad questions about the nature and history of science—questions, for example, about how theories arise and become established in the literature as a whole—would remain unanswerable without a way to glean information from hundreds of thousands or even millions of journal articles. Much the same argument holds for scientific images, or information about the collaboration, communication, training, or citation connections between researchers. The question, then, is to what degree we can learn from the vast scientific literature without having to read every article closely—to instead do what is called distant reading (Moretti 2013). With distant reading, we input a large body of literature into a computer, and use it to do the “reading” for us, extracting large-scale patterns that would be invisible or impractical to find otherwise. In the philosophy of science in particular, this process has been aided by a This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 3 number of large digitization efforts targeted at the outputs of the scientific process. One crowning achievement of these efforts is the nearly complete digitization of the academic journal literature. This content is thus now accessible in ways that it has never been before. Digital approaches to the philosophy of science contrast with traditional methods involving close reading—intently reading a narrow body of literature within a focal area. With close reading, a philosopher will have an impressive command over a limited domain. He or she closely reads a select set of documents from the scientific literature, or analyzes the experimental, training, or collaborative records of a small group of researchers to attempt to extract the structure of a scientific theory, or to understand the meaning of its terms. We should stress that the close reading-based traditional philosophy of science and distant reading-based digital philosophy of science are not in competition. Instead, they are complementary. If, for example, a researcher wants to know how the meaning of a particular term has changed over time, he or she could use automated textual analysis tools to locate instances of the term, find hot spots in which the term is used frequently, quickly see which words it is associated with, and how these word associations have changed over time. In conjunction with digital analysis, performing close reading of key texts will be invaluable. The close reading may then spur further digital inquiries, and so on. Thus, traditional and digital philosophy of science work in tandem, each supporting the other. The remainder of this article will canvass a number of significant issues that must be dealt with in order to develop a digital philosophy of science research program. We hope that this overview will be helpful to researchers who are interested in moving forward with digital tools but are not certain where or how to begin. This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 4 2. Getting started. Because digital philosophy of science is a relatively new field, not only is there no set of standard tools, it is often unclear what sorts of questions can be answered by the extant tools. Thus, let’s begin by considering some of the new kinds of questions one can address. One of the most significant advantages of distant reading comes from the ability to engage with corpora significantly larger than those usually treated by philosophers and historians of science. For example, Murdock, Allen, and DeDeo (2017) were able to analyze large-scale patterns in Darwin’s reading by accessing the full text of every book that we know him to have read over a period of decades. These kinds of analyses simply would not be possible without the aid of technology. Answering research questions that leverage broad (yet still circumscribed; see section 4) sets of data are thus likely to be a fruitful use of digital tools. For example, one could track concepts over the entire print run of a journal, the collections of books published in the Biodiversity Heritage Library (Gwinn and Rinaldo 2009), or the PubMed Open Access Subset of contemporary biomedical journal articles (Roberts 2001). These kinds of investigations allow us to explore the conceptual landscape of a field through distant reading, by offering (at least in some cases) an exhaustive analysis of an area. Another advantage comes from the ability of analytical algorithms to parse texts in ways that even well trained close readers cannot. For example, fine-grained patterns of language usage, such as the shift in a term from a noun use to a verb use, or a shift from referring to science as a one-person activity to a group activity, could be traced in the literature with a level of exhaustiveness, objectivity, and care that would simply be impossible for a single reader. Automated tools can analyze sentence structure, word order, or parts-of-speech usage in a way that would try the patience of any scholar (Manning et al. 2014). This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 5 The ability of digital tools to increase the breadth of a research question is also important. If one has a hypothesis drawn from a particular domain (maximization or optimality inferences in biology, for example), this hypothesis could be tested in other, separate domains (economics, psychology, sociology) with only a modest further investment of resources. While digital tools can aid in answering existing research questions, these tools also open the possibility of framing new questions without a clear analogue in the pre-digital world. For instance, work by Manfred Laubichler and colleagues applies dynamic network analysis to our understanding of scientific conceptual development (Miller et al. 2015). The questions they ask arise in conjunction with the digital tools, and in dialogue with digital humanities researchers in other disciplines. 3. Choosing the right tools. Now that we have a sense of the advantages of digital analysis, let’s consider the currently available tools and corpora of data relevant to the philosophy of science. To begin, we should draw attention to the central repository of digital humanities tools, known as the DiRT Directory, accessible at (for more on its construction and predecessors, see Dombrowski 2014). There are nearly as many digital humanities tools as there are digital humanities researchers, and the landscape of contemporary software changes rapidly. For nearly any kind of analysis, the directory will include some tool which performs it—the most important question will be whether the data available can efficiently be converted into the format required by that tool. 3.1. Basic tools. There are a number of tools that may be used immediately by researchers, as they do not require that one collate a set of documents of interest in advance. This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 6 Perhaps the most famous of these is the Google Ngram corpus (Brants and Franz 2006; Michel et al. 2011), accessible at . This corpus contains the entirety of the scanned Google Books project, current as of 2012, with frequency data for single words as well as pairs and longer sequences (so called bigrams, trigrams, and, more generally, n-grams). Obviously, the Ngrams project does not exclusively contain scientific or philosophical content, and hence a number of queries that might interest philosophers of science will simply not be meaningful when queried against the Ngram Viewer. For example, the scientific usage of the term “evolution” will be completely masked by the broader cultural use of the term, and hence philosophers interested in the use of this term are unlikely to be able to uncover interesting data. There are also a number of worries about the statistical representativeness of the Google Ngram corpus, even when judged as a measure of broader cultural usage or popularity (Morse- Gagné 2013; Pechenick, Danforth, and Dodds 2015). Much more precise search and analysis may be performed by using JSTOR’s Data for Research project (Burns et al. 2009), available at . This tool allows users to perform searches and analyses against the entire corpus of JSTOR journals. Researchers may search for articles by journal, publication date, author, subject, and more, allowing for careful control over the set of articles to be analyzed. These articles may then be queried for word frequencies (and ngram frequencies), as well as automatically extracted “key terms,” which are words common in the selected articles but uncommon in the corpus as a whole (computed using the tf-idf score). The frequency scores from JSTOR DFR may also be used as an input to a variety of the tools described below. This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 7 3.2. Gathering a corpus. The more advanced tools are set apart primarily by not coming with a pre-loaded corpus of material to study. This means that the challenge of obtaining data falls to individual researchers. As mentioned above, we find ourselves in a particularly fertile period for data availability in the philosophy of science. Much of the journal literature, in some cases back into the nineteenth century, is available online in PDF or HTML form. Comprehensive online projects are available that focus on the works, life, and correspondence of figures like Darwin (Secord 1974; van Whye 2002), Newton (Iliffe and Mandelbrote 1998), Poincaré (Walter, Nabonnand, and Rollet 2002), Einstein (Mendelsson 2003), and others (Pouyllau et al. 2005; Beccaloni 2008; Mills 2011). A number of discipline-specific archives have also been constructed, such as the Embryo Project Encyclopedia, an open access, digital repository covering the history of embryology and developmental biology (Maienschein et al. 2007). To this may be added the digital collections now increasingly available from a wide variety of museums and libraries. With an appropriate collection of data obtained for a researcher’s private use, it becomes possible to leverage a much wider variety of analytical tools. (These data must also be carefully curated and safely preserved; we will return to the question of data management in the next section.) A researcher gathering a corpus must consider how and to what extent the data should be annotated. Minimal annotation—for example, leaving content as plain text with only bibliographic data for tracking—allows for the rapid creation of a large corpus, and lowers the future burden of maintaining and updating the annotations. But more significant annotation— such as marking up textual data in a format like that described by the Text Encoding Initiative (Ide and Véronis 1995)—allows for more complex, fine-grained, and accurate analyses. This annotation can take a variety of forms. For textual data, TEI allows users to indicate the locations This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 8 of various parts of the document (pages, paragraphs, chapters, indexes, figures, or tables), or the various kinds of references made by a piece of text (dates, citations, abbreviations, names of persons or institutions, etc.). This process of cross-referencing documents may be aided by the use of external ontologies—in the sense (not the one common in philosophy) of collections of standardized verbs and concepts that allow for the same term to refer unambiguously across multiple documents. In philosophy, the Indiana Philosophy Ontology project, or InPhO (Buckner, Niepert, and Allen 2007), available at , allows standardized reference to concepts such as “sociobiology,” or to particular philosophers. A number of such ontologies also appear in other areas of the sciences, and a document may be marked up with multiple ontologies to add further semantic richness. With a heavily annotated document, significantly more complex analysis may be applied, as the computer now “knows” where particular concepts are mentioned, how they are used, and how they relate to other ideas. While the use of such methods is relatively untested in philosophy, the biomedical field has made significant strides in this direction in recent years— for example, analysis of the usage of gene and chemical concepts in the scientific literature has actually enabled the extraction of novel relationships (previously unpublished by researchers, but discernible from the body of literature as a whole), and even the generation of novel hypotheses about future drug development (A. M. Cohen and Hersh 2005). The question of the representativeness of one’s sample of data is also a significant one with which researchers must engage. As we noted above, even in the largest corpora, such as Google’s Ngram collection, there are still problems with the statistical significance of the sample (Morse-Gagné 2013), with biases in temporal availability of data (more data tends to be available This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 9 closer to the present, as the relevant outputs were “born digital”; Michel et al. 2011) and systematic sources of error such as that introduced by optical character recognition (Hoover 2012). These concerns are somewhat alleviated when using a curated corpus known to be complete (such as databases of historical correspondence), but even in these instances, researchers must remain constantly vigilant against statistical bias. 3.3 Advanced tools. With a corpus in place, there is a variety of options for users interested in performing analyses impossible with the basic tools described above. First, there are a number of tools designed to aid researchers in presenting their material as an easily navigable, searchable, categorized public resource—a public digital archive or museum. The most popular of these is Omeka (D. Cohen 2008), available at . Omeka is a free, open-source software product that allows users to construct online archives and museum exhibitions, to add catalog information and metadata to digital items, and to attractively present all of this material to the public at large. Deploying a website such as this is a nice way to garner some immediate, public-facing payoff from the difficult work of obtaining and curating a digital collection. One alluring feature of large digital data sets is the possibility of analyzing the networks found within them—whether these are networks of collaboration drawn from experimental archives or lab notebooks, networks of correspondence drawn from digitized letters, or citation networks extracted from the journal literature. Such network analysis can often allow us to see patterns in the overall structure of a field that would be otherwise difficult to discern. One of the most user-friendly network analysis tools available is Gephi (Bastian, Heymann, and Jacomy 2009), available at . Gephi allows users to import graphs in a number of This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 10 formats (including ones as simple as CSV spreadsheet data), and to perform a variety of analyses and visualizations. The network may be broken into clusters (using a standard measure known as modularity; Blondel et al. 2008), the degree of connectivity of individual nodes may be easily explored, and the results can then be rendered graphically for presentation. If the data to be analyzed is text, a popular choice is Voyant Tools (Sinclair and Rockwell 2016), available at . Once a corpus of text is uploaded to Voyant, the user is immediately presented with a wide variety of options: a word cloud, a cross-corpus reader, a tool for tracking word trends through the text, and a short snippet concordance are among the immediately available tools, and a variety of other, more complex analyses and attractive visualizations may be performed using plugins. Voyant may also be used to save online corpora for future use, which facilitates classroom usage of textual analysis. Another challenging problem likely to be faced by philosophers of science interested in the scientific literature is the analysis of a large number of journal articles, a kind of analysis not often performed in traditional digital humanities, which often focuses on book-length source material. To solve these problems, one of us has created a software package, RLetters (Pence 2016), available at . (One public installation of this software, containing a corpus of journals in evolutionary biology, is available at , and described in (Ramsey and Pence 2016).) This is a web application, backed by a search engine and database, which may be deployed by anyone wishing to analyze a corpus of academic journal articles. It includes a variety of analysis methods (sharing many of those described for Voyant), including an especially powerful word frequency analyzer. Finally, should all of these tools fall short, the statistical computing language R (R Core Team 2017, available at ) has become a very popular base for This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 11 constructing novel analyses in the digital humanities (Jockers 2014). R combines a comprehensive set of standard statistical analyses (such as principal component analysis and dendrogram or tree clustering) with an extensive collection of user-contributed packages which may be utilized to perform complex tasks such as querying Google Scholar or Web of Science. This power comes at the cost of significant complexity, however, as R operates like a programming language rather than a graphical application. 3.4. Copyright issues. One of the most common pitfalls that users are likely to encounter when building corpora of digital data is copyright and licensing issues. While much material pertaining to figures like Newton or Darwin is available in the public domain, a confusing legal landscape besets all work created after 1923 (the date of “public domain” for published works in the United States). A number of recent court decisions (most significantly Authors Guild v. HathiTrust; Bayer 2012) have begun to clear the legal landscape in the United States, indicating that scholarly textual analysis and other sorts of digital-humanities work are likely to fall under the U.S. “fair use” provision. This, however, does nothing to simplify obtaining copyrighted materials, nor does it help scholars in other countries, many of which lack an analogue to fair use. It also may well be cold comfort to litigation-sensitive universities. Increasingly, however, publishers are recognizing the demand for digital analyses of their materials. Elsevier has deployed a text and data mining policy that applies to all of their journals, and will allow researchers to access and analyze articles as part of any institutional subscription (Elsevier 2014). Under the auspices of JSTOR’s DFR project, researchers may request access to full-text articles, if their university subscribes to the appropriate JSTOR collections. We also have had some degree of personal success negotiating access contracts for closed-access journal This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 12 articles with their publishers, including with Nature Publishing Group, who were very receptive to the possibilities opened by digital analyses. We anticipate that this trend toward increased ease of access will only continue. 4. Data. The academic process relies on the ability of other researchers to access, verify, and reproduce the results of analyses such as these. We will next consider how to publish and archive data, and how make public the tools and techniques used to achieve the results. 4.1. Data management. Philosophers are not, as a rule, accustomed to producing large amounts of data as part of our research. When using digital tools, we find ourselves faced with many of the same questions our scientific colleagues have dealt with for some time—how do we document, store, and preserve the data that our research generates? We cannot offer comprehensive answers to these questions here; we raise them only to emphasize that problems of metadata, documentation, and archiving have been discussed extensively in other contexts and should not be neglected. Early engagement with these resources will prevent significant problems from arising in the long term (York 2009; Michener 2015). 4.2. Reproducibility. If digital analyses are to serve as elements of the permanent research record along with journal articles, then we must take care to make those analyses reproducible in the future. This is a multifaceted problem that has, in recent years, received significant attention from the scientific community (Munafò et al. 2017). For most digital philosophy projects, there are three key components to reproducibility. This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 13 First, software must be reproducible—that is, easily installed and run by those with the relevant technical expertise. To that end, the development and use of open source software is laudable, as is using a readily accessible distribution platform such as GitHub. Second, corpora must be reproducible. This can be a difficult challenge, particularly if one has negotiated access to a body of copyrighted materials for analysis. It is often possible to negotiate access not just for an individual researcher or research team, but also for any researchers accessing a public resource (Ramsey and Pence 2016 successfully negotiated such contracts for evoText). We encourage researchers to think very seriously about this challenge as they develop corpora. Finally, the original forms of data must be—and remain—available. Open data repositories such as figshare (figshare Team 2012; Kraker et al. 2015) or Zenodo (CERN 2013) will accept raw data and make it citable. Researchers should also take care to upload data into these repositories in formats that are likely to remain readable indefinitely into the future, such as comma-separated value (CSV) format for spreadsheets, or plain Unicode text or XML for textual data. 5. Integrating digital results into philosophy of science. The digital tools are powerful and they have great potential for the philosophy of science. But digital results do not automatically translate into philosophical results. We therefore must consider how to integrate them with broader answers to philosophical questions. 5.1 Justifying digital results. A recurring problem with digital humanities results consists in how we can be certain that we have obtained genuine information supporting the conclusions This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 14 we hope to draw. We can in part resolve this by proceeding in an “hypothesis-first” manner— forming clear hypotheses prior to performing analyses. All datasets are apt to contain chance patterns, and we should not be led astray by these patterns by basing our conclusions upon them. And when we formulate hypotheses, we should attempt to be open to a range of possible conclusions, since approaching a statistical analysis system with an answer to one’s question already in mind tends to result in the cherry-picking of tools and methods to produce the desired result (Ioannidis 2005). That said, it can be difficult, even having carefully formulated and tested an hypothesis, to be certain that one has in fact demonstrated it conclusively. Many analyses in the digital humanities lack statistical validation, and have only a history of successful use as evidence in their favor (see, e.g., the discussion of validation in Koppel, Schler, and Argamon 2009). Others require collaboration between experts in philosophy and statistics, computer science, or even electrical engineering (Miller et al. 2015). An important step in developing a digital research program, therefore, is to consider how to assess whether a project has succeeded or failed. This may involve validating the methods, producing standard kinds of analysis outputs, or, as we now consider, using digital research methods only as a first step in a broader program of philosophical research. 5.2 Digital humanities as research generator. Because digital tools give us significantly increased breadth and depth, we have found that they are useful not just as research tools in and of themselves, but as a compass, directing us toward questions that would be answered by traditional methods in philosophy of science. For example, Pence has recently combined existing work on an episode in the history of biology (Pence 2011) with digital tools (Ramsey and Pence This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 15 2016), to produce a more general hypothesis about debates over paradigm change, which is now ripe for a non-digital analysis (Pence in preparation). We anticipate that this workflow will, in fact, be quite common. As a digital tool shows us a provocative but not fully theorized result, this can provide us with an excellent working hypothesis, case study, or set of sample data for developing a philosophical thesis. 6. Conclusion. As scholars interested in studying the natural sciences, we cannot ignore the availability of digital data that might assist us in our research. It was once the case that the body of scientific literature was modest in size and represented only a narrow distillation of and reflection upon the world. Now the literature has become so massive, complex, and diverse that it constitutes a world unto itself, one poised for scientific and philosophical analysis. Adding to this all of the digital traces of work not heretofore published—archival images, notebooks, and so on—we are confronted with an overwhelming, but incredibly rich, world of information. Philosophers are beginning to see how this information can bear on questions in the philosophy of science, and can inspire new ones. But the profusion of sources and formats of data, on top of the assortment of available tools, some of which require considerable technical savvy, provides a barrier to the philosopher. In this essay, we have attempted to provide a window into digital philosophy of science, with both an overview of what is possible and some guidance in seeking data and analysis tools. We are excited about the prospects for future work in this field, and hope that this article will help to spread our excitement. This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 16 References Bastian, Mathieu, Sebastian Heymann, and Mathieu Jacomy. 2009. “Gephi: An Open Source Software for Exploring and Manipulating Networks.” In Third International AAAI Conference on Weblogs and Social Media, 361–62. AAAI Publications. Bayer, Harold, Jr. 2012. The Authors Guild, Inc., et al., v. HathiTrust, et al., 11 CV 6351 (HB). United States District Court, Southern District of New York. Beccaloni, George. 2008. “The Alfred Russel Wallace Correspondence Project.” http://wallaceletters.info. Blondel, Vincent D., Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. “Fast Unfolding of Communities in Large Networks.” Journal of Statistical Mechanics: Theory and Experiment 2008 (10): P10008. doi:10.1088/1742-5468/2008/10/P10008. Brants, Thorsten, and Alex Franz. 2006. The Google Web 1T 5-Gram Corpus Version 1.1 (LDC2006T13). Philadelphia, PA: Linguistic Data Consortium. Buckner, Cameron, Mathias Niepert, and Colin Allen. 2007. “InPhO: The Indiana Philosophy Ontology .” APA Newsletter 7 (1): 26–28. Burns, John, Alan Brenner, Keith Kiser, Michael Krot, Clare Llewellyn, and Ronald Snyder. 2009. “JSTOR - Data for Research.” In Research and Advanced Technology for Digital Libraries, 416–19. Lecture Notes in Computer Science 5714. Berlin: Springer. CERN. 2013. Zenodo. Geneva. https://zenodo.org/. Cohen, Aaron M., and William R. Hersh. 2005. “A Survey of Current Work in Biomedical Text Mining.” Briefings in Bioinformatics 6 (1): 57–71. Cohen, Dan. 2008. “Introducing Omeka.” http://hdl.handle.net/1920/6089. This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 17 Dombrowski, Quinn. 2014. “What Ever Happened to Project Bamboo?” Literary and Linguistic Computing 29 (3): 326–39. doi:10.1093/llc/fqu026. Elsevier. 2014. “Text and Data Mining.” https://www.elsevier.com/about/our- business/policies/text-and-data-mining. figshare Team. 2012. Figshare. London. https://figshare.com/. Gwinn, Nancy E., and Constance Rinaldo. 2009. “The Biodiversity Heritage Library: Sharing Biodiversity Literature with the World.” IFLA Journal 35 (1): 25–34. doi:10.1177/0340035208102032. Hoover, David L. 2012. “Textual Analysis.” In Literary Studies in the Digital Age: An Evolving Anthology. Modern Language Association. http://dlsanthology.commons.mla.org/textual- analysis/. Ide, Nancy, and Jean Véronis, eds. 1995. Text Encoding Initiative: Background and Context. Dordrecht: Kluwer. Iliffe, Rob, and Scott Mandelbrote. 1998. “The Newton Project.” http://www.newtonproject.sussex.ac.uk/. Ioannidis, John P. A. 2005. “Why Most Published Research Findings Are False.” PLoS Medicine 2 (8): e124. doi:10.1371/journal.pmed.0020124. Jockers, Matthew. 2014. Text Analysis with R for Students of Literature. Cham, Switzerland: Springer. Koppel, Moshe, Jonathan Schler, and Shlomo Argamon. 2009. “Computational Methods in Authorship Attribution.” Journal of the American Society for Information Science and Technology 60 (1): 9–26. doi:10.1002/asi.20961. This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 18 Kraker, Peter, Elisabeth Lex, Juan Gorraiz, Christian Gumpenberger, and Isabella Peters. 2015. “Research Data Explored II: The Anatomy and Reception of Figshare.” http://arxiv.org/abs/1503.01298. Maienschein, Jane, Manfred D. Laubichler, Jessica Ranney, Kate MacCord, Steve Elliott, and Federica Turriziani Colonna. 2007. “The Embryo Project Encyclopedia.” https://embryo.asu.edu. Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. “The Stanford CoreNLP Natural Language Processing Toolkit.” In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 55–60. Baltimore, MD: Association for Computational Linguistics. Mendelsson, Dalia. 2003. “Einstein Archives Online.” http://www.alberteinstein.info. Michel, Jean-Baptiste, Yuan Kui Shen, Aviva Presser Aiden, Adrian Veres, Matthew K. Gray, Joseph P. Pickett, Dale Hoiberg, et al. 2011. “Quantitative Analysis of Culture Using Millions of Digitized Books.” Science 331 (6014): 176–82. doi:10.1126/science.1199644. Michener, William K. 2015. “Ten Simple Rules for Creating a Good Data Management Plan.” PLoS Computational Biology 11 (10): e1004525. doi:10.1371/journal.pcbi.1004525. Miller, B. A., M. S. Beard, M. D. Laubichler, and N. T. Bliss. 2015. “Temporal and Multi- Source Fusion for Detection of Innovation in Collaboration Networks.” In 2015 18th International Conference on Information Fusion (Fusion), 659–65. Mills, Virginia. 2011. “The Joseph Dalton Hooker Project.” http://www.sussex.ac.uk/cweh/research/josephhooker. Moretti, Franco. 2013. Distant Reading. London: Verso. This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 19 Morse-Gagné, Elise E. 2011. “Culturomics: Statistical Traps Muddy the Data.” Science 332: 35– 36. Munafò, Marcus R., Brian A. Nosek, Dorothy V. M. Bishop, Katherine S. Button, Christopher D. Chambers, Nathalie Percie du Sert, Uri Simonsohn, Eric-Jan Wagenmakers, Jennifer J. Ware, and John P. A. Ioannidis. 2017. “A Manifesto for Reproducible Science.” Nature Human Behaviour 1: 21. doi:10.1038/s41562-016-0021. Murdock, Jaimie, Colin Allen, and Simon DeDeo. 2017. “Exploration and Exploitation of Victorian Science in Darwin’s Reading Notebooks.” Cognition 159: 117–26. doi:10.1016/j.cognition.2016.11.012. Pechenick, Eitan Adam, Christopher M. Danforth, and Peter Sheridan Dodds. 2015. “Characterizing the Google Books Corpus: Strong Limits to Inferences of Socio-Cultural and Linguistic Evolution.” PLOS ONE 10 (10): e0137041. doi:10.1371/journal.pone.0137041. Pence, Charles H. in preparation. “How Not to Fight about Theory: The Debate between Biometry and Mendelism in Nature, 1890–1915.” In The Evolution of Science, edited by Andreas De Block and Grant Ramsey. ———. 2011. “‘Describing Our Whole Experience’: The Statistical Philosophies of W. F. R. Weldon and Karl Pearson.” Studies in History and Philosophy of Biological and Biomedical Sciences 42 (4): 475–485. doi:10.1016/j.shpsc.2011.07.011. ———. 2016. “RLetters: A Web-Based Application for Text Analysis of Journal Articles.” PLoS ONE 11 (1): e0146004. doi:10.1371/journal.pone.0146004. Pouyllau, Stephane, Christine Blondel, Marie-Helene Wronecki, Bertrand Wolff, and Delphine Usal. 2005. “Ampère et l’histoire de l’électricité.” http://www.ampere.cnrs.fr. This is a preprint of an article whose final and definitive form is published in Philosophy of Science. Please quote only the published version of the paper. 20 R Core Team. 2017. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org. Ramsey, Grant, and Charles H. Pence. 2016. “evoText: A New Tool for Analyzing the Biological Sciences.” Studies in History and Philosophy of Biological and Biomedical Sciences 57: 83–87. doi:10.1016/j.shpsc.2016.04.003. Roberts, Richard J. 2001. “PubMed Central: The GenBank of the Published Literature.” Proceedings of the National Academy of Sciences 98 (2): 381–82. doi:10.1073/pnas.98.2.381. Secord, James. 1974. “The Darwin Correspondence Project.” http://www.darwinproject.ac.uk. Sinclair, Stéfan, and Geoffrey Rockwell. 2016. Voyant Tools. http://voyant-tools.org/. Walter, S. A., Ph. Nabonnand, and L. Rollet. 2002. “Henri Poincaré papers.” http://henripoincarepapers.univ-nantes.fr. Whye, John van. 2002. “The Complete Work of Charles Darwin Online.” http://darwin- online.org.uk/. York, Jeremy. 2009. “This Library Never Forgets: Preservation, Cooperation, and the Making of HathiTrust Digital Library.” Archiving Conference 2009 (1): 5–10. work_2shljhr72nftlgqf5gozy53zfi ---- White Paper Report ID: 115657 Application Number: HD-228961-15 Project Director: Ted Sicker Institution: WGBH Educational Foundation Reporting Period: 6/1/2015-12/31/2015 Report Due: 3/31/2016 Date Submitted: 3/23/2016 WHITE PAPER NEH Grant #HD-228961-15 Digital Humanities for Lifelong Learners Project Director: Michael Mayo WGBH Educational Foundation March 2016 2 The WGBH Educational Foundation received a Level-I Start-Up Grant award from the National Endowment for the Humanities to research a cost-effective solution that allows public media organizations and other humanities libraries to deliver online, media-based experiences to seniors throughout the U.S., built around the materials in our collective archives.” Toward such ends, WGBH convened leading thinkers in the fields of lifelong learning and humanities education together with archivists and technologists in a series of in-person and virtual meetings, administered an online survey, and conducted additional research between July and December 2015. This report reviews project activities and summarizes major findings and recommendations. BACKGROUND American Experience. Columbus and the Age of Discovery. God in America. The Machine That Changed the World. Vietnam: A Television History. War and Peace in the Nuclear Age. These are just a few examples of the humanities-based television programs produced by WGBH over the last four decades, much of it originally created with the financial support of the NEH. Such premier programming can be of particular interest to seniors, and numerous studies have shown that lifelong learning enhances social inclusion, self-confidence, and active citizenship. Additionally, ten thousand Baby Boomers (people born between 1946 and 1964) have turned 65 every day since 2011 — a pace that will continue until 2029, making this a rapidly increasing demographic. Seniors are underserved by most educational outreach programs, but this population often has the time and enthusiasm to benefit from the intellectual stimulation and social engagement offered by media-based resources, and as a group, is increasingly comfortable with digital technology. The Digital Humanities For Lifelong Learners project was proposed to determine how best to use public media’s archive of humanities programming to create a robust library of cross-disciplinary humanities modules for this eager audience of lifelong learners. Some of the best programming produced by public television, however, is sitting unused on archive shelves. And perhaps even more problematic, at a time when the proliferation of digital technologies provides new ways for such humanities materials to be used, most of the rights to distribute or display the programs have either lapsed or were never cleared for new media usage. And the cost of renewing third-party and/or performance rights is, for the most part, prohibitive. WGBH has extensive experience in negotiating such rights clearances, and for the past 15 years has been successfully applying that knowledge to the development of a digital library featuring its archival assets specifically for educational purposes. Through an initiative originally called Teachers Domain and currently presented as PBS LearningMedia (http://www.pbslearningmedia.org/), public media assets from across the system have been reversioned, annotated, and organized to meet the specific needs of K- 12 teachers and students. This service now includes over 100,000 educational resources and has a registered user base of 1.7 million, providing a potentially useful model for the 3 proposed new initiative targeting lifelong learning by older populations. The NEH proposal was written in recognition of the need to generate additional research on such possibilities, however, as the interests and capacities of lifelong learners differ from those of K-12 teachers and students. LANDSCAPE An environmental scan reveals that the demand for lifelong learning among seniors is already high and increasing steadily as the Baby-Boomers move into retirement. In Massachusetts alone there are more than 15 institutes offering seminars and workshops to this population, and this does not include growing numbers of programs at museums, cultural institutions, universities and libraries. Nationwide, there are programs such as One Day University, which hosts events throughout the country featuring live lectures by university professors, and the well-known Osher Learning Institutes located at more than 100 college campuses across the U.S. The first formal “learning-in-retirement” program in the United States was launched at the New School for Social Research in 1962, with dozens of similar programs to follow, including the Fromm Institute, Road Scholar, and Elderhostels — all were inspired by the reports and articles on “the greying of America.” More recently, some organizations have taken an additional step, providing on-campus housing and assisted care for seniors who wished to make a commitment to lifelong learning. One example is Lasell Village, on the campus of Lasell College in Massachusetts. Senior residents can purchase a condo in the campus village and pay a service fee each month that covers assistance and education. Residents must commit to taking 450 hours of classes each year. At Lasell Village, where posh senior residences are located near student dormitories, seniors can pay $600,000 to $1 million for a condominium. In addition, they can pay from $3,000 to $5,000 per month for services, which range from house-cleaning and daily dining to the classes on campus. Another source of lifelong learning opportunities is the exploding number of online offerings, including Massive Open Online Courses (MOOCs), which are available to students of any age. EdX, Udacity, and Coursera are the leading purveyors of such opportunities, presenting courses run by faculty at the world’s most prestigious universities. Numerous alumni associations, such as Princeton’s, offer online courses behind a pay wall, and other organizations such as Academic Earth and University of the Third Age also provide such digital learning opportunities for seniors. These are all well- established initiatives with paid staff and college professors who are accustomed to designing classes, and many are handsomely funded. The fairly recent rise of the open education movement has prompted an increasing amount of online educational programming, but these MOOCs and OpenCourseWare (i.e., the online sharing of searchable college course content through programs such as Tufts OpenCourseware, NotreDameOpenCourseWare, OpenMichigan, and so on) and non-affiliated online education sites such as Khan Academy, are based on formal classwork on the college level and often are not tailored to the needs or interests of lifelong learners. 4 Another potentially relevant project is OVEE, an online social TV experience from ITVS, funded by the Corporation for Public Broadcasting. Public television stations use the platform to show full-length programs, clips, or previews, and engage the audience in a chat room-style discussion. This allows the viewer to ask questions, make comments, and get live feedback from other viewers as well as from the presenter, who is often able to add anecdotes, insight, and context to the program. OVEE is not geared to senior learners, however, who often have unique needs for a simple user interface and are less familiar with the process and fast pace of online discussions. In addition, most OVEE programs require the audience to watch live at specific times, rather than on-demand. The following table presents a sampling of various lifelong learning projects according to “geography”—where they are offered. Lifelong Learning Initiatives Location Examples Who guides/Fee? University, college campuses Alumni Studies (Princeton, e.g.) Professors, self/Pay wall (seniors live near campus) Fromm Institute Osher Lifelong Learning Institutes University, college campuses Lasell Village Professors/steep fees (seniors live on campus) Campus Continuum Professors/fees UBRCs (ubased retirement communities) Professors/fees Online Courses, Videos University of the Third Age Self, in England/free Pioneer Network (for caregivers) Self, webinars/free Alumni Studies Self, prof/pay wall Academic Earth Self/ free MOOC Prof, self/free Great Courses* Self/fee Leading Age Self/fee AARP Tek website Self-guiding/free MMlearn.org (for caregivers & seniors) Self/free Mather Institute on Aging (for caregivers) Self for caregivers/free Misc locations, in person Road Scholar/Elderhostel Trained staff/fee One Day University Guest lecturers/fee 5 Senior Centers Senior Planet, OATS Volunteers, staff/free National Institute of Senior Centers (NCOA) Volunteers, staff/free Café Plus (Mather Lifeways) Staff, self SeniorNet, computer classes Staff Oasis Connections, computer classes Staff Libraries Next Chapter, NY Public Library Librarians/free Senior Moments, Brooklyn Public Library The Free Library of Philadelphia Assisted Living Institutions Numerous, including Brookdale Activity Directors Senior Living Residences (11 in MA) Five Star Quality Care Home Home Health Care (HHC) (VNSYN, e.g.) Caregivers Community rooms of HUD senior housing Self, activity leaders DIGITAL HUMANITIES FOR LIFELONG LEARNING Conducted from July-December 2015, this planning grant from NEH featured extensive research, both live and virtual meetings with experts, and an online survey of potential users. In preparation for the proposal, WGBH met with representatives from the Boston Public Library, Hebrew Senior Life, and Osher Lifelong Learning Institute for an initial discussion about an archival initiative for lifelong learners. Excerpts of the 1983 series Vietnam: A Television History were shown along with a brief excerpt from a poetry series featuring Professor Lisa New of Harvard University. The goal of the meeting was to gauge interest in this initiative and to solicit reactions to a range of excerpted archival material. In-house staff then conducted an environmental scan, assembling information about existing lifelong learning programs and, specifically, media-based approaches to serving senior citizens through these programs. Phone calls were also made to multiple external consultants recruited for the project, to flesh out the data on existing efforts and assist in defining the focus for future activities. Three examples of presentation formats were assembled as “strawmen” for consideration by the full group of advisors to be convened in the Fall. 6 A full-day Launch Meeting was conducted at WGBH in early September (agenda attached as Appendix A). Invited participants included representatives of senior care facilities, lifelong learning organizations, public media archives, and academia, as well as highly interested potential consumers of the proposed resources (“enthusiasts”). Three other PBS stations were also represented, bringing the perspective of archivists more focused on local and/or state-based humanities content, complementing the national orientation of WGBH. A list of participants and their affiliations is attached as Appendix B). The meeting was designed to generate data on each of the three central challenges to be addressed in our research: • WHO: Who is our audience and what is the optimal target age range, where can they be reached? What are potential access issues (e.g., physical limitations, comfort with computers), and how do target users prefer their information to be packaged (e.g., short/long format, edited/unedited, curated collections, free- standing videos), delivered (e.g., with interactivity, contextualization), and consumed (e.g., self-study, facilitator-led, follow lectures, community interaction)? • WHAT: What content and expert networks do we have to work with and where are they located? What issues complicate use (e.g., rights or storage format) and how can they be accommodated? • HOW: How can we best use technology to reach our audience and ameliorate logistical issues? What delivery modes should be available (e.g., mobile)? What are the technical challenges/opportunities and how should they be addressed? Launch Meeting discussions informed the development of an online survey (see Appendix C) which we administered using Survey Monkey instrument. To solicit responses, we sent email to potential senior users of the proposed service as identified by meeting participants. 160 responses were recorded (see Appendix D), addressing questions regarding desired topics, formats, length, and a wide range of additional related subjects. FINDINGS Presented below are the project’s major findings, organized by the three central challenges, shorthanded as: Audience, Content, and Design/Delivery. Audience The current generation of seniors is aging with vigor. Eighty is the new 40! For decades, Americans have been told that the population is “greying.” One in every eight Americans is a senior, which is often defined as 65 and older. What’s new is the speed at which the population is greying. According to a recently released Census Bureau report cited in The New York Times, the number of Americans 65 and older is expected to nearly double by the middle of the century when they will make up more than a fifth of the nation’s population. That’s more than 2 in every 10 Americans, or 1.6 in every 8. By 2050, 83.7 million Americans will be 65 or older, compared with 43.1 million in 2012. 7 Another significant development is how seniors are aging. On average, seniors are living longer and healthier. According to a study by Harvard University and the National Bureau of Economic Research, “Evidence for Significant Compression of Morbidity in the Elderly U.S. Population,” people who were 65 between 1991 and 1993 averaged 17.5 more years of life, with 8.8 of those years being disability-free and 8.7 years being spent with some disabling health conditions. By the 2003-2005 period, average life expectancy only increased to 18.2 years beyond age 65, but the healthy-unhealthy split had shifted to 10.4 disability-free years and 7.8 disabled years, according to the study. While “lifelong learning” implies the full spectrum of ages beyond formal schooling years, and the term “senior citizen” can include everyone older than 55, our discussions concluded that the optimal target range for our proposed digital education program is between 65 and 80. In the main, these are individuals who are at least close to or in partial retirement (and thus with unoccupied time) but still with the energy as well as mental and physical capacities to engage both productively and enthusiastically with such a digital offering. In terms of accessibility issues, however, our advisors stressed the importance of addressing visual and hearing challenges, using sufficiently large typeface and bright colors, and captioning all video-based information. We also recognize that both older and younger audiences would be well-served by such a project, offering curated access to public media archives, and that individuals beyond the core target range could (and likely will) exploit the availability of the services provided through such a project. We found that the target population might embrace the proposed services in any number of locations, from their own homes to retirement/assisted living communities, libraries, senior centers, and lifelong learning programs, but that the greatest traction might be in settings where face-to-face interaction with peers is available. While multiple formats of presentation were advocated, curated collections received the most interest, and survey respondents indicated a preference for long-form videos (e.g., complete and/or unedited chapters of broadcast programs) over shorter segments reversioned specifically for this population. In the main, however, the general consensus was that different formats would suit different audiences, and that in developing this concept WGBH should consider multiple types of resources (e.g., free-standing videos as well as resources with wrap- around information to enhance context as well as understanding of specific media segments). Respondents were split between wanting this archival material for self-study versus delivery as part of a group experience, perhaps facilitated by an expert and/or accompanied by a lecture on the underlying topic. Others advocated for including curated sets of materials within a formal course structure. Interestingly, we discovered that many existing educational programs for seniors, particularly those in lifelong learning centers, are led by experts who already have established “curricula,” some of which include media segments, making these a less likely target for our work. 8 Representatives of other public broadcasting stations expressed a particular desire for templates to structure the presentation of video materials from their own archives, responding to viability of the Interactive Lesson tool developed by WGBH for younger audiences. This platform is an innovative means for creating a customized sequence of screens containing media, text, and user-engagement activities in a seamless, visually attractive presentation on PBS LearningMedia. Making such tools available would help expand the scope, reach and impact of the proposed resources for seniors, adding local and state-based programming to the mix of available content. Content Confirming preliminary research findings, the online survey revealed senior interest in a broad array of subject matter, with the highest concentration in programming tied to history and the arts, followed by health and science. Not surprisingly, “lifestyle” programs targeting such topics as travel and cooking also ranked relatively high. Documentaries scored as the most favored format. The archives of public media producers are replete with programming in all of these subject areas, including such national broadcast brands as American Experience, Frontline, Nature, and NOVA as well as innumerable lifestyle programs and locally produced shows. All of this content is not immediately accessible, however, as copyright issues are both legion and complicated, and paucity of legal documentation for older content hinders the determination of ownership. Media segments may also include materials owned by third parties or may present liability issues, leading to further complications in clearing rights for their use in the kind of education service now being considered for seniors. In addition, the focal content of these resources can change over time, a risk especially relevant in the sciences, resulting in the need for time-consuming and expensive updates on a regular basis. Digital distribution potentially offers new ways to overcome these obstacles, however, as WGBH has employed in the development of PBS LearningMedia. For example, we can identify shorter segments (with manageable rights clearances and costs) that can still convey critical information without compromising the quality of the presentation. In order to provide appropriate contextual storytelling and pedagogical cues, these video elements can be packaged in online modules, with text and graphics, adding new material, such as “wraparound” segments videotaped with scholars and/or footage from the original interviews, to create a compelling learning experience for the target audience. Almost certainly a less complex strategy for overcoming the rights challenges, however, is to build the proposed digital education service onto archive systems already in place such as WGBH’s Open Vault (http://openvault.wgbh.org/), a program that has received funding from NEH (among others) to catalog digital materials and curate an online digital archive and catalog. And Open Vault is already part of the American Archive of Public Broadcasting (AAPB), an unprecedented initiative to preserve and make accessible significant historical content created by public broadcasting and to preserve at-risk public media before its content is lost to posterity. In 2013, the Corporation for Public Broadcasting selected WGBH and the Library of Congress as the permanent stewards of 9 the AAPB collection. To date, approximately 40,000 hours comprising 68,000 items of historic public television and radio content contributed by more than 100 public media stations and archives across the United States have been digitized for long-term preservation. In October 2015, WGBH and the Library launched the AAPB Online Reading Room, providing online access to nearly 12,000 of the digitized content for research, educational and informational purposes. The entire collection of 40,000 hours is available for research viewing and listening at WGBH and the Library of Congress. This extraordinary material includes national and local news and public affairs programs, local history productions that document the heritage of our varied regions and communities, and programs dealing with education, social issues, politics, environmental issues, music, art, literature, dance, poetry, religion and even filmmaking on a local level. This archive also includes the full interviews from which segments were pulled for inclusion in broadcast programs, many of which can be easily accessed. The AAPB ensures that this valuable source of American social, cultural and political history and creativity will be saved and made accessible for current and future generations, providing a potential anchor for the proposed new digital education service for seniors. There have been robust efforts to develop a set of AAPB rights protocols and permissions, reviewing legal and copyright questions to inform a comprehensive strategy that allows access to materials in accordance with third party rights and fair use. In addition to allowing unlimited access to materials on-site, WGBH has established an Online Reading Room ("ORR") — just as in the reading room of a physical archive— where visitors are able to access and view materials but not “check them out.” The ORR makes materials accessible for educational, research, and not for profit purposes to anyone with internet access. The materials must be viewed within the environment of the AAPB website, however, and these resources cannot be downloaded. In addition, while continuing to gather information about rights and working with stations and independent producers to provide rights clearances, WGBH has made initial broad categorical decisions about fair use. The volume of content in the AAPB is so great that fully cataloging the materials and making detailed access determinations on an item-by- item basis would take decades given current staffing capabilities. With this in mind, the AAPB team decided to determine access to the content in the archive based on a review of material at the level of categories of content. AAPB can then transfer individual items in and out of the ORR based on the subsequent acquisition of more specific rights information. Design/Delivery Once the ‘who’ and the ‘what’ have been determined, the most critical questions – the ‘how’ – come into play. Following are recommendations derived from the technologists participating on the project addressing variables related to the digital technology and system architecture needed to best realize the mission. The long-term success of this project requires the definition of a “future proof” or flexible open sourced technical architecture that will remain applicable as technology progresses. 10 Fortunately, over the last few years, the industry has converged around a few dominant technology platforms. As a result, we can make judicious decisions about technically feasible, efficient, and operationally viable components for each part of the architecture, and hence for the architecture as a whole. Outlined below are four important considerations in building and implementing a robust platform to support lifelong learners: • Defining the target user personas and their use cases; • Selecting the appropriate delivery mechanism(s) for the customer base; • Defining a suitable back-end platform for efficient and measurable management of the content and of the community; and • Content production and workflow: outlining methods for getting the content into the system, also referred to as ingestion. A visual representation illustration of the recommended architecture decisions is appended to this report (Appendix E). The target customers for lifelong learning are, by definition, older: they are no longer enrolled in a traditional (K-12 or university) education program, are not digital natives, and are certainly not smartphone natives. This target community can be divided into three segments: • ‘Digital immigrants,’ who began their professional lives in the analog world, and later embraced the digital world; • ‘Mixed-signal’ users, who use digital technologies, albeit reluctantly; • ‘Analogs,’ who still do not use digital and are very difficult to reach with digital technologies. Segmenting the consumer base into these groups, and understanding how each group is likely to access the platform, allows us to make key decisions about the implementation of the platform that will maximize reach among desired customers. These decisions are outlined in the following sections. The choice of content delivery mechanism is critical to the platform as it most directly defines the consumer experience. At the same time, while it is important to provide a platform that is accessible to more than 90% of the target customer base, it is also beneficial to minimize the number of required variants (for example, across operating systems or screen sizes), in order to limit both up-front development effort, and ongoing support and maintenance efforts. The following recommendations are designed to minimize the number of variants that need to be developed while maximizing reach and compatibility. We recommend designing the platform to be compatible with smartphones and tablets first. Content designed for these devices is also accessible on desktop Mac/PCs, and mobile devices are increasingly used among Americans, including among the target demographic of this project (non-digital natives). The majority of Google web searches in 11 the U.S. already originate from smartphones and tablets, and going forward content must be optimized for mobile user experience. Bandwidth: While bandwidth is an important consideration, it is safe to assume, for these purposes, that users will have access to sufficient bandwidth (in most cases over LTE cellular or WiFi). Screen and Browser Resolutions: Regarding display sizes and visual real estate, the market has largely converged on three dominant screen sizes for smartphones -- 4.7 inches, 5.1 inches, and 5.7 inches -- and three for tablets -- 8 inches, 10 inches, and, increasingly, 12 inches. For all these screen sizes, it is sufficient to design visual content for a 1080p resolution, which is widely supported and is sufficient for the human eye, even on a 12-inch screen. Asset types and production methodologies: We recommend using vector graphics for illustrations that can be scaled/resized with no loss of quality. Video and photographic image production and conversion standards will be established to ensure future compatibility, SEO and Accessibility requirements. Experience Design and User Interaction: Designing for “scalability” and “mobile first” will force design solutions that can be omnipotent on both Desktop browser and Native iOS/Android devices This approach to delivery mechanisms—taking into account all four types of considerations—addresses the needs of at least 90% of the target audience, minimizing the need to develop and provide ongoing support for designs for other screen sizes, operating systems, and resolution constraints. To properly implement content delivery, a back-end platform architecture must be chosen that facilitates easy site administration, production development, measurement and analysis, 100% uptime, and efficient platform maintenance. The back end is the lower part of the iceberg; the technical, “behind-the-scenes” foundation on which the user interface lies. A cloud-based content and community management platform is recommended, greatly reducing the processing power requirements of the users’ devices, allowing even devices with relatively low capabilities -- namely, those that have solely enough processing power and memory for a single video stream -- to access the platform. The back-end platform must support two key elements: content and community. The broad definition of content includes not only the video or audio content itself, but also any production of assets and associated metadata (tags, rights, etc.) to increase search reach and value for end-user. Specifically, the cloud-based back end should include support for at least four buckets of data: • Core video content itself, with intuitive navigation controls and bandwidth optimization (e.g., a video lecture); 12 • Complementary and commentary content, which includes associated textual annotations or voiceovers audio clips that enhance the core content (e.g., on- screen definitions or illustrations of key concepts introduced in the lecture); • Metadata to characterize and categorize the content (e.g., tags to facilitate access through direct searches, rights management, playlists, and linked themes to encourage exploration); and • Accessibility and personalization of user experience so that users can configure their viewing settings as needed: (e.g., allowing users to adjust resolution based on their internet quality). The platform should also enable three key roles for community management: • Content Contributors contribute the (raw) core video content, and typically are public media organizations (e.g., uploading the content directly from archives); • Content Creators contribute the complementary content, such as video or text, to complete the video offering (e.g., proposing suggested readings to enhance the learning experience); and • Content Curators build the content community by curating, editing, organizing, and linking metadata to content entries (e.g., associating different videos with each other to facilitate users to navigate across related lessons). The software development community has converged on a set of well-supported open source platforms to implement web, content, and community platforms. The following combination of well-established and well-supported distributed open source platforms is recommended. Common examples include: • Django, a web development framework: http://www.djangoproject.com/ • Drupal, a content management framework: http://www.drupal.org/ • Moodle, a learning environment platform and course management system: http://www.moodle.org/ • ffmpeg, a platform-agnostic solution for recording, converting, and streaming multimedia: http://www.ffmpeg.org/ Using open source tools enables the efficient creation of a cloud-based content management and community management platform that is flexible, yet robust. Content Production and Ingestion. The process of uploading content into the system requires defined workflow protocol standards for successful adoption and production efficiency. The content for the Lifelong Learning platform is expected to come from public media and/or from other humanities libraries that have video or audio content. Much of this content may need to be digitized; it is important to digitize content in a format that is lossless and widely supported, such that it is digitized once for all platforms: for example, lossless JPEG 20006 for video content, and AAC7 for audio content. Critical to getting any platform on the ground is ingesting a substantial initial batch of content. It is recommended that: 13 • To help encourage contributors, minimize the resources required from them to get their existing content digitized and uploaded; • Use a process that decouples the encoding of the video stream from the process of uploading to the cloud, to increase efficiency; and • For optimal processing speed, rely on dedicated machines; if not available, use GPU (general processing unit) farm that harnesses idle CPU time on existing machines within a defined network. Finally, work with contributors to progressively digitize content from the archives using open source software. Store the native digital instances, replicated for redundancy, in low-cost cloud storage. Once content is saved in the cloud, very low cost cloud computing solutions exist that can algorithmically extract metadata from the content, and parse the content into modules, summaries, and excerpts. CONCLUSIONS AND NEXT STEPS Given the diversity of need and complexity of variables, WGBH has concluded that there is no single ‘right’ approach to the challenge of getting public media archival resources into the hands and heads of senior citizens. Rather, we advocate for development of a range of strategies for addressing the ‘who,’ ‘what,’ and ‘how’ questions targeted in the planning grant research conducted to date. In sum, we recommend three (3) types of development/dissemination activity: 1) a resource base of free-standing resources, a curated library of media segments accessible to anyone but with particular relevance for service providers who seek content for existing programs; 2) packaged activities that feature those media segments with wrap-around contextualizing information, for use as free-standing modules by individuals and/or through libraries and other public settings; and 3) highly produced modules that are sufficiently comprehensive to serve as mini- courses on designated topics for home use or self-study. We have also concluded that the most viable means to address rights-related issues is to concentrate on media assets already included in the American Archive of Public Broadcasting (AAPB), at least in initial development efforts. This content features full broadcast programs (which often have fewer rights restrictions than segmented programs), and much of it is already curated and available for ready access through the Online Reading Room. The AAPB site may also be the destination through which seniors browsing the Web access these lifelong learning offerings. The ORR currently includes the following four exhibits featuring items of topical and historical significance, and additional exhibits on presidential campaigns, children's educational programming, and women's issues are currently in development: ! Documenting and Celebrating Public Broadcasting Station Histories ! Voices from the Southern Civil Rights Movement, and ! Climate Change Conversations: Cause, Impacts, Solutions ! Voices of Democracy: Public Media and Presidential Elections (http://americanarchive.org/exhibits/presidential-elections) 14 To maximize breadth, reach and access, we recommend that other PBS stations be actively involved in both development and dissemination activities, both to include the widest array of local as well as national programming and to enhance widespread buy-in to the proposed service. Toward such ends, our research has concluded that production and distribution of templates to facilitate the packaging of local content will increase the likelihood of station involvement as well as the quality of the resulting products. The Interactive Lesson Platform (ILP) recently developed by WGBH holds significant promise for these purposes. This tool provides a means to sequence media resources and related material within a learning module. Built using open source technology, the ILP features a content management system that allows producers to create and preview content in a range of design and layout templates, upload media, and embed and author an array of tools for user engagement, such as puzzles, quizzes, note-taking, commenting, etc.— all without having to know Web coding. We have also concluded that while formal lifelong learning programs (like Osher) might benefit from the availability of media-based resources of this type, libraries, senior centers, and retirement/assisted living communities might be better targets for this type of service, offering not only eager and readily accessible participants, but also national networks of prospectively interested partner organizations, many of which are already digitally connected. Next Steps Clearly, additional research and rights assessments are needed to clarify need, expand answers to the who, what, and how questions, and illuminate the most viable solutions to this challenge, but sufficient information has been conducted through this planning effort to justify pursuit of support to continue exploration of possibilities. Next steps will feature such continued research as well as production/distribution of a set of prototypes to establish proof of concept. These prototypes will include both stand-alone and packaged resources of varying comprehensiveness, each pilot-tested to assess appeal and use. Toward such ends we propose to focus especially on librarians and activity directors in retirement/assisted living communities and senior centers as both the sources of additional information and the potential users of the prototypes developed. Alliances with existing networks should also be explored, including national organizations like the American Library Association, AARP, and Senior Planet as well as local and regional providers of lifelong learning services for seniors. Although WGBH will explore various means to support the continued development of this concept, we currently plan to apply to the Digital Projects for the Public competition through the National Endowment for the Humanities in Summer/Fall 2016 and will consider also approaching the National Archives for the support of further research and development. 15 Appendices A. Launch Meeting Agenda B. Launch Meeting Participants C. Online Survey, Summary of Results D. Online Survey, Questions and Full Results E. Architectural Variables DIGITAL'HUMANITIES'FOR'LIFELONG'LEARNING' NEH/LLL'Launch'Meeting' ' Agenda' 9/8/15' ! Pre$Meeting*with!PBS*Representatives.!General!discussion!to!surface!and!review!issues! specifically!related!to!the!use!of!PBS!archives!for!the!core!content!of!possible!LLL! modules!(topics,!availability,!rights,!etc.)!@@!Louisiana!Public!Broadcasting,!Blue!Ridge!PBS,! and!Arkansas!Educational!Television!Network! !! ! 11:00!!Welcome!—!General!overview!of!the!project,!review!of!the!purposes/agenda!for! the!day,!and!discussion!of!the!mission!statement:! ! To!research!a!cost@effective!solution!that!allows!public!media!organizations!and!other! humanities!libraries!to!deliver!online,!media@based!experiences!to!seniors!throughout!the! U.S.,!built!around!the!materials!in!our!collective!archives.! ! ! 12:00!!Lunch/Presentations!—!Three!(3)!sample!resource!modules!presented!to!fuel! afternoon!discussions:!Interactive!Lesson!(Gulf!of!Tonkin),!Poetry!in!America,!and! Invitation!to!World!Literature!(Gilgamesh)! ! 1:00!!!!Content'Challenges!—!discussion!responding!to!the!following!question(s):!! ! • What!media!materials!and!expert!networks!do!we!have!to!work!with?!What! logistical!issues!will!affect!our!work?!!! ! 2:00!!!!Audience'Challenges'—!discussion!responding!to!the!following!question:! ! • Who!is!our!audience,!where!do!their!interests!and!passions!in!the!Humanities!lie,! and!how!do!they!like!their!information!packaged!and!delivered?!! !!! 3:00!!!Design'Challenges!—!discussion!responding!to!the!following!question:! ! • How!can!we!best!use!technology!to!reach!our!audience!and!ameliorate!logistical! issues?! ! 4:00!!!Next'Steps!—!Identification!of!topics!for!additional!research!and!respondents!to! potential!survey(s)!–!e.g.,!lifelong!learners,!service!providers! ! 4:30!!!Adjourn Digital Humanities for Lifelong Learning Launch Meeting, September 8, 2015 Participants ! Julia Anderson, Digital Marketing Specialist, Education, WGBH Avi Berntein-Nahar, Director, Osher Lifelong Learning Institute, Brandeis University Leslie Bourgeois, Archivist, Louisiana Public Television Karen Cariani, Director, Media Library and Archives, WGBH Kristi Chadwick, Advisor for Small Libraries, Massachusetts Library System Steve Cohen, Adjunct Lecturer, Tufts University, Department of History Amy Crownover, Project Coordinator, Arkansas Educational Television Network Michael Davies, Senior Lecturer, Engineering Systems Division, MIT Casey Davis, Media Library and Archives, WGBH Carol Jennings, Production, Blue Ridge PBS, Virginia Evie Kintzer, Executive Director, Strategy and Business Development, WGBH Joanne LaPlante, Center Communities of Brookline, Hebrew Senior Life Thomas Lerra, Research & Development Prototype Manager, WGBH Kali Lightfoot, Director, National Resource Center, Osher Lifelong Learning Institute, University of Southern Maine Mike Mayo, Director, Research and Development, Education, WGBH Elisa New, Professor of English, Harvard University Wichian Rojanawon, Director, Osher Lifelong Learning Institute, University of Massachusetts, Boston Roberta Sheehan, Lifelong learning enthusiast Ted Sicker, Executive Producer, Education, WGBH Marian Weissman, Lifelong learning enthusiast Digital'Humanities'for'Lifelong'Learners' Survey'Results,'Summary' ' Most'popular'topics:! • History!72%! • Fine!Arts!65%! • Travel!59%! • Earth!and!Environment!58%!Health!58%! • Science!54%! • Cooking!50%! • Music!50%! Content'formats:! • Documentaries!90%! • Drama!60%! • Interviews!56%! • News!50%! • News!(Arts!&!Culture)!50%! Content'length:! • Full!programs!71%! • Chapters!(10O15!min.)!18%! Accessing'content:! • Independently!at!home!80%! • View!in!a!group!setting!15%! Discussion'online'or'face'to'face:! • Neither!56%! • Face!to!Face!22%! • Online!21%! Viewing'video'content! • Laptop!66%! • PC/desktop!53%! • iPad!42%! • iPhone!38%! Time'spent'viewing'video'content! • 1O2!hours/day!32%! • 2/4!hours/week!32%! • 34!respondents!chose!“other”!because!they!watch!video!content!less! than!1O2!hours/day.! Social'Media'Platforms! • FB!74%! • Google+!37%! • Linkedin!29%! • Pinterest!23%! Sharing'content! • Email!56%! • Word!of!mouth!30%! Adding'Commentary! • Maybe!65%! • Likely!18%! • Never!16%! Accessibility! • Closed!caption!53%! • Earphones!46%! Age'range! • 66O75!51%! • 56O65!23%! Respondents'location! Overwhelming!majority!are!from!MA! ! Bonus'Question! • 26!responded!to!“What!kinds!of!questions!might!interest!you?”! • 28!responded!to!“How!might!you!want!to!make!use!of!a!module!like!this?”! • See!the!attached!spreadsheet!for!comments! Q1 Which Topics are you interested in learning more about (check as many as you'd like.) Answered: 150 Skipped: 10 Agriculture Animals Children's programming Consumer affairs and... Cooking Crafts Dance Economics Education Energy Earth and Environment Fine Arts Gardening Government Global Affairs Health History Humor 1 / 20 Digital Humanities for Lifelong Learners LGBTQ Literature Local Communities Music Nature Parenting Philosophy Politics Public Affairs Race and Ethnicity Religion Science Poverty Crime Immigration Classism Gun Control Obesity Hunger Social Movements Activism 2 / 20 Digital Humanities for Lifelong Learners 18.00% 27 38.00% 57 10.67% 16 32.67% 49 50.00% 75 27.33% 41 22.67% 34 30.00% 45 40.00% 60 28.67% 43 58.67% 88 65.33% 98 48.00% 72 27.33% 41 47.33% 71 58.67% 88 72.00% 108 39.33% 59 8.00% 12 47.33% 71 21.33% 32 Sports Technology Theater and Acting Travel War Women 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Answer Choices Responses Agriculture Animals Children's programming Consumer affairs and advocacy Cooking Crafts Dance Economics Education Energy Earth and Environment Fine Arts Gardening Government Global Affairs Health History Humor LGBTQ Literature Local Communities 3 / 20 Digital Humanities for Lifelong Learners 50.67% 76 46.00% 69 8.00% 12 27.33% 41 32.67% 49 32.67% 49 18.67% 28 24.00% 36 54.00% 81 21.33% 32 19.33% 29 29.33% 44 14.67% 22 24.67% 37 14.67% 22 16.00% 24 34.67% 52 18.67% 28 13.33% 20 32.00% 48 35.33% 53 59.33% 89 22.00% 33 36.00% 54 Total Respondents: 150 Music Nature Parenting Philosophy Politics Public Affairs Race and Ethnicity Religion Science Poverty Crime Immigration Classism Gun Control Obesity Hunger Social Movements Activism Sports Technology Theater and Acting Travel War Women 4 / 20 Digital Humanities for Lifelong Learners 38.67% 58 18.00% 27 18.00% 27 90.00% 135 60.67% 91 45.33% 68 Q2 Which Content Formats are you interested in seeing? Please choose all that apply. Answered: 150 Skipped: 10 Audio and podcasts Call-in shows (radio) Debates Documentaries Drama Event Coverage How-to shows Interviews News Magazine (Arts and Culture) Magazine (News) Talk Show 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Answer Choices Responses Audio and podcasts Call-in shows (radio) Debates Documentaries Drama Event Coverage 5 / 20 Digital Humanities for Lifelong Learners 45.33% 68 54.67% 82 50.00% 75 50.00% 75 30.67% 46 24.00% 36 Total Respondents: 150 How-to shows Interviews News Magazine (Arts and Culture) Magazine (News) Talk Show 6 / 20 Digital Humanities for Lifelong Learners 71.62% 106 18.92% 28 6.08% 9 3.38% 5 Q3 Which Content Length interests you the most? Please choose up to two responses. Answered: 148 Skipped: 12 Total 148 Full programs - 30-60 minutes Chapters of programs -... Excerpts of programs - 5... Short form content - 1-... 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Answer Choices Responses Full programs - 30-60 minutes Chapters of programs - 10-15 minutes Excerpts of programs - 5-10 minutes Short form content - 1-5 minutes 7 / 20 Digital Humanities for Lifelong Learners 80.95% 119 2.04% 3 15.65% 23 1.36% 2 Q4 How would you like to Access this Content - independently from home or a library, or a periodic meet-up with a group? Please choose one response. Answered: 147 Skipped: 13 Total 147 Independently at home Independently at a library View in a group setting Other (please specify) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Answer Choices Responses Independently at home Independently at a library View in a group setting Other (please specify) 8 / 20 Digital Humanities for Lifelong Learners 21.38% 31 22.07% 32 56.55% 82 Q5 After viewing the content would you want to take part in an online Discussion or Chat, or would you prefer to do this face-to- face? Please choose one response. Answered: 145 Skipped: 15 Total 145 Online or video... Face-to-face None of the above 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Answer Choices Responses Online or video discussion Face-to-face None of the above 9 / 20 Digital Humanities for Lifelong Learners 53.79% 78 66.90% 97 42.76% 62 14.48% 21 38.62% 56 14.48% 21 1.38% 2 Q6 Are you comfortable using Technology? If yes, how do you currently view video content? Check all that apply. Answered: 145 Skipped: 15 Total Respondents: 145 PC/desktop computer Laptop computer iPad Other tablet iPhone Android Other mobile device 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Answer Choices Responses PC/desktop computer Laptop computer iPad Other tablet iPhone Android Other mobile device 10 / 20 Digital Humanities for Lifelong Learners 31.94% 46 32.64% 47 11.81% 17 23.61% 34 Q7 Approximately how much time do you spend viewing video content on your preferred device? Answered: 144 Skipped: 16 Total 144 1-2 hours/day 2-4 hours/week 4-8+ hours/week Other (please specify) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Answer Choices Responses 1-2 hours/day 2-4 hours/week 4-8+ hours/week Other (please specify) 11 / 20 Digital Humanities for Lifelong Learners 74.55% 82 18.18% 20 37.27% 41 29.09% 32 23.64% 26 14.55% 16 0.91% 1 Q8 What Social Platforms do you use, if any? Please check all that apply. Answered: 110 Skipped: 50 Total Respondents: 110 Facebook Twitter Google+ LinkedIn Pinterest Instagram Snapchat 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Answer Choices Responses Facebook Twitter Google+ LinkedIn Pinterest Instagram Snapchat 12 / 20 Digital Humanities for Lifelong Learners 12.68% 18 56.34% 80 30.99% 44 Q9 Would you use a social platform to share content you like, or would you share via email or word of mouth? Please choose one response. Answered: 142 Skipped: 18 Total 142 Social platform Email Word of mouth 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Answer Choices Responses Social platform Email Word of mouth 13 / 20 Digital Humanities for Lifelong Learners 18.12% 27 65.77% 98 16.11% 24 Q10 How likely would you be to add Commentary to content if given an easy method to do so? Answered: 149 Skipped: 11 Total 149 Likely Maybe Never 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Answer Choices Responses Likely Maybe Never 14 / 20 Digital Humanities for Lifelong Learners 53.19% 25 46.81% 22 2.13% 1 2.13% 1 12.77% 6 Q11 What kind of Accessibility Features do you use, if any? Please check all that apply. Answered: 47 Skipped: 113 Total Respondents: 47 Closed captions Earphones Braille Haptic device Other (please specify) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Answer Choices Responses Closed captions Earphones Braille Haptic device Other (please specify) 15 / 20 Digital Humanities for Lifelong Learners 78.79% 26 84.85% 28 Q12 BONUS QUESTIONS! If you have time, we created a short prototype that you can view here: http://goo.gl/qqdkvP After you have viewed it, please let us know: Answered: 33 Skipped: 127 Answer Choices Responses What kinds of questions might interest you? How might you want to make use of a module like this? 16 / 20 Digital Humanities for Lifelong Learners 0.00% 0 0.00% 0 0.00% 0 0.00% 0 91.37% 127 92.09% 128 92.81% 129 0.00% 0 0.00% 0 0.00% 0 Q13 Helpful information: Answered: 139 Skipped: 21 Answer Choices Responses Name Company Address Address 2 City/Town State/Province ZIP/Postal Code Country Phone Number 17 / 20 Digital Humanities for Lifelong Learners 5.19% 8 7.14% 11 23.38% 36 51.95% 80 11.04% 17 1.30% 2 Q14 More helpful information! What is your age range? Answered: 154 Skipped: 6 Total 154 25-45 46-55 56-65 66-75 76-85 86-100 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Answer Choices Responses 25-45 46-55 56-65 66-75 76-85 86-100 18 / 20 Digital Humanities for Lifelong Learners Q15 Please let us know any additional thoughts you have about public media video resources for lifelong learning. Answered: 24 Skipped: 136 19 / 20 Digital Humanities for Lifelong Learners 95.56% 43 97.78% 44 Q16 Optional! If you would like to receive the results of this survey, please let us know your name and email address. Answered: 45 Skipped: 115 Answer Choices Responses Name Email 20 / 20 Digital Humanities for Lifelong Learners ~12" S m artphone N atives D igital N atives D igital Im m igrants 'M ixed-S ignal' A nalogs ~4.7" ~5.1" ~5.7"" C ontent (and com m unity) m anagem ent system : 'm odularisation', excerpting*, and rights m anagem ent autom ated algorithm ic accessibility distributed, cloud-based, open source (D jango, D rupal, M oodle, ffm peg...) *index to digital version C ontributor: contributes video content C reator: creates com plem entary content C urator: curates, edits, organizes contentC om ple- m entary content C ore video S tyle (tem plates) ~10" ~8" 1080p screen resolution D igitize A rchives P ublic m edia organisations and other hum anities libraries M etadata (tags, rights) C om m unity C om m unity C om m unity work_2tcncpfqmfcrrcx33e6z4msogi ---- Looking for Textual Evidence Digital Humanities, Middling-Class Morality, and the Eighteenth-Century English Novel Ralf Schneider, Marcus Hartner, Anne Lappert 1. Introduction In our contribution to this edited volume we present a discussion of an attempt to identify and locate literary manifestations of the idea of the “virtuous social middle” in a large corpus of eighteen-century English nov­ els with the help of methods and tools from Digital Humanities (DH).1 This attempt was situated within the larger context of a research project on com­ parative practices in the eighteenth-century novel as part of the Collaborative Research Center (CRC) 1288 “Practices of Comparing” funded by the German Research Foundation (DFG). Our project started from three assumptions. The first was the traditional assumption held in literary history about the close connection between socio-historical developments and the “rise of the novel” 2 from “the status of a parvenu in the literary genres to a place of dominance” during the eighteenth century.3 Second, we assumed that the cultural construction of the “the middle order of mankind” 4 and its concom­ itant claims about a supposedly heightened sense of ‘middle-class’ morality 1 Wahrman, Dror, Imagining the Middle Class: The Political Representation of Class in Brit- ain, c. 1780 –1840, Cambridge: Cambridge Universit y Press, 1995, 64. 2 Watt, Ian, The Rise of the Novel: Studies in Defoe, Richardson, and Fielding, Berkeley: Uni- versit y of California Press, 1957. 3 Rogers, Pat, Social Structure, Class, and Gender, 1660 –1770, in: J. A. Downie (ed.), The Oxford Handbook of the Eighteenth-Centur y Novel, Oxford: Oxford Universit y Press, 2016, 39. 4 Goldsmith, Oliver, The Vicar of Wakefield, Oxford: Oxford Universit y Press, 2006 [1766], 87. Ral f Schneider, Mar cus Har tner, Anne L apper t24 0 was accompanied by a range of social processes of comparing.5 Af ter all, con­ structions of social identity tend to rely heavily on processes of othering, and comparing plays a vital role in the construction of self and other. Third, we assumed that in the emerging medium of the novel in the period under inves­ tigation, concepts of middle-class social identity were negotiated through particular literar y strategies of comparing, whose textual manifestations can be found specifically in textual representation of characters and charac­ ter constellations. Ultimately, the underlying value system concerning class identity in a novel ought to manifest itself also in the way that the behavior or dispositions of characters are described and evaluated in comparison as either desirable and adequate, or as despicable and inappropriate. As part of our strategy of substantiating those three assumptions, our project aimed at providing a more extensive review of the textual representations of social virtues and vices in the eighteenth-century English novel than available in traditional scholarly accounts of the topic so far. In order to achieve this aim, we decided to turn to the methods of DH. We planned to identify, with the help of dif ferent types of word searches (see below), recurrent expressions that refer to social behavior in either pos­ itive or negative terms. We expected a diachronic development to be visible across the corpus, e. g., similar to the way concepts of gentility changed their semantics during the period under consideration.6 None of our expectations were met, however, as we will demonstrate below. This prompted a reconsid­ eration of our search strategies and ultimately led to the insight that prac­ tices of comparing and social-identity construction may be more implicit in 5 In the following, we will employ the term ‘middle-class’ as a synonymous st ylistic varia- tion to expressions such as ‘middle order’, ‘middle rank’, the ‘middling sorts’, etc. We are aware that the application of the terminolog y of class to discussions of eighteenth-cen- tur y societ y is contested and comes with certain conceptual problems. For introductions to the term and concept of class in early modern Britain, see Corfield, Penelope J., Class by Name and Number in Eighteenth-Centur y England, in: Histor y 72 (1987) and Cannadine, David, Class in Britain, London: Penguin, 2000, 27, 31. 6 The concept of the gentleman, for example, changed from the narrow denotation of a man of noble birth to the more widely applicable notion of a man displaying a set of ‘genteel’ (moral) qualities and behaviours. During this “social peregrination” of the term, it lost “its oldest connotations of ‘gentle’ birth and ‘idle’ living, so that, in the later eigh- teenth centur y, individual vintners, tanners, scavengers, potters, theatre managers, and professors of Divinit y could all claim the status, publicly and without irony” (P. J. Corfield, Class by Name and Number, 41). L ooking for Tex t ual Evidence 241 literature than in other discourses and function in dif ferent ways. In what follows, we will first sketch the socio-cultural context of our corpus, in which the novels contribute to the negotiation of middle-class morality. We will then brief ly engage with the question of the applicability of DH methods in the analysis and interpretation of literature, before we document some of our text searches and discuss the results. 2. Inventing the superiority of the middling classes On the opening pages of Daniel Defoe’s The Life and Adventures of Robinson Crusoe (1719) the title character’s elderly father lectures the youthful pro­ tagonist on his place in the social fabric of eighteenth-century Britain. In his attempt to dissuade the restless and adventure-seeking Robinson from “[going] abroad upon Adventures”, he emphasizes his son’s birth into the “the middle State” of society.7 This he declares to be “the best State in the World, the most suited to human Happiness” as it is neither “exposed to the Miseries and Hardships, the Labour and Suf ferings of the mechanick Part of Man­ kind”, nor is it “embarrass’d with the Pride, Luxury, Ambition and Envy of the Upper Part of Mankind”.8 While those remonstrations unsurprisingly fail to convince the young Robinson Crusoe, they articulate a sentiment of ‘middle-class’ complacency found with increasing frequency in literary and philosophical writings over the course of the eighteenth century. Defoe’s fictional character constitutes only one voice in an increasingly audible choir within the cultural discourse of the period that promotes the idea of the ‘middle order’ as possessing a distinct and superior quality. Though this idea was neither new nor universally acknowledged,9 it became increasingly 7 Defoe, Daniel, Robinson Crusoe, ed. Michael Shinagel, New York: Norton, 1994 [1719], 5. 8 Ibid. 9 On competing models of the social structure of the period, such as the notion of a bipolar “crowd-gentry reciprocity” (Thompson, E. P., Customs in Common: Studies in Traditional Pop- ular Culture, New York: New Press, 1993, 71) and the persistent traditional belief in a prov- identially ordained, universal and hierarchical order of social layers (e. g. Tillyard, E. M. W., The Elizabethan World Picture: A Study of the Idea of Order in the Age of Shakespeare, London: Chatto  & Windus, 1967 [1942]), see the discussion in D. Cannadine, Class in Britain, 24–56. With regard to the notion of the superiorit y of the ‘middle-class’, see also French, who argues that the aristocracy and gentr y retain their dominant economic and politi- Ral f Schneider, Mar cus Har tner, Anne L apper t242 attractive to those who saw themselves as belonging to this particular seg­ ment of society.10 Building on the notion of a “virtuous social middle”11 ini­ tially developed in Aristotle’s Politics,12 they actively engaged in the discur­ sive construction of the middle order as a distinct social group not only by discussing its political and economic importance for the nation,13 but also by emphatically emphasizing its moral value.14 David Hume, for example, thought that the upper classes were too immersed in the pursuit of pleasure to heed the voices of reason and morality, while “the Poor” found themselves entirely caught up in the daily struggle for survival.15 As a result, in his view, only the “middle Station” af fords “[…] the fullest Security for Virtue; and I may also add, that it gives Opportu- nity for the most ample Exercise of it […]. Those who are plac’d among the lower Rank of Men, have little Opportunity of exerting any other Virtue, besides those of Patience, Resignation, Industry and Integrity. Those who are advanc’d into the higher Stations, have full Employment for their Gen- erosity, Humanity, Af fability and Charity. When a Man lyes betwixt these two Extremes, he can exert the former Virtues towards his Superiors, and the latter towards his Inferiors. Every moral Quality, which the human Soul is susceptible of, may have its Turn and, and be called up to Action: And a Man may, af ter this Manner, be much more certain of his Progress in Virtue, than where his good Qualities lye dormant, and without Employment.”16 cal power in Britain throughout the eighteenth centur y and beyond (French, Henry, Gen- tlemen: Remaking the English Ruling Class, in: Keith Wrightson (ed.), A Social History of England: 1500 –1750, Cambridge: Cambridge University Press, 2017, 269, 280). See also Mul- drew, Craig, The ‘Middling Sort’: An Emergent Cultural Identit y, in: Keith Wrightson (ed.), A Social History of England: 1500 –1750, Cambridge: Cambridge University Press, 2017 on the emergence of the “Middling Sort” as a cultural identit y during the early modern period. 10 D. Cannadine, Class in Britain, 32–33. 11 D. Wahrman, Imagining, 64. 12 Aristotle, The Politics, trans. Carnes Lord, Chicago: Universit y of Chicago Press, 1984, IV.11. 13 D. Cannadine, Class in Britain, 42. 14 The protagonist Charles Primrose in Oliver Goldsmith’s The Vicar of Wakefield, for exam- ple, sees “the middle order of mankind” as the social sphere that is home to “all the arts, wisdom, and virtues of societ y” (87–88). 15 Hume, David, Of the Middle Station of Life, in: Thomas H. Green/Thomas H. Grose (eds.), David Hume, The Philosophical Works, Aalen: Scientia Verlag, 1964 [1742], 4:376. 16 Ibid., 4:376–4:377. L ooking for Tex t ual Evidence 24 3 The passage indicates that Hume sees the middling class’s superior virtue as the result of a sociological process. By being exposed to a wider and more complex range of social life, individuals from the middle ranks are forced to develop greater moral sensitivity and power of judgement. While he thus attempts a philosophical explanation,17 other contemporary authors cham­ pion middle-class virtue in a more simplistic fashion by rhetorically fore­ grounding the idea of a stark contrast between the “generous Disposition and publick Spirit” of members of the middling ranks and the “Depravity and Selfishness of those in a higher Class”.18 It is important to note once more that such arguments about the (moral, economic, political, etc.) superiority of a distinct middle order or class, were less “an objective description of the social order” in Britain than “a way of constructing and proclaiming favourable ideological and sociological ste­ reotypes” of those who found themselves hierarchically situated between the poor and the powerful.19 In this context, the development of the eigh­ teenth-century novel as a distinct literary genre on the fast-growing market for printed material can be seen instrumental in the emergence of the (self-) image of the middle class as an economically relevant and culturally powerful social group.20 Written by (predominantly) middle-class authors for a (pre­ dominantly) middle-class audience,21 the novel played an important role in the invention and promotion of this group’s social identity, especially by con­ 17 For a discussion of Hume’s position in relation to that of Aristotle, see Yenor, Scott, Da- vid Hume’s Humanit y: The Philosophy of Common Life and Its Limits, Basingstoke: Pal- grave, 2016, 114–119. 18 Thornton, William, The Counterpoise: Being Thoughts on a Militia and a Standing Army, London: Printed for M. Cooper, 1752. Quoted from the unpaginated preface. 19 D. Cannadine, Class in Britain, 32. 20 The connection between the “rise of the novel” and the emerging middle class was first discussed in I. Watt, Rise of the Novel, and Habermas, Jürgen, The Structural Transforma- tion of the Public Sphere: An Inquir y into a Categor y of Bourgeois Societ y, trans. Thomas Burger, Cambridge: Polit y, 2015 [1962]. For a survey of perspectives af ter those authors, see Cowan, Brian, Making Publics and Making Novels: Post-Habermasian Perspectives, in: J. A. Downie (ed.), The Oxford Handbook of the Eighteenth-Centur y Novel, Oxford: Oxford Universit y Press, 2016. 21 Hunter points out that the readership of the novel was never restricted to one specific group only. In contrast to the argument presented here, he holds that “the characteris- tic feature of novel readership was its social range […] and the way it spanned the social classes and traditional divisions of readers” (Hunter, Paul J., The Novel and Social/Cultural Ral f Schneider, Mar cus Har tner, Anne L apper t24 4 tributing to the illustration and dissemination of the concept of middle-class morality.22 As a result, a preoccupation with the figure of the individual forced to navigate morally complex situations, together with the frequent vilification of characters from aristocracy and gentry, as well as a complacent middle-class contentedness with being placed in the ‘best’ social stratum, set the tone for much eighteenth-century prose writing.23 However, while the general connection between “the emergence of a bourgeois public sphere” and “the rise of novel writing and -reading” has long been treated as “a stan­ dard feature” of the period’s literary and cultural history,24 the aesthetic and narratological dimensions of the “invention” of middle-class superiority25 still remain a productive field of study. For this reason, our research project within the CRC 1288 “Practices of Comparing” set out to investigate the novel’s contribution to eighteenth-cen­ tury negotiation of social identity and morality by focusing on the play of nar­ rative and stylistic strategies that constitute an important aspect of this con­ tribution. As we are traditionally trained literary scholars, the methodological thrust of our project lay in the informed manual analysis of an ambitious, yet manageable corpus of some twenty carefully selected novels from the period. We specifically decided to focus on classical narratological analyses of aspects such as narrative situation, focalization, and perspective structure26 as well Histor y, in: John Richetti (ed.), The Cambridge Companion to the Eighteenth-Centur y Novel, Cambridge: Cambridge Universit y Press, 1996, 19). 22 Nünning, Vera, From ‘honour’ to ‘honest’. The Invention of the (Superiorit y of ) the Mid- dling Ranks in Eighteenth Centur y England, in: Journal for the Study of British Cul- tures 2 (1994). 23 For more detailed surveys of the eighteenth-centur y novel and its contexts, see Nünning, Ansgar, Der englische Roman des 18.  Jahrhunderts aus kulturwissenschaf tlicher Sicht. Themenselektion, Erzählformen, Romangenres und Mentalitäten, in: Ansgar Nünning (ed.), Eine andere Geschichte der englischen Literatur. Epochen, Gattungen und Teilge- biete im Überblick, Trier: WVT, 1996, and the contributions in Richetti, John (ed.), The Cambridge Companion to the Eighteenth-Centur y Novel, Cambridge: Cambridge Uni- versit y Press, 1996 and Downie, J. A. (ed.), The Oxford Handbook of the Eighteenth-Cen- tur y Novel, Oxford: Oxford Universit y Press, 2016. 24 P. Rogers, Social Structure, 47. 25 V. Nünning, From ‘honour’ to ‘honest’. 26 Fludernik, Monika, An Introduction to Narratolog y, London: Routledge, 2009, Wenzel, Peter (ed.), Einführung in die Erzähltextanalyse: Kategorien, Modelle, Probleme, Trier: WVT, 2004, Nünning, Ansgar, Grundzüge eines kommunikationstheoretischen Modells L ooking for Tex t ual Evidence 24 5 as on the representation of fictional characters.27 Our individual (close) readings indeed produced results that hermeneutically seem to confirm our assumptions of a middling-class preoccupation with social identity. Nev­ ertheless, we remained painfully aware of the limited scope of our project design regarding the number of texts that we were able to incorporate into our investigation. And we wondered if we could complement the traditional literary analyses of our research by turning to DH in the attempt to engage with at least some aspects of our research on a digital and somewhat broader textual basis. 3. Between close and distant reading: using DH methods for literary analysis and interpretation While the tentative origins of DH reach back into the first half of the twen­ tieth century,28 most of its methods and research questions fully emerged only during the past few decades. One branch of the wider field of DH has concerned itself with literary texts; and its exploration of the relationship between literature and the computer has taken many shapes. One major issue is the production and increasing availability of electronic (and schol­ arly) editions of primary and secondary works. This development has sig­ nificantly widened access to literary texts and now plays a vital role in the preservation of books and other textual materials;29 it has forced libraries and academic institutions to develop new data policies and technological solutions for storing and providing access to primary and secondary lit­ erature. Also, not only computer-related genres such as literary hypertexts der erzählerischen Vermittlung: Die Funktion der Erzählinstanz in den Romanen George Eliots, Trier: WVT, 1989. 27 Margolin, Uri, Character, in: David Herman (ed.), The Cambridge Companion to Narra- tive, Cambridge: Cambridge Universit y Press, 2007, Eder, Jens/Jannidis, Fotis/Schneider, Ralf (eds.), Characters in Fictional Worlds: Understanding Imaginar y Beings in Literature, Film, and Other Media, New York: de Gruyter, 2010. 28 Thaller, Manfred, Geschichte der Digital Humanities, in: Fotis Jannidis/Hubertus Kohle/ Malte Rehbein (eds.), Digital Humanities. Eine Einführung, Stuttgart: J. B. Metzler, 2017, 3–4. 29 Shillingsburg, Peter L., From Gutenberg to Google: Electronic Representations of Literar y Texts, Cambridge: Cambridge Universit y Press, 2006. Ral f Schneider, Mar cus Har tner, Anne L apper t24 6 have emerged,30 but digital technologies have dramatically changed both the publishing and book industry, so that many literary and scholarly texts now­ adays are read not from the printed page but from the displays of e-book devices. In the context of those developments, the impact of the digital revolu­ tion on the academic infrastructure of the humanities is without question. But while computers have long found their place even in the of fices of the most technophobic academics, and while even the rear-guard of traditional literary scholars use digital information retrieval systems such as electronic library catalogues and databanks, there is still widespread resistance to some other applications of digital methods in literary research. And indeed, in the realm of literary analysis and interpretation things look a bit complicated. On the one hand, textual analysis can very well apply digitized methods, in ways comparable to the strategies of computational and corpus linguistics. In the wide field of stylometrics, for instance, large corpora of texts can be scanned for the co-occurrence of particular textual features, which can then help trace historical developments in literary language, attribute authorship, or define genres.31 Also, the themes that dominate a text can be extracted by topic modeling.32 On the other hand, when it comes to the interpretation of literary works, there is some skepticism as to the ability of computer pro­ grams to support human readers in tasks of that complexity. Although tex­ tual analysis is always the basis for interpretation, interpretation is usually performed, af ter all, by highly educated, well-informed academic readers with a hermeneutic interest in exploring the meaning – or meanings – of a text. The main interest in interpretation lies in investigating a text’s combi­ 30 Ryan, Marie-Laure, Avatars of Stor y, Minneapolis: Universit y of Minneapolis Press, 2006, Ensslin, Astrid, Hypertextualit y, in: Marie-Laure Ryan/Lori Emerson/Benjamin J. Robert- son (eds.), The Johns Hopkins Guide to Digital Media, Baltimore: Johns Hopkins Univer- sit y Press, 2014. 31 Burrows, John, Delta: A Measure for St ylistic Dif ference and a Guide to Likely Authorship, in: Literar y and Linguistic Computing  17 (2002), Jannidis, Fotis/Lauer, Gerhard, Burrows’s Delta and Its Use in German Literar y Histor y, in: Matt Erlin/Lynne Tatlock (eds.), Distant Readings: Topologies of German Culture in the Long Nineteenth Centur y, Rochester: Camden House, 2014. See also the extensive introduction and survey by Juola, Patrick, Authorship Attribution, in: Foundations and Trends in Information Retrieval 3 (2006), 233–334, http://dx.doi.org/10.1561/1500000005. 32 Jannidis, Fotis, Quantitative Analyse literarischer Texte am Beispiel des Topic Modelings, in: Der Deutschunterricht 5 (2016). http://dx.doi.org/10.1561/1500000005 L ooking for Tex t ual Evidence 247 nation of thematic, aesthetic and rhetorical features which are understood to be culturally embedded in complex ways. Both ample contextual research and the close scrutiny of textual features are therefore generally considered prerequisites of literary interpretation. The ‘distant’ reading, i. e., the computerized analysis of textual patterns in texts, that DH have introduced to literary scholarships, thus looks fairly incompatible at first sight with the close reading and interpretation strate­ gies practiced by the scholar trained in literary hermeneutics. Franco Moretti famously spoke of distant reading as “a little pact with the devil: we know how to read texts, now let’s learn how not to read them”.33 But the advantage of distant reading is that it allows scholars to detect features across a number of texts that could only with dif ficulty and considerable use of resources be tackled by individual close readings. While the computer may lack the ability to detect ‘qualitative’ dif ferences, it is its promise of a seemingly boundless quantitative analytical scope that turns it into a potentially powerful ana­ lytic tool. Moreover, DH not only of fers the opportunity to extend existing research strategies in a quantitative fashion, but the playful exploration of digital tools may also lead to unexpected results and even contribute to the emergence of new research strategies. Emphasizing the productive power of playfulness and creativity, Stephen Ramsay advocates an informal “Herme­ neutics of Screwing Around” as a valid computer-based research strategy for the Digital Age in an inf luential paper.34 Concerned with the limited scope of the hermeneutical (close) readings in our project, we were intrigued both by this lure of quantitative analysis and the emergence of the “somewhat infor­ mal branch of text interpretation delightfully termed screwmeneutics” af ter Ramsay.35 Therefore, we decided to embark on a complimentary investiga­ tion of the textual manifestations of some concepts of middle-class virtue in the eighteenth-century novel with the help of DH. 33 Moretti, Franco, Distant Reading, London: Verso, 2013, 48. 34 Ramsay, Stephen, The Hermeneutics of Screwing Around; or What You Do with a Million Books, in: Kevin Kee (ed.), Pastplay: Teaching and Learning Histor y with Technolog y, Ann Arbor: Universit y of Michigan Press, 2014 [2010]. 35 McCurdy, Nina et al., Poemage: Visualizing the Sonic Topolog y of a Poem, in: IEEE Transac- tions on Visualization and Computer Graphics 22 (2016), 447. Ral f Schneider, Mar cus Har tner, Anne L apper t24 8 4. From search to research: some examples Our approach to using DH was unusual in so far as we did not take the more common route from distant to close reading but proceeded vice versa. Since we had already invested considerable ef fort in the (close) reading and analy­ sis of our original corpus of ca. twenty eighteenth-century novels, we began our journey into the field of DH equipped with a solid set of expectations about the literary negotiation of social identity during the period under investigation. Starting from the hermeneutical findings of our investigation, we then attempted to corroborate our results, by taking our research into the realm of computing, more precisely, by expanding the corpus of novels under investigation and developing ideas on how DH tools could help us to support our arguments. Our first step in this process was to expand our text base by creating a digital corpus of 55 novels (see the list in the appendix to this article), thus more than doubling the number of texts. We decided to look at some of the most well-known novels from the eighteenth century as well as to include some lesser known works that were however well received during the period in question. Further, we intentionally included works from dif fer­ ent genres such as sentimental novels, gothic novels, coming-of-age stories and adventure novels, in order to do some justice to the considerable variety and diversity in eighteenth-century literary production.36 Already during the process of compiling and preparing the corpus, how­ ever, we encountered the first methodological challenges. While DH of fers a great variety of tools and approaches, digitized texts are only ever suitable for a research purpose as they are prepared accordingly. In other words, if we were to look for complex sentence structures, or even narrative patterns conveying middle-class ideology, these structures would have to be tagged beforehand in each text. This means that passages that we consider as good examples for such patterns would have to be identified and electronically annotated accordingly in the hidden plane of text information, the markup. Not only did we need digital copies of all novels, but a lot of tagging by hand would have been necessary. The reason is that no program can automatically 36 See J. Richetti, The Cambridge Companion, Nünning, Ansgar and Vera, Englische Literatur des 18. Jahrhunderts, Stuttgart: Klett, 1998, and Backscheider, Paula R./Ingrassia, Catherine (eds.), A Companion to the Eighteenth-Centur y English Novel and Culture, Chichester: Wiley-Blackwell, 2006. L ooking for Tex t ual Evidence 24 9 mark up more complex structural features such as comparisons between characters that are not made explicit on the textual level, but are evoked through characters acting dif ferently in comparable situations, a strategy frequently used in prose fiction. To tag the texts for such features would be a very time-consuming process that presupposes an answer to our original question, namely what role practices of comparing play on a structural level in the textual constructions of social identity. This question would need to be answered before the markup could begin, since these structures would have to be analyzed before they could then be tagged in all texts of our cor­ pus. We would further risk to exacerbate the danger of confirmation bias that is structurally inherent to our approach anyway, as we would run the danger of finding exactly what we placed there during the tagging process. The sheer number of working hours that would have to be put into creating new digital versions with tags made this type of digital research impractical for a first, tentative and playful digital exploration of our expanded corpus of eighteenth-century novels. As a consequence of these first challenges we moved away from the idea to investigate complex syntactic and narrative structures, and turned to word and phrase searches as a feasible alternative, for which an array of DH tools are available, and for which simple text files suf fice.37 In this context, our assumption was that key terms denoting middle-class virtues and vices would be detectable in abundance across the novels of our corpus. While pro­ grams such as AntConc are especially promising when zooming in on indi­ vidual texts, Voyant proved to be more ef ficient when searching larger col­ lections of texts. Generally speaking, it is interesting to look at the frequency of words within one text and within a corpus, since words that occur very frequently (except for functions words such as conjunctions or articles, which we excluded from all searches) are likely to hint at the thematic focus of a text. Sometimes, however, the opposite of an expected word frequency may be revealing, too, as was the case in the searches we document below. Since we were also interested in diachronic developments, we began by using Voy­ ant, which of fered a direct comparison of word frequencies and the context 37 Project Gutenberg is the most easily accessible online text collection for such purposes. Although random checks of Project Gutenberg texts against the printed scholarly edi- tions we had read suggested that the former are not always entirely reliable, we decided that for the first stage of word searches, the results were unlikely to be heavily distorted. Ral f Schneider, Mar cus Har tner, Anne L apper t25 0 of their appearance across the corpus as a whole. We added the year of pub­ lication to the title in order to have the novels appear in chronological order of their publication, so that any diachronic changes would be immediately visible. Since the larger framework of our project was the study of the forms and functions of practices of comparing, our very first tentative approach was to run searches for words and particles that explicitly produce comparisons (such as more/less than, and words containing comparatives or superlatives ending on -er and -est). The result was that comparative words and particles occurred indeed frequently in our corpus (“more” = 14196 times, “less” = 2531, “than” = 12161, “like” = 4555). However, looking closer at our results it became apparent that words such as more were not always used to create an explicit comparison, but in many cases appeared in other contexts, such as to empha­ size the expressed meaning ( ‘still the more’), or to indicate temporality (‘once more’) in phrases like ‘little more than’, ‘still the more’, ‘many more’ and ‘once more’ (see fig. 1). Hence, the results of the context search put the result of the word frequency in question and provided a first indication that comparing in prose fiction might work in less explicit ways than in some other discourses. Fig. 1: Word search for “more” and immediate contexts We then turned to other word searches. Collecting results from our (close) reading of the selected text from our original corpus and in the playful spirit of “screwmeneutics” 38 we developed a list of terms that describe behavior and 38 N. McCurdy et al., Poemage, 447. L ooking for Tex t ual Evidence 25 1 dispositions in negative and positive ways that we considered to be important for the negotiation of social identity in eighteenth-century English novels. In particular, we decided to look for positively and negatively connotated adjec­ tives, but also noun phrases used in characterization by narrators and other characters, or in self-characterization. With this we aimed to make apparent the contrast between what were considered desirable or undesirable charac­ ter traits and actions and how these conceptions changed throughout eigh­ teenth-century literature. For this purpose, we created two lists of adjectives we came across in our close reading process and in our reading of second­ ary literature on the construction of social identity in the eighteenth centu­ ry.39 In the group of positive terms, we had collected such words as “gentle”, “gallantry” and “virtuous”; the negative ones included “foppish”, “conceited”, “impertinence”, etc. We then added other terms from these and related semantic fields and complemented the adjectives and adverbs with the per­ tinent noun phrases in an attempt not to overlook relevant textual manifes­ tations. This gave us a list that included the words “gentle” and “gentleman”, “gallant” and “gallantry”, “grace”, “graceful”, “gracious” and “graciousness”, “polite” and “politeness” “virtuous” and “virtue” for the positively conno­ tated behaviors and attitudes; the negatively connotated ones included “fool” and “foolish”, “fop”, “foppish” and “foppery”, “disagreeable”, “conceited” and “conceitedness”, “vulgar” and “vulgarity”, “impertinent” and “impertinence”, “impetuous” and “impetuosity”, as well as “negligent” and “negligence”. For ef ficient text searching, the truncated forms of these words were used.40 Our list then had for instance gentle*, tender*, grac*, gallant*, polit*, sweet* virtu*, modest*, moderat*, on the positive side, and fop*, fool*, disagreeab*, conceited*, vulgar*, impertinen*, impetuo*, negligen* on the negative. Figure 1 shows the frequency of both the negative and the positive search terms across our corpus. Since the corpus was organized chronologically, the graphic ought to show whether certain terms were used more or less frequently in later publications than in earlier ones. As we see in the dia­ gram, usage did vary considerably, but this variation shows no indication 39 V. Nünning, From ‘honour’ to ‘honest’, D. Wahrman, Imagining. 40 Using truncated forms, i. e., a word stem closed by an asterisk, allows the system to find instances of the stem in all variations and word classes; for example, gentl* would not only include the results for “gentle”, but also for “gently”, “gentleman”, “gentlemanly”, etc. Ral f Schneider, Mar cus Har tner, Anne L apper t252 of being related to diachronic changes during the time period. Frequencies rather vary from text to text. In fact, while individual texts may deviate from the median in a significant fashion, the overall frequency of the terms under investigation seems to remain more or less consistent over the entire eigh­ teenth century as far as our corpus is concerned. The underlying assumption guiding our approach was that the social changes in the understanding of the virtues and vices listed above would somehow be ref lected by changing word frequencies. Especially for gen- tle* did we expect to find a significant diachronic development, as notions of gentility changed from a rather narrow denotation of gentle birth to an understanding of polite behavior by the end of the century that made it pos­ sible for men from a significantly wider range of society to claim the status of a “Gentleman” (see FN 6). Contrary to our expectations, however, we were unable to discern significant developments in our search result. While gentle* indicated at least a slight discernible decrease of usage (see fig. 2), none of the other terms of fered a visible indication of a diachronic development. Put dif ferently, word f requencies did not hint at the emerging construction of a middle-class identity during the period, as described by eighteenth-century social history. One possible explanation for this may be that frequency cannot capture what a term means: While narrators and char­ acters in late eighteenth-century novels may use all variations of the words “gentle”, “gentleman”, etc. as frequently as those in the early phase, they may simply mean dif ferent things by those terms. With this possible explanation in mind, we decided to turn away from questions of diachronic development within the eighteenth century. Our next step was to look at the total word frequency of our search terms in the entire corpus. In order to corroborate our assumption that these terms play a significant role in the topics of the novels, we checked their position in the list of the most frequently appearing words within the body of novels under consideration. However, we were once more disappointed. The word count showed that out of the words we were looking for, most were situated in the lower ranks of the count, whereas words such as ‘said’, ‘Mr’, ‘time’ and ‘lit­ tle’ came up top of the list (see fig.  3). From our search list, only gentleman managed to enter the top 100 at position 81, followed by virtue at 239. The results for our negative terms proved to be even less impressive with, for instance, fool, reaching only the top 2000 of the most frequently used words in the corpus. Our positive terms generally ranked higher than our negative L ooking for Tex t ual Evidence 25 3 Fi g. 2 : F re qu en cy o f n eg at iv e a nd p os it iv e t er m s a cr os s t he co rp us Ral f Schneider, Mar cus Har tner, Anne L apper t25 4 terms with virtue at position 239 (1370 occurrences), agreeable at 440 (892 occurrences), sweet at 450 (881 occurrences), and tender at 299 (1163 occur­ rences). None of the negative terms made it above fool at position 1420 and with 329 occurrences. With all our negative terms ranking rather low and quite a number of our positive terms ranking comparatively higher and with a look at the most frequent words (especially “dear”, “great”, and “good”), one may speculate whether character traits might have been negotiated more in terms of stating an ideal during the period. This would mean that texts rather state what should be aimed for, while at the same time only implicitly hinting at negative traits and behaviors and hence, at what to avoid. On the other hand, our experience with words and particles that explicitly produce comparison showed that word frequency tells us little about the contexts of use, and hence little about the diverse meanings individual words can take on in dif ferent contexts. Such a bold claim would therefore need more data via context searches or a more elaborate analysis via close reading. Fig. 3: Top 40 most f requent words in the corpus L ooking for Tex t ual Evidence 25 5 Fig. 4: Position of the word “conduct ” in the word count Ral f Schneider, Mar cus Har tner, Anne L apper t25 6 Af ter none of our search terms had turned out to feature prominently among the most frequent words in the corpus, our next step was to turn to what we could find on the list of most frequent words (figs. 3 and 4). For this we went through this list looking for terms we felt to exhibit some kind of relationship to contemporary discussions of social identity. In this way, we found that com­ parably frequently used in our corpus were “honour” (no. 54 in the word-fre­ quency list), “poor” (no.  47), “character” (no.  141) and “conduct” (no.  187; see fig.  4), with the word “conduct”, referring to the overall comportment of a person. From those results, we considered “conduct” to be particularly inter­ esting. The term, appearing most frequently in Wollstonecraf t’s Maria, Or, The Wrongs of Woman (1798) and least frequently in Fielding’s Shamela (1741), is not only eponymous to the eighteenth-century genres of the conduct book and the conduct novel, but generally constitutes a key concept of the literary and cultural movement of sensibility.41, 42 For this reason, we decided to play around some more and searched for the word “conduct” in the sentimental novels of our corpus separately.43 Once more, we received a fairly inconclu­ sive diagram (fig. 5): Between the middle and the end of the eighteenth cen­ tury, sentimental novels feature the term “conduct” in varying ways. While interesting for the formulation of new research questions,44 this did not help us in terms of our thesis on the literary negotiations of social identity. In fact, the visualization suggested that a diachronic change in the 41 V. Nünning, From ‘honour’ to ‘honest’. 42 The low result for Shamela could be interpreted in dif ferent ways. On the one hand, it could mean that this text, being a parody of one of the most inf luential of the early sen- timental novels, wanted to avoid the term by way of taking a critical stance on the genre of the sentimental novel, which was heavily inf luence by the conduct book. On the other hand, Fielding may simply have counted on the reader to realise that both the original and the parody deal with conduct, without having to make that explicit. 43 The sentimental novels or parodies thereof in our corpus are, in chronological order: Samuel Richardson’s Pamela (1740), Henr y Fielding’s Shamela (1741) and Amelia (1751), Lau- rence Sterne’s Tristram Shandy (1759), Oliver Golding’s The Vicar of Wakefield (1766), Henr y Mackenzie’s The Man of Feeling (1771), Tobias Smollet’s Humphrey Clinker (1771), Frances Burney’s Evelina (1778), Maria Edgeworth’s Castle Rackrent (1800) and Jane Austen’s Sense and Sensibility (1811). 44 Such as: Is a separation of a genre and its parodies necessar y, and if so, how can such a distinction be upheld? In how far does the illustration present a visualization of genre negotiations by means of comparison? Is this wave movement even coincidental due to the novels in the corpus? L ooking for Tex t ual Evidence 25 7 Fi g. 5 : O cc ur re nc e o f “ co nd uc t” in se le ct ed se nt im en ta l n ov el s Ral f Schneider, Mar cus Har tner, Anne L apper t25 8 usage of particular words is rather dif ficult to argue for, based on the type of distant reading we engaged in our work with Voyant. Another visualization tool by Voyant of fers users the opportunity to look for the context and the co-occurrence of individual terms in a corpus. Here it became apparent that “conduct”, while mainly appearing as a noun in connection with adjectives that qualify it, also appears as a verb, and does so most frequently in our Gothic novels (figs. 6 and 7). With their tendency to set the action in regions both temporally and spatially remote from eighteenth-century England, the Gothic novels can comment on contemporary English society at best by implication, so that the latter finding pointed once more at the need for fur­ ther close reading and interpretation. Fig. 6: Examples for sentences containing “conduct ” (1) Fig. 7: Examples for sentences containing “conduct ” (2) L ooking for Tex t ual Evidence 25 9 5. Discussion The results of our investigations with Voyant were unexpected to say the least. They are not only at odds with important voices in secondary litera­ ture,45 but they also contradict our own close reading experiences that con­ firm the conceptual relevance of the listed virtues and vices in the portrayal of characters in the eighteenth-century novel in general. Our expectation was to find diachronic developments of the words used to describe presum­ ably middle-class virtues and f laws displaying an increase of frequency towards the end of the eighteenth century. We based our expectations on the assumption that the social identity of the ‘middling’ classes began to be con­ structed in negotiations in and beyond literature during this time period.46 By use of visualization tools we expected to be able to localize the moment these negotiations entered literature on a word level, but instead the results indicate that as far as our searched terms and our corpus are concerned no such change is traceable. Confronted with these findings, we naturally began to question our search strategies, including the list of terms we had thought to be so prominent in eighteenth-century discussions of virtues and vices. But we also wondered whether the infrequent appearance of those terms and the lack of clearly discernible diachronic developments in their application could also be explained dif ferently, for example, by considering the traditional distinction made in literary studies between telling and show- ing.47 Thus, we speculated that our findings may indicate a tendency to show virtues by means of the description of behaviors rather than by naming them explicitly. However, such a claim can only be upheld by a closer analysis in terms of close reading as a complementary method to the usage of DH tools. Further, it seemed that when working with computation techniques, there is the danger that significant dif ferences between texts belonging to the various subgenres of the novel that constitute the overall corpus may dis­ appear from view. While a scholar has certain background information on literary and cultural history available in close reading, a computer is rather 45 E. g., A. Nünning, Der englische Roman. 46 D. Wahrman, Imagining, Schwarz, L. D., Social Class and Social Geography: The Middle Classes in London at the End of the Eighteenth Centur y, in: Social Histor y 7 (1982). 47 Herman, David, Stor y Logic: Problems and Possibilities of Narratolog y, Lincoln: Universit y of Nebraska Press, 2002, 171–172. Ral f Schneider, Mar cus Har tner, Anne L apper t26 0 ignorant towards contextual details in its application of distant reading on a text. This bears problems as well as promises. But we wondered whether these subgenre specific groups such as Gothic novels, or Sentimental novels, had not better be analyzed by searching them separately. The justification for dealing with these separate groups of novels belonging to dif ferent sub­ genres separately lies in literary-historical conventions and definitions of, e. g., the Sentimental Novel, or Gothic Novel. The very fact that DH overlooks such conventions and definitions in the production of data, makes us aware of their potential relevance for analysis and interpretation. In the words of McCarty, DH forces us to “ask in the context of computing what can (and must) be known of our artifacts, how we know what we know about them and how new knowledge is made”.48 Just as in the case of words and particles which explicitly produce comparisons, and with the dif ferent rankings of positive and negative terms, our usage of DH tools challenged us to acknowl­ edge that computing can only ever give us information on texts in form of data. How we read and interpret these numbers and results foregrounds the responsibility of informed research. It is easy to quickly jump to false conclusions if the numbers seem to support the desired argument. But espe­ cially when we combine traditional research with DH methodology taking into consideration all the dif ferent aspects that inf luence the results (e. g., the corpus, the genre, the scope of each text, the relation to other literary works of the same time period, etc.) becomes a dif ficult yet, important task for every scholar in the humanities. For our usage of Voyant this meant we had to realize that even when we received results that seemed to corroborate our assumptions, this did not really mean direct support for our argument in terms of numbers. It only meant that we needed to question these results again in order to avoid run­ ning the risk of prematurely interpreting unanticipated quantitative data in the light of our underlying argument. In the case of the term conduct, for example, we seemed to have found a frequently used word that could sup­ port our argument of the negotiation of social identity in terms of morals in the eighteen-century English novel. Instead, further testing via other visualization tools of fered by Voyant made clear that this seemingly simple link between word frequency and research question of fered a false security 48 McCarty, Willard, Encyclopedia of Librar y and Information Science, New York: Dekker, 2003, 1231. L ooking for Tex t ual Evidence 26 1 (figs. 2, 4, and 5). Looking at the context of the usage of the word “conduct”, we could not support our argument but had to face that the various dif ferent contexts of occurrences of “conduct” varied significantly in meaning. This means that only in a few cases of the many occurrences did “conduct” actu­ ally appear in contexts that we had in mind and that supported our argument (figs.  7 and 8). Another example for the need to treat numeric results with caution was the result of the search term fool. The 1741 novel Shamela by Henry Fielding, which was written as a parody of Samuel Richardson’s highly inf lu­ ential sentimental novel Pamela, showed a peak in the frequency of the word “fool” (551 occurrences) in comparison to the other novels. Strikingly, the sec­ ond highest frequency of the word “fool” was actually found in Richardson’s Pamela, with 175 occurrences. The temptation to construct some intertextual correlation between both texts with regard to their top positions in the word count for “fool” was great: Fielding might have picked up an inherently sig­ nificant feature of Richardson’s novel and exaggerated that for the purposes of satire. However, when we took into account the overall length of the two texts, this argument collapsed: While 551 occurrences of fool seem notewor­ thy in the relatively short novel Shamela (14.456 words), there is nothing sig­ nificant about the term’s appearance in Pamela given the total length of this work. With 227.407 words Richardson’s novel is over fif teen times longer than that of Fielding. Thus, given the massive text of Pamela, the count of 175 occurrences of fool dwindles into comparative insignificance. We were lef t with the paradox that while being considered more precise and accurate in terms of quantitative and statistical occurrences than tra­ ditional methods of close, DH actually seemed to blur any assumption of a precise answer to questions of literary analysis. In our case, DH appeared to be more suitable for finding new questions than to of fer or support conclu­ sive answers to interpretative assumptions. Voyant was able to give us the exact number of word frequencies, to tell us which word appeared how of ten in which novel, and even of fered us to compare these frequencies across the corpus directly, while allowing us at the same time to look for the specific contexts of the words. All of this was very helpful, but mainly to question our own approach and its underlying categories. We set out to look for literary negotiations of social identity and how these were inf luenced by practices of comparison, just to be faced with the problem that comparison was already included in every aspect of our own approach. Instead of making clear dis­ tinctions more apparent, DH made us question these distinctions from the Ral f Schneider, Mar cus Har tner, Anne L apper t26 2 start. If this were the end of it, we would come out of this experiment quite disillusioned. Instead we are inspired by what seems to of fer a new method­ ology for approaching literary texts. While the usage of computing in literary studies is of ten feared to turn literary analysis into a mere equation whose solution would render all further examination of a text vain and shallow, the opposite seems to be true. DH of fers a chance to engage in a more playful, more open-minded yet at the same time equally critical approach to liter­ ature and its study that eventually draws research back to the text and the question of how texts are embedded in various discourses. 6. Conclusion At first sight, our engagement with text search and visualization tools for the analysis of a corpus of eighteenth-century English novels could be sum­ marized in terms of discouragement and frustration  – an experience that appears to be shared by scholars in other DH projects but that is apparently rarely admitted in DH. According to Jasmine Kirby “[w]e don’t talk enough about failure in the digital humanities”.49 Our failure to corroborate some of our assumptions with numerical data, and the necessity to proceed from the observation of word counts to the wider contexts of our findings in fact triggered two insights. First, if the actions and dispositions of humans in social interaction that the eighteenth-century novel negotiates as desirable or undesirable are much less explicitly mentioned than expected, the novel must have other ways of presenting them. Second, the practices of comparing, too, appear to be situated on other levels than that of the text surface, at least in the corpus under scrutiny in our project. The lack of simple numerical proof garnered from distant reading was, in our case, a productive ‘failure’, because it helped us formulate the hypothesis that literary practices of comparing involve the structural juxtaposition of characters in comparable settings and plot segments. As Nina McCurdy and her colleagues have demonstrated, there is some irony in the fact that the more precision a DH tool of fers, the more it makes sense to ‘screw around’ with it to render new interesting and exciting research questions. Narrow research questions in DH of ten of fer a 49 Kirby, Jasmine S., How NOT to Create a Digital Media Scholarship Platform: The Histor y of the Sophie 2.0 Project, in: IASSIST Quarterly 42 (2019), https://doi.org/10.29173/iq926. https://doi.org/10.29173/iq926 L ooking for Tex t ual Evidence 26 3 variety of open, inconclusive results, while ‘screwing around’ seems to lead to unexpected, innovative questions.50 None of this narrows literary research down to a question of sof tware engineering and mathematical bean counting, but rather computation techniques in form of tools of fer a playful exchange between the traditionally trained scholar and DH to find ever new ways of reading texts together in the midst of the “beautif ul mess” that is literature.51 What our search for textual evidence also appeared to show was that available strategies of tagging the words and passages of a text – the produc­ tion of markup  – could much profit from taking into account the research questions of literary scholarship. Existing markup algorithms performed autonomously by computer programs, may be helpful and time-saving, and they certainly have improved much in recent years; still, they rarely capture any of the more content-related questions pertaining to literary analysis, let alone interpretation. What, in the case of our project, really would have helped would have been the automatic isolation and tagging of passages that contain comparisons; this however, is nowhere in sight. We also encoun­ tered problems in the visualization of results, even though our corpus was, in DH terms, very small. How could meaningful illustrations be produced if hundreds, or even thousands, of books were subjected to data-mining? Visu­ alization tools will also have to be further developed to match the research designs of the humanities better. Af ter our venture into DH, we still believe that no computer can ‘find out’ anything about the meaning of a text on its own. Therefore, while the scholar’s limitations are quantitative, those of computer programs appear to lie in the quality of their findings. Nor will a text be ‘readable’ to a computer at all, if it has not been previously read, processed and digitized by humans, increasingly automatized programs of parsing and tagging notwithstanding. The solution to the apparent incompatibility of close and distant reading lies, unsurprisingly, in the fact that the two strategies can, and ought to be, regarded as complemen­ tary rather than competitive, as Stephen Ramsay, among others, has argued.52 As we have shown, to make use of DH methods can help literary scholars to focus and re-formulate their questions and research strategies, and to recon­ sider their assumptions about what literary texts do and how they do it. 50 N. McCurdy et al., Poemage, 447. 51 Ibid., 445. 52 S. Ramsay, The Hermeneutics. Ral f Schneider, Mar cus Har tner, Anne L apper t26 4 Appendix 1: Extended corpus of English eighteenth-century novels 1688 Oroonoko Aphra Behn 1713 The Amours of Bosvil and Galesia Jane Barker 1714 Exilius Jane Barker 1719 Robinson Crusoe Daniel Defoe 1720 Memoirs of a Cavalier Daniel Defoe 1722 Moll Flanders Daniel Defoe 1723 The Lining of the Patch Work Screen Jane Barker 1724 John Sheppard Daniel Defoe 1726 Gulliver’s Travels Jonathan Swif t 1740 Pamela Samuel Richardson 1741 Shamela Henry Fielding 1743 Jonathan Wild Henry Fielding 1748 Roderick Random Tobias Smollett 1749 Fanny Hill John Cleland 1749 Tom Jones Henry Fielding 1750 Harriot Stuart Charlotte Lennox 1751 Amelia Henry Fielding 1751 Betsy Thoughtless Eliza Fowler Haywood 1751 Peter Wilkins Robert Paltock 1752 The Female Quixote Charlotte Lennox 1759 Rasselas Samuel Johnson 1759 Tristram Shandy Laurence Sterne 1760 The Adventures of Sir Launcelot Greaves Tobias Smollett 1762 Millenium Hall Sarah Scott 1764 Castle of Otranto Horace Walpole 1766 The Vicar of Wakefield Oliver Goldsmith 1768 Maria; Or, The Wrongs of Woman Mary Wollstonecraf t 1769 Emily Montague Frances Brooke 1771 Humphrey Clinker Tobias Smollett 1771 The Man of Feeling Henry Mackenzie 1778 Evelina Frances Burney 1778 The Old English Baron Clara Reeve 1782 Cecilia Fanny Burney 1784 Imogen William Godwin 1786 The Heroine Eaton Stannard Barrett L ooking for Tex t ual Evidence 26 5 1786 Vathek – An Arabic Tale William Beckford 1788 Mar y – A Fiction Mary Wollstonecraf t 1789 Castles of Athlin and Dunbayne Ann Radclif fe 1790 A Sicilian Romance Ann Radclif fe 1791 A Simple Stor y Elizabeth Inchbald 1791 Charlotte Temple Susanna Rowson 1791 Romance of the Forest Ann Radclif fe 1793 The Castle of Wolfenbach Eliza Parsons 1794 Caleb Williams William Godwin 1794 The Mysteries of Udolpho Ann Radclif fe 1796 Memoirs of Emma Courtney Mary Hays 1796 The Monk Matthew Lewis 1798 Wieland Charles Brockden Brown 1799 St Leon William Godwin 1800 Castle Rackrent Maria Edgeworth 1806 Leonora Maria Edgeworth 1806 Wild Irish Girl Sydney Owenson 1806 Zof loya Charlotte Dacre 1811 Sense and Sensibilit y Jane Austen 1812 The Absentee Maria Edgeworth Bibliography Aristotle, The Politics, trans. Carnes Lord, Chicago: University of Chicago Press, 1984. Backscheider, Paula R./Ingrassia, Catherine (eds.), A Companion to the Eigh­ teenth-Century English Novel and Culture, Chichester: Wiley-Blackwell, 2006. Burrows, John, Delta: A Measure for Stylistic Dif ference and a Guide to Likely Authorship, in: Literary and Linguistic Computing 17 (2002), 267–287. Cannadine, David, Class in Britain, London: Penguin, 2000. Corfield, Penelope J., Class by Name and Number in Eighteenth-Century England, in: History 72 (1987), 38–61. Cowan, Brian, Making Publics and Making Novels: Post-Habermasian Perspectives, in: J. A. Downie (ed.), The Oxford Handbook of the Eigh­ teenth-Century Novel, Oxford: Oxford University Press, 2016, 55–70. Ral f Schneider, Mar cus Har tner, Anne L apper t26 6 Defoe, Daniel, Robinson Crusoe, ed. Michael Shinagel, New York: Norton, 1994 [1719]. Downie, J. A. (ed.), The Oxford Handbook of the Eighteenth-Century Novel, Oxford: Oxford University Press, 2016. Eder, Jens/Jannidis, Fotis/Schneider, Ralf (eds.), Characters in Fictional Worlds: Understanding Imaginary Beings in Literature, Film, and Other Media, New York: de Gruyter, 2010. Ensslin, Astrid, Hypertextuality, in: Marie-Laure Ryan/Lori Emerson/Benja­ min J. Robertson (eds.), The Johns Hopkins Guide to Digital Media, Balti­ more: Johns Hopkins University Press, 2014, 258–265. Fludernik, Monika, An Introduction to Narratology, London: Routledge, 2009. French, Henr y, Gentlemen: Remaking the English Ruling Class, in: Keith Wrightson (ed.), A Social History of England: 1500–1750, Cambridge: Cambridge University Press, 2017, 269–89. Goldsmith, Oliver, The Vicar of Wakefield, Oxford: Oxford University Press, 2006 [1766]. Habermas, Jürgen, The Structural Transformation of the Public Sphere: An Inquiry into a Category of Bourgeois Society, trans. Thomas Burger, Cambridge: Polity, 2015 [1962]. Herman, David, Story Logic: Problems and Possibilities of Narratology, Lin­ coln: University of Nebraska Press, 2002. Hume, David, Of the Middle Station of Life, in: Thomas H. Green/Thomas H. Grose (eds.), David Hume, The Philosophical Works, Aalen: Scientia Ver­ lag, 1964 [1742], 375–380. Hunter, Paul J., The Novel and Social/Cultural History, in: John Richetti (ed.), The Cambridge Companion to the Eighteenth-Century Novel, Cam­ bridge: Cambridge University Press, 1996, 9–40. Jannidis, Fotis, Quantitative Analyse literarischer Texte am Beispiel des Topic Modelings, in: Der Deutschunterricht 5 (2016), 24–35. Jannidis, Fot is/Lauer, Gerhard, Burrows’s Delta and Its Use in German Liter­ ary History, in: Matt Erlin/Lynne Tatlock (eds.), Distant Readings: Topol­ ogies of German Culture in the Long Nineteenth Century, Rochester: Camden House, 2014, 29–54. Juola, Patrick, Authorship Attribution, in: Foundations and Trends in Infor­ mation Retrieval 3 (2006), 233–334, http://dx.doi.org/10.1561/1500000005. http://dx.doi.org/10.1561/1500000005 L ooking for Tex t ual Evidence 26 7 Kirby, Jasmine S., How NOT to Create a Digital Media Scholarship Platform: The History of the Sophie 2.0 Project, in: IASSIST Quarterly 42 (2019), https://doi.org/10.29173/iq926. Margolin, Uri, Character, in: David Herman (ed.), The Cambridge Companion to Narrative, Cambridge: Cambridge University Press, 2007, 66–79. McCart y, Willard, Encyclopedia of Library and Information Science, New York: Dekker, 2003. McCurdy, Nina et al., Poemage: Visualizing the Sonic Topology of a Poem, in: IEEE Transactions on Visualization and Computer Graphics 22 (2016), 439–448. Moretti, Franco, Distant Reading, London: Verso, 2013. Muldrew, Craig, The ‘Middling Sort’: An Emergent Cultural Identity, in: Keith Wrightson (ed.), A Social History of England: 1500–1750, Cambridge: Cambridge University Press, 2017, 290–309. Nünning, Ansgar and Vera, Englische Literatur des 18.  Jahrhunderts, Stutt­ gart: Klett, 1998. Nünning, Ansgar, Der englische Roman des 18.  Jahrhunderts aus kulturwis­ senschaf tlicher Sicht. Themenselektion, Erzählformen, Romangenres und Mentalitäten, in: Ansgar Nünning (ed.), Eine andere Geschichte der englischen Literatur. Epochen, Gattungen und Teilgebiete im Überblick, Trier: WVT, 1996, 77–106. Nünning, Ansgar, Grundzüge eines kommunikationstheoretischen Modells der erzählerischen Vermittlung: Die Funktion der Erzählinstanz in den Romanen George Eliots, Trier: WVT, 1989. Nünning, Vera, From ‘honour’ to ‘honest’. The Invention of the (Superiority of ) the Middling Ranks in Eighteenth Century England, in: Journal for the Study of British Cultures 2 (1994), 19–41. Ramsay, Stephen, The Hermeneutics of Screwing Around; or What You Do with a Million Books, in: Kevin Kee (ed.), Pastplay: Teaching and Learn­ ing History with Technology, Ann Arbor: University of Michigan Press, 2014 [2010], 111–120. Richetti, John (ed.), The Cambridge Companion to the Eighteenth-Century Novel, Cambridge: Cambridge University Press, 1996. Rogers, Pat, Social Structure, Class, and Gender, 1660–1770, in: J. A. Downie (ed.), The Oxford Handbook of the Eighteenth-Century Novel, Oxford: Oxford University Press, 2016, 39–54. https://doi.org/10.29173/iq926 Ral f Schneider, Mar cus Har tner, Anne L apper t26 8 Ryan, Marie-Laure, Avatars of Story, Minneapolis: University of Minneapolis Press, 2006. Schwarz, L. D., Social Class and Social Geography: The Middle Classes in London at the End of the Eighteenth Century, in: Social History 7 (1982), 167–185. Shillingsburg, Peter L., From Gutenberg to Google: Electronic Representations of Literary Texts, Cambridge: Cambridge University Press, 2006. Thaller, Manf red, Geschichte der Digital Humanities, in: Fotis Jannidis/ Hubertus Kohle/Malte Rehbein (eds.), Digital Humanities. Eine Ein­ führung, Stuttgart: J. B. Metzler, 2017, 3–12. Thompson, E. P., Customs in Common: Studies in Traditional Popular Cul­ ture, New York: New Press, 1993. Thornton, William, The Counterpoise: Being Thoughts on a Militia and a Standing Army, London: Printed for M. Cooper, 1752. Tillyard, E. M. W., The Elizabethan World Picture: A Study of the Idea of Order in the Age of Shakespeare, London: Chatto & Windus, 1967 [1942]. Wahrman, Dror, Imagining the Middle Class: The Political Representation of Class in Britain, c.  1780–1840, Cambridge: Cambridge University Press, 1995. Watt, Ian, The Rise of the Novel: Studies in Defoe, Richardson, and Fielding, Berkeley: University of California Press, 1957. Wenzel, Peter (ed.), Einführung in die Erzähltextanalyse: Kategorien, Modelle, Probleme, Trier: WVT, 2004. Yenor, Scott, David Hume’s Humanity: The Philosophy of Common Life and Its Limits, Basingstoke: Palgrave, 2016. work_2unm4tvuyzbnbjzhy4o532imue ---- Publications 2019, 7, 65; doi:10.3390/publications7040065 www.mdpi.com/journal/publications Article Open Science in the Humanities, or: Open Humanities? Marcel Knöchelmann Department of Information Studies, University College London, Foster Court, Gower Street, London WC1E 6BT, UK; marcel.knochelmann.15@ucl.ac.uk Received: 9 October 2019; Accepted: 13 November 2019; Published: 19 November 2019 Abstract: Open science refers to both the practices and norms of more open and transparent communication and research in scientific disciplines and the discourse on these practices and norms. There is no such discourse dedicated to the humanities. Though the humanities appear to be less coherent as a cluster of scholarship than the sciences are, they do share unique characteristics which lead to distinct scholarly communication and research practices. A discourse on making these practices more open and transparent needs to take account of these characteristics. The prevalent scientific perspective in the discourse on more open practices does not do so, which confirms that the discourse’s name, open science, indeed excludes the humanities so that talking about open science in the humanities is incoherent. In this paper, I argue that there needs to be a dedicated discourse for more open research and communication practices in the humanities, one that integrates several elements currently fragmented into smaller, unconnected discourses (such as on open access, preprints, or peer review). I discuss three essential elements of open science—preprints, open peer review practices, and liberal open licences—in the realm of the humanities to demonstrate why a dedicated open humanities discourse is required. Keywords: open humanities; open science; digital humanities; scholarly communication; peer review 1. Introduction There is a long history of sorting disciplines into clusters, primarily the sciences and humanities [1–3]. These clusters are, at times, extended to a triad with the social sciences in between. Contrary to the impression this clustering conjures, though, no exact distinction can be drawn between the sciences and the humanities (or the social sciences in between). Not one binary opposition, nor a combination of several ones, can describe the differences that would suffice for a clear-cut separation of disciplines: understanding or explaining, idiographic or nomothetic, qualitative or quantitative, meaning or theory—all fall short of describing but a few temporal or field-specific regularities. Any such distinction can at best approximate what unites and separates disciplines so that, in the end, it is a question of purpose or necessity on the basis of which the exercise of unifying or separating disciplines is to be undertaken. One such necessity arises when norms and practices of open and transparent research and communication are to be debated. Disciplines share characteristics of research and communication practices that require a discourse on these practices to sort disciplines into clusters so as to sufficiently address the characteristics of these disciplines. There is a specific discourse dedicated to open practices for disciplines of the sciences: open science. This discourse regularly takes for granted to speak for scholarly communication as a whole. However, already its name—open science—indicates that this discourse is not concerned with the humanities but with a cluster of scientific disciplines. There is no such discourse dedicated to the Publications 2019, 7, 65 2 of 17 humanities. This is the case, although research and communication practices in disciplines commonly clustered into the humanities do share characteristics, that could—and, I argue, need to—be addressed by a coherent discourse as well. The coherence of this discourse builds upon the unifying characteristics of its disciplines. Without such unifying grounding, contributions to a potential discourse would only be concerned with elements of some disciplines—in the humanities, for instance: philology, philosophy, history, theology, among others—instead of the humanities as a whole. The result would be a fragmentation into several smaller discourses such as open philology, open philosophy, etc. Characteristics on the basis of which disciplines can be sorted into the humanities are, for instance, an emphasis on perspectivity (as opposed to objectivity in the sciences), verbality (as opposed to reliance on models), or historicity (as opposed to systemic integration) of contributions to discourses in these disciplines [4]. These characteristics are expressive of the research paradigms and epistemologies employed in what is commonly termed the humanities: the importance of hermeneutics, source criticism, and nuanced, contextual meaning [5–7]. These more abstract characteristics lead to distinct practices such as the reliance on long-form publications (primarily the monograph), qualitative arguments, slower, editorially-heavy publishing processes, recursivity of its discourses, critique, and qualitative embedding of references [8–10].1 Moreover, the humanities live on a culture of debate, with the analysis and dialectic of interpretative understanding at its core [11]. Scholars in the humanities focus on “interpretation and critical evaluation, primarily in terms of the individual response and with an ineliminable element of subjectivity” [12] (p. 23). The resulting discourses are based on the power of arguments so that the “overall cogency of a substantial piece of work seems more closely bound up with the individual voice of its author” [13] (p. 75). Dissonance is essential and there is no need for agreement in a discourse for it to be successful in scholarship. Scholars may never reach consensus; their arguments of disagreement are essential bits of knowledge production. All this makes critiquing and reinterpreting existing contributions to a discourse—thus, continually and recursively coming back to previous work—an integral part of the humanities. A prerequisite for this is that scholarly communication practices enable such a culture of debate to flourish. A liberal understanding of scholarly practices enabled by free access to contributions, diversity of argument, intention of the author to contribute to discourses—as opposed to the intention to publish for the sake of authorship and reputation—can be supportive of such flourishing (as is discussed for the characteristics of the sciences in open science). There is, however, no discourse concerned with such elements that are dedicated to the here-offered characteristics of the humanities. In other words: though there is no one field of scholarly communication—but at least one for each cluster of scholarship—there is currently only one dedicated discourse on open research and scholarship, and this is open science. 2. Open Science, Open Humanities, and Digital Humanities 2.1. The Meaning of Discourse There is a long and strong thread of—mostly scholarly—discourses on topics of openness in all forms: open source, open access, and open science in particular, and the digital means and discourse conventions enabling openness of scholarly communication in general. Moreover, discourse is a term with a difficult genealogy. Its meaning varies depending on the disciplinary and temporally-situated context. This necessitates differentiating what I imply to with my claim for a dedicated discourse. Discourse here refers to a debate that includes various forms of textual or oral communication as contributions to a specific enquiry and body of knowledge. Such discourse is both the intellectual construction of the object of enquiry and serves as a reference to the practices it is dedicated to (in discursive form). This notion does not refer to a Foucauldian conception of discourse that includes 1 There are variances in these, of course. As mentioned in paragraph one, such unifications are approximations at best and so it needs to be stated that, for philosophy, for instance, the monograph has become less important. Publications 2019, 7, 65 3 of 17 practices, but comes closer to the early Habermasian tradition that posits as discourses a rational exchange of communicative action. The closest the conception of discourse in this article comes to is that offered by Hyland [8]. It is the engagement of experts—or novices who aspire to become experts by contributing to discourses—in communication that is governed by conventions fixed temporally only by the very members of that discourse. Contributions to a discourse are accepted and negotiated by those who already partake in the discourse. On the one hand, this is based on peer review or editorial decision making, and, on the other hand, on less standardised forms of communication, for instance, in blogs, at conferences, or on social media. The discourse, therefore, is constituted by the contributions to it which are either accepted formally through selection, or included informally through references to them. Crucially, such notion of discourse links the communicative action happening within the discourse to the individuals who contributed to it. They form discourse communities and, as Hyland summarises, the ways the members of these communities “understand knowledge, what they take to be true, and how they believe such truths are arrived at, are all instantiated in a community’s discourse conventions” [8] (p. 5). A new discourse is rarely established by an artificial gathering of contributions, but is formed organically by the need of enquiry or the lack of a coherent body of knowledge. Hence, proposing a discourse means demanding more enquiry to form a more coherent body of knowledge. Contributions to a later-formed discourse may already exist in disparate forms. As I will show in sections 2.2 to 2.4, there are few instances where the object in question—open practices in the humanities—is already touched on. These instances are not linked, though, because of the lack of the dedicated discursive realm. Similarly, the discourse on open science was not opened as a discursive realm dedicated to open science. Its origins are widely spread and contributions to it may stem from a wide array of other discourses that existed before open science was established (see Section 2.2 for a discussion of open science). It is, thus, not merely the dedication of a new terminology. The term open humanities has been used before, but this does not mean that there is an open humanities discourse. Demanding such a discourse does not dismiss existing explorations of open practices in humanities disciplines, but calls for a dedicated communicative realm where such enquiries have space to be taken on in an integrated and focussed manner. Moreover, the members of a discourse community may have worked on practical implementations, thus, conversions of knowledge into practices or experimentation to induce practical knowledge. That these practices are concerned with the knowledge in action that may be part of a discourse does not constitute that these practices are themselves components of that discourse. Only by means of textual reporting do the experiences of implementing practices or applying knowledge feed back into discourses, especially discourses governed by scholarship such as open science, digital humanities, or a potential open humanities. This distinction is essential to the ensuing discussion in this article. Though there are or have been instantiations of practices in a form of open humanities—irrespective of these being called open humanities—these instantiations do not as such contribute to an open humanities discourse; they either form minor contributions to the digital humanities discourse (see Section 2.4), or they contribute to other, disparate discourses which are not linked to each other (as in the case of some of the elements such as preprints or liberal licensing discussed in Section 3). Already the existence of such practices that are linked to each other through their integrated nature in the field requires that the enquiries into these practices, their textual representations, and the knowledge of these practices are linked as well. Thus, what a discourse is cannot be defined before its existence, as only the content of those elements of communication that contribute to the discourse determine its boundaries. With this in mind, I do not aim to define what open humanities needs to be in general or in detail, but demand that there is a discursive realm where potential elements have the potential to be linked and debated. These elements will then render the realm and define its boundaries. Open practices in the humanities could be an element of either open science or digital humanities discourses. There is no definition or guard preventing the uptake of this direction within the existing discourses. However, as I will demonstrate with the relevant literature in Section 2, and the discussion in Section 3, existing Publications 2019, 7, 65 4 of 17 contributions to these discourses show that they are conceptionally unqualified (open science), or lack coherence and dedication (digital humanities). 2.2. Open Science The term open science refers to the historical and contemporary practices and norms of open research and communication in disciplines of the sciences as well as to the discourse on these practices and norms. David [14] finds historical origins of open science practices in early developments of the then still less formal conduct of natural philosophical enquiry in the late sixteenth century—a time when there was not even a separation of clusters of scholarship into sciences and humanities. Vicente-Saez and Martinez-Fuentes [15] aim to determine an integrated definition for the proclaimed “disruptive phenomenon” that open science is and arrive at: “[o]pen [s]cience is transparent and accessible knowledge that is shared and developed through collaborative networks”. In addition to having a primarily static definition of open science, they remain diffuse on what knowledge is. Madsen simply sees open science as a movement that “seeks to promote openness, integrity and reproducibility in [scientific] research” [16]. Fecher and Friesike [17] have a more wide-ranging approach to defining open science, including processes, infrastructures, measurement, and society outside institutionalised academia. Though they state in their introduction that open science is concerned with the “future of knowledge creation and dissemination”, making no distinction here between clusters of disciplines, they only refer to scientists (next to politicians, citizens, or platform providers) when they discuss their open science schools of thought. They either presume that, due to epistemological distinctions, the humanities disciplines are not to be found in the realm of knowledge production, or they locate the humanities disciplines outside of any of their five schools of thought of open science. Such scientific perspective is further reinforced by Friesike et al. in another study on the emergent field of research on contemporary openness in research [18]. A different approach to defining open science comes from a review conducted by Peters [19]. After having reviewed the dimension and some historical origins of thought about open science, Peters offers conclusive remarks about the nature of thought that underscores open science and the broader philosophy of openness. What he does not do, though, is examine the application of this in specific research, thus, potentially enlightening the eventual differences of the development of an open culture between disciplines of the humanities and the sciences. Similar approaches and shortcomings can be found in other articles concerned with open science (see [20] or [21]). Thus, it can be confirmed that open science is taken to be literal—science-related. Open science is a concept for scientific research; the broader terminology encompassing also humanities and social science disciplines may be open scholarship, which, in short, means “opening the process of scholarship”, irrespective of discipline [22]. Thus, by definition, open scholarship includes all scholarship—irrespective of disciplinary specifics. But the above-mentioned characteristics of the humanities necessitate such differentiation. And with the humanities being in the process of opening communication and research practices as well, a dedicated space is required for debating these processes and emerging practices—one that complements open science but does not resolve into the overtly abstract open scholarship. Such a dedicated discourse should not be read as a demand to separate sciences and humanities or to reinforce a dichotomous perspective on scholarship, but as a reference to the unifying aspects of disciplines and their practices that allow for such a clustering. Moreover, what this overview of the definitions of open science shows is that it has shortcomings in addressing the humanities. Open science is not simply reducible to scientific disciplines and it is not my objective to do so; however, it is, as the literature shows, the case that open science does not address the unique characteristics of the humanities both terminologically and conceptually, making an open humanities discourse necessary. Publications 2019, 7, 65 5 of 17 2.3. The Necessity of a Discourse on Open Humanities Arguments for the necessity to establish such a dedicated discourse can be made in manifold ways: the humanities are lacking behind the sciences in the transformation towards openness; the humanities are but a by-product of open science due to the lacking of an own discourse; the fragmentation of discourses about open practices in the humanities requires an integration of these smaller discourses into a single discourse (for instance, the connection of preprints and peer review as discussed in Section 3); there lies strength in a focussed, single voice of a discourse community such as (a potential) open humanities with which to address issues of policy and funding that are more and more concerned with openness. The most pressing argument, however, comes from within scholarship of the contemporary humanities: the inadequacies of the current practices of scholarly communication require a systemic approach to finding new solutions. Dedicating a discursive realm termed open humanities to this quest not necessarily means that openness is to be taken as the only solution to these shortcomings. But without a discourse analogous to that of open science, there is no realm within which the potential of openness as a solution can be determined for the humanities. To mention are among the inadequacies of communication practices in contemporary humanities, for instance, the detrimental ways “peer reviewers criticize one another” [23]; the “great many unnecessary and inadequate publications” because of wrong incentives and evaluation mechanisms [13]; the fear of subjectivity that is immanent to judgement of quality alongside the denial of subjectivity in quantitative measurement [24]; the different funding and financial support structures that are unfit for a quicker uptake of open access in the humanities [25]; the debate around the “problem of value, transparency, and distributed financing of disciplinary activities” that arises because of the reluctance of learned societies to engage in more open processes [26]; the poisoned paradigm behind productivity and excellence, because of which scholarly communication is increasingly alienated [13,24,27,28]. Some of these issues are equally applicable to all disciplines, but by means of their discourse on openness, these issues are regularly addressed for the sciences (but not so for the humanities). The latest example for this is the recently published Ten Hot Topics around Scholarly Publishing by Tennant at al. [29]. Tennant et al.’s review article provides a useful guide to current debates in scholarly communication in the sciences; it is framed, however, as a review of scholarly publishing as a whole. This framing would make it necessary to include perspectives on the humanities which it obviously lacks.2 Of the ten hot topics, only three appear to be not focussed on journal articles and only four are not primarily concerned with scientific literature. Especially those topics that take on issues of research quality, judgement, and objectivity do not discern the profound differences that are in place between the communication practices of scientific and humanities disciplines. This makes both choice of and approach to the disputed topics a perpetuation of debates rooted in scientifically minded open scholarship practices—in short: open science. What, then, are hot topics in (open) humanities publishing? A point can be made about the higher pressure of openness in scientific disciplines because of more policy work. Indeed, open policy is a key element of open science as outlined in some of the conceptual frameworks such as Foster Open Science. This is less so the case in the humanities. The demand for openness in the humanities seems to be rooted in scholarship itself, whereas it is rooted in both scholarship and policies in the sciences. While this may impact the pace of implementing open practices in the humanities, it does not affect the necessity to have a discourse on these practices. 2.4. Open Access, Open Humanities, and Digital Humanities Open access is one of the hot topics found in both the sciences and the humanities. But as opposed to issues such as peer review, preprints, or licences, open access in the humanities is well- 2 As the authors state in the article, the choice of topics arose by means of a somewhat democratic process through a discussion on social media. The demos in this process, however, may have been unrepresentative for the humanities resulting in these science-focussed ten topics. Publications 2019, 7, 65 6 of 17 established as a discourse, or: within the discourse on open access, distinctions are made between the sciences and the humanities [30–35]. Though the early uptake of open access took place in non- humanities disciplines, already the early declarations on open access include the humanities, with the Berlin Declaration explicitly being issued as “Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities” [36] (own emphasis). Hence, other than for complementary elements of open science, the humanities developed their own discourse on open access, often focussing on the technicalities of implementing more open forms of publishing. However, whereas open access to scientific literature is embedded in an established discourse on open science, open access to scholarly literature in the humanities often remains in quarantine without a broader discursive framework— such as open humanities—within which it could be embedded. Even Kleineberg and Kaden, who enquire the need of a concept of open humanities, cling to open access as a key issue already in their heading (though they do refer to open research data and open review as well in their article) [37]. Kleineberg and Kaden’s contribution shows that, theoretically, there is a discourse on open humanities in place. In practice, however, it seems to be mostly invisible, especially in comparison with open science and digital humanities as well as in reference to the fundamental changes potential open practices in the humanities would mean for scholarship. The reluctance of an uptake of open humanities may be due to the outsourcing of digitality in the humanities that is unseen in scientific disciplines. Whereas digital methods are integral to the disciplines of the life, natural, or applied sciences, the humanities developed digital humanities to devise new methods empowered by digitality. In other words, where digitality is part of each scientific discipline, open science is the dedicated realm for debating open practices in the sciences; humanities scholars would refer to digital humanities for digitality in humanities disciplines but lack a discourse as a reference to open practices. The digital humanities appear with a twofold mission: applying digitality to support or help answer questions in traditional humanities disciplines, and exploring what it means to be human in an increasingly digital environment [38]. The former mission is transdisciplinary in nature, bridging humanities with digital specialist realms. Because of this nature, though, it advances only what is already enquired in the humanities but may not lead to new epistemological efforts; or, as Gibbs and Owens argue, “[d]espite significant investment in digital humanities tool development, most tools have remained a fringe element in humanities scholarship” [39]. Digitality remains an add-on to the humanities, a set of tools and approaches that requires more focus and aspiration for genuine integration into the traditional disciplines [40]. The latter mission seems philosophical, or even sociological, in nature, touching on questions of digital hermeneutics and ontology. It may well be argued that this is a genuine task for philosophers or sociologists which only appears in the guise of a different discourse, termed digital humanities. Overall, this layover discipline may have the potential to gather scholars to form a community that drives a discourse on open humanities—only termed digital instead of open humanities. This is questionable, though. Much of the discourses within digital humanities focus on digital practices, applied methods, or projects, instead of on open practices comparable to open science.3 There are exemptions to this, authors who directly or indirectly analyse or discuss elements considered to be part of open science (or open humanities respectively); examples include Borgmann, who discusses “publication practices, data, research methods, collaboration, incentives, and learning” for the “future of digital scholarship in the humanities” [41] (p. 5), thus describing essential prerequisites for open humanities. Bianco emphasises the “tremendous opportunities” that digital humanities have “to open up research modes, methods, practices, objects, narratives, locations of expertise, learning and teaching” [42], but fails to connect this to any of the larger contexts of open science/humanities or open scholarship. Borrelli looks at the distinction of digital practices as opposed to digitalised practices [43], touching on open access and peer review. Pritchard looks at an early, concrete version of open 3 See, for instance, key journals: digital humanities quarterly, Debates in Digital Humanities, Digital Scholarship in the Humanities, ZfdG - Zeitschrift für digitale Geisteswissenschaften, Digital Studies / Le champ numérique, or International Journal of Humanities and Arts Computing. Publications 2019, 7, 65 7 of 17 scholarly communication in the humanities from which he deduces more general findings about pre- /post-prints, open access, and a digital, potentially open infrastructure in classical studies [44]. Kuhn and Hagenhoff analyse requirements of digital monograph publishing and conclude with a decisively progressive, open potential for an outdated publishing model [45]. Fitzpatrick contributes comprehensive discussions of inadequate scholarly communication practices for which openness is offered as the best way forward [46–48]. Cohen discusses digital processes as a possible solution to one of the fundamental problems scholarly communication in the humanities exhibit: the social contract that is actualised by traditional, institutional publishing [49]. Again, this article makes no reference to open humanities (or open science/open scholarship respectively). These contributions are starting points for debates. But their glaring shortcoming is that they are not integrated into a dedicated discourse, and, instead, remain disintegrated and a minor concern next to the many contributions on methods and projects that appear in the digital humanities discourses. Integration here means bringing together the different ideas and enquiries to debate their interconnectedness and implementation in practice. Combined, the various open practices pose a fundamental change of scholarship in the humanities and to get scholars to engage with the shaping of these practices, there needs to be a single discourse as a reference, one that is centred around open humanities and not mixed up with the various debates on methods presented in the digital humanities. Conclusively, thus, the digital humanities have a topical agenda that is only to a lesser degree concerned with what would be an open humanities discourse. In the following section, I will look at some of the elements that such a discourse would be required take on and reconfigure. 3. Discussion of Practices of Open Science and Their Applicability in the Humanities The Foster Open Science taxonomy names a variety of elements from open access and open data to open reproducible research and open science policies, tools, or guidelines as first order elements of open science [50]. Most of these elements further spread to second or third order elements among which range some of the often-debated topics such as open metrics and impact or open peer review. This taxonomy comprises of many terms explicitly connected to the sciences. As apparent by the terminology, these elements are largely indifferent to the humanities: they are intended to advance, or improve, scholarship specifically in scientific disciplines by making it more open. I will discuss three key elements of the open science taxonomy with respect to their meaning and how the distinction of practices requires a different understanding of these terms in the humanities. 3.1. Preprints in the Humanities A preprint is a manuscript of an article, book, or chapter that is being published in a distinguished online repository before it is formally published in a journal or as a book [51,52]. This quite general definition can be discussed in more detail with respect to its elements: the manuscript is usually one authored for the purpose of being published in a peer reviewed journal or book; publication online in a repository means that the manuscript is published rather informally—i.e., not necessarily formatted according to a publisher’s guidelines—on an online server that functions as a repository for such manuscripts, optionally specifically for a discipline. This online publication is freely accessible with respect to a creative commons licence; the manuscript is likely to be published in a similar, but potentially revised version formally later on. The ostensibly outdated term preprint derives from the idea that the manuscript is available for debate before its formal imprimatur in a journal or book, thus, before it is printed—irrespective of this being done with ink or digitally. This also enlightens about the purpose of using preprints: it accelerates scholarship without compromising authorship and enables early debates [53,54]. The formal publishing process usually takes time, especially because of the standard pre-publication peer review [55]. During the time of this peer review, potential readers can already evaluate and judge the research as it is publicly available as a preprint. At the same time, the author(s) can take advantage of the preprint because they have an early citable and timestamped manuscript for the time of it being in review for a journal or book. Publications 2019, 7, 65 8 of 17 These characteristics have a potentially positive impact on scientific scholarship, as Tennant et al. [29], Ginsparg [56], or Taubes [57] suggest. What these authors do not touch on is that this may be different in humanities disciplines, where preprints have as yet a much smaller presence than in scientific disciplines. There are far fewer preprint servers for humanities disciplines than there are for scientific disciplines [58], which can be explained with the much smaller output of publications in the humanities, their (sometimes) geographically limited importance, and perhaps the reluctance of humanities scholars to engage in progressive, digital publishing procedures in general. Kleineberg and Kaden [37] discuss how there is no established culture for preprints in the humanities, though their publication processes are particularly long-termed. Laporte [59] identifies several challenges that arise for humanities scholars to have a more established preprint culture. One of these is that, other than in most discourses in the sciences, those of the humanities are still highly (geographically) regionally rooted and make use of a variety of languages and discursive norms instead of resorting to English as a lingua franca. This goes along with the reach of journals and book publishers who have more individual, language-dependent audiences, which makes it much harder to have a single space for an international discourse. Another challenge is posed by the high share of monographs among the overall output as well as popular articles, which further divides the output of the already considerably smaller amount of humanities publications. Those obstacles may well be overcome by highly specialised or well indexed preprint servers; but apart from the conception in theory, these characteristics are a reason for the difficulty of finding a critical mass for the uptake of active preprinting in the humanities. Geltner, the founder of a preprint server for medieval studies [60], argues strongly in favour of preprints [61]. One of his key arguments is to encourage more curation by editors actively reaching out to authors of compelling new works to convince them of publishing with their journals. If this new work would have been uploaded to preprint servers, the whole process of active curation would be strengthened, the argument goes. Preprints in the humanities are, for Geltner, not about ““accelerating research,” but rather protecting research as a curiosity-driven endeavor [sic]” [62]. It remains questionable whether such a form of curation is incentive enough for scholars to engage in authorship and preprinting, only then to be targeted by journal acquisition editors with publication advertisements. If such targeting led to publications in more prestigious journals, most authors would be encouraged to take the effort. But it may as well be the rather less prestigious journals that would aim for more targeting—a scenario no discourse or author may wish for (irrespective of the fact that the concept of prestige of journals is highly contested in the first place). Moreover, the purpose of preprints is to enable discussion for the time a manuscript is in review, not before it is submitted to a journal for review. Another yet undebated point can be made with reference to the nature of discourses particularly in the humanities, and whether scholars in the humanities may be better served with more rapid formal publication, rather than the formal publication remaining the same and instituting preprints as a temporal placeholder: in disciplines of the sciences, especially in those that have a high uptake of preprints such as physics, formal publication is generally conceived to be, literally, the formal last step. At that point in time, the content of the manuscript is generally already known within the discourse community and has gone through debate [63]—just what preprints aim to facilitate. Other scholars had the chance to incorporate and work with the knowledge drawn from the research reported in the preprinted manuscript. The purpose is clear—a sensible procedural step to accelerate scholarship during that time in which the manuscript is in review. The downside is that this step reinforces the impression that both the formality of the publication and its closed review process are required. This impression may be falsely conjured as there may not need to be a closed review process, so that preprints would either not be necessary or the preprint would be the final publication itself (making the practice of preprinting redundant).4 Another characteristic may be even more imperative as an argument against preprinting in the humanities. It is based on the importance of the historicity of publications. This is different in the 4 For a popular example of a preprint as the only—thus final—publication, see [64–66]. Publications 2019, 7, 65 9 of 17 sciences, where the most recent version of a publication—if approved of by peer review—is essential: the up to date, reliable, and reproducible research reported in the publication counts, not the historicity and versioning of the publication. This is not the case in the humanities which can be illuminated by an example. Consider that, by means of peer review, there is a change in the content from preprinted manuscript to final publication, for instance, in the line of argument or references included. This change will have to be reflected in case a future author aims to discuss this publication. This future author may no longer merely write: author A claims that X; she will instead be required to write: author A claims that X and reviewer R adds that XY. If authors in the humanities adopted preprinting, there would always be at least two versions of an article available (provided the article got accepted after revision). This may sound trivial. But for disciplines that emphasise the importance of hermeneutics and source criticism, where editorial history is a key concern, such details require attention. It is, therefore, necessary to debate whether the broader uptake of preprints in the humanities is desirable. Counterarguments may be that both the author’s initial intention and the reviewer’s request for change may be reasonable so that preprint and final publication should reflect such process of change. Moreover, one may argue that attribution of authorship of any publication is entangled already today; that is, reviewers can as well be seen as co-authors—they help with the creation of the work, rather less than more substantially, only that this is opaque today. Preprints will just make visible the difference between manuscript and publication; they will not change that there is a difference. This, however, connects to our understanding of peer review as it is this process, rather than preprints, that makes such co-creation and change opaque today. This directly leads to the next realm that requires attention for open humanities: open peer review. 3.2. Open Peer Review in the Humanities Peer review refers to the practice where fellow scholars evaluate each other’s works. The resulting evaluation may be used by editors to provide guidance to authors so that they can improve the manuscript, and to make an informed decision about whether or not to publish. The practice’s institutionalisation originates in “debates over grant funding” in the 1970s “and has since been extended to cover a variety of processes by which academics formally evaluate each other’s work” [67] (p. 11). However, the process of peer refereeing is much older. It can be traced back to learned societies and their journals in the eighteenth century. It developed as “distinctive editorial practices of learned societies [which] arose from the desire to create forms of collective editorial responsibility for publications which appeared under institutional auspices” [68] (p. 4). Since then, peer review has been developed into an institutionalised practice, and it is a systemic gatekeeper today. Especially the systemic nature of this practice, captivated by a paradigm of excellence, led to the acceptance of peer review to be a threshold to quality and authenticity, and to the assumption that by merely organising peer review, publishing companies add value to the published material [67]. Some authors argue that peer review is material for scholarship and its quality, as, for instance, Babor et al. do: “[t]he most important criterion for quality and integrity is the peer-review process, as overseen by a qualified journal editor and the journal’s editorial board” [69] (p. 51). Gatekeeping is seen as a means of merit and scholarly obligation [70], one that is supposed to be—but often enough does not achieve to be—a democratic process [8]. Finch argues that peer review performs as part of the effectiveness of high-quality channels within the current communication system, in which researchers have “effective and high-quality channels through which they can publish and disseminate their findings, and that they perform to the best standards by subjecting their published findings to rigorous peer review” [25] (p. 17). Such statements seem ignorant of the limitations of peer review in terms of quality. When refereeing is applied as an entry threshold to communication, it is a sorting mechanism, indeed a procedure of selection that filters content into diffusely constructed classes of quality; but it is not a criterion for quality as such. Moreover, the effectiveness of a practice can be questioned if that practice, on the one hand, withholds research for a longer period in which it is inaccessible to other scholars while, on the other hand, only a selection of one to three fellow scholars is deemed worthy for the Publications 2019, 7, 65 10 of 17 judgement about the value of a manuscript in a discourse. It takes, on average, 17 weeks for eventually accepted papers to get through peer review; this period is longer than average in the social sciences and humanities with 22–23 weeks [71]. Many journals in the humanities have “single figure acceptance rates” [8] (p. 168), meaning that a bulk of research is excluded from that discourse to which an author wished to contribute it to. Because of the hidden process, the reasoning behind both ex- and inclusion is opaque to fellow scholars. It is for these and other reasons that peer review is a contested practice across disciplines. Tennant et al. conclude that “debates surrounding the efficacy and implementation of peer review are becoming increasingly heated, and it is now not uncommon to hear claims that it is “broken” or “dysfunctional”“ [72]. Especially for disciplines in the sciences, this practice of gatekeeping and verification judgement is under close scrutiny [73–77]. On the other hand, some authors claim evidence for value in peer review [78] and double-blind procedures in particular [79] for disciplines of the sciences. Others raise concern and enquire the options for opening this practice, claiming increased accountability and transparency [80] or proposing entirely new models such as a preprinting-connected collaborative open peer review [81]. Ross-Hellauer [82] discusses a variety of problems with peer review, most of which affect the realms of quality and credibility fundamentally. He accounts inconsistent and “weak levels of agreement” among referees, questions the authority of their role as gatekeeper, and issues the ““black-box” nature of traditional peer review” as a “[l]ack of accountability and risk[...] of subversion”. Most of all, the social component of peer reviewing is set against the “idealized as impartial, objective assessors” based on gender, nationality, institutional affiliation, or language. Backed with (peer reviewed) studies as evidence, Ross-Hellauer’s review arrives at a devastating conclusion for this practice. Early experiments such as Peters and Ceci’s [83] only add to the impression that modern peer review has long established itself as a contested gatekeeping practice instead of a process of collaborative improvement of research. This criticism questions how peer review really achieves to democratise scholarship, or conform to an objective enterprise—two of Robert Merton’s key principles in the sociology of science [84]—resulting in an inevitable debate about making this practice more open and transparent. A concerted and concentrated debate is established within open science, and it is likewise necessary for the humanities, where there are differences in the practice. Being published in the humanities is much more connected to editorship, where peer reviewers provide the editor with a subjective understanding of the work. Decisions of acceptance or rejection are much more connected to interpretation and argument instead of objectified principles. The name of the editor is highly connected to the value of the journal and the discourse it serves. Editors are “cultural intermediaries who bridge two worlds, insiders-outsiders with a foot in each camp” [10] (p. 45). It is no news, however, that scholars, especially in the humanities, judge and argue in manifold subjective ways [85], making it much harder to compare reviews so as to arrive at a desirable compromise in the process of gatekeeping. Being published with a publishing brand—may it be a journal or a book series—is, thus, more than a question of abstract, objectified quality. Statements of quality are much harder to be made in the humanities than they are in the sciences; it is, here, rather a question of consensus and agreement of reviewers or editors on a particular level of intelligibility. But if agreement is not the objective of humanities discourses, why should inclusion in a discourse be based on agreement? Making the peer review process more open by, for instance, publishing the reviews would not necessarily disturb the mechanism of gatekeeping but make the terms of inclusion more transparent. These terms of inclusion may not be the decisive elements, though; the terms of exclusion are. And the culture of debate may require scholars to know about these terms as well. Moving the position of review from pre- to post-publication would profoundly change the purpose of reviewing from gatekeeping to improving, reconfiguring the emphasis of this practice away from the journal towards the discourse. Fitzpatrick argues in a similar manner, demanding more progress in the practice and discourse on open peer review [47,48]. However, her approach is profoundly shaped by notions of the digital Publications 2019, 7, 65 11 of 17 and digital humanities. While she indeed writes about the subjectivity and qualitative representation of humanities scholarship, these arguments are not taken up by a larger discourse outside of digital humanities. The discourse on opening peer review should not be left to either the sciences or digital humanities. It needs to be approached from within the humanities, where connections are drawn to other practices such as preprinting and making statements of value and judgement about scholarship. 3.3. Liberal Copyright Licences in the Humanities Another discussion that stands in line with the elements discussed above concerns the applicability of liberal copyright licences. Licences are fundamental for open scholarship as they are guiding principles for the practice of any form of open publishing as well as the policy work that underpins these practices. Licences are not just crucial for potential readers, indicating how the published material can be accessed and used without consulting author or publisher. They are also responsible for a progressive understanding of authorship where the author is not required to sign over copyright to the publisher. CC BY5 is the licence most favoured by open access advocates and organisations (see, for instance, the discussion by Frosio [86] (p. 98); or [87,88]). The reasoning here is that reducing the limitations issued by the creative commons licence (by means of NC, ND, SA)6 to BY, means limiting the limitation of reuse of the publication simply to attribution of authorship. In other words: “anything less introduces a barrier to the open progress of science” [30] (§19). While it is true that CC BY as the “most liberal” licence “imposes no limits on the use and reuse of material so long as the original source is acknowledged” [31]; it is also true that there is a debate whether such liberalism is in favour of discourses in the humanities. A strong position in this matter has Mandler who repeatedly voiced his concern about overtly liberal licences—though supporting open access in general [34]. In an interview he claims that ““reuse” under CC BY authorises practices that we call plagiarism in academic life. I know advocates of CC BY dislike the use of this word, but it is a good word to describe the practice of copying and altering words without specifying how they are altered” [89]. This aligns to what Morrison discusses, criticising that a “poor translation could have a negative impact on a scholar’s reputation, whether through the quality of the writing or through other scholars misquoting an inaccurate translation” [90] (p. 54). This is essentially a problem of the humanities: not only because the humanities live on a variety of languages, thus, making translations a regular necessity; even more so, it is in the humanities where the nuances of argument and expression matter. Both these issues are much less present in disciplines of the sciences. Therefore, the claim made by Morrison and Mandler seems legitimate. In the daily process of discourses, however, I think it is not the licence that is fundamental for this form of misrepresentation. It rather is good or bad scholarly convention, and thus practice, that is responsible. Translations are made by other scholars as are discussions and reviews of works cited. The basis of such scholarly practices should be the willingness to not misrepresent a fellow scholar and her argument; if this fails, no amount of altering licences will help maintain scholarly integrity. One may wish to publish only with an ND licence to pre-emptively avoid misrepresentation and still be wrongly represented in discussions and reviews of the published work, or even incorrectly connected to a different argument altogether by means of inattentive referencing. If the integrity of the discourse practices in scholarship are not held high, any pre-emptive steering through licences is futile. If the integrity is in place, however, the steering though licences is not necessary in the first place so that the more liberal licence CC BY may as well be suitable for the humanities. The suitability of this practice needs to be integrated into a larger discourse that connects it to other elements of open humanities, especially open access and open data (infrastructures), enabling 5 Creative Commons Attribution only licence. 6 Creative Commons Attribution licence with the added restrictions (possible in combinations): NC— NonCommercial; ND—NoDerivatives; SA—ShareAlike. Publications 2019, 7, 65 12 of 17 policy workers to draw on it to make informed decisions. Without such integrated discourse, policy workers can draw on only minor, fragmented opinions about this matter, or must resort to open science altogether. 4. Conclusion It seems to be harder for the humanities to speak with a single voice than it is for the sciences. But this must not mean that the humanities need not have a discourse on opening their research and communication practices. The three elements discussed indicate the necessity to have this discourse; the lack of having it in a coherent, focussed manner may harm the progress of scholarship in the humanities in an—arguably inevitable—digital future. Unlike the sciences, where the dialectic between the discourse open science and the advancement of more open science practices results in positive progress, issues in question in the humanities are dispersed into often isolated, disintegrated niche discourses. Especially the close connection of preprinting and open peer review indicates the need to have an integrated, rather than a fragmented discourse. Similar claims for interconnectedness can be made for, for instance, open data and open reproducible research, open evaluation and open metrics and impact, or open access and popular humanities communication.7 Another form of interconnectedness concerns book reviews and review articles which may well be called forms of post-publication peer review. In this sense, practices in the humanities include forms of openness already. However, those review publications are at times excluded from debates of open access as, according to principles of open science, primarily original research is to be published open access. Yet, reviewing, and thus debating, scholarship is incremental to discourses in the humanities. To sufficiently debate and find solutions for such an issue, humanities scholars and scholarly communication experts need an integrated discourse that brings these elements of open access and open post-publication peer review together. In its current, disintegrated form, the issues will always run into errors in implementation and be unrelatable to scholars in the traditional humanities disciplines. Open access alone—likely the most prominent element of open science—is well discussed for publishing practices in the humanities. And there are other practices that, out of the requirement of particular scholarship in the humanities, drive individual, smaller discourses that showcase the need for understanding and advancing practice in this field. Opening access to data of humanities scholarship and building sustainable infrastructures is one such practice. Similar to many other elements in open science that are assumed to be applicable to the humanities, however, it must be understood that “[t]he concept of research data comes from the sciences, and can only be transferred to the traditional scholarly methods of the humanities to a limited degree” [91]. Though questions of publishing data, their infrastructures, and degrees of openness are partly integrated into discourses of the digital humanities [41, 92–93], it is not connected paradigmatically to a broader open humanities. But facing a digital future, the humanities need to be open to interdisciplinary knowledge transfer in this realm [94]. This can also be seen in the social sciences, where Herb, for instance, consulted on various elements of open science and their development in sociology, eventually concluding that “[t]he open knowledge culture is not widespread in sociology” [95] (p. 419; translated by the author)8. As stated before, some disciplines and research projects of the social sciences may well be in the realm of the 7 Humanities communication does not exist as a term. This is another curious instance where there is a term called science communication without a comparable counterpart for the humanities. This is the case, either because or although, practices such as publishing in popular media are integral to humanities scholarship (Collini, 2012). It can be argued that there is no need to have such a term because of the integral nature of such publishing practice. The fact that there is no such term and discourse, however, should not lead to the assumption that there is no associated practice. The sciences only seem to be more communicative due to their discourse on and practice of science communication; the humanities conduct this communication integratively. 8 Original: ‘Die Kultur des offenen Wissens ist in der Soziologie nicht verbreitet.’. Publications 2019, 7, 65 13 of 17 sciences and can, thus, be captured by open science. For those closer to the humanities, this cannot be assumed, especially because their scholarly communication practices are distinct. Most of all, it can be stated with certainty that the crisis of scholarly communication is not simply one of the sciences. Just because the crisis may have a similar origin, the discipline-specific developments and their potential solutions may not be the same. The origin of this crisis may be seen in the paradigm of false productivity, excellence, and pressure to publish. Due to the differences of practices, this led to dissimilar problems which, for the humanities, are concisely summarised by Rosa: I am firmly convinced that, at least in the social sciences and the humanities, there is, at present, hardly a common deliberation about the convincing force for better arguments, but rather a non-controllable, mad run rush for more publications, conferences and research-projects the success of which is based on network-structures rather than on argumentational force [96] (p. 55). More transparency and openness that rid authorship of its stances of formality and reputation may serve as a solution. But the discursive space in which scholars debate the specifics of this solution must not be fragmented and dispersed into niche contributions. There may well be individual contributions to what openness means in the humanities or why it may be beneficial. This is especially true in the digital humanities discourse. But, as discussed above, the digital humanities are focussed on methods much more than on open practices. Yet, scholars in the humanities need a voice to shape their digital, open future. This needs to be a transdisciplinary space, just like digital humanities is one; only that it needs be driven from within the humanities (where digital humanities seems to be driven rather by technology), and dedicated and focussed on the opening of practices (where digital humanities are primarily concerned with methods and projects). This process may start with a disciplinary discourse in individual humanities disciplines, for instance, at conferences or in special issues of dedicated journals. The importance of an open humanities discourse will be to bring these threads together and to serve as a dedicated reference to what open practices means to humanities scholars and what best practices and their problems and implementations are. Open science achieves this for the sciences. There is nothing comparable for the humanities. Notwithstanding, such a space should not be taken as an openness for granted discourse. As the discussion above shows, not everything that can be opened necessarily works in favour of the knowledge production of its disciplines. It may just as well be the case that the ends of practices instead of the practices themselves are ripe for change so that merely reconfiguring processes may not lead to positive progress. Controversial as this may seem, there needs to be a dedicated discourse on it that brings together the variety of currently disconnected endeavours and proficiencies. Open humanities may serve as a namespace for this. Funding: The author received funding from the Arts & Humanities Research Council through the London Arts & Humanities Partnership. Conflicts of Interest: The author declares no conflict of interest. References 1. Abbott, A. Chaos of Disciplines; University of Chicago Press: Chicago, IL, USA, 2007. 2. Kagan, J. The Three Cultures. Natural Sciences, Social Sciences, and the Humanities in the 21st century; Cambridge University Press: Cambridge, UK, 2009. 3. Snow, C.P. The Two Cultures and the Scientific Revolution; Cambridge University Press: Cambridge, UK, 1960. 4. Beiner, M. Humanities. Was Geisteswissenschaft macht. Und was sie ausmacht; Berlin University Press: Berlin, Germany, 2009. 5. Bod, R. A New History of the Humanities; Oxford University Press: Oxford, UK, 2013. 6. Daston, L. Objectivity and Impartiality: Epistemic Virtues in the Humanities. In The Making of the Humanities, Volume 2, From Early Modern to Modern Disciplines; Bod, R., Maat, J., Weststeijn, T., Eds.; Amsterdam University Press: Amsterdam, Netherlands, 2012; pp. 27–41. 7. Hamann, J. Die Bildung der Geisteswissenschaften. Zur Genese einer sozialen Konstruktion zwischen Diskurs und Feld; Herbert von Halem Verlag: Köln, Germany, 2014. Publications 2019, 7, 65 14 of 17 8. Hyland, K. Academic Publishing. Issues and Challenges in the Construction of Knowledge; Oxford Applied Linguistics; Oxford University Press: Oxford, UK, 2015. 9. Steiner, F. Dargestellte Autorschaft. Autorkonzept und Autorsubjekt in wissenschaftlichen Texten; Reihe Germanistische Linguistik 282; Niemeyer: Tübingen, Germany, 2009. 10. Thompson, J.B. Books in the Digital Age. The Transformation of Academic and Higher Education Publishing in Britain and the United States; Polity: Cambridge, UK, 2005. 11. Hösle, V. Kritik der verstehenden Vernunft. Eine Grundlegung der Geisteswissenschaften; C.H. Beck: Münich, Germany, 2018. 12. Small, H. The Value of the Humanities; Oxford University Press: Oxford, UK, 2013. 13. Collini, S. What are Universities for?; Penguin: London, UK, 2012. 14. David, P.A. The Historical Origins of 'Open Science': An Essay on Patronage, Reputation and Common Agency Contracting in the Scientific Revolution. Capital. Soc. 2008, 3,5. 15. Vicente-Saez, R.; Martinez-Fuentes, C. Open Science now: A Systematic Literature Review for an Integrated Definition. J. Bus. Res. 2018, 88, 428–436. 16. Madsen, R.R. Scientific Impact and the Quest for Visibility. FEBS J. 2019, doi.org/10.1111/febs.15043. 17. Fecher, B.; Friesike, S. Open Science: One Term, Five Schools of Thought. In Web 2.0 for Scientists and Science 2.0; Springer: Vienna, Austria, 2013; pp. 17–47. 18. Friesike, S.; Widenmayer, B.; Gassmann, O.; Schildhauer, T. Opening Science: Towards an agenda of Open Science in academia and industry. J. Technol. Transf. 2015, 40, 581–601. 19. Peters, M.A. Openness, Web 2.0 Technology, and Open Science. Policy Futures Educ. 2010, 8, pp. 567–574. 20. Lahti, L.; da Silva, F.; Laine, M.; Lähteenoja, V.; Tolonen, M. Alchemy & algorithms: Perspectives on the philosophy and history of open science. RIO 2017, 3, doi:10.3897/rio.3.e13593. 21. McKiernan, E.C.; Bourne, P.E.; Brown, C.T.; Buck, S.; Kenall, A.; Lin, J.; McDougall, D.; Nosek, B.A.; Ram, K.; Soderberg, C.K.; et al. How Open Science helps researchers succeed. eLife 2016, 5, doi:10.7554/eLife.16800. 22. Katz, D.S.; Allen, G.; Barba, L.A.; Berg, D.R.; Bik, H.; Boettiger, C.; Borgman, C.L.; Brown, C.T.; Buck, S.; Burd, R.; et al. The Principles of Tomorrow's University. F1000Research 2018, 7, 1926, doi:10.12688/f1000research.17425.1. 23. Crane, T. The philosopher’s tone. The Times Literary Supplement. 2018. Available online: https://www.the- tls.co.uk/articles/public/philosophy-journals-review/ (accessed on 8 October 2019). 24. Brink, C. The Soul of a University. Why Excellence is not enough, 1st; Bristol University Press: Bristol, UK, 2018. 25. Finch, J. Accessibility, Sustainability, Excellence: How to Expand Access to Research Publications. 2013, doi:10.2436/20.1501.01.187. 26. Eve, M.P. Learned Societies, Open Access and Budgetary Cross-Subsidy. Available online: https://eve.gd/ 2019/09/17/learned-societies-open-access-and-budgetary-cross-subsidy/ (accessed on September 27, 2019). 27. Sperlinger, T.; McLellan, J.; Pettigrew, R. Who are Universities for? Re-making Higher Education, 1st; Bristol University Press: Bristol, UK, 2018. 28. Moore, S.; Neylon, C.; Eve, M.P.; O’Donnell, D.P.; Pattinson, D. “Excellence R Us”: University Research and the Fetishisation of Excellence. Palgrave Commun. 2016, 3, 16105. 29. Tennant, J.P.; Crane, H.; Crick, T.; Davila, J.; Enkhbayar, A.; Havemann, J.; Kramer, B.; Martin, R.; Masuzzo, P.; Nobes, A.; et al. Hot Topics around Scholarly Publishing. Publications 2019, 7, doi:10.3390/publications7020034. 30. Moore, S. A Genealogy of Open Access: Negotiations between Openness and Access to Research. Rev. Fr. Sci. Inf. Commun. 2017, doi:10.4000/rfsic.3220. 31. Crossick, G. Monographs and open access. Insights: UKSG J. 2016, 29, doi:10.1629/uksg.280. 32. Eve, M.P. Open Access and the Humanities; Cambridge University Press: Cambridge, UK, 2014. 33. Jubb, M. Academic Books and their Futures: A Report to the AHRC and the British Library; AHRC/British Library: London, UK, 2017. 34. Mandler, P. Open Access for the Humanities: Not for Funders, Scientists or Publishers. J. Vic. Cult. 2013, 18, 551–557. 35. Mandler, P. Open Access: A Perspective from the Humanities. Insights: UKSG J 2014, 27, 166–170, doi:10.1629/2048-7754.89. 36. Berlin Declaration. Available online: https://openaccess.mpg.de/Berlin-Declaration (accessed on 8 October 2019). Publications 2019, 7, 65 15 of 17 37. Kleineberg, M.; Kaden, B. Open Humanities? ExpertInnenmeinungen über Open Access in den Geisteswissenschaften. LIBREAS. Libr. Ideas 2017. Available online: https://libreas.eu/ausgabe32/ kleineberg/ (accessed on 8 October 2019). 38. Gardiner, E.; Musto, R.G. The Digital Humanities. A Primer for Students and Scholars; Cambridge University Press: Cambridge, UK, 2015. 39. Gibbs, F.; Owens, T. Building Better Digital Humanities Tools: Toward Broader Audiences and User- Centered Designs. Digit. Humanit. Q. 2012. Available online http://www.digitalhumanities.org/dhq/vol/6/2/000136/000136.html. (accessed on 8 October 2019) 40. Bod, R. Who’s Afraid of Patterns?: The Particular versus the Universal and the Meaning of Humanities 3.0. BMGN—Low Ctries Hist. Rev. 2013, 128, 171–180. 41. Borgman, C.L. The Digital Future is Now: A Call to Action for the Humanities. Digit. Humanit. Q. 2010. Available online: http://digitalhumanities.org/dhq/vol/3/4/000077/000077.html (accessed on 8 October 2019). 42. Bianco, J. This Digital Humanities Which Is Not One. In Debates in the Digital Humanities; Gold, M.K., Ed.; University of Minnesota Press: Minneapolis, MS, USA, 2012. 43. Borrelli, A. Wissenschaftsgeschichte zwischen Digitalität und Digitalisierung. Z. Digit. Geisteswiss. 2018. doi:10.17175/sb003_001. 44. Pritchard, D. Working Papers, Open Access, and Cyber-infrastructure in Classical Studies. Lit. Linguist. Comput. 2008, 23, 149–162. 45. Kuhn, A.; Hagenhoff, S. Nicht geeignet oder nur unzureichend gestaltet? Digitale Monographien in den Geisteswissenschaften. Z. Digit. Geisteswiss. 2019, doi:10.17175/2019_002. 46. Fitzpatrick, K. Planned Obsolescence. Publishing, Technology, and the Future of the Academy; New York University Press: New York, NJ, USA, 2011. 47. Fitzpatrick, K. Peer Review, Judgment, and Reading. Profession 2011, 2011, 196–201. 48. Fitzpatrick, K. Beyond Metrics: Community Authorization and Open Peer Review. In Debates in the Digital Humanities; Gold, M.K., Ed.; University of Minnesota Press: Minneapolis, MS, USA, 2012. 49. Cohen, D.J. The Social Contract of Scholarly Publishing. In Debates in the Digital Humanities; Gold, M.K., Ed.; University of Minnesota Press: Minneapolis, MS, USA, 2012. 50. Foster Open Science Taxonomy. Available online: https://www.fosteropenscience.eu/foster (accessed September 18 2019). 51. Neylon, C.; Pattinson, D.; Bilder, G.; Lin, J. On the origin of nonequivalent states: How we can talk about preprints. F1000Research 2017, 6, doi:10.12688/f1000research.11408.1. 52. Tennant, J.; Bauin, S.; James, S.; Kant, J. The Evolving Preprint Landscape: Introductory Report for the Knowledge Exchange Working Group on Preprints, MetaArXiv 2018. doi:10.31222/osf.io/796tu. 53. Crick, T.; Hall, B.; Ishtiaq, S. Reproducibility in Research: Systems, Infrastructure, Culture. J. Open Res. Softw. 2017, 5, 32. 54. Vale, R.D.; Hyman, A.A. Priority of discovery in the Life Sciences. eLife 2016, 5, doi:10.7554/eLife.16931. 55. Powell, K. Does it take too long to publish research? Nat. News 2016, 530, 148. 56. Ginsparg, P. Preprint Déjà Vu. EMBO J. 2016, 35, 2620–2625. 57. Taubes, G. Electronic Preprints Point the Way to ‘Author Empowerment’. Science 1996, 271, 767. 58. OSF. Preprint Archive Search on Open Science Framework. Available online: https://osf.io/preprints/ discover (accessed on 8 October 2019). 59. Laporte, S. Preprint for the Humanities—Fiction or a real possibility? SocArXiv 2016. doi:10.31235/osf.io/jebhy. 60. Anonymous. BodoArXiv Preprints: Open Repository for Medieval Studies. Available online: https://osf.io/ preprints/bodoarxiv/ (accessed 25 May 2019). 61. Geltner, G. Long live the curator! Available online: https://www.scienceguide.nl/2018/12/long-live-the- curator/ (accessed on 8 October 2019). 62. Geltner, G. Why Arts & Humanities Scholars Should Care About Preprints. Available online: http:// www.guygeltner.net/blog/372018why-arts-humanities-scholars-should-care-about-preprints (accessed on 8 October 2019). 63. Delfanti, A. Beams of Particles and Papers: How Digital Preprint Archives Shape Authorship and Credit. Soc. Stud. Sci. 2016, 46, 629–645. Publications 2019, 7, 65 16 of 17 64. Perelman, G. The entropy formula for the Ricci flow and its geometric applications. arXiv 2002. Available online: https://arxiv.org/abs/math/0211159 (accessed on 8 October 2019). 65. Perelman, G. Ricci flow with surgery on three-manifolds. arXiv 2003. Available online: https://arxiv.org/abs/math/0303109 (accessed on 8 October 2019). 66. Perelman, G. Finite extinction time for the solutions to the Ricci flow on certain three-manifolds. arXiv 2003. Available online: https://arxiv.org/abs/math/0307245 (accessed on 8 October 2019). 67. Fyfe, A.; Coate, K.; Curry, S.; Lawson, S.; Moxham, N.; Røstvik, C.M. Untangling Academic Publishing: A History of the Relationship between Commercial Interests, Academic Prestige and the Circulation of Research, 2017. Available online: https://zenodo.org/record/546100/files/UntanglingAcPub.pdf (accessed on 8 October 2019). 68. Moxham, N.; Fyfe, A. The Royal Society and the Prehistory of Peer Review, 1665–1965: (accepted manuscript/author version). Hist. J. 2018, 61, doi:10.1017/S0018246 × 17000334. 69. Babor, T.F.; Stenius, K.; Pates, R.; Miovský, M.; O’Reilly, J.; Candon, P. Publishing Addiction Science. A Guide for the Perplexed; Ubiquity Press: London, UK, 2017. 70. Caputo, R.K. Peer Review: A Vital Gatekeeping Function and Obligation of Professional Scholarly Practice. Fam. Soc. 2019, 100, 6–16, doi:10.1177/1044389418808155. 71. Huisman, J.; Smits, J. Duration and Quality of the Peer Review Process: The Author’s Perspective. Scientometrics 2017, 113, 633–650, doi:10.1007/s11192-017-2310-5. 72. Tennant, J.P.; Dugan, J.M.; Graziotin, D.; Jacques, D.C.; Waldner, F.; Mietchen, D.; Elkhatib, Y.; Collister, L.B.; Pikas, C.K.; Crick, T.; et al. A Multi-Disciplinary Perspective on emergent and future innovations in Peer Review. F1000Research 2017, 6, doi:10.12688/f1000research.12037.3. 73. Crane, H.; Ryan, M. In peer review we (don't) trust: How peer review's filtering poses a systemic risk to science. RESEARCHERS.ONE 2018. Available online: https://www.researchers.one/article/2018-09-17 (accessed on 8 October 2019). 74. Ferguson, C.; Marcus, A.; Oransky, I. Publishing: The Peer-Review Scam. Nature 2014, 515, 480–482. 75. Smith, R. Peer Review: A Flawed Process at the Heart of Science and Journals. J. R. Soc. Med. 2006, 99, 178– 182. 76. Stephan, P.; Veugelers, R.; Wang, J. Reviewers are blinkered by bibliometrics. Nat. News 2017, 544, 411. 77. Tennant, J.P. The state of the art in peer review. FEMS Microbiol. Lett. 2018, 365, doi:10.1093/femsle/fny204. 78. Siler, K.; Lee, K.; Bero, L. Measuring the effectiveness of scientific gatekeeping. Proc. Natl. Acad. Sci. U. S. A. 2015, 112, 360–365. 79. Tomkins, A.; Zhang, M.; Heavlin, W.D. Reviewer bias in single- versus double-blind peer review. Proc. Natl. Acad. Sci. U. S. A. 2017, 114, 12708–12713. 80. van Rooyen, S.; Godlee, F.; Evans, S.; Black, N.; Smith, R. Effect of open peer review on quality of reviews and on reviewers' recommendations: A randomised trial. BMJ (Clinical Research ed.) 1999, 318, 23–27. 81. Perakakis, P.; Taylor, M.; Mazza, M.; Trachana, V. Natural selection of academic papers. Scientometrics 2010, 85, 553–559, doi:10.1007/s11192-010-0253-1. 82. Ross-Hellauer, T. What is Open Peer Review? A Systematic Review. F1000Research 2017, 6, doi:10.12688/f1000research.11369.2. 83. Peters, D.P.; Ceci, S.J. Peer-review Practices of Psychological Journals: The fate of published articles, submitted again. Behav. Brain Sci. 1982, 5, 187–195. 84. Merton, R.K. The Normative Structure of Science. In The Sociology of Science: Theoretical and Empirical Investigations; Merton, R.K., Ed.; University of Chicago Press: Chicago, IL, USA, 1973. 85. Lamont, M. How Professor Think: Inside the Curious World of Academic Judgment; Harvard University Press: Cambridge, MA, 2009. 86. Frosio, G. Open Access Publishing: A Literature Review; Center for Copyright and New Business Models (CREATe): Glasgow, UK, 2014. 87. Neylon, C. Open Access must enable open use. Nature 2012, 492, 348–349. 88. Suber, P. Strong and weak OA. Available online: http://legacy.earlham.edu/~peters/fos/2008/04/strong-and- weak-oa.html (accessed on 23 September 2019). 89. Poynder, R. The OA Interviews: Peter Mandler. Available online: https://poynder.blogspot.com/2018/12/ the-oa-interviews-peter-mandler.html (accessed on 23 September 2019). 90. Morrison, H.G. Freedom for scholarship in the internet age. Available online: http://summit.sfu.ca/item/12537 (accessed on 23 September 2019). Publications 2019, 7, 65 17 of 17 91. Cremer, F.; Klaffki, L.; Steyer, T. Der Chimäre auf der Spur: Forschungsdaten in den Geisteswissenschaften. o-bib. Das offene Bibliotheksjournal 2018. 5, 142–162. 92. Brehm, E.; Neumann, J. Anforderungen an Open-Access-Publikation von Forschungsdaten–Empfehlungen für einen offenen Umgang mit Forschungsdaten. o-bib. Das offene Bibliotheksjournal 2018. 5, 1–16. 93. Lemaire, M. Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften. o- bib. Das offene Bibliotheksjournal 2018, 5, pp. 237–247. 94. Arnold, T.; Tilton, L. New Data? The Role of Statistics in DH. In Debates in the Digital Humanities, 2019; Gold, M.K., Klein, L.F., Eds.; University of Minnesota Press: Minneapolis, MS, USA, 2019. 95. Herb, U. Open Science in der Soziologie. Eine interdisziplinäre Bestandsaufnahme zur offenen Wissenschaft und eine Untersuchung ihrer Verbreitung in der Soziologie; Schriften zur Informationswissenschaft 67; Hülsbusch: Glückstadt, Germany, 2015. 96. Rosa, H. Alienation and Acceleration. Towards a Critical Theory of Late-Modern Temporality; NSU Press: Malmö, Sweden, 2010. © 2019 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). work_2upg4spnnvcebki5mzn4y5git4 ---- 010.dvi UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) UvA-DARE (Digital Academic Repository) The role of ICT in music research: a bridge too far? Honing, H. DOI 10.3366/E1753854808000104 Publication date 2007 Published in International Journal of Humanities and Arts Computing Link to publication Citation for published version (APA): Honing, H. (2007). The role of ICT in music research: a bridge too far? International Journal of Humanities and Arts Computing, 1(1), 61-69. https://doi.org/10.3366/E1753854808000104 General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. Download date:06 Apr 2021 https://doi.org/10.3366/E1753854808000104 https://dare.uva.nl/personal/pure/en/publications/the-role-of-ict-in-music-research-a-bridge-too-far(6800f005-96b0-4137-83d0-ae24ff073232).html https://doi.org/10.3366/E1753854808000104 THE ROLE OF ICT IN MUSIC RESEARCH: A BRIDGE TOO FAR? HENKJAN HONING Introduction While the wide spread availability of the computer and Internet undoubtedly has had a major influence on our society, it is less clear what its impact has been on research in the humanities. Critics blame humanities’ scholars for conservatism,1 preferring paper, pen and handwork over novel technological gadgets. Others see the use of computer technologies in the humanities mainly restricted to, but well put to use, in applications like the digital library.2 Although the latter is an important example of information and communication technology (ICT), it is unclear what the actual impact ICT has had on the research methods and research questions posed. Given the observation that for most humanities scholars the use of ICT has not progressed beyond word-processing, using email, and browsing the web,3 one could argue that, apparently, there is no real need for more advanced uses of ICT, and hence its impact on humanities research might well be negligible. However, in some specific areas of the humanities, including archeology, linguistics, media studies and music, ICT has allowed new research questions and new methodologies to emerge. In this paper, I will focus on the role of ICT in music research, especially the influence it had on the development of the fields of empirical and cognitive musicology. musicology - and beyond Musicology is a relatively young discipline, with its current architecture largely shaped by Guido Adler in the 19th century.4 He divided musicology into two major fields of historical and systematic musicology. As the naming suggests, historical musicology is concerned with the history of music, arranged by International Journal of Humanities and Arts Computing 1 (1) 2007, 61–69 DOI: 10.3366/E1753854808000104 © Edinburgh University Press and the Association for History and Computing 2008 61 H. Honing epochs, peoples, empires, countries, provinces, towns, schools, and individual artists using historiographic methods. Systematic musicology is concerned with the investigation of the chief laws applicable to the various branches of music, aesthetics, the psychology of music, music education, and the comparative study in ethnography and folklore. The latter being the category ‘miscellaneous’, one could say. While in Adler’s time the study of music was restricted to a small elite of music experts, nowadays scholars and scientists from psychology,5 sociology,6 cognitive science,7 cultural theory,8 and even archeologists9 also consider music an interesting and important domain to investigate. And they can not be blamed. Music is at least as multi-faceted as language. However, language as a research topic has attracted considerably more research than music. And one can seriously question why the field of musicology did not grow as much as linguistics did in the last fifty years. Different possible explanations come to mind. One could be that musicology is indeed what some of its critics say: a relatively conservative discipline that studies the cultural and historical aspects of music using familiar descriptive and critical methods, hence leaving out all topics Adler labeled as the systematic field. Another, more attractive explanation could be that musicologists are simply not (yet) equipped with the appropriate knowledge and tools to study music in a truly systematic way. Although most musicologists base their work on texts and scores (using paleographic and philological methods), alternative methods are needed for music with no notation or score (such as the larger proportion of music around the world) or for music in which the actual sound is a more relevant source of information (e.g., electronic music genres ranging from musique concrète to drum & bass). Ethnomusicologists were confronted with this situation early on10 and it prompted the adoption of methodologies from disciplines like physics (e.g., measurement), psychology (e.g., controlled experiments), sociology (e.g., interview techniques) or anthropology (e.g., participating observation). It is the use of methods from other disciplines that, in my opinion, might have been the cause of some delay in the development of musicology as a field, since mastering this wide variety of methodologies, most of which are never really touched upon in the curriculum of the humanities, is not an easy task. Fortunately, in the last two decades it became clear what the methodological toolbox for musicologists could be.11 In the next section two recent strands of musicological research will be discussed – empirical and cognitive musicology – that can serve as an example of the growing role of ICT, measurement, and experimental method in musicology. Both perspectives will be illustrated with an example of recent research. 62 The role of ICT in music research the role of observation: empirical musicology Empirical musicology, or ‘new empiricism’ as the musicologist David Huron calls it,12 grew out of a desire to ground theories on empirical observation and to construct theories on the basis of the analysis and interpretation of such ob- servations.13 It came with the revival of scientific method promoting the pursuit of evidence and rigorous method, after a period of considerable criticism on scientific method in the postmodern literature.14 The arrival of new technologies, most notably that of MIDI15 and of the personal computer, were instrumental to the considerable increase in the number of empirically oriented investigations into music.16 This increase in empirical research is also apparent in the founding of several new scientifically oriented journals, including Psychology of Music (1973), Empirical Studies in the Arts (1982), Music Perception (1983), Musicae Scientiae (1997), and most recently Empirical Music Review (2006). A seminal example of this development is a study by Nicholas Cook on the well-known conductor Wilhelm Furtwängler (1886–1954).17 This study was prompted by a longstanding disagreement between two music scholars: Paul Henry Lang, who was a record critic for High Fidelity magazine in the late 1960s, and Peter Pirie, a musicologist and author of Furtwängler and the art of conducting (1980).18 According to Lang, Furtwängler was a ‘dyed-in-the-wool romantic, favoring arbitrary and highly subjective procedures in tempo, dynam- ics and phrasing’, with the word ‘arbitrary’ referring to Furtwängler’s inability to keep a steady tempo.’19 Peter Pirie could not disagree more with Lang’s characterization of Furtwängler’s conducting. For Pirie, the way Furtwängler performed Beethoven was anything but arbitrary. He considered Furtwängler’s ‘flexible declamation’ a fundamental aspect of his conducting style.20 Such an argument is a typical example and result of a critical approach to the study of art, an approach that often results in unresolved differences in interpretation, even when, at least for some research questions, this is not needed at all. Cook tried, in his 1995 study, to objectively answer the question of whether Furtwängler could (or could not) keep a steady tempo. For this he chose a straight-forward, yet for musicologists relatively novel, empirical approach by simply measuring the tempo fluctuations in a variety of commercially available recordings (using off-the-shelf ICT hard- and software).21 A fragment of these measurements is presented in Figure 1. For the two historic live recordings shown here, most interpretative details were kept the same by Furtwängler, revealing very similar slowing down or speeding up patterns at characteristic structural points in the musical score. While some detail in the use of timing and tempo was changed, overall, Furtwängler had a clear, decided upon, idea of how the tempo for this composition had to be conducted, and was able to stick to this interpretation even in a concert recorded two years later. Using these relatively 63 H. Honing Figure 1. Tempo measurements of Furtwängler’s 1951 and 1953 live recordings of Beethoven’s Ninth Symphony (coda). The numbers on the x-axis refer to the bars numbers in the musical score, the numbers on the y-axis refer to the measured tempo (the higher the faster).22 simple measurements of tempo, Cook could decide the longstanding argument in favor of Pirie. the role of controlled experiments: cognitive musicology As discussed above, empirical musicology became relatively successful in the 1980s, giving a new boost to music performance studies. It proved a convincing alternative to the idea that music performance is too subjective to study scientifically. However, empirical results solely based on the method of measurement resolve only part of the research questions relevant to music research. For example, one has to keep in mind the possible discrepancy between what one measures (cf. Figure 1) and what a listener is actually aware off or perceives in a musical situation (cf. Figure 2). While ‘new musicology”23 invoked the frame of subjectivity (in fact declaring it impossible to study the 64 The role of ICT in music research arts scientifically), the advocates of cognitive science approached it in a more constructive way by refining scientific tools that allows one to study subjective experience. The application of these methods and techniques to music gave rise to the domain of cognitive musicology (or music cognition), an area of scientific inquiry that materialized in the margins of psychology, computer science, and musicology.24 An example of this line of research is a study on the use of timing and tempo in piano music.25 It combines techniques from ICT and computer science with methods from experimental and cognitive psychology aiming to answer ques- tions on the commonalities and diversities as found in music performance: what is shared among music performances and what changes in each interpretation? More specifically, the study addresses the question whether an interpretation changes when only the overall tempo of the performance is changed. In- stead of measuring performances (as in the Furtwängler example discussed above), in this study the question was operationalized: can listeners hear the difference between an original recording (by one pianist) and a manipulated, tempo-transformed recording (by another pianist)? The tempo-transformed recording was originally recorded at a different tempo but was made similar in tempo to the other performance using an advanced time-scale modification algorithm. The task was to judge which of the two performances – now both in the same overall tempo – was an original recording while focusing on the use of expressive timing. (See Figure 2 for a fragment of the user interface of the online listening experiment).26 What can we expect the results to be? One hypothesis, based on the psychological literature, suggests that listeners can not hear the difference (the ‘relational invariance’ hypothesis). Since the timing variations of the pianist are scaled proportionally, both versions will sound equally natural, so that the participants in the listening experiment will consider both versions musically plausible performances, and, consequently, just guess what is an original recording. An alternative hypothesis is that listeners can hear the difference (the ‘tempo-specific timing’ hypothesis). It is based on the idea that timing in music performance is intrinsically related to global tempo. When the timing variations are simply scaled to another tempo (i.e., slowing it down or speeding it up proportionally) this may make the performance sound awkward or unusual, and hence easier to identify as a tempo-transformed version. The results of this study are summarized in Figure 3. The majority (on average, 70%) of the 162 participants (primarily students of the University of Amsterdam and Northwestern University) could correctly identify an original recording by focusing solely on the timing used by the pianist (since both fragments had the same tempo). This result was taken as support for the tempo-specific timing hypothesis – which predicts that a tempo-transformed performance will sound awkward as compared to an original performance – and 65 H. Honing Figure 2. Fragment of the internet user interface showing the presentation of the audio fragments that had to be compared (see http://www.hum.uva.nl/mmm/exp/). as counterevidence for the relationally invariant timing hypothesis, which predicts that a tempo-transformed performance will sound equally musical or natural. As such this result cleared up a longstanding argument of whether performers do or do not adapt their timing to the tempo chosen, and, if so, whether listeners are sensitive to this.27 This study is just a small example of how methods from cognitive science – choosing an experimental design that allows one to use real music (i.e., CD recordings instead of MIDI performances, or even clicks or simple sine tones) and subjective judgments by a panel of experienced listeners – allow scientific inquiry of music perception and performance. The example also hints at the further potential of ICT for empirical studies in the humanities. The technology 66 The role of ICT in music research Figure 3. Results of a listening experiment (162 participants) using compositions from the classical and romantic piano repertoire as recorded by pianists such as Glenn Gould, Vladimir Horowitz and Rosalyn Tureck (A quote indicates a tempo transformed recording; Statistical significance levels are indicated with asterisks; * p < 0.05; ** p < 0.01; *** p < 0.001).28 used here (for a more elaborate description, see Honing, 2006), combines widely available ICT technology with well-understood methods from the social sciences. Together, they form a powerful toolkit for the modern musicologist and opens up a whole new area of cognitive research in the arts and humanities. conclusion The past two decades have witnessed a significant increase in scientifically inspired music research in which the role of ICT, measurement, and experiment became influential methods that contributed to a further understanding of music as a process in which the performer, the listener, and music as sound play a central role. These developments not only enriched musicological research itself, it also influenced the main issues addressed in other areas of research like psychology and (neuro)cognition, slowly diminishing the ‘trade deficit’ that 67 H. Honing musicology built up over its existence as a discipline. For example, music was for years only a minor topic in the psychology text books, hidden away in a section on pitch perception, in recent years several disciplines, ranging from cultural theory to archeology and psychology to computer science, have shown a growing interest in the scientific study of music. This puts music in the center of attention and research activity – next to language, where it belongs. end notes 1 See, e.g., W. Bijker & B. Peperkamp (eds.). Geëngageerde geesteswetenschappen – Perspectieven op cultuurveranderingen in een digitaliserend tijdperk. The Hague 2002; and a critical commentary G. de Vries, ‘Dienstbare alfa’s’, De Academische Boekengids, 4/4, (2004). 2 See, e.g., E. Viskil. Een digitale bibliotheek voor de geesteswetenschappen. Aanzet tot een programma voor investering in een landelijke kennisinfrastructuur voor geesteswetenschappen en cultuur. NWO-Gebiedsbestuur Geesteswetenschappen, The Hague 1999. 3 D. Robey, J. Unsworth & G. Rockwell, ‘National Support for Humanities Computing: Different Achievements, Needs and Prospects’, Proceedings of the ACH/ALLC Conference, University of Victoria, 2005. 4 G. Adler, ‘Umfang, Methode und Ziel der Musikwissenschaft’ in: Vierteljahresschrift für Musikwissenschaft 1 (1885). 5 D. Deutsch (ed.), Psychology of Music (2nd edition), New York 1999. 6 S. Frith, Music for pleasure: essays in the sociology of pop, New York, 1988. 7 H. C. Longuet-Higgins, Mental Processes. Studies in Cognitive Science, Cambridge, MA 1987. 8 E. W. Said, Musical Elaborations, London 1992. 9 S. Mithen, The Singing Neanderthals: The Origins of Language, Music, Body and Mind. London 2005. 10 C. Seeger, ‘Systematic musicology: Viewpoints, Orientations, and Methods’, in: Journal of the American Musicological Society 4, (1951), pp. 240–248, 11 See, for example, T. DeNora, ‘Musical Practice and Social Structure: a Toolkit’, in: E. F. Clarke & N. Cook (eds.), Empirical musicology: Aims, methods and prospects, Oxford 2004, p. 37. 12 D. Huron, ‘The New Empiricism: Systematic Musicology in a Postmodern Age’, Berkeley, University of California 1999, http://www.music-cog.ohio-state.edu/Music220/ Bloch.lectures/3.Methodology.html, p. 2. 13 J. Rink, (ed.) The Practice of Performance: Studies in Musical Interpretation, Cambridge 1995; E. F. Clarke & N. Cook (eds.) Empirical musicology: Aims, methods and prospects, Oxford 2004. 14 For an overview of this discussion see D. Huron, ‘The New Empiricism’. 15 Commercial standard for the exchange of information between electronic instruments and computers. 16 See for an overview, e.g., E. F. Clarke, ‘Rhythm and timing in music’, in: D. Deutsch (ed.), Psychology of Music (2nd edition), New York 1999, pp. 473–500; A. Gabrielsson, ‘The performance of music’, in D. Deutsch (ed.), Psychology of Music, New York 1999, pp. 501–602. 68 The role of ICT in music research 17 N. Cook, ‘The conductor and the theorist: Furtwängler, Schenker, and the first movement of Beethoven’s Ninth Symphony,’ in: J. Rink (ed.), The Practice of Performance, Cambridge: Cambridge University Press, pp. 105–125. 18 P. Pirie, Furtwängler and the Art of Conducting, London 1980. 19 P. H. Lang, ‘The Symphonies’. In: The Recordings of Beethoven as Viewed by the Critics from High Fidelity, Westport, Connecticut 1978. 20 P. Pirie, ‘Furtwängler and the Art of Conducting’ 21 Cook used a technique that involved playing the CD in the CD-ROM drive of a computer and tapping the space bar of the computer keyboard in synchrony with the onset of each bar, its inter-bar intervals (IBI) being recorded and converted to a measure of tempo (1/IBI). 22 Adapted from Cook, ‘The conductor and the theorist’. 23 New Musicology: a branch of music scholarship that is guided by a recognition of the limits of human understanding, an awareness of the social milieu in which scholarship is pursued, and the realization of the political area in which the fruits of scholarship are used and abused. 24 H. Honing, ‘The comeback of systematic musicology: new empiricism and the cognitive revolution’ Dutch Journal of Music Theory 9/3 (2004) pp. 241–244. 25 See for more details H. Honing, ‘Evidence for tempo-specific timing in music using a web-based experimental setup’. Journal of Experimental Psychology: Human Perception and Performance, 32(3), (2006); H. Honing, ‘Timing is tempo-specific’. Proceedings of the International Computer Music Conference, Barcelona (2005) pp. 359–362.; H. Honing, ‘Is expressive timing relational invariant under tempo transformation?’ Psychology of Music (in press). 26 To realize this technology in an academic environment is an interesting topic on its own (Google the word ‘Vulpennenbeheer’ to get a rough idea). 27 B. H. Repp, ‘Relational invariance of expressive microstructure across global tempo changes in music performance: An exploratory study.’ Psychological Research 56 (1994) pp. 269–284. 28 Adapted from Honing, ‘Evidence for tempo-specific timing in music using a web-based experimental setup’. 69 work_2vlgw55jzrhj3dzmd4n4ngf7ui ---- doi:10.3402/jac.v8.30072 This is an electronic reprint of the original article. This reprint may differ from the original in pagination and typographic detail. Powered by TCPDF (www.tcpdf.org) This material is protected by copyright and other intellectual property rights, and duplication or sale of all or part of any of the repository collections is not permitted, except that material may be duplicated by you for your research use or educational purposes in electronic or print form. You must obtain permission for any other use. Electronic or print copies may not be offered, whether for sale or otherwise to anyone who is not an authorised user. Naukkarinen, Ossi; Bragge, Johanna Aesthetics in the age of digital humanities Published in: Journal of Aesthetics and Culture DOI: 10.3402/jac.v8.30072 Published: 01/01/2016 Document Version Publisher's PDF, also known as Version of record Published under the following license: CC BY Please cite the original version: Naukkarinen, O., & Bragge, J. (2016). Aesthetics in the age of digital humanities. Journal of Aesthetics and Culture, 8, 1-18. [30072]. https://doi.org/10.3402/jac.v8.30072 https://doi.org/10.3402/jac.v8.30072 https://doi.org/10.3402/jac.v8.30072 Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=zjac20 Download by: [Aalto-yliopiston kirjasto] Date: 04 April 2017, At: 23:33 Journal of Aesthetics & Culture ISSN: (Print) 2000-4214 (Online) Journal homepage: http://www.tandfonline.com/loi/zjac20 Aesthetics in the age of digital humanities Ossi Naukkarinen & Johanna Bragge To cite this article: Ossi Naukkarinen & Johanna Bragge (2016) Aesthetics in the age of digital humanities, Journal of Aesthetics & Culture, 8:1, 30072, DOI: 10.3402/jac.v8.30072 To link to this article: http://dx.doi.org/10.3402/jac.v8.30072 © 2016 O. Naukkarinen & J. Bragge Published online: 11 Jan 2016. Submit your article to this journal Article views: 251 View related articles View Crossmark data http://www.tandfonline.com/action/journalInformation?journalCode=zjac20 http://www.tandfonline.com/loi/zjac20 http://www.tandfonline.com/action/showCitFormats?doi=10.3402/jac.v8.30072 http://dx.doi.org/10.3402/jac.v8.30072 http://www.tandfonline.com/action/authorSubmission?journalCode=zjac20&show=instructions http://www.tandfonline.com/action/authorSubmission?journalCode=zjac20&show=instructions http://www.tandfonline.com/doi/mlt/10.3402/jac.v8.30072 http://www.tandfonline.com/doi/mlt/10.3402/jac.v8.30072 http://crossmark.crossref.org/dialog/?doi=10.3402/jac.v8.30072&domain=pdf&date_stamp=2016-01-11 http://crossmark.crossref.org/dialog/?doi=10.3402/jac.v8.30072&domain=pdf&date_stamp=2016-01-11 Aesthetics in the age of digital humanities Ossi Naukkarinen 1 and Johanna Bragge 2 * 1 Department of Art, Aalto University School of Arts, Design and Architecture, Helsinki, Finland; 2 Department of Information and Service Economy, Aalto University School of Business, Helsinki, Finland Abstract One of the most difficult but yet unavoidable tasks for every academic field is to define its own nature and demarcate its area. This article addresses the question of how current computational text-mining approaches can be used as tools for clarifying what aesthetics is when such approaches are combined with philosophical analyses of the field. We suggest that conjoining the two points of view leads to a fuller picture than excluding one or the other, and that such a picture is useful for the self-understanding of the discipline. Our analysis suggests that text-mining tools can find sources, relations, and trends in a new way, but it also reveals that the databases that such tools use are presently seriously limited. However, computational approaches that are still in their infancy in aesthetics will most likely gradually affect our understanding about the ontological status of the discipline and its instantiations. Ossi Naukkarinen, PhD, is Head of Research and Vice Dean at the Aalto University School of Art, Design and Architecture, Finland. He has pub- lished books and articles on various themes in aesthetics, including envir- onmental art, everyday aesthetics, and mobile aesthetics, in journals such as Contemporary Aesthetics, Aisthesis, and Nordic Journal of Aesthetics. His publications have also been translated into Spanish, Slovenian, Italian and Chinese. Johanna Bragge holds a PhD in Management Science from the Helsinki School of Economics and works as Senior University Lecturer of Informa- tion Systems Science at Aalto Univer- sity School of Business. Her research interests include research profiling with text-mining tools, e-collaboration, ser- vice co-creation, and crowdsourcing. Her research has been published, among others, in the Journal of the AIS, IEEE Transactions on Professional Communication, Expert Systems with Applications, Futures, Group Decision and Negotiation, and Journal of Business Research. Keywords: aesthetics; bibliometrics; computing; digital humanities; text-mining; Web of Science Traditionally, well-informed conceptions about the field of aesthetics have been formed by studying it for a long time and carefully*by reading and writing books and articles, teaching and following lectures, and taking part in academic discussions in conferences and learned societies. This is still quite a normal and reasonable approach, and knowledge attained through it cannot be achieved in any other way. The more one studies, the broader and more detailed a picture one has. However, there is no universally accepted defi- nition of aesthetics. We can probably agree that aesthetics has something to do with the arts, beauty, and other aesthetic values, as well as with art criticism in the broadest possible sense. As soon as one goes any further, philosophical ponderings and disagreements arise. What kinds of studies of the arts actually belong to the field of aesthetics, and what kinds are outside it? Where are the differences between art history and aesthetics? *Correspondence to: Johanna Bragge, Department of Information and Service Economy, Aalto University School of Business, Runeberginkatu 22�24, FI-00100 Helsinki, Finland. Email: johanna.bragge@aalto.fi Journal of AESTHETICS & CULTURE Vol. 8, 2016 #2016 O. Naukkarinen & J. Bragge. This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), allowing third parties to copy and redistribute the material in any medium or format and to remix, transform, and build upon the material for any purpose, even commercially, provided the original work is properly cited and states its license. Citation: Journal of Aesthetics & Culture, Vol. 8, 2016 http://dx.doi.org/10.3402/jac.v8.30072 1 (page number not for citation purpose) http://creativecommons.org/licenses/by/4.0/ http://www.aestheticsandculture.net/index.php/jac/article/view/30072 http://dx.doi.org/10.3402/jac.v8.30072 Is aesthetics always a philosophical discipline, and what does that mean? Should we include non- academic publications such as memoirs or exhibi- tion reviews in the field if they deal with the same themes as academic aesthetics papers? What are the latest trends and which themes are fading away? Such questions are acutely relevant when one designs, for example, an introductory book or course for undergraduate students. Forming a comprehensive interpretation of any academic field is becoming more and more demanding all the time, because the number of publications, scholars, and institutions becomes higher each year. One simply cannot master all the different languages and traditions in which aes- thetic issues are addressed, and a single scholar can never get a hold of everything published in his or her field. In fact, he or she cannot even access a tiny fraction of it, since in general the growth rate of traditional scientific publishing has been increasing for the last 50 years, and the number of publications using new channels such as open-access journals is growing rapidly. 1 The latest studies show that the growth of global scientific publication output has been exponential between 1980 and 2012. 2 The same trend can also be seen in the research of aesthetics, as presented in Figure 1. In the data set we analyzed from the Web of Science (WoS) for this article, the rate of growth has been steady and surging since the turn of the millennium. How can we make sense of aesthetics in this situation? Well-informed understanding of one’s field is still expected of professionals, after all; one is not supposed to focus on some narrow area only, without the ability to link one’s specialty to a wider field. Like in many other contexts, that is the reason why it is reasonable to make use of the computa- tional tools that we have nowadays. So far, aesthe- ticians have not been very active in using these for clarifying the nature of their own field. 3 Our aim is to open up some possibilities and thus take aes- thetics closer to the so-called Digital Humanities. 4 Furthermore, we want to show that using such tools is not as easy and straightforward as one could assume, but it requires specialized skills. THE BIG PICTURE Computational analyses always need data to be analyzed with the help of algorithms that define what the computational processes will do and present to us. For this essay, we have used the publication data provided by WoS. Thomson Reuters’ WoS database is the ‘‘gold standard’’ by which many governments in coun- tries such as the USA, the UK, and Australia evaluate their national R&D performance. 5 It was also the first database that started indexing the cited references of publications, as early as the 1960s, thus allowing various (co-)citation analyses to be conducted, based on Eugene Garfield’s original idea from 1955. 6 WoS is also used as standard by researchers for bibliometric studies, as the pub- lications it indexes are stringently pre-inspected for quality, and the data it provides is consistently organized in the database. To summarize, as WoS is one of the best known, largest, and most in- fluential academic databases, it is interesting to see first what kind of image it offers of aesthetics. It is a known issue that arts and humanities (A&H) research is not as well covered in WoS as natural sciences*the indexing of A&H started much later, in 1975*although the situation has been improving lately. 7 We will return to some of the problems related to WoS and other similar databases, such as Elsevier’s Scopus, soon. In any case, as WoS is arguably one of the most important academic databases, aestheticians cannot afford to ignore it. At least, we have to understand how it functions. If the picture seems to be distorted, we have to understand why, and try to find better tools and databases. In the present situation, where such databases have a huge impact on our aca- demic communities, we cannot just omit them. The data we consult does not tell us anything as such, and we cannot even start searching for relevant information without making active selec- tions. When we created a picture of aesthetics Figure 1. Increase in research articles in aesthetics in the Web of Science. O. Naukkarinen & J. Bragge 2 (page number not for citation purpose) using WoS, we had to narrow down our approach, as will be explained soon. In addition, we chose three software tools to represent and organize the core results: VantagePoint, VOSviewer, and Leximancer. 8 VantagePoint is a professional text- mining tool for discovering and organizing infor- mation in search results from literature or patent databases. Besides advanced data cleaning func- tions, it makes it possible to quickly find answers to the questions of who, what, when, and where, helping the researcher to clarify relationships and find patterns. The second tool, VOSviewer, also analyzes bibliometric literature data, but its core focus is on visualizing the bibliometric networks, composed, for example, of journals, authors or key terms appearing in abstracts, based on co- citation, bibliometric coupling or co-occurrence relations. Leximancer is an automated content analysis tool that can be used to find prominent themes and concepts from any kind of textual data, whether bibliometric or otherwise. We used it to analyze the full texts of the British Journal of Aesthetics in 5-year periods. The time span we analyzed was 1975�2014. The A&H citation index starts at 1975, and at the time of conducting the study, we were halfway through 2015. In addition, the span covers exactly 40 years, and thus allows long-term trend analyses to be conducted, for example, by 10-year periods. If one simply types ‘‘aesthetics’’ in the basic search field of WoS, which searches for the term in titles, abstracts, and keywords, the search results (22,957 publications as of August 4, 2015) largely, at around 55%, feature publications other than A&H ones, such as life sciences and biomedicine from the other citation indexes. Figure 2 shows the division by scientific domain, as well as by more detailed research area in the A&H domain. 9 This, in fact, is interesting as such: unlike we philosophers of aesthetics might believe, a large group of people addressing aesthetic issues seems to be operating outside our circles, even if our and their discourses seldom meet. If this is the case in academic contexts such as WoS and other similar databases, what is the situation outside academia? We will briefly return to this question at the end of this essay, but, all in all, the issue of how ‘‘our’’ and ‘‘their’’ aesthetics are related would actually deserve a study of its own. This time, however, we wanted to keep the focus closer to what we think is the humanistic tradition of aesthetics. For this, we restricted the search to only the A&H citation index. Even that database initially seemed too large, as the same ‘‘aesthetics’’ search brought up many seemingly irrelevant re- search areas, such as radiology, nuclear medicine, and medical imaging. However, we looked into some of those and found that they can actually Figure 2. Aesthetics publications by scientific domain (in capital letters; A&H �Arts & Humanities) and research area in the Web of Science. Aesthetics in the age of digital humanities 3 (page number not for citation purpose) include interesting publications. For instance, they showed that the radiologic aesthetics of human body parts or organs have inspired many artists to create works of art, indicating that radiology is perhaps becoming a more common approach in the field of contemporary art. Thus, we decided to include all results from the A&H index that had ‘‘aesthetics’’ in the title, abstract or keywords. In addition, we included all publica- tions from journals that are specific to aesthetics. The aesthetics journals that are indexed in A&H by WoS include the following: British Journal of Aesthetics (BJA), Journal of Aesthetics and Art Criticism (JAAC), Journal of Aesthetic Education (JAE), Inter- national Review of the Aesthetics and Sociology of Music (IRASM), Revue d’esthétique, Psychology of Aesthetics Creativity and the Arts, Estetika � the Central European Journal of Aesthetics, and Zeitschrift für Ästhetik und allgemeine Kunstwissenschaft. Had we chosen to focus on more specifically defined research areas in A&H, it would have required more active definition, and there is no single solution for that. This means that comput- ing is necessarily combined with a philosophical analysis of what aesthetics is. For example, it is quite reasonable to state that aesthetic issues are most probably dealt with in publications listed under research areas such as art, literature, and philosophy, because aesthetics is often related to the themes of art, criticism, and beauty and is emphatically philosophical in nature. On the other hand, if one chose some other set of fields, the search results would be somewhat different. If one assumes a more Baumgartian stance, under- standing aesthetics as something close to ‘‘a science of sensitive knowing’’ (scientia cognitionis sensitivae), one would probably include more publications and fields closer to psychology; and emphasizing evolutionary, neuroscientific, or environmental branches of aesthetics could lead to including more fields of natural sciences. This means that one’s pre-understanding necessarily guides what one finds from the data that is available. It is evident that there is no single, objective, and neutral way of selecting the relevant fields when doing a more focused analysis. The aesthetics search in A&H index, including the eight domain journals mentioned above, re- sulted in 21,919 publications (as of June 18, 2015). As our purpose is to illustrate especially academic research in aesthetics, we refined the results to include only full-length journal and conference articles, thus excluding, for example, book reviews, letters, and notes. This choice was guided by the category options WoS offers, and our final search result was 11,814 articles. The results based on our selections show, first, that even if there are some self-evident forums of aesthetics, such as BJA, JAAC, and JAE, issues related to aesthetics are addressed in surprisingly many sources, some of which were previously unknown to us. In total, there were altogether 1,517 different journals or other sources listed as publishing aesthetics articles. This means that we might need to broaden our own understanding of the field, of its publication channels, and of who is actually working in it. Of course, this data analysis only suggests some possibilities and opens questions, and we have to study the phenomenon better by other means, including plenty of good old-fashioned reading. We have to find out whether the publications based on our search really are relevant to aesthetics, and whether the text-mining tools produce truthful results when making more detailed analyses. In any case, the point is that we would not have seen the new possibilities in the same way without the data analysis, and at least some of the new sources will probably turn out to be important. On the other hand, it is striking that many journals that we think are relevant and interesting for the field are missing (not indexed) in WoS: Journal of Aesthetics & Culture, the Italian Aisthesis, the US-based Contemporary Aesthetics, Journal of Aesthetics and Phenomenology, and The Nordic Jour- nal of Aesthetics, for example, not forgetting some of the perhaps lesser-known publications, such as The Journal of Aesthetics and Protest, Aisthesis* International Journal of Art and Aesthetics in Man- agement and Organisational Life, and Korean Journal of Aesthetics. This is due to the very strict indexing principles of WoS. It is evident that one cannot blindly trust the computed results, but one needs to be aware of the database restrictions. The data also shows that 93% of the articles are single-authored and reveals who are the most active and prominent scholars in the field. There are no big surprises. The top authors who have published most articles are all internationally familiar names. The top 10 are, respectively, Noël Carroll, Richard Shusterman, Peter Kivy, Robert Stecker, Stephen Davies, Jerrold Levinson, O. Naukkarinen & J. Bragge 4 (page number not for citation purpose) Harold Osborne, Stanislav Tuksar, Malcolm Bud, and Joseph Margolis (all men!)*the only surprise perhaps being Stanislav Tuksar, the Croatian music scholar. We had more or less assumed a list of this kind, but now we have evidence for our belief, and we can also see in more detail how much and where these scholars have actually published, and how many citations they have received for the articles (see Table 1). This, in turn, gives others a reference point: if someone wants to be active and visible in aesthetics, where and how often should one present one’s ideas? In this data set, Carrol has 47 articles and Margolis 20, the other top authors something between this, and by far the most important publication forums are the Journal of Aesthetic Education, Journal of Aesthetics and Art Criticism and British Journal of Aesthetics � except for Tuksar, who has mostly published in the International Review of the Aesthetics and Sociology of Music, for which he is editor-in-chief. So, it might be a good idea to aim at these journals and publish at least some 20 articles, which is naturally not that easy. The list of top cited authors, which is collected from the reference lists of our final sample of 11,814 articles, looks a little different, due to the fact that classics of philosophy, such as Immanuel Kant, are still commonly cited in the field. However, all but one of the top-10 authors also appear among the top-60 cited authors. Table 2 presents the top 50 most cited authors, based on the number of publications in which they have been cited. 10 The table also divides the number of citing publications temporally into four decades. It is interesting to see that most of the top cited authors have an ascending trend in citations, but there are also some whose curve is descending. The top authors appearing in Table 1 have been shown in bold in Table 2 for easier detection; Robert Stecker and Harold Osborne are not shown as they are at places 57 and 60, respectively. In addition, Stani- slav Tuksar’s rank is 1558, with 15 sample pub- lications in which he is cited. Bibliometric studies typically analyze and vis- ualize author networks via their co-authorship relations, revealing ‘‘scholarly communities.’’ How- ever, in the case of aesthetics and in the humanities in general, co-authorship analyses are not sensible, as our data shows that 93% of the articles are single-authored. To discover relations, one can instead conduct other types of network analyses, for example by cross-correlating authors with the help of commonly used title words or through the authors they refer to in their articles. Figures 3 and 4 illustrate two examples of such cross- correlation analyses. The most prolific authors are placed on the map based on the authors they cite Table 1. Top-10 authors Rank Author Number of articles Percentage published in BJA, JAE or JAAC Total cites for the articles Avg. cites for the articles Author’s h-index for the articles a Rank in top cited authors list 1 Carroll, Noël 47 85 368 7.83 12 13 2 Shusterman, Richard 39 79 185 4.74 7 27 3 Kivy, Peter 31 90 88 2.84 5 23 4 Stecker, Robert 31 97 145 4.68 7 57 5 Davies, Stephen 29 93 164 5.65 8 34 6 Levinson, Jerrold 28 93 249 8.89 8 11 7 Osborne, Harold 27 100 63 2.33 5 60 8 Tuksar, Stanislav 24 0 14 0.58 3 1558 9 Budd, Malcolm 20 90 105 5.25 6 45 10 Margolis, Joseph 20 95 79 3.95 6 41 a Hirsch’s h-index: An author has index h, if h of his Np papers have at least h citations each, and the other (Np � h) papers have less than h citations each. Aesthetics in the age of digital humanities 5 (page number not for citation purpose) Table 2. Top 50 most cited authors in the 11,814 aesthetics articles, by decade Rank Number of publications in which author is cited 1975�1984 1985�1994 1995�2004 2005�2014 1 996 Kant Immanuel 118 182 283 413 2 617 Adorno Theodor W. 65 127 131 294 3 548 Benjamin Walter 32 92 132 292 4 547 Goodman Nelson 126 151 144 126 5 512 Danto Arthur 54 129 157 172 6 466 Hegel G. W. F. 82 115 109 160 7 448 Barthes Roland 45 90 113 200 8 448 Beardsley Monroe C. 129 128 97 94 9 448 Foucault Michel 23 70 115 240 10 433 Derrida Jacques 25 90 104 214 11 424 Levinson Jerrold 4 59 152 209 12 420 Wittgenstein Ludwig 75 98 124 123 13 405 Carroll Noël 38 140 227 14 401 Walton Kendall L. 30 66 115 190 15 378 Dewey John 48 74 92 164 16 377 Heidegger Martin 39 86 99 153 17 373 Nietzsche Friedrich 29 85 112 147 18 372 Gombrich Ernst 91 98 93 90 19 363 Wollheim Richard 63 93 106 101 20 357 Aristotle 65 70 85 137 21 347 Dickie George 73 95 94 85 22 341 Deleuze Guilles 5 29 96 211 23 329 Kivy Peter 17 72 110 130 24 328 Bourdieu Pierre 12 30 85 201 25 315 Scruton Roger 40 70 84 121 26 311 Freud Sigmund 29 63 75 144 27 282 Plato 48 58 74 102 28 269 Hume David 37 42 84 106 29 258 Langer Suzanne 87 66 45 60 30 257 Schiller Friedrich 41 49 67 100 31 249 Jameson Fredric 13 32 54 150 32 243 Lyotard Jean-François 3 53 71 116 33 242 Gadamer Hans-Georg 38 58 59 87 34 235 Davies Stephen 4 21 71 139 35 235 Eagleton Terry 10 40 62 123 36 226 Arnheim Rudolph 39 64 44 79 37 217 Eco Umberto 23 53 53 88 38 215 Merleau-Ponty Maurice 38 36 55 86 39 212 Marx Karl 57 39 39 77 40 207 Dahlhaus Carl 27 59 56 65 41 202 Margolis Joseph 51 66 44 41 42 200 Goethe Johann Wolfgang von 38 43 52 67 43 199 Collingwood Robin 52 54 52 41 44 198 Sartre Jean-Paul 43 40 45 70 45 196 Budd Malcolm 1 22 72 101 46 193 Habermas Jürgen 19 55 57 62 47 192 Shusterman Richard 5 29 55 103 48 188 Eliot Thomas S. 59 34 47 48 49 178 Currie Gregory 7 51 120 50 177 Baudelaire Charles 29 24 54 70 Top authors from Table 1 are indicated in bold font. O. Naukkarinen & J. Bragge 6 (page number not for citation purpose) (Figure 3) or on the title words they use (Figure 4). The correlations are shown as links between author nodes: the thicker the link lines, the greater the correlation between any two authors (see legends in the upper left hand corners). 11 One can also study the basis of the correlation using the tool online: when hovering the mouse above any author node, the tool will present information showing Cross-Correlation Map Authors (Cleaned) (Top54Autho... Cited Authors (Cleaned) (Clea... Links >=0,330000 shown >0.75 1 (0) 0.50–0.75 24 (0) 0.25–0.50 135 (148) <0.25 0 (1123) Zangwill, NickZangwill, Nick YOUNG, JOYOUNG, JO Winner, EllenWinner, Ellen Tuksar, StanislavTuksar, Stanislav SUPICIC, ISUPICIC, I Stecker, RobertStecker, Robert SPARSHOTT, FESPARSHOTT, FE SMITH, RASMITH, RA Silvia, Paul JSilvia, Paul J SILVERS, ASILVERS, A Shusterman, RichardShusterman, Richard SHARPE, RASHARPE, RA Saito, YurikoSaito, Yuriko ROBINSON, JMROBINSON, JM Ridley, AaronRidley, Aaron RICHARDSON, JARICHARDSON, JA OSBORNE, HOSBORNE, H NOVITZ, DNOVITZ, D NOGUEZ, DNOGUEZ, D McFee, GrahamMcFee, Graham Matravers, DerekMatravers, Derek Margolis, JosephMargolis, Joseph Livingston, PaisleyLivingston, Paisley Levinson, JerroldLevinson, Jerrold Leder, HelmutLeder, Helmut LEDDY, TLEDDY, T Kivy, PeterKivy, Peter Kieran, MatthewKieran, Matthew Kaufman, James CKaufman, James C Hlobil, TomasHlobil, Tomas GUYER, PDGUYER, PD GRAHAM, GGRAHAM, G Godlovitch, SGodlovitch, S Eaton, MMEaton, MM Duran, JaneDuran, Jane DUFRENNE, MDUFRENNE, M Dodd, JulianDodd, Julian Dilworth, JohnDilworth, John DIFFEY, TJDIFFEY, TJ DICKIE, GDICKIE, G Davies, StephenDavies, Stephen Davies, DavidDavies, David Currie, GregoryCurrie, Gregory Crowther, PaulCrowther, Paul CHARLES, DCHARLES, D Carroll, NoelCarroll, Noel CARRIER, DCARRIER, D Carlson, AllenCarlson, Allen Budd, MalcolmBudd, Malcolm BEST, DNBEST, DN Berleant, ArnoldBerleant, Arnold ARNHEIM, RARNHEIM, R Andrijauskas, AntanasAndrijauskas, Antanas Alperson, PhilipAlperson, Philip Figure 3. Cross-correlation map of top-54 authors vs. cited authors (top 1072). Jerrold Levinson’s label appears underneath that of S. Davies, and Nick Zangwill’s label underneath M. Budd. Aesthetics in the age of digital humanities 7 (page number not for citation purpose) the values for the cross-correlated field. To take rather an easy example, the tool shows that Joseph Margolis most often cites (besides his own works) Nelson Goodman, Arthur C. Danto, Jacques Derrida, Willard Van Orman Quine, and Donald Davidson; for a knowledgeable reader, this kind of infor- mation immediately says something about his approach. Cross-Correlation Map Authors (Cleaned) (Top54 Autho... Title (NLP) (Phrases) (Cleane... Links >=0,330000 shown >0.75 0 (0) 0.50–0.75 15 (0) 0.25–0.50 149 (145) <0.25 0 (1122) Zangwill, NickZangwill, Nick YOUNG, JOYOUNG, JO Winner, EllenWinner, Ellen Tuksar, StanislavTuksar, Stanislav SUPICIC, ISUPICIC, I Stecker, RobertStecker, Robert SPARSHOTT, FESPARSHOTT, FE SMITH, RASMITH, RA Silvia, Paul JSilvia, Paul J SILVERS, ASILVERS, A Shusterman, RichardShusterman, Richard SHARPE, RASHARPE, RA Saito, YurikoSaito, Yuriko ROBINSON, JMROBINSON, JM Ridley, AaronRidley, Aaron RICHARDSON, JARICHARDSON, JA OSBORNE, HOSBORNE, H NOVITZ, DNOVITZ, D NOGUEZ, DNOGUEZ, D McFee, GrahamMcFee, Graham Matravers, DerekMatravers, Derek Margolis, JosephMargolis, Joseph Livingston, PaisleyLivingston, Paisley Levinson, JerroldLevinson, Jerrold Leder, HelmutLeder, Helmut LEDDY, TLEDDY, T Kivy, PeterKivy, Peter Kieran, MatthewKieran, Matthew Kaufman, James CKaufman, James C Hlobil, TomasHlobil, Tomas GUYER, PDGUYER, PD GRAHAM, GGRAHAM, G Godlovitch, SGodlovitch, S Eaton, MMEaton, MM Duran, JaneDuran, Jane DUFRENNE, MDUFRENNE, M Dodd, JulianDodd, Julian Dilworth, JohnDilworth, John DIFFEY, TJDIFFEY, TJ DICKIE, GDICKIE, G Davies, StephenDavies, Stephen Davies, DavidDavies, David Currie, GregoryCurrie, Gregory Crowther, PaulCrowther, Paul CHARLES, DCHARLES, D Carroll, NoelCarroll, Noel CARRIER, DCARRIER, D Carlson, AllenCarlson, Allen Budd, MalcolmBudd, Malcolm BEST, DNBEST, DN Berleant, ArnoldBerleant, Arnold ARNHEIM, RARNHEIM, R Andrijauskas, AntanasAndrijauskas, Antanas Alperson, PhilipAlperson, Philip Figure 4. Cross-correlation map of top-54 authors’ vs. title phrases (all; processed with Natural Language Processing NLP). Ivo Supicic’s label appears underneath that of Philip Alperson. O. Naukkarinen & J. Bragge 8 (page number not for citation purpose) If the data shows that two or more authors are closely related and we had not realized that before, now we have a reason to examine how they are related. This, again, requires consultation of the actual publications, but the text-mining tool has given us a reason to do that, as it gives an indication of the nature of the relationships. Without the tool, we would never have detected all such relations. Robert Stecker, for example, seems to be very well connected in many directions; how exactly and what this indicates is a matter for further analysis. On the other hand, it is interesting that the pictures do not show a stronger relation between authors such as Arnold Berleant and Yuriko Saito, even if we know from other sources that they have often addressed related topics and closely co-operated in other ways, for example, in the e-journal Contemporary Aesthetics; again, the results must be read critically. One interesting result is the heat map (Figure 5) of the most common themes, as seen through the frequency and co-occurrence of the terms used in titles and abstracts (when stop words such as ‘‘and,’’ ‘‘it,’’ etc. are excluded). 12 The warmer the color, and the larger the font size, the more often the terms appear in the sample. For example, the term ‘‘politic’’ appears in the hot red area and in medium-large font, and the data behind it indicates that the term appears in the title or abstract of 758 publications (counted only once if it appears in both). The proximity of terms indicates that they often appear in the same titles or abstracts. The map helps us to quickly see the most usual themes or issues addressed in aesthetics. Again, the map requires interpretation and fur- ther study. As it shows that, for example, ‘‘politic’’ is a frequently used term in the field, this might mean that if one wants to be a credible aesthetician, one has to pay close attention to it (and its variations political, politics, etc.), even if one had not been very interested in it before. Without data analysis, one would not have as good an idea of how common it is, and one would not have an equally good reason to study what kinds of issues are addressed and who is active under its umbrella. Its 758 hits can be compared with the other large topics appearing on the map: music 1,579, philosophy 1,091, beauty 590, poetry 519, and performance 423 hits. Furthermore, the map indicates how widespread interest is in the sub-fields in which I or someone else is specialized. This helps in relating sub-fields to each other, and provides one approach to the question of how to make sense of the relative weight of sub-fields within the whole field. It is interesting that some relatively new but possibly trending sub- fields, such as ‘‘everyday aesthetics,’’ do not (yet) Figure 5. Co-occurrence map of terms in titles and abstracts (all publication types included). Aesthetics in the age of digital humanities 9 (page number not for citation purpose) manifest in the previous maps and analyses at all. This might be related to the fact that databases include plenty of old materials, and new themes are necessarily less visible in comparison. However, the tools enable searches for such topics of interest within various data fields, such as authors’ key- words, title words/phrases, or abstract words/ phrases. One can see that title phrases related to everyday aesthetics are currently occupying the following places in the ranked title word list cover- ing the whole time range of 1975�2014: everyday life (rank 260 with 16 publications), everyday aesthetics (rank 1173 with 4 publications), and everyday aesthetic experience (rank 9658 with 1 publication from 2014). Figure 6 presents a co-citation analysis of jour- nals, as visualized using VOSviewer. Two journals are said to be co-cited if there is a third journal that cites both journals. The larger the number of journals by which two journals are co-cited, the stronger the co-citation relation between the two journals is. 13 For Figure 6, all journals with at least 30 citations (650) are included in the analysis, even if, due to reasons of clarity, only some of the journal titles are visible. One can see three ‘‘hot’’ areas on the heat map, illustrated by warmer red and yellow colors. The largest concentration is around the core of aesthetics, and this is featured by citations to JAAC, BJA, and JAE. The second center, on the left, is about communication research, and the third relates to publications on psychological issues. It is also interesting to see how the field has changed over time. The bubble chart produced using VantagePoint (Figure 7) shows the temporal development of the top-15 words or phrases derived via statistical Natural Language Processing (NLP) 14 from the titles of the publications, pre- sented in alphabetical order. Note that the search word ‘‘aesthetics’’ was removed from the figure, as it appears in most titles. From the figure, we can immediately see that aesthetics articles are most often related to arts (in general) and then to music. Moreover, political and ethical topics have visibly increased their prominence during the last few years. With these types of figures, we can also easily detect when certain terms first emerged in the titles during the 40-year sample period, espe- cially regarding the less common terms and so- called emerging terms (not shown in the figure). It is possible to analyze the temporal devel- opment in time sequences longer than a year, too. Table 3 presents the same top-15 title phrases in table/numerical format across four decades. All such general results are worth paying atten- tion to when trying to figure out what aesthetics is and how it has changed. Of course, one must know the field rather well already in advance, because otherwise one cannot focus one’s search and pay attention to relevant further questions, which are often more or less philosophical in nature. For example, if the analysis suggests a relation between two authors, it is by no means simple and straightforward to say what kind of Figure 6. Co-citation analysis of journals (all publication types included). O. Naukkarinen & J. Bragge 10 (page number not for citation purpose) relation that is. Only if one has enough under- standing of the field, can one ponder different alternatives. In addition, while such tools represent the results as frequency lists, figures, and temporal matrices, as soon as one learns to understand them, they are a very effective way of conveying information; one can see by a single glance much more than by reading a longish text. To our minds, information graphics in the form of science maps and research landscapes have been an under-used possibility in aesthetics. However, it is fairly easy to produce very informative images that could also be used in introductory books and other presentations. Figure 7. Bubble chart of top-15 title words or phrases. Table 3. Temporal development of top-15 title words/phrases # Records in total for decade 1734 2393 2855 4832 Rank # Records Title word or phrase 1975�1984 1985�1994 1995�2004 2005�2014 1 896 Art 179 220 226 271 2 422 Music 76 117 110 119 3 220 Politics 23 39 40 118 4 197 Ethics 10 21 51 115 5 175 Philosophy 21 43 54 57 6 158 Beauty 11 24 41 82 7 136 History 14 31 33 58 8 135 Literature 32 31 26 46 9 129 Nature 21 25 46 37 10 116 Poetry 22 28 26 40 11 95 Image 13 19 25 38 12 92 Role 9 29 23 31 13 88 Aesthetic experience 17 13 26 32 14 87 Criticism 30 22 17 18 15 87 Painting 12 24 21 30 ‘‘Aesthetics’’ as the search word is removed from the table from the first row, as well as common research words such as ‘‘note,’’ ‘‘reply,’’ and ‘‘reflections.’’ Aesthetics in the age of digital humanities 11 (page number not for citation purpose) A CASE STUDY Inside the large amount of data, there are more straightforward cases, and it is wise to focus more closely on them. For example, we can tentatively assume that everything that has been published in the British Journal of Aesthetics are cases of aesthetics. We can accept that without trying to define what aesthetics is. Instead, we can simply see what there is and take that as one landscape of aesthetics. As we know that BJA is one of the main forums of discussion in the field (see previous section), the picture it offers is probably highly relevant more generally, too. One could naturally try to read every volume of BJA published since 1960, but even if that might not be completely impossible, it would be an extremely time-consuming job. Furthermore, it is doubtful whether the reader could ever attain similar results to a computer, even if she read the material several times. Computers can do their tricks quickly, and as we have the titles, abstracts, keywords, and other bibliometric data ready at hand, the text-mining tools and algorithms can reveal patterns, trends, relationships, and emerging topics from the data. The advanced text-mining tools are, in practice, analogous to statistical soft- ware designed for numerical data. Using VOSviewer, we can show that BJA looks like this. The map in Figure 8 is based on the title words from all BJA publications; words appearing at least five times are included; not all are visible. BJA has its own profile compared to the field at large. In addition, as the data set for BJA is smaller, we can drill deeper and use automated content analysis tools such as Leximancer to detect major themes and concepts based on the full texts, not only on titles, abstracts, and key words. In Figure 9, we illustrate the full-text analyses for three separate time periods (1996�1999, 2005� 2009, 2010�2014). We had access to PDF docu- ments from 1996 onwards, but the maps do not cover the period of 2000�2004 and the January 2005 issue, as those PDFs were secured and not readable by the text-mining tool. In the maps, each concept (grey node) is defined by a list of statistically weighted words from the full texts, the comparison of which enables the depiction of associations (closeness and links) between the concepts. 15 Node size indicates the frequency of a concept’s appearance. To aid inter- pretation, the concepts cluster into higher-level themes (colored circles) when the map is gener- ated, and the themes are automatically named according to the largest concept node they include. Colors are heat-mapped to indicate importance, with the most prominent cluster appearing in red, the next most prominent in brownish orange, and Figure 8. Co-occurrence map of BJA title words. O. Naukkarinen & J. Bragge 12 (page number not for citation purpose) so on, according to the color organization system that Leximancer deploys. Note that the figures portray only the most prominent node names for reasons of visibility. It is easy to see that there are some themes, such as ‘‘art’’ and ‘‘work,’’ that remain over decades, but others, such as ‘‘poetry,’’ gain more interest in certain periods. On the other hand, themes such as ‘‘fashion’’ and ‘‘man’’ seem somewhat dubious and force one to dig deeper to see in what sense and way the concepts have been used. The tool enables the analyst to drill down to all text excerpts in which a certain word or word pair appears, to aid in the interpretation. In principle, it would be fairly easy to make comparisons using Leximancer, or the other tools used here, between BJA and other journals, such as the Journal of Aesthetics and Art Criticism (or any other digital data set). This would take some time, but the basic principles would not change. PROBLEMS TO SOLVE Analyzing aesthetics through WoS and BJA offers some useful insights, as we have seen, but there are limitations as well. We already mentioned that many important sources are missing from WoS. Missing sources include journals, too, but the most evident lack is monographs, which are still very important in aesthetics, as well as in other fields of the huma- nities. This data does not tell us what the most referred books are, what themes those books address, and how they form groups. Most prob- ably, such data sets will gradually be provided, Figure 9. (a) BJA full-text analysis from 1996 to 1999. The themes, in order of importance, are art, object, aesthetic, work, sense, trust, text, and man. (b) BJA full-text analysis from 2005 to 2009. The themes, in order of importance, are aesthetic, art, work, account, different, judgment, actual, and fashion. (c) BJA full-text analysis from 2010 to 2014. The themes, in order of importance, are aesthetic, art, philosophy, fact, pleasure, poetry, and picture. Aesthetics in the age of digital humanities 13 (page number not for citation purpose) while more and more books are being digitized, but, for the time being, they are not common. Of course, even now, normal library databases have some information on books (titles, authors, publishers, short descriptions, key words), but that is far from a potential set of full-text databases offering cross-referential information. Thus, pre- sent-day possibilities offered by WoS and other similar databases for analyzing the field of aes- thetics are seriously limited. For example, authors such as Arthur C. Danto and Yuriko Saito have important articles, but their books are probably at least as influential, which cannot be seen very easily through WoS. One possibility is to look for the cited reference information of the article data downloaded from WoS. However, that data is utterly messy, as the information is not uniformly entered into the database (meaning that the same book might have several instantiations with slightly differing indexing), and it is much more challen- ging and time-consuming to clean that data than the core bibliometric data of the main articles. Nevertheless, the analyst can gain preliminary insights even from the messy data, although reporting any strict statistics would be highly questionable. WoS is also dominated by publications and authors writing in English. In the data set that we analyzed, more than 73% of the publications are written in English, 12% in French, and the other 15% in 24 other languages. BJA, naturally, is all in English. However, it is not reasonable to think that aesthetic issues would only be addressed in English, especially because many of them are highly dependent on culture and language. In the future, we need digital databases that better cover several languages. There are active communities of aes- thetics using German, Polish, Slovenian, Finnish, Swedish, Japanese, Chinese, Spanish, Turkish, and several other languages. Finnish, for example, does not exist in the data set at all. How can we make different languages and cultures visible and comparable? At the moment, there are no good databases for that. Moreover, some of the typical bibliometric ana- lyses are clearly designed for the natural sciences, where many practices are somewhat different than in the humanities. For example, the tools offer co- authorship views, because it is typical in the sciences to publish in groups. In the humanities, in turn, it is still common to publish alone. As mentioned, in our data set, some 93% of the publications are single-authored, which is 10 times more than in many fields of the sciences and 2.5 times more than in the social sciences in general. 16 Thomson Reuters’ ScienceWatch presents interesting field- specific statistics on single-authorship and how it has consistently decreased from 1981 to 2012: from 33% to 11%, considering all scientific articles indexed by WoS. The number of single-authored articles has, as such, remained rather stable, around 140,000 per year, during the 30 years, but the number of multi-authored papers has exploded at the same time, from 440,000 to 1.3 million. For aesthetics, it might also be interesting to analyze pictures and sounds, but these text-mining tools cannot handle them; they are completely language-based. There are computational tools in domains other than text-mining that can be used to analyze pictures and sounds, but space does not allow us to present them here. 17 Another issue related to the visual communica- tion of text-mining results is that many tools that are available simply provide certain standard vi- sualization options without too much explaining why they are of the kind that they are. Studies in information graphics, however, have again and again shown that there are no neutral ways of visualizing data and that different solutions in choosing colors, columns, links, lines, arrows and other visual means lead to completely different un- derstandings of the questions addressed, and there are numerous alternatives that can be developed. 18 This is why visual options provided should be explicated in detail, which is not always the case. In the context of aesthetics, of course, also the aes- thetic quality of visual presentations would be a theme worth explicating but in this article we simply wanted to give examples of the means avail- able and not take a stand on their aesthetic worth. Figures 3 and 4, for example, would probably benefit from better graphic design, both aestheti- cally and otherwise. All in all, data visualization is a very potential option also for aesthetics but it must be developed much further from the level that has been exemplified in this article. Yet another kind of problem is that WoS and other academic databases are not free, but only affiliated academic people have easy access to them. This is not an open and democratic situation. Moreover, there are license restrictions even for users with a user’s license: systematic downloading O. Naukkarinen & J. Bragge 14 (page number not for citation purpose) of bibliometric data or full texts is not allowed in large quantities. Some journals have also used a secure PDF format during some years, so that it is practically impossible to make full-text analyses of those materials. At best, one has to ask for special permission for that. The most difficult nut to crack is to see what should be seen as data for making sense of aesthetics at large. Which sources should be included? WoS does not clearly cover everything, even if it is a very big data set, and neither do other databases. Moreover, even if aesthetic issues were dealt with in the sources analyzed, included in WoS or elsewhere, the word ‘‘aesthetics’’ is not always used. How can we find such cases, then? What are the best search terms and what do they actually bring up? The word ‘‘art,’’ for example, can lead us to sociological and economic studies of the arts, as well as to essays on ‘‘art of war’’*such sources potentially being irrelevant to aesthetics. In addition, in the case of BJA, can we really trust that everything in it represents aesthetics? This, in particular, requires philosophical clarity: how do we interpret terms, concepts, and cate- gories, as well as their limits, borders, and changes? It is far from self-evident which expressions can refer to the field of ‘‘aesthetic issues’’ and how* which should have been clear at least since Frank Sibley’s classical analyses of aesthetic concepts. That is exactly why there is no automatic and simple way of using and analyzing databases, but search processes must be combinations of advanced computational methods and deep philo- sophical understanding of the field in question. Answers will eventually get better as we become more experienced. In the end, we will end up discussing the ontology of aesthetics: how does it exist? As books and articles, for sure. But also in other ways? Does it have non-linguistic manifestations and how can we detect them? At least they do not exist in databases such as WoS, which leads us to say something about other possibilities related to computational approaches. FURTHER POSSIBILITIES Standard academic databases are limited in many ways, as we saw. Another option for making sense of academic aesthetics is to use online resources. Space does not allow us to explore this in more detail here, but the options available include Google Scholar and Google Books Ngram Viewer, as well as Wikipedia and its categories, which are gradually being formed by its users. 19 In so-called altmetrics, all in all, the goal is to find alternative metrics for understanding academic activities. 20 Altmetrics is a subset of scientometrics, and it denotes ‘‘the study and use of scholarly impact measures based on activity in online tools and environments.’’ 21 Although traditional scientometrics is heavily focused on citations for recording the impact of academic research, the outstanding rise of social media has exposed several new channels for tracking the impact. 22 Altmetrics is an interesting development currently taking its early steps, as it illuminates the impact of scholarly studies on the general public rather than just the academic community. 23 These metrics can be categorized in five general classes, listed in increasing order of importance: viewed, down- loaded/saved, discussed, recommended, and cited. 24 Altmetrics utilizes, for example, microblogs, on- line reference managers such as Mendeley, blogs, social networking platforms, repositories like Github, domain-specific data from arXiv, access measures on publishers sites like PLoS, and user ratings on books, for example, from Goodreads. 25 Although alternative metrics currently present one of the most popular research topics in sciento- metrics, 26 it also has some problems, as listed by John Mingers and Loet Leydesdorff: ‘‘1) Altmetrics can be gamed by ‘buying’ likes or tweets; 2) there is little by way of theory about how and why altmetrics are generated (this is also true of tradi- tional citations); 3) a high score may not mean that the paper is especially good, just on a controversial or fashionable topic; and 4) because social media is relatively new it will under-represent older papers.’’ 27 If we operate in altmetrics, we have to*again* ask which ‘‘hits’’ are actually cases of aesthetics, which are only somehow (loosely) related, and which are something else. We have to consider our search principles very carefully, when navigating the whole open internet. What kinds of terms will bring up relevant data? Are we looking for philosophical texts that are close to academic aesthetics but for some reasons excluded from the traditional academic publications, such as blog texts on Tom Leddy’s Aesthetics Today (aesthetic- stoday.blogspot.com) and video presentations, or Aesthetics in the age of digital humanities 15 (page number not for citation purpose) perhaps tweets? Do we include pictures? Artists’ activities? Or networks and groups of things rather than individual cases? Whatever we are looking for, we need to have suitable tools, and in the present situation the tools are more and more often computational. At the moment, there are no dominant, well-established tools in altmetrics, but a buzz of competing and developing ones. Still, aestheticians should follow what happens in that area. New tools appear all the time, and one can find several articles that review their features. 28 It is possible that the computational digital world is changing our way of seeing what is a ‘‘work’’ or ‘‘piece’’ or ‘‘case’’ of aesthetics. We are not neces- sarily focusing on clear-cut cases, objects, events or authors, but relational networks or ‘‘clouds’’ of phenomena, even if this might not be so evident to us. The situation is probably more or less parallel with the one that David Joselit describes in his book After Art, which addresses the situation of contemporary visual arts and architecture. Accord- ing to him, it is not that easy or even possible to see a clear difference between original artworks and all kinds of digital derivatives of and references to them; the internet and the dominant search procedures guide us to see the network they form together. In his words: ‘‘As I have argued, what now matters is not the production of new content but its retrieval, in intelligible patterns through acts of reframing, capturing, reiterating, and documenting. What counts, in other words, is how widely and easily images connect: not only to messages, but to other social currencies like capital, real estate, politics, and so on.’’ 29 Likewise in aesthetics, there might be more or less clear cases, related ones, derivatives, and so on; and what may count on many occasions is how they interact and form bigger, ever changing wholes. In such wholes, some nodes tend to attract more attention than others. However, on the internet, even such nodes are not single, clear-cut objects but relational networks within larger networks supporting them. It is not a single article or book that becomes visible alone, but everything that is attached to it in the digital network or cloud. By this logic, the article or author who attracts most connections (references) easily seems to be the most important. And in fact, often such articles and authors indeed are very important, because con- nections and relations are based on the fact that readers or other users find them useful and want to tell others about them. 30 One aspect of this situation is the importance of searchability, that is, how easily something can be searched and found in the digital net. Computa- tional tools can only search and find objects and relations that are ‘‘visible’’ to them, which, again, is defined by the algorithms they are programmed to follow. Often, such tools do not find single, clear- cut cases, even if they were very interesting and important in some other ways. Very often, also, users of such tools do not really have to understand in detail how the tools function. We can use them without knowing exactly what they do and do not do. This, however, must make us extra careful when assessing what they actually find and show us, and why. Strong visibility in such searches does not necessarily mean that a scholar or a book is automatically better, more valuable or important than something that has a lower ‘‘searchability rate’’ (in this particular data set) and that has not yet been found. The value of scholars and publica- tions is something we still have to evaluate by more complex, peer-review processes, too. We could also leave the academic world behind and try to see what aesthetics is elsewhere. Then, we have an even more complex field to navigate. The simple test of googling ‘‘aesthetics’’ and comparing the image search with the text search shows that the former relates ‘‘aesthetics’’ to beautiful (white) women and body-builder men, the latter to philosophical definitions of the term ‘‘aesthetics,’’ among literally millions of other things. How are these two interrelated? In any case, non-academic cases of aesthetics, on the internet and elsewhere, by far outnumber anything academic aestheticians can ever even imagine pro- ducing. Aesthetic values and issues are actively noticed and dealt with by various actors and in numerous ways, and academic, philosophical ap- proaches are a tiny minority in the broad field. The top 10 actors in academic aesthetics found in WoS are unknown to the wider public. It is healthy to remember this. This theme, of course, would require a study of its own. CONCLUSIONS When one nowadays wishes to understand one’s own discipline, aesthetics or otherwise, it is wise to make use of the latest computational tools O. Naukkarinen & J. Bragge 16 (page number not for citation purpose) and combine them with the more traditional understanding of the field. There are several tools available, and the best, most comprehensive results will be achieved if one does not settle for one or two, but compares several points of view with each other. All of these provide a slightly different picture of aesthetics. This, in itself, is an interesting result and worth presenting to students and readers of introductory books, for example. And it becomes even more interesting when one tries to argue which of them are more accurate, which less. Why am I for some of them? If the field is this big, why do I tend to focus on some of its parts? The full picture can never be achieved, but making use of computational tools is one current route that we simply must follow, 31 even if there are many problems to solve. They will not substitute philosophical analyses, but will complement them and actually make them even more necessary. Computational approaches also force us to con- sider what is nowadays the ontological status of the field. Where and how does it exist? A short answer is, we think, that aesthetics is a social information network that is constantly growing and changing. What this means, in more detail, must be answered in another article. 32 Notes 1. Peder Olesen Larsen and Markus von Ins, ‘‘The Rate of Growth in Scientific Publication and the Decline in Coverage Provided by Science Citation Index,’’ Scientometrics 84 (2010): 573�603. 2. Lutz Bornmann and Rüdiger Mutz, ‘‘Growth Rates of Modern Science: A Bibliometric Analysis Based on the Number of Publications and Cited Refer- ences,’’ Journal of the Association for Information Science and Technology 66 (2015): 2215�22. 3. However, aesthetics and computational approaches have been combined in other ways. For example, computational methods have been used for analyzing and even creating art works and other aesthetically interesting objects, and aesthetic features of such computational procedures have also been studied. In both areas, the MIT Press and the journal Leonardo have been active for a long time. See, for example, Paul A. Fishwick, ed., Aesthetic Computing (Cambridge, MA: The MIT Press, 2008). A recent, more specific example is an article on using computer vision to find beauty in low-attention photos stored on Flickr: Rossano Schifanella, Miriam Redi, and Luca Aiello, ‘‘An Image is Worth More than a Thousand Favor- ites: Surfacing the Hidden Beauty of Flickr Pictures’’ (Proceedings of the Ninth International AAAI Con- ference on Web and Social Media ICWSM, Oxford, UK, May 26�29, 2015). 4. A many-sided description of the fast-growing field of the digital humanities is Debates in the Digital Humanities, both as a book edited by Matthew K. Gold (Minneapolis: University of Minnesota Press, 2012), a book series starting in 2016, and as an open-access online platform at http://dhdebates.gc. cuny.edu/about (accessed September 29, 2015). For a bibliometric review, see Loet Leydesdorff and Alkim Salah, ‘‘Maps on the Basis of the Arts & Humanities Citation Index: The Journals Leonardo and Art Journal versus ‘digital humanities’ as a topic,’’ Journal of the American Society for Information Science and Technology, 61 (2010): 787�801. 5. Alan Porter and Scott Cunningham, Tech Mining. Exploiting New Technologies for Competitive Advantage (Hoboken, NJ: Wiley, 2005), 357. 6. Eugene Garfield, ‘‘Citation Indexes for Science: A New Dimension in Documentation through Association of Ideas,’’ Science 122 (1955): 108�11. 7. Loet Leydesdorff, Björn Hammarfelt, and Alkim Salah, ‘‘The Structure of the Arts & Humanities Citation Index: A Mapping on the Basis of Aggre- gated Citations among 1,157 Journals,’’ Journal of the American Society for Information Science and Tech- nology 62 (2011): 2414�26; and Björn Hammarfelt, ‘‘Using Altmetrics for Assessing Research Im- pact in the Humanities,’’ Scientometrics 101 (2014): 1419�30. 8. More information on the software is provided on their websites: VantagePoint http://www.thevantage point.com, VOSviewer http://www.vosviewer.com and Leximancer http://www.leximancer.com. See also Porter and Cunningham, Tech Mining; Nees Jan Van Eck and Ludo Waltman, ‘‘Visualizing Bibliometric Networks,’’ in Measuring Scholarly Im- pact: Methods and Practice, eds. Ying Ding, Ronald Rousseau, and Dietmar Wolfram (Cham: Springer, 2014), 285�320; and David Thomas, ‘‘Searching for Significance in Unstructured Data: Text Mining with Leximancer,’’ European Educational Research Journal 13 (2014): 235�56. 9. The categorization of research areas into scientific domains is done based on information provided by WoS at https://images.webofknowledge.com/WOKRS 511B5/help/WOS/hp_research_areas_easca.html (accessed June 23, 2015). 10. The list includes only one female author, Suzanne Langer. This gender imbalance and its reasons would deserve a study of its own. 11. A Multi-Dimensional Scaling algorithm proprietary to VantagePoint determines the location of each author on the map. The x- and y-axes of the maps have no specific meaning. The algorithm simply tries to reduce an N-dimensional representation to two dimensions, seeking to maintain authors with a high degree of similarity (correlation) in close proximity to each other. Generally speaking, authors who are Aesthetics in the age of digital humanities 17 (page number not for citation purpose) http://dhdebates.gc.cuny.edu/about http://dhdebates.gc.cuny.edu/about http://www.thevantagepoint.com http://www.thevantagepoint.com http://www.vosviewer.com http://www.leximancer.com https://images.webofknowledge.com/WOKRS511B5/help/WOS/hp_research_areas_easca.html https://images.webofknowledge.com/WOKRS511B5/help/WOS/hp_research_areas_easca.html close to each other are more similar than those that are farther apart. However, the presence or absence of a line (and the thickness of the line) between any two authors is a more appropriate measure of proximity, since it implies a relatively high correlation between them. 12. Note that this term map and subsequent VOSviewer maps are constructed with the larger search data set of 21,919 texts, including articles and all other publication types from WoS A&HCI. This is because the refining of publication types took place only in VantagePoint after the raw data was down- loaded from WoS. With VOSviewer, it is not possible to refine the raw data like that. A threshold of 25 was used when constructing the map for Figure 5. This means that terms that appear in at least 25 titles or abstracts are included in the co-occurrence map, and only some of the approx. 950 qualified terms are visible, to avoid clutter. 13. Eck and Waltman, ‘‘Visualizing Bibliometric Networks.’’ 14. On NLP, see Christopher D. Manning and Hinrich Schütze, Foundations of Statistical Natural Language Processing (Cambridge, MA: The MIT Press, 1999). 15. Thomas, ‘‘Searching for Significance in Unstruc- tured Data.’’ 16. Christopher King, ‘‘Single-Author Papers: A Waning Share of Output, But Still Providing the Tools for Progress’’, ScienceWatch, Thomson Reuters, September 2013 http://sciencewatch.com/articles/ single-author-papers-waning-share-output-still- providing-tools-progress (accessed August 17, 2015). 17. See, for example, Schifanella, Redi, and Aiello, ‘‘An Image is Worth More than a Thousand Favorites.’’ See also the software Culture Cam at culturecam.eu. 18. For example, Katy Börner, Atlas of Science*Visua- lizing What We Know (Cambridge, MA: The MIT Press, 2010); Sandra Rendgen et al., Information Graphics (Cologne: Taschen, 2012); Edward R. Tufte, The Visual Display of Quantitative Information, 2nd ed. (Cheshire, CT: Graphics Press, 2001). 19. Anne-Will Harzing, ‘‘A Preliminary Test of Google Scholar as a Source for Citation Data: A Long- itudinal Study of Nobel Prize Winners,’’ Scientometrics 94 (2013): 1057�75; and Anne-Wil Harzing, Publish or Perish, 2007, software available at http:// www.harzing.com/pop.htm (accessed September 29, 2015). 20. Lutz Bornmann, ‘‘Alternative Metrics in Sciento- metrics: A Meta-Analysis of Research into Three Altmetrics,’’ Scientometrics 103 (2015): 1123�44. 21. Jason Priem, ‘‘Altmetrics,’’ in Beyond Bibliometrics: Harnessing Multidimensional Indicators of Scholarly Impact. eds. Blaise Cronin and Cassidy R. Sugimoto (London: MIT Press, 2014), 263�88, specifically on 266. 22. John Mingers and Loet Leydesdorff, ‘‘A Review of Theory and Practice in Scientometrics,’’ European Journal of Operational Research 246 (2015): 1�19. 23. Ibid. 24. Jennifer Lin and Martin Fenner, ‘‘Altmetrics in Evolution: Defining and Redefining the Ontology of Article-Level Metrics,’’ Information Standards Quarterly 25 (2013): 21�26. 25. Bornmann, ‘‘Alternative Metrics in Scientometrics.’’ 26. Ibid. 27. Mingers and Leydesdorff, ‘‘A Review of Theory and Practice in Scientometrics,’’ 15. 28. Manuel Jesus Cobo, et al., ‘‘Science Mapping Software Tools: Review, Analysis and Cooperative Study among Tools,’’ Journal of the American Society for Information Science and Technology 62 (2011): 1382�402; YunYun Yang, et al., ‘‘Text Mining and Visualization Tools�Impressions of Emerging Capabilities,’’ World Patent Information 30 (2008): 280�93; Stefanie Haustein et al., ‘‘Coverage and Adoption of Altmetrics Sources in the Bibliometric Community’’, Scientometrics 101 (2014): 1145�63; and Katrin Weller, ‘‘Social Media and Altmetrics: An Overview of Current Alternative Approaches to Measuring Scholarly Impact’’, in Incentives and Performance: Governance of Knowledge-Intensive Or- ganizations, eds. Isabell M. Welpe et al. (Cham: Springer, 2015), 261�76. 29. David Joselit, After Art (Princeton, NJ: Princeton University Press, 2013), 55�6. 30. More generally speaking, Joselit’s analysis is related to various network points of view that have been devel- oped in philosophy and social sciences over the last decades by theorists as various as Michel Foucault, Gilles Deleuze, Bruno Latour, Luc Boltanski, Manuel Castells, Duncan Watts, and Mark Granovetter, among many others. 31. Ivan Zupic and Tomaz Cater, ‘‘Bibliometric Methods in Management and Organization,’’ Organizational Research Methods 18 (2015): 429�72. 32. Casey Haskins offers one interpretation of this network in his article ‘‘Aesthetics as an Intellectual Network,’’ The Journal of Aesthetics and Art Criticism 69 (2011): 297�308. O. Naukkarinen & J. Bragge 18 (page number not for citation purpose) http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://sciencewatch.com/articles/single-author-papers-waning-share-output-still-providing-tools-progress http://www.harzing.com/pop.htm http://www.harzing.com/pop.htm work_2wrmmvxfdbggtgz4fx6f2c2sjm ---- Artificial imagination, imagine: new developments in digital scholarly editing EDITORIAL Artificial imagination, imagine: new developments in digital scholarly editing Dirk Van Hulle1 Published online: 25 April 2019 # Springer Nature Switzerland AG 2019 This special issue on Digital Scholarly Editing introduces several new developments, both in terms of the theory and practice of textual scholarship that are taking shape in the digital medium. But instead of enumerating or summarizing them, this introduction is meant as a reflection on some of the possible ways in which our work in digital scholarly editing could be useful to the broader field of digital humanities. After all, we tend to work on a microscale compared to the forms of macroanalysis and ‘distant’ reading that are currently dominant in digital literary studies. This raises the question: how can our research be relevant to these other sub-disciplines in digital humanities? The title of this introduction is inspired by a short text by Samuel Beckett, called Imagination Dead Imagine, which can be read as a literary investigation into the workings of the human imagination. At first sight, the link with scholarly editing may seem far-fetched, but what many literary editorial projects have in common is a fascination with the creative process. After all, the human mind is at the core of ‘humanities’ research and scholarship or Geisteswissenschaft. Within the framework of this bigger picture, the issues discussed in this volume present us not only with challenges but also with opportunities. If editions are Bmachines of knowledge^ (see the contribution by Susan Schreibman and Costas Papadopoulos), and if they are Bmachines of simulation^ (McGann 2014, 124; qtd. in the contribution by Julia Flanders, Ray Siemens et al.), this knowledge and simula- tion might be combined to simulate not just a product, such as a handwritten docu- ment––as in a digital facsimile (see the contribution by Mats Dahlström)––but also a process, such as the creative and imaginative process of a literary work. International Journal of Digital Humanities (2019) 1:137–140 https://doi.org/10.1007/s42803-019-00020-w * Dirk Van Hulle dirk.vanhulle@uantwerpen.be 1 Department of Literature, University of Antwerp, Antwerp, Belgium http://crossmark.crossref.org/dialog/?doi=10.1007/s42803-019-00020-w&domain=pdf mailto:dirk.vanhulle@uantwerpen.be Developments in AI and computational linguistics enable us to create writing bots that facilitate a writer’s creative process, as a recent experiment in collaboration with the Dutch writer Ronald Giphart illustrates (see Manjavacas et al. 2017).1 Giphart wrote a story, making use of ‘Asibot’, a writing bot that offered him the possibility to continue a sentence (adding a passage of ca. 100 characters) at any time in any one of eight different styles. These styles were based on the works of a few Dutch and Flemish writers (such as Gerard Reve and Kristien Hemmerechts), on the Dutch translations of Isaac Asimov’s writings and, most interestingly, on the published works of Ronald Giphart himself. The programme also included keystroke logging, which enabled us to trace the entire creative process. Making use of this software, Giphart started writing and in the middle of his first sentence he activated the style of Gerard Reve: the bot offered a syntactically correct continuation consisting of 100 characters, after which Giphart finished the sentence. In the middle of the second sentence he activated his ‘own’ style, asking the bot to suggest a 100-character continuation in the Giphart style. This is not the place to analyse the result of this particular writing experiment, but the point is that this basic form of artificial imagination is a form of imitatio. The bot is very good at imitating or simulating the style of a particular writer, based on his or her texts published so far. So long as the bot only offers recombinations of words used in the texts that have already been published he only serves as a supplier of words, comparable to, say, the notes in Joyce’s Finnegans Wake notebooks, filled with verbal pillage, plundered from hundreds of source texts. But what if we can teach the bot that writing is actually to a large extent re-writing and revising, not just a recombining of words the writer already used in his previous works, but a complex dialectic of composition and decomposition, writing, undoing and rephrasing? If the bot were able to simulate this process, it would be able not merely to imitate but to emulate the writer. This aemulatio would be a step in the direction of artificial imagination. To start making this step, we need training data, which is what we are to some extent already producing in digital scholarly editions today. All the encoded transcriptions of a digital edition such as the Charles Harpur Critical Archive (discussed by Desmond Schmidt and Paul Eggert in this issue), the Charles Chesnutt Digital Archive (discussed by Stephanie P. Browner and Kenneth M. Price) or the Beckett Digital Manuscript Project (www.beckettarchive.org) – including all the tagged deletions, additions and substitutions – can be used as data to train the algorithm to simulate a particular writer’s creative process. By means of state-of-the-art techniques of natural language processing and sentiment analysis it should be possible to analyse and visualise the differences in tone between versions of the same work, thanks to the detailed transcriptions of each separate version. These encoded transcriptions already contain quite a bit of information that can help us detect patterns of textual change. Thus, the encoded transcripts provide information on deletions and substitutions. This kind of information can be modelled and charted in plots showing the percentages of added, deleted, 1 The experiment was a collaboration between the Meertens Institute (KNAW, Amsterdam) and ACDC (the Antwerp Centre for Digital humanities and literary Criticism, University of Antwerp), involving Folgert Karsdorp, Mike Kestemont, Dirk Van Hulle, Enrique Manjavacas, Benjamin Burtenshaw, Vincent Neyt and Wouter Haverals. 138 D.V. Hulle http://www.beckettarchive.org modified and unchanged words. The result can serve as a tool to detect patterns in terms of an author’s poetics. In the case of a writer such as Samuel Beckett, the overall pattern (www.beckettarchive.org/statistics) corresponds with the author’s self-proclaimed poetics of Bless is more^. The statistics indeed show that Beckett cut more than he added, with a relatively stable ratio of one added word for every three deleted words on average – both on the level of the separate work and on the level of the oeuvre as a whole (Beckett 2018). We may dismiss this kind of indirect reading as merely confirming what we already knew, or too Bunambitious^2 in scope to be relevant. But we can also see it as a move, no matter how modest, in the direction of more complex (macro)analyses. If combined with other techniques such as part of speech tagging, sentiment analysis and computational semantics, experiments such as these and the ones discussed in the present volume do suggest new ways in which digital scholarly editing can contribute to forms of distant reading. So far, distant reading is usually applied to one version of a text. What digital scholarly editing can offer is a way to enable distant reading across versions, which would be a necessary step in the development of artificial imagination in our discipline. Such a panoramic form of genetic reading enables readers to examine not only a work in progress, but also an oeuvre in progress, and as more and more digital genetic editions become available, possibly even literary periods in progress, including macroanalyses across versions. As any scholarly editor knows, literary imagination is not only a matter of individual mental power, but often an interaction between an intelligent agent and his or her material and cultural environment. This includes a writer’s interaction with her library, with her editor, with her friends, with social media, with her laptop, with old files, with websites, with her own earlier drafts and with the physical space of notebooks. Scholarly editors are in an optimal position to analyse especially the creative potential of the interaction with what in writing studies is called the ‘text produced so far’ (TPSF). If we manage to find suitable ways to digitally map this interaction, digital scholarly editions may serve as valuable sources of information providing training data for research into artificial imagination. This special issue of Digital Scholar, therefore, investigates the state of the art in digital scholarly editing by raising questions such as: How should we frame concepts such as ‘copy’ and ‘facsimile’ in the age of digital reproduction? How does a digital scholarly edition differ from print editions? Can we further develop the notion of the hybrid edition? How do we conceive of the scholarly edition in 3D? Can we combine the notions of a digital archive and a digital edition? How can we reappraise textual collation in a digital paradigm? How do we adjust or rethink editorial theory to cope with born-digital works of literature? According to Matthew Jockers, Bwe have reached a tipping point, an event horizon where enough text and literature have been encoded to both allow and, indeed, force us to ask an entirely new set of questions about literature and the literary record^ (2013: 4). I believe that in digital scholarly editing we may not have reached that tipping point yet, and that it may still take a while before panoramic reading of entire periods in progress 2 According to Franco Moretti, ‘the ambition is now directly proportional to the distance from the text: the more ambitious the project, the greater must the distance be’ (Moretti 2013, 48). Artificial imagination, imagine: new developments in digital scholarly editing 139 http://www.beckettarchive.org/statistics and macro-analyses across versions will be operative. But this is precisely why we need to keep investing in the genetic microanalyses of drafts, typescripts and other versions, marking up variants as a necessary step in the direction of artificial imagination and new ways of macroanalysis applied to more than one version. References Beckett, S. (2018). In D. van Hulle, S. Weller, V. Neyt (Eds.), Fin de partie / Endgame: A digital genetic edition. Brussels: University Press Antwerp. Retrieved from www.beckettarchive.org. Accessed 26 June 2018. Jockers, M. L. (2013). Macroanalysis: Digital methods and literary history. Urbana/Chicago/Springfield: University of Illinois Press. Manjavacas, E., Karsdorp, F., Burtenshaw, B., Kestemont, M. (2017). Synthetic literature: Writing science fiction in a co-creative process. Proceedings of the Workshop on Computational Creativity in Natural Language Generation (CC-NLG 2017), Santiago de Compostella, 4 September 2017. Association for Computational Linguistics, 2017, pp. 29–37. Retrieved from http://aclweb.org/anthology/W17-3904. Accessed 26 June 2018. McGann, J. (2014). A new republic of letters: Memory and scholarship in the age of digital reproduction. Cambridge: Harvard University Press. Moretti, F. (2013). Distant reading. London/New York: Verso. 140 D.V. Hulle http://www.beckettarchive.org http://aclweb.org/anthology/W17-3904 Artificial imagination, imagine: new developments in digital scholarly editing References work_2x7eubduorbqbdrhgmyblfqr4y ---- GameSound, Quantitative Games Analysis, and the Digital Humanities Research How to Cite: Iantorno, Michael. 2020. “GameSound, Quantitative Games Analysis, and the Digital Humanities.” Digital Studies/Le champ numérique 10(1): 2, pp. 1–17. DOI: https://doi.org/10.16995/dscn.319 Published: 23 January 2020 Peer Review: This is a peer-reviewed article in Digital Studies/Le champ numérique, a journal published by the Open Library of Humanities. Copyright: © 2020 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. Open Access: Digital Studies/Le champ numérique is a peer-reviewed open access journal. Digital Preservation: The Open Library of Humanities and all its journals are digitally preserved in the CLOCKSS scholarly archive service. https://doi.org/10.16995/dscn.319 http://creativecommons.org/licenses/by/4.0/ Iantorno, Michael. 2020. “GameSound, Quantitative Games Analysis, and the Digital Humanities.” Digital Studies/Le champ numérique 10(1): 2, pp. 1–18. DOI: https://doi.org/10.16995/dscn.319 RESEARCH GameSound, Quantitative Games Analysis, and the Digital Humanities Michael Iantorno Concordia University, CA michael.iantorno@gmail.com This article relates to the 2018 CSDH/SCHN conference proceedings. This paper outlines Michael Iantorno’s and Melissa Mony’s experiences with quantitative game analysis by summarizing the first year of development of the prototype ludomusicological database GameSound. To further the discussion, this article also summarizes and analyzes the work of fellow digital humanities scholar Jason Bradshaw, who applied intriguing types of tool-based analysis to BioShock Infinite. To conclude, the paper hypothesizes where this type of research could lead in the future: both for GameSound and for other projects using similar methods and methodologies. Keywords: game studies; ludomusicology; digital humanities; quantitative research; databases Cet article présente les expériences de Michael Iantorno et de Melissa Mony faites avec des analyses de jeu quantitatives, en résumant la première année de développement de la base de données prototype ludomusicale GameSound. Pour approfondir la discussion, cet article résume et analyse également l’œuvre de Jason Bradshaw et de Dr. Adrienne Shaw, qui emploient des types intrigants d’analyses de jeu quantitatives et qualitatives dans leurs propres projets, respectivement la  BioShock Infinite and Feminist Theory  : A Technical Approach  et  The LGBTQ Video Game Archive. Pour conclure, cet article formule une hypothèse concernant l’avenir de ce genre de recherche  : non seulement pour GameSound mais aussi pour d’autres projets qui se servent de méthodes et de méthodologies similaires. Mots-clés: études de jeux; la ludomusicologie; humanités numériques; recherche quantitative; bases de données https://doi.org/10.16995/dscn.319 mailto:michael.iantorno@gmail.com Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 2 of 18 1. Introduction For digital humanities scholars, breaking down a videogame into its component parts may seem like an obvious strategy for better understanding it. Many of these researchers are players themselves, after all, and the act of play implicitly invites this sort of systematic deconstruction. As a player engages with a videogame, they accumulate rules knowledge, acquire in-game resources, and develop a tacit understanding of what the game asks of them, slowly inching toward a deepened comprehension of the entire play experience. Only through this accumulation of resources and expertise can they improve their performance within a videogame, eventually completing objectives and perhaps even beating a title in its entirety. From an academic perspective, this deconstructive learning process serves a purpose other than game mastery. Instead of forging a better understanding of a game through play (as intended by the developer), technically-minded scholars can separate a videogame into its component parts, which can then be arranged and rearranged to facilitate the needs of their research. This process can be as simple as using save files or cheat codes to access parts of a game strategically, a process that often manifests as targeted or repeated playthroughs, or can involve delving into a game’s code and assets in an attempt to peek behind the curtains—scrutinizing the numerous individual elements that make up a videogame title. It is this latter form of analysis that we are interested in exploring within this paper. By accessing game assets directly, rather than solely through play, we believe that scholars have the opportunity to analyze videogames in new ways—not just as a series of set pieces or vignettes pre-determined by the original developer. When broken into parts and viewed as a collection of diverse assets, rather than a homogenous whole, new angles of research become possible through the adoption of both established and emerging forms of quantitative analysis. This paper discusses one potential avenue for this type of videogame analysis by documenting the creation, functionality, and potential applications of GameSound, Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 3 of 18 a digital humanities database project developed between 2017 and 2018.1 We developed GameSound as a prototype ludomusicological database, with the intent of providing users with easy access to the music and sound effects present within videogames. By making these audio files accessible through a web-based interface, and supplementing them with technical and contextual data, our hope is that GameSound could be used to facilitate new types of academic research. Using Civilization IV as a case study and HEURIST to build the database, GameSound currently provides access to over 2000 music and sound effect files. This paper begins with a brief overview of quantitative game analysis, both as it is defined for this particular digital humanities project and how it has been used in other projects, specifically Jason Bradshaw’s “BioShock Infinite and Feminist Theory: A Technical Approach.” We then discuss the functionality of the database itself, while also outlining the technical, legal, and theoretical challenges that arose while designing it. Finally, we hypothesize where this research could lead in the future: both for GameSound and other similar projects. However, we would be remiss to begin these discussions without first touching on the academic field that stands behind GameSound, ludomusicology. An emerging sub-discipline of musicology, ludomusicology focuses on the academic study of the audio present in videogames. Primarily concerned with the direct study of a videogame’s music and sound effects, ludomusicology also interrogates how we study audio within the context of digital software. Since the neologism was coined in 2007, the field has expanded to include music games, fan cultural music practices, live concerts, and the impact that game music has had on other musical genres (Dudley 2018). Ludomusicological scholars may discuss how the idea of diegesis is complicated by the interactivity of videogames (Kamp 2016), the impact of music games such as Taiko: Drum Master and Bloom (Kassabian and Jarman 2016), or the 1 GameSound was developed by Michael Iantorno and Melissa Mony. Although Melissa did not contribute to the authorship of this article, she shared equal research responsibilities and provided much of the musicological expertise required to bring the project to fruition. Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 4 of 18 tension between art and entertainment that arises when videogames remix classical music (Gibbons 2016). Like many topics associated with the digital humanities, ludomusicology traverses disciplines: fostering collaborations with computer science, film and media studies, and communications. GameSound is our first foray into this type of research, originally conceived as a digital humanities class assignment at McGill University, and we believe the database has potential academic, professional, and hobbyist applications. 2. Quantitative videogame analysis We describe GameSound as a tool for enabling new types of quantitative videogame analysis, but we also acknowledge that the term “quantitative” can be vague and requires further elaboration. In the context of GameSound, it reflects an approach that is comprehensive (collecting all of the audio in a given game), measurable (defining collected game audio using quantities and simple identifiers), and statistical (enabling the comparison of data using tables, charts, faceted searches, and visualizations). In contrast to qualitative videogame analysis methods, which commonly rely on written logs constructed through repeated playthroughs (Consalvo and Dutton 2006), quantitative analysis is much more tool-oriented and focuses on parameters that can be measured or counted. Although a somewhat new approach, quantitative videogame analysis can be facilitated by adopting existing digital humanities tools and established methods that have their origins in fields such as literature, history, and philosophy. The practice of isolating and extracting specific words and phrases has enabled new types of distant reading in literature, for example, and many of the methods we use to analyze books can be co-opted for analyzing various aspects of a videogame, from text to media assets. Jason Bradshaw provided an excellent example of this type of analysis at the Congress of the Social Sciences and Humanities conference in 2018, with his presentation “BioShock Infinite and Feminist Theory: A Technical Approach.” Bradshaw collaborated with fan communities to acquire a complete written script (featuring nearly 250,000 words) from the videogame BioShock: Infinite, which then served as the corpus for his project. Influenced by a close reading of the game by Catlyn Origitano, who analyzed the representation of female protagonist Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 5 of 18 Elizabeth (Origitano 2015), Bradshaw wanted to demonstrate how distant reading could corroborate Origitano’s qualitative analysis. He also had a strong desire to expand traditional distant readings of text to include videogames: “Why stop at the traditional textual mediums historically studied in the humanities? New types of digital analysis can also lend themselves to mediums born of the digital age” (Bradshaw 2018). After acquiring the game’s script, Bradshaw fed the entirety of the text into Voyant Tools—a popular piece of textual analysis software—to determine the frequency of certain words and where they occurred in the game’s timeline. Bradshaw tracked how Elizabeth was referred to throughout the game as well as how she was treated by other characters, resulting in a measured, textual character arc. By paying keen attention to labels such as “baby” or “child,” he analyzed her function in the storyline and how it could be “an allusion to the treatment of women in contemporary society” (Bradshaw 2018). Although by no means an authoritative take on the game’s narrative, the study demonstrates a progressive methodological approach for game studies: one that invites complementary quantitative analysis using methods generally reserved for literary works. Much like Bradshaw’s work, GameSound also takes a deconstructive approach, but focuses on the sound within a videogame rather the text. Sound was chosen as the focus for the project as we felt that it was one of the more difficult assets of a videogame to isolate—often framed as a secondary element to mechanics, visuals, or narrative—and because of our own experiences in professional sound and music theory. By putting an emphasis on searchability and utilizing a variety of both technical and ludological descriptors, GameSound aims to facilitate access to a videogame’s audio elements without having to rely on playthroughs or secondary resources. Once the heavy lifting of data input and formatting is complete, GameSound users have the freedom to access audio through faceted searches, create custom reports, and even embed content on the web for easy collaboration and knowledge dissemination. This approach allows for a rapid oscillation between different types of analysis, some of which may not have otherwise been available. We have outlined three speculative use cases for the database listed below: Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 6 of 18 I. Game studies scholars, who may be interested in tracking the prevalence and use of certain sound effects within a videogame (such as gunshots or dialogue snippets) could sift through the database using filters or a keyword search. Being able to point to the quantity of sound effects recorded in a game, as well as how they are triggered, could serve as a valuable point of analysis for speculating on a developer’s priorities when designing a game. II. As GameSound allows users to quickly survey the parameters of the audio found in a particular game, professional videogame developers may wish to use the database to better guide their own efforts. By sort- ing, counting, and listening to audio files by type (sound effects, music, etc), developers could use existing games as a blueprint for determining how many audio recordings they need for their own projects, how long each recording should be, and how many variations of the same sound may be required. III. Sound scholars, perhaps inspired by Jonathan Sterne’s The Death and Life of Digital Audio, could use GameSound to analyze how both data com- pression and audio compression have been applied to videogame sound. Sterne has discussed the effects of compression through his documenta- tion of the music industry’s loudness wars (Sterne 2006, 345), but little of this research has been carried over to the videogame industry. Although GameSound primarily focuses on quantitative research opportunities, an additional advantage of the database is its ability to expedite various types of listening exercises. GameSound allows scholars to listen directly to sound effects or music from a videogame, enabling approaches such as Michel Chion’s conception of reduced listening—a mode of listening that focuses on repetition and the removal of visual context (Chion 2012). This type of listening would be difficult, if not entirely impossible, through normal playthroughs where sounds cannot be divorced from their accompanying visuals, are usually layered with a multitude of other audio tracks, and can only be triggered when certain objectives or criteria are met. Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 7 of 18 3. Constructing and using GameSound GameSound was created using HEURIST, a free platform for scholars in the digital humanities that enables online database construction, mixed-media assets, and dynamic data visualization. Like many digital humanities tools, HEURIST pursues a certain level of accessibility for its user base: HEURIST’s research-driven data management system puts the user in charge, allowing them to design, create, manage, analyse and publish their own richly-structured database(s) within hours, through a simple web interface, without the need for programmers or consultants. (Sydney University 2018) After some experimentation with other tools, we chose HEURIST for this project primarily due to its ease-of-use. While we may be studying a subject that is steeped in code, we are certainly not experienced computer engineers or web developers (nor did we want to place this expectation on our collaborators). Thus, it was necessary to find a database tool that would take care of most of the heavy lifting for us while providing a very shallow learning curve for additional participants. In a way, this embodies one of the key design sensibilities for GameSound: a desire to create a useful resource that, at the same time, is easily accessible for game scholars, ludomusicologists, and independent researchers. The ultimate goal for the database is to include videogames from different platforms and eras, but the initial dataset focuses entirely on the sounds found within the 2005 computer game Civilization IV. Civilization IV was chosen for the prototype for three main reasons: First, it possesses an incredibly open programming architecture in which developers have enabled transparent access to the game’s assets—making both data extraction and interpretation simpler than in comparable titles. Secondly, Civilization IV was the first computer game to be nominated for (and win) a Grammy, granting it a special place in the history of game studies while affirming a certain level of cultural significance. Finally, the title’s availability across various platforms and marketplaces ensured that we could acquire the game without having to seek out additional hardware. In contrast, videogames that are exclusive to Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 8 of 18 a specific era or console would have presented severe challenges in both acquisition and access. Super NES games, as an example, were released on proprietary cartridges and stored their music in heavily compressed formats that are difficult to access. GameSound currently facilitates access to Civilization IV ’s audio in two ways: faceted searches (Figure 1) and reports (Figure 2). GameSound’s faceted search functionality allows users to explore Civilization IV ’s audio files through a web browser, by activating and deactivating filters presented within a column on the left side of the screen. These filters range from technical parameters (such as file type and sample rate) to ludomusicological ones (such as IEZA classification and sound type). Additionally, users can search for a specific piece of audio by filename, or simply sift through all of the game’s audio in a linear fashion by scrolling through the list-view. Clicking on a single entry in the list-view will bring forth additional information about an audio file, such as file size and duration, and loads an audio player that plays back the selected sound. Some entries also contain a screenshot or video link, both of which document one of the many possible situations in which the sound can be triggered in-game. Figure 1: A screenshot of GameSound’s faceted search. Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 9 of 18 Through the use of HEURIST’s back-end tools, GameSound users are able to export customized reports that contain entries determined by parameters that exist within the database. Seen in Figure 3 are two excerpts from an enormous report that contains every single GameSound entry that possesses both a screenshot and a video link. Reports are quite flexible in both content and layout—as this report is embedded on GameSound’s homepage, it has been customized to mimic the font styles from the website’s CSS files. Able to query any type of data that is present in the database, reports are a versatile way to share targeted sets of data with other researchers. In addition to the database itself, GameSound also exists as a web resource for those who are interested in ludomusicology and quantitative game analysis. The website currently hosts documentation outlining the project’s methods and explains how many aspects of the research were conceived. Similar to how the database provides transparent access to videogame audio, we hope that the website provides insight into how GameSound was developed. Figure 2: A screenshot of a single database entry. Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 10 of 18 4. Challenges in designing the database Ludomusicology can be a complicated endeavour for researchers, scholars, and digital humanists. Conventional qualitative methodologies prevalent in ludomusicology, such as analytical play (Summers and Hannigan 2016, 52), can lack the capacity to adequately access and isolate audio or investigate the role of interactivity—the layers “between the operations of a machine and the instructions given to it by an operator” (Burdick 2012, 53). Commercial soundtracks and other official releases of game audio can be somewhat unreliable sources, as composers may alter the recordings from their original presentation (while completely divorcing them from their in-game context). Thus, one of the key challenges in creating a ludomusicological database is gaining access to videogame audio directly without dismissing its role within gameplay. This challenge is exacerbated by the opacity of videogame file structures—much of a game’s audio assets are obfuscated through layers of code, file compression, and technical protection measures. There is a notable lack of academic tools that can penetrate these layers, and most of the existing software used to extract audio and game code falls squarely into the realm of modding or hobbyism. Thus, researchers must seek out games with open file structures (as we did with Civilization IV ) or use independently developed tools that have very little documentation or support. Figure 3: Two entries from a custom GameSound report. Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 11 of 18 In addition to the technical challenges of acquiring audio and determining its purpose within the framework of the game, one of the biggest hurdles faced when designing GameSound’s initial prototype was deciding which data types we would include in the database. As GameSound was not created with a specific research project in mind—focusing more on potential applications—much of the initial data gathering was speculative in nature. As a result, we experimented with an extremely broad range of technical and ludomusicological data throughout the database’s development. This, admittedly, may have led to some arbitrary decisions regarding data types, but gave us permission to contemplate an enormous variety of potential applications for the database. Code snippets and written descriptors were both strongly considered during the prototyping process before we whittled down the selection to its current state, which focuses more on measurable technical parameters, game media, and ludomusicological identifiers. These data types were selected for practical reasons, as much of their extraction could be automated, as well as their perceived usefulness for digital humanities scholars. In an effort to better place GameSound within current ludomusicological discourses, instead of developing our own identification system we decided to adopt the IEZA framework—a two-dimensional method for describing sound in computer games. Designed by Sander Huiberts and Richard van Tol at the Utrecht School of the Arts, the IEZA framework (Figure 4) provides an effective vocabulary for audio classification. The vertical dimension in the framework makes a distinction between audio originating from inside the fictional game world (diegetic), such as the footsteps of a game character, and sound coming from outside the fictional game world (non-diegetic), such as the title’s musical score. The horizontal dimension separates sounds that result from direct player action (activity) such as those triggered from clicking buttons within a game’s interface, and from sounds that are ambient (setting) such as atmospheric and music tracks. Four domains are formed across these two planes of comparison: Interface, Effect, Zone and Affect (Huiberts and Van Tol 2008), with the authors providing some key examples of sounds that fall within these domains: Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 12 of 18 • Effect: sounds rooted in the game world that are triggered directly by player action, such as dialogue, footsteps, and gunshots. • Zone: sounds rooted in the game world that are not directly triggered by player action, such as rain, wind, or city noise. • Interface: sounds that exist outside the game’s fictional setting that are triggered directly by player action, such as beeps and clicks emitted by a game’s menu or HUD. • Affect: sounds that exist outside the game’s fictional setting that are not triggered directly by player action, such as a game’s musical score or ominous drones in a horror game. These categorizations help to establish the importance of interactivity in videogame audio—the simple fact that “the body cannot be removed from the experience of videogame play” (Collins 2013, 3). Players are not just the receiver of a sound signal— as with radio, television, and film—but also the transmitter. In-game actions may trigger dialogue, sound effects, music, and ambient sounds directly (by clicking a button in the interface) or indirectly (through timed events or algorithms). Thus, Figure 4: IEZA Framework: Sanders Huibert and Richard van Tol. Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 13 of 18 by adopting the IEZA classification system, we simultaneously acknowledge the uniqueness of game audio while also encapsulating it within a simple set of database parameters. When used in conjunction with the existing technical categories found within the database, these domains provide valuable context to the audio files found within GameSound. In addition to ludomusicological challenges and database design conundrums, Canadian copyright law poses two intriguing hurdles for GameSound. First, the music and sound effect files in the database are, naturally, the intellectual property of the original game developer and are used without explicit permission or licensing. Although there is a strong argument that GameSound’s acquisition and use of these files falls under the educational aspects of fair dealing—“a user’s right in copyright law permitting use of, or ‘dealing’ with, a copyright protected work without permission or payment of copyright royalties” (Simon Fraser University 2018)—copyright law is difficult for researchers to pin down and academic institutions are often wary of projects that engage with it. Despite its current availability as a public resource, access restrictions may need to be introduced as the database expands in order to mitigate legal risks or to appease university ethics departments. Secondly, the open file structures present in Civilization IV can be considered an outlier in a media industry that is shifting toward tighter control of videogames and their assets. Publisher mandated terms-of-use and security measures, such as End User License Agreements and Digital Rights Management software, often create legal and technical barriers in accessing game data. This not only complicates the idea of fair dealing but creates additional challenges for the research team as they attempt to access and recover game assets that may be encrypted or hidden behind software restrictions. 5. Conclusion An obvious question for GameSound is: “what are the next steps for the project?” After receiving feedback on the working prototype from scholars at the McGill Music Graduate Students’ Society Symposium and the Congress of the Humanities and Social Sciences, expanding the dataset to include additional videogames seems to be Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 14 of 18 the obvious path forward. We are currently canvassing the game studies community to seek out collaborations with those who may find GamesSound’s unique toolset useful for their own research projects. This expansion is important as, although the development process has been enlightening, a digital humanities project such as this cannot be truly evaluated until it has moved beyond speculative use and has been tested in multiple academic research projects. As with the prototype, any potential collaboration would likely revolve around the addition of a single videogame. Much like we did with Civilization IV, database updates will focus around the acquisition of audio assets from a game, followed by a sorting process (based around the researcher’s needs as well as the existing categories present in the database). This fresh infusion of data will offer new opportunities for introspection and revision, and the database may be altered or completely rebuilt to include new categories, data types, or search functionality (such as the ability to quickly navigate between titles). Essentially, our goal would be to allow new collaborators to sift through the videogame of their choice with ease, providing them with access to the audio elements of a game without relying on repeated playthroughs or unreliable secondary sources. As with any addition to the database— which, barring copyright concerns, will always be publicly available—additional data will also provide scholars from all across the game studies and digital humanities communities with an opportunity to reflect upon the technical parameters of game audio and to experiment with quantitative game analysis tools and applications. As the database moves toward a multitude of videogames, rather than a single one, we will be presented with further opportunities to measure its value as an ongoing digital humanities project. GameSound is an iterative work. Beyond the research possibilities that it provides, it is meant to explore the ongoing technical challenges ludomusicologists face, such as data accessibility, intellectual property concerns, and the lack of established standards. Over time, we hope to collaborate with scholars across the world to expand both the scope and functionality of the database, while constantly interrogating its efficacy as a research tool. Just as we hope to learn more about videogames by breaking them down into their component parts, it is our belief Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 15 of 18 that by deconstructing and reconstructing our own work we can unearth valuable insights for scholars across various disciplines. Competing Interests The author has no competing interests to declare. Editorial Contributions Section/Copy editor, text: Darcy Tamayose, University of Lethbridge Journal Incubator. Copy editor, bibliography: Shahina Parvin, University of Lethbridge Journal Incubator. References Bradshaw, Jason. 2018. “BioShock Infinite and Feminist Theory: A Technical Approach.” Paper presented at the Congress of the Social Sciences and Humanities, Regina, SK, May 26–28. Burdick, Anne, Johanna Drucker, Peter Lunenfeld, Todd Presner, and Jeffrey Schnapp. 2012. Digital_Humanities. Cambridge: MIT Press. Chion, Michel. 2012. “The Three Listening Modes.” In The Sound Studies Reader, edited by Jonathan Sterne, 48–53. New York: Routledge. Collins, Karen. 2013. Playing with Sound: A Theory of Interacting with Sound and Music in VideoGames. Cambridge, MA: MIT Press. DOI: https://doi.org/10.7551/ mitpress/9442.001.0001 Consalvo, Mia, and Nathan Dutton. 2006. “Game Analysis: Developing a Methodological Toolkit for the Qualitative Study of Games.” Game Studies 6(1). Accessed November 15, 2019. gamestudies.org/06010601/articles/consalvo_ dutton Dudley, Sam. 2018. “Ludomusicology: An Interview with Dr. Melanie Fritsch.” The Sound Architect. Accessed November 15, 2019. https://www.thesoundarchitect. co.uk/ludomusicology-melanie-fritsch/. Gibbons, William. 2016. “Remixed Metaphors: Manipulating Classical Music and Its Meanings in Video Games.” In Ludomusicology: Approaches to Video Game Music, edited by Michiel Kamp, Tim Summers, and Mark Sweeney, 198–222. Sheffield, UK: Equinox Publishing. https://doi.org/10.7551/mitpress/9442.001.0001 https://doi.org/10.7551/mitpress/9442.001.0001 http://gamestudies.org/06010601/articles/consalvo_dutton http://gamestudies.org/06010601/articles/consalvo_dutton https://www.thesoundarchitect.co.uk/ludomusicology-melanie-fritsch/ https://www.thesoundarchitect.co.uk/ludomusicology-melanie-fritsch/ Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 16 of 18 Huiberts, Sander, and Richard Van Tol. 2008. “IEZA: A Framework For Game Audio.” Gamasutra. Accessed November 15, 2019. gamasutra.com/view/ feature/131915/ieza_a_framework_for_game_audio.php. Kamp, Michiel. 2016. “Suture and Peritexts: Music Beyond Gameplay and Diegesis.” In Ludomusicology: Approaches to Video Game Music, edited by Michiel Kamp, Tim Summers, and Mark Sweeney, 73–91. Sheffield, UK: Equinox Publishing. Kassabian, Anahid, and Freya Jarman. 2016. “Game and Play in Music Video Games.” In Ludomusicology: Approaches to Video Game Music, edited by Michiel Kamp, Tim Summers, and Mark Sweeney, 116–132. Sheffield, UK: Equinox Publishing. Origitano, Catlyn. 2015. “The Cage is Somber: A Feminist Understanding of Elizabeth.” In The Philosophy of BioShock, edited by Luke Cuddy, 38–48. John Wiley and Sons. DOI: https://doi.org/10.1002/9781118915899.ch4 Simon Fraser University. 2018. “What is Fair Dealing and How Does it Relate to Copyright?” Accessed November 15, 2019. www.lib.sfu.ca/help/academic- integrity/copyright/fair-dealing. Sterne, Jonathan. 2006. “The Death and Life of Digital Audio.” Interdisciplinary Science Reviews 31(4): 338–348. DOI: https://doi.org/10.1179/030801806X1 43277 Summers, Tim, and James Hannigan. 2016. Understanding Video Game Music. Cambridge, England: Cambridge University Press. DOI: https://doi.org/10.10 17/CBO9781316337851 Sydney University. 2018. “Home, Heurist Network.” Accessed November 15, 2019. heurist.sydney.edu.au. https://gamasutra.com/view/feature/131915/ieza_a_framework_for_game_audio.php https://gamasutra.com/view/feature/131915/ieza_a_framework_for_game_audio.php https://doi.org/10.1002/9781118915899.ch4 https://www.lib.sfu.ca/help/academic-integrity/copyright/fair-dealing https://www.lib.sfu.ca/help/academic-integrity/copyright/fair-dealing https://doi.org/10.1179/030801806X143277 https://doi.org/10.1179/030801806X143277 https://doi.org/10.1017/CBO9781316337851 https://doi.org/10.1017/CBO9781316337851 http://heurist.sydney.edu.au/ Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 17 of 18 Iantorno: GameSound, Quantitative Games Analysis, and the Digital Humanities Art. 2, page 18 of 18 How to cite this article: Iantorno, Michael. 2020. “GameSound, Quantitative Games Analysis, and the Digital Humanities.” Digital Studies/Le champ numérique 10(1): 2, pp. 1–18. DOI: https://doi.org/10.16995/dscn.319 Submitted: 19 September 2018 Accepted: 01 October 2019 Published: 23 January 2020 Copyright: © 2020 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. OPEN ACCESS Digital Studies/Le champ numérique is a peer-reviewed open access journal published by Open Library of Humanities. https://doi.org/10.16995/dscn.319 http://creativecommons.org/licenses/by/4.0/ 1. Introduction 2. Quantitative videogame analysis 3. Constructing and using GameSound 4. Challenges in designing the database 5. Conclusion Competing Interests Editorial Contributions References Figure 1 Figure 2 Figure 3 Figure 4 work_2xab3fqtfjh7dhk52oxbqchiay ---- informatics Article Towards an Uncertainty-Aware Visualization in the Digital Humanities † Roberto Therón Sánchez * , Alejandro Benito Santos , Rodrigo Santamaría Vicente and Antonio Losada Gómez Visual Analytics Group (VisUSAL), Department of Computer Science and Automation, University of Salamanca, 37008 Salamanca, Spain * Correspondence: theron@usal.es; Tel.: +34-923-294-500 (ext. 6090) † This paper is an extended version of our paper published in TEEM’18, Salamanca, Spain, 24–26 October 2018. Received: 3 June 2019; Accepted: 2 August 2019; Published: 10 August 2019 ���������� ������� Abstract: As visualization becomes widespread in a broad range of cross-disciplinary academic domains, such as the digital humanities (DH), critical voices have been raised on the perils of neglecting the uncertain character of data in the visualization design process. Visualizations that, purposely or not, obscure or remove uncertainty in its different forms from the scholars’ vision may negatively affect the manner in which humanities scholars regard computational methods as useful tools in their daily work. In this paper, we address the issue of uncertainty representation in the context of the humanities from a theoretical perspective, in an attempt to provide the foundations of a framework that allows for the construction of ecological interface designs which are able to expose the computational power of the algorithms at play while, at the same time, respecting the particularities and needs of humanistic research. To this end, we review past uncertainty taxonomies in other domains typically related to the humanities and visualization, such as cartography and GIScience. From this review, we select an uncertainty taxonomy related to the humanities that we link to recent research in visualization for the DH. Finally, we bring a novel analytics method developed by other authors (Progressive Visual Analytics) into question, which we argue can be a good candidate to resolve the aforementioned difficulties in DH practice. Keywords: progressive visual analytics; uncertainty taxonomies; digital humanities 1. Introduction The importance of computational tools in the work of researchers in the humanities has been continuously increasing and the definition of the digital humanities (DH) has been reformulated accordingly, as DH research must be integrated with practices within and beyond academia [1]. Both research and practice have been adopting new methodologies and resources which render definitions obsolete quite rapidly. In our work, we adhere to the characterization of DH as “the application and/or development of digital tools and resources to enable researchers to address questions and perform new types of analyses in the humanities disciplines” [2]. This symbiosis means that the application of humanities methods to research into digital objects or phenomena [1] is another way to look at DH research. At any rate, the computational methods that are available to humanities scholars are very rich and may intervene at different stages of the life cycle of a project. Some examples of computational methods applied in DH research are the analysis of large data sets and digitized sources, data visualization, text mining, and statistical analysis of humanities data. We are aware that the diversity of fields that fall under the broad outline of what constitutes DH research brings many different and valid goals, methods, and measurements into the picture and, so, there is no general set of procedures that must be Informatics 2019, 6, 31; doi:10.3390/informatics6030031 www.mdpi.com/journal/informatics http://www.mdpi.com/journal/informatics http://www.mdpi.com https://orcid.org/0000-0001-6739-8875 https://orcid.org/0000-0001-5317-6390 https://orcid.org/0000-0002-8581-8332 http://dx.doi.org/10.3390/informatics6030031 http://www.mdpi.com/journal/informatics https://www.mdpi.com/2227-9709/6/3/31?type=check_update&version=2 Informatics 2019, 6, 31 2 of 14 conducted to qualify as DH research. However, any intervention of computational tools in research is bound to deal with data, which will go through several processes and modifications throughout the life cycle of the project, even in cases where the research itself is not data-driven. From the inception of the project to the generation of knowledge, the intervention of computational tools transforms data by means of processes that may increase the uncertainty of the final results. Furthermore, during the life cycle of the project, there are many situations in which the scholars and/or stakeholders need to make decisions to advance the research, based on incomplete or uncertain data [2]. This will, in turn, yield another level of uncertainty inherently associated to a particular software or computational method. The motivation of this paper is to examine when such decision making under uncertainty occurs in DH projects where data transformations are performed. This work is part of the PROVIDEDH (PROgressive VIsual DEcision Making for Digital Humanities) research project, which aims to enhance the design process of visual interactive tools that convey the degree of uncertainty of humanistic data sets and related computational models used. Visualization designs, in this manner, are expected to progressively adapt to incorporate newer, more complete (or more accurate) data as the research effort develops. The rest of the paper is organized as follows: In Section 2, we introduce the types of uncertainty as defined in reliability theory, as this provides a mature and sound body of work upon which to build our research. In Section 3, we examine DH humanities research and practice in a first attempt to characterize the sources of uncertainty in DH. Section 4 is devoted to discussing how management and processing of data in DH research and practice is subject to uncertainty. Section 5 presents a progressive visual analysis proposal that approaches DH projects or experiences in which uncertainty and decision-making play a big role, with the intention of providing some hints on how mitigate the impact of uncertainty on the results. Finally, in Section 6, we outline the main conclusions of our work, which can be used to scaffold the support of decision-making under uncertainty in DH. 2. Uncertainty Taxonomies The characterization of uncertainties has been thoroughly investigated in the literature, with major emphasis in areas such as risk analysis, risk management, reliability engineering [3–6], and decision-making and planning [7], with contributions from many other fields: Operational research [8], software engineering [9], management [10], ecology [11], environmental modelling [12], health care [13], organizational behavior [14], and uncertainty quantification [15], to name a few. In order to design effective systems to help humanists make decisions under conditions of uncertainty, it is key to reflect on the notion and implications of uncertainty itself. Identifying the stages of the analysis pipeline is of vital importance for the conception of data structures, algorithms, and other mechanisms that allow the final representation in a user interface. We mentioned how the categorization and assessment of uncertainty have produced many academic contributions from different areas of human knowledge, ranging from statistics and logic to philosophy and computer science, to name a few. Drawing from its parent body of research, cartography and GEOVisualization/GIScience scholars have typically developed a special interest in providing taxonomies for uncertainty in all its forms. Carefully presenting uncertain information in digital maps has been identified as key for analysts to make more-informed decisions on critical tasks for the well-being of society, such as storm and flood control, census operations, and the categorization of soil and crops. Given that, to the best of our knowledge, an uncertainty taxonomy for visualization in the humanities is yet to be proposed, in this section we review past approaches to uncertainty taxonomies proposed in the visualization community. First, we review the GIScience body of literature, because it is closely related to visualization and the humanities, mainly due to the works on visual semiotics theory by prominent cartographers such as Bertin, MacEachren, or Fisher, which we comment on below. Furthermore, we also describe past attempts to categorize uncertainty in the scientific visualization realm, which we argue are more closely related to modern data analysis pipelines. Informatics 2019, 6, 31 3 of 14 2.1. Uncertainty in GIScience The notable contributions by MacEachren [16] and Fisher [17] supposed a great breakthrough in the conceptualization of spatial uncertainty in informational systems, which have been progressively adapted to other bodies of research in recent times. For example, MacEachren’s first taxonomy of uncertainty revolved around the juxtaposition of the concepts of quality and uncertainty. MacEachren reflected, in his study, on the different manners in which uncertainty could be introduced into the data analysis pipeline (e.g., data collection and binning) and presented concepts like accuracy (the “exactness” of data) and precision (“the degree of refinement with which an operation is performed or a measurement taken”), which have been regularly linked to uncertainty in more recent research, up to the present day. Another important contribution of this author was to provide visual guidelines for depicting uncertainty, based on previous work by the world-renowned French cartographer and theorist Jacques Bertin, mostly known for his work on visual semiotics in the 1960s. As a result, MacEachren presented different treats that could be used to depict uncertainty in numerical or nominal information. Among these treats, he pointed out the use of color saturation (color purity) to indicate the presence of uncertainty, a semiotic that is widely accepted nowadays. Finally, the author introduced other notions on how and when to present uncertainty in the visualizations and on the value of providing such uncertainty information in an analytic process. Regarding the former, the uncertainty can be presented in three ways: Side-by-side, in a sequential manner, or employing bi-variate maps. In the first approach, two different (and possibly co-ordinated) views are put side-by-side, one depicting the actual information that is subject of study while the other presents the uncertainty values linked to each of the data points in the first. In the sequential approach, the interactive approach resides in the alternate presentation of the views explained in the previous case. Finally, bi-variate maps represent data and the associated uncertainty within the same view. For the evaluation of uncertainty visualization, the author stressed the difficulty in assessing uncertainty depictions in purely exploratory approaches, when the initial message to communicate is unknown to the designer and, therefore, communication effectiveness standards are rendered inadequate in this case. In order to solve the question, in a rather practical vision, he appeals to the evaluation of the utility that this depiction has in “decision-making, pattern-recognition, hypothesis generation or policy decisions”. This is in line with many of the dictates of user-centered design, in which the identification of concrete needs and subjective emotions in the final users is considered a key element of the design process [18]. Uncertainty has various interpretations in different fields and, in our research, we refer to uncertainty as “a complex characterization about data or predictions made from data that may include several concepts, including error, accuracy, validity, quality, noise, and confidence and reliability” [19]. According to Dubois [20], knowledge can be classified, depending on its type and sources, as generic (repeated observations), singular (situations like test results or measurements), or coming from beliefs (unobserved singular events). Uncertainty is often classified [21–23] into two categories: Aleatoric and epistemic uncertainty. 2.1.1. Aleatoric Uncertainty This uncertainty exists due to the random nature of physical events. This type of uncertainty refers to the inherent uncertainty due to probabilistic variability and, thus, is modeled by probability theory. It is also known as statistical uncertainty, stochastic uncertainty, type A uncertainty, irreducible uncertainty, variability uncertainty, and objective uncertainty. It mainly appears in scientific domains and is usually associated with objective knowledge coming from generic knowledge or singular observations. The main characteristic of aleatory uncertainty is that it is considered to be irreducible [24]. In our adaptation of Fisher ’s taxonomy to the digital humanities, we identify aleatoric uncertainty as algorithmic uncertainty, which is introduced by, for example, the probabilistic nature of the algorithms at play and therefore cannot be reduced. This concept is further explained in Section 3.1. Informatics 2019, 6, 31 4 of 14 2.1.2. Epistemic Uncertainty This type of uncertainty results from a lack of knowledge or its imprecise character and is associated with the user performing the analysis. It is also known as systematic uncertainty, subjective uncertainty, type B uncertainty, reducible uncertainty, or state of knowledge. It is mainly found with subjective data based on beliefs and can be modeled with the belief function theory, as introduced by Arthur P. Dempster [25]. This kind of uncertainty is specifically related to decision-making processes and, as such, may be found both in scientific (usually associated with hypothesis testing) and humanities (associated with disputed theories or events) research. The main characteristic of epistemic uncertainty is that it is considered to be reducible, due to the fact that new information can reduce or eliminate it. Also emerging from GIScience, Fisher presented, in 1999, three types of uncertainty in his proposal: Error, vagueness, and ambiguity, which he framed in relation to the problem of definition. The difficulty resides in defining the class of object under examination and the individual components of such a class. Fisher argued that the problem of defining uncertainty was one of this kind and provided a taxonomy that depends on whether the class of objects and the objects are initially well or poorly defined. If the class of objects and its participants are well-defined, then the uncertainty is probabilistic (or aleatoric). Aleatoric uncertainty is inherent to the physical properties of the world and is irreducible. The correct way to tackle probabilistic uncertainty is to provide a probability distribution which characterizes it and this solution can be found in the mathematical and statistical literature. On the other hand, the class and the individuals can not be well-defined, in what is called vagueness or ambiguity. Vagueness is a manifestation of epistemic uncertainty, which is considered to be reducible if the information on the subject is completed, and is the kind of uncertainty that is addressed by analytics and decision-making support systems. Vagueness has been addressed many times in the past and is usually modeled using fuzzy set theory, among other approaches [26]. Yet another problem might arise in the assignment of individuals to the different classes of the same universe, in what is called ambiguity. More concretely, whenever an individual may belong to two or more classes, it is a problem of discord. If the assignment to one class or another is open to interpretation, the authors will refer to it as non-specificity. These two categorizations are presented at the bottom of Fisher’s taxonomy of uncertainty, which is reproduced in Figure 1. Uncertainty Epistemic Aleatory Imprecision Ignorance Credibility Incompleteness Figure 1. Fisher’s taxonomy of uncertainty [17], adapted by [22]. Informatics 2019, 6, 31 5 of 14 2.2. Sources of Uncertainty in Data Analysis Concurrently with the works presented in the previous section, contributions by authors from other fields of computing started to appear. In the case of scientific/information visualization, contributions by Pang et al. [27] are worth of mention. In their paper, the authors surveyed different visualization techniques which addressed the issue of uncertainty at various levels. Concretely, they proposed the use of glyphs, animations, and other treats to made users aware of the varying degrees and locations of uncertainty in the data. The taxonomy that they employed was derived from a standard definition given at the time of writing (NIST standards report ’93). The report classified uncertainty into four well-defined categories: Statistical (mean or standard deviation), error (a difference between measures), range (intervals in which the correct value must reside), and scientific judgment (uncertainty arising from expert knowledge and that was formed out of the other three). While the latter was not considered in their study, they incorporated the first three into a data analysis pipeline that is shown in Figure 2. Figure 2. Sources of uncertainty in the data analysis pipeline [27]. • Uncertainty in acquisition: All data sets are, by definition, uncertain due to their bounded variability. The source of this variability can be introduced by the lack of precision of the electronic devices capturing the information (e.g., a telescope), emerge from a numerical calculation performed according to a model (e.g., the limited precision of computers in representing very large numbers), or induced by human factors; for example, due to differences in perception of the individuals reporting the information through direct observation. • Uncertainty in transformation: Appears due to the conversions applied to the data in order to produce meaningful knowledge. This could be related to the imprecise calculation of new attributes when applying clustering, quantization, or resampling techniques. • Uncertainty in visualization: The process of presenting the information to the final user is also subject to introducing uncertainty. The rendering, rasterization, and interpolation algorithms at play that produce the graphical displays of information are also prone to errors. Furthermore, there is usually a performance/accuracy trade-off present at this stage: The more reliable and accurate a visualization is, the more computational resources it will employ and, almost always, the performance times will decay substantially. As has been noted by some authors, this has a negative effect on the way humans grasp the information contained in the data and can even invalidate the whole approach to data analysis [28–30]. Recent research has shown that the black-box approach, which is followed in many current visual analytics systems, has serious implications on decision-making and should be avoided at all costs [31]. The veracity of the visualizations should not be spontaneously assumed by users and visualization designers and must be addressed with state-of-the-art techniques which are able to maintain an adequate balance between performance, accuracy, and interactivity in the visualizations. Informatics 2019, 6, 31 6 of 14 As we discuss in the following sections, we identify progressive visual analytics (PVA) as a potential candidate to present uncertainty in a data analysis pipeline and resolve these issues. Regarding the effect of uncertainty on the analysis task, in a more recent work [32], the authors commented on the approach to uncertainty and offered a more updated model of uncertainty, which can be better related to the modern big data analytics paradigm. These authors introduced, in this model, the notion of lineage or provenance, which refers to the chain of trust that is associated with any sort of data. The purpose of the lineage is to capture the uncertainty introduced by the source of information, especially when the acquisition is performed by human individuals (credibility). Humans are not only subject to cognitive bias and complex heuristics when the decision-making involves risk [33,34], but also have the ability to lie and deceive (intentionally or not) under a variety of circumstances. The authors of this paper argue that this uncertain information reported by human factors should be bound to the data as a base value of uncertainty. This information should serve as the base value for other types of uncertainty introduced at later stages of analysis (for example, every time the data are transformed). The authors also commented on the effect of time delays between the occurrence of an event and the information acquisition related to that event. The longer the time in between these two, the more uncertainty is added due to different factors, such as changes in memory or inability to decide on the recency of a set of similar reports. Finally, the authors also provided a concise description of the analyst’s goals in the decision-making under uncertainty, which is “to minimize the effects of uncertainties on decisions and conclusions that arise from the available information”. In order to ensure this effect, it is key to “identify and account for the uncertainty and ensure that the analyst understands the impacts of uncertainty”. In this process, two key tasks, according to the authors, are “to find corroborating information from multiple sources with different types of uncertainty” and “to make use of stated assumptions and models of the situation”. The latter case refers to the ability to model the data, in order to allow the discovery of patterns, gaps, and missing information, a transformation that can also introduce more kinds of uncertainty. 2.3. Implications for Decision-Making in the Digital Humanities As explained in the introduction, our research is focused on investigating opportunities to support decision-making in DH research and practice by means of interactive visualization tools. Given the exposed dual nature of uncertainty, the second type of uncertainty (epistemic) offers an opportunity to enhance DH research and support stakeholders in assessing the level of uncertainty of a project at any given moment. Moreover, aleatoric uncertainty, which we pose as algorithmic uncertainty in a typical data analysis pipeline (Figure 2), should also be communicated to enhance the comprehensibility of methods and results. On the one hand, epistemic uncertainty can be modeled with belief function theory, which defines a theory of evidence that can be seen as a general framework for reasoning with uncertainty. On the other hand, recent efforts can be found in the literature that have focused on the adaptation and proposal of data provenance models for DH ecosystems [35,36], and which are often used to record the chain of production of digital research results, in order to increase transparency in research and make such results reproducible [37]. These models can also be enhanced, in order to convey the level of uncertainty at any link in the chain. This would provide an opportunity to make decisions related to a change in the research direction, if, for instance, at some point, the conclusion is incompatible with what the humanist feels to be solid ground epistemically, or new information is introduced that mitigates a given uncertainty level. 3. Modeling Uncertainty in the Digital Humanities Although, to the best of our knowledge, a taxonomy of sources of uncertainty in DH has not yet been proposed, there is no doubt that, in this realm, there are multiple sources of uncertainty to be found. It is our aim to contribute to paving the way towards a taxonomy of uncertainty sources Informatics 2019, 6, 31 7 of 14 in DH by identifying and discussing some instances of sources of uncertainty related to data in DH research and practice. To this end, building upon Fisher ’s taxonomy presented in the previous section, we identify four notions as sources of epistemic uncertainty that we have detected in a great majority of DH works: Imprecision (inability to express an exact value of a measure), ignorance (inability to express knowledge), incompleteness (when not all situations are covered), and credibility (the weight an agent can attach to its judgment). A proposal of a general uncertainty taxonomy for the DH can be built on top of these categories or notions (Figure 1), which are described in greater detail in the following. Also, to complete the description of Fisher’s notions, we provide examples of each category in the context of four different DH projects: Uncertainty in GIScience [38], a data set of French medieval texts [39], information related to early holocaust data [40], and an approach to the presence of uncertainties in visual analysis [41]. 3.1. Aleatoric Uncertainty According to the definition of aleatoric uncertainty provided in the previous sections, this kind of uncertainty is irreducible and, therefore, we can reformulate it and link it to the different sources of uncertainty identified by Pang et al. Namely, aleatoric uncertainty becomes algorithmic uncertainty in our proposal, and is related to the probabilistic nature of the computational techniques at play. Take, for example, the set of language/topic models, such as word2vec or Latent Dirichlet Allocation (LDA), which have become recently popular among DH practitioners [42]. These algorithms are inherently probabilistic, which means their output is given as a probability density function (PDF). Therefore, it would make no sense to try to reduce this uncertainty, but rather the analytics system should be responsible for communicating it to the user in the most realistic possible manner. 3.2. Epistemic Uncertainty Epistemic uncertainty occurs in poorly-defined objects, as explained by Fisher. This uncertainty can be reduced through, for example, research on a data set and, under our approach, it is subject to individual interpretation. For example, a scholar might decide he or she is not confident of working with a certain primary source, either because he or she is unfamiliar with the topic or simply because the source is excessively deteriorated, or similar. We argue that it is important to capture these partial interpretations and fixate them to the research object (e.g., a data set) such that the same researcher or others can, for example, follow a reasoning chain when trying to replicate an experiment. Below, we present the categories of epistemic uncertainty, as described by Fisher, and corroborate their theoretical applicability in the context of real DH scenarios. 3.2.1. Imprecision Imprecision refers to the inability to express the definitive accurate value of a measure or to a lack of information allowing us to precisely obtain an exact value. Ideally, we would be able to study and research the topic we are dealing with while working with a data set, in order to sort out any uncertainties and remove them from it, but, in most cases, we will find barriers that will prevent that. In three of the cited DH projects [38–40], imprecision is present in different forms. One instance of the presence of uncertainty due to imprecision is that related to time and dates, such as that related to the medieval texts introduced in [39]. Not every one of the texts had this problem but, in multiple instances, a concrete date on which they were written was not available. Instead, they were represented in idiosyncratic ways (e.g., between 1095–1291, first half of the 14th century, before 1453, and so on), making for a very strong presence of uncertainty to assess. 3.2.2. Ignorance Ignorance can be partial or total, and is related to the fact that information could have been incorrectly assessed by the person gathering or organizing the data. It is also possible that people, not Informatics 2019, 6, 31 8 of 14 fully sure about how to deal with data and feeling insecure about it, ignore some information and generate uncertainty during the evaluation and decision processes. Mostly due to the passage of time (in the scope of DH) and the fact that new knowledge becomes available with new experiences and research projects being completed and becoming available, we are able to find information that makes that which we had at the inception of our projects outdated or misread/misunderstood at the time. Interpretation issues can also be considered in this category or notion, given that not everybody may have the same perspective on the same data, depending on its context, which can affect its certainty. In iterative research projects unexpected results may also be reached. In this scenario, if the person analyzing the data is insecure and his or her expectations are not on par with what was generated, it is possible that some uncertainty is generated. This uncertainty can turn into the ignorance of the result, providing a new data set being wrongly assessed. This issue was tackled by Seipp et al. [41], in relation to the presence of uncertainties in visual analytics. One of the main issues in visual analysis is the possibility of misinterpretation and, in order to avoid it, the data quality needs to be appropriately represented. Even with that, the results can be misleading, and the analyst may not be able to interpret them correctly and become encouraged to ignore them and potentially introduce uncertainty into further iterations if the perceived values differ from the real values conveyed by the visualization. 3.2.3. Credibility/Discord Probably one of the strongest sources of uncertainty is the credibility of any data set or person involved in its assessment, which can be crucial to the presence (or lack) of uncertainty. This concept can be linked to that of biased opinions, which are related to personal visions of the landscape, which can make for wild variations between different groups and individuals, given their backgrounds. Moreover, this also refers to the level of presence of experts that take charge of the preparation or gathering of data, its usage, research on it, and so on. The more weight an agent bears, the less (in principle) unpredictability is expected to be present in the data. This notion is also important when working on open projects with studies that allow external agents to contribute in different ways, given that their knowledge of the matter at hand could be very different from that of others, and this must be taken into consideration when dealing with their input, as they could potentially introduce other types of uncertainty into the project and alter the results of the research. This last type of research can be related to that carried out by Binder et al. for the GeoBib project [40]. Given its open nature, in which people could contribute new information or modify readily available data. As each individual joins the system with a different background, experience, and knowledge, the information entered in the database can be related to the same record but may be completely different, depending on who introduces it. It is the researchers’ work to assess how credible each input is, depending on where it comes from. 3.2.4. Incompleteness Finally, the notion of incomplete data is a type of uncertainty that can be related to that of imprecise values. We can never be totally sure of anything, and that mostly has to do with the lack of knowledge (imprecision) that comes from the impossibility of knowing every possible option available. When dealing with a data set comprised of logs of visitors of a library in Dublin [38], the authors found records that included names of places that are neither longer existing nor traceable, due to their renaming or simply due to the person recording the instance used a name bound to his or her own knowledge. This makes it impossible to geo-localize those places, making for an ultimately incomplete (and, also, imprecise if wrong coordinates are assigned instead of leaving blank fields) data set. 4. Data and Uncertainty in Digital Humanities It is assumed that science advances on a foundation of trusted discoveries [43] and the scientific community has traditionally pursued the reproducibility of experiments, with transparency as a key Informatics 2019, 6, 31 9 of 14 factor to grant the scrutiny and validation of results. Recently, the importance of disclosing information on data handling and computational methods used in the experiments has been recognized, since access to the computational steps taken to process data and generate findings is as important as access to the data themselves [44]. On the contrary, humanities research has a different relationship with data. Given the nature of this research, data are continuously under assessment and different interpretative perspectives. Edmond and Nugent [45] argued that “An agent encountering an object or its representation perceives and draws upon the data layer they apprehend to create their own narratives”, understanding by narrative “the story we tell about data”. The collaboration between humanities and computer science has opened new ways of doing research, but has also brought many challenges to overcome. Related to our research, we focus on the role of data in DH, as humanities data are both massive and diverse, and provide enormous analytical challenges for humanities scholars [46]. In [46], the authors identified four humanities challenges related to the ways in which perspectives, context, structure, and narration can be understood. Those challenges open up many opportunities to collect, store, analyze, and enrich the multi-modal data used in the research. Among the research opportunities identified in the paper, two are especially relevant to our discussion: (a) Understanding changes in meaning and perspective, and (b) representing uncertainty in data sources and knowledge claims; both being inherently related to the notion of uncertain data. On one hand, humanities research is subject to changes in the data over time and across groups or scholars. When new sources or documents are discovered, new interpretations are elaborated and understanding of the research objects are highly dependent on the particular theoretical positions of the scholars. On the other hand, those changes in meaning and perspective arise from the availability of sources and reference material, so its highly important for the scholars to be able to assess the nature of the data related to what may be missing, ambiguous, contradictory, and so on. This, as expected, generates uncertainty in how the data is ultimately handled and analyzed, depending on the data processing procedures and associated provenance. 5. Managing Uncertainty Through Progressive Visual Analytics The usefulness and suitability of visually-supported computer techniques are a proven fact, and one can refer to the growing number of publications, papers, dissertations, and talks touching upon the subject in recent years. However, many of these proposals are still regarded with a skeptical eye by prominent authors in the field and are considered by some “a kind of intellectual Trojan horse” that can be harmful to the purposes of the humanistic research [47]. These critiques appeal to the inability of these techniques to present categories in qualitative information as subject to interpretation, “riven with ambiguity and uncertainty” and they call for “imaginative action and intellectual engagement with the challenge of rethinking digital tools for visualization on basic principles of the humanities”. These claims point to a major issue in DH: On one hand, humanities scholars are keen on employ computational methods to assist them in their research but, on the other hand, such computational methods are often too complex to be understood in full and adequately applied. In turn, acquiring this knowledge generally would require an investment of time and effort that most scholars are reluctant to commit to and would invalidate the need for any kind of multidisciplinary co-operation. As a consequence, algorithms and other computational processes are seen as black boxes that produce results in an opaque manner, a key fact that we identify as one of the main causes of the controversy and whose motivations are rooted at the very foundations of HCI. However, in the same way that users are not expected to understand the particularities of the HTTP and 4G protocols in order to access an online resource using their mobile phones, algorithmic mastery should not be an entry-level requirement for DH visual analytics either. In a similar approach, such analytics systems should not purposely conceal information from the user when mistakenly assuming that (a) the user is completely illiterate on these subjects and/or, maybe even with more harmful consequences, (b) the user is unable to learn. For example, Ghani and Deshpande [48], in their research dating from 1994, identified the sense of control over one’s environment as a major factor affecting the experience of flow. We argue Informatics 2019, 6, 31 10 of 14 that it is precisely the lack of control over the algorithms driving the visualizations that might be frustrating DH practitioners. In Section 2, we commented on the different sources that can be identified in the data analysis pipeline, as presented by [27]. Therefore, it is key that a DH analyst is able to identify this uncertainty at these stages, in order to be able to make informed decisions. Furthermore, we have seen how algorithms, models, and computations can introduce uncertainty in the analysis task which, rather than being neglected, should be appropriately presented to the user at all times. For these reasons, a hypothetical visual analytics pipeline should expose this uncertainty at all times in an effective manner, regardless of the size of the data being analyzed. On the other hand, this goal can be difficult to achieve if the inclusion of this uncertainty in the pipeline results in greater latency times that may diminish the analytic capabilities of the system. In the context of this problem, we frame our proposal of an exploration paradigm for the DH, which aims to bring scientific rigor and reproducibility into the field without impeding intellectual work as intended by humanities scholars. As was presented in previous sections, the tasks of categorization, assessment, and display of uncertainty, in all its forms, play a key role in the solving of the aforementioned issues. In order to provide an answer to this question, we draw on recent research by authors in the CS field to construct a theoretical framework on which the management of uncertainty is streamlined in all phases of the data analysis pipeline: Progressive Visual Analytics (PVA). PVA is a computational paradigm [31,49] that refers to the ability of information systems to deliver their results in a progressive fashion. As opposed to sequential systems, which are limited by the intrinsic latency of the algorithms in action, PVA systems, by definition, are always able to offer partial results of the computation. The inclusion of this feature is of major importance to avoid the well-known issues of exploratory analysis related to human perception, such as continuity, flow, and attention preservation, among others [29], and enhances the notion of direct manipulation of abstract data in the final user of the system [50]. This paradigm also brings important advantages related to the ability to break with the black-box vision of the algorithms commented upon earlier in this text [31]: There are many examples online and in the literature that illustrate how, by observing the visual results of the execution of an algorithm, users are able to understand how it works in a better manner [51]. Not only is this useful in an educational sense, but also in a practical one: Progressive Analytics often produces steerable computations, allowing users to intervene in the ongoing execution of an algorithm and make more informed decisions during the exploration task [31]. Figure 3 depicts PVA and the concept of steerable computation, as envisioned by Stolper et al. in their paper [49]. In our case, this would allow a fast re-computation of results according to a set of well-defined series of beliefs or certainties on the data, with important benefits related to the problems presented in [47]. Therefore, the challenge lies in re-implementing the typical DH workflows and algorithms in a progressive manner, allowing for a fast re-evaluation of beliefs to spark critical thinking and intellectual work under conditions of uncertainty. In order to develop this conversion, good first candidates are the typical graph layout and force-directed methods, as (a) they have been typically implemented in a progressive manner [52] and (b) they have been considered important to enable research in the humanities [46]. Other good candidates fall into the categories of dimensionality reduction (t-SNE [53]), pattern-mining (SPAM [49]), or classification (K-means [54]); although, in principle, any algorithm is susceptible to conversion, following the guides explained in [31]. For example, a complete list of relevant methods for the humanities could be compiled from the contributions by Wyatt and Millen [46]. In Figure 4, we show a modification of the progressive visualization workflow proposed by Stolper et al. [49], in which we treat the data set as a first-class research object that can be labeled, versioned, stored, and retrieved, by employing a data repository. Our proposal also draws on the ideas by Fekete and Primet [54] and we model uncertainty as a parameter Up of the progressive computation Fp defined by the authors. Informatics 2019, 6, 31 11 of 14 Figure 3. Progressive Visual Analytics (PVA) model proposed by Stolper [49]. Select Dataset Select Uncertainty Parameters Run Analytic Visualize Partial Results Interpret Partial Results Visualize Complete Results Interpret Complete Results Data Tables       Up D1 U'1 D2 U'2 ⋮ ⋮ Dz U'z Dataset A' Dataset A (v1,v2...vi) Dataset B (v1,v2...vj) Data repository ⋮ Dataset N (v1,v2...vm) Data Tables Up D1 U''1 D2 U''2 ⋮ ⋮ Dz U''z Dataset A'' Retrieve Store Data Tables      Up D1 Ur1 D2 Ur2 ⋮ ⋮ Dz Urz Dataset Ar Data Tables      Up D1 U1 D2 U2 ⋮ ⋮ Dz Uz Dataset A Figure 4. An uncertainty-aware progressive visualization workflow model for the Digital Humanities proposed by the authors and based on the contributions by Stolper [49] and Fekete [54]. Initially, a data set A is loaded, which will consist of a series of data tables, each one associated with a concrete uncertainty parameter which might or might not exist, yet, and that was, in case of existence, assigned in a previous session by the same or another user. At the beginning of the session, the user may choose to modify the subjective uncertainty parameters (from Fisher ’s taxonomy, Informatics 2019, 6, 31 12 of 14 Figure 1), according to his experience or newer research, or leave them as they are. We call this the initial user perspective P, which is a series of uncertainty parameters U1...z related to each of the data tables D1...z. As the workflow progresses, the user will modify this perspective, subsequently obtaining P′, P′′, and so on. Once the workflow is finished, the data set Ar , along with the final user perspective Pr , is stored in the data repository for later use and becomes a research object that can be referenced, reused, and reproduced, in a transparent fashion. 6. Conclusions In this paper, we reviewed past taxonomies related to uncertainty visualization in an attempt to adapt them to the DH domain. Although the DH represent an exciting new field of collaboration between practitioners with substantially different backgrounds, there are still major issues that need to be addressed as briefly as possible, in order to achieve better results. In order to overcome these challenges, we draw on a relatively new data visualization paradigm that breaks with the black-box perception of the algorithm which we argue is blocking collaboration in many research areas. The progressive workflow model in our proposal is a first approach to the problem of uncertainty in the DH analysis pipeline. We have seen a great surge of PA in the CS and visualization communities in recent years, but its applicability in a DH context is yet to be proven with adequate use-cases and evaluations. Author Contributions: Conceptualization, R.T.S.; formal analysis, R.T.S., A.B.S. and R.S.V.; investigation, R.T.S., A.B.S., R.S.V. and A.L.G.; writing—original draft preparation, R.T.S., A.B.S. and A.L.G.; writing—review and editing, R.T.S. and A.B.S.; supervision, R.T.S.; project administration, R.T.S.; funding acquisition, R.T.S. Funding: This work has received funding within the CHIST-ERA programme under the following national grant agreement PCIN-2017-064 (MINECO Spain). Conflicts of Interest: The authors declare no conflict of interest. Abbreviations The following abbreviations are used in this manuscript: DH Digital Humanities PVA Progressive Visual Analytics CS Computer Science References 1. Warwick, C.; Terras, M.; Nyhan, J. Digital Humanities in Practice; Facet Publishing: London, UK, 2012. 2. Anne, K.; Carlisle, T.; Dombrowski, Q.; Glass, E.; Gniady, T.; Jones, J.; Lippincott, J.; MacDermott, J.; Meredith-Lobay, M.; Rockenbach, B.; et al. Building Capacity for Digital Humanities: A Framework for Institutional Planning; ECAR Working Group Paper; ECAR: Louisville, CO, USA, 2017. 3. Hoffman, F.O.; Hammonds, J.S. Propagation of uncertainty in risk assessments: The need to distinguish between uncertainty due to lack of knowledge and uncertainty due to variability. Risk Anal. 1994, 14, 707–712. [CrossRef] [PubMed] 4. Ferson, S.; Ginzburg, L.R. Different methods are needed to propagate ignorance and variability. Reliab. Eng. Syst. Saf. 1996, 54, 133–144. [CrossRef] 5. Helton, J.C. Uncertainty and sensitivity analysis in the presence of stochastic and subjective uncertainty. J. Stat. Comput. Simul. 1997, 57, 3–76. [CrossRef] 6. Riesch, H. Levels of uncertainty. In Essentials of Risk Theory; Springer: Dordrecht, The Netherlands, 2013; pp. 29–56. 7. Lovell, B. A Taxonomy of Types of Uncertainty. Ph.D. Thesis, Portland State University, Portland, OR, USA, 1995. 8. Zimmermann, H.J. An application-oriented view of modeling uncertainty. Eur. J. Oper. Res. 2000, 122, 190–198. [CrossRef] http://dx.doi.org/10.1111/j.1539-6924.1994.tb00281.x http://www.ncbi.nlm.nih.gov/pubmed/7800861 http://dx.doi.org/10.1016/S0951-8320(96)00071-3 http://dx.doi.org/10.1080/00949659708811803 http://dx.doi.org/10.1016/S0377-2217(99)00228-3 Informatics 2019, 6, 31 13 of 14 9. Ramirez, A.J.; Jensen, A.C.; Cheng, B.H. A taxonomy of uncertainty for dynamically adaptive systems. In Proceedings of the 7th International Symposium on Software Engineering for Adaptive and Self-Managing Systems, Zurich, Switzerland, 4–5 June 2012; pp. 99–108. 10. Priem, R.L.; Love, L.G.; Shaffer, M.A. Executives’ perceptions of uncertainty sources: A numerical taxonomy and underlying dimensions. J. Manag. 2002, 28, 725–746. [CrossRef] 11. Regan, H.M.; Colyvan, M.; Burgman, M.A. A taxonomy and treatment of uncertainty for ecology and conservation biology. Ecol. Appl. 2002, 12, 618–628. [CrossRef] 12. Refsgaard, J.C.; van der Sluijs, J.P.; Højberg, A.L.; Vanrolleghem, P.A. Uncertainty in the environmental modelling process—A framework and guidance. Environ. Model. Softw. 2007, 22, 1543–1556. [CrossRef] 13. Han, P.K.; Klein, W.M.; Arora, N.K. Varieties of uncertainty in health care: A conceptual taxonomy. Med. Decis. Mak. 2011, 31, 828–838. [CrossRef] 14. Howell, W.C.; Burnett, S.A. Uncertainty measurement: A cognitive taxonomy. Organ. Behav. Hum. Perform. 1978, 22, 45–68. [CrossRef] 15. Potter, K.; Rosen, P.; Johnson, C.R. From Quantification to Visualization: A Taxonomy of Uncertainty Visualization Approaches. In Uncertainty Quantification in Scientific Computing; Dienstfrey, A.M., Boisvert, R.F., Eds.; IFIP Advances in Information and Communication Technology; Springer: Berlin/Heidelberg, Germany, 2012; pp. 226–249. 16. MacEachren, A.M. Visualizing Uncertain Information. Cartogr. Perspect. 1992, 13, 10–19. [CrossRef] 17. Fisher, P.F. Models of uncertainty in spatial data. Geogr. Inf. Syst. 1999, 1, 191–205. 18. Cooley, M. Human-Centered Design. In Information Design; MIT Press: Cambridge, MA, USA, 2000; pp. 59–81. 19. Nusrat, E. A Framework of Descriptive Decision-Making under Uncertainty Using Depster-Shafer Theory and Prospect Theory. Ph.D. Thesis, Nagaoka University of Technology, Niigata, Japan, 2013. 20. Dubois, D. Representation, propagation, and decision issues in risk analysis under incomplete probabilistic information. Risk Anal. 2010, 30, 361–368. [CrossRef] [PubMed] 21. Der Kiureghian, A.; Ditlevsen, O. Aleatory or epistemic? Does it matter? Struct. Saf. 2009, 31, 105–112. [CrossRef] 22. Simon, C. Data Uncertainty and Important Measures; ISTE Ltd/John Wiley and Sons Inc: Hoboken, NJ, USA, 2017. 23. Matthies, H.G. Quantifying uncertainty: Modern computational representation of probability and applications. In Extreme Man-Made and Natural Hazards in Dynamics of Structures; Springer: Dordrecht, The Netherlands, 2007; pp. 105–135. 24. Bae, H.R.; Grandhi, R.V.; Canfield, R.A. An approximation approach for uncertainty quantification using evidence theory. Reliab. Eng. Syst. Saf. 2004, 86, 215–225. [CrossRef] 25. Dempster, A.P. Upper and lower probabilities induced by a multivalued mapping. Ann. Math. Stat. 1967, 38, 325–339. [CrossRef] 26. Gonzalez-Perez, C. (Ed.) Vagueness. In Information Modelling for Archaeology and Anthropology: Software Engineering Principles for Cultural Heritage; Springer International Publishing: Cham, Switzerland, 2018; pp. 129–141. [CrossRef] 27. Pang, A.T.; Wittenbrink, C.M.; Lodha, S.K. Approaches to Uncertainty Visualization. Vis. Comput. 1997, 13, 370–390. [CrossRef] 28. Miller, R.B. Response Time in Man-Computer Conversational Transactions. In Proceedings of the December 9-11, 1968, Fall Joint Computer Conference, Part I; AFIPS ’68 (Fall, Part I); ACM: New York, NY, USA, 1968; pp. 267–277. [CrossRef] 29. Nielsen, J. Response Time Limits. 2010. Available online: http://www.nngroup.com/articles/response- times-3-important-limits (accessed on 3 June 2019). 30. Shneiderman, B. Response Time and Display Rate in Human Performance with Computers. ACM Comput. Surv. 1984, 16, 265–285. [CrossRef] 31. Mühlbacher, T.; Piringer, H.; Gratzl, S.; Sedlmair, M.; Streit, M. Opening the black box: Strategies for increased user involvement in existing algorithm implementations. IEEE Trans. Vis. Comput. Graph. 2014, 20, 1643–1652. [CrossRef] [PubMed] 32. Thomson, J.; Hetzler, E.; MacEachren, A.; Gahegan, M.; Pavel, M. A Typology for Visualizing Uncertainty. Proc. SPIE 2005, 5669, 146–158. [CrossRef] http://dx.doi.org/10.1177/014920630202800602 http://dx.doi.org/10.1890/1051-0761(2002)012[0618:ATATOU]2.0.CO;2 http://dx.doi.org/10.1016/j.envsoft.2007.02.004 http://dx.doi.org/10.1177/0272989X10393976 http://dx.doi.org/10.1016/0030-5073(78)90004-1 http://dx.doi.org/10.14714/CP13.1000 http://dx.doi.org/10.1111/j.1539-6924.2010.01359.x http://www.ncbi.nlm.nih.gov/pubmed/20487395 http://dx.doi.org/10.1016/j.strusafe.2008.06.020 http://dx.doi.org/10.1016/j.ress.2004.01.011 http://dx.doi.org/10.1214/aoms/1177698950 http://dx.doi.org/10.1007/978-3-319-72652-6_14 http://dx.doi.org/10.1007/s003710050111 http://dx.doi.org/10.1145/1476589.1476628 http://www. nngroup. com/articles/response-times-3-important-limits http://www. nngroup. com/articles/response-times-3-important-limits http://dx.doi.org/10.1145/2514.2517 http://dx.doi.org/10.1109/TVCG.2014.2346578 http://www.ncbi.nlm.nih.gov/pubmed/26356878 http://dx.doi.org/10.1117/12.587254 Informatics 2019, 6, 31 14 of 14 33. Kahneman, D.; Tversky, A. Prospect Theory: An Analysis of Decision under Risk. Econometrica 1979, 47, 263–291. [CrossRef] 34. Tversky, A.; Kahneman, D. Judgment under uncertainty: Heuristics and biases. Science 1974, 185, 1124–1131. [CrossRef] [PubMed] 35. Küster, M.W.; Ludwig, C.; Al-Hajj, Y.; Selig, T. TextGrid provenance tools for digital humanities ecosystems. In Proceedings of the 5th IEEE International Conference on Digital Ecosystems and Technologies (DEST 2011), Daejeon, Korea, 31 May–3 June 2011; pp. 317–323. 36. Burgess, L.C. Provenance in Digital Libraries: Source, Context, Value and Trust. In Building Trust in Information; Springer: Cham, Switzerland, 2016; pp. 81–91. 37. Walkowski, N.O. Evaluating Research Practices in the Digital Humanities by Means of User Activity Analysis. In Proceedings of the Digital Humanities, DH2017, Montreal, QC, Canada, 8–11 August 2017; pp. 1–3. 38. Sanchez, L.M.; Bertolotto, M. Uncertainty in Historical GIS. In Proceedings of the 1st International Conference on GeoComputation, Leeds, UK, 4–7 September 2017. 39. Jänicke, S.; Wrisley, D.J. Visualizing uncertainty: How to use the fuzzy data of 550 medieval texts. In Proceedings of the Digital Humanities, Lincoln, NE, USA, 16–19 July 2013. 40. Binder, F.; Entrup, B.; Schiller, I.; Lobin, H. Uncertain about Uncertainty: Different ways of processing fuzziness in digital humanities data. In Proceedings of the Digital Humanities, Lausanne, Switzerland, 7–12 July 2014. 41. Seipp, K.; Ochoa, X.; Gutiérrez, F.; Verbert, K. A research agenda for managing uncertainty in visual analytics. In Proceedings of the Mensch und Computer 2016—Workshopband, Aachen, Germany, 4–7 September 2016. 42. Meeks, E.; Weingart, S.B. The Digital Humanities Contribution to Topic Modeling. J. Digit. Humanit. 2012, 2, 1–6. 43. McNutt, M. Reproducibility; American Association for the Advancement of Science: Washington, DC, USA, 2014. 44. Stodden, V.; McNutt, M.; Bailey, D.H.; Deelman, E.; Gil, Y.; Hanson, B.; Heroux, M.A.; Ioannidis, J.P.; Taufer, M. Enhancing reproducibility for computational methods. Science 2016, 354, 1240–1241. [CrossRef] [PubMed] 45. Edmond, J.; Folan, G.N. Data, Metadata, Narrative. Barriers to the Reuse of Cultural Sources. In Research Conference on Metadata and Semantics Research; Springer: Cham, Switzerland, 2017; pp. 253–260. 46. Wyatt, S.; Millen, D. Meaning and Perspective in the Digital Humanities; A White Paper for the establishment of a Center for Humanities and Technology (CHAT); Royal Netherlands Academy of Arts & Sciences (KNAW): Amsterdam, The Netherlands, 2014. 47. Drucker, J. Humanities Approaches to Graphical Display. Digit. Humanit. Q. 2011, 5, 1–21. 48. Ghani, J.A.; Deshpande, S.P. Task Characteristics and the Experience of Optimal Flow in Human—Computer Interaction. J. Psychol. 1994, 128, 381–391. [CrossRef] 49. Stolper, C.D.; Perer, A.; Gotz, D. Progressive Visual Analytics: User-Driven Visual Exploration of In-Progress Analytics. IEEE Trans. Vis. Comput. Graph. 2014, 20, 1653–1662. [CrossRef] [PubMed] 50. Shneiderman, B. Direct manipulation: A step beyond programming languages. Computer 1983, 16, 57–69. [CrossRef] 51. Bostock, M. Visualizing Algorithms. 2014. Available online: http://bost.ocks.org/mike/algorithms (accessed on 3 June 2019). 52. Bostock, M.; Ogievetsky, V.; Heer, J. D3 Data-Driven Documents. IEEE Trans. Vis. Comput. Graph. 2011, 17, 2301–2309. [CrossRef] [PubMed] 53. Pezzotti, N.; Lelieveldt, B.P.F.; van der Maaten, L.; Höllt, T.; Eisemann, E.; Vilanova, A. Approximated and User Steerable tSNE for Progressive Visual Analytics. IEEE Trans. Vis. Comput. Graph. 2017, 23, 1739–1752. [CrossRef] 54. Fekete, J.D.; Primet, R. Progressive Analytics: A Computation Paradigm for Exploratory Data Analysis. arXiv 2016, arXiv:1607.05162. c© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). http://dx.doi.org/10.2307/1914185 http://dx.doi.org/10.1126/science.185.4157.1124 http://www.ncbi.nlm.nih.gov/pubmed/17835457 http://dx.doi.org/10.1126/science.aah6168 http://www.ncbi.nlm.nih.gov/pubmed/27940837 http://dx.doi.org/10.1080/00223980.1994.9712742 http://dx.doi.org/10.1109/TVCG.2014.2346574 http://www.ncbi.nlm.nih.gov/pubmed/26356879 http://dx.doi.org/10.1109/MC.1983.1654471 http://bost. ocks. org/mike/algorithms http://dx.doi.org/10.1109/TVCG.2011.185 http://www.ncbi.nlm.nih.gov/pubmed/22034350 http://dx.doi.org/10.1109/TVCG.2016.2570755 http://creativecommons.org/ http://creativecommons.org/licenses/by/4.0/. Introduction Uncertainty Taxonomies Uncertainty in GIScience Aleatoric Uncertainty Epistemic Uncertainty Sources of Uncertainty in Data Analysis Implications for Decision-Making in the Digital Humanities Modeling Uncertainty in the Digital Humanities Aleatoric Uncertainty Epistemic Uncertainty Imprecision Ignorance Credibility/Discord Incompleteness Data and Uncertainty in Digital Humanities Managing Uncertainty Through Progressive Visual Analytics Conclusions References work_2zsoxi6hfjby7gdvtrp4rtwf5y ---- Umanistica Digitale - ISSN:2532-8816 - n.4, 2019 L. Brazzo, K.J. Rodriguez – Introduction. Data Sharing, Holocaust Documentation and the Digital Humanities: Best Practices, Case Studies and Benefits DOI: http://doi.org/10.6092/issn.2532-8816/9035 Introduction Data Sharing, Holocaust Documentation and the Digital Humanities: Best Practices, Case Studies and Benefits Laura Brazzo and Kepa J. Rodriguez Abstract. This issue of Umanistica Digitale is dedicated to he workshop "Data Sharing, Holocaust Documentation and the Digital Humanities: Best Practices, Case Studies and Benefits", that took place at the Università Cà Foscari in Venice on June 29–30, 2017, in the framework of the EHRI (European Holocaust Research Infrastructure) Project. The workshop was organized by the Centro di Documentazione Ebraica Contemporanea (CDEC), with the support of the Cà Foscari University's Master in Digital Humanities Program. The workshop coincided with the 4th edition of the LODLAM (Linked Open Data in Libraries, Archives and Museums) Summit. The aim of the workshop was to present the state of the art of data sharing practices and technologies, starting from the experiences and results obtained at the EHRI project; to discuss the usability and potential of data sharing in the Humanities; to investigate the possible connections between the EHRI project and other research infrastructures and digital humanities projects. Il presente numero prende il titolo dal workshop "Data Sharing Holocaust Documentation and the Digital Humanities: Best Practices, Case Studies and Benefits" svoltosi nell’ambito del progetto EHRI il 29-30 giugno 2017. Il workshop è stato organizzato dalla Fondazione Centro di Documentazione Ebraica Contemporanea (CDEC), con il supporto del Master in Digital Humanities Program dell'Università Cà Foscari di Venezia. Scopo del workshop è stato quello di presentare lo stato dell’arte sulle pratiche e le tecnologie per la condivisione dei dati sullo specifico tema della storia della Shoah. Si è discusso delle opportunità e dei benefici derivanti dalla condivisione dei dati, a partire dalle esperienze e dai risultati ottenuti nell’ambito del progetto EHRI e delle possibilità di connessione fra EHRI ed altre infrastrutture di ricerca e progetti nel campo delle Digital Humanities. Gli articoli che presentiamo in questo numero speciale di Umanistica Digitale rispecchiano, e in alcuni casi integrano, i lavori presentati in sede di Workshop. Il numero include anche le tre presentazioni che hanno aperto le giornate di lavoro, ovvero i paper dedicati al progetto di digitalizzazione dei documenti sulla Shoah conservati dall’Archivio di Stato di Venezia e l'introduzione generale del progetto EHRI. I curatori desiderano ringraziare la rivista Umanistica Digitale e il suo comitato di redazione per aver accolto la proposta di pubblicazione; un particolare ringraziamento va a Marilena D’Aquino per il sollecito e costante supporto all’editing dei testi. i Umanistica Digitale - ISSN:2532-8816 - n.4, 2019 Preface The workshop "Data Sharing, Holocaust Documentation and the Digital Humanities: Best Practices, Case Studies and Benefits" took place at the Università Cà Foscari in Venice on June 29–30, 2017, in the framework of the EHRI (European Holocaust Research Infrastructure) Project.1 The workshop was organized by the Centro di Documentazione Ebraica Contemporanea (CDEC), with the support of the Cà Foscari University's Master in Digital Humanities Program. The workshop coincided with the 4th edition of the LODLAM (Linked Open Data in Libraries, Archives and Museums) Summit.2 The aim of the workshop was to present the state of the art of data sharing practices and technologies, starting from the experiences and results obtained at the EHRI project; to discuss the usability and potential of data sharing in the Humanities; to investigate the possible connections between the EHRI project and other research infrastructures and digital humanities projects. A Program Committee of the workshop was joined by four experts in fields related with data sharing and digital humanities: • Laura Brazzo (PhD), Chairperson, Centro de Documentazione Ebraica Contemporanea (CDEC)3, Milan, Italy • Vladimir Alexiev (PhD), Ontotext Corp,4 Sofia, Bulgaria • Silvia Mazzini (Dr), regesta.exe,5 Rome, Italy • Kepa J. Rodriguez (PhD), Yad Vashem,6 Jerusalem, Israel 21 submissions were collected, from which the program committee accepted 14 for presentation and discussion at the Workshop. Eleven of the presented submissions are represented by articles in this journal. The selection process of the submissions showed to us a very heterogeneous landscape in conceptions and degree of development concerning data sharing. In some cases the focus of the projects was the online presentation of content of the databases; in other cases strategies to share the data among different agents in a more efficient way or strategies to integrate shared data in the own infrastructure. The workshop was organized in four work sessions and opened with a general overview of the EHRI Project (Veerle Vanden Daelen) as well as a presentation of the recent State Archive of 1 http://www.ehri-project.eu 2 https://summit2017.lodlam.net/ 3 http://www.cdec.it 4 https://ontotext.com/ 5 https://www.regesta.com 6 http://www.yadvashem.org ii http://www.yadvashem.org/ https://www.regesta.com/ https://ontotext.com/ http://www.cdec.it/ https://summit2017.lodlam.net/ http://www.ehri-project.eu/ L. Brazzo, K.J. Rodriguez – Introduction. Data Sharing, Holocaust Documentation and the Digital Humanities: Best Practices, Case Studies and Benefits Venice’s project to digitize Holocaust materials (Raffaele Santoro and Andrea Pelizza). The four sessions progressed from presentations of projects based on data integration (with potential for data sharing) to projects with an advanced data-sharing framework. The three presentations scheduled for Session I were focused on different cases of data integration made at the most relevant institutions for Holocaust materials, the United States Holocaust Memorial Museum, in Washington (Michael Levy and Megan Lewis) and Yad Vashem in Jerusalem(Olga Tolokonski). Session II was devoted to oral testimonies and audio-visual materials. Case studies on integration of materials and strategies for sharing were provided by representatives of the Fortunoff Video Archive for Holocaust Testimonies at Yale University (Stefan Naron – Kevin Glick), the Center for Digital Systems at Freie Universitat Berlin (Cord Pagenstecher) and the USC Shoah Foundation’s Visual History Archive (Marta Stroud). Session III showcased two on-going digital humanities projects. The first project presented (Paris C. Papamichos – Giorgos Antoniou) is based on Digital Archive of the Greek Shoah and aimed to reconstruct the Greek Holocaust survivors’ social networks; the second presentation concerned the Nuremberg Trials Project - a digital document migration case study carried out by the Harvard University (Lidia Santarelli). In Session IV the scheduled presentations showed five advanced cases of data sharing and use of semantic approaches and technologies: the EHRI Project (Vladimir Alexiev – Ivelina Nikolova – Nevi Hatela); the JudaicaLink Project (Kay Eckert – Maral Dadvar); the Holocaust and WW2 LOD developments in the Netherlands (Annelies Van Nispen – Lizzy Jongma); and the LOD Navigator (Giovanni Moretti - Rachele Sprugnoli – Sara Tonelli). The presentation of some innovative ways to work with the archival standard (Laurent Romary – Charles Riondet) was also included in this final session. Given the heterogeneity of the participants and their common interest in advanced methodologies to enable efficient data sharing, we a included in the Program a special panel devoted to Linked Open Data (LOD). Contributions were provided by two experts coming from the LODLAM Summit. Alessio Melandri (Synapta) provided an overview of LOD and queries SPARQL; Simon Cobb (Leeds University) introduced Wikidata and its potential for integration and indexing of cultural heritage information. Organizers and Program Committee Organizing committee: Laura Brazzo, Fondazione CDEC, Milan; Matteo Perissinotto, University of Trieste – Fondazione CDEC, Milan; Simon Levis Sullam, Università Cà Foscari, Venice iii Umanistica Digitale - ISSN:2532-8816 - n.4, 2019 Program Committee: Laura Brazzo (PhD), Chairperson, Fondazione CDEC, Milan; Vladimir Alexiev (PhD), Ontotext Corp; Sofia Silvia Mazzini (Dr), Regesta.exe, Rome; Kepa J. Rodriguez (PhD), Yad Vashem, Jerusalem. Program Thursday, 29th June 2017 9.00 Registration 9.30 Greetings Simon Levis Sullam (Università Cà Foscari) Introduction 9.45 Raffaele Santoro - Andrea Pelizza (State Archive of Venice) The archival series of Venetian Prefettura and Questura – Archivio di Stato di Venezia 10.15 Veerle Vanden Daelen (Kazerne Dossin) Data Sharing, Holocaust Documentation and the Digital Humanities: Introducing the European Holocaust Research Infrastructure (EHRI) 11.15 Coffee break Session I 11.30 Michael Levy (United States Holocaust Memorial Museum - USHMM) Sharing Collections Data: One Perspective on the Practice at USHMM 12.00 Megan Lewis (United States Holocaust Memorial Museum - USHMM) Using Names Lists for Social Network Analysis 12.30 Olga Tolokonsky (Yad Vashem) Integration of Heterogeneous Shared Data: Yad Vashem’s Perspective as a Data Aggregator 13.00 Light Lunch Session II 14.00 Stephen Naron, Fortunoff Video Archive for Holocaust Testimonies, Yale University Library, New Haven Fortunoff Video Archive for Holocaust Survivors Testimonies 14.30 Cord Pagenstecher (Center for Digital Systems, Freie Universitat Berlin) Audiovisual Testimony Collections. Digital Archiving and Retrieval 15.00 Martha Stroud (USC Shoah Foundation Center for Advanced Genocide Research) iv L. Brazzo, K.J. Rodriguez – Introduction. Data Sharing, Holocaust Documentation and the Digital Humanities: Best Practices, Case Studies and Benefits Digital Testimonies and Research Access: USC Shoah Foundation Visual History Archive 15.30 Coffee break Discussion: Semantic Archive, Data and Media Integration 16.45 Conclusions Friday, 30th June 2017 LOD TUTORIAL 9.00 Alessio Melandri (Synapta) Getting New Knowledge from the Web of Data: LOD & SPARQL 10.00 Simon Cobb (Leeds University Library) An Introduction to Wikidata for Sharing and Visualising Cultural Heritage Resources 11.00 Coffee break Session III 11.15 Paris Papamichos Chronakis (University of Illinois at Chicago) From Individual Survival to Social Networks of Survivors: Rethinking the Digital Archive of the Greek Shoah 11.45 Lidia Santarelli (Princeton University) Life and Death of the Archive. The “Nuremberg Trials Project”: A Case Study in Digital Document Migration Discussion: Holocaust Research Questions and Use Cases Enabled by Data Sharing 12.45 Light Lunch Session IV 14.00 Vladimir Alexiev (Ontotext) Semantic Archive Integration for Holocaust Research: the EHRI Research Infrastructure 14.30 Charles Riondet (INRIA Research Center of Paris) Towards Multiscale Archival Digital Data 15.00 Coffee break 15.15 Kai Eckert (Stuttgart Media University) Judaica Link: A Knowledge Base for Jewish Culture and History 15.45 Annelies Van Nispen - Lizzy Jongma (Institute for War, Holocaust and Genocide Studies - NIOD) Holocaust and WW2 LOD Developments in the Netherlands v Umanistica Digitale - ISSN:2532-8816 - n.4, 2019 16.15 Rachele Sprugnoli (Bruno Kessler Foundation) LOD Navigator: Tracing Movements of Italian Shoah Victims Discussion: Global Archive and Data Integration - Enlarging EHRI Data Connections to other Datasets and Projects 16.45 Conclusions of the Workshop Table of Contents Preface Invited Talk Veerle Vanden Daelen, Data Sharing, Holocaust Documentation and the Digital Humanities: Introducing the European Holocaust Research Infrastructure (EHRI) Raffaele Santoro, The Archival Series Concerning the Confiscation of Jewish Property, Found in the Archives of the Prefecture of Venice Andrea Pelizza, Sources for the History of the Holocaust Extant in the State Archives of Venice. Inventory Projects and Digitization Full Papers Michael Levy, Sharing Collections Data: One Perspective on the Practice at USHMM, 17-22 Megan Lewis, Using Names Lists for Social Network Analysis Olga Tolokonsky, Integration of Heterogeneous Shared Data: Yad Vashem’s Perspective as a Data Aggregator Stephen Naron – Kevin Glick, Fortunoff Video Archive on Holocaust Survivors Testimonies Cord Pagenstecher, Digital Interview Collections at Freie Universitat Berlin. Survivors’ Testimonies as Research Data Paris Papamichos – Giorgios Antoniou, From Individual Survival to Social Networks of Survivors: Rethinking the Digital Archive of the Greek Shoah Laurent Romary – Charles Riondet Towards Multiscale Archival Digital Data Kai Eckert – Maral Dadvar Judaica Link. A Knowledge Base for Jewish Culture and History Annelies Van Nispen – Lizzy Jongma, Holocaust and WW2 LOD Developments in the Netherlands Giovanni Moretti – Rachele Sprugnoli – Sara Tonelli, LOD Navigator: Tracing Movements of Italian Shoah Victims vi Preface Organizers and Program Committee Organizing committee: Program Committee: Program Introduction Session I Session II LOD TUTORIAL Session III Session IV Table of Contents Preface Invited Talk Full Papers work_33ebahkoevdwrcgyzyftxrmrmi ---- Preface Jenkins, L 2020 Preface. Modern Languages Open, 2020(1): 37 pp. 1–2. DOI: https://doi.org/10.3828/mlo.v0i0.337 ARTICLE – DIGITAL MODERN LANGUAGES Preface Lucy Jenkins Cardiff University, GB JenkinsL27@cardiff.ac.uk When presented with the opportunity to contribute to a two-day workshop involving experts from Modern Languages and Digital Humanities, it appeared to me an unmissable and rare opportunity to come together with professionals from across disciplines that at times, are viewed on two separate spectrums. Unsurprisingly, the two days involved a number of curi- ous encounters and conversations as we discussed our experiences from differing levels of education and shared perspectives about the intersection between Languages and the Digital Humanities. Three things struck me in particular as emergent from the two days of discussions: Firstly, the wealth of creative ideas and enthusiasm of all involved to join together more clearly Digital Humanities and Modern Languages in a way that could cascade to all levels of learning and teaching. We benefitted from a diverse set of experts spanning Higher Education Institutions and the schools sector which made for valuable discussions about ‘what mat- ters’ where digital technology and Modern Languages was concerned. It was evident that at a schools level there was increasing emphasis on improved digital competency and literacy, and that this needed to be translated more successfully to Higher Education Institutions, particularly where language learning is concerned. Secondly, the clear and positive vision for how Modern Languages and Digital Humanities could contribute to learning experiences, but also the structural challenges in making this happen. Whilst colleagues all shared enthusiasm for the mission, there was wide experience of institutional reluctance to engage in dialogues where Digital Humanities and Modern Languages were concerned. There was much discussion of how to present the opportunity that digital and linguistic advances presented not only to the learner but also to the teacher. It was felt that time pressures and the need to ‘keep-up-to-date’ with the digital agenda allowed little space for positive responses where Digital Humanities is concerned. This certainly influ- enced discussion about how to attack and structure a tutorial that would be manageable to a teacher that felt ‘out of their comfort zone’. Thirdly, the idea that there is no such thing as a digital native. Whilst the vision of the twenty-first century learner is one with a handheld device rather than a pen and paper, it is important to acknowledge that learners’ experiences of the digital are incredibly varied, and more often than not, connected primarily to social interactions, not educational ones. This means that when constructing and curating meaningful uses of digital technology into the language learning classroom, we need to consider how to structure the experience in a https://doi.org/10.3828/mlo.v0i0.337 mailto:JenkinsL27@cardiff.ac.uk Jenkins: PrefaceArt. 37, page 2 of 2 way that is accessible to all learners. Moreover, it became increasingly clear that when con- sidering Digital Humanities it was not sufficient to understand only what could be feasibly and successfully transferred digitally, but more importantly what could be enhanced and improved in this transformation. This transformation poses inevitable challenges but also infinite opportunities. A few thoughts from an excellent two days which I look forward to building upon over the coming months. Author Information Lucy Jenkins achieved First Class honours in English Literature and Italian at Cardiff University in 2014. Achieving a scholarship for excellence, she pursued a Master’s degree in European Studies, also at Cardiff University, and graduated with a Distinction in 2015. Currently, Lucy is the National Coordinator for the MFL Student Mentoring Project, based at Cardiff University, and has progressed to create the role of Project Development Manager for Language Horizons, a DfE funded digital mentoring project for languages. In the last year Lucy has also acted as a research assistant on an AHRC Open World Research Initiative project and on small GW4 and ESRC grants. Lucy is developing her expertise in pedagogical approaches to language learning both at a research and practical level. Currently working with particular interest on the interface between mentoring, learner motivation and online technology, Lucy has been instrumental in developing Language Horizons, and is looking to advance her work on engagement factors impacting upon uptake of MFLs as well as pedagogical approaches to teaching languages. Lucy is also keenly developing her ability to work with policy makers to impact upon decisions made for language policy in Wales. This is particularly pertinent with the introduction of the New Curriculum for Wales. How to cite this article: Jenkins, L 2020 Preface. Modern Languages Open, 2020(1): 37 pp. 1–2. DOI: https://doi.org/10.3828/mlo.v0i0.337 Published: 28 August 2020 Copyright: © 2020 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. OPEN ACCESS Modern Languages Open is a peer-reviewed open access journal published by Liverpool University Press. https://doi.org/10.3828/mlo.v0i0.337 http://creativecommons.org/licenses/by/4.0/ Author Information work_34vimsyj6bdujafyxm77jfomsi ---- 1 History Department, University of Guelph Winter 2018 HIST*4170: Exploration of the Digital Humanities Credit weight: 1.0 credit Prerequisites: 10.00 credits Restrictions: Restricted to students in the B.A. Honours program with a minimum 70% average in all history course attempts. Instructor: Kim Martin Office: MacKinnon 1001/THINC Lab, 2nd Floor, Library Hours: Fridays 9:00-10:00 or W-F by appointment Email: kmarti20@uoguelph.ca Class Schedule and Location: Wednesdays/Fridays, 10:00 - 11:20 pm MCKN 059 Final Exam: There is no final exam for this course. Course Synopsis This course will begin with an introduction to Digital Humanities as a practice, a toolset, and a community. We will focus on the role of digital methods in historical and cultural research: the tools you learn in class, as well as those you investigate as part of the assignments, will help you to ground a topic of your choice (be it a person, an event, a historical artifact, a piece of writing, a location, or a work of literature, a work of music, or visual or performance art) in its historical or social context. This will enhance your understanding of your chosen topic and show you how digital tools can help you to organize, investigate, and interpret sources using a different lens, develop an argument based on your findings, and to create a digital humanities project of your own. Learning Outcomes • Increase digital literacy skills, and an awareness of a wide variety of digital tools for historical research • Be able to comprehend and use language appropriate to digital humanities research • Understand and be able to analyze the advantages of different methodologies of digital humanities inquiry • Learn to collect, manage, and manipulate digital data from various sources • Be able to formulate, direct, and complete a digital humanities project, and explain its significance to academic and lay audiences • Have the ability to situate critically some of the larger debates within digital humanities and their relationship to traditional humanities disciplines 2 Methods of Evaluation and Weights Please note, unless otherwise mentioned in class and posted in the revised course outline on Slack, all assignments are due before class on Wednesday. Assessment Weight Participation 10% Wikipedia Assignment 10% Blog posts 15% Digital Tool Assessment 15% Proposal/Annotated Bibliography 15% Final Assignment 35% Total 100% Details on Assignments: Participation (10%) • You are expected to be prepared with questions and observations from the readings every week • Participate in discussions on the weekly readings • Provide reflection on the digital tools introduced • Contribute to online discussion outside of class (Slack Channel) *** Further information on all assignments below will be handed out as the term progresses *** Wikipedia Assignment (10%) Due: Feb 2nd in class. • Find several Wikipedia articles on a historical topic of interest • Critique the articles, and find one that you believe needs revision • Create a Wikipedia account (we’ll do this together in class) • Revise the articles where needed, OR create a new article if needed • Write a 750 word reflection on the process Blog Posts (15%) Due Date: Write throughout the term, all 3 in by March 28th • Create a blog on a site of your choice (I’d recommend Medium or WordPress) • During the semester write at least 3 blog posts on topics relevant to class • Blogs should be between 500 and 1000 words • If you write more than 3 posts, I will take your 3 highest grades for the final mark Digital Tool Assessment (15%) Due Date: Throughout the term, due the week you present on your chosen tool. • Sign up for a digital tool or tool type chosen from the syllabus below, or propose another tool but make sure to get this approved by instructor • Prepare a short (5 mins) presentation of the tool to the class. • The class will then have some time to experiment. Be prepared to answer questions on the tool. • Provide a detailed report (1000 words) outlining tool features, critical analysis, and potential tool uses. Cite all the sources you use including any screenshots you include. 3 Proposal & Annotated Bibliography (15%) Due Date: Friday, February 16th • Proposal for topic of final assignment (500 words) • Annotated bibliography of at least 8 sources • Details will be provided in handout Final assignment (digital project + showcase): 35% Due Date: Friday, April 6th Details to be provided Policy on Late Assignments and Extensions: Assignments are due in class on the dates listed in this outline. Late assignments will be penalized 5% per day (24 hours), including weekend days. Late assignments will not be accepted after 7 days. Extensions will only be considered if a student has written documentation from a doctor or counsellor. No extensions will be granted on the basis of workload. Non- medical extensions must be approved at least three days before the assignment is due. Course Resources This class assumes access to a laptop computer (not a tablet) for the hands-on activities in and out of class. If you do not have access to a laptop, please consult the instructor after the first class. It is possible to borrow laptops from the library. An ongoing list of useful links and resources will be created in a shared Slack Channel. You are expected to join Slack and to participate in the ongoing discussions. D2L CourseLink will be used for marks and reminders about course work. Required Texts There are two required textbooks for this course: Graham, S., I. Milligan, and S. Weingart. (2015) Exploring Big Historical Data: The Historian’s Macroscope. Imperial Press. = Macroscope in reading list below Dougherty, J., and K. Nawrotzki (Eds). (2013) Writing History in the Digital Age. The University of Michigan Press. Ann Arbor. = Writing History in reading list below. Recommended Texts Gold, M.K., and L. Klein. (2016) Debates in the Digital Humanities 2016. University of Minnesota Press. = Debates 2016 in reading list. ***This text is available online at http://dhdebates.gc.cuny.edu/debates/2 *** http://dhdebates.gc.cuny.edu/debates/2 4 Schedule & assigned work (provisional – updates to be posted in the Slack Channel) Week/Da te Wednesday Friday Week 1 Jan 10/12 Course expectations & outline overview Cummings, A.S. and J.Jarrett. Only Typing? Informal Writing, Blogging, and the Academy. Writing History. Background on DH TOOLS: Zotero, DIRT Directory Kirschenbaum, M. (2010) What is Digital Humanities and What’s it Doing in English Departments? https://mkirschenbaum.files.wordpress.com/2 011/03/ade-final.pdf Fitzpatrick, K. (2012). The Humanities, Done Digitally.. http://dhdebates.gc.cuny.edu/debates/text/30 Spiro, L. (2012). “This Is Why We Fight?”: Defining the Values of the Digital Humanities. In M. K. Gold (Ed.), Debates in the Digital Humanities. Minneapolis: University of Minnesota Press. http://dhdebates.gc.cuny.edu/debates/text/13 Gold, M.K. and L. Klein (eds) Debates 2016 Introduction. http://dhdebates.gc.cuny.edu/debates/2 Week 2 Jan 17/19 Background on Digital History Debates 2016: Chapter 25. Robertson, S. (2016) The Differences Between Digital Humanities and Digital History. Nawrotzki, K, and J. Dougherty. Introduction. Writing History Dorn, S. Is Digital History More than an Argument about the Past? In Writing History. Googling the Past TOOLS: Google NGRAMS Solberg, J. (2012). Googling the Archive: Digital Tools and the Practice of History. Advances in the History of Rhetoric, 15(1), 53–76. Leary, P. (2005). Googling the Victorians. Journal of Victorian Culture, 10(1), 72–86. Fyfe, P. (2015). Technologies of Serendipity. Victorian Periodicals Review, 48(2), 261–266. Leary, P. (2015). Response: Search and Serendipity. Victorian Periodicals Review, 48(2), 267–273. https://mkirschenbaum.files.wordpress.com/2011/03/ade-final.pdf https://mkirschenbaum.files.wordpress.com/2011/03/ade-final.pdf http://dhdebates.gc.cuny.edu/debates/text/30 http://dhdebates.gc.cuny.edu/debates/text/13 http://dhdebates.gc.cuny.edu/debates/2 5 Week/Da te Wednesday Friday Week 3 Jan 24/26 History on the Web (pt 1) TOOL: HTRC Bookworm Macroscope: Chapter 1. “The Joy of Big Data.” Macrosope: Chapter 2. “The DH Moment.” Rosenzweig, R. (2001). The Road to Xanadu : Public and Private Pathways on the History Web. In Organization (Vol. 88, pp. 548– 579). History on the Web (pt 2) TOOL: WIKIPEDIA Seligman, A. Teaching Wikipedia without Apologies. In Writing History. Wolff, R. The Historian’s Craft, Popular Memory, and Wikipedia. In Writing History. Rosenzweig, R. (2006). Can History Be Open Source? Wikipedia and the Future of the Past. Journal of American History, (June), 117–146. Week 4 Jan 31/Feb 2nd Text Mining Macroscope: Chapter 3. “Text Mining Tools.” D’Ignazio, C. (2015). “What would feminist data visualization look like?” https://civic.mit.edu/feminist-data- visualization Text Analysis TOOL: VOYANT Ramsay, S. (2014). The Hermeneutics of Screwing Around; or What You Do with a Million Books. In K. Kee (Ed.), PastPlay: Teaching and Learning with Technology. Ann Arbor: University of Michigan Press. *** Wikipedia Assignment Due *** Week 5 Feb 7/9 Data Visualization Macroscope: Chapter 5. “Making Your Data Legible.” *** Guest Lecture by DataVis team from the McLaughlin Library *** Diversity in DH TOOLS: TABLEAU, TIMELINE JS Debates 2016: Chapter 21. Earhart, A. and Taylor, T., “Pedagogies of Race” Debates 2016: Chapter 24. Hsu, W. ”Lessons on Public Humanities from the Public Sphere” Tim Sherratt, “It’s all about the stuff: collections, interfaces, power and people,” 1 December 2011 http://discontents.com.au/its- all-about-the-stuff-collections-interfaces- power-and-people/ https://civic.mit.edu/feminist-data-visualization https://civic.mit.edu/feminist-data-visualization http://discontents.com.au/its-all-about-the-stuff-collections-interfaces-power-and-people/ http://discontents.com.au/its-all-about-the-stuff-collections-interfaces-power-and-people/ http://discontents.com.au/its-all-about-the-stuff-collections-interfaces-power-and-people/ 6 Week/Da te Wednesday Friday Week 6 Feb 14/16 Crowd-sourcing History Graham, S. Massie, G., and N. Feuerherm. The HeritageCrowd Project: A Case Study in Crowdsourcing Public History. In Writing History. Rural Diary Archive. https://ruraldiaries.lib.uoguelph.ca/ Manuscript Transcription Projects. https://folgerpedia.folger.edu/Manu script_transcription_projects Feminist DH TOOLS: CWRC Debates 2016: Chapter 10. Losh, L. et al. “Putting the Human Back into the Digital Humanities” Nowviskie, B (2011). What Do Girls Dig? http://nowviskie.org/2011/what-do-girls-dig/ *** Annotated Bib & Proposal Due *** Week 7 Feb 21/23 *** No Class. Reading Week *** Week 8 Feb 28th/ Mar 2nd Playing the Past LaPensée, Elizabeth. (2017). Video Games Encourage Indigenous Cultural Expression. The Conversation. Never Alone: The Game. http://neveralonegame.com/ Kee, K., Graham, S., Dunae, P., Lutz, J., Large, A., Blondeau, M., & Clare, M. (2009). Towards a Theory of Good History Through Gaming. Canadian Historical Review, 90(2), 303–326. TOOLS: INKLEWRITER, TWINE, AURASMA Compeau, T., and R. MacDougall (2014) Tecumseh Lies Here: Goals and Challenges for a Pervasive History Game in Progress. In K. Kee (Ed.), PastPlay: Teaching and Learning with Technology. Ann Arbor: University of Michigan Press. Zucconi, L., Watrall, E., Ueno, H., and Rosner, L., Pox and the City: Challenges in Writing a Digital History Game. In Writing History. Spring, D. (2015). Gaming history: computer and video games as historical scholarship. Rethinking History, 19(2), 207–221. https://ruraldiaries.lib.uoguelph.ca/ https://folgerpedia.folger.edu/Manuscript_transcription_projects https://folgerpedia.folger.edu/Manuscript_transcription_projects http://nowviskie.org/2011/what-do-girls-dig/ https://theconversation.com/video-games-encourage-indigenous-cultural-expression-74138 https://theconversation.com/video-games-encourage-indigenous-cultural-expression-74138 https://theconversation.com/video-games-encourage-indigenous-cultural-expression-74138 http://neveralonegame.com/ 7 Week/Da te Wednesday Friday Week 9 Mar 7/9 Mapping the Past Jenstad, J. (2011) Using Early Modern Maps in Literary Studies: Views and Caveats from London. Nowviskie, B. (2010) Inventing the Map in the Digital Humanities: A Young Lady’s Primer. https://journals.tdl.org/paj/index.php /paj/article/view/11/61 TOOLS: GOOGLE EARTH, HISTORYPIN Harkema, C., and C. Nygren (2012). HistoryPin for Library Image Collections. http://synergies.lib.uoguelph.ca/index.php/perj /article/view/1970/2620#.V6TJFZOAOko Week 10 Mar 14/16 Network Analysis Macroscope: Chapter 6. “Network Analysis” Jackson, C. (2017). Using social network analysis to reveal unseen relationships in medieval Scotland. Digital Scholarship in the Humanities, 32(2). TOOLS: GEPHI, CYTOSCAPE Alan Liu. "Friending the past: The sense of history and social computing." New Literary History 42.1 (2011): 1-30. https://muse.jhu.edu/article/441862 Week 11 Mar 21/23 Publishing, Archives and Exhibits TOOLS: OMEKA, SCALAR Christen, K. (2012). Does Information Really want to be free? Indigenous Knowledge Systems and the Question of Openness. http://www.kimchristen.com/wp- content/uploads/2015/07/christen6. 2012.pdf McPherson, T (2015) “Post- Archive: The Humanities, the Archive, and the Database.” Ed. David T. Goldberg and Patrik Svensson. Between Humanities and the Digital. MIT Press. Manoff, M. (2010). Archive and Database as Historical Record. Portal: Libraries and the Academy, 10(4), 385–398 *** No Classes – Instructor Away *** https://journals.tdl.org/paj/index.php/paj/article/view/11/61 https://journals.tdl.org/paj/index.php/paj/article/view/11/61 http://synergies.lib.uoguelph.ca/index.php/perj/article/view/1970/2620#.V6TJFZOAOko http://synergies.lib.uoguelph.ca/index.php/perj/article/view/1970/2620#.V6TJFZOAOko https://muse.jhu.edu/article/441862 http://www.kimchristen.com/wp-content/uploads/2015/07/christen6.2012.pdf http://www.kimchristen.com/wp-content/uploads/2015/07/christen6.2012.pdf http://www.kimchristen.com/wp-content/uploads/2015/07/christen6.2012.pdf 8 Week/Da te Wednesday Friday Week 12 Mar 28/30 Linked Pasts TOOLS: HuViz, PERIPLEO (Kim will demo) Brown, S. (2015). Networking Feminist Literary History: Recovering Eliza Meteyard’s Web. In Virtual Victorians (pp. 57-82). Palgrave Macmillan US. The Linked Jazz Project: https://linkedjazz.org/ *** Blog Posts Due *** *** No Classes. Holiday Friday *** Week 13 April 4/6 Reflections Readings TBA. Final class – Showcase of Digital Projects in THINC Lab. *** Final Assignments Due *** Standard College of Arts Statements for Winter 2018 E-mail Communication As per university regulations, all students are required to check their e-mail account regularly: e-mail is the official route of communication between the University and its students. When You Cannot Meet a Course Requirement When you find yourself unable to meet an in-course requirement because of illness or compassionate reasons, please advise the course instructor (or designated person, such as a teaching assistant) in writing, with your name, id#, and e-mail contact. See the undergraduate calendar for information on regulations and procedures for Academic Consideration. Drop Date Courses that are one semester long must be dropped by the end of the fortieth day of class (Friday, 9 March 2018); two-semester courses must be dropped by the last day of the add https://linkedjazz.org/ http://www.uoguelph.ca/registrar/calendars/undergraduate/current/c08/c08-ac.shtml http://www.uoguelph.ca/registrar/calendars/undergraduate/current/c08/c08-ac.shtml 9 period in the second semester. The regulations and procedures for Dropping Courses are available in the Undergraduate Calendar. Copies of out-of-class assignments Keep paper and/or other reliable back-up copies of all out-of-class assignments: you may be asked to resubmit work at any time. Accessibility The University promotes the full participation of students who experience disabilities in their academic programs. To that end, the provision of academic accommodation is a shared responsibility between the University and the student. When accommodations are needed, the student is required to first register with Student Accessibility Services (SAS). Documentation to substantiate the existence of a disability is required, however, interim accommodations may be possible while that process is underway. Accommodations are available for both permanent and temporary disabilities. It should be noted that common illnesses such as a cold or the flu do not constitute a disability. Use of the SAS Exam Centre requires students to book their exams at least 7 days in advance, and not later than the 40th Class Day. For more information see the SAS web site. Student Rights and Responsibilities Each student at the University of Guelph has rights which carry commensurate responsibilities that involve, broadly, being a civil and respectful member of the University community. The Rights and Responsibilities are detailed in the Undergraduate Calendar Academic Misconduct The University of Guelph is committed to upholding the highest standards of academic integrity and it is the responsibility of all members of the University community – faculty, staff, and students – to be aware of what constitutes academic misconduct and to do as much as possible to prevent academic offences from occurring. University of Guelph students have the responsibility of abiding by the University's policy on academic misconduct regardless of their location of study; faculty, staff and students have the responsibility of supporting an environment that discourages misconduct. Students need to remain aware that instructors have access to and the right to use electronic and other means of detection. Please note: Whether or not a student intended to commit academic misconduct is not relevant for a finding of guilt. Hurried or careless submission of assignments does not excuse students from responsibility for verifying the academic integrity of their work before submitting it. Students who are in any doubt as to whether an action on their part could be construed as an academic offence should consult with a faculty member or faculty advisor. The Academic Misconduct Policy is detailed in the Undergraduate Calendar. Recording of Materials Presentations which are made in relation to course work—including lectures—cannot be recorded or copied without the permission of the presenter, whether the instructor, a classmate https://www.uoguelph.ca/registrar/calendars/undergraduate/current/c08/c08-drop.shtml http://www.uoguelph.ca/sas https://www.uoguelph.ca/registrar/calendars/undergraduate/2014-2015/c01/index.shtml https://www.uoguelph.ca/registrar/calendars/undergraduate/2014-2015/c01/index.shtml http://www.uoguelph.ca/registrar/calendars/undergraduate/current/c08/c08-amisconduct.shtml 10 or guest lecturer. Material recorded with permission is restricted to use for that course unless further permission is granted. Resources The Academic Calendars are the source of information about the University of Guelph’s procedures, policies and regulations, which apply to undergraduate, graduate and diploma programs. http://www.uoguelph.ca/registrar/calendars/index.cfm?index MCKN 059 Final Exam: There is no final exam for this course. Course Synopsis Methods of Evaluation and Weights Course Resources D2L CourseLink will be used for marks and reminders about course work. Required Texts Recommended Texts work_3dobvgv6hjcjzkwylztzizzpki ---- Enhancing digital human motion planning of assembly tasks through dynamics and optimal control Available online at www.sciencedirect.com 2212-8271 © 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the organizing committee of the 6th CIRP Conference on Assembly Technologies and Systems (CATS) doi: 10.1016/j.procir.2016.02.125 Procedia CIRP 44 ( 2016 ) 20 – 25 ScienceDirect 6th CIRP Conference on Assembly Technologies and Systems (CATS) Enhancing digital human motion planning of assembly tasks through dynamics and optimal control Staffan Björkenstama,*, Niclas Delfsa, Johan S. Carlsona, Robert Bohlina, Bengt Lennartsonb aFraunhofer-Chalmers Centre, Chalmers Science Park, SE-412 88 Göteborg, Sweden bAutomation Research Group, Department of Signals and Systems, Chalmers University of Technology, SE-412 96 Göteborg, Sweden ∗ Corresponding author. Tel.: +46-31-772-4284; fax: +46-31-772-4260. E-mail address: staffan@fcc.chalmers.se Abstract Better operator ergonomics in assembly plants reduce work related injuries, improve quality, productivity and reduce cost. In this paper we investigate the importance of modeling dynamics when planning for manual assembly operations. We propose modeling the dynamical human motion planning problem using the Discrete Mechanics and Optimal Control (DMOC) method, which makes it possible to optimize with respect to very general objectives. First, two industrial cases are simulated using a quasi-static inverse kinematics solver, demonstrating problems where this approach is sufficient. Then, the DMOC-method is used to solve for optimal trajectories of a lifting operation with dynamics. The resulting trajectories are compared to a steady state solution along the same path, indicating the importance of using dynamics. c© 2016 The Authors. Published by Elsevier B.V. Peer-review under responsibility of the organizing committee of the 6th CIRP Conference on Assembly Technologies and Systems (CATS). Keywords: Assembly; Digital human modeling; Ergonomy; Dynamics; Optimal control 1. Introduction Although the degree of automation is increasing in manu- facturing industries, many assembly operations are performed manually. To avoid injuries and to reach sustainable production of high quality, comfortable environments for the operators are vital, see [1] and [2]. Poor station layouts, poor product de- signs or badly chosen assembly sequences are common sources leading to unfavorable poses and motions. To keep costs low, preventive actions should be taken early in a project, raising the need for feasibility and ergonomics studies in virtual environ- ments long before physical prototypes are available. Today, in the automotive industries, such studies are con- ducted to some extent. The full potential, however, is far from reached due to limited software support in terms of capability for realistic pose prediction, motion generation and collision avoidance. As a consequence, ergonomics studies are time con- suming and are mostly done for static poses, not for full assembly motions. Furthermore, these ergonomic studies, even though performed by a small group of highly specialized simulation engineers, show low reproducibility within the group [3]. To describe operations and facilitate motion generation, it is common to equip the manikin with coordinate frames attached to end-effectors like hands and feet. The inverse kinematic problem is to find joint values such that the position and orientation of hands and feet matches certain target frames. For the quasi-static inverse kinematics this leads to an underdetermined system of equations since the number of joints exceeds the end-effectors constraints. Due to this redundancy there exist a set of solutions, allowing us to consider ergonomics aspects, collision avoidance, and maximizing comfort when choosing one solution. The dynamic motion planning problem is stated as an optimal control problem, which we discretize using discrete mechanics. This results in an optimization problem, which can be solved using standard nonlinear programming solvers. Furthermore, this general problem formulation makes it fairly easy to include very general constraints and objectives. In this paper we show, using a couple of case studies, where the quasi-static solver is sufficient, and where the DMOC solver could improve the solution. The paper extends the work pre- sented in [4] and [5], and is a part of Cromm (Creation of Muscle Manikins) project [6]. 2. Background 2.1. Manikin Model In this section we present the manikin model and the inverse kinematic problems, both quasi-static and with dynamics. © 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the organizing committee of the 6th CIRP Conference on Assembly Technologies and Systems (CATS) 21 Staffan Björkenstam et al. / Procedia CIRP 44 ( 2016 ) 20 – 25 2.2. Kinematics The manikin model is a tree of rigid bodies connected by joints. Each body has a fixed reference frame and we describe its position relative to its parent body by a rigid transformation T (q), where q is the coordinate of the joint. To position the manikin in space, i.e. with respect to some global coordinate system, it has an exterior root as the origin and a prismatic joint and a rotation joint as exterior joints as opposed to the interior links representing the manikin itself, see [4]. Together, the exterior links mimic a rigid transformation that completely specifies the position of the lower lumbar. In turn, the lower lumbar represents an interior root, i.e. it is the ancestor of all interior joints. Note that the choice of the lower lumbar is not critical. In principal, any link could be the interior root, and the point is that the same root can be used though a complete simulation. No re-rooting or change of tree hierarchy will be needed. Now, for a given configuration of each joint, collected in the joint vector q = [qT 1 , . . . , qTn ] T , we can calculate all the relative transformations T1, ,Tn, traverse the tree beginning at the root and propagate the transformations to get the global position of each body. We say that the manikin is placed in a pose, and the mapping from a joint vector into a pose is called forward kinematics. Furthermore, a continuous mapping q(t), where t ∈ R, is called a motion, or a trajectory of the system. 2.3. Quasi Static Inverse Kinematics In order to facilitate the generation of realistic poses that also fulfill some desired rules we add a number of constraints on the joint vector. These kinematic constraints can for example restrict the position of certain links, either relative to other links or with respect to the global coordinate system or ensure the manikin is kept in balance, see section 2.3.2. All the kinematic constraints can be defined by a vector valued function g such that g(q) = 0 (1) must be satisfied at any pose. Finding a solution to equation 1 is generally referred to as inverse kinematics. Often, in practice, the number of constraints is far less than the number of joints of the manikin. Due to this redundancy there exist many solutions, allowing us to consider ergonomics aspects and maximizing comfort when choosing solution. To do so, we introduce a scalar comfort function h(q) (2) capturing as many ergonomic aspects as desired. The purpose is to be able to compare different poses in order to find solutions that maximize comfort. The comfort function is a generic way to give preference to certain poses while avoiding others. Typically h considers joint limits, distance to surrounding geometry in order to avoid collision, magnitude of contact forces, forces and torques on joints, see section 2.3.3. Furthermore, by combining equation 1 and 2 we can formulate the inverse kinematic problem as max q h(q) subject to g(q) = 0. (3) 2.3.1. Collision Avoidance While some contact with the environment may be intended, e.g. grasping of objects and leaning, and contribute to the force and moment balance. Other contacts, for example, collisions, are undesired. The comfort function offers a convenient way to include a simple, yet powerful, method penalizing poses close to collision. In robotics this method is generally known as Repulsive Potential [7][8]. The underlying idea is to define a barrier, say, around the obstacles increasing the discomfort towards infinity near collision. This method does not address the problem of escaping an already occurring collision. The idea is merely that if the manikin starts in a collision-free pose, then the repulsive potential prevents the manikin from entering a colliding pose. Note: It is common to think of the repulsive potential or rather its gradient field as a force field pushing an object away from obstacles. In this work, we do not want such artificial forces to contribute to the force balance. To avoid confusion with real contact forces we will not use that analogy. 2.3.2. Balance and Contact Forces One important part of g is ensuring that the manikin is kept in balance. For this, the weight of links and objects being carried, as well as external forces and torques due to contact with the floor or other objects, must be considered. The sum of all forces and torques are g f orce(q) = m g + ∑ j∈J fi, gtorque(q) = mc × m g + ∑ j∈J pj × f j +τ j, where m is the total body mass, g is the gravity vector, mc is the center of mass, f j and τ j are external force and torque vectors at point pj and J is the index set. Note that the quantities may depend on the pose, but this has been omitted for clarity. In general, external forces and torques due to contacts are unknown. For example, when standing with both feet on the floor it is not obvious how the contact forces are distributed between the feet. In what follows we let f and t denote the unknown forces and torques, and we stack them into the vector x = [qT f T τT ]T . Then we can rephrase (3) as follows: max x h(x) subject to g(x) = 0. (4) 2.3.3. Joint Torque The joint loads are key ingredients when evaluating poses from an ergonomic perspective [9]. Furthermore, research shows that real humans tend to minimize the muscle strain, i.e. mini- mize the proportion of load compared to the maximum possible load [10], so by normalizing the load on each joint by the muscle strength good results can be achieved. In this article we choose the function ht = n∑ i=1 w2i τ 2 i where τi is the torque in joint i, and wi is the reciprocal of the joint strength. Note that it is straightforward to propagate the external forces and torques and the accumulated link masses trough the manikin in order to calculate the load on each joint. 22 Staffan Björkenstam et al. / Procedia CIRP 44 ( 2016 ) 20 – 25 2.4. Discrete mechanics and optimal control 2.4.1. The constrained discrete Euler-Lagrange equations Consider the mechanical system specified by a configuration manifold Q ⊆ Rnq and Lagrangian L : T Q → R, where T Q is the tangent bundle of the configuration manifold. Furthermore, suppose the motion of the system is constrained by the equation φ(q) = 0 ∈ Rm to lie in the constraint manifold C =φ−1(0) ⊂ Q. Let U ∈ Rnu be the set of admissible controls and F : T Q×U → T∗Q the external force acting on the system, where T∗Q is the cotangent bundle of the configuration manifold. Introducing the multiplier λ(t) ∈ Rm the Lagrange- d’Alembert principle states that trajectories of the system satisfy δ ∫ t2 t1 L(q(t), q̇(t)) +φT (q(t))λ(t)dt + ∫ t2 t1 F(q(t), q̇(t),u(t)) ·δqdt = 0, (5) where variations are taken with respect to q, fixed at the end- points, and with respect to λ. Integration by parts and the fundamental lemma of calculus of variations give the following differential algebraic equations, known as the constrained Euler Lagrange equations of motion: ∂L ∂q (q(t), q̇(t)) − d dt ∂L ∂q̇ (q(t), q̇(t)) + F(q(t), q̇(t),u(t)) +ΦT (q(t))λ(t) = 0, (6a) φ(q(t)) = 0, (6b) where Φ denotes the Jacobian of the constraint function. The key idea of variational integrators is to directly approx- imate the variational principle (5) rather than the equations of motion (6). We now discretize q(t) in [t1, t2] using a fixed time step h = (t2 − t1)/N so that q(k) is an approximation of q(t1 + kh) for k = 0, . . . , N. Furthermore, we discretize the control such that u(k) is an approximation of u(t1 + (k + 12 )h) for k = 0, . . . , N − 1. We are now ready to replace the continuous state space, T Q, with the discrete state space, Q × Q, and construct a discrete Lagrangian Ld : Q × Q ×R → R such that Ld (q(k), q(k+1),h) ≈ ∫ t1+(k+1)h t1+kh L(q(t), q̇(t))dt. Introducing left and right discrete forces, F+d and F − d , and discrete multipliers, λ (k) d for k = 0, . . . , N, a discrete variational principle corresponding to (5) can be formulated as δ N−1∑ k=0 (Ld (q(k), q(k+1),h) + 1 2 φT (q(k))λ(k)d + 1 2 φT (q(k+1))λ(k+1)d ) + N−1∑ k=0 (F−d (q (k), q(k+1),u(k),h) ·δq(k) + F+d (q (k), q(k+1),u(k),h) ·δq(k+1)) = 0 (7) for all variations δλ (k) d and δq (k) with δq(0) = δq(N) = 0. This principle is equivalent to the discrete Euler-Lagrange equations: D2 Ld (q(k−1), q(k),h) + D1 Ld (q(k), q(k+1),h) + F+d (q (k−1), q(k),u(k−1),h) + F−d (q (k), q(k+1),u(k),h) +Φ T (q(k))λ(k)d = 0, (8a) φ(q(k+1)) = 0, (8b) where D1 Ld and D2 Ld are the slot derivatives with respect to the first and second argument. These equations define the varia- tional integrator by implicitly mapping (q(k−1), q(k),u(k−1),u(k)) to (q(k+1),λ(k)d ). Please refer to [11] for a thorough introduction to discrete mechanics and [12,13] for more on discrete mechanics and optimal control of multibody systems. A reasonable trade-off between accuracy and performance, is to use the the midpoint rule to approximate the relevant integrals. The discrete Lagrangian then becomes Ld (q0, q1,h) = hL (q0 + q1 2 , q1 − q0 h ) . (9) Thus D1 Ld (q0, q1,h) = h 2 ∂L ∂q (q0 + q1 2 , q1 − q0 h ) −∂L ∂q̇ (q0 + q1 2 , q1 − q0 h ) and D2 Ld (q0, q1,h) = h 2 ∂L ∂q (q0 + q1 2 , q1 − q0 h ) + ∂L ∂q̇ (q0 + q1 2 , q1 − q0 h ) . Furthermore, it is then natural to use the following discrete forces: F+d (q0, q1,u0,h) = F − d (q0, q1,u0,h) = = h 2 F (q0 + q1 2 , q1 − q0 h ,u0 ) . (10) This discretization scheme results in a second order accurate integrator. 2.5. Optimal control problem We consider the following optimal control problem: Mini- mize J = χ(q(t f ), q̇(t f )) + ∫ t f t0 L(q(t), q̇(t),u(t))dt (11a) subject to ∂L ∂q (q(t), q̇(t)) − d dt ∂L ∂q̇ (q(t), q̇(t)) + F(q(t), q̇(t),u(t)) +ΦT (q(t))λ(t) = 0, (11b) φ(q(t)) = 0, (11c) g(q(t), q̇(t),u(t)) ≥ 0, (11d) ψ0(q(t0), q̇(t0)) = 0, (11e) ψ f (q(t f ), q̇(t f )) = 0 (11f) 23 Staffan Björkenstam et al. / Procedia CIRP 44 ( 2016 ) 20 – 25 for t ∈ [t0, t f ]. Thus, we want to minimize a performance index (11a), con- sisting of the terminal cost, χ, and the integral of the control Lagrangian, L, along the trajectory, while satisfying the dynam- ics (11b)-(11c), path constraints (11d), and boundary conditions (11e)-(11f). It is well known that the discrete mechanics formulation of the equations of motion show excellent conservation of quanti- ties, such as momenta and energy, conserved by the continuous system. This will enable us to take larger time steps and still get physically meaningful results [14]. There is, however, yet another computational advantage when used in optimal control. Namely, since there are no explicit references to velocities in the discrete equations of motion, the resulting optimization problem can be formulated using fewer variables, compared to standard discretizations of trajectories on T Q. Approximating the objective using the midpoint rule and en- forcing the path constraints at the midpoints we get the following discrete optimal control problem: Minimize Jd = χ(q(N), q̇(N)) + N−1∑ i=0 hL ( q(i) + q(i+1) 2 , q(i+1) − q(i) h ,u(i) ) (12a) subject to D2 L(q(0), q̇(0)) + D1 Ld (q(0), q(1),h) + F−d (q (0), q(1),u(0),h) + 1 2 Φ T (q(0))λ(0)d = 0, (12b) D2 Ld (q(k−1), q(k),h) + D1 Ld (q(k), q(k+1),h) + F+d (q (k−1), q(k),u(k−1),h) + F−d (q (k), q(k+1),u(k),h) +ΦT (q(k))λ(k)d = 0, (12c) −D2 L(q(N), q̇(N)) + D2 Ld (q(N−1), q(N),h) + F+d (q (N−1), q(N),u(N−1),h) + 1 2 Φ T (q(N))λ(N)d = 0, (12d) φ(q(k)) = 0, (12e) g ( q(k) + q(k+1) 2 , q(k+1) − q(k) h ,u(k) ) ≥ 0, (12f) ψ0(q(0), q̇(0)) = 0, (12g) ψ f (q(N), q̇(N)) = 0, (12h) h = t f − t0 N , (12i) h ≥ 0, (12j) where q̇(0), q̇(N) are the initial and terminal velocities. The con- tinuous optimal control problem (11) has now been transcribed into a nonlinear programming (NLP) problem of the form: Find the vector x minimizing the scalar objective function f (x) (13a) such that the constraints cl ≤ c(x) ≤ cu (13b) and simple bounds xl ≤ x ≤ xu (13c) (a) Start (b) Enter (c) Finishing (d) End Fig. 1: Automatic tunnel bracket assembly are fulfilled. An optimization problem of this form can be solved using nonlinear programming. Here we use the interior point solver IPOPT[15]. 3. Quasi-static case studies 3.1. Tunnel bracket assembly The first case is to install a tunnel bracket with the help of an auxiliary tool. The tunnel bracket and the auxiliary tool is connected by a rotation joint. The case is provided by Volvo Cars. The manikin starts outside the car with the tool and tunnel bracket already connected. The manikin grasps the tool with the left hand on a bar where the direction of the grasp is free and the right hand is connected with the fingertips to the tunnel bracket. After the setup, the assembly is completely automatic and guarantees that the motion is collision-free, except for the grasping hands, and that the manikin is in balance for each time step. The motion can be seen in figure 1. The simulation take 5.1 seconds to compute on a Intel i7 2600 computer. The forces required to move the tunnel bracket is quite low and the precision required for the final assembly step and thereby slow motion makes this a good case for the quasi-static solver. 3.2. Washer placing The second case is to place washers inside the trunk of a car, this case is also provided by Volvo Cars. The case can be divided into two steps: first place the washers, and then to mount the bolts. Since both steps require the same reachability and force we choose to simulate only the washer placing. The manikin uses the left hand as support on the trunk floor to extend the reach, and the hand is free to rotate on that surface. The case is tried with 8 different manikins to cover the anthropometric variables length and weight, and also both sexes. 24 Staffan Björkenstam et al. / Procedia CIRP 44 ( 2016 ) 20 – 25 (a) Start (b) Second washer (c) Third washer (d) End Fig. 2: Placing of multiple washers for the 90 percentile male manikin Fig. 3: Top: The reachability for the shortest manikin is insufficient Bottom: The end is in reach for the manikin (a) (b) Fig. 4: Weight positions: (a) start, (b) finish In figure 2, we see the 90 percentile male manikin perform the placing without any reachability issues. In figure 3, we see the difference between 5 percentile lady versus 50 percentile male where the lady can not reach all the way. This simulation takes an average of 14.5 seconds for all eight manikins on a Intel i7 2600 computer. The washers only weigh a few grams each and the precision in which they need to be assembled, and thereby the slow motion, makes this a good case for the quasi-static solver. 4. Dynamic case study Here we compute trajectories for the manikin using the opti- mal control approach described Section 2.4. We then compute quasi-static solutions along the optimal paths, and compare the results. To make the problem more computationally attractive, we reduce the manikin model to a mechanical model of 40 de- grees of freedom. This is done by removing joints, primarily in the spine and hands. The example we study is a lifting operation using both hands, moving a weight from one predefined position to another, starting and ending at rest. We chose the height of the initial position of the weight to be 0.5 m above the ground plane and place the finish position at 1.8 m, while orientation and horizontal positions are identical, the positions can be seen in Figure 4. The weight is modeled as a rigid body, adding another six degrees of freedom to the system. To model contact, rigid constraints are added between the weight and the two hands, and also between the feet and ground. The reaction forces from the ground are, however, only allowed to push on the manikin, and must also fulfill Coulomb friction conditions. The resulting discrete optimal control problem has the structure of (12) with: χ(q, q̇) = 0 L(q, q̇,u) = uT u ψ0(q, q̇) = q̇ ψ f (q, q̇) = q̇ where the control signal, u, is chosen to be the normalized actuator torque. The problem is then solved for both a 10 kg and 25 Staffan Björkenstam et al. / Procedia CIRP 44 ( 2016 ) 20 – 25 0 0.2 0.4 0.6 0.8 1 0 1 2 3 ‖ u ‖ t [s] Dynamic Quasi-static (a) 0 0.2 0.4 0.6 0.8 1 0 1 2 3 ‖ u ‖ t [s] Dynamic Quasi-static (b) Fig. 5: Control effort in weight lifting example: (a) 10 kg, (b) 20 kg a 20 kg weight using techniques from [16]. This results in two optimal trajectories for the system: the trajectory for the 10 kg weight with a duration of 0.92 s, and the trajectory for the 20 kg weight, which has a duration of 1.05 s. The quasi-static control signal, {u(i)s }Ni=1, is then computed as the steady state solution with minimum norm along the discrete trajectory, {q(i)}Ni=1, i.e. for each i = 1, . . . , N: Minimize (u(i)s ) T u(i)s subject to ∂L ∂q (q(i),0) + F(q(i),0,u(i)s ) +Φ T (q(i))λ(i)s = 0, where u(i)s and λ (i) s are decision variables. In Figure 5, we compare the control signal magnitudes for the dynamic and quasi-static solutions. As expected the dynamic solutions, on average, require more control effort than the quasi- static solutions. In particular in the beginning of the lift, where a considerable effort is needed to accelerate both the weight and the manikin itself. It is interesting to note that in the end of the lift the dynamic solutions actually require less torque. This is explained by the fact that the direction of the lift is upward, hence the gravitational pull helps the deceleration. 5. Conclusions In this paper we showed the importance of modeling dynam- ics when planning for manual assembly operations. Two case studies where performed on industrial cases, giving examples of where the quasi-static solution is sufficient. To demonstrate the dynamic effects, a third test case was studied, which indi- cates the importance of modeling dynamics in lifting operations. There is still work to be done before the dynamic solver reaches the maturity of the quasi-static solver. In particular, the solver needs to be equipped with collision avoidance and a comfort function. Acknowledgements This work was carried out within the Wingquist Laboratory VINN Excellence Centre, supported by the Swedish Govern- mental Agency for Innovation Systems (VINNOVA). It is also part of the Sustainable Production Initiative and the Production Area of Advance at Chalmers University of Technology. References [1] A.-C. Falck, R. Örtengren, D. Högberg, The impact of poor assembly ergonomics on product quality: A costbenefit analysis in car manufacturing, Human Factors and Ergonomics in Manufacturing & Service Industries 20 (1) (2010) 24–41. [2] A.-C. Falck, M. Rosenqvist, A model for calculation of the costs of poor as- sembly ergonomics (part 1), International Journal of Industrial Ergonomics 44 (1) (2014) 140–147. [3] D. Lämkull, L. Hanson, R. Örtengren, Uniformity in manikin posturing: A comparison between posture prediction and manual joint manipulation, International Journal of Human Factors Modelling and Simulation 1 (2008) 225–243. [4] R. Bohlin, N. Delfs, L. Hanson, D. Högberg, J. Carlson, Unified solution of manikin physics and positioning - exterior root by introduction of extra parameters, Proceedings of DHM, First International Symposium on Digital Human Modeling. [5] N. Delfs, R. Bohlin, S. Gustafsson, P. Mårdberg, J. S. Carlson, Automatic creation of manikin motions affected by cable forces, Procedia CIRP 23 (2014) 35–40. [6] H. L. B. R. Högberg, D., J. Carlson, Creating and shaping the dhm tool imma for user-centred product and production design, International Journal of the Digital Human (IJDH). [7] J.-C. Latombe, Robot motion planning, Vol. 124, Springer Science & Busi- ness Media, 2012. [8] S. M. LaValle, Planning algorithms, Cambridge university press, 2006. [9] R. Westgaard, A. Aarås, The effect of improved workplace design on the de- velopment of work-related musculo-skeletal illnesses, Applied Ergonomics 16 (2) (1985) 91–97. [10] J. Rasmussen, M. Damsgaard, E. Surma, S. T. Christensen, M. de Zee, V. Vondrak, Anybody-a software system for ergonomic optimization 4. [11] J. E. Marsden, M. West, Discrete mechanics and variational integrators, Acta Numerica 2001 10 (2001) 357–514. [12] S. Leyendecker, J. Marsden, M. Ortiz, Variational integrators for constrained dynamical systems, ZAMM 88 (9) (2008) 677–708. [13] S. Leyendecker, S. Ober-Blöbaum, J. E. Marsden, M. Ortiz, Discrete me- chanics and optimal control for constrained systems, Optimal Control Ap- plications and Methods 31 (6) (2010) 505–528. [14] A. Lew, J. E. Marsden, M. Ortiz, M. West, An overview of variational integrators. [15] A. Wächter, L. T. Biegler, On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming, Mathematical Programming 106 (1) (2006) 25–57. [16] S. Björkenstam, J. S. Carlson, B. Lennartson, Exploiting sparsity in the discrete mechanics and optimal control method with application to human motion planning, in: Automation Science and Engineering (CASE), 2015 IEEE International Conference on, IEEE, 2015, pp. 769–774. work_3ge2uucuafeonlhyw5hgv4p6ra ---- 383 Educational Innovations and Applications- Tijus, Meen, Chang ISBN: 978-981-14-2064-1 Study on the Digitalization of Festival Culture in Taiwan’s Aboriginal Literature Cheng-Hui Tsai 1,a, Chuan-Po Wang 2,b Center for General Education, 1 National Taichung University of Science and Technology, Taiwan (R.O.C.) TEL: +886-935-351201 2 Department of Industrial Design, Chaoyang University of Technology Taiwan (R.O.C.) TEL: +886-926-776317 a chenghuitsai@nutc.edu.tw , b chuanpowang@gmail.com Abstract In this project, the term “teaching practice” is intended to focus on creative teaching and innovative research to promote multi-intelligence digital humanities and cultivate knowledge of aboriginal culture through field investigation and humane care. Therefore, the curriculum of Aboriginal Literature is based on: (1) An awareness of local and tribal culture and care; (2) An innovative teaching model (from a cognitive model to a cognitive skills model); (3) An emotional model (care of ethnic humanities); (4) A digital model (digital humanities and archives, learner-based learning, flipped classrooms and problem-oriented learning (PBL). The aim of the curriculum is to guide students to reflect on contemporary multicultural values, learn about holistic education and focus on people's core concerns. The rituals that are part of Taiwan’s Atayal and Thao cultures are integrated into the innovative education of aboriginal literature, and students are led to participate in field investigations of the ceremonies to complete the digital cultural documentary of the Atayal Thanksgiving ceremony to reach the innovative teaching goal of digital humanities education. Keywords: Aboriginal literature, Aboriginal culture, Digital humanities, Field study, Action research Introduction This study focuses on the digitalization of the ceremonial culture in Taiwan's aboriginal literature. The author has conducted action research as part of the field study of the Atayal ritual culture-related visits. The research specifically incorporated the Atayal ritual culture into the aboriginal literature curriculum. The teaching goal of this research is to cultivate and advance the digital humanities of the students: 1) the level of "literature knowledge", which guides the study of the aboriginal texts of the students, invites aboriginal scholars and experts to give special lectures and contact the local tribes during field surveys; 2) the level of "innovation digital", which is supplemented by a field survey of the traditional rituals of the indigenous peoples with the students invited to participate in the Atayal Thanksgiving Festival (Ryax Smqas Hnuway Utux Kayal) and established digital archives on aboriginal ceremonies. This included documentary filming, interviews with tribal elders, and exhibitions at the Aboriginal Cultural and Creative Documentary Film Festival. It is expected that a number of humanistic collections will be produced, such as the Atayal Thanksgiving Festival, an original ethnic documentary interview, lectures by experts, the teachers’ lectures, etc. These will enhance the students' multiple learning and lead to specific desired results. By introducing the concepts of innovation, creativity and originality, we have established a new teaching model for aboriginal literature. The terms: "original teaching and research", "creative teaching" and "originality in research" are for the important index on teaching purposes. The Diversified Festival Culture of Taiwan Aboriginal Literature Taiwan’s population comprises various cultural and ethnic groups, including the Han people and those of Austronesian descent. Taiwan's aborigines belong to the Austronesian group and include the Pingpu tribes. Those from the Nandao language group, which comprises less than two percent of Taiwan's total population, are located in an area of more than 16,000 square kilometers, forty-five percent of the whole of Taiwan. Due to Taiwan’s diverse natural environment, the aborigines developed different ways of life, such as farming, hunting, fishing and food collecting, depending on the ecology of their area. Different tribal types also developed. Therefore, aboriginal culture reflects a dialogue between the ethnic groups and the natural environment, and embraces rich spiritual meanings in Taiwan's aboriginal culture. The sacrificial rituals of the Atayal people are intrinsically connected with their creation narrative: when their ancestors, Mabuta and Mayan, went up the mountain, one of them was killed by a snake. It was believed that this tragedy occurred because no sacrifice had been offered and to rectify this, a pig was slaughtered. Thus, sacrifice became part of the beliefs and customs. [1] Traditional rituals are an extremely important part of Taiwan's aboriginal culture, with each group having its own idiosyncratic practices. The rituals of the various ethnic groups also have many different spiritual meanings. To understand the aboriginal culture in depth, we must first understand the cultural significance of the rituals of all the ethnic groups. (Table 1) The traditional rituals of the aborigines are often held on a mountain or at a river, with the sky and the earth as the stage and the night as the backdrop. Worship of the gods and respect for the ancestors are important parts of the tribal ethics and social life. Among the Atayal, for example, the practice of various traditional ceremonies is actually a declaration of belief in ancestral spirits. The rituals themselves mostly relate to the group’s livelihood: agriculture, hunting and headhunting. Therefore, there are pioneering offerings, sowing festivals, weeding offerings, harvest festivals, collection offerings, picaning sapa, headhunting offerings, and ancestral offerings. https://doi.org/10.35745/ecei2019v2.098 sentence pre-service 33.28 11.11 ***p<0.001 **<0.05 From the results we can see that pre-service teachers use more periods, colons and brackets than in-service teachers. While in-service teachers use more commas, exclaims. And in-service teachers also write longer sentences than pre-service teachers (they have a bigger words per sentence than pre-service teachers). According to [19], words per sentence is an important indicator of linguistic simplicity. In-service teaches have a much higher words per sentence (almost twice of pre-service teachers) and lower percentage of periods than pre-service teachers. This phenomenon indicate that in-service teachers use more complex linguistic description in reflections than pre-service teachers. Pennenbaker et al indicate that linguistic complexity may have correlation with cognitive load [20]. The higher linguistic complexity means that more cognitive process is involved. Conclusion In this paper, we collected reflection texts from two online learning communities. One of them is a teacher training for in-service teachers. The other is an online course for pre-service teaches. Through linguistic analysis we can see that there are significant differences of linguistic features between in-service and pre-service teachers. In summary, compared to pre-service teachers, in-service teachers tend to use more third-personal plural, more family words, more affect and positive emotional words, causality words, perception experience words, achievement and leisure words, and have bigger words per sentence. According to these differences, we can infer that in-service teachers pay more attention to students’ feeling and creation of classroom climate. They deliver more positive emotions and involve more cognitive process in their reflections. Pre-service teachers focus more on teaching content understanding. Due to sparse teaching experience, they have less descriptions about students than in-service teachers. The trainers should give them more chances to access to teaching practice. Next phrase, we will collect more data from different subjects to evaluate our conclusions. Acknowledgement This research is supported by Chinese National Natural Science Foundation Project "Research on Deep Aggregation and Personalized Service Mechanism of Web Learning Resources based on Semantic" (No.71704062), Hubei Province Technology Innovation special projects "Key technologies and demonstration applications of Internet + Precision Education" (No.2017ACA105), and self-determined research funds of CCNU from the colleges’ basic research and operation of MOE (No. CCNU18QN022). We also thank the Computational Cyber-Psychology Lab for the software TextMind. References [1] M. Bayrakci, “In-service teacher training in Japan and Turkey: A comparative analysis of institutions and practices,” Australian Journal of Teacher Education, vol. 34, no. 1, pp. 10-22, 2009. [2] Y. N. Tan, and Y. H. Tan, "Blended Learning for In-service Teachers' Professional Development: A Preliminary Look at Perspectives of Two Singapore Chinese Language Teachers." [3] A. Bouguen, “Adjusting content to individual student needs: Further evidence from an in-service teacher training program ☆,” Economics of Education Review, vol. 50, pp. 90-112, 2016. [4] S. Zhang, Q. Liu, W. Chen, Q. Wang, and Z. Huang, “Interactive networks and social knowledge construction behavioral patterns in primary school teachers' online collaborative learning activities,” Computers & Education, vol. 104, pp. 1-17, 2017. [5] L. Bayram, “Use of Online Video Cases in Teacher Training,” Procedia - Social and Behavioral Sciences, vol. 47, pp. 1007-1011, 2012/01/01/, 2012. [6] B. N. Nicolescu, T. Macarie, and T. Petrescu, “Some Considerations on the Online Training Programs for the Teachers from the Romanian Pre-university Educational System,” Procedia - Social and Behavioral Sciences, vol. 180, pp. 878-884, 2015/05/05/, 2015. [7] W. Westera, M. Dascalu, H. Kurvers, S. Ruseti, and S. Trausan-Matu, “Automated essay scoring in applied games: Reducing the teacher bandwidth problem in online training,” Computers & Education, vol. 123, pp. 212-224, 2018/08/01/, 2018. [8] K. Knight, D. Sperlinger, and M. Maltby, “Exploring the personal and professional impact of reflective practice groups: a survey of 18 cohorts from a UK clinical psychology training course,” Clinical Psychology & Psychotherapy, vol. 17, no. 5, pp. 427-437, 2010. [9] G. Wilson, “Evidencing Reflective Practice in Social Work Education: Theoretical Uncertainties and Practical Challenges,” British Journal of Social Work, vol. 43, no. 1, pp. 154-172, 2013. [10] J. Bennett-Levy, and C. A. Padesky, “Use It or Lose It: Post-workshop Reflection Enhances Learning and Utilization of CBT Skills,” Cognitive and Behavioral Practice, vol. 21, no. 1, pp. 12-19, 2014/02/01/, 2014. [11] K. Kori, M. Mäeots, and M. Pedaste, “Guided Reflection to Support Quality of Reflection and Inquiry in Web-based Learning,” Procedia - Social and Behavioral Sciences, vol. 112, pp. 242-251, 2014/02/07/, 2014. [12] Y.-T. Lin, M.-L. Wen, M. Jou, and D.-W. Wu, “A cloud-based learning environment for developing student reflection abilities,” Computers in Human Behavior, vol. 32, pp. 244-252, 2014/03/01/, 2014. [13] N. Yurtseven, and S. Altun, “The Role of Self-Reflection and Peer Review in Curriculum-focused Professional Development for Teachers,” Hacettepe Universitesi Egitim Fakultesi Dergisi-Hacettepe University Journal of Education, vol. 33, no. 1, pp. 207-228, Jan, 2018. [14] K. F. Hew, C. Qiao, and Y. Tang, “Understanding Student Engagement in Large-Scale Open Online Courses: A Machine Learning Facilitated Analysis of Student's Reflections in 18 Highly Rated MOOCs,” International Review of Research in Open and Distributed Learning, vol. 19, no. 3, pp. 69-93, Jul, 2018. [15] J. Luttenberg, H. Oolbekkink-Marchand, and P. Meijer, “Exploring scientific, artistic, moral and technical reflection in teacher action research,” Educational Action Research, vol. 26, no. 1, pp. 75-90, 2018. [16] P. A. Salus, Elements of General Linguistics: Faber and Faber, 2005. [17] R. L. Robinson, R. Navea, and W. Ickes, “Predicting final course performance from students’ written self-introductions: A LIWC analysis,” Journal of Language & Social Psychology, vol. 32, no. 4, pp. 469-479, 2015. [18] Q. He, C. A. W. Glas, M. Kosinski, D. J. Stillwell, and B. P. Veldkamp, “Predicting self-monitoring skills using textual posts on Facebook,” Computers in Human Behavior, vol. 33, pp. 69-78, 2014/04/01/, 2014. [19] R. L. Robinson, R. Navea, and W. Ickes, “Predicting final course performance from students’ written self-introductions: A LIWC analysis,” Journal of Language and Social Psychology, vol. 32, no. 4, pp. 469-479, 2013. [20] Y. R. Tausczik, and J. W. Pennebaker, “The psychological meaning of words: LIWC and computerized text analysis methods,” Journal of language and social psychology, vol. 29, no. 1, pp. 24-54, 2010. 384 Educational Innovations and Applications- Tijus, Meen, Chang ISBN: 978-981-14-2064-1 resources network http://www.tipp.org.tw/tribecalendar.asp) (https://eng.taiwan.net.tw/m1.aspx?sNo=0002023)[4] The Atayal Festival Culture in Taiwan’s Aboriginal Literature Among the Atayal traditions and customs are the unique patterns on their faces, their music played on a mouth harp and their hip-hop form of dancing. The Atayal social organization conforms to ancestral rituals, the most important of which is the Thanksgiving ritual (Figure 1,2,3), which is held on August 31st at four o'clock in the morning. (Table 3) *Table 3. Atayal Smyus Festival Fig. 1. Atayal festival and ritual activities Month Indigenous people Indigenous Ceremonies 1 Saaroa Miatungusu 2 Puyuma Union Amiyan Tao mivanwa、mivanwa Tsou Mayasvi 3 Tao mivanwa、mivanwa 4 Bunun Malahtangia SaiSiyat pitaza、’Oemowazka kawas Thao Mulalu pisaza 5 Bunun Malahtangia Amis Fishing festival SaiSiyat pas-taai Tao mivanwa 6 Amis Fishing festival SaiSiyat pas-taai Tao mivanwa、Mapasamorang so piyavean 7 Puyuma Misacpo' Kavalan Laligi Amis Malalikit Paiwan Masalut Kebalan Qataban 8 Amis Malalikit Atayal maho Thao mulalu tuza Tsou Homeyaya Rukai Kalabecengane Paiwan Harvest festival 9 Amis Malalikit Thao Lus’an Paiwan Harvest festival 10 Paiwan Five-years Ceremony Taroko Mgay Bari Kanakanavu Mikong 11 Rukai Tabesengane SaiSiyat Pasta'ai、pas-taai Atayal maho 12 Puyuma mangayangayaw、 mangayaw 、gilabus SaiSiyat pas-taai Festival Atayal Smyus Festival Important content and ceremony Participants Males only. According to tradition, women may not participate in ancestral festivals. Location The various tribes host the festival in turn. Time 4 a.m., every August 31. Festival refreshments Wine, millet cakes, crops, fruit, fish, etc. Bacon may not be eaten at the festival. Funeral oration The main singer chants sacred words: "Ancestral ancestors! Ancestors! Today the people who worship you are in a serious mood. May the ancestors greet Jiana, and we sincerely invite all the ancestors to gather and share the gifts. Festival……" "Ancestral ancestors, we bring crops that have been cultivated this year. Every member of the family has acted according to the ancestral teachings (gaga) and has worked hard. We are your people, and we look forward to your blessing next year. Now, we are joyfully celebrating." spirit The ancestral spirits are thanked for their gifts and the speaker reports to the ancestors on the tribe's life during the past year. They promise to abide by the traditional culture of the ancestral training and gaga, and ask the ancestors to give the tribe health and happiness. Taboos 1. The ceremony must be completed before dawn. The tribe believes that the ancestors will come and participate in the festival at dawn. 2. Women are not allowed to participate in the ceremony. *Table 1. 16 ethnic groups of indigenous peoples (refer to Wikipedia) [2] The rituals also embody the enormous spiritual symbolism that is part of the aboriginal culture. "The aboriginal people believe that all things are spiritual, and the sorcerer is usually responsible for communicating with the gods.” The people believe that the ancestors have a direct influence on their lives, good or bad. Indigenous people believe that the ancestral spirits live in the mountains, and protect the crops for the tribes, so they are most revered by the aborigines. This shows the distinctiveness of the ethnic spirit of the diverse aboriginal cultures. As mentioned earlier, each aboriginal ethnic group has its own traditional rituals. Among the many and diverse rituals are the Ancestral Spirits of the Atayal and the Truku, the New Year's Festival of the Thao, the biennial Dwarf Festival (Pasta'ai) of the Saisiyat, and the Shearing Festival (Malahtangia) of the Bunun. Furthermore, there are the Tsai tribe's War Festival (Mayasvi), the Shahru’s Bei Shen Festival (Miatungusu), the Rukai's Millet Harvest Festival (Tsatsapipianu), the Ami's Sea Festival (Misacpo) and the Harvest Festival (Malalikit). The Puyuma have their Monkey Festival (Mangayangayaw) and Big Hunting Festival (Mangayaw) every five years. Each year, there is the "Year of the Harvest Festival"; the "Autumn Festival" and "Sea Festival" of the Amis. The Dawu people celebrate the Flying Fish Festival (Mivanwa) and the New Boat Festival (Mapabosbos). The Night Festival of the Pingpu tribes has gradually been revived. In addition, the important rituals of the Puyuma include "Sea Festival", "Monkey Festival" for men, and "Hay Harvest Festival” for women. The La Aruwa believe that the ancestral spirits are attached to the collection of Bezhu, so there is a "Bei Shen Festival" (Miatungusu). The Zou people have "War Festival" (Mayasvi) and "Harvest Festival". The diverse traditional rituals contribute in no small measure to the richness and distinctiveness of the aboriginal culture and the strong ethnic identity of the various population groups in Taiwan. A summary is given of the rituals that have persisted through the ages, as well as the extent to which the mountains, the sea and the natural environment have contributed to the cultural wisdom of the aboriginal people. The lives of the Atayal people are controlled by the “gaga” (the ancestral teachings). These include all rites of passage such as birth, naming, marriage, death and the rituals associated with day-to-day activities, such as hunting, weaving, tattooing, and childbearing. Other rituals are associated with social norms, such as tribal farming, revenge and inheritance of rights. [3] The Taiwan Atayal people live in the central and northern mountainous areas of Central Taiwan, from Puli to the north of Hualien County with a population of approximately 89,958 (statistical data from March 2007). They live mainly by hunting and growing crops on burned-out mountain fields. The people are also very well known for their weaving skills. The woven fabric with its complex patterns has exquisite colors, the most predominant of which is red. This color, associated with blood, is deemed to ward off evil. In the Ayatal creation mythology, the original ancestors were a brother and sister who lived for a very long time between heaven and earth. However, after the flood, the brother and sister were troubled and unable to have children. The sister decided to paint her face to disguise who she was from her brother after which they had children and ensured the continuation of the tribe. However, now there are strong taboos against cognation marriage. (Table 2) * Table 2. The chronology of the aboriginal age ceremonies (refer to the original national information Ethnic group Popula tion Note Pangcah (Amis) 210,50 1 One of the nine ethnic groups officially recognized by the Ethnology Research Office of National Taiwan University in 1948. Payuan ( Paiwan) 101,23 4 Tayal (At ayal) 90,631 Bunun 58,711 Pinuyum ayan (Pu yuma) 14,279 Drekay ( Rukai) 13,368 Cou (Tso u) 6,653 SaiSiyat 6,644 Tao 4,620 One of the nine ethnic groups officially recognized by the Ethnology Research Office of National Taiwan University in 1948. Formerly known as the Yami, the name of the group has now been changed to Tao. Thao 792 Originally classified as Tsou, the group was included on August 8, 2001. Kebalan (Kavalan ) 1,477 Originally classified as Ami, the group was included on December 25, 2002. Truku (T aroko) 31,689 Originally classified as Atayal, the group was included on January 14, 2004. Sakizaya 947 Originally classified as Ami, the group was included on January 17, 2007. Seediq 10,115 Originally classified as Atayal, the group was included on April 23, 2008. Hla'alua ( Saaroa) 403 Located in the Taoyuan and Namasa Districts of Kaohsiung City and originally classified as Southern Tsou, and legally recognized on June 26, 2014. Kanakan avu 340 Living in the area of Namasa District, Kaohsiung City. Originally classified as Southern Tsou, and legally recognized on June 26, 2014. 385 Educational Innovations and Applications- Tijus, Meen, Chang ISBN: 978-981-14-2064-1 resources network http://www.tipp.org.tw/tribecalendar.asp) (https://eng.taiwan.net.tw/m1.aspx?sNo=0002023)[4] The Atayal Festival Culture in Taiwan’s Aboriginal Literature Among the Atayal traditions and customs are the unique patterns on their faces, their music played on a mouth harp and their hip-hop form of dancing. The Atayal social organization conforms to ancestral rituals, the most important of which is the Thanksgiving ritual (Figure 1,2,3), which is held on August 31st at four o'clock in the morning. (Table 3) *Table 3. Atayal Smyus Festival Fig. 1. Atayal festival and ritual activities Month Indigenous people Indigenous Ceremonies 1 Saaroa Miatungusu 2 Puyuma Union Amiyan Tao mivanwa、mivanwa Tsou Mayasvi 3 Tao mivanwa、mivanwa 4 Bunun Malahtangia SaiSiyat pitaza、’Oemowazka kawas Thao Mulalu pisaza 5 Bunun Malahtangia Amis Fishing festival SaiSiyat pas-taai Tao mivanwa 6 Amis Fishing festival SaiSiyat pas-taai Tao mivanwa、Mapasamorang so piyavean 7 Puyuma Misacpo' Kavalan Laligi Amis Malalikit Paiwan Masalut Kebalan Qataban 8 Amis Malalikit Atayal maho Thao mulalu tuza Tsou Homeyaya Rukai Kalabecengane Paiwan Harvest festival 9 Amis Malalikit Thao Lus’an Paiwan Harvest festival 10 Paiwan Five-years Ceremony Taroko Mgay Bari Kanakanavu Mikong 11 Rukai Tabesengane SaiSiyat Pasta'ai、pas-taai Atayal maho 12 Puyuma mangayangayaw、 mangayaw 、gilabus SaiSiyat pas-taai Festival Atayal Smyus Festival Important content and ceremony Participants Males only. According to tradition, women may not participate in ancestral festivals. Location The various tribes host the festival in turn. Time 4 a.m., every August 31. Festival refreshments Wine, millet cakes, crops, fruit, fish, etc. Bacon may not be eaten at the festival. Funeral oration The main singer chants sacred words: "Ancestral ancestors! Ancestors! Today the people who worship you are in a serious mood. May the ancestors greet Jiana, and we sincerely invite all the ancestors to gather and share the gifts. Festival……" "Ancestral ancestors, we bring crops that have been cultivated this year. Every member of the family has acted according to the ancestral teachings (gaga) and has worked hard. We are your people, and we look forward to your blessing next year. Now, we are joyfully celebrating." spirit The ancestral spirits are thanked for their gifts and the speaker reports to the ancestors on the tribe's life during the past year. They promise to abide by the traditional culture of the ancestral training and gaga, and ask the ancestors to give the tribe health and happiness. Taboos 1. The ceremony must be completed before dawn. The tribe believes that the ancestors will come and participate in the festival at dawn. 2. Women are not allowed to participate in the ceremony. *Table 1. 16 ethnic groups of indigenous peoples (refer to Wikipedia) [2] The rituals also embody the enormous spiritual symbolism that is part of the aboriginal culture. "The aboriginal people believe that all things are spiritual, and the sorcerer is usually responsible for communicating with the gods.” The people believe that the ancestors have a direct influence on their lives, good or bad. Indigenous people believe that the ancestral spirits live in the mountains, and protect the crops for the tribes, so they are most revered by the aborigines. This shows the distinctiveness of the ethnic spirit of the diverse aboriginal cultures. As mentioned earlier, each aboriginal ethnic group has its own traditional rituals. Among the many and diverse rituals are the Ancestral Spirits of the Atayal and the Truku, the New Year's Festival of the Thao, the biennial Dwarf Festival (Pasta'ai) of the Saisiyat, and the Shearing Festival (Malahtangia) of the Bunun. Furthermore, there are the Tsai tribe's War Festival (Mayasvi), the Shahru’s Bei Shen Festival (Miatungusu), the Rukai's Millet Harvest Festival (Tsatsapipianu), the Ami's Sea Festival (Misacpo) and the Harvest Festival (Malalikit). The Puyuma have their Monkey Festival (Mangayangayaw) and Big Hunting Festival (Mangayaw) every five years. Each year, there is the "Year of the Harvest Festival"; the "Autumn Festival" and "Sea Festival" of the Amis. The Dawu people celebrate the Flying Fish Festival (Mivanwa) and the New Boat Festival (Mapabosbos). The Night Festival of the Pingpu tribes has gradually been revived. In addition, the important rituals of the Puyuma include "Sea Festival", "Monkey Festival" for men, and "Hay Harvest Festival” for women. The La Aruwa believe that the ancestral spirits are attached to the collection of Bezhu, so there is a "Bei Shen Festival" (Miatungusu). The Zou people have "War Festival" (Mayasvi) and "Harvest Festival". The diverse traditional rituals contribute in no small measure to the richness and distinctiveness of the aboriginal culture and the strong ethnic identity of the various population groups in Taiwan. A summary is given of the rituals that have persisted through the ages, as well as the extent to which the mountains, the sea and the natural environment have contributed to the cultural wisdom of the aboriginal people. The lives of the Atayal people are controlled by the “gaga” (the ancestral teachings). These include all rites of passage such as birth, naming, marriage, death and the rituals associated with day-to-day activities, such as hunting, weaving, tattooing, and childbearing. Other rituals are associated with social norms, such as tribal farming, revenge and inheritance of rights. [3] The Taiwan Atayal people live in the central and northern mountainous areas of Central Taiwan, from Puli to the north of Hualien County with a population of approximately 89,958 (statistical data from March 2007). They live mainly by hunting and growing crops on burned-out mountain fields. The people are also very well known for their weaving skills. The woven fabric with its complex patterns has exquisite colors, the most predominant of which is red. This color, associated with blood, is deemed to ward off evil. In the Ayatal creation mythology, the original ancestors were a brother and sister who lived for a very long time between heaven and earth. However, after the flood, the brother and sister were troubled and unable to have children. The sister decided to paint her face to disguise who she was from her brother after which they had children and ensured the continuation of the tribe. However, now there are strong taboos against cognation marriage. (Table 2) * Table 2. The chronology of the aboriginal age ceremonies (refer to the original national information Ethnic group Popula tion Note Pangcah (Amis) 210,50 1 One of the nine ethnic groups officially recognized by the Ethnology Research Office of National Taiwan University in 1948. Payuan ( Paiwan) 101,23 4 Tayal (At ayal) 90,631 Bunun 58,711 Pinuyum ayan (Pu yuma) 14,279 Drekay ( Rukai) 13,368 Cou (Tso u) 6,653 SaiSiyat 6,644 Tao 4,620 One of the nine ethnic groups officially recognized by the Ethnology Research Office of National Taiwan University in 1948. Formerly known as the Yami, the name of the group has now been changed to Tao. Thao 792 Originally classified as Tsou, the group was included on August 8, 2001. Kebalan (Kavalan ) 1,477 Originally classified as Ami, the group was included on December 25, 2002. Truku (T aroko) 31,689 Originally classified as Atayal, the group was included on January 14, 2004. Sakizaya 947 Originally classified as Ami, the group was included on January 17, 2007. Seediq 10,115 Originally classified as Atayal, the group was included on April 23, 2008. Hla'alua ( Saaroa) 403 Located in the Taoyuan and Namasa Districts of Kaohsiung City and originally classified as Southern Tsou, and legally recognized on June 26, 2014. Kanakan avu 340 Living in the area of Namasa District, Kaohsiung City. Originally classified as Southern Tsou, and legally recognized on June 26, 2014. 386 Educational Innovations and Applications- Tijus, Meen, Chang ISBN: 978-981-14-2064-1 A Study on Constructing Historical and Cultural Textbooks for Hualien Sugar Factory, Taiwan -Based on Local Stories Hsin-Yu Chen*, Sung-Chin Chung**, Shyh-Huei Hwang***, Chia-Mei Liang**** National Yunlin University of Science and Technology, Graduate School of Design, Doctoral Program, Student No.42, Wen’an St., Douliu City, Yunlin County 640, Taiwan (R.O.C.) Douliu City, Yunlin County, Taiwan 886-983228204, albeehsinyu@gmail.com Abstract The purpose of this study is employing literature review, in-depth interviews, and the KJ method to uncover early stories of Hualien Sugar Factory, Taiwan through interviews with elders, categorizing the stories and analyzing their distinctiveness, and adapting and constructing them as historical and cultural textbooks for guided tours. For the result of this study, early local stories can be grouped into seven categories - stories from sugarcane fields, memories of life on sugar factory premises, memories at Dajin Elementary School, life outside the factory, life before and after the war, accidents and death of family, and employees of different identities. Keywords: Hualien Sugar Factory, Guangfu Sugar Factory, Historical and Cultural Stories of Hualien Sugar Factory, KJ method Introduction A. Background and Motivation Hualien Sugar Factory was one of the major sugar factories in east Taiwan. After ceasing production in 2002, the facilities transitioned into a tourism factory. Active measures have been made in recent years to generate tourism assets around the facilities (the official website of Hualien Sugar Factory, 2018), along with green landscaping to create a leisurely environment. Geographically, Hualien Sugar Factory is located in the central region of Hualien County. It is a tourism hub of the entire East Rift Valley, with more than 600 thousand visitors every year. The ice shop of the sugar factory and the surrounding shopping streets are the main source of revenue (Liang, 2018). However, it is a major current objective for Hualien Sugar Factory to entice visitors at the sugar factory to lengthen their stay beyond enjoying ice cream, connect the sugar industry with local culture, and present stories of the sugar factory, thus highlighting the cultural value of the sugar factory, promoting the rich history of the facilities, and passing on collective memories. Therefore, uncovering stories and values of the old sugar factory and editing them into historical and cultural textbooks for guided tours, so as to achieve sustainability and advancement of the cultural assets of the sugar factory has become an integral part of current efforts at the facilities. Figure 1 shows the location of Hualien Sugar Factory. Fig. 1 Location of Hualien Sugar Factory B. Objectives The following are the main objectives of the present study. 1. Investigating the historical and cultural stories of Hualien Sugar Factory, Taiwan. 2. Categorizing the historical and cultural stories of Hualien Sugar Factory, Taiwan. 3. Providing the historical and cultural stories as the basis data for guided tours textbooks of Hualien Sugar Factory, Taiwan, based on early local stories. Methodology The research methods used in this study were literature review, in-depth interviews, and KJ method. First, literature review and data collection were conducted on the cultural history and current development of Hualien Sugar Factory. Local stories about the sugar factory were extracted from in-depth interviews with elders. Records of the interviews were analyzed and adapted into historical and cultural textbooks about the sugar factory. The field interviews of the present study were conducted from January 30 to 31, February 6, 8 to 9, 15, and March 8 to 9, 2018. Lastly, the KJ method was used to categorize historical and cultural textbooks and discuss their distinctiveness for reference by the sugar factory management regarding guided tours and other related purposes. Table 1 below shows the profile information on the 15 interviewees for the study. TABLE 1 PROFILE INFORMATION ON THE 15 INTERVIEWEES Interviewee Number Location of Residence Year of Birth Date of Interview Background Fig. 2. Atayal Festival expert interview Fig. 3. Atayal Festival expert interview Last month, Lao Taiya who was in his 90s, propped up his body and shouted out: "No children have come to see me for a long time!" The children were in the city, like rogues who had abandoned their hometown. Old Atayal's eyes stretched far and wide, as if it were the light of compassion. Am I a ronin. [5] Conclusion The research methods on the digitalization of ceremonial culture in Taiwan's aboriginal literature comprise eight innovative research threads: in-depth problem awareness; multi-disciplinary consultation; extensive information collection; digital humanities cross-border; steps to implement research methods into teaching and research development of two-track information cross-border; text field adjustment; strategy for improving teaching and research improvement; and multi-product development. These are shown in the figure below: (1) In-depth problem awareness – identify research issues: the development of tribal ceremonies in aboriginal ethnic groups. (2) Multi-disciplinary consultation – discuss preliminary plans with relevant parties: project hosts and co-hosts, scholars and experts, students, tribal elders, interviewees, etc. (3) Extensive information collection – refer to relevant literature: search and induction analysis, historical literature review of text narratives and literature on aboriginal culture. (4) Digital humanity cross-border – hold joint discussions to determine the research methods (observation, interviews, questionnaires, photographs, audio recordings, videos, documentaries, texts and literature analysis, etc.), and show the specific results of the "humanization of humanities" process. (5) Two-track information cross-border – collect data, text narratives and literature on the culture of the aboriginal rituals, and collect digital information in actual textual teaching and tribal ritual fieldwork practice. (6) Text field adjustment – during actual text teaching, summarize the text narratives and literature of the aboriginal ritual culture, and analyzing the digital information collected by the tribal ritual fieldwork practice. (7) Teaching and research improvement strategy – present a research report with suggestions on improvements that can be made to the text narratives and literature materials of the aboriginal culture of the actual text teaching, the digital information collected by the tribal ritual field research practice and the inductive analysis of teaching practice. The research report will offer strategies to improve action research aiming to enhance teaching practices and results. (8) Multiple results presentation – finally, share experiences and present the concrete results of teaching practice and action research. These will be presented in a multi-modal model of teaching research results, such as the aboriginal festival cultural documentary contest film festival, teaching achievement exhibition research results exchange, sharing of talks, etc. Fig. 4 Research Threads References [1] Tian Zheyi, The Myths and Legends of the Atayal, (Taichung: Morning Star Press, 2003), p. 202. [2] Wikipedia, Available: https://www.wikipedia.org (2018, November 30). [3] Warriors. Nogan, Yu Guanghong, "Introduction to the First Chapter", "Taiwan Aboriginal History - Atayal History", (Nantou: Taiwan Literature Museum, 2002), p. 7. [4] Warriors. Nogan, "Go back to the tribe of the old Taiya!", "The Call of the Wilderness", (Taichung: Morning Star Press, December 15, 1992), p. 37. [5] Taiwan indigenous peoples portal, Available: http://www.tipp.org.tw/tribecalendar.asp (2018, November 30) In-depth problem awareness Extensive information collection Two-track information cross-border Teaching and research improvement strategy Multiple results presentation Text field adjustment Digital humanity cross- border Multi-disciplinary consultation work_3hhqhd4a4fbvvj3rqix3qqobdi ---- Digital Humanities 2018 Edinburgh Research Explorer Legal Deposit Web Archives and the Digital Humanities Citation for published version: Gooding, P, Terras, M & Berube, L 2018, 'Legal Deposit Web Archives and the Digital Humanities: A Universe of Lost Opportunity?', Digital Humanities 2018 annual conference, Mexico City, Mexico, 26/06/18 - 29/06/18 pp. 590. Link: Link to publication record in Edinburgh Research Explorer Document Version: Publisher's PDF, also known as Version of record General rights Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorer content complies with UK legislation. If you believe that the public display of this file breaches copyright please contact openaccess@ed.ac.uk providing details, and we will remove access to the work immediately and investigate your claim. Download date: 06. Apr. 2021 https://www.research.ed.ac.uk/portal/en/publications/legal-deposit-web-archives-and-the-digital-humanities(9f462b6d-7157-4241-9a88-279a6bda2e8b).html u 590 u Legal Deposit Web Archives and the Digital Humanities: A Universe of Lost Opportunity? Paul Gooding p.gooding@uea.ac.uk University of East Anglia, United Kingdom Melissa Terras m.terras@ed.ac.uk University of Edinburgh, United Kingdom Linda Berube l.berube@uea.ac.uk University of East Anglia, United Kingdom Introduction Legal deposit libraries have archived the web for over a decade. Several nations, supported by legal deposit regu- lations, have introduced comprehensive national domain web crawling, an essential part of the national library re- mit to collect, preserve and make accessible a nation’s intellectual and cultural heritage (Brazier, 2016). Scholars have traditionally been the chief beneficiaries of legal de- posit collections: in the case of web archives, the poten- tial for research extends to contemporary materials, and to Digital Humanities text and data mining approaches. To date, however, little work has evaluated whether legal deposit regulations support computational approaches to research using national web archive data (Brügger, 2012; Hockx-Yu, 2014; Black, 2016). This paper examines the impact of electronic legal deposit (ELD) in the United Kingdom, particularly how the 2013 regulations influence innovative scholarship using the Legal Deposit UK Web Archive. As the first major case u 591 u study to analyse the implementation of ELD, it will ad- dress the following key research questions: • Is legal deposit, a concept defined and refined for print materials, the most suitable vehicle for suppor- ting DH research using web archives? • How does the current framing of ELD affect digital in- novation in the UK library sector? • How does the current information ecology, including not for-profit archives, influence the relationship between DH researchers and legal deposit libraries? Research Context The British Library began harvesting the UK web domain under legal deposit in 2013. The UK Web Archive had, by 2017, grown to 500Tb. However, UK legal deposit regu- lations, based on a centuries-old model of reading room access to deposited materials, affect the archive’s signi- ficant potential for research: in practice, researchers can only access the full range of UK websites within the walls of selected institutions. DH scholars, though, require ac- cess to textual corpora and metadata in addition to inter- faces for discovery and reading (Gooding, 2012). Winters argues that “it is the portability of data, its separability from an easy-to-use but necessarily limiting interface, which underpins much of the exciting work in the Digital Humanities” (2017: 246). Restricted deposit library ac- cess requires researchers to look elsewhere for portable web data: by undertaking their own web crawls, or by utili- sing datasets from Common Crawl (http://commoncrawl. org/) and the Internet Archive (https://archive.org). Both organisations provide vital services to researchers, and both innovate in areas that would traditionally fall under the deposit libraries’ purview. They support their mission by exploring the boundaries of copyright, including ex- ceptions for non-commercial text and data mining (In- tellectual Property Office, 2014). This contrast between risk-enabled independent organisations and deposit libraries, described by interviewees as risk averse, cha- llenges library/DH collaboration models such as BL Labs (http://labs.bl.uk) and Library of Congress Labs (https:// labs.loc.gov). Methodology This paper analyses the impact of the UK regulatory en- vironment upon DH reuse of the Legal Deposit UK Web Archive. It presents a quantitative analysis of information seeking behaviour, supported by insights from 30 inter- views with UK legal deposit library practitioners. Quanti- tative datasets consisted of Google Analytics reports, and web logs of UK web archive usage, which were analysed in SPSS and Excel. These datasets allowed us to identify broad patterns of information-seeking behaviour. Practitioner interviews were hand-coded to three le- vels in Nvivo: initial coding, to provide the foundations for higher level analysis; focused coding, to further refine the data; and axial coding, using the convergence of ideas as a basis for exploring the research questions (Hahn, 2008). This analysis will inform two further research phases: a broader quantitative analysis of UK ELD collections; and qualitative analysis of the ways that the research com- munity, and DH researchers, use ELD collections. Conclusion This paper provides a vital case study of how legal deposit regulations can influence library/DH collabora- tion. It argues that UK ELD regulations use a print-era view of national collections to interpret digital preserva- tion and access. A lack of media specificity, combined with a more cautious approach to text and data mining than allowed under UK copyright, restricts DH research: first, by limiting opportunities for innovative computatio- nal research; and second by excluding lab-based library/ DH collaborative models. As web preservation activities become concentrated in a small group of key organisa- tions, current regulations disadvantage libraries in com- parison to not-for-profits, whose vital work is supported by an ability to take risks denied to legal deposit libraries. The UK’s approach to national domain web archiving re- presents a lost opportunity for computational scholar- ship, requiring us to rethink legal deposit in light of the differing affordances of born-digital archives. References Black, M. L. (2016). The World Wide Web as Complex Data Set: Expanding the Digital Humanities into the Twentieth Century and Beyond through Internet Research. International Journal of Humanities and Arts Computing, 10(1): 95–109. Brazier, C. (2016). Great Libraries? Good Libraries? Dig- ital Collection Development and What it Means for Our Great Research Collections. In Baker, D. and Ev- ans, W. (eds), Digital Information Strategies: From Applications and Content to Libraries and People. Waltham, MA: Chandos Publishing, pp. 41–56. Brügger, N. (2012). Web History and the Web as a His- torical Source. Studies in Contemporary Histo- ry, 2 http://www.zeithistorische-forschungen.de/ site/40209295/default.aspx (accessed 9 January 2017). Gooding, P. (2012). Mass Digitization and the Garbage Dump: The Conflicting Needs of Quantitative and Qualitative Methods. Literary and Linguistic Com- puting doi:10.1093/llc/fqs054. http://llc.oxford- journals.org/content/early/2012/12/22/llc.fqs054. abstract (accessed 30 July 2013). Hahn, C. (2008). Doing Qualitative Research Using Your Computer: A Practical Guide. London: Sage Publica- tions Ltd. Hockx-Yu, H. (2014). Access and Scholarly Use of Web Archives. Alexandria, 25(1/2): 113–27. https://labs.loc.gov https://labs.loc.gov u 592 u Intellectual Property Office (2014). Exceptions to Copy- right: Research UK Government https://www.gov. uk/government/uploads/system/uploads/attach- ment_data/file/375954/Research.pdf. Winters, J. (2017). Coda: Web Archives for Humanities Research - Some Reflections. The Web as History. London: UCL Press, pp. 238–48. Plenary lectures Tramando la palabra Janet Chávez Santiago Digital Experimentation, Courageous Citizenship and Caribbean Futurism Schuyler Esprit Panels Digital Humanities & Colonial Latin American Studies Roundtable Hannah Alpert-Abrams Clayton McCarl Ernesto Priani Linda Rodriguez Diego Jimenez Baldillo Patricia Murrieta-Flores Bruno Martins Ian Gregory Bridging Cultures Through Mapping Practices: Space and Power in Asia and America Cecile Armand Christian Henriot Sora Kim gorgeousora@gmail.com Ian Caine Jerry Gonzalez jerry.gonzalez@utsa.edu Rebecca Walter Critical Theory + Empirical Practice: “The Archive” as Bridge James William Baker Caroline Bassett David Berry D.M. Sharon Webb Rebecca Wright Networks of Communication and Collaboration in Latin America Nora Christine Benedict Cecily Raynor cecily.raynor@mcgill.ca Roberto Cruz Arzabal Rhian Lewis Norberto Gomez Jr. Carolina Gaínza Digital Decolonizations: Remediating the Popol Wuj Allison Margaret Bigelow Pamela Espinosa de los Monteros Will Hansen Rafael Alvarado rca2t@virginia.edu Catherine Addington ca2bb@virginia.edu Karina Baptista Mid-Range Reading: Manifesto Edition Grant Wythoff Alison Booth Sarah Allison Daniel Shore Precarious Labor in the Digital Humanities Christina Boyles Carrie Johnston Jim McGrath Paige Morgan Miriam Posner Chelcie Rowell Experimental Humanities Maria Sachiko Cecire Dennis Yi Tenen Wai Chee Dimock Nicholas Bauch Kimon Keramidas Freya Harrison Erin Connelly Reimagining the Humanities Lab Tanya Clement Lori Emerson Elizabeth Losh Thomas Padilla Legado de las/los latinas/os en los Estados Unidos: Proyectos de DH con archivos del Recovery Isis Campos Annette Zapata Maira E. Álvarez Sylvia A. Fernández Social Justice, Data Curation, and Latin American & Caribbean Studies Lorena Gauthereau Hannah Alpert-Abrams Alex Galarza Mario H. Ramirez Crystal Andrea Felima Digital Humanities in Middle and High School: Case Studies and Pedagogical Approaches Alexander Gil Roopika Risam Stan Golanka Nina Rosenblatt David Thomas Matt Applegate James Cohen Eric Rettberg Schuyler Esprit Remediating Machistán: Bridging Espacios Queer in Culturas Digitales, or Puentes over Troubled Waters Carina Emilia Guzman T.L. Cowan Jasmine Rault Itzayana Gutierrez Beyond Image Search: Computer Vision in Western Art History Leonardo Laurence Impett Peter Bell bell@uni-heidelberg.de Benoit Auguste Seguin benoit.seguin@epfl.ch Bjorn Ommer ommer@uni-heidelberg.de Building Bridges With Interactive Visual Technologies Adeline Joffres Rocio Ruiz Rodarte Roberto Scopigno George Bruseker Anaïs Guillem Marie Puren Charles Riondet Pierre Alliez Franco Niccolucci The Impact of FAIR Principles on Scientific Communities in (Digital) Humanities. An Example of French Research Consortia in Archaeology, Ethnology, Literature and Linguistics Adeline Joffres Nicolas Larrousse Stéphane Pouyllau Olivier Baude Fatiha Idmhand Xavier Rodier Véronique Ginouvès Michel Jacobson DH in 3D: Multidimensional Research and Education in the Digital Humanities Rachel Hendery Steven Jones Micki Kaufman Amanda Licastro Angel David Nieves Kate Richards Geoffrey Rockwell Lisa M. Snyder Si las humanidades digitales fueran un círculo estaríamos hablando de la circunferencia digital Tália Méndez Mahecha Javier Beltrán Stephanie Sarmiento Duván Barrera Sara del mar Castiblanco María Helena Vargas Natalia Restrepo Camilo Martinez Juan Camilo Chavez Digital Humanities meets Digital Cultural Heritage Sander Münster Fulvio Rinaudo Rosa Tamborrino Fabrizio Apollonio Marinos Ioannides Lisa Snyder Digital Chicago: #DH As A Bridge To A City’s Past Emily Mace Rebecca Graff Richard Pettengill Desmond Odugu Benjamin Zeller Bridging Between The Spaces: Cultural Representation Within Digital Collaboration and Production Stephanie Mahnke Shewonda Leger Suban Nur Cooley Victor Del Hierro Laura Gonzales Pensar filosóficamente las humanidades digitales Marat Ocampo Gutiérrez de Velasco Francisco Barrón Tovar Ana María Guzmán Olmos Sandra Reyes Álvarez Elena León Magaña Ethel Rueda Hernández Perspectivas Digitales y a Gran Escala en el Estudio de Revistas Culturales de los Espacios Hispánico y Lusófono Ventsislav Ikoff Laura Fólica Diana Roig Sanz Hanno Ehrlicher Teresa Herzgsell Claudia Cedeño Rocío Ortuño Joana Malta Pedro Lisboa Las Humanidades Digitales en la Mixteca de Oaxaca: reflexiones y proyecciones sobre la Herencia Viva o Patrimonio Emmanuel Posselt Santoyo Liana Ivette Jiménez Osorio Laura Brenda Jiménez Osorio Roberto Carlos Reyes Espinosa Eruvid Cortés Camacho José Aníbal Arias Aguilar José Abel Martínez Guzmán Project Management For The Digital Humanities Natalia Ermolaev Rebecca Munson Xinyi Li Lynne Siemens Ray Siemens Micki Kaufman Jason Boyd Can Non-Representational Space Be Mapped? The Case of Black Geographies Jonathan David Schroeder Clare Eileen Callahan Kevin Modestino Tyechia Lynn Thompson Producción y Difusión de la investigación de las colecciones de archivos gráficos y fotográficos en el Archivo Histórico Riva-Agüero (AHRA) Rita Segovia Rojas Ada Arrieta Álvarez Daphne Cornejo Retamozo Patricio Alvarado Luna Ivonne Macazana Galdos Paula Benites Mendoza Fernando Contreras Zanabria Melissa Boza Palacios Enrique Urteaga Araujo Unanticipated Afterlives: Resurrecting Dead Projects and Research Data for Pedagogical Use Megan Finn Senseney Paige Morgan Miriam Posner Andrea Thomer Helene Williams Global Perspectives On Decolonizing Digital Pedagogy Anelise Hanson Shrout Jamila Moore-Pewu Gimena del Rio Riande Susanna Allés Kajsa Hallberg Adu Computer Vision in DH Lauren Tilton Taylor Arnold Thomas Smits Melvin Wevers Mark Williams Lorenzo Torresani Maksim Bolonkin John Bell Dimitrios Latsis Harnessing Emergent Digital Technologies to Facilitate North-South, Cross-Cultural, Interdisciplinary Conversations about Indigenous Community Identities and Cultural Heritage in Yucatán Gabrielle Vail Sarah Buck Kachaluba Matilde Cordoba Azcarate Samuel Francois Jouault Digital Humanities Pedagogy and Praxis Roundtable Amanda Heinrichs James Malazita Jim McGrath Miriam Peña Pimentel Lisa Rhody Paola Ricaurte Quijano Adriana Álvarez Sánchez Brandon Walsh Ethan Watrall Matthew Gold Justice-Based DH, Practice, and Communities Vika Zafrin Purdom Lindblad Roopika Risam rrisam@salemstate.edu Gabriela Baeza Ventura Carolina Villarroel Long papers The Hidden Dictionary: Text Mining Eighteenth-Century Knowledge Networks Mark Andrew Algee-Hewitt De la teoría a la práctica: Visualización digital de las comunidades en la frontera México-Estados Unidos Maira E. Álvarez Sylvia A. Fernández Comparing human and machine performances in transcribing 18th century handwritten Venetian script Sofia Ares Oliveira Frederic Kaplan Metadata Challenges to Discoverability in Children’s Picture Book Publishing: The Diverse BookFinder Intervention Kathi Inman Berens Christina Bell The Idea of a University in a Digital Age: Digital Humanities as a Bridge to the Future University David M. Berry Hierarchies Made to Be Broken: The Case of the Frankenstein Bicentennial Variorum Edition Elisa Beshero-Bondar Raffaele Viglianti Non-normative Data From The Global South And Epistemically Produced Invisibility In Computationally Mediated Inquiry Sayan Bhattacharyya The CASPA Model: An Emerging Approach to Integrating Multimodal Assignments Michael Blum Quechua Real Words: An Audiovisual Corpus of Expressive Quechua Ideophones Jeremy Browne Janis Nuckolls Negentropic linguistic evolution: A comparison of seven languages Vincent Buntinx Frédéric Kaplan Labeculæ Vivæ. Building a Reference Library of Stains Found on Medieval Manuscripts with Multispectral Imaging Heather Wacha Alberto Campagnolo Erin Connelly Dall’Informatica umanistica alle Digital Humanities. Per una storia concettuale delle DH in Italia Fabio Ciotti Linked Books: Towards a collaborative citation index for the Arts and Humanities Giovanni Colavizza Matteo Romanello Martina Babetto Vincent Barbay Laurent Bolli Silvia Ferronato Frédéric Kaplan Organising the Unknown: A Concept for the Sign Classification of not yet (fully) Deciphered Writing Systems Exemplified by a Digital Sign Catalogue for Maya Hieroglyphs Franziska Diehr Sven Gronemeyer Christian Prager Elisabeth Wagner Katja Diederichs Nikolai Grube Maximilian Brodhun Automated Genre and Author Distinction in Comics: Towards a Stylometry for Visual Narrative Alexander Dunst Rita Hartel Social Knowledge Creation in Action: Activities in the Electronic Textual Cultures Lab Alyssa Arbuckle Randa El Khatib Ray Siemens Network Analysis Shows Previously Unreported Features of Javanese Traditional Theatre Miguel Escobar Varela Andrew Schauf To Catch a Protagonist: Quantitative Dominance Relations in German-Language Drama (1730–1930) Frank Fischer Peer Trilcke Christopher Kittel Carsten Milling cmil@hashtable.de Daniil Skorinkin Visualising The Digital Humanities Community: A Comparison Study Between Citation Network And Social Network Jin Gao Julianne Nyhan Oliver Duke-Williams Simon Mahony SciFiQ and “Twinkle, Twinkle”: A Computational Approach to Creating “the Perfect Science Fiction Story” Adam Hammond Julian Brooke Minna de Honkoku: Learning-driven Crowdsourced Transcription of Pre-modern Japanese Earthquake Records Yuta Hashimoto Yasuyuki Kano Ichiro Nakasnishi Junzo Ohmura Yoko Odagi Kentaro Hattori Tama Amano Tomoyo Kuba Haruno Sakai Data Scopes: towards Transparent Data Research in Digital Humanities Rik Hoekstra Marijn Koolen Marijke van Faassen Authorship Attribution Variables and Victorian Drama: Words, Word-Ngrams, and Character-Ngrams David L. Hoover Digital Humanities in Latin American Studies: Cybercultures Initiative Angelica J. Huizar A machine learning methodology to analyze 3D digital models of cultural heritage objects Diego Jimenez-Badillo Salvador Ruiz-Correa Mario Canul-Ku Rogelio Hasimoto Women’s Books versus Books by Women Corina Koolen Digital Modelling of Knowledge Innovations In Sacrobosco’s Sphere: A Practical Application Of CIDOC-CRM And Linked Open Data With CorpusTracer Florian Kräutli Matteo Valleriani Esther Chen Christoph Sander Dirk Wintergrün Sabine Bertram Gesa Funke Chantal Wahbi Manon Gumpert Victoria Beyer Nana Citron Guillaume Ducoffe Quantitative microanalysis? Different methods of digital drama analysis in comparison Benjamin Krautter Computational Analysis and Visual Stylometry of Comics using Convolutional Neural Networks Jochen Laubrock David Dubray Classical Chinese Sentence Segmentation for Tomb Biographies of Tang Dynasty Chao-Lin Liu Yi Chang Epistemic Infrastructures: Digital Humanities in/as Instrumentalist Context James W. Malazita Visualizing the Feminist Controversy in England, 1788-1810 Laura C Mandell Megan Pearson Rebecca Kempe Steve Dezort ZX Spectrum, or Decentering Digital Media Platform Studies approach as a tool to investigate the cultural differences through computing systems in their interactions with creativity and expression Piotr Marecki Michał Bukowski Robert Straky Ciências Sociais Computacionais no Brasil Juliana Marques Celso Castro Distributions of Function Words Across Narrative Time in 50,000 Novels David William McClure Scott Enderle Challenges in Enabling Mixed Media Scholarly Research with Multi-media Data in a Sustainable Infrastructure Roeland Ordelman Carlos Martínez Ortíz Liliana Melgar Estrada Marijn Koolen Jaap Blom Willem Melder Jasmijn Van Gorp Victor De Boer Themistoklis Karavellas Lora Aroyo Thomas Poell Norah Karrouche Eva Baaren Johannes Wassenaar Julia Noordegraaf Oana Inel El campo del arte en San Luis Potosí, México: 1950-2017. Análisis de Redes Sociales y Capital Social José Antonio Motilla The Search for Entropy: Latin America’s Contribution to Digital Art Practice Tirtha Prasad Mukhopadhyay Reynaldo Thompson Ego-Networks: Building Data for Feminist Archival Recovery Emily Christina Murphy Searching for Concepts in Large Text Corpora: The Case of Principles in the Enlightenment Stephen Osadetz Kyle Courtney Claire DeMarco Cole Crawford Christine Fernsebner Eslao Achieving Machine-Readable Mayan Text via Unicode: Blending “Old World” script-encoding with novel digital approaches Carlos Pallan Gayol Deborah Anderson Whose Signal Is It Anyway? A Case Study on Musil for Short Texts in Authorship Attribution Simone Rebora J. Berenike Herrmann Gerhard Lauer Massimo Salgaro Creating and Implementing an Ontology of Documents and Texts Peter Robinson Detection and Measurement of Digital Imbalances on a Local Scale Related to the Mechanism for Production and Distribution of Cultural Information Nuria Rodríguez-Ortega #SiMeMatan Será por Atea: Procesamiento Ciberactivista de la Religión como Parte del Canon Heteropatriarcal en México Michelle Vyoleta Romero Gallardo Edición literaria electrónica y lectura SMART Dolores Romero-López Alicia Reina-Navarro Lucía Cotarelo-Esteban José Luis Bueren-Gómez-Acebo Para la(s) historia(s) de las mujeres en digital: pertinencias, usabilidades, interoperabilidades Amelia Sanz From print to digital: A web-edition of Giacomo Leopardi’s Idilli Desmond Schmidt Paola Italia Milena Giuffrida Simone Nieddu Designing Digital Collections for Social Relevance Susan Schreibman The Digitization of “Oriental” Manuscripts: Resisting the Reinscribing of Canon and Colonialism Caroline T. Schroeder A Deep Gazetteer of Time Periods Ryan Shaw Adam Rabinowitz Patrick Golden Feminismo y Tecnología: Software Libre y Cultura Hacker Como Medio Para la Apropiación Tecnológica Martha Irene Soria Guzmán Interpreting Difference among Transcripts Michael Sperberg-McQueen Claus Huitfeldt Modelling Multigraphism: The Digital Representation of Multiple Scripts and Alphabets Peter Anthony Stokes Chinese Text Project A Dynamic Digital Library of Pre-modern Chinese Donald Sturgeon Handwritten Text Recognition, Keyword Indexing Dominique Stutzmann Christopher Kermorvant Enrique Vidal Sukalpa Chanda Sébastien Hamel Joan Puigcerver Pérez Lambert Schomaker Alejandro H. Toselli Estudio exploratorio sobre los territorios de la biopirateria de las medicinas tradicionales en Internet : el caso de America Latina Luis Torres-Yepez Khaldoun Zreik In Search of the Drowned in the Words of the Saved: Mining and Anthologizing Oral History Interviews of Holocaust Survivors Gabor Toth LitViz: Visualizing Literary Data by Means of text2voronoi Tolga Uslu Alexander Mehler Dirk Meyer Lo que se vale y no se vale preguntar: el potencial pedagógico de las humanidades digitales para la enseñanza sobre la experiencia mexicano-americana en el midwest de Estados Unidos Isabel Velázquez Jennifer Isasi Marcus Vinícius Barbosa Solving the Problem of the “Gender Offenders”: Using Criminal Network Analysis to Optimize Openness in Male Dominated Collaborative Networks Deb Verhoeven Katarzyna Musial Stuart Palmer Sarah Taylor Lachlan Simpson Vejune Zemaityte Shaukat Abidi “Fortitude Flanked with Melody:” Experiments in Music Composition and Performance with Digital Scores Raffaele Viglianti Joseph Arkfeld On Alignment of Medieval Poetry Stefan Jänicke David Joseph Wrisley Short Papers Archivos digitales, cultura participativa y nuevos alfabetismos: La catalogación colaborativa del Archivo Histórico Regional de Boyacá (Colombia) Maria Jose Afanador-Llach Andres Lombana The Programming Historian en español: Estrategias y retos para la construcción de una comunidad global de HD Maria Jose Afanador-Llach La Sala de la Reina Isabel en el Museo del Prado, 1875-1877: La realidad aumentada en 3D como método de investigación, producto y vehículo pedagógico Eugenia V Afinoguenova Chris Larkee Giuseppe Mazzone Pierre Géal A Digital Edition of Leonhard Euler’s Correspondence with Christian Goldbach Sepideh Alassi Tobias Schweizer Martin Mattmüller Lukas Rosenthaler Helmut Harbrecht Bridging the Divide: Supporting Minority and Historic Scripts in Fonts: Problems and Recommendations Deborah Anderson Conexiones Digitales Afrolatinoamericanas. El Análisis Digital de la Colección Manuel Zapata Olivella Eduard Arriaga Dal Digital Cultural Heritage alla Digital Culture. Evoluzioni nelle Digital Humanities Nicola Barbuti Ludovica Marinucci Mesurer Merce Cunningham : une expérimentation en «theatre analytics» Clarisse Bardiot Is Digital Humanities Adjuncting Infrastructurally Significant? Kathi Inman Berens Transposição Didática e atuais Recursos Pedagógicos: convergências para o diálogo educativo Ana Maria Bosse Juliana Bergmann Hurricane Memorial: The United States’ Racialized Response to Disaster Relief Christina Boyles Backoff Lemmatization as a Philological Method Patrick J. Burns Las humanidades digitales y el patrimonio arqueológico maya: resultados preliminares de un esfuerzo interinstitucional de documentación y difusión Arianna Campiani Nicola Lercari Cartonera Publishers Database, documenting grassroots publishing initiatives Paloma Celis Carbajal Integrating Latent Dirichlet Allocation and Poisson Graphical Model: A Deep Dive into the Writings of Chen Duxiu, Co-Founder of the Chinese Communist Party Anne Shen Chao Qiwei Li Zhandong Liu Sensory Ethnography and Storytelling with the Sounds of Voices: Methods, Ethics and Accessibility Kelsey Marie Chatlosh Seinfeld at The Nexus of the Universe: Using IMDb Data and Social Network Theory to Create a Digital Humanities Project Cindy Conaway Diane Shichtman Exploring Big and Boutique Data through Laboring-Class Poets Online Cole Daniel Crawford Organizing communities of practice for shared standards for 3D data preservation Lynn Cunningham Hannah Scates-Kettler Legacy No Longer: Designing Sustainable Systems for Website Development Karin Dalziel Jessica Dussault Gregory Tunink Histonets, Turning Historical Maps into Digital Networks Javier de la Rosa Pérez Scott Bailey Clayton Nall Ashley Jester Jack Reed Drew Winget Alfabetización digital, prácticas y posibilidades de las humanidades digitales en América Latina y el Caribe Gimena del Rio Riande Paola Ricaurte Quijano Virginia Brussa Listening for Religion on a Digital Platform Amy DeRogatis Words that Have Made History, or Modeling the Dynamics of Linguistic Changes Maciej Eder Locative Media for Queer Histories: Scaling up “Go Queer” Maureen Engel Analyzing Social Networks of XML Plays: Exploring Shakespeare’s Genres Lawrence Evalyn Susan Gauch Manisha Shukla Resolving the Polynymy of Place: or How to Create a Gazetteer of Colonized Landscapes Katherine Mary Faull Diane Katherine Jakacki Audiences, Evidence, and Living Documents: Motivating Factors in Digital Humanities Monograph Publishing Katrina Fenlon Megan Senseney Maria Bonn Janet Swatscheno Christopher R. Maden Mitologias do Fascínio Tecnológico Andre Azevedo da Fonseca Latin@ voices in the Midwest: Ohio Habla Podcast Elena Foulis Spotting the Character: How to Collect Elements of Characterisation in Literary Texts? Ioana Galleron Fatiha Idmhand Cécile Meynard Pierre-Yves Buard Julia Roger Anne Goloubkoff Archivos Abiertos y Públicos para el Postconflicto Colombiano Stefania Gallini Humanidades Digitales en Cuba: Avances y Perspectivas Maytee García Vázquez Sulema Rodriguez Roche Ania Hernández Quintana Corpus Jurídico Hispano Indiano Digital: Análisis De Una Cultura Jurisdiccional Víctor Gayol Expanding the Research Environment for Ancient Documents (READ) to Any Writing System Andrew Glass The Latin American Comics Archive: An Online Platform For The Research And Teaching Of Digitized And Encoded Spanish-Language Comic Books Through Scholar/Student Collaboration Felipe Gomez Scott Weingart Daniel Evans Rikk Mulligan Verba Volant, Scripta Manent: An Open Source Platform for Collecting Data to Train OCR Models for Manuscript Studies Samuel Grieggs Bingyu Shen Hildegund Muller Christine Ascik Erik Ellis Mihow McKenny Nikolas Churik Emily Mahan Walter Scheirer Indagando la cultura impresa del siglo XVIII Novohispano: una base de datos inédita Víctor Julián Cid Carmona Silvia Eunice Gutiérrez De la Torre Guadelupe Elisa Cihuaxty Acosta Samperio Puesta en mapa: la literatura de México a través de sus traducciones Silvia Eunice Gutiérrez De la Torre Jorge Mendoza Romero Amaury Gutiérrez Acosta Flexibility and Feedback in Digital Standards-Making: Unicode and the Rise of Emojis S. E. Hackney The Digital Ghost Hunt: A New Approach to Coding Education Through Immersive Theatre Elliott Hall Exploration of Sentiments and Genre in Spanish American Novels Ulrike Edith Gerda Henny-Krahmer Digitizing Paratexts Kate Holterhoff A Corpus Approach to Manuscript Abbreviations (CAMA) Alpo Honkapohja On Natural Disasters In Chinese Standard Histories Hong-Ting Su Jieh Hsiang Nungyao Lin REED London and the Promise of Critical Infrastructure Diane Katherine Jakacki Susan Irene Brown James Cummings Kimberly Martin Large-Scale Accuracy Benchmark Results for Juola’s Authorship Verification Protocols Patrick Juola Adapting a Spelling Normalization Tool Designed for English to 17th Century Dutch Ivan Kisjes Wijckmans Tessa Differential Reading by Image-based Change Detection and Prospect for Human-Machine Collaboration for Differential Transcription Asanobu Kitamoto Hiroshi Horii Misato Horii Chikahiko Suzuki Kazuaki Yamamoto Kumiko Fujizane The History and Context of the Digital Humanities in Russia Inna Kizhner Melissa Terras Lev Manovich Boris Orekhov Anastasia Bonch-Osmolovskaya Maxim Rumyantsev Urban Art in a Digital Context: A Computer-Based Evaluation of Street Art and Graffiti Writing Sabine Lang Björn Ommer ¿Metodologías en Crisis? Tesis 2.0 a través de la Etnografía de lo Digital Domingo Manuel Lechón Gómez Hashtags contra el acoso: The dynamics of gender violence discourse on Twitter Rhian Elizabeth Lewis Novas faces da arte política: ações coletivas e ativismos em realidade aumentada Daniela Torres Lima Sandra van Ginhoven Critical Data Literacy in the Humanities Classroom Brandon T. Locke Ontological Challenges in Editing Historic Editions of the Encyclopedia Britannica Peter M Logan Distinctions between Conceptual Domains in the Bilingual Poetry of Pablo Picasso Enrique Mallen Luis Meneses A formação de professores/pesquisadores de História no contexto da Cibercultura: História Digital, Humanidades Digitais e as novas perspectivas de ensino no Brasil. Patrícia Marcondes de Barros Presentation Of Web Site On The Banking And Financial History Of Spain And Latin America Carlos Marichal Spatial Disaggregation of Historical Census Data Leveraging Multiple Sources of Ancillary Data João Miguel Monteiro Bruno Emanuel Martins Patricia Murrieta-Flores João Moura Pires The Poetry Of The Lancashire Cotton Famine (1861-65): Tracing Poetic Responses To Economic Disaster Ruth Mather READ Workbench – Corpus Collaboration and TextBase Avatars Ian McCrabb Preserving and Visualizing Queer Representation in Video Games Cody Jay Mejeur Segmentación, modelado y visualización de fuentes históricas para el estudio del perdón en el Nuevo Reino de Granada del siglo XVIII Jairo Antonio Melo Flórez Part Deux: Exploring the Signs of Abandonment of Online Digital Humanities Projects Luis Meneses Jonathan Martin Richard Furuta Ray Siemens A People’s History? Developing Digital Humanities Projects with the Public Susan Michelle Merriam Peer Learning and Collaborative Networks: On the Use of Loop Pedals by Women Vocal Artists in Mexico Aurelio Meza Next Generation Digital Humanities: A Response To The Need For Empowering Undergraduate Researchers Taylor Elyse Mills La creación del Repositorio Digital del Patrimonio Cultural de México Ernesto Miranda Vania Ramírez Towards Linked Data of Bible Quotations in Jewish Texts Oren Mishali Benny Kimelfeld Towards a Metric for Paraphrastic Modification Maria Moritz Johannes Hellrich Sven Buechel Temporal Entity Random Indexing Annalina Caputo Gary Munnelly Seamus Lawless IncipitSearch - Interlinking Musicological Repositories Anna Neovesky Frederic von Vlahovits OCR’ing and classifying Jean Desmet’s business archive: methodological implications and new directions for media historical research Christian Gosvig Olesen Ivan Kisjes The 91st Volume — How the Digitised Index for the Collected Works of Leo Tolstoy Adds A New Angle for Research Boris V. Orekhov Frank Fischer Adjusting LERA For The Comparison Of Arabic Manuscripts Of _Kalīla wa-Dimna_ Beatrice Gründler Marcus Pöckelmann Afterlives of Digitization Lily Cho Julienne Pascoe Rapid Bricolage Implementing Digital Humanities William Dudley Pascoe The Time-Us project. Creating gold data to understand the gender gap in the French textile trades (17th–20th century) Eric de La Clergerie Manuela Martini Marie Puren Charles Riondet Alix Chagué Modeling Linked Cultural Events: Design and Application Kaspar Beelen Ivan Kisjes Julia Noordegraaf Harm Nijboer Thunnis van Oort Claartje Rasterhoff Bridging Divides for Conservation in the Amazon: Digital Technologies & The Calha Norte Portal Hannah Mabel Reardon Measured Unrest In The Poetry Of The Black Arts Movement Ethan Reed Does “Late Style” Exist? New Stylometric Approaches to Variation in Single-Author Corpora Jonathan Pearce Reeve Keeping 3D data alive: Developments in the MayaCityBuilder Project Heather Richards-Rissetto Rachel Optiz Fabrizio Galeazzi Finding Data in a Literary Corpus: A Curatorial Approach Brad Rittenhouse Sudeep Agarwal Mapping And Making Community: Collaborative DH Approaches, Experiential Learning, And Citizens’ Media In Cali, Colombia Katey Roden Pavel Shlossberg The Diachronic Spanish Sonnet Corpus (DISCO): TEI and Linked Open Data Encoding, Data Distribution and Metrical Findings Pablo Ruiz Fabo Helena Bermúdez Sabel Clara Martínez Cantón Elena González-Blanco Borja Navarro Colorado Polysystem Theory and Macroanalysis. A Case Study of Sienkiewicz in Italian Jan Rybicki Katarzyna Biernacka-Licznar Monika Woźniak Interrogating the Roots of American Settler Colonialism: Experiments in Network Analysis and Text Mining Ashley Sanders Garcia ¿Existe correlación entre importancia y centralidad? Evaluación de personajes con redes sociales en obras teatrales de la Edad de Plata? Teresa Santa María Elena Martínez Carro Concepción Jiménez José Calvo Tello Cultural Awareness & Mapping Pedagogical Tool: A Digital Representation of Gloria Anzaldúa’s Frontier Theory Rosita Scerbo Corpus Linguistics for Multidisciplinary Research: Coptic Scriptorium as Case Study Caroline T. Schroeder Extracting and Aligning Artist Names in Digitized Art Historical Archives Benoit Seguin Lia Costiner Isabella di Lenardo Frédéric Kaplan A Design Process Model for Inquiry-driven, Collaboration-first Scholarly Communications Sara B. Sikes Métodos digitales para el estudio de la fotografía compartida. Una aproximación distante a tres ciudades iberoamericanas en Instagram Gabriela Elisa Sued Revitalizing Wikipedia/DBpedia Open Data by Gamification -SPARQL and API Experiment for Edutainment in Digital Humanities Go Sugimoto The Purpose of Education: A Large-Scale Text Analysis of University Mission Statements Danica Savonick Lisa Tagliaferri Digital Humanities Integration and Management Challenges in Advanced Imaging Across Institutions and Technologies Nondestructive Imaging of Egyptian Mummy Papyrus Cartonnage Michael B. Toth Melissa Terras Adam Gibson Cerys Jones Towards A Digital Dissolution: The Challenges Of Mapping Revolutionary Change In Pre-modern Europe Charlotte Tupman James Clark Richard Holding An Archaeology of Americana: Recovering the Hemispheric Origins of Sabin’s Bibliotheca Americana to Contest the Database’s (National) Limits Mary Lindsay Van Tine Tweets of a Native Son: James Baldwin, #BlackLivesMatter, and Networks of Textual Recirculation Melanie Walsh Abundance and Access: Early Modern Political Letters in Contemporary and Digital Archives Elizabeth Williamson Balanceándonos entre la aserción de la identidad y el mantenimiento del anonimato: Usos sociales de la criptografía en la red Gunnar Eyal Wolf Iszaevich A White-Box Model for Detecting Author Nationality by Linguistic Differences in Spanish Novels Albin Zehe Daniel Schlör Ulrike Henny-Krahmer Martin Becker Andreas Hotho Media Preservation between the Analog and Digital: Recovering and Recreating the Rio VideoWall Gregory Zinman The (Digital) Space Between: Notes on Art History and Machine Vision Learning Benjamin Zweig Posters World of the Khwe Bushmen: Accessing Khwe Cultural Heritage Data by Means of a Digital Ontology Based on Owlnotator Giuseppe Abrami Gertrude Boden Lisa Gleiß Design on View: Imagining Culture as a Digital Outcome Ersin Altin Introducing Polo: Exploring Topic Models as Database and Hypertext Rafael Alvarado The Spatial Humanities Kit Matt Applegate Jamie Cohen The Magnifying Glass and the Kaleidoscope. Analysing Scale in Digital History and Historiography Florentina Armaselu Encoding the Oldest Western Music Allyn Waller Toni Armstrong Nicholas Guarracino Julia Spiegel Hannah Nguyen Marika Fox Creating a Digital Edition of Ancient Mongolian Historical Documents Biligsaikhan Batjargal Garmaabazar Khaltarkhuu Akira Maeda Shedding Light on Indigenous Knowledge Concepts and World Perception through Visual Analysis Alejandro Benito Amelie Dorn Roberto Therón Eveline Wandl-Vogt Antonio Losada The CLiGS Textbox José​ Calvo Tello Ulrike Henny-Krahmer Christof Schöch Katrin Betz CITE Exchange Format (CEX): Simple, plain-text interchange of heterogenous datasets Christopher William Blackwell Thomas Köntges Neel Smith Digitizing Whiteness: Systemic Inequality in Community Digital Archives Monica Kristin Blair How to create a Website and which Questions you have to answer first Peggy Bockwinkel Michael Czechowski La Aptitud para Encontrar Patrones y la Producción de Cine Suave (Soft Cinema) Diego Bonilla Women’s Faces and Women’s Rights: A Contextual Analysis of Faces Appearing in Time Magazine Kathleen Patricia Janet Brennan Vincent Berardi Aisha Cornejo Carl Bennett John Harlan Ana Jofre Decolonialism and Formal Ontology: Self-critical Conceptual Modelling Practice George Bruseker Anais Guillem Rules against the Machine: Building Bridges from Text to Metadata José Calvo Tello Prospectiva de la arquitectura en el siglo XXI. La arquitectura en entornos digitales Luis David Cardona Jiménez Visualizando Dados Bibliográficos: o Uso do VOSviewer como Ferramenta de Análise Bibliométrica de Palavras-Chave na Produção das Humanidades Digitais Renan Marinho de Castro Ricardo Medeiros Pimenta Mapping the Movida: Re-Imagining Counterculture in Post-Franco Spain (1975-1992) Vanessa Ceia Intellectual History and Computing: Modeling and Simulating the World of the Korean Yangban Javier Cha More Than “Nice to Have”: TEI-to-Linked Data Conversion Constance Crompton Michelle Schwartz Animating Text Newcastle University James Cummings Tiago Sousa Garcia Una Investigación a Explotar : Los Cristianos de Alá, Siglos XVI y XVII Marianne Delacourt Véronique Fabre The Iowa Canon of Greek and Latin Authors and Works Paul Dilley Digital Storytelling: Engaging Our Community and The Humanities Ruben Duran Charlotte Hamilton Text Mining Methods to Solve Organic Chemistry Problems, or Topic Modeling Applied to Chemical Molecules Maciej Eder Jan Winkowski Michał Woźniak Rafał L. Górski Bartosz Grzybowski Studying Performing Arts Across Borders: Towards a European Performing Arts Dataverse (EPAD) Thunnis van Oort Ivan Kisjes The Archive as Collaborative Learning Space Natalia Ermolaev Mark Saccomano Tensiones entre el archivo de escritor físico y el digital: hacia una aproximación teórica Leonardo Ariel Escobar Using Linked Open Data To Enrich Concept Searching In Large Text Corpora Christine Fernsebner Eslao Stephen Osadetz Pontes into the Curriculum: Introducing DH pedagogy through global partnerships Pamela Espinosa de los Monteros Joshua Sadvari Maria Scheid Milpaís: una wiki semántica para recuperar, compartir y construir colaborativamente las relaciones entre plantas, seres humanos, comunidades y entornos María Juana Espinosa Menéndez Camilo Martinez Cataloging History: Revisualizing the 1853 New York Crystal Palace Steven Lubar Emily Esten Steffani Gomez Brian Croxall Patrick Rashleigh Crowdsourcing Community Wellness: Coding a Mobile App For Health and Education Katherine Mary Faull Michael Thompson Jacob Mendelowitz Caroline Whitman Shaunna Barnhart Bad Brujas Only: Digital Presence, Embodied Protest, and Online Witchcraft Amanda Kelan Figueroa Ravon Ruffin La geopólitica de las humanidades digitales: un caso de estudio de DH2017 Montreal José Pino-Díaz Domenico Fiormonte Using Topic Modelling to Explore Authors’ Research Fields in a Corpus of Historical Scientific English Stefan Fischer Jörg Knappen Elke Teich Stranger Genres: Computationally Classifying Reprinted Nineteenth Century Newspaper Texts Jonathan D. Fitzgerald Ryan Cordell Humanities Commons: Collaboration and Collective Action for the Common Good Kathleen Fitzpatrick Making DH-Course Together Dinara Gagarina Standing in Between. Digital Archive of Manuel Mosquera Garcés. Maria Paula Garcia Mosquera Research Environment for Ancient Documents (READ) Andrew Glass Stephen White Ian McCrabb Manifold Scholarship: Hybrid Publishing in a Print/Digital Era Matthew K. Gold Jojo Karlin Zach Davis Legal Deposit Web Archives and the Digital Humanities: A Universe of Lost Opportunity? Paul Gooding Melissa Terras Linda Berube Crafting History: Using a Linked Data Approach to Support the Development of Historical Narratives of Critical Events Karen F. Gracy Prosopografía de la Revolución Mexicana: Actualización de la Obra de Françoise Xavier Guerra Martha Lucía Granados-Riveros Diego Montesinos Developing Digital Methods to Map Museum “Soft Power” Natalia Grincheva Brecht Beats Shakespeare! A Card-Game Intervention Revolving Around the Network Analysis of European Drama Angelika Hechtl Frank Fischer Anika Schultz Christopher Kittel Elisa Beshero-Bondar Steffen Martus Peer Trilcke Jana Wolf Ingo Börner Daniil Skorinkin Tatiana Orlova Carsten Milling Christine Ivanovic Visualizando una Aproximación Narratológica sobre la Producción y Utilización de los Recursos Online de Museos de Arte. María Isabel Hidalgo Urbaneja Transatlantic knowledge production and conveyance in community-engaged public history: German History in Documents and Images/Deutsche Geschichte in Dokumenten und Bildern Matthew Hiebert Simone Lässig A Tool to Visualize Data on Scientific Performance in the Czech Republic Radim Hladík Augmenting the University: Using Augmented Reality to Excavate University Spaces Christian Howard Monica Blair Spyros Simotas Ankita Chakrabarti Torie Clark Tanner Greene An Easy-to-use Data Analysis and Visualization Tool for Studying Chinese Buddhist Literature Jen-Jou Hung ‘This, reader, is no fiction’: Examining the Rhetorical Uses of Direct Address Across the Nineteenth- and Twentieth-Century Novel Gabrielle Kirilloff Reimagining Elizabeth Palmer Peabody’s Lost “Mural Charts” Alexandra Beall Courtney Allen Angela Vujic Lauren F. Klein TOME: A Topic Modeling Tool for Document Discovery and Exploration Adam Hayward Nikita Bawa Morgan Orangi Caroline Foster Lauren F. Klein Bridging Digital Humanities Internal and Open Source Software Projects through Reusable Building Blocks Rebecca Sutton Koeser Benjamin W Hicks Building Bridges Across Heritage Silos Kalliopi Kontiza Catherine Jones Joseph Padfield Ioanna Lykourentzou Voces y Caras: Hispanic Communities of North Florida Constanza M. López Baquero Empatía Digital: en los pixeles del otro Carolina Laverde Atlas de la narrativa mexicana del siglo XX y la representación visualizada de México en su literatura. Avance de proyecto Nora Marisa León-Real Méndez HuViz: From _Orlando_ to CWRC… And Beyond! Kim Martin Abi Lemak Susan Brown Chelsea Miya Jana Smith-Elford Endangered Data Week: Digital Humanities and Civic Data Literacy Brandon T. Locke Herramienta web para la identificación de la técnica de manufactura en fotografías históricas Gustavo Lozano San Juan Propuesta interdisciplinaria de un juego serio para la divulgación de conocimiento histórico. Caso de estudio: la divulgación del saber histórico sobre la vida conventual de los carmelitas descalzos del ex-Convento del Desierto de los Leones Leticia Luna Tlatelpa Fabián Gutiérrez Gómez Edné Balmori Feliciano García García Dr. Luis Rodriguez Morales Digital 3D modelling in the humanities Sander Münster Question, Create, Reflect: A Holistic and Critical Approach to Teaching Digital Humanities Kristen Mapes Matthew Handelman “Smog poem”. Example of data dramatization Piotr Marecki Leszek Onak ANJA, ¿dónde están los encabalgamientos? Clara Martinez-Canton Pablo Ruiz-Fabo Elena González-Blanco Combining String Matching and Cost Minimization Algorithms for Automatically Geocoding Tabular Itineraries Rui Santos Bruno Emanuel Martins Patricia Murrieta-Flores How We Became Digital? Recent History of Digital Humanities in Poland Maciej Maryl Hacia la traducción automática de las lenguas indígenas de méxico Jesús Manuel Mager Hois Ivan Vladimir Meza Ruiz Towards a Digital History of the Spanish Invasion of Indigenous Peru Jeremy M. Mikecz Style Revolution: Journal des Dames et des Modes Jodi Ann Mikesell Avery Schroeder Anne Higonnet Alex Gil AnaKaren Aguero Sarah Bigler Meghan Collins Emily Cormack Zoë Dostal Barthelemy Glama Brontë Hebdon The Two Moby Dicks: The Split Signatures of Melville’s Novel Chelsea Miya devochdelia: el Diccionario Etimolójico de las Voces Chilenas Derivadas de Lenguas Indíjenas Americanas de Rodolfo Lenz en versión digital Francisco Mondaca Unsustainable Digital Cultural Collections Jo Ana Morfin La automatización y “digitalización” del Centro de Documentación Histórica “Lic. Rafael Montejano y Aguiñaga” de la Universidad Autónoma de San Luis Potosí, mediante la autogestión y software libre José Antonio Motilla Ismael Huerta A Comprehensive Image-Based Digital Edition Using CEX: A fragment of the Gospel of Matthew Janey Capers Newland Emmett Baumgarten De’sean Markley Jeffrey Rein Brienna Dipietro Anna Sylvester Brandon Elmy Summey Hedden Using Zenodo as a Discovery and Publishing Platform Daniel Paul O’Donnell Natalia Manola Paolo Manghi Dot Porter Paul Esau Carey Viejou Roberto Rosselli Del Turco SpatioScholar: Annotating Photogrammetric Models Burcak Ozludil Altin Augustus Wendell Decolonising Collections Information – Disrupting Settler Colonial Power In Information Management in response to Canada’s Truth & Reconciliation Commission and the United Nations Declaration on the Rights of Indigenous Peoples Laura Phillips An Ontological Model for Inferring Psychological Profiles and Narrative Roles of Characters Mattia Egloff Antonio Lieto Davide Picca A Graphical User Interface for LDA Topic Modeling Steffen Pielström Severin Simmler Thorsten Vitt Fotis Jannidis Eliminar barreras para construir puentes a travès de la Web semántica: Isidore, un buscador trilingüe para las Ciencias Humanas y Sociales Sthephane Pouyllau Laurent Capelli Adeline Joffres Desseigne Adrien Gautier Hélène SSK by example. Make your Arts and Humanities research go standard Marie Puren Laurent Romary Lionel Tadjou Charles Riondet Dorian Seillier Monroe Work Today: Unearthing the Geography of US Lynching Violence RJ Ramey Educational Bridges: Understanding Conservation Dynamics in the Amazon through The Calha Norte Portal Hannah Mabel Reardon Building a Community Driven Corpus of Historical Newspapers Claudia Resch Dario Kampkaspar Daniela Fasching Vanessa Hannesschläger Daniel Schopper Expanding Communities of Practice: The Digital Humanities Research Institute Model Lisa Rhody Hannah Aizenmann Kelsey Chatlosh Kristen Hackett Jojo Karlin Javier Otero Peña Rachel Rakov Patrick Smyth Patrick Sweeney Stephen Zweibel Hispanic 18th Connect: una nueva plataforma para la investigación digital en español Rubria Rocha Laura Mandell Lorenzetti Digital Elvis Andrés Rojas Rodríguez Jose Nicolas Jaramillo Liévano Traditional Humanities Research and Interactive Mapping: Towards a User-Friendly Story of Two Worlds Collide Vasileios Routsis Digital Humanities Storytelling Heritage Lab Mariana Ruiz Gonzalez Renteria Angélica Amezcua Digital Humanities Under Your Fingertips: Tone Perfect as a Pedagogical Tool in Mandarin Chinese Second Language Studies and an Adaptable Catherine Youngkyung Ryu Codicological Study of pre High Tang Documents from Dunhuang : An Approach using Scientific Analysis Data Shouji Sakamoto Léon-Bavi Vilmont Yasuhiko Watanabe Connecting Gaming Communities and Corporations to their History: The Gen Con Program Database Matt Shoemaker Resolving South Asian Orthographic Indeterminacy In Colonial-Era Archives Amardeep Singh Brâncuși’s Metadata: Turning a Graduate Humanities Course Curriculum Digital Stephen Craig Sturgeon A Style Comparative Study of Japanese Pictorial Manuscripts by “Cut, Paste and Share” on IIIF Curation Viewer Chikahiko Suzuki Akira Takagishi Asanobu Kitamoto Complex Networks of Desire: Fireweed, Fuse, Border/Lines Felicity Tayler Tomasz Neugebauer Locating Place Names at Scale: Using Natural Language Processing to Identify Geographical Information in Text Lauren Tilton Taylor Arnold Courtney Rivard 4 Ríos: una construcción transmedia de memoria histórica sobre el conflicto armado en Colombia Elder Manuel Tobar Panchoaga Building a Bridge to Next Generation DH Services in Libraries with a Campus Needs Assessment Harriett Green Eleanor Dickson Daniel G. Tracy Sarah Christensen Melanie Emerson JoAnn Jacoby Chromatic Structure and Family Resemblance in Large Art Collections — Exemplary Quantification and Visualizations Loan T Tran Kelly Park Poshen Lee Jevin West Maximilian Schich Ethical Constraints in Digital Humanities and Computational Social Science Anagha Uppal Bridging the Gap: Digital Humanities and the Arabic-Islamic Corpus Dafne Erica van Kuppevelt E.G. Patrick Bos A. Melle Lyklema Umar Ryad Christian R. Lange Janneke van der Zwaan Off-line sStrategies for On-line Publications: Preparing the Shelley-Godwin Archive for Off-line Use Raffaele Viglianti Academy of Finland Research Programme “Digital Humanities” (DIGIHUM) Risto Pekka Vilkko Modeling the Genealogy of Imagetexts: Studying Images and Texts in Conjunction using Computational Methods Melvin Wevers Thomas Smits Leonardo Impett History for Everyone/Historia para todos: Ancient History Encyclopedia James Blake Wiener Gimena del Rio Riande Princeton Prosody Archive: Rebuilding the Collection and User Interface Meredith Martin Meagan Wilson Mary Naydan ELEXIS: Yet Another Research Infrastructure. Or Why We Need An Special Infrastructure for E-Lexicography In The Digital Humanities Tanja Wissik Ksenia Zaytseva Thierry Declerck “Moon:” A Spatial Analysis of the Gumar Corpus of Gulf Arabic Internet Fiction David Joseph Wrisley Hind Saddiki Preconference Workshops New Scholars Seminar Geoffrey Rockwell Rachel Hendery Juan Steyn Elise Bohan Getting to Grips with Semantic and Geo-annotation using Recogito 2 Leif Isaksen Gimena del Río Riande Romina De León Nidia Hernández Semi-automated Alignment of Text Versions with iteal Stefan Jänicke David Joseph Wrisley Innovations in Digital Humanities Pedagogy: Local, National, and International Training Diane Katherine Jakacki Raymond George Siemens Katherine Mary Faull Machine Reading Part II: Advanced Topics in Word Vectors Eun Seo Jo Javier de la Rosa Pérez Scott Bailey Fernando Sancho Interactions: Platforms for Working with Linked Data Susan Brown Kim Martin Building International Bridges Through Digital Scholarship: The Trans-Atlantic Platform Digging Into Data Challenge Experience Elizabeth Tran Crystal Sissons Nicolas Parker Mika Oehling Herramientas para los usuarios: colecciones y anotaciones digitales Amelia Sanz Alckmar Dos Santos Ana Fernández-Pampillón Oscar García-Rama Joaquin Gayoso María Goicoechea Dolores Romero José Luis Sierra Where is the Open in DH? Wouter Schallier Gimena del Rio Riande April M. Hathcock Daniel O’Donnell Indexing Multilingual Content with the Oral History Metadata Synchronizer (OHMS) Teague Schneiter Brendan Coates Sig Endorsed Distant Viewing with Deep Learning: An Introduction to Analyzing Large Corpora of Images Taylor Baillie Arnold Lauren Craig Tilton The re-creation of Harry Potter: Tracing style and content across novels, movie scripts and fanfiction Marco Büchler Greta Franzini Mike Kestemont Enrique Manjavacas Archiving Small Twitter Datasets for Text Analysis: A Workshop for Beginners Ernesto Priego Bridging Justice Based Practices for Archives + Critical DH T-Kay Sangwand Caitlin Christian-Lamb Purdom Lindblad _GoBack _GoBack _GoBack _GoBack _GoBack _GoBack _Ref499278733 _Ref499189299 _Ref499362493 _GoBack _GoBack _GoBack _GoBack _GoBack _GoBack _GoBack _GoBack _GoBack __DdeLink__1321_72475129 __DdeLink__1815_72475129 __DdeLink__2245_72475129 __DdeLink__2247_72475129 __DdeLink__2249_72475129 __DdeLink__2253_72475129 _GoBack _GoBack _GoBack the-cex-format corpus-building-software-brucheion-and-t microservices-based-on-cex-citemicroserv exploring-cex-collections-through-topic- browsing-integrated-text-data-libraries- screenshots-of-cex-capable-applications _GoBack _GoBack _GoBack _GoBack _732lxckkdn4q _pxa5rw1cijp8 _6yf7qndwmruy _GoBack docs-internal-guid-d95869d1-0714-7de8-2a docs-internal-guid-d95869d1-0715-4480-dc docs-internal-guid-d95869d1-0712-ece2-b7 docs-internal-guid-d95869d1-0713-c085-b6 _GoBack _GoBack _GoBack _GoBack _Ref499489480 _Ref499489482 _Ref499489484 _GoBack _GoBack _GoBack _Hlk499118219 _GoBack _GoBack work_3horm3ctfzfelcnmozatqdozsa ---- White Paper Report Report ID: 100089 Application Number: HT5003610 Project Director: Philip Ethington (philipje@usc.edu) Institution: University of Southern California Reporting Period: 9/1/2011-8/31/2012 Report Due: 11/30/2012 Date Submitted: 12/17/2012 Office of Grant Management Room 311 National Endowment for the Humanities 1100 Pennsylvania Avenue, N.W. Washington, D. C. 20506. 30 Nov. 2012 Dear Office of Grant Management: Please find attached the White Paper for the “Broadening the Digital Humanities: The Vectors-CTS Summer Institute on Digital Approaches to American Studies,” 18 July to 12 August 2011. (ID Number: HT-50036-10) awarded to the University of Southern California. Do not hesitate to contact us if you have any questions or require additional materials. Sincerely, Philip J. Ethington, Tara McPherson, John Carlos Rowe   2 White Paper Grant ID Number: HT-50036-10 Grant Term: 7/18/2011 to 8/12/2011 Grant Title: “Broadening the Digital Humanities: The Vectors-CTS Summer Institute on Digital Approaches to American Studies,” 18 July to 12 August 2011. Project Directors: Philip Ethington, Tara McPherson, John Carlos Rowe Grant Institution: University Southern California 30 Nov. 2012   3 WHITE PAPER: “Broadening the Digital Humanities: The Vectors-CTS Summer Institute on Digital Approaches to American Studies,” co-hosted by The Vectors-Center for Transformative Scholarship, and American Studies and Ethnicity Department at USC, 18 July to 12 August 2011. 30 Nov. 2012 Background: During the summer of 2011, we held a very productive four-week Summer Institute on “Broadening the Digital Humanities: The Vectors-CTS Summer Institute on Digital Approaches to American Studies,” co-hosted by The Vectors-Center for Transformative Scholarship, and American Studies and Ethnicity Department at USC, 18 July to 12 August 2011. Our primary audience was the American Studies humanities scholar who does not have a great deal of computing experience but who has begun to express an interest in the digital humanities and in interactive media more broadly. Scholars were offered the opportunity to explore the benefits of interactive media for scholarly analysis and authorship, with an emphasis on two, interoperable authoring and publishing platforms: the multimedia authoring platform Scalar, and the geohistorical narrative visualization platform HyperCities. Please “Outcomes”, p. 7 below, for detailed project descriptions and post-Institute project developments. Response to the Call for Proposals: The response to the call for proposals was very strong. Ninety-nine (99) proposals were submitted and reviewed, indicating a high level of interest in the Institute’s vision. After review, seventeen fellows representing fourteen proposals were selected. Both the submitted proposals and the selected fellows came from applicants from a range of career levels (from advanced Ph.D. student to an endowed professor), a variety of colleges and universities, and a broad geographic distribution. (See below for a list of 2011 Summer Fellows.) Additionally, an advanced undergraduate was included in the pool, working alongside his father, a professor of Africana Studies. We were quite gratified at the strong overall quality across the applicant pool and could easily have accepted additional high-caliber proposals. In fact, narrowing the final pool to ten proved very difficult and, in the end, the decision was made to accept seventeen fellows, through tapping additional USC financial resources. Our selection criteria was threefold: to achieve a diversity of content matter and theoretical frameworks: to optimize the match-up between our expertise and that of the applicants; and to achieve a balance of junior and senior scholars. In keeping with the value that many place on collaborative scholarship in digital humanities, we accepted four two-scholar partnerships, pushing the total number of fellows to 20, all of the highest caliber.   4 2011 NEH Summer Institute Fellows: Fellows Project Titles Nicholas Brown and Sarah Kanouse Recollecting Black Hawk Wendy Cheng A People’s Guide to Los Angeles Elizabeth Cornell Keywords Collaboratory Brandon Costelloe-Kuehn and Nick Shapiro Networking Asthmatic Spaces Matt Delmont The Nicest Kids in Town: American Bandstand, Rock 'n' Roll, and Civil Rights in 1950s Philadelphia Kara Keeling and Thenmozhi Soundararajan Digital Media and Social Movements David Kim and Mike Rocchio Mapping the Murals: Chicano Community Murals in LA Debra Levine ACT UP Oral History Project Curtis Marez Cesar Chavez’s Video Library Mark Marino Critical Code Studies Carrie Rentschler There/Not There: Witness in Genovese Case Nicholas Sammond Biting the Invisible Hand: Blackface Minstrelsy and Animation Jonathan Sterne MP3: The Meaning of a Format Kara Thompson A Future Perfect: Time, Queerness, Indigeneity Oliver Wang Legions of Boom: Mobile Sounds, Sights and Sites Scott Wilson Century Villages of Cabrillo (adjacent to the Port of Long Beach) USC Instruction Team: Steve Anderson Craig Dietrich Phil Ethington (PI) Erik Loyer Tara McPherson (Co-PI) Jillian O'Connor John Carlos Rowe (Co-PI)   5 Visiting Presenters (Please see Appendix, p. 23 for biographical sketches) Mark Allen Anne Balsamo Randy Bass Anne Burdick Sharon Daniel Kathleen Fitzpatrick Gary Hall Alexandra Juhasz Marsha Kinder Caroline Levander Work Timeline: Based upon our assessments and those of our fellows, the Institute was a strong success. The institute began July 18, 20101, ran for four weeks, and brought 20 fellows to the USC campus. Our work on the grant began well in advance of the institute and continues on to the present, as we continue to support various fellows in ongoing projects. Specifically, we have undertaken the following, in line with our proposed activities in the initial grant proposal: Fall-Winter 2010-11  Call for Proposals, Curriculum, Lining-up Visiting Presenters  Call for Proposals posted online and broadcast via discussion lists.  Logistical Planning for the summer institute began (housing, lab schedule, etc.)   Spring 2011  Erik Loyer and Craig Dietrich worked with Todd Presner and Dave Shepard of HyperCities to established interoperability between the platforms. HyperCities was established as a partner archive and HyperCities can be inserted into Scalar pages.  Proposals received and evaluated by Review Committee  Participants announced and confirmed  Visiting faculty for the institute confirmed  Travel and housing plans established  Workshop curriculum fine-tuned. June 2011  Finalized logistics, technical support, and curriculum  Finalized daily schedule  Opened Institute wiki  Assisted fellows with travel and housing needs  Finalized travel for guest presenters   6 July-August 2011  Broadening the Digital Humanities: The Vectors-CTS Summer Institute on Digital Approaches to American Studies,” 18 July to 12 August 2011NEH July 18 – August 12 Fall 2011-present  Continued assessment of outcomes  Continued support of several fellows projects  Publication and publicity support for fellows Evaluation of the Institute: Fellows and presenters quickly established an atmosphere of risk-free learning and creativity. Institute co-PIs and instruction team carefully interviewed and tracked each participant to support participants’ visions, helping them to understand the affordances of each platform, strategies for choosing between goals for each project, and how best to position the project to become a “publication.”. Each participant received weekly instruction concerning a) needed tutorials and support, arranging for hands-on expert help and lab time on specific topics from spatial data processing, 3-D tools, to audio and video processing and archiving; b) reasonable goals and best use of time. The morning sessions of the institute were comprised of seminar-style discussions led by guest presenters, the grant PIs, or the fellows themselves. Afternoon sessions were comprised of labs, technical workshops and demonstrations, and project development time. This mix of activities seemed to strike a balance that allowed “something for everyone.” While some fellows seemed to garner more from the seminars, others gravitated toward the hands-on aspects of the workshops and lab time. There were several things that the fellows all seemed to appreciate. Most found the quality of the guest presenters to be top notch, and many commented on how useful it was to see a range of types of work presented. Each also expressed gratitude for the way in which the institute interwove both conceptual questions and technical approaches. For this set of scholars (all tied to the “traditional” humanities), lodging questions of the digital within larger intellectual frames proved deeply satisfying. This indeed proved key to “broadening” the digital humanities (as was our theme.) Assessment Protocol: PIs and Instruction Team met weekly with each fellow in scheduled meetings, to assess each participant’s satisfaction with the curriculum and rate of progress, immediate, intermediate, and longer-term goals. Assignment of hands-on experts, lab topics, and demos were adjusted accordingly.   7 Outcomes: Projects Update Reports, as of November 2012 We are very pleased that about half of the projects launched during the 2011 Summer Institute are either already published (eg Matt Delmont, UC Press), are being submitted (eg, Curtis Marez, to the journal American Literature), or are being presented at major conferences (eg Keeling and Soundarajan). These projects are listed first under “A. Projects Being Published, Submitted, or Presented at Conferences.” Of course, several of the projects have entered longer development cycles, but these also seem to hold promise for the participants. These projects are listed second below, under “B. Projects With Longer Development Timelines.” A. Projects Being Published, Submitted, or Presented at Conferences We followed our work in the four-week summer institute with several reports at the American Studies Association annual conventions. At the 2010 Convention, John Carlos Rowe advertised the Institute at the Digital Caucus and with handouts at the Convention; at the 2011 Convention, he reported on our results to the Digital Caucus. Finally, at this year's 2012 ASA Convention, he discussed long-term implications of the Institute at a panel on Transnational Publishing and another on Digital Education for Graduate Students. ************* Matt Delmont, Professor of American Studies, Scripps College http://mattdelmont.com/ Project title: The Nicest Kids in Town   8 I used Scalar to create working on a digital project companion to my book. I "published" the Scalar project in January 2012. The Scalar project adds to the book because I was able to include 100+ images (compared to 27 in the book), as well as video clips related to my research. The link is here: http://scalar.usc.edu/nehvectors/nicest-kids Excerpt from Review citing the Scalar Site Gayle Wald (2012). The Nicest Kids in Town: American Bandstand, Rock ’n’ Roll, and the Struggle for Civil Rights in 1950s Philadelphia. By Matthew F. Delmont. Berkeley: University of California Press, 2012./The Nicest Kids in Town digital project, http://scalar.usc.edu/nehvectors/nicestkids/index.. Journal of the Society for American Music, 6, pp 489492 doi:10.1017/S1752196312000399 “Some of the sources that Delmont uses in this regard are available in a free online companion to TheNicest Kids in Town, constructed using innovative Scalar software developed by the Alliance for Networking Visual Culture. Scalar allows content producers to author projects, or “books,” that combine text and media, without subordinating the former to the latter. As users navigate the text of the Scalar-based Nicest Kids in Town, images and video clips scroll into view, accompanied by useful links to information about their provenance and content. A “stripe,” or index, running down the left-hand side of the page provides the user with an index to the material, making it easy to navigate among the three “paths,” in addition to an introduction, which constitute the main body of the project. The dozens of images on the digital Nicest Kids in Town are of far higher quality than the illustrations in the book, with its grainy black-and-white reproductions. Users will also appreciate what amount to visual “footnotes”—images of the newspaper clippings from which Delmont quotes. It’s easy to imagine the digital Nicest Kids as a nice tool for helping undergraduates understand the significance and use of primary sources and other archival materials in the production of knowledge.”   9 ************* Kara Keeling and Thenmozhi Soundarajan Project title: Digital Media and Social Movements Kara Keeling, Assistant Professor in the Division of Critical Studies in the School of Cinematic Arts and in the Department of American Studies and Ethnicity at the University of Southern California http://dornsife.usc.edu/ase/people/faculty_display.cfm?Person_ID=1016530 Thenmozhi Soundarajan, Ph.D. student, School of Cinematic Arts, USC From Third Cinema to Media Justice: Third World Majority and the Promise of Third Cinema is a collaborative multi-media archive and scholarship project consisting of an archive that contains the materials produced by Third World Majority (TWM) during the years of their existence as a collective and a collection of scholarly pieces, historical retrospectives, and other dialogues with the work of TWM. TWM was one of the first women of color media justice collectives in the United States. It operated from 2001 to 2008. Both the TWM archive and the writings about it are part of the Scalar project. Since the Institute ended, we have successfully uploaded the entire TWM archive to the Internet Archives and begun the process of linking that material to the Scalar anthology. We also have collaboratively produced content for the anthology, started recruiting others to contribute to the volume, presented about the project on a plenary for the US Cultural Studies Association's annual conference in San Diego. We are scheduled to present about it at the Allied Media Conference in Detroit, MI and the Association for Cultural Studies conference in Paris, France in June and July of 2012. Both presentations provide opportunities to produce additional content for the anthology and to identify and solicit contributors to the volume. We plan to have the archive available in Scalar and issue invitations to contributors by the end of August 2012.   10 ************* Elizabeth Cornell Project title: Keywords Collaboratory Elizabeth Cornell, Project Coordinator, Keywords Collaboratory, Fordham University http://www.elizabethfcornell.net/ I worked on a digital version of the book, Keywords for American Cultural Studies, edited by Bruce Burgett and Glenn Hendler and published by NYU Press. At the moment, we plan to use Scalar for 30 keyword essays, to be published alongside the second edition of the print version, which will contain 60 other essays. We anticipate publication to be the fall of 2013.   11 ************* Curtis Marez Project title: Cesar Chavez’s Video Library Curtis Marez, Associate Professor of Ethnic Studies at the University of California, San Diego My project is called “Cesar Chavez’s Video Library, or Farm Workers and the Secret History of New Media” It argues that farm workers have been influential actors in the political history of new media. Farm worker unions, most notably the UFW, have been early and innovative adopters of older forms of “new media,” such as portable film and video technology, in ways that illuminate the political limits and possibilities of more recent new media practices among immigrant rights activists. I have continued to work on the project and have recently submitted a multimedia “essay” to the journal American Literature.   12 ************* David Kim and Mike Rocchio Project title: Mapping the Murals: Chicano Community Murals in LA David Kim, Ph.D. student, Department of Information Studies (expected 2012), UCLA Mike Rocchio, Ph.D student, Department of Architecture and Urban Design, UCLA Mapping Chicano Murals in LA is a digital model and simulation of the Estrada Court public housing in East Los Angeles which features 60+ community murals installed in the 70s and the 80s. During the 2011 Summer Institute, we built the beta version of the model in Google Sketchup and integrated the model into Hypercities platform, which allowed us to combine narratives, archival materials and other resources towards spatial analysis of race, ethnicity and cultural nationalism as these concepts are embodied in the murals. Currently, we wrapped up the digital publication version of the project in Hypercities and will be presenting it in MLA (special session: race in the digital humanities) and architecture studies conference. We submitted the online version for peer-review in the Cambridge University Press journal Urban History in October 2012.   13 ************* Nick Shapiro and Brandon Costelloe-Keuhn Project title: Networking Asthmatic Spaces Nick Shapiro, Graduate student, University of Oxford http://oxford.academia.edu/NickShapiro Brandon Costelloe-Keuhn, PhD candidate, Department of Science and Technology Studies, Rensselaer Polytechnic Institute http://rpi.academia.edu/BrandonCostelloeKuehn We created a Scalar site that includes a short film, GIS maps and an oral history journey through the experiences of residents in the "FEMA" trailers. These temporary housing units, originally built to accommodate Gulf Coast residents who were displaced by the hurricanes of 2005, were found to contain potentially toxic chemicals, and have been resold across the United States in tandem with a widening foreclosure crisis. The project enhances users' capacity to 1) visualize connections between environmental, public health and economic crises, 2) move across scales, engaging material that situates them inside the trailers and the lives of residents, and then zooming out to see how hazards at the local level are distributed nationally, 3) understand how scientifically-engaged media can generate new perspectives on complex problems. As these units continue to be sold to every corner of the U.S., we have been interviewing residents with irritated eyes, bloody noses, memory loss, insomnia, diarrhoea and respiratory issues. We have launched an auxiliary study in collaboration with a private indoor air quality lab which questions the prevailing scientific consensus that the trailers have off-gassed their store of potentially toxic chemicals in the almost seven years since they were manufactured. To date, the average level of formaldehyde found, across eight states, is over 100 parts per billion, the EPA’s recommended maximum indoor air concentration. We are currently working on a paper that layers the embodied knowledge of FEMA trailer inhabitants and our numerical data on the   14 ongoing toxicity of these domestic spaces. Building on our work with Scalar, we aim to craft new contexts in which layered claims of toxicity, based on embodied awareness and technologically mediated measurements and visualizations, can be heard, making these trailers and the effects and affects they engender graspable as objects of epistemic action. We are in conversation with the interactive web designer at wired.com to develop a website and we have lectured internationally on our research in addition to having been featured in internationally syndicated news media.   15 ************* Oliver Wang Project title: Legions of Boom: Mobile Sounds, Sights and Sites Oliver Wang, Assistant Professor, Sociology Department, California State Long Beach http://www.csulb.edu/colleges/cla/departments/sociology/people/OliverWang.htm My digital project is a research repository focused on the history of the Filipino American mobile disc jockey community in the San Francisco Bay Area. It includes text, audio and visual resources, designed to introduce the social history of this community to both newcomers and those who grew up in the scene. The long-term goal is to create a dynamically-updated repository that can include contributions from visitors, thus emphasizing the community aspect of "community history." As I was preparing for my tenure file and book revisions (the latter of which relates to the research on the site), I have made minimal progress this past year. However, now that I have gained tenure and a fall semester sabbatical, I will be completing the site this summer and publicly launch it by early fall (2012).   16 B. Projects With Longer Development Timelines Wendy Cheng Project title: A People’s Guide to Los Angeles Wendy Cheng, Assistant Professor, Asian Pacific American Studies and Justice and Social Inquiry, School of Social Transformation, Arizona State University https://webapp4.asu.edu/directory/person/1634360 Project title: A People’s Guide to Los Angeles I worked to develop a digital, interactive version of my coauthored book, A People's Guide to Los Angeles, which presents 115 sites of struggles over power and alternative and minority histories throughout Los Angeles County. Although the book has now been published, unfortunately I have not made any progress on the digital project since the end of the institute. We (my coauthors and I, in conversation with UC Press) are currently working to develop A People's Guide book series and would like to return to the question of a digital, interactive online presence in the future that would serve as a hub for these various projects, but I don't have a sense of when or how that would develop. The institute helped tremendously, however, to identify what the digital project might look like, and what the questions, problems, and needs would likely be in order to realize the project.”   17 ************* Nicholas Sammond Project title: Biting the Invisible Hand: Blackface Minstrelsy and Animation Nicholas Sammond, Associate Professor, Cinema Studies Institute and Department of English, University of Toronto http://www.utoronto.ca/cinema/faculty-sammond.html My project entailed developing an online companion to my upcoming book, Biting the Invisible Hand: Blackface Minstrelsy and the Industrialization of American Animation (Duke University Press, forthcoming). The companion is not a literal translation of the book, but a complementary resource, one that permits the reader (or stand-alone visitor) to view the cartoons, minstrel ephemera, and other media elements to which the book refers, but which the book cannot deliver in substantial form. To date, the companion, which is only a prototype, has been undergoing beta testing by student workers, with the goal of refining its organization, structure, and flows. With the book slated to be under review by the end of the summer, significant development on the companion will commence from July forward, intensifying in September and October.   18 ************* Sarah Kanouse and Nicholas Brown Project title: Recollecting Black Hawk Sarah Kanouse, Assistant Professor�Intermedia Program�School of Art and Art History http://www.readysubjects.org/bio.html Nick Brown, PhD Candidate, Department of Landscape Architecture and American Indian Studies, University of Illinois at Urbana-Champaign http://walkinginplace.org/ http://criticalspatialpractice.blogspot.com/ Last summer, we set out to work on a digital supplement to the photo-text book Re-collecting Black Hawk. Immersion in the discussion of digital humanities at the summer institute helped us to realize that a spin-off project--related but stand alone--would be more appropriate than what we initially envisioned. The reconceptualized project has been delayed by the need to finish the print book, but we plan to return to it, using resources at our home campuses, once the publication timeline is firmed up.   19 ************* Carrie Rentschler Project title: There/Not There: Witness in Genovese Case Carrie Rentschler, Associate Professor and William Dawson Scholar of Feminist Media Studies in the Department of Art History and Communication Studies at McGill University http://www.mcgill.ca/igsf/about/staff My digital project is an annotated archive of materials that animate the cultural life and case construction of the infamous 1964 Kitty Genovese murder. While the archive constitutes the research materials for a book I am writing, when complete, the digital archive will have significance beyond that publication, and will be of interest to students who are taught the case in high school and university classrooms. The project is not yet complete. It is much as it was at the end of the seminar last summer due to my current administrative responsibilities. I am, however, quite eager to complete it, because of how useful I believe it will be pedagogically and as a small research archive.   20 ************* Jonathan Sterne Project title: MP3: The Meaning of a Format Jonathan Sterne, Associate Professor, Department of Art History and Communications Studies, McGill University. http://media.mcgill.ca/en/jonathan_sterne My project was to create a digital companion to my book MP3: The Meaning of a Format. I was especially interested in expanding the audio capabilities of Scalar and companion sites like Critical Commons. I've made some progress but am not yet that close to done. I hope to "go live" with something in August when my book comes out. I've replotted the project -- the concept was a little foggier last summer and I tried using existing models like pouring in text. Instead, what I need to do it pick a few core "takeaway" concepts and provide additional illustration and material to the book, especially audio material.   21 ************* Kara Thompson Project Title: Colonized Time Kara Thompson, Assistant Professor of English & American Studies College of William and Mary ktthompson@wm.edu I submitted a project proposal for “Mapping with Reservations,” a multimedia, multilayered cartographic representation of the reservation system in the U.S., beginning in 1880 and continuing to the present day. This was clearly a project much too onerous for the time and scope of the fellowship. After some training with Scalar, I created a minimal framework for “Colonized Time,” which is a cultural history of the Black Hills from 1850-1890. I try to use the paths and visual orientations of Scalar to show how the Black Hills is a site well known in a dominant tourist imagination, but underneath the venerated, re-enacted “wild west” is the very present colonization of the Lakota people. I have not made any progress on the project since I left the Institute. Immediately following our time at the IML, I began my first tenure track position, which completely consumed my research and writing time. I do hope to take up the project again and devise a way to use it for either research or teaching. http://scalar.usc.edu/nehvectors/kara-thompson/index. [I notice the HyperCities inserts are not loading properly, at least not on my computer, which is why I am sending a link instead of a screen shot].   22 ************* Debra Levine Project Title: ACT UP Oral History Project     23   Appendix Visiting Presenters Mark Allen is the founder of Machine Project, a non-profit community space in the Echo Park neighborhood of Los Angeles investigating art, technology, natural history, science, music, literature, and food. In the Machine Project storefront on North Alvarado Street, Allen and his colleagues produce events, workshops and site-specific installations using hands-on engagement to make rarefied knowledge accessible. In his own work, Allen is interested in sculpture and performance, asking how they can affect the viewer in a deep, personal way. How can the viewer be moved from a passive position to a state of engagement and communal experience? Allen has been working with these concerns since graduate school, and his practice has transformed from studio artist to include collaborator, facilitator and producer as he has investigated these questions. Under his direction, Machine functions as a research laboratory, investigating performance, sculpture and installation as lived experience for the viewer. Anne Balsamo is Professor of Interactive Media in the School of Cinematic Arts, and of Communication in the Annenberg School of Communication and Journalism at the University of Southern California. From 2004-2007, she served as Director of the Institute for Multimedia Literacy. her work focuses on the relationship between culture and technology. In 2002, she co- founded Onomy Labs, Inc., a Silicon Valley technology design and fabrication company that builds cultural technologies. Her first book, Technologies of the Gendered Body: Reading Cyborg Women (Duke University Press, 1996) investigated the social and cultural implications of emergent bio-technologies. Her new book project, Designing Culture: The Technological Imagination at Work, examines the relationship between cultural reproduction and technological innovation. Randy Bass is Executive Director of Georgetown's Center for New Designs in Learning and Scholarship, a University-wide center supporting faculty work in new learning and research environments. He is the director of the Visible Knowledge Project (VKP). In conjunction with the VKP, he is also the Director of the American Studies Crossroads Project, an international project on technology and education in affiliation with the American Studies Association, with major funding in the past by the US Department of Education and the Annenberg/CPB Project. In conjunction with the Crossroads Project, Bass is the supervising editor of Engines of Inquiry: A Practical Guide for Using Technology to Teach American Studies, and executive producer of the companion video, Engines of Inquiry: A Video Tour of Learning and Technology in American Culture Studies. He has served as co-leader of the NEH-funded "New Media Classroom Project: Building a National Conversation on Narrative Inquiry and Technology," in conjunction with the American Social History Project/Center for Media and Learning (at the CUNY Graduate Center). He is also co-editor of the Electronic Resources Editor for the Heath Anthology of American Literature (third edition, Paul Lauter, ed.). Anne Burdick is Chair of the Media Design Program at the Pasadena Art Center. She is a regular participant in the international dialogue regarding the future of graduate education and research in design. In addition, she designs experimental text projects in diverse media, for   24 which she has garnered recognition, from the prestigious Leipzig Award for book design to I.D. Magazine’s Interactive Design Review for her work with interactive texts. Burdick has designed books of literary/media criticism by authors such as Marshall McLuhan and N. Katherine Hayles and she is currently developing electronic corpora with the Austrian Academy of Sciences. Burdick’s writing and design can be found in the Los Angeles Times, Eye Magazine and Electronic Book Review, among others, and her work is held in the permanent collections of both SFMOMA and MoMA. Burdick studied graphic design at both Art Center College of Design and San Diego State University prior to receiving a B.F.A. and M.F.A. in graphic design at California Institute of the Arts. Sharon Daniel is Professor of Film and Digital Media at the University of California, Santa Cruz where she teaches classes in digital media theory and practice. Her research involves collaborations with local and online communities, which exploit information and communications technologies as new sites for “public art.” Daniel is the co-creator of the Web- based interactive project Public Secrets, which examines the spaces of the prison system through the voices of incarcerated women. The award-winning project exemplifies precise and elegant interface design and the use of an algorithm to generate random “text boxes” that act as metaphors for the project’s central thesis. Kathleen Fitzpatrick is Associate Professor of English and Media Studies and chair of the MediaStudies program at Pomona College in Claremont, California. She is the author of The Anxiety of Obsolescence: The American Novel in the Age of Television (Vanderbilt UP, 2006), which was selected as an “Outstanding Academic Title” for 2007 by CHOICE. She serves on the editorial board of the Pearson Custom Introduction to Literature database anthology, as well as of the Journal of e-Media Studies and the Journal of Transformative Works, and is a member of the executive committee of the MLA Discussion Group on Media and Literature. She has recently finished a book-length project, to be published by New York University Press, entitled Planned Obsolescence: Publishing, Technology, and the Future of the Academy. She is a founder of MediaCommons and a frequent blogger and has recently been appointed the first Director of Scholarly Communication for the MLA. Gary Hall is a London-based cultural and media theorist working on new media technologies, continental philosophy and cultural studies. He is Professor of Media and Performing Arts in the School of Art and Design at Coventry University, UK. He is the author of Culture in Bits (Continuum, 2002) and Digitize This Book!: The Politics of New Media, or Why We Need Open Access Now (Minnesota UP, 2008) and co-editor of New Cultural Studies: Adventures in Theory (Edinburgh UP, 2006) and Experimenting: Essays with Samuel Weber (Fordham UP, 2007). He is also founding co-editor of the open access journal Culture Machine, director of the cultural studies open access archive CSeARCH, co-founder of the Open Humanities Press and co-editor of the OHP's Culture Machine Liquid Books series. His work has appeared in numerous journals, including Angelaki, Cultural Politics, Cultural Studies, Parallax and The Oxford Literary Review. He is currently developing a series of politico-institutional interventions - recently dubbed 'deconstructions in the public sphere' - which use digital media to creatively perform critical and cultural theory. In 2009/10 he will be a Visiting Fellow at the Centre for Research in the Arts, Humanities and Social Sciences at Cambridge University.   25 Alexandra Juhasz is Professor of Media Studies at Pitzer College. She makes and studies committed media practices that contribute to political change and individual and community growth. She is the author of AIDS TV: Identity, Community and Alternative Video (Duke University Press, 1995), Women of Vision: Histories in Feminist Film and Video (University of Minnesota Press, 2001), F is for Phony: Fake Documentary and Truth’s Undoing, co-edited with Jesse Lerner (Minnesota, 2005), and Media Praxis: A Radical Web-Site Integrating Theory, Practice and Politics, www.mediapraxis.org. She has published extensively on documentary film and video. Dr. Juhasz is also the producer of educational videotapes on feminist issues from AIDS to teen pregnancy. She recently completed the feature documentaries SCALE: Measuring Might in the Media Age (2008), Video Remains (2005) and Dear Gabe (2003) as well as Women of Vision: 18 Histories in Feminist Film and Video (1998) and the shorts, RELEASED: 5 Short Videos about Women and Film (2000) and Naming Prairie (2001), a Sundance Film Festival, 2002, official selection. She is the producer of the feature films, The Watermelon Woman (Cheryl Dunye, 1997) and The OWLS (Dunye, 2010). Her current work is on and about YouTube: www.youtube.com/mediapraxisme and www.aljean.wordpress.com. Marsha Kinder is a cultural theorist and prolific film scholar, whose specializations include narrative theory, digital media, children's media culture, and Spanish cinema. She has published more than 100 essays and 10 books, including Blood Cinema: The Reconstruction of National Identity in Spain with companion CD-ROM (1993), Playing with Power in Movies, Television and Video Games: From Muppet Babies to Teenage Mutant Ninja Turtles (1991), Self and Cinema (1982) and Closeup (1978). Since 1997 Kinder has directed The Labyrinth Project, an art collective and research initiative on interactive cinema and database narrative. Labyrinth has produced a series of award-winning interactive installations and DVDs that have been exhibited at museums, conferences, film festivals and new media festivals worldwide. Kinder has worked for Sega as a rater of violence in video games, has written, directed and produced game protoypes and online courseware projects, and received a number of awards, including the Sundance Online Festival Jury Award for New Narrative Forms, British Academy of Film & TV Arts for Best Interactive Project in the Learning Category, and New Media Invision Award for Best Overall Design. Caroline Levander is the Vice Provost for Interdisciplinary Initiatives, Carlson Professor in the Humanities, and Professor of English at Rice University. She is currently writing Laying Claim: Imagining Empire on the U.S. Mexico Border (Oxford University Press) and Where Is American Literature? (Wiley-Blackwell’s Manifesto Series). Levander has recently co-edited Teaching and Studying the Americas (2010), A Companion to American Literary Studies (2011), and "The Global South and World Disorder," with Walter Mignolo, for The Global South (2011). She is the recipient of grants and fellowships from the Mellon Foundation, the National Endowment for the Humanities, the National Humanities Center, the Huntington Library, and the Institute of Museum and Library Science's National Leadership grant, among other agencies. In addition to co-editing a book series, Imagining the Americas, with Oxford University Press, Levander is author of Cradle of Liberty: Race, the Child and National Belonging from Thomas Jefferson to W.E.B. Du Bois (Duke UP 2006) and Voices of the Nation: Women and Public Speech in Nineteenth-Century American Culture and Literature (Cambridge UP 1998, paperback reprint 2009) and co-editor of Hemispheric American Studies (2008) and The American Child: A Cultural Studies Reader (2003). She is also involved in the Our Americas Archive Partnership   26 (OAAP), a digital archive supported by search tools and teaching materials that provides open access to historical documents on the Americas, which are housed at collaborating institutions. Ramesh Srinivasan is Assistant Professor of Information Studies with a courtesy appointment in Design|Media Arts. Srinivasan, who holds M.S and Doctoral degrees, from the MIT Media Laboratory and Harvard's Design School respectively, has focused his research globally on the development of information systems within the context of culturally-differentiated communities. He is interested in how an information system can function as a cultural artifact, as a repository of knowledge that is commensurable with the ontologies of a community. As a complement, he is also interested in how an information system can engage and re-question the notion of diaspora and how ethnicity and culture function across distance. His research therefore involves engaging communities to serve as the designers, authors, and librarians/archivists of their own information systems. His research has spanned such bounds as Native Americans, Somali refugees, Indian villages, Aboriginal Australia, and Maori New Zealand. He has published widely in scholarly journals. work_3hxrsoea2bdiln2bg76j552qdi ---- Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 S. Calzati – Digital Autoethnography & Connected Intelligence: Two Qualitative Practice- Based Teaching Methods for the Digital Humanities DOI: http://doi.org/10.6092/issn.2532-8816/9881 Digital Autoethnography & Connected Intelligence: Two Qualitative Practice-Based Teaching Methods for the Digital Humanities Stefano Calzati Tallinn University of Technology, Estonia stefano.calzati@taltech.ee Abstract In higher education we witness a unique conjuncture: on the one hand, students who attend academic courses are the first generation to have fully grown in a digitalized world; on the other hand, teachers, while having grown and studied in a still largely analogue world, have witnessed the evolution of today’s techno-society since its infancy. By connecting the field of the Digital Humanities with education, this article discusses the conception, design and results of two practice-based teaching experiences which were aimed at exploring the tensions embedded in our daily use of digital technologies, as well as in today’s techno-society as a whole. The first one is a “digital autoethnography” developed at the City University of Hong Kong; the second one refers to the course “Anthropology of Communication” – co-delivered at Politecnico of Milan – which adopted a “connected intelligence” approach to urge students to reflect on tomorrow’s techno-society in a collaborative way. While the first experience was chiefly a self- reflexive study on the impact of social media on the individual, the second one mapped the main criticalities of techno-society as a whole, according to seven macro-themes, and asked students to elaborate possible solutions. Both courses considered students as active learners/users, insofar as they at the forefront of today digital revolution, but also the subjects most in need of critical tools to face it. Oggi, nell’università, assistiamo a una congiuntura unica: da un lato, gli studenti che frequentano i corsi accademici sono la prima generazione ad essere completamente cresciuta in un mondo digitalizzato; dall’altro lato, i docenti, pur essendo cresciuti e aver studiato in un mondo ancora in gran parte analogico, hanno assistito all'evoluzione della tecno-società odierna sin dalla sua infanzia. Promuovendo un dialogo tra le Digital Humanities e la didattica (accademica), questo articolo discute la concezione, progettazione e i risultati di due esperienze di insegnamento practice-based mirate a esplorare le tensioni implicite nel nostro uso quotidiano delle tecnologie digitali, nonché nella tecno-società odierna nel suo insieme. La 27 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 prima esperienza è una “autoetnografia digitale” sviluppata presso la City University di Hong Kong; la seconda è collegata al corso “Anthropology of Communication” – tenuto al Politecnico di Milano – nella quale abbiamo adottato l’approccio dell’“intelligenza connessa” per stimolare gli studenti a riflettere collaborativamente sulla tecno-società di domani. Mentre la prima esperienza è stata principalmente uno studio autoriflessivo sull'impatto dei social media sull’individuo, la seconda ha mappato le principali criticità della nostra tecno-società, a partire da sette macro-temi, al fine di elaborare possibili soluzioni. Entrambi i corsi considerano gli studenti come utenti attivi, giacché sono in prima linea nell’uso delle nuove tecnologie, ma sono anche coloro che necessitano maggiormente di un solido bagaglio critico per usarle/svilupparle al meglio. Introduction This paper explores, through a qualitative approach, the field of the Digital Humanities – or better, its “outer ring”, as Fabio Ciotti put it during his keynote lecture at the 2019 AUCID conference – in connection with education and teaching practices. Notably, it does so by maintaining a critical standpoint (in the broad sense of the term) towards the impact that digital technologies are having on today’s teachers and students. To be sure, here “digital technologies” loosely refers to both Web services – such as apps, platforms, social networks (SNSs) – as well as hardware devices, especially mobile phones. At the same time, it is, above all, higher education to be at the centre of the present discussion, although it would certainly be useful to promote a debate intersecting teaching, learning and digital technologies which spreads across all levels of education. As one last addendum, it is necessary to specify that the focus of this article is not on the (impact of the) use of digital technologies within a teaching- learning context. On this topic, literature is already consistent, involving all levels of education, as well as a variety of subjects ([32]; [5]; [25]; [18]). By contrast, this article makes digital technologies the subject of attention, highlighting the importance of developing new digital tech literacies, able to bring to the surface how digital technologies affect the individual and our daily life. A practice-based orientation towards the teaching of new digital tech literacies – of which we are increasingly in need, given where society is heading – is outlined. Concrete examples concerning two courses developed by the author, in conjunction with colleagues in Italy and abroad, will be provided. Firstly, the discussion will dwell upon the design and pedagogical goals of these courses; secondly, the research insights coming from these experiences will be presented; lastly, future developments and research lines will be sketched. As a side note, it is important to stress that, although these two courses are reviewed together, their projectuality and objectives are different; as a consequence, their results, as we will see, cannot be comparable. 28 S. Calzati – Digital Autoethnography & Connected Intelligence: Two Qualitative Practice- Based Teaching Methods for the Digital Humanities Crucial Times in Higher Education If we take as a starting point for the present discussion the mass diffusion of the Web in the mid-1990s (alongside that of mobile phones, although initially they were not smartphones yet), we realize that by now the generation of digital natives born out of that milieu has reached the stage of undergraduate or postgraduate education. This means that these students have grown up, at least since their first cycle in schools, in an increasingly digitalized world, and certainly one in which digital technologies have had a progressively radical impact on daily life. Education, however, has often been reactive, rather than proactive, towards this paradigmatic shift: most of the times, digital technologies have been implemented in curricula of primary and secondary schools as “mere” tools of support to otherwise unchanged teaching practices, rather than as technologies with unique features to be exploited ([11]). It is only over the last five-ten years that this power relation has been rebalanced, with technology gradually taking the lead in a process of reconceptualization of teaching practices ([29]). This rebalancing, after all, has become a necessity by now, insofar as digital natives represent the pulling force of digital technology’s (r)evolution. In fact, they are both the main target (as consumers) of tech companies and services as well as the main producers of digital content, providing an epitomizing example of what is meant by the term “produsers” (see, among others, [20]; [44]). According to recent statistics ([30]), people aged between 18 and 29 years old are those using smartphones the most – 96% – while the percentage decreases to 79% for 50 to 64 year-old people. Concerning social media, surveys ([31]) show that users from 18 to 29 years old are the most active. If we look at Facebook and Instagram – two of the most popular social media platforms – we see that the percentage is respectively at 79% and 67%; at the same time, these two data drop to 68% and 23% in the users population aged 50-64. On a similar note, it is interesting to remark that user-generated content (UGC) is mainly produced by Millennials, who contribute to over the 70% of all UGC found online ([40]). In contrast to this picture, today’s teachers and scholars are still, by and large, members of earlier generations, i.e. generations that, to various degrees, have transited from an analogue to a digital society. In Italy, for instance, the average age of tenure-track professors is 59; the age of associated professors is 52, while researchers are on average 47 years old ([43]). This means that academics were largely born and educated within a radically different socio-technological framework from the one we live in today; most importantly, the underpinning pedagogical vision of this framework privileged written words over moving images, syntagmatic step-by- step approaches to knowledge over paradigmatic hypertextual ones, and individual reading and memorization over interactional learning practices ([9]; [16]). Under these conditions, teachers represent a cornerstone within today’s education system for motives that go well beyond mere pedagogical issues and point, rather, to their generational bridging role within the class. Indeed, teachers literally keep one foot in an epoch that predates the digital revolution, while the other foot is now solidly grounded on today’s technologized society, which they have seen growing since its birth and of which they can pinpoint, for this very reason, both potentialities and shortcomings. In other words, today’s teachers are the public owners of a “knowledge heritage” 29 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 about technology that is unique – due to demographical circumstances – and which is now crucial to pass on to younger generations, in order for them to become aware of the roots and evolution of that technological shift which, to their eyes, appears as nothing more than a conditio de facto. The reason for reinstating what is maybe obvious, although sometimes overlooked – i.e. the encounter in class of two generations that have radically different approaches towards digital technologies – is crucial for highlighting the potential fruitfulness that can spring out of the synergy between today’s teachers and students in university. To be sure, this fruitfulness also comes with a responsibility, i.e. the need – now more than ever – to rethink teaching and learning as really interactive and mutually beneficial processes for all actors involved. Students are certainly the subjects who more easily enter in contact and familiarize with (new) digital technologies, precisely because these are conceived for them in the first place. And yet, students largely lack – as one of the two teaching experiences discussed below will show – the tools for critically engaging with and using these technologies. Teachers, by contrast, can help students to both put things into perspective – e.g. to investigate the archaeology of new media – and develop practical and critical skills for de-commoditizing technology and make a wiser use of it. At the same time, it is through constant dialogue with the students that teachers can remain abreast of technological innovation, which, for its very nature, tend to reach older generations only when it has already consolidated. In other words, the class becomes a space of negotiation for fruitfully engaging with what Ragnedda ([33]) has called “second digital divide”, meaning by that the needed competences and skills for an effective use of technology (rather than the mere access, which is described as “first digital divide”). Above all, it is important to stress that the class remains the privileged environment where this encounter and exchange can mature at best. This is so because it is only through the collective sharing of the same teaching-learning horizon that knowledge transfer can occur most productively. On this point, studies ( [25]) show that blended courses – i.e. courses that combine in-class and distant, technologically- mediated learning – are those leading to the best results for students; and yet, it is only when the in-class component is in the equation that we witness, in fact, an effective knowledge transfer in the long run. The risk with distant learning courses fully conceived as mediated by technology – which is a consequence of the evolution of digital platforms – is to witness what Van Dijck, Poell and de Waal ([45]) call “learnification”, that is, the fragmentation and parcelling of the learning process into self-contained units, which eventually miss to entice an effective acquisition. Building on Stephen Krashen’s ([22]) distinction between “learning” and “acquisition”, it could be said that this process of fragmentation and parcelling tend to be apprehended on a superficial level, rather than acquired in depth, precisely because technology still functions as a barrier or, at best, as a form of mediation of the learning process, to which a shared collective dimension has been subtracted. Here, the distinction made by German philosopher Walter Benjamin ([2]) between two different kinds of experience – “Erfahrung” and “Erlebnis” – might be of help to clarify the point. According to Benjamin, “Erfahrung” is a collective, qualitative experience that leads to forms of shared reflection, knowledge, and understanding across individuals, while “Erlebnis” is a kind of immediate experience that is focused on the moment and is lived through momentarily by the single subject. According to Benjamin, the passing from oral storytelling to written storytelling and further down to the 30 S. Calzati – Digital Autoethnography & Connected Intelligence: Two Qualitative Practice- Based Teaching Methods for the Digital Humanities technologized information conveyed by mass media has produced a decay of “Erfahrung” in favour of a blossoming of parcelled and individually lived experiences as Erlebnisse. The latest occurrence along this line – although Benjamin could not foresee that – might well be considered the kind of information and socialising practices fuelled by today’s digital technologies. Such premises are crucial to pave the way to the present discussion. In fact, they highlight the need, at all levels of education, to foster collaboratively shaped (teacher-students) new digital tech literacies which consider digital devices not only as tools, but as the subject of a critical reflection to be performed also, but not exclusively, through them, in the context of a broader discussion concerning the individual, technology and society as a whole. In this respect, digital tech literacies are framed within the fields of philosophy of technology (e.g. [14]), critical media studies ([20]) and digital cultures ([39]) and their coming to being has to be regarded more as a synergetic ongoing praxis involving all actors, than as a set of guidelines for the understanding of what technology can do for education and pedagogical purposes. Digital Autoethnography & Connected Intelligence The theoretical-critical premises outlined above were at the core of two practice-based teaching experiences aimed at exploring, in innovative ways, the tensions embedded in our daily use of social networks and in today’s techno-society. Overall, the shared common goal of these experiences was to enhance students and teachers awareness about the impact that digital technologies can have on the single individual, as well as on society in its entirety, thus stressing the relevance that digital tech literacies play (and will increasingly play) for all actors involved in the fostering of tomorrow’s society, from scholars to students to professionals. In order to do so, such technologies were put at the centre of two academic courses and approached, at once, as subjects and objects of a critical reflection connecting both teachers and students. In this way, the teaching-learning experience really allowed for the emergence of collaboratively built digital tech literacies. As we will see, this approach led, in one case, to reshape students’ attitude towards social media use; and, in the other case, to the design of projects meant to concretely tackle the tensions implied by today’s technologization of society. The following section is dedicated to the description of the conception and design of both experiences, after which a discussion on their results will follow. This will lead to stress strengths and weaknesses of both experiences as potentially replicable courses aimed at fostering digital tech literacies at university level. Digital Autoethnography The first experience was a course in new digital literacies – titled “Facebook and Autobiography” – that was delivered at the City University of Hong Kong in the fall semester of 2016 by myself together with prof. Roberto Simanowski. The goal of the course was to 31 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 explore the practices of self-representation on social networks (i.e. Facebook and Instagram 1) and assess how these differ from traditional forms of written self-representation, such as traditional diaries. Most importantly, we aimed to do so by putting the students enrolled in the course at the centre of the analysis and the learning experience so that they could capitalise on it in terms of critical insights about technological subjugation. Overall, 38 students were involved. Methodologically speaking, beyond the delivery of typical lectures focused on autobiographic writing across old and new media (e.g. [15]; [21]; [27]; [38]), we elaborated a “digital autoethnography”, defined as the study of “the discourses that emerge at the intersection of online/offline and the offline context through which the online worlds are entered” ([36]). In fact, the digital autoethnography consisted of a double-sided analysis. On the one hand, as researchers, we entered Facebook and Instagram via the creation of profiles that students befriended, on a voluntary basis, in order for us to monitor their activities over a period of five weeks; on the other hand, we instructed our participants to self- reflect upon their SNSs use through a number of assignments, whose assessment was part of the final evaluation for the course. The assignments were designed, together with three other colleagues from Germany, during a workshop held at the University of Wuppertal in July 2016 (thus prior to the beginning of the semester). To begin with, students were asked to answer a first round of questions aimed at providing us with a general understanding of their use of Facebook. Questions were: a) “Why do you use Facebook?”; b) “To what extent would you say that your profile reflect yourself?”; c) “What is a diary for you?”; d) “Does Facebook work as a diary for you? (Why or why not)”; e) “What do you look up on Facebook?” Students had to write down the answers and they could elaborate on them as freely as they wished. Secondly, they had to parse all of their Facebook posts over five weeks and tag them by using a set of previously elaborated tags. We developed four categories of tags which respectively referred to a) the type of posts’ content; b) the authorial stance responsible for the posts and its relation to the user’s self-representation; c) the mood of the posts; d) if/how posts had a time- related connotation or interrelation with other posts on the user’s Timeline. Specifically, the first category was inspired by Roman Jakobson’s ([19]) communication functions. Participants were instructed that posts would have a “referential function” whenever the Facebook’s user, or one of his/her friends, geolocated themselves or tagged other friends; posts (and comments) bore an “emotive function” when they overtly expressed the user’s emotion or state of mind; posts (and comments) had a “phatic function” when they were meant to simply keep in contact with friends (this function comprised emoticons, bare expressions of agreement/disagreement, likes and similar reactions). The second category of tags moved along the Self-Other axis: we asked participants, on the one hand, to identify if they had published the post themselves (“self-authored”), or if this activity had been outsourced (“other-authored”, further disentangled as “shared by user”, “shared by other friends”, or “frictionless sharing” by external apps); on the other hand, we wanted to know whether the content of the post directly referred 1 Initially, our focus was solely on Facebook, as this is the most widely used social network. Then, through in-class debates, we realized the necessity to also include Instagram into the picture, as this social network is increasingly popular especially among young adults. 32 S. Calzati – Digital Autoethnography & Connected Intelligence: Two Qualitative Practice- Based Teaching Methods for the Digital Humanities to the user (“self-related”) or to a different topic/issue (“other-related”, such as news, commercials, entertaining content, etc.). The third category addressed the mood of the posts: “euphoric” (positive content), “dysphoric” (negative content), or “neutral”. Under the fourth category fell those tags that dealt with time. In fact, we were interested in exploring if/how posts connected to each other along one’s Timeline, as well as in those occurrences where a single post contained a “small story” within itself. From here we defined three tags: “temporal”, which signalled the centrality of time (either as a single moment or a duration) with respect to the action/event described in one or several posts (e.g. journeys, anniversaries, timeframe of the semester, etc.); “hermeneutic”, in which posts (or comments) displayed an effective process of understanding among users (it is the case of posts and comments that contain questions and answers); “cause-effect”, when posts on the Timeline were linked by a clear cause-effect relation (e.g. when a post is published as a critique or in support of precedent posts or comments). Since, in practice, these tags overlap and can be co-present, we instructed students that each post could well be labelled with more than one tag belonging to the same category. Thirdly, alongside the tagging of the posts – which we also observed and captured with screenshots – we asked the participants to keep a written diary in which they jotted down, on a weekly basis, reflections about their SNSs diet and all their activities on Facebook and Instagram, from posting, to sharing and liking, to commenting. The goal, in this regard, was to let participants digest their daily SNSs use and prompt a “distanced” reflection – via the traditional act of writing – which could trigger a retrospective assessment of the users’ SNSs activities. Lastly, because the befriending of our avatars was on a voluntary basis, a distinction was made. Those students who did befriend our avatars – thus allowing us to closely monitor their activity – had to answer at the end of the five weeks a second round of questions that were tailored on their specific Timelines and aimed at understanding the underpinning reasons of their SNSs posting. The students who opted for taking part in the experience but not revealing their own profiles to us were required to write a final essay which reflected upon the whole experience of having kept a written diary alongside their daily SNSs use. Eventually, two groups were constituted: group A (16 students) submitted a diary, the tagging, and a final essay. Group B (22 students) – those who befriended us – submitted a diary, the tagging, and answered a second round of questions at the end of the five-week survey period. By comparing the insights derived from our monitoring of SNSs and the assignments of the students it was possible to better understand how participants represent themselves on SNSs (indeed, a fragmented representation across different media platforms, as we will see) and, most importantly, to sharpen the students’ awareness concerning their online self-projection, which is, in fact, an almost unperceived drowning, rather than a conscious and controlled exposure. In this respect, the experience did bring to the surface the embedded tensions involved in the technological subjugation to which individuals are exposed when using SNSs. 33 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 Connected Intelligence The second teaching experience refers to the course “Anthropology of Communication” which I co-delivered during the 2018 fall semester together with prof. Derrick de Kerckhove at Politecnico of Milan. By addressing seven macro-themes – ethics, education, ecology, politics, economy, urbanism, and technology – the aim of the course was to map the status of today’s techno-society and provide students with new critical insights and tools for consciously reflecting upon its evolution for eventually elaborating possible alternatives. In order to do so, the 54 students enrolled in the course (curriculum in “Design of Communication”) were put at the centre of the learning experience, being aware that they are, indeed, the pulling force of today’s techno-society and the designers of tomorrow’s. In fact, the overreaching goal of the course was to make students conceive and design a technologically sustainable village, intended as a community space – of the dimension of a neighbourhood or a small city – in which technology was at once tool and framework of the citizens’ daily life, on the wave of Martin Heidegger’s ([17]) well known idea that technology is instrumental to the individual but always, inevitably, also “enframing” him/her. By “technologically sustainable” we meant a village in which technologically delivered services were free to access, fair in their algorithmic functioning (i.e. unbiased, see discourses on algorithmic and data justice: e.g. [10]; [26]; [42]; [34]) and respectful of privacy (e.g. private data ownership, or also the possibility of withdrawing from the use of technology, without losing access to services and rights; see discussion on the ethical boundaries of the digitalized society, e.g. [14]; [13]). More concretely, the course was inspired by the then recent news2 of Google’s goal to plan and build a fully smart neighbourhood in an area of the city of Toronto. Given the corporate-driven conditions behind Google’s project – which led to harsh critiques from various actors both within and outside of the project3 – our course also came to have a meta-political relevance, especially with regard to the repurposing of technology as a public utility. Hence, while we, as teachers, provided students with recent evidence of what has been called “surveillance capitalism” ( [46]), the project aimed – at a broader level – to confront students with the need to rethink the relation between technology, individuals and collectivity, renewing the debate on what it means to acquire (and put to use) tech competences. Methodologically, apart from traditional lectures focused on various topics revolving around critical data studies ([6]), digital cultures ([7]; [12]; [39]), transhumanism ([37]) and digital methods (Rogers 2013), as teachers we provided the conceptual framework of the course, which took the form of a wiki cloud of 54 keywords (Figure 1) that helped students navigate today’s techno-society (examples of keywords are: “participatory democracy”, “datacracy”, “digital twins”, “deep learning”, “smart city”, “social credit”, “transparency”, “net neutrality”, “big data”, “algorithmethics”). For each keyword a short definition was given, together with a couple of references for further exploring the concept, as well as links to its most closely related keywords. Beyond this initial setting, we relied upon a “connected intelligence” approach for the development of the course, which meant to leave students autonomously manage their 2 https://www.citylab.com/solutions/2019/06/alphabet-sidewalk-labs-toronto-quayside-smartcity- google/592453/ 3 https://www.bbc.com/news/technology-47815344 34 https://www.bbc.com/news/technology-47815344 https://www.citylab.com/solutions/2019/06/alphabet-sidewalk-labs-toronto-quayside-smartcity-google/592453/ https://www.citylab.com/solutions/2019/06/alphabet-sidewalk-labs-toronto-quayside-smartcity-google/592453/ S. Calzati – Digital Autoethnography & Connected Intelligence: Two Qualitative Practice- Based Teaching Methods for the Digital Humanities research work (albeit supervised). Such approach was articulated on three different and interconnected levels. Initially, we asked students to pick one keyword and research upon it by expanding its definition and the reference list. In so doing, each student became an expert of his/her own keyword. Subsequently, students gathered in groups of three to four (coordinated by us in order to avoid the formation of too big groups) depending on the similarities among the owned keywords and the researches conducted individually. As part of this second stage we asked students, in groups, to deliver weekly in-class presentations that highlighted the interrelation across the three/four keywords, according to one of the seven macro-themes identified at the beginning of the course. So, for instance, we had the students focusing on “big data”, “algorithm” and “blockchain” working together under the macro-theme of technology insofar as their individual researches led them to explore the technical/operative side of the keywords. Thirdly, over the last three weeks of the course, students clustered in seven bigger groups composed of six to nine members, always following the affiliation of their keywords to one of the macro-themes. This enlarged grouping allowed a cross-fertilization of ideas based upon the research conducted up to that stage. In fact, the principle at the basis of the “connected intelligence” approach – differently from Pierre Lévy’s ([23]) idea of “intelligence collective” – is to favour innovation through collaboration and sharing. As a matter of fact, “connected intelligence” is neither “owned” by the single individuals, nor it is simply the sum of the link connecting them, rather; it is the outcome/surplus that derives from such rhizomatous connectivity ([8]). Eventually, the objective for each macro-group was to elaborate a project that either outlined the conception of a product or service that addressed one key issue of the afferent macro-theme – e.g. the unwilling circulation of private data on digital platforms, tackled by the technology group – or defined a manual of good practices for the design/use of technology (as it was the case with the ethics macro-group, whose work remained inevitably on a more conceptual level). In so doing, not only students became more aware of the criticalities of technological innovation, but also learnt to think collaboratively in view of possible solutions for making tomorrow’s techno-society (more) sustainable. 35 Figure 1: The wiki cloud of keywords. The font size was randomly assigned. Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 Results In the following section the main findings of the teaching experiences are presented and discussed in light of the debate on the use of digital technologies in/for today’s techno-society, as well as in connection with an assessment of the achieved results for the enhancement of digital tech literacies and tech-related awareness in both students and teachers. In the last part of the article the main limitations of these experiences and possible future developments will be addressed. Digital Autoethnography The teaching experience in Hong Kong provided valuable insights in two respects (see also [4]): 1) how young adults choose and use SNSs for self-representation; 2) the (often surreptitious) impact that SNSs have on users’ cognitive and social self-perception. Concerning the first point, in her work on small life stories on Facebook, Ruth Page ( [28]: 410) notes that, despite the fact that updates are “self-contained units rather than the bricks of an ongoing narrative”, it is still possible for readers to “fill the gaps” between statuses and reconstruct small stories about the user. By contrast, we highlighted an increasing difficulty in identifying a coherent self-representation of the user on Facebook. One main explanation for our diverging conclusion has likely to do with the renewed medial and technological affordances of the platform. When Page conducted her study, Facebook had not yet introduced the Timeline; now, as also noted by McNeill ([24]) a few years later, the platform has completely changed its design and, consequently, the use its users make of it. First of all, we witnessed a rather limited frequency of posting. We reported a total of 378 posts, which means an average of 17.1 posts/user over five weeks, i.e. a bare 3.4 posts/user per week (in line with the tendency of young adults to shift towards the use of Instagram and also Snapchat). Moreover, of these posts the majority (8.1 posts/user) were tagged as “phatic”, thus reasserting the primary function of “keeping in contact” rather than providing insightful information about one’s own life (6.8 posts/user were tagged as “referential”, 5.2 as “time-related”, and 4.8 posts/user as “emotive”). The phatic dimension of Facebook’s communication appears more vividly in relation to comments: out of a total of 678 comments reported, 420 simply consisted of emoticons or phatic expressions. Secondly, we noted the tendency to post or share content that was “other-related” and “other-authored” – such as news, entertaining videos, or advertisements – rather than “self-related” and “self-authored”, that is, produced by the users and directly pertaining to their lives. In fact, taken individually, the tag “posted by user” is the one that recurs slightly more often than the other two: 6.5 posts/user against 5.5 (“shared by others”) and 4.8 (“shared by user”). And yet, as soon as we add up all posts that are not authored by the Timeline’s owner (“shared by user” and “shared by others”) they amount to almost two thirds of the total. This means that, for the greatest part, the Timelines of our participants are already an outsourced projection of them; one that produces a sort of depersonification of their representation and perception (it is in this respect that Franco Berardi ([3]: 21) warns against the reduction, brought about by technology, of the uniqueness of the subject to “a set of components, or a format”). In fact, the gradual withdrawal of users from 36 S. Calzati – Digital Autoethnography & Connected Intelligence: Two Qualitative Practice- Based Teaching Methods for the Digital Humanities Facebook – which is considered, more radically than Instagram and Snapchat, as a public space rather than a diary, according to the majority (31) of our participants – is, at once, cause and effect of the platform’s shift from being user-focused to functioning as a news-aggregator (with all the related issues concerning the control of fake news and publishers’ copyrights). This finding can be also derived from the replies of our students to the first round of questions. In particular, to the question “What do you look up on Facebook?” 85% of the students (32) responded, “news,” among whom twenty-six coupled “news” with “entertaining stuff,” highlighting the extent to which “hard news” and “entertaining content” are perceived as overlapping. Moreover, from the participants’ essays and replies to the second round of questions we realized that for our students Facebook constitutes just one platform of a more conspicuous SNSs diet. Overall, our participants claimed to post on Facebook only very relevant life events or episodes of public interest, delegating the bulk of social interactions to other SNSs, namely Instagram and Snapchat. More precisely, Snapchat is where users tend to be more authentic and unreflexive, Facebook is where they choose to present a strongly and positively crafted self, and Instagram works as an in-between semi-private form of photographic diary. These are the words of a student: “I use Snapchat almost on a daily basis whereas my Instagram posts depends on when I go out […] so I would post at least once or twice every week in Instagram, whereas I have almost stopped using Facebook”. This means that, across the three platforms, there is a quantitative narrowing down as well as qualitative discrepancies concerning what is being posted. Hence, if we are to look for coherent self-representations, we need to conceive of a comprehensive approach to SNSs in that “to fill the gaps,” has become a matter of collation among different platforms. Concerning the impact that SNSs have on users’ cognitive and social self-perception, by collating our monitoring of the posting with the participants’ diaries, we realized that users: 1) often share materials and reply to comments uncritically (i.e. without really checking the content of the posts shared or commented on); 2) forget by and large what they have liked/shared after a few days. These phenomena are symptomatic of broader tensions affecting the relation between users and social media. An example of the first kind can be found in the video, shared by a female participant, in which a woman jokingly pretends to be against public breastfeeding. The irony of the video is quite evident in that the woman’s supposed puritanism is contrasted with images showing the fetishization of the female body, which goes well beyond the exposure of breasts. What is significant is that one of the user’s friends did not perceive at all the irony of the video and commented disappointedly on the post. When asked to elaborate on that, the student said: “I suppose my friend didn’t reflect enough when watching the video and concluded that the woman in the video was serious”. It seems, then, that not only did the user’s friend not interpret the “hidden” ironic meaning of the video, but she also felt the urge to intervene, without much consideration. On the other hand, an example showing the process of forgetfulness triggered by social media is particularly acute with regard to liking. In fact, such act remained largely untracked by the majority of participants in their written diaries. Prompted by our question, one student 37 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 reported that “out of my expectations, when checking my activity log I discovered that I liked an overwhelming number of 639 posts in five weeks!” The main problem is that, while users tend to quickly forget what they liked, it is not so for the platform, which tracks and remembers everything they do on it. More broadly, the technology’s erosion – through the unreflexive actions it promotes – of the (human) ability to remember, opens the way to what Benjamin ([1]) defined precisely as an “impoverishment of experience” (as “Erfahrung”) brought about by technology: as soon as users are led, by technology itself, to act mechanically, their acts are deprived of a shared value and turned into solipsistic “Erlebnis” (in this regard, Bernard Stiegler ([41]: 78) talks of a “mercantile production” of memory). The experience in Hong Kong allowed us (as teachers/researchers) as well as the students (as the main actors of the experience) to bring these tendencies to the surface, in order to become fully aware of them via direct experience. The most relevant result was achieved when such consciousness triggered a counter-action in the way of using SNSs. We were glad, for instance, to witness one student discussing how the keeping of a written diary affected his reflection about potential upcoming posts: “I can’t deny that keeping a written diary affected my posting: I became more aware of things or moments around me and I wondered whether I would really like to share them with others.” Another participant confessed: “over the five weeks of logs, I changed some of my views toward my use of Facebook and other SNSs. I have always thought that I kept a very low profile on social media. However, after this self-tracking, I found that I don’t keep at all a low profile.” These testimonies definitely attest to the synergies between the online and offline realms and the distanced (more aware) self-perception that the autoethnography triggered with regard to online modes of self-representation. Connected Intelligence The teaching experience in Milan led to the elaboration of seven macro-projects (thematically clustered), which were presented and discussed in class during the last week of the semester. What is most interesting to remark is the interconnection among the various projects, as a result of the “connected intelligence” approach adopted during the whole course. This is particularly evident in the final project of the technology group. Being this group focused on the “stuff ” itself – i.e. technology – that traversed also all the other macro-groups, its members decided to conceive a tech space on which the other projects might converge. More specifically, the technology group designed a mockup platform, named Village Technology Service (VTS), which addressed the issue of data security and data privacy, by allowing citizens/users to re- appropriate their own data created through multiple interactions with digital services and devices. In the words of the students, the platform “compiles the history of all our data transactions, allowing each citizen to easily manage her own data”. In so doing, this project came to be connected in particular to the ethics and education macro-groups, although politics and ecology groups were also involved. Concerning the ethics project, the group drafted a chart discussing the pros and cons of a more transparent society, based upon the open circulation and access of data made public either necessarily by services and companies, or voluntarily by users/citizens. By offering a critique of China’s top-down social credit system, the group advanced a collective assessment of services through publicly relevant data, leading to forms of 38 S. Calzati – Digital Autoethnography & Connected Intelligence: Two Qualitative Practice- Based Teaching Methods for the Digital Humanities rewards and/or disincentives for both services and users (so that social responsibility is double- sided). Concerning education, the group focused its attention on the ethical and practical implications of the emergence of digital twins,4 that is, the datafied doppelganger of the individual, made up of the collection of all its (so far dispersed) data. In their own words, the group explored what it means to have “a digital twin serving the individual as a personal assistant and a digital face in society” and how to think of and frame its coming into being (i.e. through which data and under which conditions of liability). As for the politics group, their project elaborated on the concept of “epistemocracy”, i.e. the idea that processes of democratic/participative decision-making, especially on local matters, should be based upon the acquisition of prior knowledge. To advance this idea the group designed an e-governance online service for promoting the direct participation, collaboration and voting of citizens on a number of proposals. In the cases envisioned by the group, before voting online, citizens are required to pass a test focused on the debated proposal and meant to assess the citizen/user knowledge of its key tenets (and s/he can only try the test twice). Subsequently, we have two projects that offer concrete examples for a more sustainable circular economy based on e-services. The ecology group conceived the mockup of an app called “Veg- eat-ables” for launching fair practices in the production and consumption of local food. In this spirit, vegetables are grown in collectively managed gardens – at the level of streets or neighbourhoods (see also the urbanism group) – its consumption is meant for self-subsistence and, if needed, the app put in contact members of different neighbourhoods for the recirculation of leftovers (which can be given in exchange for other small social services, see economy group). The app also contains a section with information on how to preserve food, limit packaging, and recycle organic and inorganic waste. Strictly connected with the ecology group is the economy group, which came up with a platform for the sharing of (voluntary) services based on the logics of time banking. Time banking is, indeed, a grassroots way of trading – close to bargain – where the currency is actually time. The economy group, then, collaborated closely with the ecology group for implementing a sustainable model intersecting the working hours in the collectively managed gardens with the possibility of receiving food (or getting lower utility bills, see also urbanism group) by cumulating a “time capital” for the provided social work. Last, the project of the urbanism group addressed three layers – building, mapping and mobility – which were deeply interwoven and eventually described via few renderings at micro level (e.g. single houses and streets), meso level (e.g. neighbourhoods, social spaces, natural areas) and macro level (the whole village). To link the three layers was an environmentally sustainable mobility plan, which also included tech connectivity and free WiFi. More in detail, the group planned house building using both renewable materials and 3D printing, as well as designing them to be energetically sustainable (reducing, if not zeroing, utility bills); the topographical organisation of the village included modularly-planned streets and neighbourhoods hosting canals, green areas, public spaces and buildings, as well as info points describing the overall conception of the village; mobility was based on pedestrian zones, bike sharing and electric car sharing fuelled by renewable energies (obtained by allotting communal areas for solar panels and windmills). 4 https://cmte.ieee.org/futuredirections/2019/07/07/digital-twins-where-we-are-where-we-go-vii/ 39 https://cmte.ieee.org/futuredirections/2019/07/07/digital-twins-where-we-are-where-we-go-vii/ Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 Overall, the projects presented different levels of accuracy and depth. And yet, they all did rely upon existing technologies for their conception, design and implementation of the proposed solutions, thus stressing the importance of creating synergies across sometimes distant areas (such as politics and technology, or economy and ecology) in innovative, tech-based ways. Most importantly, by building upon their practical skills as designers, students demonstrated to be able to cut through the critical discourses surrounding digital technologies and techno-society, in order to pragmatically address (if not solve) some of the most relevant issues connected to them. Limitations and Further Developments The experience in Hong Kong brought to all actors involved a more “distanced” perception of the use of SNSs (and how these, in turn, impact surreptitiously on the user’s life). This is certainly significant in light of the goal to foster critical awareness as far as our daily social media diet is concerned. However, given the small cohort of participants and its socio- demographic uniformity (all Asian students between 18 and 22 years old), the findings would require further testing to be confirmed. No doubt, the research would greatly benefit from the replication of the digital autoethnography in a different cultural context (e.g. Europe) and with the involvement of a larger and more varied group of participants. The experience in Milan, on its part, represented a very proactive approach towards the status of today’s techno-society, the unveiling of its shortcomings and potentialities and the reflection upon possible more sustainable directions it can take. In fact, students had the chance to not only discuss cutting edge issues related to the pervasiveness of technology in our society, but also be at forefront of innovative solutions to be conceived and implemented collaboratively. And yet, the breadth of the macro-themes likely constituted a major limitation to the students’ effective elaboration of their final projects, especially given the short span of time they had. In this respect, the possibility of linking this course to a second one, either in the same semester or in a subsequent one, might represent a viable option leading to finalize sounder projects. This would also be enhanced by connecting students in DH with peers studying computer sciences: the collaboration would certainly favour the conception of more feasible projects, possibly ready to be presented and implemented at municipality level. References [1] Benjamin, Walter. (1933) 1999. “Experience and Poverty.” In Walter Benjamin: Selected Writings, edited by Marcus Bullock and Michael W. Jennings, vol. 2.2, 731- 736. Cambridge: Harvard University Press. [2] Benjamin, Walter. (1938) 2002. “The Storyteller: Observations on the Works of Nikolai Leskov.” In Walter Benjamin: Selected Writings, edited by Marcus Bullock and Michael W. Jennings, vol. 3, 143-166. Cambridge: Harvard University Press. 40 S. Calzati – Digital Autoethnography & Connected Intelligence: Two Qualitative Practice- Based Teaching Methods for the Digital Humanities [3] Berardi, Franco. 2015. And: Phenomenology of the End. Cambridge: MIT Press. [4] Calzati, Stefano, and Roberto Simanowski. 2018. “Self-Narratives on Social Networks: Trans-Platforms Stories and Facebook’s Metamorphosis into a Postmodern Semi-Automated Repository.” Biography 41 (1): 24-47. [5] Cradler, John, Mary McNabb, Molly Freeman, and Richard Burchett. 2002. “How Does Technology influence Student Learning?” Learning and Leading 29(8): 46–49. [6] Dalton, Creig, Linnet Taylor, and Jim Thatcher. 2016. “Critical Data Studies: A Dialogue on Data and Space.” Big Data & Society 3 (1): 1-5. https://doi.org/10.1177/2053951716648346 [7] De Kerckhove, Derrick. 1995. The Skin of Culture: Investigating the New Electronic Reality. Toronto: Somerville Publisher. [8] De Kerckhove, Derrick. 1997. Connected Intelligence: The Arrival of the Web Society. Toronto: Somerville Publisher. [9] De Kerckhove, Derrick. 2009. Dall’alfabeto a Internet. L’homme «littéré»: alfabetizzazione, cultura, tecnologia. Milano: Mimesis Edizioni. [10] Dencik, Lina, Arne Hintz , and Jonathan Cable. 2016. “Towards Data Justice? The Ambiguity of Anti-Surveillance Resistance in Political Activism.” Big Data & Society 3(2): https://doi.org/ 10.1177/2053951716679678 [11] Farinelli, Fiorenza. 2010. “Competenze e opinioni degli insegnanti sull’introduzione delle TIC nella scuola italiana.” Programma Education – Fondazione Agnelli. Accessed September 1, 2019. https://www.fondazioneagnelli.it/wpcontent/uploads/2017/08/F._Farinelli__Compete nze _e_opinioni_degli_insegnanti_sull_introduzione_delle_TIC_nella_scuola_italiana_- _FGA_WP29.pdf [12] Finn, Ed. 2017. What Algorithms Want. Cambridge: MIT Press. [13] Floridi, Luciano. 2015. The Ethics of Information. Oxford: Oxford University Press. [14] Floridi, Luciano, ed. 2013. The Onlife Manifesto. London: Springer. [15] Freeman, Mark. 2011. “Stories Big and Small: Toward a Synthesis.” Theory & Psychology 21 (1): 114–21. [16] Hayles, Katherine. 2012. How We Think: Digital Media and Contemporary Technogenesis. Chicago: University of Chicago Press. [17] Heidegger, Martin. 1977. The Question Concerning Technology and Other Essays. Translated by William Lovitt. New York: Harper Books. [18] IEEE. (2018). “What Education Would Be Like in 2050?” Accessed February 13, 41 https://www.fondazioneagnelli.it/wpcontent/uploads/2017/08/F._Farinelli__Competenze%20_e_opinioni_degli_insegnanti_sull_introduzione_delle_TIC_nella_scuola_italiana_-_FGA_WP29.pdf https://www.fondazioneagnelli.it/wpcontent/uploads/2017/08/F._Farinelli__Competenze%20_e_opinioni_degli_insegnanti_sull_introduzione_delle_TIC_nella_scuola_italiana_-_FGA_WP29.pdf https://www.fondazioneagnelli.it/wpcontent/uploads/2017/08/F._Farinelli__Competenze%20_e_opinioni_degli_insegnanti_sull_introduzione_delle_TIC_nella_scuola_italiana_-_FGA_WP29.pdf https://doi.org/%2010.1177/2053951716679678 https://doi.org/10.1177%2F2053951716648346 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 2020: https://cmte.ieee.org/futuredirections/2018/02/23/what-would-education-be- like-in-2050-gig-economy/ [19] Jakobson, Roman. 1960. “Closing Statements: Linguistics and Poetics.” In Style in Language, edited by Thomas A. Sebeok. Cambridge: MIT Press. [20] Jenkins, Henry, Sam Ford, and Joshua Green. 2013. Spreadable Media: Creating Value and Meaning in a Networked Culture. New York: New York University Press. [21] Kennedy, Helen. 2014. “Beyond Anonymity, or Future Directions for Internet Identity Research.” In Identity Technologies: Constructing the Self Online, edited by Anna Poletti and Julie Rak, 25-41. Madison: University of Wisconsin Press. [22] Krashen, Stephen. 1981. Second Language Acquisition and Second Language Learning. Oxford: Pergamon. [23] Lévy, Pierre. 1994. L’intelligence collective. Pour une anthropologie du cyberspace. Paris: La Découverte. [24] McNeill, Laurie. 2012. “There Is No ‘I’ in Network: Social Networking Sites and Posthuman Auto/Biography.” Biography 35 (1): 65–82. [25] Means Barbara, Yukie Toyama, Robert Murphy, Marianne Bakia, and Karla Jones. 2009. “Evaluation of Evidence-Based Practices in Online Learning: A Meta-Analysis and Review of Online Learning Studies.” US Department of Education. Accessed July 31, 2019. https://www2.ed.gov/rschstat/eval/tech/evidence-based- practices/finalreport.pdf [26] Metcalf, Jacob, and Cate Crawford. 2016. “Where Are Human Subjects in Big Data Research? The Emerging Ethics Divide.” Big Data & Society 3 (1): https://doi.org/10.1177/2053951716650211 [27] Morrison, Aimée. “Facebook and Coaxed Affordances.” In Identity Technologies: Constructing the Self Online, edited by Anna Poletti and Julie Rak, 112-131. Madison: University of Wisconsin Press. [28] Page, Ruth. 2010 “Re-Examining Narrativity: Small Stories in Status Updates.” Text and Talk 30 (4): 423–444. [29] Petrucco, Corrado, and Valentina Grion. 2015. “Insegnanti in formazione e integrazione delle tecnologie in classe: Futuri docenti ancora poco social?” Qwerty 10 (2): 30-45. [30] Pew Research Center. 2019a. “Mobile Fact Sheet.” Pew Research Center Internet &Technology. Accessed September 1, 2019. https://www.pewinternet.org/fact- sheet/mobile/ [31] Pew Research Center. 2019b. “Social Media Fact Sheet.” Pew Research Center Internet &Technology. Accessed September 1, 2019. https://www.pewinternet.org/fact- sheet/social-media/ 42 https://www.pewinternet.org/fact-sheet/social-media/ https://www.pewinternet.org/fact-sheet/social-media/ https://www.pewinternet.org/fact-sheet/mobile/ https://www.pewinternet.org/fact-sheet/mobile/ https://doi.org/10.1177%2F2053951716650211 https://www2.ed.gov/rschstat/eval/tech/evidence-based-practices/finalreport.pdf https://www2.ed.gov/rschstat/eval/tech/evidence-based-practices/finalreport.pdf https://cmte.ieee.org/futuredirections/2018/02/23/what-would-education-be-like-in-2050-gig-economy/ https://cmte.ieee.org/futuredirections/2018/02/23/what-would-education-be-like-in-2050-gig-economy/ S. Calzati – Digital Autoethnography & Connected Intelligence: Two Qualitative Practice- Based Teaching Methods for the Digital Humanities [32] Pierson, M. E. 2001. “Technology Integration Practice as a Function of Pedagogical Expertise.” Journal of Research on Computing in Education, 33: 413–430. [33] Ragnedda, Massimo. 2019. “Conceptualising the Digital Divide.” In Mapping the Digital Divide in Africa, edited by Bruce Mutsvairo and Massimo Ragnedda, 27-44. Amsterdam: Amsterdam University Press. [34] Rieder, Gernot, and Judith Simon. 2016. “Datatrust: Or, the Political Quest for Numerical Evidence and the Epistemologies of Big Data.” Big Data & Society 3 (1): https://doi.org/ 10.1177/2053951716649398 [35] Rogers, Richard. 2013. Digital Methods. Cambridge: MIT Press. [36] Rybas, Natalia, and Radhika Gajjala. 2007. “Developing Cyberethnographic Research Methods for Understanding Digitally Mediated Identities.” Forum: Qualitative Social Research, 8 (3): n.p. [37] Saracco, Roberto. 2018. Transhumanism. IEEE – Future Direction Committee Symbiotic Autonomous Systems Initiative. Ebook. Accessed September 1, 2019. https://digitalreality.ieee.org/images/files/pdf/transhumanism.pdf [38] Sauter, Theresa. 2014. “‘What’s on Your Mind?’: Writing on Facebook as a Tool for Self Formation.” New Media & Society 16 (5): 823–39. [39] Simanowski, Roberto. 2016. Digital Humanities and Digital Media: Conversations on Politics, Culture, Aesthetics and Literacy. London: Open University Press. [40] Stackla. 2019. “43 Statistics About User-Generated Content You Need to Know.” Accessed September 1, 2019. https://stackla.com/resources/blog/42-statistics-about- user-generated-content-you-need-to-know/ [41] Stiegler, Bernard. 2010. “Memory.” In Critical Terms for Media, edited by W. J. T. Mitchell and Mark Hansen, 64-87. Chicago: University of Chicago Press. [42] Taylor, Linnet. 2015. “No Place to Hide? The Ethics and Analytics of Tracking Mobility Using Mobile Phone Data.” Environment and Planning D: Society and Space 34 (2): 319-336. [43] Università. 2018. “Università, in calo professori e ricercatori. Quasi 5mila in meno in 7 anni.” Accessed September 1, 2019. https://www.universita.it/universita-calo- professori-ricercatori/ [44] Van Dijck, José. 2013. The Culture of Connectivity: A Critical History of Social Media. Oxford: Oxford University Press. [45] Van Dijck, José, Thomas Poell, and Martijn de Waal. 2018. The Platform Society: Public Values in a Connective World. New York: Oxford University Press. [46] Zuboff, Shoshana. 2018. The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. New York: Public Affairs. 43 https://www.universita.it/universita-calo-professori-ricercatori/ https://www.universita.it/universita-calo-professori-ricercatori/ https://stackla.com/resources/blog/42-statistics-about-user-generated-content-you-need-to-know/ https://stackla.com/resources/blog/42-statistics-about-user-generated-content-you-need-to-know/ https://digitalreality.ieee.org/images/files/pdf/transhumanism.pdf https://doi.org/%2010.1177/2053951716649398 Abstract Introduction Crucial Times in Higher Education Digital Autoethnography & Connected Intelligence Digital Autoethnography Connected Intelligence Results Digital Autoethnography Connected Intelligence Limitations and Further Developments References work_3hxzu73yiva45pzyuwqvvxu5ki ---- Anarchive as technique in the Media Archaeology Lab | building a one Laptop Per Child mesh network RESEARCH ARTICLE Anarchive as technique in the Media Archaeology Lab | building a one Laptop Per Child mesh network libi striegl1 & Lori Emerson2 Published online: 1 April 2019 # Springer Nature Switzerland AG 2019 Abstract The Media Archaeology Lab (MAL) at the University of Colorado at Boulder (U.S.A.) acts as both an archive and a site for what the authors describe as ‘anarchival’ practice- based research and research creation. ‘Anarchival’ indicates research and creative activity enacted as a complement to an existing, stable archive. In researching the One Laptop Per Child Initiative, by way of a donation of XO laptops, the MAL has devised a modular process which could be used by other research groups to investigate the gap between the intended use and the affordances of any given piece of technology. Keywords Archive.Anarchive.Meshnetwork.Medialab.Mediaarchaeology.Practice- based research . Research creation 1 Introduction What follows in part one is an overview of the philosophy and holdings of the Media Archaeology Lab (MAL), based at the University of Colorado at Boulder (U.S.A), along with a summary of its key ongoing activities. We discuss the MAL in terms of conventional notions of the archive and more anarchic notions of the anarchive as developed by Siegfried Zielinski, University of Toronto’s WalkingLab, and Concordia University’s Senselab. Part two of our article focuses exclusively on our One Laptop Per Child (OLPC) XO Mesh Network Project and the four-part set International Journal of Digital Humanities (2019) 1:59–70 https://doi.org/10.1007/s42803-019-00005-9 * libi striegl striegl@gmail.com; libi.striegl@colorado.edu * Lori Emerson lori.emerson@gmail.com; lori.emerson@colorado.edu 1 Intermedia Arts, Writing, and Performance Program, University of Colorado at Boulder, Boulder, CO, USA 2 Department of English and Intermedia Arts, Writing, and Performance Program, University of Colorado at Boulder, Boulder, CO, USA http://crossmark.crossref.org/dialog/?doi=10.1007/s42803-019-00005-9&domain=pdf of guidelines we have developed as a result of this project and as a way of documenting our anarchival process. Though our anarchival process is a living document and thus subject to change and revision, our guidelines serve as an initial point of consideration for future case studies. By approaching the OLPC XO collection anarchivally, we suggest a novel approach to assessment and to knowl- edge creation related to this specific technology, while also suggesting how these guidelines might be developed to approach other technologies. 2 The Media Archaeology Lab as archive and anarchive In his essay ‘AnArcheology for AnArchives: Why Do We Need—Especially for the Arts—A Complementary Concept to the Archive?’ Siegfried Zielinski, who is some- times aligned with a somewhat softer practice or a mode of thinking called media archaeology, eloquently clarifies what a classic archive is: channeling Michel Foucault, the archive is, in short, ‘the externalization of historical consciousness, thereby documenting a consciousness fundamentally tied to power. The utterances, objects, and artefacts produced by artists and thinkers closely involved with the arts are liable to end up in these archives. Once this happens, archivists, librarians, and curators trans- form heterogeneous objects into structures to whom they are and will remain profound- ly alien’ (116). The ‘anarchive,’ however, is, Zielinski posits, ‘a complementary opposite and hence an effective alternative to archive... Following a logic of plurality and wealth of variants, they are particularly suited to handle events and movements; that is, time-based sensations. Just as the anarcheological sees itself first and foremost as an activity, anarchives are principally in an active mode’ (121). While the foregoing has helped us think through how to handle experiments with still-functioning but obsolete networks in the MAL, Zielinski also asserts that artists and researchers like us need both archives and anarchives: archives that collect, select, preserve, restore, and sort in accordance with the logic of a (dispositive) whole, and the autonomous, resistant, continually reactivated anarchives geared toward individual needs and work methods. It is the utopia, the non-place, which in an ongoing process reshapes and reinterprets the materials from which memories are made. Anarchives necessarily challenge, indeed provoke, the archive: otherwise, they would be devoid of meaning. Caring for anarchives may help prevent the many idiosyncratically designed particular collections from changing into a rule-bound administrative apparatus. It may even enable us to celebrate the past as a regained present. (122) Thus, since 2009, when we founded the Media Archaeology Lab, the lab has become known as both an orderly and an unruly place. On the one hand, the MAL’s extensive collection of still functioning media from the late nineteenth century through the twenty-first century has been carefully accessioned and catalogued, and we have also created disk images of all our valuable pieces of early digital art and literature. If you visit the lab, you will be greeted by roughly one hundred and thirty years’ worth of media to turn on, play with, open up, create with, move around, and juxtapose with other media. Our oldest media objects range from a camera from 1880, a collection of 60 International Journal of Digital Humanities (2019) 1:59–70 early twentieth-century magic lanterns, and an Edison diamond disc phonograph player from 1912. Our more recent media range from the desktops, laptops, luggables, portables, and game consoles from the mid-1970s through the early 2000s. We also have a collection of printed matter and software from the 1950s through the 2000s. Highlights of the collection include: a 1976 Altair 8800b; a 1981 desktop computer from Sweden; a 1984 Vectrex game console; a 1986 desktop from East Germany; and a rare 1987 ‘advanced work processor’ called the Canon Cat computer. On the other hand, the MAL changes from year to year, depending on who is in the lab and what donations have arrived at our doorstep, and thus it undoes many assumptions about what archives as well as labs should be or do. As a testament to the flexibility and open-endedness of the MAL, in the last three years (coinciding with the opening of the new Intermedia Arts, Writing, and Performance PhD program at CU Boulder) the lab’s vitality has grown substantially because of the role of three PhD students affiliated with the program. These students have been invited to develop their own unique career trajectories in and through the lab. One student, who wishes to obtain an academic position after graduation, has created a hands-on archive of scanners in conjunction with a dissertation chapter, soon to be published as an article, on the connections between the technical affordances of scanners and online digital archives. Another student, who wishes to obtain a curatorship after graduation, founded an event series called MALfunctions, which pairs nationally and internationally recog- nized artists with critics on topics related to the MAL collection; this student also arranges residencies at the lab for these visiting artists/critics who, in turn, generate technical reports on their time spent in the MAL; furthermore, as a result of her work with this event series, she has been invited to serve as a curator for an annual media arts festival at the Boulder Museum of Contemporary Art. Finally, another student (and a coauthor of this essay), who wishes to pursue a career in alternative pedagogical practice outside of higher education, has started a monthly retro games night for members of the CU Boulder community; she also is running monthly workshops teaching students and members of the public how to fix vintage computers and game consoles and the basics of surveillance and privacy. Thus, unlike archives or labs that are structured hierarchically and driven by a single person with a single vision, the MAL takes many shapes. It is, as we write above, an archive for original works of early digital art/literature and their original platforms; it is also an apparatus through which we come to understand a complex history of media and the consequences of that history; it is a site for artistic interventions, experiments, and projects; it is a flexible, fluid space where students and faculty from a range of disciplines can undertake practice-based research; it is a space where graduate students come for hands-on training in fields ranging from digital humanities, literary studies, media studies, and curatorial studies to community outreach and education. In other words, the MAL is an intervention in the notions of ‘archive’ and ‘lab’ insofar as it is a place where, depending on your approach, you will find opportunities for research and teaching in myriad configurations and a host of other, less clearly defined activities made possible by a collection that is both object and tool. The MAL has also evolved into a ‘real life’ and virtual community enterprise: it has an international advisory board of scholars, archivists, and entrepreneurs; faculty fellows from across the CU Boulder campus; and a regularly rotating cohort of undergraduate interns, graduate research assistants, post-graduate affiliates, and International Journal of Digital Humanities (2019) 1:59–70 61 volunteers from the general public who help with class tours and guest visits to the lab. We also host media studies reading groups, artist residencies, an event series called MALfunctions, retro game nights, and workshops on how to fix old or new devices and even on how to build mesh network, as we discuss in part two. The more MAL becomes a communal enterprise, the more it also appears open and accessible to all kinds of people who themselves may have no background in programming or tinkering or making or building, but who understand that we are increasingly compelled to have some understanding of how our everyday technologies work and how we might build alternatives. The objects in the MAL demonstrate how determinisms (ideological and otherwise) are built into technologies of the past, and they do this partly as a result of hands-on interactions with them and partly as a result of experience with the ways in which objects in the lab depart from our present-day expectations. They show how technological determinisms are historical, and therefore changeable, according to the values and concerns we develop. What follows is a description of one particular archival/anarchival project on which we have been working in the MAL since 2017. By presenting this project, we wish not only to describe thoroughly one possible activity one might undertake archivally/ anarchivally in the lab, but also to explore the ways in which such an activity has the potential to guide other hands-on experiments with obsolete technology. In other words, while we are suggesting a novel approach to assessment and knowledge creation related to this specific technology, we hope this can serve as a model to approach other technologies within and beyond the MAL. 3 Archiving and anarchiving the MAL’s collection of OLPCs In early 2017, the MAL received a donation of twenty OLPC XO laptops, opening up an avenue for hands-on research into and critical consideration of the history, imple- mentation, and outcomes of the OLPC project. The OLPC initiative was founded in 2005 by then MIT Media Lab Director Nicholas Negroponte with the following mission: ‘to create educational opportunities for the world’s poorest children by providing each child with a rugged, low-cost, low- power, connected laptop with content and software designed for collaborative, joyful, self-empowered learning’ [OLPC n.d.-e]. The goal was to design and manufacture laptops which could be sold en masse to governments or Non-Governmental Organi- zations involved in educational programs for $100, approximately 1/10th the cost of the average laptop at the time. The project was originally funded by member organizations including eBay, Red Hat, Quanta, and Google [OLPC n.d.-g]. As the project continued, the price never actually dropped to $100 and the initiative faced backlash from one- time member Intel and a dramatic drop in overall funding. Thus, the OLPC project was immediately polarizing. Positive responses came from within the tech industry, evidenced by the support of the member organizations willing to give funding to the initiative. Several governments also responded positively with a willingness to sign up for the laptop distribution program, including Uruguay, Rwanda, and Peru. Negative responses came from the tech industry and from diplomats and leaders from countries in the target market. Marthe Dansokho of Cameroon was quoted at the 2005 World Summit on the Information Society held in Tunisia as saying, ‘What 62 International Journal of Digital Humanities (2019) 1:59–70 is needed is clean water and real schools.’ At the same Summit, Mohammed Diop of Mali stated, ‘It is a very clever marketing tool. Under the guise of non-profitability hundreds of millions of these laptops will be flogged off to our governments’ [Smith]. Bill Gates was skeptical of the project when it was proposed at Davos [Olson] and Lee Felsenstein, in a blog post written shortly after the initiative was founded, noted that ‘By marketing the idea to governments and large corporations, the OLPC project adopts a top-down structure. So far as can be seen, no studies are being done among the target user populations to verify the concepts of the hardware, software and cultural constructs’ [Felsenstein]. Criticism notwithstanding, beginning in 2007, laptops were distributed in 42 countries [OLPC n.d.-a]. Philosophically, the OLPC project was based on an educational foundation derived from the work of Seymour Papert. ‘Constructionism’ was name of the philosophy Papert developed around principles of student-centered, active learning; Papert’s phi- losophy, in turn, was based on the work of Jean Piaget and his notions of Constructivist ontology. The active, discovery-based, unstructured learning process advocated by Papert formed the central tenet of the hands-off methods central to the OLPC initiative. This hands-off method drives the belief that the XO laptops, through their careful design, can be handed to children in any situation, who will simply figure out how the devices work and progress via self-guided learning without the aid of a teacher. Negroponte also took inspiration from Sugata Mitra’s ‘Hole-in-the-wall’ project, which called for learning with no or minimal interaction from an instructor [Venkatraman]; Mitra’s project inspired Negroponte to pursue implementation plans which included possible helicopter drops of XO laptops in remote locations [OLPC News]. Even though the OLPC project is effectively over, research on the overall effective- ness of the initiative is ongoing. Most studies have so far suggested that the success of the implementation depends on whether devices are properly integrated into class- rooms, whether there is appropriate teacher education with regard to laptop use and pedagogical deployment, and whether there is a general enthusiasm around the project in the target community. Thus, since these devices have what one might call ‘contextual baggage’ as part of their associated global education project, their presence in the broader MAL collection has special significance as a clear illustration of the nature of top-down technological solutions to global problems. By providing opportunities for active exploration, the MAL opens up all devices in its collection to a consideration of their complexity through an investigation of their affordances. As we point out above, by design the MAL is both an archive housing these and other devices and a lab for experimental work and knowledge creation. It provides space for archiving but also for moving beyond and through the archive (Fig. 1). At first glance, the XO collection exists as a static set of objects - nothing more than a pile of plastic and electronics in bright and ostensibly friendly colors sitting in a corner of the lab. In other words, these devices fit neatly into the least generous definition of archive as a collection of things in their original state which are usually only considered in terms of their place in whatever has been deemed ‘history.’ The aforementioned is true of any object in the MAL. If they are not activated, they are lifeless. Furthermore, while the XO collection could be used to illustrate the laptops’ original intended use, thereby demonstrating their capacity within that sphere (for example, as an educational tool for children in underserved communities and International Journal of Digital Humanities (2019) 1:59–70 63 developing countries), if the collection of XO laptops is only activated in this way it runs the risk of simply replicating the outcomes for which the OLPC project’s initial implementation has been criticized. The archival impulse in this scenario is necessarily backwards-looking, where any attempts to reframe or reimagine the devices are bound to be purely abstract in the sense that they would merely serve the original intentions and even ideological purpose of the OLPC project. Thus, the MAL’s research on the XO laptops is intended to move beyond simple situational replication and into active critique. In other words, we are constantly seeking ways to activate our collection which will enable us to examine the hardware and software independently of their original associations. By reflecting on the relationship of the technology to its broader socio-political context, we are able both to provide space for critique and to create pathways for future action. While the impulse is normally to dismiss a piece of technology entirely when the broader project of which it is part is unsuccessful or problematic, we are suggesting that the alternative is to reframe the technology in terms of its real potential and address how its associated project fell short of this potential. The challenge is to accept the technology for what it can do and compare these capabilities with what the technology was designed to do. 3.1 The Anarchive and the counter-archival impulse Our desire to do something new with the laptops, as opposed to preserving them in place, was, again, a decidedly counter-archival one. Preserving in place, in the case of the XO laptops, seemed to give implicit approval to the OLPC project as a whole without offering any space to negotiate and understand the project’s successes and failures, both ideologically and technologically. Instead of figuratively and metaphor- ically placing the history of these devices on a shelf, ready to be abstracted and transported into a conventional narrative, we wanted to take the opportunity to confront the project’s history by experimenting with the devices’ aforementioned real potential activated via its functionalities; for example, the XO laptop is particularly well suited to low and/or variable power consumption and mesh networking. Also, by expressing the Fig. 1 A part of the MAL OLPC XO collection in its inactive state. The collection contains an additional 6 computers and accessories 64 International Journal of Digital Humanities (2019) 1:59–70 capabilities of the hardware and software within the scope of what they seem well suited for rather than within the scope of what they were intended for, we hope to find a perspective from which accurately to critique a project like OLPC, which had both complex intentions and outcomes. 3.2 Inspiration We want to be clear that our OLPC project is not intended to be scientific and it refuses a prescriptive methodology in favor of offering guidelines which we hope can be adapted to the circumstances of any particular research group. By proposing an explicitly open-ended and modular anarchival process for the lab, we are suggesting a way of channeling the counter-archival impulse. For our purposes, we are revising University of Toronto WalkingLab’s definition of the anarchive as ‘an activity that resists mere documentation and interpretation in favour of affective and material processes of production, where archival “technicities” create new compositions and new nodes of research’ [WalkingLab]. Combining this with the notions expressed by Zielinski, we are declaring that the anarchive is deliberate activity which resists collection, documentation, and abstraction in favor of affective, concrete knowledge production wherein the archive is activated in order to create new directions for critique and research. We have also adapted a description from SenseLab (based at Concordia University in Montréal, Canada) of their anarchiving process as a basis from which to construct our own process of approaching the XO laptops. SenseLab’s definition begins with the following assertions: 1. The anarchive is best defined for the purposes of the Immediations project as a repertory of traces of collaborative research-creation events. The traces are not inert, but are carriers of potential. They are reactivatable, and their reactivation helps trigger a new event which continues the creative process from which they came, but in a new iteration. 2. Thus the anarchive is not documentation of a past activity. Rather, it is a feed- forward mechanism for lines of creative process, under continuing variation. And the authors continue, concluding with the following: 7. Approached anarchivally, the product of research-creation is process. The anarchive is a technique for making research-creation a process-making engine. Many products are produced, but they are not the product. They are the visible indexing of the process’s repeated taking-effect: they embody its traces (thus bringing us full circle to point 1). [SenseLab] Once again, the aim of this project is to construct a range of situations and interactions with the OLPC XO’s which take advantage of the particular innate qualities of the OLPC XO hardware in order to imagine alternative potential uses. Iterating different versions of these interactions/situations will, ideally, generate a course of action for an anarchiving process which extends beyond the OLPC project. With the formation of a process which is fundamentally both International Journal of Digital Humanities (2019) 1:59–70 65 iterative and generative, we hope that future projects carried out in the lab will build upon this anarchiving framework. 3.3 Direction The questions we have sought to answer over the course of the project are: what relationship does a technology have to its intended deployment? Does everything need to be preserved in its original state? Can the process of engaging with the technology be preserved without the content? And, is it critically productive to interrogate the functional reality of an object and re-deploy it for new (potentially better suited) ends? In answering these questions within the framework proposed, the goal is both to examine this particular example of technology and also to create a framework for a process-based examination of technologies which can be transferred to other devices held in similar archives. The questions we have sought to answer require both that we examine the technology itself and work outwards; they also require that we examine the history and context of the technology within the context of the project for which it was designed and deployed. As such, initially we surveyed the technical specs of the hardware and software package that make up the XO. This survey was undertaken both by examining the laptops physically and interacting with them as a user, and also by accessing various online documentation about the device. This was a necessary step in order to understand the value of the device as it stands, rather than resting on the assumptions gleaned from press coverage as well as personal and academic accounts. In addition, we researched both the OLPC initiative as it was first conceived, the immediate and ongoing critical response to the project, and the outcomes thus far from its various implementations. The OLPC project is at present largely defunct, but the XO laptops are still in use in several countries, including Uruguay and Ethiopia, and they are still being deployed in these contexts by various non-governmental entities and other organizations with an educational mission. The research we conducted has largely been through reports available publicly, though some anecdotal information has been collected as well, including the origins of the MAL collection. The MAL collection was donated by one of the entities referenced above, a church mission group which purchased the devices for use in Ethiopia. This contextual research was necessary to understand the archive from which we were drawing, through which we were moving, and from which we were exiting. From these initial explorations, we were able to derive the first two guidelines for our anarchival process: 1. Become familiar with the context of the archived technology and understand the intended manifestation of the technology, both of which are necessary in order to move beyond them. 2. Understand the technology itself. Conduct hands-on research in order to determine the technology’s capabilities and failings. This might take the form of using the device as a primary computer for an extended period of time. Conduct hands-off research as a supplement to this process, especially if there is something one cannot learn by hands-on use. This might take the form of 66 International Journal of Digital Humanities (2019) 1:59–70 reading manuals or other documentation. Document what is discovered during hands-on versus hands-off research. 3.4 Learning the technology In terms of software, the XO laptops were designed with the idea that the user not require fluency in any particular computer language and not even have to have any previous experience with computers. Surprisingly, in our own informal studies conducted in and around the MAL, we found the foregoing generally holds true. We gave XOs to people with varying degrees of computer literacy and they were all able to navigate the basic functions of the device within a very short period of time. Text documents, camera, and games are all readily findable within a few minutes of opening the laptop. In fact, we observed that the way in which all of our testers are accustomed to the standards of computer layout and interface normalized by Windows and Apple ecosystems is a hindrance, as these users were forced to overcome their own presuppositions about interface design in order to familiarize themselves with the laptops. However, we also noted in our users’ interactions that there were some pieces of information about the devices not easily obtained by interaction alone and this required investigation in the secondary documentation available online. Because open source is one of the tenets of the OLPC project, the documentation of hardware and software specifications is extensive (Fig. 2). In terms of hardware, the XO laptops were also designed to be easy and intuitive to navigate, durable, and connectable. The devices also require little power and have variable power consumption which is directly tied to the software activities and the hardware use. Devices have a wifi module which, in conjunction with the Sugar OS, is tailored towards transmission. In addition, devices which are in low power mode or even powered off can still be used as transmitters, all of which makes the laptops ideal for mesh networking. Fig. 2 The Sugar interface in ‘Neighborhood’ mode. The small ‘person’ icons indicate nearby XO laptops with open connections. The solid circles indicate available internet connections, the solid circles with parentheses around them indicate a connection in progress, the circles with a line across them indicate internet connections requiring a password, and the concentric circles indicate mesh networking Channels International Journal of Digital Humanities (2019) 1:59–70 67 Broadly speaking, a mesh network is a dynamic, nodal networking model that exists as a complement to the traditional single-access-point network familiar to most internet users. More specifically, a mesh network is an ad hoc, node-to-node network connection whereby each node provides and accesses a signal, as opposed to a direct connection with a single signal source. The XO laptops were designed for mesh networking in order to amplify an internet signal in areas with minimal connectivity, and the devices could use a single access point (ethernet line, satellite phone connection, landline) to provide internet access to many devices. The XO laptops can also connect to one another wirelessly even when no internet signal is available to facilitate data sharing on a local network (Fig. 3). 3.5 Active process The anarchive pertains to events rather than objects. It extends outside of the archive and exists in addition to it. Thus, for our purposes, we defined a third step in our process as follows: 3. Determine a path by which you can escape the archive using the technology you are activating. Follow that path to its logical conclusion and re-trace it with variations if necessary, for as long as necessary. This escape and activation might range from playful interactions to more rigorous hardware or software hacking. The active process is the anarchive. The anarchive may have byproducts (including documentation), but these byproducts are not the anarchive. Document whatever seems appropriate. Save the residue of all experiments wherever appropriate Based on the information gathered during our survey of the physical capabilities of the XO laptops, we are in the process of creating a cross-campus mesh network using Fig. 3 Available networks list showing the olpc-mesh AdHoc network available for connection on a MacBook Pro 68 International Journal of Digital Humanities (2019) 1:59–70 solely these devices which act as communication nodes and transfer points. In the case of our OLPC Mesh Network project, the event or active process is the individual instantiations of network connectivity and the interactions made possible therein are the byproducts. When we treat the XO laptops as the objects in this case, it becomes clear they are ideal facilitation devices for the networking event rather than containers for it. If a network is not initiated and a place for connectivity is not created, the devices return to the archive and are no longer participants in the anarchival process. They can, however, always be reengaged for new instantiations of the network. We have so far formed active networks in both formal and informal sessions. Each network activation builds upon previous activations. Thus, while our first session involved two devices and two individuals working simultaneously in a shared text document, by the fifth session, the laptops had been ‘abandoned’ in a building with their network active so that users would interact with the network as they encountered it. Every instance of the network has involved a shared text document because this has proven the best method for provoking and promoting interactions that will leave evidence of the network’s existence. However, just as the nature of the network is ephemeral and ad-hoc, the evidence of the network is fleeting. If it is not preserved at regular intervals during the event, it is possible for a participant to erase all traces of their participation or all traces of any participation. In this case, the anarchive carries out its function as an external energetic force around the archive, escaping the function of the archive as a located, documented, and catalogued entity. In other words, the counter-archival impulse expressed earlier is continued through these anarchival activities, wherein preservation neither is the goal nor is desirable. Given our research thus far, we have decided that future network events should be broader and should take place outside rather than inside various campus buildings. Exploiting the maximum range of the XO’s wireless unit and taking advantage of the unit’s ruggedness, we plan to conduct two network events, one involving facilitators guiding interactions with the network and one involving laptops that are simply left for passersby to investigate. These two modes of interaction are based directly upon the contextual research on the intended implementation of the laptops and the studies on post-implementation success of the project. From our plans for these next network activations, we have devised a final step for our anarchiving process: 4. Resolve the archive and the anarchive, if possible. Use the results of the events to address the context of the archive. This might take the form of repeating the activities with the original intended user base for the technology We have not yet reached the point of resolving the archive and the anarchive. In spite of this, the anarchival process is in constant conversation with the archive and with the contextual information surrounding the physical manifestation of the archive. Without consideration of the history held within the archive, the next path for anarchival activity would not be so clear. 4 Conclusions Our research into the OLPC remains incomplete, as our anarchival activity around the OLPC Mesh Network Project is ongoing. Though our four-step anarchival process is a International Journal of Digital Humanities (2019) 1:59–70 69 living document and as such is subject to change and revision, our guidelines serve as an initial point of consideration for future case studies. The anarchival process present- ed serves as an extension of the archival function served by the Media Archaeology Lab and a supplementary activity which enlivens the archive as it stands. By ap- proaching the OLPC XO collection anarchivally, we are suggesting a novel approach to assessment and knowledge creation related to this specific technology. We are also developing a model for approaching other technologies within or beyond the MAL. References Bender, W. (2012). Learning to change the world: The social impact of one laptop per child. New York: Palgrave Macmillan. Felsenstein, L. (2005, November 10). Problems with the $100 Laptop. The Fonly Institute. Retrieved from http://www.fonly.typepad.com/fonlyblog/2005/11/problems_with_t.html. McArthur, V. (2009, September). Communication Technologies and Cultural Identity: A Critical Discussion of ICTs for Development Paper presented at the IEEE Toronto International Conference: Science and Technology for Humanity, 2009, Toronto, Canada, 910–914. https://doi.org/10.1109/TIC- STH.2009.5444367. Media Archaeology Lab (n.d.). Retrieved from http://mediaarchaeologylab.com. [Last accessed 21 February 2018]. OLPC (n.d.-a). About The Project > Countries. laptop.org. Retrieved from http://laptop.org/about/countries. OLPC (n.d.-b). Laptop. Retrieved from http://laptop.org/en/laptop/index.shtml. OLPC (n.d.-c). Laptop Hardware > Specs. Retrieved from http://laptop.org/en/laptop/hardware/specs.shtml. OLPC (n.d.-d). Mission. Retrieved from http://laptop.org/en/vision/mission/index.shtml. OLPC (n.d.-e). Vision. Retrieved from http://laptop.org/en/vision/index.shtml. OLPC (n.d.-f). Vision > History. Retrieved from http://laptop.org/en/vision/project/index.shtml. OLPC (n.d.-g). Vision > Mission > FAQ. Retrieved from http://laptop.org/en/vision/mission/faq.shtml. Olson, P (2006, March 16). Gates Pours Water on $100 Laptop. Forbes. Retrieved from https://www.forbes. com/2006/03/16/gates-laptop-microsoft-cx_po_0316autofacescan06.html#42282ce43e38. SenseLab (n.d.). Anarchive – Concise Definition. Retrieved from http://senselab.ca/wp2 /immediations/anarchiving/. Smith, S. (2005, December 1). The $100 laptop – is it a wind-up? CNN. Retrieved from http://edition.cnn. com/2005/WORLD/africa/12/01/laptop/. Venkatraman, V. (2011, December 7). I want to give poor children computers and walk away. New Scientist. Retrieved from https://www.newscientist.com/article/mg21228425.500-i-want-to-give-poor-children- computers-and-walk-away. Vota, W. (2006, November 17). An implementation miracle. OLPC News. Retrieved from http://www. olpcnews.com/implementation/plan/implementation_miracle.html. WalkingLab (n.d.). Walking Anarchive. Retrieved from https://walkinglab.org/portfolio/walking-anarchive/. Zielinski, S. (2015). AnArcheology for AnArchives: Why do we need—Especially for the arts—A comple- mentary concept to the archive? (Winthrop-young, G., trans.). Journal of Contemporary Archaeology, 2(1), 1–147. https://doi.org/10.1558/jca.v2i1.27134. 70 International Journal of Digital Humanities (2019) 1:59–70 http://www.fonly.typepad.com/fonlyblog/2005/11/problems_with_t.html https://doi.org/10.1109/TIC-STH.2009.5444367 https://doi.org/10.1109/TIC-STH.2009.5444367 http://mediaarchaeologylab.com http://laptop.org http://laptop.org/about/countries http://laptop.org/en/laptop/index.shtml http://laptop.org/en/laptop/hardware/specs.shtml http://laptop.org/en/vision/mission/index.shtml http://laptop.org/en/vision/index.shtml http://laptop.org/en/vision/project/index.shtml http://laptop.org/en/vision/mission/faq.shtml https://www.forbes.com/2006/03/16/gates-laptop-microsoft-cx_po_0316autofacescan06.html#42282ce43e38 https://www.forbes.com/2006/03/16/gates-laptop-microsoft-cx_po_0316autofacescan06.html#42282ce43e38 http://senselab.ca/wp2/immediations/anarchiving/ http://senselab.ca/wp2/immediations/anarchiving/ http://edition.cnn.com/2005/WORLD/africa/12/01/laptop/ http://edition.cnn.com/2005/WORLD/africa/12/01/laptop/ https://www.newscientist.com/article/mg21228425.500-i-want-to-give-poor-children-computers-and-walk-away https://www.newscientist.com/article/mg21228425.500-i-want-to-give-poor-children-computers-and-walk-away http://www.olpcnews.com/implementation/plan/implementation_miracle.html http://www.olpcnews.com/implementation/plan/implementation_miracle.html https://walkinglab.org/portfolio/walking-anarchive/ https://doi.org/10.1558/jca.v2i1.27134 Anarchive as technique in the Media Archaeology Lab | building a one Laptop Per Child mesh network Abstract Introduction The Media Archaeology Lab as archive and anarchive Archiving and anarchiving the MAL’s collection of OLPCs The Anarchive and the counter-archival impulse Inspiration Direction Learning the technology Active process Conclusions References work_3mandpkqi5gtxkwlowj2uubdy4 ---- White Paper Report Report ID: 107229 Application Number: HK-50022-12 Project Director: Nancy Maron (nancy.maron@ithaka.org) Institution: Ithaka Harbors, Inc. Reporting Period: 9/1/2012-3/31/2014 Report Due: 6/30/2014 Date Submitted: 6/30/2014 Sustaining the Digital Humanities: Lessons Learned (NEH white paper) Nancy L. Maron and Sarah Pickle Ithaka S+R June 30, 2014 Ithaka S+R: Sustaining the Digital Humanities: Lessons Learned (NEH white paper) 2 / 10 Introduction Ithaka S+R recently completed a study, with generous funding from the National Endowment for the Humanities’ Office of Digital Humanities, that explored the different models colleges and universities have adopted to support digital humanities (DH) outputs on their campuses. The final report, entitled Sustaining the Digital Humanities: Host Institution Support beyond the Start-Up Phase, and the accompanying Sustainability Implementation Toolkit, are intended to guide faculty, campus administrators, librarians, and directors of support units as they seek solutions for coordinating long-term support for digital humanities resources at their institutions. By exploring both the assumptions and practices that govern host support, from the grant-stage to the post-launch period, we hoped to gain a clearer understanding of the systems currently in place and to identify examples of good practice. Over the course of this study, Ithaka S+R interviewed more than 125 stakeholders and faculty project leaders at colleges and universities within the US. These interviews included a deep-dive phase of exploration focused on support for the digital humanities at four campuses—Columbia University, Brown University, Indiana University Bloomington, and University of Wisconsin- Madison. This research helped us to better understand how institutions are navigating issues related to the sustainability of DH resources and what successful strategies are emerging. Research for this study began in October 2012 and involved two stages:  Phase 1, Sector-Wide Research: Interviews and desk research with stakeholders at a variety of higher education institutions (public and private, teaching- and research- focused, large universities and small liberal arts colleges) provided an overview of the practices and expectations of digital humanities project leaders, funders, and their university administrators, as well as the challenges and successes they have encountered along the way.  Phase II, Deep-Dive Research: More extensive analysis of four institutions that have created and managed several of their own digital projects allowed us to develop a map of the full scope of their activities, the value they offer to the host university, and the dynamics that drive decision making around the role the university plays in supporting them. Unlike many other recipients of Digital Implementation Grants who are developing digital tools and online resources, the primary deliverable for this grant is a white paper to share findings from our work. We refer our readers to that paper, Sustaining the Digital Humanities: Host Institution Support beyond the Start-Up Phase, for the most comprehensive discussion of methodology and lessons learned. In this paper, we are pleased to have the opportunity to reflect further on the project as a project, and to consider its challenges and impacts. Ithaka S+R: Sustaining the Digital Humanities: Lessons Learned (NEH white paper) 3 / 10  The URL for the final report is: http://www.sr.ithaka.org/sites/default/files/SR_Supporting_Digital_Humanities_2014061 8f.pdf  The URL for the Sustainability Implementation Toolkit is: http://www.sr.ithaka.org/research-publications/sustainability-implementation-toolkit Lessons learned and changes in course During the course of the study we chose to modify the methodology due to both a sharpening of the focus on institutional models and an awareness of the difficulty in collecting reliable financial data. This shift resulted in our conducting more case profiles and more interviews, but in collecting less financial data than first planned.  Landscape focus on campus profiles. Our initial plan for our landscape review was to interview 20-25 individuals at institutions across the United States, in faculty, administration, and department head roles. As we sharpened our focus on institutional strategies, we decided to use the landscape phase of our research to create profiles of a dozen campuses. Rather than interviewing individuals specifically by job role, we chose campuses to profile and then sought key individuals on those campuses.  Expanded from two deep dives to four. We conducted four deep profiles, instead of two, as originally planned. This afforded us a greater understanding of both the common and the unique challenges faced by universities in this area, making it possible for us to describe in our report three campus ―models‖ for supporting DH, while remaining attentive to the influences that local idiosyncrasies can have when adopting any one of these models.  De-emphasized cost data. An initial goal of the study was to quantify the cost—to the PIs, to their host institutions, to granting agencies—of creating and sustaining digital humanities resources. The motivation for attempting this was to develop a view of all the resources already being spent on doing this work in an ad hoc fashion. Between the time of the grant proposal and our undertaking the work, however, we had completed another study [IMLS-funded case studies of digitized special collections and an ARL-funded survey of digitized special collections] that had allowed us to do further cost data gathering, specifically at some institutions, including academic libraries with special collections. This exercise, as well as our experience in interviewing staff and faculty for this project, made it painfully clear that accurate cost data would be difficult to obtain, as in most cases neither faculty nor library staff were in the habit of tracking the time they were devoting to specific digital projects. We did gather some data concerning budgets in our faculty surveys, but chose to focus on the larger issue of which units were devoting time to specific activities, and determining whether or not they were doing so on an in- kind or paid basis. http://www.sr.ithaka.org/sites/default/files/SR_Supporting_Digital_Humanities_20140618f.pdf http://www.sr.ithaka.org/sites/default/files/SR_Supporting_Digital_Humanities_20140618f.pdf http://www.sr.ithaka.org/research-publications/sustainability-implementation-toolkit Ithaka S+R: Sustaining the Digital Humanities: Lessons Learned (NEH white paper) 4 / 10  Shifted timing of campus meetings. The initial plan was to visit each campus twice, first for interviews with senior administrators and support staff, and later on, to interview faculty once the survey results had been analyzed. Due in part to the challenges of scheduling these sessions around holidays and campus schedules, we opted to conduct most faculty interviews via phone, to get to them more quickly. This turned out to be an even better plan; the second campus meetings were then devoted to sharing back our findings and hosting facilitated sessions with groups of stakeholders. These sessions offered us valuable feedback on our findings, and also were in some cases run as workshops, where senior administrators, faculty and unit heads actively discussed the roles they currently play and how they see their own systems developing to better manage the demands of faculty and the work they create. Perhaps the most difficult question was how to define the particular flavor of ―digital humanities‖ we would examine. Did we care about all the shapes and sizes that DH engagement comes in, or just in the large-scale digital outputs that seem to garner the most attention and funding? In the end, we developed a method we hoped would acknowledge and capture data on the widespread interest in digital humanities, while also identifying practitioners who are actually building and managing long-term resources. The survey was directed at all faculty in a few departments selected by our campus-based partners (often based in the library) and we tried to get as broad participation as possible. But the survey also sought to identify those among the respondents who had managed or created digital projects that they considered to be for public use and that were expected to need ongoing support and development. This approach worked for the most part, but while we were eager to learn more about those major, public digital research initiatives, we soon realized that campus leaders still need a better understanding of what faculty (and even students) are doing, and to what extent those other activities generate materials that will require a support strategy. We hope that those who choose to undertake a campus-based survey for themselves will consider ways to capture more data about the sorts of files, formats, and intentions of even those practitioners whose work is not intended for public use. In other words, while we focused on a particular use case that is known to create significant sustainability challenges, there are many faculty and students who are creating other types of resources and data that may also pose challenges over the long-term, and the survey could prompt respondents to offer greater detail about that work so that a better-informed and finer-tuned system of support could be developed. Accomplishments The paper and toolkit were published on June 18, 2014 and represent the final deliverables from this grant. In the course of conducting the study and developing the paper and tools, we had several accomplishments worth emphasizing: Ithaka S+R: Sustaining the Digital Humanities: Lessons Learned (NEH white paper) 5 / 10  We undertook and completed four full campus profiles, twice as many as original proposed, by altering the methodology used to focus less on gathering cost data and more on understanding process and strategy.  Our original estimate was to interview about 45 people in the course of this project. In the end, we interviewed over 125 individuals, including some more than once.  We held on-campus meetings to share back our findings and discuss them with campus stakeholders. Each campus partner was offered a short menu of types of events we might host for them. This phase of the project was extremely productive; rather than just providing us with feedback on our work (though they served this purpose, also), in many cases, the sessions ended up being a good neutral ground for people across campus to begin to have substantive conversations about how to better coordinate their activities. Several times, we were told that meetings like that were very valuable but ―just don’t happen.‖ It may take some time to see the results from this work; we will continue to track evidence of people and teams using this approach to develop their own campus- based strategies.  Our marketing team developed lists of contacts and communications to disseminate the report and the toolkit. An announcement was sent to 3,239 contacts, including US library deans and directors, digital humanities centers, digital humanists, publishers, and higher education and libraries media. Additionally, the announcement was posted on the ACRL Digital Humanities Interest Group listserv, the ACRL Sustainability listserv, and on Ithaka S+R’s blog and Twitter account. Audiences The readership for this report includes several groups. While it is too soon after publication to have a full picture of the impact the paper and toolkit will have, we expect the readership to include:  Library administrators and DH coordinators. We see as the main audience for this report those in the library who manage digital projects, whether for the library’s own collections or as a service to faculty to come to the library for support. We have heard from some library directors that the report will be useful to them and others who are considering developing DH strategies for themselves. In just the last week, we have heard from an AUL for technology at a major research institution (Wisconsin) and a head of a liberal arts college publishing program (Amherst) who reported that they had shared the report widely with campus colleagues.  DH practitioners. Faculty who are engaged in building digital projects of their own will be one of our audiences here, too. As many of the initiatives to gain further funding to support staff hires, technology capacity and education for practitioners are lead by faculty members, we believe that the report will provide them with the tools they need to gather Ithaka S+R: Sustaining the Digital Humanities: Lessons Learned (NEH white paper) 6 / 10 data on the nature of the need on their campus, and to have structured conversations with administrators about possible paths forward.  Heads of other related units on campus:Many units on campus, from digital humanities centers to technology or visualization groups, to the university press are or could be participating in the process of creating and managing the new digital research resources being created on campus. While digital humanities centers seem to the obvious leader in these discussions, we hope that the paper encourages a discussion about the roles that the DH center does assume, and important roles that others will need to take on.  Senior administrators (deans, provosts). Our research made clear that in most places, this issue is only beginning to emerge at the highest levels of administration, and yet the instances of greatest coordinated investment only occur with support from the top. We hope that senior administrators will find this to be a useful paper for framing the issues, and we imagine that library directors and faculty will direct them to it for this purpose. The reach of this report and this topic is nation-wide and even international. While geographic differences do exist concerning institutional strategy, the tools offer here are easily translatable to other settings. A complete list of interviewees is available in the appendices of final report, starting on page 67. In total, we spoke with 126 individuals from 23 institutions of higher education and 5 other organizations, such as funding agencies. Those institutions included 10 public and 13 private universities and colleges. While most were research universities, 6 were liberal arts colleges. In terms of outreach, within the first ten days of publication (June 18 – June 28), we had 1578 total page views of the final report, which has been downloaded 567 times. There have been 986 page views for the Toolkit, and various elements of the kit were downloaded 346 times. Social media has played d a significant role in spreading the word about this publication. The initial Ithaka S+R announcement was re-tweeted 34 times, reaching 21,473 followers. Another 57 people and organizations tweeted independently about the project, and those tweets were re- tweeted 71 times, for a total reach of 397,686. Evaluation The project was supported by an advisory committee, which included Richard Detweiler, President, Great Lakes Colleges Association; Martin Halbert, Dean of Libraries, University of North Texas; Stanley N. Katz, Director, Center for the Arts and Cultural Policy Studies; Lecturer with rank of Professor, Woodrow Wilson School of Public and International Affairs; President Emeritus of the American Council of Learned Societies; Maria C. Pantelia, Professor, Classics, University of California, Irvine; Director, Thesaurus Linguae Graecae®; Richard Spies, Former Executive Vice President for Planning and Senior Advisor to the President at Brown University, Ithaka S+R: Sustaining the Digital Humanities: Lessons Learned (NEH white paper) 7 / 10 Former Vice President for Finance and Administration at Princeton University. Ann J. Wolpert, Director of Libraries, MIT, was a valued member of the advisory committee until her death in October 2013. The advisory committee offered valuable guidance at key milestones throughout the project:  A conference call on May 29, 2013 allowed us to share findings from the sector-wide research that had been completed and to select the deep dive sites.  Feedback on the project leader questionnaire was solicited via email in September 2013, after we had refined the instrument in collaboration with the campuses coordinators.  A conference call on December 12, 2013 served to discuss preliminary findings from the faculty surveys and to review early sketches of project lifecycles, as well as to discuss the format and emphasis of the final report.  A final in-person meeting was held at the Ithaka S+R offices in New York on March 3, 2014. At this session, the committee reviewed draft profiles of Brown and Indiana and helped us to plan our on-campus workshops and roundtables.  Several members of the advisory committee read full drafts of the final paper and offered detailed comments and feedback. In addition, we received valuable feedback from members of the community at different points throughout the project, thanks to our close working relationships with our partner campuses. Meetings held at each of the four campuses permitted us to test out the ideas in the paper and those used to build the toolkit with groups of varying composition. The campus workshops included the following:  Columbia University: Roundtable of several senior library directors and staff dedicated to supporting digital humanities work, including the AUL for Collections and Services, Associate VP, Digital Programs and Technology Services, the Director of the Center for Digital Research and Scholarship, Acting Executive Director for the Center for New Media Teaching and Learning. the Director, Humanities and History Libraries and the Digital Humanities Coordinator.  Indiana University Bloomington: Roundtable with the Libraries Executive Council, which included the dean of the library and five associate deans; a presentation of research findings attended by about thirty people, including the majority of the Libraries Executive Council, several members of the Libraries’ Digital Collections Services, a handful of faculty members who have created DH projects, and few support staff from other units around campus; finally, a library staff training session on sustainability principles attended by several members of the Libraries Executive Council, members of the Libraries’ Digital Collections Services, and the reference librarians. Ithaka S+R: Sustaining the Digital Humanities: Lessons Learned (NEH white paper) 8 / 10  University of Wisconsin-Madison: Key stakeholder roundtable, including the dean of the library, an AUL for library technologies, the CIO, and two associate deans in the College of Letters and Science.  Brown University: Key stakeholder roundtable, including the university librarian, two AULs, the DH librarian, the deputy provost, two key administrators in other support units, and a handful of faculty with DH projects. These sessions were structured to include a formal presentation of findings from the campus- based survey, including DH activity on campus; a review of overlaps and gaps in the current system of supporting services to digital humanities project leaders; and a facilitated discussion on the key motivators for offering DH support. The feedback from these sessions, and our observations of how the ―key stakeholder‖ sessions helped to surface often sensitive topics in very productive ways strongly influenced the final design of the Sustainability Implementation Toolkit, in particular. The broader public is just now starting to respond to the project, and we will continue to track this over the months ahead. At the Annual meeting of the Associate of American University Presses (AAUP, June 2014) a session on Publishing and Digital Humanities included a brief synopsis and discussion of the paper. At the annual meeting of the American Library Association (June 2014) a discussion of the paper is on the agenda of the ACRL interest group for digital humanities. Responses to the paper will vary for different categories of readers. DH practitioners, particularly faculty members, may find this useful as a way to raise awareness of the topic on their campuses. Some well-known DH practitioners (Alex Gil at Columbia and Trevor Muñoz at MITH) were recently quoted in ―When Digital Projects End,‖ an article in Inside Higher Education, devoted to the study. Gil pointed out that ―The report does a fine job of teasing out the diversity of support approaches at different universities…Now that they have brought this level of detail to the conversation, I hope we can begin expanding the concept of support that the study assumes to include the learning of faculty, students and librarians. Nothing in my estimate will support digital scholarship and allow it to endure constant technological change -- on any campus -- more than shared knowledge.‖ 1 Continuation and Long Term Impact Unlike some of the other grantees in this program, this paper is considered to be the end product of a successful research project, so there are no immediate plans to continue the project itself. Ithaka S+R will continue to host the paper and the toolkit, and to promote it through webinars and other speaking engagements that we participate in. The papers that Ithaka S+R publishes 1 Carl Straumsheim, ―When Digital Projects End,‖ Inside Higher Ed, June 26, 2014. http://www.insidehighered.com/news/2014/06/26/study-preserve-digital-resources-institutions-should-play-their- strengths http://www.insidehighered.com/news/2014/06/26/study-preserve-digital-resources-institutions-should-play-their-strengths#sthash.EieOFrFg.dpbs http://www.insidehighered.com/news/2014/06/26/study-preserve-digital-resources-institutions-should-play-their-strengths#sthash.EieOFrFg.dpbs Ithaka S+R: Sustaining the Digital Humanities: Lessons Learned (NEH white paper) 9 / 10 tend to remain relevant over many years, so we have reason to believe that the readership of this work will continue to grow, as we continue to promote it. As a result of the project we came to know the senior library and DH leaders at the four campuses we worked most closely with, Columbia, Brown, Indiana and Wisconsin. These relationships have been wonderfully productive, not just for the paper, but in other ways as well. We are developing a training course, for example, and may now end up partnering with Columbia in future years. This grant gave us license to speak with many of the leaders of the DH community, and this led to other possible partnerships, as well. It has been a pleasure getting to know many of the library directors, faculty, senior administrators and other departmental heads, and these relationships will certainly last well beyond the end of the grant. We have started to hear of some encouraging illustrations of the impacts the process has had for those campuses we partnered with for this study. According to University Librarian Harriette Hemmasi of Brown University, ―The process at Brown heightened insight among the various stakeholders about the ways in which we see ourselves and each other as part of the campus infrastructure that supports digital humanities and digital scholarship, more generally. It also provided an impetus for increased collaboration, resulting in an award from the Provost to fund a two-year Digital Humanities Lecture Series, including at least one short-term Scholar-in- Residence each year.‖ According to Lee Konrad, Associate University Librarian, Technology Strategies and Data Services at University of Wisconsin-Madison, ―The process helped to illustrate both the pros and cons of supporting [DH-related] work in a highly decentralized manner. I came away feeling that while this type of support model has its challenges, it also has great rewards in that it brings together scholars, technologists, and librarians from across the campus in ways that might be difficult in a highly structured environment. The process gave us a very important opportunity to work together at administrative levels, and …to discuss engaging in sustainable digital humanities work at scale.‖ In addition, as is often the case, while this project has answered some questions it has also suggested others in need of further investigation. For example, it became clear that there is much more to discuss concerning what it means to ―publish‖ or ―disseminate‖ one’s work. Many campus roundtables with library staff and faculty suggested that posting materials in a campus repository was all that was needed. And yet, we heard very little about significant impact or efforts to build audience for these projects and even where there was a university press on campus, it was not generally considered a key player. We hope to further explore this topic, by working with members of the Association of American University Presses as well as with Library publishing units that are starting to play a role in this area. Ithaka S+R: Sustaining the Digital Humanities: Lessons Learned (NEH white paper) 10 / 10 Grant Products During the course of this grant, we wrote and published the final report, entitled Sustaining the Digital Humanities: Host Institution Support beyond the Start-up Phase as well as the Sustainability Implementation Toolkit. Both are freely available and hosted on the Ithaka S+R website: http://www.sr.ithaka.org/research-publications/sustaining-digital-humanities  Sustaining the Digital Humanities: Host Institution Support beyond the Start-Up Phase http://www.sr.ithaka.org/sites/default/files/SR_Supporting_Digital_Humanities_2014061 8f.pdf  Sustainability Implementation Toolkit http://www.sr.ithaka.org/research-publications/sustainability-implementation-toolkit The toolkit outlines three key phases, each including several downloadable files: Step One: Assess the Landscape (http://www.sr.ithaka.org/content/assess-landscape)  Survey of Faculty Creation of Digital Content, Tools, and Infrastructure  Customizing and Implementing the Survey  Interview Guide: Directors of Support Units  Interview Guide: Senior Administrators  Interview Guide: Digital Project Leaders Step Two: Identify Overlaps and Gaps (http://www.sr.ithaka.org/content/identify-overlaps-and-gaps)  Analyzing the Data Gathered  Overlaps and Gaps Worksheet Step Three: Discuss and Address Institutional Priorities (http://www.sr.ithaka.org/content/discuss-and-address-institutional-priorities)  Hosting a Stakeholder Roundtable  Stakeholder Roundtable: Presentation Template Additional features of the Toolkit include:  A Briefing Paper for Digital Project Leaders (http://www.sr.ithaka.org/sites/default/files/BRIEFING_PAPER.pdf)  Intake Questionnaire for New Digital Projects http://www.sr.ithaka.org/sites/default/files/IntakeQuestionnaire.pdf http://www.sr.ithaka.org/research-publications/sustaining-digital-humanities http://www.sr.ithaka.org/sites/default/files/SR_Supporting_Digital_Humanities_20140618f.pdf http://www.sr.ithaka.org/sites/default/files/SR_Supporting_Digital_Humanities_20140618f.pdf http://www.sr.ithaka.org/research-publications/sustainability-implementation-toolkit http://www.sr.ithaka.org/content/assess-landscape http://www.sr.ithaka.org/content/identify-overlaps-and-gaps http://www.sr.ithaka.org/content/discuss-and-address-institutional-priorities http://www.sr.ithaka.org/sites/default/files/BRIEFING_PAPER.pdf http://www.sr.ithaka.org/sites/default/files/IntakeQuestionnaire.pdf work_3mifxysn6jb35np3ddn5iatkue ---- Poster-Lidia-Bocanegra.pages TEN YEARS RECOVERING THE MEMORY OF REPUBLICAN EXILE WITH CITIZEN COLLABORATION. THE RESULTS OF E-XILIAD@S PROJECT: A PERSPECTIVE FROM DIGITAL HUMANITIES AND DIGITAL PUBLIC HISTORY DH2020. CARREFFOURS / INTERSECTIONS. ALLIANCE OF DIGITAL HUMANITIES ORGANIZATIONS, UNIVERSITY OF OTTAWA, CARLETON UNIVERSITY. OTTAWA, 20, 21, 25 JULY 2020. CreditsScientific Publications E-xiliad@s is a crowdsourcing research project about the Spanish republican exile, financially supported by the Spanish Ministry of Labour and Immigration in 2009 and by the Ministry of Employment and Social Secu- rity in 2011, through the Dirección General de Migraciones. The aim is to collect unpub- lished data, online, about the anonymous Spanish Republican exile, mainly for the pe- riod from 1939 to 1959 Franco regime, relat- ed to social, public and contemporary histo- ry and a strong focus on gender, hence the name “e-xiliad@s". by PhD. Lidia Bocanegra Barbecho - Digital Humanities specialist and DH re- sponsible at Medialab UGR (Universidad de Granada- Spain). Tenure Track position at Contemporary History Departmen, UGR. lbocanegra@ugr.es | @Lidia_Bocanegra * https://orcid.org/0000-0001-9479-5921 * https://ugr.academia.edu/LidiaBocanegra * https://www.researchgate.net/profile/Lidia_Bocanegra_Barbecho * https://hcommons.org/members/lbocanegra/ * https://contemporanea.ugr.es/informacion/directorio-personal/lidia-bocane- THE PROJECT AIM S PA N IS H H IS TO R IC A L M EM O R Y R EC O VE R Y WWW.EXILIADOSREPUBLICANOS.INFO CO-CREATION METHODOLOGY The unpublished data are provided by users who, after registration, compile a series of questions collected in a web form created ad hoc, called Ficha del exiliado in Spanish or Exile Record in English. It has a series of questions, mandatory or not, some of them with a free text field and others with closed lists. The questions re- spect a chronological order; the objective is to stimulate the family memory of those who fill it. THE COMMITMENT E-xiliad@s collaborates to recover the mem- ory of the republican exile through Open Data, with the consent of the users; at the same time, it is responsible to communicate to the society, with scientific rigor, the topics of exile and return, through the methodology of Digital Public History. C IT IZ EN S C IE N C E S TR AT EG Y D IG IT A L P U B LI C H IS TO R Y Visibility - Multilingual project (Spanish / French / English). Countries of highest audi- ence: Spain, France, Mexico, USA, Argentina, Great Britain, Chile, Colombia, Puerto Rico. For more than 10 years, the project appears in the top queries, in the topic of the Republican exile, in the main search engines. Search Engine Optimisation (SEO) - code without errors, internal metatags, URL, etc. Large amount of published content - creation of new Exile Records and new sec- tions. Use of the project's social networks (Facebook and Twitter), with more than 1.5k followers, to generate and disseminate new content. Reciprocal exchange. Using the system "you give me, I offer you" helps to expand the project, generating greater confidence to new users that want to deposit their memory in it. How the system works? a) Collection of information through the internal web form (you give me); b) Services offered by the project (I offer you): publica- tion of the information received in related sections, with prior consent; historical advice via email; bulletin board; dissemination of information through project social me- dia; creation of informative sections about exile: biogra- phies, geolocated map, biblio/webography, travelogues. User friendly Web layout - Easy-to-use and professional layout that incorporates e- commerce web design knowledge: positioning of information and images strategi- cally, writing by paragraphs. E-XILIAD@S RESULTS Since 2010, e-xiliad@s has published approximately 200 records of anonymous ex- iles and compiled about 500 files among images, scanned official documents, old newspaper articles, memoirs, poems, sound documents (interviews). More than 80% of the records are public, which shows a wide availability of users internationally to share and recover the memory of republican exile. The project has more than 1.500 followers in its social networks and nearly 45.000 visits to the project's website every year, becoming today one of the reference projects at academic and international level about the Republican exile. EXILE RECORDS TOTAL: 200 PUBLISHED: 81% UNPUBLISHED: 19% FILES TOTAL: 485 PUBLISHED: 69% UNPUBLISHED: 31% ANNOUNCEMENTS TOTAL: 65 COMMENTS: 64 IMAGE GALLERIES TOTAL: 21 WEB SECTIONS TOTAL: 89 WEB VISITS TOTAL: 507.068 USERS: 160.578 FOLLOWERS TOTAL: 1528 FACEBOOK: 924 TWITTER: 604 A C H IE VE M EN TS * Co-creación, participación y redes sociales para hacer historia. Ciencia con y para la sociedad (2017)- DOI 10.5209/HICS.57847 * ‘Cada día atrasamos el reloj un cuarto de hora para llegar con la hora americana'. Diario de viaje hacia el exilio (2015) - DOI 10.5281/zenodo.1182943 * La web 2.0 y el estudio del exilio republicano español: el análisis de la movilidad social y el retorno a través del proyecto e-xiliad@s (2015) - DOI 10.5281/zenodo.1182456 * El exilio republicano español: estudio y recuperación de la memoria a través de la web 2.0. Nuevo en- foque metodológico con el proyecto e-xiliad@s (2015) - DOI 10.5281/zenodo.1182238 * El semanario Exilio y los intelectuales del campo de Bram, 1939 (2015) - DOI 10.5281/zenodo.1182305 * Revista Exilio. Campo de Bram (2015) - DOI 10.5281/zenodo.1182839 * Memoria, exilio republicano e historia digital: El Proyecto e-xiliad@s (2014) - DOI 10.5281/zenodo. 1182357 Award for the best participation / presence in social media 2019 - Asociación de Hu- manidades Digitales Hispánicas (Premios HDH 2020). This contribution has been funded by the research project: Análisis de la Participación Pública en la investigación histórica desde el ámbito de la ciencia ciudadana (Co-Historia), funded by FEDER/Junta de Andalucía-Consejería de Economía y Conocimiento/ Proyecto (E-HUM-507-UGR18). Principal Investigator: Lidia Bocanegra Barbecho. mailto:lbocanegra@ugr.es https://ugr.academia.edu/LidiaBocanegra https://www.researchgate.net/profile/Lidia_Bocanegra_Barbecho https://hcommons.org/members/lbocanegra/ https://contemporanea.ugr.es/informacion/directorio-personal/lidia-bocanegra-barbecho mailto:lbocanegra@ugr.es https://ugr.academia.edu/LidiaBocanegra https://www.researchgate.net/profile/Lidia_Bocanegra_Barbecho https://hcommons.org/members/lbocanegra/ https://contemporanea.ugr.es/informacion/directorio-personal/lidia-bocanegra-barbecho work_3pih7g5g3jalpf6ckdlifzsitu ---- Exploring dynamic multilayer graphs for digital humanities Applied Network ScienceBornhofen and Dü Applied Network Science (2020) 5:54 https://doi.org/10.1007/s41109-020-00295-x RESEARCH Open Access Exploring dynamic multilayer graphs for digital humanities Stefan Bornhofen1* and Marten Düring2 *Correspondence: sb@eisti.eu 1Laboratoire ETIS, CY Cergy Paris Université, ENSEA, CNRS, UMR8051, 33 Boulevard du Port, 95000 Cergy, France Full list of author information is available at the end of the article Abstract The paper presents Intergraph, a graph-based visual analytics technical demonstrator for the exploration and study of content in historical document collections. The designed prototype is motivated by a practical use case on a corpus of circa 15.000 digitized resources about European integration since 1945. The corpus allowed generating a dynamic multilayer network which represents different kinds of named entities appearing and co-appearing in the collections. To our knowledge, Intergraph is one of the first interactive tools to visualize dynamic multilayer graphs for collections of digitized historical sources. Graph visualization and interaction methods have been designed based on user requirements for content exploration by non-technical users without a strong background in network science, and to compensate for common flaws with the annotation of named entities. Users work with self-selected subsets of the overall data by interacting with a scene of small graphs which can be added, altered and compared. This allows an interest-driven navigation in the corpus and the discovery of the interconnections of its entities across time. Keywords: Visual analytics, Network visualization, Dynamic multilayer networks, Digital humanities Introduction In recent years, vast quantities of the human cultural records have been digitized, further described with metadata and made available in the form of collections. Such collections within the fields of cultural heritage and digital humanities typically consist of digitized multimedia objects with a strong bias towards unstructured text, metadata of various lev- els of detail and completeness, and often a layer of named entity annotations. Today, most scholars in the humanities and related disciplines rely on keyword search and faceted search to retrieve relevant content. Any analysis of such collections needs to be based on an understanding of the underlying rationale for the creation of collections, how they are organized and to be able to retrieve relevant content in an exploratory manner (van Ham and Perer 2009; Brown and Greengrass 2006). In this paper we present Inter- graph, a technical demonstrator for the exploration of such collections based on named entity linking and collection-inherent metadata as well as the results from preliminary user evaluations. Intergraph was designed to utilise multilayer network visualizations to © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. http://crossmark.crossref.org/dialog/?doi=10.1007/s41109-020-00295-x&domain=pdf http://orcid.org/0000-0001-7702-0614 mailto: sb@eisti.eu http://creativecommons.org/licenses/by/4.0/ Bornhofen and Dü Applied Network Science (2020) 5:54 Page 2 of 13 support non-technical users in the exploration of historical document collections. More specifically, it helps to answer the following questions: How is a given named entity (per- son, institution or location) represented in a collection? Who appears with whom? How does this change over time? How can we compare the coverage of entities? Answers to these questions can help to detect patterns in the collection such as biases in the compo- sition of a collection (in our case for example gaps in the coverage of specific entities), or to highlight unexpected links between them. Intergraph was developed as one of three technical demonstrators during the BLIZAAR project, a French-Luxembourgish research project dedicated to develop novel visualiza- tions of dynamic multilayer graphs (https://blizaar.list.lu). BLIZAAR concentrated on a use case in biology and a principal use case in history. The targeted users in both cases were scholars with no or very limited experience with data analysis in general and network analysis and visualization in particular. The following section describes the principal use case in more detail, with its dataset and user requirements. “State of the art” section reviews the state of the art in the visu- alization of dynamic multilayer networks. “Intergraph” section presents the framework designed in response to the use case, and provides an overview of its main features. “User test results” section gives an account of the user tests conducted. Finally, “Conclusion & future work” section concludes the paper with a discussion and prospects for future work. Dataset and requirements The data is derived from resources on the European integration process since 1945 collected by the Centre Virtuel de la Connaissance sur l’Europe (CVCE) (https://www.cvce.eu), a former research and documentation center which in 2016 was integrated into the University of Luxembourg. The CVCE created a multilingual col- lection of approximately 25,000 digitized documents organized in 29 hierarchically structured thematic corpora. The documents differ significantly in nature: they include newspaper articles, diplomatic notes, personal memoirs, audio interview transcripts, car- toons and photos, all with descriptive captions. The histograph project (Guido et al. 2016) processed a subset of circa 15,000 of these documents with named entity recog- nition (NER) and -disambiguation and stored links between entities and documents in a Neo4j graph database. This dataset was made available to and further processed by the BLIZAAR project for the development of more advanced graph exploration prototypes. Figure 1 shows the BLIZAAR data structure with its nodes and relationships. Firstly, resources are part of one or more collections, from the highest logical unit of thematic corpora (ePublications) down to the corresponding hierarchical units and subunits; this is modeled by the “is_part_of” relationship. Secondly, named entities (people, locations, themes, institutions) have been extracted using named entity recognition software such as YAGO (Max-planck-institut fur informatik: YAGO) and TextRazor (The natural language processing API). This process enabled the generation of the “appears_in” relationship. Entities “co-appear” in resources, and collections “share” resources, by bipartite network projection (Latapy et al. 2008; Zweig and Kaufmann 2011). Finally, a collection “mentions” an entity, and accordingly the entity “is_mentioned_in” the collection, if the collection contains at least one resource where the entity appears. Table 1 gives an idea of the size of the dataset. https://blizaar.list.lu https://www.cvce.eu Bornhofen and Dü Applied Network Science (2020) 5:54 Page 3 of 13 Fig. 1 Data structure. BLIZAAR data structure (nodes and relationships) A previous study by the BLIZAAR project on visual analytics requirements for research in digital cultural heritage was published in McGee et al. (2016). The authors suggest that the data structures supporting the analysis of a complex digital corpus, in which people, organizations, places, multimedia documents and document collections are connected across time, should best be modeled as a dynamic multilayer network. As a matter of fact, concerning the given dataset: • Nodes can be considered on at least three layers, a resource layer, an entity layer and a collection layer, and they have different relationships; • Nodes on the two latter layers have subtypes which can be treated as extra layers: entities are people, locations, institutions or themes, collections are ePublications, units or subunits; • Resources are time-stamped by their historical publication date. The network therefore changes depending on the studied time period. Moreover, distinct time slices can also be regarded as defining separate layers. The definition of layers in this data model deliberately remains flexible (node types, node subtypes, time periods), and may depend on the user’s research question and vision of the data. Also, note that compared to the framework of Kivela’s universal model of mul- tilayer networks (Kivelä et al. 2014), the BLIZAAR data does not possess multiple types of relationship between nodes. This network dimension has therefore not been considered for the rest of this paper. In view of the concrete dataset and the CVCE (subsequently the University of Luxem- bourg) acting as a collaborator and a stakeholder, the BLIZAAR project is an instance of problem-driven visualization research. It has therefore been conducted in the spirit Table 1 BLIZAAR dataset magnitudes Item Approximate count Resources 15,000 Collections 4,000 Identified entities 36,000 Entity appearances 300,000 Entity co-occurrences 7,000,000 Bornhofen and Dü Applied Network Science (2020) 5:54 Page 4 of 13 of a design study, which is defined as a project in which visualization researchers ana- lyze a specific real-world problem faced by domain experts, design a visualization system that supports solving this problem, validate the design, and reflect about lessons learned in order to refine visualization design guidelines (Sedlmair et al. 2012). The collaboration was organized as follows: A kick-off workshop with four CVCE domain experts helped to assess technical skill levels and to determine the research priorities for historians and to develop user stories. This was followed by monthly exchanges with a primary domain expert (also the second author of this paper). One intermediate and one final evaluation with four CVCE domain experts validated the user stories and provided feedback on the usability of the demonstrator. In the kick-off workshop, the domain experts identified content retrieval and insights concerning the representation and interconnections of entities in a corpus as main objec- tives. Due to the heterogeneity of the documents in the corpus and lacking information on the nature of links between entities, the data was not considered usable for the reconstruc- tion and quantitative analysis of a historical social network without significant manual annotation which was beyond the scope of the project. In addition, domain experts did not have a strong background in advanced data analysis but highly value at least a basic comprehension of the inherent logic of the tools they work with. These needs were expressed in the following user stories: (1) Content overview: “I would like to have an overview of how a specific per- son/institution/location is represented in the corpus and of the other entities with whom they are mentioned. This helps me to decide which documents I want to study in greater detail.” (2) Query knowledge expansion: “I am interested in a topic but simple keywords are not suitable to retrieve relevant documents. Starting with my limited knowledge of the topic I want to receive suggestions for promising contents and additional keywords which can guide my exploration.” (3) Explore search results: “I am interested in a broader topic but am overwhelmed by the very large number of diverse search results. I want to be able to dissect and organize these results and understand how they are related to each other and their attribute values.” (4) Entity comparison: “I want to compare specific entities (persons, institutions, loca- tions, but also collections) in the corpus to get a better understanding of their presence in the corpus. I want to study how the contexts in which they appear change over time. I want to explore links between the entities I compare.” Within the BLIZAAR project, one or more of these user stories were targeted by dif- ferent prototypes (https://blizaar.list.lu). The design of the Intergraph demonstrator put special focus on (1) Content overview and (4) Entity comparison. To implement these user stories within a graph-based environment, the following user requirements have been identified: 1 Create subgraphs of understandable size and complexity 2 Set up multiple graphs for the sake of comparison or contrasting 3 Observe temporal changes 4 Filter for node and edge properties 5 Maintain a straightforward link to all connected resources for further study 6 Compensate for errors in named entity linking, e.g. duplicates https://blizaar.list.lu Bornhofen and Dü Applied Network Science (2020) 5:54 Page 5 of 13 A specific problem related to the automatic generation of the network is data imperfec- tions. Most commonly we observe fragments which were wrongly identified as entities, duplicate entities which have not been disambiguated correctly, and entities which have been disambiguated wrongly and linked to homonyms (the politician “Robert Schuman” vs. the composer “Robert Schumann”). The effort required to rectify all of the above- mentioned flaws is too costly and therefore unrealistic for this and comparable corpora. Functionalities moderating the flaws were therefore considered to be the most promising strategy in this case. State of the art Network visualizations offer a unique way to understand and analyze complex data by enabling users to inspect and comprehend relations between individual units and their properties. Some scientific fields have been using network visualizations for a long time, most notably systems biology where purpose-built visualizations have been developed for more than twenty-five years (Mendes 1993; Shannon et al. 2003; Pavlopoulos et al. 2008). Interactive network visualizations have been used in and around the digital humani- ties sphere to make datasets accessible for exploration and research (Jänicke et al. 2015; Düring 2019; Jessop 2008; Boukhelifa et al. 2015; Düring 2013; Windhager et al. 2018) inasmuch as they offer novel search and discovery tools which enhance well-established techniques such as faceted search and keyword search. Examples include stand-alone applications based on letter exchanges (Warren et al. 2016), to explore bibliographic data (SNAC; Verhoeven and Burrows 2015), collections of documents based on unstructured text (Guido et al. 2016; Moretti et al. 2016), often with a strong element of decentralised and collaborative data curation. To avoid costly manual relationship extraction, network data is often generated automatically based on named entity recognition (NER), existing document metadata or other data extracted from unstructured text and inferred relations between them. With its focus on the exploration of automatically enriched unstructured texts, Intergraph bears closest resemblance to Guido et al. (2016) and Moretti et al. (2016) but adds new functionality for the exploration of multilayer networks on multiple canvases. Dynamic networks represent evolving relationships between entities that evolve over time (Beck et al. 2017). A small number of tools are readily available for dynamic graph visualization, such as Gephi (Bastian et al. 2009) or Commetrix (Trier 2008) which are arguably the two most prominent solutions. These applications are under continuous development and have a large user community, but they are not adapted to visualizing multiple layers and therefore cannot properly meet the specificities of multilayer data. As a matter of fact, recent research suggests that multilayer graphs allow for more complex- ity in the exploration of historical data (McGee et al. 2016; Valleriani et al. 2019; van Vugt 2017; Grandjean 2019). In multilayer networks, subnetworks are considered on indepen- dent layers, but they can also interact with each other (Kivelä et al. 2014). Multilayer networks can have multiple types of node (Ghani et al. 2013), with different attributes (Kerren et al. 2014; Nobre et al. 2019) and different types of relationships (Singh et al. 2007). Tulip (Auber 2004) is a powerful graph visualization framework capable of embrac- ing the complexity of multilayer data, however its configuration requires expertise in programming and network analysis and has a substantial learning curve for the user to obtain the intended visualization. Bornhofen and Dü Applied Network Science (2020) 5:54 Page 6 of 13 The particularities of networks being both dynamic and multilayer have recently come to the attention of network science due to their importance for real-world applica- tions. Multilayer networks open up new opportunities for the interactive exploration of (historical) datasets but also require novel types of data visualization (Rossi and Magnani 2015). In recent years, two collaborative European projects, Plexmath (2012- 2015) (https://cordis.europa.eu/project/rcn/105293_fr.html) and Multiplex (2012-2016) (https://cordis.europa.eu/project/rcn/106336_en.html), were entirely dedicated to this topic, allowing to design a number of novel visualization methods (http://www.mKivela. com/pymnet, https://github.com/sg-dev/multinet.js) (De Domenico et al. 2014; Piškorec et al. 2015). The tools published in the scope of these two projects have been primarily designed to demonstrate concepts and to illustrate universal approaches to dynamic mul- tilayer graph visualizations. Given the lack of ongoing development and active support, their usability for the BLIZAAR use case given in this paper is problematic. A complete survey on current visualization solutions for multilayer graphs and their features, can be found in Ghoniem et al. (2019). Based on the BLIZAAR use requirements identified in the previous section, three feature sets were found to be essential for any adopted solution: 1 the swift creation of subgraphs from a larger dataset to support an iterative exploration workflow; 2 follow-up functionalities beyond selection, layout rearrangement or camera movement. In our case, this includes querying, generating, and juxtaposing related subgraphs and layers; 3 a flexible layer model which allows the user to switch between, and even to combine models (e.g. node type layers vs. time slice layers). To our knowledge there is no tool available which contains all three feature sets and allows their combination in exploratory workflows. This motivated the development of the Intergraph visualization platform which we describe in the following section. Intergraph Intergraph offers a novel approach to exploring digital humanities corpora by means of an iterative search and discovery workflow. The demonstrator has been designed to meet all user requirements specified in the previous section. Written in javascript, Intergraph runs in a web browser and communicates with a node.js server which queries the data from a Neo4j database. The front-end client renders the graphs using the Three.js graphics library. Given the size of the BLIZAAR dataset, an overall visualization of the corpus is nei- ther suitable nor desirable for exploration. Instead, users are rather interested in creating and inspecting subnetworks with entities relevant to their current research. The main idea of Intergraph is therefore to begin the exploration from one or more known start nodes. Following the expand-on-demand principle (van Ham and Perer 2009), the user will encounter new relevant nodes and pursue their exploration by conveniently creat- ing additional graphs stemming from the existing ones. This path of exploration yields a sequence of linked subgraphs (user requirement 1). Depending on the query and the users understanding of the data, a new graph may be used and looked upon as a complementary layer, or as a complementary graph of an existing layer. https://cordis.europa.eu/project/rcn/105293_fr.html https://cordis.europa.eu/project/rcn/106336_en.html http://www.mKivela.com/pymnet http://www.mKivela.com/pymnet https://github.com/sg-dev/multinet.js Bornhofen and Dü Applied Network Science (2020) 5:54 Page 7 of 13 Figure 2 shows a general screenshot of the Intergraph interface. Graphs can be dynam- ically added to and deleted from the scene. Following the VisLink approach (Collins and Carpendale 2007), they are rendered on free-floating planes which can be arbitrar- ily translated, oriented and scaled using familiar transformation widgets. Depending on the user’s tasks and preferences, the scene can be viewed from a 2D or a 3D perspec- tive. The default 2D view is known to be most effective for visual data exploration and analytics, since 3D visualizations tend to suffer from occlusion, overlapping and distor- tion, and they often require increased viewpoint navigation to find an optimal perspective (Shneiderman 2003). However, 3D scenes allow users to stack multiple planar graph layers in space and to create so-called “2.5D” visualizations, which can be useful for understanding complex networks (Ware 2001). Since target users are not experts in network science and since the data itself does not lend itself to reconstruct meaningful social networks, Intergraph forgoes advanced graph concepts. Network analysis functions, metrics and algorithms like clustering coefficients or betweenness have not been considered for implementation in this prototype. On these grounds, Intergraph also uses standard visual encoding. Node colors reflect the node type and sizes indicate the number of underlying resources. A click on a node or edge gives immediate access to these resources (user requirement 5). New graphs are typically pro- duced by querying ego-networks of existing nodes, i.e. subgraphs linking a node with its immediate neighbors, via easy-to-communicate operations such as: • All entities co-appearing with a given entity • All collections mentioning a given entity • All entities mentioned in a given collection • All collections sharing resources with a given collection If the same node appears on two or more graphs of the scene, coupling edges (Kivelä et al. 2014) are drawn (see Figs. 2 and 3). This user-driven network generation approach was partly inspired by “citation-chaining”, one of the most commonly used search strategies for literature among historians (Ellis 1989; Buchanan et al. 2005). Inter- graph applies the citation chaining principle to documents and the entities mentioned in Fig. 2 Intergraph. A general screenshot Bornhofen and Dü Applied Network Science (2020) 5:54 Page 8 of 13 Fig. 3 Comparing graphs. Co-evolution of two dynamic networks over consecutive time periods them. This allows users to create their own interest-driven search and discovery paths across the dataset. With regard to data imperfections, one of the most frequently encountered flaws with entity disambiguation in the CVCE dataset are entity duplicates, i.e. multiple recognized entities where in reality only one was actually mentioned. For example, named entity recognition yielded three separate nodes for “East Berlin”. If the user wants to consider Bornhofen and Dü Applied Network Science (2020) 5:54 Page 9 of 13 these three nodes as one in order to create an ego-network, it is possible to multi-select a number of nodes and to query “All entities co-appearing with a given entity (union)”, meaning that the result will be the list of nodes co-appearing with at least one of the selected nodes. The user can then define a unique group node for “East Berlin” and draw a meaningful graph (user requirement 6). It is also possible to query “All entities co- appearing with a given entity (intersection)”. This operation returns the list of entities co-appearing in the corpus with all selected nodes and can be used to merge multiple nodes, for example if understood as representatives of a social group. The results of new queries first appear in the form of a table in the left pane. This first kind of visualization, itemizing only the nodes without the edges, may in some cases already be sufficient to work with. The table lets users decide whether it is worth generating the graph or whether they prefer to recompile the list of nodes, if there are missing nodes or nodes which should be excluded from the graph. A graph of a given node table, or part of it, can be generated on demand and is added to the canvas on the right side of the interface. The scene can be submitted to a filter which operates on resource type and time period (user requirement 4). Subgraphs of a given resource type can provide a better understand- ing of its distribution within the corpus. Subgraphs considering the resources within a specific time window allow the user to assess the relevance and interconnections of enti- ties during a specific period. The user can shift the time window and obtain an animated representation of the dynamic graph (user requirement 3). If time-to-time mapping, i.e. animation, is not convenient to analyze the evolution of a network over time, time-to- space mapping is also possible. For this purpose the user can clone and “freeze” a graph of the scene, meaning that its current filter is fixed. Using this method, several graphs with the same nodes but distinct time periods can be juxtaposed (2D) or superimposed (3D) in space (Beck et al. 2017) (user requirement 2). Figure 3 illustrates how Intergraph addresses the three challenges as specified in the state of the art by offering interactive subgraph creation, multiple operations to define layers and the free organization of these layers in the canvas. We take as example Willy Brandt, former Chancellor of the Federal Republic of Germany, and his advisor Egon Bahr. The scene shows a set of subgraphs which have been incrementally constructed. Starting with a search for the two nodes “Willy Brandt” and “Egon Bahr”, we use the node-level follow-up operation “persons co-appearing with” and the graph-level operations “clone”, “freeze” and “time filter” to create two ego networks at three consecutive time periods (before 1964, 1964-1987 and after 1987). The flexible layer concept allows organizing these six layers freely, in this case to reveal how the co-occurrence networks evolve and overlap. User test results To assess the usability of Intergraph, two user evaluation sessions were held. The first to validate the user stories and to provide feedback on an earlier version of Intergraph, the second to evaluate the current version. The tests were conducted with a group of four scholars, all of whom were former CVCE employees. The selection criteria were familiar- ity with the underlying corpus on European integration and with the application of digital tools and methods. These criteria were applied to ensure that users could turn their atten- tion to the interaction with the prototype with only minimal reminders of the underlying Bornhofen and Dü Applied Network Science (2020) 5:54 Page 10 of 13 data model and content, and also that they were qualified to judge the pertinence of the output. For the second evaluation, evaluators were asked to submit in advance a list of five persons or intuitions and optionally up to three time periods between 1945 and 2009 they wished to explore in greater depth under the precondition that they had expert knowledge about their presence in the CVCE corpus. To compensate for the relative unfamiliarity with the software they were to test, the scholars were reminded of BLIZAAR’s research objectives and received a circa 10-min demonstration of the Intergraph core functionalities. After the presentation of the plat- form, they were invited to use Intergraph for themselves and to begin their session with an elementary keyword search for an entity they knew was mentioned in the corpus. From this starting point, they were free to perform more synoptic tasks, such as finding relevant collections and resources, searching for co-appearing entities and comparing their corre- sponding networks, in order to obtain a comprehensive overview of how the investigated element is represented, positioned and linked in the corpus. Throughout the session, users were encouraged to continuously verbalize their train of thoughts and actions, in line with the thinking aloud approach (Boren and Ramey 2000). Following the 45-min testing period, users were asked to give verbal feedback and to complete a questionnaire. In their verbal feedback, users appreciated the ease of navigating through the corpus, the flexibility and freedom to combine different elements, the links across canvases, the management of duplicate entities as well as the ability to drill down to the underlying resources. With regard to the added value in their research workflows, they highlighted the ability to detect unexpected relationships between entities (in this case between a bank and politicians) and the ability to contribute to a global assessment of the collection and its composition. Critical remarks addressed the absence of additional inter-layer links across canvases (e.g. multiple entities mentioned in the same ePublication), the obscurity of the frequent use of the context menu triggered by a right-click, and long loading times for graphs with more than 100 nodes due to performance limitations. Users were invited to fill in a questionnaire and to quantify the utility of Intergraph on a scale of 1 to 7 with regard to all aspects of the initially defined user story. As a result, Fig. 4 shows a high general acceptance. In one case, Intergraph did not produce a number of documents the evaluator expected to retrieve, which explains the low mark below average. Most notably, all users declared in the questionnaire that for the given user story they would prefer to Fig. 4 Evaluation. User test questionnaire results Bornhofen and Dü Applied Network Science (2020) 5:54 Page 11 of 13 use Intergraph over the other available tools (CVCE homepage search and CVCE backend search). Conclusion & future work This paper presented Intergraph, a visual analytics platform designed for effective nav- igation through the content of digital humanities corpora by non-experts. The work is inspired by recent advances in the visualization of dynamic multilayer networks, and has been enhanced and optimized for humanities scholars with no or very little skills in data analysis and visualization and their subject-specific workflows. The user tests conducted showed a high acceptance of the demonstrator with respect to the original requirements. The user tests did however also reveal a number of challenges. Our evaluation setup sought to minimize the hurdles involved with an ad-hoc evaluation of a new tool and uncommon methods of search and discovery based on named entities and visualiza- tions but could not remove them. Users require significant time to learn how to operate especially prototype software and to adjust long established research workflows. Most crucially however, automated data extraction and visualization-based exploration need to be understood well and their potential can only be explored by extensive experimentation and adoption to a new domain which was beyond the scope of this project. Given the exploratory nature of Intergraph, future work will concentrate on additional ways of suggesting related nodes and creating pertinent graphs out of existing ones, for example by applying recommendation algorithms (Bobadilla et al. 2013). The multi- layer character of the data should be leveraged by adding other types of interlayer edges, such as those indicating the “mentions” relationship between collections and entities. For the time being, it is also impossible to visualize more than one type of link within a given layer, which precludes for example the possibility to visualize registered family or friendship relationships between people in addition to their coappearance relationship. Moreover, data imperfections may cause significantly skewed results and therefore need to be taken into account. Such imperfections may stem from the automated process- ing of digitized materials by methods such as Optical Character Recognition (OCR) or Named Entity Recognition (NER), but also from manual curation of metadata, for exam- ple, as well as intrinsic ambiguities in the source material. Since data cleaning is typically too costly, we observe a strong need for systems which enable users to cope with data- inherent imperfections. Intergraph’s functionality to merge duplicate nodes is a step in this direction. Another promising direction is the critical assessment of the composition of datasets and the potential of visualization to reveal its inherent biases. These require- ments present interesting challenges and opportunities for the design of innovative tools for visual analytics. Finally, it is important to observe that the used data model is highly generic. Enti- ties identified in collections of time-stamped resources are likely to be found in a huge number of digital corpora. It is intended to open the existing platform to other datasets, and for this purpose the authors are currently working on extending Intergraph and its functionalities to two additional historical databases: • Regesta Imperii (Kuczera 2019) (http://www.regesta-imperii.de): a collection of records of the Roman-German kings and emperors, as well as of the popes from the Middle Ages; http://www.regesta-imperii.de Bornhofen and Dü Applied Network Science (2020) 5:54 Page 12 of 13 • Romans 1by1 (Varga 2017) (http://romans1by1.com): a population database registering the people which are attested in Greek and Roman epigraphic sources. By this means, we hope that Intergraph can evolve into a valuable visualization and exploration tool for many scholars working in the field of digital humanities. Abbreviations CVCE: Centre Virtuel de la Connaissance sur l’Europe; NER: Named Entity Recognition; OCR: Optical Character Recognition Acknowledgements The authors would like to thank the master student Samuel Guillaume for his excellent contribution to the implementation of the Intergraph platform. The authors would furthermore like to express their gratitude to the journal editors and reviewers for their highly constructive feedback on earlier versions of this article. Authors’ contributions Stefan Bornhofen is responsible for the design and implementation of the Intergraph software. Marten Düring is responsible for the development of the use case and evaluation. Both authors read and approved the final manuscript. Authors’ information Stefan Bornhofen studied mathematics and computer science at the University of Mainz, Germany, and obtained a PhD in Computer Science from Paris-Sud University, France, in 2008. He holds a teaching and research position at CY Tech engineering school in Cergy near Paris, and is the head of a Master’s program specializing in computer graphics, computer vision and human-computer interaction. Marten Düring studied history at the universities of Augsburg and Manchester, and received a PhD degree in contemporary history from the University of Mainz, Germany in 2015. He is Assistant Professor in Digital History at the Luxembourg Centre for Contemporary and Digital History. Funding This research was funded by the ANR grant BLIZAAR ANR-15-CE23-0002-01 and the FNR grant BLIZAAR INTER/ANR/14/9909176. Availability of data and materials The datasets generated and/or analysed during the current study are not publicly available due to copyright restrictions but are available from the corresponding author on reasonable request. Competing interests The authors declare that they have no competing interests. Author details 1Laboratoire ETIS, CY Cergy Paris Université, ENSEA, CNRS, UMR8051, 33 Boulevard du Port, 95000 Cergy, France. 2Luxembourg Centre for Contemporary and Digital History, 11 Porte des Sciences, 4366 Esch-sur-Alzette, Luxembourg. Received: 24 February 2020 Accepted: 29 July 2020 References Auber D (2004) Tulip - A Huge Graph Visualization Framework. Graph Draw Softw:105–126. https://link.springer.com/ chapter/10.1007/978-3-642-18638-7_5#citeas Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks. Int AAAI Conf Weblogs Soc Media:361–362. https://www.aaai.org/ocs/index.php/ICWSM/09/paper/view/154 Beck F, Burch M, Diehl S, Weiskopf D (2017) A taxonomy and survey of dynamic graph visualization. Comput Graph Forum 36:1 Bobadilla J, Ortega F, Hernando A, Gutirrez A (2013) Recommender systems survey. Knowl Based Syst 46:109–132 Boren T, Ramey J (2000) Thinking aloud: reconciling theory and practice. IEEE Trans Prof Commu 43(3):261–278 Boukhelifa N, Giannisakis E, Dimara E, Willett W, Fekete J-D (2015) Supporting Historical Research Through User-Centered Visual Analytics. In: Bertini E, Roberts JC (eds). EuroVis Workshop on Visual Analytics (EuroVA). The Eurographics Association. pp 1–5. https://diglib.eg.org/handle/10.2312/eurova.20151095.001-005 Brown SC, Greengrass M (2006) RePAH: a user requirements analysis for research portals in the arts and humanities. Humanities Research Institute Online, Sheffield University, Leicester Buchanan G, Cunningham SJ, Blandford A, Rimmer J, Warwick C (2005) Information Seeking by Humanities Scholars:218–229 Collins C, Carpendale S (2007) VisLink: Revealing Relationships Amongst Visualizations. IEEE Trans Vis Comput Graph (Proc IEEE Conf Inf Vis (InfoVis)) 13(6):1–6 De Domenico M, Porter MA, Arenas A (2014) MuxViz: a tool for multilayer analysis and visualization of networks. J Complex Netw 3:159–176 Düring M (2013) HNR Bibliography. Historical Network Research. Retrieved September 1, 2018 from http:// historicalnetworkresearch.org/bibliography http://romans1by1.com https://link.springer.com/chapter/10.1007/978-3-642-18638-7_5#citeas https://link.springer.com/chapter/10.1007/978-3-642-18638-7_5#citeas https://www.aaai.org/ocs/index.php/ICWSM/09/paper/view/154 https://diglib.eg.org/handle/10.2312/eurova.20151095.001-005 http://historicalnetworkresearch.org/bibliography http://historicalnetworkresearch.org/bibliography Bornhofen and Dü Applied Network Science (2020) 5:54 Page 13 of 13 Düring M (2019) Networks as Gateaways. Gleanings from applications for the exploration of historical data. In: Kerschbaumer F, von Keyserlingk L, Stark M, Düring M (eds). The Power of Networks. Prospects of Historical Network Research. Routledge Publishers, Abingdon Ellis D (1989) A behavioural approach to information retrieval system design. J Doc 45(3):171–212 Ghani S, Kwon B. C, Lee S, Yi J. S, Elmqvist N (2013) Visual analytics for multimodal social network analysis: A design study with social scientists. IEEE Trans Vis Comput Graph 19(12):2032–2041 Ghoniem M, McGee F, Melançon G, Otjacques B, Pinaud B (2019) The State of the Art in Multilayer Network Visualization. Comput Graph Forum 38(6):125–149 Grandjean M (2019) A Conceptual Framework for Multilayer Historical Networks. Sharing the Experience: Workflows for the Digital Humanities. DARIAH-CH Proceedings, Neuchâtel Guido D, Wieneke L, Düring M (2016) histograph. Graph-based exploration, crowdsourced indexation. CVCE, Luxembourg Jänicke S, Franzini G, Cheema MF, Scheuermann G (2015) On Close and Distant Reading in Digital Humanities: A Survey and Future Challenges. In: Eurographics Conference on Visualization (EuroVis) - STARs. The Eurograhics Association. https:diglib.eg.org/handle/10.2312/eurovisstar.20151113 Jessop M (2008) Digital visualization as a scholarly activity. Lit Linguist Comput 23(3):281–293 Kerren A, Purchase HC, Ward MO (2014) Multivariate network visualization. Lect Notes Comput Sci 8380:8380 Kivelä M, Arenas A, Barthelemy M, Gleeson JP, Moreno Y, Porter MA (2014) Multilayer networks. J Complex Netw 2(3):203–271 Kuczera A (2019) Die “Regesta Imperii” im digitalen Zeitalter. Das Regest als Netzwerk von Entitäten. Das Mittelalter 24:157–172. https://doi.org/10.1515/mial-2019-0011 Latapy M, Magnien C, Vecchio ND (2008) Basic notions for the analysis of large two-mode networks. Soc Netw 30(1):31–48 Max-planck-institut fur informatik: YAGO A high quality knowledge base. https://www.mpi-inf.mpg.de/departments/ databases-and-information-systems/research/yago-naga/yago. Accessed 17 Aug 2020 McGee F, Düring M, Ghoniem M (2016) Towards Visual Analytics of Multilayer Graphs for Digital Cultural Heritage Mendes P (1993) GEPASI: a software package for modelling the dynamics, steady states and control of biochemical and other systems. Comput Appl Biosci 9(5):563–571 Moretti G, Sprugnoli R, Menini S, Tonelli S (2016) Knowledge-Based Systems ALCIDE : Extracting and visualising content from large document collections to support humanities studies. Knowl-Based Syst 111:100–112 Nobre C, Streit M, Meyer M, Lex A (2019) The State of the Art in Visualizing Multivariate Networks. Comput Graph Forum 38(3):807–832 Pavlopoulos GA, O’Donoghue SI, Satagopam VP, Soldatos T, Pafilis E, Schneider R (2008) Arena3D: visualization of biological networks in 3D. BMC Syst Biol 2:104 Piškorec, Sluban, Šmuc (2015) MultiNets: Web-Based Multilayer Network Visualization. In: Machine Learning and Knowledge Discovery in Databases. Springer, Cham. pp 298–302 Rossi L, Magnani M (2015) Towards effective visual analytics on multiplex and multilayer networks. Chaos Solitons Fractals 72(0):68–76 Sedlmair M, Meyer M. D, Munzner T (2012) Design study methodology: Reflections from the trenches and the stacks. IEEE Trans Vis Comput Graph 18(12):2431–2440 Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504 Shneiderman B (2003) Why not make interfaces better than 3D reality? IEEE Comput Graph Appl 23(6):12–15 Singh L, Beard M, Getoor L, Blake M (2007) Visual mining of multimodal social networks at different abstraction levels. In: Information Visualization, 2007. IV ’07. 11th International Conference. IEEE, Zurich. pp 672–679 SNAC Social Networks and Archival Context. http://socialarchive.iath.virginia.edu. Accessed 17 Aug 2020 The natural language processing API. https://www.textrazor.com. Accessed 17 Aug 2020 Trier M (2008) Research Note - Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Inf Syst Res 19:335–350 Valleriani M, Kräutli F, Zamani M, Tejedor A, Sander C, Vogl M, et al. (2019) The emergence of epistemic communities in the ’Sphaera’ corpus: mechanisms of knowledge evolution. J Hist Netw Res 3:50–91. https://doi.org/10.25517/jhnr.v3i1.63 van Ham F, Perer A (2009) Search, Show Context, Expand on Demand: Supporting Large Graph Exploration with Degree-of-Interest. IEEE Trans Vis Comput Graph 15(6):953–960 van Vugt I (2017) Using Multi-layered Networks to Disclose Books in the Republic of Letters. J Hist Netw Res 1(1):25–51 Varga R (2017) Romans 1 by 1 v.1.1. New developments in the study of Roman population. Digit Class Online 3(2):44–59 Verhoeven D, Burrows T (2015) Aggregating Cultural Heritage Data for Research Use: The Humanities Networked Infrastructure (HuNI). In: Metadata and Semantics Research. Springer, Cham. pp 417–23 Ware C (2001) Designing with a 2 1/2D Attitude. Inf Des J 10(3):171–182 Warren C, Shore D, Otis J, Wang L, Finegold M, Shalizi C (2016) Six Degrees of Francis Bacon. A Statistical Method for Reconstructing Large Historical Social Networks. Digit Human Quart 10(3) Windhager F, Federico P, Schreder G, Glinka K, Dörk M, Miksch S, Mayr E (2018) Visualization of Cultural Heritage Collection Data: State of the Art and Future Challenges. IEEE Trans Vis Comput Graph 25:20 Zweig KA, Kaufmann M (2011) A systematic approach to the one-mode projection of bipartite graphs. Soc Netw Anal Min 1(3):187–218 Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. https:diglib.eg.org/handle/10.2312/eurovisstar.20151113 https://doi.org/10.1515/mial-2019-0011 https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago http://socialarchive.iath.virginia.edu https://www.textrazor.com https://doi.org/10.25517/jhnr.v3i1.63 Abstract Keywords Introduction Dataset and requirements State of the art Intergraph User test results Conclusion & future work Abbreviations Acknowledgements Authors' contributions Authors' information Funding Availability of data and materials Competing interests Author details References Publisher's Note work_3rdcathodvd3fkmrh367r5nskm ---- Textuality in 3D: three-dimensional (re)constructions as digital scholarly editions RESEARCH ARTICLE Textuality in 3D: three-dimensional (re)constructions as digital scholarly editions Susan Schreibman1 & Costas Papadopoulos1 Published online: 14 May 2019 # The Author(s) 2019 Abstract 3D (re)constructions of heritage sites and Digital Scholarly Editions face similar needs and challenges and have many concepts in common, although they are expressed differently. 3D (re)constructions, however, lack a framework for addressing them. The goal of this article is not to create a single or the lowest common denominator to which both DSEs and 3D models subscribe, nor is it to reduce 3D to one scholarly editing tradition. It is rather to problematise the development of a model by borrowing concepts and values from editorial scholarship in order to enable public-facing 3D scholarship to be read in the same way that scholarly editions are by providing context, transmission history, and transparency of the editorial method/decision-making process. Keywords 3D . (Re)construction . Digital scholalry editions . 3D scholarly editions . Ambiguity. Transparency. Evidence 1 Introduction The nature, functionality, and theories informing Digital Scholarly Editions (DSEs) have flourished over the past three decades, moving beyond how to represent print- based texts in digital forms into new types of knowledge production and dissemination informed by the affordances of the medium (Apollon et al. 2014; Sutherland 1997; Driscoll and Pierazzo 2016; Pierazzo 2015). The creators of the earliest DSEs were cognizant of these differences in the nomenclature used to describe their scholarship: The Rosetti Archive, The Blake Archive, The MacGreevy Archive, The Whitman Archive. The shift from conceiving of their scholarship as an editorial product for print publication to web-based carried with it a new spatiality in which variants, versions, and secondary sources did not have to be decomposed into a notational shorthand International Journal of Digital Humanities (2019) 1:221–233 https://doi.org/10.1007/s42803-019-00024-6 * Susan Schreibman s.schreibman@maastrichtuniversity.nl 1 Department of Literature & Art, Faculty of Arts and Social Sciences, Maastricht Uiversity, Maastricht, The Netherlands http://crossmark.crossref.org/dialog/?doi=10.1007/s42803-019-00024-6&domain=pdf mailto:s.schreibman@maastrichtuniversity.nl delivered at the bottom of the page, but rather connoted a shift in thinking of scholarly editions from sites of textual compression to decentred webs of textuality (after all, this was the height of hypertext theory). In Derridean terms, the digital archive was a space for gathering together texts on a specific theme, individual, or topic, expanding the world of its creation into an archontic space for inscription and investigation (Derrida and Prenowitz 1995, 10). These new forms of editions could be also described as assemblages, little machines of knowledge (Deleuze and Guattari 1987, 4). Thinking of DSEs as machines of knowledge which represent texts as open objects that can be read and understood through and by their means of production and reception reelevates long-standing textual practices such as annotation, apparatus, and commentary as interlinked dis- courses, ‘not in serially additive arrangements, but in functional interdependence’, providing a palette for both textual and contextual study (Gabler 2010, 46). This model of DSEs repositions the editor, not as an objective arbiter of the text, but more like a hunter-gatherer, a constructor of a dynamic and rhizomatic knowledge site. These knowledge sites need not privilege the alphanumeric. As early as 1999, when the first digital archives were being published, McKenzie described the panoply of objects that could be open to the kinds of intensive bibliographical study (transmission, production, and reception) that textual scholars had traditionally reserved for print and manuscript traditions, ‘as verbal, visual, oral and numeric data’ (McKenzie 1999, 13). Like McKenzie, we believe that the digital provides us with a pan-glossary to abstract the material medium of the objects of our contemplation to conceptual ones, and that the text(s) at the centre (or periphery) of the knowledge site we create is worthy of representing any human activity, not simply the artistic but the anthropologic. In his definition of digital scholarly editions, Sahle has also advocated that Scholarly Digital Editions not be restricted to literary texts but rather ‘cover all cultural artefacts from the past that need critical examination in order to become useful sources for research in the humanities’ (Sahle 2016, 22). What is key, however, is not simply that the object, or what we are broadly calling the ‘text’, is digitised and distributed electronically, but that it exists within a knowledge site, what we are calling a Digital Scholarly Edition; what Derrida (Derrida and Prenowitz 1995, 10) and the earliest textual editors of web-based scholarship called the archive, what Deleuze and Guattari (1987, 4) called the assemblage, and McKenzie (1999, 15) ‘the sociology of texts’, integrating interrogating and interlinking, the textual and the contextual (Gabler 2010, 46). At first glance, conceiving of three-dimensional (3D) (re)constructions as Digital Scholarly Editions might seem anti-intuitive. After all, the technologies, methodol- ogies, and theories that have informed the creation of DSEs, everything from TEI/ XML, documentary vs critical editions, and relational vs XML-aware databases to more recent discussions of how the texts created in DSEs can be repurposed, remodelled, and algorithmically analysed and visualised, share little with the tech- nologies, methodologies, and theories that have informed scholarship expressed through 3D reconstructions. 3D reconstructions are computer generated models produced in computer graphic software packages. They represent in three dimen- sions geometric data combined with textures and a simulation of how light interacts with different surfaces. The 3D(re)constructions we refer to in this article concern cultural heritage visualisations and simulations. These visualisations range from schematic 222 S. Schreibman, C. Papadopoulos representations of buildings to photorealistic renderings and predictive simulations of ancient structures (see Dawson et al. 2007) and from spatial analysis (see Paliou et al. 2011) and physics simulations (see Oetelaar 2016) to interactive virtual worlds utilising online platforms (see Sequiera and Morgado 2013) and game engines (see the projects carried out as part of the Humanities Virtual Worlds Consortium – http://virtualworlds. etc.ucla.edu/). Examples of the types of projects that could be informed by the principles outlined in this article include the Virtual Rosewood Research project (http://www.rosewood-heritage.net/), which focuses on Rosewood, Florida, an African American town destroyed during the 1923 race riot (González-Tennant 2015 ), Virtual Williamsburg 1776 (Fischer 2012; http://research.history.org/vw1776/), and Digital Hadrian’s Villa (Frischer and Stinson 2016), the Unity version of which allows users to test archaeoastronomical theories (e.g. the alignment of the sun with the tower of Roccabruna on the summer solstices during Hadrian’s reign to discover celestial arrangements in the night sky as they would have been seen in the past). It also includes Contested Memories: The Battle of Mount Street Bridge project (http://mountstreet1916.ie), which will be discussed at the end of this article. We will be utilising the term (re)constructions (as opposed to reconstruction) to signify the theoretical nature of the research being undertaken, reinforcing the hypo- thetical nature of the models created as described in more detail below. Although three- dimensional modelling has been seen as an essential research practice in fields such as archaeology and architecture, with hundreds of applications providing opportunities for experimentation and new insights, it has never assumed a central role or been established as scholarship in its own right, and it has been less frequently used by other fields in the humanities. There is no doubt that these models can have a ‘wow’ factor, and they are commonly used as the last step that takes place at the end of a project as a means of communicating absence in a visually engaging way (Hermon and Fabian 2002; Gillings 2005) and attracting publicity and funding. The situation is changing, however, as the cost of the technologies and equipment needed to create 3D models has become more affordable, and more institutions at third level teach the skills, methods, and theories involved. Over a decade ago, Gillings (2005, 224) called for a robust theoretical and conceptual framework that would realise the potential of the method, providing a means of making visible the underlying decision-making process while disseminating research findings within a single methodological framework. That framework still does not exist. Today 3D (re)constructions are being used as analytical tools within the broad area of history and heritage studies to explore the impact of light on the experience and perception of ancient environments, as simulations to investigate how battles unfold, as a means of providing an embodied and sensorial understanding, and as immersive experiences of a period, culture, or historical event. In most cases these (re)constructions have been created for offline use due to the computational power required to undertake various analyses or even to host the worlds online, making it difficult for researchers outside project teams to learn from the work of others. It may be due to these difficulties that those working with these technologies have spent less time developing a framework within which to convey to a wider public the scholarship that has gone into the creation of the (re)construction, including the choices used in creating the model, the methodologies used, the history of the subject under investiga- tion, and the decisions made during the process of its creation. Textuality in 3D: three-dimensional (re)constructions as digital scholarly editions 223 http://virtualworlds.etc.ucla.edu/ http://virtualworlds.etc.ucla.edu/ http://www.rosewood-heritage.net/ http://research.history.org/vw1776/ http://mountstreet1916.ie/ Current 3D scholarship primarily exists in a bifurcated information space. The models exist electronically, but the knowledge generated from them is written about in journal articles, illustrated by static two-dimensional images, while the models themselves cannot be accessed beyond the individual or the team who worked on them. While 15 years ago Earl and Wheatley (2002) ascribed the under-theorisation of the field to a widespread belief that 3D visualisation had a small role to play as an interpretive mechanism, there now is a growing body of practitioners in a wide variety of fields who are looking for a holistic information environment within which to disseminate their work. The authors of this article, one with a background that stretches back to scholarly editing in print and, since the mid-1990s, has included digital forms and the other with a background in digital archaeology and particularly in the use of 3D modelling for analytical and documentary purposes, have come to believe that recent scholarship in digital scholarly editions, as it moves away from the conceptual and medium-specific frameworks of print traditions into more dynamic knowledge sites, can provide a productive model of the ways in which 3D (re)constructions can be used to reach larger audiences and shape intellectual debates. This scholarship can also provide a model for the ways in which the knowledge gained by the researchers who created it can be embedded in 3D (re)constructions. This is a speculative article, the essential intention of which is to propose and test theories from two domains which at first glance seem to have very little in common. However, its authors have come to realise after several years of collaboration and conversation that they share significant concerns and have wrestled with similar concepts, from expressing transparency and ambiguity to modelling absence from the extant record. This is an opening salvo to generate discussion about how the current theories about DSEs could be expanded and applied beyond their dominantly text- based practice and how 3D (re)constructions can be conceived of as open knowledge networks which contain, embedded within them, the analytical and archival scholarship that informed their creation.1 2 Modelling in 3D: an introduction Three-dimensional computer graphical approaches flourished in the television and film industry with the first commercial 3D software package, Wavefront Technologies, released in 1984 to serve the increasing needs of motion pictures. Around the same 1 This article has grown out of present and previous conversations. The authors thank the participants in the Virtual Worlds as Digital Scholarly Editions Masterclass (Maynooth University, 13–14 June 2017, http://dhprojects.maynoothuniversity.ie/vwdse/, funded by the Digital Scholarly Editions Initial Training Network (DiXiT). This Masterclass provided a fruitful dialogue between scholars from the fields of heritage 3D visualisation and Digital Scholarly Editing, allowing them to explore theoretical and practical issues pertaining to the creation, annotation, and publication of 3D models with the ultimate aim of marrying the practices of the two communities. The authors are also part of the Andrew W. Mellon Foundation project ‘Scholarship in 3D: Digital Edition Publishing Cooperative’, which attempts to build a sustainable framework within which to reconceive 3D works as digital editions and create an infrastructure that will enable its recognition as scholarship. 224 S. Schreibman, C. Papadopoulos http://dhprojects.maynoothuniversity.ie/vwdse/ time, 3D modelling was used to (re)construct heritage datasets, with the earliest reported work being that of the bath building at Caerleon Roman Fort in South Wales (Smith 1985) and a year later the first animated virtual tour, that of the Old Minster of Winchester (Reilly et al. 2016). In this paper, we focus on the type of 3D modelling in which computer graphics are used to (re)construct ‘what is not there’ as a means of providing a better understanding of and generating hypotheses concerning and interpretations of different datasets, from ancient structures to twentieth-century buildings. Like textual scholarship, 3D model- ling requires evaluations of the reliability of different, often incomplete and ambiguous sources, and it gathers evidence that leads to a series of decisions regarding the type, amount, and style of models to be constructed, depending on their intended uses and audiences. The 3D scholarship to which we are referring in this paper has been called by many names, from three-dimensional (solid) modelling to virtual reconstructions and 3D computer graphic simulations. These terms have emerged from particular schools of thought, theoretical and methodological traditions, and biases and assumptions. For example, the term Virtual Reconstruction became popular in the 1990s after Reilly (1991) coined the term Virtual Archaeology, while in the mid-2000s the term Virtual Worlds was used to reflect online, multiplayer, avatar-based game approaches (see for example Bell 2008; Nevelsteen 2017), facilitated by platforms such as Second Life, Open Simulator, and Unity. In these three decades, no consistency or consensus emerged concerning the terminology used to describe this type of scholarship; some consistency has only been seen in the association of these terms with the word reconstruction. The fallacy of the term ‘reconstruction’ (Taylor 1948; Clark 2010) is ingrained in the practice of many disciplines. For example, in archaeology under the influence of the processual school, emphasis was placed on the objectivity of scientific methods, while in textual editing, according to the American school, particularly within the Bowers/ Tanselle approach, the goal of the edition is to restore the text to the author’s final intention (Shillingsburg 1999, 29). Reconstruction implies an attempt to bring some- thing back to its original state based on available evidence as its starting point. However, such attempts can never be accurate, as they are interpretations of some past reality or of the documentary evidence, and as such, they reflect present-day social and cultural agendas. Much as editing texts for digital publication has made editors aware of the con- straints of editing for print publication (McGann 1997, 20) coupled with new theoret- ical approaches which do not favour (re)construction of the text according to what the editor believes the author would have intended. Equally, the possibility of photorealistic rendering of 3D heritage opened up new debates as to the purpose and role of 3D models in the interpretive process (Gillings 2005). Calling what we model a ‘recon- struction’ which is indistinguishable from real life (as in the case of 3D) or professing to know what a long-dead author wanted (in terms of textual editing) leave little room to question the process according to which the ‘reconstruction’ was made, the authenticity of the new work that has been created, or the reliability or interpretation of the evidence used in its construction. Therefore, we use the term (re)construction to emphasise the decision-making process and the non-absolutist approach in the construction of the model. Textuality in 3D: three-dimensional (re)constructions as digital scholarly editions 225 3 3D and textual scholarship: a parallel path Scholars in both textual editing and 3D (re)constructions have faced similar issues in modelling the textual/material record, and surprisingly, they have employed either the same or similar terminology in confronting these challenges. This is what we refer to as ‘a parallel path’. These issues concern the use of evidence, ways of making the decision making process visible, and ways of dealing with ambiguity. In this section we outline these parallels, which form the key features of the last section, Towards 3D Digital Scholarly Editions. 3.1 Evidence Both fields work to reconstruct the text on the basis of existing, albeit imperfect evidence (if the evidence were perfect, there would be no need for editors or editions). In both cases, the researcher is evaluating the evidentiary record, and the further we go back in this record chronologically, the less evidence remains. In many respects, the 3D (re)constructions of ancient spaces are more akin to the textual editing tradition of stemmatics applied to premodern texts, in which the editor reconstructs the original text by working backwards from extant witnesses through a set of relationships expressed as a tree structure in an effort to come as close as possible to the original text before it was corrupted (or indeed thought to be improved) through the copying process. 3D (re)constructions of sites that exist in fragmentary form, e.g. a prehistoric settlement, also go through a process of evaluation of extant data, resolving, discarding, and harmonising evidence in an effort to understand what might have existed. This can be likened to an interpretation of secondary sources documenting remains in the form of textual records/field notes, photographs, and illustrations, site-specific physical remains, and information coming from other sites which bear some resemblance (temporal or spatial) to the subject under investigation. The goal in both fields is the (re)construction of an artefact (be it a poem or an ancient building) which makes clear to the user where there are gaps in knowledge of the putative original and what methods and evidence were used in order to fill these gaps. On the other hand, 3D (re)constructions based on (nearly) complete evidence, e.g. (re)constructions of a twentieth-century urban battlefield in which old buildings still stand, share more with traditions and methods used to edit nineteenth-century and twentieth-century texts. In both cases, there tends to exist an abundance of evidence, and the researcher adjudicates among extant sources, each of which carries its own authority. Just as texts can exist in multiple versions, manuscripts, typescripts, and printings, some or all of which might have been authorised by the writer, bringing back in 3D a building or a street or what happened in that space at a specific point in time (or indeed, over time) may utilise a plethora of evidence, including documentary, oral, and visual. This evidence may be contradictory and fragmentary, and some items of evidence may be more ‘authorised’ or credible than others. The theories underlying the textual tradition of versioning in which the goal is not to establish a definitive or reading text, but to reconstruct, not only a textual history, but the underlying view of the nature of the text’s production (Tanselle 1995, 24) might prove a useful foundation for models of 3D (re)constructions in which what is of interest is the potential to make visible multiple states over time or alternative 226 S. Schreibman, C. Papadopoulos possibilities in interpretation. These ‘textual’ moments can be viewed as snapshots, each providing a unique, equally valid window onto the past (Schreibman 1993, 93). Textual editors have the ability to model these differences in TEI/XML within one apparatus or framework explicitly to denote regions of both shared and divergent text. Once these regions are marked, they can be analysed and visualised, providing the modeller with a method for recording difference over time and the reader with a standard notation for the decision making process. While TEI/XML is not a suitable language for 3D, the theoretical framing may well be, as it offers a way to make variation visible within a single model. 3.2 Ambiguity The Oxford English Dictionary defines ambiguity as the ‘quality of being open to more than one interpretation; inexactness’. It is precisely because of the ambiguous nature of evidence that textual scholarship exists. Revealing, acknowledging, and/or resolving this ambiguity is part of the interpretive process. In the transition from print to digital editing of alphanumeric texts, it has become possible to represent ambiguity more explicitly in a non-notational manner, providing the reader with the documentary evidence to mediate between different textual states (Schreibman 2002, 288–89). There have been lengthy discussions concerning the contention that simply providing the reader with facsimiles of the various witnesses is a form of unediting (Schreibman 2013), i.e. an abdication of the editorial role. On the other hand, modelling the text according to a theory such as genetic editing or versioning to make visible the writing process allows readers to formulate their own sense of the work (or work-in-progress) situated in both time and place (Machan 1994, 303). In other words, the point of this type of editing is not to resolve the ambiguity of the textual record, but to expose it through modelling. Modelling ambiguity has been the subject of much discussion in 3D (re)construc- tions. While in both fields there exists the subjective nature of gathering, selecting, and interpreting evidence, with 3D (re)constructions there are less codified ways to express ambiguity (or indeed to consider whether or not it should be expressed) and to make the modelling process as transparent as possible (for a recent discussion see Watterson 2015). Representing ambiguity began to become more pressing in the mid-1990s and especially in the 2000s, as photorealistic models started dominating heritage represen- tation. This, in turn, fuelled debates about their misleading nature (Miller and Richards 1995; James 1997; Goodrick and Gillings 2000; Eiteljorg 2000) and the problematic use of the word ‘reconstruction’ (Clark 2010). These concerns gave rise to a series of proof-of-concept implementations that demonstrated intellectual rigour in 3D models and explicated decision-making in the process of their creation as a means of counteracting their problematic photorealistic nature. These implementations included alternative reconstructions, annotations, renderings in different colours, textures, and shadings, as well as ways of activating or deactivating ambiguous features and alternative models that would in turn affect other elements (Kensek 2007). Unlike the digital scholarly editing practices in which there exists a core set of technologies and methods (largely around XML/TEI), to a large extent 3D visualisations utilise bespoke solutions with no sustainable platforms, methodologies, or standards which might lead to the wider adoption of such practices (see for example: Textuality in 3D: three-dimensional (re)constructions as digital scholarly editions 227 Archaeology Data Service Guide to Archiving Virtual Reality Projects (2018): http://guides.archaeologydataservice.ac.uk/g2gp/Vr_6-1; McDonough et al. 2010). The World Wide Web currently hosts a wide variety of DSEs, some going back to the earliest digital archives in the mid-1990s, providing the field with a tradition from which new theories, models, and editions are developed. However, due to their intensive computational nature, most 3D visualisation projects are developed for offline use and in a constantly changing environment in which software and frameworks become obsolete soon after their release. Also, the Internet itself, with constant changes to browsers, does not offer a stable environment for such projects (Ruan and McDonough 2009), providing little opportunity to build a body of knowl- edge and a practice-based community. 3.3 Transparency As mentioned above, the creation of 3D models under the influence of photorealism gave birth to a series of debates regarding transparency, documentation of decision- making, standards, and intellectual rigour in the process of (re)construction. Earlier, more schematic work did not carry with it the same calls for transparency, as it did not, however unintentionally, ‘trick’ the viewer into reading reality into the (re)construction. Several projects attempted to establish principles to address this new environment. The best-known is the London Charter (Denard 2009; http://www.londoncharter.org/), developed in 2006 at a meeting of 3D visualisation specialists who came up with a series of principles that would ensure a certain level of standardisation in terms of the creation of documentation, ensuring sustainability and access and articulating the aims and methods of (re)creation (Denard 2012). For example, principle 4.6, ‘Documenta- tion of Process (Paradata); states that: Documentation of the evaluative, analytical, deductive, interpretative and creative decisions made in the course of computer-based visualisation should be dissem- inated in such a way that the relationship between research sources, implicit knowledge, explicit reasoning, and visualisation-based outcomes can be under- stood (Denard 2009, 8-9). While a worthy goal, it is impossible not only to document, but, as was found in editing according to the Greg/Bowers (Bowers 1978) school of copy-text, even to represent each and every editorial decision and the rationale for that decision. This would amount to providing a textual record of not only how each accidental (e.g. punctuation, spelling, word division which was considered less significant and meaning-making than substantives, e.g. word choice) was handled, but why it was handled that way. In both traditions, this goal becomes even more unobtainable, as it would involve a full time amanuenses following the researcher (or team of researchers), recording every conversation, saving every version of every document, every state of every electronic file. Even if it were possible to save all this, how could it be presented to the reader so as to make more transparent the decision-making process? This might be more akin to an archive or an assemblage, but even if this could be ordered and catalogued, what it does not necessarily demonstrate is what happens in the space between the evidence, 228 S. Schreibman, C. Papadopoulos http://guides.archaeologydataservice.ac.uk/g2gp/Vr_6-1 http://www.londoncharter.org/ the ideas that inform the decision-making process, tracing a path from evidence to its representation. Despite the fact that several reports and frameworks followed The London Charter,2 none of them tackled the inherent visual illiteracy in reading photorealistic 3D visualisations as texts. There is no scaffolding within which the 3D model is situated to provide the reader with access to its conceptual and methodological underpinnings. 4 Towards 3D scholarly editions We have argued that Digital Scholarly Editions act as mediators and gatherings of evidence (Gabler 2010, 44), textual, social, and historical, that is read in an increasingly multimodal infrastructure. As knowledge sites, they encompass within the same com- putational paradigm both the primary text and the evidence that informed the decisions in creating the text, thus providing the community which it serves a tool for ‘prying problems apart and opening up a new space for the extension of learning’ (Apollon et al. 2014, 5–6). This framework provides the information structures and evidence that make up the edition so that the audience can understand the process behind the creation of the edition and adjudicate its authenticity and reliability. Hence we believe that there exists a case and a rationale for designing a blueprint to link editorial, epistemological, and technical practices in the development of editions of 3D (re)constructions as scholarship in its own right, as assessable assemblages to combat the problem of the vacuous nature of most models: empty sites where the research that went into their creation remains invisible to those outside project teams. While photorealistic models are ever more beautiful to behold, if their raison d’être is not to serve as works of art, but as mediators of evidence, the ways in which the chain of production that informed their creation needs to be made visible in the same information space as the models. We are not advocating that a DSE of a 3D (re)construction be thought of as a defined object, but rather as a methodological field in which a set of codes imposes a prefiguring frame on the reality being created, and not only the technological codes that govern the creation of this reality, but also the social, theoretical, and historical codes that its makers adopt in its creation (Barthes 1977). The construction of such an edition entails building an intertextual network com- posed of the 3D model along with its accompanying annotation and apparatus, thus providing a base from which the reader can actively engage in the knowledge creation process. This approach is being taken in the redevelopment of the Contested Memories: The Battle of Mount Street Bridge (BMSB) project, a spatiotemporal (re)construction of one of the most important battles during the 1916 Easter Rising between a small 2 Pletinckx (2007) and the Network of Excellence, Epoch developed the Interpretation Management KnowHow booklet, which explained different methods, including source assessment and correlation, hypoth- eses trees, and updating, to ensure scholarly transparency in 3D visualisation. This information and the comments, decisions, and interpretive processes comprise the paradata of the project (Baker 2012, 163–176), which allows a clearer interpretation management and understanding of the relationships between primary data and the outcome. More recently, The Seville Principles (Lopez-Menchero and Grande 2011) used the London Charter as a theoretical framework to develop a series of principles to increase the applicability of the latter to 3D visualisations of archaeological heritage. Other works have also elaborated on how scientific reasoning can become visible (Hermon 2012; Niccolucci 2012). Textuality in 3D: three-dimensional (re)constructions as digital scholarly editions 229 group of Irish Volunteers (17) and a much larger force (1750) of British soldiers (https://mountstreet1916.ie/ Papadopoulos and Schreibman in press).The apparatus being explored includes a narrative-driven camera with a voice over providing an evidence-informed interpretation of how the battle unfolded; audio files that replicate the types of sounds that accompanied the battle; and a user interface to display in-world textual annotations about the combatants and other participants (such as the medical personnel), key buildings and events, the types and makes of guns used, and the sources for the (re)construction (from the methods used to the primary sources consulted; see Fig. 1 for a mock up). The project has also experimented (see Fig. 2) with animated agents to help the reader better visualise troop movements as the battle unfolded. Fig. 1 The new user interface of the standalone Unity 3D environment includes an annotation panel that provides contextual information (text and/or multimedia) activated by hotspots, as well as a timeline for exploring the battle temporally. Fig. 2 A total of 252 AI agents (36 for each of the seven companies of British soldiers) have been included in a WebGL version of the 3D model 230 S. Schreibman, C. Papadopoulos https://mountstreet1916.ie/ Embedding the iconography of 3D (re)constructions into what we might broadly describe as scholarly editing practice, opens up new vistas for scholarship and the communication of the results of that scholarship within spatio-temporal environments that are immersive and multisensorial. If the goal of the modelled dataset is to create its own ecosystem to provoke and encourage evolving thought about the materials, aesthetics, and cultures it simulates (Schreibman 2013), then the scholarly editing framework we have outlined here, we believe, fulfils that goal. The real difference between the two domains is not the technologies utilised in digital production, but the fact that textual editors have a long history of models, theories, and paradigms of the documentation and display of text and paratext (as well as models, theories, and paradigms for the fashioning of arguments), and 3D (re)constructions do not. This article presents a rationale for the development of such as framework. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and repro- duction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. References Apollon, D., Bélisle, C., & Régnier, P. (2014). Digital critical editions. Urbana: University of Illinois Press. Archiving Virtual Reality Projects. (2018) Archaeology Data Service / Digital Antiquity. Guides to Good Practice. Retrieved from http://guides.archaeologydataservice.ac.uk/g2gp/Vr_6-1. Accessed 2018, March. Baker, D. (2012). Defining Paradata in heritage visualisation. In A. Bentkowska-Kafel, H. Denard, & D. Baker (Eds.), Paradata and transparency in virtual heritage (pp. 163–176). Fanrham: Ashgate. Barthes, R. (1977). Image-music-text. Essays selected and translated by Stephen heath. London: Fontana. Bell, M. (2008). Toward a definition of Bvirtual worlds^. Journal of Virtual Worlds Research 1(1). Retrieved from https://journals.tdl.org/jvwr/index.php/jvwr/article/view/283/237. Accessed 2018, March 6. Bowers, F. (1978). Greg’s ‘rationale of copy-text’ revisited. Studies in Bibliography, 31, 90–161. Clark, J. T. (2010). The fallacy of reconstruction. In Forte, M. (Ed.), Cyber-Archaeology (pp. 63–73). BAR International Series S2177. Oxford: Archaeopress. Dawson P., Levy, R., Gardner, D. & Walls, M. (2007). Simulating the behavior of light inside Arctic dwellings: Implications for assessing the role of vision in task performance. World Archaeology 39, pp. 17–55. Retrieved from https://doi.org/10.1080/00438240601136397. Accessed 2018, March 6. Deleuze, G., & Guattari, F. (1987). A thousand plateaus: Capitalism and schizophrenia. London: University of Minnesota Press. Denard, H. (ed). (2009). The London charter. Retrieved from http://www.londoncharter.org/. Accessed 2018, March 10. Denard, H. (2012). A new introduction to the London charter. In A. Bentkowska-Kafel, D. Baker, & H. Denard (Eds.), Paradata and transparency in virtual heritage (pp. 57–71). Brookfield: Ashgate. Derrida, J. & Prenowitz, E. (1995). Archive fever: A Freudian impression. Diacritics 25(2), 9–63. Retrieved from http://www.jstor.org/stable/465144. Accessed 2018, March 6. Driscoll, M., & Pierazzo, E. (2016). Digital scholarly editing: Theories and practices. Cambridge: Open Book Publishers. Earl, G., & Wheatley, D. W. (2002). Virtual reconstruction and the interpretative process: A case-study from Avebury. In D. W. Wheatley, G. Earl, & S. Poppy (Eds.), Contemporary themes in archaeological computing (pp. 5–15). Oxford: Oxbow. Eiteljorg, H. (2000). The compelling computer image – A double-edged sword. Internet Archaeology 8. Retrieved from http://intarch.ac.uk/journal/issue8/eiteljorg_index.html. Accessed 2018, March 6. Fischer L. (2012). Visualizing Williamsburg: Modeling an early American City in 2D and 3D. In Niccolucci, F., Dellepiane, M., Pena Serna, S., Rushmeier, H. & Van Gool, L. (Eds.), VAST11: International Textuality in 3D: three-dimensional (re)constructions as digital scholarly editions 231 http://guides.archaeologydataservice.ac.uk/g2gp/Vr_6-1 https://journals.tdl.org/jvwr/index.php/jvwr/article/view/283/237 https://doi.org/10.1080/00438240601136397 http://www.londoncharter.org/ http://www.jstor.org/stable/465144 http://intarch.ac.uk/journal/issue8/eiteljorg_index.html Symposium on Virtual Reality, Archaeology and Intelligent Cultural Heritage (pp. 77–80. The Eurographics Association, 2011. Retrieved from https://doi.org/10.2312/PE/VAST/VAST11S/077-080. Accessed 2018, March 6. Frischer B. & Stinson, P. (2016). The importance of scientific authentication and a formal visual language in virtual models of archaeological sites: The case of the house of Augustus and villa of the mysteries. In Silberman, N. A. & Callebaut, D. (Eds.), Proceedings of the Interpreting the Past: Heritage, New Technologies and Local Development Conference on Authenticity, Intellectual Integrity and Sustainable Development of the Public Presentation of Archaeological and Historical sites and Landscapes (pp. 49– 83). Brussels: Flemish Heritage Institute. Retrieved from http://www.iath.virginia. edu/images/pdfs/frischer_stinson.pdf. Accessed 2018, March 6. Gabler, H.W. (2010). Theorizing the digital scholarly edition. Literature Compass 7, pp. 43–56. Retrieved from https://doi.org/10.1111/j.1741-4113.2009.00675.x. Accessed 2018, March 6. Gillings, M. (2005). The real, the virtually real, and the hyperreal: The role of VR in archaeology. In Smiles, S. & Moser, S. (Eds.), Envisioning the past: Archaeology and the image. Oxford: Blackwell Publishing. https://doi.org/10.1002/9780470774830.ch12. Accessed 2018, March 6. González-Tennant, E. (2015). Resurrecting Rosewood: new heritage as applied visual anthropology. In A. Gubrium, K. Harper, & M. Otanez (Eds.), Participatory visual and digital research in action (pp. 163– 180). Walnut Creek: Left Coast Press. Goodrick, G., & Gillings, M. (2000). Constructs, simulations and hyperreal worlds: The role of virtual reality (VR) in archaeological research. In G. Lock & K. Brown (Eds.), On the theory and practice of archaeological computing (pp. 41–58). Oxford: Oxford University Press. Hermon, S. (2012). Scientific method, Chaîne Opératoire and visualisation: 3D modelling as a research tool in archaeology. In A. Bentkowska-Kafel, H. Denard, & D. Baker (Eds.), Paradata and transparency in virtual heritage (pp. 13–22). Farnham: Ashgate. Hermon, S. & Fabian, P. (2002). Virtual reconstruction of archaeological sites, some scientific considerations: Avdat Roman military camp as a case-study. In Niccolucci, F. (Ed.), Virtual archaeology: Proceedings of the VAST Euroconference (pp. 103–108). Arezzo 2000, November 24–25. Oxford. James, S. (1997). Drawing inferences. In L. B. Molyneaux (Ed.), The cultural life of images: Visual representation in archaeology (pp. 22–48). Great Britain: Routledge. Kensek, K. (2007). Survey of methods for showing missing data, multiple alternatives, and uncertainty in reconstructions. CSA Newsletter 19 (3). Retrieved from http://csanet.org/newsletter/winter07/nlw0702. html. Accessed 2018, March 6. Lopez-Menchero, V.M. & Grande, A. (2011). Principles of Seville. International Principles of Virtual Archaeology. Forum Internacional de Arqueología Virtual. Retrieved from http://www. arqueologiavirtual.com/carta/wp-content/uploads/2012/03/BORRADOR-FINAL-FINAL-DRAFT.pdf. Accessed 2018, March 6. Machan, T. W. (1994). Chaucer’s poetry, versioning and hypertext. Philological Quarterly, 73(3), 299–316. McDonough, J., Olendorf, R., Kirschenbaum, M., et al. (2010). Preserving virtual worlds final report. Illinois: IDEALS: Illinois digital environment for access to learning and scholarship. Retrieved from http://hdl. handle.net/2142/17097. Accessed 2018, March 6. McGann, J. (1997). A rationale of hypertext. In K. Sutherland (Ed.), Electronic text: Investigations into method and theory (pp. 19–46). Oxford: Clarendon Press. McKenzie, D. F. (1999). Bibliography and the sociology of texts. Cambridge: Cambridge University Press. Miller, P., & Richards, J. (1995). The good, the bad, and the downright misleading: Archaeological adoption of computer visualization. In J. Huggett & N. Ryan (Eds.), Computer applications and quantitative methods in archaeology 1994 (pp. 19–22). Oxford: BAR International Series 600. Nevelsteen, K. (2017). Virtual world, defined from a technological perspective, and applied to video games, mixed reality and the Metaverse. Computer Animation & Virtual Worlds. Retrieved from https://doi. org/10.1002/cav.1752. Accessed 2018, March 12. Niccolucci, F. (2012). Setting standards for 3D visualisation of cultural heritage in Europe and beyond. In A. Bentkowska-Kafel, H. Denard, & D. Baker (Eds.), Paradata and transparency in virtual heritage (pp. 23–36). Farnham: Ashgate. Oetelaar, T. (2016). CFD, thermal environments, and cultural heritage: Two case studies of Roman baths. In IEEE 16th International Conference on Environment and Electrical Engineering (EEEIC), Florence, Italy: IEEE. Retrieved from https://doi.org/10.1109/EEEIC.2016.7555484. Accessed 2018, March 6. Paliou, E., Wheatley, D., & Earl, G. (2011). Three-dimensional visibility analysis of architectural spaces: Iconography and visibility of the wall paintings of Xeste 3 (late bronze age Akrotiri). Journal of Archaeological Science, 38(2), 375–386. https://doi.org/10.1016/j.jas.2010.09.016 Accessed 2018, March 6. 232 S. Schreibman, C. Papadopoulos https://doi.org/10.2312/PE/VAST/VAST11S/077-080 http://www.iath.virginia.edu/images/pdfs/frischer_stinson.pdf http://www.iath.virginia.edu/images/pdfs/frischer_stinson.pdf https://doi.org/10.1111/j.1741-4113.2009.00675.x https://doi.org/10.1002/9780470774830.ch12 http://csanet.org/newsletter/winter07/nlw0702.html http://csanet.org/newsletter/winter07/nlw0702.html http://www.arqueologiavirtual.com/carta/wp-content/uploads/2012/03/BORRADOR-FINAL-FINAL-DRAFT.pdf http://www.arqueologiavirtual.com/carta/wp-content/uploads/2012/03/BORRADOR-FINAL-FINAL-DRAFT.pdf http://hdl.handle.net/2142/17097 http://hdl.handle.net/2142/17097 https://doi.org/10.1002/cav.1752 https://doi.org/10.1002/cav.1752 https://doi.org/10.1109/EEEIC.2016.7555484 https://doi.org/10.1016/j.jas.2010.09.016 Papadopoulos, C. and Schreibman, S. (in press). Towards 3D Scholarly Editions: The Battle of Mount Street Bridge. Digital Humanities Quarterly. Pierazzo, E. (2015). Digital scholarly editing: Theories, models and methods. New York: Ashgate. Pletinckx, D. (2007). Interpretation management: How to make sustainable visualisations of the past, Epoch. Retrieved from http://media.digitalheritage.se/2010/07/Interpretation_Managment_TII.pdf. Accessed 2018, March 6. Reilly, P. (1991). Towards a virtual archaeology. In Rahtz, S. & Lockyear, K. (Eds.), CAA90. Computer Applications and Quantitative Methods in Archaeology 1990. British archaeological reports international series 565 (pp. 132–139). Oxford: Tempus Reparatum. Reilly, P., Walter, A., & Todd, S. (2016). Rediscovering and Modernising the digital old minster of Winchester. Digital Applications in Archaeology and Cultural Heritage, 3(2), 33–41. https://doi.org/10.1016/j. daach.2016.04.001 Accessed 2018, March 12. Ruan, J. & McDonough, P. (2009). Preserving born-digital cultural heritage in virtual world. In IEEE International Symposium on IT in Medicine & Education (pp. 745–748). Jina, China, IEEEA, 2009, 745–748. Retrieved from https://doi.org/10.1109/ITIME.2009.5236324. Accessed 2018, March 6. Sahle, P. (2016). What is a scholarly digital edition. In M. Driscoll & E. Pierazzo (Eds.), Digital scholarly editing: Theories and practices (pp. 20–39). Cambridge: Open Book Publishers. Schreibman, S. (1993). Re-envisioning versioning: a scholar’s toolkit. In A. Ciula & F. Stella (Eds.), Digital philology and medieval texts (pp. 93–102). Pisa: Pacini editore. Schreibman, S. (2002). Computer-mediated texts and textuality. Computers and the Humanities 36: pp. 283– 293. Pisa: Pacini Editore. Schreibman, S. (2013). Digital scholarly editing. In Price, K. & Siemens, R. (Eds.), Literary studies in a digital age: A methodological primer. MLA Commons. Retrieved from https://dlsanthology.mla.hcommons. org/digital-scholarly-editing/. Accessed 2018, March 6. Luis M. Sequiera & Morgado, L. C. (2013). Virtual archaeology in second life and OpenSimulator. Journal of Virtual Worlds Research 6(1), pp. 1–16. Retrieved from https://journals.tdl.org/jvwr/index. php/jvwr/article/view/7047/6310. Accessed 2018, March 6. Shillingsburg, P. (1999). Scholarly editing in the computer age: Theory and practice. Ann Arbor: the University of Michigan Press. Smith, I. (1985). Romans make a high-tech comeback: Sid and Dora’s bath show pulls in the crowd. Computing, June, 7–8. Sutherland, K. (1997). Electronic text: Investigations in method and theory. Oxford: Clarendon Press. Tanselle, G. T. (1995). The varieties of scholarly editing. In D. C. Greetham (Ed.), Scholarly editing: A guide to research (pp. 9–32). New York: Modern Language Association. Taylor, W. W. (1948). A study of archeology. Memoir no 69. American Anthropologist, 50(3), Pt 2. Watterson, A. (2015). Beyond digital dwelling: Re-thinking interpretive visualisation in archaeology. Open Archaeology 1 (1). Retrieved from https://doi.org/10.1515/opar-2015-0006. Accessed 2018, March 12. Textuality in 3D: three-dimensional (re)constructions as digital scholarly editions 233 http://media.digitalheritage.se/2010/07/Interpretation_Managment_TII.pdf https://doi.org/10.1016/j.daach.2016.04.001 https://doi.org/10.1016/j.daach.2016.04.001 https://doi.org/10.1109/ITIME.2009.5236324 https://dlsanthology.mla.hcommons.org/digital-scholarly-editing/ https://dlsanthology.mla.hcommons.org/digital-scholarly-editing/ https://journals.tdl.org/jvwr/index.php/jvwr/article/view/7047/6310 https://journals.tdl.org/jvwr/index.php/jvwr/article/view/7047/6310 https://doi.org/10.1515/opar-2015-0006 Textuality in 3D: three-dimensional (re)constructions as digital scholarly editions Abstract Introduction Modelling in 3D: an introduction 3D and textual scholarship: a parallel path Evidence Ambiguity Transparency Towards 3D scholarly editions References work_3y22ufoqi5euja4cnjqg52t4aa ---- Preventing Work-Related Musculoskeletal Disorders in Manufacturing by Digital Human Modeling International Journal of Environmental Research and Public Health Article Preventing Work-Related Musculoskeletal Disorders in Manufacturing by Digital Human Modeling Jerzy Grobelny and Rafał Michalski * Faculty of Computer Science and Management, Wrocław University of Science and Technology, 50-370 Wrocław, Poland; jerzy.grobelny@pwr.edu.pl * Correspondence: rafal.michalski@pwr.edu.pl Received: 12 October 2020; Accepted: 20 November 2020; Published: 22 November 2020 ����������������� Abstract: This research concerns the workplace design methodology, involving digital human models, that prevents work-related musculoskeletal disorders (WMSDs). We propose an approach that, in conjunction with one of the classic WMSD risk assessment methods, allows one to simplify simulations in a three-dimensional digital environment. Two real-life workstations from a manufacturing industry were modelled in a 3D Studio Max environment by means of an Anthropos ErgoMax system. A number of simulations show that, for the examined cases, classic boundary mannequins’ approaches can be replaced by using 50th percentile of a population individual, with a minimal impact on the WMSD risk. Although, the finding might not be suitable in all situations, it should be considered, especially where compromise solutions are being sought due to other criteria. Keywords: occupational safety and health; musculoskeletal disorders; digital human models; anthropometry; design 1. Introduction Work-related musculoskeletal disorders (WMSDs) refer to diminishing the functionality or damaging of such human body structures as muscles, joints, tendons, ligaments, nerves, cartilage, bones as well as the blood circulation system. These impairments result mainly from such performed work that requires repetitive manual activities, transporting heavy loads manually, excessive energy expenditure, prolonged static forced body posture, etc., and immediate work environment conditions [1] (p. 2), [2] (p. 12). It is well known that inappropriate body posture is one of the causes of musculoskeletal disorders. A particularly awkward body posture taken while carrying heavy objects can cause a serious problem. Similarly, minor postural inconveniences repeated hundreds or thousands of times may also deteriorate human health [1] (p. 4). Such ailments can turn into medical pathological problems and physical changes in the human locomotor system. Usually WMSDs cover both specific medical diseases such as the tendonitis, tenosynovitis, carpal tunnel syndrome and pain felt in various anatomical structures which is not clearly represented in clinical terms, e.g., neck muscular tensions or non-specific lower back pain [2] (pp. 28–29). These types of ailments and medical conditions are the most common health problem in the European Union. As shown in the report of the European Agency for Safety and Health at Work [2], (pp. 13–14), among work- related health problems, musculoskeletal disorders are reported by about 60 percent of all workers in EU countries (data from 2010 and 2015). The highest rates of complaints were recorded in such occupational categories as agriculture, forestry and fishing (69%), machine operators in industry and assemblers (66%) and craftsmen (65%). The most numerous groups among those listed are operators and assemblers. For example, in Germany, the EU’s largest economy, manufacturing companies generate 45% of national income. Major industrial companies of a production nature (e.g., electrotechnical, mechanical, automotive) are characterized by the presence of large areas of assembly workstations. Int. J. Environ. Res. Public Health 2020, 17, 8676; doi:10.3390/ijerph17228676 www.mdpi.com/journal/ijerph http://www.mdpi.com/journal/ijerph http://www.mdpi.com https://orcid.org/0000-0002-0807-1925 http://www.mdpi.com/1660-4601/17/22/8676?type=check_update&version=1 http://dx.doi.org/10.3390/ijerph17228676 http://www.mdpi.com/journal/ijerph Int. J. Environ. Res. Public Health 2020, 17, 8676 2 of 19 The constant trend towards product complexity and diversity as well as shortening of the product life cycle and reducing of product batches favors the use of manual assembly [3].The health consequences of professional activity in this group, along with production machine operators, across the EU [2] (p. 15) include mainly back pain (55%), pain in the arms, shoulders neck and upper limbs muscles (47%), as well as lower limbs muscles (33%). For many years, the efforts of researchers analyzing the WMSD area have been focused on ways to minimize them. General knowledge of mechanisms and factors generating ailments [1] (pp. 13–33) allowed, among other things, for the development of a number of methods for risk occurrence and identification. Thanks to this, it was possible to design and arrange workspaces in such a way that minimizes the WMSDs risk. Methods supporting the prevention of WMSDs have been in use for many years. Among the most popular are the Ovako Working Posture Analyzing System (OWAS) [4,5], Rapid Upper Limbs Assessment (RULA) [6,7], and Rapid Entire Body Assessment (REBA) [8,9]. The OWAS method is intended mainly for the risk assessment of WMSDs for physical work. It analyzes the body posture defined by relative positions of its segments and force loads and assigns them to various risk categories. The RULA tool is focused on the analysis of the upper limbs in sedentary posture. It is similar to OWAS, but it more precisely distinguishes between the positions of the hands while performing tasks. The REBA approach combines the perspectives of RULA and OWAS and identifies the risk of WMSDs where the static work load is predominant [8]. The general idea of the WMSD risk appraisal consists of assessing the deviations of individual body segment angles from their natural, neutral values. This evaluation approach is justified, among others, by precise physiological studies and research on perceived discomfort and fatigue. Aaras et al. [10] document significant relations between the magnitude of the hand segments deviations from their neutral positions and the physical intensity of the load on muscles and tendons. Investigations regarding subjective perception of such body postures were undertaken by a number of researchers. For instance, Corlett and Bishop [11] examined welders discomfort and pain located in specific body segments. Drury and Coury [12], in turn, developed a methodology for evaluating overall comfort while sitting in a chair. Bhatnager et al. [13] associated poorer work performance with bigger perceived discomfort, whereas Kee [14] took advantage of the subjects’ perceived discomfort to automatically generate a three-dimensional isocomfort workspace. Traditional procedures of using the cited methods of assessing body postures most often involve observation and documenting of body segments positions by taking photos or making videos of typical activities. On the basis of such data, the angles between the body segments analyzed in a specific approach are determined more or less precisely. Then, values of the workload and/or WMSDs risk indicators are determined in accordance with the developed procedures. Usually, these approaches also suggest the appropriate type of intervention when unfavorable results are obtained. As a result, corrective actions may include redesigning tools, rearranging spatial relations, or changing work processes organization. Case studies for such analyses and interventions are described, for example, in [15], where authors examined postural behavior while performing repetitive tasks by using photographs and activity sampling techniques. Priya et al. [16] examined body postures also by taking pictures and using Corlett and Bishop methodology [11] while Gómez-Galán et al. [17] used pictures for analyzing postural load during melon cultivation. There are important limitations of this type of methods. It is rather difficult to precisely determine angles that characterize the body posture. Furthermore, the investigator might have practical problems in achieving anthropometric representativeness of the surveyed persons (probants). Most often, simply an employee currently working in the studied environment is being examined. In light of the increasing incorporation of females into the industrial workforce, anthropometric analyses involving diverse populations are especially important. They may include not only typical data but also specific anthropomorphical proportions and body shapes. Undoubtedly, the practical inconveniences can be overcome, and special modelling needs met by using modern computer systems supporting 3D design together with the existing digital human models (DHMs). Int. J. Environ. Res. Public Health 2020, 17, 8676 3 of 19 The main research goal is to examine the possibility of simplifying, in a specific context, the workplace design methodology involving digital human models. We propose an approach that allows one to replace classic boundary mannequins with the use of the 50th percentile individual, with a minimal impact on the WMSD risk level. Although the finding might not be suitable in all situations, it should be considered, especially where compromise solutions are being sought due to other criteria. 2. Digital Human Models The concept of DHMs and its first implementations were created in the late 1960s. The basic idea of taking into account human body properties and limitations as fully as possible while designing workstations in CAD systems was, and still is, quite obvious. Including these features in the virtual space before the physical project is created, allows for conducting tests and making appropriate adjustments very fast and at comparatively low cost. The possible changes are limited only by the creativity of the designer. Of course, a necessary condition for success of such an approach is the appropriate construction of human body virtual models. The digital, 3D mannequins should statistically correctly represent real populations in terms of both anthropometric features, and biomechanical as well as physiological capabilities. The effectiveness and efficiency of such virtual analyses can be increased by automatic generation of various ergonomic assessments incorporated in software that supports DHMs. They may include, for instance, mechanical workload calculations, approximations of the level of postural discomfort or the thermal comfort degree. The development of the concept and implementation of DHMs has historically been twofold [18]. In the years of 1960–1990, the parallel trends included computer systems that were primarily designed to support static anthropometric analyses and those meant mainly for dynamic processes studies. Within the first area, started already in the early 1960s, SAMMIE [19–22], Apolin [23,24], and Anthropos [25–28] systems were created, among others. Research on dynamic processes’ analyses involving human participation resulted in the development of such programs as CALSPAN 3D CSV [29], ADAMS [30,31], or MADYMO 3D [32–34]. Systems of this type were mainly aimed at analyzing crash tests of virtually designed vehicles. Since the 1990s, one can observe the trend of integrating both directions within complex systems and incorporating them into professional CAD software, e.g., Apolinex [35–37], Human [38], or 3DSSPP/AutoCAD [39,40]. Reviews of these earlier solutions can be found, e.g., in [41,42]. The most famous modern applications of this type, i.e., JACK [43] (now part of the Tecnomatix software [44]), RAMSIS [45,46], SAFEWORK [47] (now part of the DELMIA 3DExperience software [48]), or Santos [49,50] are constantly being improved. New features, e.g., facilitate analyses with sophisticated methods of dynamic load assessment or support of psychophysiological evaluations by artificial intelligence [50]. In studies of postural loads and WMSD risks, DHM software packages offer a wide range of multidimensional analyzes of processes and workstations. Some of the modern implementations include modules that automatically calculate classic assessments of postural loads (RULA, REBA) or postural discomfort indicators based on empirical formulas. For conducting research and analyses of this type, older systems that are not developed further, are still in use (e.g., SAMMIE or Anthropos). Most of them, apart from representing the anthropometric features of many populations, have built-in WMSD risk analysis tools and, moreover, they are integrated or cooperate with popular CAD systems. A considerable advantage of these programs is also a relatively simple user interface. The simplicity results not only from many years of experience in their applications, but also from a much smaller range of various functionalities compared to solutions aimed at complex dynamic analyses such as JACK, RAMSIS, or DELMIA. Int. J. Environ. Res. Public Health 2020, 17, 8676 4 of 19 3. Applications of DHM The use of any DHM system for the ergonomic analysis of tools, workstations, or human work processes in view of potential threats such as discomfort, inconvenience, usually consists of: 1. Creating or recreating a work environment in a virtual space. 2. Insertion of a human body model (dummy), which is appropriate in terms of anthropometrical features. 3. Simulating body posture during the most frequently performed work tasks. 4. Generating workload or discomfort assessments, observing potential inconveniences, e.g., related to the field of view, ranges, etc., and performing the risk assessment of WMSDs. 5. Correction of the workstation and its environment aimed at reducing the potential identified threats and removing inconveniences. Any population representation can be used in this general procedure. In special cases, digital mannequins representing anthropometric characteristics of specific people intended to work in a given environment may be used. Most often, however, ergonomic design consists in ensuring that geometrical features are matched to the potential population of workers in the range from the 5th to 95th percentile of their body dimensions. For example, Deros et al. [51] applied it to assembly line workstation design, Gragg et al. [52] for the virtual vehicle cab, and Michalski and Grobelny for designing emergency-medical-service helicopter interiors [53]. In similar situations, the designer should predict the appropriate ranges of regulation of work-related environmental components or, when necessary, look for compromise solutions for the studied population. In the first case, the standard approach is to use mannequins representing the 5th and 95th percentiles of body dimensions. Most often the body height is applied, but in specific situations. as in the analysis of arm ranges, also individual body segments can be taken advantage of. The compromise solution usually involves an average individual, that is, a human model with anthropometric parameters reflecting the 50th percentile of the given population [54–56]. Various types of digital human models were applied for the ergonomic assessment and design of workplaces in different areas. A relatively substantial number of studies were performed within the automotive manufacturing, for instance, examining automotive assembly tasks [40], driver’s seat adjustment ranges [52], driver’s workplace design [57], reach envelopes in the vehicle workspace [58], or lately statistical approaches for predicting postures [59]. Software enabling preproduction analyses of this kind was also used, e.g., in the aviation industry, for emergency-medical-service helicopters [53], digital human modeling applications in aviation [60,61], in a medical field, e.g., in a surgical ward [62], or while designing for the disabled or elderly people, e.g., [63]. For the review, refer to [64]. Among applications concerned with manufacturing, a wide variety of positions were investigated. For example, Grobelny et al. [65] examined painters, fitters, polishers, pressers, technicians, forklift truck operators, and stockroom deliverers; Schall et al. [66] focused on manual material handling by means of transfer carts and performing tasks such as window and door construction; Peruzzini et al. [67] examined pipe external and internal grinding, cleaning, ovalization control, whereas Zhang et al. [68] investigated welders. Studies directly involving assembly works were conducted, among others, by [69–71]. The present study may be treated as a continuation of the trend related to these investigations. A comprehensive review of applications and trends of DHM systems in the manufacturing industry was provided by Zhu et al. [72]. The hardware and software related to area was, in turn, were reviewed by Mgbemena et al. [73]. Int. J. Environ. Res. Public Health 2020, 17, 8676 5 of 19 4. Case Studies 4.1. Material and Method The study presented in this paper covers two real-life manual assembly workstations, existing in a Polish branch of an international company. The company produce, among other things, internal and external mirrors for various types of cars. The enterprise operates for dozens of years and is present in 16 countries worldwide. The Polish branch employs more than 500 workers. 4.1.1. Workstations Characteristics Both workplaces are operated alternately by men and women. The construction of the stands does not allow for the adjustment of the position of its components. Therefore, the current research is focused on determining the most important parameters related to the location of individual work environment movable items that will result in the lowest risk of WMSDs for the entire population of potential employees. The following assembly workstations were investigated: (a) The station for manual positioning and fixing of elements inside the mirror body. (b) The station for fixing mirror’s components with a pneumatic screwdriver. Tasks performed on station (a) include manual operations of connecting structural and electrotechnical elements of the mirror with the plastic body. The mirror body is placed on a special stand fixed on the work surface and the individual assembly items are placed in containers behind the work surface. In station (b), the employee places the module completed in station A inside a special holder and tightens, in succession, several screws securing the mirror body parts. Figure 1 shows a 3D model of both stations in a digital 3D space, where all dimensional relationships of the work station environment are kept. Int. J. Environ. Res. Public Health 2020, 17, x 5 of 19 4.1.1. Workstations Characteristics Both workplaces are operated alternately by men and women. The construction of the stands does not allow for the adjustment of the position of its components. Therefore, the current research is focused on determining the most important parameters related to the location of individual work environment movable items that will result in the lowest risk of WMSDs for the entire population of potential employees. The following assembly workstations were investigated: (a) The station for manual positioning and fixing of elements inside the mirror body. (b) The station for fixing mirror’s components with a pneumatic screwdriver. Tasks performed on station (a) include manual operations of connecting structural and electrotechnical elements of the mirror with the plastic body. The mirror body is placed on a special stand fixed on the work surface and the individual assembly items are placed in containers behind the work surface. In station (b), the employee places the module completed in station A inside a special holder and tightens, in succession, several screws securing the mirror body parts. Figure 1 shows a 3D model of both stations in a digital 3D space, where all dimensional relationships of the work station environment are kept. Figure 1. Examined workstations (a) and (b) and basic work posture configurations for 5th percentile of woman and 95th percentile of man while performing basic tasks. The right bottom window presents the field of view of the human model at workstation (a). 4.1.2. Applied Methodology The performed analyses were carried out in the Anthropos ErgoMAX system (ver. 6.0.2, HS Group, Kaiserslautern, Germany) which operates within the 3D Studio Max (ver. 6.0, Autodesk, Inc., California, USA) virtual environment. In the first step, simplified models of the test workplaces and their equipment were prepared in the 3D digital environment. These models precisely mapped Figure 1. Examined workstations (a) and (b) and basic work posture configurations for 5th percentile of woman and 95th percentile of man while performing basic tasks. The right bottom window presents the field of view of the human model at workstation (a). Int. J. Environ. Res. Public Health 2020, 17, 8676 6 of 19 4.1.2. Applied Methodology The performed analyses were carried out in the Anthropos ErgoMAX system (ver. 6.0.2, HS Group, Kaiserslautern, Germany) which operates within the 3D Studio Max (ver. 6.0, Autodesk, Inc., San Rafael, CA, USA) virtual environment. In the first step, simplified models of the test workplaces and their equipment were prepared in the 3D digital environment. These models precisely mapped essential dimensions of the key workspace components. General, three-dimensional contours were employed to represent work tools and objects. Figure 1 also illustrates the idea of the simulation research presented in this paper. Digital mannequins representing the appropriate dimensions of the population were generated by the Anthropos software and placed at the virtual stands. Next, the specific body posture taken by employees while performing basic working activities was simulated. Two animation functionalities offered by the Anthropos system were employed for this purpose. The inverse kinematics component and the module for direct setting of angles in joints that connect body segments. Taking advantage of these precise data and the REBA methodology, the risk level of WMSDs was determined for both examined workplaces. The simulations were performed separately for body dimensions of the 5th and 95th percentiles of the Eastern European population. In the second stage, a procedure for correcting dimensions of the work surface height was proposed. It was aimed at reducing the risk of WMSDs. Detailed research steps are described in the next section. The workstations modeled in the first stage of the research along with the animated mannequins allowed for the simulation of basic working positions and their evaluation by the REBA method [8,74]. Such an assessment consists of assigning appropriate codes, represented by natural numbers, to the positions of key body segments. Two groups called A and B are distinguished. The A includes the torso, neck, and legs, whereas B comprises arms, forearms, and hands. The trunk movements are divided into four groups depending on its flexion or extension, the neck movements are categorized into two groups in relation to movement angles, also leg positions are assessed in two groups. When it comes to category B, upper arm positions are evaluated according to four different classes, whereas lower arms and wrists movements include two groups. The general principle of coding is to assign higher values to the positions of body segments that deviate more from those favorable from the point of view of biomechanics. More specifically, the scores depend on extension and/or flexions of given body segments. Overall, Group A categories allow for representing as many as 60 posture combinations and class B—36. The determination of the WMSD risk for a given activity comes down to reading the values from table, in which the risk levels are assigned to all 144 combinations of A and B groups’ codes. The result obtained in this way is finally corrected by adding 1 for static work. The static work is defined here as any type of activity in which at least one body segment is held in the same position for at least one minute. All the positions analyzed in this research fulfill this criterion, hence, each overall rating was increased accordingly. A detailed analysis of the REBA methodology and features of workplaces studied here allowed for a significant simplification of calculations. Employees work on the analyzed workstations in a standing, unforced body posture. This allows for assigning code 1 for the basic torso and leg posture. Admittedly, observation of work tools and objects requires a head tilt, but only in the sagittal plane, without twists or tilting the head to sides. The maximum value of the code for the extreme head tilt is 2. In the cases investigated here, the A value will always amount to 1. Given the above, our analyses will focus only on employee’s group B segments configurations while simulating work tasks. Since the code for A group equals 1 for our cases, the overall assessment of the risk level is based on the first row of the REBA resulting risk level matrix. The B code, as mentioned earlier, is determined by positions of the arm, forearm, and hand of the more heavily loaded limb. In the Anthropos system, locations of the main body segments are generated automatically in the form of graphs, showing the percentage deviations of the current position from the neutral position. Table 1 includes the translation of this Anthropos software posture indicator to the angles expressed in Int. J. Environ. Res. Public Health 2020, 17, 8676 7 of 19 degrees and, finally, in the last column, to the appropriate REBA partial codes. The data from Table 1 allow for specifying the general code for part B and are used in further analyses. Table 1. Correspondence between the Anthropos posture indicators, expressed as a percentage of the maximum range, degrees of body segments’ flexion or extension, and part B codes of the Rapid Entire Body Assessment (REBA) methodology which was used for assessing work-related musculoskeletal disorders (WMSDs) in the virtual environment. The bigger the REBA code number, the more WMSD risk is associated with the given body segment. Anthropos Posture (% of Range) Position/Movement (Degrees) REBA Code Upper arm <11 Flexion <20 1 11–25 Flexion 20–45 2 25–50 Flexion 45–90 3 (+1 if shoulder is raised) >50 Flexion 90 4 (+1 if shoulder is raised) Lower arm 0–50 Flexion 0–60 2 50–83 Flexion 60–100 1 >83 Flexion >100 2 Hand +/−25 Extension/flexion +/−15 1 (+1 if wrist deviated or twisted) >25 Extension/flexion >15 2 (+1 if wrist deviated or twisted) In the first step of the analysis, the traditional approach of threshold human models was applied and involved the 5th percentile mannequin representing Eastern European women and the 95th percentile of the male mannequin from the same population. These models were used to assess the risk of WMSDs by the described above REBA methodology. For both examined workplaces, the body postures for typical tasks were first initially configured by applying the inverse kinematics. This method automatically sets appropriate body segment locations based on the target point indicated by the hand position. This step was followed by precise corrections of the angles in individual joints to obtain final body postures. Flexibility of the employed kinematic chains of human body segments in Anthropos, allows one to simulate limb positions for the same manual task in many ways. Therefore, in this study, we adopted the rule of configuring body segments so that their required final hands locations exhibit minimum partial indicators from Table 1 for the remaining. This means that the presented settings are the least deviating from the optimal ones. Moreover, they are in line with the general paradigm suggesting a relationship between the subjective feeling of postural discomfort and the objective threat of musculoskeletal ailments, e.g., [75]. In our analyses, the position of the line of sight was also simulated and used to ensure that the work tools and items were in the center of the employee’s virtual field of view. A sample of this simulation for a female 5th percentile is shown in Figure 1. Obtained in such a way values of individual body segments angles were the basis of determining partial and overall REBA codes. The figures and tables in the next section illustrate the results of these analyses. 4.2. Workstations Analyses, Design Improvements, and Discussion 4.2.1. Workstation (a)—Manual Assembly Figures 2 and 3 show the simulations of the working postures for a 5th percentile of a woman and a 95th percentile of a man, in workstation (a). The figures also present values of the basic angles for hand segments positions obtained from Anthropos. Int. J. Environ. Res. Public Health 2020, 17, 8676 8 of 19 Int. J. Environ. Res. Public Health 2020, 17, x 8 of 19 Figure 2. Body posture simulation of the 5th percentile woman during performing basic activities on workstation (a). Angles of the body segment positions expressed as a percentage of maximum ranges are on the image right side. Figure 3. Body posture simulation of the of the 95th percentile man during performing basic activities on workstation (a). Angles of the body segments positions expressed as a percentage of maximum ranges are on the image right side. The analysis of the partial codes of REBA part B for simulations from Figures 2 and 3 was made in accordance with the data in Table 1 and is put together in Table 2. Table 2. Partial and overall codes for REBA part B for workstation (a) analyses involving extreme mannequins. Body Segment 5th Percentile 95th Percentile Anthropos Posture (% of Range) REBA Code Anthropos Posture (% of Range) REBA Code Upper arm 26 3 14 2 Lower arm 76 1 58 1 Hand 42 2 18 1 REBA part B 4 1 REBA general (+1 for static work) 3 2 According to the REBA methodology, the overall occupational risk assessment for the 5th percentile employee in workstation (a) is 3, and for the 95th percentile one is 2. These codes fall into the second out of five risk categories, where the first one denotes small risk, and the fifth is the highest one. It may be observed that the workplace is designed rather for taller people, however, according Figure 2. Body posture simulation of the 5th percentile woman during performing basic activities on workstation (a). Angles of the body segment positions expressed as a percentage of maximum ranges are on the image right side. Int. J. Environ. Res. Public Health 2020, 17, x 8 of 19 Figure 2. Body posture simulation of the 5th percentile woman during performing basic activities on workstation (a). Angles of the body segment positions expressed as a percentage of maximum ranges are on the image right side. Figure 3. Body posture simulation of the of the 95th percentile man during performing basic activities on workstation (a). Angles of the body segments positions expressed as a percentage of maximum ranges are on the image right side. The analysis of the partial codes of REBA part B for simulations from Figures 2 and 3 was made in accordance with the data in Table 1 and is put together in Table 2. Table 2. Partial and overall codes for REBA part B for workstation (a) analyses involving extreme mannequins. Body Segment 5th Percentile 95th Percentile Anthropos Posture (% of Range) REBA Code Anthropos Posture (% of Range) REBA Code Upper arm 26 3 14 2 Lower arm 76 1 58 1 Hand 42 2 18 1 REBA part B 4 1 REBA general (+1 for static work) 3 2 According to the REBA methodology, the overall occupational risk assessment for the 5th percentile employee in workstation (a) is 3, and for the 95th percentile one is 2. These codes fall into the second out of five risk categories, where the first one denotes small risk, and the fifth is the highest one. It may be observed that the workplace is designed rather for taller people, however, according Figure 3. Body posture simulation of the of the 95th percentile man during performing basic activities on workstation (a). Angles of the body segments positions expressed as a percentage of maximum ranges are on the image right side. The analysis of the partial codes of REBA part B for simulations from Figures 2 and 3 was made in accordance with the data in Table 1 and is put together in Table 2. Table 2. Partial and overall codes for REBA part B for workstation (a) analyses involving extreme mannequins. Body Segment 5th Percentile 95th Percentile Anthropos Posture (% of Range) REBA Code Anthropos Posture (% of Range) REBA Code Upper arm 26 3 14 2 Lower arm 76 1 58 1 Hand 42 2 18 1 REBA part B 4 1 REBA general (+1 for static work) 3 2 Int. J. Environ. Res. Public Health 2020, 17, 8676 9 of 19 According to the REBA methodology, the overall occupational risk assessment for the 5th percentile employee in workstation (a) is 3, and for the 95th percentile one is 2. These codes fall into the second out of five risk categories, where the first one denotes small risk, and the fifth is the highest one. It may be observed that the workplace is designed rather for taller people, however, according to the classification and interpretation of the REBA authors [8], this is not a big risk, but taking corrective actions may be necessary. In view of the obtained ratings, an attempt was made to correct the workstation (a). The general methodology of the applied improvement approach results from the fact that body segment dimensions in each population are approximately normally distributed. Hence, matching the working environment to average individuals provides, relatively, the largest number of people with good spatial conditions. Since, in the analyzed case, the height of the work surface is the key parameter, a simple procedure was adopted in the simulation studies to correct this parameter. The height of the work surface should be set in such a way that a mannequin, with anthropometric parameters corresponding to the 50th percentile of the adult population from Eastern Europe, could adopt the posture that ensures a minimal risk of WMSDs according to REBA. For this purpose, the dummy was first positioned in such a configuration of the angles of the arm segments that ensured the minimum values of REBA partial codes according to Table 1, and then the location of the work surface was adjusted to this position. The effect of this approach is shown in Figure 4. Int. J. Environ. Res. Public Health 2020, 17, x 9 of 19 to the classification and interpretation of the REBA authors [8], this is not a big risk, but taking corrective actions may be necessary. In view of the obtained ratings, an attempt was made to correct the workstation (a). The general methodology of the applied improvement approach results from the fact that body segment dimensions in each population are approximately normally distributed. Hence, matching the working environment to average individuals provides, relatively, the largest number of people with good spatial conditions. Since, in the analyzed case, the height of the work surface is the key parameter, a simple procedure was adopted in the simulation studies to correct this parameter. The height of the work surface should be set in such a way that a mannequin, with anthropometric parameters corresponding to the 50th percentile of the adult population from Eastern Europe, could adopt the posture that ensures a minimal risk of WMSDs according to REBA. For this purpose, the dummy was first positioned in such a configuration of the angles of the arm segments that ensured the minimum values of REBA partial codes according to Table 1, and then the location of the work surface was adjusted to this position. The effect of this approach is shown in Figure 4. Figure 4. Simulation of modified workstation (a) adjusted to the optimal hand configuration of the 50th percentile mannequin of the Eastern European population. This arrangement’s risk score for Part B of the REBA method is minimal. The overall REBA rating for the solution from Figure 4 remains at 2, due to static workload but it is the best spatial solution under the existing conditions and constraints. The overall quality of this solution is further validated by simulating the working posture of the extreme digital mannequins and checking the REBA scores once again. The results of this operation are shown in Figures 5 and 6. Figure 5. Simulation of 5th percentile woman on the modified workstation (a) adjusted to the optimal hand configuration of the 50th percentile individual. Figure 4. Simulation of modified workstation (a) adjusted to the optimal hand configuration of the 50th percentile mannequin of the Eastern European population. This arrangement’s risk score for Part B of the REBA method is minimal. The overall REBA rating for the solution from Figure 4 remains at 2, due to static workload but it is the best spatial solution under the existing conditions and constraints. The overall quality of this solution is further validated by simulating the working posture of the extreme digital mannequins and checking the REBA scores once again. The results of this operation are shown in Figures 5 and 6. Table 3 presents the outcomes of the REBA assessment for the extreme human models placed in the corrected workstation (a). Int. J. Environ. Res. Public Health 2020, 17, 8676 10 of 19 Int. J. Environ. Res. Public Health 2020, 17, x 9 of 19 to the classification and interpretation of the REBA authors [8], this is not a big risk, but taking corrective actions may be necessary. In view of the obtained ratings, an attempt was made to correct the workstation (a). The general methodology of the applied improvement approach results from the fact that body segment dimensions in each population are approximately normally distributed. Hence, matching the working environment to average individuals provides, relatively, the largest number of people with good spatial conditions. Since, in the analyzed case, the height of the work surface is the key parameter, a simple procedure was adopted in the simulation studies to correct this parameter. The height of the work surface should be set in such a way that a mannequin, with anthropometric parameters corresponding to the 50th percentile of the adult population from Eastern Europe, could adopt the posture that ensures a minimal risk of WMSDs according to REBA. For this purpose, the dummy was first positioned in such a configuration of the angles of the arm segments that ensured the minimum values of REBA partial codes according to Table 1, and then the location of the work surface was adjusted to this position. The effect of this approach is shown in Figure 4. Figure 4. Simulation of modified workstation (a) adjusted to the optimal hand configuration of the 50th percentile mannequin of the Eastern European population. This arrangement’s risk score for Part B of the REBA method is minimal. The overall REBA rating for the solution from Figure 4 remains at 2, due to static workload but it is the best spatial solution under the existing conditions and constraints. The overall quality of this solution is further validated by simulating the working posture of the extreme digital mannequins and checking the REBA scores once again. The results of this operation are shown in Figures 5 and 6. Figure 5. Simulation of 5th percentile woman on the modified workstation (a) adjusted to the optimal hand configuration of the 50th percentile individual. Figure 5. Simulation of 5th percentile woman on the modified workstation (a) adjusted to the optimal hand configuration of the 50th percentile individual.Int. J. Environ. Res. Public Health 2020, 17, x 10 of 19 Figure 6. Simulation of 95th percentile man on the modified workstation (a) adjusted to the optimal hand configuration of the 50th percentile individual. Table 3 presents the outcomes of the REBA assessment for the extreme human models placed in the corrected workstation (a). Table 3. REBA WMSD risk assessment results for extreme human models on the modified workstation (a) adjusted to the optimal hand configuration of the 50th percentile individual. Body Segment 5th Percentile 95th Percentile Anthropos Posture (% of Range) REBA Code Anthropos Posture (% of Range) REBA Code Upper arm 23 2 11 1 Lower arm 64 1 45 2 Hand 11 1 11 1 REBA part B 1 1 REBA general (+1 for static work) 2 2 Calculations from Table 3 show that a slight change of lowering the work surface height only by 5 cm improved the risk category for the female 5th percentile and did not change the risk for the 95th percentile of men. As it is not possible to obtain a lower score for part B of the REBA method, the obtained solution can be considered optimal from the point of view of the WMSD risk, that is, the best under the assumptions made. 4.2.2. Workstation (b) with a Screwdriver A similar procedure was applied for the examination of the workstation equipped with a mechanical screwdriver. Figure 7 shows an existing design analysis for the 5th percentile of a female, whereas Figure 8 presents simulated body posture of the 95th percentile of a male. Both models are from the Eastern Europe population. Figure 6. Simulation of 95th percentile man on the modified workstation (a) adjusted to the optimal hand configuration of the 50th percentile individual. Table 3. REBA WMSD risk assessment results for extreme human models on the modified workstation (a) adjusted to the optimal hand configuration of the 50th percentile individual. Body Segment 5th Percentile 95th Percentile Anthropos Posture (% of Range) REBA Code Anthropos Posture (% of Range) REBA Code Upper arm 23 2 11 1 Lower arm 64 1 45 2 Hand 11 1 11 1 REBA part B 1 1 REBA general (+1 for static work) 2 2 Calculations from Table 3 show that a slight change of lowering the work surface height only by 5 cm improved the risk category for the female 5th percentile and did not change the risk for the 95th percentile of men. As it is not possible to obtain a lower score for part B of the REBA method, the obtained solution can be considered optimal from the point of view of the WMSD risk, that is, the best under the assumptions made. Int. J. Environ. Res. Public Health 2020, 17, 8676 11 of 19 4.2.2. Workstation (b) with a Screwdriver A similar procedure was applied for the examination of the workstation equipped with a mechanical screwdriver. Figure 7 shows an existing design analysis for the 5th percentile of a female, whereas Figure 8 presents simulated body posture of the 95th percentile of a male. Both models are from the Eastern Europe population. Int. J. Environ. Res. Public Health 2020, 17, x 11 of 19 Figure 7. Body posture simulation of the of the 5th percentile woman during performing basic activities on workstation (b). Angles of the body segments positions expressed as a percentage of maximum ranges are on the image right side. Figure 8. Body posture simulation of the of the 95th percentile man during performing basic activities on workstation (b). Angles of the body segments positions expressed as a percentage of maximum ranges are on the image right side. The analysis of the angular values shown in Figures 7 and 8 in confrontation with the REBA rules provided the results shown in Table 4. Table 4. Partial and overall codes for REBA part B for workstation (b) analyses involving extreme mannequins. Body Segment 5th Percentile 95th Percentile Anthropos Posture (% of Range) REBA Code Anthropos Posture (% of Range) REBA Code Upper arm 54 4 (+1) arm raised 32 3 Lower arm 50 1 77 1 Hand 67 2 28 2 REBA part B 7 4 REBA general (+1 for static work) 4 2 The results of this analysis indicate that the risk level is average and that appropriate actions are necessary to correct the worker posture. Similar to workstation (a), also here, the work surface is placed too high. Therefore, the risk of WMSDs is especially high for shorter people. Figure 7. Body posture simulation of the of the 5th percentile woman during performing basic activities on workstation (b). Angles of the body segments positions expressed as a percentage of maximum ranges are on the image right side. Int. J. Environ. Res. Public Health 2020, 17, x 11 of 19 Figure 7. Body posture simulation of the of the 5th percentile woman during performing basic activities on workstation (b). Angles of the body segments positions expressed as a percentage of maximum ranges are on the image right side. Figure 8. Body posture simulation of the of the 95th percentile man during performing basic activities on workstation (b). Angles of the body segments positions expressed as a percentage of maximum ranges are on the image right side. The analysis of the angular values shown in Figures 7 and 8 in confrontation with the REBA rules provided the results shown in Table 4. Table 4. Partial and overall codes for REBA part B for workstation (b) analyses involving extreme mannequins. Body Segment 5th Percentile 95th Percentile Anthropos Posture (% of Range) REBA Code Anthropos Posture (% of Range) REBA Code Upper arm 54 4 (+1) arm raised 32 3 Lower arm 50 1 77 1 Hand 67 2 28 2 REBA part B 7 4 REBA general (+1 for static work) 4 2 The results of this analysis indicate that the risk level is average and that appropriate actions are necessary to correct the worker posture. Similar to workstation (a), also here, the work surface is placed too high. Therefore, the risk of WMSDs is especially high for shorter people. Figure 8. Body posture simulation of the of the 95th percentile man during performing basic activities on workstation (b). Angles of the body segments positions expressed as a percentage of maximum ranges are on the image right side. The analysis of the angular values shown in Figures 7 and 8 in confrontation with the REBA rules provided the results shown in Table 4. The results of this analysis indicate that the risk level is average and that appropriate actions are necessary to correct the worker posture. Similar to workstation (a), also here, the work surface is placed too high. Therefore, the risk of WMSDs is especially high for shorter people. As before, adjustments to the workstation spatial arrangement were made based on the optimal settings for the 50th percentile mannequin. In this case, the correction required a significant lowering of the tool holder position. The optimal solution for the average human model is illustrated in Figure 9. Int. J. Environ. Res. Public Health 2020, 17, 8676 12 of 19 Table 4. Partial and overall codes for REBA part B for workstation (b) analyses involving extreme mannequins. Body Segment 5th Percentile 95th Percentile Anthropos Posture (% of Range) REBA Code Anthropos Posture (% of Range) REBA Code Upper arm 54 4 (+1) arm raised 32 3 Lower arm 50 1 77 1 Hand 67 2 28 2 REBA part B 7 4 REBA general (+1 for static work) 4 2 Int. J. Environ. Res. Public Health 2020, 17, x 12 of 19 As before, adjustments to the workstation spatial arrangement were made based on the optimal settings for the 50th percentile mannequin. In this case, the correction required a significant lowering of the tool holder position. The optimal solution for the average human model is illustrated in Figure 9. Figure 9. Simulation of modified workstation (b) adjusted to the optimal hand configuration of the 50th percentile mannequin of the Eastern European population. This arrangement’s risk score for Part B of the REBA method is minimal. The analysis of the changed design was performed, again, for the threshold representatives of the examined population. The outcomes are shown in Figures 10 and 11. Figure 10. Simulation of 5th percentile woman on the modified workstation (b) adjusted to the optimal hand configuration of the 50th percentile individual. Figure 9. Simulation of modified workstation (b) adjusted to the optimal hand configuration of the 50th percentile mannequin of the Eastern European population. This arrangement’s risk score for Part B of the REBA method is minimal. The analysis of the changed design was performed, again, for the threshold representatives of the examined population. The outcomes are shown in Figures 10 and 11. Int. J. Environ. Res. Public Health 2020, 17, x 12 of 19 As before, adjustments to the workstation spatial arrangement were made based on the optimal settings for the 50th percentile mannequin. In this case, the correction required a significant lowering of the tool holder position. The optimal solution for the average human model is illustrated in Figure 9. Figure 9. Simulation of modified workstation (b) adjusted to the optimal hand configuration of the 50th percentile mannequin of the Eastern European population. This arrangement’s risk score for Part B of the REBA method is minimal. The analysis of the changed design was performed, again, for the threshold representatives of the examined population. The outcomes are shown in Figures 10 and 11. Figure 10. Simulation of 5th percentile woman on the modified workstation (b) adjusted to the optimal hand configuration of the 50th percentile individual. Figure 10. Simulation of 5th percentile woman on the modified workstation (b) adjusted to the optimal hand configuration of the 50th percentile individual. Int. J. Environ. Res. Public Health 2020, 17, 8676 13 of 19 Int. J. Environ. Res. Public Health 2020, 17, x 13 of 19 Figure 11. Simulation of 95th percentile man on the modified workstation (b) adjusted to the optimal hand configuration of the 50th percentile individual. Table 5 summarizes the REBA risk assessment components for the data obtained in simulations from Figures 10 and 11. Table 5. REBA WMSD risk assessment results for extreme human models on the modified workstation (b) adjusted to the optimal hand configuration of the 50th percentile individual. Body Segment 5th Percentile 95th Percentile Anthropos Posture (% of Range) REBA Code Anthropos Posture (% of Range) REBA Code Upper arm 24 2 10 1 Lower arm 79 1 65 1 Hand 21 1 11 1 REBA part B 1 1 REBA GENERAL (+1 for static work) 2 2 As it can be easily noticed, the application of the strategy of adjusting the height of the work tool to the anthropometry of a 50th percentile individual resulted in a radical improvement in the WMSDs risk level assessments for workstation (b). The solution is almost perfect from the analyzed point of view. 4.2.3. REBA Sensitivity Analysis Even a cursory analysis of the relationships reflected in the matrices of the REBA methodology shows that in manual work, risk assessment is most sensitive to the deviation of an upper arm from its neutral position. Therefore, we examined the solution in which the height of the screwdriver body is optimal in the sense of the upper arm’s position set on the border of its optimal range (i.e., smaller than 11% of the maximal range) for the 5th female percentile. Such an assumption resulted in a radical lowering by as much as 35 cm of the work surface in workstation (b). The akin simulation for the 95th male percentile for this solution is shown in Figure 12. Figure 11. Simulation of 95th percentile man on the modified workstation (b) adjusted to the optimal hand configuration of the 50th percentile individual. Table 5 summarizes the REBA risk assessment components for the data obtained in simulations from Figures 10 and 11. Table 5. REBA WMSD risk assessment results for extreme human models on the modified workstation (b) adjusted to the optimal hand configuration of the 50th percentile individual. Body Segment 5th Percentile 95th Percentile Anthropos Posture (% of Range) REBA Code Anthropos Posture (% of Range) REBA Code Upper arm 24 2 10 1 Lower arm 79 1 65 1 Hand 21 1 11 1 REBA part B 1 1 REBA GENERAL (+1 for static work) 2 2 As it can be easily noticed, the application of the strategy of adjusting the height of the work tool to the anthropometry of a 50th percentile individual resulted in a radical improvement in the WMSDs risk level assessments for workstation (b). The solution is almost perfect from the analyzed point of view. 4.2.3. REBA Sensitivity Analysis Even a cursory analysis of the relationships reflected in the matrices of the REBA methodology shows that in manual work, risk assessment is most sensitive to the deviation of an upper arm from its neutral position. Therefore, we examined the solution in which the height of the screwdriver body is optimal in the sense of the upper arm’s position set on the border of its optimal range (i.e., smaller than 11% of the maximal range) for the 5th female percentile. Such an assumption resulted in a radical lowering by as much as 35 cm of the work surface in workstation (b). The akin simulation for the 95th male percentile for this solution is shown in Figure 12. Int. J. Environ. Res. Public Health 2020, 17, 8676 14 of 19 Int. J. Environ. Res. Public Health 2020, 17, x 14 of 19 Figure 12. Body posture simulation of the 95th percentile male on workstation (b) designed to be optimal for the 5th percentile female. The distance of 70 cm from employee’s eyes to the work area was measured with the Tape tool. The outcome of this analysis is surprising because the ideal solution for the 5th female turned out to be also good for the 95th male percentile. The only criterion here is the WMSD risk assessment performed according to the REBA convention. The only doubt diminishing the acceptance of the “for the smallest” design strategy, is the distance between the employee’s eyes and the work items. It is illustrated in Figure 12. In the analyzed workstation (b), such a solution could be accepted because this distance amounts to approximately 70 cm. This is the upper limit of the ergonomic recommendation regarding the placement of visual information that require reading (50–70 cm; Młodkowski, 1998, p. 354; Woo et al., 2016). However, with more precise works, this can be a problem. Especially in assembly works where, apart from manual activities, the visual information processing is also important. In such situations, the arrangement of information components within the employee’s field of view may be crucial for work effectiveness and efficiency. There are no excessive requirements in this respect for the examined workstations. Despite that, fields of view for the extreme mannequins given in Figure 13 illustrate significant differences caused by anthropometric and design differences in this respect. Figure 13. Fields of view of the extreme human models in the basic configuration of the body posture on workstation (a) designed to be optimal for the 5th percentile female. The left image side shows 5th percentile female’s field of view, whereas the right side presents the field of view of 95th percentile male. A similar procedure applied to workstation (a) with an ideal solution for a 5th percentile of a woman, does not change the REBA WMSD risk assessment for the male 95th percentile. In this case, the shift in the worksurface level is small relative to the 50th percentile individual optimization strategy. The simulated posture along with the angle ranges is shown in Figure 14. In comparison to the simulation results presented in Figure 6, the lower arm angle slightly deteriorated in Figure 14, but the REBA assessment did not change. What is more, the workspace design is now ideal, from the REBA perspective, for the 5th percentile of a woman. Figure 12. Body posture simulation of the 95th percentile male on workstation (b) designed to be optimal for the 5th percentile female. The distance of 70 cm from employee’s eyes to the work area was measured with the Tape tool. The outcome of this analysis is surprising because the ideal solution for the 5th female turned out to be also good for the 95th male percentile. The only criterion here is the WMSD risk assessment performed according to the REBA convention. The only doubt diminishing the acceptance of the “for the smallest” design strategy, is the distance between the employee’s eyes and the work items. It is illustrated in Figure 12. In the analyzed workstation (b), such a solution could be accepted because this distance amounts to approximately 70 cm. This is the upper limit of the ergonomic recommendation regarding the placement of visual information that require reading (50–70 cm; Młodkowski, 1998, p. 354; Woo et al., 2016). However, with more precise works, this can be a problem. Especially in assembly works where, apart from manual activities, the visual information processing is also important. In such situations, the arrangement of information components within the employee’s field of view may be crucial for work effectiveness and efficiency. There are no excessive requirements in this respect for the examined workstations. Despite that, fields of view for the extreme mannequins given in Figure 13 illustrate significant differences caused by anthropometric and design differences in this respect. Int. J. Environ. Res. Public Health 2020, 17, x 14 of 19 Figure 12. Body posture simulation of the 95th percentile male on workstation (b) designed to be optimal for the 5th percentile female. The distance of 70 cm from employee’s eyes to the work area was measured with the Tape tool. The outcome of this analysis is surprising because the ideal solution for the 5th female turned out to be also good for the 95th male percentile. The only criterion here is the WMSD risk assessment performed according to the REBA convention. The only doubt diminishing the acceptance of the “for the smallest” design strategy, is the distance between the employee’s eyes and the work items. It is illustrated in Figure 12. In the analyzed workstation (b), such a solution could be accepted because this distance amounts to approximately 70 cm. This is the upper limit of the ergonomic recommendation regarding the placement of visual information that require reading (50–70 cm; Młodkowski, 1998, p. 354; Woo et al., 2016). However, with more precise works, this can be a problem. Especially in assembly works where, apart from manual activities, the visual information processing is also important. In such situations, the arrangement of information components within the employee’s field of view may be crucial for work effectiveness and efficiency. There are no excessive requirements in this respect for the examined workstations. Despite that, fields of view for the extreme mannequins given in Figure 13 illustrate significant differences caused by anthropometric and design differences in this respect. Figure 13. Fields of view of the extreme human models in the basic configuration of the body posture on workstation (a) designed to be optimal for the 5th percentile female. The left image side shows 5th percentile female’s field of view, whereas the right side presents the field of view of 95th percentile male. A similar procedure applied to workstation (a) with an ideal solution for a 5th percentile of a woman, does not change the REBA WMSD risk assessment for the male 95th percentile. In this case, the shift in the worksurface level is small relative to the 50th percentile individual optimization strategy. The simulated posture along with the angle ranges is shown in Figure 14. In comparison to the simulation results presented in Figure 6, the lower arm angle slightly deteriorated in Figure 14, but the REBA assessment did not change. What is more, the workspace design is now ideal, from the REBA perspective, for the 5th percentile of a woman. Figure 13. Fields of view of the extreme human models in the basic configuration of the body posture on workstation (a) designed to be optimal for the 5th percentile female. The left image side shows 5th percentile female’s field of view, whereas the right side presents the field of view of 95th percentile male. A similar procedure applied to workstation (a) with an ideal solution for a 5th percentile of a woman, does not change the REBA WMSD risk assessment for the male 95th percentile. In this case, the shift in the worksurface level is small relative to the 50th percentile individual optimization strategy. The simulated posture along with the angle ranges is shown in Figure 14. Int. J. Environ. Res. Public Health 2020, 17, 8676 15 of 19 Int. J. Environ. Res. Public Health 2020, 17, x 15 of 19 Figure 14. Body posture simulation of the 95th percentile male on workstation (a) designed to be optimal for the 5th percentile female. The distance of 68 cm from employee’s eyes to the work area was measured with the Tape tool. 5. Conclusions The presented research results show, above all, the broad possibilities of DHM in the analysis and design of human workplaces. Such analyses seem to be very useful as diversity in the workforce is becoming bigger and bigger due to, among other things, growing proportion of females and people of different ethnic origins. Taking advantage of digital models of workstations and humans is potentially very beneficial both for employers and employees. Though it is, naturally, possible to improve existing solutions by, e.g., providing platforms for shorter people, it is much better and usually cheaper to design the workstations correctly. Poorly designed workplaces that do not take into account anthropometrical features of different individuals may significantly increase the risk of WMSDs. Improving the workplace conditions is of importance not only from the economical point of view but also from the medical perspective. The higher the risk of WMSDs, the more severe consequences for human health. The list of possible medical problems associated with WMSDs along with their codes from the International Classification of Diseases have been comprehensively listed in [76] (p. 10). The catalog contains as many as 31 disease entities including seven tendinopathies, eight tunnel syndromes and nerve compressions, three hygromas, four bone syndromes, three vascular syndromes, meniscus lesions, and five non-specific disorders. Although the case study presented here relates to very specific and concrete situations, it seems that the presented results have a significant and potentially universal applications. First of all, they show how many aspects of user-centered design can be addressed using relatively simple DHM software developed, as mentioned earlier, many years ago. The undoubted advantage of the Anthropos ErgoMax system is its implementation in the 3D Studio Max environment. Version 6.0 of this program is easy to learn and use and is completely sufficient for analyzing existing and designing new work environments in terms of their ergonomic properties. The Anthropos software facilitates a flexible insertion of human digital models of many national and regional populations. The inverse kinematics functionality, along with precise positioning of body segments through rotations in joints, allows for performing simulations of any working postures. As it was shown in this study, by combining the features of the 3D Studio Max environment and the Anthropos ErgoMAX system, one is able to obtain detailed data for ergonomic assessments in the areas of anthropometry, fields of view, workloads, or the WMSD risk. The REBA methodology used here allowed us to significantly improve the designs of existing workplaces in a specific company. Furthermore, the universal finding of the research regards the effectiveness of designing the height of the worksurface for the 5th percentile individual of the population in minimizing the risk of WMSDs for the entire population. As far as we are aware, such a result has not been reported yet in the existing literature. Although the presented approach might Figure 14. Body posture simulation of the 95th percentile male on workstation (a) designed to be optimal for the 5th percentile female. The distance of 68 cm from employee’s eyes to the work area was measured with the Tape tool. In comparison to the simulation results presented in Figure 6, the lower arm angle slightly deteriorated in Figure 14, but the REBA assessment did not change. What is more, the workspace design is now ideal, from the REBA perspective, for the 5th percentile of a woman. 5. Conclusions The presented research results show, above all, the broad possibilities of DHM in the analysis and design of human workplaces. Such analyses seem to be very useful as diversity in the workforce is becoming bigger and bigger due to, among other things, growing proportion of females and people of different ethnic origins. Taking advantage of digital models of workstations and humans is potentially very beneficial both for employers and employees. Though it is, naturally, possible to improve existing solutions by, e.g., providing platforms for shorter people, it is much better and usually cheaper to design the workstations correctly. Poorly designed workplaces that do not take into account anthropometrical features of different individuals may significantly increase the risk of WMSDs. Improving the workplace conditions is of importance not only from the economical point of view but also from the medical perspective. The higher the risk of WMSDs, the more severe consequences for human health. The list of possible medical problems associated with WMSDs along with their codes from the International Classification of Diseases have been comprehensively listed in [76] (p. 10). The catalog contains as many as 31 disease entities including seven tendinopathies, eight tunnel syndromes and nerve compressions, three hygromas, four bone syndromes, three vascular syndromes, meniscus lesions, and five non-specific disorders. Although the case study presented here relates to very specific and concrete situations, it seems that the presented results have a significant and potentially universal applications. First of all, they show how many aspects of user-centered design can be addressed using relatively simple DHM software developed, as mentioned earlier, many years ago. The undoubted advantage of the Anthropos ErgoMax system is its implementation in the 3D Studio Max environment. Version 6.0 of this program is easy to learn and use and is completely sufficient for analyzing existing and designing new work environments in terms of their ergonomic properties. The Anthropos software facilitates a flexible insertion of human digital models of many national and regional populations. The inverse kinematics functionality, along with precise positioning of body segments through rotations in joints, allows for performing simulations of any working postures. As it was shown in this study, by combining the features of the 3D Studio Max environment and the Anthropos ErgoMAX system, one is able to obtain Int. J. Environ. Res. Public Health 2020, 17, 8676 16 of 19 detailed data for ergonomic assessments in the areas of anthropometry, fields of view, workloads, or the WMSD risk. The REBA methodology used here allowed us to significantly improve the designs of existing workplaces in a specific company. Furthermore, the universal finding of the research regards the effectiveness of designing the height of the worksurface for the 5th percentile individual of the population in minimizing the risk of WMSDs for the entire population. As far as we are aware, such a result has not been reported yet in the existing literature. Although the presented approach might not always be suitable, it is worth checking while analyzing workstations in digital environment. Naturally, one also needs to take into account specific limitations, for example, those suggested in this work—the level of work precision or work item visibility in the employee’s field of view. Author Contributions: Conceptualization, methodology, formal analysis, investigation, resources, data curation, writing—original draft preparation, writing—review and editing, visualization, supervision, project administration, funding acquisition, J.G., R.M. All authors have read and agreed to the published version of the manuscript. Funding: This work was partially supported by Ministry of Science and Higher Education (MNiSW, Poland). Conflicts of Interest: The authors declare no conflict of interest. References 1. Simoneau, S.; St-Vincent, M.; Chicoine, D. Work-Related Musculoskeletal Disorders (WMSDs): A Better Understanding for More Effective Prevention; Institut de Recherche Robert Sauvé: Montréal, QC, Canada, 1996. 2. De Kok, J.; Vroonhof, P.; Snijders, J.; Roullis, G.; Clarke, M.; Peereboom, K.; van Dorst, P.; Isusi, I. Work-Related Musculoskeletal Disorders: Prevalence, Costs and Demographics in the EU; European Agency for Safety and Health at Work, Publications Office of the European Union: Luxembourg, 2019; ISBN 978-92-9479-145-0. 3. Hinrichsen, S.; Riediger, D.; Unrau, A. Assistance Systems in Manual Assembly. In Proceedings of the Production Engineering and Management, Bandung, Indonesia, 21–23 September 2016. 4. Karhu, O.; Kansi, P.; Kuorinka, I. Correcting working postures in industry: A practical method for analysis. Appl. Ergon. 1977, 8, 199–201. [CrossRef] 5. Gómez-Galán, M.; Pérez-Alonso, J.; Callejón-Ferre, Á.-J.; López-Martínez, J. Musculoskeletal disorders: OWAS review. Ind. Health 2017, 55, 314–337. [CrossRef] [PubMed] 6. McAtamney, L.; Nigel Corlett, E. RULA: A survey method for the investigation of work-related upper limb disorders. Appl. Ergon. 1993, 24, 91–99. [CrossRef] 7. Gómez-Galán, M.; Callejón-Ferre, Á.-J.; Pérez-Alonso, J.; Díaz-Pérez, M.; Carrillo-Castrillo, J.-A. Musculoskeletal Risks: RULA Bibliometric Review. Int. J. Environ. Res. Public. Health 2020, 17, 4354. [CrossRef] [PubMed] 8. Hignett, S.; McAtamney, L. Rapid Entire Body Assessment (REBA). Appl. Ergon. 2000, 31, 201–205. [CrossRef] 9. Hita-Gutiérrez, M.; Gómez-Galán, M.; Díaz-Pérez, M.; Callejón-Ferre, Á.-J. An Overview of REBA Method Applications in the World. Int. J. Environ. Res. Public. Health 2020, 17, 2635. [CrossRef] 10. Aarås, A.; Westgaard, R.H.; Stranden, E. Postural angles as an indicator of postural load and muscular injury in occupational work situations. Ergonomics 1988, 31, 915–933. [CrossRef] 11. Corlett, E.N.; Bishop, R.P. A Technique for Assessing Postural Discomfort. Ergonomics 1976, 19, 175–182. [CrossRef] 12. Drury, C.G.; Coury, B.G. A methodology for chair evaluation. Appl. Ergon. 1982, 13, 195–202. [CrossRef] 13. Bhatnager, V.; Drury, C.G.; Schiro, S.G. Posture, Postural Discomfort, and Performance. Hum. Factors 1985, 27, 189–199. [CrossRef] 14. Kee, D. A method for analytically generating three-dimensional isocomfort workspace based on perceived discomfort. Appl. Ergon. 2002, 33, 51–62. [CrossRef] 15. Floyd, W.F.; Ward, M.J. Posture in Industry. Int. J. Prod. Res. 1967, 5, 213–224. [CrossRef] 16. Priya, K.C.; Singh, J.K.; Kumar, A. A Comparative Study of Postural Stress for Ergonomically Compatible Design in Selected Manual Weeding Tool. Int. J. Curr. Microbiol. Appl. Sci. 2018, 7, 136–141. [CrossRef] 17. Gómez-Galán, M.; Pérez-Alonso, J.; Callejón-Ferre, Á.-J.; Sánchez-Hermosilla-López, J. Assessment of Postural Load during Melon Cultivation in Mediterranean Greenhouses. Sustainability 2018, 10, 2729. [CrossRef] http://dx.doi.org/10.1016/0003-6870(77)90164-8 http://dx.doi.org/10.2486/indhealth.2016-0191 http://www.ncbi.nlm.nih.gov/pubmed/28484144 http://dx.doi.org/10.1016/0003-6870(93)90080-S http://dx.doi.org/10.3390/ijerph17124354 http://www.ncbi.nlm.nih.gov/pubmed/32560566 http://dx.doi.org/10.1016/S0003-6870(99)00039-3 http://dx.doi.org/10.3390/ijerph17082635 http://dx.doi.org/10.1080/00140138808966731 http://dx.doi.org/10.1080/00140137608931530 http://dx.doi.org/10.1016/0003-6870(82)90006-0 http://dx.doi.org/10.1177/001872088502700206 http://dx.doi.org/10.1016/S0003-6870(01)00047-3 http://dx.doi.org/10.1080/00207546708929753 http://dx.doi.org/10.20546/ijcmas.2018.702.017 http://dx.doi.org/10.3390/su10082729 Int. J. Environ. Res. Public Health 2020, 17, 8676 17 of 19 18. Bubb, H. Why do we need digital human models. In DHM and Posturography; Scataglini, S., Paul, G., Eds.; Academic Press: Cambridge, MA, USA, 2019; pp. 7–32. ISBN 978-0-12-816713-7. 19. Bonney, M.C.; Case, K.; Hughes, B.J.; Kennedy, D.N.; Williams, R.W. Using SAMMIE for Computer-Aided Workplace and Work Task Design; SAE International: Warrendale, PA, USA, 1974. 20. Case, K.; Porter, J.M.; Bonney, M.C. SAMMIE: A Computer Aided Design Tool for Ergonomists. Proc. Hum. Factors Soc. Annu. Meet. 1986. [CrossRef] 21. Case, K.; Porter, J.M.; Bonney, M.C. SAMMIE: A man and workplace modelling system. In Computer-Aided Ergonomics; Karwowski, W., Genaidy, A.M., Asfour, S., Eds.; Taylor and Francis: London, UK, 1990; pp. 31–56. 22. Feeney, R.; Summerskill, S.; Porter, M.; Freer, M. Designing for disabled people using a 3D human modelling CAD system. In Ergonomic Software Tools in Product and Workplace Design; Landau, K., Ed.; Verlag ERGON GmbH: Stuttgart, Germany, 2000; pp. 195–203. 23. Grobelny, J. Including anthropometry into the AutoCAD-microcomputer system for aiding engineering drafting. In Trends in Ergonomics/Human Factors; Aghazadeh, V.F., Ed.; North-Holland: Amsterdam, The Netherlands, 1988; pp. 77–82. 24. Grobelny, J.; Cysewski, P.; Karwowski, W.; Zurada, J. Apolin: A 3-dimensional ergonomic design and analysis system. In Computer Applications in Ergonomics, Occupational Safety and Health; Mattila, M., Karwowski, W., Eds.; Elsevier: Amsterdam, The Netherlands, 1992; pp. 129–135. 25. Lippman, R. Anthropos quo vadis? Anthropos human modeling past and future. In Ergonomic Software Tools in Product and Workplace Design; Landau, K., Ed.; Verlag ERGON GmbH: Stuttgart, Germany, 2000; pp. 156–168. 26. Bauer, W.; Lippman, R.; Rossler, A. Virtual human models in product development. In Ergonomic Software Tools in Product and Workplace Design; Landau, K., Ed.; Verlag ERGON GmbH: Stuttgart, Germany, 2000; pp. 114–120. 27. IST. Anthropos 5 das Menschmodell der IST GmbH. Manual 2 Interaktionen; IST GmbH: Kallmünz, Germany, 1998. 28. IST. Anthropos ErgoMAX—User Guide; IST GmbH: Kallmünz, Germany, 2002. 29. Fleck, J.; Butler, F.E. Validation of the Crash Victim Simulator; NTIS: Springfield, VA, USA, 1981. 30. Esteves, G.; Ferreira, C.; Veloso, A.; Brandão, F. Development of a Model of the Muscle Skeletal System using Adams. In Its Application to an Ergonomic Study in Automotive Industry; SAE International: Warrendale, PA, USA, 2004; pp. 2004–2169. 31. Hatami, M.; Wang, D.; Qu, A.; Xiangsen, Z.; Wang, Q.; Baradaran Kazemian, B. Dynamic Simulation of Biomechanical Behaviour of the Pelvis in the Lateral Impact Loads. J. Healthc. Eng. 2018, 2018. [CrossRef] 32. Maltha, J. Madymo Crash Victim Simulations Handbook; The National Academies of Sciences, Engineering, and Medicine: Washington, DC, USA, 1983. 33. Schmitt, K.-U.; Niederer, P.F.; Muser, M.H.; Walz, F. Trauma Biomechanics: Accidental Injury in Traffic and Sports, 3rd ed.; Springer-Verlag: Berlin/Heidelberg, Germany, 2010; ISBN 978-3-642-03713-9. 34. TASS International, Siemens Madymo. Available online: https://tass.plm.automation.siemens.com/madymo (accessed on 7 August 2020). 35. Grobelny, J.; Karwowski, W. A Computer Aided System for Ergonomic Design and Analysis for AutoCad User; Landau, K., Ed.; Human Factors Association of Canada: Toronto, ON, Canada, 1994; pp. 302–303. 36. Grobelny, J.; Karwowski, W. Apolinex: A human model and computer-aided approach for ergonomic workplace design in open CAD environment. In Ergonomic Software Tools in Product and Workplace Design; Landau, K., Ed.; Verlag ERGON GmbH: Stuttgart, Germany, 2000; pp. 121–131. 37. Grobelny, J.; Michalski, R.; Kukuła, J. Wirtualne manekiny w projektowaniu ergonomicznym stanowisk pracy. In Obciążenie Układu Ruchu. Przyczyny i Skutki; Palucha, R., Jach, K., Michalskiego, R., Eds.; Oficyna Wydawnicza Politechniki Wrocławskiej: Wrocław, Poland, 2006; pp. 61–67. ISBN 83-7085-946-1. 38. Sengupta, A.K.; Das, B. Human: An autocad based three dimensional anthropometric human model for workstation design. Int. J. Ind. Ergon. 1997, 19, 345–352. [CrossRef] 39. Chaffin, D.B. Development of computerized human static strength simulation model for job design. Hum. Factors Ergon. Manuf. Serv. Ind. 1997, 7, 305–322. [CrossRef] 40. Feyen, R.; Liu, Y.; Chaffin, D.; Jimmerson, G.; Joseph, B. Computer-aided ergonomics: A case study of incorporating ergonomics analyses into workplace design. Appl. Ergon. 2000, 31, 291–300. [CrossRef] 41. Chaffin, D.B. Digital Human Modeling for Workspace Design. Rev. Hum. Factors Ergon. 2008. [CrossRef] http://dx.doi.org/10.1177/154193128603000718 http://dx.doi.org/10.1155/2018/3083278 https://tass.plm.automation.siemens.com/madymo http://dx.doi.org/10.1016/S0169-8141(96)00012-1 http://dx.doi.org/10.1002/(SICI)1520-6564(199723)7:4<305::AID-HFM3>3.0.CO;2-7 http://dx.doi.org/10.1016/S0003-6870(99)00053-8 http://dx.doi.org/10.1518/155723408X342844 Int. J. Environ. Res. Public Health 2020, 17, 8676 18 of 19 42. Berlin, C.; Kajaks, T. Time-related ergonomics evaluation for DHMs: A literature review. Int. J. Hum. Factors Model. Simul. 2010, 1, 356. [CrossRef] 43. Badler, N.; Becket, W.; Webber, B. Simulation and analysis of complex human tasks. Cent. Hum. Model. Simul. 1995, 2596, 225–233. 44. Siemens Tecnomatix. Human Modeling and Simulation. Available online: https://www.plm.automation. siemens.com/global/en/products/tecnomatix/human-modeling-simulation.html (accessed on 7 August 2020). 45. Seidl, A. RAMSIS-A New CAD-Tool for Ergonomic Analysis of Vehicles Developed for the German Automotive Industry; SAE International: Warrendale, PA, USA, 1997. 46. Bubb, H.; Engstler, F.; Fritzsche, F.; Mergl, C.; Sabbah, O.; Schaefer, P.; Zacher, I. The development of RAMSIS in past and future as an example for the cooperation between industry and university. Int. J. Hum. Factors Model. Simul. 2006, 1, 140. [CrossRef] 47. Fortin, C.; Gilbert, R.; Beuter, A.; Laurent, F.; Schiettekatte, J.; Carrier, R.; Dechamplam, B. SAFEWORK: A microcomputer-aided workstation design and analysis. New advances and future developments. In Computer Aided Ergonomics; Karwowski, W., Genaidy, A.M., Asfour, S., Eds.; Taylor & Francis: London, UK, 1990; pp. 157–180. 48. Dassault Systèmes DELMIA 3Dexperience Platform. Available online: https://www.3ds.com/products- services/delmia/products/delmia-3dexperience/ (accessed on 7 August 2020). 49. Abdel-Malek, K.; Yang, J.; Kim, J.H.; Marler, T.; Beck, S.; Swan, C.; Frey-Law, L.; Mathai, A.; Murphy, C.; Rahmatallah, S.; et al. Development of the Virtual-Human SantosTM. In Proceedings of the Digital Human Modeling; Duffy, V.G., Ed.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 490–499. 50. Abdel-Malek, K.; Arora, J.; Bhatt, R.; Farrell, K.; Murphy, C.; Kregel, K. Santos: An integrated human modeling and simulation platform. In DHM and Posturography; Scataglini, S., Paul, G., Eds.; Academic Press: Cambridge, MA, USA, 2019; pp. 63–77. ISBN 978-0-12-816713-7. 51. Deros, B.M.; Khamis, N.K.; Ismail, A.R.; Jamaluddin, H.; Adam, A.M.; Rosli, S. An Ergonomics Study on Assembly Line Workstation Design. Am. J. Appl. Sci. 2011, 8, 1195–1201. [CrossRef] 52. Gragg, J.; Yang, J.; Howard, B. Hybrid method for driver accommodation using optimization-based digital human models. Comput. Aided Des. 2012, 44, 29–39. [CrossRef] 53. Michalski, R.; Grobelny, J. Designing Emergency-Medical-Service Helicopter Interiors Using Virtual Manikins. IEEE Comput. Graph. Appl. 2014, 34, 16–23. [CrossRef] 54. Roebuck, J.A.; Kroemer, K.H.E.; Thomson, W.G. Engineering Anthropometry Methods; Wiley-Interscience: New York, NY, USA, 1975; ISBN 978-0-471-72975-4. 55. Kroemer, K.H.E. Engineering Anthropometry. Proc. Hum. Factors Soc. Annu. Meet. 1976, 20, 365–367. [CrossRef] 56. Garneau, C.J.; Parkinson, M.B. A comparison of methodologies for designing for human variability. J. Eng. Des. 2011, 22, 505–521. [CrossRef] 57. Grobelny, J. Anthropometric data for a driver ’s workplace designing the AutoCAD system. In Computer-Aided Ergonomics; Karwowski, W., Genaidy, A.M., Asfour, S.S., Eds.; Taylor & Francis: London, UK, 1990; pp. 80–89. 58. Yang, J.; Abdel-Malek, K. Human reach envelope and zone differentiation for ergonomic design. Hum. Factors Ergon. Manuf. Serv. Ind. 2009, 19, 15–34. [CrossRef] 59. Park, J.; Reed, M.P. Predicting vehicle occupant postures using statistical models. In DHM and Posturography; Scataglini, S., Paul, G., Eds.; Academic Press: Cambridge, MA, USA, 2019; pp. 799–803. ISBN 978-0-12-816713-7. 60. Sanjog, J.; Karmakar, S.; Patel, T.; Chowdhury, A. Towards virtual ergonomics: Aviation and aerospace. Aircr. Eng. Aerosp. Technol. Int. J. 2015, 87, 266–273. [CrossRef] 61. Bernard, F.; Zare, M.; Sagot, J.-C.; Paquin, R. Using Digital and Physical Simulation to Focus on Human Factors and Ergonomics in Aviation Maintainability. Hum. Factors 2019. [CrossRef] [PubMed] 62. Bartnicka, J. Knowledge-based ergonomic assessment of working conditions in surgical ward—A case study. Saf. Sci. 2015, 71, 178–188. [CrossRef] 63. Şuteu Băncilă, A.M.A.; Buzatu, C. Digital Human Modeling in the Development of Assistive Technologies for Elderly Users. Available online: https://www.scientific.net/AMM.809-810.835 (accessed on 6 August 2020). 64. Maurya, C.M.; Karmakar, S.; Das, A.K. Digital human modeling (DHM) for improving work environment for specially-abled and elderly. SN Appl. Sci. 2019, 1, 1326. [CrossRef] http://dx.doi.org/10.1504/IJHFMS.2010.040271 https://www.plm.automation.siemens.com/global/en/products/tecnomatix/human-modeling-simulation.html https://www.plm.automation.siemens.com/global/en/products/tecnomatix/human-modeling-simulation.html http://dx.doi.org/10.1504/IJHFMS.2006.011686 https://www.3ds.com/products-services/delmia/products/delmia-3dexperience/ https://www.3ds.com/products-services/delmia/products/delmia-3dexperience/ http://dx.doi.org/10.3844/ajassp.2011.1195.1201 http://dx.doi.org/10.1016/j.cad.2010.11.009 http://dx.doi.org/10.1109/MCG.2014.26 http://dx.doi.org/10.1177/154193127602001605 http://dx.doi.org/10.1080/09544820903535404 http://dx.doi.org/10.1002/hfm.20135 http://dx.doi.org/10.1108/AEAT-05-2013-0094 http://dx.doi.org/10.1177/0018720819861496 http://www.ncbi.nlm.nih.gov/pubmed/31361155 http://dx.doi.org/10.1016/j.ssci.2014.08.010 https://www.scientific.net/AMM.809-810.835 http://dx.doi.org/10.1007/s42452-019-1399-y Int. J. Environ. Res. Public Health 2020, 17, 8676 19 of 19 65. Grobelny, J.; Michalski, R. Waldemar Karwowski Workload Assessment Predictability for Digital Human Models. In Handbook of Digital Human Modeling; Duffy, V., Ed.; CRC Press: Boca Raton, FL, USA, 2008; Volume 20081561, pp. 28-1–28-13. ISBN 978-0-8058-5646-0. 66. Schall, M.C.; Fethke, N.B.; Roemig, V. Digital Human Modeling in the Occupational Safety and Health Process: An Application in Manufacturing. IISE Trans. Occup. Ergon. Hum. Factors 2018, 6, 64–75. [CrossRef] [PubMed] 67. Peruzzini, M.; Pellicciari, M.; Gadaleta, M. A comparative study on computer-integrated set-ups to design human-centred manufacturing systems. Robot. Comput.-Integr. Manuf. 2019, 55, 265–278. [CrossRef] 68. Zhang, Y.; Wu, X.; Gao, J.; Chen, J.; Xv, X. Simulation and Ergonomic Evaluation of Welders’ Standing Posture Using Jack Software. Int. J. Environ. Res. Public. Health 2019, 16, 4354. [CrossRef] 69. Dukic, T.; Rönnäng, M.; Christmansson, M. Evaluation of ergonomics in a virtual manufacturing process. J. Eng. Des. 2007, 18, 125–137. [CrossRef] 70. Lämkull, D.; Hanson, L. Roland Örtengren A comparative study of digital human modelling simulation results and their outcomes in reality: A case study within manual assembly of automobiles. Int. J. Ind. Ergon. 2009, 39, 428–441. [CrossRef] 71. Yang, Q.; Wu, D.L.; Zhu, H.M.; Bao, J.S.; Wei, Z.H. Assembly operation process planning by mapping a virtual assembly simulation to real operation. Comput. Ind. 2013, 64, 869–879. [CrossRef] 72. Zhu, W.; Fan, X.; Zhang, Y. Applications and research trends of digital human models in the manufacturing industry. Virtual Real. Intell. Hardw. 2019, 1, 558–579. [CrossRef] 73. Mgbemena, C.E.; Tiwari, A.; Xu, Y.; Prabhu, V.; Hutabarat, W. Ergonomic evaluation on the manufacturing shop floor: A review of hardware and software technologies. CIRP J. Manuf. Sci. Technol. 2020. [CrossRef] 74. Rizkya, I.; Syahputri, K.; Sari, R.M.; Siregar, I. Evaluation of work posture and quantification of fatigue by Rapid Entire Body Assessment (REBA). IOP Conf. Ser. Mater. Sci. Eng. 2018, 309, 012051. [CrossRef] 75. Cho, C.-Y.; Hwang, Y.-S.; Cherng, R.-J. Musculoskeletal Symptoms and Associated Risk Factors Among Office Workers With High Workload Computer Use. J. Manipulative Physiol. Ther. 2012, 35, 534–540. [CrossRef] [PubMed] 76. Roquelaure, Y. Musculoskeletal Disorders and Psychosocial Factors at Work; ETUI, The European Trade Union Institute: Brussels, Belgium, 2018. Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). http://dx.doi.org/10.1080/24725838.2018.1491430 http://www.ncbi.nlm.nih.gov/pubmed/30984907 http://dx.doi.org/10.1016/j.rcim.2018.03.009 http://dx.doi.org/10.3390/ijerph16224354 http://dx.doi.org/10.1080/09544820600675925 http://dx.doi.org/10.1016/j.ergon.2008.10.005 http://dx.doi.org/10.1016/j.compind.2013.06.001 http://dx.doi.org/10.1016/j.vrih.2019.09.005 http://dx.doi.org/10.1016/j.cirpj.2020.04.003 http://dx.doi.org/10.1088/1757-899X/309/1/012051 http://dx.doi.org/10.1016/j.jmpt.2012.07.004 http://www.ncbi.nlm.nih.gov/pubmed/22951267 http://creativecommons.org/ http://creativecommons.org/licenses/by/4.0/. Introduction Digital Human Models Applications of DHM Case Studies Material and Method Workstations Characteristics Applied Methodology Workstations Analyses, Design Improvements, and Discussion Workstation (a)—Manual Assembly Workstation (b) with a Screwdriver REBA Sensitivity Analysis Conclusions References work_3zytjanwqfhq7mfbrf6vdrtdxi ---- Preprocessing Greek Papyri for Linguistic Annotation HAL Id: hal-01279493 https://hal.archives-ouvertes.fr/hal-01279493v2 Submitted on 9 Jun 2017 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Preprocessing Greek Papyri for Linguistic Annotation Marja Vierros, Erik Henriksson To cite this version: Marja Vierros, Erik Henriksson. Preprocessing Greek Papyri for Linguistic Annotation. Journal of Data Mining and Digital Humanities, Episciences.org, 2017, Special Issue on Computer-Aided Processing of Intertextuality in Ancient Languages. �hal-01279493v2� https://hal.archives-ouvertes.fr/hal-01279493v2 https://hal.archives-ouvertes.fr     1   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal     Preprocessing Greek Papyri for Linguistic Annotation Marja Vierros1*, Erik Henriksson2   1, 2 University of Helsinki, Finland *Corresponding author: Marja Vierros marja.vierros@helsinki.fi Abstract   Greek documentary papyri form an important direct source for Ancient Greek. It has been exploited surprisingly little in Greek linguistics due to a lack of good tools for searching linguistic structures. This article presents a new tool and digital platform, “Sematia”, which enables transforming the digital texts available in TEI EpiDoc XML format to a format which can be morphologically and syntactically annotated (treebanked), and where the user can add new metadata concerning the text type, writer and handwriting of each act of writing. An important aspect in this process is to take into account the original surviving writing vs. the standardization of language and supplements made by the editors. This is performed by creating two different layers of the same text. The platform is in its early development phase. Ongoing and future developments, such as tagging linguistic variation phenomena as well as queries performed within Sematia, are discussed at the end of the article.     keywords   Greek; papyri; linguistic annotation; treebank; dependency grammar; TEI EpiDoc XML; MySQL; Python; JavaScript     INTRODUCTION   Greek papyri from Egypt have preserved bigger and smaller entities of Greek as it was written by ancient speakers from ca. 300 BCE to 700 CE. There are different registers and styles found within a variety of different text types; the vernacular becomes visible in private letters and the official phraseology in contracts. Therefore, the papyrological corpus forms an important direct source for Greek linguists. The documentary papyrological corpus is freely available in digital form in the [Papyrological Navigator] (PN) platform, which also allows users to search both text strings and metadata (such as date and provenance). The search possibilities do not, however, easily yield to querying linguistic structures or variation in spelling or morphosyntax. Partly for this reason, the papyrological corpus has been left without much attention within the majority of linguistic research of Ancient Greek. A research project of author 1 (“SEMATIA: Linguistic Annotation of the Greek Documentary Papyri – Detecting and Determining Contact-Induced, Dialectal and Stylistic Variation” funded by the Academy of Finland) sought methods to make better use of the papyri for purposes of linguistic research. In this first phase we needed a way to preprocess the papyri into a form which could be linguistically annotated. The Sematia tool presented in this article results from this project but the tool is still being further developed. A new research project [“Act of the Scribe: Transmitting Linguistic Knowledge and Scribal Practices in Graeco- Roman Antiquity”] where author 1 is currently a researcher, is concentrating on scribes, their level of competence and their linguistic skills. We study the mechanisms of the language production in order to separate the technical effects from the linguistic and cognitive processes. This enables us to pinpoint the scribe’s part in language change. We have added the possibility for implementing new metadata especially for the purposes of that project in Sematia. We approach the texts by dividing them by the “acts of writing” in order to distinguish each writer within one text. Sometimes a text is a product of one writer only, but     2   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal   in many cases two or more different people have written in one document, attested by the change of handwriting. I BACKGROUND   In this section, we will first briefly describe the digital papyrological corpus used in this project, as well as the nature of a papyrus text, in order to illustrate the basic requirements for preprocessing the data. Then, we summarize the linguistic annotation process in 1.2, essential for the later discussion on how we plan to utilize treebanks in this project. Lastly, in order to motivate the way in which we address the texts, we shortly discuss what we mean by linguistic variation in 1.3.     1.1 The Papyrological Corpus in Digital Form   The platform Papyrological Navigator (PN) is the most important digital tool for papyrologists and anyone using papyri, potsherds and wooden tablets as primary sources for their studies of the Ancient World. It is an umbrella platform under which several databases with different scopes are linked together. Its history goes back to 1982, when a papyrological text corpus in digital form was formed at Packard Humanities Institute, resulting in a CD- ROM (PHI #7 Duke Databank of Documentary Papyri). A more detailed history is given in the information page at PN. At the moment the Duke Databank text corpus is open source and available online via the Papyrological Navigator and the texts have been migrated into [TEI EpiDoc XML] form. New publications are added to the corpus, old entries can be corrected and new data added via the Papyrological Editor by the papyrological community (the workflow is curated by an editorial team). Thus, the corpus is kept in an up-to-date, reliable state. Currently, it hosts ca. 70,000 Greek texts, 2,000 Latin texts and 1,000 Coptic texts. A word count is not available and texts vary from very short to extremely long. The PN also includes a search interface, where the texts, metadata, translations and images can be searched using different parameters. 1.1.1 The Nature of a Papyrus Text and its realization in TEI EpiDoc XML Papyri, like inscriptions, are seldom preserved in perfect condition. This results in gaps (lacunae) within the text. The ink may have faded in places, or the handwriting might be difficult to read, with the result that the editor cannot always be certain how to read each letter. Moreover, many texts contain a large number of abbreviations, because they come from the pens of professional scribes working with texts of an administrative nature. These features are marked in the paper editions according to the editorial conventions called the Leiden System, commonly agreed upon in 1931. For example, a lacuna is marked with square brackets, abbreviations are expanded within parentheses and uncertain letters have a dot under them. For a full list, see [Schubert 2009, 202–203]. The EpiDoc XML marks the same phenomena in TEI compatible tags within the text, e.g. for the lacunae, for the uncertain letters. The display in the PN shows the text in a traditional Leiden System layout (with the apparatus criticus below the text), but the text is stored in the GitHub repository in the XML form. Example 1. The first two lines of P.Petra 1 6 in PN display layout (A) and in EpiDoc XML (B): (A)     3   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal   (B) γνῶσιςγνο͂σις ὧνὁ͂ν ἀπώλε σαἀπόλεσα ἐγώἐγὸ Ἐπιφάνιος Although Example 1 exhibits no gaps or uncertain letters, it shows another feature that is highly relevant to our project and to linguists in general, namely, editorial corrections. Within the tag, the tag informs which form the ancient writer really wrote on the papyrus and what the editor thinks is the regular or standard form which was meant. A linguist is usually interested in the forms that the writer originally wrote, since they give us information on language change, phonology and the vernacular. However, with regard to our project it is highly important that the edited text contains the assumed standard forms, too. Using that information, the lemmatization and comparison between the original and standard forms are much easier to perform. Of course, we may be hesitant in several cases about what, in fact, is the standard we should be comparing with and if we agree with the editor’s interpretation of what was sought after by the original writer. For discussions on this topic, see [Colvin 2009] and briefly [Vierros 2012, 25]. 1.2 Treebanks   For Ancient Greek literature, two (constantly growing) linguistically annotated treebank corpora exist, as mentioned by [Haug 2014]: the Ancient Greek Dependency Treebank (currently ca. 558,000 tokens of Homer, Hesiod, tragedies) and the PROIEL treebank (currently ca. 230,000 tokens of the New Testament, Herodotus and later Greek), see also [Universal Dependencies]. These treebanks follow the Dependency Grammar originally used for Czech in the Prague Dependency Treebank outlined in [Hajič 1999]. The suitability of treebanks for historical linguistic research as well as dependency grammar for Ancient Greek has been recently discussed by [Haug 2015]. The most reasonable solution, in our opinion, was to follow the same framework of annotation also with the papyrological material. In this way we can utilize best practices and an annotation infrastructure in those projects as well as gain maximal synergy between the corpora of literary and documentary texts. In the annotation process each word is supplied with a tag including its lemma, postag (i.e. string containing the part-of-speech and morphological analysis of the form), syntactic role and a reference to the head word. The analysis is performed according to the Guidelines for the annotation of Ancient Greek (see [Bamman and Crane 2008] and [Celano 2014] for versions 1.1 and 2.0, respectively). The annotation tool we have used is an editor called [Arethusa] in the [Perseids] platform. Arethusa first divides the text into sentences at certain punctuation (full stop, colon) and performs the tokenization, i.e. gives each sentence and each word within the sentence an ID number. It employs the [Morpheus] tool in providing each word with a lemma and with morphological analysis. This means that lemmatizing and morphological analysis are performed semi-automatically in the Arethusa editor; the human annotator must evaluate the correctness of the analyses where several options are possible in the case of homonyms and add forms in cases where the tool does not recognize the lemma (e.g. many Egyptian names in the papyri). The syntactic roles and dependencies have to be     4   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal   analysed by the human annotator and implemented manually because a syntactic parser for Ancient Greek is still a desideratum; the first attempts have been reported by [Mambrini and Passarotti 2012]. Example 2. Treebanked sentence in XML format.   The “postag” is a nine-place string marking each lemma with 1) part of speech 2) person 3) number 4) tense 5) mood 6) voice 7) gender 8) case 9) the degree of comparison, using certain agreed letters and numerals, e.g. “n” stands for nominative and “g” for genitive within the 8th place of the string, marking “case”.     1.3 Linguistic Variation   The documentary papyri include many different types of linguistic variation, which often cannot be found in the literary texts preserved via the manuscript tradition. Variation means the existence of competing linguistic forms either within one single speech community or a language as a whole. When we witness a change in a language, it is normally preceded by a great deal of synchronic variation, that is, many variants compete until one of them becomes popular and consistent. Studying the variants as such not only tells us a great deal about language change and the processes leading to it, but also about the community; where the people come from, and with whom they have interacted (contact induced variation). Some of the variants in papyri can be categorized as “scribal errors”, a category which is not always treated consistently. It may include mere slips of the pen, but sometimes even a difference of one letter may be an important phonological variant signalling changes in pronunciation. For example, the genitive singular of the word “wheat” (standard: πυροῦ) is written in two different nonstandard ways in the potsherds from Narmouthis (the potsherds, ostraca, are included in the papyrological corpus): πουροῦ (OGN I, 42 and 47) and ποιροῦ (OGN I 46 and 86). The latter (ποιροῦ) attests the merging of /y/ and /oi/ that was an internal development in Greek in the Roman period, but the former (πουροῦ) shows more the transfer of Egyptian, which did not have the front vowel /y/, and often the /u/ and /y/ were confused by Egyptians writing Greek, see [Dahlgren, 2016 and 2017]. In addition to spelling variants, we wish to present a couple of examples of morphosyntactic variation in order to make our treatment of the papyri more understandable. First, the phrase initial inflection strategy. Greek is an inflecting language where morphological case agreement is essential. Certain examples of case incongruence were earlier considered mainly “bad Greek”, but shown by [Vierros 2012] to present a pragmatic strategy for certain scribes; they only inflected the phrase initial words and left the rest of the words belonging to the same phrase in the nominative case. It also reflected the native language, Egyptian, of the writers, as it did not have case inflection. Also, the relative pronouns of the same writers were inflected according to the wrong head, thus evidencing contact-induced transfer from Egyptian.     5   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal   A different type of dilemma is presented by some spellings that prevent us from making direct assumptions on what form the ancient writer aimed for. [Leiwo 2010] discusses, for example, how the phrase καλῶς ποιήσεις (a way of saying “please”, “you do well…”) is used; i.e. which form of a verb can act as its complement. Usually, an aorist participle complement denotes what is being asked. However, in the ostraca from Mons Claudianus, a form πέµψε is used (O. Claud. II 243, 2–3). In this particular case, it is difficult to say how it should be interpreted: straight up, πέµψε, would be the aorist indicative 3rd person singular form of the verb “to send” and this is how the automatic morphological tool would classify it. In the sentence it cannot be a 3rd person form since the phrase is directive. We could interpret it at least in two different ways. It could be an aorist imperative 2nd person singular, πέµψον, because unstressed /e/ and /o/ could be confused, especially by Egyptian native speakers, and the final /n/ could easily be dropped out. This is how the editors wish to regularize it. However, also the infinitive form, πέµψαι would be a phonologically possible interpretation here because the <αι> and <ε> are often confused in the papyri. All the forms discussed above were probably pronounced in the same way: /pémpsəә/. The annotator may wish to mark up both options, the infinitive and the imperative, because the question here is whether the infinitive form was an accepted variant with this directive phrase or not. II PREPROCESSING THE PAPYRI   In this section, we first present the idea of layering as a solution to preprocessing the papyrological data. Second, 2.2 contains the detailed description of how each XML tag is treated in the selection or deselection of elements for each layer. The technical side of building the platform and tool, for which author 2 was in charge, is described in 2.3.     2.1 Layers in Sematia   As mentioned in 1.1.1, the XML tags in the papyrus texts code important information. The tags are located inside the text and between words and letters. Similarly, the choices and apparatus entries for one word follow each other. In the treebank editor, a word is the basic element it tries to identify automatically. The EpiDoc XML texts cannot therefore be uploaded to the treebank editor as such, because the tags break up the words and the apparatus choices would all be included side by side if we only removed the tags. For the study of linguistic variation, we need first and foremost to know what the ancient author really wrote (and what is extant of what he wrote). However, the standard variant is useful to have for the sake of comparison. Moreover, the fragmentary nature of many texts makes the syntactic structure discontinuous, and therefore the editor’s supplements may help in having a solid syntactic tree of a sentence, which is otherwise broken.     For these reasons, it seemed justified that we should create two different layers of the same text, each of which will be treebanked separately. First, the original layer contains only what has been preserved in the papyrus and in the form the ancient writer wrote them. For abbreviated words, for example, only the part that was written is taken into the original layer to prevent us annotating case inflection that the ancient writer did not produce. The standard layer, on the other hand, includes all the editorial work: the expanded abbreviations, supplements, as well as the standardized forms of misspelled words are all accepted. In this way, we get two different treebanks of one act of writing, and comparison can be made between them to see where the morphology differs.     Since treebanking does not allow us to mark all features relating to linguistic variation, we decided to add a third layer, where a new variation mark-up is added to the treebank XML. This very much concerns phonology and spelling, but can also benefit morphosyntactic     6   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal   analyses. Moreover, different editors are not always consistent in what spellings they standardize. The variation layer is discussed in chapter IV (Future developments).       An important division of one document is performed before the layering. The change of handwriting, , indicates a different person penning the letters. Thus, each act of writing gets its own layers and eventually treebanks. Also, the new metadata we enter (discussed in III), concerns each act of writing.     One caveat may be mentioned, although the present article is not the correct place to take the discussion very far. The original layer, in fact, contains some editorial work too, i.e., it does not present a so-called diplomatic transcript. The writing on the papyrus is usually without word divisions (in scriptio continua) and does not contain diacritical marks (accents, breathings, or iota subscripts). The word divisions and diacritics are part of the editor’s interpretation and make the text readable. We have not moved towards a diplomatic transcript in the original layer for the sake of readability as well as to facilitate the automatic lemmatization and morphological analysis. If the annotator disagrees with some word divisions or diacritics, s/he has the possibility to make a change in the text in the Arethusa tool. However, in that case the interpretation should be well supported and the same correction should be suggested to the Papyrological Navigator.     2.2 How tags were treated   This chapter consists of a full discussion of how the TEI EpiDoc XML tags are treated when creating the original vs. the standard layer (for a quick glance, the same information is collected in Table 1 at the end of this section). It was important to keep the word count, i.e., keep the tokenization the same in both layers, so that the word-for-word comparison is possible between the layers by using the word-IDs. We use “dummy” elements to replace the parts not included in the layers on account of tokenization. Another reason for using dummy elements is to help the annotator to notice the missing parts of the text. The annotator will clearly see that something is missing either between the words or at the end of an abbreviation when s/he sees the dummy element. For this reason, the dummy element is written in capital letters.     2.2.1 Editorial corrections: , , , and   The element usually contains two alternatives. First, gives the standardized, regularized version, and is thus selected for the standard layer. On the other hand, consists of what was originally written on the papyrus, and is naturally elected for the original layer. E.g. from   γνῶσιςγνο͂σις we choose γνῶσις for the standard layer and γνο͂σις for the original layer. Sometimes the editor may have suggested two different possibilities for regularizations, or another scholar may have suggested a new interpretation. In those cases, the platform allows the user to choose one of the options to the text which will be annotated (see below 2.3.3). Pure scribal mistakes are sometimes coded with the pair and . Then, from e.g. τιµὴντµµὴν     7   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal   we choose what is marked corrected with for the standard layer, i.e. τιµὴν, and what is marked with for the original layer, i.e. τµµὴν.     2.2.2 Abbreviations: ,   Words are abbreviated in different ways in the papyri. Sometimes only the end of the word is left unwritten (and it usually has some sort of abbreviation mark at the break up point). In TEI EpiDoc XML, the tag surrounds the whole word which is abbreviated in its expanded form and, within the tag, the part which was left unwritten is surrounded by the tag. For example, when the word στερεοῦ is abbreviated by leaving out the ending οῦ, it is written στερε(οῦ) according to the Leiden System, but in EpiDoc XML it is marked:     στερεοῦ In this case, we take the whole word in expanded form into the standard layer (στερεοῦ) and for the original layer we choose only what was written on the papyrus, i.e. στερε, now added with the dummy for abbreviation: A. Thus in the original layer we get στερεA. The annotator now immediately sees that the scribe has not written the ending of the word, and can annotate the word for lemma and other factors that are visible, but not, in this instance, by its morphological case. Some words have been abbreviated only with a certain abbreviation mark. One of the most common is the sign  for ἔτος, “year”. In this case the word is most often opened up in the genitive and marked within the parentheses in the Leiden System: (ἔτους). The markup is: ἔτους The whole word in expanded form, ἔτους, is chosen for the standard layer and for the original layer it is substituted with the marker A. The annotator may be confident enough to lemmatize the word for ἔτος, but otherwise the morphological analysis should be left open. 2.2.3 Supplements and omissions: , When there is a hole in the papyrus, it may be possible for the editor to make an educated assessment about what probably was written in the gap and restore it. Especially if the gap was short (only a few letters) or if the missing part is in a formulaic part of a text, the parallel documents help in restoring the text. When text is restored in the lacuna, it is written inside square brackets in the Leiden System, and in TEI EpiDoc XML it is marked with the tag with the reason attribute “lost”. The markup can go over word boundaries. For example:     µ[ε]λίχρως = µελίχρως ὄντ[ος ἐ]ν = ὄντος ἐν We choose the restorations for the standard layer without brackets, that is, we get µελίχρως and in the latter example two words: ὄντος ἐν. This way, the linguistic annotation tool correctly recognizes these words. For the original layer, however, the supplements are not taken in, since we cannot be sure if the editor has been right; the ancient writer could have written a nonstandard variant even in a short space. The supplement receives the dummy marker SU in the original layer: µSUλίχρως and, in the case of two words, both get their own marker: ὄντSU SUν. Especially when there are several words in a lacuna, it is important that     8   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal   each word (and punctuation mark) is counted in the same way in both layers in order to keep the tokenization the same. Another type of supplement is when the editor of the papyrus thinks that the ancient writer has not, by mistake, written something we would expect. The editor can add what was omitted using angle brackets in the Leiden System; in EpiDoc XML it is rendered with the supplied tag with the reason attribute “omitted”: ἀπ<ε>γραψάµην = ἀπεγραψάµην Again, we choose the supplement for the standard layer as the editor suggests: ἀπεγραψάµην. For the original layer the supplement is replaced with the dummy marker OM, i.e. ἀπOMγραψάµην. The opposite case is , which indicates text which the original writer wrote, but the editor considers superfluous. This surplus text is replaced with the marker SR in the standard layer but included as such in the original layer.   2.2.4 No supplements in lacuna:   When there is a lacuna in the papyrus in which the editor has not been able to suggest a supplement, this is replaced with the dummy element G both in the standard and in the original layers. The reason is that, also when annotating the standard layer, the annotator should see if the sentence is not whole.     2.2.5 Uncertain letters:   The ‘conscience’ of a papyrologist, the underdot, signals that a letter is only partially preserved or so faded that the editor cannot be certain beyond a doubt which letter the ancient writer wrote. He makes an assumption based on the ink traces he sees, writes the letter he assumes has been written in the papyrus, but puts a dot under the letter in the edition. In EpiDoc XML those letters are marked with the tag :   Ἀλεξάνδρο̣υ = Ἀλεξάνδρου In the standard layer it was an easy decision to include the uncertain letters in the same way as the supplemented letters. However, it was difficult to decide how to address the problem in the original layer, since we need the letter without markers interfering with the word recognition in the annotating environment. We decided to take the uncertain letters into the original layer in the same way as into the standard one. This may result in sometimes annotating a word which will later be read as another word. However, that may happen even in cases where the editor has not used underdots. Moreover, the annotator need not annotate the word at all if s/he does not trust the reading. The annotator has the possibility to change the text in the annotating framework, as mentioned previously in 2.1. 2.2.6 The apparatus: In the same way as above with (2.2.1), the apparatus criticus entries can include several options on what the editor or other scholars suggest for the readings. Tags are, e.g., or . We have again decided to give the power of decision to the user; s/he can choose the best alternative to be included in the text which will be uploaded to the annotation tool.     9   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal   Original layer Standard layer : / Text within Text within : / Text within Text within Text within Text within dummy element: A Text within dummy element: SU Text within the tag dummy element: OM Text within the tag Text within the tag dummy element: SR dummy element: G dummy element: G Text within Text within user in Sematia chooses user in Sematia chooses Table 1. The treatment of TEI EpiDoc XML tags in the original vs. standard layer in Sematia.     2.3 Technical realisation In this section, we describe the technical realization of “Sematia” as a web-based tool for creating, managing and querying the original and standard layers of Epidoc XML texts. We begin by sketching the overall data structure of Sematia, and go on to discuss how the system automatically extracts metadata and creates the two layers from imported documents. We then describe Sematia’s integration with the Perseids API, and finally, how the annotated layers can be queried in Sematia. Sematia is hosted on a University of Helsinki server at https://sematia.hum.helsinki.fi and is publicly available to everyone with a Google account, which is required for logging in. Alternatively, the tool can also be installed locally (but without the database) from the open source code available at https://github.com/ezhenrik/sematia/. The back-end of Sematia was developed with Python and MySQL, and the client-side interface with HTML and JavaScript. 2.3.1 Database From the perspective of scribal production of papyri, the elements in EpiDoc XML files are crucial, as they divide the document into parts penned by different persons (see also section 2.1 above). In Sematia, these “hands” each get their own linguistic layers (original and standard), as well as metadata (discussed in detail in III). Moreover, the documents imported to Sematia can be described with metadata about composition date and provenance. This results in the following database schema (fields are in parentheses): •   Document (id, user_id, XML, HTML, date, provenance) •   Hand (id, document_id, no, [metadata fields]) •   Layer (id, hand_id, type, treebankXML, settings) •   User (id, name) •   User Document (id, user_id, document_id) In this schema, each “layer” record is linked to a single “hand”, which, in turn, is a child of a “document” record that belongs to the “user” who imported the document to Sematia. The document table contains fields for the source XML, the HTML-conversion (see 2.3.2 below) as well as date and provenance metadata. To avoid duplicating any data, no actual text is stored in the hand table; its only purpose is to serve as metadata storage for each act of writing. Lastly, the layer table has fields for the layer type, treebank annotation XML (see 2.3.5) and user settings for manually chosen variants (see 2.3.4, item 6).     10   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal   2.3.2 Importing documents In order to minimize the effort to create the layers in Sematia, we have automatized the workflow wherever possible. Thus, when the user imports a document (by entering the document URI into the system), Sematia first parses the XML and calculates the number of elements in the document, which mark the boundaries of different acts of writing. A corresponding number of hand records is then created, as well as the standard and original layer records for each hand. Initially, these records are created as empty templates, to be filled in the stages described below. Since the actual layering happens within the browser using JavaScript code (see 2.3.4 below), the XML tree is next converted to an HTML string that can be manipulated using the Document Object Model (DOM) interface. The following template is used in the conversion: Element nodes: Text nodes: [text] The converted HTML string is then saved to the database. 2.3.3 Document metadata Next, Sematia tries to populate the metadata fields of the new document record automatically via PN’s Apache Solr API available at http://papyri.info/solr/select/?q=id:[document id]. At the time of writing this, Sematia is configured to fetch date and provenance metadata from this public API, in case these data are available for the imported document. As regards PN’s date metadata, Sematia includes a mapper which converts the diverse formattings (e.g. “II spc”, “II/IIIspc”, AD709”) to a machine-readable form (“101-200”, “101-300” and “709- 709”, respectively). There is also an interface in Sematia for editing the metadata fields manually. 2.3.4 Creating the layers The layering process is described in the following steps. The layer is created client-side in the browser from the HTML conversion of the imported document, using the jQuery javascript library. 1.   The user chooses the layer and hand she wishes to work on, for example, standard layer of the 1st “hand” of Petra 1.1. 2.   On a new page, the HTML conversion of the document (e.g. Petra 1.1) that contains the selected hand is loaded into the DOM tree. 3.   The elements outside the selected hand (e.g. elements within hand 2 and 3, if hand 1 was selected) are hidden and marked for exclusion from layering. The motivation for loading the whole document first and hiding irrelevant sections later is the fact that elements in EpiDoc XML documents may appear on different levels of the XML hierarchy. Due to this discrepancy, loading only the elements between two tags would risk creating invalid structures in the DOM. 4.   General formatting is applied, e.g. elements (line break) that have the attribute break=”no” are removed in order to prevent unintended word breaks. Some CSS- styling is also applied in order to highlight different elements to the user (see item 6). 5.   The layer is enabled according to the rules discussed in 2.2, by marking each element with a data-attribute either for exclusion or with the replaced value. For example, if the layer type is original, to each element is added an attribute with the segment     11   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal   “A”. The element is a special case as it may contain several words as well as punctuation marks. In the original layer, we want to replace each word and punctuation with “SU” or “OM” (depending on the value of the “reason” attribute) to maintain the same word count in all layers. Likewise, we had to make sure that the tokenization would work the same way in both Sematia’s layering tool and Arethusa’s treebanking service. For these reasons, the regular expressions used to split up words in Sematia follow Arethusa’s tokenization rules as closely as possible. For example, Arethusa has been configured to deal with crasis (e.g. κἀγώ, “I too”) by treating the merged words as separate. In Sematia, a similar mechanism is currently under development. 6.   In some cases, the editor of the papyrus has provided multiple readings for the same text part, contained in or elements in the XML. Only one of the readings should end up in the layer, making it necessary for the user to choose the preferred interpretation manually. This feature was implemented by adding a click event listener to the elements that may have multiple readings, which allows the user to make the choice simply by clicking on the preferred variant. The manual edits are saved to the database and automatically enabled whenever the user returns to view or edit the layer. 7.   Finally, the layer is created by collecting the new values in the act of writing that the user is working on. The resulting text is loaded into a panel next to the HTML version. 2.3.5 Perseids API integration Sematia uses the [Perseids] Project API (https://sosol.perseids.org/sosol/api/v1) to handle the treebank annotation of the layers. We opted for a strong integration with the Perseids Project, since it is home to the syntactic annotation framework we use, [Arethusa]. Moreover, the Perseids platform offers a centralized review process for the annotations, which helps us to control the quality of the treebanks uploaded to Sematia. The integration works out roughly as follows: First, a layer is created in Sematia according to the steps described in the previous section. Next, Sematia prepares the layer into a treebank annotation template using Perseids’ tokenization and transformation tools (https://github.com/perseids-project/perseids-client-apps), which is then POSTed to Perseids for annotation. When the annotation is finished, it is placed in a review queue, where it can be approved, sent back to the annotator for revision or rejected by one of Sematia’s administrators. Finally, after the approval, Perseids sends the treebank back to Sematia via a public GitHub repository dedicated to Sematia’s finalized treebank annotations (https://github.com/ezhenrik/sematia-tb). 2.3.6 Queries Sematia also includes a preliminary set of tools (https://sematia.hum.helsinki.fi/tools) for exporting the treebanked layers as a single .zip archive, listing frequencies of tokens in the treebanks, visualizing the data as a hierarchical document cluster, as well as for searching for occurrences of morpho-syntactic features or text segments in the treebanked layers. Using the search functionality, users can limit the results with metadata filters (e.g. document date) and combine them with regular expressions targeting individual fields of the treebanks (e.g. syntactic relation), which makes it possible to create highly specific queries on the data. For example, if one wishes to find instances where a participial verb form acts as the subject, the Relation field is filled as SBJ and the Postag field as ^....p. The search tools are currently in a very preliminary stage, but the future development of Sematia will be focused on extending this particular area.     12   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal   III METADATA     3.1. Metadata in existing databases   The metadata which concern the actual papyrus document can be found via the Papyrological Navigator from several different databases, e.g the Heidelberger Gesamtverzeichnis der griechischen Papyrusurkunden Ägyptens (HGV) has collected information on the date and provenance of the text, the original title and the subject matter (in German); similarly, the Trismegistos portal adds the metadata of people involved and places mentioned, to mention a few aspects. For the needs of the project “Act of the Scribe” we wish to add metadata that would help in the identification of the writers as well as the linguistic register. In addition to that, the date and provenance is extracted automatically for each document from the PN, as discussed in 2.3.2.     3.2. Metadata to be added   The new metadata always concern one act of writing; that is, all writers in one papyrus get their own metadata field. It is divided into four sections: Handwriting, Writer and author, Text type, and Addressee.     3.2.1 Handwriting   The printed editions of papyri quite often have some sort of description of the handwriting, at least for the main hand of the text. Moreover, later research may have identified the hand as the same as in some other text, or made some other observations on it. However, if the current user of Sematia has seen the original text or a photograph of it, s/he can add his/her own custom evaluation to the handwriting. We included four subfields for describing handwriting. The first two, “Description in the edition” and “Custom description”, are free text fields serving mainly the user as a reference. The third field is a drop-down list for the level of professionalism with four possibilities to choose from: Not known, Professional, Non- professional and Practised letterhand. The first is applicable when there is no description or a photograph or possibility to check the original. The last option is something between the professional and non-professional; a person who is accustomed to writing, but has obviously not received scribal training. The fourth subfield is reserved for entering a list of texts where the same handwriting is found. This list is stored as a JSON string in the database and may be used in the future for connecting the acts of writing by the same person in queries.     3.2.2 Writer and author   In our project, we are interested in distinguishing the linguistic acts of the actual writer (usually a scribe who has received more or less education) from those of the author of the text, who may have dictated the text or given written and/or oral instructions. Moreover, in official contracts there may be a scribal official ‘responsible’ for the text, e.g. a notary who may even sign the document with his own name, but is not the actual writer of the document, like the agoranomoi from Pathyris discussed by [Vierros 2012]. For these reasons, we have three categories which can be filled in, if the information is available, but left blank, if not: “Actual writer”, “Scribal official” and “Author”. For each one, there are three fields to be filled in: Name, Title and [Trismegistos] Person ID number. Later, when the corpus has a sufficient amount of texts, this information can be used, for example, for connecting people with similar titles to the similar use of language, or even finding texts that have been written/authored by the same person.             13   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal   3.2.3 Text type and Addressee   The genre of the text naturally has an influence on the language used. A private letter belongs to a different register than a notarial contract. The addressee has a similar impact. The text is more formal if written to a superior than if written to a peer or subordinate. Therefore, it is important to gather this metadata when possible. We have added a drop-down list for the text type trying to cover the basic text types found in the papyri but also limiting the list to quite general categories (e.g. “contract” with certain subfields, “letter” with certain subfields, among others). For the addressee, we wanted a general description selected from a drop-down list: “official”, “private” or “not known/applicable”. The first two options get subfields with the subfields “superordinate”, “peer” and “subordinate”. In addition, there are fields for the addressee’s name, title and Trismegistos Person ID number.     IV ONGOING AND FUTURE DEVELOPMENTS     4.1 Variation layer   Research on linguistic variation, discussed above in 1.3, is the driving force on building the Sematia corpus. Quite a number of such phenomena can be queried by comparing the original and standard layers. For example, if we are interested in morphological case agreement, the standard layer includes the grammatically ‘correct’ versions and the original has the variant forms. A search comparing, e.g., the case coding included in the postag of each word, reveals when a word has been written in an unexpected case (and similar comparisons can be made for mood, person, tempus, etc.). The biggest missing block of linguistic information concerns phonology, since spelling is not taken into account in the existing Treebank templates. This issue is to some extent addressed in the new database of Text Irregularities within the [Trismegistos] platform compiled by [Depauw and Stolk 2015]. Their data concerns the whole Duke Databank of Documentary Papyri and is collected phoneme by phoneme and based on the editorial corrections (i.e. the tags within the element, cf. 2.2.1). However, the editorial corrections are not always present in the DDbDP (for example certain editors have not necessarily thought it worthwhile to regularize all confusions of spellings between ι and ει) and it is not always clear if some confusion by a writer concerned the phoneme or the morpheme, i.e. whether the variation had phonological or morphological basis. For these occasions and for the greater accuracy in studying the linguistic variation, we plan to add a variation layer in Sematia. The treebank XML of the original layer would be duplicated and a new variation tag added for those words where variation exists.     The variation tagset in all its depth is still under consideration. We could have a tag for variation, , and define it with different type attributes for phonology, morphology, and syntax. The types could be further defined with different values, e.g. for the immediate context, if that seems to play a role (in phonology at least). Also, a certain variation could be defined with two or more options, for example suggesting that we are fairly certain that a feature is either phonological confusion (e.g. of αι instead of ε) or a morphological one (e.g. confusion of aorist infinitive or imperative endings) or both at the same time.   4.2 Queries   Several tools for querying treebanked data already exist. Both Ancient Greek Treebank corpora can be queried with, e.g. [SETS Treebank Search], [PML Tree Query Engine] or [XQuery/BaseX] (see also [Universal Dependencies]). Moreover, the PROIEL corpus is available in INESS query interface. They employ somewhat different query languages, but all support detailed and complicated linguistic queries from the treebanked data. As mentioned in 2.3.6, all the available treebanked data can be exported, either all layers as one .zip archive or     14   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal   the original layers and the standard layers separated as their own sets. Some quering possibilities have already been integrated in the platform itself (see 2.3.6), but they are still in a testing and developing stage. The important feature is to allow comparative queries between the original and standard layers. For example, one can search for instances where the Original layer has a dative case (Postag field: ^.......d), but the Standard layer has a genitive case (Postag field: ^.......g). The searches can also be performed on or limited by our new metadata. Conclusion In this article, we have described a process in which individual texts from the corpus of documentary Greek papyri can be preprocessed for the purposes of linguistic annotation. The annotation follows the same framework as other corpora of Ancient Greek texts. For the first time we can automatically separate the original text written by the ancient writer from the editorial interpretation. The original layer can be studied in its own right as well as compared with the standardized version. We have not disregarded the results of the hard editorial work devoted to these texts in the previous centuries, as they form the parallel layer of the text. The layers enable the comparison of linguistic variants abundant in the papyri to the scholarly standard forms. The tool is currently optimized for retrieving the texts from the Papyrological Navigator, but there is no impediment to modify it to be used for other texts which are encoded in EpiDoc XML, such as many epigraphic corpora. References For the abbreviations of papyrological editions, see Checklist of Editions of Greek, Latin, Demotic, and Coptic Papyri, Ostraca, and Tablets, of which the updated version is found online: http://papyri.info/docs/checklist. “Act of the Scribe: Transmitting Linguistic Knowledge and Scribal Practices in Graeco-Roman Antiquity” http://blogs.helsinki.fi/actofscribe/. Arethusa: available via Perseids sign in: http://sosol.perseids.org/sosol/signin. Bamman, D. and Crane, G. Guidelines for the Syntactic Annotation of the Ancient Greek Dependency Treebank (1.1). The Perseus Project, Tufts University 2008. http://nlp.perseus.tufts.edu/syntax/treebank/greekguidelines.pdf. Bamman, D. and Crane, G. The Ancient Greek and Latin Dependency Treebanks. Language Technology for Cultural Heritage, ser. Foundations of Human Language Processing and Technology. Springer (Berlin–Heidelberg), 2011:79–98. Bamman, D., Mambrini, F. and Crane, G. An Ownership Model of Annotation: The Ancient Greek Dependency Treebank. Proceedings of the 8th Workshop on Treebanks and Linguistic Theories (TLT8). 2009;8. http://www.perseus.tufts.edu/~ababeu/tlt8.pdf. Celano, Giuseppe G. A. Guidelines for the annotation of the Ancient Greek Dependency Treebank 2.0. 2014. https://github.com/PerseusDL/treebank_data/edit/master/AGDT2/guidelines Colvin, S. The Greek Koine and the Logic of a Standard Language. Standard Languages and Language Standards: Greek, Past and Present. Ashgate (Farnham), 2009:33-45. Dahlgren, S. Outcome of long-term language contact: Transfer of Egyptian phonological features onto Greek in Graeco- Roman Egypt. University of Helsinki, doctoral dissertation. 2017. http://urn.fi/URN:ISBN:ISBN 978-951-51-3218-5. Dahlgren, S. Towards a definition of an Egyptian Greek variety. Papers in Historical Phonology. 2016;1: 90–108. http://journals.ed.ac.uk/pihph/article/view/1695. Depauw, M. and Stolk, J. Linguistic Variation in Greek Papyri: Towards a New Tool for Quantitative Study. Greek, Roman, and Byzantine Studies. 2015;55:196-220. Hajič, J. Building a Syntactically Annotated Corpus: The Prague Dependency Treebank. Issues of Valency and Meaning. Studies in Honor of Jarmila Panevov. Charles University Press (Prague), 1998:12-19. Haug, D.T.T. Computational Linguistics and Greek. Encyclopedia of Ancient Greek Language and Linguistics. Brill Online, 2014 (First appeared 2013; last online update November 2013). Haug. D.T.T. Treebanks in historical linguistic research. Perspectives on Historical Syntax. John Benjamins, 2015:188-202. Haug, D.T.T., Eckhoff, H. M., Majer, M., Welo, E. Breaking down and putting back together: analysis and synthesis of New Testament Greek. Journal of Greek Linguistics. 2009;9:56-92. INESS (Norwegian Infrastructure for the Exploration of Syntax and Semantics): http://iness.uib.no. Leiwo, M. Imperatives and other directives in the Greek letters from Mons Claudianus. The Language of the Papyri. Oxford University Press (Oxford), 2010:97-119. Mambrini, F. and Passarotti, M. Will a parser overtake Achilles? First experiments on parsing the Ancient Greek Dependency Treebank. Proceedings of the 11th Workshop on Treebanks and Linguistic Theories (TLT11). Colibri 2012;11. Morpheus: https://wiki.digitalclassicist.org/Morpheus.     15   Journal  of  Data  Mining  and  Digital  Humanities   http://jdmdh.episciences.org   ISSN  2416-­‐5999,  an  open-­‐access  journal   nltk http://www.nltk.org/. Papyrological Navigator: http://papyri.info/. Perseids: http://sites.tufts.edu/perseids/. PML Tree-Query Engine: http://lindat.mff.cuni.cz/services/pmltq/#!/home. Schubert, P. Editing a Papyrus. The Oxford Handbook of Papyrology. Oxford University Press (New York), 2009:195-215. Sematia: http://sematia.hum.helsinki.fi/. SETS Treebank Search: http://bionlp-www.utu.fi/dep_search TEI EpiDoc XML: http://sourceforge.net/p/epidoc/wiki/Home/. Trismegistos Portal: http://www.trismegistos.org/. Trimegistos Text Irregularities: http://www.trismegistos.org/textirregularities/ Universal Dependencies: http://universaldependencies.org/. Vierros, M. Bilingual Notaries in Hellenistic Egypt. A Study of Greek as a Second Language. KVAB (Brussel), 2012. [XQuery / BaseX] http://docs.basex.org/wiki/Startup. work_432zobppujgrfl5xfphnoyz4p4 ---- Assessing and improving human movements using sensitivity analysis and digital human simulation HAL Id: hal-01221647 https://hal.archives-ouvertes.fr/hal-01221647v3 Submitted on 6 Feb 2019 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Assessing and improving human movements using sensitivity analysis and digital human simulation Pauline Maurice, Vincent Padois, Yvan Measson, Philippe Bidaud To cite this version: Pauline Maurice, Vincent Padois, Yvan Measson, Philippe Bidaud. Assessing and improv- ing human movements using sensitivity analysis and digital human simulation. International Journal of Computer Integrated Manufacturing, Taylor & Francis, 2019, 32 (6), pp.546-558. �10.1080/0951192X.2019.1599432�. �hal-01221647v3� https://hal.archives-ouvertes.fr/hal-01221647v3 https://hal.archives-ouvertes.fr Assessing and improving human movements using sensitivity analysis and digital human simulation Pauline Mauricea,b,d, Vincent Padoisb,c, Yvan Meassond and Philippe Bidaudb,e aUniversité de Lorraine, CNRS, Inria, LORIA, F-54000 Nancy, France; bSorbonne Université, CNRS UMR 7222, Institut des Systèmes Intelligents et de Robotique, ISIR, F-75005 Paris, France; cInria, Centre Bordeaux Sud-Ouest, Équipe AUCTUS Inria / IMS (Univ. Bordeaux, CNRS UMR5218), F-33405 Talence, France; dCEA, LIST, Interactive Robotics Laboratory, Gif-sur-Yvette, F-91191, France; eONERA, 91123 Palaiseau, France ABSTRACT Enhancing the performance of technical movements aims both at improving opera- tional results and at reducing biomechanical demands. Advances in human biome- chanics and modeling tools allow to evaluate human performance with more and more details. Finding the right modifications to improve the performance is, how- ever, still addressed with extensive time consuming trial-and-error processes. This paper presents a framework for easily assessing human movements and automatically providing recommendations to improve their performances. An optimization-based whole-body controller is used to dynamically replay human movements from motion capture data, to evaluate existing movements. Automatic digital human simulations are then run to estimate performance indicators when the movement is performed in many different ways. Sensitivity indices are thereby computed to quantify the influence of postural parameters on the performance. Based on the results of the sensitivity analysis, recommendations for posture improvement are provided. The method is successfully validated on a drilling activity. KEYWORDS Digital human simulation, Dynamic motion replay, Sensitivity analysis of human motion, Ergonomics. 1. Introduction Performance enhancement in technical postures or movements has always been of great concern. In numerous applications the overall performance is twofold, consist- ing of both the achievement of some operational goal and the minimization of the biomechanical demands experienced by the person. Workstations designers now often take into account the exposure to musculoskeletal disorders risk factors in addition to workers’ productivity (Schneider and Irastorza 2010; NRC 2001). In sports, coaches aim at finding the right movement to improve athletes’ results while preventing in- juries (Fortenbaugh, Fleisig, and Andrews 2009; Robinson and ODonoghue 2008). In rehabilitation, knowing which motion patterns alleviate the stress on a weakened body part helps provide exercises or recommendations to prevent further injury (Sturnieks et al. 2008). The assessment and improvement of a movement are usually conducted under the supervision of an expert (e.g. ergonomist, physiotherapist) who observes the person Corresponding author: Pauline Maurice. Email: pauline.maurice@polytechnique.org. performing the activity and provides recommendations based on his/her knowledge and experience. The availability of experts may however be limited. Besides, obser- vational methods provide only qualitative measures of the biomechanical demands experienced by the person. Digital human software (software in which human motions are reproduced with a digital human model) have therefore been developed to as- sist and supplement experts (e.g. OpenSim (Delp et al. 2007), AnyBody (Damsgaard et al. 2006)). Such software enable easy access to detailed force and motion-related biomechanical quantities, which otherwise can only be measured on real humans with complex instrumentation, if at all (e.g. muscle or joint forces). However, the relia- bility of these biomechanical measurements is questionable (Hicks et al. 2015). One key factor is the mapping of the human motion onto the digital model: the resulting motion should be dynamically consistent (i.e. respect the laws of physics) to enhance the reliability of force-related quantities. Yet, existing software do not guarantee such consistency (Hicks et al. 2015). The other difficulty in improving technical movements lies in the identification of suitable modifications which will enhance the performance (Fig. 1). Relations between the macroscopic parameters of the movement (i.e. adjustable parameters defining the way the movement is performed) and the resulting performances are often complex. Therefore, despite advances in human biomechanics and modeling tools, successful modifications generally still result from an intensive trial-and-error process. In (Demir- can 2012), Demircan proposes a tool for analyzing the relation between novice and expert athletes’ movements and the resulting performances. This tool reveals the fea- tures differentiating an efficient movement from a non-efficient one, but cannot provide explicit recommendations for an optimal execution. Besides, a single performance cri- terion is considered (tool acceleration), whereas a detailed assessment of biomechanical performances relies on several quantities which may be differently affected by a same parameter of the movement (e.g. joint loads, joint positions, energy consumption). This paper presents a framework to assess the performance of a technical movement and easily identify how to improve it, while addressing the aforementioned concerns. The proposed framework consists of two components: • A method for replaying pre-recorded human motions while ensuring the dy- namic consistency of the resulting motion. This dynamic replay relies on an optimization-based controller which enables to track subjects’ motion in oper- ational space, while imposing dynamic and biomechanical constraints. Existing situations can thus be evaluated. • A method for analyzing the dependence between parameters and performances of the movement. A variety of situations are automatically created and evaluated using an autonomous digital human model (no motion capture), and a sensitivity analysis is conducted on the simulation results. The critical parameters of the movement can thus be identified and tuned, using only little input data. The paper is organized as follows. Section 2 presents the two components of the framework, with an emphasis on the dynamic replay method. The sensitivity analysis part has already been published in (Maurice et al. 2017) and is only briefly described here. Section 3 describes the experimental set-up which is used as a proof of concept of the proposed framework. The results are presented in section 4 and discussed in section 5. 2 2. Method Human motion is often captured through optical motion capture techniques, in which markers are positioned on the human body and cameras record the markers 3D posi- tions. The recorded markers trajectories are then mapped onto a digital human model (DHM) to estimate movement-related biomechanical quantities. Despite their exten- sive use, the current mapping techniques still lack physical consistency, especially when the motion is highly dynamic and/or involves significant interaction forces with the environment. Note that the retargetting problem (i.e. mapping the motion of a subject onto an avatar with a different morphology is not considered here. This work assumes that the DHM morphology can be adapted to each subject, and that its kinematics is very similar to the human body kinematics. 2.1. Dynamic replay of human motion: Related work Recorded human motions are commonly mapped onto a DHM with inverse kinematics techniques (IK), which convert markers operational (i.e. Cartesian) space trajectories into joint space trajectories. With IK, kinematic quantities such as joint positions and velocities can be measured. Conversely, the estimation of the driving forces (joint torques or muscle forces) requires an additional inverse dynamics (ID) step, in which forces are computed from the DHM dynamic model and the joint space trajectories resulting from IK. Though widely used, the IK+ID process has several drawbacks. First, the IK step can be time consuming. Second, the IK solution is not unique, so the resulting motion may not be plausible. Many authors address this concern by using a modified IK to match the resulting motion with a given set of constraints (Lee and Shin 1999; Grochow et al. 2004). However, the dynamic properties of the human body are not considered so the computed motion is not dynamically consistent (such techniques are especially used in computer animation, where the visual realism is the primary concern). This inconsistency prevents the force equilibrium in the ID step when experimental external forces are added (e.g. ground reaction force). In OpenSim, for instance, this inconsistency appears in residual forces (Hicks et al. 2015). In order to improve the dynamic consistency of the replayed motion, some studies include dynamic considerations in the motion computation. Multon et al. (Multon et al. 2009) combine IK with dynamic corrections, modifying motion capture data to respond to physical collision forces. Da Silva et al. (Da Silva, Abe, and Popović 2008) and Muico et al. (Muico et al. 2009) directly use controller-based techniques including dynamic constraints to animate a DHM, but they still require the joint trajectories resulting from IK as an input. To avoid the IK step, some authors work directly in the operational space. John and Dariush (John and Dariush 2014) use a task space kinematic control method (closed- loop inverse kinematics) and dynamic constraints to track the motion directly in task space. Ott et al. (Ott, Lee, and Nakamura 2008) connect the markers to the DHM body with virtual springs, and use the generated forces to compute the DHM motion through the dynamic model equation. Demircan et al. (Demircan et al. 2010) use an operational space approach based on null-space projection (Khatib 1987) to track the Cartesian markers trajectories. However, these techniques cannot explicitly take into account certain constraints of the movement. Specifically, inequality constraints such as joint limits cannot be included in null-space projection techniques. Instead, authors resort to suboptimal heuristics to account for inequality constraints through avoidance 3 tasks (Sentis and Khatib 2006), or simply dismiss these constraints. 2.2. DHM controller The dynamic replay method presented in this work is based on a direct control of markers in the operational space through an optimization-based controller. Unlike analytical techniques such as explicit null-space projection (Khatib 1987), numerical optimization-based techniques enable to solve the human kinematic redundancy while explicitly taking into account both equality and inequality constraints. The DHM used here is a rigid body model which does not include muscles. Each joint is controlled by a single ideal rotational actuator, so the actuation variables are the joint torques. Though muscle-related quantities cannot be measured with such a model, biomechanical demands can be estimated with quantities such as joint loads, joint dy- namics, or mechanical energy. Besides, while musculoskeletal models have proved valid and insightful in specific cases, no general criterion has been established yet to solve the muscle recruitment problem. This is a concern for the reliability of muscle-related measurements (Hicks et al. 2015; Thelen, Anderson, and Delp 2003; Damsgaard et al. 2006; Chaffin, Andersson, and Martin 2006). The questionable gain of information and the significantly higher computational cost therefore reduces the interest of mus- culoskeletal models in the current context. The DHM motion is computed with the linear quadratic programming (LQP) con- troller framework developed by Salini et al. (Salini, Padois, and Bidaud 2011). LQP handles the optimization of a quadratic objective that depends on variables subjected to linear equality and inequality constraints. The variables are the joint torques, but also the contact forces. The ground reaction force (GRF) is therefore computed in the optimization process, and does not need to be recorded beforehand to replay the motion. The GRF estimation is a significant advantage over most of the other motion replay techniques since it simplifies the experimental set-up. The control problem is formulated as follows: argmin X ∑ i ωiTi(X) s.t.   M(q)ν̇ + C(q, ν) + g(q) = S τ − ∑ j JTcj (q)wcj GX � h (1) where τ is the vector of joint torques, wcj the contact wrench of the j-th contact point, q the generalized coordinates of the system (i.e. joint angles), ν the generalized velocity concatenating the free-floating base twist and the joint velocities q̇, and X = (τ T , wc T )T . The equality constraint is the equation of motion; M is the inertia matrix of the system, C the vector of centrifugal and Coriolis forces, g the vector of gravity forces, S the actuation selection matrix due to the free-floating base, and JTc the Jacobian of contacts. The inequality constraint includes the bounds on joint positions, velocities, and torques formulated in τ , and the contact existence conditions for each contact point according to the Coulomb friction model: Ccj wcj ≤ 0 ∀j Jcj (q)ν̇ + J̇cj (ν, q)ν = 0 ∀j (2) 4 where Ccj is the linearized friction cone of the j-th contact point. The objective function is a weighted sum (weights ωi) of tasks Ti representing the squared error between a desired acceleration or wrench and the system accelera- tion/wrench. The solution is a compromise between the different tasks, based on their relative weights (the proposed method could however easily be adapted to a strict priority strategy such as hierarchical quadratic programming (Escande, Mansard, and Wieber 2014)). Four categories of tasks are defined by the following errors: • Operational space acceleration ‖Jiν̇ + J̇iν − Ẍ∗i‖ 2 • Joint space acceleration ‖q̈− q̈∗‖2 • Operational space wrench ‖wi −w∗i ‖ 2 • Joint torque ‖τ −τ ∗‖2 where Ẍi is the Cartesian acceleration of body i, and wi the wrench associated with body i. The superscript ∗ refers to the desired acceleration/wrench/torque. The desired acceleration is defined by a proportional derivative control: z̈∗ = z̈goal + Kv(ż goal − ż) + Kp(zgoal −z) (3) where z stands for X or q, and Kp and Kv are the proportional and derivative gains. The superscript goal indicates the position, velocity and acceleration wanted for the body or joint (reference trajectory). Note that the acceleration variable ν̇ can be expressed as a function of X using the equation of motion. 2.3. Tasks for motion replay The recorded motion is mapped onto the DHM by creating an operational accelera- tion task for each marker placed on the subject’s body, and using the recorded marker trajectories as reference trajectories (Fig. 2). Due to unavoidable differences between the human and DHM kinematics, the markers tracking tasks alone are often not suffi- cient to maintain the DHM balance (the balance is in open-loop control and the DHM falls). A center of mass (CoM) acceleration task is therefore added to control bal- ance. The reference CoM acceleration is computed with a Zero Moment Point (ZMP) preview control method (Kajita et al. 2003). Additionally, if the activity includes ex- erting intentional force on the environment (e.g. pushing an object), an operational space wrench task is created at the hand. The original ZMP preview control scheme is modified to take into account these known external forces acting on the DHM. Unlike the GRF, the intentional reference force must be given as an input to the controller (e.g. from force sensor measurement or known object features). In order to obtain a natural posture even when some body segments are not entirely constrained by the markers tracking tasks, preferred joint angles are specified with joint acceleration tasks (corresponding to a standing posture arms along the body). Eventually, joint torque minimization tasks are added to prevent useless effort and ensure the uniqueness of the solution to the optimization problem. Given the high number of tasks in the controller and the differences between the hu- man and DHM kinematics, not all tasks can be fully fulfilled. The weighting strategy of the controller allows to deal with conflicting objectives, but tasks weights nevertheless affect the resulting motion. The balance task, for instance, is required to prevent the 5 Figure 1. Two different ways of pushing a heavy object, resulting in different biomechanical demands on the human body (inspired from (Demircan 2012)). Positions Velocities Accelerations Forces Torques Physical consistency Dynamic replay Balance Markers Manipulation force Posture - Effort LQP controller Operational space acceleration task Operational space wrench task Joint torque task Joint space acceleration task Figure 2. Joint space and operational space tasks used in the LQP controller for the dynamic replay of human motion. 6 DHM from falling, but alters the lower and mid-body markers tracking tasks. Approx- imate values of tasks weights are first determined from common sense. In accordance with Demircan et al. (Demircan et al. 2010), distal markers tasks are assigned larger weights than proximal markers tasks to reduce the effect of cumulative errors in pre- ceding joint positions. Joint space tasks (torque minimization and preferred posture) have the lowest weights since they are background tasks. Weights are then manually tuned by trial-and-error. Though time consuming in the first place, the tuning process does not need to be repeated; the weights obtained are general enough to be used for successfully replaying many different activities. The weights values are given in Table 1. 2.4. Sensitivity analysis of human performances The dynamic replay method allows easy measurement of operational and biomechani- cal performances with a DHM. But these measurements alone do not give information about the postural changes which would enhance the overall performance. Providing postural recommendations requires to know the influence of the adjustable postural parameters on the movement performance. Most of the time, however, no straightfor- ward analytical relation between parameters and performances is available. This work therefore proposes to establish parameter-performance influence through a statistical sensitivity analysis. The sensitivity analysis method has already been published in (Maurice et al. 2017) in the context of collaborative robots assessment. The method is summarized below, with a focus on its application to movement improvement. Statistical sensitivity analysis relies on numerical evaluation of the output (indi- cators of performance) for numerous values of the input parameters (Saltelli, Chan, and Scott 2000). Given the large number of trials required, movements are simulated with an autonomous DHM so that many situations can rapidly be tested without the need for any human subjects. Unlike motion capture and replay, the motion of an autonomous DHM is automatically generated from high-level descriptions of the tasks to execute. The whole process for analyzing the dependence between the postural pa- rameters of a given movement and the resulting performance is summarized as follows (Fig. 3): (1) Define the adjustable parameters characterizing the way the movement is per- formed (e.g. position/orientation with respect to the environment, initial pos- ture), and select among all the possible combinations the values that should be tested. (2) Simulate the movement with an autonomous DHM for each selected combination of parameters values, and measure the associated performance indicators. (3) Compute sensitivity measures for the performance indicators, based on their values in all the tested cases. In step 1, the adjustable parameters and the numerical bounds within which these parameters are allowed to vary depend on the movement that is considered. As such, their choice is not addressed by the method described here. Instead the choice is left to the user. Appropriate numerical values to test for each of the selected parameters are determined according to the experimental design of the extended FAST method (Fourier amplitude sensitivity testing) (Saltelli, Tarantola, and Chan 1999). The FAST exploration method is a good compromise between the comprehensiveness of the space exploration and the number of trials. In step 2, the DHM is animated with an LQP controller similar to the one used 7 Table 1. Numerical values of the tasks weights used for dynamic re- play. Task Weight balance 10 m a rk e rs back 1 head 2.5 shoulder 2.5 elbow 5 hand 20 knee 1 ankle 1 p o st u re back 10−1 neck 10−2 scapula 10−1 shoulder 10−2 elbow 10−3 wrist 10−3 hip 10−2 knee 10−2 ankle 10−2 torque minimization 10−8 Task description Parameters set #NParameters set #1 Human and task parameters Selection ... Analysis Relevant indicators Influential parameters Indicators set #1 Indicators set #N... Dynamic simulation Manikin controller LQP Figure 3. Flow chart of the method for analyzing the dependence between postural parameters of the task and resulting operational/biomechanical performances. 8 for motion replay, but the markers tracking tasks are replaced by hands and/or feet operational acceleration tasks (depending on the goal of the movement). The reference trajectory – or at least the start and end points – must still be given as an input to the simulation. A detailed assessment of biomechanical performances requires several indicators to account for the different demands (e.g. posture, effort, energy). The number of performance indicators should however be limited to facilitate the analysis, while suf- ficiently accounting for the overall performance. Hence, step 3 aims at identifying both the most relevant performance indicators and the most influential parameters (i.e. the parameters that have the strongest effects on the relevant indicators). In the context of performance improvement, the relevance of an indicator is not related to its value but to its variations when the movement is performed in different ways; if the value of an indicator remains unchanged whichever way the movement is performed, this indicator is not useful to compare different situations. Therefore indicators are ranked according to their variance, after they have been normalized to make the variances comparable. The number of indicators that should be kept is then chosen according to the Scree test (Jolliffe 2002). Eventually, the influence of each parameter on the relevant performance indicators is estimated by computing Sobol indices which measure the percentage of variance of an indicator that is explained by the parameter (Hoeffding 1948; Sobol 1993). Sobol first order indices Si and total indices STi are used because they give information about the ith parameter Xi independently from the influence of other parameters (Homma and Saltelli 1996). A high Si means that Xi alone strongly affects the performance indicator, while a small STi means that Xi has very little influence, even through interactions. This method allows to identify which parameters should mainly be tuned to improve the overall performance. It should be noted that Sobol indices represent relative contributions, i.e., they inform on the influence of a parameter compared to all parameters within the tested set. 3. Experiment This section presents an application of the method for guiding performance enhance- ment presented in the previous sections. Human motions are recorded and replayed to evaluate the dynamic replay method. The sensitivity analysis is then applied to the considered movement, and the results are used to provide postural recommendations. The improved situation is compared to the original one to ensure that the proposed recommendations do enhance the performance. It should be noted that the application presented here is a proof of concept of the method proposed in this paper. 3.1. Task description An industrial manual task requiring significant effort is used as a test case. The tasks consists in drilling six holes consecutively in a vertical slab of autoclaved aerated concrete (dimensions: 30 × 60 cm) with a portable electric drill. The locations of the holes are imposed and depicted on Fig. 4. The drill weighs 2.1 kg. The average force needed to drill a hole in these conditions is around 40 N (measured with a force sensor embedded in the drill). The task duration is not constrained, but it takes about 1 min to perform the whole activity (take the drill, drill the six holes, put the drill down). Aside from the task correct execution (i.e. localization and depth of the holes), the 9 main concern is the biomechanical performance: biomechanical demands should be minimized in order to decrease the risk of disease or injury. 3.2. Motion capture set-up 3.2.1. Participants Five right-handed healthy students (3 males and 2 females) aged 25 to 30 years take part in the experiment (average size 1.72 ± 0.1 m; average body mass index 22.6 ± 0.8 kg.m−2). All participants gave informed consent before starting the experiment. Each participant performs the task ten times, with a resting period between each trial. The drill is held with the right hand only. Participants choose their feet positions; they are allowed to move their feet between each trial but not within a trial. 3.2.2. Instrumentation Participants’ motions are recorded with a CodaMotion1 system at 100 Hz. Participants are equipped with 25 markers spread all over their body (both legs, both arms, back and head). They stand on an AMTI force plate2 while performing the task to measure the GRF (for validation purpose only). A 6 axes ATI force sensor3 is embedded in the drill handle to measure the drilling forces (Fig. 4). The recorded data are filtered with a zero-phase 10 Hz low pass 4th order Butterworth filter. All recorded data are available upon request. 3.2.3. Replay The recorded motions are replayed with a DHM using the dynamic replay method described in section 2.3. The drilling force measured with the force sensor is used as an input to the simulation, whereas the GRF measured with the force plate is used for validation purpose (in the simulation the GRF is automatically computed by the DHM controller). Simulations are run in the XDE dynamic simulation framework developed by CEA- LIST (Merlhiot et al. 2012). The DHM consists of 21 rigid bodies linked together by 20 compound joints, for a total of 45 degrees of freedom (DoFs), plus 6 DoFs for the free-floating base. Each DoF is a revolute joint controlled by a single actuator. Given each participant’s size and mass, the DHM is automatically scaled according to average anthropometric coefficients4. Each body segment can be further manually modified to match the participant’s morphology when needed. 3.3. Sensitivity analysis set-up 3.3.1. Postural parameters In manual tasks, postural parameters that can be adjusted are generally related to participants’ position/orientation with respect to the environment. For the drilling 1www.codamotion.com 2http://www.amti.biz/ 3http://www.ati-ia.com/products/ft/ft_models.aspx?id=Gamma 4segments lengths: http://www.openlab.psu.edu/tools/calculators/proportionalityConstant, segments masses: http://biomech.ftvs.cuni.cz/pbpk/kompendium/biomechanika/geometrie_hmotnost_vypocet_en 10 www.codamotion.com http://www.amti.biz/ http://www.ati-ia.com/products/ft/ft_models.aspx?id=Gamma http://www.openlab.psu.edu/tools/calculators/proportionalityConstant http://biomech.ftvs.cuni.cz/pbpk/kompendium/biomechanika/geometrie_hmotnost_vypocet_en task, nine parameters are defined and listed in Table 2, along with their user-defined limits. They include morphology-related parameters to check whether the formulated recommendations should depend on the the person’s morphology or not. Note that the parameters used in this work are specific to the task addressed. The proposed analysis method is generic and can be applied to any movement, however the list of parameters and their bounds are movement-specific. The R software sensitivity toolbox5 is used to select, within the user-defined nu- merical bounds, the parameters values that need to be tested for the extended FAST analysis. The sample size and set of frequencies are chosen based on the number of parameters, according to the recommendations of Saltelli et al. (Saltelli, Tarantola, and Chan 1999). They result in a total of 11601 simulations. One simulation takes approximately 2 min (real time: 75 s) on one core of a 2.4 GHz Intel R CoreTM i7 laptop. The simulations are parallelized on 8 cores. 3.3.2. Simulations The drilling task is simulated in the XDE framework with the autonomous DHM. Only the right hand trajectory and force are explicitly specified. The reference hand trajectory and drilling force profile are estimated from the data recorded for the replay step. The DHM feet do not move during a simulation, except if balance cannot be maintained and the DHM falls. The correctness of the drilling task execution is ensured by checking the hand actual trajectory and force in each simulation. 3.3.3. Performance indicators 25 indicators are measured to assess the biomechanical performance. They are chosen to quantify as exhaustively as possible the effects of all kinds of physical demands, including dynamic phenomena (see Maurice et al. for a detailed description of the indicators (Maurice et al. 2014)). 20 indicators are local quantities which directly estimate joint demands: joint position, velocity, acceleration, power and torque for the right arm, left arm, back and legs respectively. 5 indicators are global quantities which represent the ability of a person to comfortably perform certain actions. The force ( resp. velocity ) transmission ratio of the right hand estimates the capacity to produce force (resp. movement) in the drilling direction (Chiu 1987). The sum of the square distances between the center of pressure (CoP) and the base of support boundaries (balance stability margin) (Xiang et al. 2010), and the time before the CoP reaches the base of support boundary (dynamic balance) estimate the balance quality. The kinetic energy of the whole body estimates human power consumption due to movement. In order to make the variances comparable, the indicators must be scaled because they have non-homogeneous units and different orders of magnitude. Experimentally obtained reference values are used for the scaling (see (Maurice et al. 2014) for more details). To summarize each time-varying indicator in a single value, time-integral values over a whole simulation are used. It should be noted that these biomechanical indicators are independent from the method presented in section 2. The method can be used with any indicators of human performance that can be measured on a DHM. 5http://www.r-project.org 11 http://www.r-project.org Motion capture camera Force plateMotion capture markers Embedded force sensor X Y Z Figure 4. Motion and force capture instrumentation for the drilling task. A commercial drill has been modified to embed a force sensor. The red circles on the slab represent the drilling points. Table 2. Parameters definition and limit values for the drilling task. The total horizontal distance between the pelvis and the stab center is equal to the value of the corresponding parameter plus the arm length of the DHM. The right foot is front when the inter-feet distance in the sagittal plane is positive. The influence of the person’s morphology is taken into account both directly (with the DHM size and body mass index parameters) and indirectly since some parameters are partly calculated based on the DHM size (inter-feet distance in frontal plane, vertical distance between shoulder and stab center, horizontal distance between pelvis and stab center). Parameter Minimum Maximum DHM size (m) 1.65 1.85 DHM body mass index (kg.m−2) 21.0 27.0 Preferred elbow flexion angle (◦) 10 135 Inter-feet distance in frontal plane (% of hip width) 100 200 Inter-feet distance in sagittal plane (m) -0.25 0.25 Orientation of drill handle w.r.t. vertical (◦) 0 90 Pelvis horizontal orientation w.r.t. normal to stab (◦) -30 30 Vertical distance between shoulder and stab center (m) -0.2 0.1 Horizontal distance between pelvis and stab center (m + DHM arm length) -0.3 0.0 12 4. Results This section presents the comparison between the recorded and replayed motions (Fig. 5), the output of the sensitivity analysis, and the comparison of the initial and improved situations. 4.1. Dynamic replay validation 4.1.1. Motion The reliability of the replayed motion is assessed by comparing the 3D Cartesian posi- tions of the experimental markers (recorded with the CodaMotion) with the simulated ones (points on the DHM body). The RMS errors between the experimental and simulated markers positions are presented in Table 3. The tracking error is smaller than 3 cm for all markers except the knee and right shoulder markers. The tracking error is smaller for the distal body parts, in accordance with the tasks weights distribution in the controller (higher weights for distal body parts). The tracking is better for the left arm than for the right arm because the left arm remains almost still. The right hand tracking error is nevertheless satisfactory (around 1 cm), given that the overall length of the hand trajectory is about 1 m. The results are consistent across participants. 4.1.2. Force The reliability of the force data should be assessed by comparing the DHM joint torques computed with the controller with the human joint torques estimated from muscle forces. However, getting reliable human joint torques measurements is a practi- cal issue. Conversely, the GRF is easily measured and provides an indirect estimation of the joint torques through the equation of motion (equality constraint in Eq. 1). The experimental GRF (measured with the force plate) and simulated GRF (computed with the DHM controller) are therefore compared. The Pearson’s linear correlation coefficient r between the experimental and simu- lated GRF components is given in Table 4. A good correlation is observed for each GRF component (four components with r ≥ 0.90, two components with r ≥ 0.70). No significant permanent force/moment offset is observed (Fig. 6). FY (direction of drilling) shows a better correlation than FX and FZ because the variations of FY have a larger amplitude (Fig. 6). There are no significant differences across participants, except for the vertical force Fz. The disparity of the Fz results might however be due to the lower precision of the force plate in this direction, because of the higher load. 4.2. Sensitivity analysis Out of 25 biomechanical indicators, 6 are identified as relevant for the drilling task by the sensitivity analysis. These relevant indicators are given in Table 5. Together they account for 80 % of the total variance information, so little information is lost by not taking into account the other indicators. The presence of the upper-body torque and position indicators among the relevant indicators is consistent with the physical demands of the drilling task (exerting a significant force with the right hand while covering an extended area). The absence of any velocity and acceleration indicators is expected since the drilling task does not require fast motions. 13 Figure 5. Motion capture (left) and dynamic replay with the LQP controller (right) of the drilling task. A video of the recorded and replayed motions is available in the supplementary material. Table 3. Average RMS error between the experimental and sim- ulated 3D markers positions across the 10 trials, for the 5 partici- pants (Si stands for subject i). Several markers are placed on each body/joint, but only the largest RMS error of all markers placed on a body/joint is given. Position RMS error (cm) S1 S2 S3 S4 S5 Mean SD Ankle 1.5 1.0 0.9 1.1 1.3 1.2 0.2 Knee 4.6 4.9 3.7 3.8 4.8 4.4 0.5 Back 2.9 3.2 1.7 2.5 3.1 2.7 0.5 Head 1.6 1.6 0.6 1.5 1.0 1.3 0.4 Right Shoulder 7.8 6.9 2.0 6.8 7.0 6.1 2.1 Left Shoulder 3.9 2.7 1.9 2.2 3.5 2.8 0.8 Right Elbow 2.9 3.0 2.7 2.9 3.1 2.9 0.1 Left Elbow 0.8 0.5 0.3 0.8 0.7 0.6 0.2 Right Wrist/Hand 0.8 1.0 1.5 0.7 1.3 1.1 0.3 Left Wrist/Hand 0.5 0.4 0.2 0.3 0.2 0.3 0.1 Table 4. Average Pearson’s correlation coefficient be- tween the simulated and experimental GRF components across the 10 trials, for the 5 participants (Si stands for subject i). X is the sagittal axis, Y the frontal axis (drilling direction), and Z the vertical axis. Force Moment FX FY FZ MX MY MZ S1 0.82 0.98 0.70 0.78 0.98 0.96 S2 0.82 0.98 0.62 0.95 0.98 0.95 S3 0.62 0.98 0.57 0.96 0.98 0.96 S4 0.78 0.98 0.91 0.96 0.98 0.98 S5 0.77 0.98 0.82 0.96 0.98 0.97 Average 0.76 0.98 0.72 0.92 0.98 0.97 SD 0.07 0.00 0.13 0.07 0.00 0.01 14 A B C D E F Figure 6. Time evolution of the experimental and simulated GRF components, for one trial of participant No.5. A: Force FX . B: Force FY . C: Force FZ . D: Moment MX . E: Moment MY . F: Moment MZ . The force and moment errors of this trial are representative of their average values across all participants and trials. The moments are given at the center of the feet. 15 Table 5 also gives the value of Sobol indices for the 6 relevant indicators. Some parameter-indicator relations represented by these indices are expected and confirm the consistency of the proposed analysis (e.g. strong influence of the inter-feet frontal distance on the balance stability margin indicated by the high value of Sobol first order index). Other relations are less straightforward and could not easily be guessed without the sensitivity analysis, for instance the influence of the pelvis orientation on the right arm torque indicator. Overall, Sobol first order indices indicate that the pelvis orientation, the inter-feet sagittal distance, and the desired elbow flexion have a significant influence on several of the relevant indicators. These parameters should primarily be optimized to enhance the biomechanical performances. Conversely, low values of Sobol total indices indicate that the influences of the stab height and the pelvis-stab distance are small. The values of these parameters can be freely chosen by the person performing the activity. Surprisingly at first, the DHM morphological parameters are not identified as influent according to Sobol indices. This phenomenon can however be explained by the fact that other parameters are scaled based on the DHM size. Therefore, the set-up is adjusted depending on the DHM size, which likely reduces the influence of the morphological parameters on the biomechanical demands. 4.3. Posture modification Sobol indices provide quantitative information about the magnitude of the parame- ters influence, but they do not inform on the detail of the indicators vs. parameters evolution. Such evolution can be estimated with a metamodel (Box and Draper 1987), but building a metamodel requires many more trials. As such, Sobol indices are not useful to find the parameters optimal values. However, trend curves can be obtained from the large number of trials performed for the sensitivity analysis, and used to identify well-performing parameters values. For each parameter, the optimal value is determined by considering only the associated relevant indicators. Recommendations for the drilling task are provided in Table 6. Pure morphological parameters (DHM size and body mass index) are excluded since they do not have a significant influence on the performance according to Table 5. The modified activity is compared to the initial one to validate the benefit resulting from the proposed recommendations. As a first validation, both the initial and the modified situations are evaluated with the autonomous DHM simulation (a complete validation would require the recording and replay of the movement performed by a human subject following the recommendations). The initial and recommended values of the parameters are displayed in Table 6 and the corresponding situations are illustrated in Fig. 7. The values of the relevant indicators measured in both situations are presented in Table 7. Out of the 6 relevant indicators, 5 are significantly improved by the proposed modifications, while the last one is only slightly worsened. Biomechanical demands associated with the right arm torque, legs position, right arm position, force transmis- sion ratio and balance stability are reduced by 29, 42, 14, 16 and 35 % respectively, whereas the back torque demand is increased by 9 %. Importantly, the significant re- duction in biomechanical demand is achieved even though the range of variation of the adjustable parameter is limited. This result advocates for the use of the proposed method which can identify minor changes with a major impact. The evolution of the shoulder flexion and rotation torques during the drilling move- ment are plotted in Fig. 8 to illustrate the reduced physical demands on more detailed 16 Table 5. Sobol indices for the 6 biomechanical indicators identified as relevant for the drilling task. For each parameter and indicator, the upper value is the first order index, the lower value is the total index (Sobol indices can range from 0 to 1). The biomechanical indicators are presented in decreasing order of importance (decreasing variance) from left to right. The percentages below their names correspond to the percentage of the total variance they explain. FTR stands for force transmission ratio. Numbers are colored from blue (minimum) to red (maximum), to facilitate the reading. Relevant biomechanical indicators Right Arm Legs Right Arm Back FTR drilling Balance stability torque position position torque direction margin 23 % 20 % 15 % 9 % 7 % 6 % P a r a m e t e r s DHM size 10−3 0.01 10−3 0.02 0.09 0.03 0.02 0.35 0.04 0.10 0.11 0.05 DHM bmi 10−4 10−3 10−4 0.01 0.26 10−4 0.03 0.31 0.10 0.10 0.30 0.03 Elbow flexion 0.22 0.11 0.03 10−3 0.02 10−3 0.26 0.67 0.08 0.09 0.06 0.03 Inter-feet sagittal 0.01 0.16 0.08 0.15 10−3 0.17 distance 0.03 0.66 0.15 0.30 0.02 0.23 Inter-feet frontal 10−4 10−3 10−5 0.01 10−4 0.53 distance 0.02 0.31 10−3 0.07 0.02 0.60 Drill 0.16 10−4 0.16 10−3 10−3 10−4 orientation 0.26 0.32 0.24 0.12 0.05 0.02 Pelvis orientation 0.32 0.02 0.45 0.43 0.45 0.10 0.37 0.50 0.59 0.63 0.52 0.19 Stab height 0.04 10−4 0.05 0.06 0.04 10−4 0.09 0.26 0.09 0.20 0.07 0.02 Pelvis distance 0.09 0.01 0.06 0.01 0.04 10−4 0.15 0.30 0.19 0.10 0.08 0.03 Table 6. Initial and recommended parameters values for the drilling task. The parameters which have the largest influence on the performance are highlighted in bold. The values for the initial situation are measured on participant No.5. Parameter Initial Recommended Preferred elbow flexion angle (◦) 100 135 Inter-feet distance in frontal plane (% of hip width) 120 200 Inter-feet distance in sagittal plane (m) 0 -0.1 Orientation of drill handle w.r.t. vertical (◦) 0 0 Pelvis horizontal orientation w.r.t. normal to stab (◦) 0 -15 Vertical distance between shoulder and stab center (m) 0 -0.1 Horizontal distance between pelvis and stab center (m + DHM arm length) 0 0.1 Table 7. Values of the relevant biomechanical indicators measured in the initial and modified situations. FTR for Force Transmission Ratio. For each indicator, the value displayed is the percentage of the indicator reference value used for the scaling. Right Arm Legs Right Arm Back FTR drilling Balance stability torque position position torque direction margin Init. 138 69 126 76 128 106 Modif. 98 40 108 83 107 69 17 biomechanical quantities. Both joint torques are reduced in the modified situation, and the DHM simulation allows a quantitative estimation of this reduction. 5. Discussion The results presented in the previous section demonstrate the usefulness of the pro- posed method. On one hand, the dynamically replayed motion is very similar to the original one, and the simulated GRF is consistent with the experimental one. On the other hand, the situation modified according to the sensitivity analysis results exhibits enhanced performances, confirming the benefit of the postural modifications. Never- theless, the application of the proposed method should be considered carefully because of some current limitations of the human model and control. Such limitations are dis- cussed thereafter, along with leads on future research directions which may help lift those limitations. 5.1. Accuracy of dynamic replay The tracking error of the markers 3D trajectories obtained with the dynamic replay method is similar to the tracking error reported for other replay methods including dynamic considerations (Table 3). Demircan et al. report a tracking error between 0 and 4 cm depending on markers, when replaying a throwing motion (Demircan et al. 2010). John and Dariush report an RMS tracking error of 4 cm for the worst marker, when tracking a set of 30 markers (John and Dariush 2014). This latter study however addresses seated motions where balance is less of an issue. Though good, the replayed motion is nonetheless not exact (non-zero tracking error). In addition to soft tissue deformation and uncertainty on markers placement, the tracking errors are mainly due to differences between the human and DHM kinematics. For instance, the complexity of the human shoulder (De Sapio 2007) is not rendered in the DHM kinematics, hence the right shoulder tracking error. The balance task in the controller also alters the accuracy of the replayed motion. Due to differences between the human and DHM kinematics and the lack of decision skills of the DHM regarding how to recover balance, the ZMP preview control scheme in the balance task is tuned to be conservative. Most unstable situations are thus avoided as long as the original motion is not too unstable. However, the balance improvement is achieved at the cost of a modified motion, hence a less accurate replay. This partly explains the knee tracking error observed despite the small displacement of the knees during the drilling movement. The kinematic differences between the human and the DHM can be minimized with more complex musculosketetal models, but they remain unavoidable and necessarily affect the quality of the replayed motion and forces (Hicks et al. 2015). These differ- ences are, however, not specific to the dynamic replay method, but affect any kind of motion replay. Moreover, the replay method presented in this paper with a basic human model could also be used with more accurate musculosketetal models – though at a larger computational cost – to reduce model-induced errors. 18 A B Figure 7. Snapshots of the initial (A) and modified (B) drilling movement simulated with the autonomous DHM. sh ou ld er f le xi on to rq ue ( N .m ) time (s) 10 15 20 25 10 20 30 40 50 60 70 initial modified 0 2 4 -2 -4 -6 -8 time (s) 10 20 30 40 50 60 70 sh ou ld er r ot at io n to rq ue ( N .m ) initial modified A B Figure 8. Time evolution of the shoulder flexion (A) and rotation (B) torques in the initial and modified situations. The grey areas correspond to the drilling periods. 19 5.2. Musculoskeletal model and motor control Unlike the muscular actuation of the human body, the actuation of the DHM is de- scribed at joint level only, with each DoF controlled by a single actuator. The biome- chanical quantities measured with such a model are necessarily less detailed than what could be measured with a musculoskeletal model. Moreover, the DHM joint torques do not fully represent the physical effort exerted by a person. Due to the redun- dancy of human actuation, different combinations of muscle forces can result in a same joint torque. When simultaneously contracting antagonist muscles (co-contraction phe- nomenon), a person can generate forces which do not produce any net joint torque. These ”internal forces” do not have any equivalent in the DHM and are not accounted for in the evaluation. Co-contraction of antagonist muscles aims at increasing joint impedance to withstand perturbations arising from limb dynamics or external loads (Gribble et al. 2003); though especially important in high accuracy motions, it is nev- ertheless present in all motions. Not taking co-contraction forces into account therefore leads to an under-estimation of the human effort. Nevertheless, the motion replay method presented in this work is modular, in that it decouples the rigid body dynamics from the actuation dynamics. The output of the dynamic replay fully describes the evolution of the system dynamics in terms of state [ qT (t), q̇T (t) ]T , joint torques τ (t) and applied external wrenches wc(t). Given this evolution, the use of an inverse musculoskeletal model (MM) could give access to the evolution of muscle activations um(t) = f −1(q(t), q̇(t), τ (t), wc(t), MM). Muscle-related performance indicators could thus be estimated. However the question of muscle recruitment in the MM – especially regarding co-contraction – remains an open issue. 5.3. Autonomous motion generation The sensitivity analysis being based on DHM simulations, the biomechanical reliabil- ity of the results depends on the realism of the autonomous DHM motion. Simulating highly realistic human motions requires a model of whole-body motor control which accounts for the redundancy of the human musculoskeletal system and for the slow dynamics of human muscle activation. The slow dynamics is particularly limiting as it requires to consider the motor control problem from an optimal control perspective rather than from a purely reactive perspective. The first consequence is a large increase in the computational cost, which is not compatible with running thousands of simula- tions in a reasonable amount of time. Moreover, solving this optimal control problem requires to understand the psychophysical principles that voluntary movements obey. Though many studies have been conducted to establish mathematical formulae of such principles (e.g. Fitt’s law (Fitts 1954), minimum jerk principle (Flash and Hogan 1985), two-thirds power law (Viviani and Flash 1995)) and some have been success- fully applied to DHM simulations (De Magistris et al. 2013), these formulations remain largely limited to reaching motions. Indeed, driving principles are not yet known for all kinds of whole-body motions, especially when significant external forces are at play. Transposed to a DHM, determining the underlying principles of human motion comes down to establishing which mathematical quantities are optimized when human-like motions are performed. Optimality criteria can be investigated through inverse opti- mization techniques – as proposed by Clever et al. (Clever, Hatz, and Mombaur 2014) for human locomotion or by Berret et al. (Berret et al. 2011) for reaching motions – but they remain an open research problem. 20 Nevertheless, the sensitivity analysis method in itself is independent from the DHM control. If an improved control law is available, it can be used to generate more realistic motions, while the analysis method remains the same. 6. Conclusion This paper presents a framework to easily assess and enhance the performance of human postures and movements, based on dynamic DHM simulations. Existing situ- ations are replayed from motion capture data with an LQP controller. This method guarantees the dynamic consistency of the replayed motion and forces, unlike what is currently achieved with most existing replay methods. The reliability of the biome- chanical measurements taken on the DHM is thus increased, without requiring exper- imental GRF measurement. Then a sensitivity analysis of the movement is conducted with autonomous DHM simulations to identify the most influential parameters of the movement, and thereby provide recommendations for improvement. Because the DHM motion is automatically generated in this step, only little input data are needed to carry out the analysis. In particular, there is no need for human participants to per- form multiple repetitions of the motion. The proposed method is applied to a drilling movement. Experiments carried out on 5 participants show that motions and forces are reliably replayed. The sensitivity analysis allows to highlight and rank some non trivial phenomena, which cannot be quantified a priori. Finally, the assessment of the modified situation shows significant improvement in performances compared to the initial situation, demonstrating the usefulness of the proposed method. Futur work includes presenting the movement improvement tool to a human move- ment expert (e.g., ergonomist) to receive his/her feedback and compare the optimal movement obtained with the method proposed in this work to his/her own recommen- dations. Future research will also be directed towards coupling the developed tool with more detailed human models, such as musculo-skeletal models. Such a coupling will allow to measure quantities that more accurately represent the physiological demands during the movement. This will require investigating optimal coupling between the fast rigid-body simulation presented here and computationnally expensive musculo- skeletal models, in order to keep the simulation time compatible with the execution of multiple simulation instances. To conclude, though the application presented in this paper focuses on biomechan- ical performance, the method is more general and can be applied to other domains, such as rehabilitation. Funding This work was partially supported by the RTE company through the RTE/UPMC chair ”Robotics Systems for field intervention in constrained environments”, held by V. Padois. P. Maurice is supported in part by the European Union’s Horizon 2020 research and innovation program under grant agreement No. 731540 (An.Dy). 21 References Berret, B., E. Chiovetto, F. Nori, and T. Pozzo. 2011. “Evidence for composite cost functions in arm movement planning: an inverse optimal control approach.” PLoS Comput Biol 7 (10). Box, G.E.P., and N.R. Draper. 1987. Empirical model-building and response surfaces. John Wiley & Sons. Chaffin, D.B., G.B.J. Andersson, and B.J. Martin. 2006. Occupational biomechanics. 4th ed. Wiley. Chiu, SL. 1987. “Control of redundant manipulators for task compatibility.” Proceedings of the IEEE International Conference on Robotics and Automation 4: 1718–1724. Clever, D., K. Hatz, and K. Mombaur. 2014. “Studying Dynamical Principles of Human Loco- motion using Inverse Optimal Control.” Proceedings in Applied Mathematics and Mechanics 14 (1): 801–802. Da Silva, M., Y. Abe, and J. Popović. 2008. “Simulation of Human Motion Data using Short- Horizon Model-Predictive Control.” Computer Graphics Forum 27 (2): 371–380. Damsgaard, M., J. Rasmussen, S. T. Christensen, E. Surma, and M. de Zee. 2006. “Analysis of musculoskeletal systems in the AnyBody Modeling System.” Simulation Modelling Practice and Theory 14 (8): 1100–1111. De Magistris, G., A. Micaelli, P. Evrard, C. Andriot, J. Savin, C. Gaudez, and J. Marsot. 2013. “Dynamic control of DHM for ergonomic assessments.” International Journal of Industrial Ergonomics 43 (2): 170–180. De Sapio, V. 2007. “Task-level strategies for constrained motion control and human motion synthesis.” PhD diss., Stanford University. Delp, S.L., F.C. Anderson, A.S. Arnold, P. Loan, A. Habib, C.T. John, E. Guendelman, and D.G. Thelen. 2007. “OpenSim: open-source software to create and analyze dynamic simu- lations of movement.” IEEE Transactions on Biomedical Engineering 54 (11): 1940–1950. Demircan, E. 2012. “Robotics-based Reconstruction and Synthesis of Human Motion.” PhD diss., Stanford University. Demircan, E., T. Besier, S. Menon, and O. Khatib. 2010. “Human motion reconstruction and synthesis of human skills.” Advances in Robot Kinematics: Motion in Man and Machine 283–292. Escande, A., N. Mansard, and P.B. Wieber. 2014. “Hierarchical quadratic programming: Fast online humanoid-robot motion generation.” The International Journal of Robotics Research . Fitts, P.M. 1954. “The information capacity of the human motor system in controlling the amplitude of movement.” Journal of experimental psychology 47 (6): 381. Flash, T., and N. Hogan. 1985. “The coordination of arm movements: an experimentally con- firmed mathematical model.” The journal of Neuroscience 5 (7): 1688–1703. Fortenbaugh, D., G.S. Fleisig, and J.R. Andrews. 2009. “Baseball pitching biomechanics in relation to injury risk and performance.” Sports Health: A Multidisciplinary Approach 1 (4): 314–320. Gribble, P.L., L.I. Mullin, N. Cothros, and A. Mattar. 2003. “Role of cocontraction in arm movement accuracy.” Journal of Neurophysiology 89 (5): 2396–2405. Grochow, K., S.L. Martin, A. Hertzmann, and Z. Popović. 2004. “Style-based inverse kine- matics.” ACM Transactions on Graphics 23 (3): 522–531. Hicks, J.L., T.K. Uchida, A. Seth, A. Rajagopal, and S.L. Delp. 2015. “Is My Model Good Enough? Best Practices for Verification and Validation of Musculoskeletal Models and Sim- ulations of Movement.” Journal of biomechanical engineering 137 (2). Hoeffding, W. 1948. “A class of statistics with asymptotically normal distribution.” The annals of mathematical statistics 293–325. Homma, T., and A. Saltelli. 1996. “Importance measures in global sensitivity analysis of non- linear models.” Reliability Engineering & System Safety 52 (1): 1–17. John, C., and B. Dariush. 2014. “Dynamically Consistent Human Movement Prediction for In- 22 teractive Vehicle Occupant Package Design.” Proceedings of the 3rd Digital Human Modeling Symposium . Jolliffe, I. 2002. Principal component analysis. Wiley Online Library. Kajita, S., F. Kanehiro, K. Kaneko, K. Fujiwara, K. Harada, K. Yokoi, and H. Hirukawa. 2003. “Biped walking pattern generation by using preview control of zero-moment point.” Proceedings of the IEEE International Conference on Robotics and Automation 2: 1620– 1626. Khatib, O. 1987. “A unified approach for motion and force control of robot manipulators: The operational space formulation.” IEEE Journal of Robotics and Automation 3 (1): 43–53. Lee, J., and S.Y. Shin. 1999. “A hierarchical approach to interactive motion editing for human- like figures.” Proceedings of the 26th annual conference on Computer graphics and interactive techniques 39–48. Maurice, P., V. Padois, Y. Measson, and P. Bidaud. 2017. “Human-oriented design of collab- orative robots.” International Journal of Industrial Ergonomics 57: 88–102. Maurice, P., P. Schlehuber, V. Padois, Y. Measson, and P. Bidaud. 2014. “Automatic selection of ergonomie indicators for the design of collaborative robots: A virtual-human in the loop approach.” 14th IEEE-RAS International Conference on Humanoid Robots 801–808. Merlhiot, X., J. Le Garrec, G. Saupin, and C. Andriot. 2012. “The XDE mechanical kernel: Effi- cient and robust simulation of multibody dynamics with intermittent nonsmooth contacts.” Proceedings of the 2nd Joint International Conference on Multibody System Dynamics . Muico, U., Y. Lee, J. Popović, and Z. Popović. 2009. “Contact-aware nonlinear control of dynamic characters.” ACM Transactions on Graphics 28 (3): 81. Multon, F., R. Kulpa, L. Hoyet, and T. Komura. 2009. “Interactive animation of virtual humans based on motion capture data.” Computer Animation and Virtual Worlds 20 (5-6): 491–500. NRC. 2001. Musculoskeletal Disorders and the Workplace: Low Back and Upper Extremities. Institute of Medicine and National Research Council, National Academy Press. Ott, C., D. Lee, and Y. Nakamura. 2008. “Motion capture based human motion recognition and imitation by direct marker control.” Proceedings of the 8th IEEE-RAS International Conference on Humanoid Robots 399–405. Robinson, Gemma, and Peter ODonoghue. 2008. “A movement classification for the investi- gation of agility demands and injury risk in sport.” International Journal of Performance Analysis in Sport 8 (1): 127–144. Salini, J., V. Padois, and P. Bidaud. 2011. “Synthesis of complex humanoid whole-body be- havior: a focus on sequencing and tasks transitions.” Proceedings of the IEEE International Conference on Robotics and Automation 1283–1290. Saltelli, A., K. Chan, and E.M. Scott. 2000. Sensitivity analysis. Wiley. Saltelli, A., S. Tarantola, and K.P.S. Chan. 1999. “A quantitative model-independent method for global sensitivity analysis of model output.” Technometrics 41 (1): 39–56. Schneider, E., and X. Irastorza. 2010. OSH in figures: Work-related musculoskeletal disorders in the EU - Facts and figures. European Agency for Safety and Health at Work. Sentis, L., and O. Khatib. 2006. “A whole-body control framework for humanoids operating in human environments.” Proceedings of the IEEE International Conference on Robotics and Automation 2641–2648. Sobol, I.M. 1993. “Sensitivity estimates for non linear mathematical models.” Mathematical Modelling and Computational Experiments 407–414. Sturnieks, Daina L, Thor F Besier, Peter M Mills, Tim R Ackland, Ken F Maguire, Gwidon W Stachowiak, Pawel Podsiadlo, and David G Lloyd. 2008. “Knee joint biomechanics following arthroscopic partial meniscectomy.” Journal of Orthopaedic Research 26 (8): 1075–1080. Thelen, D.G., F.C. Anderson, and S.L. Delp. 2003. “Generating dynamic simulations of move- ment using computed muscle control.” Journal of biomechanics 36 (3): 321–328. Viviani, P., and T. Flash. 1995. “Minimum-jerk, two-thirds power law, and isochrony: con- verging approaches to movement planning.” Journal of Experimental Psychology: Human Perception and Performance 21 (1): 32. 23 Xiang, Y, J S Arora, S Rahmatalla, T Marler, R Bhatt, and K Abdel-Malek. 2010. “Hu- man lifting simulation using a multi-objective optimization approach.” Multibody System Dynamics 23 (4): 431–451. 24 Introduction Method Dynamic replay of human motion: Related work DHM controller Tasks for motion replay Sensitivity analysis of human performances Experiment Task description Motion capture set-up Participants Instrumentation Replay Sensitivity analysis set-up Postural parameters Simulations Performance indicators Results Dynamic replay validation Motion Force Sensitivity analysis Posture modification Discussion Accuracy of dynamic replay Musculoskeletal model and motor control Autonomous motion generation Conclusion work_43xudc3jafc3zdkihf3w2jfpca ---- Crowd simulation: A video observation and agent-based modelling approach Browse Explore more content Repository IJDH Shahrol 2016.pdf (889.45 kB) Crowd simulation: A video observation and agent-based modelling approach CiteDownload (889.45 kB)ShareEmbed journal contribution posted on 03.11.2016, 10:00 by Shahrol Mohamaddan, Keith Case Human movement in a crowd can be considered as complex and unpredictable, and accordingly large scale video observation studies based on a conceptual behaviour framework were used to characterise individual movements and behaviours. The conceptual behaviours were Free Movement (Moving Through and Move-Stop-Move), Same Direction Movement (Queuing and Competitive) and Opposite Direction Movement (Avoiding and Passing Through). Movement in crowds was modelled and simulated using an agent-based method using the gaming software Dark BASIC Professional. The agents (individuals) were given parameters of personal objective, visual perception, speed of movement, personal space and avoidance angle or distance within different crowd densities. Two case studies including a multi-mode transportation system layout and a bottleneck / non-bottleneck evacuation are presented. Categories Mechanical Engineering not elsewhere classified Keywords Agent-based modellingCrowd simulationObservational study History School Mechanical, Electrical and Manufacturing Engineering Published in International Journal of the Digital Human Volume 1 Issue 3 Pages 229 - 247 (19) Citation MOHAMADDAN, S. and CASE, K., 2016. Crowd simulation: A video observation and agent-based modelling approach. International Journal of the Digital Human, 1(3), pp. 229-247. Publisher © Inderscience Version AM (Accepted Manuscript) Publisher statement This work is made available according to the conditions of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) licence. Full details of this licence are available at: https://creativecommons.org/licenses/by-nc-nd/4.0/ Acceptance date 03/03/2016 Publication date 2016-10-18 Notes This paper was accepted for publication in the journal International Journal of the Digital Human and the definitive published version is available at http://dx.doi.org/10.1504/IJDH.2016.10000735 DOI https://doi.org/10.1504/IJDH.2016.10000735 ISSN 2046-3375 Publisher version http://dx.doi.org/10.1504/IJDH.2016.10000735 Language en Administrator link https://repository.lboro.ac.uk/account/articles/9565757 Licence CC BY-NC-ND 4.0 Exports Select an optionRefWorksBibTeXRef. managerEndnoteDataCiteNLMDC Categories Mechanical Engineering not elsewhere classified Keywords Agent-based modellingCrowd simulationObservational study Licence CC BY-NC-ND 4.0 Exports Select an optionRefWorksBibTeXRef. managerEndnoteDataCiteNLMDC Hide footerAboutFeaturesToolsBlogAmbassadorsContactFAQPrivacy PolicyCookie PolicyT&CsAccessibility StatementDisclaimerSitemap figshare. credit for all your research. work_4cq3f23fdfhdzpyexkis4bms2a ---- Reconstructing Urbanization of a Pennine Fringe Township through Computational Chaining of Land Tax Records: Mottram in Longdendale 1784-1830 This is a repository copy of Reconstructing Urbanization of a Pennine Fringe Township through Computational Chaining of Land Tax Records: Mottram in Longdendale 1784-1830. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/87374/ Version: Accepted Version Article: Bibby, P.R. (2014) Reconstructing Urbanization of a Pennine Fringe Township through Computational Chaining of Land Tax Records: Mottram in Longdendale 1784-1830. International Journal of Arts and Humanities Computing, 8 (2). pp. 125-186. ISSN 1753-8548 https://doi.org/10.3366/ijhac.2014.0127 eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ Reuse Unless indicated otherwise, fulltext items are protected by copyright with all rights reserved. The copyright exception in section 29 of the Copyright, Designs and Patents Act 1988 allows the making of a single copy solely for the purpose of non-commercial research or private study within the limits of fair dealing. The publisher or other rights-holder may allow further reproduction and re-use of this version - refer to the White Rose Research Online record for this item. Where records identify the publisher as the copyright holder, users can verify any specific terms of use on the publisher’s website. Takedown If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing eprints@whiterose.ac.uk including the URL of the record and the reason for the withdrawal request. mailto:eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex 125 RECONSTRUCTING URBANIZATION OF A PENNINE FRINGE TOWNSHIP THROUGH COMPUTATIONAL CHAINING OF LAND TAX RECORDS: MOTTRAM IN LONGDENDALE 1784–1830 PETER BIBBY Abstract This paper uses Land Tax records to attempt to reconstruct the pattern of urbanization in a Pennine fringe township which formed part of the Lancashire cotton complex during the early industrial revolution. It uses logic programming to articulate rules to develop a longitudinal approach which chains together individual Land Tax records for successive years to identify perduring property objects, which are then located geographically using the pooled descriptors drawn from the returns. It investigates not only house repopulation, but also the character of new property development, of sub- division and amalgamation of holdings and the changing control of housing. It allows a remarkably detailed reconstruction of change in the particular locality, revealing events that have gone unnoticed. Pent-up demand associated with proto-industrialization combined with the self-interest of a major absentee landlord to allow a flurry of small scale construction between 1785 and 1805; property then converted to workers’ housing with the onset of industrial urbanism. More generally, it is suggested that a computational approach of this sort allows for a more serious engagement with a source all too often dismissed as unpromising. The paper concludes by drawing out implications of the work for more traditional approaches to interpreting Land Tax returns. Keywords: Land Tax, logic programming, house repopulation, Pennines, proto-industrialization International Journal of Humanities and Arts Computing 8.2 (2014): 125–186 DOI: 10.3366/ijhac.2014.0127 © Edinburgh University Press 2014 www.euppublishing.com/ijhac http://www.euppublishing.com/ijhac http://www.euppublishing.com/ijhac August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex 126 Peter Bibby introduction This paper explores the feasibility of using Land Tax returns to examine urbanization of a particular locality over the period 1784–1830. It attempts to chain together individual Land Tax records for successive years to identify enduring property objects, and to locate them geographically using any of the pooled descriptors within the returns. It also seeks to identify change and development as these property objects divide or combine. More tentatively it attempts to move beyond the phenomenal level, beginning to examine the relationship between these physical chnges and broader changes in economic organization. Urbanization universally involves a reduction in direct economic dependence upon the land through the adoption of more indirect methods of production, and also the accretion of buildings. The form of any urbanization- that is the scale and configuration of the physical effects, the balance of working time assigned to direct agricultural production and the organization of all forms of production is historically specific. The particular locality of concern - Mottram-in-Longdendale- a township in the Pennine fringe in the north eastern ‘panhandle’ of the former county of Cheshire perhaps epitomized in 1780 the mutual dependency of domestic textile production and dairying. A ‘cold and inclement’ place, where ‘the herbage is sour and turns to rushes’ if not sufficiently limed1 , Mottram shared the archetypal preconditions for the emergence of the classic dual economies discussed by Thirsk2 . Its place in the geographic division of labour did not entail severance from the land, but a system of land use and development similar to that held by Defoe to typify the country around Halifax in which ‘as every clothier must keep a horse, perhaps two, to fetch and carry for the use of his manufacture . . . then every manufacturer generally keeps a cow or two, or more, for his family, and this employs the two, or three, or four pieces of enclosed land about his house’3 It has long been appreciated that by the late eighteenth century the Pennine fringe was studded with cottages and adjoining crofts, intercalated within a mosaic of larger holdings -still too small to provide adequate income by agriculture alone4 . Dependence on agriculture had been reduced not only through domestic spinning and weaving, but engagement in crafts and trades such as tailoring and shoemaking.5 Population growth had been accommodated ‘not so much [by] an urban increase but a thickening of the population over the countryside’ as farm units were successively fragmented6 , a process which continued by the 1840s producing spaces 127 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbaniza tion ‘dotted with villages and groups of dwellings, and white detached houses, and manufactories presenting an appearance somewhat like that of a vast city scattered amongst meads and pastures, and belts of woodland’7 Such specific configurations might be seen through the lens of the proto- industrialization thesis, their dispersed domestic industry reliant on distant markets, their need for greater capital inputs and changes in the organization of production driving subsequent industrialization8 . Indeed, Mottram formed part of the South East Lancashire cotton complex, producing for markets in Ireland, America and Europe and considered by Walton to bear all the ‘stigmata of the classic proto-industrial model’9 . A specifically proto-industrial perspective on this dual economy might suggest that this landscape might be subject to intense demographic pressure and raise questions about the lines of continuity with an emerging industrial urbanism, although the grand narratives turn away from such patterns of economic organization (and physical development) after 1800, as Walton has argued proto-industrialization without industrial urbanism was not necessarily a ‘dead-end’10 . Moreover, because the mechanization of weaving lagged so long behind the spinning branch, weaving continued to be undertaken by ‘nearly identical household units of production’11 which composed Bamford’s vast scattered city. Nevertheless, at least one local family active within the traditional dual economy- the Sidebottoms – became a major industrial capital within the township. Part of the challenge in this current paper involves attempting to assess how changes in patterns of land-use and development that might be imputed from the Land Tax returns might variously have contributed to intensification of a proto-industrial pattern or to the constitution of an urban- industrial ensemble. Within the confines of a single township, however, competition for land implies that development of one form necessarily excludes others, and the perspectives of specific landowners become important. Despite Levine’s view that a landlord-dominated proto-industrial village would be a contradiction12 , two thirds of the land in this particular township was controlled by a single absentee landlord. The tendency to fragmentation of farm units found here and frequently associated with proto-industrialization13 cannot be ascribed in this instance to partible inheritance. It must be understood in relation to the Tollemache family’s perception of their interests, to the perceptions of their stewards, which are central to what follows, and also in relation to the contemporary discourse of estate management which ran seamlessly into political economy. Practice on the Tollemache estate ran counter to contemporary conventional wisdom regarding the proper size of cottage grounds and the desirable size of farm units on landed estates which usually favoured large farm units. Although a counter position was championed by Nathaniel Kent14 and the potential of an alternative ‘cow and cottage economy’ was 128 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby promoted by the Society for Bettering the Condition of the Poor, this was denounced by Malthus as it might lead to a general diminution in means of subsistence, and feared by others because of its association with Jacobinism15 . Much of what follows therefore is concerned with attempting to identify different moments of urbanization variously associated with different patterns of social organization, in circumstances where physical property was continually being put to new uses. Running through this long period of adaptation is an intriguing idealistic continuity between the principal landlord’s fragmentation of holdings in the last two decades of the eighteenth century and the celebrated advocacy of cottage farms by his successor, Lord John Tollemache. This paper does not prioritize the views of the Tollemache estate, but inclines towards a market perspective, imagining a marketized cottage economy. Tollemache interests shaped the supply of land, but the pattern of demand was driven by the same forces that led to fragmentation under proto-industrialization. No common remained in the township, and the pressure to proletarianization is seen as the squeezing out of particular households’ claim on land, through the market. Investigation of these possibilities proceeds by attempting to infer physical change and change in the organization of holdings by chaining Land Tax records, and by attempting to impute the function of property by gathering information about occupiers through nominal record linkage to a range of further sources. The following sections first introduce the Land Tax returns and the idea of chaining them, sets out the relations to previous studies, and the centrality of the value of the sum assessed in constructing chains and the need to link to the physical. Subsequent sections seek to identify the influence of first national legislative and second local administrative practice on the values assessed, so as to filter out extraneous influences not attributable to physical change or change in occupancy. land tax and land tax chains: introduction To readers familiar with the Land Tax returns, the foregoing may seem quite unreasonably ambitious. Any attempt of this sort requires a detailed understanding of the Land Tax assessments for the period, described in some detail by Ginter16 . Land Tax was introduced in Great Britain in 1692, initially being levied not only on the annual rental value of real property, but on assessments of (personal) sources of income other than land and buildings. From 1745 the returns were used to establish entitlement to vote in county elections, and as Clerks of the Peace for counties were between 1780 and 1832 required to keep copies for electoral purposes, they survive in large numbers for that period in County Record Offices. The information within the returns is minimal (see Figure 1). Adopting Ginter’s terminology, these ‘duplicates’ for any year and township comprise a series of ‘line entries’ providing the name of the proprietor, 129 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization a) 1785 b) 1797 1785 shows least detail; providing occupier and sum assessed only 1797 provides fuller property descriptions than all other years. Source: Land Tax Returns, Mottram-in-Longdendale, QDV/2/299, Cheshire Archive and Local Studies Service, Cheshire Record Office, Chester. Figure 1. Land Tax Duplicates (Extracts); Mottram-in-Longdendale Township; a) 1785; b) 1797. the occupier, the sum for which they were liable, and very often a description of the property ‘bundle.’ The returns do not appear to have been used to explore urbanization, let alone competing forms. Indeed Turner and Mills’ collection of studies based on the Land Tax maintained a clear distinction between urban applications and rural applications17 . The crux of the present work involves matching the line entries longitudinally into chains, gathering together the scanty information about particular holdings to reveal their successive occupiers, to identify new development and to track the reconfiguration of individual holdings over time. Although Land Tax returns have often been used in local studies to point to changes in occupation of particular properties of interest18 , they have rarely been used systematically to enrich information about enduring entities. There are exceptions. Hunt, for example, attempted to track holdings over time to identify tithes (where this was not stated)19 ; Henstock linked Land Tax line entries over time more systematically to examine ‘house repopulation’ in Ashbourne, a Derbyshire market town20 . There do not, however, appear to have been studies which attempt to reconstruct the changing pattern of physical development and occupation of land and property by tackling the far more difficult task 130 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby of examining the amalgamation or subdivision of particular bundles, and the systematic identification of new property. This study attempts that task, through a computationally realized extension of Henstock’s approach. While Henstock’s study was designed to examine the succession of occupiers of a fixed set of property objects, deliberately excluding the rural area and abstracting from land parcellation, the present work allows for far more complex patterns of succession. One way of visualizing the central task is to imagine the individual line-entries as a set of vertices; and then consider the problem of specifying a set of edges, that is linkages between line entries for successive years, so as to construct a directed graph showing the history of the various property objects within the township. Under the idealized Fixed Property Objects assumption, each line entry would refer to one of a fixed number of unchanging properties. Each separate property could be represented by a disjoint subgraph, a simple ‘chain’, with only the occupiers changing (suiting Henstock’s prime purpose). With physical development and reorganization of agricultural holdings, however, the township ‘Land Tax graph’ and the constituent sub graphs for different holdings take the form of ‘trees’. When tracking individual bundles, any tendency for yeoman holdings to give way to large scale capitalist farms21 would imply tree structures, with fewer disjoint graphs, different chains joining together over time as holdings were combined. Conversely, when tracking individual properties, if there was a tendency for holdings to fragment (in a manner frequently associated with proto- industrialization), the number of disjoint subgraphs would be maintained, though more would take the form of trees. This paper sketches out a method for reconstructing the entire Land Tax graph for the township as a set of chains, each chain corresponding to a series of line entries. When properties are combined, chains join (or more strictly one is absorbed into the other). When a property is divided, loosely speaking a chain splits; strictly a new additional chain begins. Not surprisingly, when reconstructed, the actual graph for the township proves to be a hybrid, though the tendency to fragmentation dominates (as will become evident in Figure 3a). Identification of the succession of line-entries forming any particular chain rests principally on the limited information which they themselves contain, and it is important that the character of this information is understood. It is the identification of enduring property objects which is crucial, and although the bundle descriptions might seem the most obvious indicators of continuity, returns for many years include no such description. Where they are present, most descriptions take relatively uninformative generic forms such as ‘house and land’ or ‘cottage and croft.’ Moreover, in a given year the same property name (eg ‘Hague Farm’) may occur in several line entries. Hence continuity must also be sought in the names of proprietors and occupiers and in the sum assessed. 131 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization The approach to forming chains taken here rests crucially (but not solely) on consideration of the sum assessed. Under idealized conditions, unchanging property bundles should be expected to have unchanging tax liability, whatever changes of ownership or tenancy might occur. Similarly, one might expect that where two bundles had been amalgamated, the corresponding line entries would be replaced the following year by a single one with sums assessed combined. Where a bundle had been divided, it would likewise seem reasonable to anticipate that in the following year new line entries would show apportioned liabilities. The approach developed centres on the articulation and testing of rules expressing such continuities. These idealized conditions include the maintenance of a stable legal and administrative system, fixity of valuations and poundages and fairness and consistency of local practice. The fundamental assumption (implicit in Henstock’s study) that liability can be added and divided as suggested rests on a principle embedded in English law and custom from the time of commutation of feudal services into money values. The principle is set out as a dialogue in an early nineteenth century commentary: ‘Q: What if the tenant since that statute enfeoff a stranger of part of the land? A: Then the stranger shall hold of the lord per particular [sic] morum, viz. the rent shall be apportioned; as if there be twenty acres of land, and twenty shillings rent, the purchaser shall hold by three shillings rent, for three acres: but if there be an entire service that cannot be apportioned, as a horse, a hawk, the lord shall have the whole’22 . In this particular locality, evidence of such apportionment is found at least from the 1360s23 . The next sections consider firstly stability and change in the Land Tax regime over the period in question and secondly the nature of valuation and administrative practice in the particular township. Together they form a basis for identifying potential discontinuities and for constructing modified and augmented line entries, compensating where possible for administrative changes and hence exposing substantive changes in value. influences on individual assessments: the land tax regime Critical aspects of the statutory provisions and their implications for the present work are summarized in Table 1. In principle at least, the greatest difficulty in interpreting any individual Land Tax assessment lies in understanding its place within a system in which individual townships were required to return a fixed sum in accordance with a hierarchy of quotas, irrespective of physical change. County quotas were set in statute (annually before 1798), while Commissioners at county level were statutorily required to set quotas for Hundreds or Divisions in proportion to assessments of 1692, and to set township quotas without statutory instruction. These quotas are usually regarded as having been fixed in practice from 169824 A ugust 7, 2014 T im e: 02:36pm ijhac.2014.0127.tex P eter B ib b y 1 3 2 Table 1. Potential sources of change in reported land tax liability. Change in Liability Arising From Issue Note Significance and Treatment Changed Valuation Prerogative of local assessors 25 Local revaluation in 1822. Specific adjustments applied (see text) Changed Poundage 20% statutory maximum; otherwise prerogative of local assessors 26 Imputed from returns; standardised values calculated (see text) Change in Asset Classes Recorded Treatment of Property worth Less than £1 Redemption of Liability and Exoneration Redemption of Liability by Third Party Provision for Redemption by Ecclesiastical and other Bodies Land, buildings, tithes and official salaries identifiable in the township returns. No effect on quota locally. Statutory provisions refer to wealth of individual; not value of parcel. Lower assessments are recorded locally Individuals buying out their liability were exonerated from further payment and property not subject to reassessment. Those exonerated are listed in the township returns In principle, property on which liability was redeemed but where owners or occupiers were not exonerated remained listed and subject to reassessment Provisions made under various statutes for ecclesiastical and other bodies to sell property in order to redeem Land Tax liability 27 Official salaries and tithes excluded from analyses 28 Property included in analyses regardless of value; inconsistencies investigated (see text) 29 Exoneration of Sidebottom Bros means the development of the Broadbottom colony cannot be tracked 30 No known instances in the township 31 Church’s liability locally redeemed from 1818, and further change not traceable. Possible land sale A ugust 7, 2014 T im e: 02:36pm ijhac.2014.0127.tex R eco n stru cting U rba n iza tion 1 3 3 Reduced Liability following Appeal Double Taxation of Roman Catholics Revaluation of individual properties to reflect in situ change Complete omission of influential owners or occupiers Clear provisions for appeal against assessment throughout but no surviving local appeal documentation Roman Catholics were in principle liable to double taxation, though this may not have occurred in practice. There were no known Catholic households locally Occurred in principle, but doubted in practice by contemporary commentators and later analysts Occurred in principle, but doubted in practice by contemporary commentators and later analysts 32 Some falls between 1822 and 1823 might result from appeal after 1822 revaluation 33 Ignored 34 Very large number of upward in-situ revaluations evident (see text) 35 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby 134 Generally, reconciling physical growth with fixed quotas represented a significant challenge. The geographically inequitable nature of quotas which took no account of population shifts since the 1690s was much discussed36 . Although this poses major problems for comparing assessments between townships, for present purposes it is more important to understand how or if equitable treatment of those with interests in new and existing property might be achieved within a township. Writing in 1798, Lord Fitzwilliam believed: It occurs most frequently that a land tax rate levies a sum considerably beyond the sum payable to Government as the land tax of the district. This has arisen from various causes, but principally from new property arising within the district, as for instance a House is built. The House immediately becomes liable to bear its proportion in the Landtax of the district. The Assessors rate it regulating the sum, we suppose, by the known Standard of some antient house of equal size. To keep the levy down to the precise demand of Government upon the district every article of taxed property within the district ought to be relieved in its just proportion on such an occasion, but this has not been the practice.’37 Other commentators, by contrast, were quick to suggest that new property avoided the tax and that newly developing areas contributed little.38 In principle, local revaluations and adjustment of local poundages might have been used to bring the township quota and assessments of individual properties into alignment. Specific local adjustments evident in the Mottram returns are examined in the next section. Beyond the general difficulties implied by fixed quotas, account must be taken of discontinuities arising from arrangements introduced from 1798 allowing the redemption or purchase of Land Tax liability in order to ease the debt crisis arising from engagement in the Napoleonic wars. At this time, the Land Tax formerly agreed annually became perpetual, the quotas became statutory, and a series of further measures was introduced to encourage redemption of debt in return for lump sum payments. The main consequence for the present investigation is that incremental development of particular sites in the township was obscured where land tax liability had been redeemed. Apart from the Church (after 1818), only two land holders in the township bought exoneration; John Bostock and the Sidebottom brothers. From 1804 they redeemed their liability respecting holdings at the southern limit of the township, precluding the use of the returns to track development year-on-year within the Sidebottom’s cotton works and their adjoining Broadbottom colony. When the Sidebottoms later secured further land, they again redeemed their Land Tax liability, and so subsequent incremental development was again obscured. 135 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization 2 Table 1 draws attention to additional aspects of the legislation which might potentially introduce discontinuities and affect the possibility of completing Land Tax chains. No substantial effects are identified, but the significance of the approach taken in this study (regarding treatment of tithes and official salaries; and of properties with an annual value of less than £1) is considered further in the course of the work. influences on individual assessments: locally determined changes in poundage and valuation Previous work, particularly that due to Ginter, stresses the extent of the variation in local practice, indicating that localities had considerable autonomy to undertake valuations, to set poundages, and to alter timing of collection and so forth. This section seeks to identify any such changes which might have to be accounted for in attempting to construct Land Tax chains. In the absence of surviving documentation explicitly discussing practice within Mottram township, the following paragraphs draw inferences from the returns themselves. Systematic changes in poundage are found between 1780 and 1798. Inspection reveals that assessments of individual properties in the township vary in a predictable manner year-on-year. Thus any property taxed at 2/6d in 1788 might be expected to be assessed at 2/7d in 1789, 2/8d in 1794 or 2/3d in 1799, signalling changes in local poundage. From 1799 until 1821, local poundage appears fixed at 1s 1 1 d in the pound (ie 5.625%). This conclusion is permitted by the inclusion of ‘annual values’ for each property on 1813 return. In forming the chains, therefore, annual multipliers are used to estimate standardized liability for the years 1784–1821 on the basis of the 1799 local poundage.39 The source of the annual values shown in the 1813 return is unknown, although both the modern and contemporary literature suggest that it is likely to be the survey of 1692 which formed the original foundation for the quotas.40 The valuational rents implied by the annual Land Tax assessments are referred to below as Notional Annual Value (or NAV0 by way of shorthand). In analysing and discussing development and change it proves more convenient to refer to these implied values rather than the land tax payment due. NAV0 for a bundle is typically about half the rateable value for the corresponding property in 1818 (the only year in the period considered for which a rating list survives)41 . The specific values of NAV0 recorded usually increase in steps of 10s (£0.50), suggesting the rough and ready character of the valuation. NAV0 for a cottage and ‘croft’ (a small parcel of land) was typically £2 exactly, with few bundles showing lower values. In the spirit of Lord Fitzwilliam’s comment above, new property might be easily rated by local assessors. 136 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby From 1822, although the township quota remained fixed, the basis of the individual assessments changed significantly. The assessments of 1822 have no arithmetic relationship to those of earlier years (except in the case of properties where liability was exonerated which remained constant). Quite different valuation principles are implied which remained in place until 1830, shifting the relative values of land and buildings and transferring a greater part of the burden of the tax onto the manufacturing interest. The valuation(s) underlying the new Land Tax assessments of 1822 do not survive, but their principles are presumed similar to those underlying the surviving rating valuations of 1818, to which they are closely related statistically. There is, however, a sharp contrast between the rough and ready valuations of NAV0 and the number of gradations in value found from 1822 (referred to here as NAV1)42 . Perhaps it is no coincidence that this shift occurred the year after the death Wilbraham Tollemache, Earl of Dysart, the principal landowner since 1770. Certainly, this discontinuity was limited to the township, not affecting the neighbouring townships or Stockport Division more generally. Because of the changed valuation principles applied after 1821, a different approach must be taken to standardization. To extend the chains beyond 1821 in a consistent manner, a specific assessment conversion factor is used for every 1821–1822 transition. These factors are also used to produce estimates of NAV0 for each bundle from 1822 onwards, by applying them to the later Land Tax assessments. In the few cases where new property was built after 1822, the value of NAV0 is set at 95% of the NAV1 value43 overall change in aggregate assessment 1784–1830 On the basis of the foregoing, a modified version of the line entries was produced including standardized assessments and NAV0 estimates. Aggregations of these provide an initial picture of the overall trajectory of development (see Figure 2). Series A represents the constant quota. The actual sum of the individual assessments represented by Series B (unadjusted and including liability in respect of tithes and salaries) in fact diverged from the quota even where this was not reflected in the reported totals. Ginter treats such returns as ‘defective’ and warns against their use.44 Nevertheless, it is clear that these divergences were transparent and approved by those Commissioners serving the Stockport Division who allowed the assessments. Subsequent analyses of the chains, in fact confirms the internal integrity of the aggregations. It is suggested that the latitude displayed should be seen as part of the actual approach to accommodating the tension between fixed quotas and local equity in circumstances of growth. 137 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization Figure 2. Land Tax Aggregates; Mottram-in-Longdendale Township; 1784–1829. Series C shows the total sums actually assessed in respect of land and buildings alone, highlighting the effect of the shift to a new valuation in 1822. Excluding tithes and official salaries seems desirable in principle, as explained above. In this specific instance their exclusion seems straightforward. They had not been commuted into land, were owned by the Bishop of Chester, and were leased to absentees.45 From time to time, excisemen were resident in the township, and in principle there is a possibility that as their contribution to meeting the quota rose and fell, the contributions of other taxpayers might alter correspondingly. It is clear, however, from Figure 2 that no such adjustments were made.46 Series D adjusts C, removing the effect of local variations in poundage, all occurring before 1798. Series D summarizes the core facts represented by the adjusted line entries used to generate the chains. The final series shown, NAV0, tracks the imputed notional value of property on the basis of the old valuation. These initial analyses clearly demonstrate that at least some new physical development was recorded year on year, and reveal a continual rise in aggregate valuational rent, contrary to initial expectations given fixed quotas. Changes in local poundage aside, two of the possible forms of local revaluation discussed 138 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby by Ginter are found47 ; the one-off revaluation of 1822, and the continual reassessment on which the following analyses depend. nature of the physical objects corresponding to the line entries Having attempted to ensure as far as possible that any change in the standardized values reflects either a physical change or a change in occupancy, the next step is to attempt to understand the likely physical character of bundles with a particular value in order firstly to link the Land Tax records to other information, and secondly to gain an appreciation of the character of unlinked bundles. General principles distilled from Ginter’s analysis form the starting point:48 i) bundles cannot be assumed to be either functional ‘wholes’ (such as farm units), or geographically contiguous parcels, ii). specific buildings cannot be assumed to be individually represented; but may instead be ‘clumped’ and represented in line entries along with other buildings (whether contiguous or scattered), and iii). there may be an untaxed residuum and hence many buildings may not be included (either individually or within a composite line entry). On the initial assumption that a bundle will usually correspond to a ‘holding’ defined by a specific lease or deed, information about its physical character - in the case of property owned by the principal landlord- might be found within Tollemache estate documentation. Nearly all holdings on that estate fell into one of three types; property let on fourteen-year leases, property let on annual ‘cottage tenancies’, and property leased for 99 years determined by three lives. Very little documentation survives for the annual cottage tenancies though it appears that they typically included more than one dwelling and encompassed small parcels of land, the tenants serving as gatekeepers, subletting property and controlling access to clusters of dwellings.49 The legal power to grant 99- year leases was only secured by the principal landlord in 1786 by a private parliamentary Bill, which proved a pivotal moment in the physical development of the township.50 In the case of agricultural land leased from the Tollemache estate, the relation between the physical character of a bundle and its assessment is readily understood. Property held on 14-year lease included parcels of agricultural land which themselves might or might not be contiguous and which might include disjoint cottage property. These leases ran concurrently, the period examined being covered by five allocations or ‘tacks’ made in 1771, 1785, 1799, 1813, and 1827 with associated surveys being undertaken in the preceeding year. For years when a survey took place, the Land Tax liability of a bundle may be compared with the area and rent of the corresponding holding. Restricting attention to 1799 and cases where a one-to-one match between a holding and a line-entry can 139 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization be identified, a strong correlation between assessment and rent is found (0.76), but a much stronger one with acreage (0.986). A rate of tax per hectare for the township may be estimated by regression using area measures for holdings on the Tollemache estate in 1799: T = 0 + 1 A + where T represents the Land Tax Assessment, A is the area of undeveloped parcels let, 0 and 1 are parameters to be estimated, and is an error term Lt ax = 0.181 + 0.065*Hectares Statistically, variation in acreage accounts for 97.3% of the variability of the Land Tax assessment (or equivalently of NAV0). With tax payable estimated at 6.5p per hectare as above, the notional annual value (NAV0) of agricultural land in the township would be $1.15 per hectare (or 46p per acre). This relationship is used to guide the matching of Land Tax and estate documentation more generally and to make rough estimates of the acreage of holdings outside the Tollemache estate for the period up to 182151 . The intercept in the above expression (18.1 pence) is interpreted here as the Land Tax typically payable on the built property within a holding leased for fourteen years- equivalent to an annual value of £3.22 (NAV0), representing say 4.4 bays of building.52 Although the value of buildings has been largely ignored in estimating area equivalent Land Tax assessments, it should not be discounted. Gregory King’s estimates53 imply that in 1692 the assessed value of land and buildings were in the ratio 13:3. In the Pennine fringe, where holdings were typically very small, this lack of attention seems difficult to justify. Only limited inferences can be made about the nature of built property, especially property with £2.00 NAV0 (the usual minimum in the township). This is because very few holdings leased for terms of 14 years had values as low as this, and no descriptions of annual cottage property survive. The area/tax relationship discussed above suggests that one form might be a one-bay cottage with three acres of land. Some other possibilities appear. Descriptions of Phoebe Stead’s 14-year holding grandiosely styled Taylors Hospital stands as an example- a house, a shop, a cottage and a wash house (with a NAV0 of £2) beside the turnpike road at the Lane End tollhouse, makes no explicit reference to a croft or any garden ground. It is clear that some property went untaxed. The potential scale and nature of this untaxed residuum might be crudely gauged by comparing receipts for cottage rentals for the Tollemache estate in 1785 with Land Tax entries for the same year. Assuming that any bundle represents a holding, and that the ‘tenant’ and the ‘occupier’ should always be identical, any cottage tenancy without a corresponding line entry might be considered to have gone untaxed. Of the 36 140 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby Tollemache cottage tenancies of 1785, 30 can be found immediately on the Land Tax returns. Some of the mismatch should be expected to be attributable to divergence between recorded occupiers and tenants, and the accuracy with which the residuum can be measured depends on the approach to matching. Tollemache cottage tenancies without a corresponding a line entry all have a (market) rent of £2 per annum or less, and three of these six have a rent of under £1. As all the property on 14-year lease can be matched, and most of the cottage tenancies, for 1785 that portion of the rental income for the estate attributable to property identifiable in the Land Tax returns accounts for 99.4% of the total. The untaxed residuum would therefore appear of no significance in terms of aggregate rental value, although it may be of more significance in terms of tracking development. The untaxed residuum might result from a particular interpretation of statute, from deliberate local policy, from oversight or from the simple play of power. These possibilities have slightly different implications for the attempt to construct Land Tax chains. Any principle that bundles with an annual value less than £1 were exempt from Land Tax either from 1798 or throughout- supposedly grounded in statute- is disputed,54 and the practice in Mottram township was evidently to tax such parcels in some circumstances both before and after 1798. Over the entire period, 111 entries are found with values of NAV0 less than £1, the smallest value being 4s (£0.20) for ‘part of Brick Croft’ in 1796. Even assuming that market rent rather than valuational rent were the appropriate measure and that this might be four times higher, the £1 threshold would still not be exceeded in that case. A literal interpretation of successive statutes would suggest that the value test should be applied to the entire property of the person assessed, rather than the specific bundle. On this reading, the undeveloped houseplot at Brick Croft was liable because of the value of the occupier’s entire holding (which amounted to £8 NAV0 within the township).55 Subsequent sections take this further by exploring circumstances where chains appear to break down as existing property ceases to be or starts to be taxed. constructing chains: overview Assembly of the chains, and establishment of the links between them to construct the entire Land Tax graph is achieved by applying a series of a ‘rules’ to ‘facts’ drawn primarily from the line entries. The facts and rules together might be thought of as a knowledge base, coded in the logic programming language Prolog56 which serves as an ‘inferencing engine’. It might be thought of as a computational theorem prover which can be made to draw out the implications of knowledge of very different forms (including topological, geometric and grammatical relations) provided that knowledge can be expressed either as facts or rules. 141 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization The ‘facts’ derived from the line entries with some preprocessing take the form landtax(Case,Year,Proprietor,Occupier,Bundle,Tax), for example landtax(13, 1784, [wilbraham, tollemache], [john, shaw], [], 0.787501). landtax(1120, 1799, [wilbraham, tollemache], [widow, stead], [cottage],0.15). landtax(1121, 1799, [john, bostock], [john, bostock], [broadbottom], 1.6875). Spellings of personal names are standardized at the outset. Where the line entry records the occupier as ‘tenants’ or similar, this is replaced by the proprietor. Tax is expressed in pounds, the standardized measure being used for the years 1784–1821. As shown above, Proprietor, Occupier and Bundle description are represented as Prolog lists, allowing various required natural language processing tasks using definite clause grammars.57 An empty list, [], indicates that the line entry has no property description. Where possible, property descriptions are added to line entries originally lacking them by recursively copying descriptions from the previous (or following) year, provided that the specific combination of occupier name and (standardized) tax matches uniquely. Facts based on the line entries are supplemented by further Prolog facts based on a body of other material (summarized in Table 1) which might are used both to locate the bundles to which particular line entries refer, and to guide the construction of chains. This encodes some estate documentation, facts recording familial relationships derived from parish registers, enumerators’ books from the 1841 census and the tithe apportionment survey of 1846. Other historic sources, such as wills, have been used to corroborate linkages, confirming reconstructed events, but are not stored as Prolog facts. Most of the effort in the project lies in the specification and re-specification of rules. Taken together the rules seek to identify the most likely successor(s) to any line entry. In terms of the graph metaphor, this involves identifying the ‘edges’ most likely to link line-entries (vertices). A bundle in year t might be succeeded by one or more bundles in year t+1 if their aggregate values were equal (subject to some tolerance). From the various sets of linking arcs that meet this minimal condition, further rules are designed to identify the most likely links by scoring potential arcs principally in terms of continuity - a composite based on continuity of occupier, of proprietor and of bundle continuity. Each time the procedure is run, (that is the rules are applied to the facts), links are made and chains are extended computationally if the scores merit. Where two or more candidate links score equally as potential ways of extending a chain, or where no candidates score sufficiently highly, no link is made, but documentary evidence is reconsidered or more sought. As possibilities are resolved, linkages between line entries in these uncertain cases are recorded as specific facts and assigned superior scores. Incomplete matches (ie those which do not maintain value in full) can also be recorded by the analyst as specific facts, ‘pseudobundles’ being A ugust 7, 2014 T im e: 02:36pm ijhac.2014.0127.tex P eter B ib b y 1 4 2 Table 1. Detailed sources of supplementary information. Topic Scope Dates Content Source Application Valuation, Surveys, Tollemache Estate 1771,1799, Names, areas, Cheshire CRO Locating Land Particulars of estates 1813,1826 rents, tenants of Tax bundles land parcels Leases (99 years) Tollemache Estate 1786 onwards date, lessee, Cheshire CRO Identifying expected property description new construction Register of Leases Tollemache Estate 1814,1837 date, lessee Cheshire CRO Identifying assignees Cottage Rentals Tollemache Estate 1785 Cheshire CRO Tithe Apportionment Township 1846 Cheshire CRO Locating Land Tax bundles Census Enumerators’ Books Township 1841 Locating Land Tax bundles Household Heads Township 1700–1820 dates of marriage, burial, Parish Registers Assessing Continuity business partnerships Wives Township 1700–1820 Parish Registers Assessing Continuity Children Township 1700–1820 dates of baptism, Parish Registers Assessing Continuity link to HOH Tollemache Estate Map Tollemache Estate 1771- 1826 Reconstruction Locating Land Tax bundles Sale plan Part of Tollemache estate 1841 Cheshire CRO Locating Land Tax bundles Highway Rate Book Township 1818 Tameside Archives and Local Studies Toponomy Township throughout subareas All the above Restricting chain formation 143 created to account for discrepancies. By repeated application of the procedures problems reduce and chains are defined. The rule-based inferencing deployed has parallels with the approach of expert- systems58 , but crucially the rules used must rely largely on consideration of statute, contemporary texts and modern scholarship rather than ‘expertise’.59 An assessor’s awareness of local practice and his understanding of matters taken for granted in everyday life are all missing. Posited rules rely instead on abduction60 - ie on positing a hypothetical relation (concerning admissable arithmetic mismatch, for example) and then testing it by applying it to the facts. To develop rules in this manner is to explore what must be true for the particular outcomes to be possible and this extends from admissable arithmetic mismatch towards the more general tacit knowledge at the core of social relations. If posited rules admit too many possibilities, they are of little immediate value as they suggest too many plausible chains. If they admit too few possibilities, chains will not form at all. Progress depends on repeatedly respecifying rules, which serves not only to construct the chains, but also to reconstruct some of this tacit knowledge to a limited degree as discussed below. Not only therefore is there a symmetry between the specification of rules and the resulting outcomes, but the rules provide pointers to how language and legal provisions must have been interpreted. At any particular stage in the analysis, there may be competing ways of extending a chain, and this opens up new approaches to making sense of undated endorsements, crossings out and annotations in estate documentation for example. Potential paths may be supported by and illuminate such minutiae. The approach, however, is very unforgiving. Chains break down where posited rules cannot be satisfied. This might result from failure to identify consistency in local practice correctly, from the inconsistent practice of assessors, or simply from error in data preparation. In more familiar quantitative analyses of Land Tax concerned with aggregates, to overlook a single line entry, to duplicate one or to mistype a value, though undesirable, is of relatively little consequence. In attempting to chain individual records, the emphasis is largely on the difference between line entries and such errors are crucial. The overall approach demands the presumption of order is absolutely maintained until it is no longer possible. following the value: introduction The first group of rules express principles for defining summations of individual Land Tax assessments for a given year to compare with a specific assessment the following year. Each line entry is assigned a unique identifier, and considered to denote a property bundle with the same identifier. In terms of the graph, this corresponds to a specific vertex (node). On the basis of very restrictive assumptions about how property may be broken up, an initial Identity [1] is 144 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby posited in which Lj, the Land Tax liability in respect of bundle j, can be related to liability with respect to bundles present in the previous year by the expression: Lj = Li + Lk − Ll + Cj −Dj +Rj +Aj + Zj (1) where k栗m l栗s m is a set of bundles merged with i, s is a set of bundles split from i, Cj represents liability in respect of new development observed in bundle j Dj represents a reduction in liability corresponding to physical change (devalorisation of capital) in bundle j, or the outcome of an appeal in respect of property forming (part of) that bundle, Rj is a revaluation adjustment that takes a specific parcel for all bundles dated 1822 and is 0 otherwise, Aj represents an adjustment for rounding errors and other very small changes in liability, and Zj represents an adjustment for all other attributes of bundle j, its proprietor and its occupier which affect change in liability from one year to the next. The following sections elaborate the principle underlying [1], and extend it, first relaxing the assumptions about property subdivision and second accommodating matters of administrative practice which emerge. Identity [1] considerably extends the logic implicit in Henstock’s study of Ashbourne61 , which presumes that almost invariably an (important) special case of [1] will hold, in which there will be no material change in physical character from year to year. In this Fixed Property Object case, a single line entry j for a particular year would be found in place of entry i the previous year and (without wholesale revaluation), Li and Lj would be identical and all the other terms on the right hand side of [1] would be 0. Even in the Ashbourne study, however, it was necessary to recognize ‘occasional subdivision of properties’ and one case of amalgamation, and hence to identify bundles corresponding to sets m and s in [1], and in these cases, the principle that liability could simply be summed and divided (‘and resolved by simple arithmetic’) was implicitly accepted.62 It should be appreciated that in Equation [1], the distinction between bundle i (the predecessor) and bundles in the set m is one of convention. Identity [1] moves beyond the Fixed Property Object case by considering change in the building stock. In the case of construction of a new cottage all terms on the right hand side other than Cj will be 0. By Dj, the possibility of devalorization, or of successful appeal is admitted, but without any expectation that these effects would be substantial. The tolerance, Aj, avoids including changes which might be considered de minimis. Initially set at ±£0.0083 (2d), it was later reduced to ±£0.004, ‘filtering out’ change with a notional annual value (NAV0) of less than 1/6d63 145 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization Identity [1] allows that bundles may be combined or split, but only in very restricted ways. Under [1] a bundle found in any particular year must either comprise one or more bundles recorded in the previous year, or be part of a single bundle from the previous year. It is, however, quite possible that a bundle comprises parcels which were never explicitly represented as bundles. It is far from adequate, however, as it does not admit the possibility that a parcel might cease to be part of one bundle and become part of another.64 A totally general solution would be to treat any Land Tax bundle as a mereological sum of atoms of real property at an instant in time65 . Identity [1] would be re-written without a specific ‘predecessor’ Li and replacing sets m and m by sets of infinitessimal property elements. Within mereological calculus any objects may have a sum, though following Quine those which are not useful are discounted.66 Implementation would obviously be impractical and moreover the formulation would suggest a world that were infinitely and immediately plastic. A less comprehensive approach might define potentially useful sums by recognizing that property transfers may be hidden wherever (subject to some tolerance) some set of bundles found in one year carries the same aggregate Land Tax liability as another set of bundles the following year. This would imply a large but finite set of sums, rather than an infinite set of combinations of atoms of real property. In the work reported, a more modest extension of [1] has been applied. ‘Useful sums’ have been defined only in three very restrictive sets of circumstances: • when there is a possibility that property objects would (from an endurantist perspective) be treated as changing in value (eg where a taxpayer name or bundle name remains constant) • when the specific value of a bundle suggests that an apportionment has occurred (ie falls outside the set values usually encountered), and • when the value of a particular bundle cannot be expressed as the sum of the values of a series of bundles in the year previous or following. Relaxing the highly restrictive assumptions of Identity [1], Lj , the Land Tax liability in respect of bundle j might be related to liability with respect to bundles present in the previous year by summation of liability for ‘property elements’ or simply ‘elements’ for short. An element may be either a bundle as [1] or part of a bundle recognized as a ‘useful sum’. On this basis, a revised identity is defined: where Lj = k栗m Vk − l栗s Vl + Cj −Dj +Rj +Aj (2) and Vk = Ln pnk 146 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby m is a set of elements merged with i, s is a set of elements split from i, Vk is the liability assigned to element k, Ln is the standardised liability carried by bundle n, pnk is the proportion of the standardized liability for bundle n assigned to element k inter-temporal adjustments In applying [1] and [2] it became clear that modification was necessary to capture intertemporal adjustments made by local assessors, which frustrate the formation of chains and shift the interpretation of individual line entries. At least three such types of adjustment are found. The first accounts for vacant property. A second form of adjustment, entirely unanticipated within the literature, is found to occur in some circumstances after the death of an occupier, and is assumed to allow for an executor to settle an individual’s affairs. In these cases an occupier’s name may disappear from the Land Tax return, but one or more lagged assessments may subsequently be recorded after a gap (in the name of the deceased and at the former level). Hence following ten deaths in 1800 for example, new occupiers for the respective bundles are recorded in both 1801 and 1802, before a final lagged assessment for the deceased occupier is recorded in 1803. Third, it appears that further lagged assessments were recorded, consequent on the second group. In these cases the liability of those entering on property vacated on the death of the previous occupier was set at the level appropriate to the bundle that they themselves had previously occupied. (Beside these three sets of adjustments are very small year-on-year changes where occupancy appears continuous, which are filtered out in Equations [1] and [2] by the Aj tolerance). likelihood scores In principle (though not procedurally), the computational exercise is concerned to identify for each particular line entry, all summations which might satisfy [2]. The attempt to reconstruct change involves choosing between them, which demands further rules, and perhaps suggests a probabilistic approach. Although such an approach was not finally preferred, consideration of probability forms a useful stepping stone to explaining the procedures adopted. Restricting attention to the Fixed Property Object case, and without any further information, the probability that line entry j (dated y + 1) with liability Lj would succeed line entry i (dated y) might be considered to depend on n, the number of line entries dated y + 1 with a liability equal to Lj. The probability pij that j would succeed i might be estimated as 0 if Lj I= Li or 1/n otherwise. This might be thought of as a uniform prior probability of succession. Given the crude nature of the valuations, there are many cottages assessed at £2, while far fewer smallholdings 147 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization share a specific value. The prior probabilities of succession under these assump- tions are thus far higher in the latter case. This principle is easily extended to consider not only the frequency with which a particular value is recorded, but the number of bundles with that value owned by a particular landowner. Of course, the line entries provide substantive evidence relevant to assessment of the probability of a particular linkage. In the spirit of Henstock, it is assumed that the similarity of line entries year-on-year should influence the degree of belief that occupation continued. A Bayesian approach to assessing the probability of a particular linkage developed on this basis might consider not only the prior probability of succession, but estimate two further probabilities. The first would be that of finding the observed degree of similarity between line entries if they really did represent the same property. Technically, this is the likelihood that the succession occurred. The second would be the probability of finding that degree of similarity otherwise. On the basis of these three values, the probability of the particular transition might be estimated67 . Although Henstock judged the similarity of line entries year on year (implicitly allied to the likelihood of the transition), he did not consider the three probabilities. Estimation of the likelihood of specific transitions was attempted in the present study, but this proved impractical.68 Moreover, as the work progressed it appeared that rather than assigning a probability to each potential succession, it might be possible and preferable to identify a single most likely solution. Indeterminacy, rather than being commuted into probability, has driven the search for additional evidence. The approach taken does not estimate likelihood as such, but assigns a likelihood score to each potential succession based primarily on similarity. The likelihood score for a particular summation rests on four groups of considerations; similarity, structural priority, the broader evidence of related lagged summations and the ordering of the line entries within the return. Each of these considerations is outlined below. The values taken by the scores are illustrated in Table 2 and examples of scores assigned to particular potential transitions are provided in Table 4.69 An overall succession score is calculated for any summation, by combining the likelihood score with the prior probability of the transition (which varies with the prevalence not only of the sum assessed, but of the other details - the proprietor being particularly significant in practice). The goal is to find the best overall succession score for each bundle. It should be understood, however, that identification of the ‘best’ summation of elements in year t corresponding to any particular line entry i in year t-1 does not depend solely on the overall succession scores for line entry i. It also depends on the scores associated with all other summations, such as that for line entry k, in year t-1 which might ‘compete’ for the same elements in year t. Potential changes involving the same bundle or element are mutually exclusive; if a given bundle or sub-bundle forms part of one summation, it cannot participate in another. 148 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby Table 2. Similarity component scores. Sequence Related Simplicity Proprietor Occupier Bundle Continuity Parts Similarity Similarity Similarity Score Possible Values Minimum 0 0 −轡 0 0 0 轡 Maximum 1 1 2 2 2 2 0 Forced 3 3 3 3 3 3 −3 Calculated Values for All Tested Summations Average 0.22 0.09 −0.06 1.70 0.10 0.86 8.18 Minimum 0 0 −1 0 0 0 0 Maximum 1 1 2 2 3 3 16 Calculated Values for Best Tested Summations Average 0.46 0.10 1.90 1.75 0.90 1.26 2.37 Minimum 0 0 −1 0 0 0 0 Maximum 1 1 2 2 3 3 11 Values for Best Tested Summations (including forced) Average 1.82 1.65 2.49 2.42 2.02 2.19 −0.50 Minimum 0 0 −1 0 0 0 −3 Maximum 3 3 3 3 3 3 11 Notes Scores for individual components increase with similarity. Values under ‘All Tested Summations’ refer to the scores for the relevant components of similarity between any given line entry and all its potentially matchable line entry summations (ie those for which Land Tax liabilities are equal subject to a tolerance, and which do respect all other constraints). Values under ‘Best Tested Summations’ refer to the scores for the relevant components of similarity between any given line entry and the potentially matchable line entry summation (s) with the best (ie lowest) continuity score (combining similarity and structural priority). Values under ‘Forced’ Summations refer to the scores assigned to the relevant components of similarity between any given line entry and that identified by the analyst as the preferred line entry summation (to which a continuity score of −3 assigned). A ugust 7, 2014 T im e: 02:36pm ijhac.2014.0127.tex R eco n stru cting U rba n iza tion 1 4 9 S tr uc tu re S eq ue nc e R el at ed P ar ts S im pl ic it y P ro p ri et o r O cc up ie r B un dl e S co re Table 3. Similarity scores; examples. Type Line Entry Potentially Matching Summation split 0 1 1 1 2 2 1 0 [[4825, 1795, [wilbraham, tollemache], [james, stead], [cottage, and, croft], 0.253125]] continuation 0 1 0 2 0 1 2 1 [[4122, 1827, [edward, hollingworth], [william, heap], [part, of, roe, cross, farm], 0.2875]] continuation 0 1 0 2 1 3 2 1 [[1243, 1800, [john, swindells], [john, swindells], [summerbottom, and, lands], 0.604167]] continuation 0 1 0 2 2 0 2 1 [[1374, 1802, [wilbraham, tollemache], [wilbraham, tollemache], [brick, croft], 0.0125]] continuation 0 1 0 2 2 1 1 1 [[955, 1797, [wilbraham, tollemache], [occupation, [plasterer], thomas, harrop], [house, and, garden], 0.28125]] continuation 7 0 0 2 2 2 2 1 [[1011, 1797, [wilbraham, tollemache], [john, lee], [kelsall, farm], 0.675]] continuation 7 0 0 2 2 2 2 1 [[1700, 1805, [wilbraham, tollemache], [jonathan, hadfield], [hurstclough, farm], 0.5]] merge 0 1 1 2 2 0 2 1 [[1525, 1803, [john, bostock], [john, bostock], [broadbottom], 1.6875]] merge 0 1 1 2 2 0 2 1 [[1529, 1803, [widow, wood], [widow, wood], [silver, spring], 0.59375]] continuation 7 0 0 2 2 2 1 2 [[3042, 1818, [wilbraham, tollemache], [john, langwith], [foundry], 0.1125]] split 0 1 0 2 1 0 2 2 [[1226, 1800, [widow, wood], [widow, wood], [silver, spring], 0.59375]] split 0 1 1 1 2 0 1 2 [[52, 1784, [wilbraham, tollemache], [james, harrop], [], 0.900001]] [[904, 1796, [wilbraham, tollemache], [james, stead], [house, and, land], 0.125625], [926, 1796, [wilbraham, tollemache], [james, stead], [house, and, land], 0.1275]] [[4253, 1828, [john, roberts], [james, heap], [part, of, roe, cross, farm], 0.2875]] [[1346, 1801, [john, dale], [john, dale], [summerbottom, and, lands], 0.604167]] [[1475, 1803, [wilbraham, tollemache], [thomas, hill], [part, of, brick, croft], 0.0125]] [[1058, 1799, [wilbraham, tollemache], [ann, harrop], [house, and, land], 0.28125]] [[1110, 1799, [wilbraham, tollemache], [john, lee], [kelsall, farm, and, cottage], 0.7875]] [[1802, 1806, [wilbraham, tollemache], [jonathan, hadfield], [hurst, clough], 0.50625]] [[1423, 1802, [john, bostock], [john, bostock], [broadbottom], 1.51875], [1424, 1802, [john, bostock], [william, and, george, sidebottom], [broadbottom], 0.16875]] [[1430, 1802, [widow, wood], [john, harrison], [roe, cross], 0.297917], [1431, 1802, [widow, wood], [edward, chadwick], [roe, cross], 0.297917]] [[3165, 1819, [wilbraham, tollemache], [john, langwith], [], 0.16875]] [[1329, 1801, [joseph, wood], [john, harrison], [roe, cross], 0.297917], [1330, 1801, [joseph, wood], [edward, chadwick], [roe, cross], 0.297917]] [[66, 1785, [wilbraham, tollemache], [jonathan, bowers], [], 0.618751], [117, 1785, [wilbraham, tollemache], [senior, james, harrop], [], 0.28125]] A ugust 7, 2014 T im e: 02:36pm ijhac.2014.0127.tex P eter B ib b y 1 5 0 S tr uc tu re S eq ue nc e R el at ed P ar ts S im pl ic it y P ro p ri et o r O cc up ie r B un dl e S co re Table 4. Continued. Type Line Entry Potentially Matching Summation continuation 0 0 0 2 0 0 2 3 [[2410, 1813, [edmund, kershaw], [nathan, bowers], [[2618, 1814, [william, and, george, sidebottom], [william, [harryfields], 0.6875]] and, george, sidebottom], [harryfields, farm], 0.6875]] continuation 0 0 0 2 2 0 2 3 [[2704, 1815, [edward, hollingworth], [daniel, mercer], [[2803, 1816, [edward, hollingworth], [robert, heap], [roe, [roe, cross], 0.9]] cross], 0.9]] merge 0 0 0 2 2 0 1 3 [[2637, 1814, [james, hurst], [james, hurst], [], 0.595833]] [[2421, 1813, [james, hurst], [occupation, [innkeeper], thomas, chadwick], [roe, cross], 0.297917], [2422, 1813, [james, hurst], [occupation, [innkeeper], thomas, chadwick], [roe, cross, land], 0.297917]] invention 4 1 0 1 2 0 1 4 [[363, 1790, [wilbraham, tollemache], [edward, moss], [], [[284, 1789, [wilbraham, tollemache], [wilbraham, 0.45]] tollemache], [], 0.337499], [4901, 1789, [wilbraham, tollemache], [neddy, holt], [], 0.112501]] continuation 3 0 0 2 2 2 1 5 [[4831, 1795, [joseph, bardsley], [joseph, bardsley], [house, [[932, 1796, [joseph, bardsley], [joseph, bardsley], [house, and, garden], 0.148125]] and, garden], 0.140625]] lagged continuatio n 1 0 0 2 1 2 2 5 [[1431, 1802, [widow, wood], [edward, chadwick], [roe, [[1633, 1804, [joshua, wood], [edward, chadwick], [roe, cross], 0.297917]] cross], 0.297917]] split 0 0 0 −1 2 0 1 6 [[4849, 1795, [wilbraham, tollemache], [robert, bennett, and, james, harrop], [house, and, land], 0.27]] lagged merge 1 0 1 1 2 1 1 7 [[956, 1797, [wilbraham, tollemache], [widow, stead], [cottage, and, croft], 0.253125]] continuation 0 0 0 2 2 0 0 8 [[2241, 1810, [wilbraham, tollemache], [james, shaw], [garlick, cottage], 0.1125]] [[842, 1796, [wilbraham, tollemache], [john, swindells], [hodge, mill], 0.028125], [897, 1796, [wilbraham, tollemache], [john, lee], [cottage], 0.1125], [926, 1796, [wilbraham, tollemache], [james, stead], [house, and, land], 0.1275]] [[798, 1794, [wilbraham, tollemache], [james, stead], [house, and, land], 0.126562], [817, 1794, [wilbraham, tollemache], [james, stead], [house, and, land], 0.126562]] [[2358, 1811, [wilbraham, tollemache], [robert, bennett], [late, hills, barn], 0.1125]] lagged continuation 1 0 0 2 2 0 1 8 [[4888, 1785, [wilbraham, tollemache], [samuel, doxon], [], [[255, 1787, [wilbraham, tollemache], [joel, howard], 0.337499]] lagged continuation 1 0 0 2 0 0 1 9 [[4797, 1795, [john, reddish], [john, reddish], [cottage], 0.1125]] lagged merge 1 1 0 −1 2 0 1 10 [[1311, 1801, [wilbraham, tollemache], [john, reddish], [house, and, land], 0.675]] [house, and, land], 0.3375]] [[1131, 1799, [john, sidebottom], [robert, bennett], [silent, mill], 0.114583]] [[952, 1797, [wilbraham, tollemache], [robert, bennett], [harrops, land], 0.140625], [963, 1797, [wilbraham, tollemache], [thomas, lowe], [cottage], 0.225], [965, 1797, [wilbraham, tollemache], [enoch, bretnor], [croft, late, woolley], 0.196875], [1012, 1797, [wilbraham, tollemache], [john, lee], [cottage], 0.1125]] August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization 151 The overall succession score for the ‘best’ summation for any bundle is assessed against a threshold. Where the threshold is satisfied, necessary new links are created to extend the chain(s). Where the overall score is not considered determinate, no links are made but more evidence has been sought from other documentation. similarity: proprietor and occupier names Three dimensions of similarity - of the names of the proprietor, of the occupier and of the bundle description - are assessed, extending to the several occupiers of the several potential bundles or elements in the case of merges and splits. Assessment of similarity makes use of elementary natural language processing techniques using the Definite Clause Grammars (DCG) extension of Prolog. Each dimension of similarity is assigned a score between 0 (no similarity) and 3 (identity). The limited value of the bundle descriptions underlies the emphasis on proprietor and occupier names.70 Assessment of the similarity of personal names extends beyond direct matching to consider possible transfer to family members, business partners, and in the case of 99-year leases, assignees using sources referenced in Table 1. Well-understood problems of nominal record linkage apart, matching proprietor names proves straightforward, save insofar as account must be taken of Tollemache long leaseholders, who were not consistently treated as proprietors. As the principal landowner did not dispose of the freehold property in the township over the period, a specific rule discounted any summations implying such transfers. Using similarity of occupiers’ surnames to frame judgments about likelihood of succession makes implicit assumptions about security of tenure. Locally, where property was held on fourteen year lease, there seem to be strong expectations of tenant right of renewal, and of nominating a successor, which seem matched by equally strong presumptions in contemporary treatises on estate management. This is not clear in the case of annual tenants.71 similarity: bundle descriptions Comparisons of bundle descriptions may entail assessment of the compatibility of generic property descriptions with each other, of the compatibility of generic descriptions with topographic proper names (definite descriptions), and of the compatibility of topographic proper names with each other. Particular attention is given to the compatibility of parts with each other (where property units are being divided or combined). Generic property descriptions are recognized as such and compatibility of pairs of generic property descriptions is assessed by decomposing noun 152 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby phrases (eg ‘house, mill and land’) into their components and applying similarity constraints based on implied physical changes. This prevents, for example, any built property being part of a summation corresponding to an undeveloped bundle (described as say ‘lands’). As work progressed, these ‘physical’ constraints were loosened (allowing compatibility of ‘cottage’ with ‘house’ or ‘cottages’, for example) in recognition of the far from precise way such generic terms were actually used. Topographic names are treated as a special class of noun phrases. They are regarded as attributes of the places to which they refer rather than as rigid designators72 , so several bundles in one year may be described as ‘Hague Farm,’ while ‘Harrop Edge’ is treated as identical to ‘part of Harrop Edge’. The tendency for the referents of names to drift implies that allusions to topographic features, holdings and localities are not easily distinguished. Few presumptions are therefore made about the assumed extent of places denoted (for example, ‘Nogon,’ or ‘Lane End’). For this reason too, the phrase ‘in Mottram’ in the township returns was treated as having no specific import (ie ‘X in Mottram’ or ‘X at Mottram’ are treated identically to ‘X’ alone). Assessment of the similarity between a topographic name and a generic description is relatively straightforward where the proper name has both a proper element and a generic element which indicates a building (eg ‘Woolley Cottage’), or takes a related form (such as ‘a cottage, late Platts’). By analogy with the matching of personal names above, a similarity score of 3 is assigned to matches such as that between ‘Woolley cottage’ and ‘Wooleys cottage’, but a score of 2 is assigned to that between ‘Woolley cottage’ and ‘cottage’. In treating names of this specific type, comparisons are also made between the proper element of the bundle description and the name of the preceeding occupier (potentially allowing a higher score of 3 to be assigned). This form can even justify merges (in the case of the description ‘Barbers cottage, Bretnors field’). Acknowledging once again typical transference of reference from landscape features to buildings, there is, however, no assumption that topographic names such as ‘Harrop Edge’ or ‘Dolly Meadow’ necessarily denote parcels of undeveloped land, and so matches including built property are permitted. Thus the score for a match between such a name and a generic cottage, house or land remains 1. A specific approach to topographic matching was designed to exclude the implausible without attempting an exhaustive assignment of bundles to geographic locations which patchy knowledge would not permit. A difficulty particularly of historic applications of GIS is that it can be difficult to hold information that is not placeable. To make best use of the locational information inherent in such terms as ‘at Lane End’, a number of sublocalities were identified with which particular bundles might be associated (deliberately without any further definition). Hence ‘Mudd,’ ‘New Mudd’, ‘Mudd Island’ and also ‘Dolly 153 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization Meadow’ were treated as having the property of association with the sublocality, Mudd. Most bundle descriptions do not imply association with any sublocality, but when two line entries are compared, both of which can be associated with sublocalities, a match is considered implausible if the implied sublocalities differ. This allows use to be made of locational knowledge while maintaining the overall strategy of building relationships between historic textual data while permitting locational reference to be deferred. structural priority In casually comparing two line entries for successive years believed to refer to the same proprietor and occupier, higher liability in the later year might be attributed either to development or expansion. Both possibilities fit the endurantist intuition that the value of a persisting object had increased. Contrarily, it would also be consistent with an individual having relinquished occupation of one bundle and entered into another comprising entirely different property. Structural priority refers specifically to the following predispositions about which changes in landscape and occupancy are more or less likely: i) a there is no evidence in the Tollemache estate documentation of any abandonment of buildings a ‘fall’ in liability is presumed to imply transfer of property, unless it is impossible to identify any plausible set of corresponding increases; ii) while the ossibility of loss of value or appeal are admitted, they are treated as outcomes of last resort; iii) give relative values of land and buildings, and on the evidence of property constructed, any increase in the value of an apparently continuing holding greater than £3(NAV0) is presumed to result from transfer rather than construction and must be offset by a fall in liability of another holding; iv) give the overall precedence accorded to transfers over new development, a penalty of 1 applied to any other in situ development; and v) although ummations that satisfy [2] might include any number of property elements and imply any configuration of property, a penalty is imposed which increases with the number of property elements combined within or carved out of a bundle. The penalties associated with summations not preferred by principles i, ii and iii prevent the associated linkages being formed automatically. In the absence of preferable options chains will remain incomplete and further review will be necessary. 154 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby ordering The ordering of line entries within a return played an important role in Henstock’s longitudinal matching, as before 1815 the Ashbourne returns followed a consistent street sequence. Ordering is of much less significance in the present study because (alphabetic listings apart) different topographic orderings were followed in different years. The sequence numbers added to each line entry allow order to be exploited, however. For any line entry, expected sequential positions for the previous and following years may be calculated. When the expected position differs from the actual position by less than three entries, the likelihood score is adjusted. lagged summations When considering succession from or to a particular line entry, summations are identified and likelihood scores calculated not only for entries in the year immediately following (or preceeding), but also for more distant years (termed ‘lagged summations’). This allows identification of the various intertemporal effects outlined and assists in identifying those holdings repeatedly divided and recombined, or whose occupiers alternate. The scores assigned preclude lagged summations ever being preferred to non-lagged ones (thereby preventing jumping through time). results: chains, geographic reference and audit Each time the procedures are run, (ie the rules are applied to the facts), a series of chains is created, together with the link information required to produce an entire graph. Each chain represents a continuous path between bundles through time. An example of a chain is provided in Box 1, while the entire reconstructed graph is illustrated in Figure 3a (a, b and c), the thickness of the edges in Figure 3a being proportional to the associated notional annual value (NAV0). The information associated with each chain includes together with the successive estimates of NAV0, the content of the line entry corresponding to successive vertices (and also a reference to its geographic ‘patch’ as described below). It also includes the imputed circumstances of the chain’s origin, of its termination, and of critical events within it (such as gaining value from, or losing value to another chain) together with matched information from Tollemache estate documentation where applicable (as in Box 1). Each chain is identified by the number of its starting vertex, that is the unique reference of the specific line entry. A chain may originate by being ‘split from’ another chain, or be treated as ‘expected built’ in the case of properties matched with a Tollemache 99-year building lease. The origin of chains starting in 1 5 5 A ugust 7, 2014 T im e: 02 :36p m ijhac.201 4.0127.tex R eco n stru cting U rba n iza tion Figure 3a. Mottram-in-Longdendale Land Tax Graph 1784–1829; Tollemache Estate; larger properties extant in 1784. 1 5 6 A ugust 7, 2014 T im e: 02 :36p m ijhac.201 4.0127.tex P eter B ib b y Figure 3b. Mottram in Longdendale Land Tax Graph 1784-1829; Tollemache Estate; smaller properties and holdings created after 1784. A ugust 7, 2014 T im e: 02:36pm ijhac.2014.0127.tex R eco n stru cting U rba n iza tion 1 5 7 Freeholds 1784: 1 Cresswell (Lowe from 1785); 2 Kershaw; 5 Bostock; 9 Harrison; 42 Parish; 53 54 55 56 57 58 60 2040 3330 3575 Stamford & Warrington; 61 Church; 1863 Shaw Chains 11 (Shaw 1784) and 26 (Hill 1784) are included in Figure 2a (as in some years Land Tax liability for a constituent bundle also includes property subject to Tollemache freehold) Numbers on the horizontal axis denote year of assessment; numbers in grey for 1799 indicate the 'patch' occupied by the chain in that year (see text). All other numbers denote the start of specific Chains. A number in a rectangular box denotes a Chain originating with new construction; numbers in red indicate that property appears to have been previously untaxed. Italic script indicates a Chain representing only an inter-temporal adjustment associated with liability of a deceased occupier. Bundles (vertices) are represented by black points. The thickness of edge is proportional to the notional annual value of property (NAV0) transferring to the bundle at its right hand side An arrowhead on an edge indicates that liability in regard to the bundle on the right is exonerated and hence changes in value (NAV0) arising from new physical development cannot be traced Figure 3c. Mottram-in-Longdendale Land Tax Graph 1784-1829; bundles controlled by other freeholders. A ugust 7, 2014 T im e: 02:36pm ijhac.2014.0127.tex P eter B ib b y 1 5 8 Box 1: A Specific Chain: Chain 10; Cooper Holding At Hague; Tollemache Estate H Each chain is represented as a Prolog list. Each element in that list provides information for a specific year. Each element is itself a list which takes the form [Identifier,Year.Proprietor,Occupier,Bundle, TollemacheParcels, Sum_Assessed, NAV0, Patch].Where a Bundle corresponds to a series of parcels on the Tollemache estate, these appear as a list in the Tollemache Parcels slot (italicized here), otherwise [] appears. The information about any Tollemache parcel is also ordered as a list of the form [Identifier, Alpha, Num, Parcel, Sqmetres, Value]. Alpha and Num together (eg h8), refer to the missing estate map. The reconstructed version of this map forms a key source for the map of Land Tax patches for 1799 included as Figure 6.Chain 10 remained with the Cooper family, throughout but was augmented by addition of William Oldham’s Old Gate in 1804 (involving an intertemporal adjustment).[[10, 1784, [wilbraham, tollemache], [william, cooper], [hague, farm], [], 1.2375, 22.0, 10], [74, 1785, [wilbraham, tollemache], [william, cooper], [hague, farm], [[100, h, 8, [great, arney, road], 16617.4, 14], [101, h, 9, [little, arney, road], 6576.15, 20], [102, h, 10, [long, croft], 5918.53, 25], [103, h, 11, [dodds, butts, and, little, brow] , 6095.58, 20], [104, h, 12, [top, of, arney, road], 1011.72, 25], [105, h, 13, [wheat, croft], 303.515, 40], [106, h, 14, [higher, croft], 961.129, 40], [107, h, 15, [wall, hey, meadow], 16086.3, 35], [108, h, 16, [sick, meadow], 4173.32, 25], [109, h, 17, [new, meadow], 9434.24, 20], [110, h, 18, [brow, above, house, and, homesites], 1947.55, 40], [93, h, 1, [higher, banks], 19247.9, 5], [94, h, 2, [middle, banks], 9484.83, 5], [95, h, 3, [lower, banks], 1315.23, 8], [96, h, 4, [lowermost, banks], 20335.5, 5], [97, h, 5, [catt, tor, meadow], 14315.8, 10], [98, h, 6, [farmost, field, and, wood], 14771.0, 5], [99, h, 7, [middle, field], 10623.0, 20]], 1.2375, 22.0, 10], [142, 1786, [wilbraham, tollemache], [william, cooper], [hague, farm], [], 1.2375, 22.0, 10], [212, 1787, [wilbraham, tollemache], [william, cooper], [hague, farm], [], 1.2375, 22.0, 10], [290, 1789, [wilbraham, tollemache], [william, cooper], [hague, farm], [], 1.2375, 22.0, 10], [369, 1790, [wilbraham, tollemache], [william, cooper], [hague, farm], [], 1.2375, 22.0, 10], [456, 1791, [wilbraham, tollemache], [william, cooper], [hague, farm], [], 1.2375, 22.0, 10], [552, 1792, [wilbraham, tollemache], [william, cooper], [hague, farm], [], 1.2375, 22.0, 10], [652, 1793, [wilbraham, tollemache], [william, cooper], [hague, farm], [], 1.2375, 22.0, 10], [752, 1794, [wilbraham, tollemache], [william, cooper], [hague, farm], [], 1.2375, 22.0, 10], [4767, 1795, [wilbraham, tollemache], [william, cooper], [hague, farm], [], 1.2375, 22.0, 10], [851, 1796, [wilbraham, tollemache], [william, cooper], [hague, farm], [], 1.2375, 22.0, 10], [1003, 1797, [wilbraham, tollemache], [william, cooper], [hague, farm], [], 1.2375, 22.0, 10], [1101, 1799, [wilbraham, tollemache], [william, cooper], [hague, farm], [[100, h, 6, [farmost, field, and, wood], 14771.0, 5], [101, h, 7, [middle, field], 10623.0, 20], [102, h, 8, [great, arney, road], 16617.4, 14], [103, h, 9, [little, arney, road], 6576.15, 20], [104, h, 10, [long, croft], 5918.53, 25], [105, h, 11, [dodds, butts, and, little, brow], 6095.58, 20], [106, h, 12, [top, of, arney, road], 1011.72, 25], [107, h, 13, [wheat, croft], 303.515, 40], [108, h, 14, [higher, croft], 961.129, 40], [109, h, 15, [wall, hey, meadow], 16086.3, 35], [110, h, 16, [sick, meadow] , 4173.32, 25], [111, h, 17, [new, meadow], 9434.24, 20], [112, h, 18, [brow, above, house, and, homesites], 1947.55, 40], [95, h, 1, [higher, banks], 19247.9, 5], [96, h, 2, [middle, banks], 9484.83, 5], [97, h, 3, [lower, banks], 1315.23, 8], [98, h, 4, [lowermost, banks], 20335.5, 5], [99, h, 5, [catt, tor, meadow], A ugust 7, 2014 T im e: 02:36pm ijhac.2014.0127.tex R eco n stru cting U rba n iza tion 1 5 9 14315.8, 10]], 1.2375, 22.0, 10], [1201, 1800, [wilbraham, tollemache], [william, cooper], [hague, farm], [], 1.2375, 22.0, 10], [1291, 1801, [wilbraham, tollemache], [william, cooper], [house, and, land], [], 1.44583, 25.7036, 1291], [1392, 1802, [wilbraham, tollemache], [william, cooper], [house, and, land], [], 1.40625, 25.0, 1291], [1505, 1803, [wilbraham, tollemache], [william, cooper], [hague, farm], [], 1.2375, 22.0, 1505], [1596, 1804, [wilbraham, tollemache], [william, cooper], [cottage], [], 1.40625, 25.0, 1596], [1697, 1805, [wilbraham, tollemache], [william, cooper], [cottage], [], 1.40625, 25.0, 1596], [1799, 1806, [wilbraham, tollemache], [william, cooper], [house, and, land], [], 1.40625, 25.0, 1596], [1903, 1807, [wilbraham, tollemache], [william, cooper], [house, and, land], [], 1.40625, 25.0, 1596], [2007, 1808, [wilbraham, tollemache], [william, cooper], [house, and, land], [], 1.40625, 25.0, 1596], [2112, 1809, [wilbraham, tollemache], [william, cooper], [house, and, land], [], 1.10625, 19.6667, 1596], [2249, 1810, [wilbraham, tollemache], [jo, cooper], [house, and, land], [], 1.40521, 24.9815, 1596], [2294, 1811, [wilbraham, tollemache], [betty, cooper], [house, and, land], [], 1.40521, 24.9815, 1596], [2431, 1813, [wilbraham, tollemache], [betty, cooper], [house, and, land], [[100, h, 5, [catt, tor, meadow], 14315.8, 14], [101, h, 6, [farmost, field, and, wood], 14771.0, 8], [102, h, 7, [middle, field], 10623.0, 20], [103, h, 8, [great, arney, road], 16617.4, 16], [104, h, 9, [little, arney, road], 6576.15, 24], [105, h, 10, [long, croft], 5918.53, 30], [106, h, 11, [dodds, butts, and, little, brow], 6095.58, 25], [107, h, 12, [top, of, arney, road], 1011.72, 25], [108, h, 13, [wheat, croft], 303.515, 24], [109, h, 14, [higher, croft], 961.129, 40], [110, h, 15, [wall, hey, meadow], 16086.3, 40], [111, h, 16, [sick, meadow], 4173.32, 45], [112, h, 17, [ new, meadow], 9434.24, 30], [113, h, 18, [brow, above, house, and, homesites], 1947.55, 40], [208, v, 1, [old, gate], 14695.2, 18], [96, h, 1, [higher, banks], 19247.9, 8], [97, h, 2, [middle, banks], 9484.83, 12], [98, h, 3, [lower, banks], 1315.23, 10], [99, h, 4, [lowermost, banks], 20335.5, 6]] , 1.40521, 24.9815, 1596], [2550, 1814, [wilbraham, tollemache], [betty, cooper], [house, and, land], [], 1.40521, 24.9815, 1596], [2666, 1815, [wilbraham, tollemache], [betty, cooper], [house, and, land], [], 1.40521, 24.9815, 1596], [2780, 1816, [wilbraham, tollemache], [betty, cooper], [house, and, land], [], 1.40521, 24.9815, 1596], [2897, 1817, [wilbraham, tollemache], [betty, cooper], [house, and, land], [], 1.40521, 24.9815, 1596], [3007, 1818, [wilbraham, tollemache], [betty, cooper], [house, and, land], [], 1.40521, 24.9815, 1596], [3125, 1819, [wilbraham, tollemache], [betty, cooper], [house, and, land], [], 1.40521, 24.9815, 1596], [3246, 1820, [wilbraham, tollemache], [betty, cooper], [house, and, land], [], 1.40521, 24.9815, 1596], [3361, 1821, [wilbraham, tollemache], [betty, cooper], [house, and, land], [], 1.40521, 24.9815, 1596], [3480, 1822, [john, tollemache], [betty, cooper], [house, and, land], [], 0.967708, 24.9815, 1596], [3609, 1823, [john, tollemache], [betty, cooper], [house, and, land], [], 0.967708, 24.9815, 1596], [3729, 1824, [john, tollemache], [betty, cooper], [house, and, land], [], 0.967708, 24.9815, 1596], [3851, 1825, [john, tollemache], [betty, cooper], [house, and, land], [], 0.967708, 24.9815, 1596], [3972, 1826, [john, tollemache], [betty, cooper], [houses, and, farm], [], 0.967708, 24.9815, 1596], [4097, 1827, [john, tollemache], [betty, cooper], [houses, and, farm, at, hague], [], 0.967708, 24.9815, 1596], [4228, 1828, [john, tollemache], [thomas, and, holland, cooper], [houses, and, farm, at, hague], [], 0.967708, 24.9815, 1596], [4371, 1829, [john, tollemache], [thomas, and, holland, cooper], [houses, and, farm, at, hague], [], 0.967708, 24.9815, 1596]] August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby 160 1784 is described as ‘censored.’ All other chains are initially considered to have an ‘unknown’ origin, though most have subsequently been reclassified as ‘new built’. Chains may end when they are merged into another chain, or in 1830 after which they are ‘censored’, or in ‘unknown’ circumstances. The extent and character of these unknown origins and terminations is considered below. Inherent within each chain is an imputed development history, complemented by a locational history. A chain comprises one or more subchains each corresponding to a geographic patch. The geographic footprint of a chain obviously alters as holdings are combined or divided, but each of the subchains that stretch between such events corresponds to a fixed (though initially unknown) geographic patch. The specification of subchains, and hence of patches, rests on the separation of those changes in notional value arising from change in geographic extent from those others due to physical development and intertemporal adjustments. Potentially, therefore, a chain might be thought of not as a one dimensional object attenuated through time, but a three dimensional object - the additional dimensions allowing representation of its footprint at the time of each successive Land Tax assessment.73 The final processing step - locating the patches geographically - is largely distinct from generation of the chains, and predominantly involves clerical rather than computational effort. This matching rests on the one hand on the information in the chains themselves, and on the other the availability of appropriate cartographic sources. The locational evidence attached to the chains is of two forms. The first derives from matches with estate documentation which (where appropriate) associate the names of parcels held on 14-year lease with specific patches, and from matches with property subject to 99-year lease. The second comprises the successive descriptions of enduring features provided by the chains themselves. Although many individual bundle descriptions (when present) may be uninformative (eg ‘cottage’) or now untraceable, an entire chain frequently provides one or more recognizable descriptions (eg ‘cottages on Pingot Lane’). Problems remain in locating cottage property which are discussed below. As in Henstock’s study of Ashbourne, there is some reliance on the Tithe Map (of 1846 in this case). A computational reconstruction of a lost Tollemache estate plan produced for a sister project relying on higher quality plans of the 1840s and a range of other material provides the other principal cartographic resource. audit: can the chains be completed? Although later sections attempt to draw out emergent understandings of urbanization prompted or supported by the reconstructed chains, the present concern is simply with the extent to which it is possible to complete them. This proves very satisfactory; Table 5 provides some summary statistics. There are 161 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization Table 5. Completion of chains; number of bundles in chains by status. Termination Known Unknown Total Known number 4461 15 4476 Origin pct 97.2 0.3 97.5 Unknown number 77 36 113 pct 1.7 0.8 2.5 Total number 4538 51 4589 pct 98.9 1.1 100.0 Table 6. Chains ending unexpectedly. Origin Term Occupier Property NAV0 Chain Comment Unknown Bundles Year (years) Origin? 1790 14 John Richardson [] £2 412 Wealth? j 15 1790 99 Joseph Dewsnap [] £2 391 lost .. 4 1791 1 Samuel Richardso n cottage £1 534 Wealth? j 18 1795 99 James Shaw Silent Mill £2 4843 lost .. 11 1813 99 Joseph Band [] £2 2398 Wealth? j 3 All 51 4589 bundles in total, each representing a property object at a point in time, which using the methods outlined can be arranged into 186 chains, defining the Land Tax graph in Figure 3a(a,b,c). In contrast to the fixed property objects case, only five chains simply continue with a constant notional value from 1784 through to 1829 (implying both unchanging boundaries and the absence of material development affecting Land Tax assessment). Overall, 4461 bundles (97.2%) occupy a place in a chain for which the circumstances of origin and termination are both known. Thirty six bundles (0.8%) form part of a chain where neither the circumstances of origin or of termination are clear. In 1784, the township was assessed as sixty bundles (excluding tithes and official salaries), fifty three of which define chains which can be traced directly through to 1829. Seven of the remaining ten were merged into others and two incurred some radical rupture. None of the chains beginning in 1784 become untraceable. Chains begin or end unexpectedly when the logic set out above fails to capture the practices of the assessors. They may also begin unexpectedly when new property is built. Problems of continuity are thus more easily understood by focussing on chains that end unexpectedly (which are listed in Table 6). 162 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby Though the inconsistencies seem modest, most of the broken chains appear to arise from changes in the untaxed residuum. Sometimes such changes seem to result from the personal circumstances of an occupier, consistent with the interpretation of statute in Table 1. Thus while let to Samuel Richardson, a Tollemache cottage with an unusually low NAV (£1) was assessed for Land Tax, but ceased to be traceable after his death. More convincingly, a Tollemache 14- year let with an unusually low rent was not assessed for Land Tax while occupied by Jacob Jackson but became so once it was occupied by the wealthy attorney, Robert Bennett74 . Other broken chains seem to reflect possibly systematic changes in the margin of the untaxed residuum. Thus in 1813 when the assessment seems especially assiduous, two additional holdings were assessed for the first time although they had been built some years earlier, though given the values of the property their previous exclusion might have been a matter of policy.75 Moreover, with the local revaluation of 1822, the Earl of Stamford and Warrington’s plantation appears for the first time, which might be more likely to be oversight. Nevertheless, it seems clear both that these inconsistencies are modest, and that the method adopted goes quite a way towards unravelling them. audit: can the chains be placed - and with what degree of precision? Each of the 4705 individual bundles was assigned to a patch, thereby defining 324 distinct patches. Some 268 (82.7%) of them can be located. For the remaining 56 (17.3%), different solutions are possible.76 Figure 4 illustrates the extent to which it proves possible to locate the patches by mapping (where possible) the footprint of the chains in a single year- 1799. Each number shown on Figure 4 corresponds to a chain shown on Figure 3a(a, b or c) at that particular stage. The ease or difficulty of locating a particular patch depends fundamentally upon the richness of the cluster of descriptions associated with the chain on the one hand and the cartographic resource on the other. There are, however, two mediating considerations: the geographic configuration of the patch itself, and the extent of changes in occupancy between 1830, and the time at which cartographic survey was undertaken. These are considered in turn. As the amount of descriptive matter brought together within a chain increases, the chance of locating the patch improves, even though many individual bundle descriptions may be either entirely uninformative (eg ‘cottage’, or ‘house and land’) or now untraceable (‘Badgers Hall’, ‘Bolton Hill’, ‘Baron (or Barren) Alley’). Cottages traceable only by their occupier are thus hard to place and hence only 73.9% of patches with a NAV0 of £2 or less can be located as opposed to 86.0% of other patches as Table 7 shows. Locating such 163 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization to Manchester Roe Cross Lane End to Woodhead to Stockport Hodge Broadbottom Hodge 9999: part of Stamford Estate which cannot be assigned to a specific patch -1: untaxed land (Stamford plantation and places of worship) Parts of Tollemache estate which cannot be assigned to a patch are shaded yellow Figure 4. Land Tax Patches: Mottram-in-Longdendale 1799. 164 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby Table 7. Percentage of all patches traceable 1784–1829; by circumstance. Stamford Estate Other Freeholds All Traced 19.4 90.6 82.7 Untraced 80.6 9.4 17.3 All 100.0 100.0 100.0 Entire Township Cottage Holdings Larger Holdings All Traced 73.9 86.0 82.7 Untraced 26.1 14.0 17.3 All 100.0 100.0 100.0 Township Excluding Stamford Estate Cottage Holdings Larger Holdings All Traced 74.7 96.6 90.3 Untraced 25.3 3.4 9.7 All 100.0 100.0 100.0 patches depends largely on continuity of occupancy between 1830 and the time of the Tithe Commutation survey. The range of cartographic resources is obviously critical. For patches within the Tollemache estate, the availability of a digital plan reconstructed on the basis of the Tithe map, surviving books of reference and other textual and graphical sources proves very valuable. Chains are linked directly to parcels on 14-year lease as reconstructed. For patches within the Stamford and Warrington estate, no special cartographic sources are currently available and this proves a problem. In the case of the other minor freeholds, the chains derived are not complex and hence the Tithe map suffices. Configuration of holdings has less obvious effects. Where a freeholding (for which no estate map is available) comprised several contiguous patches with different occupiers, the possibility of defining their limits depends entirely on the extent of changes in occupancy between 1830 and the tithe commutation survey of 1846. The configuration of patches corresponding to 14-year Tollemache leases also cause difficulties. Although the location of the agricultural parcels included in such leases is known from the reconstruction (however scattered they may be), the location of disjoint cottages is not known. Once again, the feasibility of locating such cottages depends on the extent of changes in occupancy between 1830 and 1846. Ultimately, therefore, where the Tithe Map is the only cartographic resource, the most critical consideration is the extent of changes in occupancy between 1830 and 1846. Turnover of tenants on the Stamford and Warrington estate was 165 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization such that it proves possible to locate only seven of its 36 patches (19.4%). The specific pattern of turnover on the Tollemache estate in that period allows that 75% of cottage holdings can be located (but this should be compared with 97% of larger holdings). reconstructing urbanization: the pattern of land allocation and development The chains derived from surviving Land Tax records expose a period of substantial change, revealing the chronology and pattern of development in remarkable detail. They establish the succession of occupiers of a changing mosaic of holdings, and of an expanding stock of building, providing a framework for organizing and making sense of further materials. Overall, they reveal three phases: Phase 1: 1784–1804; A ‘considerable increase:’ development of village by petty capitalists guided by the landed interest, with active subdivision of small farms, creation of new cottage farms and the establishment of first generation machine spinning factories; Phase 2: 1805–1825; The ‘finished town:’ establishment of second generation spinning and calico printing factories, with intensification of housing, and the transfer of control of cottage property to larger capitals; Phase 3: 1826–1830; Minor Dispersed Development; resumption of small- scale development on the Tollemache estate. Phase 1: 1784–1804; A ‘considerable increase’: Unlike neighbouring townships, Mottram had experienced little of the rapid demographic growth typical of proto-industrialization. Aiken in 1795 observed that ‘it is only of late years that the town has had any considerable increase, which has been chiefly at the bottom of the hill, but some latterly on the top77 . The chains allow that period of increase to be reconstructed and more surprisingly point towards some of the processes underlying his observation. Chaining indicates that this growth was almost entirely within the Tollemache estate. Chains begin as the 14-year leases of 1771 come to an end, making way for the 1785 ‘tack.’ It was also in 1785 that Wilbraham Tollemache secured the Parliamentary Act allowing him to grant long leases on his Mottram estate (overcoming limitations of tenure shared with other major landowners).78 The new ‘tack’ provided an opportunity for change, and the power provided by the Act was critical to the program of subdivision and physical development shown by the Land Tax chains (see Figures 3a and 3b respectively). Neither a record of the 1785 ‘tack’ nor a contemporary survey survive, but the chains reveal its effects. Opportunity was taken to break up the two tenant 166 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby farms to the east of the Manchester turnpike (Chains 41 and 52), releasing plots immediately adjoining the road for development, the remaining parcels either being packaged into smaller bundles (Chains 116, 117 or 118) or assigned to other very small scale holdings (augmenting Chains 24, 25, 27 and 59). Single parcels of land adjoining the Woodhead and Stockport turnpikes were separated from former holdings and assigned to publicans (Chains 15 (Bennett) and 39 (Goddard)). Elsewhere, cottages were severed from the small farms with which they had been previously let (reducing Chain 7). The overall effect of the 1785 changes across the Tollemache Mottram estate was to re-configure holdings in a form more attuned to the pattern of demand, reducing their typical size and presumably contributing to the increase in rents per acre discussed below. Moreover, the apparent rigidity of 14-year leases did not prevent further subdivision after 1785. Between 1786 and 1787, Thomas Cardwell’s farm (Chain 47) was divided so as to create four ‘cottage farms,’ (Chains 197, 255, 253, 256), at minimal expense to the landowner as existing buildings provided the dwellings79 . Developers of the ‘middling sort,’ representing a specific ’combination of work and property’ (in the spirit of Lubow80 ) re-centred the village. New housing built on the roadside plots by the surgeon James Stead (Chain 197) and by Thomas Chadwick, a woollen clothier (Chain 129) became subject to Land Tax by 1786. By the same year, William Garside, a shopkeeper, had built his ‘Baron Alley’ (Chain 198) in what was becoming the core of the village near the junction of the three turnpikes. Alongside, the tailor Robert Hamilton completed the property subsequently styled ‘Grocers’ Hall’ (Chain 199), and Wagstaffe’s mill (Chain 771) was built adjoining a farm house built a century before, a remnant of a holding evidently divided before the period examined. All these had the benefit of 99-year leases; the market was unmuzzled, but regulated by the aspirations of the Tollemache estate. Chaining shows how the release of further parcels by the principal landowner allowed for thickening and extension of this core (see Figure 3b). Housebuilding by the publican-farmer Samuel Cook on the Pit Croft by 1791 (Chain 528), drew it southwards on the Stockport turnpike, while development by the weaver Joseph Bardsley (Chain 442) extended it northwards on the Stalybridge road. By 1790, Thomas Cardwell, the farmer whose holding had been divided into cottage farms, had completed the first housing on ‘Brick Croft’ (Chain 390), the remainder being incrementally built-out and subdivided, changing hands repeatedly before development was ultimately completed in 1813. While most of this activity contributed to the formation of a minor commercial centre - ‘a sort of market’ as Aiken put it, at its peak in 1791, building started at the hamlet of Mudd - the top of the hill which he described. Thomas Shaw’s houses (Chain 536) and Jonathan Hadfield’s Badgers Hall (Chain 535) of 1791 were followed by Joshua Binns’ Bolton Hill (Chain 636) from 1792. Again the 167 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization developers were of the middling sort, but rather than the craftsmen-shopkeepers of the new village core, two of the three (Binns and Shaw) were cotton spinners and aspirant industrial capitalists who in 1796 together secured a lease for an ambitious scheme to create a water-powered cotton mill at Hodge never in fact built.81 Chaining also indicates the attenuated period of incremental development on individual plots. Setting the evidence of the chains alongside documentary and photographic material, it becomes clear that the basic development units of the period were usually single houses or pairs, frequently abutting existing buildings (see Figures 5 and 6). Within any particular Land Tax chain, incremental development appears as increases in NAV0 of between 10 shillings (£0.50) and £2 not attributable to transfer of property from others. Figure 7 shows the aggregate value of these incremental changes year by year, highlighting their significance in the late 1780s and into 1791/2 . Much of the property built in this way at the village core was evidently poor, and was demolished in the early years of the twentieth century. At Mudd too, incremental accretion once again produced ‘a number of irregular tumble-down houses’.82 Closely spaced parallel terraces played no part in this form of urbanization (although they typified the later Broadbottom colony). Indeed, the building plots released on 99-year lease were too narrow to permit this. Instead, the discontinuous ribbon of development meant occupiers of the new property might still occupy garden land and grazing land rented separately. In the absence of property it was not possible - in Malcomson’s terms - to provide ‘for one’s own needs by one’s own efforts, without the mediation of wage-employment’83 , but access to means of subsistence was possible. The development forms of the township in the 1780s and 1790s thus had no necessary direct connection with proletarianization. They were consistent with the extension of a mixed marketized cottage economy, and the small-scale developers frequently occupied (adjoining) land for fourteen-year terms, allowing them or their tenants the possibility of cow keeping. Moreover, the particular pressure of demand for small areas of pasture and grazing abutting the village is strongly suggested by the pattern of Tollemache rents and increases in those rents. Alongside those changes in the closing years of the late eighteenth century that appear to reflect the late flowering of a proto-industrial economy - or rather one based on pluriactivity - the Land Tax chains also track the onset of industrialization proper. On the Tollemache estate, chaining shows both minor textile development intercalated in the village and larger-scale machine spinning on riverside sites at the southern limit of the township. Chaining shows the severance of an old fulling mill - Hodge Mill (Chain 68) - from a small farm (Chain 4) subdivided in the 1785 ‘tack’ (also forming Chain 66). It shows the succession of its occupiers and following realignment of business interests, the construction of an adjacent factory - Wharf Mill (Chain 1108) - by 1799.84 168 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby Figure 5. Incremental Development. Building A with three steps predates the rebuilding, appearing on a lease plan of 1789 as a saddler’s shop and is part of Chain 276. The property which extends it, B, is accounted for within Chain 38, its construction presumably corresponding to an increase in its annual value of £1 in 1792, or a further increase of £1 in 1793. Property C occupies a site accounted for within Chain 40. A notional value of £1.50 in 1784 increased to £3 by 1792, and £3.50 by 1806. The site was granted a 99-year lease in 1796, including an area where development had taken already taken place (Source: Tameside Image Archive; Copyright Tameside MBC). Beyond the limits of the principal landlord’s estate, the Land Tax chains track the construction of Thomas Lowe’s mill (Chain 740) on his family’s freehold by 1794, and its absorption once again into Chain 1 on the death of his father. Chaining shows, however, that even the reconfiguration of land uses and the pattern of development accompanying machine spinning did not begin to constitute urban forms typified by ‘confined streets’. Alongside a demand for workers’ housing, machine spinning induced a demand for land for grazing 169 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization Figure 6. Incremental Development. Samuel Cook’s development first appears as Chain 528 in 1791 with a notional annual value of £2, being subject to a 99-year lease from Wilbraham Tollemache dated January 1789. With incremental development, the value rose to £3 in 1793, and to £3.50 in 1806. In 1815, the property was split forming two further chains- 2778 and 2831 (Copyright: Author). horses, pasturing cattle and growing fodder crops, leading to displacement of households engaged in more traditional activity. In the case of Hodge Mill, a single terrace was built unrelated to a street system, in nearby pastureland, accommodating workers and collective loomshops (Chain 835). Chaining tracks the block’s initial construction on land taken from the holding of farmer-clothier John Lees (Chain 3), its later extension and the subsequent increase in notional value as adjoining land was transferred from Lees (presumably a cow ground for the benefit of the occupiers)85 . Chaining shows that through the 1790s the demand of incipient cotton capitals for ‘agricultural land’ prompted the displacement of long-settled families - Bowers (Chains 4 and 66) and Lees (Chain 3) - who perhaps epitomised the traditional dual economy, culminating in the subdivision of Lees’ Hurst Clough farm in the tack of 1799, and 170 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby ‘Further Development’ refers to the notional value of built development undertaken within any bundle, other than development in the year the bundle first became taxable. ‘All Development’ refers to the notional value of all built development undertaken in any given year, including development within taxed for the first time and ‘further development as defined above. Figure 7. Notional annual value of new built development: Mottram-in- Longdendale 1784-1829. assignment of a further portion to Moss (Chain 835,1108), the cotton spinner who controlled Wharf Mill. Thus although in this locality land ownership was highly concentrated, the demand of both craftsmen and of machine spinners had continued to stimulate fragmentation of holdings in contrast to pervading trends and the conventional wisdom of estate management. Phase 2: 1805–1825; The ‘finished town’: Analysis of the chains suggests a marked change in the pattern of development and the organization of housing with the opening of the nineteenth century. 171 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization Apart from change at Hodge discussed, the ‘tack’ of 1799 involved little further subdivision of holdings. For whatever reason, possibly an active policy of restraint, new housebuilding ceased on the Tollemache estate, and Mottram became a ‘finished town’.86 The remaining plots on Brick Croft (Chains 875 and 876) transferred between petty capitalists without development until construction on of Chain 876 in1813 by the publican William Warhurst. With restraint came intensification and escalation of rents, with rents of cottages securing twice their rateable value by 1818, and yields on cottages double expectation by 1826.87 The balance of forces driving change in the new century, and the physical character of development, seem quite different. Factory-based industry came to the fore, while some of the first generation machine spinning businesses disappeared. Most significantly, the Sidebottom brothers - locally rooted major capitals - established Broad Mills and the adjoining Broadbottom colony outwith the Tollemache estate at the southern limit of the township (Chain 1627). Although chaining cannot trace subsequent development because of exoneration, other evidence highlights the stark difference between the physical configuration of housing in the colony and that elsewhere in the township. These parallel terraces in their tightly confined complex, emblematic of proletarianization, were the principal addition to the housing stock in the period.88 While physical development was restricted, the township’s economy grew, becoming dominated by large-scale textile manufacture.89 Development by the Sidebottoms apart, the Land Tax chains show that opportunities offered by the early mechanization of calico printing were realized by Samuel Matley and Co who took over Tollemache property at Hodge following the collapse of the earlier spinning partnership (Chain 68). Despite the scale of the physical investment by this second major capital which the chains suggest (an increase of £22.25 NAV0 in 1805–6), the Matleys built no further workers’ housing. Rather they intervened in the supply of housing space by buying up property built on long lease in the preceeding period. Besides the block controlled by their predecessors at Hodge (Chain 835), its second floor loomshops divided to provide further accommodation, they acquired housing constructed at the height of the boom by Jonathan Hadfield at Mudd (Chain 633).90 Although chaining shows that building had virtually stopped on Tollemache land, successive Censuses (1801,1811,1821) demonstrate that the number of households in the township continued to grow at levels outstripping building by the Sidebottoms.91 Property use thus intensified. Much of the building developed by the boom which Tollemache had promoted was bought up to provide workers’ housing. The chains show that the interests involved were not limited to industrialists such as Matley, who might be motivated at least in part by their own need to secure labour power. Centrally, chaining exposes the hitherto unrecognized market-making role of the local attorney and rentier, 172 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby Robert Bennett. Evidently a man of substantial means holding Bank of England debt, by the close of the eighteenth century he had become actively involved in its growth. He had played a transitory role in Silent Mill (Chain 4843), an enterprise of John Sidebottom, two of whose sons controlled the Broad Mills complex. By 1795, Bennett had both himself built new property on 99-year lease (Chain 4842), and engaged in a series of transfers allowing further subdivision of Harrop’s holding and creation of housing out of its stable (Chain 4789). Chaining shows Bennett’s transitory direct involvement before the property passed to the Manchester liquour merchant, Henry Cardwell, while Bennett’s account book shows that he retained an interest as Cardwell’s mortgagee. After 1800, Bennett’s role grew as he acquired properties demised by Tollemache for 99 years, including housing developed in the preceeding spurt of growth by the woollen clothier Thomas Chadwick (Chain 129) and the cotton spinner Thomas Shaw (Chain 536). He took older cottage property demised to John Sykes (Chain 21), and that property demised to the publican Edmund Hill, creating further housing out of his barn (Chain 2145). Bennett, moreover, played an important role in providing mortgage finance for the final developments on building leases granted in the period of expansion (eg Chain 876).92 Renting out land that he held from Tollemache on14-year lease (including Chains 36 and 1875) - not merely parcels of meadow but gardens and pigcotes gave Bennett further income streams, and influence over what remained of the ‘cow and cottage’ system in its continuing form, and the provision of housing space with no land at all.93 The Land Tax chains show, moreover, that the major cotton spinning and calico printing capitals also came to control substantial areas of pasture land and grazing. The Sidebottom brothers succeeded to freehold land formerly held by two lesser freeholders - Lowe (through Chain1) and Kershaw (Chain 2), but leased nothing from the principal landlord. Subsequent development on those freehold bundles (limited in fact to substantial mansions for their own occupation) cannot be traced through the Land Tax returns, because they purchased exoneration once they came into possession of the property. Matley secured Tollemache land including not only that leased to the spinners that preceeded them (Chain 68), but took the adjoining Hurst Clough holding (Chain 3) from 1810 and the farm previously occupied by the publican Samuel Cook (Chain 31) from 1824. Phase 3: After 1825: ‘Minor Dispersed Development’: The final years of the surviving Land Tax record form a codicil to this account, indicating a new period of construction on the principal landlord’s estate (Chains 4196?, 4223,4225,4269), following a flurry of building leases. An estate survey of 1826, preparatory to a new tack, perhaps signalled this change, noting that 173 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization Matley’s houses at Hodge (Chain 835) made a return of 12% per annum, roughly double that expected. The pattern of land release was similar to that of the period before 1805, with the blocks longer, but still widely dispersed. The Land Tax chains show development in 1828 on Tollemache land near, but distinctly separate from, the Broadbottom colony. John Clayton, a publican-shopkeeper, developed Haven House 350m to the west. William Loughridge’s terrace was built 250 metres to the north, surrounded by pasture land which the chains show was taken from Brown Road Farm (Chain 4270). Any suspicion that this continues a pattern of thirty years before seems confirmed by roughly contemporary comment that ‘Loughridge wants a cow-keeping out of I6’ in an estate notebook.94 urbanization: summary In summary, therefore, chaining allows the construction of a history of changing land-use and development at the simple, phenomenal, level. It reveals a period of subdivision, and rapid village development, whose character and sudden ending remain unrecognized.95 At this level there are significant gaps - most obviously problems arising from exoneration, rendering the account offered of the Sidebottom cotton enterprise seriously deficient.96 Understanding the nature and extent of the untaxed residuum also remains a problem, despite the possibility of noting changes under the longitudinal approach. Moreover, these changes at the phenomenal level can be seen to have contributed to attenuation of a proto-industrial configuration, creating cottage farms, stimulating pluriactivity, accommodating crafts and trades, and satisfying pent-up housing demand. Their significance within the constitution of an urban-industrial ensemble can also be appreciated; there is direct evidence of construction and expansion of mills, printworks, and (though occluded) of a factory colony. While other sources show more clearly the nature of the Sidebottoms’ starkly class-divided locale, chaining reveals some of the less obvious aspects of this urbanization, including the manner in which the substantial capitals secured control of undeveloped land, and the control and intensification of previously developed housing. Chaining exposes the role of Robert Bennett which seems totally unknown.97 The capacity of the Land Tax chains to track the building stock (subject to exoneration) assists obliquely in appreciating facets of proletarianization, provided one recalls that it does not track numbers of households. The present study qualifies the nature of this urban-industrial ensemble. It stands as a warning against simplistic imagining of this form of urbanization as a force ‘that covered the hills and valleys of Lancashire and the West Riding with the factory towns that were to introduce a new social type for the world to follow’.98 In the township examined, physical urbanization after 174 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby 1804 did not engage forces strong enough to increase the flow of housing output substantially. The present study shows the modest scale of the particular physical developments that contributed to urban growth and their configuration relative to each other. It points to the persistence of the cottage farms and the demand of industrial capitalists for grazing and pasture, which partly underpins this intermingling of agriculture and industry. Mottram became part of the vast scattered city that Bamford described in 1844, a place where people, although deeply engaged in urban social movements did not spend their ‘lives in the confined streets of large towns, shaded alike from the winter’s wind and the summer’s sun’ as in the imaginings of John Revans, scourge of the Chartist Land Plan.99 Although portrayed by a Royal Commission in that same year as part of the third largest town in the country outside London, its households were as deeply rooted in small scale agriculture as in textiles, and the large town merely a geostatistical artifact100 . The evidence of the Land Tax chains allows for reflection on the strategies open to specific actors, and shows before 1805, Tollemache and his steward apparently pursuing a form of planning barely discussed.101 Favouring subdivision of holdings, and turning their backs on conventional wisdom they pursued an approach to promoting cottage farming far removed from any sort of paternalism, and avoiding all capital expenditure, but employing regulation in a period of local economic expansion. Moreover, they seem to have recognized that while issuing ninety-nine year building leases would generate only modest income, it unlocked development potential, in turn stimulating increased economic output, a portion of which would accrue to the estate in the form of rack rent on the adjoining land. Whatever the ideological position, the material benefits to the landlord of extending the ‘cow and cottage’ system outweighed those offered by industrial urbanism. conclusions Finally, some broader conclusions are offered about aspects of the Land Tax returns and the nature of social relationships implied which are thrown into high relief when the methods developed here are adopted. Although the chain perspective deploys an interpretative strategy which emphasizes the relation between line entries in successive returns rather than the individual line entries alone, its insights carry implications for more familiar approaches to the Land Tax. It demonstrates, for example, that without an understanding of the pattern of assessments across a township it is not possible to interpret change at individual properties.102 . Chaining reveals, moreover, the variety of linguistic descriptions which may be applied to the same enduring referent, and thus stands as a warning against overly nice interpretation of particular terms. It shows in the context considered that terms such as ‘house’ and ‘cottage’ do not clearly pick 175 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization out different property types. Neither can ‘a cottage’ be distinguished from ‘a cottage and croft’ or a ‘house’ from a ‘house and land’. (In the circumstances, both should be understood to include small areas of land). Such variant terms are variously applied in different years without any change in the proprietor, occupier or most critically the notional annual value of the property described. No more can a ‘cottage’ be distinguished from ‘cottages’, though the plural is rarely used. In fact, comparison with estate documentation shows that reference to ‘a cottage’ in a line entry may include a number of dwellings, and typically denotes more than one.103 Chaining also indicates that assessors were varyingly assiduous in carrying out their duties, and more significantly points to limited systematic local change in the use of language over time. In some years, sharper (although not necessarily definite) descriptions are provided (eg ‘a croft, late Woolleys;’ or ‘Barber’s cottage; Bretnor’s field’). For other years, particularly in the 1820s when development was locally very limited and holdings static, bundle descriptions (though present) are very bland. In the Mottram returns, the manner in which the use of the term ‘farm’ shifted as the mixed economy developed, however, is of more significance. As subdivision continued, the term ‘farm’ came to be subordinated to ‘house’ within the returns (as in the entry ‘house and farm’), and to refer to smaller and smaller holdings, denoting perhaps a single close. It seems no accident that the description ‘Barber’s Cottage; Bretnor’s field’ later became ‘Barber’s Cottage; Bretnor’s farm’, and that the field in question had been divided into two. Despite the fact that only four of the 356 households in the township in1821 were primarily dependent upon agriculture104 , by 1828, the Land Tax returns describe 61 units within the township as ‘X Farm . . . ’ or ‘house and farm ..’ and a further ten as ‘part of farm . . . ’ Crucially, however, the sustained investigation demanded suggests that at least in this case the returns have more integrity than either Mingay’s105 entirely dismissive view or even Noble’s106 more detailed examination might incline us to believe. The prime purpose of the returns was to communicate liability as economically as possible, and if our intentions are different, it seems reasonable that we should pay the price. The present approach makes great demands of the returns and of related sources. In its pursuit, mismatches between the Land Tax returns and estate documentation have repeatedly been found to be explicable, and the sources to have different but complementary strengths. Difficulties stem from the extended pyramidal nature of property-holding and occupancy which eludes the simple distinction between ‘proprietor’ and ‘occupier’ of the Land Tax returns, and occasional legal uncertainties. As local practice in Mottram treated the Earl of Stamford and Warrington’s lifeholders as proprietors for Land Tax purposes, the freeholder went almost unrecorded in the returns. The local assessor’s unexpected identification of the Earl as proprietor in a line entry for 1822 in Chain 60 proves consistent with the agent’s supposition that the property 176 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby was likely to be forfeit following the death of the tenant in 1819. Line entries for the years until 1829 reflect actual occupation and shifting assumptions of ownership, until it became clear that another of the three lives had survived.107 Those holding from the Tollemache estate ninety-nine year leases determined by lives were not, however, consistently treated as proprietors. These anomalies are, however, a characteristic of the complexity of land ownership not the inadequacy of the returns. Difficulties in matching ‘occupiers’ from the Land Tax returns with ‘tenants’, from estate documentation are thus to be expected. Occasional notes in estate records serve as a reminder of the depth of the landholding pyramid. Given a reference to property let by Tollemache ‘occupied by Joshua Wagstaff and Benjamin Holdgate under James Shaw under John Reddish’, it is not clear which ‘occupier’ should be expected in the Land Tax returns. The repeated instances where leases are granted in the name of one partner while another is recorded as occupier should not occasion surprise. Alternating occupiers may also be expected to appear in the Land Tax returns in such circumstances, and the inter- temporal adjustments noted provide a further reason for alternating occupancy. Divergences between tenancy and occupation may be expected to carry meaning even if it cannot always be recreated. The Mottram Land Tax returns record John Harrison as the occupier of Tollemache’s Titterton Farm (Chain 36) in 1813, but Robert Bennett is recorded as lessee in the 1813 ’tack.’ In this case, Bennett’s own account book survives showing the terms on which it was indeed sub-let to Harrison. Neither is John Hadfield found in estate documentation through the 1820s, despite appearing as the occupier of Tollemache land in the Land Tax returns. In this case, however, a later note in an estate document claims that ‘John Hadfield, joiner, has held W1, W2 and W3 for all the present lease’ - again showing the distinction between the lessee and the occupier108 . There seems a very real possibility, at least in the township considered, that the Land Tax returns provide a more accurate record of actual occupation than the evidence of leases. It is presumably easier for the modern-day analyst observing mismatch to conclude that the returns are deficient than for the assessor or collector of the Land Tax to justify a baseless demand. Ultimately, this investigation begs the question of what must be true of the organization of society at the time in question for this manner of reconstruction to be possible, and what aspects of that social organization underlie the difficulties and limitations. Obviously, despite the contemporary belief that many went untaxed, in the locality considered regular partial updating of valuations allowed for inclusion of newly built property, local changes in poundage were applied and local comprehensive ‘revaluation’ in 1822 systematically shifted the relative value of agricultural, domestic and industrial property. The possibility of forming the chains, however, demands much more - requiring (and hence providing evidence of) a high degree of consistency in practice at local level in a time of 177 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization change. Where chains are broken, sustained investigation has usually resolved the problem, demonstrating that this is more usually the result of complexity rather than oversight. This implied order and stability, rests on the mentality of part-time assessors of a ‘middling sort’, whose common sense also provided for the local administration of the Elizabethan Poor Law. It was this mentality which compensated for the deficiency of the Land Tax returns as a ‘technology of power’.109 The mechanisms of government were crucially undeveloped. Government’s lack of awareness of or concern for the operation of the Land Tax at local level as late as the 1830s is amply demonstrated in the evidence provided to a Royal Commission by the officials responsible at national level.110 Governing at a distance was hardly possible.111 The Pennine fringe at the end of the eighteenth century was barely a ‘geocoded landscape’: the ‘spatial regime of inscriptions’ in Rose-Redwood’s terms was poorly developed112 . The underdeterminancy (and occasional inaccuracy) of the line entries, however, would only have prejudiced the original purpose of a return in the absence of collectors and occupiers whose local knowledge allowed them to appreciate its assumptions and draw necessary inferences, resolving the problems of reference both personal and geographical. In the terminology of relevance theory, the message of each line entry is linguistically communicated, but not (fully) linguistically encoded.113 The core challenge of the current paper has lain in the attempt to compensate for that tacit knowledge, and to reconstruct it to a degree. This, however, is only possible because of an original order. end notes 1 Aikin, J., Description of the country from thirty to forty miles round Manchester. 1795, London: Stockdale, p472. 2 Thirsk, J., ‘Industries in the countryside’ in The Rural Economy of England, J. Thirsk, Editor. 1984, Hambledon Press: London. 3 Defoe, D., A Tour Through the Whole Island of Great Britain. 1968, London: Dent. p602. 4 Wadsworth, A.P. and J.D.L. Mann, The Cotton Trade and Industrial Lancashire 1600–1780. 1931, Manchester: Manchester University Press, p317–21; Walton, J.K., Proto- industrialisation and the first industrial revolution: the case of Lancashire, in Regions and Industries: A Perspective on the Industrial Revolution in Britain, P. Hudson, Editor. 1989, Cambridge University Press: Cambridge. p. 41–68, p60 5 Stobart, J., The First Industrial Region: North-West England, c.1700–1760. 2004b, Manchester: Manchester University Press, p143, p151; also Aiken, Description of the country from thirty to forty miles round Manchester, p472 6 Wadsworth and Mann, The Cotton Trade and Industrial Lancashire, p311 7 Bamford, S., Walks in South Lancashire. 1844, Blackley: S Bamford, p 11 8 See p968 of Marfany, J., Is it still helpful to talk about proto-industrialization? Some suggestions from a Catalan case study. Economic History Review, 2010. 63(4): p. 942–973 178 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby 9 See p59 and p61 of Walton, J.K., ‘Proto-industrialisation and the first industrial revolution: the case of Lancashire’, in Regions and Industries: A Perspective on the Industrial Revolution in Britain, P. Hudson, Editor. 1989, Cambridge University Press: Cambridge. p. 41–68. 10 ibid. p68. 11 Levine, D., Reproducing families : the political economy of English population history. 1987, Cambridge: Cambridge University Press, p111 12 ibid. p117 13 See for example van Bavel, B.J.P., ‘Early Proto-Industrialization in the Low Countries? The Importance and Nature of Market-Oriented Non-Agricultural Activities in the Countryside in Flanders and Holland, c. 1250–1570,. Belgisch tijdschrift voor filologie en geschiedenis/ Revue belge de philologie et d’histoire, 2003. 81: p. 1109–65; Gray, J., Spinning the Threads of Uneven Development: Gender and Industrialization in Ireland during the Long Eighteenth Century. 2005, Lanham: Lexington.; Hudson, P., ‘Proto-Industrialization in England’, in European Proto-Industrialization, Ogilvie; S.M. Cerman, Editor. 1996, Cambridge University Press: Cambridge. 14 Kent, N., Hints to Gentlemen of Landed Property. 1792, Dublin: Lawyer’s And Magistrate’s Magazine 15 See Marshall, W.H., On The Management Of Landed Estates: A General Work For The Use Of Professional Men. 1806: Printed for Longman, Hurst, Rees, and Orme. 16 Ginter, D.E., A Measure of Wealth: The English Land Tax in Historical Analysis. 1992, London: Hambledon Press. 17 Turner, M., Mills,D (eds.), Land and Property: The English Land Tax 1692–1832. 1986, Gloucester: Alan Sutton 18 Examples for this locality are found in Nevell, M. and J. Walker, Tameside in Transition. 1999, Ashton-under-Lyne: Tameside Metropolitan Borough Council; and Haynes, I., The Cotton Industry in Hollingworth and Mottram-in-Longdendale. 2008, Ashton-under-Lyne. 19 Hunt, H.G., ‘Land Ownership and Enclosure 1750–1830’, Economic History Review, 1959. 24: p. 497–505 20 Henstock, A., ’House Repopulation from the Land Tax Assessments in a Derbyshire Market Town, 1780–1825’, in Land and Property: The English Land Tax 1692–1832, Turner, M., D. Mills, Editors. 1986, Alan Sutton: Gloucester. p. 118–135. 21 This was a particular concern of writers such as Gray, H.L., ‘Yeoman Farming in Oxfordshire from the Sixteenth Century to the Nineteenth’. Quarterly Journal of Economics, 1910. 24: p. 293–326 and Davies, E., ’The Small Landowner,1780–1832, in the Light of the Land Tax Assessments’. Economic History Review, 1927. 1: p. 88- 9. 22 Noy, W., The principal grounds and maxims: with an analysis of the laws of England. 1821, London: Sweet, p292 23 Harrop, J., P. Booth, and S. A. Harrop, Extent of the lordship of Longdendale 1360, 2005, Record Society of Lancashire and Cheshire, p xxxiii. 24 On statutory provisions and practice see evidence of Wood and Garnett included within Second report from the Select Committee appointed to inquire into the state of agriculture; with the minutes of evidence, and appendix, 1836, (189), p 259. Mottram was included within and subject to quotas for the Stockport Division of Cheshire. 25 See Ginter, D., ‘The Incidence of Revaluation’ in Land and Property: The English Land Tax 1692–1832, Turner, M., D. Mills, Editors. 1986, Alan Sutton: Gloucester. p. 180–188. 26 Despite frequent claims that the Land Tax poundage was fixed at 4s in the pound (20%) from 1798, this appears to be a statutory maximum. See Miller, S., The Laws Relating To The Land Tax Its Assessment Collection Redemption And Sale. 1849, London: Sweet, p3. Latitude 179 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization at township level to alter both poundage and valuation is discussed in Ginter, A Measure of Wealth. 27 Between 1780 and 1832 Land Tax fell almost entirely on real property. Tithes, in principle analogous to rent, remained subject to Land Tax and have been variously treated by previous analysts. Gray ‘Yeoman Farming’ ignored them; Hunt ‘Land Ownership and Enclosure’ attempted to exclude tithes in situations where before enclosure they were received in kind. Ginter, A Measure of Wealth systematically included tithes. Tithes would be pertinent to this study only if commuted into land, which does not appear to be the case. A small range of specific income streams remained liable to Land Tax including official salaries (at a nationally constant poundage of 20s in £100 (ie 1%)). See evidence of Wood and Garnett in Second Report p268). Some line-entries in the Mottram return refer to the salaries of excise officers. 28 Despite claims (by authorities including Miller The Laws Relating To The Land Tax, p11) that property with an annual value of less than £1 ceased to be liable from 1798, the stipulation of s80 of the 1797 Act refers to the exemption of persons whose entire real property is worth less than £1, and merely extends previous provision. See for example Burn, R., The Justice of the Peace, and Parish Officer. 1772, London: Cadell, p45). There appear to be no consequent changes in line entries in Mottram in the years immediately after 1798, and it is clear that locally property bundles with lower annual values were taxed although the aggregate values of property held by those liable exceeded £1. 29 From 1798, when Land Tax became perpetual, proprietors or occupiers (other than tenants at rack rent) were allowed to redeem their liability by payment of a lump sum (equivalent to fifteen year’s purchase). See p284 of Hunt, H.G., ‘Land Tax Assessments’. Economic History Review, 1966. NS 11: p. 283–286. They were thereafter ‘exonerated’, and although still listed in the returns, subsequent assessments of the annual value of their property remained constant, and so further development cannot be traced. 30 From 1798, further provisions allowed that if an owner did not redeem their Land Tax, another private person or group of people (‘the redemptioner’), could redeem but not exonerate that tax. Under this arrangement the Government received the lump sum as if the land tax had been exonerated, but continued collecting the tax for the redemptioner. (These arrangements were repealed in 1802). Property subject to these arrangements were still liable to reassessment. See Wood and Garnett’s evidence in Second Report, p256). At least locally, such property was not separately identified in the returns. 31 An Act of 1817 confirmed and amended by further statutes allowed ecclesiastical and other bodies to sell property in order to redeem Land Tax liability. See Miller The Laws Relating To The Land Tax, p242). From 1818, the Church’s liability with respect to both the Glebe Land and the Tithes in Mottram township were exonerated. 32 Provision for appeal is found in the statutory timetable for administering the Tax as discussed by Beckett, J.V., ‘Land Tax Administration at the Local Level, 1692–1832’, in Land and Property: The English Land Tax 1692–1832, M.T.D. Mills, Editor. 1986, Alan Sutton: Gloucester. p. 161–179. In some cases the returns themselves indicate provision for appeal. 33 Although Catholics who had not sworn an oath of allegiance and supremacy and who refused to do so were liable to double tax, Beckett, ’Land Tax Administration at the Local Level’, p166, suggests that this did not usually occur in practice. 34 National administrators doubted that liability was revised as development occurred, and assessors might have avoided reassessment ‘in the interests of local harmony’ (Beckett ’Land Tax Administration at the Local Level’, p170). Ginter, ‘The Incidence of Revaluation’ and Ginter A Measure of Wealth points to the frequency of reassessment of individual property. 35 This possibility is raised in Beckett, ’Land Tax Administration at the Local Level’,p163. Note also that case law established that while the tenant might deduct the land Tax from his rent, the amount deducted could only be the sum for which the property would be liable had it remained 180 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby in the form it was when the tenancy began, with liability for any improvements falls upon the tenant. 36 Eg Sayer, B., An Attempt To Shew The Justice And Expediency Of Substituting An Income Or Property Tax For The Present Taxes .. 1833, London: Hatchard and Miller, The Laws Relating To The Land Tax 37 Lord Fitzwilliam, Letter to Laurence French 1798, 10th or 17th Dec 1798, Box X51 5/138, Milton Manuscripts, Northamptonshire Record Office quoted in Ginter, A Measure of Wealth p110. 38 Beckett ’Land Tax Administration at the Local Level’, p171. 39 The validity of this procedure can be tested by considering the frequency with which the standardized assessment remains constant from one year to the next in circumstances where the owner, occupier and description of the ‘property bundle’ remain the same. In 1,542 such cases the hypothesis of constancy holds; in 34 it does not. Those 34 cases appear to reflect substantive change of the types of concern, allowing the conclusion that the fundamental stability of the (standardized) assessments provides a basis for identifying material change. 40 Miller, The Laws Relating To The Land Tax 41 Highway Rate book for the township of Mottram, 1818–19, DD8/8A, Records of the Heginbottom family, Tameside Local Studies and Archives, Ashton-under-Lyne. 42 Most Land Tax payers’ liability fell slightly, but the assessment of Samuel Matley’s printworks almost doubled, and that of Beckett’s Hodge Mill increased by over 40%. Land Tax due on the larger of the township’s small farms in 1822 typically fell to two-thirds of the 1821 assessment, while the sum due on blocks of cottages with little associated land either remained almost constant or increased slightly. Given the large number of distinct property values recorded, and the very low likelihood that a new survey had been undertaken, it seems possible that the ‘valuations’ underlying the Land Tax assessments for 1822 onward rests on a combination of more than one source. . 43 Limiting consideration to cases where the sum assessed was less than 10s (implying £1 NAV0) regressions was used to estimate NAV1 on the basis of NAV0 for chains starting in different circumstances (censored, expected built, split), with estimated values of NAV1 ranging from 90% to 96% of NAV0 (95.2% for expected built property). On this basis, NAV0 for new property has been treated as being 1.052632 of the value of NAV1. 44 Ginter, A Measure of Wealth p14. 45 Tithes were leased from 1768 to 1808 to William Ulithorn Wray, Rector of Darley, Derbs, and afterwards to his widow. See Lease for 3 lives (copy 21 Nov 1818), 1768, P 25/8/13, Cheshire Archive and Local Studies Service, Cheshire Record Office, Chester. 46 From time to time, excisemen were resident in the township, and in principle there is a possibility that as their contribution to meeting the quota rose and fell, the contributions of other taxpayers might alter correspondingly. It is clear, however, from Figure 1 that no such adjustments were made. 47 Ginter, ‘The Incidence of Revaluation’ p182. 48 See Ginter, ‘The Incidence of Revaluation; Ginter, A Measure of Wealth. 49 References to numbers of cottages, undertenants etc are found in Tollemache (Wilbraham of Woodhey) Collection, DTW series, Cheshire Archive and Local Studies Service, Cheshire Record Office, Chester. 50 Local assessors were not consistent in their treatment of holdings of this last type- lessees being considered as proprietors in some years and in some cases, but not all. 51 This notional value is very much less than the actual rental value, though this was not material to the local operation of the Land Tax in the period in question (or of consequence for estimation of area equivalents). 181 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization 52 Unfortunately, although there are good descriptions of the buildings in holdings leased from Tollemache for 14 years from 1799, the range of descriptors seems too variable to provide useful measures of influences on the assessed value of built property. In some cases the number of houses or cottages is included, in others the numbers of bays of building together with their age and quality. 53 These are shown in Hilditch, R., Aristocratic Taxation: Its Present State Origin And Progress With Proposals For Reform. 1843, London: Simpkin and Marshal, p30 54 Any principle that bundles with an annual value less than £1 was exempt from Land Tax either from 1798 or throughout - supposedly grounded in statute - is disputed (see Endnote 29). 55 It is also possible that local assessors might seek to avoid taxing the least valuable cottages, even though this was not a legal requirement as the steward of a major landowner (treating aggregate tax liability of tenants as a deduction of total rent) might seek to determine the share of an estate’s Land Tax liability to be placed on each tenant. This practice is advised in Mordant, J., The Complete Steward. 1761, London: Sandby. Mordant’s exemplar estate accounts place no Land Tax responsibility on cottage tenants (vol 2, p16–18). Although he recommends this on the basis that the manor includes all the property within the township, he adds that ‘where the Lord owns only part of the land .. the tax is to be proportioned to each tenant exactly (if he pays it) or if not to the value of the whole estate compared with others &c by the rule of proportion’ ibid p17. If very small potential liabilities are spread across the entire estate, an untaxed residuum will exist, but the distinction between clumping and omission becomes a fine one. 56 See Bratko, PROLOG Programming for Artificial Intelligence. 2011, Boston: Addison- Wesley. 57 See Pereira, F.C.N. and S.M. Shieber, Prolog and natural-language analysis. 2002, Brookline MA: Microtome Publishing. 58 See Siler, W. and J.J. Buckley, Fuzzy expert systems and fuzzy reasoning. 2005, New York: Wiley 59 Key sources amongst these are Miller, The Laws Relating To The Land Tax and Ginter, A Measure of Wealth. 60 Peirce, C., S, ‘Harvard Lectures on Pragmatism’, in Collected Papers of Charles Sanders Peirce, in Hartshorne, C. and P. Weiss, Editors. 1903, p171–172) 61 Henstock . ‘House Repopulation’ 62 ibid. p124. 63 Various Land Tax statutes specify that sums less than a halfpenny should be spread between years. See Miller The Laws Relating To The Land Tax. 64 Comparison of line entries for Tollemache property held by John Goddard and Samuel Cook in 1784 and 1785 provides a simple illustration of this. Between 1784 and 1785 Cook’s liability for Land Tax fell by precisely the same amount as Goddard’s increased, consistent with the possibility that property transferred between them. In this case, there is evidence from estate documentation that in fact Cook’s occupation of Marled Field gave way to occupation by Goddard. No bundle, however, corresponds to Marled Field; it passed from being part of one bundle to being part of another. It therefore cannot be sufficient to suggest that bundles in one year can be represented as combinations of bundles in adjacent years or that the Land Tax liability carried by a bundle can be decomposed as in [1]. It is possible to define a set of differences that together with the values for the bundles themselves in contrast to [1] exhaust all the ways in which any specific bundle could be composed. Without any other information, it is possible (though very far from likely) that if any set of bundles in year y (for example [C’,G] in 1784) carried the same aggregate liability as another set in year y+1 (eg [C,G’]) that they comprised the same property. In some of these cases, however, if for example C’ represented Cook’s bundle in 1784 (ie with Marled Field), G represented Goddard’s holding 182 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby in that same year and C and G’ represented their respective holdings in the following year (G’ including Marled Field), then the two sums C’-G and C-G’ would represent sums useful for present purposes, and the liability carried by the hidden part can of course be found by subtraction. 65 Cf Quine, W.v O., Word and Object. 1960, Cambridge MA: MIT Press; Heller, M., The Ontology of Physical Objects: Four-Dimensional Hunks of Matter. 1990, Cambridge: Cambridge University Press, and Jubien, M., Ontology, Modality, and the Fallacy of Reference. 1993, Cambridge: Cambridge University Press. 66 Quine, Word and Object. 67 In principle, it would be appropriate to combine such simple prior probabilities with further subjective probabilities based on the evidence of the strength of similarity of the line entries. In the example above, the prior probability, P(H) might be seen as the degree of belief in the proposition that when tenant T surrendered his lease, S occupied his cottage, without knowing the identity of S or T. Assessment of the similarity of the line entries provides some additional evidence, which might potentially be combined with P(H) allowing the subjective probability to be revised. According to Bayes theorem, the revised probability P(H|E) given the evidence of similarity(E) is P(H|E) = p(E|H) 侹 P(H) P(E) where P(E|H) is the probability of finding that similarity given the proposition, and P(E) is the overall probability of such evidence of similarity being found. P(E) must be estimated as the sum of two components; the probability of finding the evidence of similarity if the hypothesis of continuity were true P(E|H) and the probability of finding that evidence if it were not P(E|∼H). 68 The actual pattern of transitions that occurred cannot be known. Moreover as an element of property can only pass to one chain, there are strong interactions between the probabilities. 69 It combines seven sub-scores modified by a score for sequence within the line entries, inclusion of continuing parts and a penalty for complexity. In the middle of the range, a likelihood score of 3 is assigned to a possible linkage between two unmatched line entries for successive years sharing the same liability and proprietor but different occupiers. 70 Reference to occupier names may be the only way to identify a specific bundle. Indeed contemporary legal opinion held that ‘the names of the tenants were only inserted in order to shew for what property the landlords were rated’ (Lord Kenyon CJ R v The Inhabitants of Folkestone Michaelmas Term 1789). See Durnford, C, Sir Edward Hyde East, Term Reports in the Court of King’s Bench, Volume 3 1817, London: Butterworth. It is very important, however, that a circumstance where a tenant relinquishes one tenancy and takes on another is not mistaken for a change in the nature of a holding. 71 See bequest of 14-year interest in Tollemache land ‘with the tenant right and benefit of renewal thereof’: will of Samuel Radcliffe of Mottram in Longdendale, 1797, WS 1797,Cheshire Archive and Local Studies Service, Cheshire Record Office, Chester. More generally see Mordant, Complete Steward, p360–361 or Marshall, On The Management Of Landed Estates, who asks ’What superintendent who knows the difficulty of procuring a good tenant would wish to discharge him? And no such tenant will readily leave the farm he is settled upon if he find proper treatment’ (p381). 72 The term rigid designator was introduced in Kripke, S., Naming and Necessity.1980, Cambridge MA:Harvard University Press to characterize the relation between a name and its referent. The less orthodox approach here, resting on Jubien, Ontology, Modality, and the Fallacy of Reference proves helpful in dealing with changing objects of uncertain extent. 183 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization 73 This might be thought of as analogous to treating perduring objects as a series of stages rather than a 4D space-time worm as in Sider, T., ‘The Stage View and Temporary Intrinsics’. Analysis, 2000. 60: p. 84–88 74 Bennett held a wide range of financial assets including Bank of England debt and a mortgage portfolio in addition to substantial real property interests including the freehold of substantial cotton mills. See also Endnote 92. 75 In 1813 a printed form was used in Mottram for setting out Land Tax liabilities, annual values were recorded for the first and only time, and care was taken to identify Tollemache 99-year leaseholders accurately, consistently treating them as proprietors. Two additional holdings were assessed for the first time (both at £2 NAV0), both of which were subject to 99-year leases granted some years before (to Band(1798) and Marshall (1786), one on an existing cottage with an unusually low rent. Assessment of one of these (Band) ceased after 1815, ending the chain, but resumed in 1822, the year of the local revaluation. 76 There are three groups of circumstances a patch cannot be located with sufficient precision to identify a land parcel that could be projected on to the national grid. In the first, a series of patches will be known to correspond to an area of land, but the portions of that land belonging to each patch cannot be known. This is typical of patches on the Stamford and Warrington estate, but the bounds of the two patches created by temporary division of Tollemache estate C from 1829, for example, are similarly unknown. The second circumstance involves cottage property (with any associated land) of unknown extent. Such patches might be thought of as occupying unassigned cottage space (the overall distribution of which can be approximated). In principle, the location of property of this second type might be taken further by first representing unassigned cottage space as a grid of probabilities (having taken account of assigned space within the township and various sketch maps and drawings). Having probabilistically represented the entire unassigned cottage space, a particular patch might then be probabilistically located by reference to ordering information from the Land Tax returns and the notional annual value. The third circumstance is where cottage property of this type represents part of a patch. 77 Aiken, Description of the country from thirty to forty miles round Manchester, p458. 78 See Chalkin, C.W., ‘The Provincial Towns of Georgian England: A Study of the Building Process, 1740–1820’. 1974, London: Edward Arnold. 79 The holding broken up in 1771 included 11 messuages - far more than any other holding within the Mottram estate Cheshire Archive and Local Studies Service, Cheshire Record Office, Chester 80 Lubow, L., B, ‘From Carpenter to Capitalist: The Business of Building in Postrevolutionary Boston’. 1997, Boston: Northeastern University Press p185 81 The lease is amongst counterpart leases and expired and leases, Mottram, 1786–1899, DTW/2477/F/12, Tollemache (Wilbraham of Woodhey) Collection, Cheshire Archive and Local Studies Service, Cheshire Record Office, Chester. In 1792, Bretnor’s or Brow cottage is also found away from the village. 82 Chadwick, W., Reminiscences of Mottram. 1882, Stalybridge p7. 83 Malcolmson, R.W., Life and Labour in England, 1700–1780. 1981, London: Hutchinson, p26 84 Between 1785 and 1801 the tenants of Hodge Mill were Marsland, Holt (bankrupt), Moss and Swindells with a vacant spell between Holt and Moss when Tollemache himself became liable for the Land Tax. John Swindells and his partner John Dale are alternately reported between 1801 and 1804. The partnership between Moss and Swindells was dissolved in 1796, after which Moss with other partners developed Wharf Mill. 85 Cow keeping had been used to attract skilled labour by Samuel Greg at Styal. See also Redford, A., Labour Migration in England, 1800–1850. 1926, Manchester: Manchester University Press, Chapter 2. 184 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby 86 ‘Mottram-in-Longdendale has often been spoken of as a finished town, as few, if any one, could speak of new houses being erected, except in place of other houses or repairs’. The Manchester Times and Gazette (Manchester, England), Saturday, July 16, 1836; Issue 403. ‘Finished town’ seems to have been in fairly frequent use in the nineteenth century. 87 In 1818 rents on properties shown in ‘Rent accounts for premises, mainly cottages and small houses in Mottram, Hattersley and Glossop co. Derby, 1806–1837, DDX563/1, Cheshire Archive and Local Studies Service, Cheshire Record Office, Chester) were typically double the rateable value (notionally a measure of annual value) shown in the Mottram Highway Rate book for that year (see Endnote 41). A comment in a survey of 1826 DTW/29 Tollemache (Wilbraham of Woodhey) Collection, Cheshire Archive and Local Studies Service, Cheshire Record Office, Chester confirms the unexpectedly high yield of cottage property. 88 The Land Tax shows virtually no development on the Tollemache or Stamford and Warrington estates on this period, nor does the Tollemache archive in Chester Record Office (DTW) Tollemache (Wilbraham of Woodhey) Collection, Cheshire Archive and Local Studies Service, Cheshire Record Office, Chester. The Sidebottom brothers were recorded as owners of 34 units (all built since 1801) in the Highway Rate book (see Endnote 41). 89 Expansion of the Sidebottom enterprise between 1802 and 1834, with the operation of a second mill on the site from 1815 and a third from 1827 is summarised in their evidence to the Royal Commission of Employment of Children in Factories of Factories Inquiry Commission. Supplementary report of the Central Board of His Majesty’s commissioners appointed to collect information in the manufacturing districts, as to the employment of children in factories, and as to the propriety and means of curtailing the hours of their labour, 1834 (167). Its scale relative to other local cotton mills can be gauged by the Crompton census of 1811–1812. The expansion of the Matley’s Hodge printworks can be gauged in part from the Land Tax returns, but its employment cannot be estimated before 1843 (Resolution of confidence in Richard Cobden and his work towards the repeal of the Corn Laws, 1843, employees in the calico printing works of Richard Matley of Hodge. 140 signatures, COBDEN/551,West Sussex County Record Office, Chichester). 90 The block at Hodge (Chain 835) is described in p37–38 of Nevell, M., The Archaeology of Industrialisation and the Textile Industry: the Example of Manchester and the South-western Pennine Uplands During the 18th Century (Part 1)’. Industrial Archaeology Review, 2008. 30(1): p. 33–48. 91 The 312 ‘houses’ of the 1821 Census for Mottram township should be compared with the 220 of the 1801 Census and the 175 separately rated properties in the 1818 Mottram Highway Ratebook (see Endnote 41), of which 34 were owned by the Sidebottom brothers. 92 See Rent accounts and papers relating to the affairs of John [sic] Bennett of Mottram in Longdendale, solicitor and property owner, 1806–1837, DDX563/1, Cheshire Archive and Local Studies Service, Cheshire Record Office, Chester. His role subsumes those discussed by Anderson, B., L, ‘The attorney and the early capital market in Lancashire’, in Liverpool and Merseyside, J.R. Harris, Editor. 1969, Cass: London. p. 50–77. 93 Apart from Bennett’s control of the Titterton Farm he received rents on the ‘cottages by the Church yard side’ that estate documentation indicates had been let with it. Whatever his relationship with Kershaw’s freehold at Harryfields, his account book shows a stream of rents coming from Bowers its tenant, and this role continued when ownership of the Harryfields freehold shifted to the Sidebottom Brothers. 94 DTW2406/30, Tollemache (Wilbraham of Woodhey) Collection, Cheshire Archive and Local Studies Service, Cheshire Record Office, Chester. 95 Despite the availability of a recently commissioned series of local studies, and despite the fact that property remaining from the boom of 1785–1795 lends the present day village much of its physical character, a recent conservation area appraisal (Tameside 2011) demonstrates that this decisive episode in its development remains unknown. 185 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Reconstructing Urbanization 96 97 98 99 100 101 102 103 104 105 106 107 These events are, however, documented in Nevell and Walker, Tameside in Transition; Sayer, M., Broadbottom 1795–1975: A History. 2007, Broadbottom: Broadbottom Community Association, and Haynes, The Cotton Industry in Hollingworth and Mottram-in-Longdendale. The current narrative could be potentially enriched by sustained analysis of Bennett’s account books, DDX563) Cheshire Archive and Local Studies Service, Cheshire Record Office, Chester but this would be difficult without the organizing framework of the Land Tax chains. Hammond, J. and B. Hammond The Skilled Labourer. 1919, London: Longmans, Green & Co, p 4. The phrase is from Revans’ evidence in Fourth report from the Select Committee on the National Land Company; together with the minutes of evidence, 1847–48 (503), p38. Appendix 1 of First Report of the Commission on the State of Large Towns and Populous Districts, 1844–5, (572), provides population and mortality statistics for the Ashton and Oldham Registration District (Appendix p1). This area had a population of almost 174,000- much greater than Birmingham or Leeds, but only 22,700 people (ie 13%) lived in Ashton itself - the ‘large town’ examined subsequently. The area included both Mottram and Bamford’s Middleton 16km’s away (see endnote 7); the true character of Bamford’s vast scattered city being evident from the Ordnance Survey six-inch maps of Lancashire of c. 1848. The specific case is discussed tangentially in Mathews, S., ‘The Cheshire estates of John Tollemache of Peckforton, 1861–1872’. Transactions of the Historic Society of Lancashire and Cheshire, 2005. 154: p. 117–136. That examination is critical of the consequences of granting 99 year leases, wrongly assuming that they covered agricultural land rather than building plots. One local example of failure to overlooks total local revaluation in 1822 and treats change at the Hodge Printworks as vast investment (Haynes, The Cotton Industry in Hollingworth and Mottra m-in-Longdendale). Given concern with the provision of shelter, and the frequent references in Tollemache estate notebooks to the effect that dwellings have be sub-divided (‘house in two dwellings’) or recombined, the lack of concern with numbers of units is easily understood. The work reported here indicates that at least in this particular circumstance it is possible to track the creation of dwelling space, but not the number of dwelling units. Abstract of the answers and returns made pursuant to an act, passed in the first year of the reign of His Majesty King George IV, intituled,"an act for taking an account of the population of Great Britain, and of the increase or diminution thereof. 1822 (502) Mingay, G.E., ‘The Land Tax Assessments and the Small Landowner’. Economic History Review, 1964. 17: p. 381–388. Noble, N., ‘The Land Tax Returns in the Study of the Physical Development of Country Towns’, in Land and Property the English Land Tax 1692–1832, M.T.D. Mills, Editor. 1986, Alan Sutton: Gloucester. Stamford property was held by tenants on leases for three lives, rather than for a fixed term (cf Clay, C., ‘Lifeleasehold in the Western Counties of England 1650–1750’. Agricultural History Review, 1981. 29: p. 83–96). On payment of a fine, on the death of the first or second life tenants might make up their three lives again. This is a form of tenure distinct from that of ‘leasehold for lives-determinable-on-years’ found after 1786 on the Tollemache estate and which occupied an intermediate position between lifehold and a lease for a fixed term. Divergence between Land Tax and estate documentation in this particular case arises from uncertainty over the whereabouts of an individual who deserting his wife surreptitiously left the district more than 25 years before. See Letter, Joshua Hegginbottom to Worthington & Nicholls, 1829, Hattersley Building Grounds, Box 2 No 38 : Hill : 1768 – 1829, Ashton- Stalybridge from Enville, Tameside Local Studies and Archives, Ashton-under-Lyne. 186 August 7, 2014 Time: 02:36pm ijhac.2014.0127.tex Peter Bibby 108 109 110 111 112 113 Valuation (detailed) of the Mottram, Micklehurst and Arnfield Tollemache estate, 1811, DTW2406/30, Tollemache (Wilbraham of Woodhey) Collection, Cheshire Archive and Local Studies Service, Cheshire Record Office, Chester The term is used in the sense of Rose-Redwood, R., and A. Tantner, ’Introduction: Governmentality, House Numbering and the Spatial history of the modern city Urban History, 2012. 39: p. 607613. Wood and Garnett ’s evidence in Second Report p265–268. Cf Miller, P. and N. Rose, ‘Governing economic life’, Economy and Society 1990. 19: p. 1–31;.Barry, A., ‘Lines of communication and spaces of rule’, in Foucault and political reason: liberalism, neo-liberalism and rationalities of government, in Barry, A., T., Osborne and N. Rose Editors. 1996, University of Chicago Press: Chicago. p. 123–141; and Rose, N., Powers of Freedom: Reframing Political Thought. 1999, Cambridge: Cambridge University Press. Rose-Redwood, R., ‘Governmentality, Geography, and the Geo-Coded World’. Progress in Human Geography, 2006. 30(4): p. 469–486, esp. p470. Eg Sperber, D., and D. Wilson, Relevance: Communication and Cognition. 1986, Oxford: Blackwell. work_4dj4z77x55bbdgfvnbthd7gmdy ---- LAB14_001 Laboratorio dell’ISPF, XI, 2014 DOI: 10.12862/ispf14L001 QUESTO NUMERO L’undicesimo fascicolo del «Laboratorio» si apre con la riproduzione anastatica di un raro volume celebrativo di inizio Settecento, il Pubblico funerale per Carlo di Sangro e Giuseppe Capece (Napoli, 1708), contenente un ragguaglio storico e alcu- ni versi di Giambattista Vico. È una scelta in continuità con la tradizione della nostra rivista, che risponde in questo caso anche ad una lieta occasione esterio- re, cioè l’avvio di una nuova collaborazione tra l’ISPF e la «Fondazione Pietro Piovani per gli studi vichiani», nella cui preziosa Collectio viciana è stato selezio- nato l’esemplare qui riprodotto. Lo abbiamo scelto sia per l’eleganza grafica sia per la sua natura di documentazione minore ma necessaria dell’opera di Vico e del suo ambiente, ben corrispondente all’ispirazione che accomuna le due istituzioni coinvolte, cioè la Fondazione dedicata a Piovani e l’Istituto erede del «Centro di studi vichiani» da lui fondato, e che adesso vive anche nel progetto di biblioteca digitale intrapreso dall’ISPF. Siamo alle battute finali del lavoro di digitalizzazione per il Portale Vico dei fondi storici della biblioteca dell’Istituto, grazie al cofinanziamento dell’Unione Europea per il tramite della Regione Campania, di cui si è già riferito nei fascicoli precedenti. A questo si è aggiunta recentemente, grazie a un accordo con la Biblioteca Nazionale di Napoli «Vit- torio Emanuele III», l’acquisizione digitale dell’intero, straordinario patrimonio di manoscritti, postillati ed editiones principes di Vico ivi conservato, la cui pubbli- cazione sul Portale è prevista a breve. In questo quadro non si può non salutare con entusiasmo la collaborazione con la Fondazione Piovani – voluta dal suo presidente, Fulvio Tessitore, in passato anche direttore del «Centro di studi vichiani» – e la possibilità di acquisire in digitale un’impareggiabile biblioteca vichiana “d’autore”. Altre due novità vanno segnalate al riguardo: la creazione all’interno dell’Istituto di un Centro di Umanistica Digitale1 destinato a dare continuità e prospettiva alle attività così avviate, e soprattutto, per la nostra rivista, la nascita, al suo fianco, di una collana di volumi in forma di e-book, i «Quaderni del Lab»2, che ha visto quest’anno la pubblicazione dell’edizione digitale annotata di due significativi testi della cultura meridionale: il Della Mente sovrana del mondo dell’abate antispinozista Tommaso Rossi (1743) e i Diari (1800- 1808) – finora inediti – del medico evoluzionista Giosuè Sangiovanni, dispo- nibili in open access sullo stesso sito della rivista3. Accanto a queste e ad altre attività nel campo della storia del pensiero, l’ISPF ha dato seguito ai lavori dell’«Osservatorio sui saperi umanistici» con un nutrito calendario di incontri. La sezione relativa di questo numero presenta i materiali dei primi seminari del 2014 (i successivi troveranno posto, anche sotto 1 Cfr. . 2 Cfr. . 3 Rispettivamente agli indirizzi e . Questo numero / This Issue 4 forma di online first, nel fascicolo del prossimo anno): in particolare due arti- coli di Andrea Battistini e di Dario Generali, che colgono l’occasione della nuo- va edizione di A. Vallisneri, Che ogni italiano debba scrivere in lingua purgata italiana, per discutere il significato degli attuali progetti di adozione della lingua inglese nella formazione superiore italiana, e una ricognizione critica dello scenario geolinguistico delle Digital Humanities proposta da Domenico Fiormonte in un’anticipazione del volume The Digital Humanist. A Critical Inquiry, che sposa nel modo migliore le esigenze di documentazione e riflessione informata del- l’«Osservatorio» su un terreno decisivo per la ridefinizione dei nostri studi. Accanto a questo primo nucleo, e non senza continuità con esso, la sezione ospita poi un notevole “speciale” sull’insegnamento della storia nelle scuole a cura di Maria Pia Donato, comprendente quattro saggi – della stessa Donato, di Luigi Cajani, di Christophe Charle e di Elvira Valleri – su un tema di grandis- sima delicatezza, anche in considerazione degli attuali, controversi progetti di ridefinizione della missione della scuola nel nostro Paese. Anche la sezione «Saggi» si mantiene quest’anno in una tensione tra analisi del pensiero moderno e dibattito contemporaneo, presentando un lungo e articolato studio vichiano di Horst Steinke sulla struttura retorica del De Anti- quissima – con un’Appendice sulla composizione delle Orazioni inaugurali –, un corposo lavoro di Valeria Gammella sulla lettura foucaultiana di Cartesio e un impegnativo intervento di Francesco Varricchio sui rapporti tra la concezione foucaultiana del reale psichiatrico e la corrente del New Historicism. Conclude il numero la sezione «Strumenti», con una nota di Roberto Evangelista sulla cor- rispondenza tra Van Gent e Tschirnhaus, la documentazione di una biblioteca medica napoletana del Settecento a cura di Flavia Luise e l’indice delle prime dieci annate della nostra rivista allestito da Assunta Sansone. 5 THIS ISSUE The eleventh issue of our «Laboratorio» is opened by the anastatic repro- duction of a rare early XVIIIth Century celebrative volume: the Publicum Caroli Sangrii et Josephi Capycii, Nobilium Neapolitanorum, Funus (Naples, 1708), which includes an historical report and some verses by Giambattista Vico. This choice pursues our journal’s tradition and responds as well, this time, to a happy exterior occasion: namely the start of a new collaboration between the ISPF and the «Fondazione Pietro Piovani per gli studi vichiani», the copy that we reproduce having been selected among the precious Collectio viciana of the latter. We have choosen this book both for its remarkable graphic elegance and for its nature of minor, but necessary document of Vico’s work and milieu, coherently to the common inspiration shared by two institutions such as the Foundation dedicated to Piovani, and the Institute that has inherited the legacy of the «Centro di studi vichiani» founded by Piovani himself. The same inspira- tion animates the ISPF’s digital library project. We are now attending to the final steps of the digitalization of the historical collections of the Institute’s library, thanks to the co-financing offered by the European Community and the Regione Campania which has been already mentioned in the past issues. Moreover, a recent agreement with the National Library of Naples «Vittorio Emanuele III» has permitted us to acquire a complete digital version of its extraordinary patrimony of Vico’s manuscripts, annotated copies and editiones principes, which we are going to publish soon on our Portale Vico. In this framework, we can inaugurate enthusiastically our collaboration with the Fon- dazione Piovani – promoted by its president, Fulvio Tessitore, once director of our «Centro » – and the possibility to include in our digital collection a Vichian library incomparable both for it contents and for his “author”. In this respect we must mention two other new initiatives. The first one is the creation, inside our Institute, of a Center for Digital Humanities4, where these activities will find a perspective of continuity. The second innovation regards more directly our journal, now sided by a new series of e-book, the «Quaderni del Lab»5. This series started this year with the digital edition of two relevant texts of Southern Italy cultural history: Della Mente sovrana del mondo by the anti-Spinozian abbot Tommaso Rossi (1743), and the journal, unpublished until now, that Giosuè Sangiovanni, an evolutionist physician and Neapolitan revolutionary, kept from 1800 to 1808. The two volumes are available on the same site of this journal6. 4 See . 5 See . 6 They are respectively available at the following addresses: and . Questo numero / This Issue 6 Besides these and other initiatives, the ISPF has continued the activities of the «Osservatorio sui saperi umanistici» scheduling a conspicuous number of conferences. The corresponding section of this issue offers the texts of the first seminars of 2014 (the following will find place in the next year issue). These ones include two articles by Andrea Battistini and Dario Generali, taking their move from the new edition of A. Vallisneri’s Che ogni italiano debba scrivere in lingua purgata italiana in order to discuss the meaning of the recent projects aiming to introduce courses in English in the Italian high educational system, as well as a critical review of the geo-linguistic scenery of the Digital Humani- ties, proposed by Domenico Fiormonte as an anticipation of his book The Digital Humanist: A Critical Inquiry, which covers a fundamental topic in the redefinition of the humanistic studies putting together in the best way the requirements of documentation and reflection fostered by our «Osservatorio». In continuity to this first group of articles, the «Osservatorio» hosts, more- over, a remarkable special section on the history teaching in schools edited by Maria Pia Donato and including four essays − by Donato herself, Luigi Cajani, Christophe Charle and Elvira Valleri. The subject that these articles investigate is a very delicate one, also considering the more recent and controversial projects that aim to redefine the school’s own mission in our Country. Also the section «Essays» keeps its balance between the analysis of modern thought and the contemporary debate. It offers a long and articulated “Vichian” essay by Horst Steinke, dealing with the rethorical structure of the De Antiquissima and also including an Annex on the composition of the Orazioni inaugurali, a thick work by Valeria Gammella on Foucault as a reader of Descartes, and Francesco Varricchio’s analysis of the relationship between Foucault’s conception of the psychiatric real and the New Historicism. Finally, the section «Instruments» that close this year’s issue contains a note by Roberto Evangelista on the Van Gent–Tschirnhaus correspondence, the catalogue of a XVIIIth Century Neapolitan medical library compiled and commented by Flavia Luise, and the Index of the first ten volumes of our journal edited by Assunta Sansone. work_4eqp73och5hudeerggco4uknfy ---- The .txtual condition, .txtual criticism and .txtual scholarly editing in Spanish philology RESEARCH ARTICLE The .txtual condition, .txtual criticism and .txtual scholarly editing in Spanish philology Bénédicte Vauthier1 Published online: 5 March 2019 # Springer Nature Switzerland AG 2019 Abstract The impact of New Technologies on writing proccess is not new at all. This digital revolution first resulted in the appearance of new text formats and the development of an ad hoc literary theory. In Angloamerican area, this revolution made philologists and patrimonial institutions reflect on the necessity of developing formats of study, edition and perennial conservation of these new formats of digital texts. What is the reason for such a delay in these disciplines that can be observed in Europe? Why can we say that digital forensics and media archaeology (Kirschenbaum) are not trasnational disci- plines? In this paper, I assess the impact in Europe and in Angloamerican area of .Txtual condition. Moreover, I make a contrast between these conclusions and the answers given by three emblematic writers of the ‘new Spanish narrative’ to a survey about ways of managing and preserving digital files. Keywords Spanish .txtual condition . .txtual criticism . .txtual editing . New comparative filology. Spanish filology 1 The .txtual condition and .txtual criticism in Spanish philology A little more than ten years ago, in June 2007, around forty representatives of contemporary book culture––among them, authors, critics, journalists, publishers and booksellers––gath- ered in Seville at the initiative of a prestigious Spanish publishing house, Seix Barral, and the José Manual Lara Foundation. The aim of the three-day summit and its round-tables was to exchange ideas about and survey into the achievements, objectives and innovations of a International Journal of Digital Humanities (2019) 1:29–46 https://doi.org/10.1007/s42803-019-00003-x Dedicated to Thorsten Ries for his interest in my work. Also dedicated to Agustín Fernández Mallo, Robert Juan-Cantavella and Vicente Luis Mora, without whose generous support this study would not have the same relevance. * Bénédicte Vauthier benedicte.vauthier@rom.unibe.ch 1 Institut für spanische Sprache und Literaturen, Universität Bern (Schweiz), Bern, Switzerland http://crossmark.crossref.org/dialog/?doi=10.1007/s42803-019-00003-x&domain=pdf http://orcid.org/0000-0002-9022-2699 mailto:benedicte.vauthier@rom.unibe.ch group of writers who were born between the early 1960s and mid-1970s and began publishing their work in the early years of this millennium. Although, for many of these writers, the meeting in Seville was not strictly speaking a debut––and neither were they members of the so-called “Nocilla generation”, which, nonetheless, became synonymous for the group after the event1––it was a fortunate choice of timing to bring together such a heterogenous group of authors. By biological age, they could belong to different generations, but this meeting proved to be the foundational act of the “New Spanish Narrative.”2 While a reader of their works will notice many compositional, stylistic and thematic differences between these authors, there are two features many of their works have in common and which have received scholarly attention in the form of doctoral theses (Calles Hidalgo 2011; Barker 2011; Pantel 2012; del Pozo Ortea 2012; Saum-Pascual 2012): hypertextuality, on the one hand, and inter-, multi- or transmediality on the other. As is widely known, both concepts have been usually linked to the impact that new media has had on literary production and creative writing and this has most prominently been reflected in the pioneering and internationally well-known studies by Bolter, Landow, Ryan, Douglas, etc. Compared to the most influential works in the Anglo-American area (Michael Joyce’s afternoon, a story, Stuart Moulthrop’s Victory Garden) Spanish electronic literature has received less scholarly attention so far (Pérez 2015), especially in Spain. This might be understandable in some cases with respect to limited innovative aesthetic quality and the ephemeral character of the digital works in question. As a matter of fact, several works of Spanish electronic literature which have been collected, preserved and presented in a section of the institutional online portal Virtual Library Miguel de Cervantes are not available online anymore.3 There is a risk that some of the “expanded literature” texts––or exonovels, to use the neologism one of the New Spanish Narrative’s most famous representatives coined (Fernández Mallo 2012: 67)––will sooner or later be confronted with digital obsoles- cence. The second part of this article discusses this problem, but will also address another issue. I will look beyond contemporary literary studies, focused on the inter- pretation of singular authorised texts published under their author’s name, which often fail to recognise the multitude of reprints,4 and instead turn to questions of the (digital) 1 The label “Nocilla generation” refers to the title of a novel by Agustín Fernández Mallo called Nocilla Dream. Nuria Azancot reused part of the title in an article published just a few weeks after the meeting (07.07.2007). Other labels used - related to monographs, anthologies or compilations - are “afterpop”, “mutant”, “last generation Spanish narrative”, “pangeic”, “postmodern”, “New Spanish Narrative”, “post- humanist narrative”. 2 According to the literature, the number of writers that would be part of the group varies from six to twenty. Among them, it is common to find Lolita Bosch, Javier Calvo, Harkaitz Cano, Jorge Carrión, Diego Doncel, Domenico Chiappe, Álvaro Colomer, Juan Francisco Ferre, Javier Fernández, Agustín Fernández Mallo, Eloy Fernández Porta, Salvador Gutiérrez Solís, Robert Juan-Cantavella, Milo Krmpotic, Gabi Martinez, Javier Moreno, Vicente Luis Mora, Sofia Rhei, Isaac Rosa, Mario Cuenca Sandaval, Germán Sierra, Manuel Vilas, etc. 3 Literatura Electrónica Hispánica 2018 Fundación Biblioteca Virtual Miguel de Cervantes (Ed.): Biblioteca Virtual Miguel de Cervantes. URL: http://www.cervantesvirtual.com/bib/portal/literaturaelectronica/obras. html (accessed 06/27/18). 4 An exception is the three volumes: Nocilla Dream (2006), Nocilla Experience (2008) and Nocilla Lab (2009) that form a trilogy, republished under the common title Proyecto Nocilla (2013). 30 International Journal of Digital Humanities (2019) 1:29–46 http://www.cervantesvirtual.com/bib/portal/literaturaelectronica/obras.html http://www.cervantesvirtual.com/bib/portal/literaturaelectronica/obras.html writing process, scholarly editions (of born-digital material) and (born-digital) ar- chives.5 In this area, the landscape of European research, especially for Spanish studies, seems to offer less encouraging prospects. It is clear that European research in Digital Humanities lags behind its counterpart in the Anglo-American world for reasons that are not easy to overcome. Matthew Kirschenbaum’s concept of the “.txtual condition” in the digital age does indeed apply to many European writers––Spanish in particular––and I will refer to three of them in the second part of my study: “In the specific domain of the literary, a writer working today will not and cannot be studied in the future in the same way as writers of the past, because the basic material evidence of their authorial activity—manuscripts and drafts, working notes, correspondence, journals—is, like all textual production, increasingly migrating to the electronic realm” (Kirschenbaum 2013: par. 4). However, very little, if any, scholarly attention has been paid since to this change that will affect in crescendo four branches of literary studies––analytical bibliography, philology, scholarly editing and interpretive studies––when it comes to the literary production of the twenty-first century. It is useless to bemoan the situation. It is much more interesting to try to understand the causes and to examine the difficulties, or perhaps the resistances, that will have to be overcome in Europe in order to create the digital humanities community that Kirschenbaum has called for. 2 Digital forensics: a transnational discipline? In the context of Anglo-American academia and research, authors, textual scholars, editors and, above all, cultural and memory institutions work hand in hand to meet the “.txtual Condition” (Kirschenbaum 2013: par 38). Regarding European countries where English is not the official, but rather a second language, the delay of GLAM institutions (galleries, libraries, archives, and museums), the lack of digital edition projects that are comparable to the pioneering work in the Anglo-American area6 and the scarce curiosity of researchers for the impact of the digital transformation on literary ways of writing that Kirschenbaum exhibits in Track Changes (2016) cannot be properly explained unless we make explicit how an apparently transnational discipline––the adaptation of computer forensics methods in archival science and philology—is rooted in the specific philological tradition of Anglo-American analytical bibliography and textual criticism. This is probably the key reason for the success of this approach. Media archaeology […] offers one set of critical tools for coming to terms with the .txtual condition. Another, of course, is to be found in the methods and theoretical explorations of textual scholarship, the discipline from which McGann launched his ongoing program to revitalize literary studies by restoring to it a 5 Text in the broad sense of the word, that is, as defined by Donald McKenzie: “I define ‘texts’ to include verbal, visual, oral, and numeric data, in the form of maps, prints, and music, of archives of recorded sound, of films, videos, and any computer-stored information, everything in fact from epigraphy to the latest forms of discography” (1999: 13). 6 In Mechanisms, Kirschenbaum cites, as a representative sample button, The Electronic Beowulf, The Canterbury Tales Project, The William Blake Archive and The Rossetti Archive (2008: 16 note 26). International Journal of Digital Humanities (2019) 1:29–46 31 sense of its roots in philological and documentary forms of inquiry. As I’ve argued at length elsewhere, the field that offers to most immediate analog to bibliography and textual criticism in the electronic sphere is computer forensics, which deals in authenticating, stabilizing, and recovering digital data. […] Digital forensics is the point of practice at which media archaeology and digital human- ities intersect (Kirschenbaum 2013: par. 31). The double philological root to which Kirschenbaum refers is undoubtedly at the center of the success and fruitful development of this research paradigm within the digital human- ities in his field of research. However, it might be more straightforward to say that the analogue precursor of the adaptation of computer forensics as a tool in born-digital philology was McKenzie’s “heterodox bibliography or sociology of texts” and McGann’s “New Textualism or Modern Textual Criticism”. The reference to Jerome McGann, who is given the honor of having revitalised literary studies, may be seen as proof of this assertion. In textual criticism and scholarly editing of modern English literature texts––the stress on “modern” is essential here––the names Donald McKenzie, “heterodox” bibliographer (Darnton 2003: 43) and book historian, and Jerome McGann, American critic and philologist of modern texts, are often mentioned together and may seem synonymous with the paradigm shift that took place in literary disciplines during the mid-1980s (Greetham 2013: 37, Sutherland 2013: 57; Shillingsburg 1996: 24). McGann’s and McKenzie’s books A Critique of Modern Textual Criticism (McGann 1983, 1st ed., 1991, 2nd ed.) and Bibliography and the Sociology of Texts (McKenzie 1986, 1st ed., 1999, 2nd ed., the result of the Panizzi Lectures, McKenzie 1985), published within a narrow time frame, have contributed to this misconception. The fact that the two authors sought to distance themselves almost at the same time––though they did it regarding different corpus and interests––from the “Greg-Bowers-Tanselle theory”, which at the time was the predominant paradigm in the field of textual criticism and scholarly editing of premodern texts (Lernout 1996), may also have contributed to the shakeup of the discipline.7 A survey into the academic reception of the two scholars not only in the Anglo- American sphere, but also in European research, reveals, however, that things are not simple at all, and neither their names nor their proposals are interchangeable. McKenzie is an undisputed authority on Anglo-American bibliographical research. However, it is the name and the work of Jerome McGann that has become the most emblematic reference among the representatives of New Textualism. His book A Critique of Modern Textual Criticism became part of the school’s “canon” (Greetham in Shillingsburg 1996: vii). The earlier The Textual Condition (1991) is one of the most frequently cited works among researchers interested modern scholarly editing, such as Scholarly Editing in the Computer Age (Shillingsburg 1996), The Fluid Text. A Theory of Revision and Editing for Book and Screen (Bryant 2002) and the .txtual Condition (Kirschenbaum 2013), to mention just three important works by McGann’s followers. 7 In his assessment of Anglo American textual criticism, Lernout (1996) suggests that McGann calls into question Greg’s “base text” theory and the concept of “authorial aim” later introduced by Bowers in the context of his philological study on manuscripts and modern texts. McKenzie, however, tries to widen the field of analytical bibliography with his new definition of “text” and his reflection on the production and spreading of printing history. 32 International Journal of Digital Humanities (2019) 1:29–46 In the part of Europe where English is not a lingua franca, however––particularly in Romance-speaking Europe (e.g. France, Italy, Spain)––with the exception of English literature studies, the work of McGann is hardly known, while “the second McKenzie” (Willison 2002: 204) enjoys international reputation throughout Europe, although more among book historians than among modern philologists. The French book historian Roger Chartier is the one who mainly disseminated McKenzie’s work and ideas in the European context.8 The absence of a modern philology in France and Spain at the time allowed Chartier to leave out the dominant bibliographical facet in the New Zealander’s work by skipping the outset of his reflection to highlight the second key idea9 underlying McKenzie’s Panizzi Lectures: the material and social dimensions of text, which led to his own history of books, reading and readers. In this way, Chartier did away with the strong link between librarianship and bibliography, between archives and editing, in McKenzie’s work and diluted his very early concerns about the emergence of new technologies that were substantially affecting the understanding of text and its circulation (Vauthier 2018a, b). These lectures were conceived and prepared, not as a text destined for print, but as lectures occasions. The challenge, as I saw it, was to sketch an extended role for bibliography at a time when traditional book forms must share with many new media their prime function of recording and transmitting texts. could we read in the preface to the first edition of his book (1986: IX). In its second edition, following a review of McGann’s (1988) response and, thanks to Chartier’s work, the unexpected international reception of the book in Europe (McKenzie 1986: 23; 1999: 6); he resumes: The familiar historical processes by which, over the centuries, texts have changed their form and content have now accelerated to a degree which makes the definition and location of textual authority barely possible in the old style. Professional librarians, under pressure from irresistible technological and social changes, are redefining their discipline in order to describe, house, and access 8 McKenzie’s book, Bibliography and sociology of texts, was translated to French in 1991, to Italian in 1998 and to Spanish in 2005. In all three cases, translations were accompanied by a substantial prologue by Chartier, who has channeled the author’s reception, very particularly in France and then in Spain, in the somewhat exclusive direction of the history of the book (Vauthier 2018a). 9 These three ideas are: 1. “an extended role for bibliography”, as it is shown in the first lines of the preface that follows. In the second paragraph, McKenzie specifies: “there were two other considerations which it seemed timely to voice” (1986, ix, italics are mine). 2. The acknowledgment of historical bibliography as a discipline in itself: “Historical bibliography (as distinct from descriptive and analytical bibliography and stemmatics) has gained acceptance as a field of study” and 3. The essential instability of the text and the impossibility of fixing it for good: “Definitive editions have come to seen an impossible ideal” and “each version has some claim to be edited in its own right” (1986, 2). These three ideas clearly show McKenzie’s wide oversight, as bibliographer and book historian. Chartier is a book historian, not a philologist, and consequently only refers to the first and second idea. International Journal of Digital Humanities (2019) 1:29–46 33 sounds, static and moving images with or without word, and a flow of computer- stored information (1991: 1). These two dimensions––the impact of new technologies and the necessary renovation of the traditional bibliography––explain the honored place that Donald McKenzie now occupies among Anglo-American scholars of modern text, particularly in the works of McGann and Kirschenbaum.10 Although in The .txual Condition (2013: par. 7, 31 and 41) and in Mechanisms (2008: 9) Kirschenbaum admits his debt to McGann, in the opening pages of his collective report Digital Forensics and Born-Digital Content in Cultural Heritage Collections, he asserts very clearly that the necessary connections and interactions between the world of archives and digital forensics stem from McKenzie’s work, particularly from his early attention to new technologies: We maintain that such parallels are not coincidental, but rather evidence of something fundamental about the study of the material past, in whatever medium or form. As early as 1985, D. F. McKenzie, in his Panizzi lectures, explicitly placed electronic content within the purview of bibliography and textual criti- cism, saying, ‘I define texts to include verbal, visual, oral, and numeric data, in the form of maps, prints, and music, of archives of recorded sound, of films, videos, and any computer-stored information, everything in fact from epigraphy to the latest forms of discography (1999, 13) (Kirschenbaum et al. 2010: 5). In the same way, McGann’s recent popularity among modern philologists may be a consequence of the fact that he knew how to minimize his first book’s precedence to McKenzie’s Panizzi Lectures, which would be published two years later, and, conse- quently, could make McKenzie ‘the Hero of Our Own Time’, that is, the Hero of Scholarly Edition. D. F. McKenzie became The Hero of Our Own Time not because he discovered the sociology of the text – we’ve known about that for a long time. He became The Hero because he knew that the idea of the social text had to be realized as a scholarly edition. Such an edition would be addressing and answering some key – basically philo- logical – questions. Could one develop a model for editing books and material objects rather than just the linguistic phenomena we call texts? To pose that question, as McKenzie did, was to lay open the true dimensions of what he was after: a model for editing texts in their contexts (McGann 2013: 281-282, italics are mine). After having clarified the intrinsic alignment between the adaption of digital forensics as a philological and archival scientific method, on the one hand, and the traditions of theory and practice of modern scholarly editing, textual criticism and analytical 10 It is very interesting to note that in the section “The History of the Book” of his review of “Textual Scholarship”, Marcus does not mention McKenzie, but only Chartier and Darnton. Instead, he mentions McKenzie along with McGann in the section “Textual Scholarship in Present” (2010). 34 International Journal of Digital Humanities (2019) 1:29–46 bibliography in the Anglo-American academia and research, on the other, I turn the scope of this survey back to Europe. Unlike the Anglo-American context, where authors, textual scholars, publishers and, above all, cultural and memory institutions work hand in hand to meet the “.txtual Condition”, the European cultural and research landscape features neither such a clear, nor such an unanimously shared strategy among the involved parties.11 Moreover, there is no such thing as a standardized “continental” or “European theory of scholarly editing”, nor are there language-specific models for scholarly editions (Lernout 2002, 2013; Vauthier 2018a, b). Even TEI encoding and TEI-based editions still face difficulties in establishing a standard model for digital scholarly editions (Marcus 2009: 93–94), which may complicate the long-term preservation of editions. Last and above all, the idealistic understanding of the modern text that prevails among European philologists and an individualist or romantic concept of authorship have been the main factors which impeded studies focused on the materiality of the textual media, on the graphic dimension of prints and books, and on non-authorised or posthumous12 versions of texts (Vauthier 2017; Vauthier 2018a, b). An article penned by Rüdiger Nutt-Kofoth illustrates this point, which allows me to further detail and expand my above claims regarding the lack of scholarly reception of McGann’s work in Europe. In Editionsphilologie als Mediengeschichte (Scholarly Editing as Media History, Nutt-Kofoth 2006)––the simplicity of the title should be noted as meaning- ful––a German literary scholar and specialist for scholarly editions invites his col- leagues to stop focusing solely on the “linguistic” dimension of the text and instead turn to the concept of “bibliographical orientation”13 by Peter Shillingsburg (2006: 19), the representative of the New Anglo-American Textualism, which, in recent years, has made large efforts to build bridges between the scholars of German and English literature.14 It is too early to see whether his colleagues will follow this invitation,15 although it may 11 Kirschenbaum’s article illustrates how the invaluable legacy of an author, editor and educator like Deena Larsen, a pioneer of electronic writing, to an institution (the MITH) is not a result of chance, but of friendship that unites the writer to the center and its researchers. That is to say, the same scenario as the one at the origin of the legacy or the sale of working manuscripts of contemporary writers to memory institutions is repeated: Louis Aragon at the Institut des Textes et Manuscrits Modernes in Paris, Miguel Ángel Asturias at the Bibliothèque Nationale de France, Friedrich Dürrenmatt in Switzerland. In all three cases, the legacy was made with the explicit desire for exploration and evaluation using latest editing techniques. 12 The geneticists—members of both French and German schools—have put a lot of emphasis on the materiality and the graphic substance of the drafts, of the avant-texte, a characteristic that they have not granted to the text completely (Lebrave, 2009; Mahrer 2017 Reuss 2005). In the same way, even among those who declare to be interested in the process, the death of the author remains an insurmountable frontier (Lebrave, 2009, Mahrer 2017) or remains clearly on the side of the history of reception (Reuss 2005). And that is what the Anglo-Americans question with the idea of “versioning” (Reiman 1987), “fluid text” (Bryant 2002), etc. 13 “Based in the bibliographical studies of D. F. McKenzie, this orientation enlarges the definition of text to include all aspects of the physical forms upon which the linguistic text is written. This approach does not admit to any parts of the text or of the physical medium to be considered non significant and therefore emendable. […] all aspects of the physical object that is the book that bear clues to its origins and destinations and social and literary pretentions […] are text to the bibliographic orientation” (Shillingsburg 1996: 23–24). Two recent books of Italian philology (Cadioli 2012, Italia 2013) also draw attention to the importance of the works of McGann and even more of Peter Shillingsburg. 14 Like the German editor of Ulysses, Hans Walter Gabler, did at his time and like it is still done by Belgian Anglists Geert Lernout and Dirk van Hulle (Lernout 2013: 74–75). 15 That invitation is the same that Margarita Santos Zas and I made to edit Valle-Inclán (2017). International Journal of Digital Humanities (2019) 1:29–46 35 seem unlikely that it will be possible to put an end to the debate about the issue of textual versions that opposes the scholars of German and English literature— an issue that hinges on questions of materiality on the one hand and “authorial intentionality” on the other (Shillingsburg 1996: 99–100).16 More instructive may be Patrick Sahle’s work (2013) on the typology of digital scholarly editions and on the definition of the term “text”,17 in which the historian reflects upon its polysemy and, instead of one definition, proposes a dynamic wheel of terminological perspectives on the term “text”. In this way, he intends to overcome static definitions that construe “text” in antagonistic terms. After this overview, in which some light was shed on the specificity of modern textual criticism and scholarly editing of modern texts both in European traditions and in the Anglo-American research context, it is necessary to return to memory institutions and to the urgent issue of the long-term preservation and curatorship of writers’ private digital archives. 3 ‘I unpack my digital library and show you my digital desktop’ In their studies of librarianship and digital curatorship, Becker (2014) and Weisbrod (2015) highlighted the challenges and deficits that research and memory institutions–– they take libraries and literary archives in Germany, Austria and Switzerland as an example––need to address in terms of long-term preservation, curatorship and scholarly appreciation of born-digital heritage and digital culture in comparison to memory institutions, archives and research in European countries where English is not the main language spoken. Both books seek to understand how writers write, how they organise their working process, and how they organise and preserve their documents in the digital era. Yet, the question arises: to what extent authors have an interest in and are willing to receive support from memory institutions to ensure long-term preservation of their literary and personal digital archives. Additionally, Dirk Weisbrod complemented the empirical part of his doctoral thesis with in-depth expert interviews with archivists and directors of memory institutions. Both authors in their conclusions put emphasis on the need for archivists to establish contact between memory institutions and likely donors or depositors of private digital archives as early as possible in order to make writers aware of the need and possibilities in place to preserve their published or ongoing work, for instance, in an institutional archive cloud (Weisbrod 2015: 416– 453, here 423). In the course of his argument, Weisbrod forges the neologism ‘präkustodiale Intervention’ (pre-custodial intervention), which refers to the interven- tion of archivists with possible donors or depositors as a preliminary measure to ensure long-term preservation of and access to their archives. This conclusion is very much in line with the institutional collaboration with writers advocated for by Kirschenbaum. From the more modest academic perspective of a scholar of contemporary Spanish literature, who is not directly connected to particular memory institutions and who does 16 In From Gutenberg to Google, Shillingsburg lists the main works of the polemics (2006: 173–174) and Vauthier analyzes some of the editorial implications of the two paradigms (2017 and 2018a). 17 This polysemy is activated or reactivated, if we consider McKenzie, through the problems of coding. That is, from what new technologies make the editors see. 36 International Journal of Digital Humanities (2019) 1:29–46 not feel inclined to acquire digital forensic skills anytime soon, I still find it important to scrutinize the implications of the digital media turn for the scholarly edition and interpretation of twenty-first century literature. It was Jean-Louis Lebrave’s research program18 that guided my steps when exam- ining the hybrid dossier génétique of Robert Juan-Cantavella’s transmedial novel El Dorado, a born-digital novel (Vauthier 2014: 2016). This research program, proposed to the French critique génétique by one of its pioneers, put forward his scholarly attention for the changes that the arrival of personal computers on the authors’ writing desks meant for their ways of working. Having studied the dossier génétique of El Dorado, it became very clear to me that the critique génétique and scholarly editions of twenty-first century literature will depend on the preservation state of the private digital archives and that these disciplines will have to focus on the question of digital versions and variants––and on the complexity of the problem of the versions (Lebrave 2011: 145). Consequently, I contacted three writers of the “New Spanish Narrative” to start a survey about their way of working in the digital age.19 Without being aware of it at the time—given that I formulated my questions based on my years-long practice as a scholarly editor of avant-textes and modern Spanish texts and not aimed at the interviewees’ way of writing—I happened to collect data about their methods of organising their work on the computer and about how they ensured the preservation of their creative work at the same time. Despite the relatively small sample of three writers, the data collected is relevant in the context of the methodological framework of qualitative survey (Heigham and Croker 2009). Qualitative surveys gather generic information, illustrate general trends, may seek answers to research questions that cannot be operationalised and addressed in quantitative surveys or questions where the personal relationship and the interaction between interviewee and researcher may play a key role. In short, my survey responds to the research question formulated by Becker in her conclusion: “It would be inter- esting to have a closer look at the youngest generation of writers with respect to their ways of writing” (2014: 70). In the present case, the answers are interesting with respect to two dimensions. Nocilla Dream (2006) by Agustín Fernández Mallo (1967), El Dorado (2008) by Robert Juan-Cantavella (1976) and Alba Cromm (2010) by Vicente Luis Mora (1970) are among the most representative inter- and transmedial works of the New Spanish Narrative. All of them meet the definition of the exonovel, a neologism coined by Fernández Mallo.20 18 “It would be a much greater matter of urgency to mobilize the energy for approaching two crucial questions for the future of genetic criticism. First, it is about really knowing how the writers appropriate the computer, and which are the effects of this appropriation on writing. The second concerns the way in which geneticists will be able to construct real scientific objects based on data of a new type stored on computer memories” (Lebrave 2010: 155). 19 The mention of the personal relationship and friendship is necessary, since the access to author files must be approved by the authors and / or beneficiaries. In the case of digital archives, the question of trust placed in the researcher and the confidentiality of the documents to which they may have access is more crucial than ever. I sent the questionnaire “I unpack my digital library and show you my digital desktop” during the Christmas period of 2017 and the answers came between 26th December and 25th January, allowing me to request additional information. 20 In addition, Agustín Fernández Mallo and Vicente Luis Mora also have articles and / or essays that focus on the impact of new technologies and they maintain blogs of literary criticism. International Journal of Digital Humanities (2019) 1:29–46 37 A neologism, based notion of a exoskeleton, exonovel refers to “that which sustains to novel, providing internal solidity and protection, without which the novel itself is not possible” (Fernández Mallo 2012: 68). “The model that this Exonovel follows is that of a protective shell on the outside of the book’s body, but it is dislocated.” (2012: 69) The examples that the author provides between the two definitions refer to digital formats that the three authors use on a regular, varying basis: websites, blogs (either installed for the purpose of a specific writing project or their pre-existing blog), videostreams on YouTube or other platforms, Facebook accounts operated by the authors under their real names or avatars, etc. With respect to the problem of digital obsolescence of electronic literature, which has already been briefly addressed, this calls for precautionary measures; it is time to be concerned about the impact that a partial or complete loss of the elements “without which, the novel itself is not possible” would have on our understanding of the exonovels. I will postpone dealing with the private digital archive, the submerged part of the iceberg— the digital files poised to possibly disappear—and first address the work’s digital representation in the public sphere. Although the texts seem to be independent from the “cinematic poetics of their provenance”, what would happen if the readers of the Nocilla trilogy––Dream (2006), Experience (2008), Lab (2009)––had no longer access to the movie Proyecto Nocilla?21 Accordingly, what would happen to our understanding of the work if Cantavella’s Punk Journalism website––available on punkjournalism.com––that complements the novel El Dorado, the URL of which already no longer corresponds to the one mentioned on the back cover, ceased to exist? Even so, if the two texts are autonomous, the parodistic weblog will not stop laying open its deck of cards, revealing to its reader parts of the documentary (digital photos, cutouts of scanned texts, etc.) and critical material (articles of the fictive character published in the press) that the author’s alter ego Trebor Escargot used to write his road movie, a remake of Fear and Loathing in Las Vegas (1998),22 inspired by Hunter S. Thompson’s homonymous 1971 novel (Vauthier 2014, 2016). Furthermore, how could we not think about the implications for our understanding of Alba Cromm (2010) if we know that the author’s logbook “Alba Cromm y la vida de los Hombres” (“Alba Cromm and the Life of Man”), which Vicente Luis Mora wrote parallel to his novel and to which the novel refers, and also know that this logbook “is only accessible through the Internet Archive search engine” (Ilasca 2015: 3)? These preliminary observations may be sufficient to illustrate why it is essential to understand how writers imagine the future of their work, which they develop, almost exclusively, in digital media. Despite the fact that all of the authors’ answers to the various survey blocks are interesting and relevant with regard to the issue, I will not document them in full. It would be impossible for several practical reasons, mainly because some of them are very long and some contain confidential information. Even without having access to the authors’ computers or, in this case, their files,23 but having seen some of their 21 The film is available in the writer’s blog “El hombre que salió de la tarta”: “Proyecto Nocilla, la película” http://fernandezmallo.megustaleer.com/proyecto-nocilla-la-pelicula/ (accessed 13/2/19). 22 El Dorado is based on both Thompson’s novel and the film directed by Terry Gilliam, which stars Jonny Depp as Raoul Duke. 23 The work on El Dorado was realized through the examination of the material collected in a USB key that the author gave me in 2011, along with personal documents (DVD, press, logbook, etc.) kept in a backpack (Vauthier 2014, 2016). 38 International Journal of Digital Humanities (2019) 1:29–46 http://punkjournalism.com http://fernandezmallo.megustaleer.com/proyecto-nocilla-la-pelicula/ screenshots, it is clear that “access to someone else’s computer is like finding a key to their house, with the means to open up the cabinets and cupboards, look inside the desk drawers, peek at the family photos, see what’s playing on the stereo or TV, even sift through what’s been left behind in the trash” (Kirschenbaum 2013: par. 3). This is the natural and understandable reason for the cautious reluctance of the authors to deposit or donate their private digital archives––and surely one of the greatest challenges for the effort of building future born-digital archives and pre-custodial interventions. To give an overview, three quarters of the questions asked were related to what Weisbrod describes as “ways of administrating and managing work on the computer” (2015: 391–394, 524–525), one quarter was about the “methods to ensure preservation of the archives” (2015: 383–390, 523–523). To be more precise, I was interested in the following topics: & how do authors organize their digital work and when do they start organizing their materials; & the metadata and criteria they use to organize their work (date (timestamps), file name, file type, extension, title, etc.) and their possible variations according to textual genre (narrative, essay, poetry, academic work, etc.); & the time and naming schemes according to which they create and name versions, the timing and regularity when they do so; & the possible metadiscursive component (“notes de régie”) in their work and the way they use visual queues for marking up certain digital writing operations (color, strikethrough, bold, track changes, etc.); & the way of documenting their work in the digital environment (type of used sources and consulted documents) and, if applicable, the way this “external” material is stored; & the possibility of recycling documents, versions, own and/or “external” texts and the concrete way this is done (duplication of documents, copy and paste, etc.); & the use of the operating system’s virtual recycle bin; & the preservation of digital files and hardware (self-archiving): what do they keep (own and/or external documents, draft versions, final versions etc.)? When does the author self-archive their born-digital materials? Where is the archive stored (cloud, hard disk, hard copy)?; & their possible representation of a digital library of literary authors that would replace traditional, paper-based archives, and their willingness to deposit or donate their born-digital archives. Due to my interest in the writing process, on the one hand, and in scholarly editing of both avant-textes and authorised texts, on the other, I was especially interested in the way the authors document their own working materials and even more in the manage- ment of possible versions of their work. Moreover, their answers to my questions regarding their willingness and interest to deposit or donate their digital files to archives and about their self-archiving practice seemed somewhat unexpected to me, if not alarming. In the following section, before commenting on them altogether and coming to a conclusion, I will reproduce the answers they gave to the first block of questions–– documentation and version-management– and I will outline the answers to the second block––archive preservation. International Journal of Digital Humanities (2019) 1:29–46 39 4 “Version” Staying within the methodological context of a philological study that does not utilise digital forensics (Ries 2010), during the interviews, I used the term “version” in the sense of “text” that slightly differs from another text, saved by the author before or after the first text. A comparison between both texts would show a variation, that is, would allow a researcher to reconstruct the writing process.24 I did not try to establish a new definition of the term “version” in the context of this study that would allow to include, in a strict sense, textual versions in automatically saved backup or temporary files (von Bülow 2003: 3). I also did not take into account the definition of the term version that Kirschenbaum refers to, for whom “versioning is a hallmark of electronic textual culture––as a thriving industry of content management systems, file comparison utilities, and so-called version control or concurrent versions systems, […]”, that is, Concurrent Versions System (CVS) (2008: 197). I will argue, however, that the issue of “textual identity” still needs to be at the center of scholarly interest. With this clarification, my question and the answers by the three authors are documented below.25 Bénédicte Vauthier: What is your criterion to be met in order for you to save a new version or to duplicate the document on which you are currently working? What do you do with previous “versions”? How do you name them? How often do you save what you wrote? Agustín Fernández Mallo: I create a new file version if the novel is very advanced and it seems that I can open a path that will radically change things. I usually keep the name of the original file and simply add a number at the end. “[...] 2,” “[...] 3”, etc. Sometimes I specify the reason of the change in order to remember it: “[…]SUBSTITUTIONdistortion”. I keep the previous versions, even if I consider them complete failures—you already know my opinion about garbage and about how it can be recycled. Years may pass until I am able to see that something I had discarded was waiting for its natural place somewhere else. (email on 10 January 2018). Robert Juan-Cantavella: The criterion is not very scientific. I save a new version every time I think that I have made many changes to a document, just in case I may have to review later once more what I discarded. Sometimes a long time passes (months), sometimes very little (days), depending on the changes made. When I return to a version, I usually save a copy of the previous version when I start, because I do not have everything under control; I keep the copy just in case. Previous versions are saved in a dedicated folder. I usually name it “out” or “ant”, although it can be a different name. (email on 14 January 2018). Vicente Luis Mora: [...] When I spoke about versions, I was referring to the same base document with some changes; sometimes with many changes, sometimes with 24 Technically it would have been perhaps more accurate to speak about the “state of a text”, to distinguish this rewriting, typical of “avant-texte”, from another possible rewriting, posterior to a first publication. 25 I translated my questions and their answers to English for this documentation. 40 International Journal of Digital Humanities (2019) 1:29–46 fewer. The base document, however, is always the same and therefore keeps the same creation date.26 Each one of these digital copies that you have is a backup copy of the same document in progress, in continuous change: I enter the document and I add new things, correct or delete some of the old ones, so the document is not the “same” anymore when I finish. This is why I said it is a version, while you called it a “changed draft”. When I create a digital copy, it is because I think that I have made enough changes in the original to save it separately, and the chronolog- ical order is not marked by the date of the file, but by [...] [the date of the backup]. [This latest file] [...] contains the most recent version of each of the [texts]. [...] As I sometimes change minor things here and there in the document that are difficult to remember, and I hate the “track changes” mechanism of [Microsoft] Word, this is the only solution for me not to get lost: by successively saving multiple backups of the same document while it transforms towards its final “gestalt”. It is possible that between these copies there are only a few variants, but all of them together are the writing-polishing process of the novel, the demanding toil of writing, which [...] includes even the proofeading. (email 24 January 2018). 5 Preservation With a nod to Walter Benjamin’s essay “Ich packe meine Bibliothek aus” (“Unpacking My Library”, 1931), which is echoed in the title of my questionnaire, I formulated the following two blocks of questions that refer to the preservation of digital files. 1. Where do you keep the digital files related to the writing process of your works? Hard disk, cloud, hardcopy? Do you also keep the external documents that you have consulted? 2. How do you imagine a future library/digital archive of authors (for example, Residencia de estudiantes)? Can you imagine depositing or donating the digital archive of your works to a public research library? Do you organize your folders with this in mind when you finish a work? Do you manipulate them? Are there any documents that you would like to delete? When you buy a new computer, do you keep the hard drive of the one you dispose of? To the first question, the three authors declared that they save their material to one or several hard drives separately. They save their work with varying regularity and, in any case, when they move or migrate to a different computer. Two of them also use email as a save and backup tool and none of them seem to use the cloud. Agustín Fernández Mallo even declared that he does not trust it. In general, working on paper, either for proofreading or to keep hardcopies of their work, has rather a marginal role in their working process. They may use it merely for “more sentimental than practical reasons”, as Fernández Mallo puts it, referring to the gallery proofs his publisher sends him. The 26 On my initiative, the author returns to the response he had given me in the first place: “I do not usually make different versions’ (26.12.2017), which contradicted the status of a digital file to which I was given access.” Specifically he refers to the 28 “digital versions” of the novel Alba Cromm that he sent me in 2014 through a cloud. I gave up studying them because I did not find a satisfactory form of exploring the genetic dossier composed of versions which, in addition, as the author suggests, all have the same date of creation. International Journal of Digital Humanities (2019) 1:29–46 41 three writers seem to move only a few of their digital documents to the virtual trash bin, and never while they are still in the writing process. Some of these documents end up bloating folders named “discard”, “keep” or “out”, which leaves the writer the option to recover or reuse the text, immediately or later. Draft versions may also be found in these folders. They mostly move material to the trash folder that may have turned up in the phase of documentary research, excerpts, early notes and drafts and that is not regarded as useful any longer (downloads, photos, etc.) or which is considered “exter- nal” to their work. Agustín Fernández Mallo and Vicente Luis Mora said that they keep their old computers when buying a new one. Vicente Luis Mora said that once when he was abroad (in the USA), he disassembled a laptop “in order to puverise all its main components one by one with a hammer” as a measure of destroying his data. In addition to laptops, he also has a personal desktop computer which serves as a “method of general physical backup of everything.” The answers referring to their interest or willingness to deposit or donate their digital archive to a library that ensures their preservation cannot be easily summarized, nor do they show an unified tendency. Agustín Fernández Mallo said that he imagined “a library in which the digital and the analogue are perfectly intermeshed, what I call Postdigitalism: I organize the folders thinking about my personal organization and nothing else.” Robert Juan- Cantavella imagines “a library that is accessible from computers. I do not know whether one would be required to enter a physical space (a building) to go and consult them. I would not donate the digital working documents of my books to a library or to any other type of institution. When I finish a work, I do not organize the materials thinking about any later external research consultation. There are usually no documents that I want to eliminate more than others.” As for Vicente Luis Mora, he seemed to doubt that there will be such libraries “when I will be older”, imagining, in addition, a careful process of selection of “writers who wish to be included in their archive”. However, he expressed his concern about the idea that “textgenetic researchers like me” could dig into his computer and into the drafts of his works, with which, even when finished, he is usually not satisfied. Hence, he declares: “you may be able to understand my feelings about the materials I have put aside. I guess, I will make many things disappear that would interest you, although I will keep others because the love for the work that took place during the consecutive drafts does not allow me to get rid of them.” Having documented these answers by the authors, I would like to conclude by returning to the question of born-digital and the scholarly edition. 6 .txtual editing Anyone who knows anything about digital files and is familiar with the concerns of writers ––and even more those of their relatives––about the idea that researchers will search through their drafts as they please, have access to private materials, potentially reveal well-guarded or forgotten secrets, will understand certain fears triggered by the idea of delivering not just previously selected drafts and prints, but also the key to their digital “home” to unknown philologists, who would use forensic methods in order to access those digital secrets. Regardless whether out of fear, lack of confidence or interest, if the artists of the 20th and twenty-first century do not deposit or donate their digital archives to professional memory institutions or take curatorial measures in order to preserve them, an important 42 International Journal of Digital Humanities (2019) 1:29–46 resource for understanding the works of this era would be lost. They would end up only printed and published, making them appear curiously single or decontextualised if we think of the heterogeneous modern archives that may consist of drafts, notes, gallery proofs, prints, annotated books, correspondences, photos, etc. The apparent reluctance towards the archives of the future on the side of those authors who turned to and embraced the new media and their technologies to the extent of even becoming strongholds of the worldview of a connected society seems somewhat puzzling. If these authors did not give researchers access to the materials and traces of the creation of their works, geneticists and philologists would have to, like other critics, turn to texts published in book format in the future––e.g. in the form of works reedited and republished by the authors themselves. The majority of scholars, who do not seem to feel much curiosity for the unpublished, archived part of the work, usually accept this situation and base their work on the texts that circulate in the public realm. Without reiterating the interpretation problem posed by transmedia works here, it obvious that failing to apply curatorial measures would risk losing the published, and even more these works’ unpublished parts and materials, rendering their historical record incom- plete, historically inaccurate and potentially incomprehensible. In cases where authors give philologists and textual geneticists access to the folders of one or more of their works, e.g. via a pen drive or their cloud account containing a complete record of unaltered documents that could belong to the constellation of the works, the challenge for the researcher with standard user skills will be to determine the possible or actual number of textual states or versions of the work exist. Even if we content ourselves with the versions saved voluntarily by the authors, we could see that they do not hesitate to duplicate the most complete version of the text in order to avoid regrets in case they have to “come back” to a previous one. Although this duplication is not merely mechanical, it is – from a critique génétique perspective––a fundamentally different process compared to the isolated revision and the revision by rewriting of a text in the analogue medium. In addition to this challenge, there are two other problems to be addressed: first, as the authors do not see their desktop full of drafts, they tend to avoid disposing of their things, which raises the issue of textual garbage and recycling that some of them have inscribed at the center of their work––this is, for instance, the case with Fernández Mallo (2009: 105–119) and Mora (2007: 29–31, 184–188). The second difficulty is related to the size of the digital files of narrative works: the systematic analysis of these is impossible without the aid of text collation tools such as included in Juxta, MEDITE, CollateX, iTeal. I would like to highlight a conclusion drawn by Lebrave at the end of his review of the Kirschenbaum’s and Ries’ work: “It is very likely that genetic forensics has to renounce being a poetic of processes and instead will content itself with being a poetic of transitions between textual states” (2010: 145). I think this is accurate. However, as I do not want to give in to pessimism (Lebrave 2010: 145), I hope that the unexpected multiplication of “versions” or “states” of a text with which a researcher is confronted when accessing a digital archive, will prove an invitation for them to address the question of “textual identity” under a new digital perspec- tive. Faced with “different states of what we can suppose to be the same text, with all the epistemic difficulties posed by the problem of simultaneously identical and different texts” (Ganascia and Lebrave 2009: 74, [italics mine]), it is time to stop supposing and start investigating this theoretical issue. However, it is necessary to International Journal of Digital Humanities (2019) 1:29–46 43 investigate it before launching a digital collation tool (Mahrer 2017: 36–37) or before making available for the reader or user all the versions of a text to be edited (Bryant 2002: 87). To argue that beyond their differences two texts that can be compared, which is to say textually aligned, must be considered together in a genetic perspective (Mahrer 2017: 36–37), or that “a version, like any text of a work, is effectively an approximation of the attempt to achieve the work” (Bryant 2002: 86) is equivalent to solving the problem that was to be elucidated “in favor of identity” (Reuss 1990: 5–10). References Azancot, N. (2007). La generación Nocilla y el Afterpop piden paso. El mundo, hemeroteca de El Cultural, 25.07. Retrieved from http://www.elcultural.es/version_papel/LETRAS/21006/La_generacion_Nocilla_ y_el_afterpop_piden_paso (Accessed 2/13/19). Barker, J. (2011). No place like home: Virtual space, local places and Nocilla fictions (Doctoral Dissertation, University of British Columbia). Becker, S. (2014). Born-digital-Materialien in literarischen Nachlässen. Auswertung einer quantitativen Erhebung. Berliner Handreichungen zur Bibiliotheks- und Informationswissenschaft. Berlin. Staats- und Universitätsbibliothek. Retrieved from : https://edoc.hu-berlin.de/handle/18452/2749 (Accessed 2/13/19). Bryant, J. (2002). The fluid text: A theory of revision and editing for book and screen. University of Michigan P. Cadioli, A. (2012). Le diverse pagine. Il testo letterario tra scrittore, editore, lettore. Il Saggiatore. Calles Hidalgo, J. (2011). Literatura de las nuevas tecnologías. Aproximación estética al modelo literario español de principios de siglo (2001–2011). Salamanca: Ediciones Universidad Salamanca (Col. Vitor). Darnton, R. (2003) The heresies of bibliography. The New York Review of Books, May 29, 43–45. del Pozo Ortea, M. (2012). Hacia un reencantamiento posthumanista: poesía, ciencia y nuevas tecnologías (Doctoral Dissertation, Massasuchets). Fernández Mallo, A. (2009). Postpoesía. Hacia un nuevo paradigma. Anagrama. Fernández Mallo, A. (2012). Topological time in Proyecto Nocilla [Nocilla project] and Postpoesía [post- poetry] (and a brief comment on the exonovel). Hybrid Storyspaces: Redefining the Critical Enterprise in Twenty-First Century Hispanic Literature, Hispanic Issues On Line, 9, 57–75. Ganascia, J. G., & Lebrave, J.-L. (2009). Trente ans de traitements informatiques des manuscrits de genèse. In O. Anokhina & S. Pétillon (Eds.), Critique génétique: concepts, méthodes, outils (pp. 68–82). Paris: IMEC. Greetham, D. (1996). Foreword. In P. Shillingsburg (Ed.), Scholarly editing in the computer age: Theory and practice (pp. vii–xvi). Ann Arbor: University of Michigan P. Greetham, D. (2013). A history of textual scholarship. In N. Fraistat & J. Flanders (Eds.), The Cambridge companion to textual scholarship (Cambridge Companions to Literature, pp. 16–41). Cambridge: Cambridge University Press. https://doi.org/10.1017/CCO9781139044073.002. Heigham, J., & Croker, R. A. (2009). Qualitative research in applied linguistics. A practical introduction. London: Palgrave Macmillan. Ilasca, R. (2015). ¿Sueñan los escritores con obras electrónicas? La experiencia transmedial en Alba Cromm de Vicente Luis Mora. Texto digital, 11(1), 209–225. Italia, P. (2013). Editing Novecento, Salerno. Kirschenbaum, M. G. (2008). Mechanisms. Cambridge: New Media and the Forensic Imagination. Kirschenbaum, M. G. (2013). The .txtual condition: Digital humanities, born-digital archives, and the future literary. Digital Humanities Quarterly, 7(1), 1–43. Retrieved from: http://www.digitalhumanities. org/dhq/vol/7/1/000151/000151.html (Accessed 2/13/19). Kirschenbaum, M. G., Ovenden, R. & Redwine, G. (2010). Digital forensics and born-digital content in cultural heritage collections. Washington D.C., Council on Library and Information Resources. Lebrave, J.-L. (2009). Manuscrits de travail et linguistique de la production écrite. Modèles linguistiques, 59, 13–21. Lebrave, J.-L. (2010). L’ordinateur, Olympe de l’écriture? Genesis, 31, 159–161. 44 International Journal of Digital Humanities (2019) 1:29–46 http://www.elcultural.es/version_papel/LETRAS/21006/La_generacion_Nocilla_y_el_afterpop_piden_paso http://www.elcultural.es/version_papel/LETRAS/21006/La_generacion_Nocilla_y_el_afterpop_piden_paso https://edoc.hu-berlin.de/handle/18452/2749 https://doi.org/10.1017/CCO9781139044073.002 http://www.digitalhumanities.org/dhq/vol/7/1/000151/000151.html http://www.digitalhumanities.org/dhq/vol/7/1/000151/000151.html Lebrave, J.-L. (2011). Computer forensics: la critique génétique et l’écriture numérique. Genesis, (33), 137– 147. Lernout, G. (1996). La critique textuelle anglo-américaine: une étude de cas. Genesis, 9, 45–65. A version of this article in English is available at URL: http://www.antwerpjamesjoycecenter.com/genesis.html (Accessed 7/15/18). Lernout, G. (2002). Genetic criticism and philology. Text, 14, 53–75. Lernout, G. (2013). Continental editorial theory. In N. Fraistat & J. Flanders (Eds.), The Cambridge companion to textual scholarship (pp. 61–78). Cambridge: Cambridge UP. Literatura Electrónica Hispánica. (2018) Fundación Biblioteca Virtual Miguel de Cervantes (Ed.): Biblioteca Virtual Miguel de Cervantes. Retrieved from : http://www.cervantesvirtual.com/bib/portal/literat uraelectronica/obras.html (Accessed 7/15/18). Mahrer, R. (2017). La plume après le plomb. Poétique de la réécriture des œuvres déjà publiées. Genesis, 44, 17–38. Marcus, L. S. (2009). Textual scholarship. In Women editing/editing women: Early modern women writers and the new textualism (pp. 75–101). Newcastle upon Tyne: Cambridge Scholars Publishing. McGann, J. (1983). A critique of modern textual criticism. Chicago: University of Chicago Press. McGann, J. (1988). Theory of texts. London Review of Books, 10(4), 20–21. McGann, J. (2013). Coda: Why digital textual scholarship matters; or, philology in a new key. In N. Fraistat & J. Flanders (Eds.), The Cambridge companion to textual scholarship (Cambridge Companions to Literature, pp. 274–288). Cambridge: Cambridge University Press. https://doi.org/10.1017 /CCO9781139044073.014. McKenzie, D. F. (1986). Bibliography and the sociology of texts. London: The British Library. McKenzie, D. F. (1991). La bibliographie et la sociologie des textes. Paris: Cercle de la Libraire. McKenzie, D. F. (1998). Bibliografia e sociologia dei testi. Milan: Sylvestre Bonnard. McKenzie, D. F. (1999). Bibliography and the sociology of texts. Cambridge: Cambrigde UP. McKenzie, D. F. (2005). Bibliografía y sociología de los textos. Madrid: Akal. Mora, V. L. (2007). Circular. Las afueras. Córdoba: Berenice. Nutt-Kofoth, R. (2006). Editionsphilologie als Mediengeschichte. Editio, 20, 1–23. Pantel A. (2012). Mutations contemporaines du roman espagnol. Agustín Fernández Mallo et Vicente Luis Mora (Doctoral Dissertation, Montpellier). Pérez, J. A. (2015). Digital storytelling in Spanish: Narrative techniques and approaches (Doctoral Dissertation, University of California, Santa Barbara). Reiman, D. H. (1987). “Versioning”: The presentation of multiple texts. In Romantic texts and contexts (pp. 167–180). Columbia: University of Missouri P. Reuss, R. (1990). “Michael Kohlhaas und Michael Kohlhaas”. Zwei deutsche Texte, eine Konjektur und das Stigma der Kunst. Berliner Kleist-Blätter, 3, 3–43. Reuss, R. (2005). Text, Entwurf, Werk. Text. Kritische Beiträge, 10, 1–12. Ries, T. (2010). “die geräte klüger als ihre besitzer”: Philologische Durchblicke hinter die Schreibszene des Graphical User Interface. Überlegungen zur digitalen Quellenphilologie, mit einer textgenetischen Studie zu Michael Speiers ausfahrt st. Nazaire. Editio, 24, 149–199. Sahle, P. (2013). Digitale Editionsformen. Zum Umgang mit der Überlieferung unter den Bedingungen des Medienwandels. 3 vols. Norderstedt, Schriften des Instituts für Dokumentologie und Editorik. Saum-Pascual, A. (2012). Mutatis Mutandi. Literatura española del nuevo siglo XXI (Doctoral Dissertation, Riverside, University of California). Shillingsburg, P. (1996). Scholarly editing in the computer age: Theory and practice. Ann Arbor: University of Michigan P. Shillingsburg, P. (2006). From Gutenberg to Google. Electronic representations of literary text. Cambridge: Cambridge UP. Sutherland, K. (2013). Anglo-American editorial theory. In N. Fraistat & J. Flanders (Eds.), The Cambridge companion to textual scholarship (Cambridge Companions to Literature, pp. 42–60). Cambridge: Cambridge University Press. https://doi.org/10.1017/CCO9781139044073.003. Vauthier, B. (2014). Tanteos, calas y pesquisas en el dossier genético digital de El Dorado de Robert Juan Cantavella. In M. Kunz & S. Gómez Rodríguez (Eds.), Nueva narrativa española (pp. 311–345). Barcelona: Linkgua. Vauthier, B. (2016). Genetic criticism put to the test by digital technology: Sounding out the (mainly) digital genetic file of El Dorado by Robert Juan-Cantavella. Variants, 12-13, 163–186. Retrieved from: http://journals.openedition.org/variants/353 (Accessed 24/07/18). Vauthier, B. (2017). Éditer des états textuels variants. Genesis, (44), 39–55. Vauthier, B. (2018a). Donald McKenzie: historiador del libro y filólogo. Revista Hispánica Moderna in press. International Journal of Digital Humanities (2019) 1:29–46 45 http://www.antwerpjamesjoycecenter.com/genesis.html http://www.cervantesvirtual.com/bib/portal/literaturaelectronica/obras.html http://www.cervantesvirtual.com/bib/portal/literaturaelectronica/obras.html https://doi.org/10.1017/CCO9781139044073.014 https://doi.org/10.1017/CCO9781139044073.014 https://doi.org/10.1017/CCO9781139044073.003 http://journals.openedition.org/variants/353 Vauthier, B. (2018b). Critique génétique y filologías del texto moderno. Nuevas perspectivas —sobre ‘el texto’— a partir de Ramón del Valle-Inclán,. Ínsula, 861, pp. 11–15 in press. Vauthier, B. & Santos Zas, M. (Eds.) (2017). Un día de guerra (Visión estelar). La Media Noche. Visión estelar de un momento de guerra de Ramón del Valle-Inclán. Estudio y dossier genético. Santiago de Compostela, Biblioteca de la Cátedra Valle-Inclán/Servizo de Publicacións da Universidade, 3 vols. + DVD. von Bülow, U. (2003). “Rice übt Computer, die Laune wird immer guter!”: Über das Erschließen digitaler Nachlässe. Paper delivered at KOOP-LITERA Österreich. Literaturhaus Mattersburg, 8–9 May 2003. Retrieved from : https://www.onb.ac.at/koop-litera/termine/kooplitera2003/Buelow_2003.pdf. (Accessed 2/13/19). Weisbrod, D. (2015). Die präkustodiale Intervention als Baustein der Langzeitarchivierung digitaler Schriftstellernachlässe (Doctoral Dissertation, Berlin). Retrieved from http://nbn-resolving. de/urn:nbn:de:kobv:11-100233595 (Accessed 2/13/19). Willison, I. (2002). Don McKenzie and the history of the book. In J. Thomson (Ed.), Books and bibliography: Essays in commemoration of Don McKenzie (pp. 202–210). Wellington: Victoria UP. 46 International Journal of Digital Humanities (2019) 1:29–46 https://www.onb.ac.at/koop-litera/termine/kooplitera2003/Buelow_2003.pdf http://nbn-resolving.de/urn:nbn:de:kobv:11-100233595 http://nbn-resolving.de/urn:nbn:de:kobv:11-100233595 The .txtual condition, .txtual criticism and .txtual scholarly editing in Spanish philology Abstract The .txtual condition and .txtual criticism in Spanish philology Digital forensics: a transnational discipline? ‘I unpack my digital library and show you my digital desktop’ “Version” Preservation .txtual editing References work_4fy445hx5rggbah7jix4adiqyq ---- 11SU PP LE M EN TI IL CAPITALE CULTURALE Studies on the Value of Cultural Heritage Patrimonio, attività e servizi culturali per lo sviluppo di comunità e territori attraverso la pandemia eum Rivista fondata da Massimo Montella Dall’analisi al cambiamento della realtà Paolo Clini, Ramona Quattrini, Umanesimo Digitale e Bene Comune? Linee guida e riflessioni per una salvezza possibile / Digital humanities and Commons: guidelines and recflections for a possible salvation «Il capitale culturale», Supplementi 11 (2020), pp. 157-175 ISSN 2039-2362 (online); ISBN 978-88-6056-670-6 DOI: 10.13138/2039-2362/2529 Umanesimo Digitale e Bene Comune? Linee guida e riflessioni per una salvezza possibile * Paolo Clini, professore ordinario in Disegno e rilievo, Dipartimento DICEA dell’Università Politecnica delle Marche, via Brecce Bianche, 60131 Ancona, email: p.clini@univpm.it. ** Ramona Quattrini, ricercatrice a tempo determinato in Disegno e Rilievo, Dipartimento DICEA dell’Università Politecnica delle Marche, via Brecce Bianche, 60131 Ancona, email: r.quattrini@univpm.it. Alcune riflessioni e considerazioni presenti in questo saggio sono sostanziati e convalidati dalle applicazioni sviluppate e testate all’interno del progetto CIVITAS, Progetto Strategico di Ateneo di Univpm. Paolo Clini*, Ramona Quattrini** Abstract Il Covid-19 ha evidenziato in maniera drammatica la condizione di totale fragilità della cultura e, in particolare, del nostro patrimonio artistico e storico, tangibile e intangibile. Una fragilità determinata sostanzialmente dall’assenza di relazioni, di cui invece il patrimonio vive nel susseguirsi storico delle società che lo conservano e lo condividono. Ai tempi del Covid-19, in cui tutti i musei, i siti archeologici e i luoghi della cultura erano chiusi, si è posta con urgenza la riflessione su come mantenere vive queste relazioni, attraverso il digitale. L’articolo tratteggia riflessioni teoriche e metodologiche per un manifesto di buone pratiche operative e scientifiche, a partire dalle numerose esperienze condotte in ambito di umanesimo digitale. Vengono esplicitati i quattro passaggi, intimamente connessi, su cui far leva per una filiera digitale consapevole e sostenibile: digitalizzazione scientifica, nuove forme di interazione virtuale, misurazione del gradimento dei pubblici, formazione di nuove competenze. 158 PAOLO CLINI, RAMONA QUATTRINI The pandemic crisis dramatically highlighted the fragility of culture and, in particular, of our tangible and intangible, artistic and historical heritage. A fragility determined substantially by the absence of relations, of which the heritage lives on in the historical succession of the societies that preserve and share it. In the days of Covid, when all museums, archaeological sites and places of culture were closed, there was an urgent need to reflect on how to keep these relationships alive, through digital technologies. The article outlines theoretical and methodological reflections for a manifesto of good operative and scientific practices, starting from several experiences conducted in the field of Digital Humanities. The four closely connected steps on which to leverage for a conscious and sustainable digital supply chain are explained: scientific digitization, new forms of virtual interaction, measurement of public acceptance, training of new skills. 1. Introduzione Il Covid-19 ha posto in maniera drammatica davanti ai nostri occhi la condizione di totale fragilità della cultura e, in particolare, del nostro patrimonio artistico e storico tangibile e intangibile. Una fragilità determinata sostanzialmente dall’assenza di relazioni. Il patrimonio vive quando agisce nei confronti delle società che nel loro susseguirsi storico lo conservano e lo condividono. Questo è il suo autentico valore. Ai tempi del Covid-19 abbiamo vissuto un momento epocale. Ci sono stati giorni in cui tutti i musei del mondo, tutti i siti archeologici, tutti i luoghi della cultura erano chiusi. Non era mai accaduto, neanche durante la guerra. Un silenzio inaccettabile perché testimoniava il silenzio della nostra civiltà. Inaccettabile perché da tali situazioni l’uomo può morire, le stesse civiltà possono morire. Oggi siamo quindi costretti a riflettere da un lato su come prevenire queste situazioni drammatiche, dall’altro su come trarre insegnamento e opportunità da questi passaggi storici che sempre più ci inducono a considerare il patrimonio come un Bene Comune vitale per il singolo individuo, per la Comunità a cui esso appartiene, per la civiltà che lo ha generato. Da sempre la tecnica ha costituito uno strumento indispensabile per compiere fino in fondo il percorso straordinario che si genera da un atto creativo. La tecnica ha assunto diverse denominazioni nell’evoluzione della nostra civiltà artistica. Oggi la chiamiamo digitale: sicuramente una delle chiavi che ci possono aiutare a rinnovare il ruolo della tecnica nella sua simbiosi con la creazione artistica e porre le condizioni per una nuova forma di umanesimo. Umanesimo digitale proprio in riferimento alle tecniche che permettono, oggi come nelle ricorrenze antiche e moderne, di rimettere l’uomo al centro dei processi creativi e artistici. Questo attraverso processi che consistono nella possibilità di riprodurre sempre più fedelmente il nostro patrimonio, arricchendone da un lato le valenze di espressione artistica, dall’altro la possibilità di circolare e diffondersi all’interno della nostra civiltà in luoghi e contesti lontani da dove quel patrimonio è 159UMANESIMO DIGITALE E BENE COMUNE? LINEE GUIDA E RIFLESSIONI PER UNA SALVEZZA POSSIBILE collocato ma garantendone così la sua condizione indispensabile di Bene Comune, senza la quale sarebbe anche priva di significato la sua collocazione fisica in uno specifico luogo museale. Una democrazia dell’Arte essenziale alla sua stessa sopravvivenza e a quella della civiltà e delle civiltà che la producono e che la accolgono. Come aveva ben intuito quasi un secolo addietro Walter Benjamin, pur non consapevole della straordinaria possibilità e accelerazione che il digitale avrebbe concesso a questo processo. 2. Verso un manifesto Sono sicuramente innumerevoli i documenti programmatici che indicano le ICT o il digitale come soluzione o volano per rendere il patrimonio Bene Comune: l’ultimo che può essere citato è stato effettivamente scritto e pubblicato in piena crisi pandemica1. L’Europe Day Manifesto richiama infatti per le istituzioni europee un ruolo di primo piano nel patrimonio culturale digitale e il grande potenziale che esse presentano per andare avanti con le nuove tecnologie come l’intelligenza artificiale e l’apprendimento automatico, pur nel perseguimento di principi umanistici ed etici. Nell’auspicare una accelerazione della trasformazione digitale, viene subito richiamata la necessità di ridurre il divario tra le istituzioni che sono digitalmente attrezzate e quelle che non lo sono. La necessità di democratizzare l’accesso al nostro patrimonio per sostenere diversità, inclusività, creatività e impegno critico nell’educazione, Va di pari passo, come vedremo in seguito, con la promozione delle competenze digitali, per rafforzare il ruolo delle nostre istituzioni culturali. Ma accanto a questo documento non possiamo non citare altri atti fondamentali di questo passaggio, tra cui segnaliamo: Carta di Siena2, Carta di Londra3, Convenzione di Faro4, Nuova Agenda per la Cultura5 e, per rimanere in ambito nazionale, il Piano Strategico per la digitalizzazione del Turismo6 e il Piano Triennale per la Digitalizzazione7. Documenti che evidenziano la consapevolezza a livello di autorità centrale di cogliere l’inevitabile trasformazione che il digitale genera sul nostro patrimonio. Poco però è stato realizzato in questi anni. Ad una consapevolezza culturale della trasformazione non è mai seguita una sistematica azione, anche economica, che abbia potuto portare i nostri luoghi della cultura a trasformarsi realmente e a cogliere la ricchezza di questo cambiamento. Il 1 European Heritage Alliance 2020. 2 ICOM Italia 2016. 3 EPOCH 2009. 4 Europe Council Treaty Office 2005. 5 European Commission 2018. 6 Laboratorio per il Turismo Digitale (TDLab) 2014. 7 Direzione generale Musei – MIBACT 2019. 160 PAOLO CLINI, RAMONA QUATTRINI Covid-19 ne è stata la palese dimostrazione. Infatti di fronte a questa emergenza si è scatenata una corsa a cercare forme digitali e virtuali alternative o sostitutive di quella cultura fisica cancellata. E così ci siamo resi conto di quanto valga un dipinto digitalizzato o un museo capace di mettere on line la sua collezione o uno spettacolo che non si esaurisce nel breve spazio della sua espressione. Ma abbiamo anche capito di non essere pronti. Perché un modo nuovo e diverso di continuare a fare e comunicare cultura lo si costruisce in tempo di pace e non di guerra. Da anni parliamo di digitalizzazione del patrimonio ma davanti a questa emergenza abbiamo visto che molto poco era stato fatto, che addirittura i più grandi musei del mondo non avevano loro rappresentazioni virtuali degne di essere chiamate tali. E che alla fine tutti gli esiti digitali si risolvevano in un più marcato uso dei social; esiti che nulla hanno però a che vedere con la reale dimensione della digitalizzazione del patrimonio. E così abbiamo capito che la cultura virtuale o digitale, come la vogliamo chiamare, non è una pezza che mettiamo a un vestito rotto ma è una straordinaria opportunità affinché la bellezza che le nostre civiltà hanno prodotto e continuano a produrre, possano raggiungere ogni uomo che vive in questa terra, che vive in luoghi e in condizioni che non gli permetteranno mai di potere godere fisicamente di quell’opera. Abbiamo scoperto che il nostro patrimonio non è digitalizzato scientificamente, che non abbiamo le competenze per gestire questi processi di trasformazione, che non sappiamo usare i mezzi più attuali per la fruizione del patrimonio digitale, che manca un confronto ed empatia tra patrimonio e utente, che non abbiamo dati che ci permettano davvero di capire come creare un rapporto profondo e personale con un visitatore alla ricerca di esperienze sempre più personali, proprio attraverso le potenzialità del digitale e dei nuovi mezzi di fruizione. Molti di questi aspetti sono evidenziati e fotografati con chiarezza nel Final Report8 di Nemo (Network of European Museum Organisations) pubblicato nel mese di Luglio 2020. Diciamo quindi che sul Digitale è stata fatta in questi anni molta propaganda, che oggi forse il Covid-19 interrompe o permetterà di superare. Come è possibile rendere efficace e reale il nuovo umanesimo digitale? Come è pensabile usare le nuove tecniche, ormai sufficientemente affidabili e pervasive, per ricreare la simbiosi tra arte e tecnica? Come possiamo muoverci con passo sicuro nella tutela delle relazioni tra Beni e uomini, se non cercando di mettere a punto concrete filiere operative rispetto a piani di trasformazione digitale dei nostri luoghi della cultura e di ingaggio digitale reale del nostro patrimonio. Un manifesto di buone pratiche operative e scientifiche non può che ripartire da una filiera digitale fondata quindi su quattro passaggi tra di loro intimamente connessi, come illustrato in figura 1, e che sembrano proprio essere le grandi lacune mostrate in questi mesi di emergenza: – la digitalizzazione scientifica del patrimonio; 8 NEMO, Szogs 2020. 161UMANESIMO DIGITALE E BENE COMUNE? LINEE GUIDA E RIFLESSIONI PER UNA SALVEZZA POSSIBILE – le nuove forme di fruizione digitale: a sostegno dell’analogico, in situ e sostitutive o alternative, in remoto; – la misurazione e il monitoraggio del pubblico a cui è destinato il carattere di Bene Comune del Patrimonio; – una profonda revisione delle competenze e dei percorsi di formazione per la gestione di queste filiere e del loro continuo rinnovamento. Su questi passaggi e riconnessioni (Fig. 1) possiamo fondare un nuovo umanesimo digitale. Ed è su ognuna di queste leve che nei nostri percorsi di ricerca e con le attività del gruppo Distori Heritage9, attraverso l’integrazione di approcci storico-umanistici e tecnologici, abbiamo agito valorizzando e strutturando le competenze nel campo delle Digital Humanities (DH) cercando anche di definire, attraverso casi concreti, buone pratiche di aiuto alla scrittura di un manifesto/filiera digitale 2.1 Digitalizzare scientificamente A partire dalle attività di ricerca che riguardano l’applicazione di nuove tecnologie di rilievo e documentazione digitale al patrimonio nelle sue varie forme – dall’archeologia ai dipinti e alle statue all’architettura, al paesaggio – sono state testate e validate varie forme e modalità di digitalizzazione, scientificamente fondate. Il cardine di queste digitalizzazioni è senz’altro l’acquisizione tridimensionale, basata principalmente sulle nuvole di punti. Al di là della ormai indiscussa e necessaria integrazione delle tecniche, ovvero l’utilizzo per la fase di acquisizione di fotografie, panoramiche, nuvole di punti da Laser Scanner Terrestre (TLS) e da fotogrammetria, anche con l’ausilio dei veicoli a pilotaggio remoto (APR); la bontà e duttilità dei facsimile digitali basati su nuvole di punti è principalmente dimostrata dai numerosi applicativi che ne risultano (Fig. 2). Riportiamo qui ad esempio le digitalizzazioni che hanno riguardato i reperti archeologici del Museo Archeologico Nazionale delle Marche, poi confluiti in una libreria digitale che abilita una interazione intima con i reperti e varie tipologie di approfondimento nell’ottica di testare anche gli apprendimenti basati sulle ICT10. La libreria è stata anche premiata da ICOM nell’ambito delle installazioni interpretative, guadagnando la medaglia di bronzo al premio AVICOM2019 svoltosi a Shanghai11. Per quel che riguarda le digitalizzazioni tridimensionali in ambito di patrimonio costruito storico gli esempi sono innumerevoli: dai rilievi dei manufatti palladiani poi confluiti nei modelli della Palladio Library12 alla recente acquisizione 9 www.distori.org. 10 Clini et al. 2018. 11 Cfr. . 12 Gaiani et al. 2015. 162 PAOLO CLINI, RAMONA QUATTRINI del Palazzo Ducale di Urbino e della Galleria Nazionale delle Marche13, che potremmo definire omnicomprensiva: dal quadro al complesso monumentale del palazzo (Fig.3). Questa multiscalarità delle acquisizioni, prevista già negli intenti del progetto CIVITAS14 (ChaIn for excellence of reflectiVe Societies for dIgitization of culTural heritAge and museumS. A pilot case in Palazzo Ducale at Urbino), si prefigge di affrontare le varie sfide di un grande museo, ospitato in un edificio storico di grandissimo valore, come accade in larga parte dei casi, mettendo a punto buone pratiche di trasformazione digitale per esso. Inoltre digitalizzazioni complete e piene di significati sono altresì in grado di aprire a percorsi molto articolati di gestione attraverso le piattaforme HBIM15 ma anche ad uno sfruttamento dei facsimili digitali 3D relativi al patrimonio storico basato sull’intelligenza artificiale, approccio sicuramente foriero di esiti molto promettenti in termini di sostenibilità per questa tipologia di dati. Infine per quel che riguarda la scala del paesaggio, inteso nella sua accezione più recente come un unicum di territorio naturale e antropizzato costellato da beni e manufatti storici, di valore o meno, le sperimentazioni relative alla sua digitalizzazione hanno fatto largo uso, in aggiunta ai dati morfometrici, di acquisizioni low-cost che hanno dimostrato grande efficacia. Si veda ad esempio l’efficace utilizzo di fotografie panoramiche e video a 360°, utilizzati sia per la documentazione del Convegno En route Landscape&Archaeology16, che per il racconto e la mappatura anche ai fini turistici. Si riportano qui due buone pratiche di cui siamo stati ideatori ovvero il portale del DCE Distretto Culturale Evoluto della Via Flaminia17, e il Portale Marcheology. Nel primo caso, una piattaforma cloud-based collegata a dei luoghi pilot ha permesso la valorizzazione di una intera infrastruttura lineare e del paesaggio che la caratterizza, facendo largo uso di database sulle informazioni storiche e naturalistiche esistenti. Esse sono state poi integrate da campagne di acquisizione dati dedicate (fotografiche e tridimensionali), utilizzando tecnologie all’avanguardia e/o standard (fotogrammetria, scanner laser, fotocamere 360, foto HD, RPAS, ecc.). In questo modo è stato possibile effettuare un rilievo dell’intero paesaggio della Via Flaminia, da Fano al Passo Scheggia. L’altro caso studio è Marcheology18, nato dal progetto MUSST Archeogate: è un portale di archeologia, realizzato in collaborazione con il Polo Museale delle Marche con la finalità di comunicare l’unicità del patrimonio archeologico e di rafforzare il valore identitario creando una rete tra luoghi della cultura. Gli asset tecnologici innovativi creati sono 13 Clini et al. 2020a. 14 Clini et al. 2020b; Nespeca 2018. 15 Quattrini et al. 2016. 16 Clini et al. 2016. 17 Clini et al. 2019. 18 ; il portale è stato anche riportato nei link di successo della iniziativa Culture@home della DG Connect in . 163UMANESIMO DIGITALE E BENE COMUNE? LINEE GUIDA E RIFLESSIONI PER UNA SALVEZZA POSSIBILE stati: una piattaforma multicanale per la fruizione dei beni con tecnologie di Virtual Reality (VR) di estrema portabilità, a stretto contatto con una rete di attori pubblico e privati del turismo locale. Tale sistema tecnologico-culturale attua il coinvolgimento e radica nella comunità l’importanza del patrimonio archeologico per la propria memoria. Anche in questo caso, si è fatto uso di contenuti multimediali immersivi e 3D per raccontare dei siti particolarmente significativi e dimostrare la facilità di nuove forme di rappresentazione per il web. Accanto alla più tradizionale navigazione dei contenuti standard raccolti nel database, sono stati scelti 7 luoghi pilot19 oggetto della realizzazione di contenuti multimediali e 3D, in linea con le più contemporanee forme di rappresentazione presenti sul web. Infatti, la diffusione della VR fruibile da visori, ha democratizzato la visualizzazione immersiva in modalità 360° e la piattaforma ha garantito una integrazione di contenuti scalabili, grazie a uno sviluppo dedicato. La vera novità, in un portale dal taglio turistico, è costituita dalla integrazione della libreria digitale 3D, dove si raccoglie una selezione scelta dei pezzi più rappresentativi delle collezioni archeologiche. I modelli 3D, realizzati tramite fotogrammetria digitale, sono navigabili grazie all’inserimento sulla piattaforma open Sketchfab, che offre un ottimo visualizzatore 3D, leggero e intuitivo. La ricchezza e qualità dei contenuti digitali garantiscono il coinvolgimento degli utenti ma anche una prima formazione degli esperti archeologi che hanno collaborato alla selezione dei contenuti e alla loro progettazione. 2.2 Fruire La creazione dei facsimili digitali tridimensionali fin qui raccontati, innesca poi la seconda leva ovvero la creazione e conseguente validazione, come vedremo nel paragrafo successivo, di modalità inedite di accesso ai saperi, offerte dalle potenzialità delle ICT. In questa ottica è possibile promuovere e diffondere la conoscenza e valorizzazione del Patrimonio Culturale, che non vanno viste come disgiunte dalle necessità di sua gestione e conservazione. Le metodologie non sono circoscritte alla ricerca individuale e strettamente accademica, ma tendono a massimizzare la collaborazione interdisciplinare estesa alla dimensione del territorio, alle sue aziende ma soprattutto alle sue istituzioni culturali, per fornire loro soluzioni testate e mature di fruizione realmente efficaci e rispondenti ai bisogni e alle caratteristiche dei beni culturali. Non si tratta dunque di seguire la tecnologia emergente ma di verificarne l’efficacia e la portabilità in un ambito delicato e fragile come quello della cultura. Da questo punto di vista la missione 19 I luoghi pilot, scelti nelle 5 province, sono: il Museo Archeologico “A. Vernarecci” di Fossombrone, la Collezione Paleontologica di Serrapetrona, l’Area archeologica di Suasa, il Museo Archeologico “G. Allevi” di Offida, l’Area Archeologica “Helvia Ricina” a Villa Potenza, Macerata, Area Archeologica di Potentia a Porto Recanati, il Museo Archeologico Nazionale delle Marche. 164 PAOLO CLINI, RAMONA QUATTRINI delle università e dei laboratori di ricerca in DH si concretizza come una azione di ponte, utile allo sviluppo di una digital literacy più che mai necessaria, come vedremo nel quarto tassello del nostro percorso. La recente crisi pandemica ha messo in evidenza come, nell’impossibilità di far esperire la dimensione fisica dei luoghi, si sia cercato di surrogarla acuendo una cesura nelle modalità di accesso e fruizione del Cultural Heritage (CH), da on-site a unicamente on-line, che ha moltiplicato i tentativi di ‘messa online’ non ancora strutturati e colto impreparate le istituzioni, soprattutto sul piano dei contenuti prettamente digitali. Tale impreparazione lascia in qualche modo sconcertati, se si pensa alle tante esperienze che si sono succedute a livello nazionale e internazionale, ma che evidentemente non erano vissute come parte integrante della vita del museo. Per quel che riguarda il nostro lavoro, sono maturi i tempi per generalizzare e fornire degli standard, in tema di Musei Virtuali. È quanto si sta effettivamente facendo all’interno del progetto Interreg IT-HR REMEMBER20, in cui stiamo coordinando l’implementazione di 8 Musei Virtuali per il patrimonio tangibile e intangibile dei porti tra Italia e Croazia. In questo ruolo è stata sviluppata una metodologia portante per la realizzazione dei musei, con particolare riferimento alla combinazione di contenuti, sviluppo tecnologico e hardware correnti, per l’ottenimento di esperienze digitali scalabili a diverse tipologie di utenti e target group. A partire da esperienze di successo come quella del V-MUST21, si è cercato di dare risposte in termini di fattibilità tecnica e standard qualitativi ed economici. Ma l’altra faccia della medaglia, come vedremo nel prossimo paragrafo, è cercare di soddisfare le aspettative dei visitatori22, e su questo c’è tanto da analizzare e sperimentare. Si è fatto tesoro delle esperienze pregresse in materia di prodotti virtuali divulgativi, ma soprattutto di esperienze interdisciplinari di Learning by interacting23, in particolare della lunga collaborazione e sperimentazione sviluppata per i dipinti della Galleria Nazionale delle Marche. L’applicazione Ducale24, in tutte le sue varie versioni, si incardina sulla possibilità di sfruttare la Augmented Reality (AR) per sviluppare esperienze coinvolgenti ed educative intorno alle opere d’arte, stando all’interno del Museo. Tuttavia il caso studio forse più efficace nella presente trattazione è un progetto, finalizzato proprio durante il lockdown e che ha portato a un primo prototipale risultato di Museo Virtuale: la Pinacoteca Civica di Ancona “F. Podesti”. In esso ci si è proposti di dotare il museo di uno strumento virtuale che si configurasse come alternativo nei casi eccezionali di chiusura totale e complementare nei casi di flussi limitati di visitatori. L’approccio si è configurato non solo come risposta contingente all’emergenza sanitaria ma come occasione per produrre valore aggiunto 20 . 21 . 22 Pescarin 2014. 23 Clini 2017a. 24 Clini 2017b. 165UMANESIMO DIGITALE E BENE COMUNE? LINEE GUIDA E RIFLESSIONI PER UNA SALVEZZA POSSIBILE rispetto al museo on-site in termini di narrazione inedita e di documentazione dei contenuti. Questo lavoro è consistito nell’attivazione di una riflessione interdisciplinare, congiunta trai ricercatori e esperti (storici e operatori didattici) del Museo, che ha condotto il gruppo di lavoro alla messa a sistema di una progettualità più ampia rivolta alla creazione di un prodotto digitale dalle spiccate potenzialità interattive ed esperienziali. Si è così realizzato un Virtual Tour25 con panoramiche, immagini ad alta risoluzione (Fig. 4), modelli 3D e testi descrittivi, anche con commento audio, ma che fondasse un laboratorio di sperimentazione permanente atto a potenziarlo e aggiungere strumenti rivolti alla cura della relazione con le persone. In particolare i temi da approfondire nel futuro saranno: a) la accessibilità e l’inclusività, con strumenti di mediazione per vari target; b) l’intrattenimento, con un vero e proprio palinsesto virtuale; c) la fruizione multidisciplinare, integrando contenuti anche sui restauri, come già sperimentato altrove26. La grande sfida di questo approccio è tendere a dimostrare che la tecnologia possa entrare nel mondo della fruizione dei beni culturali realizzando supporti che, a partire da una corretta digitalizzazione del patrimonio quale nuova forma di tutela e conservazione, implementino la funzione educativa mettendo in atto nuovi sistemi di interazione e possibili servizi rivolti al nuovo pubblico virtuale. 2.3 Misurare In considerazione di quanto fin qui esposto, dopo aver assistito negli anni a applicazioni ICT abbandonate dalle istituzioni dopo breve tempo o non manutenute perché percepite come inefficaci, nei prossimi anni è auspicabile che gli amministratori di siti archeologici e musei intraprendano un approccio completamente digitale per gestire e comunicare i loro beni. Sebbene tecnologie all’avanguardia siano disponibili anche senza ingenti investimenti, mancano ancora metriche e possibilità concrete di misurare l’accettazione da parte dei pubblici: pochi lavori si stanno concentrando sul feedback dell’utente. Recentemente è stato accettato un lavoro27 che testa diverse esperienze multimediali dal punto di vista degli utenti per valutarne il coinvolgimento. Il lavoro mostra un flusso di lavoro per lo studio e l’analisi della soddisfazione quantitativa e qualitativa degli utenti riguardo a diverse applicazioni dedicate all’archeologia, su tre diverse scale: paesaggio, museo e manufatto archeologico. I risultati dimostrano che l’approccio proposto fornisce ad addetti ai lavori 25 . 26 Quattrini et al. 2019. 27 Quattrini et al. 2020. 166 PAOLO CLINI, RAMONA QUATTRINI e curatori d’arte dati significativi per analizzare l’esperienza-utente e, di conseguenza, per modificare o migliorare la propria offerta28. Le prime sperimentazioni di ricerca in questo senso si sono dimostrate molto promettenti (Fig. 4) e hanno suggerito inoltre di avviare progetti di ricerca congiunti con aziende, in un’ottica di trasferimento tecnologico. Ad esempio è possibile citare il PROGETTO regionale C.O.ME. (change your museum – analysis of behavior, emotions and reactions of museum visitors) da cui è poi nato il sistema MeMus29, attualmente sperimentato a Palazzo Buonaccorsi di Macerata e al Museo Omero di Ancona. Obiettivo del progetto era migliorare l’esperienza di visita dei musei tramite lo sviluppo e la sperimentazione di un nuovo sistema automatico di monitoraggio dei pubblici museali. Il sistema di monitoraggio fa uso di tecnologie di tracciamento basate su flussi video e di analisi dei comportamenti di utenti e visitatori e ha agito con un approccio interdisciplinare, unendo conoscenze museologiche, museografiche e competenze nel settore delle tecnologie applicate alla retail intelligence e al neuromarketing. Questa prima fase di trasferimento tecnologico è solo una parte dello cammino, infatti se n’è dimostrata necessaria una seconda, complementare, in cui l’università possa proseguire nell’ambito della ricerca e operare una personalizzazione e specializzazione dei metodi di misurazione alle sfaccettate realtà culturali dei territori. L’apertura verso il contesto socio-economico mediante la valorizzazione e il trasferimento delle conoscenze e le iniziative dal valore socio-culturale ed educativo sono alcuni dei punti di forza dell’approccio qui presentato. In situazioni impreviste come quelle nate dalla grave crisi della pandemia globale, la promozione di una strategia digitale condivisa e sviluppata nel corso degli anni offre opportunità di crescita per il settore culturale, evidenziando come nuove forme di conoscenza siano in grado di attivare processi innovativi e opportunità di contatto con i fruitori. 2.4 Formare Arrivando all’ultimo tassello della nostra filiera operativa, va detto che la formazione di nuove professionalità in ambito di Digital Cultural Heritage è anche il tassello più significativo per dare sostenibilità futura alla digitalizzazione e innovazione in un intero comparto economico. Fino a che non avremo professionisti preparati ad affrontare i piani trasformazione digitali, inseriti organicamente nelle istituzioni culturali, l’intero settore non potrà conoscere il suo nuovo umanesimo. Quindi da una parte si necessita della formazione di queste figure, ancora percepite come ibride, dall’altra si necessità di un 28 Recentemente l’Università Politecnica delle Marche ha organizzato con ICOM un webinar dal titolo “Misurare per crescere ed innovare. I musei italiani nelle nuove sfide digitali dopo Covid” attualmente consultabile su: . 29 . 167UMANESIMO DIGITALE E BENE COMUNE? LINEE GUIDA E RIFLESSIONI PER UNA SALVEZZA POSSIBILE ringiovanimento e potenziamento degli organici. Sul primo fronte l’università e i centri di ricerca, in particolare i gruppi in Digital Cultural Heritage, stanno facendo molto. In particolare è necessario contribuire all’affinamento delle expertise necessarie per operare in un settore in continuo aggiornamento ed evoluzione. Nelle numerose collaborazioni già avviate, questo lavoro ha anche dimostrato di rispondere alla terza missione dell’Università: il nostro approccio supporta e focalizza processi decisionali con importanti istituzioni incoraggiando processi di costruzione e condivisione degli indirizzi strategici: dalla scelta delle tecnologie più idonee in base al tipo di patrimonio, agli aspetti di gestione e sostenibilità degli strumenti messi in campo nel medio-lungo periodo. Le buone pratiche e le iniziative messe in campo si sono dimostrate un punto di partenza per necessarie discussioni che prevedano, in accordo con i principi della resilienza e della sostenibilità, l’abilitazione di campagne di rilievo massive e l’integrazione di queste nelle policies delle istituzioni pubbliche con l’obiettivo di stabilire nuove visioni per l’accesso e la comprensione del patrimonio culturale. Possiamo riportare ad esempio il Massive Open On Line Course (MOOC) in Digital Cultural Heritage, sviluppato in collaborazione con DiCultHer30 e messo a disposizione dei docenti delle scuole secondarie per la loro formazione, o il costruendo Micromaster sulla valorizzazione e la comunicazione dei paesaggi culturali europei e del suo patrimonio ferroviario con l’ausilio delle ICT, all’interno del progetto RailToLand31. Il progetto, un Erasmus + Key action 2, Partenariati Strategici per l’Istruzione Superiore, più in generale mira a esplorare il valore sociale ed educativo del paesaggio culturale europeo come patrimonio comune e come catalizzatore dei processi di consolidamento dell’identità europea, attraverso forme innovative e virtuali di apprendimento tra pari e di Design by Thinking. Altre attività formative, istituzionalizzate nel percorso universitario, sono sviluppate con laboratori di tesi di laurea all’interno del corso di studi in Ingegneria Edile Architettura della Univpm, in cui è previsto un curriculum specifico in Fruizione e Gestione dei beni culturali e architettonici. 3. Conclusioni Il nuovo umanesimo digitale, come è stato fin qui approfondito e tratteggiato, costituisce un concetto fondante che viene da lontano, perché da sempre fondato sul concetto di riproducibilità dell’opera d’arte. Questo concetto ha vissuto sul connubio che ogni civiltà ha sempre ricercato tra la sua scienza e tecnica 30 . 31 . 168 PAOLO CLINI, RAMONA QUATTRINI possibili e l’arte sempre eterna. Oggi le straordinarie possibilità del digitale, non solo di riprodurre perfettamente un’opera d’arte ma anche di metterne in evidenza nuovi significati artistici estetici e narrativi, ci permettono di riscrivere una nuova definizione di umanesimo digitale e di tracciare le linee per un nuovo manifesto. Come si attuano e come si legano tra loro le quattro leve, sopra descritte? Che cosa oggi le tecnologie ci permettono di realizzare? Come rimettere l’uomo al centro del suo patrimonio? Come le nuove esperienze di realtà aumentata e immersiva possono profondamente trasformare il rapporto tra uomo e patrimonio? Come è possibile oggi pensare ad un patrimonio dematerializzato dalla sua realtà fisica e disponibile dovunque e per chiunque? Come l’umanesimo digitale può realizzare quel concetto di democratizzazione dell’arte che quasi un secolo addietro Walter Benjamin32 leggeva esattamente nella possibilità della riproducibilità dell’opera d’arte? Tale riflessione generò al contempo un equivoco da cui però è giunto il momento di uscire. Va superato quello che è senz’altro un gigantesco luogo comune, ovvero che un’opera vada vista dal vivo e che questa esperienza sia molto più ricca e coinvolgente del vederne una riproduzione virtuale. Se questo è ancora il livello del dialogo con le istituzioni culturali possiamo tranquillamente alzare bandiera bianca. Ben più complesso e foriero di innovazione per la nostra civiltà è questo umanesimo digitale che cerca nella riproducibilità del Patrimonio nuove forme di arte e di democrazia. La missione è dare credibilità scientifica, fattibilità tecnica, sostenibilità economica e dignità istituzionale a varie tipologie di servizi culturali al momento non presenti ma indispensabili per valorizzare il patrimonio come bene comune. Da qui, rafforzando il ruolo delle comunità di patrimonio virtuali, vanno forniti alle istituzioni museali strumenti chiave nell’implementazione di Piani Strategici di Trasformazione Digitale, necessari ad aggiornare la gestione di contenuti culturali di qualità incardinati su artefatti digitali 3D. Ma è tutto da costruire. Come dimostrato dalla frenesia digitale che il Covid-19 ha generato nei mesi drammatici del lockdown e dai risultati imbarazzanti di tale frenesia. Questa è la condizione per ristabilire la cultura e il patrimonio come luogo delle relazioni. Questa è la condizione per non sentire mai più l’urlo assordante del silenzio della nostra civiltà. Riferimenti bibliografici Benjamin W. (1935), Das Kunstwerk im Zeitalter seiner technischen Reproduzierbarkeit. Frankfurt/Main: Suhrkamp, 1935, trad.it.: L’opera d’arte nell’epoca della Sua Riproducibilità tecnica, Torino: Einaudi, 2013. 32 Benjamin 1935. 169UMANESIMO DIGITALE E BENE COMUNE? LINEE GUIDA E RIFLESSIONI PER UNA SALVEZZA POSSIBILE Clini P., Frapiccini N., Quattrini R., Nespeca R. (2018), Toccare l’arte e guardare con altri occhi. Una via digitale per la rinascita dei musei archeologici nell’epoca della riproducibilità dell’opera d’arte, in Ambienti digitali per l’educazione all’arte e al Patrimonio, a cura di Alessandro Lugini, Milano: Franco Angeli, pp. 97-113, . Clini P., Frontoni E., Martini B., Quattrini R., Pierdicca R. (2017b), New Augmented Reality applications for Learning by Interacting, «Archeomatica», 8, n. 1, pp. 28-33. Clini P., Frontoni E., Quattrini R., Pierdicca R., Puggioni M. (2019), Archaeological Landscape and Heritage. Innovative Knowledge-Based Dissemination and Development Strategies in the Distretto Culturale Evoluito Flaminia NextOne, «Il capitale culturale», n. 19, pp. 211-235. Clini P., Galli, Quattrini R. (2016), Landscape & Archaeology, «SCIRES-IT SCIentific RESearch and Information Technology», 6, pp. 1-6. Clini P., Quattrini R., Bonvini P., Nespeca R., Angeloni R., Mammoli R., Dragoni A.F., Morbidoni C., Sernani P., Mengoni M. (2020b), Digit (Al) Isation in Museums: Civitas Project-AR, VR, Multisensorial and Multiuser Experiences at the Urbino’s Ducal Palace, in Virtual and Augmented Reality in Education, Art, and Museums, Guazzaroni G., Pilai A.S. (eds.), Hershey: IGI Global, pp. 194-228, DOI: 10.4018/978-1-7998-1796-3.ch011. Clini P., Quattrini R., Frontoni E., Pierdicca R., Nespeca R. (2017a), Real/ Not Real: Pseudo-Holography and Augmented Reality Applications for Cultural Heritage in Handbook of Research on Emerging Technologies for Digital Preservation and Information Modeling, Ippolito A., Cigola M. (eds.), Hershey: IGI Global, pp. 201-227, DOI: 10.4018/978-1-5225-0680- 5.ch009. Clini P., Quattrini R., Nespeca R., Angeloni R., Mammoli R. (2020a), Digital Facsimiles of Architectural Heritage: New Forms of Fruition, Management and Enhancement. The Exemplary Case of the Ducal Palace at Urbino, in Graphical Heritage, Congreso Internacional de Expresión Gráfica Arquitectónica, Cham: Springer, pp. 571-582. Direzione generale Musei – MIBACT (2019), Piano Triennale per La Digitalizzazione e l’Innovazione Dei Musei, . EPOCH (2009), The London Charter, . Europe Council Treaty Office (2005), Council of Europe Framework Convention on the Value of Cultural Heritage for Society – Faro’s Convention, . 170 PAOLO CLINI, RAMONA QUATTRINI European Commission (2018), A New European Agenda for Culture. Bruxelles, . European Heritage Alliance (2020), Europe Day Manifesto, . Gaiani M., Apollonio F., Clini P., Quattrini R. (2015), A Mono-Instrumental Approach to High-Quality 3D, Reality-Based Semantic Models. Application on the Palladio Library, in Digital Heritage, Vol 1/2, IEEE Computer society. ICOM Italia (2016), Carta Di Siena 2.0, . Laboratorio per il Turismo Digitale (TDLab) (2014), Piano Strategico per La Digitalizzazione Del Turismo Italiano, Roma. NEMO, and Nina Szogs, Digitisation and IPR in European Museums, 2020, . Nespeca R. (2018), Towards a 3D digital model for management and fruition of Ducal Palace at Urbino. An integrated survey with mobile mapping, «SCIRES-IT – SCIentific RESearch and Information Technology», 8, n. 2, pp. 1-14, . Pescarin S. (2014), Museums and Virtual Museums in Europe: reaching expectations, «SCIRES-IT – SCIentific RESearch and Information Technology» 4, n. 1, pp. 131-140, . Quattrini R., Clini P., Nespeca R., Ruggeri L. (2016), Misura e Historical Information Building: sfide e opportunità nella rappresentazione di contenuti 3d semanticamente strutturati, «DISEGNARE CON...», 9, pp. 1-11. Quattrini R., Gasparetto F., Angeloni R., D’Alessio M. (2019), Modelli digitali per comunicare il patrimonio e l’intervento di restauro, «ARCHEOMATICA- TECNOLOGIE PER I BENI CULTURALI», 10, n. 3, pp. 24-27. Quattrini R., Pierdicca R., Paolanti M., Clini P., Nespeca R., Frontoni E. (2020), Digital Interaction with 3D Archaeological Artefacts: evaluating user’s behaviours at different representation scales, «Digital Applications in Archaeology and Cultural Heritage», e00148, . 171UMANESIMO DIGITALE E BENE COMUNE? LINEE GUIDA E RIFLESSIONI PER UNA SALVEZZA POSSIBILE Appendice Fig. 1. Digitale, il nuovo patrimonio. Verso un manifesto per i Beni culturali digitali come Beni comuni e per una nuova forma di arte: le quattro leve principali. 172 PAOLO CLINI, RAMONA QUATTRINI Fig. 2. Digitale, il nuovo patrimonio. Scienza e Tecnica per fondare una nuova civiltà. La filiera di acquisizione che conduce alla realizzazione dei facsimili digitali 3D 173UMANESIMO DIGITALE E BENE COMUNE? LINEE GUIDA E RIFLESSIONI PER UNA SALVEZZA POSSIBILE Fig. 3. Digitale, il nuovo patrimonio. I luoghi d’arte dovunque e per chiunque Studiolo del Duca, Palazzo Ducale di Urbino. Esperienza di Realtà Virtuale Immersiva sviluppata all’interno del Progetto CIVITAS, basata su realizzazione del facsimile 3D ad altissima risoluzione con integrazione di laser scanner e fotogrammetria, basata su scatti a luce polarizzata 174 PAOLO CLINI, RAMONA QUATTRINI Fig. 4. Digitale, il nuovo patrimonio. Quello che gli occhi non vedono. Madonna col Bambino di Carlo Crivelli, dipinto a oro e tempera su tavola, dimensioni 21x15cm, databile al 1480. I risultati dell’acquisizione High Resolution (HR) del dipinto e un suo ingrandimento 175UMANESIMO DIGITALE E BENE COMUNE? LINEE GUIDA E RIFLESSIONI PER UNA SALVEZZA POSSIBILE Fig. 5. Digitale, il nuovo patrimonio. Un’arte a misura di ciascuno. Le tecnologie per misurare comportamento e gradimento degli utenti negli spazi museali e creare sistemi a supporto delle decisioni per le istituzioni culturali eum edizioni università di macerata Direttore / Editor in-chief Pietro Petraroia Texts by Stefano Baia Curioni, Giovanna Barni, Claudio Bocci, Giovanna Brambilla, Salvatore Aurelio Bruno, Roberto Camagni, Roberta Capello, Silvia Cerisola, Anna Chiara Cimoli, Paolo Clini, Stefano Consiglio, Madel Crasta, Luca Dal Pozzolo, Stefano Della Torre, Marco D’Isanto, Margherita Eichberg, Chiara Faggiolani, Pierpaolo Forte, Mariangela Franch, Stefania Gerevini, Maria Teresa Gigliozzi, Christian Greco, Marta Massi, Armando Montanari, Marco Morganti, Umberto Moscatelli, Maria Rosaria Napolitano, Fabio Pagano, Elisa Panziera, Sabina Pavone, Carlo Penati, Tonino Pencarelli, Pietro Petraroia, Domenica Primerano, Ramona Quattrini, Corinna Rossi, Valentina Maria Sessa, Erminia Sciacchitano, Emanuela Stortoni, Alex Turrini, Federico Valacchi http://riviste.unimc.it/index.php/cap-cult/index ISSN 2039-2362 JOURNAL OF THE DIVISION OF CULTURAL HERITAGE Department of Education, Cultural Heritage and Tourism University of Macerata ISBN 978-88-6056-622-5 Euro 25,00 work_4hmgvvbpnzh3zhhnaaqgqbb2m4 ---- wp-p1m-39.ebi.ac.uk Params is empty 404 sys_1000 exception wp-p1m-39.ebi.ac.uk no 219162327 Params is empty 219162327 exception Params is empty 2021/04/06-02:28:47 if (typeof jQuery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/1.14.8/js/jig.min.js"][/script]'.replace(/\[/g,String.fromCharCode(60)).replace(/\]/g,String.fromCharCode(62))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} Page not available Reason: The web page address (URL) that you used may be incorrect. Message ID: 219162327 (wp-p1m-39.ebi.ac.uk) Time: 2021/04/06 02:28:47 If you need further help, please send an email to PMC. Include the information from the box above in your message. Otherwise, click on one of the following links to continue using PMC: Search the complete PMC archive. Browse the contents of a specific journal in PMC. Find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/MED/ work_4ojs2dcanfbc7dnknzsvb2hpmi ---- S2059163220000134jrv 259..275 REVIEW ESSAY Digital Humanities and the Discontents of Meaning Michael A. Fuller* University of California, Irvine *Corresponding author. E-mail:mafuller@uci.edu (Received 23 July 2019; revised 5 December 2019; accepted 25 February 2020) The digital humanities offer more than just a set of tools. The application of software to assist in the analysis of large collections of data does not just expand the volume of material we can incorporate in our work, it also expands how we in the humanities understand the nature of meaning. The recent scholarly turn to the expanded modes of analysis made possible by DH is not just the “latest new thing” but gives the scholarly community a way to articulate and respond to long-standing doubts about the episte- mological grounding in the practice of the humanities. Even more importantly, I believe that this broadening of inquiry afforded by DH is intrinsic to the humanistic project itself. In this essay, I seek in particular to connect the implicit conceptual substructure behind the architectural logic of the digital humanities to key strains of hermeneutic thought that have established a basis for exploring the question of how we are to under- stand the vast, variegated world of historical human experience that is the object of our humanistic inquiries across disciplines. The Retreat of Meaning How do the digital humanities provide an epistemological model for thinking about the human? I would argue that the digital humanities live within the impasse of the failure of language that has long shadowed humanistic study, but offer compelling ways of thinking about meaning within that impasse. This reorientation becomes clearer when we consider the digital humanities within the context of the epistemological pre- dicament of the humanities in the past half century. The failure of language—the simple truism that words are just words, signifiers not securely attached to things—has long presented an epistemological problem in the human- istic interpretation of texts. In the past fifty years in particular, scholars have intensely debated the question that if all we have are words—and if our mode of thinking about these words is through systems of yet more words—how can we reach beyond words to the things themselves? Of course not all our understanding of the past relies on words: we have the substantial resources of artifacts from fragments of textile to pottery and metal castings, to human bones, to extensive archeological sites. Still, many aspects of the past—and in particular, the intellectual, affective, and aesthetic dimensions—remain inaccessible without relying on textual corpora. Without a theory of reference to connect © Cambridge University Press 2020. This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited. Journal of Chinese History (2020), 4, 259–275 doi:10.1017/jch.2020.13 h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://orcid.org/0000-0003-2060-4437 mailto:mafuller@uci.edu http://creativecommons.org/licenses/by/4.0/ https://crossmark.crossref.org/dialog?doi=10.1017/jch.2020.13&domain=pdf https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms texts firmly to the world, the meanings we can justifiably draw from textual evidence become suspect. The story of the unmooring of language is well-known: structuralism gave us a synchronic account of meaning in language. Its logic of mutual differentiation among signifiers offered important insights into the working of texts, but it also led to the vision of language as a closed, endless mirroring of words. Substantive links between signifiers and a signified world outside of language dissolved into an infinite deferral of meaning. Although the problem of the retreat of reference has deep roots in the Western philo- sophical tradition, with the vanishing of reference to ground meaning, we have been forced to confront the question: if the world does not shape the structure of language, then what does? One answer derives from the isolated subjectivity of the individual reader. Words and texts mean to me what they mean to me, without access to larger structures other than those I supply. This is how we read unreflectingly most of the time, and its connois- seurship and inner world of responses can be very satisfying. This certainly is how most of my students read, and they are happy within an asserted subjectivity of meaning which admits no further analysis. However, a second set of responses has evolved. What is often collectively deemed the “hermeneutics of suspicion” unravels this isolated subjectiv- ity of meaning and posits the individual as a node within systems of difference shaped by the structuring of power. Compelling critiques of institutional racism and sexism and the ways in which they are embedded in and structure our shared discourses offer ample evi- dence to support a vision of the ideological structuring of meaning. Critical theory both in literary studies and in the social sciences offers modes of revealing the duplicity and eva- sions of texts and showing that texts that claim to assert truths can be revealed as relying on concealed predications. While such analyses remain a crucial check on complacency in scholarly as well as broader public discourse, by their nature their reflections remain within and demonstrate in myriad ways the failures of language rather than offering deeper knowledge of the human. Aware of the limits of critique, scholars in critical theory increasingly have come to echo Bruno Latour’s query in “Why Has Critique Run Out of Steam? From Matters of Fact to Matters of Concern.”1 Indeed, the movement toward “postcritique” discussed by Rita Felski and others is an important index of the discontents of meaning and of a desire in humanistic disciplines to develop perspectives and methods that move beyond the impasses of failures of language.2 The Challenge of Scientific Insights into the Human While scholars in the humanities have been sharpening their modes of critical analysis, a second, less visible trend outside of the humanities has been rapidly gaining strength in the pragmatic study of language, perception, and cognition that treats the structuring of human experience as an object of scientific inquiry. The success of neural networks and ever more pervasive forms of artificial intelligence recently has caught the public imagination and seemingly rendered scholarship in the humanities yet more irrelevant to the exploration of human meaning. Thus, humanistic scholarship—either in the tra- ditionalist mode of individual sensibility or in the contemporary mode of social critique —has little standing to speak to the larger patterns and deeper meanings of human 1Bruno Latour, “Why Has Critique Run Out of Steam? From Matters of Fact to Matters of Concern,” Critical Inquiry 30.2 (2005), 225–48. 2For example, see Rita Felski, The Limits of Critique (Chicago: University of Chicago Press, 2015) and Elizabeth S. Anker and Rita Felski, eds., Critique and Postcritique (Durham: Duke University Press, 2017). 260 Michael A. Fuller h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms experience. Other disciplines—now largely in the sciences—are stepping in to provide insights into the human. Geoffrey Harpham, the former director of the National Humanities Center, lamented in 2006 : “One of the most striking features of contem- porary intellectual life is the fact that questions formerly reserved for the humanities are today being approached by scientists in various disciplines such as cognitive science, cognitive neuroscience, robotics, artificial life, behavioral genetics and evolutionary biol- ogy.”3 This story of the crisis in the humanities has been presented—and lamented— many times and in innumerable variations in recent years and is in no way new. However, as I suggested earlier, it is a story that may have a happy—even if unlooked for—ending in which DH plays a part. Harpham continues in his essay, “Science and the Theft of Humanity,” which I quote at length: Humanists, who have been only partially aware of the work being done by scien- tists and other nonhumanists on their own most fundamental concepts, must try to overcome their disciplinary and temperamental resistances and welcome these developments as offering a new grounding for their own work. They must commit themselves to be not just spectators marveling at new miracles, but coinvestigators of these miracles, synthesizing, weighing, judging and translating into the vernac- ular so that new ideas can enter public discourse. They—we—must understand that while scientists are indeed poaching our con- cepts, poaching in general is one of the ways in which disciplines are reinvigorated, and this particular act of thievery is nothing less than the primary driver of the transformation of knowledge today. For their part, those investigating the human condition from a nonhumanistic perspective must accept the contributions of humanists, who have a deep and abiding stake in all knowledge related to the question of the human. We stand today at a critical juncture not just in the history of disciplines but of human self-understanding, one that presents remarkable and unprecedented opportunities for thinkers of all descriptions. A rich, deep and extended conversa- tion between humanists and scientists on the question of the human could have implications well beyond the academy. It could result in the rejuvenation of many disciplines, and even in a reconfiguration of disciplines themselves—in short, a new golden age. I share both Harpham’s optimism and his call for us humanists to look to the sciences to provide at least partial grounding for our work. In particular, engagement with the neuroscience of memory, emotion, language, and selfhood can deepen humanistic reflection on the patterns of human experience. However, there is a yet broader and more profound conceptual shift at work, of which whatever neuroscience can tell us is just a part. Words and texts are traces of human action and accordingly participate in the broader patterns of life as humans live it. The sciences can help us with crucially important creaturely dimensions of expe- rience—help us understand the biological mechanisms of memory, affect, and language production—but situating and understanding texts within the human world built upon 3Geoffrey Harpham, “Science and the Theft of Humanity,” American Scientist No. 4 (July–August 2006), 296–98, cited in Geoffrey Rockwell and Stéfan Sinclair, Hermeneutica: Computer-Assisted Interpretation in the Humanities (Cambridge, MA: MIT Press, 2016), 20. Journal of Chinese History 261 h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms these basic processes return us to the humanistic discipline of hermeneutics, of which, I argue below, the digital humanities are a modern technological incarnation. The schol- ars who shaped modern Western hermeneutics, born in the aftermath of Kant’s Copernican revolution, confronted the problem of understanding the religious, philo- sophical, and literary legacy of the past without access to timeless essences. Instead, they had to develop theories and methods to allow them to extract understanding from the totality of the evidence at hand. We confront the same problem—with similar hopes—in the practice of the digital humanities. Thus I turn next to describe the anti- foundational approach to language in DH, connect it to Wittgenstein’s proposal of lin- guistic meaning defined through usage, and then trace Wittgenstein’s model for understanding back to the hermeneutic lineage through Wilhelm Dilthey to Friedrich Schleiermacher. Having considered theoretical models for the broad integration of data from lived experience in the hermeneutic tradition, I return to propose that these integrative models implicitly shape the emerging approaches to the digital human- ities and in fact complement the new approaches to thinking about human perception, memory, emotion, selfhood and meaning being developed through research in neuro- science and evolutionary biology. The Digital Humanities The digital humanities play a critical role in the gradual opening up of the humanities to the broader interpretation of the human because they embody and articulate a different understanding of the nature of meaning. I begin by returning to the issue of textual meaning, in particular to the central problem of how words mean: even if we stay for the moment in Saussure’s structuring of signifiers through mutual differentiation, the paradigms of the digital humanities do not find language either condemned to the infinite regress of critical theory or an order built upon ideology. Topic modeling, one of the most familiar techniques in the digital humanities, pro- vides a clear example of the modeling of meaning in DH. Explaining the concept pre- sents a challenge because the very phrase “topic modeling” all too easily misleads those who are without the technical knowledge of what these “topics” are and the mathemat- ics by which they are derived. Without that frame, people seem to assume that the “words” in topic-modeling systems rely in some way on the semantic structure of lan- guage. Instead, in topic modeling, words as signifiers are not only cut off from any pos- sible signified content but also from the entire system of mutual differentiation that defines the signifiers of a language. I believe that a brief introduction to the basic ele- ments of linear algebra upon which topic modeling is built will go a long way to help clarify the logic of meaning as defined within the set of conceptual structures associated with topic modeling. In topic modeling, one begins with a collection of texts. In the most common approach, the order of the words does not matter, and each document is considered simply an unordered “bag of words.” Moreover, the words in the texts are meaningless tokens, just strings of bytes. The goal of topic modeling then is to build a system of mutual differentiation relying only on the collection of the bag of words in the corpus to be analyzed. This mathematized version of meaning and structure is unfamiliar to most human- ists, but grasping it is a key to seeing how the digital humanities synthesize the vast cor- pora of data into a new world of empirically organized human connections. My aim at this juncture is to explain the basic mathematics in topic modeling to demystify the structure of meaning defined by topic modeling and related paradigms. 262 Michael A. Fuller h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms Texts as Matrices of Meaning The first version of a structured representation of words within the particular domain of selected texts is simply a large matrix with the dimensions determined by the number of documents and the total number of different words in that collection of documents.4 The value for each element in the matrix is the frequency ( f) of a given word (wi) in a given document (dj). Thus with D documents and N different words, we have a matrix: Any particular document is defined as an array of words, and any word is defined as its frequency in the array of documents. If we had a thousand documents with ten thousand different words, we would have a matrix with 10 million entries (1,000 documents x 10,000 words). In topic modeling, one looks for a way to change this very large matrix into the product of two smaller matrices such that there are K number of topics (tk) where each document can now be described as an array of topics, and each topic can be described as an array of words: 4A matrix is a mathematical formalism from the discipline of linear algebra. An m x n matrix is a set of numbers arranged in m rows, with each row having a set of n numbers. The n values all line up to produce n columns, each with m values. Thus the following is a 2 x 3 matrix: 1 2 3 4 5 6 [ ] It has 2 rows, (1,2,3) and (4,5,6), and three columns, (1,4), (2,5), and (3,6). The sets of numbers for the rows and columns in turn define vectors, which define positions in multidimensional spaces. The row-vectors in the example are 3-dimensional, while the column-vectors are 2-dimensional. An m x n matrix defines a specific way of mapping n-dimensional vectors into m-dimensional space (i.e., turn n-dimensional vectors into m-dimensional vectors). Journal of Chinese History 263 h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms The advantage of this factoring of the matrix into two separate matrices is that if one has 100 topics, then the DocumentTopics matrix has 100,000 elements (100 topics x 1,000 documents) and the TopicsWords matrix has 1,000,000 entries (100 topics x 10,000 words), for a grand total of 1,100,000 entries. Finding a matrix of 100 topics allows us to condense the initial data by a factor of about 10. So far this looks like a mathematical trick, but what does a “topic” then mean, given the math that defines it? The words in the set of documents are not randomly distrib- uted. They have an internal logic within the collection of documents, and “topics” cap- ture the regularities in the appearance of the words. Words cluster together, and the topics represent those clusters. People working with topic modeling stress that the mathematical tools that find the two topic matrices are “semantically naïve”: they know nothing about the meaning of the words; they just manipulate them. Topic mod- eling, however, does construct a new version of meaning for the tokens (words) in the system. While Mallet, the standard package used in the humanities for topic modeling, uses LDA (Latent Dirichlet Allocation) which is based on Bayesian probability, another, simpler approach uses a form of non-negative matrix factoring called PLSA (Probabilistic Latent Semantic Analysis).5 The Semantics of Matrices Where does the semantics—the assignment of meaning—come in? Recall that the col- umns in the new matrix define the topics as arrays of words: 5Wikipedia is in fact a perfectly good source for learning the basics of these models; see “Probabilistic latent semantic analysis,” https://en.wikipedia.org/wiki/Probabilistic_latent_semantic_analysis; “Non-negative matrix factorization,” https://en.wikipedia.org/wiki/Non-negative_matrix_factorization; and “Latent Dirichlet allocation,” https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation. 264 Michael A. Fuller h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://en.wikipedia.org/wiki/Probabilistic_latent_semantic_analysis https://en.wikipedia.org/wiki/Probabilistic_latent_semantic_analysis https://en.wikipedia.org/wiki/Non-negative_matrix_factorization https://en.wikipedia.org/wiki/Non-negative_matrix_factorization https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms At the same time, however, the rows in the matrix define the words through the topics in which they participate: wordj = ( f (tiwj), . . . f (tiwj), . . . f (tKwj))[row vector] Since the topics themselves are mathematical constructs, defining the meaning of words as a vector of the weights they contribute to defining the array of topics may seem extremely abstract. However, we then take a next step of comparing the similarity of the words as defined by their role in the topics. The simplest approach is to take the normalized dot product of the two word vectors:6 similarity(wQi, w Q j) = w Q i · w Q j ‖wQi‖ ‖w Q j‖ If the two word vectors are identical, the similarity = 1, and if they have no overlap at all, the value is 0. One then can use these similarity values to generate a hierarchical clus- tering analysis visualized as a tree graph (dendrogram) with branches that split in ever finer groupings to represent the clustering of words that share similarities in meaning. The point to stress here is that this clustering of words by similarity is relative to a specific collection of texts. A different set of texts would produce a different clustering. This move from a collection of texts as bags of words to the combination of (1) the texts defined as arrays of topics and (2) the topics defined as arrays of words, and then to a hierarchical clustering of the words based on their similarity is a form of distributional semantics in which the meaning of words are defined through the patterns of their usage within a corpus. Even though the words remain a system of signifiers, this ana- lytic approach from the digital humanities (originating in linguistics) takes us very far away from the enclosed world of poststructuralist analysis and much closer to the struc- turing of meaning in a textual corpus. That is, the algorithms here, seemingly a set of 6The dot product of two vectors is a single number (a scalar) calculated by taking the sum of each value in the first vector multiplied by the corresponding value in the second vector, i.e. a · b = ∑mi=1 aibi. The norm (the length) of a vector is the square root of the dot product of the vector times itself, i.e., ‖a‖ = �����a · a√ . A vector defined as a‖a‖ is a unit-vector (i.e., with a length of 1) in the direction of a. Journal of Chinese History 265 h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms mathematical functions, in fact embody and articulate a model for meaning defined through usage. Meaning as Usage and the Hermeneutics of “Forms of Life”: Wittgenstein and Dilthey Some authors relate this meaning-as-usage to Ludwig Wittgenstein’s famous dictum “For a large class of cases of the employment of the word ‘meaning’—though not for all—this word can be explained in this way: the meaning of a word is its use in the lan- guage.”7 Thus Wittgenstein, like distributional semantics, assigns meaning according to usage. Wittgenstein arrived at his view of language when he confronted the failure of the more substantive model of traditional philosophy. In the model of logic inherited from Frege but with roots leading back to Plato, words refer to objects, and the truth of prop- ositions relies on the correctness of the relationships they describe in the world. Wittgenstein rejected this appeal to reference to ground meaning and instead came to argue that the meaning of language comes simply and modestly from how humans use language. Wittgenstein’s account, in other words, is anti-foundational: it rejects the possibility that human access to objects can serve as the foundation of knowledge and of language. Wittgenstein, encountering the failure of reference to provide meaning, devel- oped an empirical response based on actual experience. He proposed language-games as the locus of meaning. Facing the same failure of reference, we now have turned to the digital humanities’ ability to survey the vast corpora that document the human use of language. Like previous scholars, however, I suggest that in assigning meaning to usage Wittgenstein was echoing—and perhaps drawing on—an earlier, broader tradition of interpretation in German hermeneutics from Schleiermacher to Dilthey. This herme- neutic tradition is of great significance to our understanding of the digital humanities as modes of exploring meaning.8 The specific link between Wittgenstein and Dilthey in particular is in the concept of “forms of life” that Wittgenstein introduces—but does not expand on—in the Philosophical Investigations, the central work of his late career. Wittgenstein sees communication as possible because people are participating in the same language-games, but the question arises of how it is possible that people are play- ing the language-game in the same way, since rules cannot possibly specify all the var- iations allowable in a language-game. Agreement becomes possible, Wittgenstein argues, because people share a “form of life:” Here the term “language-game” is meant to bring into prominence the fact that the speaking of language is part of an activity, or a form of life.9 “So are you saying that human agreement decides what is true and what is false?”—It is what human beings say that is true and false; and they agree in the language they use. That is not agreement in opinions but in forms of life.10 7Ludwig Wittgenstein, Philosophical Investigations, translated by G.E.M. Anscombe (Oxford: Basil Blackwell, 1976), 21. 8See Karl-Otto Apel, “Wittgenstein and the Problem of Hermeneutic Understanding,” in Ludwig Wittgenstein—Critical Assessments, Vol. 4, ed. Stuart Shanker (London: Routledge, 1986) and Nicholas F. Gier, Wittgenstein and Phenomenology: A Comparative Study of the Later Wittgenstein, Husserl, Heidegger, and Merleau-Ponty (Albany: SUNY Press, 1981). 9Wittgenstein, Philosophical Investigations, 11. 10Wittgenstein, Philosophical Investigations, 88. 266 Michael A. Fuller h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms This grounding of agreement in shared forms of life resembles Wilhelm Dilthey’s stress on the role of “objective mind” in providing a basis for mutual understanding: The cita- tion here is long but important: I have shown how significant the objective mind is for the possibility of knowledge in the human studies. By this I mean the manifold forms in which what individ- uals hold in common have objectified themselves in the world of the senses. In this objective mind, the past is a permanently enduring present for us. Its realm extends from the style of life and the forms of social intercourse to the system of purposes which society has created for itself and to custom, law, state, religion, art, science and philosophy. For even the work of genius represents ideas, feelings and ideals commonly held in an age and environment. From this world of objec- tive mind the self receives sustenance from earliest childhood. It is the medium in which the understanding of other persons and their life-expressions takes place: For everything in which the mind has objectified itself contains something held in common by the I and the Thou. Every square planted with trees, every room in which seats are arranged, is intelligible to us from our infancy because human planning, arranging and valuing—common to all of us—have assigned a place to every square and every object in the room. The child grows up within the order and customs of the family which it shares with other members and its mother’s orders are accepted in this context. Before it learns to talk, it is already wholly immersed in that common medium. It learns to understand the gestures and facial expressions, movements and exclamations, words and sentences, only because it encounters them always in the same form and in the same relation to what they mean and express. Thus the individual orientates himself in the world of objective mind. This has an important consequence for the process of understanding. Individuals do not usually apprehend life-expressions in isolation but against a background of knowledge about common features and a relation to some mental content.11 Dilthey’s argument is that we humans manifest our internal intentions in our actions and change the phenomenal world based on them. Although those intentions are not knowable in themselves, the world into which we are born and in which we live is shaped by the long history of intentional human structuring, and we learn to speak, act and think through the mediation of these humanly shaped forms. These are precisely Wittgenstein’s forms of life. Because we share these forms, we understand one another. And if we seek to understand people from a different time or place, we need to understand the context of “objective mind” through which they thought and wrote. This is the hermeneutic project for Dilthey. Dilthey’s approach of seeking the totality of the mind-built world in which a person lived—a world of explicit traces in the sensory realm that mediate intentionality—was, like Wittgenstein’s, a response to skepticism and the failure of meaning. As Dilthey explained: 11Wilhelm Dilthey, Draft for a Critique of Historical Reason, translated in Kurt Mueller-Vollmer, ed., The Hermeneutics Reader: Texts of the German Tradition from the Enlightenment to the Present (New York: Continuum Press, 1985), 155. Journal of Chinese History 267 h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms Today hermeneutics enters a context in which the human studies acquire a new, important task. It has always defended the certainty of understanding against his- torical skepticism and wilful subjectivity; first when it contested allegorical inter- pretation, again when it justified the great Protestant doctrine of the intrinsic comprehensibility of the Bible against the scepticism of the Council of Trent, and then when, in the face of all doubts, it provided theoretical foundations for the confident progress of philology and history by Schlegel, Schleiermacher and Boeckh. Now we must relate hermeneutics to the epistemological task of showing the possibility of historical knowledge and finding the means for acquiring it.12 Focusing on what is given, “the expression of what is expressed” in Dilthey’s words, and grasping its meaning in a disciplined re-living through one’s own experience makes understanding possible: In such understanding, the realm of individuals, embracing men and their crea- tions, opens up. The unique contribution of understanding in the human studies lies in this; the objective mind and the power of the individual together determine the mind-constructed world. History rests on the understanding of these two.13 Dilthey’s goals here are very broad: he seeks to understand all of human experience through careful reflection on the manifest sedimentation of human intentions “from the style of life and the forms of social intercourse to the system of purposes which soci- ety has created for itself and to custom, law, state, religion, art, science and philosophy.” More crucially, he asserts that only by such a broad reflection can one hope to under- stand another person or another time. Dilthey’s model for humanistic interpretation extends to aspects of human social organization that go far beyond the humanities and sees human production as drawing its material and meaning from all forms of the “objective mind.” This understanding of human production—and thus of the humanities—as part of a broader matrix of mean- ing provides the underpinnings for the project of the digital humanities to search out regularities from the world of materials in which the objects of our inquiries are embed- ded. However, Dilthey’s hermeneutics is so abstract that it does not offer concrete meth- odological models. For humanists, the model of Friedrich Schleiermacher, whose work on textual interpretation Dilthey extended and generalized, is more directly relevant. Schleiermacher and Textual Hermeneutics Schleiermacher, considered the father of both modern hermeneutics and modern Protestantism, was primarily a theologian, but he was deeply interested in the problem of understanding. The pressing task for him was to be sure that he understood the New Testament correctly, but he also was an innovative and acclaimed translator of Plato whose translations are still used today in Germany.14 The story of the reasons behind his approach to hermeneutics gets very complicated very quickly, but it is nonetheless 12Dilthey, Draft for a Critique of Historical Reason, 162. 13Dilthey, Draft for a Critique of Historical Reason, 158 14For the connection between Schleiermacher’s efforts as a translator and his hermeneutic theorizing, see, for example, Theo Hermans, “Schleiermacher and Plato, Translation and Hermeneutics,” in Friedrich Schleiermacher and the Question of Translation, eds. Larisa Cercel and Adriana Serban (Berlin: De Gruyter, 2015), 77–106. 268 Michael A. Fuller h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms worth exploring, for the epistemological problematic that drove Schleiermacher’s approach to understanding and the solutions he proposed for finding knowledge within that problematic are directly relevant to the situation of the humanities today and the role of digital humanities in reasserting the claims of humanistic knowledge. Schleiermacher was part of the group of early German Romantic writers who were endeavoring to find ways to respond to Immanuel Kant’s critical philosophy. Kant argued that not only do we not have access to objects in the world; we do not have inner access to our own self as the ground for experience. All we can know is within a phenomenal realm that is shaped in a priori ways by categories of perception we bring to the world to make experience possible. Among the major categories Kant pro- posed were subject-and-object, time-and-space, and cause-and-effect. We have no right to assume that these categories are actually part of the world, but we cannot experience the world without them. The post-Kantian early Romantic writers essentially worked within this epistemological critique that called metaphysically grounded foundational knowledge into question. Schleiermacher actually went Kant one better. While Kant asserted the necessity of the particular a priori categories of his analysis, Schleiermacher considered them to be as shaped by the same constraints of time and place as all other provisional human knowledge. Thus, while Schleiermacher was a theologian, he was a post-Kantian theologian whose approach to the religious under- standing of the New Testament complemented his hermeneutic approach to texts. For Schleiermacher, the religious element in human experience lay in the capacity to have intuitions about unity that preceded any conceptual understanding of what that unity might be. This capacity is without any additional specific content. Thus, to under- stand any particular form of religious practice, one cannot bring any presumed content to one’s observations and instead must rely on an understanding of the logic of practice within the particular community. The question, then, is how one understands another human community, once one sets aside access to universal truths. In particular, how is one to understand the Christianity as given in the New Testament without recourse to received dogma? Schleiermacher’s hermeneutics provided his answer. For Schleiermacher, hermeneutics—the art of understanding—had two components. He asserted: 5. As every utterance has a dual relationship to the totality of the language and the whole thought of its originator, then all understanding also consists of the two moments: of understanding the utterance as derived from language, and as a fact in the thinker.15 Schleiermacher elaborated on these two moments: 5.3. According to this, each person, on the one hand, is a location in which a given language forms itself in an individual manner; on the other, their discourse can only be understood via the totality of the language. But then the person is also a spirit which continually develops, and their discourse is only one act of this spirit in connection with the other acts.16 15Friedrich Schleiermacher, Hermeneutics and Criticism and Other Writings, translated and edited by Andrew Bowie (Cambridge: Cambridge University Press, 1998), 8. 16Schleiermacher, Hermeneutics and Criticism, 8–9. Journal of Chinese History 269 h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms Schleiermacher is not naïve here. He knows that understanding, as he presents it, requires a totality that is impossible. He argues instead that one makes progressively greater sense of the fragments one does know through the sorts of intuitions without rules that Kant defined as aesthetic judgments. Thus, Schleiermacher further asserted: 9. Explication is an art 1. Each side on its own [is an art]. For in every case there is construction of something finitely determinate from the infinite [and] indeterminate. Language is infinite because every element is determinable in a particular manner via the rest of the elements. But this is just as much the case in relation to the psycholog- ical side. For every intuition of an individual is infinite. And the effects on people from the outside world are also something which gradually diminishes to the point of the infinitely distant. Such a construction cannot be given by rules which would carry the certainty of their application within themselves. 2. For the grammatical side to be completed on its own there would have to be a complete knowledge of the language, in the other case [the psychological] a com- plete knowledge of the person. As there can never be either of these, one must move from one to the other, and no rules can be given for how this is to be done.17 The Hermeneutics of the Digital Humanities With this pair of impossible complementary tasks—the demand for complete grammat- ical and psychological understanding—defining the hermeneutic endeavor, we at last arrive at our destination. I assert that the project of the digital humanities in our own day, with our own epistemological problematic, is a continuation of Schleiermacher’s infinite project of understanding. Schleiermacher, working within strong prohibitions against foundational knowledge of either the self or the world, turned to synthesizing the patterns of the details of what we can know of the middle realm, the world as given to human experience. In reading texts, Schleiermacher required a difficult, disciplined synthesis of broad and deep knowledge and intuitions binding the disparate forms of information into that moment of synthesis. His approach here entirely recasts the distinction between close reading and the various methodologies in the digital humanities that come under the rubric of distant reading.18 Each text is abuzz with patterns—patterns of lan- guage usage as well as the patterns of historical and social interaction that shaped the author in its writing. The modes of distant reading powerfully search through textual corpora on a scale that humans cannot hope to match, and provide a background of linguistic behavior for the texts we read. At the same time, as Schleiermacher pointed out, the particular texts we engage are also moments in human experience. On the one hand, the authors writing them are embedded in the structures of their society, culture, and language, and on the other, their writings diverge from the givenness of these struc- tures and reflect particular intentions at a particular time and place. We need sensitivity to discern these divergences. Thus close reading remains vital, but given the growing availability of distant readings, the demands placed on close reading change. Close 17Schleiermacher, Hermeneutics and Criticism, 11. 18For a useful account of distant reading, see S. Jänicke, G. Franzini, M. F. Cheema, and G. Scheuermann, “On Close and Distant Reading in Digital Humanities: A Survey and Future Challenges,” in Eurographics Conference on Visualization, eds. R. Borgo, F. Ganovelli, and I. Viola (EuroVis) (2015), www.informatik. uni-leipzig.de/~stjaenicke/Survey.pdf. 270 Michael A. Fuller h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://www.informatik.uni-leipzig.de/~stjaenicke/Survey.pdf https://www.informatik.uni-leipzig.de/~stjaenicke/Survey.pdf https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms reading opens out when we see the text as the locus of synthesis for all the myriad pat- terns disclosed by distant readings of which the text is a part (the modern version of Schleiermacher’s project). Consider some of the options offered by distant reading. We already have seen how distributional semantics can provide hierarchical clustering for the language in a corpus of texts. We can apply this approach to particular genres of history, literature, and reli- gious and philosophical discourse within a given period to see if there are distinctive patterns of usage for each genre of which we need to be aware as we read. We can com- pare usage within a genre across time, with obvious examples like the scholarly notes (biji 筆記) from the Tang to the Southern Song. Stylometry for Classical Chinese is still a work in progress, but we can sharpen its techniques as we explore the rise and complex dispersions of genres. Here I would cite Paul Vierthaler’s exploration of various subgenres of Chinese fiction in “Fiction and History: Polarity and Stylistic Gradience in Late Imperial Chinese Literature.”19 Similar approaches can be applied to the whole range of informal writings in the Song or between early and later biji as the genre develops. Having aggregated data, we can see the distinctiveness of particular texts as identified by their divergence from the collective metrics. Another important development in distant reading is the effort to identify intertex- tuality, as in the work on Latin texts by Walter Scheirer and others in their essay “The sense of a connection: Automatic tracing of Intertextuality by meaning.”20 (I confess I am astonished at how well their approach works given the state of the tools they bring to it.) This question of intertextuality is vitally important for the reading of Chinese literati texts, in particular, since so much of the connoisseurship in the close reading of Chinese poetry and prose from the Song dynasty onward is in the identification of allusive ref- erence.21 In the tradition, this search for allusions appears to be an effort to assert mas- tery and control the meaning of texts, but it relies on a rather arbitrary methodology. I will be very interested to see what large-scale intertextuality studies turn up and how those results will complicate close reading. What I present here is just a very partial list of the role of distant reading in giving us important information about word-usage, genre, and intertextuality. There is much more that I have not seen and yet more in the offing, where scholars are still playing with the possibilities of current tools and learning new approaches to tagging texts that then can be used to greatly extend the power of the basic techniques we now have. These are exciting times, and it is my hope that we never will be able to look at texts the same way again. As we read and interpret, they will be deeper and more demanding because of the work of the digital humanities. 19Paul Vierthaler, “Fiction and History: Polarity and Stylistic Gradience in Late Imperial Chinese Literature,” Cultural Analytics May 23, 2016, DOI: 10.22148/16.003. 20Walter Scheirer, Christopher Forstall, and Neil Coffee, “The Sense of a Connection: Automatic Tracing of Intertextuality by Meaning,” Digital Scholarship in the Humanities 31.1, (2016), 204–17. 21Donald Sturgeon, as part of his important collection of digital texts, has created a platform for looking at text reuse that will be an important tool for exploring intertextuality in pre-modern Chinese texts. See his discussion of the tool in Donald Sturgeon, “Digital Approaches to Text Reuse in the Early Chinese Corpus,” Journal of Chinese Literature and Culture (JCLC) 5.2 (2018), 186–213. From this special issue of JCLC devoted to digital humanities, also see Yi-long Huang and Bingyu Zheng, “New Frontiers of Electronic Textual Research in the Humanities: Investigating Classical Allusions in Chinese Poetry through Digital Methods,” JCLC 5.2 (2018), 411–37. Journal of Chinese History 271 h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms The China Biographical Database (CBDB) as Hermeneutic “Technical” Interpretation The digital techniques I have mentioned so far all center on what Schleiermacher called the “grammatical” mode of hermeneutics. In contrast, Schleiermacher’s “technical” moment focuses on the question of “how the author arrived at the thought from which the whole developed, i.e. what relationship does it have to his whole life and how does the moment of emergence relate to all other life-moments of the author.”22 Thus the central challenge of the psychological component of hermeneutics is how to understand the larger life patterns within which an individual (or an era) lives. The goal of presenting a systematic analysis of the “human sphere” informing life in premodern China drives the China Biographical Database project, where I am the chief data architect. Our goal is to systematically collect what data we can on the key struc- tures shaping social experience in pre-modern China. We are acutely aware of data that we cannot collect and the limits of what we offer. For example, parish records in England allow historians to inventory people’s worldly goods, but there are no corre- sponding sources of information for China. The information we have also is strongly biased toward the elite stratum that shows up in histories, including local gazetteers. Careful sifting of Buddhist and Daoist records allows us to know something about the lives of monks, but we know next to nothing about merchants, farmers, and artisans throughout Chinese history. These lacunae seriously distort what we can know of social experience in pre-modern China. Fortunately, it turns out that most of the extant authors from pre-modern China were from the elite stratum about which we can say a good deal. The extant historical record allows us to track a range of important social and insti- tutional systems that structured elite life in pre-modern China. Kinship relations, of course, come first, and then social relations, and locality. The examination system and one’s place within the imperial bureaucracy also loomed large in the lives of many of the authors we read. When Harvard inherited the database from Robert Hartwell, all of these components already were included in his data structures. I cleaned these up a bit, but the only significant addition I made to the types of data was that of “social institutions,” since we discovered that such entities as private academies and temples were institutions around which members of the elite stratum formed commu- nities to achieve collective goals. The major innovation I brought to the functionality of the database was my realization that we could exploit the hybrid nature of the system on which CBDB ran. Hartwell created the initial database in dBase, an old database pro- gramming language. When I reconfigured the database in FoxPro, a close cousin of dBase, I realized that we could exploit the initial kinship and social relationship infor- mation we had for individuals by recursively searching through them. That is, we start with the kinship information for an individual, and then we add all the kinship infor- mation for that person’s kin, and then we add the kinship information for all the newly discovered people, and so on, until the kinship distances reach a limit set by the end user. This sort of recursive search is relatively easy to set up in a procedural program- ming language like FoxPro or VBASIC, which Microsoft Access uses as its back-end programming language. It is much harder to build into SQL, Structured Query Language. In any case, I set the system to loop through social relations data in the same way to build social networks that could be exported to Social Network Analysis packages like Gephi. And in addition, we allowed the system to mix and match: to 22Schleiermacher, Hermeneutics and Criticism, 107. 272 Michael A. Fuller h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms pull in all the social relations of kin, all the kin of people in one’s social network, and every other possible combination of kinship and social relationship. Although Hartwell designed his database to allow him to look at the connection between office-holding and kinship, I don’t think he quite realized what an extraordi- narily powerful database he had developed. Schleiermacher asserted the relevance of the totality of social structures impinging on the individual. Because CBDB has tables rep- resenting the components of kinship, social network, social status, office-holding, and locality in an individual’s life, it allows a scholar to explore their interactions in a cor- ollary to Schleiermacher’s proposed methodology. That is, a simplified version of the CBDB structure looks like Figure 1. That is, people are at the center, and through them one can link all the additional components of social organization. We can ask questions like “Was the role of medical officer hereditary, that is, were medical officers the sons or nephews of medical officers, and did the families of medical officers marry their children to one another?” (Figure 2). We can ask yet more complicated questions. Were officials from Fujian more likely to develop local kinship networks than were officials from Zhejiang? Did patterns differ depending on rank, and did the patterns change over time? This adds the dimension of locality (Figure 3). When I was reading the writings of Liu Kezhuang 劉克莊 (1187–1269) from Fujian, this question was of considerable importance. Indeed, in my recent book on Southern Song poetry, being aware of the interactions among locality, kinship, social networks, participation in the examination system, and office holding compelled me to rethink the usual understanding of the “Rivers and Lakes” poets and realize that the usual story was wrong. There clearly were large networks of men from important local line- ages who participated in the examination system as proof of their elite status but who had little hope to actually succeed and little interest in serving more than would be required to confirm their tax exemption. Instead, they traveled from patron to patron talking and writing and, frequently enough, joining in protest against the current impe- rial administration. Reading their poetry in this context allows one to develop a more nuanced understanding of their work. Figure 1. Simplified Diagram of CBDB Relational Database Structure Journal of Chinese History 273 h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms The Digital Humanities and the Connectedness of Meaning in Human Experience Schleiermacher and Dilthey were circumspect and methodical in their efforts to allow contemporary readers to understand the lived significance of texts of the past. The pos- tulates they needed to justify their methods were—keeping with their Kantian model— fairly minimal: people in the past were biologically similar to people today, and people acted with motives.23 An understanding of all else that is built upon the basic hardware Figure 2. Querying the Relation of Office and Kinship Networks Figure 3. Querying the Relation of Place, Office and Kinship Networks 23Dilthey’s model for historical knowledge requires the postulate that people act with motives, either as 意, intentions, or as 情, feelings rather than randomly. This assumption is a corollary to Kant’s postulate of the purposivesness of Nature as a totality behind the aesthetic judgments that are the first step toward 274 Michael A. Fuller h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms —from what we now consider the history of affect, to religious and philosophical sys- tems, to literary history—must be constructed from a careful, reflective consideration of the texts (and other artifacts) that survive. And there was no absolute certainty in the conclusions they could draw, given the pastness of the past and the limitations of our sensibilities and the data that remains. This is precisely our current situation, except that the new digital technologies allow us to extend the range of our data and our ability to organize it in ways that they could not have imagined. The sort of empirical results for individuals and for larger aggregations of people that are derived from network analyses and other forms of statistical analysis are not a reduction of people and texts to numbers but a hermeneutically compelling way to discover the large-scale patterns of the social world that informed people’s lives. These results provide contexts for reading and thinking, and they put demands on read- ing. My understanding of the writings of the large community of “Rivers and Lakes” poets of early thirteenth-century China became profoundly different once I incorpo- rated the myriad factors—from the evolving nature of local elites, to the examination system and the role of printing, to the details of the Daoxue networks (and their argu- ments)—that shaped the historical context for poetry at the time. The model of CBDB and access to large repositories of digital texts were both crucial to reconceiving how the texts I studied were connected and where their meaning resides: within the texts, cer- tainly, but texts, the digital humanities shows us, are not just singularities; they radiate outward and are traces of moments of experience at the intersections of complex mul- tidimensional patterns that the digital humanities can partially—if never fully—restore. We live within epistemological impasses, but with the help of the digital humanities, we, like Schleiermacher, Dilthey, and Wittgenstein, are turning to the large world of human experience to see what we can learn about who we are. We as humanists have much to contribute to this project that we share with scholars pursuing other forms of inquiry. We have a great future if we open ourselves to the challenges of this shared endeavor and learn to see the methodologies for large-scale analysis as inte- gral to our inquiry into the human. empirical. Intentions, however, apply to the “life-world” of human experience rather than to the larger phe- nomenal realm. Cite this article: Fuller MA (2020). Digital Humanities and the Discontents of Meaning. Journal of Chinese History 4, 259–275. https://doi.org/10.1017/jch.2020.13 Journal of Chinese History 275 h tt p s: // d o i.o rg /1 0. 10 17 /j ch .2 02 0. 13 D o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . C ar n eg ie M el lo n U n iv er si ty , o n 0 6 A p r 20 21 a t 01 :2 8: 40 , s u b je ct t o t h e C am b ri d g e C o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/10.1017/jch.2020.13 https://doi.org/10.1017/jch.2020.13 https://www.cambridge.org/core https://www.cambridge.org/core/terms Digital Humanities and the Discontents of Meaning The Retreat of Meaning The Challenge of Scientific Insights into the Human The Digital Humanities Texts as Matrices of Meaning The Semantics of Matrices Meaning as Usage and the Hermeneutics of “Forms of Life”: Wittgenstein and Dilthey Schleiermacher and Textual Hermeneutics The Hermeneutics of the Digital Humanities The China Biographical Database (CBDB) as Hermeneutic “Technical” Interpretation The Digital Humanities and the Connectedness of Meaning in Human Experience work_4ry2oq4ivzdcnnphnfbaamfpia ---- ica-template-v2 Digitizing Patterns of Power – Cartographic Communication for Digital Humanities Karel Kriz,a Alexander Pucher,a Markus Breier,a a University of Vienna, Department of Geography and Regional Research, Vienna, Austria; karel.kriz@univie.ac.at, alexander.pucher@univie.ac.at, markus.breier@univie.ac.at Abstract: The representation of space in medieval texts, the appropriation of land and the subsequent installation of new structures of power are central research topics of the project “Digitizing Patterns of Power” (DPP). The project focuses on three regional case studies: the Eastern Alps and the Morava-Thaya region, the historical region of Macedonia, and historical Southern Armenia. DPP is a multidisciplinary project, conducted by the Austrian Academy of Sciences the Institute for Medieval Research (IMAFO) in cooperation with the University of Vienna, Department of Geography and Regional Research. It is part of an initiative to promote digital humanities research in Austria. DPP brings together expertise from historical and archaeological research as well as cartography and geocommunication to explore medieval geographies. The communication of space, time and spatial interconnectivity is an essential aspect of DPP. By incorporating digital cartographic expertise, relevant facts can be depicted in a more effective visual form. Optimal cartographic visualization of base data as well as the historical and archaeological information in an interactive map-based online platform are important features. However, the multidisciplinary of the project presents the participants with various challenges. The different involved disciplines, among them cartography, archaeology and history each have their own approaches to relevant aspects of geography and geocommunication. This paper treats geocommunication characteristics and approaches to interactive mapping in a historical and archaeological context within a multidisciplinary project environment. The fundamental challenges of cartographic communication within DPP will be presented. Furthermore, recent results on the communication of historical topographic, as well as uncertain thematic content will be demonstrated. Keywords: Cartographic Communication, Topographic Maps, Cartographic Visualization, Spatial Uncertainty, Digital Humanities, Multidisciplinary 1. Introduction The perception, depiction and organization of spaces and places in the Middle Ages encompass an interdisciplinary research field which helps to understand historical processes and relations within the medieval period. The representation of space in medieval texts, the appropriation of land and the subsequent installation of new power-structures are central research topics of the project “Digitizing Patterns of Power” (DPP). These patterns of power, established in space and time, are the research focus of this interdisciplinary project. The research questions are the domain of historical scholarship, but the phenomena are to a large extent spatial phenomena. The representation and analyses of spatial phenomena are core competences of cartography and geographical information science. DPP is a multidisciplinary project, conducted by the Austrian Academy of Sciences the Institute for Medieval Research (IMAFO) in cooperation with the University of Vienna, Department of Geography and Regional Research. (IfGR). It is part of the program “Digital Humanities: long term projects on cultural heritage”. This program is an initiative of the Austrian Academy of Sciences to promote digital humanities research in Austria. It started in January 2015 and will end in Dezember 2018. The aim of DPP is the development a generalizable workflow from the digitization of a specific corpus of textual and archaeological evidence to the analysis and visualization of data with the help of digital tools. Cartography and geocommunication are vital parts in the representation and visualization of the historical land- scape and the underlying data. (Bodenhamer et al. 2010, Gregory & Ell 2007) The creation of project specific base maps from free geodata, the visualization of the uncertainty inherent in historical data and the development of methods of interactive geocommunication to create a sustainable online presentation of data and results of the research is central for DPP. 2. Case Studies and historical research questions The project will focus on four case study regions. Historical research questions come from these case study regions. • The Carolingian eastern Alps (8th /9th century) • The March/Morava – Thaya/Dyje border region (7th – 11th century) • The historical region of Macedonia (12th – 14th century) Proceedings of the International Cartographic Association, 1, 2017. This contribution underwent single-blind peer review based on submitted abstracts | https://doi.org/10.5194/ica-proc-1-62-2017 | © Authors 2017. CC BY 4.0 License. mailto:markus.breier@univie.ac.at 2 of 5 • Historical southern Armenia: the rise and fall of Vaspurakan (5th – 11th century) Fig. 1. The case studies of DPP Although located in different parts of Europe and Asia Minor as well as being in different timeframes, these regions share a common basis of mountainous ecologies1, their position on the peripheries of imperial spheres and the specific framework provided by these conditions for the emergence and dynamics of political and socioeconomic structures. One research topic is the appropriation of space through creating “places of power” and possible underlying strategies. Is there a correlation between the increase in density of sites – e.g. settlements, fortifications, churches and monasteries, market places etc. – and an intensifying need for control over the land and its gradual appropriation? Another issue is the interplay between built and natural environment. In the eastern Alps and its surroundings, different structures were established in the late Roman Empire, the Carolingian expansion in the 8th and 9th century and the medieval internal colonization of the Eastern Alps region starting from the 10th century. Ecclesial institutions vied for influence to control the trade and pilgrimage routes to the south. (Winckler 2012) The Morava/Thaya region is and has always been a border region not only today, but also during the medieval age. The political and social entities on both sides of the border have left certain patterns of power in the landscape. Due to the lack of historical sources in this case study the focus will be on archaeological sources. (Eichert 2012) At the territory of today’s Former Yugoslavian Republic of Macedonia (FYROM), research is conducted on the transformation of the region from a Byzantine province into an area of military and political expansion by the Serbian medieval empire. Impacts on settlement patterns, re-distribution of landed property, interplay between resident popu- lation and nomads (Popović 2014/2015) 1 The Morava/Thaya region is not a mountainous region, but it serves as comparison to the other areas of research in order to elaborate on terrain specific or terrain independent developments. and the establishment of new infrastructure are of interest. Fig. 2. Target areas within the case study Macedonia In the historical region of southern Armenia, research focuses on the region around Lake Van and on the emergence of the principality of the noble house of Arcruni in the period between the end of the ancient Armenian monarchy (428) and the Seljuk conquest of Armenia (1020-1070). Textual evidence as well as archaeological data provides further input for a comprehensive analysis and visualization of the construction of an early medieval polity both in the narrative and in space within the specific ecology of the Van region at the crossroads between Byzantium and the Islamic World. (Preiser- Kapeller 2010/2012) The project builds on information and data gathered by the project partners from the Academy of Sciences. Data from various previous projects is incorporated, but also new data is acquired for DPP. The data gathered for DPP comprises of archaeological and historical sources. Archaeological entities include artefacts, monuments, settlements and burial sites Historical information is extracted from written sources like charters, chronicles and travel reports. This information is geotagged and entered in a common database. 3. Base Maps To provide a background for the historical information, a specific base map has to be created, which suits the needs of the historians and archaeologists. This is a critical task, since ideally, the map should represent the landscape at the timeframe appropriate for the research question. However, there are some difficulties to this undertaking. For one, it is very difficult to near impossible to get geodata of the medieval landscape. Not only man-made features like settlements and land use have changed, but also natural features like the course of rivers, coastlines and the extent of lakes have changed during history. Although there has been research on historical courses of some rivers, there is no comprehensive and consistent data available for all relevant regions. Furthermore, the project spans a timeframe of nine centuries, from the 5th to the 14th century. This is a huge timespan, with the Proceedings of the International Cartographic Association, 1, 2017. This contribution underwent single-blind peer review based on submitted abstracts | https://doi.org/10.5194/ica-proc-1-62-2017 | © Authors 2017. CC BY 4.0 License. 3 of 5 earliest nearly as far from the latest as the latest from present day. In a time before river regulations, the courses of the rivers would have changed a lot over a timespan of nine centuries. Overall, it was decided that the base map of DPP is based on current geodata, which will serve as a viable approximation. Another point of discussion with historians was the inclusion of current international borders and cities in the base map. From a cartographic point of view, they serve as means of orientation, since most map users are familiar with the rough shape of the countries and the location of the major cities. It provides a frame of reference, to refer from the map to the real world. For medievalist on the other hand, as was evident in personal discussions, these features are a distraction, since they did not have any meaning during the Middle Ages. It was argued, that these features are not only a distraction, but that current international borders would promote a nation-centric historiography, which is hard to overcome for students of history or the general public. For DPP therefore, two base maps will be available. The default base map is without current borders and cities, focusing on relief, waterbodies and land use. An alternative version is available, which includes the international borders, important cities and the according labels. Both base maps are created from free geodata, using GTOPO 30, Natural Earth waterbodies and UMD global land cover data for the lower zoom levels (5 – 7), SRTM , OSM and Hansen global forest data for the higher zoom levels (8-11). Fig. 3. Base map without modern features. Zoom level 5 (left) and zoom level 11 (right) Fig. 4. Base map with modern features. Zoom level 5 (left) and zoom level 11 (right) 4. Spatial Uncertainty of Historical data DPP builds on various historical data. This includes data from archaeological excavations, data extracted from historical written sources and secondary data from other sources, like old maps. Due to the nature of the data sources, the data quality - especially the accuracy and certainty - varies greatly. This is true for the spatial as well as the temporal aspect. Data gathered during archaeological surveys are very precise in the spatial aspect – it is recorded with surveying techniques or at least GPS. The precision of temporal aspects varies, depending on the applicable method of dating. On the other hand, data extracted from written sources is much more imprecise and uncertain. (Jänicke & Wrisley 2013) The references in the written sources are often very vague, e.g. a location was in the vicinity of a town, or situated within an administrative unit or nearby a geographical feature. This is further complicated by the fact that the entity, which the location is referenced to, can be located only vaguely itself. The exact extent of historical administrative units or the area of influence is very hard to determine. Such historical entities were often not clearly defined even during the time they existed. It is one of the aims of this project, to use the available data to reconstruct these areas. Toponyms often change, and even if the names stay (more or less) the same, the location or extent of the entities change. Settlements grow or shrink over time, and sometimes change location. An example would be a village that is situated in a river valley which is then destroyed by a flood. It could be rebuilt a little further up the slope of the valley. The name would be retained, but the location has changed. In some cases, the written sources contain references to places, with two or more possible locations, i.e. the written source gives the name of the village with no additional information, and there are two villages with the same name. It is therefore unclear, to which of the two source it refers to. All these uncertainties make it difficult to give exact coordinates to the events and locations. The historical research questions of DPP, however, make it necessary to record the level of uncertainty of the data in the database. Further- more, the uncertainty will be represented scale dependent in the map based application. Approximation methods, like assigning the data to the center of the current administrative unit or guessing where the location was most likely, are not desirable for this project. How to best handle uncertainty in such an environment is one of the main cartographic research questions of this project. Although various approaches to uncertainty visualization exist (e.g. MacEachren 2005, Reuschel & Hurni 2011), these approaches have to be adapted for the use in an interactive application with many different data entities. Proceedings of the International Cartographic Association, 1, 2017. This contribution underwent single-blind peer review based on submitted abstracts | https://doi.org/10.5194/ica-proc-1-62-2017 | © Authors 2017. CC BY 4.0 License. 4 of 5 5. Geocommunication The map based application serves as a tool for research itself. By enabling the user to combine various datasets and results of database queries, spatial relations can be explored. However, it is not the aim of DPP to create a full-fledged WebGIS. DPP focuses on optimal representation of the data and its uncertainty as well as on usability especially for non-GIS experts and performance (Kriz 2013). The application should guide the user through the data, allowing to query the database and show various data layers over a purpose made base map. The prototype of the application, which was finished in February 2017, offers a basic query interface for the data- base and navigation. The available zoom levels are from 5 to 11. The aim is to provide base maps for the whole extent of the case studies up to zoom level 13. For selected hotspots, even higher zoom levels are considered. Data entities are displayed as dots and are clustered, because there are areas with very high data density. Polygon and uncertainty representation is not yet available in the prototype, but will be an important feature for the final application. The links between various data entities like places, events, actors and signs of power can be followed via hyperlinks, which allows to explore the spatial relations between the entities. The final application will also offer a full text search. The functionality will include the possibility to query the database via text based input as well as interacting with the map. It will be possible to use the results as a starting point for browsing the data, switching between the map view and the database view. The map view will provide spatial aspects and selected thematic information of the data, whereas the database view will provide access to the full information and relations to other data entities. A time slider allows the user to gain insight to the temporal aspects. Fig. 5. Prototype of interactive application With these tools, the data can be explored in its spatial, temporal and thematic aspects to help in answering the historical research question. Cartographic applications are not only used as a research tool in DPP. The map based application will also serve as a platform for communicating the results of DPP to a wider audience. To keep the application accessible, ease of use and a clearly structured functionality is a key requirement for the application. However, the application should not be simplistic, because of the complex thematic content. Fig. 6. Clustering in the DPP application To communicate key results of the project to the public, so called “Story Maps” will be included in the application. These “Story Maps” are predefined views of the data, consisting of database queries, which are complemented with a detailed description of the topic shown and information about its significance for the historiography. 6. Data structure The database system is the technical backbone of the project DPP. “OpenATLAS”, an object-oriented database system established during previous research at the Institute for Medieval Research is used to create a common data pool. It joins data from archaeological as well as historical sources and uses classes and properties from the CIDOC Conceptual Reference Model (CIDOC- CRM, Le Boeuf at al. 2015). Originally created with cultural heritage management in mind, it is update to meet the criteria for DPP. It can map historical and archaeological entities like sites, features, stratigraphic units and finds, documents, events and actors - which can be persons or institutions. Further- more, the relations between these entities and spatial and temporal information are modelled. Metadata, connections to bibliographical resources, image data, textual content, online resources, administrative units and record restrictions like copyright or licensing of various datasets can be recorded. The database is connected to the interactive map based online application via PHP based server code. The application itself is programmed in Java Script, making use of the Leaflet library. Communication between server sided database access and client side application is handled by GeoJSON (Java Script Object Notation). However, its object-oriented structure, complex data relations and dynamic elements make it challenging when visualizing the entities and their relations in an interactive application. According to the OpenData policies of the involved institutions, the data collected during the project will be provided to the public. Geodata will be accessible via Web Feature Service (WFS). In this way, expert users can Proceedings of the International Cartographic Association, 1, 2017. This contribution underwent single-blind peer review based on submitted abstracts | https://doi.org/10.5194/ica-proc-1-62-2017 | © Authors 2017. CC BY 4.0 License. 5 of 5 use the data within their own GIS and conduct analyses and queries, which are not possible in the online application. The software developed for this project is based on Open Source components and will be available to similar projects. 7. Conclusion and Outlook DPP is a multidisciplinary research project which explores the benefits of state-of-the-art geocommunication technologies to historical research. The focus of the cartographic efforts of the project lies on inherent cartographic issues, which are until now hardly considered in similar projects. DPP uses high quality base maps, which are created specifically for this project, with additional layers relevant for the research questions. Story maps and database query functions allow researchers as well as the interested public to browse the data and explore the spatial relations of the entities of the case studies as well as to see the results of the research. The uncertainty of the various entities is also modelled in the database and will be represented in the map view. However, these issues also represent challenges when building a complex system with various possible interactive elements. The uncertainty of the data is very inhomogeneous. Therefore, concepts to represent the uncertainty in all its aspects in various scales are explored. Furthermore, when designing such a system, usability has to be considered. The aim is to create a flexible system, where the user can conduct her or his own queries and combination of layers. However, it should also be easy to use for non-GIS experts. As of February 2017, the application is currently in its prototype stage, offering basic functionality. While the database structure is already mapped in the application, uncertainty representation is not yet implemented. Besides implementing advanced functionality, the uncertainty representation is the focus of the remainder of the project. The project and the software developed within are designed with possible extensions in mind, providing a basic framework for similar projects. 8. References Bodenhamer J., Corrigan J., Harris T. M. (eds.) (2010) The Spatial Humanities: GIS and the Future of Humanities Scholarship. Bloomington and Indianapolis. Eichert S. (2012) Frühmittelalterliche Strukturen im Ostalpenraum. Studien zu Geschichte und Archäologie Karantaniens. In: Forschung und Kunst 39, Klagenfurt. Gregory I. N. & Ell P. S. (eds.) (2007. Historical GIS. Technologies, Methodologies and Scholarship. New York. Jänicke S. and Wrisley D. J. (2013) Visualizing Uncertainty: How to Use the Fuzzy Data of 550 Medieval Texts? in Proceedings of the Digital Humanities 2013. Kriz K. (2013) Maps and Design – Influence of Depiction, Space and Aesthetics on Geo- communication. In: Kriz K, Cartwright W, Kin- berger M (Eds.), Understanding Different Geographies. Berlin, 9-24 Le Boeuf P., Doerr M., Ore C. E., Stead S. eds. (2013) Definition of the CIDOC Conceptual Reference Model. ICOM/CIDOC CRM Special Interest Group. http://www.cidoc- crm.org/sites/default/files/cidoc_crm_version_6.2.1.pdf, accessed 23/02/2017. MacEachren A. et al. (2005) Visualizing Geospatial Information Uncertainty: What We Know and What We Need to Know. in Cartog- raphy and Geographic Information Science, Vol 32, No. 3, pp. 139-160. Popović M. (2014). Vlachen in der historischen Landschaft Mazedonien im Spätmittelalter und in der Frühen Neuzeit. In: Romanen und ihre Fremdbezeichnungen im Mittelalter: Walchen, Vlachen, Waliser [in press] Popović M. (2015) Das Kloster Hilandar und seine Weidewirtschaft in der historischen Landschaft Mazedonien im 14. Jahrhundert. In: ΠΕΡΙΒΟΛΟΣ – Mélanges offerts à Mme Mirjana Živojinović, Tome I. Belgrade 2015, 215-225. Preiser-Kapeller J. (2010) erdumn, ucht, carayut´iwn. Armenian Aristocrats as Diplomatic Partners of Eastern Roman Emperors, 387- 884/885 AD. Armenian Review 52 (2010) 139-215 Preiser-Kapeller J. (2012) Networks of Border Zones – Multiplex Relations of Power, Reli-gion and Economy in South-Eastern Europe, 1250-1453 CE. In: Proceedings of the 39th Annual Conference of Computer Applications and Quantitative Methods in Archaeolo-gy, “Revive the Past”. Amsterdam, 381-393 Reuschel A. and Hurni L., (2011) Mapping Literature: Visualisation of Spatial Uncertainty in Fiction. in The Cartographic Journal, Vol. 48, No. 4, pp. 293-308. Winckler K. (2012) Die Alpen im Frühmittelalter. Die Geschichte eines Raumes in den Jahren 500 bis 800. Wien. Proceedings of the International Cartographic Association, 1, 2017. This contribution underwent single-blind peer review based on submitted abstracts | https://doi.org/10.5194/ica-proc-1-62-2017 | © Authors 2017. CC BY 4.0 License. Digitizing Patterns of Power – Cartographic Communication for Digital Humanities 1. Introduction 2. Case Studies and historical research questions 3. Base Maps 4. Spatial Uncertainty of Historical data 5. Geocommunication 6. Data structure 7. Conclusion and Outlook 8. References work_4ubshnppnvhytfurhaxn6g3f3e ---- Expanding the Librarian's Tech Toolbox: The "Digging Deeper, Reaching Further: Librarians Empowering Users to Mine the HathiTrust Digital Library" Project Search D-Lib: HOME | ABOUT D-LIB | CURRENT ISSUE | ARCHIVE | INDEXES | CALENDAR | AUTHOR GUIDELINES | SUBSCRIBE | CONTACT D-LIB D-Lib Magazine May/June 2017 Volume 23, Number 5/6 Table of Contents   Expanding the Librarian's Tech Toolbox: The "Digging Deeper, Reaching Further: Librarians Empowering Users to Mine the HathiTrust Digital Library" Project Harriett Green and Eleanor Dickson University of Illinois at Urbana-Champaign {green19, dicksone} [at] illinois.edu   https://doi.org/10.1045/may2017-green   Abstract This paper provides an overview of the IMLS-funded project "Digging Deeper, Reaching Further: Librarians Empowering Users to Mine the HathiTrust Digital Library," and explains how the project team developed a curriculum and workshop series to train librarians on text mining approaches and tools, in order to address the recognized skills gap between the needs of researchers pursuing digital scholarship and the services that librarians are traditionally trained to provide. Keywords: HathiTrust Digital Library Project, Text Mining   1 Introduction The roles of librarians are transforming as a growing number of researchers and instructors integrate data into their work and scholarship. As the Association for Research Libraries' Strategic Thinking and Design Initiative Report predicts, In 2033, the research library will have shifted from its role as a knowledge service provider within the university to become a collaborative partner within a rich and diverse learning and research ecosystem. [1] This futurist declaration frames how librarians increasingly are encountering new research questions and scholarly needs oriented around data and digital technologies — needs that push the boundaries of current skillsets, knowledge, and service scope of librarians and archivists today. And recent initiatives such as the Library of Congress's "Collections As Data" forum and the IMLS-funded "Always Already Computational: Collections as Data" project recognize today's essential role of libraries and archives in providing and curating much of the data being used in this new, emergent research. In light of the "computational turn" [2] across the disciplines and in libraries themselves, how can libraries prepare for supporting data-driven research? The Digging Deeper, Reaching Further: Libraries Empowering Users to Mine the HathiTrust Digital Library Resources (DDRF) project aims to develop and disseminate a curriculum for librarians to build competence in skills and tools for digital scholarship that they then can incorporate into research services at their home institutions.   2 Background Digital scholarship centers and research commons are emerging in more and more libraries as part of revised service models to address the research needs for digital humanities and data-driven scholarship. Still, not all academic libraries have (or need) centralized services, and even when they do, librarians from many different departments in the library and areas of expertise are being drawn into digital scholarship support [3]. Studies document how these dynamic, data-driven changes in how scholars pursue research often involve deeper collaboration between librarian and disciplinary researchers [4], and what the Research Libraries UK's Re-Skilling for Research report called "a more proactive model of engagement with researchers." [5] Services such as research collaborations with faculty [6], building new models for scholarly communications and publishing in digital humanities [7], and offering tiered support services for digital scholarship projects encompassing digitization, multi-media publishing, and software development [8] are becoming increasingly standard in libraries. The recently published volumes Digital Humanities in the Library: Challenges and Opportunities for Subject Specialists [9] and Laying the Foundation: Digital Humanities in Academic Libraries [10] feature multiple case studies of new services and programs in academic libraries that address contemporary research needs in the area of digital humanities specifically. But these rapidly growing areas of digital scholarship research, and the responding changes in library services and infrastructure, also highlight the key challenges that librarians face in gaining skills that enable them to engage with digital scholarship work [11]. Some centers have responded by offering training programs for librarians at their institutions to become more familiar with digital tools and methods. Notable efforts at the University of Maryland [12], Indiana University [13], and Columbia University Libraries' Developing Librarian program exemplify programs that re-skill librarians, especially subject librarians, to participate in new service models and the growing demand for digital scholarly support. National and international initiatives to train those across the academy, from students and faculty to librarians, in strategies for incorporating digital methods and tools into research have proliferated in recent years. Programs such as the Humanities Intensive Learning and Teaching (HILT) institute prepare attendees, who include librarians, to engage in digitally-intensive research. Other recent professional development opportunities for librarians on topics in digital scholarship have included the Digital Humanities Institute for Mid-Career Librarians at the University of Rochester and the Data Science and Visualization Institute for Librarians at North Carolina State University, as well as the forthcoming the Association of Research Libraries' newly-launched Digital Scholarship Institute. Our DDRF project aims to share and build upon the goals of many of these training initiatives, which are to address the recognized skills gap between the needs of scholarly research with computational tools and the services that librarians are traditionally trained to provide. Notably, these training initiatives for librarians employ a "train-the-trainer" model, by which librarians learn a new skillset that they, in turn, can introduce to local scholars. The newly released findings of the IMLS-funded Mapping the Landscapes: Continuing Education and Professional Development Needs for Libraries, Archives and Museums [14] attest in particular to the need for digital scholarship skills, as they note that of the core competency areas for professional development highlighted in their survey, "intermediate to advanced technology skills, digital collection management and digital preservation competency areas received the highest percentage of respondents indicating a need for significant improvement." DDRF aims to empower librarians — especially those without local training programs — to become active in digital scholarship on their campuses. As such, our project seeks to build this capacity in support of the Institute for Museum and Library Services (IMLS) National Digital Platform initiative. Funded by a 2015-2018 IMLS Laura Bush 21st Century Librarian grant award, DDRF is a partnership between five institutions: The University of Illinois at Urbana-Champaign, Indiana University Bloomington, Lafayette College, Northwestern University, and the University of North Carolina at Chapel Hill. Librarians and specialists from the partner institutions have been collaborating to develop a curriculum and training mechanism focused on preparing library and information professionals to engage in text analysis and core skills in supporting data-driven research. This project leverages the expertise of the HathiTrust Research Center jointly based between the University of Illinois at Urbana-Champaign and Indiana University Bloomington. Many of the hands-on activities and examples presented in the curriculum are drawn from the workshops, tools, and research services provided by HathiTrust Research Center for text analysis research [15]. The curriculum will be released as an open educational resource at the end of the grant.   3 Project update We have drafted, delivered, and revised the initial version of the DDRF text analysis curriculum using an iterative instructional design process. Our process drew upon the inspiration and examples offered by other effective open training initiatives, including Software Carpentry, Data Carpentry, and Library Carpentry [16], as well as the New England Collaborative Data Management Curriculum [17]. The DDRF curriculum aims to be skills-oriented and centered on specific real-world use cases, as we describe later in the paper. The suite of teaching materials includes slide decks, instructor guides, and participant handouts. We continue to refine the materials after each iterated pilot workshop, with the aim of teaching the final curriculum at regional and national workshops across the U.S. during 2017 through 2018. Through the pilot workshops, we have learned that the skill needs for librarians around digital scholarship are varied and individually-driven. The five project partners represent colleges and universities with diverse constituents and approaches to supporting digital scholarship. As such, each partner institution has encountered unique experiences teaching the same curriculum to their different audiences, which have ranged from cohorts of public services librarians working in undergraduate-central communities, to information science researchers and librarians at large research universities. The richness of this participant diversity has meant that the project partners are able to provide feedback on the efficacy of the training materials for different audiences. Our experience teaching the workshops to date has influenced our approach to instructional design and curriculum development, both of which have also been shaped by participant feedback through formal assessment.   3.1 Instructional design The multistage instructional design process applied in this project began in fall 2015 with definitions of learning goals and objectives for the curriculum. This stage involved identifying the requisite skills and knowledge for librarians from different areas of expertise to support text analysis research, and how to build a training program that would address those requirements. This process established a benchmark for the curriculum that project partners were able to reference as the materials took shape. As a part of iterating on the teaching materials, we have refined the learning goals and objectives based on feedback and teaching experiences. Our learning goals and objectives address librarian-specific competencies to engage with digital scholarship, and we developed them with the approach of seeing text analysis tools and methods as a digital scholarship service supported by the library. We do not expect for the learner to become an expert over the course of several hours, nor for the learner to necessarily formulate their own research project. Instead, we focus on fostering awareness of, and the ability to communicate about, key tools and methods in text analysis. Additionally, they map to five training modules that follow the text analysis workflow, from finding textual data to managing and analyzing it, which also align with key points at which a librarian might be involved in the research process (Table 1). Each module incorporates skills-based competencies that are developed through hands-on activities. A sample reference question that could be addressed using text analysis threads the modules and guides hands-on activities and discussion. Where appropriate, the activities align with HathiTrust Research Center tools and services. Module Primary learning goal Skills developed Introduction Understand what text analysis is and how scholars are using it in their research. Recognize research questions that may lend themselves to text analysis methods. Gathering Textual Data Differentiate the various ways textual data can be acquired and evaluate textual data providers. Build a textual dataset and run a web scraping script. Working with Textual Data Distinguish cleaning and/or manipulating data as a part of the text analysis workflow. Clean text data files using a Python script and/or OpenRefine. Analyzing Textual Data Recognize the advantages and constraints of web-based text analysis tools and programming solutions. Run a web-based text analysis algorithm and extract token frequencies from a dataset. Visualizing Textual Data Identify data visualization as a component of data-driven analysis. Practice exploratory data analysis using different tools for visualization. Table 1: Learning and Skill-Building Goals for DDRF Curriculum We chose to use a modular format for the curriculum, so that the workshops could be adjusted for different settings. Some modules have been further broken down into "beginner" and "advanced" lessons, improving the flexibility of the teaching materials. In the second round of pilot workshops, we found that the partner institutions were interested in rearranging the content to suit their audiences. Some chose to teach the modules in order from one to five, while others taught the beginner lessons of multiple modules before moving on to the advanced lessons.   3.2 Teaching We have now taught several iterations of the curriculum via pilot workshops at each of the partner institutions. The workshops have been open to librarians, library paraprofessionals, and students in library and information science departments. We have seen strong interest in the workshops from across the library: for all of the fall 2016 workshops combined, 32% of attendees self-reported as reference librarians, 21% as technical services librarians, 21% as "other" types of librarians, and 16% as digital humanities or digital scholarship librarians. Between each round of pilot workshops, the project team reviewed and updated the curriculum, based both on the attendees' evaluations and also in part on the experiences of the partner instructors teaching the materials. The feedback from the project partners has revealed that it can be challenging to learn, digest, and teach materials that others have developed. In such cases, instructors found it helpful to team-teach the workshop so that the instructor team was better able to grasp the materials and answer attendee questions. Making it easier for others to pick-up and teach the curriculum is one of our goals for the coming year. To this end, we are drafting in-depth instructor guides for each module that define vocabulary terms, outline the key points that should be addressed, and provide a slide-by-slide script from which the presenter can read. An important component of our strategy thus far has been to limit technological barriers to participating in the workshop. The activities deployed in several of the modules involve the participants executing Python programs to complete a task. Properly setting up a programming environment can take considerable time, especially in a workshop setting and when using machines in a computer lab. When possible, we have explored web-based tools for programming, such as PythonAnywhere, that allow participants to complete activities no matter what their operating system and without configuring their computer. We came to this decision by evaluating our learning goals and determining what aspects of the code-based activity was most important to meeting our objectives. We determined that streamlining the technical activities through web-based programming platforms lowers the cognitive load of learning a new concept, and allows attendees to focus on what happens when they run a script as opposed to the nuances of their programming environment. While we have attempted to simplify the steps to successful completion of each activity, we have also learned to value creative and critical thinking in the hands-on sections. After the first rounds of workshops, project partners reported that they wished there were more opportunities in the curriculum for open-ended inquiry. They also reflected on the importance of play and experimentation for those learning digital scholarship competencies. The first iterations of the activities were straightforward, and we are exploring ways to make them more playful as a means of reinforcing the concepts in the activities [18]. We have also incorporated discussion questions into the most recent version of the teaching materials. We hope such discussion will provoke critical reflection of the skills and competencies addressed in each module within the context of the learning goals, as well as provide space for attendees to connect the workshop's content to their own teaching and learning.   3.3 Assessment Following each workshop, participants complete an assessment form. From the assessment feedback, the project team has been able to glean that librarian learners appreciate learning by doing, and that they prefer depth over breadth of content in a workshop. Attendee feedback shows that librarians value experiential learning. Responses gathered in the assessment form often related to the hands-on activities. For example, one workshop attendee wrote that they were "intimidated" before coming to the workshop because it would teach programming concepts, but that "the structure of the workshop which allowed us to focus on the conceptual capabilities of using Python and scripts to do text mining was very useful and interesting." Additionally, others noted that there should be even more time devoted to skill-based learning. One wrote, "When I sign up for a workshop, I expect that most of the time will be actual hands on activities." Current work on the curricular materials is focused on further developing the scope of the hands-on sections of each module to allow learners the opportunity to understand the process happening in each activity, in addition to fostering experimentation and discourse as mentioned above. Workshop feedback also reflected that early pilot workshops were too short relative to the amount of content we tried to teach. One attendee advised us to, "Make it longer, with more time for exploring data. [There is] not nearly enough time to really dig deep." We intend for future workshop sessions to be longer and anticipate they will be less rushed. We are also devising ways to create paths through the content for shorter workshops: By highlighting key points for each module in the aforementioned expanded instructor's guide, we aim for instructors to feel empowered to condense content as needed for use in abbreviated workshops.   4 Next steps and conclusion We continue to incorporate the feedback and assessment received into our curricular and programmatic development of the project materials, and strive to keep in mind the various user groups and skills levels that librarians and information professionals have today. Given the initial response to our workshops, we know that our colleagues are actively seeking training and instruction in these emergent skillsets for digital scholarship and data science. The next year will see a series of regional and national workshops where we will present the curriculum to larger, more diverse audiences from across North America. Through these workshops, we will gather additional responses from the librarian community that will allow us to refine the curriculum into a final open educational resource. Our project is motivated by the potential to build new and interactive communities of practice in libraries around digital scholarship. Library and information professionals today, across areas of expertise, must grapple with questions such as: What technical and social infrastructures do libraries need to build or re-think in order to support digital scholarship? How do we provide librarians with the skillsets and knowledge needed to respond to new research and teaching needs? How can libraries anticipate the data-driven research of the future? The more that libraries proactively equip their staff to engage in more data-intensive research and teaching — in addition to developing new spaces and service models — the richer the future looks for the changing role of libraries and archives in higher education.   References [1] Association for Research Libraries. (2016). Strategic thinking and design initiative: Extended and updated report, Washington, DC: Association for Research Libraries. [2] David Berry, D.M. (2011). The computational turn: Thinking about the digital humanities. Culture Machine 12. [3] Mulligan, R. (2016). SPEC Kit 350: Supporting digital scholarship. Washington, DC: Association for Research Libraries. [4] Green, H. E. (2014). Facilitating communities of practice in digital humanities: Librarian collaborations for research and training in text encoding. Library Quarterly (84) 2, 219-234. https://doi.org/10.1086/675332 [5] Auckland, M. (2012). Re-skilling for research: An investigation into the role and skills of subject and liaison librarians required to effectively support the evolving information needs of researchers. London: Research Libraries UK. [6] Alexander, L., Case, B., Downing, K., Gomis, M. & Maslowski, E. (2014). Librarians and scholars: Partners in digital humanities. EduCause Review; Nowviskie, B. (2013). Skunks in the library: A path to production for scholarly R&D. Journal of Library Administration 53(1), 53-66. https://doi.org/10.1080/01930826.2013.756698 [7] Coble, Z., Potvin, S., & Shirazi, R. (2014). Process as product: Scholarly communication experiments in digital humanities. Journal of Librarianship and Scholarly Communication 2(3), eP1137. https://doi.org/10.7710/2162-3309.1137 [8] Vinopal, J. & McCormick, M. (2013). Supporting digital scholarship in research libraries: Scalability and sustainability. Journal of Library Administration 53(1), 27-42. https://doi.org/10.1080/01930826.2013.756689 [9] Hartsell-Gundy, A., Braunstein, L. & Golomb, L. eds. (2015). Digital humanities in the library: Challenges and opportunities for subject specialists. Chicago: Association for College and Research Libraries. [10] Gilbert, H. & White, J. eds. (2016). Laying the foundation: Digital humanities in academic libraries. Lafayette, IN: Purdue University Press. [11] Posner, M. (2013). No half measures: Overcoming challenges to doing digital humanities in the library. Journal of Library Administration 53(1), 43-52. https://doi.org/10.1080/01930826.2013.756694 [12] Munoz, T. & Guiliano, J. (2014). Making digital humanities work. Digital Humanities 2014 Conference Abstracts EFPL-UNIL Lausanne, Switzerland 8-12 July 2014. 274-275. [13] Courtney, A., M. Dalmau, & C. Minter. (2014). Research Now: Cross training for digital scholarship. Poster presented at 2014 DLF Forum. [14] Drummond, C., Skinner, K., Pelayo, N., & Vukasinovic, C. (2016). Self identified library, archives, and museum professional development needs 2016 edition: Compendium of 2015-2016 Mapping the Landscapes project findings and data. Atlanta: Educopia Institute. [15] Downie, J.S., Furlough, M., McDonald, R.H., Namachchivaya, B., Plale, B.A., & Unsworth, J. (2016). The HathiTrust Research Center: Exploring the full-text frontier. EduCause Review, May 2, 2016. [16] Baker, J. et al., (2016). Library Carpentry: software skills training for library professionals. LIBER Quarterly. 26(3), 141-162. https://doi.org/10.18352/lq.10176 [17] Lamar Soutter Library, University of Massachusetts Medical School. New England Collaborative Data Management Curriculum; Kafel, D., Creamer, A. T. & Martin, E. R. (2014). Building the New England Collaborative Data Management Curriculum. Journal of eScience Librarianship 3(1): e1066. https://doi.org/10.7191/jeslib.2014.1066 [18] For more about the concept of play in digital pedagogy, see Sample, M. (2016). Play. In Digital pedagogy in the humanities: Concepts, models, and experiments. New York: Modern Language Association.   About the Authors Harriett Green is the interim Head of Scholarly Communication and Publishing, English and Digital Humanities Librarian, and associate professor, University Library, at the University of Illinois at Urbana-Champaign. Her research and publications focus on usability of digital humanities resources, digital pedagogy, digital publishing, and humanities data curation. She is Principal Investigator for the IMLS-funded "Digging Deeper, Reaching Further: Libraries Empowering Users to Mine the HathiTrust Digital Library" project.   Eleanor Dickson is the Visiting HathiTrust Research Center Digital Humanities Specialist at the University of Illinois at Urbana-Champaign. She supports outreach and training for the HathiTrust Research Center, as well as local digital humanities research at Illinois.   Copyright ® 2017 Harriett Green and Eleanor Dickson work_4wxdbbpa6rhljnyb3t35fa4xzy ---- Long session presentation at the Digital Humanities Australasia Conference, March 2012 AusCinemas Presentation ABSTRACT: As part of our current ARC project “Mapping the Movies”, Dr. Mike Walsh and I are developing, a geodatabase of Australian cinemas, covering the period from 1948 to 1971 and based on a consistent dataset found in the trade journal Film Weekly, providing basic information on the ownership, location and capacity of approximately 4,000 venues. A principal purpose of the database is to provide an opportunity for crowdsourcing information about the venues from other material available on the Web and from the interested public. We expect to engage the interest of organisations devoted to the history and preservation of cinemas, and of school teachers developing local history projects under the national curriculum. The information gathered will include details of screening programs, photographs and digitised newspaper reports. Funded by an eReasearchSA Summer Scholarship, we are developing a set of templates for collection of crowdsourcing data and extend the website to manage and use the additional information. A broader aim of the project is to develop a generic open source geodatabase for use by digital humanities researchers who want to map relatively small scale datasets. The system is focused around a database structure that supports the definition of objects with metadata, allowing additional objects to be added to the system without the need to significantly change the underlying database structure. The system is focused on easy implementation and management, needing high-level IT skills for only brief periods in the establishment of a project, to define objects in the database and in the programming code, and customise the user interface to meet their specific needs. The paper will describe the evolution of the research project, and demonstrate the website. Long session presentation at the Digital Humanities Australasia Conference, March 2012 [Slide 1: Opening screen] This project forms part of the output for an ARC Discovery project involving Deb Verhoeven, Mike Walsh, Kate Bowles, Colin Arrowsmith and Jill Matthews, called Mapping the Movies: the Changing Nature of Australia’s Cinema Circuits and their Audiences. This project was a continuation of a previous Discovery project, Regional Markets and Local Audiences: Case Studies in Australian Cinema Consumption, 1928–1980. Both projects are contributions to an emerging international trend in research into cinema history, that has shifted its focus away from the content of films to consider their circulation and consumption, and to examine the cinema as a site of social and cultural exchange. This shared effort has engaged contributors from different points on the disciplinary compass, including history, geography, cultural studies, economics, sociology and anthropology, as well as film and media studies. Their projects have examined the commercial activities of film distribution and exhibition, the legal and political discourses that craft cinema’s profile in public life, and the social and cultural histories of specific cinema audiences. Many of their projects have been collaborative, facilitated by computational analysis and the opportunities for quantitative research offered by databases and Geographical Information Systems, which allow for the compilation of new information about the history of cinema exhibition and reception in ways that would previously have been too labour intensive to undertake. Having achieved critical mass and methodological maturity, this body of work has now developed a distinct identity, to which we have given the name ‘the new cinema history’. In calling this work cinema history, we are deliberately distinguishing it from a film history that has been predominantly constructed as a history of production, producers, authorship and individual films most commonly understood as texts, and Long session presentation at the Digital Humanities Australasia Conference, March 2012 that has been predominantly evaluative, classificatory or curatorial in its remit. Methodologically, this practice of film history has often struggled to place films into a wider historical context; its most common approach has been to treat films as involuntary testimony, bearing unconscious material witness to the mentalité or zeitgeist of the period of their production. The idea that films, along with other forms of mass or popular culture, are ‘eloquent social documents’ reflecting the flow of contemporary history has been an implicit assumption of much writing about cinema, but explanations of how ‘the film-making process taps some reservoir of cultural meaning’ have remained relatively unformulated and untheorised, little advanced from Siegfried Kracauer’s proposal in 1947 that some movies, or some ‘pictorial or narrative motifs’ reiterated in them, might be understood as ‘deep layers of collective mentality which extend more or less below the dimensions of consciousness’. Versions of this proposition have encouraged historians to treat films as historically symptomatic and to examine the ‘unconscious’ of a filmic text to reveal the biases, tastes or secret fears of the cultural moment in which it was produced. Instinctively reaching for metaphor and allusion as clues, this mode of analysis turns the movies themselves into proxies for the missing historical audience, in the expectation that an interpretation of film content will reveal something about the cultural conditions that produced it and attracted audiences to it. Such analyses pay little attention to their actual modes of circulation at any time, and risk ascribing to individual films a representational significance that may be disproportionate to their capacity for historical agency. This symptomatic film history has also largely been written without acknowledging the transitory nature of any individual film’s exhibition history. Motion picture industries require audiences to cultivate the habit of cinemagoing as a regular and frequent social activity. From very early in their industrial history, motion pictures were understood to be consumables, viewed once, disposed of and replaced by a substitute providing a comparable experience. The routine change of programme was a critical element in the construction of the social habit of attendance, ensuring that any individual movie was likely to be part of a movie theatre audience’s experience of cinema for three days or less, with little opportunity to leave a lasting impression before it disappeared indefinitely. Sustaining the habit of viewing required Long session presentation at the Digital Humanities Australasia Conference, March 2012 a constant traffic in film prints, ensuring that the evanescent images on the screen formed the most transient and expendable element of the experience of cinema. Oral histories with cinema audience members consistently tell us that the local rhythms of motion picture circulation and the qualities of the experience of cinema attendance were place-specific and shaped by the continuities of life in the family, the workplace, the neighbourhood and community. Stories that cinemagoers recall return repeatedly to the patterns and highlights of everyday life, its relationships, pressures and resolutions. Only the occasional motion picture proves to be as memorable, and it is as likely to be memorable in its fragments as in its totality. New cinema history takes these facts as its premise, and focuses its attention on the questions that surround the social history of the experience of cinema rather than the histories of its ephemeral products. By doing so, it becomes possible to engage scholars from more diverse disciplinary backgrounds in this emerging field. Cinema has become a matter of historical interest to researchers who have not been schooled in the professional orthodoxy that the proper business of film studies is the study of films. From the perspective of historical geography, social history, economics, anthropology or population studies, the observation that cinemas are sites of social and cultural significance has as much to do with the patterns of employment, urban development, transport systems and leisure practices that shape cinema’s global diffusion, as it does with what happens in the evanescent encounter between an individual audience member and a film print. New cinema history uses quantitative information, articulated through the apparatus of databases, spatial analysis and geovisualisation, to advance a range of hypotheses about the relationship of cinemas to social groupings in the expectation that these hypotheses must be tested by other, qualitative means. Long session presentation at the Digital Humanities Australasia Conference, March 2012 [Slide 2 Adelaide ] The Mapping the Movies project has begun an investigation into the significance of Australian cinemas as sites of social and economic activity, focusing on the period from 1950 to 1970. This period covers a major change in the number, nature and geographic distribution of cinemas in Australia, and on reason for focusing on it is because there is a conventional explanation for those changes in the appearance of television as a functional alternative to cinema. From the perspective that I’ve outlined, we want to ask questions about the persuasiveness of that explanation, and to consider a range of other factors that might have contributed to the relative decline in cinema attendance over the period. The long-term aim is to combine archival, social and spatial data with oral histories to construct a GIS database of cinema venues and their neighbourhoods, creating maps of distribution practices and audience movements in order to analyse the responsiveness of cinemas and their audiences to social and cultural change. Of course, this has turned out to be a far more ambitious agenda than one grant can achieve, and the part of the project that I want to discuss today might be considered an initial enabling device for the larger project. In one sense, the project is also an attempt to address an issue raised by Alan Liu in his keynote address, in the historical parallel to the debate over close and distant reading, which is in the relationship between microhistories and larger scale social or cultural history: how many microhistories do you need to make a general historical statement? At one level, the cinema history we are discussing describes a highly localised activity, involving individual sites and the individuals attached to them. But these individuals were also part of a globally-organised supply chain, the profitability of which was dependent on the predictability of their behaviour. Long session presentation at the Digital Humanities Australasia Conference, March 2012 [Slide 3: Film Weekly records] The primary information source for our initial dataset comes from the annual trade publication, the Film Weekly Yearbook, which contains a listing of cinema exhibition venues in Australia, with minimal information about their location, seating capacity and ownership. [Slide 4: Record extraction into a simple exportable excel database file] We have extracted the information initially into a series of spreadsheets, geocoded each of the venues, and generated a map based on Google Maps technology. [Slide 5: Website frontend based on Google Maps technology] It’s worth saying two things about the underlying data at this point, just to highlight what I think is an instance of a wider debate. This project uses the Film Weekly data as a consistent dataset; this is industry-sourced data, which existed for industry use. Its virtues are its volume, its national coverage, and its consistency. What we also Long session presentation at the Digital Humanities Australasia Conference, March 2012 know about it, from the other research in our project, is that its data is not always accurate. It doesn’t, for example, capture the closure data of cinemas with any accuracy – closure is simply recorded by a cinema’s absence from the list in a given year. Within the project, we have had long discussions about how to use this data, and whether to integrate it with the project’s main database, CAARP, which has retained a higher level of exactitude, and a much greater level of detail in the data we’ve stored in it, but does so for smaller areas and narrower periods of time. Our solution has been to maintain a separation between the two datasets, but to allow the Auscinemas site to access CAARP data, and for CAARP to have the capacity to ingest AusCinemas data when we’re sure of its reliability. This also, of course, means that Auscinemas will grow from its base data, and in the process distort the consistency of the original dataset. This is an inevitable consequence of the research, and of the crowdsourcing aspect of our project, with which we hope to generate a collection of microhistories which will correct, amplify and complicate the picture we can create from the existing data. A quick your of what the site does. [Slide 6: More Website frontend screenshots] Venue data is linked to a set of markers, which represent Locality Type and Cinema Type. Clicking on a marker opens an Information Window which displays all available Film Weekly data and additional linked resources. (We must get prettier icons!) Long session presentation at the Digital Humanities Australasia Conference, March 2012 [Slide 7: Website features: Browse, Search, Time slider and Contribute] You can search or browse the data, and select a set of venues to display. You can then examine what happens to that set of venues over time by using the Time Slider, either manually or as an automation. [Slide 8: Brisbane] [Slide 9: Contribute form] The crucial bit of all this for the development of the project is the Contribute form, which is how we plan to gather crowdsourced information from the general public, from local historical societies and cinema preservation groups, and potentially from school local history projects. I suspect that we have some lessons to learn from the papers yesterday by Donelle McKinley and Mia Ridge in the session on successful crowdsourcing, but this is our current version. Long session presentation at the Digital Humanities Australasia Conference, March 2012 The aim is to collect images, stories, clippings, personal histories, information about screenings, and more generally accounts of the role and function of the cinema in the community, which will augment the work that we will do with students in harvesting information from Trove, Picture Australia and elsewhere. This is also likely to take us outside the boundaries of our initial period of 1948-71, and this will involve a number of revisions and reiterations of the site. We also have a range of questions to develop as the project grows beyond its current users: One of our original intentions was to develop the geodatabase as a generic piece of software for use by digital humanities researchers who want to map relatively small scale datasets. The system is focused around a database structure that supports the definition of objects with metadata, allowing additional objects to be added to the system without the need to significantly change the underlying database structure. The system is focused on easy implementation and management, needing high-level IT skills for only brief periods in the establishment of a project, to define objects in the database and in the programming code, and customise the user interface to meet their specific needs.  What do other researchers want to use the site for, and how do we make the site more useful to a broader range of users, at a variety of levels of use?  How do we get people to contribute? How much further can we simplify and clarify the contribution process?  How closely do we monitor the reliability of contributor-supplied information? How far can we automate input processes to reduce monitoring costs but ensure reliability?  What do we do when the money runs out?  Can this system be picked up by others and readily used or has it become too intertwined with our own data? [Slide 10: Case Study 1] [click on coloured horizontal arrows to move to linked slides] Long session presentation at the Digital Humanities Australasia Conference, March 2012 [Slide 11: Case Study 1] [slide linked to red arrow on #9, little yellow arrow to go back to #9] [Slide 12: Case Study 1] [slide linked to green arrow on #9, little yellow arrow to go back to #9] [Slide 13: Case Study 1] [slide linked to blue arrow 1 on #9, little yellow arrow to go back to #9] Long session presentation at the Digital Humanities Australasia Conference, March 2012 [Slide 14: Case Study 1] [slide linked to blue arrow 2 on #9, little yellow arrow to go back to #9] [Slide 15: Case Study 1] [slide linked to purple arrow on #9, little yellow arrow to go back to #9] [Slide 15: Case Study 2] [click on coloured horizontal arrows to move to linked slides] [Slide 16: Case Study 2] [slide linked to red, green and blue arrows on #15, little yellow arrow to go back to #15] Long session presentation at the Digital Humanities Australasia Conference, March 2012 [Slide 17: Case Study 2] [slide linked to dark pink arrow on #15, little yellow arrow to go back to #15] [Slide 20: Questions for the Future] work_4zemnol56vfmjps3drenrec2xu ---- Microsoft Word - X DE MAGISTRIS_no downsample.doc HAL Id: hal-01171409 https://hal.archives-ouvertes.fr/hal-01171409 Submitted on 29 Jul 2015 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Dynamic digital human models for ergonomic analysis based on humanoid robotics techniques Giovanni de Magistris, Alain Micaelli, Jonathan Savin, Clarisse Gaudez, Jacques Marsot To cite this version: Giovanni de Magistris, Alain Micaelli, Jonathan Savin, Clarisse Gaudez, Jacques Marsot. Dynamic digital human models for ergonomic analysis based on humanoid robotics techniques. International Journal of the Digital Human, Inderscience, 2015, 1 (1), pp.81-109. �10.1504/IJDH.2015.067135�. �hal-01171409� https://hal.archives-ouvertes.fr/hal-01171409 https://hal.archives-ouvertes.fr Int. J. Digital Human, Vol. X, No. Y, xxxx 1 Copyright © 20XX Inderscience Enterprises Ltd. Dynamic digital human models for ergonomic analysis based on humanoid robotics techniques Giovanni De Magistris* and Alain Micaelli CEA, LIST, LSI, rue de Noetzlin, Gif-sur-Yvette, F-91190, France E-mail: giovanni_demagistris@hotmail.it E-mail: alain.micaelli@cea.fr *Corresponding author Jonathan Savin, Clarisse Gaudez and Jacques Marsot Institut National de Recherche et de Sécurité (INRS), rue du Morvan, CS 60027, Vandoeuvre-lès-Nancy, F-54519, France E-mail: jonathan.savin@inrs.fr E-mail: clarisse.gaudez@inrs.fr E-mail: jacques.marsot@inrs.fr Abstract: Digital human models can be used for biomechanical risk factors assessment of a workstation and work activity design for which there is no physical equipment that can be tested using actual human postures and forces. Yet, using digital human model software packages is usually complex and time-consuming. A challenging aim therefore consists in developing an easy-to-use digital human model capable of computing dynamic, realistic movements and internal characteristics in quasi-real time, based on a simple description of future work tasks, in order to achieve reliable ergonomics assessments of various work task scenarios. We developed such a dynamic digital human model, which is automatically controlled in force and acceleration and inspired by human motor control and based on robotics and physics simulation. In our simulation framework, the digital human model motion was controlled by real-world Newtonian physical and mechanical laws. We also simulated and assessed experimental insert-fitting activities according to the occupational repetitive actions (OCRA) ergonomic index. Simulation led to satisfactory results: experimental and simulated ergonomics evaluations were consistent, and both joint torques and digital human model movements were realistic and coherent with human-like behaviours and performances. Keywords: digital human model; DHM; human learning; dynamic control; ergonomic analysis. Reference to this paper should be made as follows: De Magistris, G., Micaelli, A., Savin, J., Gaudez, C. and Marsot, J. (xxxx) ‘Dynamic digital human models for ergonomic analysis based on humanoid robotics techniques’, Int. J. Digital Human, Vol. X, No. Y, pp.xxx–xxx. 2 G. De Magistris et al. Biographical notes: Giovanni De Magistris received his MS in Mechatronics Engineering from Politecnico di Torino, Italy, and from École Nationale Supérieure de Mécanique et des Microtechniques, Besançon, France. He received his PhD in Robotics and Mechanics in 2013 from the Université Pierre et Marie Curie, Paris, France. He joined the Systems and Technologies Integration Laboratory, French Atomic Energy Commission in 2010. His current research interests include automatic control, whole-body control for virtual humans, and human behaviours. Alain Micaelli received his degree in Engineering and PhD in Automatic Control and Signal Processing in 1979 and 1982, respectively, from the École Nationale Supérieure des Télécommunications, Paris, France, and from the University of Paris-Sud. He joined the French Atomic Energy Commission in 1982 and has been involved in several national and international projects dealing with teleoperation and mobile robotics. He is currently a Research Director in the field of automatic control. His research interests include the control of manipulators, telemanipulators, mobile robots, virtual reality, and, more specifically, virtual manikin. Jonathan Savin is a graduated Engineer from the National Superior School of Physics (Telecom Physique, Strasbourg, France), and holds a Master in Mechanics and Engineering (with a biomechanics specialisation). After working for various companies as a Software Development Engineer, he joined the French Research and Safety Institute for the Prevention of Occupational Accidents and Diseases (INRS) as a project leader. His work focuses mainly on the integration of occupational risks during the initial design phases for work equipment, and specifically on assessing emerging design tools/techniques in this field (digital humans, virtual reality, and augmented reality). Clarisse Gaudez is a Medical Doctor in charge of researches at INRS. She received her MD in 2001 from the University of Clermont Ferrand and PhD in Biomechanics in 2004 from the University of Paris VI. She also received her university degree in Applied Ergonomics in 1999. She joined INRS in 2004. Her research interests are related to the prevention of work related musculoskeletal disorders and to the prevention of occupational accidents induced by movement disturbance. Her work focuses on motor control, movement variability, gestures and strategies used by employees. She has conducted research protocol expertise from foreign counterparts of INRS and international journals. Jacques Marsot is a graduated Engineer from ENIB (Belfort National School of Engineer). After ten years in the industry as Designer and Project Leader, he joined in 1993 the French Research and Safety Institute for the Prevention of Occupational Accidents and Diseases (INRS). He is currently in charge of the ‘Design Engineering of Safe Systems’ Laboratory. Activities of this laboratory focus for example on: ergonomic design of working equipment, drawing up methodologies to design safer machines, design of digital manikins or virtual environments to assess the potential strains related to working situations. Dynamic digital human models for ergonomic analysis 3 1 Introduction Digital human models (DHMs) can be used to improve workplace conditions that affect both worker safety and business success. The goal of this paper is to improve DHMs so that they can be used to help workstation designers to reliably identify and assess workplace induced musculoskeletal disorders (MSDs) and other biomechanical risks factors in the early stages of design. Due to computer technology progress in the past 20 years, DHMs were developed for computer-aided design (CAD) software tools, in order to analyse workstation ergonomics (Claudon et al., 2006). Their diffusion has widely grown with the concept of the ‘digital factory’ (Arndt, 2006). Indeed, many CAD software tools now integrate DHMs, which allow designers, studies offices (Haesen, 2009) and consultants (Urbatic Concept France, 2007) to represent and virtually simulate operators in order to evaluate future workstation ergonomics (reach areas, physical performances, time analysis). Today, DHMs which are used for design purposes are mentioned in different standards as potential tools for biomechanical risk factors assessment from the early stages of design (AFNOR, 2005, 2007). Used even before the achievement of a workstation physical prototype, DHMs thus contribute to the application of safety principles in the early stages of design, in conformity with European standards (European Union, 2006). Regarding this topic, there is abundant scientific literature about DHMs implementing ergonomics methods or industrial standards [for example, RULA (MacAtamney and Cortlett, 1993), OCRA (Occhipinti, 1998) and EAWS (Schaub et al., 2012)] for workstation ergonomics assessment in the early stages of design (Jayaram et al., 2006; Annarumma et al., 2008; Berlin et al., 2009). They may also play a positive role for communication and coordination between the different stakeholders of the project (designers, users, decision-makers, health and safety staff). The use of DHMs hence makes sense for occupational risk prevention, for instance to reduce the risk of MSD occurrence, which represents a major proportion of declared occupational disorders (Sjogaard et al., 1995; Bernard, 1997). In 2011, MSDs in the USA represented a third of occupational disorders leading to leaves (Bureau of Labor Statistics, 2011); in France, it is the number one disorder compensated by social security, amounting to roughly 80% of recognised occupational disorders, and more than nine million lost workdays (CNAM-TS, 2012). Actually, European designers must satisfy technical directive 2006/42/EC on machinery (European Union, 2006) and related harmonised standards. This directive deals with a priori risks assessment. By successive iterations, designers must obtain the lowest possible residual risks level (this is the integrated prevention concept) (AFNOR, 2010). Regarding the MSD issue, the directive requirements were intensified in the early 2010s: many standards related to physical risks assessment were added in the last five years, for instance NF EN 1005-5 (AFNOR, 2007) for MSD risks assessment related to high rate repetitive tasks. These standards are partly based on identification and counting of the operator’s technical actions (conception data) and partly on assessments of biomechanical risks factors (postures, efforts, repetitive movements, task durations or other parameters). In industry, one can find two classical ergonomic assessment modalities. The first is coarse, it is based on observations of the operator’s activities through video recording and questionnaires, and it is used to qualitatively evaluate efforts and postures. The second is 4 G. De Magistris et al. more precise, it requires advanced metrology means (force sensors or surface electromyography systems to quantitatively measure the operator’s exertions; motion capture systems to measure the operator’s joint positions). However, the second modality can hamper, and thus modify, the analysed task executions (since the operator is geared with sensors). Moreover, the second modality requires expert skills in biomechanics and physiology. In addition, both modalities cannot be used in the early design stages because they require prototypes of future workstations (Badler et al., 1993; Morrisey, 1998; Zhang et al., 2000). DHMs offer a complementary approach which relies on simulation. Although DHMs may cost more than the other two modalities in the early design stages of projects, they allow the examination of multiple design scenarios, even when physical mock-ups or prototypes are not available. As a result, they can rapidly lead to significant overall cost and development time reductions (Chaffin, 2001). When using DHMs, actual measurements using actual workstations are often eventually carried out to test final physical mock-ups or prototypes before the final products are produced and used. The focus of our research then was to develop DHMs capable of performing simulated work tasks with dynamically consistent motions, behaviours and internal characteristics (positions, velocities, accelerations and torques) based on a simple description of future work-tasks, in order to achieve realistic ergonomics assessments of various work-task scenarios in the early stages of the design process. To obtain these goals, we developed dynamic DHMs which ware automatically controlled in force and acceleration (De Magistris et al., 2013a), and which were inspired by human motor control (Todorov and Jordan, 1998) and based on physics simulation. In our simulation framework, DHM motions were controlled by real-world Newtonian physical and mechanical laws and applied forces and torques were calculated by automatic control techniques. Our controller also handled multiple simultaneous tasks (balance, contacts, manipulations) in real time. This paper consists of eight sections. The second and third sections describe existing DHMs and their limitations. The fourth section presents the advantages of our DHMs. The fifth section describes an application case which was used to experimentally validate our DHMs and also outlines the principles of our dynamic DHM controls. The sixth section describes outcomes, and, in particular, compares experimental and simulation results. Finally, the seventh and eighth sections present a discussion of the issues raised by the approach and its future prospects. 2 A review of DHMs Around 1960, scientific literature presented the first DHMs. Initially, DHMs allowed graphical representation of a human in static conditions (for visualisation purposes users could primarily change the postures and anthropometric dimensions of the DHMs). Since then, dramatic improvements have been made, DHMs were first integrated into CAD software tools, and then have even been integrated into virtual reality systems (Chedmail et al., 2002). Today, designers can have their projects tested by future workers in complete ‘scale 1’ virtual environments. Now, various academic and industrial laboratories are still trying to enrich DHM functions and behaviours. For example, they are trying to upgrade them with advanced calculation units which can simulate realistic human-computer interactions (Pouliquen, 2006), and with real time joint effort calculations and optimisations of the operator’s movements according to physical Dynamic digital human models for ergonomic analysis 5 capabilities and actual risks present in the work environments (Shahrokhi and Bernard, 2009). However, in this paper, we are primarily interested in the DHM tools which are used for workstation design. The main functions of such tools are: 1 The digital operator’s representation in the 3D CAD software tools. For this purpose, the DHMs can be feminine and masculine models which are characterised by anthropometrical or biomechanical factors (angular range of motion, maximum exertion, etc.), which are selected by the designers from anthropometrical or biomechanical databases. 2 Simulation of postures and/or activity sequences (gestures, posture changes, object prehensions, motions, etc.). In other words, the DHMs can be animated by either direct and/or inverse kinematic control, manually, from a database of predefined movements or by motion analysis systems. 3 Anthropometric prescriptions verification, collision detection and task execution time calculation. In other words, the operator’s reach area and visual field are realistically and accurately included in the functionalities of the DHMs. 4 Biomechanical and/or physiological constraints assessment through the use of common ergonomic indices. Various ergonomic evaluators such as revised equations from NIOSH (1991), RULA methods (MacAtamney and Cortlett, 1993), OWAS (Karu et al., 1977), EAWS (http://inderscience.metapress.com/ content/m850j18564428m27/) and Snook tables (Liberty MutualTables, http://www.ccad.uiowa.edu/vsr/research/standard-ergonomicassesments/ liberty-mutual/) are used to evaluate the simulated tasks completed by the DHMs. Table 1 sums up the main characteristics and functionalities of DHMs used by the main industrial or academic stakeholders in the CAD field. Table 1 shows that there are two groups of DHMs: on one hand those used in generic industrial contexts [for example, Jack (Badler, 1997), DELMIA (http://www.3ds.com/products-services/delmia), SAMMIE CAD (Porter et al., 2004) and ERGOMAN – PROCESS ENGINEER (Schaub et al., 1997)] and those specifically designed for a specific type of application (for example, RAMSIS (Seidl, 2004), BHMS (Rice, 2004), MAN3D (Monnier, 2004) and SantosHuman (VSR Research Group, 2004; Vignes, 2004)). Table 1 includes the DHMs’ names and their producers. Columns three to ten stand for ‘animation method’ (AM); ‘direct’ (D) stands for direct kinematics; ‘inverse’ (I) stands for inverse kinematics; ‘field of view’ (FOV) stands for the extent of the observable world that is seen at any given moment by the DHM; ‘reach area’ (RA) stands for objects and the parts of the workstation that the DHM can reach; ‘collision detection’ (CD) stands for detection of contacts between two or more objects or between humans and objects; ‘static effort’ (SE) stands for human efforts in static conditions; ‘ergonomic assessment indices’ (EAI) stands for the types of workstation assessments; ‘motion capture’ (MC) stands for optional modules that can record the movements of objects or people; ‘methods time measurement’ (MTM) stands for calculation of standard times in which a worker can complete a task. 6 G. De Magistris et al. Table 1 Main characteristics and functionalities of DHMs Model Society AM FOV RA CD SE EAI MC MTM Jack Siemens PLM Software D/I Yes Yes Yes Yes NIOSH, OWAS, RULA Yes Yes Delmia Dassault Systèmes D/I Yes Yes Yes Yes RULA No No ErgoMan Process Engineer Dassault systèmes D Yes Yes No Yes NIOSH Snook tables No Yes Ramsis Human Solutions D/I Yes Yes Yes Yes No No No SAMMIE CAD SAMMIE CAD Limited D/I Yes Yes Yes No NIOSH, RULA No No BHMS Boeing D Yes Yes Yes No No No No MAN3D IFSTTAR D/I Yes Yes Yes Yes No Yes No SantosHuman SantosHuman Inc. D/I No Yes Yes Yes NIOSH, Snook tables Yes No 3 DHM limitations When using DHMs, which are integrated to CAD software tools, during the early stages of design, one must consider various limitations, which are clearly described and identified in related literature: 1 HM posture set up is complex. It can be completely subjective when controlled by the designers, whether this task is carried out, in a direct way, by using a keyboard or, in an experimental way, by using a puppet (Yoshizaki et al., 2011). It can also rely on optimisation algorithms (Chaffin, 1997; Center for Ergonomics, 2004), computer techniques (Zhang et al., 2010), experimental data from motion capture devices (Wang, 2008; Fritzsche et al., 2011) or even the operating procedures given by process/method engineers (Kuo and Wang, 2009). 2 Most CAD DHMs only consider one or a few static postures, and thus they neglect constraints related to maintaining postures or balance (Lämkull et al., 2009), and therefore they cannot calculate complete movements, with acceleration and inertia effects. 3 DHM calculations for some ergonomic indices are not reliable, especially those that rely on relative exertions. In fact, when DHMs calculate such relative exertions, they compare physics-based simulated exertions to built-in maximal effort databases which are generally incomplete or approximate. The resulting errors in maximal exertions consequently lead to erroneous ergonomic indices (Savin, 2011). Dynamic digital human models for ergonomic analysis 7 4 Interpretations of simulation results with DHMs require real ergonomic skills and substantial knowledge of MSD emergence mechanisms and related scientific fields (ergonomics, biomechanics, physiology, etc.) (Dukic et al., 2007). Actually, ergonomics processes require transverse cooperation between the different stakeholders (design, manufacturing, prevention) in a proactive ergonomic approach to design (Falck and Rosenqvist, 2012). For these reasons, biomechanical risk factor assessment based upon DHM simulations can lead to underestimation: some studies report that up to 40%–50% of simulations classify risk lower than expected (Lämkull et al., 2009; Savin, 2011), which can lead to unsuitable design choices (Malchaire, 2011). 4 Realistic simulation of human motion The aim of our work was to address some of the DHM limitations described in Section 3: 1 The aim of our work was to simplify and accelerate DHM simulation and animation processes. Our DHM allowed autonomous simulation motions (not only postures) given only minimum information about future work tasks. 2 The aim of our work was to give DHMs autonomous, objective and realistic behaviours from a global movement standpoint (postures, trajectories) as well to improve the data which is used to quantitatively define such behaviours (positions, velocities, accelerations, efforts, etc.), in order to calculate reliable ergonomic assessment indices: our DHM took account of dynamics, balance controls and contact forces, which were not considered by current DHMs in CAD software tools. Precedent works (Colette et al., 2007, 2008; Mansour et al., 2011; Liu et al., 2011) showed that humanoid robotics techniques can be used to simulate autonomous motions. So we tried to create DHM control algorithms that were more objective and reliable than those of DHMs which were currently integrated into CAD software tools, by simultaneously considering both human behaviours and Newtonian mechanics laws. 5 Development and validation processes Figure 1 presents our development and validation process for the dynamic DHM command laws, which were based on humanoid robotics techniques and inspired by human motor control laws. The process relied on one hand on the development of a simulation tool, and on the other hand on the validation of the tool by comparisons between experimental and simulated data and corresponding ergonomic evaluations. 8 G. De Magistris et al. Figure 1 Development and validation process 5.1 Human subject experiments In the framework of our work, we chose to complete an experimental case-study. We chose to focus on repetitive manual assembly tasks, which can present significant MSD risks (Kilbom and Persson, 1987) when performed daily as main work activities. Assembly tasks being very diverse, we selected an activity of insert-fitting, which is a common work task, especially in the automobile or appliance industries. Moreover, the Laboratory for Biomechanics and Ergonomics (LBE) of Institut National de Recherche et de Sécurité (INRS) has previously studied this task, which consists of positioning and clamping metallic inserts on a plastic dashboard for a vehicle, both in a laboratory and in industrial settings (Gaudez, 2008). The institutional ethics committee approved our experiments. Eleven healthy right-handed subjects (nine males and two females) took part in the case study [age = 29.4 ± 9.2 years (mean ± standard deviation), height = 177.7 ± 10.3 cm, body mass = 75.9 ± 9.3 kg]. The subjects gave written consent before the experiments, and they filled out a health questionnaire. They already had some experience in insert-fitting throughout their careers. Dynamic digital human models for ergonomic analysis 9 In De Magistris et al. (2013a), we described experiments which were conducted by agents of the INRS, which consisted of reproducing an insert-fitting workstation (see Figure 2). In this paper, we used data from the experiments as test data for our model validation process. Figure 2 shows experimental set-up for the human subjects experiments in this paper. Figure 2 Human subject experiment (see online version for colours) The subjects were asked to clip ten inserts [see Figure 3(a)] into related supports [see Figures 3(b) and 3(c)] using two methods: using only fingers or using a hand-held tool that meets specific ergonomic criteria (NST-n168, 1998) (see Figure 4). Figure 3 Insert and support, (a) insert (b) support (c) insert placed on the support (see online version for colours) (a) (b) (c) 10 G. De Magistris et al. Figure 4 Hand-held tool that meets specific ergonomic criteria (see online version for colours) The subjects used four different strategies to fit the inserts using only fingers: 1 four subjects picked up the ten inserts one by one and clipped them onto the supports using only their right hands 2 four subjects picked up the ten inserts one by one from the table with their left hands, then transferred them to their right hands, which they then used to clip the inserts onto the supports 3 two subject picked up the ten inserts all at once from the table with their left hands, then picked up the inserts from their left hands with their right hands, which they then used to clip the inserts onto the supports 4 one subject picked up the inserts one by one from the table with their right hand, transferred them to their left hand to position them properly, then transferred them back to the right, which they then used to clip the inserts onto the supports. For the remainder of this paper, we only analysed the first two strategies. The last two strategies were not analysed because the number of subjects was not sufficient for meaningful analysis. When using the hand-held tool, all of the subjects used the same strategy: they all picked up the ten inserts one by one with their left hands and placed the inserts on the tool, which they held in their right hands. They then clipped the inserts onto the supports only using the tool. 5.2 DHM simulation 5.2.1 Human body dynamics model In this paper, the DHM was divided in two separately articulated rigid body branches, which were used to model the human body and hands. The human body was kinematically modelled as a set of articulated rigid bodies organised into a redundant tree structure (see Figure 5). The rigid bodies were characterised by their degrees of freedom (dof). Depending on the functions of the corresponding human segments, each articulation of the tree was modelled by a number of revolute joints. The human body was modelled and controlled at two separate levels, to reduce complexity: the first level was used to control the body, the second level was used to control the hands. The body model comprised 39 joint dof and 6 root dof, with 8 dof Dynamic digital human models for ergonomic analysis 11 for each leg and 7 dof for each arm (see Figure 5). The root was not controlled. The dimensions were based on the subjects’ anthropometries (Hanavan, 1964). Control of the body model is described in more detail later. The DHM control is described later. The root was not controlled. The DHM dimensions were based on the subjects’ anthropometries. Figure 5 (a) Body model with skinning and collision geometry (b) Hand model with skinning and collision geometry (see online version for colours) (a) (b) Each hand model had 20 dof (see Figure 5).We used a proportional-derivative controller to control joint position θ where a set of desired position θd corresponds to opened/closed hand and different preset grasps. The dynamics model of a robot, which we used to control our DHM, was a second order system: j j k k r T r T r c c e c j j MT NT G Lτ J W J W+ + = + +∑ ∑ (1) In equation (1), M is the generalised inertia matrix, NT stands for the centripetal and Coriolis forces, T and 1[ · · · ]tndofT Vrootq q= are respectively the acceleration and velocity vectors in generalised coordinates, Gr is the gravity force, τ = [τ1 · · · τndof ] t is the joint torque vector, L = [0(ndof,6) Indof ] t is the matrix used to select the actuated dof, W = [ΓF]t denotes the external wrenches (see Figure 6) where Γ is the moment and F the force. The superscript r denotes ‘real’ wrench values in a simulation. The subscript c stands for non-sliding contacts at known fixed locations such as the contacts between the feet and the ground. The subscript e stands for unknown contacts with the environment. 12 G. De Magistris et al. Figure 6 DMU scenario with wrenches: com (centre of mass) for balance, head following end effector (EE) movement, thorax avoiding large movement, c (contacts) no sliding contacts, lhand (left hand) and rhand (right hand) end effectors for performing handling tasks (see online version for colours) Note: The virtual object at the top right is a model of an insertion. 5.2.2 Human-like dynamic DHM controls To control DHM movements and postures, we developed a multi-objective DHM controller. Common control techniques are based on pure stiffness compensation of internal and external disturbances. In this paper we used a controller which combined both feed-forward and feed-back techniques (De Magistris et al., 2013a), and which was inspired by human motor control principles. This is the novelty of our DHM, since precedent DHMs were generally based on kinematic control techniques. In addition, when a multi-body system contacts another object, it is important to make the limbs more compliant to avoid ‘contact instabilities’ (Hogan, 1990). Therefore, we also noted that feed-forward control was also needed to properly control DHMs as mechanical systems. A number of studies have shown that the nervous system uses internal representations to anticipate the consequences of dynamic interaction forces. In particular, Lackner and Dizio (1994) demonstrated that the central nervous system (CNS) is able to predict centripetal and Coriolis forces; Gribble and Ostry (1999) demonstrated Dynamic digital human models for ergonomic analysis 13 compensation of interaction torques during multijoint limb movements. These studies suggested that the nervous system has sophisticated anticipatory capabilities. We therefore also developed an accurate internal representation or an inverse model which we used to control body dynamics in object-filled environments. Based on the notion underlying the acceleration-based control method (Abe et al., 2007; Colette et al., 2008) and the Jacobian-Transpose (JT) control method (Pratt et al., 1996; Liu et al., 2011; De Magistris et al., 2011, 2013b), we developed a combined anticipatory feed-forward and feed-back control system. This controller was formulated as two successive quadratic programming (QP) problems with multiple dof which were used to simultaneously solve all the constraint equations. The first QP problem was the feedforward control and the second QP problem was the feed-back control. This computational optimisation framework was first described in De Magistris et al. (2013a, 2014). To simulate the tasks described in De Magistris et al. (2013a), several objectives were identified and prioritised: 1 Centre of mass (com). The dynamic controller maintained the DHM’s balance by imposing that the horizontal plane projection of the centre of mass (com) lied within a convex support region (Bretl and Lall, 2008). 2 Thorax. We observed that the subjects’ thorax orientation varied very little, during the experimental tasks. We therefore controlled the DHM’s thorax orientation to stay as close as possible to its initial orientation. 3 Posture. We specified one of the DHM’s joint positions as a reference position for the entire simulation, to obtain more realistic movements, and to avoid any singularities. 4 End effectors (EE). This feature performed the specific manipulation tasks. 5 Head. During the experimental tasks, we observed that the head followed the movements of the end effector performing the predominant manipulation task. We therefore specified this as our head objective. 6 Contact force. We set the contact force to zero for a regularisation of the QP problem. 7 Gravity compensation. We included a gravity compensation objective to make the target tracking control independent of gravity compensation. 5.2.3 Human-like movements The DHM’s movements were characterised by the initial and final points of the trajectories (positions and orientations) and their durations. We also included imposed way-points in order to avoid potential obstacles (for instance a table edge) because the DHM’s controller did not include collision avoidance. For the calculation of desired trajectories and velocities, we took into account experimental studies of human movements found in literature. Previous papers showed that voluntary movements obey the following three major psycho-physical principles: 14 G. De Magistris et al. • Hick-Hyman’s Law: the average reaction time aveRT of a real human depends on the logarithm of the n probable choices (Hyman, 1953): ( )2 log 1aveRT d n= + (2) • Fitts’ Law: the movement times of a real human depend on the logarithms of the relative accuracies (the ratios between movement amplitudes and target dimensions) (Fitts, 1954): 2 log (2 )D g z P= + ϒ (3) where D is duration time, ϒ is amplitude, P is accuracy, g and z are empirically determined constants. • Kinematics invariance: the hand movements of a real human have a bell-shaped speed profile in straight reaching movements (Morasso, 1981). For more complex trajectories (i.e., handwriting) a 2/3 power law can be used to predict the correlation between movement speed and trajectory curvature (Morasso and Mussa-Ivaldi, 1982) described as: 2 1 3˙( ) ss t Z R − = (4) where ˙( )s t is tangential velocity, R is radius of curvature and Zs is a proportionality constant, also called ‘velocity gain factor’. For a real human, more complex trajectories can be divided into overlapping basic trajectories similar to reaching movements. Such spatio-temporal invariant features of normal movements can be explained by a variety of criteria related to maximum smoothness, such as the minimum jerk criterion (Flash and Hogan, 1985) or the minimum torque-change criterion (Uno et al., 1989). As a result, for the DHM, we implemented a modified minimum jerk criterion with via-points, which were used to calculate trajectories. The methods which we used were first presented in De Magistris et al. (2013a), and they were originally inspired by the work of Todorov and Jordan (1998). In order to calculate trajectories for both rotations and translations, we only needed to find start, intermediates and end points X, start and end velocities V and start and end accelerations A. Intermediate times TP were found using a nonlinear simplex method, which minimised jerk over all possible passage times. 5.2.4 Contacts The simulations were completed using the XDE-Core physics simulation module developed by CEA-LIST. The simulation module managed the entire simulation in real time, including collision resolution, contact constraints and friction effects, which were modelled in compliance with Coulomb’s friction law: xy zf μ f≤ (5) where || ||xyf is a tangential contact force, μ is a dry friction factor and || ||zf is a normal contact force. Dynamic digital human models for ergonomic analysis 15 The central component of the XDE-Core physical engine was a generalised virtual mechanism (GVM) mechanical calculation module. The GVM module was used to manage multi-body systems and rigid or deformable contacts, and its mechanical formalisms were based upon Lie’s groups (Merlhiot, 2009). It also used interactive and real-time algorithms. The XDE-Core physical engine also included a component which detected multiple impacts. The module used a discrete local minimum distance (LMD) algorithm, which did not need to calculate the global distances between objects. The module could detect multiple impacts by only calculating the local minimum distances between one or two points on the surfaces of each of the objects. If one of these distances was zero, the module assumed there was an impact. The module also dilated the meshes used to represent objects, which rounded off edges, and gave the module the ability to tolerate minor defects in the meshes. Previous related papers have verified that performance of the XDE-Core physical engine is good, when used in simulation contexts, for DHMs in virtual reality environments (Mansour et al., 2011; Liu et al., 2011). 5.2.5 Digital mock-up The digital mock-up (DMU) scenario (see Figure 6) reproduced the experimental environment and ensured geometric similarity. The inputs used to build the DMU scenario were the workplace spatial organisation (x, y and z dimensions), inserts and tool descriptions (x, y, z positions and weights) and the initial DHM position. The virtual object at the top right of Figure 6 is a model of an insertion with force Fins = Kins · xins = 60 N. 5.2.6 Task models To model an insert-fitting task, we proceeded just like work situation designers in design or methods offices: the task was divided into motion sequences or elementary postures. For a simulation, the insert-fitting task was modelled with the finite state machine (FSM) in Figure 7. The FSM used terminology come from conception methods [for example, MTM – methods time measurement (Maynard et al., 1948)] or ergonomic evaluations [for example, OCRA – occupational repetitive actions (Occhipinti, 1998)]. In Figure 7, the states for the one-handed (right hand) insert fitting task were: 1 WAIT: At the beginning of the insert-fitting task, the DHM’s body was in a vertical position, with its arms resting on its sides. 2 REACH: The DHM’s hand took a prehension position and reached for an insert. The DHM’s eyes followed the motion of the DHM’s right hand. 3 GRASP: The subject closed the fingers of the right hand and took the insert. 4 POSITION: The DHM’s right hand moved toward an insertion point. The DHM’s eyes followed the DHM’s right hand. 5 PLACE: The DHM fitted the insert into a vacant support. 6 RELEASE: The DHM’s fingers on its right hand opened. 16 G. De Magistris et al. 7 REACH: The DHM’s body returned to its initial position. 8 WAIT: At the end of an insert-fitting task, the DHM waited in its initial position. The DHM’s body was in vertical position, with its arms resting on its sides. Figure 7 also shows the states for the two-handed (right hand and left hand) insert fitting task. Figure 7 The states for an insertion task (one-handed at the top; two-handed at the bottom) The states for the insert fitting task with a tool were the same as the states for the two-handed insert fitting task, except the RELEASE LHAND and GRASP RHAND states were replaced by a PLACE LHAND state, because the DHM placed an insert on the tool with its left hand. 6 Results To obtain the results presented in this section, we implemented our DHM on a PC (12 M Cache, 2.53 GHz Processor, 24 GB of RAM). With a simulation step of 0.01 s, the joint torques were calculated in quasi-real-time (the computation durations were 1.5 times the simulation duration). 6.1 Trajectories To compare trajectories from real human experiments Xh and from simulations Xs, we analysed the trajectories of the ‘POSITION’ steps of the finite state machines for four subjects. We chose to compare six experimental medial clip insertion trajectories to six simulated trajectories. We did not compare trajectories for the first and last two clips. The experimental trajectories and the simulated trajectories had different start and end points, and they had different duration times. For this reason, we needed to apply a set of elementary affine transformations to compare these trajectories. Dynamic digital human models for ergonomic analysis 17 6.1.1 Spatial transformations 1 Translation. The first step of the spatial transformations consisted of matching experimental and simulated start points. The first step was used to translate the trajectories to their start points: ( ) (0) ( ) (0) ( ) (0) , ( ) (0) ( ) (0) ( ) (0) s s h h s s s h h h s s h h x t x x t x X y t y X y t y z t z z t z − −⎛ ⎞ ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ = − = −⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟− −⎝ ⎠ ⎝ ⎠ (6) where time t ∈ [0,...,tf]. 2 Rotation between vectors. The second step of the spatial transformations consisted of a rotation. The second step was used to rotate the experimental trajectory to match the simulated trajectory, by calculating the cross product and the angle between the experimental trajectory and the simulated trajectory: ( ) [ ]s s s sf f f fX t x y z= and ( ) [ ].h h h hf f f fX t x y z= The cross product was calculated by calculating the determinant of a formal matrix: ( ) ( ) s s sf f f f f h h h f f f t t x y z x y z ⎡ ⎤ ⎢ ⎥ × = ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎣ ⎦ s h i j k X X (7) Using Sarrus’ rule, the cross product was expanded to: ( ) ( ) ( ) ( ) ( ) +s h s h s h s h s h s hf f f f f f f f f f f f f f x y z t t y z z y z x x z x y y x u u u × = − − − = + + s hX X i j + k i j k (8) where ( ), ( )s h s h s h s hx yf f f f f f f fu y z z y u z x x z= − = − and ( ) s h s h y f f f fu x y y x= − are the components of the unit vector u = (ux, uy, uz). The angle between the trajectories vectors was calculated by calculating the arccosine of the scalar product of the trajectories: ( ) ( )( ) arccos f fθ t t= ⋅s hX X (9) 3 Homothetic transformation. The third step of the scalar transformations was used to match the positions in the experimental trajectory and the simulated trajectory. A homothetic transformation, with scale factor λ was used to match the positions: ( ) ( ) s f h f X t λ X t = (10) 18 G. De Magistris et al. 4 Rotation to obtain coplanarity of three key-points. The fourth step of the scalar transformations was used to obtain coplanarity of three key-points in the experimental and simulated trajectories. The calculated the vector C = (Cx, Cy, Cz) of the segment hOX defined by the start and end points of each trajectory, and a rotation about hOX was used to obtain coplanarity (see Figure 8). Figure 8 Rotation to obtain coplanarity of three key-points To obtain coplanarity of the trajectories Xh and Xs, we rotated by γ the simulated trajectory onto the experimental trajectory: arccos AC BC γ AC BC ⎛ ⎞⋅⎜ ⎟= ⎜ ⎟ ⎝ ⎠ (11) where A and B are respectively the projections of C onto Xh and Xs. 6.1.2 Distances between trajectories Various methods from previous literature can be used to calculate the distances between two trajectories, which are defined by two non-empty point sets. One common method consists of calculating the Hausdorff distance between the two sets of points. This method can be used to measure the ‘closeness’ of two non-empty point sets which are subsets of a metric space. The method assigns a scalar score or distance to the two trajectories, which measures the similarity between the two trajectories (Chen et al., 2011). This distance is defined as: ( ) { }H sup inf ( , ), sup inf ( , ), max y xx yd x y d x yd ∈ ∈∈ ∈= s hh sh s X XX XX X (12) where sup is supremum and inf is infimum. The results in this section present both the Hausdorff distances and the average distances between the experimental trajectories and the simulated trajectories. The average distances were calculated using the following definition: ( ), mean inf ( , )ave yx d d x y ∈∈ = sh h s XX X X (13) Dynamic digital human models for ergonomic analysis 19 6.1.3 Results Table 2 shows the distances between the experimental trajectories and the simulated trajectories, for the four experimental subjects, at the ‘POSITION’ steps of the finite state machines (see Figure 7). The distances are given for the six central one-handed insertion tasks. Figure 9, compares the experimental trajectories and the simulated trajectories, for all subjects, and for all of the right-handed insertion tasks. Table 2 Right wrist trajectories – distances between trajectories for all insertions Subject Hausdorff distance Average distance 1 (1.1 ± 0.3) cm (0.5 ± 0.1) cm 2 (1.5 ± 0.4) cm (0.7 ± 0.2) cm 3 (1.7 ± 0.5) cm (0.9 ± 0.3) cm 4 (2.6 ± 1.2) cm (1.5 ± 0.6) cm Figure 9 Right wrist trajectories for all subjects (see online version for colours) 20 G. De Magistris et al. 6.2 Velocities To compare the trajectories, which were completed with different velocities, dimensionless normalised time was used: f t t t = (14) where tf is the duration of the rigid body motion. Multiplying V (t) by tf and substituting t by ,t makes the velocities independent of the time scale (Schutter, 2010). Figure 10 shows the resulting velocities for the experimental trajectories and the simulated trajectories, at the ‘POSITION’ step of the finite state machines (see Figure 7), for the four subjects, and the right-handed insertion tasks. Figure 10 Right wrist velocities for all subjects (see online version for colours) 6.3 Torque analysis The torques calculated during the simulations were suitable for human performance capabilities. For example, the maximum simulated value of the right elbow flexion torque was about 22 N · m and 5 N · m for the wrist torque (see Figure 11). The torque values Dynamic digital human models for ergonomic analysis 21 calculated during the simulations were always smaller than maximum admissible torques at the elbow and wrist joints (the maximum torques at the elbow are approximately 70 N · m for men and 35 N · m for women (Askew et al., 1981) and approximately 8.05 N · m for flexion at the wrist and 6.53 N · m for extension at the wrist for both men and women (Ciriello et al., 2001). Figure 11 Simulated torques, (a) right elbow torque (b) right wrist torque (see online version for colours) (a) (b) 6.4 Ergonomic assessment The OCRA ergonomic index was used to evaluate both the experimental insertion tasks and the simulated insertion tasks (Occhipinti, 1998; European Union, 2006). Tables 3 to 5 show the OCRA assessments. The OCRA index is the ratio between the total number of observed technical actions (ATA) and the total number of recommended technical actions (RTA) (for each upper limb): ATA OCRA RTA = (15) NTC·60 ATA F·D CT·D = = (16) where F is the frequency per minute; NTC is the number of technical actions in a cycle; CT is the cycle time in seconds; D is the evaluated net duration of the repetitive task during the work shift in minutes, and ( )RTA CF · PoM · ReM · AdM · FoM · RcM · DuM= (17) where CF is a frequency constant related to the number of technical actions per minute, in this paper (30 actions per minute); PoM, ReM, FoM are multiplier factors, with values which range between 0 and 1, and which are selected according to the characteristics of the posture (PoM), repetitiveness (ReM) and force (FoM); AdM are risk factors for additional elements; DuM is a duration multiplier and RcM is selected based upon ability to recover. 22 G. De Magistris et al. Table 3 OCRA index values for the one-handed insertion task Human subject experiment DHMs RUL RUL NTC 3 3 CT 3.89 ± 0.41 4.07 ± 0.43 F 46.71 ± 4.87 44.67 ± 4.72 D 0.39 ± 0.04 0.41 ± 0.04 ATA 18 18 CF 30 30 FoM 0.64 ± 0.19 0.64 ± 0.19 PoM 0.55 ± 0.08 0.52 ± 0.04 ReM 0.7 0.7 AdM 1 1 RPA 2.93 ± 1.20 2.83 ± 0.85 DuM 1 1 RcM 1 1 RTA 2.93 ± 1.20 2.83 ± 0.85 OCRA 7.25 ± 3.70 7.19 ± 3.39 Risk level Risk Risk Note: Mean ± standard deviation for four subjects. Table 4 OCRA index values for the two-handed insertion tasks Human subject experiment DHMs LUL RUL LUL RUL NTC 1 3 3 2 CT 4.14 ± 0.14 4.14 ± 0.14 4.35 ± 0.13 4.35 ± 0.13 F 14.51 ± 0.46 43.53 ± 1.39 13.80 ± 0.41 41.41 ± 1.23 D 0.41 ± 0.01 0.41 ± 0.01 0.44 ± 0.01 0.44 ± 0.01 ATA 6 18 6 18 CF 30 30 30 30 FoM 1 0.57 ± 0.15 1 0.57 ± 0.15 PoM 0.63 ± 0.25 0.53 ± 0.05 0.63 ± 0.25 0.55 ± 0.10 ReM 0.7 0.7 0.7 0.7 AdM 1 1 1 1 RPA 5.41 ± 2.10 2.60 ± 0.73 5.70 ± 2.23 2.89 ± 1.01 DuM 1 1 1 1 RcM 1 1 1 1 RTA 5.41 ± 2.10 2.60 ± 0.73 5.70 ± 2.23 2.89 ± 1.01 OCRA 1.21 ± 0.34 7.46 ± 2.62 1.15 ± 0.32 6.93 ± 2.78 Risk level No risk Risk No risk Risk Note: Mean ± standard deviation for four subjects. Dynamic digital human models for ergonomic analysis 23 Table 5 OCRA index values for the insertion tasks with a tool Human subject experiment DHMs LUL RUL LUL RUL NTC 1 3 3 2 CT 3.99 ± 0.37 3.99 ± 0.37 4.14 ± 0.35 4.14 ± 0.35 F 30.28 ± 2.88 30.28 ± 2.88 29.17 ± 2.56 29.17 ± 2.56 D 0.40 ± 0.04 0.40 ± 0.04 0.41 ± 0.03 0.41 ± 0.03 ATA 12 12 12 12 CF 30 30 30 30 FoM 1 0.87 ± 0.14 1 0.87 ± 0.14 PoM 0.61 ± 0.16 0.52 ± 0.04 0.61 ± 0.16 0.51 ± 0.03 ReM 0.7 0.7 0.7 0.7 AdM 1 1 1 1 RPA 5.12 ± 1.43 3.76 ± 0.74 5.30 ± 1.44 3.82 ± 0.61 DuM 1 1 1 1 RcM 1 1 1 1 RTA 5.12 ± 1.43 3.76 ± 0.74 5.30 ± 1.44 3.82 ± 0.61 OCRA 2.49 ± 0.61 3.30 ± 0.67 2.40 ± 0.57 3.23 ± 0.61 Risk level Very low risk Very low risk Very low risk Very low risk Note: Mean ± standard deviation for all subjects. 7 Discussion Tables 3 to 5 show that the OCRA assessments for the experimental and simulated insertion tasks were consistent with each other. The results also show that the simulated torques were compatible with human performance capabilities. This is particularly important, since common DHMs may compute joint torques and/or working postures that are sometimes not compatible with human performance capabilities, which can lead to erroneous ergonomic assessments (Lämkull et al., 2009; Savin, 2011). Although we obtained similar trajectories and speed profiles in the experiments and in the simulations, the differences were more noticeable for small subjects (see Figure 10). This could be due to the fact that the horizontal distances (distances between the subjects and the insert sockets, or the ranges between the extreme insert sockets) were not adjusted according to the subjects’ sizes [instead the heights of the table were set to 90% of the elbow-ground height, in accordance with European standards for a standing work activities which require normal visions and precisions (European standard EN ISO 14738:2008, 2008)]. 24 G. De Magistris et al. 8 Conclusions and future works In this paper, we introduced dynamic DHMs which were controlled in force and acceleration. We used the DHMs to simulate an experimental insert fitting tasks in quasi-real-time, and we used the simulated postures, times and exertions to calculate OCRA index-based ergonomic assessments. Given limited information on the scenarios (typically initial and final operator-positions and clipping forces), the simulated ergonomic assessments were still in the same risk levels as OCRA assessments for corresponding experimental results. In addition, the simulated trajectories were similar to the experimental trajectories. These encouraging results show that our DHMs could be used to overcome some of the limitations of common DHMs. For the simulations, we explicitly specified the types of grasps (palmar, pinch, full-handed) and the orientations of the objects in the subjects’ hands. Based on the final orientations (the objects were attached to the subjects’ hands). In the future, in order to generalise the DHMs, prehension functions need to be added to our kinematic model. To avoid excessively increasing the complexity of our kinematic model, with 20 segments and 28 additional dof per hand (Miyata et al., 2005), we need to replace the wrist model with an end-effector with characteristics (number of joints, types, rotational and translational ranges) that can be used to model observed dof for each type of grasp (Miller et al., 2003). As a result, the end effector will have more dof in pinch mode than in full-handed grasp mode. In this article, we used several controller parameters. To improve the controller, the tasks weights could be automatically modified to reduce complex tuning (Salini, 2012). In addition, to take account of obstacles in the workstation, path-planning (Escande, 2008; Toussaint et al., 2007; Lamarche, 2009), obstacle avoidance and auto-collision (Stasse et al., 2008) features could also be added. In the long term, the contributions in this paper, and in future work, will result in the creation of an ergonomic evaluation software tool that can be integrated into CAD software tools, which consists of dynamic DHMs with human-like behaviours and characteristics. References Abe, Y., Silva, M.D. and Popovic, J. (2007) ‘Multiobjective control with frictional contacts’, in Proceedings ACM SIGGRAPH/EG Symposium on Computer Animation, Aire-la-Ville, Switzerland, pp.249–258. AFNOR (2005) NF EN 1005-4 – sécurité des machines – performance physique humaine – partie 4, La Plaine St-Denis, AFNOR. AFNOR (2007) NF EN 1005-5 – sécurité des machines – performance physique humaine – partie 5: appréciation du risque relatif à la manutention répétitive à fréquence élevée, La Plaine St-Denis, AFNOR. AFNOR (2010) NF EN ISO 12100: Sécurité des machines – principes généraux de conception – appréciation du risque et réduction du risque, La Plaine St-Denis, AFNOR. Annarumma, M., Pappalardo, M. and Naddeo, A. (2008) ‘Methodology development of human task simulation as PLM solution related to OCRA ergonomic analysis’, in Computer-Aided Innovation (CAI) – IFIP 20th World Computer Congress, Proceedings of the Second Topical Session on Computer-Aided Innovation, Milano, Italy, pp.19–29. Dynamic digital human models for ergonomic analysis 25 Arndt, F. (2006) ‘The digital factory planning and simulation of production in automotive industry’, Informatics in Control, Automation and Robotics I, pp.27–29. Askew, L.J., An, K.N., Morrey, B.F. and Chao, E.Y. (1981) ‘Functional evaluation of the elbow. Normal motion requirements and strength determinations’, Orthop. Trans., pp.5–304. Badler, N. (1997) ‘Virtual humans for animation, ergonomics, and simulation’, in Proceedings of the IEEE Workshop on Non-rigid and Articulates Motion, pp.28–36. Badler, N., Phillips, C. and Webber, B. (1993) Simulating Humans: Computer Graphics, Animation, and Control, Oxford University Press, Oxford. Berlin, C., Örtengren, B., Lämkull, D. and Hanson, L. (2009) ‘Corporate-internal vs. national standard – a comparison study of two ergonomics evaluation procedures used in automotive manufacturing’, International Journal of Industrial Ergonomics, Vol. 39, No. 6, pp.940–946. Bernard, B. (1997) Musculoskeletal Disorders and Workplace Factors – A Critical Review of Epidemiologic Evidence for Work-Related Musculoskeletal Disorders of the Neck, Upper Extremity, and Low Back, NIOSH, CDC (Centers for Disease Control and Prevention). Bretl, T. and Lall, S. (2008) ‘Testing static equilibrium for legged robots’, IEEE Transactions on Robotics, Vol. 24, No. 4, pp.794–807. Bureau of Labor Statistics (2011) Non Fatal Occupational Injuries and Illnesses Requiring Days Away from Work, 2010, in US Department of Labor, Bureau of Labor Statistics, Washington, DC. Center for Ergonomics (2004) Energy Expenditure Prediction Program, University of Michigan, College of Engineering, Ann Arbor, MI. Chaffin, D. (1997) ‘Development of computerized human static strength simulation model for job design’, Human Factors and Ergonomics in Manufacturing, Vol. 7, No. 4, pp.305–322. Chaffin, D. (2001) ‘Digital human modeling for vehicle and workplace design’, in SAE, Warrendale, PA. Chedmail, P., Maille, B. and Ramstein, E. (2002) ‘Etat de l’art sur l’accessibilité et l’étude de l’ergonomie en réalité virtuelle’, Mécanique & Industries, Vol. 3, pp.147–152. Chen, J., Wang, R., Liu, L., Song, J. (2011) ‘Clustering of trajectories based on Hausdorff distance’, in International Conference on Electronics, Communications and Control (ICECC) pp.1940–1944. Ciriello, V.M., Snook, S.H., Webster, B.S. and Dempsey, P. (2001) ‘Psychophysical study of six hand movements’, Ergonomics, Vol. 44, No. 10, pp.922–936. Claudon, L., Daille-Lefèvre, B. and Marsot, J. (2006) ‘La révolution du numérique: un atout pour concevoir des postes de travail plus sûrs’, Hygiène et Sécurité du Travail – Note Documentaire ND 2282, Vol. 210, pp.5–13. CNAM-TS (2012) TMS: développer des plans de prévention durables pour réduire la progression du risque. Colette, C., Micaelli, A., Andriot, C. and Lemerle, P. (2007) ‘Dynamic balance control of humanoids for multiple grasps and non coplanar frictional contacts’, in 7th IEEE-RAS International Conference on Humanoid Robots, Pittsburgh, PA, pp.81–88. Colette, C., Micaelli, A., Andriot, C. and Lemerle, P. (2008) ‘Robust balance optimization control of humanoid robots with multiple non coplanar grasps and frictional contacts’, in Proceedings of the IEEE International Conference on Robotics and Automation, Pasadena, USA, pp.3187–3193. De Magistris, G., Micaelli, A., Andriot, C., Savin, J. and Marsot, J. (2011) ‘Dynamic virtual manikin control design for the assessment of the workstation ergonomy’, in 1st International Symposium on Digital Human Modeling, Lyon. De Magistris, G., Micaelli, A., Evrard, P. and Savin, J. (2014) ‘A human-like learning control for digital human models in a physics-based virtual environment’, The Visual Computer, pp.1–18, DOI: 10.1007/s00371-014-0939-0. 26 G. De Magistris et al. De Magistris, G., Micaelli, A., Evrard, P., Andriot, C., Savin, J., Gaudez, C. and Marsot, J. (2013a) ‘Dynamic control of DHM for ergonomic assessments’, International Journal of Industrial Ergonomics, Vol. 43, No. 2, pp.170–180. De Magistris, G., Micaelli, A., Savin, J., Gaudez, C. and Marsot, J. (2013b) ‘Dynamic digital human model for ergonomic assessment based on human-like behaviour and requiring a reduced set of data for a simulation’, in 2nd International Digital Human Model Symposium, Ann Arbor, USA. Dukic, T., Ronang, M. and Christmansson, M. (2007) ‘Evaluation of ergonomics in a virtual manufacturing process’, Journal of Engineering Design, Vol. 18, No. 2, pp.125–137. Escande, A. (2008) Planification de points d’appui pour la génération de mouvements acycliques: application aux humanoïdes, PhD thesis, Université d’Evry-Val d’Esonne, France. European Standard EN ISO 14738:2008 (2008) Safety of Machinery – Anthropometric Requirements for the Design of Workstations at Machinery, ISO. European Union (2006) ‘Directive on machinery – Directive 2006/42/EC of the European parliament and of the council of 17 may 2006 on machinery’, Official Journal of the European Union. Falck, A. and Rosenqvist, M. (2012) ‘What are the obstacles and needs of proactive ergonomics measures at early product development stages? – An interview study in five Swedish companies’, International Journal of Industrial Ergonomics, Vol. 42, No. 5, pp.406–415. Fitts, P. (1954) ‘The information capacity of the human motor system in controlling the amplitude of movement’, Journal of Experimental Psychology, Vol. 47, No. 6, pp.381–391. Flash, T. and Hogan, N. (1985) ‘The coordination of arm movements: an experimentally confirmed mathematical model’, Journal of Neuroscience, Vol. 5, No. 7, pp.1688–1703. Fritzsche, L., Jerndrusch, R., Leidholdt, W., Bauer, S., Jäckel, T. and Pirger, A. (2011) ‘Introducing ema (editor for manual work activities) – a new tool for enhancing accuracy and efficiency of human simulations’, in V.G. Duffy (Éd.): Digital Production Planning, Digital Human Modeling, Lecture Notes in Computer Science, Vol. 6777, pp.272–281. Gaudez, C. (2008) ‘Upper limb musculo-skeletal disorders and insert fitting activity in automobile sector: impact on muscular stresses of fitting method and insert position on part’, Computer Methods in Biomechanics and Biomedical Engineering, Vol. 11, Supplement 001, pp.101–102. Gribble, P. and Ostry, D. (1999) ‘Compensation for interaction torques during single and multi-joint limb movement’, Journal of Neurophysiology, Vol. 82, No. 5, pp.2310–2326. Haesen, B. (2009) ‘Betterlift – introducing a semi-automaic exhaust manipulator to reduce a high absenteeism rate’, Int. Assessment, Elimination and Substantial Reduction of Occupational Risks – European Agency for Safety and Health at Work (Eu-OSHA) pp.36–43. Hanavan, E. (1964) ‘A mathematical model of the human body’, Wright-Patterson Air Force Base, Report No. AMRL-TR-102, pp.64–102. Hogan, N. (1990) ‘Mechanical impedance of single- and multi-articular systems’, in J.M. Winters and S.L. Woo (Eds.): Multiple Muscle Systems: Biomechanics and Movement Organization, Springer-Verlag. Hyman, R. (1953) ‘Stimulus information as a determinant of reaction time’, Journal of Experimental Psychology, Vol. 45, No. 3, pp.188–196. Jayaram, U., Jayaram, S., Shaikh, I., Kim, Y. and Palmer, C. (2006) ‘Introducing quantitative analysis methods into virtual environments for real-time and continuous ergonomic evaluations’, Computers in Industry, Vol. 57, No. 3, pp.283–296. Karu, O., Kansi, P. and Kouarinka, I. (1977) ‘Correcting working postures in industry: a practical method for analysis’, Applied Ergonomics, Vol. 8, No. 4, pp.199–201. Kilbom, A. and Persson, J. (1987) ‘Work technique and its consequences for musculoskeletal disorders’, Ergonomics, Vol. 30, No. 2, pp.273–279. Dynamic digital human models for ergonomic analysis 27 Kuo, C. and Wang, M. (2009) ‘Motion generation from MTM semantics’, Computers in Industry, Vol. 60, No. 5, pp.339–348. Lackner, J. and Dizio, P. (1994) ‘Rapid adaptation to coriolis force perturbations of arm trajectory’, Journal of Neurophysiology, Vol. 72, No. 1, pp.299–313. Lamarche, F. (2009) ‘Topoplan: a topological path planner for real time human navigation under floor and ceiling constraints’, Computer Graphics Forum, Vol. 28, No. 6, pp.649–658. Lämkull, D., Hanson, L. and Ortengren, R. (2009) ‘A comparative study of digital human modelling simulation results and their outcomes in reality: a case study within manual assembly of automobiles’, International Journal of Industrial Ergonomics, Vol. 39, No. 2, pp.428–441. Liu, M., Micaelli, A., Evrard, P., Escande, A. and Andriot, C. (2011) ‘Interactive dynamics and balance of a virtual character during manipulation tasks’, in IEEE International Conference on Robotics and Automation, Shanghai, China, pp.1676–1682. MacAtamney, L. and Cortlett, E. (1993) ‘Rula: a survey method for the investigation of work-related upper limb disorders’, Applied Ergonomics, Vol. 24, No. 8, pp.91–99. Malchaire, J. (2011) Guide: classification des méthodes d’évaluation et/ou de prévention des risques de TMS. Mansour, D., Micaelli, A. and Lemerle, P. (2011) ‘A computational approach for push recovery in case of multiple noncoplanar contacts’, in International Conference on Intelligent Robots and Systems (IROS), pp.3213–3220. Maynard, H., Stegemerten, G. and Schwab, J. (1948) Methods-time Measurement, McGraw Hill Book Company, New York. Merlhiot, X. (2009) ‘Extension of a time-stepping compatible contact determination method between rigid bodies to deformable models’, in Proceedings of the Multibody Dynamics ECCOMAS Thematic Conference. Miller, A., Knoop, S., Christensen, H. and Allen, P. (2003) ‘Automatic grasp planning using shape primitives’, in IEEE International Conference on Robotics and Automation, Vol. 2, pp.1824–1829. Miyata, N., Kouki, M., Mochimaru, M., Kawachi, K. and Kurihara, T. (2005) ‘Hand link modelling and motion generation from motion capture data based on 3d joint kinematics’, in Proceedings SAE International Iowa. Monnier, G. (2004) Simulation de mouvements humains complexes et prédiction de l’inconfort associé – application à l’évaluation ergonomique du bouclage de la ceinture de sécurité, PhD thesis, Institut National des Sciences Appliquées de Lyon. Morasso, P. (1981) ‘Spatial control of arm movements’, Experimental Brain Research, Vol. 42, No. 2, pp.223–227. Morasso, P. and Mussa-Ivaldi, F. (1982) ‘Trajectory formation and handwriting: a computational model’, Biological Cybernetics, Vol. 45, No. 2, pp.131–142. Morrisey, M. (1998) ‘Human-centric design’, Mechanical Engineering, Vol. 120, No. 7, pp.60–62. NIOSH (1991) Work Practice Guide for Manual Lifting, Technical report 81-122, Department of Health and Human service, San Francisco, California. NST-n168 (1998) Ergonomie des outils à main – problématique et état de l’art, Note scientifique et technique de l’INRS. Occhipinti, E. (1998) ‘OCRA, a concise index for the assessment of exposure to repetitive movements of the upper limbs’, Ergonomics, Vol. 41, No. 9, pp.1290–1311. Porter, J.M., Case, K., Marshall, R., Gyi, D. and Sims, R. (2004) ‘Beyond Jack and Jill: designing for individuals using Hadrian’, International Journal of Industrial Ergonomics, Vol. 33, No. 3, pp.249–264. 28 G. De Magistris et al. Pouliquen, M. (2006) Proposition d’un modèle de main pour la simulation des interactions hommemachine dans un environnent virtuel: application à la prévention des accidents aux mains, Vol. NTS 263, p.268. Pratt, J., Torres, A., Dilworth, P. and Pratt, G. (1996) ‘Virtual actuator control’, in IEEE International Conference on Intelligent Robots and Systems, pp.1219–1226. Rice, S. (2004) ‘Boeing human modeling system. Working postures and movements’, in N.J. Delleman, C.M. Haslegrave and D.B. Chaffin (Eds.): Tools for Evaluation and Engineering, pp.462–465, CRC Press. Salini, J. (2012) Dynamic Control for the Task/posture Coordination of Humanoids: Toward Synthesis of Complex Activities, PhD thesis, University of Pierre and Marie Curie. Savin, J. (2011) ‘Digital human manikins for work-task ergonomic assessment: which degree of confidence and using constraints?’, Proceedings of the Institution of Mechanical Engineers, Part B: Journal of Engineering Manufacture, Vol. 225, pp.1401–1409. Schaub, K., Caragnano, G., Britzke, B. and Bruder, R. (2012) The European Assembly Worksheet, Theoretical Issues in Ergonomics Science. Schaub, K., Landau, K., Menges, R. and Grossmann, K. (1997) ‘A computer-aided tool for ergonomic workplace design and preventive health care’, Human Factors and Ergonomics in Manufacturing, Vol. 7, No. 4, pp.269–304. Schutter, J.D. (2010) ‘Invariant description of rigid body motion trajectories’, ASME Journal of Mechanisms and Robotics, Vol. 2, No. 1, p.9. Seidl, A. (2004) ‘The ramsis human simulation tool. working postures and movements’, in N.J. Delleman, C.M. Haslegrave and D.B. Chaffin (Eds.): Tools for Evaluation and Engineering, pp.445–450, CRC Press. Shahrokhi, M. and Bernard, A. (2009) ‘A framework to develop an analysis agent for evaluating human performance in manufacturing systems’, CIRP Journal of Manufacturing Science and Technology, Vol. 2, No. 1, pp.55–60. Sjogaard, G., Sejersted, O.M., Winkel, J., Smolander, J., Jorgensen, K. and Westgaard, R. (1995) ‘Exposure assessment and mechanisms of pathogenesis in work-related musculoskeletal disorders: Significant aspects in the documentation of risk factors’, in O. Svane and C. Johansen (Eds.): Work and Health. Scientific Basis of Progress in the Working Environment. Stasse, O., Escande, A., Mansard, N., Miossec, S., Evrard, P. and Kheddar, A. (2008) ‘Real time selfcollision avoidance task on hrp-2 humanoid robot’, in IEEE International Conference on Robotics and Automation, Pasadena, USA, pp.3200–3205. Todorov, E. and Jordan, M. (1998) ‘Smoothness maximization along a predefined path accurately predicts the speed profiles of complex arm movements’, Journal of Neurophysiology, Vol. 80, No. 2, pp.697–714. Toussaint, M., Gienger, M. and Goerick, C. (2007) ‘Optimization of sequential attractor-based movement for compact behaviour generation’, in IEEE International Conference on Humanoid Robots. Uno, Y., Kawato, M. and Suzuki, R. (1989) ‘Formation and control of optimal trajectory in human multijoint arm movement: minimum torque-change model’, Biological Cybernetics, Vol. 61, No. 1, pp.89–101. Urbatic Concept France (2007) ‘Une approche innovante de l’ergonomie’, Ergonoma Journal, Vol. 48, No. 7, pp.22–23. Vignes, R. (2004) Modeling Muscle Fatigue in Digital Humans, Center for Computer-Aided Design, The University of IOWA, Tech. rep. VSR Research Group (2004) Technical Report for Project Virtual Soldier Research, Center for Computer-Aided Design, The University of IOWA, Tech. rep. Dynamic digital human models for ergonomic analysis 29 Wang, X. (2008) Contribution à la simulation du mouvement humain en vue d’applications en ergonomie, Mémoire de HDR, Université Claude Bernard – Lyon I, Report No. 5-2008. Yoshizaki,W., Suguira, Y., Chiou, A., Hashimoto, S., Inami, M., Igarashi, T., Akazawa, Y., Kawachi, K., Kagami, S. and Mochimaru, M. (2011) ‘An actuated physical puppet as an input device for controlling a digital manikin’, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, New York, USA, pp.637–646. Zhang, B., Horvath, I., Molenbroek, J. and Snijders, C. (2010) ‘Using artificial neural networks for human body posture prediction’, International Journal of Industrial Ergonomics, Vol. 40, No. 4, pp.414–424. Zhang, X., Nussbaum, M. and Chaffin, D. (2000) ‘Back lift versus leg lift: an index and visualization of dynamic lifting strategies’, Journal of Biomechanics, Vol. 33, No. 6, pp.777–782. work_522o6sdt2jfpxejcre57oeradm ---- An Optical Character Recognition Software Benchmark for Old Dutch Texts on the EYRA Platform Mirjam Cuper1, dr. Adriënne Mendrik2, Maarten van Meersbergen2, Tom Klaver2, Pushpanjali Pawar2, Dr. Annette Langedijk3, Lotte Wilms1 1 National Library of the Netherlands (KB), 2 The Netherlands eScience Center, 3 SURF Digitized collections of printed historical texts are important for research in Digital Humanities. However, acquiring high-quality machine readable texts using currently available Optical Character Recognition (OCR) methods is a challenge. OCR Quality is affected by old fonts, old printing techniques, bleedthrough of the ink, paper quality, old spelling, multiple columns and so on. It is unclear which OCR methods perform best. Therefore, we are currently in the process of setting up a benchmark to enable the evaluation of the performance of OCR software on old Dutch texts. The benchmark is being set-up on the EYRA benchmark platform (eyrabenchmark.net) developed by The Netherlands eScience Center and SURF. For the pilot version of the benchmark a data set containing 2055 Dutch book pages (1630- 1796) and 1024 Dutch newspaper pages (1618-1945) is made available by the National Library of the Netherlands (KB). This data set contains both scanned pages (OCR method input data) and machine readable text (ground truth that can be used to assess the quality of the OCR method output). This dataset is split in training and validation data. The training data can be downloaded and used by algorithm developers to train their OCR algorithms or tune their workflows (pre-processing, layout segmentation, character recognition, post-processing). The EYRA platform offers algorithm developers the opportunity to submit their OCR algorithm or workflow to the EYRA platform in a docker container. The docker container will, in turn, be run on the validation data in the cloud on the Dutch national infrastructure of SURF. The advantage of this set-up, is that it prevents over-tuning on the validation data and therefore provides a fair comparison of the performance of the OCR methods. Also, if new validation data is available and added to the benchmark later on, the OCR methods can easily be re-run on the new data. Various metrics could be used to assess the performance of the OCR methods in comparison to the ground truth. In the pilot we will use the most commonly used metrics (Character Error Rate and Word Error Rate). However, we are planning to add more metrics later on, that address different aspects of the OCR method performance. The EYRA platform uses Observable (observablehq.com) to visualize algorithm results on the platform, to gain more insight into algorithm performance. These visualizations can easily be integrated in a journal paper, which promotes replication of result visualizations. Furthermore the OCR benchmark provides an easy way for OCR method developers to compare their method to other existing methods, by providing the data, metrics, ground truth and algorithms for comparison, replicating algorithm validation in the experiment and results section of a journal paper. For the National Library of the Netherlands, this benchmark provides a way to gain insight into the performance of OCR methods and to select the best available OCR method for their problem of digitizing old Dutch texts. This in turn will provide higher quality digitized texts for Digital Humanity research. work_53isusw3bzhjpk7xlatqqjuqy4 ---- Microsoft Word - 02_meschini.docx DigitCult | Scientific Journal on Digital Cultures Published 30 June 2019 Correspondence should be addressed to Federico Meschini, Università per stranieri di Perugia/École normale supérieure de Paris. Email: fmeschini@gmail.com DigitCult, Scientific Journal on Digital Cultures is an academic journal of international scope, peer-reviewed and open access, aiming to value international research and to present current debate on digital culture, technological innovation and social change. ISSN: 2531-5994. URL: http://www.digitcult.it Copyright rests with the authors. This work is released under a Creative Commons Attribution (IT) Licence, version 3.0. For details please see http://creativecommons.org/ licenses/by/3.0/it/ DigitCult 3 http://dx.doi.org/10.4399/97888255263182 2019, Vol. 4, Iss. 1, 3–20. DOI: 10.4399/97888255263182 Documenti, medialità e racconto. Di cosa parliamo quando parliamo di Digital Scholarship. Abstract La digital scholarship è costituita da metodologie e pratiche sia di ricerca sia di disseminazione dei risultati basate sul paradigma digitale e perciò, oltre agli aspetti più manifesti, ha un ruolo strategico nel panorama scientifico, in quanto luogo d’incontro tra scienze umane e scienze esatte. Partendo dall’etimologia di scholarship e dalla relazione dinamica tra i diversi significati veicolati, questo articolo si concentra sul rapporto tra le digital humanities e la digital scholarship e su come il concetto di pubblicazione elettronica implichi una concezione pluralistica del testo: queste diverse accezioni sono a loro volta un ponte tra settori disciplinari contigui ma spesso non comunicanti, come ad esempio discipline umanistiche da un lato e mediologiche dall’altro. Le riflessioni conclusive sono focalizzate sugli elementi costitutivi della digital scholarship e le loro possibili combinazioni e sul rapporto tra linguaggio testuale e visivo nella comunicazione scientifica. Documents, Mediality and Narration. What We Talk About When We Talk About Digital Scholarship. Digital scholarship consists of both research and publishing methodologies and practices based on the digital paradigm. It has, therefore, a strategic role in the scholarly landscape since it is a meeting place between humanities and hard sciences. Starting from the etymology of scholarship – and the dynamic relationship between the different meanings conveyed – this article focuses on the relationship between digital humanities and digital scholarship. An important aspect of this relationship is the implication by electronic publishing of a pluralistic view of text: these different meanings are a bridge between contiguous but more than often non-communicating disciplines, such as humanities and media studies. The concluding reflections focus on the constituent elements of digital scholarship and their possible combinations and on the relationship between textual and visual language and their use in scholarly communication. Federico Meschini Università per Stranieri di Perugia / École Normale Supérieure de Paris 4 | Documenti, medialità e racconto doi:10.4399/97888255263182 DigitCult | Scientific Journal on Digital Cultures Calcolatori ed eruditi Il termine scholarship nella lingua italiana non è di immediata traduzione, nonostante le sue origini etimologiche risalgano al latino schola – che “designa sia il concetto sia il luogo dello studio”1 – a sua volta derivato dal greco skholḗ, ma destinato ad assumere rispetto a quest'ultimo maggiori significati, in particolare a partire dal medioevo. Schola è la radice di scholastĭcus, caratterizzato da una doppia accezione, sostantivale e aggettivale, laddove quest'ultima indica tutto ciò che è relativo alla scuola, significato tradotto successivamente in italiano2. La forma sostantivale latina era anch'essa polisemica, in quanto faceva inizialmente riferimento sia al docente sia al discente sia ad un generico status di erudizione, implicando perciò l’attività di ricerca da quella della didattica, e non distinguendole nettamente in quanto entrambe caratterizzate da un continuo processo di studio. Da qui si passa, insieme a scholāris3, all'inglese antico e medio con scōlere per arrivare al moderno scholar, funtore della scholarship. Il suffisso ship trasforma le caratteristiche del singolo in un qualcosa di generalizzato e relativo all'attività in sé, riferendosi sia al processo, e perciò alle condizioni necessarie per attuarlo, sia al prodotto, e i relativi supporti che lo rendono disponibile. Se in italiano la declinazione aggettivale del lemma scientifico indica allo stesso modo le scienze umane e quelle esatte – anch'esse non sempre chiaramente distinte nel mondo classico e medievale, come non esisteva parimenti una netta separazione tra ricerca e didattica – in ambito anglosassone è proprio scholarly ad assolvere a questo ruolo di etichetta inclusiva, laddove scientific è limitato alle sole hard sciences4. Digital Scholarship è perciò, come frequentemente accade nella ridefinizione digitale delle attività sviluppatesi e legate ad una dimensione analogica, un'etichetta ancipite tanto efficace e incisiva quanto di non semplice definizione. Melanie Schlosser, nel blog Digital Scholarship @ The Libraries5, la descrive come “research and teaching that is made possible by digital technologies, or that takes advantage of them to ask and answer questions in new ways” (2012). La Schlosser successivamente preferisce però utilizzare ciò che scrive a riguardo Abby Smith Rumsey: “Digital scholarship is the use of digital evidence and method, digital authoring, digital publishing, digital curation and preservation, and digital use and reuse of scholarship” (Rumsey 2011, 2). Quest'ultima spiegazione risulta più efficace e completa per due motivi: il primo è la presenza del digitale non solo ad un livello tecnologico, bensì metodologico ed epistemologico6; il secondo è la descrizione dell'intero ciclo della ricerca in cui ogni fase viene declinata secondo questa nuova modalità e l'ultima si ricongiunge alla prima, attuando così una circolarità virtuosa. 1 http://www.treccani.it/enciclopedia/schola_%28Enciclopedia-Italiana%29/. 2 http://www.treccani.it/vocabolario/scolastico1/. Il sostantivo invece sopravvive nella nostra lingua solo in riferimento alla filosofia Scolastica, http://www.treccani.it/vocabolario/scolastico2/. 3 Inizialmente sinonimi, scholāris e scholastĭcus con il passare del tempo assumono un significato simmetrico, in quanto finiscono per indicare rispettivamente lo svolgimento e il risultato del processo di apprendimento (Quinto 2001, 48-49). 4 La lingua tedesca, sfruttando la sua caratteristica agglutinante, parte da un concetto generale e inclusivo, wissenschaft, da cui derivano naturwissenschaft, le scienze della natura, e geisteswissenschaft, quelle dello spirito. 5 http://library.osu.edu/blogs/digitalscholarship/. Già dal titolo del blog risalta il ruolo strategico delle biblioteche – facilmente generalizzabile nonostante nello specifico ci si riferisca all’università dell’Ohio – nella ridefinizione dell'attività di ricerca, andando a espandere quelle che sono le tradizionali attività di acquisizione, conservazione e disseminazione, in particolare per ciò che concerne il supporto necessario, non solo tecnologico e infrastrutturale, nella creazione delle risorse digitali. Nell'ultimo post del blog del dicembre 2016 viene ricordato il compito delle biblioteche in quanto spazio collaborativo di discussione necessario per rispondere alla domanda “what is digital scholarship and what should libraries be doing to support it” nel quadriennio 2012-2016. Conclusosi questo compito e passati alla domanda successiva, “What are we doing to support digital scholarship?” e “How can we continue to improve and evolve our digital scholarship program?” (Ibid.), il testimone viene consegnato al blog ufficiale dell'iniziativa Research Commons – http://library.osu.edu/researchcommons/ – e sempre gestito dalle biblioteche dell’università con lo scopo di fornire servizi per le diverse fasi del processo della ricerca, tra cui il reperimento, la gestione e la visualizzazione dei dati. 6 Non a caso, dopo appena qualche riga la Rumsey scrive (corsivo mio) “The goals of scholarly production remain intact, but fundamental operational changes and epistemological challenges generate new possibilities for analysis, presentation, and reach into new audiences”. doi:10.4399/97888255263182 Federico Meschini | 5 DigitCult | Scientific Journal on Digital Cultures L’efficacia di questa definizione è dovuta all’essere stata elaborata in un percorso quasi decennale: Abby Smith Rumsey era la direttrice dello Scholarly Communication Institute (SCI)7 – luogo d’incontro e di riflessioni condivise con cadenza annuale presso la biblioteca dell’Università della Virginia – che, in una prima fase dal 2003 al 2011 e successivamente dal 2012 al 20138, ha avuto il compito di identificare e proporre strategie per far progredire la comunicazione scientifica, in particolare nelle scienze umane, sulla base di una sempre maggiore diffusione del paradigma digitale. Dopo aver analizzato tutta una serie di argomenti specifici, tra cui i centri di ricerca nelle humanities o i visual studies9, delle nove relazioni prodotte dallo SCI, le ultime due sono concentrate sin dal titolo su un new-model scholarly communication. Il punto di partenza è come la tradizionale divisione dei processi e dei ruoli nella comunicazione scientifica non sia più in grado di rispondere adeguatamente ai cambiamenti causati dalla rivoluzione digitale e dalla trasformazione in atto nell’accademia, in particolare la crisi delle humanities, e sia necessario perciò sviluppare un modello “enacted by individuals and groups playing multiple and overlapping roles” (Rumsey 2010, 8). La proposta sul percorso da seguire per far fronte in maniera adeguata a questi cambiamenti è incentrata su diversi aspetti: i nuovi generi e modelli della produzione scientifica; i modelli di business e di copyright; un’adeguata valutazione delle diverse professionalità e dei ruoli necessari allo sviluppo e alla crescita della comunità scientifica; la creazione di infrastrutture digitali condivise tra case editrici, biblioteche e centri di ricerca, così da facilitare lo scambio e la diffusione della conoscenza; lo sviluppo di percorsi formativi adeguati in grado di fornire alle nuove generazioni di studiosi nelle scienze umane le competenze, di tipo tecnologico, comunicativo e gestionale, necessarie in questo nuovo sistema; i possibili finanziamenti da parte di istituzioni ed enti privati, ricevuti dimostrando il valore strategico delle scienze umane in questo nuovo panorama informativo (Rumesy 2011, 24-26). È possibile individuare facilmente una concatenazione tra tutti questi diversi argomenti, in quanto il cambiamento della natura del documento si propaga e influenza gli altri aspetti, sia conoscitivo-formativi sia socio-economici, e ne è a sua volta influenzato. Proprio su questo aspetto documentale le due relazioni contengono delle osservazioni rilevanti, in cui partendo dal livello del contenuto si passa continuamente a quello dell’espressione e viceversa. Per ciò che riguarda il primo viene posto in discussione il ruolo della monografia, in quanto argomentazione che ha proprio nell’estensione, nell’essere una longform la sua motivazione, giudicata sì ancora rilevante e strategica, ma che deve in qualche modo evolvere; in particolare va considerata la presenza di contenuti non testuali e di conseguenza una maggiore granularità (Riva 2017). Quest’ultimo aspetto si riflette chiaramente sulla struttura della monografia che va in qualche modo esplicitata: “Monographs are structured like trees, with a long central line or trunk from which many branches lead off and from there, ever smaller branches are spawned” (Rumsey 2010, 15). Questa similitudine pone in primo piano la consistenza argomentativa che caratterizza una monografia, senza però tralasciare i possibili appigli a estensioni ulteriori, lasciati come compito al lettore. Nel Web lo scenario è diametralmente opposto in quanto, immerso in un grafo apparentemente sconfinato in cui i nodi costituiscono i vari contenuti informativi, è l’utente a dover di volta in volta creare il proprio percorso lineare e il più possibile consistente, selezionando tra le numerose opzioni disponibili: “The book is the anti-open-Web.” (Rumsey 2011, 12). Va sottolineato però come sia proprio la familiarità con questa struttura ad albero della monografia, acquisita tramite un continuo processo di studio, a permettere ad uno studioso di muoversi con agilità all’interno di uno spazio informativo, creando di volta in volta le connessioni necessarie e valutandone la validità, in primo luogo con ciò che fa già parte del suo patrimonio conoscitivo: “Perhaps we are so familiar with the monograph form that we no longer notice that few scholars read long-form arguments from the first page to last, in that order. Rather, they move in well-worn paths that run between introductory, reference, 7 http://uvasci.org. 8 A dimostrazione di come questo bisogno fosse sentito e condiviso a livello globale, nel 2011 in Europa prende il via anche FORCE11 – http://www.force11.org – acrostico di Future Of Research Communication and Escholarship, con finalità del tutto simili allo SCI e promotore sia di una conferenza sia di una summer school su questi stessi temi. 9 http://uvasci.org/institutes-2003-2011/. 6 | Documenti, medialità e racconto doi:10.4399/97888255263182 DigitCult | Scientific Journal on Digital Cultures citation, and index materials, all centering around the core narrative presentation.” (Ivi) Va però sottolineato come la possibilità di esplicitare la struttura nella monografia digitale rischi di favorire chi è già in possesso di questa abilità e al contrario sfavorire chi deve ancora svilupparla. Questo aspetto è giudicato come fondamentale nel dottorato, la fase di avviamento alla ricerca. Non a caso, in relazione alla riflessione sulla forma della monografia, la discussione si sposta immediatamente sulla tesi, il prodotto conclusivo del dottorato di ricerca: “what do those digital genres tell us about the ‘dissertation-as-proto-book’ as the most appropriate preparation for a career of productive scholarship?” (Ibidem, 11). Mettere in discussione il prodotto vuol dire interrogarsi sul processo. Viene criticata l’eccessiva specializzazione che attualmente caratterizza le scienze umane, in quanto a forte rischio di settorialità e isolamento, con conseguente difficoltà di diffusione ed effettiva ricaduta dei risultati della ricerca; certo questa specializzazione è presente anche nelle scienze esatte, ma viene perlomeno in parte mitigata dal lavoro di gruppo, caratteristica spesso presente nelle Digital Humanities. Quale può essere lo scopo e il relativo prodotto finale di un dottorato di ricerca, che vada a sostituire una monografia incentrata sull’accumulazione verticale di conoscenze su argomenti sempre più specifici? La domanda cui rispondere è come “The dissertation is meant to demonstrate capacity in relation to some body of knowledge [...] and demonstrate capacity as well. Capacity for what is the question now.” (Ibidem, 12). Una possibile risposta è la capacità di lavorare direttamente sulla struttura, sulla capacità di creare connessioni, “the ability to navigate the online environment and to disseminate knowledge to an audience” (Ivi), sull’essere non più un lonely scholar bensì un node of knowledge. In base a questo principio diverse alternative possono prendere il posto della monografia: “Can we imagine that a new-model dissertation would be a translation, a collection of essays, original digital objects, or curatorial projects?” (Ivi). È evidente come il concetto stesso di relazione orizzontale, di giustapposizione, e non verticale, di specializzazione, sia al cuore delle possibilità elencate, di tipo linguistico, concettuale, codicale o tematico. Il riferimento all’oggetto digitale porta esplicitamente il discorso dal piano del contenuto a quello dell’espressione, e alla commistione di codici comunicativi eterogenei. Una prima riflessione che viene effettuata è, come spesso accade, di tipo dicotomico. “There are two models of multimedia argument: in one, argument is carried by prose and punctuated by media as illustration; in the other, the medium itself bears the burden both of presentation and argumentation” (Rumsey 2010, 15). La differenza sottostante quest’opposizione è sul diverso ruolo e peso assunto di volta in volta dalle varie tipologie di contenuto: linea centrale o ramo secondario, denotativo o connotativo, informativo o narrativo, figura o sfondo, e, passo successivo alla contrapposizione, sull’interazione che viene a instaurarsi tra queste due differenti modalità. Subito dopo questa affermazione vengono affrontate due questioni anch’esse apparentemente contrapposte, ma in realtà connesse. La prima, già parzialmente affrontata, è su quanto la linearità sia essenziale nello sviluppo di un argomento: multicodicalità e granularità portano inevitabilmente a mettere in discussione la progressione lineare, non fosse altro per la possibilità di organizzare i vari contenuti in base alla loro tipologia e all’esplicitazione delle relazioni presenti10. Il secondo punto riguarda la necessità nell’utilizzo di un medium, e di 10 Grazie alla libreria D3.js, la piattaforma Scalar – per la creazione di pubblicazioni arricchite e frequentemente citato nelle due relazioni dello SCI – permette diverse modalità di visualizzazione dei contenuti, a griglia, ad albero, radiale o a grafo aggregato, mostrando così le relazioni basate sul modello sottostante e composto da: singoli oggetti iconografici, sonori o audiovisivi, annotazioni relative agli oggetti o a porzioni di essi; pagine contenenti sia testo sia uno o più oggetti; percorsi in grado di organizzare linearmente le pagine; tag per raggruppare le pagine in base a un principio insiemistico (Sayers e Dietrich 2013). L’edizione digitale del testo di Jason Mittel sulla complessità nella narrazione televisiva seriale (2015), realizzata proprio tramite Scalar – http://scalar.usc.edu/works/complex- television/ – va ad ampliare la versione cartacea in quanto presenta le porzioni rilevanti delle fonti primarie cui l’edizione a stampa fa riferimento. Tralasciando la funzione di estensione in punti specifici del testo di partenza tramite contenuti granulari, già di per sé fortemente ipertestuale, quest’edizione riproduce pedissequamente l’indice originario; perciò, anche immaginando una versione integrale che includa i contenuti di entrambe le edizioni, sia il modello sia le possibilità di visualizzazione di Scalar permettono e incoraggiano una fruizione non lineare a partire da una base lineare. Ciò conferma come doi:10.4399/97888255263182 Federico Meschini | 7 DigitCult | Scientific Journal on Digital Cultures conseguenza di un codice comunicativo, di possedere “basic technical proficiency and literacy skills” (Ivi). Quest’ultimo tipo di competenze porta alla grammatica utilizzata da un mezzo espressivo e quindi ad un approccio diegetico; ciò sembra essere messo in discussione dalla non linearità, ma in realtà quest’ultima da un lato prevede la presenza di percorsi – e perciò racconti – multipli e dall’altro la costruzione da parte dell’utente di un percorso autonomo a partire dai singoli nodi. La presenza di un aspetto diegetico anche in quest’ultimo caso trova conferma grazie sia alla natura frattale del racconto, racchiusa pertanto anche nei contenuti granulari (Yorke 2014, 145-147), sia alla closure (McCloud 1993, 63), la capacità da parte di un fruitore di riempire autonomamente, e più o meno consciamente, gli spazi mancanti per creare così un insieme, che risponda ai requisiti di omogeneità e coerenza. Se ciò non dovesse essere possibile si ricadrebbe in ogni caso nelle categorie dell’antitrama e delle realtà incoerenti, che hanno però la loro ragione d’essere, e viceversa, nei loro opposti, la trama classica e le realtà coerenti (McKee 1997,44-58). Il fattore diegetico, insieme alla sua organizzazione in elementi componibili e scomponibili all’occorrenza, assume ulteriore importanza superando l’aspetto sincronico, di giustapposizione, tra codici eterogenei, e prendendo in considerazione quello diacronico, fondamentale nello sviluppo della longform. A livello di etichette ciò trova corrispondenza nelle definizioni di multimedia e transmedia e nei due diversi prefissi, con l’ultimo ad indicare la presenza di un percorso, e perciò una narrazione, in un assetto mediatico variegato. Nel riflettere sulla trasposizione di un’argomentazione a forma lunga, pensata in origine per una monografia cartacea e successivamente destinata ad pubblicazione digitale, Massimo Riva riassume diversi degli argomenti qui proposti (corsivo mio): “rethinking my book as a digital monograph compelled me to shift the weight of my argument from the written to the visual component, embedding as much of my argument in the latter. At the same time, this also required a substantial shift in my writing strategy [...] investing the written text with a new crucial function: supporting the visualizations (in the shape of captions or internal annotations), on the one hand, and providing a narrative frame which allows the reader to connect the various visualizations among themselves, and follow a path toward some theoretical and methodological conclusions” (Riva 2017, 69). Il ruolo di supporto e di contestualizzazione narrativa della componente testuale è tanto più necessario quanto più la parte non testuale è caratterizzata da una fruizione sincronica. Oltre naturalmente alle immagini, più che per i contenuti audio e video ciò vale soprattutto per quelli computazionali, in cui il calcolo non è funzionale solo al piano espressivo – ad esempio la riproduzione di un filmato – ma a quello contenutistico, costituito da un insieme di possibili stati discreti, risultato delle elaborazioni sottostanti e dell’interazione degli utenti11. Queste considerazioni, sul rapporto tra aspetto narrativo, commistione di codici comunicativi eterogenei, granularità, non linearità e ruolo fondamentale della tecnologia, forniscono infine ulteriori elementi su ciò che accomuna l’editoria digitale a quelle forme espressive caratterizzate da questi stessi tratti, come il fumetto o il cinema (Posner 2016); queste ultime possono essere perciò una preziosa risorsa per ciò che riguarda sia i rapporti tra i diversi codici nei singoli blocchi informativi (McCloud 1993, 153-155) sia l’equilibrio narrativo complessivo. questa modalità sia strettamente legata alla conoscenza della struttura sottostante, o perché formalizzata, come in questo caso e più in generale nelle pubblicazioni digitali, o perché estrapolata empiricamente da un esperto lettore. 11 Un esempio di contenuto computazionale relativo al racconto, com’è ormai evidente tra i temi principali delle riflessioni contenute in questa sede, è Hedonometer, uno strumento di sentiment analisys: applicato a circa 1.300 testi del Project Gutenberg – http://hedonometer.org/books/v1/ – ne ha analizzato l’andamento emotivo, verificando così la loro aderenza ad una delle sei trame di base (Reagan et al. 2016). 8 | Documenti, medialità e racconto doi:10.4399/97888255263182 DigitCult | Scientific Journal on Digital Cultures Media e Humanities La pagina di Wikipedia sulla digital scholarship riprende la definizione della Rumsey, con cui apre il capoverso iniziale che si chiude con un tentativo decisamente vago – che in questa sede cercheremo di focalizzare maggiormente – di stabilire una relazione con l'informatica umanistica: “Digital scholarship has a close association with digital humanities, though the relationship between these terms is unclear”12. Da un lato è palese come le digital humanities, nelle loro diverse declinazioni, costituiscano il côté della digital scholarship nelle scienze umane, ben prima che quest'ultimo termine si diffondesse su larga scala; dall'altro determinati aspetti delle prime si estendono strategicamente alla seconda nella sua globalità, in particolare tutto ciò che afferisce al concetto di pubblicazione elettronica: “Digital humanities is not a unified field but an array of convergent practices that explore a universe in which print is no longer the exclusive or the normative medium in which knowledge is produced and/or disseminated.”13 Una prima prova ad un livello teorico di questa tesi è il riferimento esplicito alle digital humanities nelle riflessioni sull'evoluzione dello scholar, in relazione all'utilizzo del computer come strumento conoscitivo: “One potentially rich space for action for the media studies professor is in a third variant of the digital humanities, the multimodal scholar. [...] She aims to produce work that reconfigures the relationships among author, reader, and technology while investigating the computer simultaneously as a platform, a medium, and a visualization device.” (McPherson 2009, 120) Un punto su cui vale la pena soffermarsi in questa affermazione riguarda le varianti implicite e le relative generazioni precedenti e alla base del multimodal scholar. La prima si riferisce a quel gruppo di studiosi impegnato, sulle orme di Padre Roberto Busa e il suo Index Thomisticus, in attività direttamente legate alle pratiche computazionali, per cui l'etichetta disciplinare era non a caso humanities computing (Hockey 2004). Caratteristiche ascrivibili a questa variante sono una tradizione storico/culturale di diversi decenni e relativamente uniforme nonostante l’eterogeneità disciplinare, in quanto bilanciata vuoi da una dimensione circoscritta della comunità scientifica, vuoi dalla riflessione comune sulla centralità dello strumento computazionale nelle varie pratiche, a sua volta oggetto di considerazioni teorico/metodologiche. Tutto ciò va inoltre situato in un contesto in cui la scarsa usabilità delle interfacce utente spingeva verso una conoscenza della tecnologia sottostante, dai comandi dei sistemi operativi testuali alle istruzioni dei linguaggi di programmazione: questa commistione di aspetti sia teorici sia tecnologici non poteva non favorire oltre al dialogo interdisciplinare anche un forte senso di coesione. Infine, la principale distinzione di questa prima generazione si può riassumere nella contrapposizione dialogica tra aspetto qualitativo da un lato, come la codifica dei testi, e quello quantitativo dall’altro, tra cui l'analisi testuale (Gigliozzi 1987). La seconda e più recente generazione si identifica principalmente con l'utilizzo degli strumenti di comunicazione tipici del Web 2.0, Blog e Wiki in primis, come alternativa e complemento ai tradizionali luoghi di pubblicazione accademica, andando così a estendere quell’eterodossia editoriale che nella prima generazione era limitata, sia per motivi pragmatici sia culturali, a quei prodotti della ricerca non riducibili ad una dimensione tipografica senza snaturarne l'essenza, come banche dati testuali o edizioni critiche digitali14. È con il progressivo 12 http://en.wikipedia.org/wiki/Digital_scholarship. 13 “A Digital Humanities Manifesto”, http://manifesto.humanities.ucla.edu/2008/12/15/digital-humanities- manifesto/. 14 Gino Roncaglia, nel descrivere una situazione di discontinuità nelle pratiche di editoria elettronica applicata alla saggistica, scrive “Fa in parte eccezione il campo delle edizioni critiche digitali, che è legato tuttavia a un insieme di strumenti e problematiche diverse rispetto all’idea di ‘arricchimento’ del testo.” (Roncaglia 2018, 3). È altresì vero come le continue riflessioni sulla natura e sul modello dell'edizione, e in particolare sul concetto di modello dei dati (Witt 2018), portino questi due paradigmi ad incontrarsi inevitabilmente. doi:10.4399/97888255263182 Federico Meschini | 9 DigitCult | Scientific Journal on Digital Cultures affermarsi e diffondersi di questa seconda generazione che si passa, non senza critiche, da humanities computing a digital humanities (Vannhoute 2013), transizione sancita ufficialmente a metà anni 2000 anche grazie alla pubblicazione, sia singolarmente sia significativamente a stampa, del A Companion to Digital Humanities per i tipi dell'editore Blackwell (Schreibman et al. 2004a) L’intenzione più che evidente è quella di definire e contemporaneamente espandere un settore in continuo cambiamento, a causa di un intreccio di fattori culturali e tecnologici, includendo pratiche e di conseguenza discipline in cui l’enfasi è sull’aspetto comunicativo e multimediale. Ciò che viene a delinearsi è un qualcosa di variegato ed eterogeneo che però “remains deeply interested in text, but […] has redefined itself to embrace the full range of multimedia” (Schreibman et al. 2004b, xxiii). L’apertura è chiaramente nei confronti dei media studies, arrivando così alla terza variante descritta dalla McPherson, ma mantenendo come punto focale privilegiato il testo, cuore delle discipline umanistiche tradizionali. Vuoi però la sempre crescente specializzazione dei diversi settori vuoi la maggiore eterogeneità di questo nuovo scenario, il processo di armonizzazione tra le scienze umane da un lato e quelle mediologiche dall’altro, nonostante la comune ridefinizione basata sul paradigma computazionale/digitale, non è stato e non è né automatico né lineare ed è tuttora caratterizzato da una certa tensione. Successivamente alla pubblicazione dell’articolo della McPherson, nel gennaio del 2010 in uno scambio di tweet tra Matthew Kirschenbaum, Stephen Ramsay e Mark Sample si arrivò a parlare di una probabile faida tra i due schieramenti e conseguente “turf war”, immagine decisamente evocativa ed efficace tanto da essere successivamente ripresa per descrivere un possibile, e pessimistico, scenario nel rapporto tra scienze umane e digital humanities a causa di una mancata integrazione tra di loro (Hayles 2012). Se quest’ultima questione rimane tutt’ora aperta e lo rimarrà ancora per diverso tempo, in quanto nonostante una sempre maggiore contaminazione e diffusione di pratiche e strumenti computazionali (Stella 2018) non si può non notare una corrispondente reazione di arroccamento su posizioni conservatrici, il rapporto con i media studies è affatto cambiato. Va sottolineato come in questo settore non sembra essere presente, o perlomeno non allo stesso livello, quella diffidenza che caratterizza molti studiosi umanistici (Tomasin 2017), vuoi per una maggiore freschezza della disciplina vuoi per un interesse intrinseco nei confronti dei meccanismi sottostanti qualsiasi strumento conoscitivo/comunicativo. Pubblicazioni come The Arclight Guidebook to Media History and the Digital Humanities (Acland e Hoyt 2016) o The Routledge Companion to Media Studies and Digital Humanities (Sayers 2018), anche in questo caso a stampa, mostrano come tale rapporto esista e, nonostante le inevitabili declinazioni disciplinari, presenti punti di contatto strategici15, in particolare sulla natura dei documenti digitali e le relative possibilità espressive: “Just as the codex was an improvement over the papyrus scroll […] the digitally mediated “page” offers yet another paradigm shift in the processes of writing and reading. The digital page yields a new axis of depth—a page that layers to other pages, can be seen next to other pages, and can include moving images, still images, sounds.” (Friedberg 2009, 150) Il riferimento nel compendio della Blackwell alla centralità del testo, citato in precedenza, permette di approfondire questo rapporto tra l’approccio mediologico e quello umanistico. Un primo passo obbligato è la non certo banale definizione del concetto di testo. Patrick Sahle, nelle sue riflessioni sulle edizioni critiche digitali, ne ha elaborato una teoria pluralistica, in cui il testo può assumere diversi significati a seconda del punto di vista assunto ed essere perciò interpretato parimenti come un’idea, un’opera, un codice linguistico, una versione, un documento e un segno visivo (Sahle 2013, III 45-49). Nonostante il parallelismo con altri formalismi o approcci16, l’innovazione di Sahle è l’organizzazione di queste diverse interpretazioni in una ruota, con dei rapporti non gerarchici ma circolari e diametrali. 15 In particolare nel primo volume il capitolo di Eric Hoyt (2016) è incentrato sulla creazione di collezioni digitali di fonti primarie, l’utilizzo e lo sviluppo di programmi per la loro analisi e, infine, la scrittura di libri e articoli contenenti i risultati ottenuti, temi collegati a quelli affrontati in questa sede; nel secondo il capitolo Futures of the Book (Bath et al. 2018) analizza il rapporto tra libro a stampa e calcolatore, superando la visione “integrata” che vede il primo sostituito completamente dal secondo. 16 Un paragone immediato è con il modello FRBR, sebbene non totalmente isomorfo (Pierazzo 2014, 67). 10 | Documenti, medialità e racconto doi:10.4399/97888255263182 DigitCult | Scientific Journal on Digital Cultures Figura 1. La text wheel di Patrick Sahle. Nelle discipline mediologiche il focus è su quei tratti legati alla materialità e al piano dell’espressione – documento, versione e segno visivo – direttamente collegati agli aspetti sociologici (McKenzie 1985) e a loro volta trait-d'union con la dimensione multicodicale; grazie a questa struttura circolare si ha però una connessione diretta, per giustapposizione o opposizione, a quelle declinazioni legate al piano del contenuto, in una commistione che non può non ricordare quell’intreccio tra elementi e ruoli tradizionalmente considerati come separati sottolineato dalla Rumsey. Non a caso Franz Fischer utilizza la text wheel di Sahle per descrivere quali degli aspetti della testualità siano rappresentati tramite un’edizione critica digitale e di come siano collegati tra di loro: “the boundaries between the different aspects are constantly in a state of flux and thus the diplomatic transcription also represents one particular version of the text” (Fischer 2010). Parimenti l'aggettivo multimodal, oltre a riferirsi alla materialità eterogenea dei supporti insieme alle relative caratteristiche tecnologico/funzionali, e a quella definizione polisemica di scholar vista precedentemente, in cui la distinzione tra scienze umane e scienze esatte, ricerca e didattica, docente e discente è affatto lasca, ben si attaglia a questa visione pluralistica. Basandosi sempre su questo approccio è possibile rileggere affermazioni come quella della McPherson riguardo lo studio e l’utilizzo del computer come al tempo stesso piattaforma, medium e strumento di visualizzazione: il dispositivo computazionale non può prescindere dall’aspetto linguistico, alla base del codice e dei linguaggi di programmazione, e nella ruota di Sahle questa istanza si trova opposta proprio a quella del segno visivo, così come il medium, inteso come unione di espressione e contenuto, è rappresentato dall’opposizione diametrale tra documento e opera, e queste due interpretazioni sono giustapposte rispettivamente a quella di segno e di codice linguistico. L’opposizione tra documento e opera spiega inoltre un fenomeno descritto come apparentemente contraddittorio dalla Rumsey epperò visto come fondamentale per ciò che riguarda l’evoluzione del ruolo delle biblioteche; se da un lato fungono da “trusted conservator and long-term steward of humanities scholarship” dall’altro sono “a force for innovation and a neutral meeting ground of people from different disciplines and professions to collaborate and experiment” (Rumsey 2010, 21). Il primo aspetto è chiaramente legato alla loro funzione tradizionale rispetto al documento e alla sua dimensione fisica, mentre il secondo acquista maggiormente senso utilizzando la chiave di lettura dell’opera. Declinando quest’ultimo concetto Utilizzando un approccio semiotico la corrispondenza è completa, a patto di uniformare il livello del contenuto e dell’espressione con gli altri sottolivelli, considerandoli perciò come stadi intermedi, in quanto l’idea trova riscontro con la sostanza del contenuto, l’opera con il contenuto, il codice linguistico con la forma del contenuto, la versione con la forma dell’espressione, il documento con l’espressione e il segno visivo con la sostanza dell’espressione (Barthes, 1964, pp. 105-106). doi:10.4399/97888255263182 Federico Meschini | 11 DigitCult | Scientific Journal on Digital Cultures nel significato di racconto – e ciò vale sia per la narrativa sia per la saggistica, umanistica e scientifica (Lolli 2018) – è la biblioteca a fornire quel luogo d’incontro tra attori eterogenei le cui interazioni creano nuove relazioni, connessioni e stati conoscitivi rispetto a quelli precedentemente esistenti: il medesimo tipo di cambiamento alla base della natura del racconto (Yorke 2014). Sempre riguardo il rapporto tra digital scholarship e digital humanities da cui eravamo partiti, un’ulteriore conferma, stavolta pragmatica, della loro stretta relazione viene dai diversi centri nati da esperienze significative di informatica umanistica, con sovente una biblioteca accademica a fornire il necessario supporto istituzionale: esempi significativi sono lo Scholars' Lab dell'università della Virginia, evoluzione diretta dell'Electronic Text Center17, e il Center for Digital Scholarship della Brown, nato dallo Scholarly Technology Group18. Entrambi i centri sono stati punti di riferimento imprescindibili negli anni '90 e primi anni 2000 per lo sviluppo dello standard della Text Encoding Initiative, per la codifica digitale dei testi, in particolare di ambito letterario e linguistico. Sebbene questo standard sia stato pensato e sviluppato principalmente per documenti e testi appartenenti al cultural heritage, la sua versatilità e la progressiva diffusione, insieme alla disponibilità di strumenti software per il formato XML su cui è attualmente basato, ne hanno esteso l'utilizzo anche ad altri settori, in particolare l'editoria scientifica (Holmes e Romary 2011). Lo scholarly publishing, insieme alla scholarly communication di cui è un sottoinsieme, è stato il principale àmbito in cui la componente digitale ha fatto sentire sin da subito i suoi effetti, ben prima che il computer divenisse uno strumento di fruizione, soprattutto per la creazione e formattazione di prodotti editoriali destinati ad una produzione cartacea. Questo sia ad un livello generalistico, con i page description language – tra cui il Postscript della Adobe, utilizzati dai programmi di desktop publishing – sia specialistico, tramite i linguaggi di marcatura procedurali e descrittivi, rappresentati i primi da TeX e LaTeX e i secondi da SGML prima e XML poi (Coombs et al. 1987). Successivamente, il movimento open access è stato di fatto reso possibile dalla creazione di infrastrutture software come piattaforme per la creazione di archivi istituzionali e riviste, insieme ai relativi standard e protocolli di metadati (Guerrini 2010). Nonostante i contenuti ad accesso aperto continuino ad essere quasi esclusivamente di tipo tradizionale, come articoli in PDF o set di dati19, nascono e si diffondono progressivamente nuove modalità di comunicazione scientifica, basate sia su formati sia su canali alternativi. Per ciò che riguarda i formati l’attenzione è ora incentrata sul saggio computazionale (Somers 2018), in cui i dati e il codice diventano parte integrante e attiva della pubblicazione insieme all’argomentazione narrativa: il documento diventa in questo modo una vera e propria edizione digitale in quanto ne include le componenti costitutive: la logica operativa, l’interfaccia utente e i dati strutturati (Meschini 2018, 65)20. Mentre per i canali e le modalità di disseminazione è necessario soffermarsi sull’altro estremo dell’asse documento-racconto. La “unclear relationship” tra digital scholarship e digital humanities, da cui eravamo partiti, acquista ora una maggiore nitidezza: se elementi essenziali nelle scienze umane sono il documento e il racconto, insieme naturalmente alla loro relazione dialogica, la loro ri-mediazione diventa strategica per ciò che riguarda il rapporto e la contaminazione con le scienze esatte. Se ciò risulta immediato e intuitivo per quello che riguarda il documento e il dato digitale, e tutto ciò che unisce e armonizza questi due estremi – l’edizione, l’archivio o la biblioteca digitale – lo stesso non si può dire, e va perciò analizzato con maggior cura, per quell’insieme eterogeneo definito come digital storytelling (Alexander 2017). 17 http://scholarslab.lib.virginia.edu/; http://dcs.library.virginia.edu/digital-stewardship-services/etext/. 18 http://library.brown.edu/create/cds/; http://xml.coverpages.org/stgover.html. 19 Per una panoramica sui principali formati utilizzati per i dataset, sia testuali come il CSV sia binari vedi la guida della Library of Congress Format Descriptions for Dataset Formats -– http://www.loc.gov/preservation/digital/formats/fdd/dataset_fdd.shtml. A ciò va naturalmente aggiunta l'iniziativa LinkedData, e le diverse serializzazioni dello standard RDF nelle varie sintassi, tra cui XML, JSON e Turtle. 20 Nei dati strutturati vanno inclusi anche i metadati relativi alla struttura del documento che, insieme alle altre due componenti, ne permettono la fruizione non lineare da parte dell’utente (Venerandi 2018). 12 | Documenti, medialità e racconto doi:10.4399/97888255263182 DigitCult | Scientific Journal on Digital Cultures Transmedia Scholarship Un primo esempio di questa nuova modalità di disseminazione presenta tratti significativi di quella forma del transmedia storytelling definita da Henry Jenkins come corporate e caratterizzata da un approccio top-down (Jenkins 2006, 18): il progetto Why We Post, una ricerca antropologica comparativa sull'utilizzo dei social media, ha reso disponibili i propri risultati traendo spunto dall'oggetto di studio. Daniel Miller, antropologo digitale e coordinatore del progetto, afferma come l'approccio olistico utilizzato nelle modalità di ricerca dovesse in qualche modo essere presente anche nelle modalità di disseminazione e fa inoltre riferimento al rapporto tra pubblicazione dei risultati e divulgazione su larga scala, includendo come elemento di mediazione l'aspetto educativo: “We realized that the conventional way of research dissemination, the books, the journal articles, in some ways that is a little narrow [...] you have many different audiences out there who could be interested in those results, so then you think about how to create a range.”21 In base a questi princìpi il sito web Why We Post22 presenta caratteristiche interessanti dal punto di vista strutturale/mediatico. Il primo contenuto informativo disponibile è un video di circa quattro minuti di presentazione generale del progetto23, in cui vengono dichiarate le finalità, la metodologia utilizzata, basata su di una ricerca comparativa sul campo, i principali risultati ottenuti e infine le modalità di disseminazione adottate, sottolineando come si voglia passare da un progetto di “global research” a una fase di “global education”, e invitando a condividere i contenuti social prodotti. La sezione successiva del sito, denominata Discoveries24, si basa su di un approccio progressivo e multicodicale, o se si preferisce multimodale, per illustrare i risultati. Il punto di partenza è un singolo periodo, spesso composto da una sola frase, in cui viene affermato un principio sovente in contraddizione con le varie credenze vulgate sui social network. Si va dall'idea che i social non rendano automaticamente le persone più individualiste o il mondo più omogeneo, alla creazione di nuovi spazi intermedi tra la sfera pubblica e quella privata, al rapporto non lineare tra l'uso di una piattaforma e la tecnologia sottostante, al fatto che siano gli utenti a plasmare i social media e non il contrario, fino alle possibilità formative o addirittura di privacy offerte a chi prima non aveva accesso ad altre tipologie di fonti informative. Ciascuna di queste affermazioni viene sviluppata e argomentata attraverso una serie di contenuti discreti, circa sei o sette, ognuno relativo ad una delle nazioni in cui è stata effettuata la ricerca; questi contenuti sono corredati da un breve testo, con un ruolo più di introduzione che di contestualizzazione, e consistono principalmente in un video di qualche minuto, con interviste o momenti di vita quotidiana, o una Story, un breve racconto di poco più di mille battute in cui l'affermazione generale viene declinata ed esemplificata tramite un caso concreto. 21 CCSCS public talk | Daniel Miller: Why We Post: The anthropology of social media, http://youtu.be/1r_8a9hub78. 22 http://www.ucl.ac.uk/why we-post. 23 http://youtu.be/0jA5B32MP98. 24 http://www.ucl.ac.uk/why-we-post/discoveries/. doi:10.4399/97888255263182 Federico Meschini | 13 DigitCult | Scientific Journal on Digital Cultures Figura 2. La sezione Discoveries del sito Why We Post. Scopo di questa sezione è fornire un'introduzione generale ai risultati della ricerca, supportata da un’accurata selezione della documentazione raccolta; assolve così, con le dovute proporzioni, a quella funzione teorizzata da Robert Darnton nel descrivere la possibile struttura piramidale di un libro elettronico – o più in generale di un’edizione digitale – a strati in cui, subito dopo una descrizione ad alto livello dell'argomento in questione, “The next layer could contain expanded versions of different aspects of the argument, not arranged sequentially as in a narrative, but rather as self-contained units that feed into the topmost story” (Darnton 1999). Conseguenza dell'assenza di una disposizione narrativa sequenziale è una maggiore presenza di questo aspetto nelle singole unità discrete, come dimostrato dalla denominazione delle storie o dai video documentaristici. Sebbene non totalmente corrispondente e soprattutto non formalizzata, la struttura a strati di Darnton trova altri riscontri nell'assetto documentale/mediatico di Why We Post. Il canale YouTube del progetto25 contiene la totalità dei video prodotti durante la ricerca, e corrisponde perciò al terzo livello, “composed of documentation, possibly of different kinds, each set off by interpretative essays” (Ibid.), nonostante la parte relativa ai saggi interpretativi si trovi invece, insieme alla componente teorica del quarto livello, in uno dei due rimanenti blocchi informativi del progetto: undici monografie pubblicate in modalità open access e disponibili come PDF e in HTML tramite la piattaforma digitale dell'University Press dell'University College of London26. 25 http://www.youtube.com/user/whywepost. 26 http://www.uclpress.co.uk/collections/series-why-we-post. Delle undici monografie presenti, nove sono ad autore singolo e ognuna incentrata su uno dei luoghi in cui è stata effettuata la ricerca sul campo, vedi http://www.ucl.ac.uk/why-we-post/research-sites. Le rimanenti due sono scritte a più mani e trasversali rispetto alle ricerche individuali: in particolare How the World Changed Social Media (Miller et al. 2016), scritta da pressoché tutti i ricercatori coinvolti nel progetto è al tempo stesso una sorta d’introduzione e di riepilogo generale, in quanto riporta e riassume in forma argomentativa e lineare i contenuti distribuiti nelle varie unità informative discrete. Tutte le monografie presentano un forte taglio divulgativo, soprattutto per ciò che riguarda il registro linguistico utilizzato: ciò spiega, insieme alla disponibilità in modalità open access in diverse lingue e il tema d’interesse generale, il numero elevato di download, nell’ordine delle decine di migliaia (Costa et al. 2016). Nell’ottica della commistione tra aspetti qualitativi, qui legati alle modalità di strutturazione dei contenuti, e quantitativi, sei di queste monografie sono disponibili tramite Topicgraph, http://labs.jstor.org/topicgraph/; questo strumento, sviluppato all’interno del progetto Reimagining the Monograp di JSTOR (Humphreys et al. 2018) visualizza graficamente la distribuzione dei principali argomenti contenuti in un documento tramite un’analisi basata sul Topic modeling (Brett 2012). Infine, per un elenco completo delle pubblicazioni prodotte durante il progetto, includendo quindi anche gli articoli pubblicati sulle riviste scientifiche vedi http://www.ucl.ac.uk/why-we-post/about-us/publications/. 14 | Documenti, medialità e racconto doi:10.4399/97888255263182 DigitCult | Scientific Journal on Digital Cultures Infine, l'ultimo tra i blocchi principali racchiude contemporaneamente il quinto e, in parte, il sesto livello di questa struttura, l’elemento pedagogico e dialogico tra autori e fruitori: Why We Post: the Anthropology of Social Media è un corso online disponibile sulla piattaforma Future Learn, gestita da un consorzio formato principalmente da università inglesi. Rispetto agli altri principali erogatori di MOOC, tra cui Coursera ed EdX, i corsi presenti su FutureLearn, e Why We Post non fa certo eccezione, presentano una maggiore stringatezza ed essenzialità: i contenuti formativi presenti sono composti da brevi articoli e video – più eventuali contenuti esterni di approfondimento – distribuiti, com'è ormai prassi generale, nelle settimane in cui si svolge il corso, con relativa stima oraria settimanale dell'impegno richiesto27. Questa descrizione dello spettro documentale/mediatico di questo progetto dimostra come ci sia una naturale tendenza ad una certa corrispondenza, seppure non esplicitata, con i livelli descritti da Darnton mano a mano che il suddetto spettro viene espanso. Sempre secondo Jenkins, alla modalità corporate, imposta dall'alto, si contrappone quella grassroot, caratterizzata da uno sviluppo bottom-up. Sebbene non totalmente identificabile con quest'ultimo paradigma, principalmente per la partecipazione di istituzioni culturali e non di singoli utenti, il prossimo esempio ne condivide diverse caratteristiche. L'evento scientifico più rilevante dell'aprile 2019, perlomeno a livello mediatico, è stato la trasmissione in diretta dell'immagine dell'ombra del buco nero al centro della galassia Messier 87 da parte del progetto Event Horizon Telescope28. Grazie anche alla copertura dei maggiori quotidiani online, nella giornata del 10 aprile quest'immagine è stata tra le più condivise sulle piattaforme social, e il video ufficiale della conferenza stampa su YouTube riporta quasi un milione e trecentomila visualizzazioni29. Se ciò era abbastanza prevedibile, come il riutilizzo pressoché immediato di quell’immagine come meme, lo stesso non si poteva certo dire del post pubblicato sul proprio profilo Facebook da Katie Bouman, la giovane ricercatrice che ha avuto un ruolo di rilevo nello sviluppo dell'algoritmo per l’elaborazione dei dati ricevuti e la loro successiva trasformazione in 27 http://www.futurelearn.com/courses/anthropology-social-media. Nel caso specifico di Why We Post, le settimane di corso sono cinque, per tre ore di impegno settimanale. Al contrario delle altre piattaforme FutureLearn concede l’accesso ai corsi solo nei periodi in cui sono effettivamente erogati. Come parte del modello di business adottato, una volta trascorse le settimane previste è possibile continuare ad accedere ai contenuti di un corso, altrimenti gratuiti, solo dietro il pagamento di una quota, che oltretutto dà diritto ad un certificato ufficiale di partecipazione una volta completato il percorso formativo. È inoltre presente un’anteprima dei contenuti, con chiare finalità di promozione. Per Why We Post quest’anteprima consiste in due articoli e un video. Gli articoli sono su Twitter – http://www.futurelearn.com/courses/anthropology-social-media/10/steps/371143 – e sulla cautela degli utenti nell’utilizzo dei social su argomenti come la politica in determinati contesti – http://www.futurelearn.com/courses/anthropology-social-media/10/steps/371175 – mentre il video è incentrato sui meme – http://www.futurelearn.com/courses/anthropology-social-media/10/steps/371161. Entrambi gli articoli hanno una lunghezza media di circa cinquemila battute e sono suddivisi in paragrafi, in modo da facilitarne la lettura sfruttando sia l’aspetto strutturale sia quello presentazionale. La finalità didattica trova riscontro nel linguaggio utilizzato, che utilizza una terminologia non specialistica e costruzioni sintattiche lineari. Queste stesse caratteristiche si ritrovano nel video, cui va aggiunto sia l’aspetto dialogico e narrativo apportato dalla presenza esplicita di un docente sia un forte uso del canale visivo, grazie all’impiego di immagini che declinano concretamente l’argomento trattato. Il corso su FutureLearn è stato erogato due volte nel 2016, a febbraio e giugno, tre nel 2017, a gennaio, giugno ed ottobre, ed una sola volta nel 2018, ad agosto, il periodo in cui vengono interrotte le pubblicazioni sui relativi canali Facebook e Twitter. I video sono presenti su una specifica playlist del canale Youtube del progetto e, nonostante non sia stata fino ad ora annunciata ufficialmente nessuna nuova edizione, il corso è costantemente disponibile su UCLExtend, la piattaforma di eLearning dell’University College of London, basata su Moodle, dove è stato tradotto in altre lingue, tra cui l’italiano, il cinese e lo spagnolo, così da renderlo maggiormente accessibile. Va naturalmente sottolineato come la differenza tecnologica e concettuale tra le due piattaforme, in quanto sistemi informativi eterogenei a livello di modello di dati, di logica operativa e di interfaccia utente, renda la fruizione dei medesimi contenuti due esperienze di fatto diverse, in particolare per ciò che riguarda la modalità di apprendimento, l’una maggiormente di gruppo e collaborativa mentre l’altra autonoma e individuale. 28 http://eventhorizontelescope.org/. 29 National Science Foundation/EHT Press Conference Revealing First Image of Black Hole, http://youtu.be/lnJi0Jy692w. doi:10.4399/97888255263182 Federico Meschini | 15 DigitCult | Scientific Journal on Digital Cultures una rappresentazione visiva. Nel post, pubblicato in concomitanza con l’evento, la Bouman mostra una sua foto visibilmente entusiasta mentre osserva il risultato ottenuto30. Figura 3. Il post pubblicato da Katie Bouman durante la ricostruzione dell’immagine del buco nero. La viralità di questa foto, diventata a sua volta un meme, ha focalizzato l’attenzione dei social sulla giovane ricercatrice, con i ben noti effetti di idealizzazione, personalizzazione e polarizzazione delle opinioni. È stata necessaria la pubblicazione di un post successivo in cui Katie Bouman ha specificato come il risultato ottenuto fosse stato possibile solo grazie a un lavoro di squadra31. Ciononostante nella stessa giornata del 10 aprile è stata creata una pagina sulla Bouman su Wikipedia32, e i suoi video su Youtube, sia divulgativi, come un TED Talk del 2017, sia scientifici, tra cui uno espressamente realizzato dal California Institute of Technology qualche giorno dopo, hanno totalizzato un numero elevato di visualizzazioni33. Secondo la teoria del transmedia, il post originario della Bouman ha svolto, seppure involontariamente, il ruolo di rabbit hole, un punto d’ingresso nell’universo narrativo. Certo, in questo caso specifico l’eccezionalità dell’evento può far sorgere qualche dubbio sulla sua effettiva adattabilità e generalizzazione, ma anche in Why We Post erano presenti ingressi simili, costituiti da articoli pubblicati su The Economist o servizi della BBC. Ciò che deve seguire al rabbit hole è la costruzione di un percorso in cui in ogni nodo l’utente deve essere spinto a proseguire, così da soddisfare i propri bisogni narrativo/informativi, seppure indotti. Va da sé come questo approccio si possa trasporre anche alla scholarly communication includendo idealmente sia l’aspetto corporate sia quello grassroot. Ritornando al confronto tra la metodologia di Darnton e quella di Jenkins, le differenze a prima vista non sembrano essere poche. Il primo è concentrato sull’aspetto documentale/informativo, trascurando se non rigettando del tutto il fattore narrativo, al cuore delle riflessioni del secondo insieme all’aspetto mediologico. In realtà, come abbiamo visto, racconto e informazione sono strettamente legati, così come natura documentale e mediatica. Darnton parla di “documentation, possibily of different kinds” (1999) e Jenkins di come “each medium makes it own unique contribution” (2007) e la struttura piramidale può essere considerata un caso particolare della “unified and coordinated entertainment experience” (Ivi). 30 http://www.facebook.com/photo.php?fbid=10213321822457967. 31 http://www.facebook.com/photo.php?fbid=10213326021042929. Nonostante il riscontro ottenuto questo post non è stato virale come il precedente e oltretutto ha attirato l’attenzione di troll e haters, caratterizzati da un atteggiamento antiscientifico e discriminatorio nei confronti della giovane scienziata. 32 http://en.wikipedia.org/wiki/Katie_Bouman. 33 Rispettivamente più di 3 milioni per il video del TED – http://www.youtube.com/watch?v=BIvezCVcsYs – e circa 175.000 quello del CalTech – http://www.youtube.com/watch?v=UGL_OL3OrCE. Colpisce il totale di visualizzazioni realizzato dal secondo video, nonostante il dato numericamente inferiore, in quanto espressamente indirizzato ad un pubblico altamente specializzato. 16 | Documenti, medialità e racconto doi:10.4399/97888255263182 DigitCult | Scientific Journal on Digital Cultures La differenza principale tra i due sta nella formalizzazione delle relazioni. Per l’uno è un punto di partenza mentre per l’altro di arrivo: per Jenkins la dispersione, sistematica e non certo casuale di ciò che è un insieme coerente è un aspetto fondamentale del transmedia storytelling, in quanto a questa fase corrisponde un successivo processo di ricostruzione e decodifica sia a livello di messaggio sia di medium, con aspetti non trascurabili di intelligenza collettiva. Tutto questo è un valore aggiunto per l’industria dell’intrattenimento, di solito considerata negativamente per ciò che riguarda i processi formativi ed educativi, al contrario cuore del mondo universitario e biblioteconomico cui appartiene Darnton. L’utente ideale del libro a strati o è già in possesso o è cosciente dell’importanza delle competenze di information literacy, mentre ciò non si può dire per l’utente di un prodotto di fiction. Ecco perché Darnton è interessato al prodotto, ad avere relazioni già esplicitate in modo da facilitare i bisogni informativi dell’utente34, mentre Jenkins al processo, e di conseguenza allo sviluppo indiretto di quelle stesse competenze tramite l’esplicitazione delle relazioni, così da soddisfare un bisogno che è sia informativo sia emotivo. In quest’ottica, e ricollegandoci alla definizione originaria di scholarship da cui eravamo partiti, in cui la distinzione tra didattica e ricerca era sfumata, uno sviluppo interessante è su come i princìpi del transmedia storytelling possano o meno essere adattati alla digital scholarship, sulla falsariga di ciò che è stato già fatto con l’aspetto formativo con il transmedia education (Jenkins 2010). In particolare assumono rilevanza fattori quali la spreadability e la drillability, che rispondono alle già citate esigenze di granularità e facilità di diffusione dei contenuti da un lato e al passaggio dalla granularità alla longform dall’altro. Immersion ed extraction possono rappresentare la presenza di componenti computazionali insieme alla possibilità di utilizzare liberamente i dati o una parte di essi in applicazioni terze. Il principio di seriality può essere controverso, in quanto in un’accezione positiva indica il sottolineare la progressione di un percorso di ricerca insieme all’uso di diversi media, come la registrazione di una presentazione ad un convegno o un video promozionale35, in cui rientra anche l’aspetto della performance; al contrario una declinazione negativa è quel fenomeno conosciuto come “salami science”, in cui i risultati di una ricerca vengono frammentati in modo da ottenere il maggior numero possibile di pubblicazioni, soluzione estremamente pragmatica all’opposizione darwniniana publish or perish. Similmente, di non semplice applicazione sono continuity vs. multiplicity e subjectivity, principalmente per la coerenza interna richiesta ad una pubblicazione, in particolare nelle scienze esatte; nel primo caso i già citati fattori di granularità e non linearità possono essere una soluzione, perlomeno parziale, mentre nel secondo il meccanismo delle annotazioni, come quello implementato da Hypothes36, permette ad altri utenti di estendere il testo base aggiungendo così prospettive multiple. Sviluppando questo aspetto, e superando la dimensione del singolo documento, una mappa concettuale può dare la possibilità di disporre spazialmente le pubblicazioni su di un particolare argomento, insieme alle eventuali relazioni semantiche di affinità o divergenza, realizzando così il worldbuilding. Combinazioni e conclusioni Le numerose declinazioni della digital scholarship sono di volta in volta combinazioni con modalità e valori variabili di princìpi che abbiamo provato a individuare in questa sede, e 34 Per questo scopo risulta indicato l’utilizzo del semantic web e delle ontologie. Le SPAR ontologies – http://www.sparontologies.net/ – sono incentrate sul dominio dell’editoria scientifica e possono essere usate per esprimere, in particolare la parte relativa alla descrizione dei documenti eventualmente con le necessarie estensioni, il modello di Darnton. 35 Il progetto StoryCurve, basato sull’elaborazione e la visualizzazione del rapporto tra r®cit e histoire in film che non seguono l’ordine cronologico come Pulp Fiction o Memento, nel relativo sito – http://storycurve.namwkim.org/ – presenta un testo introduttivo, un articolo scientifico (Kim et al. 2018) insieme ai materiali supplementari in PDF, StoryExplorer – http://storyexplorer.namwkim.org/ – lo strumento sviluppato insieme al codice sorgente e ai dati dei vari film e infine un breve video di circa tre minuti, in cui vengono presentati in maniera efficace e accattivante gli argomenti principali del progetto. La presenza nel gruppo di ricerca di due membri appartenenti al settore R&D della Disney ha, molto probabilmente, avuto un ruolo non da poco riguardo la realizzazione del suddetto video, in quanto ben consapevoli dell’importanza dell’aspetto comunicativo. 36 http://web.hypothes.is/. doi:10.4399/97888255263182 Federico Meschini | 17 DigitCult | Scientific Journal on Digital Cultures classificabili in rapporti sia di contrapposizione sia di interazione tra assetto documento-centrico e data-centrico, mono e multicodicale, individuale e collettivo, informativo e narrativo (denotativo e connotativo), statico e dinamico (con quest’ultimo a includere le accezioni sia di interattivo sia di computazionale), sincronico e diacronico (temporale e spaziale), forma breve e forma lunga. In questo modo è possibile accomunare pratiche eterogene tra di loro come le diverse piattaforme digitali per la pubblicazione di articoli, monografie, miscellanee e curatele37, mappe crono e georeferenziate38, fino ad arrivare ai video essay. Concludendo con una riflessione su quest’ultima categoria, di primo acchito può sembrare riduttivo utilizzare il linguaggio audiovisivo in sostituzione di quello testuale, che richiede una maggiore partecipazione da parte del fruitore, con delle conseguenze fondamentali a livello cognitivo (Wolf 2008). Non va dimenticato come l’aspetto sovrasegmentale e performativo del linguaggio verbale, e l’approccio sinestesico del doppio canale visivo/auditivo possano avere una carica non indifferente nel veicolare con efficacia anche concetti non strettamente narrativi. Inoltre la grammatica visiva ha un livello di espressività paragonabile a quella testuale, seppure di natura diversa; di conseguenza imparare a decodificarla può essere una componente rilevante in quell’attività più generale che è l’educazione alla complessità (Roncaglia 2018). Naturalmente l’utilizzo dei video essay è tanto più diffuso e maturo quanto più il mezzo visivo è l’oggetto stesso dell’analisi39. Nelle scienze esatte al contrario il loro ruolo sembra ridursi ad un apporto secondario, di tipo paratestuale, come ad esempio i video abstract, o con un approccio mimetico incentrato sulla riproduzione di conferenze e lezioni, disponibili sui vari canali Youtube o su portali dedicati, come VideoLectures40. Fanno eccezione i materiali formativi in cui si nota sempre di più una cura verso il fattore comunicativo, come nel canale CrashCourse41, in cui lo sviluppo dei contenuti, che spaziano dall’informatica, alla biologia, alla statistica per arrivare alla mitologia o alla storia del teatro, è coadiuvato da un team creativo responsabile dei vari aspetti di ogni episodio, dalla sceneggiatura, alle scenografie virtuali, all’uso di musiche, disegni e animazioni. Va sottolineato però come da un lato la sempre maggiore velocità di diffusione delle informazioni e dall’altro l’altrettanto crescente interdisciplinarità scientifica, riducano l’intervallo 37 Literary Studies in the Digital Age: An Evolving Anthology (Price e Siemens 2013) – http://dlsanthology.mla.hcommons.org/ – è un’antologia sulle digital humanities in cui i capitoli vengono pubblicati progressivamente e con possibilità di commenti da parte degli utenti. Anche l’edizione digitale di Debates in the Digital Humanities (Gold 2012) – http://dhdebates.gc.cuny.edu/ – prevede la funzionalità dei commenti degli utenti, cui aggiunge però un aspetto computazionale per ciò che riguarda la struttura: viene mostrato graficamente quanto più un periodo è stato evidenziato dai lettori, permettendo così di individuare con una rapida scansione i punti giudicati più salienti, possibilità che però non deve andare a scapito di una lettura, comprensione e valutazione dell’argomentazione nella sua globalità; inoltre sia i periodi, le annotazioni e i commenti sono disponibili in formato JSON per ulteriori elaborazioni. 38 Vedi in particolare lo Spatial History Project dell’Università di Stanford – http://web.stanford.edu/group/spatialhistory/ – incentrato sulla visualizzazione spaziotemporale di dati storici e culturali così da meglio comprendere l’evoluzione dei fenomeni e Neatline, uno strumento realizzato dallo Scholar’s Lab – http://neatline.org/ – per creare narrazioni basate su mappe e timeline. 39 Come nel canale Youtube Every Frame a Painting – http://www.youtube.com/user/everyframeapainting/ – dedicato al linguaggio cinematografico. Un altro esempio significativo è il canale The Art of Story – http://www.youtube.com/channel/UCnGfFb0Ouo0i92MFx7mqZLg/ – che propone un videocorso sul racconto, creato a partire da una serie di lezioni in presenza e successivamente adattato in una monografia (Skelter 2018). L’incisività di questo corso, rispetto a tanti altri di argomento simile, sta proprio nell’utilizzo del medium, sia a livello sincronico sia diacronico. Nel primo caso immagini e testo connotano e declinano con efficacia i contenuti della traccia audio: nel parlare delle due diverse tipologie di scrittori, chi si affida all’istinto e chi invece segue con rigore una scaletta, vengono utilizzate rispettivamente le etichette di berserker, i guerrieri nordici famosi per la loro furia incontrollata, e di assassino, caratterizzato da calma e rigore, mostrando contemporaneamente brevi spezzoni dei film di Kurosawa raffiguranti i due differenti stili. Nel secondo caso ogni volta che viene illustrato un argomento, ad esempio la distinzione in un dialogo tra testo, sottotesto e contesto, segue subito la scena di un film incentrata su quel tema con eventuali integrazioni didascaliche effettuate tramite scritte in sovraimpressione o voce narrante. Questa stessa attenzione nel bilanciamento tra spiegazione ed esempi, tra una fase maggiormente esplicativa e una illustrativa e di approfondimento si riscontra anche nella struttura generale di ogni video. 40 http://videolectures.net. 41 http://www.youtube.com/user/crashcourse/. 18 | Documenti, medialità e racconto doi:10.4399/97888255263182 DigitCult | Scientific Journal on Digital Cultures temporale che intercorre tra il processo di ricerca e quello formativo/divulgativo. Paradossalmente, o forse proprio significativamente, la digital scholarship presenta quei tratti distintivi dell’etimologia di scholarship posta all’inizio di queste riflessioni, in cui ricerca e didattica, scienze umane e scienze esatte non sono nettamente distinte e separate. Per questo motivo la riflessione conclusiva è come, ancora più che nelle digital humanities a causa del maggiore coinvolgimento delle hard sciences, la digital scholarship sia un luogo d’incontro privilegiato e strategico nell’intero panorama scientifico e conoscitivo, in quanto costituita da fattori contenutistici, tecnologici e comunicativi che interagiscono continuamente senza soluzione di continuità. References Alexaner, Bryan. The New Digital Storytelling: Creating Narratives with New Media. Santa Barbara, CA: Praeger, 2017. Acland, Charles R. e Eric Hoyt (a cura di). The Arclight Guidebook to Media History and the Digital Humanities. Falmer, UK: REFRAME Books, 2016. http://projectarclight.org/book. Barthes, Roland. “Éléments de sémiologie”. Communications 4: 91-135. http://www.persee.fr/doc/comm_0588-8018_1964_num_4_1_1029. Bath, John, Alyssa Arbuckle, Constance Crompton, Alex Christie e Ray Siemens, INKE Research Group. “Futures of the Book”. In The Routledge Companion to Media Studies and Digital Humanities, a cura di Jentery Sayers, 336-344. Abingdon-on-Thames: Routledge, 2018. Brett, Megan R. “Topic Modeling: A Basic Introduction”. Journal of Digital Humanities 2.1 (2012). http://journalofdigitalhumanities.org/2-1/topic-modeling-a-basic-introduction-by-megan-r- brett. Costa, Elisabetta, Daniel Miller, Laura Haapio-Kirk, Nell Haynes, Tom McDonald, Jolynna Sinanan, Razvan Nicolescu, Juliano Spyer, Shriram Venkatraman e Xinyuan Wang. “Why We Post: Taking anthropology to the world”. Anthropology News 57.9 (2016): e44-e47. doi:10.1111/AN.122. Coombs, James H., Allen H. Renear e Steven J. DeRose. “Markup Systems and the Future of Scholarly Text Processing”. Communications of the ACM 30.11 (1987): 933-947. doi:10.1145/32206.32209. Darnton, Robert. “The New Age of the Book”. New York Review of Books, 19 marzo 1999. http://www.nybooks.com/articles/1999/03/18/the-new-age-of-the-book/. Fischer, Franz. “The pluralistic approach. The first scholarly edition of William of Auxerre’s treatise on liturgy”. Computerphilologie 10 (2010): 151-168. http://computerphilologie.tu- darmstadt.de/jg08/fischer.html. Friedberg, Anne. “On Digital Scholarship”. Cinema Journal 48.2 (2009): 150-154. http://www.jstor.org/stable/20484457. Gigliozzi, Giuseppe (a cura di). Studi di codifica e trattamento automatico di testi. Roma: Bulzoni, 1987. Gold, Matthew K (a cura di). Debates in the Digital Humanities. Minneapolis, MN: University of Minnesota Press, 2012. Guerrini, Mauro. Gli archivi istituzionali. Milano: Editrice Bibliografica, 2010. doi:10.4399/97888255263182 Federico Meschini | 19 DigitCult | Scientific Journal on Digital Cultures Hayles, Nancy K. “How We Think: Transforming Power and Digital Technologies”. In Understanding Digital Humanities, a cura di David M. Berry, 42-66. London: Palgrave Macmillan, 2012. doi:10.1057/9780230371934_3. Hockey, Susan. “The History of Humanities Computing”. In A Companion to Digital Humanities, a cura di Susan Schreibman, Ray Siemens e John Unsworth, 1-19. Oxford: Blackwell, 2004. doi:10.1002/9780470999875.ch1. Holmes, Martin e Laurent Romary. “Encoding Models for Scholarly Literature: Does the TEI Have a Word to Say?”. In E-Publishing and Digital Libraries: Legal and Organizational Issues, a cura di Ioannis Iglezakis, Tatiana-Eleni Synodinou e Sarantos Kapidakis, 88-110. Hershey, PA: IGI Global, 2011. Hoyt, Eric. “Curating, Coding, Writing: Expanded Forms of Scholarly Production”. In The Arclight Guidebook to Media History and the Digital Humanities, a cura di Charles R. Acland e Eric Hoyt, 347-373. Falmer, UK: REFRAME Books, 2016. http://projectarclight.org/book. Humphreys, Alex, Christina Spencer, Ronald Snyder, Laura Brown e Matthew Loy. “Reimagining the Digital Monograph. Design Thinking to Build New Tools for Researchers”. Journal of Electronic Publishing 21.1 (2018). doi:10.3998/3336451.0021.102. Jenkins, Henry. Convergence Culture: Where Old and New Media Collide. New York: NYU Press, 2006. Jenkins, Henry. “Transmedia Storytelling 101”. Confessions of an Aca-Fan (2007). http://henryjenkins.org/blog/2007/03/transmedia_storytelling_101.html. Jenkins, Henry. “Transmedia Education: the 7 Principles Revisited”. Confessions of an Aca-Fan (2010). http://henryjenkins.org/blog/2010/06/transmedia_education_the_7_pri.html. Kim, Nam Wook, Benjamin Bach, Hyejin Im, Sasha Schriber, Markus Gross e Hanspeter Pfister. “Visualizing Nonlinear Narratives with Story Curves”. IEEE Transactions on Visualization and Computer Graphics, 24.1 (2018): 595-604. doi:10.1109/TVCG.2017.2744118. Lolli, Gabriele. Matematica come narrazione. Bologna: Il Mulino, 2018. McCloud, Scott. Understanding Comics. Northampton, MA: Kitchen Sink Press, 1993. McKee, Robert. Story: Substance, Structure, Style, and the Principles of Screenwriting. New York: Regan Books, 1997. McKenzie, Donald F. Bibliography and the Sociology of Texts. Cambridge: Cambridge University Press, 1985. McPherson, Tara. “Introduction: Media Studies and the Digital Humanities”. Cinema Journal 48.2 (2009): 119-123. http://www.jstor.org/stable/20484452. Meschini, Federico. Reti, memoria e narrazione. Archivi e biblioteche digitali tra ricostruzione e racconto. Viterbo, Sette Città, 2018. Miller, Daniel, Elisabetta Costa, Nell Haynes, Tom McDonald, Razvan Nicolescu, Jolynna Sinanan, Juliano Spyer, Shriram Venkatraman e Xinyuan Wang. How the World Changed Social Media. London: UCL Press, 2016. http://www.uclpress.co.uk/products/83038. Mittel, Jason. Complex TV: The Poetics of Contemporary Television Storytelling. New York: NYU Press, 2015. 20 | Documenti, medialità e racconto doi:10.4399/97888255263182 DigitCult | Scientific Journal on Digital Cultures Pierazzo, Elena. Digital Scholarly Editing: Theories, Models and Methods. Abingdon-on-Thames: Routledge, 2015. Posner, Miriam. “How Is a Digital Project Like a Film?”. In The Arclight Guidebook to Media History and the Digital Humanities, a cura di Charles R. Acland e Eric Hoyt, 184-194. Falmer UK: REFRAME Books, 2016. http://projectarclight.org/book. Price, Kenneth M. e Ray Siemens (a cura di). Literary Studies in the Digital Age: An Evolving Anthology. New York, Modern Language Association, 2013. doi:10.1632/lsda.2013. Quinto, Riccardo. Scholastica. Storia di un concetto. Padova: Il Poligrafo, 2001. Reagan, Andrew J. e Lewis Mitchell, Dilan Kiley, Christopher M. Danforth, Peter Sheridan Dodds. “The emotional arcs of stories are dominated by six basic shapes”. EPJ Data Science 5.31 (2016). doi:10.1140/epjds/s13688-016-0093-1. Riva, Massimo. “An Emerging Scholarly Form: The Digital Monograph”. DigitCult - Scientific Journal on Digital Cultures 2.3 (2017): 63-74. doi:10.4399/97888255099087. Roncaglia, Gino. “Experimenting with New Forms of Academic Writing”. DigitCult - Scientific Journal on Digital Cultures 3.3 (2018): 1-4. doi:10.4399/97888255208971. Roncaglia, Gino. L’età della frammentazione. Roma-Bari: Laterza, 2018. Rumsey, Abby Smith. “Scholarly Communication Institute 8: Emerging Genres in Scholarly Communication”. Charlottesville, VA: University of Virginia Library (2010). http://uvasci.org/institutes-2003-2011/SCI-8-Emerging-Genres.pdf. Rumsey, Abby Smith. “Scholarly Communication Institute 9: New-Model Scholarly Communication: Road Map for Change”. Charlottesville, VA: University of Virginia Library (2011). http://uvasci.org/institutes-2003-2011/SCI-9-Road-Map-for-Change.pdf. Sayers, Jentery and Craig Dietrich. “After the Document Model for Scholarly Communication: Some Considerations for Authoring with Rich Media”. Digital Studies/Le Champ Numérique 3.2 (2013). doi:10.16995/dscn.237. Sayers, Jentery (a cura di). The Routledge Companion to Media Studies and Digital Humanities. Abingdon-on-Thames: Routledge, 2018. Sahle, Patrick. Digitale Editionsformen. Norderstedt: Books on Demand, 2013. Schlosser, Melanie. “Defining Digital Scholarship”. Digital Scholarship @ The Libraries (2012). http://library.osu.edu/blogs/digitalscholarship/2012/12/12/welcome-to-digital-scholarship- the-libraries/. Schlosser, Melanie. “Closing this Blog.” Digital Scholarship @ The Libraries (2016). http://library.osu.edu/blogs/digitalscholarship/2016/12/06/closing-this-blog/. Schreibman, Susan, Ray Siemens e John Unsworth (a cura di). A Companion to Digital Humanities. Oxford: Blackwell, 2004. http://www.digitalhumanities.org/companion. Schreibman, Susan, Ray Siemens e John Unsworth. “The Digital Humanities and Humanities Computing: An Introduction”. In A Companion to Digital Humanities, a cura di Susan Schreibman, Ray Siemens e John Unsworth, xxiii - xxviii. Oxford: Blackwell, 2004. doi:10.1002/9780470999875.fmatter. Skelter, Adam. The Lost Art of Story: The Anatomy of Chaos Transcripts. 2018 doi:10.4399/97888255263182 Federico Meschini | 21 DigitCult | Scientific Journal on Digital Cultures Somers, James. “The Scientific Paper Is Obsolete. Here’s what’s next”. The Atlantic, 5 aprile 2018. www.theatlantic.com/science/archive/2018/04/the-scientific-paper-is- obsolete/556676/. Stella, Francesco. Testi letterari e analisi digitale. Roma: Carocci, 2018. Tomasin, Lorenzo. L’impronta digitale. Cultura umanistica e tecnologia. Roma: Carocci, 2017. Witt, Jeffrey C. “Digital Scholarly Editions and API Consuming Applications”. In Digital Scholarly Editions as Interfaces, a cura di Roman Bleier, Martina Bürgermeister, Helmut W. Klug, Frederike Neuber e Gerlinde Schneider, 219-247. Norderstedt: Books on Demand, 2018. http://kups.ub.uni-koeln.de/9118/. Vanhoutte, Edward. “The Gates of Hell. History and Definition of Digital | Humanities | Computing”. In Defining Digital Humanities, a cura di Melissa Terras, Julianne Nyhan e Edward Vanhoutte, 119-156. Farnham: Ashgate Publishing, 2013. Venerandi, Fabrizio. “Notes for a 'Digital Native Writing'”. DigitCult - Scientific Journal on Digital Cultures 3.3 (2018): 5-6. doi:10.4399/97888255208972. Wolf, Maryanne. Proust and the Squid: The Story and Science of the Reading Brain. Thriplow, UK: Icon Books, 2008. Yorke, John. Into the Woods: A Five-Act Journey into Story. London: Penguin Books, 2014. work_54dwrq365vesphbu5mqkx56u4m ---- King’s Research Portal DOI: 10.1177/1354856518772029 Document Version Peer reviewed version Link to publication record in King's Research Portal Citation for published version (APA): Spence, P. (2018). The academic book and its digital dilemmas. CONVERGENCE (LONDON), 24(5), 458-476. https://doi.org/10.1177/1354856518772029 Citing this paper Please note that where the full-text provided on King's Research Portal is the Author Accepted Manuscript or Post-Print version this may differ from the final Published version. If citing, it is advised that you check and use the publisher's definitive version for pagination, volume/issue, and date of publication details. And where the final published version is provided on the Research Portal, if citing you are again advised to check the publisher's website for any subsequent corrections. General rights Copyright and moral rights for the publications made accessible in the Research Portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognize and abide by the legal requirements associated with these rights. •Users may download and print one copy of any publication from the Research Portal for the purpose of private study or research. •You may not further distribute the material or use it for any profit-making activity or commercial gain •You may freely distribute the URL identifying the publication in the Research Portal Take down policy If you believe that this document breaches copyright please contact librarypure@kcl.ac.uk providing details, and we will remove access to the work immediately and investigate your claim. Download date: 06. Apr. 2021 https://doi.org/10.1177/1354856518772029 https://kclpure.kcl.ac.uk/portal/en/publications/the-academic-book-and-its-digital-dilemmas(258e975d-f19d-431b-adc7-13d7b17721d1).html /portal/paul.spence.html https://kclpure.kcl.ac.uk/portal/en/publications/the-academic-book-and-its-digital-dilemmas(258e975d-f19d-431b-adc7-13d7b17721d1).html https://kclpure.kcl.ac.uk/portal/en/journals/convergence-london(dc8cd2c2-a719-41a2-8aa3-94f823cf51cf).html https://doi.org/10.1177/1354856518772029 The academic book and its digital dilemmas Paul Spence paul.spence@kcl.ac.uk King’s College London, UK Submitted to the journal Convergence: The International Journal of Research into New Media Technologies Author pre-print version, after peer review, 2018 Copyright notice Copyright rests with the journal Convergence: The International Journal of Research into New Media Technologies Journal Volume Number and Issue Number still pending at time of submission. Copyright © 2018. Reprinted by permission of SAGE Publications. DOI: 10.1177/1354856518772029 journals.sagepub.com/home/con Abstract The future of the academic book has been under debate for many years now, with academic institutional dynamics boosting output, while actual demand has moved in the opposite direction, leading to a reduced market which has felt like it is in crisis for some time. While journals have experienced widespread migration to digital, scholarly monographs in print form have been resilient and digital alternatives have faced significant problems of acceptance, particularly in the arts and humanities. Focusing in particular on the arts and humanities, this article asks how, and under what conditions, the digitally mediated long-form academic publication might hold a viable future. It examines digital disruption and innovation within humanities publishing, contrasts different models, and outlines some of the key challenges facing scholarly publishing in the humanities. This article examines how non-traditional entities, such as digital humanities research projects, have performed digital publishing roles and reviews possible implications for scholarly book publishing’s relationship to the wider research process. It concludes by looking at how digital or hybrid long-form publications might become more firmly established within the scholarly publishing landscape. Introduction In his article “Scholarship: beyond the paper” in Nature a few years ago, Jason Priem argued that “we are witnessing the transition to … another scholarly communication system – one that will harness the technology of the Web to vastly improve dissemination” (2013: 437). While such arguments are not new, and impassioned claims about the transformative powers of digital technology in publishing have often proven to be premature or unrealistic, it seems clear that our relationship to scholarly publication is susceptible to change at every level of its existence, from conception to final reception, and beyond, as a result of digital mediation. Whereas academic journals have experienced many changes already, predictions of the imminent demise of print in academic publishing have proven to be misplaced, particularly in the Arts & Humanities (and to some extent in the social sciences), where the print monograph continues to hold significant cultural and symbolic value. Discussions about the future of the academic book face a series of contradictory dynamics: the enduring cultural value of the book for some scholarly sectors, which however currently rests on an economic model that seems untenable; the preference for print for some kinds of reading versus the enormous potential in digital discovery and annotation; and the concerns of many publishers, keen to engage with digital agendas and yet anxious to avoid the pitfalls experienced by the music industry. In any case, there seems to be little doubt that further (and substantial) change is coming. In her exploration of the impact of digital on the academic market, Frania Hall calls the monograph “the scholarly publisher’s next challenge” (2013: 76). The enduring importance of deep, reflective reading currently better suited to reading in print form and fears about the effect of digital migration have deferred major transformations, but sooner or later the scholarly monograph is likely to undergo a much closer engagement with (and transformation through) digital social mediation, data-driven dynamics and network effects. Focusing in particular on the arts and humanities (although many of its arguments are applicable to scholarly book publishing in other fields), this article asks how, and under what conditions, the digitally mediated long-form academic publication might hold a viable future. It examines digital disruption and innovation within humanities publishing, contrasts different models, and outlines some of the key challenges facing scholarly publishing in the humanities. Debating the future of the academic book Academic publishing was already “at the crossroads” in 2005, notes Thompson, by which time a steady increase in outputs, fuelled by the pressure to publish (to get onto, or move up, the academic ladder), stood in stark contrast to the actual market for academic books (2005: 175). Thompson points to important regional differences, for example between the U.S. markets, dominated by university presses whose mission was often underwritten by their institutions, and UK-based academic publishing, where the larger university presses like OUP and CUP had achieved greater market diversification, had greater global reach, and thus were less financially vulnerable to the immediate effects of a downturn in book sales. Nevertheless, the reality was that the field as a whole was “thinning out” (2005: 165), and everyone now operated in a restricted economic space, where digitally mediated innovation seemed tempting, but had so far been largely elusive. In recent years there have been numerous reports, publications and initiatives examining the current state and future of the academic book. These have been especially visible in, although not limited to, regions of the world where scholarly publishing is highly developed in commercial or infrastructural terms, such as the United Kingdom or North America, and in many countries these debates are part of processes of reflection dating back decades. Special issues in academic journals on publishing have examined this from different perspectives: as part of wider reviews of the scholarly publishing landscape;1 through calls to rethink the University Press;2 with a particular focus on digital publishing for the humanities and social sciences;3 and as calls to ‘disrupt’ the existing scholarly landscape as a whole.4 A series of initiatives in the United States, many of them funded by the Andrew W. Mellon foundation, have attempted to address the particular challenges facing University Presses there, from policy and infrastructural perspectives, as described by Anthony Watkinson in his report on ‘The Academic Book in North America’ for the Academic Book of the Future project (2016). Many of these have produced reports and have left traces in scholarly journals, offering various proposals on how to address what is widely seen as a ‘crisis’ in scholarly book publishing and covering a wide range of issues including business models, Open Access, infrastructure and the relationship of University Presses to their local library and faculty (Brown et al., 2017; Elliott, 2015). More recently, the UK’s Arts & Humanities Research Council, in collaboration with the British Library, invited “collaborative proposals to explore the Academic Book of the Future in the context of open access publishing and the digital revolution”.5 The result of this was the two-year ‘Academic Book of the Future’6 project, led by Dr Samantha Rayner at University College London (UCL) and colleagues at UCL and King’s College London, which initiated a community coalition and a series of activities that formally ended in September 2016. Of particular note is the Academic Book Week, which has evolved into a self-sustaining event beyond the life of the project.7 1 Special issue of Nature exploring transformations in scientific publishing https://www.nature.com/news/the-future-of-publishing-a-new-page-1.12665 2 Special issue of the Journal of Electronic Publishing, Volume 13 Issue 2, on ‘Reimagining the University Press’ (Fall 2010) or Special issue of Learned Publishing, volume 29, on ‘The University Press Redux’. 3 Special Issue of the Journal of Scholarly Publishing, Volume 48 Issue 2 on ‘Digital Publishing for the Humanities and Social Sciences’. 4 Special issue of the Journal of Electronic Publishing, Volume 19 Issue 2, on ‘Disrupting the Humanities: Towards Posthumanities’. 5 http://www.ahrc.ac.uk/funding/opportunities/archived-opportunities/academicbookofthefuture/ 6 https://academicbookfuture.org/ 7 https://acbookweek.com/ https://www.nature.com/news/the-future-of-publishing-a-new-page-1.12665 http://www.ahrc.ac.uk/funding/opportunities/archived-opportunities/academicbookofthefuture/ https://academicbookfuture.org/ https://acbookweek.com/ While by no means uniform in their conclusions, the body of evidence emerging from these initiatives points consistently towards a number of factors affecting the future of scholarly book publishing: 1. contradictions around supply and demand for scholarly books (in the U.S. and UK at least – monograph output in the humanities has increased in recent years, while actual sales per title have dropped) 2. continuing anxiety around Open Access (with national and international dynamics complicating things further) 3. divergent attitudes towards new digital media and ecologies, and their implications for credit and promotion 4. an ongoing sense that the future of the academic book is “at a major crossroad” and “uncertain” (in the words of an AHRC press release about the Academic Book of the Future project) but without widespread consensus on what the problems, or at least the solutions, really are8 Digital culture and technology (henceforth ‘digital’) are not the only factor here, but they have introduced new opportunities or challenges, and accentuated many of the difficulties which already existed. Digital mediations In his examination of the state of digital scholarship, and its affordances or limitations, Weller explores how digital technology is transforming scholarly communications as a whole, underlining some dynamics of digital culture which profoundly influence the future of the academic book in digital form (2011). The combined effect of the transition from information scarcity to information abundance, debates about copyright and networked interactions, or user-generated, mobile and mutable content - to name just a few factors - has fundamentally altered many areas of human life in the last twenty years or so, and these provide a context with which discussions of academic book publishing have still not fully engaged, in particular in those areas (such as the humanities) where wider engagement with digital practices is still undergoing negotiation. For some, the globally networked, digital and open cultures which have emerged as a result of the World Wide Web seem to point to a target of sorts for scholarly publishing, whereby geographic, institutional and social divides can be resolved through digital infrastructures which, moreover, enable scholarship to be more fully integrated with wider knowledge structures, thus facilitating wider public engagement: “[D]igital Humanities scholarship .. promises to expand the constituency of serious scholarship and engage in a dialogue with the world at large” (Burdick et al., 2012: 26). These digital 8 http://www.ahrc.ac.uk/newsevents/news/the-academic-book-of-the-future1/ http://www.ahrc.ac.uk/newsevents/news/the-academic-book-of-the-future1/ transformations are both facilitated and complicated by processes of disintermediation, globalization and media convergence (Phillips, 2014: xiii-xiv) and by competing dynamics between popular and commercial interests in the digital space, or between ‘open’ and proprietorial ‘walled garden’ approaches to digital infrastructure. Publishing as a whole has seen many instances of digital innovation, from “interactive digital products experimenting with narrative structures”, innovative funding/pricing models, aggregation models or user-generated content, to new entrants in publishing (Hall, 2014). Geolocation, Virtual Reality, Linked Data, data-driven analysis and Artificial Intelligence are just some of the many opportunities for content, but how can these work for the scholarly monograph? While scholarly publishing has arguably experimented ‘digitally’ more than other sectors like trade publishing, in part due to anxiety over its future, many argue that scholarly monographs are the least amenable to digital transformation, at least with regards to content (Thompson, 2012: 348-50). Some argue for the ongoing primacy of print in scholarly book publishing – which will “draw on digital capabilities” but in a “subordinate”, non-“disrupter” role (Esposito, 2017), while others argue that ‘digital’ holds the key to understanding the future, and that our thinking on this subject should “rip off the physical covers of the ‘book’ and move swiftly into the digital realm” (Pinter, 2016: 40). One barrier to engagement is the fact that the stakeholders and participants in scholarly publishing are highly heterogeneous, representing often radically different starting points, which influence the variety in responses to digital transformation. ‘Print first’ or ‘digital first’? In L’édition électronique, Dacos and Mounier broadly divide visions of digital publishing into two: one strand which understands it as a simple substitution from print to screen, with no fundamental change in the overall concept or apparatus of publishing (they maintain that this position was hard to maintain, even in 2010); and another, which views digital publishing as part of a “new era” of knowledge production, a “revolution” in text comparable to the arrival of the printing press and its effects on humanity. Tellingly, the latter view contemplates “the disappearance of the book as we know it” (Dacos and Mounier, 2010: 3-4, my translation). Applying this division to long-form digital publications we have: those which effectively follow print models to produce what are, basically, digital remediations of the printed book and those whose processes, functionality, forms and/or formats are fundamentally different, because they are conceived for digital. The division is not watertight, since each “digital book” may draw on traditional or disruptive models to differing degrees, but, as a general principle, it is a useful point of comparison in the current landscape. The first model – long-form publications simulating the print book, with, at best, modest application of digital affordances - dominates the digital output of long-form academic publications at present. Electronic text has existed in publishing since the 1970s, and publishers (and publishing) played a key role in the development of electronic markup standards such as XML, but digital innovations have generally been received with caution, and even where there is dual print-digital workflow, the conceptual models for publication, design parameters, publishing systems, editing flows, supporting infrastructures and wider expectations of the scholarly community are still largely predicated on the print model by default. The current general consensus around what constitutes an eBook, moreover, is a far more limited, and print-centric, view than that which circulated in its early history (and which pointed to an altogether more ambitious concept of ‘electronic book’). These less ambitious, to use Mrva-Montoya’s phrase, ‘tradigital’ books (Mrva-Montoya, 2015), in PDF or EPUB format have been easier to produce because they do not fundamentally undermine existing models, and as a result, they represent a limited engagement with digital modes and affordances. In a similar vein, Prescott, in asking if we are “doomed to a world of PDFs?”, expresses concern that “the future publishing landscape is a bleak one” and argues that the scholarly environment it is supposed to serve is “less media rich” now than it was a few decades ago (Prescott, 2015). Even the EPUB format, which is (by default) flowable and in theory allows for rich, interactive publications – more like websites than books– is, argues McGuire, constrained by the application of DRM and device/platform-specific restrictions (2012: 115-6) which, in their current implementations, severely limit digitally mediated interactivity across books. We are still far from the modular, highly structured, dynamically interactive, ‘crowd collaborative’, social and networked views of the academic book which digital culture and technology might allow for. To re-appropriate language used by Craig Mod, the first vision responds to the question “How do we change books to make them digital?”, whereas the second asks “How does digital change books?” (Mod, 2012: 95). The first model presupposes moderate change to the current landscape; the publisher model adapts to ‘digital’, but otherwise stays broadly the same; the second model consists of a much more radical transformation in models for scholarly dissemination. At present, academic book publishing has largely stayed with the first model for a number of reasons. The enduring attachment of many scholars to physical books and preference for reading print is a key factor, although this will probably change as reading technology improves, wider reading habits evolve, and viable and alternative models of the ‘book’ emerge in digital form. While publishers are increasingly starting to look at digital-first systems and workflows to produce both digital and physical books, a paradigmatic shift which challenges the assumption that a ‘print-like’ object will be developed first (or perhaps even at all) means that changes in author perceptions are likely to take longer. For now, at least, authors and editors “have relatively little experience in enriching their texts to take advantage of the opportunities opened up by digital technologies” (Jubb, 2017: 35), although again this is likely to evolve. Similarly, scholarship monographs, particularly in the humanities and social sciences, are likely to remain broadly ‘linear’ in the short term, even if complementary non-linear modes are slowly emerging over time. In spite of all these caveats, a digital transformation in academic book production seems inevitable. Bhaskar argues that the arrival of the “digital network means, over the long term, that there can be no such thing as business as usual” for publishing as a whole (Bhaskar, 2013: 76) and looking at the study habits and practices of our students today (as opposed to the habits and practices of those teaching them), it seems highly improbable that, in ten or twenty years, the scholarly media ecology will remain unchanged. How might a digital long-form publication which could truly rival the printed academic book emerge? At present, we are very much at the stage of experimentation. There are many challenges of technical sustainability and preservation, education and training, not to mention effective business models and integration into the wider fabric of scholarly communications. But perhaps the most serious challenge is to explore how the digital long-form publication might become an effective vehicle for scholarly argument and interpretation to rival the print monograph. I now turn to a research field within the humanities which has a track record in research into new models and frameworks for digital publication. The digital humanities and scholarly publishing The ‘digital humanities’ is a transdisciplinary field with a history of experimentation with, and critique of, the interactions between computational tools and methods, digital culture and the humanities (often straying into the social sciences) stretching back over 50 years. Digital humanists have been involved in numerous publishing-related initiatives, including: the Academic Book of the Future project (where the host departments in the two co-coordinating institutions both have long-standing history in ‘DH’9); many of the Mellon-funded North American initiatives mentioned earlier; various digital publishing tools and frameworks, whether general purpose (Scalar10 and Manifold11), function/technology-specific (TAPAS12) or field-specific (Papyri.info13 and Perseids14); markup frameworks (XML15 and TEI16); and the production of multiple digital editions, resources, databases and other forms which either qualify as, or occupy the same intellectual space as, long-form publications. 9 Disclaimer: I work for one of them 10 http://scalar.usc.edu/scalar/ 11 http://manifold.umn.edu/ 12 http://tapasproject.org/ 13 http://papyri.info/ 14 http://sites.tufts.edu/perseids/ 15 https://www.w3.org/XML/ 16 http://www.tei-c.org/index.xml http://scalar.usc.edu/scalar/ http://manifold.umn.edu/ http://tapasproject.org/ http://papyri.info/ http://sites.tufts.edu/perseids/ https://www.w3.org/XML/ http://www.tei-c.org/index.xml In spite of this activity, scholarly book publishing has not featured particularly prominently as a topic (except as a by-product of other scholarly activities, such as editing) in many of the better known digital humanities publications. To take just one example, in the first edition of the landmark ‘Blackwell Companion to Digital Humanities’ (Schreibman et al., 2004), books and publishing do feature, but generally in relation to some other topics such as electronic markup (Renear, 2004) or electronic scholarly editing (Smith, 2004). On one level this is hardly surprising; the field’s proximity to these themes is clear from the copious literature which it has produced on markup and scholarly editing as significant areas of both study and practice. Later volumes, including the substantially revised second edition of the Blackwell Companion (Schreibman et al., 2016), come closer to addressing the current state (and future) of publishing, although they still tend to address the issue within wider discussions about subjects such as scholarly communications or digital scholarship. In spite of this general preference for focussing on wider scholarly frameworks over publishing, and thus on ‘digital resources’ rather than ‘digital publications’, researchers in the digital humanities have often addressed issues relating to publishing, and how they fit into wider discussions about the future of the academic book. What follows is a short review of four common themes within the ‘DH’ view on publishing. • Modelling and publishing. In their review of ‘digital publishing [as] seen from the digital humanities’, Blanke, Pierazzo and Stokes locate publishing close to another of DH’s historic areas of strength, namely ‘modelling’. For them, publishing “needs to be understood as a range of modelling activities that aim to develop and communicate interpretations” - perhaps symbolically, one of their subheadings is “[n]ot publishing but modelling” (2014: 17). The implied venue for this kind of modelling activity is the non-narrative-based publication of digitised content, most commonly published in scholarly editions or archive- based publications, but the article raises important wider questions about what we consider to be “faithful reproduction” and proposes that we free ourselves from “skeumorphic representations” of non-digital content in a digital environment, which apply to all kinds of publication (Blanke et al., 2014: 19, 26). • Process versus product. In a very different vein, in her chapter ‘Scholarly Publishing in the Digital Age’ Kathleen Fitzpatrick reflects on her experience with Media Commons, - which she also used for the preparation of her monograph ‘Planned Obsolescence’ (Fitzpatrick, 2011), - as an experiment in networked scholarly publishing which aimed to facilitate social editing, community creation, public engagement and peer review. The richer interactions between peers which this editing/publishing model enables places the focus less on the final outcomes of research publishing (“the product”) and more on “the process” (Fitzpatrick, 2015: 459-460), which draws attention to publishing as part of a wider research ecosystem. • Scholarly research infrastructure. Digital humanities research has often been involved in “building” scholarly infrastructure – both for critical interpretation and as a community-building exercise – resulting in publishing functions which are embedded within wider scholarly research systems. This is evident, for example, in Crane et al.’s early call to build “the infrastructure for ePhilology”. The digital resource/publication argued for in that case: can be disseminated to anyone, anywhere, at any time; is hypertextual, facilitating connection between scholarly narrative and supporting evidence; can be dynamically remixed for different people/uses; is capable of learning by itself through “documents that learn from each other”, using machine generated information from external datasets; is able to “learn from their human readers” by analysing their digital habits; and is customisable to individual users and their settings (Crane et al., 2008). Many of these attributes may become desirable for scholarly publications of the future, but does this describe a digital resource, or a publication, or potentially both? As publication, in this scenario, increasingly merges into a larger research infrastructure, it becomes more important to establish clear dividing lines between research and publication, a topic I will return to later. • Re-thinking the Academy. Finally, it is not uncommon to see the digital humanities invoked to support more radical re-alignments of the scholarly landscape – for Cathy Davison, “DH is … about realigning traditional relationships between disciplines, between authors and readers, between scholars and a general public, and, in other ways, re-envisioning the borders and missions of twenty-first century education” (Davidson, 2015: 134). That gives some sense of how the digital humanities views publishing; in what ways does it actually perform publishing functions or roles? With a few notable exceptions (Fitzpatrick, 2011), this does not generally involve discussions about publishing mission or sustainability. Digital humanists are frequently involved in “building” resources, and as such these typically have many of the following attributes: they are experimental; they combine text with other media in dynamic interplay; they involve interdisciplinary, multi-author, inter-institutional collaboration; they are networked; they are closely connected to communities of practice (not just digital humanities, but also, say, epigraphers, or early modernists); they encourage curation, Open Access and sharing; they may be conceived with public engagement in mind.17 17 I do not for a moment intend to suggest here that digital innovation is limited to the digital humanities. There are many new media, digital arts and electronic literature experiences in relation to publishing which deserve a fuller treatment, but which I do not analyse in detail here for reasons of space. It is clear from all of this, that in many ways, the digital humanities are already deeply involved in some publishing practices, including those which produce long-form publications, but also that their role is poorly defined precisely because of their range, a point I will expand on later. I will now outline the key challenges I believe we need to address in order to connect the different visions around digitally-mediated long-form publishing in the humanities. Projections of the digitally mediated academic book What projections exist for digital futures of the book, and what criteria are used to describe them? Kapaniaris et al. present a spectrum based on degrees of interaction, ranging from eBooks in PDF form at one end, to books apps at the other (2013). A report by an Emory working group to the Mellon foundation on ‘The Future of the Monograph in the Digital Era …’ presents a print/digital continuum from traditional print-based books to digital only and identifies four models: (a) print monographs, (b) digital long- form publications “with a strong resemblance to print monographs”, (c) significantly enhanced long-form publications in digital form and (d) long-form publications which are conceived, and can only realistically operate, digitally (Elliott, 2015). Enhancements, in this definition, might include images, sound, or references to other content and complex navigational structures. Key criteria for dividing categories might be whether or not the work is linear or non-linear, and whether it is ‘stable’ or ‘updateable’. At the more interactive end of the spectrum, it not always clear how to distinguish between a digitally enhanced eBook and other text-based electronic resources, and even where that distinction is clear, the “complex relationship” which the university press system (and indeed scholarly publishing as a whole) “maintains … to the plethora of electronic research and reference databases that are ever-more essential to supporting scholarship” (Lynch, 2010) is often an obstacle to differentiation between scholarly ‘publications’ and supporting ‘resources’. There is also some overlap here with debate regarding the future of other scholarly forms, such as the journal article, and it may be necessary to take a wider view across the full range of possible scholarly outputs. For example, Breure et al. suggest a similar taxonomy based on a spectrum which distinguishes between: text-driven and image- driven interfaces; linear and non-linear dynamics; and limited multimedia support or visual narratives sustained by full immersion/interactivity connected to research datasets (Breure et al., 2011). This may be equally to relevant to books and journals, and everything in-between. One key outcome of the Andrew W. Mellon Foundation’s strategic investment in long- form scholarly publishing, which began in 2013, has been the development of a set of features to describe the “monograph of the future” (understood to be digital and open access) which are ambitious in scope and which very much favour an ‘enhanced’ view of the academic book. In this formulation, the academic book should be: “fully interactive and searchable online” with primary and other sources; portable across reader applications; able to support usage metrics which protect user privacy; be updated, managed and preserved digitally; economically sustainable and amenable to device- neutral user annotations, while meeting scholarly standards of rigour, able to function within existing systems of professional recognition and marketable as an object belonging directly to its reader (Waters, 2016). This is an ambitious ‘wish-list’, implemented in part across a number of its funded research projects, and still in need of further testing and debate, but it provides important material for thought on how to develop new publishing models and infrastructure, and whether they are most effectively instantiated at institutional, national, commercial or disciplinary levels. How is the book changing as a ‘system’ for creating and disseminating knowledge? In order to understand that properly, we need to better understand how digitally mediated academic long-form publications work, or might work, and how they affect knowledge production ‘systems’. Writing from a book design perspective, Craig Mod argues that we need to contemplate the book, not as a fixed object, but as a combination of systems: a pre-artefact system (conception, authoring and editing); the system of the artefact itself (‘the published book’ itself); and a post-artefact system (“the space in which we engage with the artefact”). Digital culture disrupts all of these systems: the pre-artefact system is no longer limited to interactions between author and editor and may include other forms of co-creation and ‘community’ editing; the book itself can be manifested in multiple forms, each with a different set of affordances; and the post-artefact system may include “digital marginalia”, namely comments, notes and interactions between an (in our case scholarly) community around a piece of writing (Mod, 2012: 90-92) and, in this sense, ‘digital’ functions as “scaffolding between the pre- and post-artefact systems” (Mod, 102). Despite the challenges, and while there is significant variation across disciplines and geographies, scholarly communications have been, and continue to be, transformed by digital culture and technology. Thanks to social media effects, public/private and formal/informal boundaries are no longer as clear as they used to be. Research objects increasingly circulate in digital form or through digital channels and “[i]n the Web era, scholarship leaves footprints” (Priem, 2013: 438). Our expectations about how we gather information (speed, access, broader interpretations of what constitute ‘valid’ sources) and then process/disseminate it (the sharing economy, collective intelligence and online publication modes) have been dramatically changed by digital culture. The pervasive influence of social media on dissemination in today’s society, where the smartphone often constitutes the primary mode of access to information (and for companies, a crucial means to accessing information on user/reader behaviour) is another element altering the knowledge landscape, creating new structures and signifiers of symbolic value. These factors have so far still not had a major impact on scholarly outputs, but it is very unlikely these outputs will remain unaffected in future. Research ecologies in some disciplines, for example in the arts and humanities, still depend very much on ‘print’ era models, but this is increasingly being contested (Kelly, 2012), even if the path of progression is by no means clear yet. Given all of this, we might expect more mutual overlap in debates about the future of ‘research’ and ‘publishing’ respectively: many of the discussions around research ecosystems and infrastructure seem to treat publishing as an afterthought, or merely as a ‘digital button’ to press to produce output, while much of the debate around the future of publishing takes little account of evolving scholarly communication cycles and research ecosystems. We need to better understand the ‘digital book’ (or its alternatives) as intellectual systems, but also how they fit into wider knowledge and research systems, including those which operate beyond the Academy. Long-form publications, networked scholarship and new knowledge objects Digital publications have often raised interesting questions, but they do not, as yet, constitute coherent and readily identifiable modes of scholarly expression and as such, their location in existing scholarly communication circuits remains under-articulated. One early attempt to articulate a ‘digital’ future for scholarly content was Darnton’s pyramid, which envisaged knowledge being represented in different layers, including (top to bottom): (1) a concise view of a topic; (2) supporting argument arranged in chunked and non-sequential form; (3) documentation and it accompanying analysis; (4) theoretical discussion; (5) pedagogical materials; and (6) interactions between authors and readers (Darnton, 1999). Early visions of this type were sometimes criticised as being utopian or techno- deterministic in character. Nevertheless, increasing evidence of a ‘networked research cycle’ (Weller, 2011: 56) in some areas of academia suggests changes in the research process that will start to effect greater changes in how publications are conceived and produced. This implies, as I have noted, a change in focus from ‘product’ to ‘process’, but this greater connection between research and publication ecosystems, points towards two effects. On the one hand, it theoretically makes it possible to produce publications faster, and with a greater connection between analysis and evidence (data; models; visualisations), while, in some cases, it makes it harder to see the distinction between ongoing research and stable research outputs. Brown et al. believe that publishing will look “very different” in the future, and now that the online mediation of journals is well established, they “believe the next stage will be the creation of new formats … ultimately allowing scholars to work in deeply integrated electronic research and publishing environments that will enable real-time dissemination, collaboration, dynamically-updated content, and usage of new media.” (Brown et al., 2007: 4). But these new formats are unlikely to evolve merely on the grounds of technological possibility and affordance; if they do develop in any significant way, they will likely grow from scholarly need, grounded in changes in the way that we produce knowledge. One thing which stands out from many of the reports produced about the future of the book is that, while there is abundant literature on practical aspects (such as Open Access or business models), and a good understanding of how academics structures (validation/promotion systems or research evaluation programmes) drive expectations about format, there are relatively few studies regarding how digital publication actually facilitates or encourages new forms of knowledge production. In his ‘Theses on the Epistemology of the Digital’, Alan Liu explores how ‘the digital’ affects our understanding of what knowledge consists of, and how it potentially transforms its systems of production and dissemination. It introduces new knowledge objects (such as ‘algorithm’, ‘multimedia’ and ‘data’) and challenges the preference for “acts of rhetoric and narrative” in some (often humanities-based) disciplines (Liu, 2014). It also increasingly encourages us to question whether a monograph, or even a book in the more general sense, is always the best way to communicate a given argument. By this logic, if we stop looking at digital books as, necessarily, simple digital mediations of a print original and take full advantage of the communicative capacity of the digital medium, we are better placed to find critical arguments which can only be made digitally and which make better use of the digital space as a site of creativity, co-creation and generative knowledge. How well are we currently placed to commit to such challenges? Where I work, in the humanities, there are different opinions regarding the level of engagement of researchers with the theoretical or practical aspects of digital culture and technology. Whereas some argue that today’s humanities reseachers are “well versed in modern digital practices” (Deegan, 2017: 32), others argue that, by their inability to engage with digital innovation nearly as fluidly as they typically engage with print monographs, “the Arts and Humanities are not embracing the culture of transformation that these fields pretend to embody” (O’Sullivan, 2017: 8). Smiljana Antonijević’s wide-reaching ethnographic study of scholars across institutions in the US and Europe seemed to indicate that there remain both anxieties and practical barriers to full engagement of humanities with the affordances of ‘the digital’, although generational differences exist (Antonijević, 2015: 44-49). Beyond the digital humanities, we can observe little evidence of humanities researcher involvement, or interest, in the design of the research and publication tools which they adopt, with the very real danger that “humanities scholars will develop the same consumer relationship to digital content that they have had to print” (Prescott, 2012: 6-7). This is part of a wider problem, in the humanities, linked to the fact that digital resources carry less prestige, which sets up a certain circular dynamic where digital resources are used to support research, but are then under-cited because of the preference for print (Hitchcock, 2013). Finally, it also takes us back to challenges which derive from the growing density of the media landscape and difficulties in delimiting new forms of publication within a broader, digitally mediated research ecosystem. As we have seen, digital publishing blurs boundaries, and (at least potentially) replaces a finite set of publication types with a seemingly fluid spectrum populated with multiple ‘publication points’. Distinctions between ongoing research and stable outputs, or between ‘digital resource’ and ‘digital publication’ are not always clear in this scenario, and some digital practitioners have been reluctant to sacrifice the flexibility in definition which the digital medium provides, but in many ways they would be better served by making clearer formal distinctions. The acts of maintaining dynamic digital resources and providing snapshots for evaluation/accreditation are not mutually exclusive, as those of us who have submitted digital outputs to the UK’s Research Excellence Framework can attest. There is a wider set of questions around digital resources, and their ‘equivalence’ to the academic book which is beyond the scope of this article, but issues such as preservation, stability of record and how to integrate knowledge objects such as evidentiary datasets or dynamic visualizations within digital long-form publications (either embedded or as external ‘appendices’) will be a key part of that discussion. Rearticulating publishing forms Definitions and categorisations of academic books are often illustrative of the competing claims and pressures on them. There are no universal definitions for the academic book, but Deegan’s description of the book as a “long-form publication, a monograph, the result of in-depth academic research … making an original contribution” is a good starting point, and traditional distinctions with the shorter journal article (which is often more limited in scope) still stand, although as she points out, they are “becoming increasingly blurred” (2017) and the emerging mini-monograph format (Palgrave Pivot and Stanford Briefs) adds to erosion of the boundaries between forms. Her inclusion of an approximate word length for the monograph (80-100,000 words) is, of course, a print legacy, and we might question whether parameters of length (or indeed structure, format and use of non-textual media) will always be so significant, but for now, no other models constitute scalable alternatives in the scholarly mainstream. In part, this is a reflection of cultural status: monographs “are deeply woven into the way that academic think of themselves as scholars” (Deegan 2017: 14), but this assumption, and the print model which accompanies it, is increasingly disputed – Pinter, for example, argues that, in future the book will be defined more by its function than any other feature and that we will move beyond the “sunken investments in existing scaffolding” to engage with evolving new media ecologies (Pinter, 2016: 40). Many terms exist to describe digitally mediated forms of the long-form publication, including ‘enhanced eBook’, ‘enhanced monograph’, ‘networked book’ or ‘book apps’.18 Digital terms are also notoriously fluid: originally the term ‘eBook’ covered more ambitious visions of the book in electronic form, but it has been largely appropriated, as a result of commercial usage, to represent remediated print content in EPUB or PDF formats with relatively limited functionality. There is also an important point to make about the formulation of terms. Print-based terms at least loosely describe, or stand in as signifiers for, their scholarly purpose – the monograph, a single authored piece of research; the edited collection, bringing together different writing about a given theme; or the scholarly edition, providing a critical interpretation of a given work- whereas terms used for new digital long-form publications types merely imply something about the format or functionality – it is ‘enhanced’ or ‘networked’ (we are rarely told to what purpose) – or in the case of ‘book app’, they offer information about its delivery platform. What is more, at its core the language used for these ‘new’ forms is resolutely tied to print – the terms used simultaneously seek to appropriate the cultural baggage of the print book and to liberate themselves from it at the same time – which help to explain the conceptual challenges in making them viable alternatives to the printed book in the short term. Digital forces us to think about distinctions in form, content, platform or device which are either not relevant or not negotiable for the printed book and it is unlikely that we will see stable terms emerge in the short term to describe these new instances of the ‘book’ (or its partial replacement). Nevertheless, until stable terms for new scholarly publishing concepts arise, it may remain harder for them to gain traction beyond the margins, and so this requires attention. As we have already seen, a vast array of terminology for digital outputs exists, and these have been fuelled in part by the nature of digital affordances themselves (which may influence new ‘fashions’ in digital research), but also in large part by the pressure to present new forms as being ‘innovative’. I would also contend that the terms used so far for long-form digital publications and/or other research outcomes have generally had more to do with cultural and political context than any substantive element related to functionality or cultural representation. The cultural baggage of common words such as ‘archive’, ‘edition’ or ‘database’ varies according to sector and locale.19 Some have argued for the symbolic force of the ‘database’ (Manovich, 2002) while the concept of ‘archive’ has considerable currency in many areas of the humanities, although their relation to publication seems unclear. In 18 See also (Drucker, 2008) for earlier terms such as “expanded book”, the “hyper-book” or “the book emulator”. 19 Ken Price is unusual in giving serious attention to “the genres we are now working in” as he explores various terms in relation to his experience on the Whitman project (Price, 2009). their projection of possible new cultural forms which might be generated by the digital humanities, Burdick et al. suggest new terms such as ‘augmented editions’, ‘animated archive’ or ‘database documentaries’ (2012: 35, 47, 54); these have the virtue that they provide meaning to otherwise overused and ambiguous terms, but the question is whether or not these, or the many other terms currently in circulation, will have the coherence and consensus to be adopted more broadly. To some extent, stable terms will emerge organically over time and it would be counter-productive to overly force the issue, but greater discussion among the various constituencies of scholarly publishing would surely be beneficial for all. A crucial aspect of this conversation will be to find greater alignment between the terminology used at different stages of the scholarly communications cycle, in particular around validation and promotion processes. So, whereas ‘enhanced monograph’ seems to be used by various academics and people involved in discussions about the future of publishing, it does not appear, for example, anywhere in the extensive list of admissible output collection formats used in the last UK Research Evaluation Framework exercise (REF 201420), where we see, under the list of admissible ‘digital artefacts’, the terms ‘software’, ‘website content’, ‘digital or visual media’ and ‘research datasets and databases’. Moreover, a clear boundary still does not really exist between, on the one hand, innovative / experimental forms and, on the other, stable forms worthy of inclusion as outputs equivalent to the journal article or monograph. While the experimentative, ‘laboratory’ function of much work typically carried out in the digital humanities will continue to be important in pushing the boundaries of scholarly communications (and a fundamental part of the research agenda of that field), we also need to establish clearer genres, descriptors and/or labels around digital publications across the spectrum (from ‘short form’ to ‘long’ form) so that they can be evaluated fairly. In ‘Imagining a University Press System to Support Scholarship in the Digital Age’ Lynch argues for greater standardization and for ‘templates’ (2010), which would fix particular genres, facilitating scholarly validation, circulation and credit systems. Thomas III actually goes on to tentatively propose terminology we might use to this purpose: Interactive Scholarly Works (ISWs), which by his definition are more “tightly defined” digital outputs combining archives, tools and argument; digital projects or Thematic Research Collections (TRCs)21, which cover more “capacious” outputs drawing together heterogeneous tools, models and datasets in open-ended, multi-author research collaborations; and digital narratives, which are born-digital works of highly structured and interpretative scholarly narrative (Thomas III, 2016: 531-2). While we might argue about the precise division or nomenclature, the need for clearer categorisation of digital works - for formal publishing and evaluation purposes - and a more consistent terminology, seems clear. This is, moreover, a conversation which needs to include a wide range of actors, and to be multi-disciplinary and global in 20 http://www.ref.ac.uk/about/guidance/submittingresearchoutputs/ 21 After Caroline Palmer’s proposed use of the term (Palmer, 2004). http://www.ref.ac.uk/about/guidance/submittingresearchoutputs/ outlook. It is also to be hoped that discussions around terms which affect both academic standing and career advancement will become less national and more global over time. While these differences in terminology exist, digital alternatives to the book will continue to be undermined by difficulties in formal academic validation. Making ‘print’ and ‘digital’ work together Part of the answer may lie in gaining a better understanding of how print and digital work together. How does scholarship function differently in the digital environment – what is lost, what is gained, and how does this influence choices about digital and print channels? We are only just starting to understand the answer to these questions, but we need to identify which aspects of scholarly communication are better served by digital or print, and how they might fit together better in future. The recent recovery of print versus eBook sales in trade publishing22 suggests a broader 'cooling' of public attitudes towards ‘digital’ reading after a period of high expectations (and sometimes hyperbole) for digital formats, and in scholarly publishing, numerous sources seems to confirm that print publications hold enduring significance for academic researchers (Wolff-Eisenberg et al., 2016), especially in areas like the humanities and social sciences where narrative-based argument is at the core (Deegan, 2017). Academic books are a key feature of the publishing landscape, particularly in the humanities and social sciences, for a number of reasons, which include their cultural symbolism, ability to communicate a coherent and sustained narrative, phenomenological resonances/power, readability, and finally, underlying academic credit and promotion mechanisms (Deegan, 2017). By contrast, ‘digital’ mediations of the book have faced significant problems of acceptance for a number of reasons, and so are generally limited to eBook remediations of print monographs, special cases (such as digital scholarly editions) or new media experiments. That said, - and while early enthusiasm (and at times proselytism) regarding the potential of digital technology to transform academic book publishing has waned as the practical limitations have become more apparent -, the major challenges of sustainability in current models of supply and demand (Jubb, 2017: 5), along with wider questions about how ‘the academy’ should re- adjust to new modes of knowledge production, mean that it nonetheless seems inevitable that ‘digital’ will play a significant part in re-thinking its future. Dunleavy, speaking from a social sciences perspective, has argued for a ‘new renaissance’ of books based on emerging realities such as the digital reading list, which favours chunkable content which can easily be downloaded, annotated or added (by students) and which can be added to at the last minute, on demand (by lecturers). Highlighting the growing awareness that it may not be practical to continue marketing books as single entities, he argues that the book may be better thought of as part of a 22 https://www.theguardian.com/books/2017/mar/14/ebook-sales-continue-to-fall-nielsen-survey-uk- book-sales https://www.theguardian.com/books/2017/mar/14/ebook-sales-continue-to-fall-nielsen-survey-uk-book-sales https://www.theguardian.com/books/2017/mar/14/ebook-sales-continue-to-fall-nielsen-survey-uk-book-sales large high quality library which can be navigated, rather in the way that we navigate journal collections (Dunleavy, 2012). In this scenario, print and digital need to work together as part of a seamless experience, allowing users to experience content as they prefer, on paper or on screen. It is to be expected, then, that ‘digital’ and ‘print’ may be seen as less oppositional in future. The recent reader survey by the Oxford University Studies in the Enlightenment confirmed what we already know from various sources: that readers “seek portability and immediate accessibility of scholarly resources” and yet do not generally favour ‘digital only’ access. Rather, they prefer hybrid print-digital access, according to the kind of activity they are carrying out. We are still far from having stable and sustainable business models for hybrid long-form publications, but from a scholarly perspective the requirement is clearly there. Conclusions In earlier times, digital publishing was sometimes presented as making publishing simpler in some way: whether through the immediacy and potential global reach of posting content to the Web or through the promise of ‘single-source publishing’ which often accompanied the early proposition of XML for editing/publishing. Far from simplifying publishing, digital culture and technology have made it far more complex in many respects, with new content types, more technical formats, competing workflows and hugely divergent business models. There are clearly many advantages for moving content into digital first workflows, and this may become more common in future even in scholarly book publishing, but the adoption barriers are significant, and the increasing use of mobile and tablets has only complicated things further (McIlroy, 2015). This is likely to make more adventurous long-form digital publications harder to sustain in business terms, in the short term, and yet from a scholarly perspective, this shift towards a richer range of outputs has already started, and it is something which needs to be understood properly and integrated into the current publishing landscape. As the recent study of arts and humanities outputs submitted to the UK’s Research Evaluation Exercise showed, monographs carry great weight, but there is also greater variation in research outputs, with the suggestion that scholars (in the arts and humanities) are more likely to see digital media as “central to their research output and scholarly experience” (Tanner, 2016: 12), even beyond more obviously receptive fields such as art and design, the performing arts, communication studies, new media studies or library and information management. We are also at a stage of intense contradiction in terms of geographic scope, where on the one hand, the effects of a global network facilitate stronger connections between scholars around the world, while on the other hand digital media effects exacerbate historic geo-economic and social divides. While some aspects of academic publishing display global characteristics, debates about the future of the academic book are still largely operating along national lines, as the example of debates in the U.S. and the UK demonstrate, tied to local funding landscapes and systems of credit and evaluation. A book published digitally is, in theory, open to wider and more democratic dissemination systems, but in practice its fate is often firmly tied to national systems for academic validation, localised (and often inconsistent) licensing dynamics and unevenly stacked international knowledge flows. As Inefuku has argued, “[t]rue democratization and globalization of knowledge cannot exist without a critical examination of the systems that contribute to the production of scholarship”, and initiatives to develop global publishing platforms need to involve Global South perspectives from the start (Inefuku, 2017). Redefining scholarly publishing so that it is genuinely inclusive, collaborative and based on true reciprocity will be an important part of the academic book of the future. Various pieces of research, including the recent Academic Book of the Future project, have demonstrated the enduring appeal and importance of the long form narrative- based scholarly monograph, while highlighting the ongoing challenges facing the academic book. In many fields, the academic book has been replaced by databases or side-lined as the currency of the journal article, dominant in the sciences, has grown, and some might argue that the digital mediation of the academic book has reached its limits. I have argued here that, while change may be slow, such a position is untenable in view of changing media expectations and habits. It is crucial, however, to gain greater common understanding of the motivation and dynamics which bind together (and sometimes separate) different actors in the scholarly book communication circuit, and of the way that relationships are changing. There are a number of different stakeholders involved in scholarly publishing – including academics (as authors and consumers), librarians, publishers, digital media companies, digital practitioners and wider publics – and discussion regarding the future of scholarly publishing “has too often failed to transcend the self-interest of individual groups of stakeholders” (Anthony Cond of Liverpool University Press, quoted in Samantha Rayner's preface to Deegan, 2017: 6). There does, nevertheless, appear to be a sense now that roles are changing, with, for example, publishers “shifting their position in the value chain, and redefining themselves as they go, into training and assessment, information systems, networked bibliographic data, and learning services” (Goldsworthy, 2015). Along with this, there is a growing awareness in some quarters that partnerships are going to be crucial in bridging the gaps which exist between different stakeholders. This includes the digital humanities. The digital humanities already plays a semi-informal role as “exploratory laboratory” for publishing along the lines proposed by Svensson for its role in relation to the humanities more generally (Svensson, 2010), but if this role were more consistently negotiated with (and recognised by) other stakeholders (such as other humanities academics, publishers and libraries) it would benefit all involved. Initiatives such as the recent call for novel publications “blending cutting-edge technology with high quality scholarship” by the King’s Digital Lab and Stanford University Press will help to redefine complex narrative argument within a digital or hybrid setting.23 It is perhaps understandable that a field which is constantly in transition - in part due to changes in digital culture and technology, and in part due to its fluid/unstable status within the Academy – should strive to make a wide set of claims influencing everything from policy to innovation, but I would like to argue here that both digital humanities and publishing sectors would mutually benefit from greater analysis and clarity about the field’s actual (and potential) contributions to debates about the future of publishing in the humanities. William G. Thomas III points out that the field has produced “innovative and sophisticated hybrid works of scholarship, blending archives, tools, commentaries, data collections and visualizations”, but that many of these outputs have faced serious problem in terms of recognition, credit and absorption into the wider scholarly fabric (Thomas III, 2016: 525). These gaps in understanding about the nature and status of new digital outputs constitute as much a problem for the humanities as a whole (and indeed scholarly publishing) as it does for the digital humanities. But what if these outputs were viewed (and recognised) more fully as part of the process of exploration in the ongoing transformation of scholarly publishing in the humanities? I have proposed here a vision of the academic book in the humanities which is globally inclusive, shaped by actual scholarly needs (rather than by the histories of print or web technologies), re-articulated for current media landscapes, more closely aligned to emerging research ecosystems and with greater integration of needs of the different stakeholders. It is possible to imagine digital long-form arts and humanities publications developing in a number of different ways in future. Firstly, and although I have not had space to contemplate it properly here, the concept of ‘publishing the archive’ will increasingly be important, especially around chunked book content. This seems likely to manifest itself in how established publishers find new ways to make digital assets which are currently ‘book-bound’ available as part of self-managed or aggregated online platforms. Nor have I addressed content managed by galleries, libraries and museums, which naturally connects to many areas in the humanities thematically. Secondly, new ‘digital’ forms will develop and stabilise which will contain their own network-native systems of knowledge formation, academic certification and filtering. These will take a lot longer to emerge, because they depend on a level of critical digital literacy, and consensus around media effects, in the humanities which it will take time to develop. The third route will involve moving beyond digital simulation of print monographs, or concepts of ‘enhanced’ monographs, to hybrid publications which aim to take full advantage of the affordance 23 https://www.kdl.kcl.ac.uk/blog/call-expressions-interest-your-novel-idea-publication/ https://www.kdl.kcl.ac.uk/blog/call-expressions-interest-your-novel-idea-publication/ of each medium. This mixed ecology provides many challenges – not least how we apportion different roles and functionality to the ‘print’ and ‘digital’ manifestations of a particular ‘book’ - but also many opportunities in fully integrating complex scholarly argument into a potentially more connective, participatory and visually expressive medium. References Antonijević, S. (2015) Amongst Digital Humanists - An Ethnographic Study of Digital Knowledge Production Smiljana Antonijević Palgrave Macmillan. Basingstoke, Hampshire; New York: Palgrave Macmillan. Bhaskar, M. (2013) The Content Machine: Towards a Theory of Publishing from the Printing Press to the Digital Network. London ; New York: Anthem Press. Blanke, T. et al. (2014) Digital Publishing Seen from the Digital Humanities. Logos. 25 (2), 16–27. Breure, L. et al. (2011) Rich Internet Publications: ‘Show What You Tell’. Journal of Digital Information. 12 (1). Brown, L. et al. (2017) Reimagining the Digital Monograph: Design Thinking to Build New Tools for Researchers, A JSTOR Labs Report. Available from: https://hcommons.org/deposits/item/hc:14411/ (Accessed 8 August 2017). Brown, L. et al. (2007) University Publishing In A Digital Age. Available from: http://sr.ithaka.org/research-publications/university-publishing-digital-age (Accessed 31 August 2017). Burdick, A. et al. (2012) Digital_Humanities. Cambridge, Mass.: MIT Press. Crane, G. et al. (2008) ‘ePhilology: When the Books Talk to Their Readers’, in Susan Schreibman & Ray Siemens (eds.) Companion to Digital Literary Studies. Blackwell Companions to Literature and Culture. Hardcover Oxford: Blackwell Publishing Professional. Dacos, M. & Mounier, P. (2010) L’édition électronique. Paris: la Découverte. Darnton, R. (1999) The New Age of the Book. Available from: http://www.nybooks.com/articles/1999/03/18/the-new-age-of-the-book/ (Accessed 31 August 2017). Davidson, C. (2015) ‘Why Yack Needs Hack (and Vice Versa): From Digital Humanities to Digital Literacy’, in Patrik Svensson & David Theo Goldberg (eds.) Between Humanities and the Digital. Cambridge, Massachusetts: MIT Press. pp. 131–144. Deegan, M. (2017) The Academic Book of the Future Project Report: A Report to the AHRC and the British Library, London. Available from: https://academicbookfuture.org/end-of-project- reports-2/ (Accessed 3 July 2017). Drucker, J. (2008) ‘The Virtual Codex from Page Space to E-space’, in Susan Schreibman & Ray Siemens (eds.) Companion to Digital Literary Studies. Blackwell Companions to Literature and Culture. Hardcover Oxford: Blackwell Publishing Professional. pp. 216–232. Dunleavy, P. (2012) Ebooks herald the second coming of books in university social science. LSE Review of Books. Available from: http://blogs.lse.ac.uk/lsereviewofbooks/2012/05/06/ebooks- herald-the-second-coming-of-books-in-university-social-science/ (Accessed 27 November 2017). Elliott, M. (2015) The Future of the Monograph in the Digital Era: A Report to the Andrew W. Mellon Foundation. Journal of Electronic Publishing 18 (4). Esposito, J. (2017) The Multifarious Book. The Scholarly Kitchen. Available from: https://scholarlykitchen.sspnet.org/2017/08/01/the-multifarious-book/ (Accessed 9 August 2017). Fitzpatrick, K. (2011) Planned obsolescence publishing, technology, and the future of the academy. New York: New York University Press. Fitzpatrick, K. (2015) ‘Scholarly Publishing in the Digital Age’, in Patrik Svensson & David Theo Goldberg (eds.) Between Humanities and the Digital. Cambridge, Massachusetts: MIT Press. pp. 457–466. Goldsworthy, S. (2015) The future of scholarly publishing. Available from: http://blog.oup.com/2015/11/future-scholarly-publishing/ (Accessed 31 August 2017). Hall, F. (2014) Digital Convergence and Collaborative Cultures. Logos 25 (4), 20–31. Hall, F. (2013) The business of digital publishing an introduction to the digital book and journal industries. London; New York: Routledge. Hitchcock, T. (2013) Confronting the Digital: Or How Academic History Writing Lost the Plot. Cultural and Social History 10 (1), 9–23. Inefuku, H. W. (2017) Globalization, Open Access, and the Democratization of Knowledge. EDUCAUSE Review. Available from: http://er.educause.edu/articles/2017/7/globalization-open- access-and-the-democratization-of-knowledge (Accessed 30 July 2017). Jubb, M. (2017) Academic Books and their Futures: A Report to the AHRC & the British Library. Available from: https://academicbookfuture.org/end-of-project-reports-2/ (Accessed 31 August 2017). Kapaniaris, A. et al. (2013) Digital Books Taxonomy: From Text E-Books to Digitally Enriched E- Books in Folklore Education Using the iPad. Mediterranean Journal of Social Sciences. 4 (11), 316. Kelly, J. (2012) An Ecology for Digital Scholarship. Available from: http://ihrdighist.blogs.sas.ac.uk/2012/12/10/67/ (Accessed 31 August 2017). Liu, A. (2014) Theses on the Epistemology of the Digital: Advice For the Cambridge Centre for Digital Knowledge. Available from: http://liu.english.ucsb.edu/theses-on-the-epistemology-of- the-digital-page/ (Accessed 31 August 2017). Lynch, C. (2010) Imagining a University Press System to Support Scholarship in the Digital Age. Journal of Electronic Publishing. 13 (2). Available from: http://hdl.handle.net/2027/spo.3336451.0013.207. Manovich, L. (2002) The language of new media. Cambridge, Mass.: MIT Press. McGuire, H. (2012) ‘Why the Book and the Internet Will Merge’, in Hugh McGuire & Brian O’Leary (eds.) Book: A Futurist’s Manifesto: A Collection of Essays from the Bleeding Edge of Publishing. 1 edition Boston, MA: O’Reilly Media. pp. 109–118. McIlroy, T. (2015) Mobile Strategies for Digital Publishing: A Practical Guide to the Evolving Landscape. Blue Ash, Ohio: The Future of Publishing and Digital Book World. Mod, C. (2012) ‘Designing Books in the Digital Age’, in Hugh McGuire & Brian O’Leary (eds.) Book: A Futurist’s Manifesto: A Collection of Essays from the Bleeding Edge of Publishing. 1 edition Boston, MA: O’Reilly Media. pp. 81–105. Mrva-Montoya, A. (2015) Beyond the Monograph: Publishing Research for Multimedia and Multiplatform Delivery. Journal of Scholarly Publishing. 46 (4), 321–342. O’Sullivan, J. (2017) Scholarly equivalents of the monograph? An examination of some digital edge cases. Available from: https://cora.ucc.ie/handle/10468/4273 (Accessed 1 September 2017). Palmer, C. L. (2004) ‘Thematic Research Collections’, in Susan Schreibman et al. (eds.) Companion to Digital Humanities. Blackwell Companions to Literature and Culture. Oxford: Blackwell Publishing Professional. pp. 348–365. Phillips, A. (2014) Turning the page: the evolution of the book. London; New York: Routledge. Pinter, F. (2016) ‘The Academic “Book” of the Future and Its Function’, in Rebecca E. Lyons & Samantha Rayner (eds.) The Academic Book of the Future. 1st ed. 2016 edition New York, NY: Palgrave Macmillan. pp. 39–45. Prescott, A. (2015) Are We Doomed to a World of PDFs? Digital Riffs. Available from: https://medium.com/digital-riffs/are-we-doomed-to-a-word-of-pdfs-11f57edaf926 (Accessed 24 August 2017). Prescott, A. (2012) Consumers, creators or commentators? Problems of audience and mission in the digital humanities. Arts and Humanities in Higher Education. 11 (1–2), 61–75. Price, K. M. (2009) Edition, Project, Database, Archive, Thematic Research Collection: What’s in a Name? Digital Humanities Quarterly 3 (3). Available from: http://www.digitalhumanities.org/dhq/vol/3/3/000053/000053.html (Accessed 1 September 2017). Priem, J. (2013) Scholarship: Beyond the paper. Nature. 495 (7442), 437–440. Renear, A. (2004) ‘Text encoding’, in Susan Schreibman et al. (eds.) Companion to Digital Humanities. Blackwell Companions to Literature and Culture. Hardcover Oxford: Blackwell Publishing Professional. pp. 218–239. Available from: http://www.digitalhumanities.org/companion/. Schreibman, S. et al. (eds.) (2016) A New Companion to Digital Humanities. 2nd Revised edition edition. Chichester, West Sussex, UK: Wiley-Blackwell. Schreibman, S. et al. (eds.) (2004) Companion to Digital Humanities. Blackwell Companions to Literature and Culture. Hardcover. Oxford: Blackwell Publishing Professional. Available from: http://www.digitalhumanities.org/companion/. Smith, M. N. (2004) ‘Electronic scholarly editing’, in Susan Schreibman et al. (eds.) Companion to Digital Humanities. Blackwell Companions to Literature and Culture. Hardcover Oxford: Blackwell Publishing Professional. pp. 306–322. Available from: http://www.digitalhumanities.org/companion/. Svensson, P. (2010) The Landscape of Digital Humanities. Digital Humanities Quarterly 4 (1). Available from: http://digitalhumanities.org/dhq/vol/4/1/000080/000080.html (Accessed 31 August 2017). Tanner, S. (2016) An analysis of the Arts and Humanities submitted research outputs to the REF2014 with a focus on academic books. Available from: https://kclpure.kcl.ac.uk/portal/en/publications/an-analysis-of-the-arts-and-humanities- submitted-research-outputs-to-the-ref2014-with-a-focus-on-academic-books(9cfc5250-07e0- 4d82-9b0b-0853447024e6)/export.html (Accessed 1 September 2017). Thomas III, W. G. (2016) ‘The Promise of the Digital Humanities and the Contested Nature of Digital Scholarship’, in Susan Schreibman et al. (eds.) A New Companion to Digital Humanities. 2nd Revised edition edition Chichester, West Sussex, UK: Wiley-Blackwell. pp. 524–537. Thompson, J. B. (2005) Books in the digital age: the transformation of academic and higher education publishing in Britain and the United States. Cambridge, U.K.; Malden, Mass.: Polity Press. Thompson, J. B. (2012) Merchants of culture: the publishing business in the twenty-first century. New York, New York: Plume. Waters, D. J. (2016) Monograph Publishing in the Digital Age. Shared Experiences. Available from: https://mellon.org/resources/shared-experiences-blog/monograph-publishing-digital- age/ (Accessed 31 August 2017). Available from: https://mellon.org/resources/shared- experiences-blog/monograph-publishing-digital-age/ (Accessed 31 August 2017). Watkinson, A. (2016) The Academic Book in North America: Report on attitudes and initiatives among publishers, libraries, and scholars. Available from: https://academicbookfuture.org/academic-book-north-america-watkinson/ (Accessed 15 August 2017). Available from: https://academicbookfuture.org/academic-book-north-america- watkinson/ (Accessed 31 August 2017). Weller, M. (2011) The digital scholar: how technology is transforming scholarly practice. London: Bloomsbury. Wolff-Eisenberg, C. et al. (2016) Ithaka S+R US Faculty Survey 2015. Available from: http://www.sr.ithaka.org/publications/ithaka-sr-us-faculty-survey-2015/ (Accessed 31 August 2017). Available from: http://www.sr.ithaka.org/publications/ithaka-sr-us-faculty-survey-2015/ (Accessed 31 August 2017). Author biography Paul Spence is a Senior Lecturer in Digital Humanities at King's College London. His research currently focuses on digitally mediated knowledge creation, digital publishing, global perspectives on digital scholarship and the potential interplay between modern languages and digital culture. He was joint creator of the multi-platform publishing framework xMod (since renamed as Kiln http://kcl-ddh.github.io/kiln/), and now leads the 'Digital Mediations' strand on the Language Acts and World-making project (https://languageacts.org/). http://kcl-ddh.github.io/kiln/ https://languageacts.org/ work_56benodsq5auzdmyqtvffdu3i4 ---- Modelling East Asian Calendars in an Open Source Authority Database | Semantic Scholar Skip to search formSkip to main content> Semantic Scholar's Logo Search Sign InCreate Free Account You are currently offline. Some features of the site may not work correctly. DOI:10.3366/ijhac.2016.0164 Corpus ID: 52294922Modelling East Asian Calendars in an Open Source Authority Database @article{Bingenheimer2016ModellingEA, title={Modelling East Asian Calendars in an Open Source Authority Database}, author={Marcus Bingenheimer and Jen-Jou Hung and Simon Wiles and Boyong Zhang}, journal={Int. J. Humanit. Arts Comput.}, year={2016}, volume={10}, pages={127-144} } Marcus Bingenheimer, Jen-Jou Hung, +1 author Boyong Zhang Published 2016 Computer Science, Geography Int. J. Humanit. Arts Comput. This paper discusses issues concerning the creation of conversion tables for East Asian (Chinese, Japanese, Korean) and European calendars and describes the development of an open source calendar database as part of the history of converting East Asian calendars. East Asian calendars encode both astronomical and political cycles. As a result, date conversion must in practice rely on complex look-up tables and cannot be done merely algorithmically. We provide a detailed overview of the history… Expand View via Publisher Save to Library Create Alert Cite Launch Research Feed Share This Paper Figures and Topics from this paper figure 1 Julian day Open-source software Algorithm Computation Lookup table ENCODE References SHOWING 1-10 OF 25 REFERENCES SORT BYRelevance Most Influenced Papers Recency Markup Meets GIS - Visualizing the "Biographies of Eminent Buddhist Monks' Marcus Bingenheimer, Jen-Jou Hung, Simon Wiles Computer Science, Geography 2009 13th International Conference Information Visualisation 2009 8 PDF Save Alert Research Feed Humanities Computing W. McCarty Sociology 2005 176 PDF Save Alert Research Feed Encyclopedia of Computer Science and Technology H. Henderson Computer Science, Sociology 2002 78 Save Alert Research Feed The histories of computing(s) M. Mahoney Sociology 2005 58 PDF Save Alert Research Feed Joseph Scaliger and Historical Chronology: The Rise and Fall of a Discipline A. Grafton History 1975 36 Save Alert Research Feed Mapping time C. Leibold, J. L. Hemmen Computer Science, Medicine Biological Cybernetics 2002 64 Save Alert Research Feed Die Schu-King-Finsterniss Amsterdam: Johannes Müller. 1889 Taiwan fojiao wenhua shuwei dianzang zhi fazhan 2011 台灣佛教數位典藏資料庫之建置 Digital Archives for the Study of Taiwanese Buddhism Jen-Jou Hung, Marcus Bingenheimer, Jr-Wei Shiu Computer Science 2011 1 Save Alert Research Feed The Chinese Reader's Manual: A Handbook of Biographical, Historical, Mythological, and General Literary Reference W. Mayers Art 2009 2 Save Alert Research Feed ... 1 2 3 ... Related Papers Abstract Figures and Topics 25 References Related Papers Stay Connected With Semantic Scholar Sign Up About Semantic Scholar Semantic Scholar is a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI. Learn More → Resources DatasetsSupp.aiAPIOpen Corpus Organization About UsResearchPublishing PartnersData Partners   FAQContact Proudly built by AI2 with the help of our Collaborators Terms of Service•Privacy Policy The Allen Institute for AI By clicking accept or continuing to use the site, you agree to the terms outlined in our Privacy Policy, Terms of Service, and Dataset License ACCEPT & CONTINUE work_5dgfdi4j3vba5b6ingyrf57jda ---- The Value of Plurality in 'The Network with a Thousand Entrances' | Semantic Scholar Skip to search formSkip to main content> Semantic Scholar's Logo Search Sign InCreate Free Account You are currently offline. Some features of the site may not work correctly. DOI:10.3366/ijhac.2017.0190 Corpus ID: 52293706The Value of Plurality in 'The Network with a Thousand Entrances' @article{Siemens2017TheVO, title={The Value of Plurality in 'The Network with a Thousand Entrances'}, author={R. Siemens and A. Arbuckle and Lindsey Seatter and Randa El Khatib and Tracey El Hajj}, journal={Int. J. Humanit. Arts Comput.}, year={2017}, volume={11}, pages={153-173} } R. Siemens, A. Arbuckle, +2 authors Tracey El Hajj Published 2017 History, Computer Science Int. J. Humanit. Arts Comput. This contribution reflects on the value of plurality in the ‘network with a thousand entrances’ suggested by McCarty (http://goo.gl/H3HAfs), and others, in association with approaching time-honoured annotative and commentary practices of much-engaged texts. The question is how this approach aligns with tensions, today, surrounding the multiplicity of endeavour associated with modeling practices of annotation by practitioners of the digital humanities. Our work, hence, surveys annotative… Expand View via Publisher hcommons.org Save to Library Create Alert Cite Launch Research Feed Share This Paper 3 Citations View All Figures and Topics from this paper figure 1 figure 2 figure 4 figure 6 Digital humanities Wikipedia Altran Praxis Endeavour (supercomputer) 3 Citations Citation Type Citation Type All Types Cites Results Cites Methods Cites Background Has PDF Publication Type Author More Filters More Filters Filters Sort by Relevance Sort by Most Influenced Papers Sort by Citation Count Sort by Recency Towards Open Annotation: Examples and Experiments Lindsey Seatter Computer Science 2019 3 Save Alert Research Feed Examples : Tools for Collaborative , Text-Based Annotation Example One : Annotation Studio Flexibility : Excellent Lindsey Seatter 2019 PDF Save Alert Research Feed “When I saw my peers annotating”: Student perceptions of social annotation for learning in multiple courses Jeremiah H. Kalir, Esteban Morales, Alice Fleerackers, Juan Pablo Alperin Psychology 2020 1 Save Alert Research Feed References SHOWING 1-10 OF 45 REFERENCES SORT BYRelevance Most Influenced Papers Recency From Text to Work: Digital Tools and the Emergence of the Social Text J. McGann Art, Sociology 2005 28 Save Alert Research Feed Automatic annotation of bibliographical references in digital humanities books, articles and blogs Young-Min Kim, P. Bellot, Elodie Faath, Marin Dacos Computer Science BooksOnline '11 2011 18 PDF Save Alert Research Feed Creating Scholarly Tools and Resources for the Digital Ecosystem: Building Connections in the Zotero Project D. J. Cohen Computer Science First Monday 2008 22 Save Alert Research Feed Measuring researchers’ use of scholarly information through social bookmarking data: A case study of BibSonomy Ángel Borrego, J. Fry Computer Science J. Inf. Sci. 2012 38 PDF Save Alert Research Feed Beyond Browsing and Reading: The Open Work of Digital Scholarly Editions Jon Saklofske, Jake Bruce Computer Science 2013 3 PDF Save Alert Research Feed A framework for retrieval and annotation in digital humanities using XQuery full text and update in BaseX C. Mahlow, C. Grün, Alexander Holupirek, M. Scholl Computer Science DocEng '12 2012 8 PDF Save Alert Research Feed User-defined annotations: artefacts for co-ordination and shared understanding in design teams Jean-François Boujut Computer Science 2003 43 Save Alert Research Feed Collaborative bibliography David G. Hendry, J. Jenkins, J. McCarthy Computer Science Inf. Process. Manag. 2006 32 PDF Save Alert Research Feed Modeling Social Annotation: A Bayesian Approach A. Plangprasopchok, Kristina Lerman Computer Science TKDD 2010 19 PDF Save Alert Research Feed Phrase Detectives: A Web-based collaborative annotation game J. Chamberlain, Massimo Poesio, Udo Kruschwitz Computer Science 2008 111 PDF Save Alert Research Feed ... 1 2 3 4 5 ... Related Papers Abstract Figures and Topics 3 Citations 45 References Related Papers Stay Connected With Semantic Scholar Sign Up About Semantic Scholar Semantic Scholar is a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI. Learn More → Resources DatasetsSupp.aiAPIOpen Corpus Organization About UsResearchPublishing PartnersData Partners   FAQContact Proudly built by AI2 with the help of our Collaborators Terms of Service•Privacy Policy The Allen Institute for AI By clicking accept or continuing to use the site, you agree to the terms outlined in our Privacy Policy, Terms of Service, and Dataset License ACCEPT & CONTINUE work_5dmtdy72qfah3lh44wo2o44uo4 ---- BIROn - Birkbeck Institutional Research Online Eve, Martin Paul (2017) Review of Composition, Creative Writing Studies and the Digital Humanities by Adam Koehler. The Review of English Studies 68 (287), pp. 1032-1034. ISSN 1471-6968. Downloaded from: http://eprints.bbk.ac.uk/id/eprint/18231/ Usage Guidelines: Please refer to usage guidelines at https://eprints.bbk.ac.uk/policies.html or alternatively contact lib-eprints@bbk.ac.uk. http://eprints.bbk.ac.uk/id/eprint/18231/ https://eprints.bbk.ac.uk/policies.html mailto:lib-eprints@bbk.ac.uk 1/5 Review of Koehler, Adam, Composition, Creative Writing Studies and the Digital Humanities. (London: Bloomsbury Academic, 2017) Author's original version, accepted for publication in Review of English Studies, published by Oxford University Press. What does it mean to “write” in the digital age? As Matthew Kirschenbaum has shown us in recent days, technologies of word processing made the transition from business environments to creative writing with an unforeseen and paradigm-altering swiftness.1 N. Katherine Hayles has also demonstrated how the processes of publishing print books has been a digital-first endeavour for quite some time.2 For the majority of people who write in the world today, digital technologies are an indispensable part of the process. Yet, how do we conceive of digital writing as different from other forms of production? Is simply using a word processor enough of a mediation to call writing “digital”? Or should we be interested in e-literatures that more fundamentally harness the potentially radical possibilities of the digital space but that involve various new types of labour (coding, design, digital preservation)? We never used to insist that those writing with pencils should have taken part in and understood the constitution of those inscription tools. That said, among other practices, various schools of concrete poetry in the twentieth century – most notably those that gathered in the network around Hansjörg Mayer – broke down these binary barriers between tools and products in what Bronaċ Ferran has called a “typoetical revolution”. The affordances of the digital are certainly different. But are radical works in this space still “writing”? If so, what kinds of writing and from what types of spaces? These are the sorts of questions that sit behind Adam Koehler's Composition, Creative Writing Studies, and the Digital Humanities. Specifically, Koehler is motivated to address matters 2/5 of shifting disciplinarity in the era of the digitally-mediated writing subject, working between the spaces of composition and creative writing, as the book's title might imply. It is here, indeed though, that one encounters the first particularity of Koehler's work: there is a strong North American slant to his angle. Those outside this academic system may be less familiar with what is meant by “composition” in the senses used in Koehler's monograph. In the UK, for example, the only place that you will find a module on “composition studies” is within a music department. Certainly, Koehler's book could have used an additional contextualisation of this field for readers outside the space, although the comprehensive literature review of the role of creativity and imagination in writing instruction goes some way towards this (23-35). Indeed, the first chapter after Koehler's introduction felt, to this reader, as a plunge into the deep end. On the other hand, many more scholars elsewhere will be familiar with the rise of the creative writing programmes that Koehler charts; whether they be through courses in their own departments or in the study of contemporary fiction, as noted by Mark McGurl and others. 3 If, then, Koehler's approach to composition felt too sudden for me, his discussion of digital creative writing appeared over-rehearsed. Moving through all the seminal big-name figures, from Jackson's “Patchwork Girl”, through to Hayles's medial ecologies, up to Egan's Twitter fiction, the charitable way to characterise this would be to say that Koehler's scholarship is thorough. However, to my ears it sounded a little too much like a story that I have heard many times before. If the first two chapters here left me a little adrift, in the third Koehler's book comes into its own and his work on the recent Kenneth Goldsmith controversy is up-to-the minute and relevant (80-85). It also demonstrates the fresh ways in which Koehler considers artists to be “digital”. For, in this case, the definition of digital is shaped by a type of identity politics that is mediated by the technologies of social media; a post-identity politics, in some ways. This broader framing of the politics of the digital, even when a white poet then reads aloud – in analogue – from a poorly considered aesthetic work appropriating a black man's death, is productive and politically 3/5 persuasive. It is also an excellent analysis of the ways in which different disciplinary spaces, across creative-critical boundaries, interact, merge, meld, and seep in their practices while still remaining distinct in their politics. Another highlight of this work, for me, was the patient and steady assault on Jonathan Franzen's continued arguments against digital practices (103-117). Although one may always flinch upon reading of how turning to Heidegger will clarify a problem, this section was well-informed and philosophically astute on the ways in which “technologies” of writing stretch back a long way. Indeed, the ways in which we define “technology” are important and Koehler cogently frames our strange naturalisation of technologies from bygone eras, as though their re-enchantment will somehow protect us as talismans against the new. That new, in Koehler's framing, is a set of practices – “nonlinearity”, “intertextuality”, “genre shifting”, “appropriation” – that act as markers of a “techno-cultural shift” (136). In all, though, I have to confess, I do not think that I am the target audience for this work. For the new media ecologies that Koehler describes in Composition, Creative Writing Studies, and the Digital Humanities felt, to me, curiously devoid of digital specificities. Could we not take the above traits, for instance, and situate them amid any number of past literary moments? Romanticism, Modernism, or Postmodernism? We do have a discussion of Twitter fiction, certainly, but what is specifically digital about such a writing practice that was not already somehow encapsulated by Oulipo's constraint-based techniques? Yes, Koehler poses a set of interesting questions about these practices and the rise of composition and creative writing alongside one another; the “fenceless neighbours” to which he turns. But actual engagement with specific underlying digital technologies, their affordances, and consequences, seemed lighter to me. I also wondered why there was not greater discussion in the book of studies on disciplinarity itself. Surely some of the emergent work in the field of critical university studies would at least have merited a mention here? 4/5 Perhaps, however, I am just expecting too much from a book that is aiming to cover a lot of ground. Its ambitions to synthesize three huge fields into a narrative of co-genesis was always going to be tricky. Composition, Creative Writing Studies, and the Digital Humanities, then, attempts that task and I feel it doesn't quite get there. It does, though, provide fertile ground for further exploration and points towards a set of self-questioning practices that are and that will remain crucial to the spaces of composition and creative writing. In fostering these questions and holding up a cruel glass, and not because it comes to any definite resolutions, Koehler's book undertakes an important task. 1Matthew G. Kirschenbaum, Track Changes: A Literary History of Word Processing (Cambridge, MA: The Belknap Press of Harvard University Press, 2016). 2N. Katherine Hayles, How We Think: Digital Media and Contemporary Technogenesis (Chicago: University of Chicago Press, 2012), p. 6. 3Mark McGurl, The Program Era: Postwar Fiction and the Rise of Creative Writing (Cambridge, MA: Harvard University Press, 2009). work_5lgvhggzxnhfvkvyq7hd263ofu ---- MTO 20.1: Shaffer, Open Peer Review [1] Throughout its history, Music Theory Online has leveraged new digital technologies to increase access to high-level scholarship and facilitate discussions of published materials inside the journal itself. (See, for example, the commentaries in Volumes 0.2, 1.1, 13.3, and 16.4.) New technologies and publishing practices have developed in the past few years that can carry these practices even further, in great service to the academic community. One such publishing practice is open peer review, exemplified by publications like the Journal of Digital Humanities, Digital Humanities Now, Digital Humanities This, American History Now, and the book Web Writing: Why and How for Liberal Arts Teaching and Learning (Dougherty et al. 2014). While multiple practices exist that can be considered manifestations of open peer review, open peer review in its fullest sense takes the scholarly discussion that traditionally follows publication—such as the discussion threads contained in the early issues of MTO—and moves it pre-publication, rendering them part of the review process. Open peer review ensures high visibility for the best work, extensive vetting by the scholarly community pre-publication, and a timely publication process, all the while maintaining high standards for peer-reviewed publication. [2] Music theorists would benefit from having an open peer-review journal, and MTO is best situated to be that journal. MTO is also well situated to experiment with open peer review without committing its entire future to such a model. In this article, I will explain open peer review in more detail by following an example article through the process of review and publication, commenting on the potential benefits of this process along the way. I conclude with a proposal for how MTO might experiment with the open peer-review model in order to gauge its potential for our field more precisely. Following the Process [3] One of the first articles to appear in the Journal of Digital Humanities is Trevor Owens’s “Defining Data for Humanists: Text, Artifact, Information or Evidence?” This article began as a chapter for the book, Writing History in the Digital Age, itself an open peer-review project (Nawrotski and Dougherty, 2013). Writing History in the Digital Age was undertaken by co-editors Jack Dougherty (Trinity College, Connecticut) and Kristen Nawrotzki (Pädagogische Hochschule Heidelberg), in collaboration with the University of Michigan Press. This project is available from UMP as a print book, a downloadable ebook, and an open-access web book as part of their series of digitalculturebooks that explore novel publication models like open peer review and simultaneous print and open-access publication. For that project, Owens co-authored a chapter with Volume 20, Number 1, February 2014 Copyright © 2014 Society for Music Theory A Proposal for Open Peer Review Kris P. Shaffer KEYWORDS: MTO, open access, open peer review, publishing, curation, blogs Received January 2014 1 of 5 Frederick W. Gibbs called “The Hermeneutics of Data and Historical Writing.” The first version of this chapter was posted to the project website in the fall of 2011. On this website, readers can comment on the chapter as a whole, or on specific paragraphs in the chapter. The chapter received 35 total comments—12 on the chapter as a whole, and 23 on specific passages. Some of these comments we would recognize as typical peer-review summaries (accept/reject/revise with comments on how best to revise); others were directed at improving specific elements in the chapter. [4] Owens and Gibbs replied to some comments directly, and ultimately composed a revision of this chapter, which appeared on the project website in the spring of 2012. At this stage in the publication process, the chapter moved from review mode to copyediting mode. The chapter on the website includes a link to a document in Google Drive where any reader can comment (though comments ended up limited to the two authors and two editors), but only the document owners can change the text in response to those comments. The final version of this document was submitted to the publisher for inclusion in the print and electronic book editions (Gibbs and Owens 2013). However, the open peer-review process for Owens’s and Gibbs’s chapter did not end with the publication of Writing History in the Digital Age. [5] In December 2011, the middle of the open peer-review process for this chapter, Owens wrote on his blog: We were asked to clarify what we saw as the difference between data and evidence. We will help to clarify this in the paper, but it has also sparked a much longer conversation in my mind that I wanted to share here and invite comments on. As I said, this is too big of a can of worms to fit into that paper, but I wanted to take a few moments to sketch this out and see what others think about it. [6] What followed in that blog post was the first version of Owens’s essay, “Defining Data for Humanists: Text, Artifact, Information or Evidence?” This essay sparked the interest of digital humanists on the internet, and on February 2, 2012, it was re-posted on Digital Humanities Now, an open-access website that aggregates content on the open web related to the field of digital humanities, or DH. In addition to simply aggregating or re-posting the blog post, DHNow’s editors labeled the post “Editor’s Choice,” meaning that their team of regular editors and temporary “editors-at-large” agree that it is worth particular attention from the DH community. Though DHNow does include some form of peer review (albeit without comments, or even the idea of “submission”), it is not a journal, simply a quality-controlled aggregator that attempts to help DH scholars find good content on the open web. Comments and responses to items posted on DHNow are posted on the original sites (controlled by the authors), and journal publication is left to others. [7] DHNow is one of several publications produced by the Roy Rosenzweig Center for History and New Media. Another is the Journal of Digital Humanities. As stated on the journal’s website, The Journal of Digital Humanities selects content from the Editors’ Choice pieces from Digital Humanities Now, which highlights the best scholarship—in whatever form—that drives the field of digital humanities field [sic] forward. The Journal of Digital Humanities provides three additional layers of evaluation, review, and editing to the pieces initially identified by Digital Humanities Now. [8] Owens’s blog post that appeared in DHNow’s Editor’s Choice stream was picked up by JDH in just such a manner. After a round of revisions, it appeared in Volume 1.1 of JDH in Winter 2011 (Owens 2011)—before the book chapter completed its editorial process. Though this final editing process did not take place on the open web, we can compare the blog post to the version published in JDH and see that most of the changes made during this final review stage were small. The most substantial changes were additions: a new introduction to fit the new medium of publication, a new paragraph engaging existing research not cited in the original, and a conclusion—something often missing from a blog post, but essential to a polished publication. [9] This open peer-review process produced several unique benefits: (1) It produced two publications in highly respected media instead of one. Having two publications is an obvious benefit for the author, but since the extra publication allowed Owens to address a significant issue that he and Gibbs could not address fully in the original chapter, the extra publication benefits the field as well. (2) The open peer-review process allowed a greater number of scholars to provide refining input to the authors before final publication. (3) It generated a collaborative environment in which to make the chapter and article 2 of 5 better. (4) In addition to the advantages of the open review process, both the book and the journal processes included the biggest advantages of traditional publishing: working with peer reviewers, editors, and publishers to ensure high quality content and to make the work available to a wide readership. [10] Open peer review and its advantages are not exclusive to the field of digital humanities. For instance, Empirical Musicology Review employs a “public peer review” model and publishes commentaries alongside articles in the journal, with the goal of “allowing readers to witness a scholarly conversation.” (See this description of Empirical Musicology Review’s peer-review process.) Open peer review can benefit music theorists, as well, and MTO is well positioned to incorporate it into a viable publication model for music theory. With the advantages of open peer review in mind, I will now outline a proposal for an open peer-review experiment for MTO. A Proposal for Open Peer Review in MTOMTOMTOMTO [11] For the past year, the SMT Networking Committee (of which I am a member) has been discussing the possibility of creating a website that will curate content of interest to music theory scholars available on the open web. We have built a prototype for this curation site using open-source content management software called Pligg. Pligg allows a community to register its members, and all registered members can nominate content for inclusion on the site. Once a member submits a link to an online resource and a brief description of the resource, it appears on a page available to members that contains the most recently submitted content. Members can follow the link to the original resource, comment on the original page, comment within Pligg’s members area regarding its merit for inclusion, and vote thumbs-up or thumbs-down on Pligg. Once a submitted resource reaches a certain threshold—five more thumbs up than down, for example—it is automatically moved to the front page, which is viewable by anyone, members and non-members alike. This proposed curation site is based on the model of DHThis, which is similar to DHNow, but with a more transparent and community-driven review process. [12] Such a curation site would be a valuable tool in itself. As SMT members have moved more of their online discussion and resource sharing off of smt-talk, discussions have become more fragmented. This curation site could serve as a valuable portal to help music scholars find good content and online discussions, a portal that is flexible enough to deal with the changing landscape of online social interaction. As more music scholars are blogging about their research, and especially their teaching, and more music instructors are posting course materials on the open web, this curation site could serve as a valuable portal to good research, research-in-progress, and pedagogical tips and materials. Such a site would also allow, potentially, for a greater diversity of voices within the music theory community, as a small sub-community could easily share their materials. (If that community is active and enthusiastic enough, they could readily help each other’s work reach the main page. And if a community abused the process, algorithms for front-page inclusion could be adjusted or users could be suspended.) [13] However, just as DHNow is not the final stage in a peer-review process, this music theory curation site need not be the final stage in a peer-review process. I propose that MTO take on a function similar to the Journal of Digital Humanities. MTO’s editors can periodically mine the front page of this curation site, as well as the comment threads, for articles of interest, to find blog posts and essays that are of great interest to the music theory community, that receive significant positive feedback, and that have improved as a result of the feedback received. MTO’s editors can seek out simply the best content, or the best content related to a pre-identifed theme. MTO’s editors (as well as organizers of conferences and symposia) can also identify themes that emerge on the curation site, and use those themes to organize special issues. [14] Once content from the curation site has been identified for potential inclusion in MTO, I envision one of two primary options for the final stage of review. The first option would follow JDH: accept the article pending revision, and assign a small team of editors and/or referees to work collaboratively, and non-anonymously, with the author to bring the article up to publication standard. The second option would follow something similar to the Writing History in the Digital Age model: accept the article pending revision, and make the article available for public comment pre-publication, so that the author can revise the article based on (good) comments received, followed by a final copyediting pass before publication on MTO. In both cases, the editor would retain the final decision about inclusion in the journal. A Community Mind-shift 3 of 5 [15] Such a publication model would both produce and require a mind-shift from the music theory community. It would require a greater comfort than our field has traditionally shown with sharing ideas publicly that may not be camera-ready. It would require individuals to overcome a greater technological hurdle than with traditional publishing. Since this open peer-review model would be based around pre-published material on the web, it would generally lead to shorter publications than traditional publishing. Articles that work well for open peer review tend to be shorter, much more targeted, and have a much shorter bibliography than traditional articles, even when they represent the same level of research quality. Lastly, just as authors would need to be more comfortable sharing ideas that may not be camera-ready, reviewers need to be more comfortable sharing their comments, including critical ones, on the open web for such a project to work. Blind peer review protects reviewers as much as authors. Open peer review may be a non-starter for those used to providing their comments anonymously. Regardless, open peer review does require a major shift in thought here, and it may only work in sub-communities (such as the FlipCamp community, or within a department or a school of thought) where a strong collegiality already exists. [16] Open peer review would also produce a mind-shift among music theory scholars: scholarship would be less about individuals working in isolation, and more about collaboration. That, in turn, means that working and talking openly with other scholars becomes a primary means of participating in scholarly discourse. (That is the spirit behind Empirical Musicology Review’s review process and open access policy.) For my part, that would be a welcome and exciting shift in the way we as music theorists work. [17] Open peer review has much to offer our scholarly community. While the proposal I have outlined may not be the ultimate destination for SMT or MTO, it is a process that has proven valuable enough to other humanistic disciplines that it is worth attempting in ours. Regardless of the end result, I hope that we as a scholarly society can be among the leaders of those increasing access to good research and teaching materials and facilitating meaningful scholarly discourse about music, both within our society and without. Kris P. Shaffer University of Colorado–Boulder College of Music 301 UCVB Boulder, CO 80309 kris.shaffer@gmail.com Works Cited Dougherty, Jack, Jason B. Jones, Dina Anselmi, and Tennyson O’Donnell, eds. 2014. Web Writing: Why and How for Liberal Arts Teaching & Learning. Ann Arbor, Mich.: University of Michigan Press. Gibbs, Frederick W. and Trevor J. Owens. 2013. “The Hermeneutics of Data and Historical Writing.” In Writing History in the Digital Age, ed. Kristen Nawrotzki and Jack Dougherty. Ann Arbor, Mich.: University of Michigan Press. Nawrotzki, Kristen and Jack Dougherty, eds. 2013. Writing History in the Digital Age. Ann Arbor, Mich.: University of Michigan Press. Owens, Trevor. 2011. “Defining Data for Humanists: Text, Artifact, Information or Evidence?” Journal of Digital Humanities 1, no. 1 (accessed January 10, 2014). Copyright Statement 4 of 5 Copyright © 2014 by the Society for Music Theory. All rights reserved. [1] Copyrights for individual items published in Music Theory Online (MTO) are held by their authors. Items appearing in MTO may be saved and stored in electronic or paper form, and may be shared among individuals for purposes of scholarly research or discussion, but may not be republished in any form, electronic or print, without prior, written permission from the author(s), and advance notification of the editors of MTO. [2] Any redistributed form of items published in MTO must include the following information in a form appropriate to the medium in which the items are to appear: This item appeared in Music Theory Online in [VOLUME #, ISSUE #] on [DAY/MONTH/YEAR]. It was authored by [FULL NAME, EMAIL ADDRESS], with whose written permission it is reprinted here. [3] Libraries may archive issues of MTO in electronic or paper form for public access so long as each issue is stored in its entirety, and no access fee is charged. Exceptions to these requirements must be approved in writing by the editors of MTO, who will act in accordance with the decisions of the Society for Music Theory. This document and all portions thereof are protected by U.S. and international copyright laws. Material contained herein may be copied and/or distributed for research purposes only. Prepared by Carmel Raz, Editorial Assistant 5 of 5 work_5pp37f763jaxdfpobju6nolski ---- DNA and 普通話 (Mandarin): Bringing introductory programming to the Life Sciences and Digital Humanities doi: 10.1016/j.procs.2015.05.458 DNA and (Mandarin): Bringing introductory programming to the Life Sciences and Digital Humanities Mark D. LeBlanc1 and Michael D.C. Drout2 1Computer Science, 2English Wheaton College, Norton, MA, USA {mleblanc, mdrout}@wheatoncollege.edu Abstract The ability to write software (to script, to program, to code) is a vital skill for students and their future data-centric, multidisciplinary careers. We present a ten-year effort to teach introductory programming skills in domain-focused courses to students across divisions in our liberal arts college. By creatively working with colleagues in Biology, Statistics, and now English, we have designed, modified, and offered six iterations of two courses: “DNA” and “Computing for Poets”. Larger percentages of women have consistently enrolled in these two courses vs. the traditional first course in the major. We share our open source course materials and present here our use of a blended learning classroom that leverages the increasing quality of online video lectures and programming practice sites in an attempt to maximize faculty-student interactions in class. Keywords: programming, interdisciplinary, multidisciplinary, genomics, bioinformatics, digital humanities, blended learning 1 Introduction Teaching novices to understand, predict, and solve data-rich problems by writing software is an interdisciplinary endeavor. Whereas “computational science is the new scientific field emerging from the fusion of mathematics and information technology” (Koumoutsakos, 2014), there exists a heightened need for departments to offer creative and appropriate courses that acknowledge the need for students to be exposed to interdisciplinary teams and computationally-rich problems. While programming is not computational thinking, introductory courses that teach problem solving via scripting are important course offerings that “teach programming to enhance computational thinking” (Falkner, 2014). Procedia Computer Science Volume 51, 2015, Pages 1937–1946 ICCS 2015 International Conference On Computational Science Selection and peer-review under responsibility of the Scientific Programme Committee of ICCS 2015 c© The Authors. Published by Elsevier B.V. 1937 http://crossmark.crossref.org/dialog/?doi=10.1016/j.procs.2015.05.458&domain=pdf http://crossmark.crossref.org/dialog/?doi=10.1016/j.procs.2015.05.458&domain=pdf One goal is to offer programming courses that match the passions of rising computational scientists. We present a ten-year multidisciplinary effort to iteratively design and teach two introductory programming courses, one for students in the life sciences and the other for students in the humanities. These two introductory programming courses are both offered every other year in addition to our traditional introduction to computer science course that is offered each semester. After commenting on the spirit behind a computational thread between DNA (the language of life) and Mandarin (here just an example of a digitized corpora in most any language), we discuss the larger academic framework of these two interdisciplinary courses and present enrollment data that speaks to our efforts to increase the percentage of women who are exposed to computational science. Finally, we present details of each course, including a discussion of how our use of blended learning is helping us maximize faculty-student interactions during class and is helping to challenge the traditional notion of how faculty and students define and recognize “classroom time”. 2 A note on DNA and (Mandarin) Counting character and word n-grams is an important step in many computational explorations of texts where vectors of token frequencies are used as “stock” in analyses of those texts; for example, an unsupervised cluster analysis of segments from a novel written by two authors. Of course, it turns out that counting “words” (motifs) in DNA (e.g., counting frequencies of motifs in a sliding window of every four nucleotides, each token referred to as a 4-mer) to detect regions of horizontal transfer is algorithmically similar to counting character n-grams in languages with little white spacing (e.g., counting instances of every four contiguous characters in The Dream of the Red Chamber, one of the four classical Chinese novels written in Mandarin). Although counting tokens is only one of many introductory techniques in text mining, we have found that teaching students to write software to count and store “words” for future analysis sparks their interest in computational experiments, whether that interest be in microbial genomics or text mining in a foreign language such as Mandarin. 3 Connecting Across Campus The Wheaton College curriculum is centered on “Connections,” pairs of linked courses that connect significantly different disciplines. Wheaton is a residential, liberal arts campus of 1600 students where courses are linked across any two of six academic areas: creative arts, humanities, history, math and computer science, natural sciences, and social sciences. Each course in a pair of connected courses may be taken in either order and do not need to be taken in consecutive semesters (LeBlanc et al. 2009). Our Computer Science program has established a suite of six connected courses, two of which are discussed here in this paper and listed in Table 1. Course Name Connected to Area Connected With Courses Computing for Poets English, Digital Humanities J.R.R. Tolkien or Anglo Saxon Literature DNA Life Sciences, Philosophy Bio-Ethics or Ethics Table 1. Two domain-focused, introductory programming courses for students in the humanities and life sciences. Two English courses are connected to the Poets course, and Philosophy’s two ethics courses are connected to the DNA course. Ethics is also connected to other Biology courses. Unlike models that rely on “courses outside of computer science” (Furst et al. 2007), connected Intro programming for Life Sciences and Digital Humanities Mark LeBlanc and Michael Drout 1938 courses involve multidisciplinary faculty in the design of computer science courses. Humanities students, typically from the highly enrolled English program who are taking either “Anglo-Saxon Literature” or “J.R.R. Tolkien” courses are encouraged to consider completing a connection by subsequently enrolling in “Computing for Poets” (COMP 131). Given the explosion of digitized texts and our own ongoing research (cf. Lexomics), this course connection is rich with opportunity for creative problem solving. Likewise, the revolutions in personalized medicine and genome sequencing are generating a new thirst among students to computationally consider how “the stuff of life” is data. The “DNA” course (COMP/BIO 242) is cross-listed as a computer science or biology course and counts for a 200-level elective in bioinformatics, biology, or computer science. The “DNA” course is aptly connected with Philosophy’s Bio-Ethics or Ethics courses; the fast- paced changes in genomic medicine alone keep us all on our toes regarding the implications of the scientific advances. Overall, our experience supports the recommendation that faculty from other departments take an active role in course development and delivery (Guzdial, 2009). We agree with Cooper and Cunningham (2010) that offering opportunities for problem solving and introductory programming with a specific context over an entire semester is an important element, whether the context is “genomics” for life science students or “text mining” for digital humanities students. Introductory programming courses with a focus on media have been very successful (cf. Guzdial, 2003). Union College offers multiple perspectives when approaching CS1 (Barr, 2012) as do our courses that reach two vibrant and broad audiences: the life sciences and digital humanities. The team-taught course at Harvey Mudd College (Dodds et al., 2012) that integrates the first semesters of BIO1 and CS1 is more ambitious than our DNA course described here but is similar in spirit in that the entire semester is focused on modules that apply computing to genomics problems. Bioinformatics- centric courses are more likely to appear at the upper-level (cf. Tjaden, 2007); whereas, the courses described here assume no prior programming experience. A growing number of CS1 courses offer “data-centric” assignments with topics and applications that vary on each assignment (e.g., Anderson, et al. 2015) whereas the DNA and Poets courses discussed here maintain a fixed context. Perhaps most distinguishing is that the two courses presented here are introductory programming courses. Although a rich collection of web-based bioinformatics and text mining tools are available and are integrated during final projects, the focus is on teaching good introductory programming skills to mix, mash, and morph scientific data within a context of an existing scholarly and scientific passion. 4 Towards gender balance Increasing the percentage of women who enroll in introductory computer science courses has been an ongoing focus and continual challenge. In twenty offerings from 2000 to the present, our traditional first course in the computer science major (CS1 - “Robots, Games, and Problem Solving”), which itself continues to be oversubscribed each semester for the last few years, enrolls on average just over one-third female students (35.7%). Like at many institutions, this course serves potential computer science majors and as an elective for primarily mathematics and science majors. With an overall campus female-male student ratio of almost two to one and the relatively high proportion of women majoring in the life sciences and humanities, we knew we had positive recruitment potential if we could provide more options. Since 2004 we have offered two additional introductory programming courses: “DNA” for students in the life sciences and a course called “Computing for Poets” (hereafter Poets) for students in the humanities. The two courses are separately offered as an alternative to our introductory (CS1) course, but specifically target the rapid need for and reliance on computational Intro programming for Life Sciences and Digital Humanities Mark LeBlanc and Michael Drout 1939 thinking due to the revolutionary changes in the bioinformatics and digital humanities spaces. Over the last decade, DNA has been offered six times and Poets has been offered five times since 2004. As shown in Figure 1, our two introductory, interdisciplinary courses (DNA and Poets) have consistently enrolled more women than the traditional offerings of CS1. In all six offerings of DNA and in all five offerings of Poets since 2004, women enroll in higher percentages than 19 of the 20 offerings of CS1. Since 2000, women enroll in CS1 at an average of 36%. In contrast, from 2004 to the present, on average 51% of the students in DNA and 58% of the students in Poets were women. Maintaining female enrollment percentages above 50% in introductory computer science courses is a significant and refreshing outcome. Greater gender balance in the classroom has numerous benefits, including the benefits that the male students receive from working more closely with women early in their careers. 5 The Digital Humanities The time has never been better for computational science to impact the Humanities. The Digital Humanities represents a growing subculture on many campuses and the glut of digitized texts, many in languages from around the globe that match scholar’s areas of expertise, has radically altered what it means to be a scholar of texts. Computational explorations of texts, sometimes referred to as computational stylistics is a subfield within the Digital Humanities and one where students need exposure to and practice with searching and analyzing large digitized corpora. The Computing for Poets course discussed next is approaching its sixth iteration to do just that. 5.1 Computing for Poets (COMP 131) The use of computers to manage the storage and retrieval of written texts creates new opportunities for scholars of ancient and other written works. Recent advances in computer Figure 1. Percentage of Women Enrolling in three different introductory computer science courses: CS1 (introductory course for majors) and two focused offerings, one for students in the life sciences (DNA) and the other for students in the digital humanities (Poets). The moving trend line shows the 4-per-year average percentage for our traditional course (CS1). Intro programming for Life Sciences and Digital Humanities Mark LeBlanc and Michael Drout 1940 software, hypertext, and database methodologies have made it possible to ask novel questions about a poem, a story, a trilogy, or an entire corpus. The Poets course exposes students to leading markup languages (HTML, CSS, XML) and teaches computer programming as a vehicle to explore and “data mine” digitized texts. Programming facilitates top-down thinking and practice with computational thinking skills such as problem decomposition, algorithmic thinking, and experimental design, topics that humanities students in our experience rarely see. Programming on and with texts introduces students to rich new areas of scholarship including stylometry and authorship attribution. The course has no prerequisites other than a love of the written (and digital) word; no previous computer programming experience is required. A learning objective for students in this course is to articulate how computational analyses of digitized texts enables both a “close reading” of a single text and as well as a “distant reading” of many texts across time (Moretti, 2013). The goal for each student is to master enough programming to modify digitized texts to help in a computational experiment that explores a question of a text or set of texts. For example, the assignments in Fall 2013 (see Table 2) asked students to write and extend Python scripts over the semester that analyze texts and store results in Excel-ready output files to facilitate subsequent analyses, documentation, and scientific writing. The number, pace, and level of difficulty of assignments and labs in the course are coordinated to help most students conduct an introductory text mining experiment in the last three weeks of the course. Assignment Short Description a1: Website – Ngrams Build a website with results from Google’s Ngram Viewer to investigate how the frequency of words or phrases have changed over time as appearing in books from 1800 to 2012 a2: Deforming Poetry Write a program to help the reader more easily read poems that have been deformed in various methods, e.g., read a poem backwards a3: Regex Play Use regular expressions to solve some of Will Shortz’ word puzzles; Shortz is National Public Radio’s (NPR) puzzle master a4: Tall Elves Conjecture: Tolkien wanted his readers to fully appreciate that his elves were large, thus he used the word “tall” (or other variants such as “big”, “giant”, “large”, etc.) in close proximity to the name of an elf (e.g., “Legolas”, “Galadriel” or even the generic word, “elf”). Write a script(s) to generate data that will help experimentally verify if this conjecture is true or false. a5: Only in the Poetry Considering the entire digitized corpus of Old English (Anglo-Saxon) poetry and prose texts, write a script to determine which if any words appear only in the poetry? Table 2. Five programming assignments in Computing for Poets (Spring 2014 semester) Developing programming assignments with an expert scholar in these spaces is critical when attempting to focus student attention on the current level of scripting practice and excitement that comes from studying the original texts. The Poets course is “connected” with two courses in English: J.R.R. Tolkien (ENG 259) and Anglo-Saxon Literature (ENG 208). For example, in Anglo-Saxon studies, the relationship between Old English poems has been a vexed question for nearly 150 years. Most of the poetry is anonymous and exists only as tenth-century copies in manuscripts (some of it is assumed to be much earlier). We have only three named authors of poetry in the Anglo-Saxon period and the majority of the prose is anonymous. Thus for years scholars of Old English have struggled to divine relationships between texts based on vocabulary, Intro programming for Life Sciences and Digital Humanities Mark LeBlanc and Michael Drout 1941 meter, and style. These results have been at best contentious and at worst completely unsuccessful. As students learn more and more scripting in the course, we set up and run experiments using the entire Anglo-Saxon corpus. Some of the questions asked by undergraduates may never have been asked before. In the case of the connection with the course on J.R.R. Tolkien, students participate in the design and execution of an experiment across multiple texts, e.g., The Lord of the Rings trilogy which inevitably leads to follow up interests with other texts: What about the Silmarillion? Should we ask this computational question on The Hobbit? 5.2 Blended Learning Acknowledging that class time is precious, we have worked hard to maximize student problem solving practice in class. In a modest use of blended learning in a format sometimes called a “flipped classroom” approach, students use an online Python interactive textbook (Miller and Ranum, 2014) and spend time outside of class completing the online Code Academy practice exercises. The level of quality in both the interactive textbook and practice lessons at Code Academy is notable. Together these provide helpful benchmarks for student progress, helping ensure that students have practiced with the fundamental control structures, for instance, before participating in a hands-on lab. It has not escaped our notice that our open reliance on online reading and practice materials has forced us as instructors to critically consider the use of class time, for example, the time spent “writing notes on the board.” In-class problem solving sessions begin with brief discussions that map a problem at hand to the programming language control structures and/or data structures of the day, followed by pair-programming opportunities (switching the student typing every 20 minutes) to refactor scripts to open and read from files in multiple languages, e.g., Mandarin, Latin, or Middle English. 5.3 Final Projects From the initial day of class, students begin work on a final semester project to design an experiment on a set of digitized texts of their choice. In our experience, scholars who might like to perform computational analysis in their areas of expertise and/or wish to teach their students how to do so become discouraged too early in the game. The Lexos software developed by our Lexomics Project provides a simple, web-based workflow for text processing, statistical analysis, and visualization. Situated within a clean and simple interface, Lexos consolidates the common yet frustrating pre-processing operations that are needed for subsequent analysis, e.g., a cluster analysis of segments from multiple novels. Student programming in final projects is on an “as needed basis,” the focus being the use of a computational method in a small experiment. Some recent undergraduate topics for final projects are listed in Table 3. A complete syllabus and sets of programming assignments and other course materials are available at our Lexomics website (http://lexomics.wheatoncollege.edu/). Sample Undergraduate Final Projects in Computing for Poets All of Caesar is Divided into Five Parts, but Who Wrote What? A look at the Various Authors of the Complete Works of Julius Caesar Not So Elementary: Did Sir Arthur Conan Doyle Write all of the Sherlock Holmes Canon? Found in Translation: A Comparative Lexomic Analysis of Three Translations of Beowulf Variation and Influence Using Dendrograms to Identify Variations in Style and Influence in Tolkien’s Lord of the Rings Can Bias be Counted? Political Vocabulary in the News Table 3. Student topics for their final projects in Computing for Poets (the last three iterations of student topics are available at: http://wheatoncollege.edu/lexomics/computing-poets ). Intro programming for Life Sciences and Digital Humanities Mark LeBlanc and Michael Drout 1942 6 Reaching the Life Sciences The generation, storage, analysis, and visualization of bioinformatics data is happening at a dizzying pace and the next generation of scientists face great challenges in a post-sequenced world (Macarthur, Wired). Since 1998, our Wheaton College Genomics Research Group has influenced collaborative teaching between biology and computer science and our collaborations have in turn shaped directions for research (LeBlanc and Dyer, 2003; 2004; Dyer et al. 2007). 6.1 DNA (COMP/BIO 242) An amazing blend of science, computing, and mathematics emerges when considering the molecule “Deoxyribonucleic Acid” (DNA). DNA is the blueprint of life for all organisms on Earth. Its distinctive and beautiful physical nature, a double helix of four bases, maps onto its functionality as a bearer of information, generation after generation. Fully sequenced genomes including the human genome and hundreds of microbial genomes have become the starting point for attempts to answer a wide range of biological and quantitative questions. A goal of the course is to enhance computational thinking via introductory programming as applied to the wealth of genomic data. A particular focus is on the exciting merge of personalized medicine and the ongoing human microbiome projects. Table 4 lists a set of learning objectives, what we hope become our students’ “take away stories”. “Take away stories” (learning outcomes) for DNA (COMP/BIO 242) (0) You are at a cocktail party and the topic of genomes comes up. You are able to recall significant phrases, terms, and techniques and your understanding of the main ideas and concepts enables you to lead the conversation for a while … which causes your friends to raise their eyebrows. (1) You learn to identify and classify problems that are candidates for a computer to handle; this is the start of “computational thinking”. (2) You demonstrate the ability to think algorithmically, breaking what originally seems like an overly complicated problem into a series of smaller, manageable tasks. (3) You learn to craft creative solutions by “writing software” (“to program”, “to code”, “to script”). (4) You appreciate the importance of microbes and the Human Microbiome Project. (5) You design experiments to first solve small computational tasks (e.g., one gene sequence) and then scale your solutions to very large sets of data (e.g., all genes in a genome). (6) You learn to move around and perform some work in the Linux (Unix) operating system. (7) You learn to professionally document your software and produce quality summaries, graphs, and reports of your computational methods and results. (8) You begin to appreciate the (soon to be) revolution in personalized medicine, including knowing your way around a personalized report from the personal genomics company “23andMe”. (9) You feel empowered to evaluate the ethical implications of your work and learn to appraise, critique, and defend your own as well as the work of others. Table 4. Learning objectives for the cross-listed course DNA (COMP/BIO 242). This course is part of the connection “Genes in Context” with Philosophy 111 (Ethics) or Philosophy 241 (Bio-Ethics). Throughout the semester in the DNA course, students are exposed to the ethical aspects of living in a post-genomic world and the increasing use and challenges of sequenced genomes as applied to “personalized medicine”. Students access, explore, and discuss Intro programming for Life Sciences and Digital Humanities Mark LeBlanc and Michael Drout 1943 the professor’s genomic report as obtained from the saliva-based DNA service (In Fall 2013 we obtained one of the last reports to include likelihoods of having certain diseases, just prior to the FDA’s decision to halt such “interpreted” genomic profiles). In addition, students watch the 1997 movie GATTACA together, a bioethics professor leads a discussion of “Designer Babies”, and students produce one-minute YouTube “commercials” of companies currently promoting and selling medical profiles based on individual genomes. The commercials are framed from one of two points of view: (i) from the point of view of the company (e.g., 23andMe) or (ii) from a consumer advocacy point of view. Not unlike the Poets course, the programming assignments are paced so that students may conduct a final project experiment and then share their methods and results, both orally and in writing. Assignment Short Description a1: Playing with DNA Working with DNA as a language: a string of characters in a four-letter alphabet a2: Chargaff’s Numbers For any sequence or entire genome, report the proportions of A,C,G,T nucleotides a3: Gene Finder Simulate transcription and translation on strings and evaluate “appropriate” reading frames a4: Motif Finder Build and apply regular expressions to find relevant regulatory motifs upstream of genes a5: Comparative Genomics Use a “bag of words” to keep track of motif frequencies to assign a “genomic signature” to sections of a genome Table 5. Five programming assignments in DNA (Fall 2013 semester). 6.2 Blended Learning The Fall 2013 offering of the DNA course was our most significant blended learning trial to date. Like in the Poets course, students use an online Python interactive textbook (Miller and Ranum, 2014) and spend time outside of class completing the Code Academy practice exercises. In addition, students watch at least five lectures outside of class on biological topics from a MOOC (Massive Open Online Course), here Udacity’s “Tales of the Genome” online course. The rationale for using this “outside” material is two-fold. First, the quality of these terminology-rich lectures is very good and improving; for example, students are encouraged to fill-in a template of a concept map during a lecture, where each concept map links to and from other lectures. The online material contains minimal “talking head” time, rather presenting a series of “whiteboard” illustrations, punctuated by just-in-time quizzes. A completed concept map is provided at the end of the lecture. The second and related rationale for using Udacity’s materials is that class time is too precious to spend lecturing on basic biological processes, for example transcription and translation, especially in a one-semester programming course where time for problem solving is at a premium. Intro programming for Life Sciences and Digital Humanities Mark LeBlanc and Michael Drout 1944 7 Reaching the entire academy Teaching programming to introduce and enhance computational thinking is a vital contribution to the academy. The life sciences’ genomic revolution and the digitization of books and manuscripts present unique opportunities for exposing a wider audience to computational science. We present two introductory programming courses over a ten-year period that target students in the life sciences and humanities, each attracting gender balanced enrollments that exceed traditional computer science introductory offerings. In particular, we remain convinced that learning to think algorithmically and to encode those ideas in software is a vital competency for today’s undergraduate. New opportunities for teaching computing to the growing constituencies that can benefit from introductory programming have led us to experiment with alternative, blended learning uses of class time, specifically fewer minutes lecturing and more hours solving problems in class. We advocate for more creative experimentation from faculty with how they use class time, including an increase in the infusion of “outside” course materials (e.g., MOOC lectures) from the growing palette of good instructional materials available. Programming is not an end all for computational science. Yet, the academy faces a number of new audiences who will benefit from the ability to script in the midst of their data-driven world. How we reach them in introductory courses can make all the difference. This work was funded in part by the National Endowment for the Humanities (NEH) and a Google CS Engagement Award. References Anderson, R., Ernst, M.D., Ordóñez, R., Pham, P., and Tribelhorn, B. (2015). A Data Programming CS1 Course. Proceedings of SIGCSE’15 Symposium on Computer Science Education. Kansas City, MO (Mar. 2015), 150-155. Barr, V. (2012). Create two, three, many courses: An Experiment in Contextualized Introductory Computer Science. JCSC 27(6) (June 2012), 19-25. Code Academy: Python course (accessed 12/1/2014). (http://www.codecademy.com/en/tracks/python). Cooper, S. and Cunningham, Z. (2013). Success in Introductory Programming: What Works? Communications of the ACM. August, 56(8), 34-36. Dodds, Z., Libeskind-Hadas, R., and Bush, E. (2012). BIO1 as CS1: Evaluating a Crossdisciplinary CS Context. ITiCSE’12, July 3–5, 2012, Haifa, Israel, 268-272. Downey, S., Drout, M., Kahn, M., and LeBlanc, M.D. (2012). “Books Tell Us”: Lexomics and Traditional Evidence for the Sources of Guthlac A. Modern Philology 110, 153-181. Dyer, B.D., Kahn, M., and LeBlanc, M.D. (2007). Classification and regression tree (CART) analyses of genomic signatures reveal sets of tetramers that discriminate temperature optima of archaea and bacteria. Archaea 2, 159–167. Intro programming for Life Sciences and Digital Humanities Mark LeBlanc and Michael Drout 1945 Falkner, N. (2014). Computational Thinking for All. Keynote address at SIGCSE Technical Symposium on Computer Science Education, Atlanta, GA. Furst, M., Isbell, C., and Guzdial, M. (2007). Threads™: How to Restructure a Computer Science Curriculum for a Flat World. Proceedings of 38th SIGCSE symposium on Computer Science Education. Covington, KY (Mar. 2007), 420-424. Guzdial, M. (2003). A Media Computation Course for Non-Majors, ITiCSE 2003 Conference Proceedings, 104-108. Guzdial, M. (2009). Teaching Computing to Everyone. Communications of the ACM, 52, 5 (May 2009), p31-33. Koumoutsakos, P. (2014). The Arrow of Computational Science. Abstract of public lecture given at ETH Zurich, Switzerland, June 2, 2014. LeBlanc, M.D. and Dyer, B.D. (2003). Teaching Together: A three-year case study in genomics. The Journal of Computing in Colleges, 18(5), 85-95. LeBlanc, M.D. and Dyer, B.D. (2004). Bioinformatics and Computing Curricula 2001 – Why Computer Science is well positioned in a post-genomic world. ACM SIGCSE Bulletin, 36(4). LeBlanc, M.D., Gousie, M., and Armstrong, T. (2010). Connecting Across Campus. Proceedings of the 41st SIGCSE Technical Symposium on Computer Science Education, Milwaukee, WI. Lexomics Research Group. http://wheatoncollege.edu/lexomics/computing-poets/ Macarthur, D. (accessed 12/09/2014). Why biology students should learn to program. Wired: Genetic Futures. http://www.wired.com/2009/03/why-biology-students-should-learn-how-to- program/ Miller, B. and Ranum, D. (accessed 12.10.2014). How to Think Like a Computer Scientist. Runestone Interactive Project (http://interactivepython.org/courselib/static/thinkcspy/index.html). Moretti, F. (2013). Distant Reading. Verso Publishing. Tales of the Genome: An Introduction to Genetics for Beginners. Udacity. (https://www.udacity.com/course/bio110). Tjaden, B. (2007). A multidisciplinary course in computational biology. The Journal of Computing in Colleges 22(6), 80-87. Related URLs Computing for Poets: http://wheatoncollege.edu/lexomics/computing-poets/ DNA: http://wheatoncollege.edu/genomics/dna/ Lexomics Research: http://lexomics.wheatoncollege.edu Genomics Research: http://genomics.wheatoncollege.edu Intro programming for Life Sciences and Digital Humanities Mark LeBlanc and Michael Drout 1946 work_5udx6sbgkrgo7o3xbnmwclfid4 ---- On the Value of Narratives in a Reflexive Digital Humanities Research How to Cite: Fan, Lai-Tze. 2018. “On the Value of Narratives in a Reflexive Digital Humanities.” Digital Studies/Le champ numérique 8(1): 5, pp. 1–29, DOI: https://doi.org/10.16995/dscn.285 Published: 27 March 2018 Peer Review: This is a peer-reviewed article in Digital Studies/Le champ numérique, a journal published by the Open Library of Humanities. Copyright: © 2018 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. Open Access: Digital Studies/Le champ numérique is a peer-reviewed open access journal. Digital Preservation: The Open Library of Humanities and all its journals are digitally preserved in the CLOCKSS scholarly archive service. https://doi.org/10.16995/dscn.285 http://creativecommons.org/licenses/by/4.0/ Fan, Lai-Tze. 2018. “On the Value of Narratives in a Reflexive Digital Humanities.” Digital Studies/Le champ numérique 8(1): 5, pp. 1–29, DOI: https://doi.org/10.16995/dscn.285 RESEARCH On the Value of Narratives in a Reflexive Digital Humanities Lai-Tze Fan Lingnan University, HK laitzefan@LN.edu.hk This paper returns to the relationship of “narrative versus database” (an argument originally made by Lev Manovich in 2001) as one that can be further addressed. A specific issue persists in text analysis research in the digital humanities: the difficulty of representing the figurative meaning of narratives through digital tools. Towards an accommodation, this paper adopts a narratological framework in order to propose alternative models of content management and organization that more closely resemble figurative meaning making in human language. These alternative models therefore better allow for the computational representation of figurative elements that N. Katherine Hayles describes as “the inexplicable, the unspeakable, the ineffable” of narrative literature. This paper argues that the construction of figurative meaning through paradigmatic substitution (as part of an imaginary vocabulary that is drawn from in the process of meaning making) is difficult to account for in the relational database— arguably still the most culturally prominent database model. By focusing on NoSQL (“no” or “not only” Structured Query Language) databases, this paper explores how layers of figurative meaning can be represented together through these flexible and non-relational models. In particular, the ability of non-relational databases to group together multiple values— encouraging their association, comparison, and juxtaposition—can be analyzed as a computational albeit imprecise counterpart to the formation of paradigmatic and figurative meaning. Thus, towards accounting for a word, image, or idea’s layers of meaning as expressed in literature, this paper offers a study of the limitations of digital tools and their critical negotiation with humanities research and reflection. Keywords: Narrative; database; reflexive; figurative meaning; paradigmatic; language; digital tools; relational; non-relational; NoSQL; narratology; humanities advocacy https://doi.org/10.16995/dscn.285 mailto:laitzefan@LN.edu.hk Fan: On the Value of Narratives in a Reflexive Digital Humanities 2 Cet article reprend la relation de « narration par rapport à base de données » (un argument posé à l’origine par Lev Manovich en 2001) comme une relation qui pourrait recevoir une plus grande attention. Une question précise persiste dans la recherche de l’analyse de texte en humanités numériques: la difficulté de représenter la signification figurative des narrations par l’entremise d’outils numériques. Comme mesure d’accommodement, cet article adopte un cadre de travail narratologique afin de proposer d’autres modèles de gestion et d’organisation de contenu qui ressemblent plus étroitement à la recherche de signification figurative en langage humain. Par conséquent, ces autres modèles permettent une meilleure représentation informatique des éléments figuratifs que N. Katherine Hayles décrit comme « l’inexplicable, l’innommable, l’ineffable » de la littérature narrative. Cet article soutient que la construction de la signification figurative par la substitution paradigmatique (dans le cadre d’un vocabulaire imaginaire qui en est tiré dans le processus de recherche de la signification) est difficile à représenter dans la base de données relationnelle —sans doute encore le modèle de base de données le plus prédominant sur le plan culturel. En mettant l’accent sur les bases de données NoSQL (« non » ou « non seulement » en langage de requête structuré), cet article explore comment les niveaux de signification figurative peuvent être représentés ensemble par l’entremise de ces modèles flexibles et non relationnels. De plus, la capacité des bases de données non relationnelles à regrouper des valeurs multiples —encourageant leur association, leur comparaison et leur juxtaposition — peut être analysée comme un homologue informatique quoique imprécis de la formation de la signification paradigmatique et figurative. Ainsi, en ce qui concerne la représentation des niveaux de signification d’un mot, d’une image ou d’une idée tels qu’ils sont exprimés en littérature, cet article offre une étude des limitations des outils numériques et de la négociation critique avec la recherche et la réflexion des humanités. Mots-clés: Narration; base de données; réflexif; signification figurative; paradigmatique; langage; outils Introduction As a field of scholarship, the digital humanities are increasingly important to understand and develop, as they are uniquely attuned to the wide-ranging impact of digital media and culture. Yet, there remains a discrepancy between the epistemological underpinnings of the humanities and digital technologies and culture. On the one hand, we live in an information age that privileges technological progress and that is tasked with the creation, storage, and management of large amounts of data. On the Fan: On the Value of Narratives in a Reflexive Digital Humanities 3 other, our (western) traditional methods of interpreting information are grounded in humanities philosophy—through theoretical, interpretive, and reflexive methods of understanding history, tradition, culture, and storytelling. The epistemological differences between digital technologies and the humanities are in one way exemplified by the relationship between the database and the traditional narrative. There is a discursive history of thinking about the database and narrative in terms of opposition, most notably beginning with Lev Manovich’s article “Database as Symbolic Form” (1999) and its expansion in the foundational The Language of New Media (2001), in which he calls database and narrative “natural enemies,” even expressing “surprise” that narrative still exists in new media (2001, 225; 2001, 228).1 Since such statements, the relationship between narrative and database has been examined to reveal more complexity. In the areas of digital humanities literary research and digital narratives in particular, narratives and databases are often analyzed in terms of their dynamicism, as digital tools can be used to store, manage, represent, share, and create narrative literature—for instance, through electronic literature. Yet, many of these digital tools and methods have limitations that are at the crux of Manovich’s original argument—namely the problem of juggling the ludic depth of literature with the qualities of precision, efficiency, and “knowing” that are dictated by rigid data management models and systems both on-screen and behind the screen. Given that machinic operations are designed to produce outcomes, quantify data, and otherwise offer answers, is it possible for methods of quantification to represent, for instance, the depth or affect of a metaphor? Databases in particular, as computational structures of content management, may struggle to store let alone re-present figurative meaning in literature. As this paper will show, this difficulty stems from the broader limitations of digital tools for representing the semiotic depth that is foundational to paradigmatic meaning 1 Manovich himself has since further explored the nuances of this relationship in his work on software studies, most recently in the 2012–2015 Mellon research project on big data called “Tools for the Analysis and Visualization of Large Image and Video Collections for the Humanities.” Fan: On the Value of Narratives in a Reflexive Digital Humanities 4 making through human language. In using digital tools and methods to represent literature, then, digital humanists must ask whether the methodological prowess and scope of digital tools risk any loss of literary- and humanistic-based reflection and interpretation. I am not in this sense the first to inquire into the wheres and whens of the “H” in DH.2 This does not imply that the digital humanities are not humanistic; rather, it refers to scholarship in literary studies, media studies, and the digital humanities that calls for investigative analysis that can account for more reflexive and interpretive ways of thinking. For instance, in response to Manovich’s 2001 statements, N. Katherine Hayles (2012) contends that any scientific and engineering research presented through data and facts requires narrative for “the interpretation of the relations revealed by database queries” (2012, 182). Narratives are necessary to articulate the contexts and implications of any data- or fact-based research, including: background information; relations between groups; examinations of patterns in statistics; possible applications and their outcomes; and alternative methodologies that had been or could be attempted. The use of explanation in these examples illustrates the praxis and necessity of narrative forms and training even for research that is grounded in data, presenting a significant case for the value of reflecting upon narratives and narrative representation. This includes digital humanities projects and texts that are digitized, born-digital, and digitally informed. I argue that the identification of the limitations and affordances of digital tools and methodologies for literary analysis only remind us of the value of two modes of inquiry in a humanistic digital humanities: • Humanistic thinking: reflexive and interpretive modes of inquiry in which humanities scholars and students are trained. These modes uniquely posi- tion us to ask whether the use of quantitative digital methodologies and tools (which participate in a discourse of “efficient” and “precise” meth- 2 In addition, for whom is the “H” in DH? Further inquiry into this question can address critical issues surrounding (the politics of) representation in the digital humanities, as explored by scholars such as Adeline Koh and Anne Cong-Huyen. Fan: On the Value of Narratives in a Reflexive Digital Humanities 5 odological prowess) risks any priorities and responsibilities of the larger humanities project. • Narratological thinking: an understanding of the linguistic play and semiotic depth of language as it is used to construct works of narrative literature. Narratological thinking requires a consideration of literary ele- ments such as plot, theme, imagery, poetics, medium/media, and intertext. Narratological thinking is, in this sense, a mode of inquiry that is necessary to understanding how figurative meaning functions as a unique and vital quality of meaning making in general, including how we communicate with each other by offering information in the form of stories. Together, these modes of inquiry as applied to the digital humanities encourage the critical comparison, juxtaposition, interpretation, and reflection of digital tools and research—a critique that is a necessarily ongoing endeavour in the still-nascent stages of development for the digital humanities as an academic field. Applying these two modes of inquiry to the analysis of specific database models that are popular for structuring, managing, and representing data reveals that the discussion of “narrative versus database” is not over. In fact, these database models point to an issue that continues to be a topic of inquiry and even skepticism in digital humanities text analysis projects: that, whether qualitative or quantitative, digital tools are not always capable of capturing the essence of makes literary texts “literary” in the first place—including the elements of figurative meaning that Hayles describes as “the inexplicable, the unspeakable, the ineffable” of narrative literature (2012, 179). What humanistic and narratological modes of inquiry reveal, then, is the need for alternative models of content management that better accommodate for the literary. Towards such an accommodation, this paper proposes that digital text analysis projects can utilize NoSQL or non-relational database models—an approach to content (as data) management that more closely resembles the paradigmatic dimensions of meaning making in human language and that therefore begins to address elements of figurative meaning that carry so much literary “weight” and semiotic depth through Fan: On the Value of Narratives in a Reflexive Digital Humanities 6 their imagery, metaphor, and depth in human language. This alternative content management model is especially pertinent, I show, to address contemporary forms of narrative literature that mediate the impact of digital structures and representation on how we read, write, and think of literature itself today. Seeking methods of representing figurative meaning is only one way that humanistic and narratological thinking can encourage reflexivity and interpretation in the digital humanities. In this sense, I offer this paper that explicitly focuses on figurative meaning as only the start of a broader study on the dynamic between digital and narratological meaning making. Terms that I map throughout as a part of this ongoing comparison, juxtaposition, interpretation, and reflection may be aligned with my earlier descriptions of the epistemological underpinnings of a computational information age and (western) humanistic philosophies, whereby “database” and “narrative” as network nodes may branch out to include the quantitative and qualitative, data and interpretation, and the literal and the figurative. These terms, much like their nodal roots, are not to be considered in opposition, but rather, as in connection and thus conversation along with other existing epistemological modes of knowledge. The main difference I wish to illustrate is the wait and weight of the humanities: its position to inquire beyond that which is “known” and its critical negotiation with that which claims to know. Database versus Narrative: The Known and the Unknown/ Indeterminate The “narrative versus database” discussion emerges from Manovich’s description in The Language of New Media of the rise of a “computerization of culture,” in which the database plays a key role as a symbolic form and significant cultural form (2001, 43). While many scholars have sought to reframe the relationship between narrative and database to reveal more complexity (discussed further below), it remains the case that there are aspects of literary narrative that are not accounted for or represented by all digital tools, simply because of the ways in which these tools are designed to manage content. Some database models are rigid in their parameterization of content and others are more flexible. It is therefore necessary to distinguish that while Manovich and Fan: On the Value of Narratives in a Reflexive Digital Humanities 7 Hayles identify many models of databases and content management, each focuses on the relational database, which has been and arguably predominantly still is the database form of cultural choice (Dourish 2014, n.p.). Relational databases resemble the format of spreadsheets such as those seen in MS Excel, as both resemble print- based forms such as the index, their table structures remediating analogue methods of information organization that existed long before the digital computer was invented. It is perhaps this transferability of and therefore established literacy in more familiar cultural forms that attribute to the relational database’s continued popularity as a database form. For her own choice, Hayles explains in 2012 that the relational database has “almost entirely replaced the older hierarchical, tree, and network models” and also object-oriented database models (176). The relational database is composed of one or more tables (with rows and columns) that are drawn from for their data, a structure that is dictated by its programming language, SQL (Structured Query Language). SQL offers a rigid form of data organization through which content is dictated by the model of the table: if one requests data from a relational database, one must specify its database location; in reverse, any changes to the database structure or hierarchy of organization are also expressed in the code. It can be said that this rigidity is unavoidable because of why the general database was developed in the first place. The influx of digital devices offers a bounty of data that has become our blessing and our curse, as we try to find the “best” ways to manage and access data, typically through the methods of structured languages, programs, and databases. This leads to the creation of databases as “collections of items on which the user can perform various operations: view, navigate, search” (Manovich 2001, 234), and many computer operations function through the operations of requesting, adding, deleting, and updating data. Two types of potential incommensurability between the narrative and database emerge: the structural/formal and the semiotic. Manovich’s initial observations of the discrepancy between narrative and database involve a consideration of how content is maintained and managed differently among distinct cultural forms. Specifically, it is the amount of information collated in digital culture that presents a conundrum Fan: On the Value of Narratives in a Reflexive Digital Humanities 8 of structure and form: data storage and management in computational devices allows for a massive amount of content to be stored, often resulting in efforts to mass archive and digitize that began in the early 1990s as a trend Manovich describes as “storage mania” (2001, 234). In contrast, narrative cannot nor does it traditionally try to contain all information. As defined by narratology scholars such as Mieke Bal (2009) and David Herman (2009), a literary narrative is defined by its dynamic movement between markers of time (the beginning and end of a trajectory), composed of what Joseph Tabbi and Michael Wutz (1997) describe as “the progression of a central protagonist from a beginning through a middle toward an end that progressively diminishes possibilities and so represents that character’s fate” (14). The traditional narrative follows a cause-and-effect model, certainly a model of meaning making in which a linear pathway is developed in the mind of the reader and in which not all trajectories are mapped. For these reasons, Manovich argues of database and narrative that “each claims an exclusive right to make meaning out of the world” (2001, 225). As the database is a dynamic body of information with no beginning or end, he asks, “how can one keep a coherent narrative or any other developmental trajectory through the material if it keeps changing?” (2001, 221). But more complicated is the question of language-based semiotic content when it is stored and represented as data. Given the imposed rigid structure of relational databases in addition to its ability to edit content, the question of narrative versus database requires that we—and by this I mean digital humanists, but also computer scientists who work in linguistic-informed areas such as NLP (Natural Language Processing)—further negotiate computational semantics of content organization (computer-specific meaning making) in relation to human languages and the semiotic construction of meaning through language. It is in part such a negotiation upon which Manovich draws in order to anchor some of his juxtapositions between narrative and database, particularly through a delineation of the differing functions of paradigmatic and syntagmatic dimensions of each. These dimensions are important because they function as a core aspect of human language—the logic of syntagmatic grammar and paradigmatic substitution through which we form Fan: On the Value of Narratives in a Reflexive Digital Humanities 9 semiotic meaning in sentences. Here, Manovich argues that “in the case of a written sentence, the words that comprise it materially exist on a piece of paper, while the paradigmatic sets to which these words belong only exist in the writer and reader’s minds,” and in contrast, in the database, the “paradigmatic dimension” has “material existence” (2001, 230–1). He thus imagines, Hayles describes, that “the paradigmatic possibilities are actually present in the columns and rows, while the syntagmatic progress of choices concatenated into linear sequences by SQL commands is only virtually present” (2012, 180). Hayles disagrees with the idea that databases possess much less relay paradigmatic meaning in this way. As content management tools such as the relational database may abstract content (such as text in words and clauses) into individual rows and columns, they force content (and any generation of content through a “transition” across rows of cells) to follow the organizational schema dictated by the database’s structure and organization. So, while all content is materially present in the relational database, Hayles stresses that “in neither rows nor columns does [the paradigmatic dimension’s] logic of substitution obtain; the terms are not synonyms or sets of alternative terms but different data values” (2012, 180). Her observation of the limits of this model of content management for paradigmatic and syntagmatic meaning making reveal that the way relational databases encourage us to interpret data is not how human language works, how humans make meaning out of language or narrative, or how humans construct meaning through narrative. Hayles’ distinction matters to a discussion of figurative meaning because of the formation of figurative meaning through a paradigmatic set of associated meanings. Figurative meaning, which can be described as the association of a signifier (as a word, image, or idea) with potential metaphors, similes, analogies, tropes, and metonymies, is constructed through the paradigmatic dimension—an imaginary set of affiliations that are shaped through composition of, and encounter and practice with, cultural texts and objects. Figurative meaning can therefore only be constructed through a logic of substitution such that a subject can associate a signifier with a set of related meanings—a process of exploratory and imaginary substitutions that Fan: On the Value of Narratives in a Reflexive Digital Humanities 10 I will describe as creating a depth of meaning and therefore as possessing a “deep movement” through the paradigmatic set. Paradigmatic sets of literal and figurative meaning are thus different albeit related: for example, a paradigmatic set of literal meanings for the word “red” may include the synonyms “crimson,” “rose,” “carmine,” “cherry,” “scarlet,” and “vermilion,” while a paradigmatic set of figurative meanings may include “passion,” “lust,” “rage,” “fever,” and “violence.” The limits of the relational database for representing paradigmatic meaning in literature can be narrowed down to an aspect of what makes literature “literary” in the first place—and one characteristic is its depth of meaning beyond the literal and through the figurative that necessitates qualitative and reflexive analytical methods rooted in literary study, such as close reading. In particular, Hayles proposes that the epistemological differences between database and narrative are rooted in their differing “worldviews” through the element of indeterminacy, as narratives reach for it and databases are designed to avoid it. The element of indeterminacy is attributed as a quality of the literary character of narratives and also encourages close reading for an interpretive exploration of a text’s layers of meaning. Hayles juxtaposes narratives and databases through the indeterminate in this way, arguing that: Narratives gesture toward the inexplicable, the unspeakable, the ineffable, whereas databases rely on enumeration, requiring explicit articulation of attributes and data values … databases in themselves can only speak that which can explicitly be spoken. Narratives, by contrast, invite in the unknown, taking us to the brink signified by Henry James’s figure in the carpet, Kurtz’s ‘The horror, the horror,’ Gatsby’s green light at pier’s end, Kerouac’s beatitude, Pynchon’s crying of Lot 49. (2012, 179) In this string of examples, the figurative is indeterminate insofar as it provokes imagination and a depth of possible meanings: the single image of Jay Gatsby’s green light captures (at the same time that it overwhelms) the character’s yearning for a system of ideals that are epitomized in the character Daisy. His yearning is metaphorized in the unreachable light, the hue of which also represents envy. If Fan: On the Value of Narratives in a Reflexive Digital Humanities 11 there is a way to quantify the depth and affect of these layers of the indeterminate through figurative meaning, we have not necessarily yet found it. On Limits and the Value of Humanistic and Narratological Thinking Digital humanists have actively attempted to ameliorate such differences by drawing upon both digital and humanistic methodologies and philosophies. For instance, Hayles proposes to “locate digital work within print traditions, and print traditions within digital media, without obscuring or failing to account for the differences between them” (7). She has sought to address larger-scale ideas of difference between the logics of meaning making in the humanities and the digital through the specific media with which they are associated and through which they often work. This is the basis of her proposal of a “media-specific analysis” in 2004’s “Print is Flat, Code is Deep: The Importance of Media-Specific Analysis.” Building on the need for media-specific analysis, one of her central arguments in How We Think (2012) is that we require three modes of reading in an era in which “print is no longer the default medium of communication,” naming these modes as close reading, hyper reading, and machine reading (2012, 249). As the identification of literary studies with the practice of close reading risks pushing digital reading “to the margins as not ‘really’ reading or at least not compelling or interesting reading,” Hayles examines the value of hyper reading as a necessary method for today’s scholar to engage with all the materials and resources that are made available today (2012, 60).3 Drawing upon James Sosnoski, she also offers examples of hyper reading texts through search queries, filtering with keywords, skimming, hyperlinking, fragmenting, “pecking” (“pulling out a few items from a longer text”), and juxtaposing (a comparative method of reading across, for instance, several open browser tabs and windows) (2012, 61). 3 In regard to the number of books that can hypothetically be read in a single lifetime, Hayles cites Gregory Crane’s argument “that the upward bound for the number of books anyone can read in a lifetime is twenty-five thousand (assuming one reads a book a day from age fifteen to eighty-five)” (2012, 27). Fan: On the Value of Narratives in a Reflexive Digital Humanities 12 A “synergy” or “recursive feedback loop” among close, hyper, and machine reading is thus necessary in an era in which our understanding of communication must take into consideration the specific “affordances and limitations” of individual media systems, as Marie-Laure Ryan describes (2004, n.p.).4 We need methods of reading that are specific to interpreting and scrutinizing the minute of individual texts (close reading), methods of reading that can account for enormous collections of digitized texts (hyper reading), and methods of reading that can process computer code of varying degrees of abstraction (machine reading) (Hayles 2012, 58-72). Hayles’ tripartite model of reading, then, shows that hyper reading and machine reading, which are digitally informed, can also be applied to methods of interpretation and by extension to reading narrative. In this sense, digital methods of information and content engagement can make room or account for narrative forms and narratological thinking. The development of the digital humanities has also seen a surge in literary text analysis projects that take quantitative, data-based, or algorithmic approaches to literary research, representation, and analysis. Twelve years ago, Franco Moretti published Graphs, Maps, Trees: Abstract Models for Literary History (2005), a fascinating re-approach to literary study whereby the data visualization of hundreds of literary texts’ narrative content (through graphs, maps, and trees) allows us to grasp larger trends in literary history through a method he calls “distant reading.” Six years ago, Google Books and Harvard physicists attempted to quantify the English language through a database: drawing upon millions of digitized literary texts, they mapped patterns in the literary usage of words through a method called “culturomics,” whereby language is proven to reflect cultural atmospheres and change (Michel et al. 2011). In the past five years, and with increasing urgency and interest, digital humanists and literary scholars have expanded methods of database 4 For an excellent example of recursive and comparative reading in action, see Reading Project: A Collaborative Analysis of William Poundstone’s Project for Tachistoscope {Bottomless Pit} (2015) by Jessica Pressman, Mark C. Marino, and Jeremy Douglass. Fan: On the Value of Narratives in a Reflexive Digital Humanities 13 analysis to consider the computational representation and potential quantification of narrative. Yet, if literature possesses a quality of the indeterminate and if the objective of the database is to avoid the indeterminate, we must question of limits of digital representation itself for analyzing aspects of the literary. The identification of these limitations occurs through two crucial modes of inquiry I describe in the introduction: humanistic thinking and narratological thinking. While relational databases are useful for counting instances, exploring degrees of relationships, visualizing patterns and shifts, and so forth, in the data itself there is little reflexive meaning; as Hayles notes, it needs to be formed through interpretation (2012, 179). When examining databases, meaning and humanistic reflection come in at another layer, in part through additional information and in part through the interpretation of data. A narratological approach to digital text analysis may allow us to expand upon approaches to literary intertext as paratext that is significant to a work’s larger corpus—in write-ups, commentary, footnotes, endnotes, appendices, forwards, afterwards, glosses, and so forth—and to think of extensive metadata itself as an accompanying narrative about a text and its contexts. In particular, by examining descriptive metadata that articulates examples of data content and application, we may construct comprehensive narratives of the processes of content production, management, access, and reception, shaping narratives about the trajectory (the cause-and-effect) of digital humanities projects, tools, and research. To see how humanistic and narratological thinking aid in the identification of the limitations of digital tools for representing literary text, I will discuss a text analysis project that reflects upon these limitations: Network Theory, Plot Analysis (2011), which comes out of Stanford University’s Literary Lab. The Literary Lab, co-directed by Franco Moretti and Mark Algee-Hewitt, houses several collaborative projects that, upon completion, are published on the Lab’s website as research pamphlets. In the project pamphlet of Network Theory, Plot Analysis, Moretti analyzes narratives through the quantification of literary elements and variables. These and similar projects re-visit key ideas and principles of narrative hermeneutics through Fan: On the Value of Narratives in a Reflexive Digital Humanities 14 their mediations of narrative data. At the same time, Moretti’s use of data storage and visualization methods to represent literary features is paired with his reflexive mode of literary criticism, which observes that his methodology may be unable to capture what he calls the “weight” of narrative (2011, 2). The pamphlet’s write-up, which is itself a form of metadata and an integral part of the project’s paratext, is thus revealed to be necessary to understanding the data visualizations. It takes up a storytelling mode to speak to the project’s struggle to negotiate narrative factors with network diagrams. Also, it considers this struggle in a way that retains and captures the humanistic inquiry of a digital humanities that is critically reflexive of its own tools and methodologies.5 The pamphlet utilizes network theory in order to visualize relationships between narrative characters, including in 5 This article was written, reviewed, and revised for publication before allegations of intimidation and sexual assault were made against Moretti by several former students (Liu and Knowles 2017). According to these reports, the unproven allegations are under investigation by Stanford University as this article goes to press (editorial and authorial note). Figure 1: Network Theory, Plot Analysis. Literary Lab, Stanford University. 2011. Fan: On the Value of Narratives in a Reflexive Digital Humanities 15 Shakespeare’s Hamlet, the analysis on which I will focus. The research question for a network data visualization such as Figure 1 may be “who speaks to whom and how often?” whereby the most loquacious characters (here, Hamlet, Claudius, and Horatio) occupy more central positions in the network and minor speaking parts are on the outskirts. The data that corresponds to and generates this network could (but does not necessarily) take the form of a relational database, as it is an excellent tool for methodological tasks such as counting the frequency of something. We may say that the parameters of this relational database are also defined by the same research question, “who speaks to whom and how often?” such that characters’ names could be charted on both X and Y axes of a relational database, and their direct encounters could be ticked off. If the research question is intent on studying the frequency and relations of dialogue, a content analysis through the relational database is most apt; however, if the research question inquires more deeply into character relations and dynamics in the plot, then a relational database’s corresponding visualization is not as Figure 2: Network Theory, Plot Analysis. Literary Lab, Stanford University. 2011. Fan: On the Value of Narratives in a Reflexive Digital Humanities 16 clear. For example, Figure 2 is a visualization of deaths in Hamlet. The “region of death” in red illustrates the group of people who kill each other off at the end of Hamlet. The research question may be: “who dies and what is their relationship to other people who also die?” and the relational database parameters that produce the corresponding network may involve two layers: one table for character encounters and one for character deaths. The resulting data visualization allows researchers to compare characters’ encounters in dialogue to the actual people who die, and by doing so, we arrive at an analysis of the significance of certain relationships. The data visualizations developed by the network theory framework noticeably sway us toward the idea that the characters who engage more frequently in dialogue are also the ones that die, especially the ones that die together in the play’s final scene; yet, such a observation might discount the significance of characters such as Polonius and Ophelia, who die earlier in the play. While the pamphlet possesses 57 data visualizations that are derived from the data of corresponding databases, some narrative thinking appears to be necessary to define the parameters of the databases. The pamphlet is also necessary to reflect upon how data visualizations encourage us to analyze narrative aspects of Hamlet, especially compared to traditional narrative hermeneutic techniques. The visualizations may be more analytically interesting than charts of data alone, but it is arguable that their main function in the Network Theory, Plot Analysis pamphlet is to complement Moretti’s explanation of the processes of the research. We may view this explanation as a narrative itself in the following structure: an original text was studied for features of plot (a narrative of text analysis); multiple narrative views were negotiated according to specific research questions that were derived from narratological thinking (a narrative of data analysis); and this exploration revealed discrepancies between narratological thinking and representing narrative through digital tools (a narrative of quantified text analysis). It is these discrepancies that allow Moretti to identify and ruminate upon a significant idea: the uniqueness and complexity of what he calls narratological “weight” appears to elude his network visualizations. The “weight” of certain events for plot development, for instance, can be difficult to quantify, especially in a way that Fan: On the Value of Narratives in a Reflexive Digital Humanities 17 is easily encoded for data management, and graphing and visualization purposes. To show the possible implications of attempting to quantify this literary weight, Moretti discusses the clustering and positioning of character encounters (see Figure 3). The significance of Hamlet, Claudius, and Horatio is spatially represented by the fact that they occupy central positions in the data visualization. In comparison, the ghost has few lines of dialogue and is therefore on the outskirts of the diagram, equated in spatial significance with characters such as “Gravedigger” or “Norwegian Captain.” In fact, the scene between Hamlet and the ghost is of fundamental importance to the rest of the narrative, as it is the ghost who inspires Hamlet’s theory that Claudius killed his father and thus his revenge plot. Yet, as the data visualization is unable to represent this weight, in this way, network theory risks reducing and abstracting the plot (Moretti 2011, 3). Matthew Jockers’ work in text analysis and plot visualization in Macroanalysis: Digital Methods and Literary History (2013) mollifies this specific issue, building on early work by Kurt Vonnegut in plot diagramming to capture the significance of chronological plot events in a linear series of crests and dips. His work Figure 3: Network Theory, Plot Analysis. Literary Lab, Stanford University. 2011. Fan: On the Value of Narratives in a Reflexive Digital Humanities 18 is related to that of the larger research team of Novel TM (to which Jockers belongs), a transnational research project on text mining the novel that is led by Andrew Piper. While the issues of representation that Network Theory, Plot Analysis identifies are being actively tackled, I find equal value in reflections on certain limitations of data-based digital representations and analyses of literature. Reflexive inquires into digital humanities analysis, tools, and research production produce a text that is able to weave between this reflexive critique and the media-specific analytical affordances offered by digital media. In particular, what the observations reveal is that in the processes of both close and hyper reading, we should not make generalizations about content based on the data visualizations or their corresponding databases, as additional knowledge is often needed. In the case of Hamlet, additional knowledge about the play’s specific narrative is helpful to effectively analyze the visualizations. And with regards to databases and data visualizations for plots that are not so well known or accessible as Hamlet (for instance, rare texts, texts that are hard to access, or texts that are subject to copyright), it is through additional information and commentary that many of these difficulties are fleshed out. The metadata, here as a formal write-up, is crucial to clarifying where database forms and digital tools can fail, especially through their discrepancies with literary form, content, and hermeneutics. There is critical value in surprises, hiccups, obstacles, and failures, towards “failing better” so to speak. Representing Paradigmatic Meaning in Non-Relational Databases Identifying the limitations of these forms and tools critically gesture towards seeking alternative methods of content management that better accommodate for and represent on-screen: literary weight, figurative meaning, narrative forms, and linguistic play. In this sense, greater commensurability between the database and narrative (and between the unique cultures or “worldviews” of meaning making to which each belongs) may be met through the design or at least imagining of digital tools that can represent the indeterminate in figurative meaning. There is no “one-size-fits-all” computational tool for content management and representation, requiring that a digital humanities researcher, teacher, or student who is offered Fan: On the Value of Narratives in a Reflexive Digital Humanities 19 multiple possibilities for content mediality and mediation weigh the pros and cons, the affordances and limitations, of various digital tools. Returning to the earlier differentiation between paradigmatic sets of literal and figurative meaning—where a paradigmatic set of literal meanings for the word “red” differs from a paradigmatic set of its figurative meanings—a digital tool that offers a paradigmatic approach to content management would better accommodate and account for both these literal and figurative paradigmatic sets, particularly if it is flexible enough to allow for set editing and expansion. As with when a subject mentally searches for synonyms and metaphors out of their existing vocabularies and can also expand those vocabularies through training and reference, in the same way, a database model with a paradigmatic approach to content could store sets of meanings as imaginary possibilities that could be expanded. To be clear: the relational database also allows for this expansion, because one can theoretically add to it forever. However, the difference between the relational database and a database model with a paradigmatic approach is in its structure: the latter offers a non-relational—that is, non-rigid—schema for storing, organizing, receiving, and engaging with content-as-data. Relational databases are useful for when one chooses parameters and variables that are likely to be content-rich, or when one has the time, reason, and occasion to go through different possible transitions across rows of cells. These methods work best with small amounts of data; however, if a researcher or teacher is trying to organize or interpret an enormous amount of cells and transitions, and if many of these cells do not have values, then the relational database fails on accounts of digital scalability, memory, and speed. What happens when we move away from the relational database or if we at least incorporate other forms of database? In this regard, I do not refer to other traditional databases with SQL encoding models such as attribute-value, network, or hierarchal databases, but to a more recent paradigm of information organization: publicly introduced in 2009, and with particular significance starting around 2012, computer scientists have begun to embrace the NoSQL (“no” or “not only” SQL) movement, pushing for non-relational databases (Dourish 2014, n.p.). NoSQL is a database model that takes on several Fan: On the Value of Narratives in a Reflexive Digital Humanities 20 formats, including a document style that allows data to be organized in groups and a graph model that can resemble a network.6 NoSQL was developed for programmers to code and alter data more easily through less rigid schema. As one aspect of this effort, content is less abstract and isolated in individual cells, often offering textual context in a way that can be read as metadata. NoSQL organizes data into a flow-chart form, with keys that can be defined by any consistent variable, such as a list of course codes or a series of dates. The particular trait and thus particular format of NoSQL databases that I want to focus on is the document- oriented database’s ability to group together multiple values for each key (called a key- value or attribute-value store). Whereas in the relational database, each individual cell contains one value, for a NoSQL database, each key can contain a group of values (see Figure 4). For instance, for the database “ENG_101” (a course called “ENG 101”), each student could have associated values such as “name,” “major,” and “student id.” Rather than being structured in relational tables, the key-value model of data management, especially through document-oriented databases, can organize content to more closely resemble the paradigmatic dimension of language. Having associated values grouped together would allow multiple values to be read together as a set of paradigmatic words or associations, so that the values of the key “red” can contain “passion,” “anger,” “fever” and so forth, thereby offering an embodied version of a paradigmatic logic of substitution (see Figure 5). Additional values can also be added to the group of values through client reading and writing (user engagement). Applying Non-Relational Databases to Narratives: Towards Representing Figurative Meaning The layers of meaning in a figurative text and the weight that they carry to tell a story find a computational albeit imprecise counterpart in the paradigmatic approaches of content in non-relational databases. Such alternative content management models 6 It is entirely possible that the graph format for the Network Theory, Plot Analysis pamphlet was used because of the objective of generating network visualizations. I have analyzed these visualizations as if they are constructed through relational databases only because the accompanying observations about the potential limitations in network models for text analysis occur through—and therefore serve to underline—issues of rigidity in relational databases. Fan: On the Value of Narratives in a Reflexive Digital Humanities 21 are important as we consider emerging forms of literature that increasingly trouble how we think of “narrative” and that highlight the difficulty that many digital tools have with capturing or supporting elements of figurative meaning. One particularly important shift in re-thinking “narrative” that has also reframed its relationship with the database as a dynamic is the cultural practice and advent of narratives that are digitally composed and informed—through hypertexts, born- digital narratives, and online texts that do not necessarily embody a linear cause- and-effect form. Through the introduction of intermedial and transmedial (and thus trans-spatial and trans-temporal) qualities to cultural texts, digital narratives foster conversations about what it means to read and write creatively in and across various Figure 4: A relational database compared to a document-oriented non-relational database. Figure 5: Document-oriented database containing the key “red”. Fan: On the Value of Narratives in a Reflexive Digital Humanities 22 media platforms. For instance, variances have been identified between analogue and digital forms of textuality towards a complexity of their interrelations (see Ryan 2003; Hayles 2004; Morris 2006; Hayles 2008; and so forth). One premise in studies of new media, electronic literature, and literary studies is that in a so-called digital era, narrative persists, and often, that narrative resists by mediating their medium- specificity and unique material circumstances of meaning making (see Fitzpatrick 2002; Goldsmith 2011). Digital narratives are especially unique because of their representational duality: on-screen narrative content always corresponds to what is behind the screen— computational methods of storing and managing content as data. The implications of this duality for digital writing lead Alan Liu (2004a) to identify a deviation of rigid database, markup schema, and encoding formats from the textual practices of older communication forms such as print. He argues that such rigid factors can confine writing and today’s creative writer to the structure and content dictated by the database format and its behavioural parameters; this effect potentially leads to the author’s “surrender[ing] the act of writing to that of parameterization” (59). These literary shifts and the attempts to grasp them in interpretive analysis and digital tools prompt a humanistic and narratological inquiry into the place of the digital relative to older literary genres and styles that are difficult to represent for their layers upon layers of meaning. How, for instance, can we use a database to represent the relationship between image and text in the graphic novel? How can we represent the technique of literary stream-of-consciousness and the thematics of disorientation and fragmentation that might provoke it? In terms of ontology, how would we visualize temporality versus duration, or how could we visualize reference and memory? Towards addressing some of these questions, I will briefly discuss a NoSQL approach to representing the properties of literary weight and semiotic depth in the construction of a world and its specifically defined ontology—its properties of time, space, and of narrative movement that I describe as an “imaginary ontology.” Databases could be described as imaginary ontologies insofar as they create defining Fan: On the Value of Narratives in a Reflexive Digital Humanities 23 parameters of being within which things are and through which events—which is to say, the phenomenological reception and mediation of content—can emerge. I am specifically interested in literary texts that trouble what we mean by “narrative” or “book” through their mediation of the computerization of culture and representation, such as Mark Z. Danielewski’s House of Leaves (2000). House of Leaves is a novel that plays with the cultural form of the print-based narrative at the same time that it composes this narrative through a collection of technological mediations. In doing so, it reflects the changing face (and body) of literature in a cultural era when narrative and database engage in recursive dynamicism. In fact, it is the structure and format of the novel’s mediations that attribute to House of Leaves its digital character and that characterize it as a kind of database itself. Others have described the labyrinthian and networked structure of the novel through its text, intertext, and paratext, as the novel offers multiple perspectives on the same events (see Hamilton 2008; Pressman 2006). It has at least four narrative trajectories, it employs various literary and mediated techniques to represent them, and it offers associated ideas, characters, and other information in a multi-linear way. This structure is comparable to the way content (as data) is stored in a database, as data must be hashed together from discrete locations to create digital objects and images. The scattering of texts, images, and symbols about the pages of House of Leaves follows that the narrative is only formed through the compositing of fragments of information—and this action is also what makes House of Leaves inherently literary. The reader is told at the beginning of the novel that the central object of the plot (the house) cannot physically exist; however, the plot revolves around the mystery of the house and its mediations by other characters. That is: the content of the novel itself does not exist without mediations (Hansen 2004, 628). Expanding this series of mediations further, the novel’s narrative and meanings are constructed by the reader’s mediations of textual fragments into a formed “story” through their narratological thinking. The reader’s reflection upon and engagement with such fragments—with the semiotic depth of texts and images as well as with layers of mediation—draw out Fan: On the Value of Narratives in a Reflexive Digital Humanities 24 their paradigmatic as well as literary meanings. The agential subject’s mediations and narratological thinking, House of Leaves shows, remain central to the construction of meaning, whether the content is stored in a database or presented as a book of literature. To this effect, the novel offers a recursive feedback loop between narrative and database that is intent on encompassing the reader’s mediations of the novel itself—what Mark B.N. Hansen describes as “copies with a difference” (2004, 618)— and that legitimizes them as part of the novel’s (para)textual corpus. House of Leaves is notably difficult to represent as data. To represent the intermediality, multimodality, and multi-linearity of the text in a relational database would result in a large collection of tables, many of which would be filled with empty values. A researcher may then have to compare the data in dozens of different visualizations while also addressing the database tables that such visualizations refer to. A digital text analysis project on House of Leaves through a non-relational database would ideally: 1. represent the layers of meaning in its narrative content, including through literary elements such as metaphor and trope; 2. represent the novel’s discrete methods and instances of mediating the same narrative idea, space, or event; and 3. represent the various ways in which each of these methods and instances overlap and interact with each other. For example, consider that the narrative event when Will Navidson and his friends enter a labyrinthian hall occurs as a documentary scene, and that the reader does not have access to this footage, but instead, to the character Zampanò’s textual mediation of the cinematic moment. Zampanò’s text is also accompanied by 1) his footnotes on this labyrinth event, 2) the character Johnny Truant’s footnotes on Zampanò’s text, and 3) a comic book depiction of the scene in the appendices of the novel. To organize and encode the labyrinth event with a document-oriented NoSQL database, one possibility would be to list all of these mediations under the key of “labyrinth” and also their associations with the novel’s figurative themes and metaphors (see Figure 6). This database could thus be set up so that a search Fan: On the Value of Narratives in a Reflexive Digital Humanities 25 for “labyrinth metaphor” could return the values “haunting,” “monstrosity,” and “uncanny” so that a reader can piece together these literary associations. The reader could also search for “labyrinth text” to discover the other ways in which the hallway scene is represented in the novel: “Zampano’s manuscript,” “Zampano’s footnotes,” “Johnny’s footnotes,” and “comic.” This is only one example of the way that a paradigmatic approach to content management and representation can better account for figurative meaning in literary narrative, especially vis-à-vis digitally informed shifts to how we think of narrative creation, creativity, and engagement. Mark Z. Danielewski’s current endeavour The Familiar is a proposed 27-volume experimental book project that functions as an imagined layering of different characters over different times. It would be very difficult to represent this feat of temporal relativism in a digital humanities project or arguably even in traditional literary hermeneutics without further consideration of how it has been influenced by contemporary models and computational models of intertextual and networked content organization. Feedback Loops: Between the “Known” and the “Unknown” In academic discourse as with digital tools, variously flexible methods of representing shifting concepts of narrative play and negotiating them with our expectations of literary form, genre, and convention can be posed as alternative modes of creative and critical inquiry, particularly in the next steps of the (digital) humanities’ project. Figure 6: Document-oriented database containing the key “labyrinth”. Fan: On the Value of Narratives in a Reflexive Digital Humanities 26 By this, I do not mean to imply that the digital can match or account for all aspects of the literary; the separate togetherness of media-specific analysis and also of media- specific analytical affordances initiated my inquiry into the narrative and database in the first place, and also encouraged me to draw upon Hayles’ proposal of more synergistic approaches that resemble a “recursive feedback loop” between the digital humanities and the traditional humanities (2012, 32). Where we might move on from here is a return to a question posed in the introduction—a consideration of what aspects of the humanities may be at risk in the use of tools and text forms with distinct epistemological worldviews. After pondering over this by focusing on figurative meaning, I also find value in inverting the question. What aspects of the humanities and the literary might be upheld through the dynamic of a recursive feedback loop between the humanities and the digital, the narrative and the database? For one, reflexive observations reveal that research questions, surprises, and limitations are an origin or catalyst for a recursive feedback loop, as they necessitate a back-and-forth negotiation between what functions well (what we “know”) and what asks us to pause and think (what we do not “know”). The ultimate gesture of this negotiation might be understood in the vein of what Alan Liu calls “the ethos of the unknown”—a political mode that is rooted in humanistic philosophy (and also advocacy for this philosophy) by way of varying degrees of the critical infiltration, hacking, and implosion of systems of post-industrial information culture (2004b, 9; 2004b, 294). It is such post-industrial systems that help to shape the expansive scope and rigidity of computational data management in the name of efficiency, function, and performance. Machines and databases prefer axioms, answers, and the determinate over reflections, interpretations, and the indeterminate, bringing this paper full circle in its consideration of the place of humanistic and narratological thinking amongst the digital. Machines are designed to function as asked, such that they are meant to resist the indeterminate. This paper has sought to study their limitations in a way that is not meant to move in this direction of quantifying the figurative and determining the indeterminate; rather, it moves towards database forms (and content Fan: On the Value of Narratives in a Reflexive Digital Humanities 27 representation forms) that are more ludic like language and like all the things we can say in a single word, a phrase, a look, a light. I have sought to highlight the value of the indeterminate and the unknown in necessitating an ongoing comparison, juxtaposition, interpretation, and reflection of tools and work. In between the known and unknown—or, what is in between that which functions as defined, rigid, and expected and that which requires us to ponder, interpret, critique, remember, and return—is the weight and wait of the question, where even that which is determined can start to fall apart. Acknowledgements Thank you to computer scientist Andrew Qu Yang for his suggestions on alternative database forms and content management in the research stage of this paper. Competing Interests The author has no competing interests to declare. References Bal, Mieke. 2009. Narratology: Introduction to the Theory of Narrative.  Toronto; Buffalo, NY: University of Toronto Press. Danielewski, Mark Z. 2000. House of Leaves. New York, NY: Pantheon Books. Dourish, Paul. 2014. “No SQL: The Shifting Materialities of Database Technology.” Computational Culture: A Journal of Software Studies 4: n.p. Accessed April 3, 2015. http://computationalculture.net/article/no-sql-the-shifting-materialities-of- database-technology. Fitzpatrick, Kathleen. 2002. “The Exhaustion of Literature: Novels, Computers, and the Threat of Obsolescence.” Contemporary Literature 43(3): 518–559. DOI: https://doi.org/10.2307/1209111 Goldsmith, Kenneth. 2011. “Revenge of the Text.” Uncreative Writing. New York, NY: Columbia University Press. Hayles, N. Katherine. 2004. “Print is Flat, Code is Deep: The Importance of Media-specific Analysis.” Poetics Today 25(1): 67–90. DOI: https://doi. org/10.1215/03335372-25-1-67 http://computationalculture.net/article/no-sql-the-shifting-materialities-of-database-technology http://computationalculture.net/article/no-sql-the-shifting-materialities-of-database-technology https://doi.org/10.2307/1209111 https://doi.org/10.1215/03335372-25-1-67 https://doi.org/10.1215/03335372-25-1-67 Fan: On the Value of Narratives in a Reflexive Digital Humanities 28 Hayles, N. Katherine. 2008. “Future of Literature: Print Novels and the Mark of the Digital.” In Electronic Literature: New Horizons for the Literary. Notre Dame, IN: University of Notre Dame. Hayles, N. Katherine. 2012. How We Think: Digital Media and Contemporary Technogenesis. Chicago, IL: University of Chicago Press. Herman, David. 2009. Basic Elements of Narrative. Chichester, UK; Malden, MA: Wiley-Blackwell. DOI: https://doi.org/10.1002/9781444305920 Jockers, Matthew. 2013. Macroanalysis: Digital Methods and Literary History. Urbana: University of Illinois Press. Liu, Alan. 2004a. “Transcendental Data: Toward a Cultural History and Aesthetics of the New Encoded Discourse.” Critical Inquiry 31: 49–84. DOI: https://doi. org/10.1086/427302 Liu, Alan. 2004b. The Laws of Cool: Knowledge Work and the Culture of Information. Chicago, IL: University of Chicago Press. DOI: https://doi. org/10.7208/chicago/9780226487007.001.0001 Liu, Fangzhou, and Hannah Knowles. 2017. “Harassment, Assault Allegations against Moretti Span Three Campuses.” Stanford Daily. November 16, 2017. https://www.stanforddaily.com/2017/11/16/harassment-assault-allegations- against-moretti-span-three-campuses/. Manovich, Lev. 1999. “Database as a Symbolic Form.” Convergence: The Journal of Research into New Media Technologies 5(2): 80–99. DOI: https://doi. org/10.1177/135485659900500206 Manovich, Lev. 2001. The Language of New Media. Cambridge, MA: MIT Press. Michel, Jean-Baptise, Yuan Kui Shen, Aviva Presser Aiden, Adrian Veres, Matthew K. Gray, The Google Books Team, Joseph P. Pickett, Dale Hoiberg, Dan Clancy, Peter Norvig, Jon Orwant, Steven Pinker, Martin A. Nowak, and Erez Lieberman Aiden. 2011. “Quantitative Analysis of Culture Using Millions of Digitized Books.” Science 331: 176–182. DOI: https://doi.org/10.1126/ science.1199644 Moretti, Franco. 2005. Graphs, Maps, Trees: Abstract Models for a Literary History. London, UK; New York, NY: Verso. https://doi.org/10.1002/9781444305920 https://doi.org/10.1086/427302 https://doi.org/10.1086/427302 https://doi.org/10.7208/chicago/9780226487007.001.0001 https://doi.org/10.7208/chicago/9780226487007.001.0001 https://www.stanforddaily.com/2017/11/16/harassment-assault-allegations-against-moretti-span-three-campuses/ https://www.stanforddaily.com/2017/11/16/harassment-assault-allegations-against-moretti-span-three-campuses/ https://doi.org/10.1177/135485659900500206 https://doi.org/10.1177/135485659900500206 https://doi.org/10.1126/science.1199644 https://doi.org/10.1126/science.1199644 Fan: On the Value of Narratives in a Reflexive Digital Humanities 29 Moretti, Franco. 2011. Network Theory, Plot Analysis. Literary Lab, Stanford University. Accessed March 7, 2015. http://litlab.stanford.edu/ LiteraryLabPamphlet2.pdf. Morris, Adalaide. 2006. “New Media Poetics: As We May Think/How to Write.” In New Media  Poetics: Contexts, Technotexts, and Theories, edited by Adalaide Morris, and Thomas Swiss. Cambridge, MA: MIT Press. Pressman, Jessica. 2006. “House of Leaves: Reading the Networked Novel.” Studies in American Fiction 34(1): 107–128. DOI: https://doi.org/10.1353/saf.2006.0015 Pressman, Jessica, Mark C. Marino, and Jeremy Douglass. 2015. Reading Project: A Collaborative Analysis of William Poundstone’s Project for Tachistoscope {Bottomless Pit}. Iowa City, IA: University of Iowa Press. Ryan, Marie-Laure. 2003. “On Defining Narrative Media.” Image and Narrative 6: n.p. Accessed March 27, 2015. http://www.imageandnarrative.be/inarchive/ mediumtheory/marielaureryan.htm. Tabbi, Joseph, and Michael Wutz. 1997. “Introduction.” In Reading Matters: Narrative in the New Media Ecology, edited by Joseph Tabbi, and Michael Wutz. Ithaca, NY: Cornell University Press. How to cite this article: Fan, Lai-Tze. 2018. “On the Value of Narratives in a Reflexive Digital Humanities.” Digital Studies/Le champ numérique 8(1): 5, pp. 1–29, DOI: https://doi. org/10.16995/dscn.285 Submitted: 29 October 2017 Accepted: 29 October 2017 Published: 27 March 2018 Copyright: © 2018 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. OPEN ACCESS Digital Studies/Le champ numérique is a peer-reviewed open access journal published by Open Library of Humanities. http://litlab.stanford.edu/LiteraryLabPamphlet2.pdf http://litlab.stanford.edu/LiteraryLabPamphlet2.pdf https://doi.org/10.1353/saf.2006.0015 http://www.imageandnarrative.be/inarchive/mediumtheory/marielaureryan.htm http://www.imageandnarrative.be/inarchive/mediumtheory/marielaureryan.htm https://doi.org/10.16995/dscn.285 https://doi.org/10.16995/dscn.285 http://creativecommons.org/licenses/by/4.0/ Introduction Database versus Narrative: The Known and the Unknown/Indeterminate On Limits and the Value of Humanistic and Narratological Thinking Representing Paradigmatic Meaning in Non-Relational Databases Applying Non-Relational Databases to Narratives: Towards Representing Figurative Meaning Feedback Loops: Between the “Known” and the “Unknown” Acknowledgements Competing Interests References Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 work_5vjwfih7szgbplasjgajnmvn2y ---- SSRN-Deckblatt-neu-neu-brunschwig research paper series M A X - P L A N C K - I N S T I T U T F Ü R E U R O P Ä I S C H E R E C H T S G E S C H I C H T E M A X P L A N C K I N S T I T U T E F O R E U R O P E A N L E G A L H I S T O R Y www.rg.mpg.de Max Planck Institute for European Legal History Colette R. Brunschwig Perspektiven einer digitalen Rechtswissen- schaft: Visualisierung, Audiovisualisierung und Multisensorisierung No. 2018-03 · http://ssrn.com/abstract=3126043 Published under Creative Commons cc-by-nc-nd 3.0 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 PERSPEKTIVEN EINER DIGITALEN RECHTSWISSENSCHAFT: VISUALISIERUNG, AUDIOVISUALISIERUNG UND MULTISENSORISIERUNG Colette R. Brunschwig* Abstract Die Bedeutung der visuellen, audiovisuellen und der multisensorischen Medien wächst. Bei letzteren handelt es sich um hybride Medien, welche nicht nur den Seh- und Hörsinn ansprechen, sondern auch andere Sinne, wie etwa den Geruchssinn, Tast- und Bewegungssinn. In Anbetracht dieser medialen Entwicklung erforschen so- wohl die digital humanities als auch was man im Sinne einer Arbeitshypothese eine digitale Rechtswissenschaft nennen könnte, die Visualisierung, Audiovisualisierung und Multisensorisierung. Bisher sind die zwei Gebiete weitgehend getrennte Wege gegangen, ohne einander gegenseitig zu beeinflussen. Und dies, obwohl jene sich, was die Visualisierung, Audiovisualisierung und Multisensorisierung betrifft, teilwei- se mit ähnlichen Problemen und Fragen auseinandersetzen. Der vorliegende Aufsatz bezweckt, und dies soll sein innovativer Beitrag sein, die beiden fachlichen Bereiche in Bezug auf die genannten Forschungsgegenstände einander anzunähern. Überdies soll damit zu einem interdisziplinären Erkenntnisgewinn beigetragen werden. Dies geschieht einerseits anhand von Überlegungen, welche durch die Hundertjahr feier- Tagung der Zentralbibliothek Zürich (Schweiz) veranlasst worden sind. Andererseits geht dieser Aufsatz weit über die Themen dieser Tagung hinaus. * Wissenschaftliche Mitarbeiterin, Universität Zürich, Rechtswissenschaftliche Fakultät, Zentrum für Rechtsgeschichtliche Forschung, Abteilung Rechtsvisualisierung, Rämistrasse 74/52, 8001 Zürich, CH, colette.brunschwig@rwi.uzh.ch; https://www.ius.uzh.ch/de/research/units/zrf/abtrv/brunschwig.html. Eine englische Übersetzung dieses Textes ist online verfügbar unter https://forhistiur.de/2019-05-brun- schwig/abstract/. https://www.ius.uzh.ch/de/research/units/zrf/abtrv/brunschwig.html https://forhistiur.de/2019-05-brunschwig/abstract/ https://forhistiur.de/2019-05-brunschwig/abstract/ Colette R. Brunschwig 2 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 1. Auftakt Anlässlich ihrer Hundertjahrfeier lud die Zentralbibliothek Zürich (Universitätsbibliothek, nachfolgend ZBZ) zu einer Tagung ein. Das multi- und interdisziplinäre Treffen bezweckte, ein Forum dafür zu bieten, über digitale Forschungsdaten in der Gegenwart und Zukunft zu diskutieren. Der Tagungstitel lautete: „Die Bibliothek vernetzt: Infrastrukturen für For- schungsdaten in den Geisteswissenschaften“.1 Referenten aus unterschiedlichen Bereichen berichteten über ihre Tätigkeiten, ihr Wissen und ihre Erfahrungen, indem sie die Entwicklungen und Herausforderungen beschrieben, mit denen sie sich angesichts der anhaltenden digitalen Transformation konfrontiert sehen. Ich besuchte diese Tagung in meiner Rolle als wissenschaftliche Mitarbeiterin der Rechts- wissenschaftlichen Fakultät der Universität Zürich (Abteilung Rechtsvisualisierung). Dabei nahm ich die dargebotenen Informationen aus der Perspektive einer Rechtswissenschaft- lerin auf, die sich vorrangig mit den Themen „Visualisierung“, „Audiovisualisierung“ und „Multisensorisierung“ befasst.2 Diese Gegenstände lassen sich hauptsächlich in den recht- lichen Grundlagenfächern ansiedeln. Dazu gehören beispielsweise die Rechtsgeschichte, die Rechtstheorie, die Rechtssoziologie, die Rechtspsychologie und die Rechtsinformatik.3 Abb. 1 1 Vgl. Universität Zürich. „Fachtagung anlässlich des 100-Jahr-Jubiläums der ZB.“ http://www.bibliothek- vernetzt.uzh.ch/de.html. Zugriff am 14. Juli 2017. Damit sich die Lesenden schneller orientieren kön- nen, werde ich Websites und Abstracts in den Fussnoten ausführlich zitieren. Aus diesem Grund erschei- nen diese Quellen nicht mehr im Literaturverzeichnis. 2 Vgl. Brunschwig, Visualisierung von Rechtsnormen (2001); Id., „Multisensory Law and Therapeutic Juris- prudence“ (2012); Id., „Law Is Not or Must Not Be Just Verbal and Visual in the 21st Century“ (2013), und Id., „Multisensory Law“ (2016). 3 Die Rechtsinformatik kann als rechtliches Grundlagenfach betrachtet werden, insofern sie sich mit Werkzeugen (tools) und Methoden (methods) befasst und dabei rechtsdogmatische Probleme, Fragen und Themen transzendiert. Wenn ich es richtig sehe, ist es heute Mode geworden, für diesen Teil der Rechtsinformatik die Begriffe legal technology oder legal tech zu verwenden. Mit diesem buzz word oder unter diesem label wird momentan auch viel Geld verdient. Colette R. Brunschwig 3 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 Was ist unter „Visualisierung“ zu verstehen? Einerseits bezeichnet dieser Begriff den Prozess der Visualisierung, andererseits das Produkt, welches aus diesem Prozess hervorgeht. Was den Visualisierungsprozess angeht, spielt es – aus juristischer Perspektive – eine Rolle, – wer visualisiert – Juristen, Studierende der Rechtswissenschaft, juristische Laien, wie z. B. Designer und Informatiker, und so fort; – warum und wozu visualisiert wird; – welche rechtlichen Inhalte visualisiert werden, wie z. B. Normen von Rechtserlassen, Teile von gerichtlichen Verfügungen und Beschlüssen, Inhalte der Rechtswissenschaft (Forschung und Lehre), Inhalte der staatlichen Rechtspraxis und der privaten Recht- spraxis (z. B. Verträge), rechtlich relevante Sachverhalte und so fort; – mit welchen Medien visualisiert wird – analoge und/oder digitale Medien (Hard- und Software), – mit welchem semiotischen Code visualisiert wird; – welche Visualisierungsmethoden angewendet werden. Mit Bezug auf die Visualisierung als Produkt lassen sich – wiederum aus rechtlicher Perspek- tive – folgende Überlegungen anstellen: – Welche rechtlichen Inhalte erscheinen tatsächlich in verbo-visueller oder visueller Form? – In welchem Medium manifestiert sich das visuelle Produkt (Medialität)? – In welchem semiotischen Code erscheint es (Codalität)? – Welche Art von Wahrnehmung spricht die Visualisierung an und/oder zu welcher Art von Wahrnehmung ist sie unter Umständen selber fähig – etwa im Fall einer Kamera, welche mit der Visualisierung verbunden ist (Modalität)? – Wer sind die Rezipienten der Visualisierung – Juristen, Studierende der Rechtswissen- schaft, juristische Laien? – Wie wirkt sich die Visualisierung auf die Rezipienten aus? Mit anderen Worten: Wie erleben die Rezipienten die Visualisierung und wie verhalten sie sich zu ihr? Die Überlegungen zur „Visualisierung“ lassen sich – mutatis mutandis – auf die Forschungs- gegenstände „Audiovisualisierung“ (Animationen, Videos, Filme) und „Multisensorisierung“ (virtuelle Realitäten, games, humanoide Roboter) übertragen. 1.1 Hintergrund Wissenschaftstheoretisch betrachtet, gilt die Rechtswissenschaft als Geistes- und Sozialwissen- schaft. Es ist in diesem Rahmen weder notwendig noch möglich, diese Aussage zu begrün- den. Stattdessen verweise ich auf ein paar Publikationen, welche Gründe dafür darlegen.4 4 Zur Rechtswissenschaft als Geisteswissenschaft vgl. z. B. Obermayer, „Rechtswissenschaft als Geisteswis- senschaft“ (1987); Kretschmer, Rechts- als Geisteswissenschaft (2007), und Balkin und Levinson, „Law and the Humanities,“ 156. Zur Rechtswissenschaft als Sozialwissenschaft vgl. z. B. Büllesbach, „Rechtswissen- schaft und Sozialwissenschaft,“ 401, 403. Colette R. Brunschwig 4 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 Vorliegend sind jedoch primär geisteswissenschaftliche Bezüge der Rechtswissenschaft be- deutsam. Sofern man davon ausgeht, dass die Rechtswissenschaft eine Geisteswissenschaft darstellt, oder wenn man zumindest annimmt, einzelne rechtliche Grundlagenfächer, wie z. B. die Rechtsgeschichte, die Rechtstheorie und die Rechtsphilosophie, hätten nähere Beziehungen zu den Geisteswissenschaften, dann ergeben sich wichtige Konsequenzen. Angehörige dieser Rechtsdisziplinen haben oder hätten es auf sich zu nehmen, die Bezüge dieser Disziplinen zu den digitalen Geisteswissenschaften (digital humanities; nachfolgend DH) zu erhellen. Abb. 2 1.2 Probleme und Fragen Die Rechtsgeschichte dürfte zurzeit gegenüber den anderen rechtlichen Grundlagenfächern eine Vorreiterrolle einnehmen, was die Auseinandersetzung mit den DH betrifft. So fragt Birr: „Wird die Rechtsgeschichte fundamental durch die neuen, digitalen Instrumente und Praktiken verändert werden? Werden die ‚neuen‘, ‚digitalen‘ Rechtshistoriker andere For- schungsfragen stellen als ihre traditionell arbeitenden Kollegen?“5 Ich möchte noch ergänzen: Werden sich die Fragen der digitalen Rechtshistoriker von jenen der analogen Rechtshisto- riker unterscheiden, zumal was die Visualisierung, Audiovisualisierung und Multisensorisie- rung vergangener rechtlicher Inhalte betrifft? In diesem Terrain sind Forschungslücken zu 5 Birr, „Die geisteswissenschaftliche Perspektive,“ 332. Zur digital legal history vgl. auch Robertson, „Sear- ching for Anglo-American Digital Legal History,“ 1049–69. Colette R. Brunschwig 5 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 schliessen. Mutatis mutandis erscheint es ebenso sinnvoll wie notwendig, derartige Fragen in anderen rechtlichen Grundlagenfächern zu stellen. Während der Tagung begegnete ich den Vorträgen mit zwei grundlegenden Fragen: 1. In- wiefern sind deren Inhalte bedeutsam für die rechtlichen Grundlagenfächer, insbesondere was die Themen „Visualisierung“, „Audiovisualisierung“ und „Multisensorierung“ angeht? 2.  Inwiefern könnten meine Erfahrungen und mein Wissen etwas zu dem Vorgetragenen beitragen, und zwar mit Fragen, Informationen und Einschätzungen? Während ich die vor- liegende Arbeit zu Papier brachte, brach eine dritte Frage auf: Welche über die Tagung hi- nausgehenden Reflexionen lassen sich aus rechtswissenschaftlicher Perspektive anstellen, ohne dabei den Fokus auf die Visualisierung, Audiovisualisierung und Multisensorisierung aus den Augen zu verlieren? Die „multisensorische“ Ausrichtung der aufgeführten Kernfragen rechtfertigt sich letzt- lich nicht durch meine persönlichen Forschungsvorlieben. Vielmehr legen die DH selber eine derartige epistemologische Orientierung nahe. Sich auf McPherson berufend, präsen- tiert Svensson eine Typologie von DH, zu der man die multimodal humanities zählt: „The multimodal [meine Hervorhebung] humanities bring together scholarly tools, databases, networked writing and peer-to-peer commentary while also leveraging the potential of the visual and aural media that are part of contemporary life.“6 Zur Multimodalität oder Multisensorik bemerken überdies Burdick et al. unter dem Titel „What defines the Digital Humanities now?“: And the notion of the primacy of text is being challenged. Whereas the initial wavers of computational humanities concentrated on everything from word frequency studies and textual analysis (classificati- on systems, mark-up encoding) to hypertext editing and textual database construction, contemporary Digital Humanities marks a move beyond a privileging of the textual, emphasizing graphical methods of knowledge production and organization, design as an integral component of research, transmedia crisscrossings, and an expanded concept of the sensorium of humanistic knowledge.7 Solche nicht-verbozentrischen Perspektiven der gegenwärtigen DH bringen es mit sich, dass „some of the major sectors Digital Humanities research extend outside the traditional core of the humanities to embrace quantitative methods from the social and natural sciences as well as techniques and modes of thinking from the arts.“8 Der vorliegende Aufsatz gliedert sich nach dem Ablauf der ZBZ-Tagung. Ohne einen An- spruch auf Vollständigkeit zu erheben, zielt er darauf ab, die aufgeworfenen Schlüsselfragen zu beantworten. 6 Svensson, „The Landscape of Digital Humanities,“ Note 14. 7 Burdick et al., Digital_Humanities, 122. 8 Burdick et al., Digital_Humanities, 122. Colette R. Brunschwig 6 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 2. Forschungsprojekte der Universität Zürich: Anforderungen an Services und Infrastrukturen 2.1 Lavater-Briefedition Die Handschriftenabteilung, eine Spezialsammlung der ZBZ, besitzt über 20 000 Briefe Johann Caspar Lavaters.9 Die Vortragenden Ursula Caflisch-Schnetzler und Barbara Naumann verfolgen das Ziel, diese Briefe zu digitalisieren und in einer historisch-kritischen Edition zu erschliessen. Dabei planen die beiden Wissenschaftlerinnen auch eine analo- ge Edition. Gemäss ihren Angaben bezieht sich dieser Briefwechsel auf religiöse, philoso- phische, pädagogische, literarische und naturwissenschaftliche Themen. Ausserdem lassen sich aus den digitalisierten Briefen Informationen extrahieren, die über Lavaters Kommu- nikationsnetzwerk und seine Strukturen Auskunft geben. Deswegen visualisieren die Wis- senschaftlerinnen diese kommunikativen Netzwerke in Form von Graphen (graphs). Diese Graphen setzen sich aus Kanten (Linien, edges) und Knoten (nodes) zusammen.10 Der von den Referentinnen gezeigte Graph breitet sich über eine Karte (geographic map)11 aus, auf der man europäische Länder erkennt. Der Graph präsentiert viele dyadische Beziehungen, die Lavater offenbar mit seinen Korrepondenzpartnern hatte.12 In den DH ist es heute üblich, Visualisierungen zu produzieren und/oder bereits existie- rende Visualisierungen zu verwenden, sie zu beschreiben, zu interpretieren und zu evaluie- ren.13 Caflisch-Schnetzlers und Naumanns Visualisierungsbestrebungen lassen sich somit in diesen wissenschaftlichen Kontext einordnen. Hier stellen sich Fragen, wie beispielsweise: Was wird visualisiert? Mit welchen Methoden (methods) und Medien (tools) wird visualisiert? Welche Arten von Visualisierungen lassen sich ausmachen? Wie lassen sie sich beschreiben, interpretieren und evaluieren? Caflisch-Schnetzlers und Naumanns Kombination aus graph and geographic map mag Betrachtende dazu veranlassen zu fragen, was diese visuellen Medien tatsächlich zu leisten vermögen – und was eben nicht. Die grosse Anzahl der Linien lässt einen eine Vorstellung davon bekommen, mit wie vielen Akteuren Lavater korrespon- dierte. Es fehlen indessen Knotenattribute, wie z. B. das Geschlecht, die Nationalität, der wis- 9 Vgl. Universität Zürich, Deutsches Seminar. „Edition Johann Caspar Lavater.“ Zugriff am 14. Juli 2017. http://www.lavater.uzh.ch/de.html. 10 Vgl. Stegbauer und Rausch, Einführung in NetDraw, 4–10. 11 Zum Einsatz von Geographic Information Systems (GIS) in den DH vgl. z. B. Murrieta-Flores, Donald- son und Gregory, „GIS and Literary History,“ Noten 1–39. 12 Folie 5 von Caflisch-Schnetzlers und Naumanns Präsentation zeigt diesen Graphen. Zusammen mit anderen Präsentationen, die anlässlich der ZBZ-Tagung gezeigt worden sind, lässt sich diese PowerPoint-Präsentation auf der Konferenzwebsite herunterladen: Vgl. Universität Zürich. „Fachtagung anlässlich des 100-Jahr-Jubiläums der ZB.“ Zugriff am 14. Juli 2017. http://www.bibliothek-vernetzt.uzh. ch/de.html. 13 Vgl. z. B. Verbert, Katrien, „On the Use of Visualization for the Digital Humanities.“ Konferenzabstract, dh2015.org, [s.d.]. Zugriff am 14. Juli 2017. http://dh2015.org/abstracts/xml/VERBERT_Katrien_On_ the_Use_of_Visualization_for_t/VERBERT_Katrien_On_the_Use_of_Visualization_for_the_Dig.html, und Bubenhofer, „Drei Thesen zu Visualisierungspraktiken in den Digital Humanities,“ 351–55. http://www.bibliothek-vernetzt.uzh.ch/de.html http://www.bibliothek-vernetzt.uzh.ch/de.html http://dh2015.org/abstracts/xml/VERBERT_Katrien_On_the_Use_of_Visualization_for_t/VERBERT_Katrien_On_the_Use_of_Visualization_for_the_Dig.html http://dh2015.org/abstracts/xml/VERBERT_Katrien_On_the_Use_of_Visualization_for_t/VERBERT_Katrien_On_the_Use_of_Visualization_for_the_Dig.html Colette R. Brunschwig 7 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 senschaftliche Hintergrund. Solche Attribute würden es einem erlauben, Näheres über die besagten Akteure zu erfahren. Die Kanten sind linien- und nicht pfeilförmig, so dass nicht erkennbar wird, ob die Beziehung zwischen Lavater und seinem jeweiligen Adressaten ein- seitig, also asymmetrisch war oder wechselseitig, mithin symmetrisch. Alle Linien sind gleich dünn. Sie informieren folglich nicht darüber, wie intensiv die jeweilige Briefbeziehung sich gestaltete. Es fragt sich darum, ob die Referentinnen noch andere Visualisierungen kreiert haben, welche die hier fehlenden Informationen enthalten. Aus Anlass des 200-jährigen Jubiläums der ZBZ könnten künftige Rechtshistoriker die E-Mail-Kommunikation von Rechtswissenschaftlern (key players in the field) aus den Jahren 2000 bis 2017 untersuchen. Ich denke z. B. an Rechtswissenschaftler, welche heute die Visu- alisierung und Audiovisualisierung rechtlicher Inhalte erforschen. Sollte diese E-Mail-Kom- munikation noch nicht gelöscht, ja abrufbar sein, könnten Rechtshistoriker der Zukunft fragen, mit wem die besagten Rechtsakteure in Verbindung standen, aus welchen Ländern diese Partner stammten und welche Themen sie miteinander diskutierten. Ich könnte mir vorstellen, dass – hundert Jahre später – Rechtshistoriker auch Audio-Elemente in ihre Graphen und Kanten einbinden werden. 2.2 Variantengrammatik des Standarddeutschen Christa Dürscheid und Don Tuggener stellten dem Auditorium ihr Projekt „Varianten- grammatik des Standarddeutschen“ vor. Es hat zum Ziel, „auf der Grundlage eines Korpus von knapp 600 Millionen Wortformen die Variation in der Grammatik des Deutschen zu erfassen“.14 Die Resultate, welche die Forschung der beiden Sprachwissenschaftler zeitigen wird, werden „für die Grammatikforschung und den Sprachunterricht relevant sein und ei- nen praktischen Nutzen für alle Personen haben, die Auskünfte zur Standardsprachlichkeit grammatischer Varianten wünschen.“15 Soweit ich erkennen kann, dürfte das zur Debatte stehende Projekt für das Thema „Vi- sualisierung, Audiovisualisierung und Multisensorisierung rechtlicher Inhalte“ nichts „her- geben“. Es fragt sich allerdings, ob nicht die Rechtslinguistik (Sprache und Recht (philo- logischer Ansatz) bzw. Recht und Sprache (rechtswissenschaftlicher Ansatz))16 dem zur Debatte stehenden Projekt etwas abgewinnen könnte: Wie sind z. B. Rechtstexte (Gesetze, 14 Dürscheid, Christa und Don Tuggener. „Abstract.“ In Kurzbiographien und Abstracts zu den Tagungsbeiträ- gen, hrsg. v. Zentralbibliothek Zürich, 3. Zürich: [kein Verlag], 2017, 3. Zu diesem Forschungsprojekt: Universität Zürich, Universität Salzburg und Universität Graz. „Variantengrammatik des Standarddeut- schen.“ Zugriff am 14. Juli 2017. http://www.variantengrammatik.net. 15 Ibid. 16 Das Zentrum für Rechtssetzungslehre (Universität Zürich) veranstaltet rechtslinguistische Kolloquien: Vgl. Universität Zürich, Rechtswissenschaftliches Institut, Zentrum für Rechtsetzungslehre. „Weiterbil- dung.“ Zugriff am 14. Juli 2017. https://www.rwi.uzh.ch/de/oe/ZfR/weiterbildung.html. Die Philosophi- sche Fakultät der Universität zu Köln bietet einen Studiengang „Europäische Rechtslinguistik“ an: Vgl. Universität zu Köln, Philosophische Fakultät, Europäische Rechtslinguistik. „Europäische Rechtslinguis- tik: Das Konzept.“ Zugriff am 14. Juli 2017. http://erl.phil-fak.uni-koeln.de/11925.html. Colette R. Brunschwig 8 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 Verordnungen, Verträge etc.) auszulegen, die in Deutschland, Österreich und in der deutsch- sprachigen Schweiz ihren Ursprung haben? Schliesslich drängt sich die Frage auf, ob sich die Rechtslinguistik im Jahre 2117 als selbständiges rechtliches Grundlagenfach an deutsch- sprachigen Rechtsfakultäten etabliert haben wird oder ob diese Disziplin weiterhin ein ihr unwürdiges Schattendasein fristen wird. 2.3 Capturing Multilingual Discourses of Switzerland Es fällt auf, dass der Titel des Referates „Capturing Multilingual Discourses of Switzerland“ in englischer Sprache erscheint. Das Projekt des Computerlinguisten Martin Volk hat den Zweck, mehrsprachige Texte, die in der Schweiz verfasst worden sind, zu digitalisieren, mit linguistischen Informationen anzureichern sowie in einem XML-Format verfügbar zu ma- chen.17 Während des Vortrages wurde dieses Vorhaben anhand der mehrsprachigen Jahrbü- cher des Schweizerischen Alpenclubs18 exemplifiziert. Mittels automatischer Inhaltsanalysen wurden die besagten Texte für die Zeit von 1925 bis 2009 daraufhin untersucht, welche zen- tralen Themen darin diskutiert, welche sprachlichen Kenntnisse bei der Leserschaft voraus- gesetzt worden sind, und inwiefern die Satzkomplexität sich verändert hat. Was die thema- tische Fokussierung der fraglichen Texte anbelangt, sind Säulendiagramme erstellt worden. Diese Diagramme veranschaulichen durch auf der x-Achse vertikal positionierte „Säulen“, wie häufig gewisse Themen in einer gewissen Zeitspanne diskutiert worden sind. Schweizer Gesetze sind bekanntlich mehrsprachig. Mehrsprachigkeit gilt auch für die Regesten in Urteilen des Schweizerischen Bundesgerichtes, in denen der Urteilsinhalt möglichst präzise und knapp wiedergegeben wird. Ich könnte mir vorstellen, dass Volks Forschungsansatz sich als fruchtbarer Boden für drei- oder viersprachiges Schweizer Recht erweisen könnte, ganz zu schweigen vom mehrsprachigen europäischen Recht. Im Jahre 2117 könnten juristische und philologische Rechtslinguisten die Frage beantworten, inwie- fern der computerlinguistische Ansatz in den letzten 100 Jahren für die mehrsprachigen Schweizer Gesetze, die multilingualen Regesten des Schweizerischen Bundesgerichtes sowie für die Mehrsprachigkeit des europäischen Rechts fruchtbar gemacht werden konnte. 2.4 Wissensportal „Bildungsgeschichte Schweiz“ Christina Rothen und Thomas Ruoss präsentierten ihr Projekt „Bildungsgeschichte Schweiz“19. Die beiden Erziehungswissenschaftler untersuchen, wie sich die Schulstufen 17 Vgl. Volk, Martin. „Abstract.“ In Kurzbiographien und Abstracts zu den Tagungsbeiträgen, hrsg. v. Zentral- bibliothek Zürich, 3–4. Zürich: [kein Verlag], 2017. 18 Vgl. Schweizer Alpen-Club SAC. „Startseite.“ Zugriff am 14. Juli 2017. http://www.sac-cas.ch/. Zum Jahr- buch des Schweizerischen Alpenclubs vgl. z. B. Zentralbibliothek Zürich. „e-rara.“ Zugriff am 14. Juli 2017. http://www.e-rara.ch/zuz/content/titleinfo/7265103. 19 Vgl. Universität Zürich. „Bildungsgeschichte Schweiz.“ Zugriff am 14. Juli 2017. http://www.bildungs ge- schichte.uzh.ch/de.html. http://www.bildungsgeschichte.uzh.ch/de.html http://www.bildungsgeschichte.uzh.ch/de.html Colette R. Brunschwig 9 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 („Vorschule“, „obligatorische Schule“, „Mittelschulen“) sowie „Lehrerinnen- und Lehrerbil- dung“ während des 19. und 20. Jahrhunderts in den verschiedenen Schweizerischen Kanto- nen einerseits entwickelten, andererseits veränderungsresistent waren. Eine solche Fragestellung wird vielleicht Rechtswissenschaftler anno 2117 zum Nachden- ken darüber anregen, wie sich die Forschung zur Visualisierung, Audiovisualisierung und Multisensorisierung rechtlicher Inhalte in der Zeitspanne von 2017 bis 2117 im europäsi- chen und angloamerikanischen Raum entwickelte oder gegen Veränderungen resistent war, vor allen Dingen in institutioneller Hinsicht. 2.5 Lives in Transit: Steamship Passages in the Late 19th and Early 20th Century World Martin Dusinberre, englischsprachiger Ordinarius für „global history“ an der Universität Zürich, beschäftigte sich mit der Frage: „What does it mean to tell global history through the digital humanities?“ Der Welt-Historiker erforscht die Schiffspassage „als Transitperiode zwischen zwei Orten […], in der soziale Ordnungen und Beziehungen […] neuverhandelt [sic] werden.“20 Wenn man als Schiffspassagier im Transit sei, wirke sich das auf die Gefühle, Psychopathologie und Physiologie aus.21 Sein Thema solle, meinte er, einen Austausch zwi- schen Meeresgeschichte, Medizingeschichte und digitaler Geschichte (digital history) anstos- sen. Wenn ich den Referenten richtig verstanden habe, sei er noch nicht wirklich in der Lage, konkrete Fragen zu formulieren, die ihm dabei helfen würden, die soeben angeführten zwei Forschungsgegenstände in die digital history einzubinden. Digital history hat sich zu einer selbständigen Forschungsdisziplin entwickelt, die sowohl Bezüge zu den DH als auch zur „analogen“ Geschichtswissenschaft hat.22 Dennoch – sie „ist vorerst ein Weg, kein Zustand“.23 Das bedeutet, dass sich ihre Probleme, Fragen, Methoden im Fluss befinden.24 Aus diesem Grund ist es Dusinberre nicht zu verübeln, dass er, wie er coram publico frank und frei eingestand, noch nicht soweit sei offenzulegen, welche Kompo- nenten der DH bzw. der digital history konkret in sein Projekt einfliessen würden. Mit seinem 20 Dusinberre, Martin. „Abstract.“ In Kurzbiographien und Abstracts zu den Tagungsbeiträgen, hrsg. v. Zentral- bibliothek Zürich, 4–5. Zürich: [kein Verlag], 2017. 21 Vgl. ibid. 22 Zur digitalen Geschichtswissenschaft (digital history) vgl. z. B. Seefeldt und Thomas, „What Is Digital History ?“ [s. p.]. Es gibt auch einen Blog mit dem Titel „Digitale Geschichtswissenschaft“: Vgl. Läs- sig, Simone. „Digitale Geschichtswissenschaft: Das Blog der AG Digitale Geschichtswissenschaft im VHD.“ Zugriff am 14. Juli 2017. http://digigw.hypotheses.org/. Der Verband der Historiker und Historikerinnen Deutschlands hat eine eigene Arbeitsgruppe zur digitalen Geschichtswissenschaft ins Leben gerufen. Die Verbandswebsite präsentiert Informationen dieser Arbeitsgruppe: Vgl. Ver- band der Historiker und Historikerinnen Deutschlands. „Arbeitsgruppen.“ Zugriff am 14. Juli 2017. http://www.historikerverband.de/arbeitsgruppen/ag-digitale-gw.html. 23 Schmale, Digitale Geschichtswissenschaft, 37. 24 Zu möglichen Komponenten der digitalen Geschichtswissenschaft vgl. z. B. Schmale, Digitale Geschichts- wissenschaft, 61 ff. http://www.historikerverband.de/arbeitsgruppen/ag-digitale-gw.html Colette R. Brunschwig 10 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 Ansatz „history through the digital humanities“ steht dieser Forscher freilich nicht alleine da, was aus den Ergebnissen einer Google-Recherche genau mit dieser Phrase hervorgeht: Historians of the future will be born of a culture that values the images, sounds and movement of video games over the silent, placid words of books. It is reasonable to suppose, therefore, that these historians might begin to encourage their apprentices to represent the past through similar visual/ aural/kinesic environments.25 Dieses Zitat von Staley leicht abwandelnd, könnte Dusinberre seinen Forschungsgegen- stand mit Hilfe einer interaktiven virtuellen Umgebung (interactive virtual reality) multi- sensorisieren, vorausgesetzt, dass der digitale Historiker über die erforderlichen finanziellen Mittel verfügt, um eine solche Umgebung zu kreieren. Dabei wäre zu versuchen, das Schiff abzubilden sowie die sich an Bord entwickelnden sozialen Relationen der Passagiere zu vi- sualisieren. Was deren emotionale und physische Befindlichkeit betrifft, wäre es unter Um- ständen möglich, sie mit Hilfe ausgewählter Personen in der virtual reality darzustellen.26 Hundert Jahre später dürfte sich die Geschichtswissenschaft darüber klar geworden sein, inwiefern solche digitalen Multisensorisierungen einen zusätzlichen Erkenntnisgewinn ge- bracht haben. Bezüglich der global legal history fragt es sich, inwiefern sie through the digital humanities betrieben werden könnte. 3. Europäische Netzwerke und Forschungsservices 3.1 DARIAH-DE „DARIAH“ bedeutet „Digitale Forschungsinfrastruktur für die Geistes- und Kulturwissen- schaften“. DARIAH-DE erstreckt sich auf Deutschland, während DARIAH-EU europäisch ausgerichtet ist. Das Referat über DARIAH berührte die Themen „Visualisierung“, „Audio- visualisierung“ und „Multisensorisierung“ nicht. Darum erlaube ich mir, die Lesenden auf die DARIAH-Website zu verweisen. Ich habe auf dieser Website im Suchfeld rechts oben den Begriff „Visualisierung“ einge- geben und 44 Treffer erhalten (Stand 10. März 2017). Es würde sich lohnen, die einzelnen „hits“ genauer anzuschauen, z. B. im Hinblick darauf, welche Inhalte von welchen Akteuren visualisiert werden und welche Art von Visualisierungen produziert worden sind. Mit den Suchbegriffen „Audiovisualisierung“, „Multisensorisierung“, „Rechtsgeschichte“, „Rechts- theorie“ und „Rechtsinformatik“ habe ich jeweils die Fundmenge Null erzielt. „Multimedia“ hat einen Treffer gemacht, „Design“ hingegen sieben. Im Sommer 2015 ist die Schweize- rische Akademie der Geistes- und Sozialwissenschaften (SAGW) Kooperationspartnerin von 25 Staley, „Digital Historiography,“ 1. 26 Vgl. z. B. Staley, „Digital Historiography,“ 1–4, und Kheraj, „The Presence of the Past,“ [s.p.]. Colette R. Brunschwig 11 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 DARIAH geworden.27 Die SAGW betreut den Schwerpunkt „Wissenschaft im Wandel“.28 Darin laufen Projekte, wie etwa jenes, das den Titel „Digital Humanities: Infrastrukturen, Forschungsprojekte, Netzwerke“ trägt.29 Zusammen mit der Alliance of Digital Humanities Organizations (ADHO)30 und der Forschungsinfrastruktur für Sprachressourcen in den Geistes- und Sozialwissenschaften (CLARIN)31 steuert DARIAH dazu bei, dass sich die DH konsolidieren, d. h. institutionalisie- ren. Einen solchen „Institutionalisierungsbeitrag“ leisten auch Fachzeitschriften, wie z. B. die Zeitschrift für Digitale Geisteswissenschaften (ZfdG),32 digital humanites quarterly (dhq)33 und das Journal of Digital Humanities (JDH)34 sowie Konferenzen,35 Forschung36 und Lehre.37 3.2 CLARIN-D „CLARIN“ steht für „Common Language Resources and Technology Infrastructure“.38 Auf ihrer Webseite findet sich eine Rubrik mit dem Titel „Auffinden“. Scrollt man darauf, öff- nen sich drei Sub-Menüs mit den Titeln „VLO: Suche nach Ressourcen“,39 „FCS-Suche in 27 Vgl. Immenhauser, Beat. „SAGW wird Cooperating Partner von DARIAH.“ Schweizerische Akademie der Geistes- und Sozialwissenschaften: Aktuelles. http://www.sagw.ch/sagw/aktuelles/news-2015/Mitglied- dariah.html. 28 Vgl. Schweizerische Akademie der Geistes- und Sozialwissenschaften. „Schwerpunkte.“ Zugriff am 14. Juli 2017. http://www.sagw.ch/sagw/schwerpunkte.html. 29 Vgl. Schweizerische Akademie der Geistes- und Sozialwissenschaften. „Digital Humanities: Infra- strukturen, Forschungsprojekte, Netzwerke.“ Zugriff am 14. Juli 2017. http://www.sagw.ch/de/sagw/ laufende-projekte/digital-humanities.html. 30 Vgl. Alliance of Digital Humanities Organizations. „Home.“ Zugriff am 14. Juli 2017. https://adho.org/. 31 CLARIN-D. „Home.“ Zugriff am 14. Juli 2017. http://de.clarin.eu/de/. 32 Vgl. ZfdG: Zeitschrift für digitale Geisteswissenschaften. „Home.“ Zugriff am 14. Juli 2017. http://www. zfdg.de/. 33 Vgl. DHQ: Digital humanities quarterly. „Home.“ Zugriff am 14. Juli 2017. http://www.digitalhuman- ities.org/dhq/. 34 Vgl. JDH: Journal of Digital Humanities. „Home.“ Zugriff am 14. Juli 2017. http://journalofdigitalhu- manities.org/. 35 Vgl. DHd: digital humanities im deutschsprachigen raum. „Aktuelles.“ Zugriff am 14. Juli 2017. https:// dig-hum.de/aktuelles. 36 Vgl. DHd: digital humanities im deutschsprachigen raum. „Forschung: Projekte.“ Zugriff am 14. Juli 2017. https://dig-hum.de/forschung/projekte. 37 Vgl. DHd: digital humanities im deutschsprachigen raum. „Über DHd: Digitale Geisteswissenschaften.“ Zugriff am 14. Juli 2017. https://dig-hum.de/digitale-geisteswissenschaften. Zur „Disziplinierung“ der DH vgl. auch Klambauer, „Einführung in das Fach,“ [s.p.]. 38 Vgl. Wikipedia. „CLARIN.“ Zugriff am 14. Juli 2017. https://de.wikipedia.org/wiki/CLARIN. 39 „VLO“ steht für Virtual Language Observatory. Vgl. CLARIN-D. „Auffinden: VLO: Suche nach Ressour- cen.“ Zugriff am 14. Juli 2017. http://de.clarin.eu/de/auffinden/vlo-suche-nach-ressourcen. http://www.sagw.ch/de/sagw/laufende-projekte/digital-humanities.html http://www.sagw.ch/de/sagw/laufende-projekte/digital-humanities.html http://www.zfdg.de/ http://www.zfdg.de/ http://www.digitalhumanities.org/dhq/ http://www.digitalhumanities.org/dhq/ http://journalofdigitalhumanities.org/ http://journalofdigitalhumanities.org/ Colette R. Brunschwig 12 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 Ressourcen“40 und „Referenzressourcen“.41 Es würde zu weit führen, die drei Rubriken mit denselben Schlagwörtern abzufragen, die ich in Bezug auf DARIAH-DE verwendet habe. Es lohnt sich indes, die CLARIN-Website weiter zu explorieren. Dabei fällt beispielsweise auf, dass (noch) keine rechtswissenschaftliche Facharbeitsgruppe ins Leben gerufen worden ist.42 Aus meiner Sicht würde es sich anbieten, eine internationale Facharbeitsgruppe „Rechtliche Grundlagenfächer“ zu gründen. Ein solcher Schritt würde einen Anreiz dafür schaffen, dass sich rechtswissenschaftliche Fakultäten der deutschsprachigen Länder (Deutschland, Öster- reich und Schweiz) an CLARIN anschlössen. 4. Forschungsnahe nationale Infrastrukturen im Aufbau – Infrastrukturen und Services für linguistische Projekte (Session 2) 4.1 Überblick In der zweiten Tagungshälfte galt es, sich für eine der drei angebotenen Sessionen zu ent- scheiden. Aufgrund meiner Fragen (oben Ziffer 1.2) wählte ich Session 2 aus. Sich auf den „Universitären Forschungsschwerpunkt (UFSP) Sprache und Raum“ und dessen labs (La- boratorien) beziehend,43 versprach die Veranstaltung im kleineren Rahmen, in meine In- teressensrichtung zu gehen. Im Überblick betrachtet, schnitt Session 2 folgende Fragen an: Welche Infrastrukturen und Services für linguistische Projekte existieren bereits? Für wen stehen diese Infrastrukturen und Services offen und auf wie lange sind sie angelegt? Es wurde betont, dass es bei befristeten digitalen Projekten darauf ankomme, von Anbeginn an darauf zu achten, dass deren Finanzierung gewährleistet bleibe, wenn jene der Initialisierungsför- derer abgeschlossen sei. In aller Regel zähle da der „gute Wille“ der Institution, welche dem jeweiligen Projekt bisher Gastrecht gewährt habe. 40 Das Akronym „FCS“ bedeutet „Federated Content Search.“ Vgl. CLARIN-D. „Auffinden: FCS: Suche in Ressourcen.“ Zugriff am 14. Juli 2017. http://de.clarin.eu/de/auffinden/fcs-suche-in-ressourcen in Verbin- dung mit CLARIN. „FCS.“ Zugriff am 14. Juli 2017. https://www.clarin.eu/glossary/fcs. 41 Zu den Referenzressourcen vgl. CLARIN-D. „Auffinden: Referenzressourcen.“ Zugriff am 14. Juli 2017. http://de.clarin.eu/de/auffinden/referenzressourcen. 42 Vgl. CLARIN-D. „Facharbeitsgruppen.“ Zugriff am 14. Juli 2017. http://de.clarin.eu/de/facharbeitsgrup- pen. 43 Vgl. Universität Zürich. „UFSP Sprache und Raum.“ Zugriff am 14. Juli 2017. http://www.spur.uzh.ch/ de.html. Zu den universitären Forschungsschwerpunkten vgl. Universität Zürich. „Forschung: For- schungsschwerpunkte.“ Zugriff am 14. Juli 2017. http://www.uzh.ch/de/research.html. http://de.clarin.eu/de/facharbeitsgruppen http://de.clarin.eu/de/facharbeitsgruppen http://www.spur.uzh.ch/de.html http://www.spur.uzh.ch/de.html Colette R. Brunschwig 13 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 Der UFSP Sprache und Raum hat die Aufgabe, innovative Ansätze einzubringen, indem er wissenschaftliche Infrastrukturen bereitstellt und interessierte Akteure unterstützt, darunter Studierende und Wissenschaftler.44 Bisher haben wir uns mit unbewegten (statischen) Visualisierungen beschäftigt. Daher möchte ich jetzt die bewegten (dynamischen) Visualisierungen und allenfalls auch Audio- visualisierungen des VideoLab behandeln.45 Dennoch möchte ich es nicht versäumen, die Wichtigkeit und Bedeutung des GISLab46 und CorpusLab47 hervorzuheben. Insbesondere rechtslinguistisch interessierte Juristen seien zu ermuntern, sich gründlich mit der Arbeit dieser beiden labs auseinanderzusetzen und dabei zu fragen, welche ihrer Probleme, Fragen und Erkenntnisse sich – mutatis mutandis – auf die Rechtswissenschaft übertragen liessen. 4.2 Das VideoLab Der Vortrag zum VideoLab führte den Titel „Open Sensors: from sensors to data“. Klaus Wolfgang Kesselheim, der Leiter dieses lab, erläuterte uns, mit welchen Medien er arbeite, um visuelle und audiovisuelle Daten zu erfassen. Dabei handle es sich um normale Videoka- meras, action-Kameras, Kleinstbildkameras, omnidirektionale Kameras (Aufnahmegeräte, die „in der Lage [sind], Bilder aus allen Richtungen in einem Bereich von 360 Grad horizontal als auch vertikal aufzunehmen“48) sowie Eyetracker (Geräte, welche imstande sind, Augenbewe- gungen aufzuzeichnen und zu analysieren). Der Sprechende machte deutlich, das VideoLab untersuche, inwiefern Menschen räumliche Voraussetzungen für ihre Interaktionen schüfen und wie sie in ihrer räumlichen Umgebung Elemente aktivieren würden, um ihre interakti- onellen Ziele zu verfolgen. Dies, indem sie gleichzeitig ihre Interaktionen im zeit-räumlich und situationellen Kontext verankern würden. Als Beispiele, für die bereits Forschungsdaten vorlägen, führte Kesselheim eine Zahnarztpraxis und den Innenraum einer Kirche an. Der VideoLab-Website sind weitere Informationen zu entnehmen: Unlike other methods, such as questionnaire studies, video recordings permit studying the behavior of the participants while they are carrying out their everyday interaction. And, unlike field notes, for example, video recordings make it possible to repeatedly watch the interaction and to scrutinize even the smallest details of the temporal and spatial organization of the event.49 44 Vgl. Universität Zürich. „Fachtagung anlässlich des 100-Jahr-Jubiläums der ZB: Präsentationen.“ Zugriff am 14. Juli 2017. http://www.bibliothek-vernetzt.uzh.ch/de.html [Folie 5 von Derungs, Kesselheims und Samardžićs Präsentation]. 45 Zum VideoLab vgl. Universität Zürich. „UFSP Sprache und Raum: Laboratorien.“ Zugriff am 14. Juli 2017. http://www.spur.uzh.ch/de/departments/videolab.html. 46 Zum GISLab vgl. Universität Zürich. „UFSP Sprache und Raum: Laboratorien.“ Zugriff am 14. Juli 2017. http://www.spur.uzh.ch/de/departments/gislab.html. 47 Zum KorpusLab vgl. Universität Zürich. „UFSP Sprache und Raum: Laboratorien.“ Zugriff am 14. Juli 2017. http://www.spur.uzh.ch/de/departments/korpuslab.html. 48 Wikipedia. „Omnidirektionale Kamera.“ Zugriff am 14. Juli 2017. https://de.wikipedia.org/wiki/Omnidi- rektionale_Kamera. 49 Universität Zürich. „UFSP Sprache und Raum: Laboratorien.“ Zugriff am 14. Juli 2017. http://www.spur. uzh.ch/de/departments/videolab.html. https://de.wikipedia.org/wiki/Omnidirektionale_Kamera https://de.wikipedia.org/wiki/Omnidirektionale_Kamera http://www.spur.uzh.ch/de/departments/videolab.html http://www.spur.uzh.ch/de/departments/videolab.html Colette R. Brunschwig 14 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 Es wäre durchaus denkbar, Medien im juristischen Kontext zu installieren, um visuelle und audiovisuelle Daten zu sammeln, z. B. in einer Anwaltskanzlei oder in den Räumen, in wel- chen sich der schweizerische Juristentag, eine Konferenz des Schweizerischen Juristenver- eins, abspielt. Unter Umständen könnte man für solche Projekte wissenschaftssoziologisch interessierte Rechtssoziologen gewinnen oder auch kommunikations- und medienpsycholo- gisch interessierte Rechtspsychologen sowie juristische Rechtslinguisten. Gleichwohl stän- den diesem Ansinnen ernstliche Hindernisse im Wege: Abgesehen davon, dass mein Be- rufsstand an Geheimnisregeln (Anwaltsgeheimnis, Amtsgeheimnis) gebunden ist, lassen sich Juristen von Berufes wegen nicht gerne in die Karten schauen – schon gar nicht von einer Videokamera. Wie mich dünkt, pflegen Angehörige meiner Zunft nur mit offenen Karten zu spielen, wo es in der jeweiligen Situation angemessen und von Vorteil zu sein scheint. Last, but not least dürften persönlichkeits- und datenschutzrechtliche Probleme solchen visual and audiovisual recordings entgegenstehen. 5. Förderer 5.1 Fragen an die Förderer Nach dem Motto „zuerst die geistige Nahrung“ beschäftigte sich die ZBZ-Tagung, erst als sie ihrem Ende zuging, mit zentralen geldgebenden Institutionen der Schweiz: dem SNF und der SAGW. Ich trat den beiden Vorträgen mit zwei Fragen entgegen: 1. Worin unter- scheiden sich der SNF und die SAGW hinsichtlich der Dauer ihrer Förderungsmassnah- men? 2. Fördern die beiden Institutionen derzeit rechtswissenschaftliche Projekte, welche die Visualisierung, Audiovisualisierung und/oder Multisensorisierung rechtlicher Inhalte zum Gegenstand haben? Wer noch anders gelagerte Informationen zu diesen Vorträgen einholen möchte, dem werden die Folien der beiden Referenten weiterhelfen.50 Mit Bezug auf die Präsentation „EURESEARCH, Europäische Programme – Horizon 2020“ möchte ich Interessierten die informative Website ans Herz legen.51 5.2 SNF und SAGW Brigitte Arpagaus, Bereichsleiterin der Geisteswissenschaften und stellvertretende Leiterin der Abteilung Geistes- und Sozialwissenschaften beim SNF, legte dar, Forschende könnten 50 Vgl. Universität Zürich. „Fachtagung anlässlich des 100-Jahr-Jubiläums der ZB: Präsentationen.“ Zugriff am 14. Juli 2017. http://www.bibliothek-vernetzt.uzh.ch/de/Praesentationen.html. 51 Vgl. EURESEARCH: Swiss guide to European research & innovation. „Home.“ Zugriff am 14. Juli 2017. https://www.euresearch.ch/de/. Colette R. Brunschwig 15 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 ihre Themen frei wählen. Die Beitragsdauer des SNF betrage ein bis vier Jahre. Seine Förde- rung sei als „Anschubfinanzierung“ gedacht. Die Website des SNF besitzt ein Suchfeld. Das dort eingegebene Schlagwort „Visualisie- rung“ hat 37 Ergebnisse gezeitigt (Stand 14. Juli 2017), während die Schlagwörter „Audiovi- sualisierung“, „Multisensorisierung“ und „legal design“ ergebnislos ausgefallen sind. Ich habe die besagten 37 Treffer daraufhin untersucht, ob sie sich auf die Visualisierung rechtlicher Inhalte beziehen lassen. Soweit ich erkennen kann, ist es nicht möglich, einen solchen Bezug herzustellen. Beat Immenhauser, stellvertretender Generalsekretär der SAGW, berichtete, die Aufga- benteilung zwischen dem SNF und der SAGW sei geklärt. Der SNF fördere beispielsweise Editionen mit einer Laufzeit von weniger als zehn Jahren, während die SAGW für Editionen zuständig sei, deren Laufzeit zehn Jahre übersteige. Die jetzigen Schwerpunkte der SAGW sind: „Sprachen und Kulturen“, „nachhaltige Ent- wicklung“ sowie „Wissenschaft im Wandel“. Meine Bemühungen sind vergeblich geblieben, in den drei Schwerpunkten ein laufendes Projekt zur Visualisierung, Audiovisualisierung und/oder Multisensorisierung rechtlicher Inhalte zu finden. Die DH finden sich im Schwer- punkt „Wissenschaft im Wandel“. Unter der Überschrift „DH“ lässt sich die Rechtswissen- schaft nicht auffinden.52 Einzig das Projekt „Digitalisierung der Sammlung Schweizerischer Rechtsquellen (SSRQ) figuriert unter der Rubrik „Geschichte“ und dort wiederum in der Unterrubrik „Netzwerke“. Die Rechtsquellenstiftung des Schweizerischen Juristenvereins betreut dieses Vorhaben.53 Offenbar kommen die DH bei der schweizerischen Rechtswis- senschaft (noch) nicht an. Dies ist zu bedauern, zumal die DH ein tiefes, grosses Gefäss bil- den würden, welches die Visualisierung, Audiovisualisierung und/oder Multisensorisierung rechtlicher Inhalte als Forschungs- und Lehrgegenstände aufzunehmen vermöchte. Uns Wissenschaftlern bleibt zu wenig Zeit, zu lesen, zu reflektieren und zu schreiben. Dass uns Informationen überfluten, stellt ein Dauerproblem dar. Seien wir doch ehrlich: Es ist der Punkt erreicht, wo wir nicht mehr mit der Informationsflut fertig werden. Deshalb halten viele es für ratsam, die eigene Forschung auf eine möglichst enge Fragestellung intra muros zu begrenzen und von vornherein darauf zu achten, dass die dafür massgebliche Lite- ratur einigermassen überschaubar bleibt. Wer es trotzdem wagt, magnam terram incognitam intra et extra muros zu betreten, läuft Gefahr, dass ihm der Förderzeitraum von vier Jahren zu eng wird, und seine epistemologische Exkursion mangels Zufuhr von Überlebensmitteln ihr Ziel nicht erreicht. Ich halte die Visualisierung, Audiovisualisierung und Multisensori- sierung rechtlicher Inhalte für eine solche magnam terram incognitam. Warum? Die rechtli- chen Grundlagenfächer haben sich bisher nicht oder nur am Rande mit diesem Forschungs- 52 Vgl. Schweizerische Akademie der Geistes- und Sozialwissenschaften. „Laufende Pojekte in den Schwer- punkten: Digital Humanities.“ Zugriff am 14. Juli 2017. http://www.sagw.ch/sagw/laufende-projekte/ digital-humanities/Gesellschaftswissenschaften.html. 53 Vgl. ibidem. http://www.sagw.ch/sagw/laufende-projekte/digital-humanities/Gesellschaftswissenschaften.html http://www.sagw.ch/sagw/laufende-projekte/digital-humanities/Gesellschaftswissenschaften.html Colette R. Brunschwig 16 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 gegenstand befasst,54 geschweige denn die rechtsdogmatischen Fächer. Wissenschaftliche Akteure der Visualisierung, Audiovisualisierung und Multisensorisierung rechtlicher Inhalte haben demnach keine andere Wahl, als längere Expeditionen in für gewöhnlich unbekannte geistes- und sozialwissenschaftliche Territorien zu machen. Sich hier aufhaltend, müssen die Rechtswissenschaftler die vorgefundenen Erkenntnisse daraufhin überprüfen, inwiefern sie sich – mutatis mutandis – auf die Visualisierung, Audiovisualisierung und Multisensorisie- rung rechtlicher Inhalte anwenden lassen. Innerhalb und ausserhalb der Rechtswissenschaft dürften sich noch weitere magnae terrae incognitae abzeichnen. Beim Anblick dieser „nackten“ Tatsachen fragt es sich, ob der SNF sich nicht mindestens teilweise dafür einsetzen könnte, weniger Projekte, sie dafür über einen längeren Zeitraum als „bloss“ vier Jahre zu fördern – unter Umständen auch in Absprache mit der SAGW. Es lässt sich nicht ausschliessen, dass zu diesem Zweck die rechtlichen Grundlagen, auf denen die Tätigkeit der beiden Förderinstitutionen beruht, geändert werden müssten. Sie finden sich im Bundesgesetz über die Förderung der Forschung und der Innovation (FIFG).55 6. Ausklang 6.1 Wissenschaftliche Ausbeute Die ZBZ-Tagung hat eine grössere wissenschaftliche Ausbeute gebracht – und dies sogar aus rechtswissenschaftlicher Perspektive. Die Visualisierung, Audiovisualisierung und Multisen- sorisierung gehören zum Kern-„Geschäft“ der DH. Aufgrund dessen war es möglich, gewisse Inhalte einzelner Vorträge mit der Visualisierung, Audiovisualisierung und Multisensorisie- rung rechtlicher Inhalte fruchtbar zu vernetzen. Eventuell an Gryphius’ Gedicht „Es ist alles eitel“ anknüpfend (Du siehst, wohin du siehst, nur Eitelkeit auf Erden. / Was dieser heute baut, reißt jener morgen ein; / Wo itzund Städte stehn, wird eine Wiese sein, / Auf der ein 54 Was den deutschsprachigen Raum anbelangt, ragen die Arbeiten beispielsweise Lachmayers, Röhls, Boehme-Nesslers und Hilgendorfs heraus. Während Lachmayer und Hilgendorf die Visualis- ierung rechtlicher Inhalte ins Zentrum stellen, beziehen die anderen Autoren ebenso die Audiovisu- alisierung rechtlicher Inhalte in ihre Forschung ein. Bestärkt durch Kenneys Theorien zur multisen- sorischen Kommunikation und zu den multisensorischen Medien, fährt meine Wenigkeit damit fort, zusätzlich die Multisensorisierung rechtlicher Inhalte zu explorieren. Im Übrigen sollten diese multisen- sorischen Theorien für die Rechtstheorie fruchtbar gemacht werden. Im englischsprachigen Raum ha- ben sich z. B. Austin, Goodrich, Katsh, Sherwin, Feigenson und Spiesel um die Rechtsvisualisierung und rechtliche Audiovisualisierung verdient gemacht. Um die vorliegende Fussnote nicht aufzublähen, sind einzelne Werke der genannten Autoren im Literaturverzeichnis aufgeführt. 55 Vgl. Schweizerische Eidgenossenschaft. „Startseite: Bundesrecht.“ Zugriff am 14. Juli 2017. https://www. admin.ch/opc/de/classified-compilation/20091419/index.html. https://www.admin.ch/opc/de/classified-compilation/20091419/index.html https://www.admin.ch/opc/de/classified-compilation/20091419/index.html Colette R. Brunschwig 17 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 Schäferskind wird spielen mit den Herden. […]),56 schreibt von Matt, emeritierter Professor für neuere deutsche Literatur an der Universität Zürich: Ganze Lehrstühle leben von der Produktion semiotischer Konzepte und der Demontage ihrer Vor- gänger. Die Studierenden glauben daran, richten sich danach aus, schreiben Dissertationen darüber und müssen eines Tages zur Kenntnis nehmen, dass kein Hahn mehr nach ihrer wissenschaftlichen Heilslehre kräht.57 Ich hoffe, dass meine eigenen Informationen und Einschätzungen nichts demontiert, nichts zerpflückt, sondern mancherlei sich öffnende Blüten der DH für eine digitale Rechtswissen- schaft gepflückt haben. 6.2 Jurisprudentia semper reformanda est Angehörige der rechtlichen Grundlagenfächer – Rechtshistoriker, Rechtstheoretiker, Rechts- soziologen, Rechtspsychologen und Rechtslinguisten – sollten sich mit den DH auseinan- dersetzen. Meine Aufforderung richtet sich besonders an jene Anghörige der rechtlichen Grundlagenfächer, die sich mit der Visualisierung, Audiovisualisierung und/oder Multisen- sorisierung rechtlicher Inhalte bereits befassen oder dies zu tun gedenken. Dabei handelt es sich um Forschungsgegenstände, die man mit dem Etikett „multisensory legal design“ versehen könnte, wenn es um das Hervorbringen von Rechtsvisualisierungen, rechtlichen Audiovisua- lisierungen und von rechtlichen Multisensorisierungen geht. Digitale Geisteswissenschaftler möchte ich dazu motivieren, – im interdisziplinären Dia- log – Akteure der rechtlichen Grundlagenfächer auf die Schätze aufmerksam zu machen, nach denen letztere in den DH graben könnten, insbesondere im Hinblick auf die Visualisie- rung, Audiovisualisierung und Multisensorisierung rechtlicher Inhalte. Ich möchte diesem Text einen fiktionalen Schluss verleihen, indem ich fingiere, ich sei bereits verstorben: „Tot-lebend“ wie ich sei, schriebe ich der ZBZ einen Brief aus Anlass ihres 200-Jahr-Jubiläums. Vielleicht gehört es sich nicht, einen wissenschaftlichen Text in auf einer Fiktion beruhenden Art und Weise enden zu lassen. Allein – eine solche Form der Meinungsäusserung wird mir ein Stück Narrenfreiheit gewähren. Zudem werde ich ge- wisse Dinge klarer und deutlicher ausdrücken können, als ich dies in einem herkömmlichen „Ausblick“, wie er nachfolgend angebracht wäre, zu tun vermöchte. Die Fiktionalität dieses rückschauenden „Aus-Blicks“ erlaubt es auch, sich auf einen längeren Zeitraum zu beziehen, als dies in einem traditionellen Ausblick getan wird. So wagt Susskind „bloß“, ins Jahr 2035 vorauszublicken, um tomorrow’s lawyers zu beschreiben.58 56 Gryphius, „Es ist alles eitel,“ 109. 57 von Matt, Sieben Küsse, 11. 58 Vgl. z. B. Susskind, Tomorrow’s Lawyers, 149 f. Colette R. Brunschwig 18 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 6.3 Fiktiver Brief aus dem Jenseits Jenseits, im Januar 2117 Sehr geehrte Mitarbeitende und humanoide Roboter der Zentralbibliothek Zürich Ich danke Ihnen für Ihren Brief Anfangs dieses Jahres. Ihrer Nachricht entnehme ich, dass Sie vorhaben, das 200-Jahr-Jubiläum der ZBZ mit einer Tagung in der Aula der Universität Zürich zu feiern. Wenn ich Sie richtig verstanden habe, kontaktieren Sie verstorbene Wissen- schaftler, die im Jahr 2017 unterschiedlichen Fakultäten der Universität Zürich angehörten. Sie möchten von uns Jenseitigen Genaueres über frühere Zustände der Alma Mater Turicensis erfahren. Ich fühle mich geehrt, dass Ihre Wahl punkto Rechtswissenschaftlicher Fakultät auf mich gefallen ist. Ich solle Ihnen Fragen beantworten, welche meine Fakultät Anno Do- mini 2017 beträfen: 1. Wer wurde im Jahr 2017 neu an die Rechtswissenschaftliche Fakultät der Universität Zürich in die Fachgruppe „Grundlagenfächer“ berufen? Auf welche Forschungsgegen- stände legten die Neuberufenen das Hauptgewicht? Inwiefern bestand ein Zusam- menhang zwischen ihren Forschungsschwerpunkten und den DH? 2. Betrieben die Neuberufenen „reine“ rechtliche Grundlagenforschung und -lehre? Mit anderen Worten, vermittelten ihre Forschung und Lehre eher grundlegende Erkennt- nisse über die Entwicklung des Rechts, seine Strukturen, Funktionen und Wirkungen? Oder verhalfen ihre wissenschaftlichen Aktivitäten vielmehr zu Wissen, das sich in der Rechtspraxis anwenden liess? 3. Was hofften Sie, mit Ihrem Aufsatz „Perspektiven einer digitalen Rechtswissenschaft: Visualisierung, Audiovisualisierung und Multisensorisierung“ zu bewirken ? Sie mögen mir verzeihen, wenn die aus meiner jenseitigen Perspektive entwickelten Ant- worten dann und wann etwas verschroben anmuten. Seelen Verstorbener, in deren Gemein- schaft ich unterdessen lebe, neigen dazu, die Dinge sub specie aeternitatis zu betrachten. 6.3.1 Neuberufene im Jahr 2017 und ihre Aktivitäten José Luis Alonso, ordentlicher Professor für Römisches Recht, Juristische Papyrologyie und Privatrecht, sowie Elisabetta Fiocchi, Assistenzprofessorin für Rechtsgeschichte, traten ihre Ämter in der ersten Hälfte des Jahres 2017 an.59 Als ich meinen Aufsatz „Perspektiven einer digitalen Rechtswissenschaft: Visualisierung, Audiovisualisierung und Multisensorisierung“ ausarbeitete, kannte ich die Neuberufenen nicht persönlich. Ich holte Informationen über sie auf ihren Websites ein (Stand April 2017). Nebenbei bemerkt, frage ich mich, ob die Rechtswissenschaftliche Fakultät der Universität Zürich inzwischen dafür gesorgt hat, die 59 Vgl. Universität Zürich. „News: Alle Artikel / Archiv.“ Zugriff am 14. Juli 2017. http://www.uzh.ch/news/ info/berufungen/index.php. http://www.uzh.ch/news/info/berufungen/index.php http://www.uzh.ch/news/info/berufungen/index.php Colette R. Brunschwig 19 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 „history“ der Webauftritte ihrer emeritierten Professoren zu archivieren. Alonso und Fioc- chi dürften mittlerweile in den wohlverdienten Ruhestand getreten sein. In Alonsos curriculum vitae war zu lesen: „The kernel of my research to the present date has been private law, particularly obligations and real securities. My work in these fields arises from an interest in the structure of legal institutions in ancient legal thought and prac- tice.“ An einer anderen Stelle hielt der Romanist und juristische Aegyptologe fest: In the last decade, my attention has turned to the legal practice of the papyri, an interest nurtured at the department of Papyrology of the University of Warsaw, home to the leading publication in the field, the Journal of Juristic Papyrology. It is, in my view, urgent to reconstruct the bond between papyrology and Roman law: […]. Papyrologists, in particular, have been left abandoned to their own forces, without the assistance of legally trained experts, facing an enormous mass of documents whose nature is prevalently legal. Fiocchi beteiligte sich dazumal am Digitalisierungsprojekt „Natural Law in Italy“, das Teil des Projektes „Natural Law 1625–1850“ war. Sie fungierte als Mitbegründerin des Projektes „Natural Law and Law of Nations across the Ocean: Domingo Muriel and his Rudimenta Iuris Naturae et Gentium (1791)“. Es würde den Rahmen meines Briefes sprengen, Fiocchis Aktivitäten zu beschreiben, welche über die gerade vorhin erwähnten Themen hinausgingen. Fiocchis Digitalisierungsprojekt liess sich mit den DH verbinden. Ende des zwanzigsten und Anfangs des 21. Jahrunderts pflegten viele historisch orientierte Rechtswissenschaftler, ihre Quellen zu digitalisieren. Aus diesem Grund gehe ich davon aus, dass Alonso gleichfalls selber rechtshistorische Quellen, wie etwa Bücher zur juristischen Papyrologie, scannte bzw. durch seine Mitarbeitenden scannen liess. Im Übrigen liess sich im Jahre 2017 (noch) kein Bezug dieser Wissenschaftler zu den DH ausmachen. 6.3.2 Messlatte für die Wissenschaft Anfang des 21. Jahrhunderts Um ganz ehrlich zu sein, möchte ich mir nicht anmassen, Ihre zweite Frage zu beantworten, zumal ich zu der Zeit, als ich den Aufsatz „Perspektiven einer digitalen Rechtswissenschaft: Visualisierung, Audiovisualisierung und Multisensorisierung“ verfasste, weder mit den Ver- öffentlichungen noch mit der Lehrtätigkeit von Alonso und Fiocchi vertraut war. Wie auch immer, es ist meine Überzeugung: Wissenschaft sollte sich – innerhalb der Schranken des Gesetzes, der Ethik und Moral – frei entfalten können. 6.3.2.1 Forderungen an die Wissenschaft Anfang des 21. Jahrhunderts Im Jahre 2017 entdeckte ich die Botschaft von Drew Gilpin Faust, the President of Harvard University and Lincoln Professor of History.60 Da ich nicht weiss, ob ihre Verlautbarung für Sie im Internet noch abrufbar ist, zitiere ich Ausschnitte daraus. Sie begann mit: „WE UNDER- 60 Vgl. Harvard University. „The Harvard Campaign: President’s Message.“ Zugriff am 14. Juli 2017. http:// campaign.harvard.edu/presidents-message. Colette R. Brunschwig 20 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 TAKE THE HARVARD CAMPAIGN AT A MOMENT WHEN HIGHER EDUCATION IS BEING CHALLENGED TO REINVENT ITSELF, […].“ Diese Kampagne „calls upon us to articulate and affirm the fundamental values and purposes of higher education in the rapidly changing environment of a global and digital world – a world filled with promise for improving lives [meine Hervorhebungen], a world in which talent recognizes no boundaries, a world in which creativity and curiosity will fuel the future.“61 Im weiteren Verlauf ihrer Programmschrift konkretisierte Faust, welche Anforderungen die Aufgabe „Wissenschaft“ stelle: „We must harness the power of One Harvard62 to advance discovery and learning across fields, disciplines, and our broad range of Schools to change knowledge and to change the world [meine Hervorhebungen].“ Unter „Advancing Meaning, Values, and Creativity“63 forderte the President zum einen eine historisch ausgerichtete Wis- senschaft: „[…], Harvard must reinforce the significance of transcending the immediate and instrumental to explore and understand what humans have thought, done, and been, and thus to imagine where they might best seek to go“.64 Zum anderen postulierte die Historike- rin: „We must offer more prominence to innovation and hands-on discovery65 inherent in engineering, the arts, and design [meine Hervorhebung], as well as to experiential learning beyond the classroom.“66 Die Geschichtswissenschaftlerin schloss mit: „[…] Universities are unique in their commitment to the long term, to uniting the wisdom of the past with the urgen- cy of the present and the promise of the future [meine Hervorhebungen]. […].“67 Žižek, ein slowenischer Philosoph und Kulturkritiker Ende des 20. und Anfang des 21. Jahrhunderts, forderte im Feuilleton der Neuen Zürcher Zeitung vom Samstag, dem 25. März 2017: Was wir hier zurückweisen sollten, ist die Grundprämisse dieses Diskurses: ‚Studenten müssen sich sicher in den Klassenzimmern fühlen.‘ Nein, müssen sie nicht. Vielmehr müssen sie lernen, die Kom- fortzone zu verlassen, sich offen mit all den Erniedrigungen und Ungerechtigkeiten des realen Lebens zu konfrontieren und dagegen zu kämpfen.68 61 Ibid. 62 Vgl. Harvard University. „The Harvard Campaign: Aspirations: Advancing the Power of Integrated Knowledge.“ Zugriff am 14. Juli 2017. http://campaign.harvard.edu/aspiration/advancing-power-integrat- ed-knowledge. 63 Vgl. Harvard University. „The Harvard Campaign: Aspirations: Advancing Meaning, Values, and Creativ- ity.“ Zugriff am 14. Juli 2017. http://campaign.harvard.edu/aspiration/advancing-meaning-values-and-cre- ativity. 64 Vgl. Harvard University. „The Harvard Campaign: President’s Message.“ Zugriff am 14. Juli 2017. http:// campaign.harvard.edu/presidents-message. 65 Vgl. Harvard University. „Aspirations: Advancing Innovation and Hands-On Discovery.“ Zugriff am 14. Juli 2017. http://campaign.harvard.edu/aspiration/advancing-innovation-and-hands-discovery. 66 Vgl. Harvard University. „The Harvard Campaign: President’s Message.“ Zugriff am 14. Juli 2017. http:// campaign.harvard.edu/presidents-message. 67 Ibid. 68 Žižek, „Das Leben ist nun einmal krass,“ 43. http://campaign.harvard.edu/aspiration/advancing-power-integrated-knowledge http://campaign.harvard.edu/aspiration/advancing-power-integrated-knowledge http://campaign.harvard.edu/aspiration/advancing-meaning-values-and-creativity http://campaign.harvard.edu/aspiration/advancing-meaning-values-and-creativity Colette R. Brunschwig 21 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 Für mich bestand ein Zusammenhang zwischen Fausts und Žižeks Forderungen und Äusse- rungen in einzelnen rechtswissenschaftlichen Publikationen. So bemerkte Volpato im Jahr 1991: One feature about legal outputs is their unabashed textuality. It is (still) uncommon to see advice, advocacy or judgments presented as videos, animations, graphs, or simulations. In informatic terms there is a strong rigidity about which channels and codes are appropriate and a resistance to testing the communication efficiency of trying something else. In many instances, more information would be conveyed through these non-textual ‘channels’.69 1995 griff Katsh Volpatos gedanklichen Faden auf: „The digital lawyer will both see things differently and see different things since he or she will have some expertise in employing graphical and other nontextual capabilities to describe, characterize, and represent conflict, […].“70 Und 2008 vertraten Brinktrine/Schneider die Ansicht: „Für die Kommunikations- fähigkeit von Juristen ist entscheidend das Vermögen, anderen Menschen in Wort, Schrift und Bild juristische Fragen erklären zu können.“71 6.3.2.2 Visualisierung, Audiovisualisierung, Multisensorisierung rechtlicher Inhalte – gemessen an den Forderungen von Faust, Žižek und Brinktrine/Schneider Ich stellte mir die Aufgabe, die Visualisierung, Audiovisualisierung und Multisensorisierung rechtlicher Inhalte an den Forderungen von Faust, Žižek und Brinktrine/Schneider zu messen. Von Ihrer Warte aus betrachtet, mag es aufschlussreich sein, die vorläufigen Antwor- ten zu lesen, die ich anno dazumal entwickelte. Indem einzelne Rechtswissenschaftler Ende des 20. und Anfang des 21. Jahrhunderts die Visualisierung, Audiovisualisierung und/oder Multisensorierung rechtlicher Inhalte zu er- forschen begannen, reagierten sie auf die fortschreitende digitale Transformation in Gesell- schaft, Wirtschaft, Wissenschaft sowie im Staat. Infolge der Digitalisierung war es einfacher geworden, visuell, audiovisuell, ja multisensorisch zu kommunizieren – auch im rechtlichen Kontext.72 Der besagte nicht-verbozentrische Forschungsgegenstand enthielt das Verspre- chen in sich, bestimmte Bereiche des Rechtslebens zu verbessern, darunter die Kommuni- kation zwischen Rechtsexperten und juristischen Laien sowie die von Stress geprägte Arbeit von Rechtsakteuren (Richter, Anwälte, Staatsanwälte, Polizisten etc.). Verbesserung der Rechtsexperten-Laien-Kommunikation. Meine Wenigkeit vertrat die Über- zeugung, die Visualisierung und Audiovisualisierung rechtlicher Inhalte würde Rechtsak- teure dazu befähigen, rechtliche Konzepte und Probleme sowie den Ablauf rechtlicher Verfahren juristischen Laien besser zu erläutern und zu veranschaulichen, z. B. in der An- 69 Volpato, „Legal Professionalism and Informatics,“ 215. 70 Katsh, Law in a Digital World, 174. 71 Brinktrine und Schneider, Juristische Schlüsselqualifikationen, 19. 72 Vgl. Feigenson und Spiesel, Law on Display, 2 f. Colette R. Brunschwig 22 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 walt-Klienten-Kommunikation und Richter-Prozessparteien-Kommunikation.73 Nebstdem regte ich Anwälte und Richter dazu an, sie sollten juristische Laien dazu motivieren, die Fakten ihres Falles nicht nur verbal zu beschreiben, sondern diese rechtlichen Tatsachen auch zu zeichnen oder sogar mit ihrem Smartphone audiovisuell aufzuzeichnen. Diese Laien sollten, riet ich, weiter dazu angespornt werden, rechtliche Konzepte und Probleme zu skiz- zieren, die ihren Fall betrafen, sowie ihre Ziele und Vorstellungen zu visualisieren, sei es in digitaler oder analoger Form. Auf diese Weise, schlug ich vor, könnten Anwälte und Richter überprüfen, ob die juristischen Laien die sie berührenden rechtlichen Inhalte tatsächlich verstanden hatten. Eine multisensorische Rechtskommunikation würde es juristischen Laien erleichtern, notwendig werdende Entscheidungen sinnvoll zu treffen, beispielsweise in fami- lienrechtlichen Streitigkeiten (Trennung, Scheidung), arbeitsrechtlichen, gesellschaftsrecht- lichen und anderen Konflikten, die andere Rechtsgebiete betrafen, in Strafverfahren etc.74 Reduktion des Stresses von Rechtsakteuren. Anwälte wie Richter standen unter Stress. Dafür gab es verschiedene Ursachen: Zeitdruck, permanente, zuweilen emotional belastende Kon- flikte, in welche diese Rechtsakteure hineingezogen wurden, lange Arbeitstage, „jonglieren“ mit mehreren Fällen, enormer Fallerledigungsdruck, stundenlanges Sitzen und Arbeiten vor dem Bildschirm, Informationsüberflutung, allerlei Telefonanrufe etc.75 Dieser sichtbar wer- dene Misstand löste Reaktionen aus. Vor allem im US-amerikanischen Raum erschienen Pu- blikationen, welche bezweckten, die ärgste Not zu lindern. Die Autoren rieten Betroffenen, wie sie den beruflich bedingten Stress reduzieren sollten: „Yoga for Lawyers“,76 Mindfulness for Lawyers,77 „A Lawyer’s Guide to the Alexander Technique“,78 um nur drei Beispiele zu nennen, die eigens auf die Bedürfnisse dieser Rechtsakteure zugeschnitten waren. In der elek- tronischen Buchhandlung der American Bar Association (ABA)79 sichtete ich unter „Topics“ die Rubrik „Professional Interests“. Letztere enthielt neben den eben genannten Büchern weitere Publikationen zu den Themen „Lawyer Wellness“, „Mentoring“ und „Work/Life Ba- lance“.80 Teilweise wurden diese Gegenstände auch audiovisuell im Internet präsentiert, in erster Linie auf YouTube.81 Die Suche nach entsprechender deutschsprachiger rechtswissen- schaftlicher Literatur und passenden rechtlichen Audiovisualisierungen im World Wide Web 73 Vgl. in diesem Zusammenhang Rambow und Bromme, „Was Schöns ‚reflective practitioner‘ durch die Kommunikation mit Laien lernen könnte,“ 245–63. 74 Vgl. Brunschwig, „Multisensory Law and Therapeutic Jurisprudence“ (2012). 75 Zum Stress von Anwälten und Richtern vgl. z. B. Love und Martin, Yoga for Lawyers, IX, 2, 7 ff., und Cho und Gifford, The Anxious Lawyer, 1–8. 76 Vgl. Love und Martin, Yoga for Lawyers, 21 ff. 77 Vgl. z. B. Rogers, The Six-Minute Solution, 10 ff.; Riskin, „Mindfulness in the Heat of Conflict,“ 131 ff., und Cho und Gifford, The Anxious Lawyer, 61 ff. 78 Vgl. Krueger, A Lawyer’s Guide to the Alexander Technique, 27 ff. und 51 ff. 79 Vgl. ABA American Bar Association. „Publishing.“ Zugriff am 14. Juli 2017. http://www.americanbar.org/ aba.html. 80 Vgl. ABA Shop. „Topics: Professional Interests.“ Zugriff am 14. Juli 2017. https://shop.americanbar.org/ ebus/default.aspx. 81 Vgl. z. B. Rogers, Scott L. „Mindfulness Exercise: Order in the Cortex,“ YouTube Video, 7:59. 17. April 2012. https://www.youtube.com/watch?v=-KB-OyDx7Tw. http://www.americanbar.org/aba.html http://www.americanbar.org/aba.html https://shop.americanbar.org/ebus/default.aspx https://shop.americanbar.org/ebus/default.aspx Colette R. Brunschwig 23 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 erwies sich bedauerlicherweise als erfolglos. Als ich die Website des Schweizerischen An- waltsverbandes aufsuchte, entdeckte ich im Detailplan zum Programm des 9. Anwaltskon- gresses 2017 (Thema: „Schulterschluss der Akteure der Gerichtsbarkeit“)82 immerhin einen Hinweis auf eine Präsentation von Gabriele Hofmann-Schmid.83 Ihr Vortrag trug den Titel „Stressbewältigung und Zeitmanagement: Wie Sie es schaffen, Ihre guten Vorsätze in die Tat umzusetzen“.84 Ansonsten fand ich nichts Einschlägiges, wiewohl ich mir gewünscht hätte, dass der Schweizerische Anwaltsverband – ähnlich wie die ABA – solche Literaturhinweise gegeben hätte. Die US-amerikanische Rechtslehre blieb nicht unberührt von den beschriebenen Ent- wicklungen in der rechtswissenschaftlichen Forschung. Publikationen zum Thema „mind- fulness for law students“ wurden in den USA aufgelegt, während dort (noch) keine Veröf- fentlichungen zu „yoga for law students“ und „Alexander Technique for law students“ in Erscheinung traten. Allerdings offerierten Law Schools ihren Studierenden Kurse in Acht- samkeit (mindfulness in law)85 sowie Yoga-Kurse (yoga for law students).86 Zum Teil wurden auch YouTube-Videos für law students zu diesen Themen bereitgestellt.87 Ungeachtet dieser positiven rechtsdidaktischen Entwicklungen in den USA bot die Rechtswissenschaftliche Fakultät der Universität Zürich ihren Studierenden nichts Vergleichbares an. Man mag mir entgegenhalten, dem Akademischen Sportverband der Universität Zürich (ASVZ) habe es oblegen, die Wellness-Wünsche und -Ansprüche der Studierenden aller Fakuläten zu befrie- digen. Spezialangebote für Studierende des Rechts wären also überflüssig, ja übertrieben ge- wesen. Auf der einen Seite dürfte diese Behauptung zutreffen, auf der anderen Seite wäre es nichtsdestotrotz sinnvoll gewesen, wenn einzelne Rechtsprofessoren ihre Studierenden dazu motiviert und darin angeleitet hätten, spezielle Körperübungen auszuführen. Diese Übun- gen hätten die Studierenden später in ihrem Berufsleben begleitet, so dass es ihnen leichter 82 Vgl. SAV Schweizerischer Anwaltsverband. „Weiterbildung: Anwaltskongress.“ Zugriff am 14. Juli 2017. https://www.sav-fsa.ch/de/weiterbildung/anwaltskongress.html. 83 Zu Gabriele Hoffmann-Schmid vgl. Gabriele Hofmann-Schmid. „Legal Coaching.“ Zugriff am 14. Juli 2017. http://www.legalcoaching.ch/. 84 SAV Schweizerischer Anwaltsverband. „Weiterbildung: Anwaltskongress.“ Zugriff am 14. Juli 2017. https://www.sav-fsa.ch/de/weiterbildung/anwaltskongress.html. 85 Vgl. z. B. University of Miami School of Law. „Mindfulness in Law Program.“ Zugriff am 14. Juli 2017. http://www.law.miami.edu/academics/mindfulness-in-law-program; University of California: Berke- leyLaw. „Mindfulness in Legal Education.“ Zugriff am 14. Juli 2017. https://www.law.berkeley.edu/stu- dents/mindfulness-at-berkeley-law/resources/mindfulness-in-legal-education/, und Harvard Law School. „Wellness.“ Zugriff am 14. Juli 2017. http://hls.harvard.edu/dept/dos/wellness/. Vgl. auch Rogers und Jacobowitz, Mindfulness & Professional Responsibility, 3 ff. und 25 ff. 86 Vgl. z. B. University of Miami School of Law. „Yoga for Law Students.“ Zugriff am 14. Juli 2017. http:// www.law.miami.edu/news/2011/june/yoga-law-students; The University of British Columbia: Peter A. Allard School of Law. „Yoga for Law Students.“ Zugriff am 14. Juli 2017. http://www.allard.ubc.ca/ events/yoga-law-students-1, und Harvard Law School. „Wellness.“ Zugriff am 14. Juli 2017. http://hls. harvard.edu/dept/dos/wellness/. 87 Vgl. z. B. Rogers, Scott L. „Mindfulness for Law Students.“ YouTube Video, 5:12. 7. Oktober 2014. https:// www.youtube.com/watch?v=cEU5I1pYPYY. https://www.law.berkeley.edu/students/mindfulness-at-berkeley-law/resources/mindfulness-in-legal-education/ https://www.law.berkeley.edu/students/mindfulness-at-berkeley-law/resources/mindfulness-in-legal-education/ http://www.allard.ubc.ca/events/yoga-law-students-1 http://www.allard.ubc.ca/events/yoga-law-students-1 http://hls.harvard.edu/dept/dos/wellness/ http://hls.harvard.edu/dept/dos/wellness/ Colette R. Brunschwig 24 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 möglich gewesen wäre, den eigenen Stress gar nicht erst aufkommen zulassen oder ihn zum Allermindesten auf ein erträgliches Mass zu verringern.88 Dadurch, dass Rechtswissenschaftler die Visualisierung, Audiovisualisierung und Multi- sensorisierung rechtlicher Inhalte erforschten und teilweise auch lehrten, erfuhr das tradi - tionell verbozentrische rechtliche Wissen einen entscheidenden Wandel. Dieser „multi- sensorische“ Wandel vollzog sich, indem vor allem theoretische, methodologische und historische Erkenntnisse aus nicht-juristischen Disziplinen Eingang in rechtswissenschaft- liche Veröffentlichungen fanden. An der Peripherie des etablierten rechtswissenschaftlichen Diskurses angesiedelt, manifestierten sich diese Publikationen als Monografien, Aufsätze oder auch erst in Form von Zeitschriftenartikeln und Blogpostings. In Bezug auf die Pro- duktion von Rechtsvisualisierungen wurden Erkenntnisse aus dem visual design (Visuelle Kommunikation) rezipiert;89 für die Produktion von rechtlichen Audiovisualisierungen (Rechtsvideos, Rechtsfilme) Erkenntnisse aus dem audiovisual design (Audiovisuelle Kom- munikation, Film wissenschaft) und für die Produktion von legal games (legal gamification)90 und von legal virtual realities91 waren auch Erkenntnisse aus dem game design92 und inter- action design (z. B. Rechtsakteur-Roboter-Interaktion) massgebend.93 Zwecks Analyse und Evaluation von Rechtsvisualisierungen wurden Erkenntnisse aus der Kunstgeschichte, den Medien- und Kommunikationswissenschaften wichtig; für die Rezeption von rechtlichen Audiovisualisierungen (Rechtsvideos, Rechtsfilme) Erkenntnisse aus den Film-, Medien- und Kommunikationswissenschaften sowie aus den Populären Kulturen (popular culture studies). Es gab (noch) keine Studiengänge für visual legal designers und audiovisual legal designers – weder an der Rechtswissenschaftlichen Fakultät der Universität Zürich noch an der Zürcher Hochschule der Künste (ZHDK), noch an der ZHAW School of Management and Law in Winterthur. Dessen ungeachtet dünkte es mich, es sei nur eine Frage der Zeit, bis Rechts- videos systematisch und in grösserem Stil produziert würden – für das Internet, womöglich selbst für Videowände an Verwaltungsgebäuden, Gerichtsgebäuden und Parlamenten etc. „Januskopf “ der Rechtswissenschaft. Wer sich mit der Visualisierung, Audiovisualisierung und Multisensorisierung rechtlicher Inhalte nicht nur „synchron“, sondern auch „dia- chron“ befassen wollte, musste sich auf die visuelle,94 audiovisuelle95 und multisensorische Rechtstradition einlassen. Dies bedeutete, dass solcher Art interessierte Rechtswissenschaftler 88 Vgl. Brunschwig, „Multisensory Law,“ 161 f. mit weiteren Literaturhinweisen. 89 Vgl. z. B. Brunschwig, Visualisierung von Rechtsnormen, 80 ff.; Hagan, „Prof. Jay Mitchell on Visual Design for Lawyers,“ [s.p.], und Haapio und Passera, „Visual Law,“ [s.p.], und Salo und Haapio, „Robo-Advisors and Investors,“ 441–48. 90 Vgl. z. B. Kimbro, „New legal gamification,” [s.p.]. 91 Vgl. z. B. Baksi, „Virtual reality helps students to master criminal law,” [s.p.]. 92 Vgl. Lines, „Using game-design pedagogies to embed skills in the law curriculum,” [s.p.], und Martin, „A Simulation Game to Help People Prep for Court,” [s.p.]. 93 Vgl. Hagan, „Make Interactive Visuals with D3,“ [s.p.]. 94 Vgl. z. B. Kocher, Zeichen und Symbole des Rechts (1992). 95 Vgl. z. B. Delage, La Verité par l’image (2006) [Caught on Camera (2013); Titel der englischen Überset- zung]. Colette R. Brunschwig 25 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 dazu angehalten waren, vornehmlich rechtshistorische Literatur zu studieren. Sie beleuchte- te vergangene sensorische Rechtsphänomene. Hibbitts, seines Zeichens US-amerikanischer Rechtshistoriker und Rechtsinformatiker, hielt 1992 fest: In the twelth and thirteenth centuries, the immediate European progenitors of our culture turned increasingly to writing to help preserve information and customary lore that had been primarily per- petuated and celebrated in sound, gesture, touch, smell, and taste. Once this corpus was inscribed, and thus removed from its original multisensory context, it slowly but indubitably became the creature of the medium [d. h. geschriebener Text] that claimed to sustain it.96 Drei Jahre später bemerkte der mit der multisensorischen Rechtstradition vertraute Schwei- zer Rechtshistoriker Carlen in seinem Vorwort zum Band „Sinnenfälliges Recht“: Erstes Ziel der vorliegenden Sammlung, die auch so etwas wie ein wissenschaftlicher Rechenschafts- bericht sein darf, ist es, das Verständnis dafür zu wecken und zu fördern, dass das alte Recht stark sinnfällig und plastisch war; man musste sehen und hören, das Recht als geistiges Ordnungsgefüge symbolisch verdeutlichen und sinnbildlich verkörpern. Das Recht sollte in die Sinne gehen, sinnenfäl- lig sein. So wurde es dem Volke bewusst, das es andererseits selber verstand, sein Denken und Fühlen dem Recht einzuprägen. [...] Vielleicht ist es gut, einem abstrakten Recht und einem Recht, das sich stark vom „Volk“ entfernt hat, wieder das sinnenfällige Recht in Erinnerung zu rufen.97 Wenn man wollte, war es also möglich, die Visualisierung, Audiovisualisierung und Mul- tisensorisierung rechtlicher Inhalte in der rechtshistorischen Vergangenheit zu verankern. Dadurch erhielt dieser Gegenstand ein Doppelgesicht, das gleichzeitig vorwärts und zurück- blickte. Dieser eigenwillige Januskopf wandte sich einerseits dem zu, was punkto Visualisie- rung, Audiovisualisierung und Multisensorisierung rechtlicher Inhalte längst gedacht und getan worden war. Andererseits beschäftigte er sich damit, was diesbezüglich gerade gedacht und getan wurde, sowie mit dem, was in der Richtung zu überlegen und zu tun wäre. Unab- hängig davon, ob man in Vergangenheit zurückblickt, die Gegenwart betrachtet oder in die Zukunft schaut, sind es bestimmte gleichbleibende Kernfragen, die behandelt werden. Der sprachlichen Einfachheit halber formuliere ich sie in der Zeitform des Präsens, ohne dabei die Vergangenheit und Zukunft ausschliessen zu wollen: Welche rechtlichen Inhalte wer- den visualisiert, audiovisualisiert und multisensorisiert? Welche Arten von Rechtsvisualisie- rungen, rechtlichen Audiovisualisierungen und Multisensorisierungen werden produziert? Warum und wozu werden rechtliche Inhalte visualisiert, audiovisualisiert und multisenso- risiert? Wie wirken sich die visuellen, audiovisuellen und multisensorischen rechtlichen Er- zeugnisse aus? Hoher Stellenwert der Kunstgeschichte und des Design. Ich habe oben bereits dargetan, dass der Kunstgeschichte und dem design eine grosse Bedeutung zukam, wenn es um die Visuali- sierung, Audiovisualisierung und Multisensorisierung rechtlicher Inhalte ging. Last, but not least war ich davon überzeugt, dass die Visualisierung, Audiovisualisierung und Multisensorisierung rechtlicher Inhalte Studierenden des Rechts sowohl während als 96 Hibitts, „Coming to Our Senses,“ 875. 97 Carlen, „Vorwort,“ XVI. Colette R. Brunschwig 26 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 auch nach ihrem Studium dabei unterstützen würde – um Žižeks Gedanken zu wiederho- len  –, sich „mit all den Erniedrigungen und Ungerechtigkeiten des realen Lebens zu kon- frontieren und dagegen zu kämpfen.“98 6.3.3 Geringe Hoffnungen Sie fragen, was ich mit meinem Text zu bewirken hoffte. Wissenschaftssoziologische Gründe sprachen dagegen, grosse Hoffnungen damit zu verbinden: Wie bereits erwähnt, schwam- men Wissenschaftler des 20. und 21. Jahrhunderts in einer Flut an Literatur. Unzählige wissenschaftliche Texte konkurrierten miteinander, um die Gunst der scientific community buhlend. Infolgedessen schenkten wissenschaftliche Akteure ihre Aufmerksamkeit ledig- lich einer bestimmten Anzahl von Veröffentlichungen, die sie herausgesucht hatten, weil sie diese für ihre eigenen Forschungs- und Lehrzwecke für geeignet befunden hatten.99 Der Wissenschaftssoziologe Weingart beklagte deshalb, „dass ein Teil der gesamten Menge an produziertem Wissen einfach unbeachtet bleibt. Mehr als die Hälfte aller Publikationen wird nie zitiert, d. h., sie fällt aus dem Kommunikationsprozess heraus ([…]).“100 Dieses Klagelied stimmte auch Posner, ein Vertreter des Law and Economics-Ansatzes, an. Bei genauerer Be- trachtung bezog er sich dabei im Wesentlichen auf die interdisziplinär ausgerichteten recht- lichen Grundlagenfächer: „Only a small percentage of works of interdisciplinary legal scho- larship receives sustained critical attention, […].“101 Denn Rechtserlasse, rechtsdogmatische Literatur und Judikatur bildeten die unentbehrlichen Informationsquellen der Juristen. Die unüberschaubare Menge an Rechtsinformationen zwang Rechtswissenschaftler zur selek- tiven Aufmerksamkeit. Keine Zeit also, sich mit der Visualisierung, Audiovisualisierung und Multisensorisierung aus der Perspektive der DH und einer digitalen Rechtswissenschaft zu befassen. Meine einzige Hoffnung bestand darin, dass mein Text im Jahre 2117 oder später als Zeitdokument rezipiert werden würde, aus dem sich wissenschaftshistorische Informa- tionen extrahieren liessen. Ein Text also für die Nach- und nicht für die Mitwelt. Posner folgerte im letzten Absatz seines Aufsatzes „Legal Scholarship Today“ (2001): „My conclusion is that interdisciplinary legal scholarship is problematic unless subjected to the test of rele- vance, of practical impact.“102 Obwohl einige Passagen in meinem Text für die Rechtspraxis von damals relevant gewesen wären, stellte ich mir vor seiner Veröffentlichung vor, meine juristischen Zeitgenossen würden ihn nicht beachten. 98 Žižek, „Das Leben ist nun einmal krass,“ 43. 99 Vgl. Weingart, Wissenschaftssoziologie, 36 f. 100 Weingart, Wissenschaftssoziologie, 37. 101 Posner, „Legal Scholarship Today,” 1325. 102 Posner, „Legal Scholarship Today,” 1326. Colette R. Brunschwig 27 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 6.3.4 Fragen an die Nachwelt Hoffentlich haben meine Antworten Ihre historische Neugier befriedigt. Ihre Fragen liessen auf der anderen Seite mich neugierig werden. Da ich keinen Zugang zu einem irdischen In- ternetanschluss habe, wäre ich daran interessiert, Näheres über die Rechtswissenschaft 2117 erfahren: – Lässt sich die Rechtswissenschaft inzwischen in eine analoge, digitale und nanotechnolo- gische aufgliedern? Abb. 3 – Hat sich eine digitale Rechtswissenschaft (digital law oder digital legal science) gleichbe- rechtigt neben den DH, den digitalen Sozialwissenschaften (digital social sciences), der di- gitalen Medizin (digital medicine), der digitalen Theologie (digital theology) und den digi- talen Naturwissenschaften (digital natural sciences) entwickelt? – Wenn ja, umfasst die digitale Rechtswissenschaft die digitale Rechtsdogmatik sowie die digitalen rechtlichen Grundlagenfächer, wie etwa die digitale Rechtsgeschichte, digitale Rechtstheorie, digitale Rechtsphilosophie, digitale Rechtssoziologie und digitale Rechts- psychologie? Abb. 4 Colette R. Brunschwig 28 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 – Lässt sich die Rechtsbilddatenbank des Zentrums für Rechtsgeschichtliche Forschung, Abteilung Rechtsvisualisierung, wo ich im Jahr 2017 arbeitete, im Internet noch abrufen? Bildet die Rechtsbilddatenbank jetzt einen Verbund mit anderen Rechtsbilddatenbanken – unter der Ägide des Max-Planck-Institutes für europäische Rechtsgeschichte? Wurde diese verbo-visuelle Rechtsdatenbank in der Zwischenzeit ein Vorbild für das design von anderen verbo-visuellen und audiovisuellen Rechtsdatenbanken? – Inwiefern ist es noch möglich oder nicht mehr möglich, auf die in meinem Aufsatz „Per- spektiven einer digitalen Rechtswissenschaft: Visualisierung, Audiovisualisierung und Multisensorisierung“ zitierten Webseiten und Blogpostings zuzugreifen? – Was ist aus dem von José Luis Alonso initiierten Forschungsschwerpunkt „juristische Papyrologie“ geworden? Hat sich daraus ein Forschungszentrum entwickelt? Wenn ja, existiert diese Einrichtung noch und wie verhält sie sich zu andern Institutionen der recht- lichen Grundlagenfächer an der Universität Zürich? Fragen, nichts als Fragen. Sie bevölkern meinen – unseren jenseitigen Raum. Als ehemalige Angehörige der irdischen scientific community schwirren wir darin herum. Wir murmeln – raunen kritisch weiter, als ob wir nicht verschieden wären, als ob unsere Meinungen in terra noch gefragt wären. Aus den Gefängnissen unserer Leiber ausgebrochen, sind unsere See- len männlich und weiblich zugleich. Die Trennung der Geschlechter ist aufgehoben. Kei- ne Männernetzwerke. Keine old boys, die young boys auf jenseitige Lehrstühle hieven. Keine Kolleginnen, die trotz „gleicher Qualifkation“ ihre Köpfe an den gläsernen Decken akade- mischer Paläste wundstossen, in untergeordneten Positionen „hängenbleiben“ und sich – ganz weiblich – in Selbstkritik zerfleischen – anstatt sich die Pfauenrad schlagende Rhetorik gewisser Männer wenigstens ein bisschen zum Vorbild zu nehmen, mit der es letzteren ge- lingt, zumeist männlich dominierte Berufungskommissionen für sich einzunehmen. Zum Schluss möchte ich Ihnen ein Angebot machen: Sofern gewünscht, wäre ich durchaus bereit, den Organisatoren der ZBZ-Tagung 2217, abermals mit Informationen und Einschät- zungen zur Verfügung zu stehen. Ich hätte auch keine Angst davor, allein mit intelligenter Software zu kommunizieren, welche unter Umständen Ihre neuronalen Netzwerke ersetzt haben wird. Wir Jenseitigen lesen momentan Hararis Bestseller „Homo Deus“ [„Mensch als Gott“]. Es gibt einige unter uns, die erachten, der Buchtitel komme einer Hybris gleich. Der Autor hätte das Substantiv „Deus“ mit Anführungszeichen versehen sollen. Nichtsdestotrotz diskutieren wir gewisse Textpassagen, darunter die Folgende: Yet even cyborg engineering is relatively conservative, inasmuch as it assumes that organic brains will go on being the command-and-control centres of life. A bolder approach dispenses with organic parts altogether, and hopes to engineer completely non-organic beings. Neural networks will be replaced by intelligent software, which could surf both the virtual and non-virtual worlds, free from limitations of organic chemistry. After 4 billion years of wandering inside the kingdom of organic compounds, life will break out into the vastness of the inorganic realm, and will take shapes that we cannot envision even in our wildest dreams. After all, our wildest dreams are still the product of organic chemistry.103 103 Harari, Homo Deus, 51 f. Colette R. Brunschwig 29 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 Harari folgert daraus: „Breaking out of the organic realm could also enable life to finaly break out of planet Earth.“104 Im Gegensatz zu seiner Folgerung hat sich eine überwiegende jenseitige Mehrheit Gryphius’ Gedicht „Menschliches Elende“ verschrieben, mit dessen zwei letzten Strophen dieser Brief ausklingen soll: Gleich wie ein eitel Traum leicht aus der Acht hinfällt/ Und wie ein Strom verscheußt, den keine Macht aufhält, /So muß auch unser Nam, Lob, Ehr und Ruhm verschwinden. Was itzund Atem holt, muß mit der Luft entfliehn, /Was nach uns kommen wird, wird uns ins Grab nachziehn. /Was sag ich? Wir vergehn, wie Rauch von starken Winden.105 Hochachtungsvoll grüssen Sie aus jenseitigen Räumen Colette R. Brunschwig – Nicolas R. Brunschwig 7. Literatur Austin, Regina. „Documentation, Documentary, and the Law: What Should Be Made of Victim Impact Videos ?“ Cardozo Law Review 31, Nr. 4 (2010): 979–1017. Balkin, Jack und Sanford Levinson. „Law and the Humanities: An Uneasy Relationship.“ Yale Journal of Law and the Humanities 18 (2006): 155–87. Baksi, Catherine. „Virtual reality helps students to master criminal law“. Blog. The Times, 13 Oktober 2016. http://www.thetimes.co.uk/article/virtual-reality-helps-students-to-master-criminal-law-tldbm0qxh (acces- sed July 14, 2017). Birr, Christiane. „Die geisteswissenschaftliche Perspektive: Welche Forschungsergebnisse lassen Digital Humanities erwarten?“ Rechtsgeschichte – Legal History 24 (2016): 330–34. Boehme-Neßler, Volker. Unscharfes Recht: Überlegungen zur Relativierung des Rechts in der digitalisierten Welt. Berlin: Duncker & Humblot, 2008. Id. Pictorial Law: Modern Law and the Power of Pictures. Berlin: Springer, 2011. Brinktrine, Ralf und Hendrick Schneider, Juristische Schlüsselqualifikationen: Einsatzbereiche – Examensrele- vanz – Examenstraining. Berlin: Springer, 2008. Brunschwig, Colette R. Visualisierung von Rechtsnormen: Legal Design. Zürich: Schulthess, 2001. Id. „Multisensory Law and Therapeutic Jurisprudence: How Family Mediators Can Better Communicate with Their Clients.“ Phoenix Law Review 5, Nr. 4 (Sommer 2012): 705–46. Id. „Law Is Not or Must Not Be Just Verbal and Visual in the 21st Century: Toward Multisensory Law.“ In Internationalisation of Law in the Digital Information Society: Nordic Yearbook of Law and Informatics 2010– 2012, hrsg. v. Dan Jerker B. Svantesson und Stanley Greenstein, 231–83. Kopenhagen: Ex Tuto, 2013. Id. „Multisensory Law, A ‚Legal Rebel‘ with a Cause.“ In Lawyers as Changemakers: The Global Integrative Law Movement, hrsg. v. J. Kim Wright, 155–65. Chicago: American Bar Association, 2016. 104 Harari, Homo Deus, 52. 105 Gryphius, „Menschliches Elende,“ 110. Colette R. Brunschwig 30 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 Bubenhofer, Noah. „Drei Thesen zu Visualisierungspraktiken in den Digital Humanities.“ Rechtsgeschichte – Legal History 24 (2016): 351–55. Büllesbach, Alfred. „Rechtswissenschaft und Sozialwissenschaft.“ In Einführung in Rechtsphilosophie und Rechtstheorie der Gegenwart, hrsg. v. Arthur Kaufmann, Winfried Hassemer und Ulfrid Neumann, 401–27. 8., überarbeitete Aufl. Heidelberg: C.F. Müller, 2011. Burdick, Anne, Johanna Drucker, Peter Lunenfeld, Todd Presner und Jeffrey Schnapp. Digital_Humanities. Cambridge: MIT Press, 2012. Carlen, Louis. „Vorwort.“ In Sinnenfälliges Recht: Aufsätze zur Rechtsarchäologie und Rechtlichen Volkskunde, hrsg. v. Louis Carlen, XV–XVI. Hildesheim: Weidmannsche Verlagsbuchhandlung, 1995. Cho, Jeena und Karen Gifford. The Anxious Lawyer: An 8-Week Guide to a Joyful and Satisfying Law Practice Through Mindfulness and Meditation. Chicago: ABA Publishing, 2016. Delage, Christian. La Vérité par l’image: De Nuremberg au procès Milosevic. Paris: Denoël, 2006. Feigenson, Neal. Experiencing Other Minds in the Courtroom. Chicago: The University of Chicago Press: 2016. Feigenson, Neal und Christina Spiesel. Law on Display: The Digital Transformation of Legal Persuasion and Judgment. New York: New York University Press, 2009. Goodrich, Peter. „Screening Law.“ In A Thousand Eyes: Media Technology, Law and Aesthetics, hrsg. v. Marit Paasche und Judy Radul, 145–66. Berlin: Sternberg Press, 2011. Gryphius, Andreas. „Es ist alles eitel.“ In Deutsche Gedichte: Von den Anfängen bis zur Gegenwart, hrsg. v. Benno von Wiese, 109. Erweiterte Neuausgabe. Düsseldorf: August Bagel, 1973. Id. „Menschliches Elende.“ In Deutsche Gedichte: Von den Anfängen bis zur Gegenwart, hrsg. v. Benno von Wiese, 109–10. Erweiterte Neuausgabe. Düsseldorf: August Bagel, 1973. Haapio, Helena, and Stefania Passera. „Visual Law: What Lawyers Need to Learn from Information Designers“. Blog. Cornell University Law School, Legal Information Institute, VoxPopulII, 15. Mai 2013. Zugriff am 14. Juli 2017. https://blog.law.cornell.edu/voxpop/2013/05/15/visual-law-what-lawyers-need-to-learn- from-information-designers/. Hagan, Margaret. „Make Interactive Visuals with D3“. Blog. Legal Design Toolbox, November 2013. Zugriff am 14. Juli 2017. http://www.legaltechdesign.com/LegalDesignToolbox/2013/11/17/make-interactive-visu- als-with-d3/. Id. „Prof. Jay Mitchell on Visual Design for Lawyers“. Blog. Open Law Lab, 24. Februar 2016. Zugriff am 14. Juli 2017. http://www.openlawlab.com/2016/02/24/prof-jay-mitchell-on-visual-design-for-lawyers/. Harari, Yuval Noah. Homo Deus: A Brief History of Tomorrow. London: Vintage, 2017. Hibbitts, Bernard J. „‚Coming to Our Senses‘: Communication and Legal Expression in Performance Cul- tures“ Emory L.J. 41, Nr. 4 (1992): 873–960. Hilgendorf, Eric. dtv-Atlas Recht. Grundlagen, Staatsrecht, Strafrecht, Bd. 1. München: Deutscher Taschen- buch Verlag, 2003. Id., Hrsg. Beiträge zur Rechtsvisualisierung. Berlin: Logos Verlag, 2005. Id. dtv-Atlas Recht. Verwaltungsrecht, Zivilrecht, Bd. 2. München: Deutscher Taschenbuch Verlag, 2007. Katsh, M. Ethan. Law in a Digital World. New York: Oxford University Press, 1995. Kenney, Keith. Philosophy for Multisensory Communication and Media. New York: Peter Lang, 2016. http://www.legaltechdesign.com/LegalDesignToolbox/2013/11/17/make-interactive-visuals-with-d3/ http://www.legaltechdesign.com/LegalDesignToolbox/2013/11/17/make-interactive-visuals-with-d3/ Colette R. Brunschwig 31 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 Kheraj, Sean. „The Presence of the Past: The Possibilities of Virtual Reality for History“. Blog. Activehistory. ca: History Matters, 22. Februar 2017. Zugriff am 14. Juli 2017. http://activehistory.ca/2017/02/the-presen- ce-of-the-past-the-possibilities-of-virtual-reality-for-history/. Kimbro, Stephanie. „New legal gamification: Game on Law“. Blog. Legal Informatics Blog, 18. August 2013. Zugriff am 14. Juli 2017. https://legalinformatics.wordpress.com/2013/08/18/kimbro-new-legal-gamificati- on-venture-game-on-law/. Klambauer, Klara. „Einführung in das Fach“. Blog. Digitale:geschichte, 12. Februar 2016. Zugriff am 14. Juli 2017. http://dguw.hypotheses.org/385. Kocher, Gernot. Zeichen und Symbole des Rechts: Eine historische Ikonographie. München: Beck, 1992. Kretschmer, Bernhard, Hrsg. Rechts- als Geisteswissenschaft: Festschrift für Wolfgang Schild zum 60. Geburtstag. Studien zur Rechtswissenschaft, Bd. 205. Hamburg: Kovač, 2007. Krueger, Karen G. A Lawyer’s Guide to the Alexander Technique: Using Your Mind-Body Connection to Handle Stress, Alleviate Pain, and Improve Performance. Chicago: ABA Publishing, 2015. Lachmayer, Friedrich. „Graphische Darstellungen im Rechtsunterricht.“ Zeitschrift für Verkehrsrecht 8 (1976): 230–34. Id. „Visualisierung des Rechts.“ In Zeichenkonstitution. Akten des 2. Semiotischen Kolloquiums Regensburg 1978, Bd. 2, hrsg. v. Annemarie Lange-Seidl, 208–12. Berlin: Walter de Gruyter, 1981. Lines, Kris. „Using game-design pedagogies to embed skills in the law curriculum“. Blog. Higher Education Academy, 4. August 2014. Zugriff am 14. Juli 2017. https://www.heacademy.ac.uk/using-game-design-peda- gogies-embed-skills-law-curriculum. Love, Hallie N. und Natalie Martin. Yoga for Lawyers: Mind-Body Techniques to Feel Better All the Time. Chica- go: ABA Publishing, 2014. Martin, Greg St. „A Simulation Game to Help People Prep for Court“. Blog. News@Northeastern, 25. Sep- tember 2014. Zugriff am 14. Juli 2017. https://news.northeastern.edu/2014/09/virtual-courtroom-project/#_ ga=1.242156359.410854886.1492784282. von Matt, Peter. Sieben Küsse: Glück und Unglück in der Literatur. München: Carl Hanser, 2017. Murrieta-Flores, Patricia, Christopher Donaldson und Ian Gregory. „GIS and Literary History: Advancing Digital Humanities research through the Spatial Analysis of historical travel writing and topographical li- terature.“ dhq 11, Nr. 1 (2017): Noten 1–39. http://www.digitalhumanities.org/dhq/vol/11/1/000283/000283. html. Obermayer, Klaus. „Rechtswissenschaft als Geisteswissenschaft.“ JuristenZeitung 42, Nr. 14 (17. Juli 1987): 691–96. Posner, Richard. „Legal Scholarship Today.“ Harv. L. Rev. 115 (2001): 1314–26. Rambow, Riklef und Rainer Bromme. „Was Schöns ‚reflective practicioner‘ durch die Kommunikation mit Laien lernen könnte.“ In Wissen – Können – Reflexion: Ausgewählte Verhältnisbestimmungen, hrsg. v. Georg Hans Neuweg, 245–63. Innsbruck: Studienverlag, 2000. Riskin, Leonard L. „Mindfulness in the Heat of Conflict: Taking Stock.“ Harv. Negot. L. Rev. 20 (2015): 121–55. Robertson, Stephen. „Searching for Anglo-American Digital Legal History.“ Law & Hist. Rev. 34 (2016): 1047–69. Röhl, Klaus F. „(Juristisches) Wissen über Bilder vermitteln.“ In Wissen in (Inter-)Aktion: Verfahren der Wis- sensgenerierung in unterschiedlichen Praxisfeldern, hrsg. v. Ulrich Dausendschön-Gay, Christine Domke und Sören Ohlhus, 281–311. Berlin: Walder de Gruyter, 2010. http://activehistory.ca/2017/02/the-presence-of-the-past-the-possibilities-of-virtual-reality-for-history/ http://activehistory.ca/2017/02/the-presence-of-the-past-the-possibilities-of-virtual-reality-for-history/ https://legalinformatics.wordpress.com/2013/08/18/kimbro-new-legal-gamification-venture-game-on-law/ https://legalinformatics.wordpress.com/2013/08/18/kimbro-new-legal-gamification-venture-game-on-law/ https://www.heacademy.ac.uk/using-game-design-pedagogies-embed-skills-law-curriculum https://www.heacademy.ac.uk/using-game-design-pedagogies-embed-skills-law-curriculum https://news.northeastern.edu/2014/09/virtual-courtroom-project/#_ga=1.242156359.410854886.1492784282 https://news.northeastern.edu/2014/09/virtual-courtroom-project/#_ga=1.242156359.410854886.1492784282 http://www.digitalhumanities.org/dhq/vol/11/1/000283/000283.html http://www.digitalhumanities.org/dhq/vol/11/1/000283/000283.html Colette R. Brunschwig 32 Max Planck Institute for European Legal History Research Paper Series No. 2018-03 Röhl, Klaus F. und Stefan Ulbrich. Recht anschaulich: Visualisierung in der Juristenausbildung. Köln: Halem, 2007. Rogers, Scott L. The Six-Minute Solution: A Mindfulness Primer for Lawyers. Miami Beach: Jurisight, 2009. Rogers, Scott L. und Jan L. Jacobowitz. Mindfulness & Professional Responsibility: A Guide Book for Integrating Mindfulness into the Law School Curriculum. Miami Beach: Mindful Living Press, 2012. Salo, Marika und Helena Haapio. „Robo-Advisors and Investors: Enhancing Human-Robot Interaction through Information Design.“ In Trends und Communities der Rechtsinformatik: Tagungsband des 20. Interna- tionalen Rechtsinformatik Symposions IRIS 2017, hrsg. v. Erich Schweighofer, Franz Kummer, Walter Hötzen- dorfer und Christoph Sorge, 441–48. Wien: Österreichische Computer Gesellschaft, 2017. Schmale, Wolfgang. Digitale Geschichtswissenschaft. Wien: Böhlau, 2010. Seefeldt, Douglas und William G. Thomas. „What Is Digital History?“ Perspectives on History: The News- magazin of the American Historical Association. (May 2009): [s.p.]. https://www.historians.org/publica- tions-and-directories/perspectives-on-history/may-2009/intersections-history-and-new-media/what-is-digi- tal-history. Sherwin, Richard K. „Imagining Law as Film (Representation without Reference?).“ In Law and the Hu manities: An Introduction, hrsg. v. Austin Sarat, Matthew Anderson und Catherine O. Frank, 241–68. Cambridge: Cambridge University Press, 2010. Id. Visualizing Law in the Age of the Digital Baroque: Arabesques and Entanglements. London: Routledge, 2011. Id. „Visual Jurisprudence.“ N.Y.L. Sch. L. Rev. 57 (2012–2013): 11–39. Staley, David J. „Digital Historiography: Virtual Reality.“ JAHC 2, Nr. 1 (April 1999): 1–4. Stegbauer, Christian und Alexander Rausch. Einführung in NetDraw: Erste Schritte mit dem Netzwerkvisuali- sierungsprogramm. Wiesbaden: Springer, 2013. Susskind, Richard. Tomorrow’s Lawyers: An Introduction to Your Future. Oxford: Oxford University Press, 2013. Svensson, Patrik. „The Landscape of Digital Humanities.“ dhq 4, Nr. 10 (2010): Noten 1–179. http://digi- talhumanities.org/dhq/vol/4/1/000080/000080.html. Volpato, Richard. „Legal Professionalism and Informatics.“ Journal of Law and Information Science 5, Nr. 2 (2) (1991): 206–29. Weingart, Peter. Wissenschaftssoziologie. Bielefeld: transcript, 2003. Žižek, Slavoj. „Das Leben ist nun einmal krass: Lasst es uns bitte nicht schönreden.“ Übers. v. rs. [Lucas Roos?]. NZZ, 25 März 2017. https://www.historians.org/publications-and-directories/perspectives-on-history/may-2009/intersections-history-and-new-media/what-is-digital-history https://www.historians.org/publications-and-directories/perspectives-on-history/may-2009/intersections-history-and-new-media/what-is-digital-history https://www.historians.org/publications-and-directories/perspectives-on-history/may-2009/intersections-history-and-new-media/what-is-digital-history http://digitalhumanities.org/dhq/vol/4/1/000080/000080.html http://digitalhumanities.org/dhq/vol/4/1/000080/000080.html work_5yc2ryccx5dzjaoq5clgzpa6x4 ---- Towards Implementing a Real-time Deformable Human Muscle Model in Digital Human Environments 2351-9789 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of AHFE Conference doi: 10.1016/j.promfg.2015.07.889 Procedia Manufacturing 3 ( 2015 ) 3844 – 3851 Available online at www.sciencedirect.com ScienceDirect 6th International Conference on Applied Human Factors and Ergonomics (AHFE 2015) and the Affiliated Conferences, AHFE 2015 Towards implementing a real-time deformable human muscle model in digital human environments Abhinav Sharma*, Angela Dani, Anith J. Mathai, Timothy Marler, Karim Abdel-Malek U.S. Army Virtual Soldier Research Program, Center for Computer Aided Design, The University of Iowa, Iowa City, IA 52242, U.S.A Abstract The current state of the art digital human models are a visual proxy for humans in part due to the advances in computer graphics. They perform with biomechanical accuracy that mimics real human motion. Models such as Santos® have a biomechanically accurate skeleton driving the motion, which in turn controls the deformation of a flexible skin for added realism, all in real time. However, these models lack realistic musculoskeletal systems that respond in real time to biomechanical motion. They require varying levels of pre-processing before motion can be applied to them, thus preventing the real time effect; muscle models need to be flexible and deformable, which in computer simulations generally translates to higher computation requirements. By combining advances in computer graphics, especially the fast rendering game graphics capability, with known literature on musculoskeletal modeling, a preliminary full body musculoskeletal system that deforms in real time is presented along with the skeleton and skin. Given the biomechanical studies focus of digital humans, the model implemented centers on the mathematical articulation, and not the graphical volumetric representation, of actual musculoskeletal systems. As such, each muscle is defined as a line that starts at an origin position, determined from anatomy, and ends at the corresponding insertion position, while wrapping as needed around cylindrical obstacles that emulate the minimum bulges required for that line to be at the centroid of the actual equivalent muscle. For each muscle, the origin position, insertion position and obstacle parameters (position, rotation, and scale) have to be obtained relative to key joints for accurate articulation. This was done manually on a per-muscle basis for 126 muscles, and can be extended to any anthropometric avatar. With this initial real-time model, there is potential for a quicker assessment of the effect of muscles on human task performance, leading to a complete model that deforms in real time. © 2015 The Authors. Published by Elsevier B.V. Peer-review under responsibility of AHFE Conference. Keywords: Digital humans; Muscle model; Anatomy; Real time graphics; Musculoskeletal system * Corresponding author. Tel.: +1-319-333-5075. E-mail address: abhinav-sharma@uiowa.edu © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of AHFE Conference http://crossmark.crossref.org/dialog/?doi=10.1016/j.promfg.2015.07.889&domain=pdf 3845 Abhinav Sharma et al. / Procedia Manufacturing 3 ( 2015 ) 3844 – 3851 1. Introduction 1.1. Digital human Digital human models are representations of humans rendered within a 3D computer graphical environment. The digital human model is driven by a mathematical skeleton. Unlike real human beings that comprise of bones, muscles, nerves, and organ systems, digital humans such as Santos® consist mainly of kinematic skeleton and a skin. The digital humans or ‘avatars’, visually characterize a human body by using advanced graphical techniques to realistically represent human skin. The skin is connected to the skeleton by using a ‘skinning’ algorithm. The skin is essentially a mesh of polygons, almost always triangles, that interlock together to form a continuous and closed skin. The skinning algorithm allows the skin to deform and move along with the skeleton. The polygonal elements that make up the skin do not deform and are rigid, but the joints of the polygons are allowed to bend. The deformation of the skin is analogous to chain-link armor, where each joint cannot deform but the overall system is flexible and pliable. The skinning algorithm is essentially a mathematical formulation that assigns weights to certain polygons. These weights are usually ‘painted’ on by a 3D modeler. The process of skinning is an advanced and mature technique in the computer graphics world and has been utilized by digital human modeling. 1.2. Kinematic modeling of human mechanism The 55-degrees of freedom (DOF) Santos™ whole-body human model [1] includes the six global DOFs from a global origin as well as the lower/upper limbs, the torso, and the head (Figure 1). The pelvis is chosen as the base link for the global DOFs under rigid-body assumption. The configuration of the human open loop kinematic chains is described by the Denavit-Hartenberg (DH) notation [2]. The 55×1 joint variables uniquely determine the configuration of the system. The global motion (translations and rotations) of the body reference point located at the center of the human pelvis can also be described using the DH method with mass-free link-joint structure representing the global DOFs. The displacements of the first three prismatic joints describe the global translations of the pelvis in the 3-D xyz Cartesian coordinates (units in meters). The last three revolute joint variables represent the global orientations of the pelvis in terms of Euler angles (units in radians) in a 3-D space. The global DOF generalized coordinates are expressed as follows: where the superscripts (p) and (r) indicate the prismatic and the revolute joint, respectively, and α, β, and γ are the Euler angles of Z-X-Y type. Fig. 1. A whole-body human-like mechanism in zero joint variables and the global DOFs. 3846 Abhinav Sharma et al. / Procedia Manufacturing 3 ( 2015 ) 3844 – 3851 Fig. 2. Biceps muscles in humans. Fig. 3. Simplified representation of the biceps muscles. 1.3. Background on musculoskeletal modeling Musculoskeletal modeling is a field that has been in development for decades now. Currently, several musculoskeletal simulation packages such as SIMM Biomechanics and AnyBody Technology are available commercially. These packages however, require a certain level of preprocessing before any results can be obtained and analyzed. Furthermore, their sole purpose is musculoskeletal simulation; human performance cannot be analyzed as comprehensively as using digital humans. Digital humans, that are developed based on ‘skinning’ algorithms, in contrast, lack musculoskeletal models that can be used for more accurate task simulations. This paper aims at introducing a combination of these two capabilities for the entire digital human body. Using the Santos® digital human developed by the Virtual Soldier Research (VSR) team, this paper introduces a preliminary full body musculoskeletal system that responds in real time, without any noticeable delay, on an avatar. 2. Methods 2.1. Transitioning from actual human muscles to computerized muscles Figure 2 presents what the bicep muscles look like in human beings. From the figure, three main parts can be identified; the origin tendon, the part that initially attaches the muscle to the skeleton; the insertion tendon, the other part that attaches the muscle to the skeleton; and the muscle bulk, the part that expands or contracts during muscle action (contraction, extension, adduction, abduction). Trying to model and simulate the volumetric muscles presented in the figure will prove to be highly computationally expensive. As such, a mathematical, less intensive model based on the formulation proposed by Charlton and Johnson [3] is used instead. More specifically, the modified Charlton and Johnson algorithm developed by Amos Patrick, from the VSR team, is used [4]. Figure 3 summarizes this formulation. The muscle is depicted as a single line, rather than a volume, that still starts at an origin position and ends at the corresponding insertion position, but wraps around an obstacle, usually a sphere or a cylinder, placed such that the muscle line runs through the centroid of the actual equivalent muscle. That is, the obstacles serve to mimic the minimum bulges that will keep the approximating muscle line passing through the centroid of the actual volumetric muscle. This is usually accomplished by running the line from the origin, through known points called via points on the obstacles, to the insertion. The line in this formulation is referred to as the muscle action line. Every muscle can be approximated with a number of muscle action lines. For most muscles such as the bicep muscles, just one line suffices. But for muscles such as the trapezius muscles whose insertion tendons cover a much larger area, more than one action line is needed. The model presented in this paper will be based on the approach Patrick took; the number of action lines per muscle will be kept to a minimum [4]. 2.2. Musculoskeletal modeling on Santos® 2.2.1. Inputs To extend the musculoskeletal modeling formulation onto an avatar such as Santos®, three parameters first have to be obtained for each muscle action line. These are: the origin position, the corresponding insertion position, and the obstacle(s) information (position, rotation, and scale). To accomplish this, 3D human skeletal, and volumetric 3847 Abhinav Sharma et al. / Procedia Manufacturing 3 ( 2015 ) 3844 – 3851 muscle models are imported in the rendering environment and scaled to fit within the skin-based digital human, as it would in a real person. It is worth noting that these are just visual proxies; although all the models are stacked together, the Santos® avatar, and the imported models remain entirely independent from each other in function. Then, on a per-muscle basis, the parameters are obtained; 3D frames are manually placed where the origin and insertion tendons attach to the skeletal model, and their positions are retrieved relative to the Santos kinematic skeletal joints. Similarly, cylindrical obstacles are manually placed on the on the volumetric muscles such that the action line that would wrap around the obstacles will be at the centroid of the volumetric models. The obstacle parameters are then retrieved relative to the kinematic joints. The joints relative to which the parameters are obtained are determined based on biomedical intuition. Consider the bicep short muscle, for example. This muscle originates from the shoulder and inserts on the ulna with the main bulge associated with the muscle being around the elbow. On the Santos® joint based kinematic skeleton, this would translate into the origin position being obtained relative to the shoulder joint and the insertion and obstacle parameters being retrieved relative to the elbow joint, given the lack of an actual skeletal system. In this scenario, the origin is said to be ‘parented’ to the shoulder joint. Similarly, the insertion and obstacle are said to be parented to the elbow given that their information was retrieved relative to the elbow joint. Once these parameters are obtained, the obstacle wrapping algorithm can be run. 2.2.2. Wrapping algorithm The wrapping algorithm employed is based on the approach suggested by Charlton and Johnson [3], and Patrick [4], in which the muscle action line is assumed to be a frictionless elastic string. Two cases are to be considered: when wrapping around the obstacle is required and when it is not. For illustration, assume the muscle being considered only has one obstacle. The muscle action line runs through 7 points. The first and last points are the origin and insertion, respectively. The remaining 5 points are the via points which would be on the surface of the obstacle in case wrapping occurs, or linearly interpolated between points 1 and 7 in case wrapping does not occur. Using the obstacle position, rotation, and scale parameters, and the Santos® joints chain information, a transformation matrix is constructed to obtain the origin and insertion positions in terms of the cylinder coordinate system. If wrapping must occur, it will occur along the shortest path from the origin to insertion around the cylinder. This implies that the via points will follow a helical pattern the cylinder. On the onset, not enough information is available to determine whether wrapping occurs or not. The problem here is a 3D problem. Given two initial positions relative to a 3D cylinder with a longitudinal z-axis, 5 positions must be determined. To accomplish this, the wrapping algorithm suggests carrying out an analysis in 2 planes—the circular x-y plane and longitudinal x-z or y-z plane (since the base is circular, the x-z plane is similar to the y-z plane). The circular x-y plane is defined by the x and y coordinates of the origin, insertion and center of the cylinder, all three of which are purposely never collinear. From there, the x-y positions of via points 2 and 6 are determined. These positions are To and Ti in Figure 4a. With those, whether wrapping should occur or not can be assessed. Wrapping occurs if the z component of the cross product between and is negative, and does not occur if it is positive. Figure 4b highlights this difference. If wrapping does occur, the z-coordinate of To and Ti must be determined. From those two points, the remaining via points can be readily determined by radial interpolation. The next step is to convert the via points from the coordinate system of the cylinder to that of the environment so that a line may be rendered through them for a visual check. Finally, to create the real time effect, the origin, insertion, obstacle positions and consequently, all of the via points positions, are updated and recomputed every rendering frame to take into account any limb movement by the digital human. 2.2.3. Multiple obstacle wrapping For complex muscles, more than one obstacle is needed. For such muscles, the wrapping algorithm must be run in series for each obstacle. Figure 5 presents this approach. The wrapping algorithm is run on the first obstacle, and if wrapping occurs, the last point of intersection, Ti, is used as the origin position for the subsequent obstacle. 3848 Abhinav Sharma et al. / Procedia Manufacturing 3 ( 2015 ) 3844 – 3851 Fig. 4. (a) Cylindrical wrapping in the x-y plane (b) Difference in wrapping depending on sign of z-component of cross product. [4] Fig. 5. Multiple obstacle wrapping work flow. Fig. 6. Key joint concept used to establish minimum and maximum muscle lengths. 2.2.4. Elongation feedback To accompany the real time musculoskeletal model, a novice scheme has been implemented for muscle elongation feedback. This scheme is based on the length of the muscle action line. First, a key joint for each muscle action line is determined. When the joint angle of this key joint is set to its maximum, the muscle action line attains its maximum length, and when the joint angle is set to its minimum, the muscle action line attains its minimum length. Once the minimum and maximum lengths of each muscle have been established, elongation feedback given any muscle length can be provided through linear interpolation. Figure 6 illustrates the concept. The key joints are determined on the Santos® avatar using a trial and error process. 3. Results and discussion 126 muscle action lines parameters were retrieved through manual placement of 3D frames and cylindrical obstacles. The parenting scheme for each of these muscles was tested and refined for proper biomedical motion. For illustrative purposes, consider the bicep short muscle again. Originally, the insertion position of the muscle was (a) (b) 3849 Abhinav Sharma et al. / Procedia Manufacturing 3 ( 2015 ) 3844 – 3851 Fig. 7. Preliminary musculoskeletal system on Santos® with real time elongation feedback. retrieved relative to the Santos® kinematic wrist joint. However, parenting the insertion to the wrist joint resulted in the muscle action line of the bicep short muscle protruding from the digital human during limb manipulation. When the insertion position was retrieved relative to the elbow joint instead, the problem was eliminated and the muscle action line moved approximately as anatomically predicted. Furthermore, for each of these action lines, key joints that would set the minimum and maximum length limits were determined for elongation feedback. The result is a preliminary full body musculoskeletal system that deforms and responds in real time, while providing elongation feedback. Figure 7 presents a view of this capability. 3.1. Shortcomings of the musculoskeletal model Of the several hundreds of muscles in the human body, only 126 muscle action lines were able to be added to the Santos® digital human. This is primarily because of the fact that the Santos® kinematic skeleton is not representative of an actual human skeleton. For the muscles that were not added, changing the parenting scheme did not prove as successful as with the bicep short muscle; the action lines of those muscles still did not respond as anatomically expected. Furthermore, the elongation feedback currently implemented using the preliminary musculoskeletal system is based on minimum/maximum limits established by key joints. This dependency on key joints makes using this feedback mechanism as constraint in human task simulation redundant in the Santos® software which already uses joint angles as constraints in its simulations. A more scientifically sound feedback mechanism will involve having the ‘rest’ lengths of each muscle stored relative to a ‘neutral’ avatar position. Any deviation from those muscle ‘rest’ lengths would then be reported as positive or negative elongation as necessary. 3.1.1. Neutral posture A neutral posture is the body position at which muscle activation levels are at a minimum. In this position, the amount of electrical activity stimulating the muscles is at its minimum. A number of methods have to been used to accurately quantify this amount of electrical activity. Amongst them, the use of electromyography (EMG) signals has yielded the best results. In this method, either electrodes are attached to the skin, for superficial muscle activity detection, or needles are injected into the muscle, for more detailed analysis, and the electrical activity of the muscle recorded. 3850 Abhinav Sharma et al. / Procedia Manufacturing 3 ( 2015 ) 3844 – 3851 In the Santos® model, neutral postures need to be examined and developed for standing, sitting and lying down. This is to ensure the proper calculation and execution of already existing tasks. EMG signals have been used to determine neutral postures for the three cases. 3.1.1.1. Standing While standing, the human body is under constant stress as a result of the body trying to balance itself. The ankle joint experiences the most stress as the center of mass falls in front of, and not in line with, it. Constant activity in the calves, and major superficial dorsal and ventral muscles of the torso is necessary for stability. This is similar to guy wires (tensioned cables designed to add stability) on a cell phone tower. Without this constant activity, an individual would not be able to stand properly [5]. The Checkley System suggested by Edwin Checkley [6] is recommended for Santos. Figure 8 presents Santos in this standing neutral posture. 3.1.1.2. Lying down Not much muscle activity is needed in lying down. The lying down posture is completely at rest due to the absence of energy expenditure from having to balance the body [7]. The Woodhull-McNeal posture is recommended for use as the lying down posture in Santos (See Figure 9). Note from Figure 9 that the current Santos skeleton does not allow for the neck joint to align with the spine as per the recommendation. 3.1.1.3. Sitting A muscle activation level study carried out by NASA in zero gravity is used as basis for the relaxed sitting position. According to it, the center of gravity of the head is situated slightly anterior to the atlanto-occipital joint (articulation between the atlas and the occipital bone). Body weight is supported by the ischial tuberosities (known as the sitting bones) and the adjacent soft tissues. The degree of the lumbar curve during the sitting posture depends on sacral angulation which is governed by pelvic posture and the degree of mobility/fixation of the involved segments [8]. Figure 10 presents Santos in the suggested posture. Fig. 10. Sitting position (a) Santos (b) NASA reference position [8]. a b a b a b Fig. 8. Neutral standing posture on Santos (a). Standing posture depicting natural and forcible carriage of the body (b) [6]. Fig. 9. Lying down neutral posture (a) Top view (b) Side view. 3851 Abhinav Sharma et al. / Procedia Manufacturing 3 ( 2015 ) 3844 – 3851 These postures, once properly implemented in Santos®, can be used as basis for better muscle elongation output. 4. Future work Given the described shortcomings of the current musculoskeletal system, the model is still incomplete. First, a new parenting scheme has to be devised to enable the addition of further muscles. Secondly, based on the performed study, human ‘neutral’ positions for standing, lying down, and sitting have to be implemented on the Santos® model. Once that is done and the Santos® digital human is set to those neutral postures, muscle action line ‘rest’ lengths can be computed and used for a more scientific elongation feedback. With a better elongation feedback scheme, the musculoskeletal model can be extended for use in task simulations. From simple tasks such as touching a point in space, to complex ones such as obstacle courses, the Santos® digital human will be able to perform these out while minimizing or maximizing muscle elongation as required. Once refined and proven adequate for biomedical studies, the musculoskeletal model will be extended onto the other Santos® software digital humans. 5. Conclusion This paper presents a breakthrough in full body musculoskeletal system modeling on digital humans. Musculoskeletal system modeling and simulation has long remained a field quite separate from digital human modeling. The former requires a certain level of preprocessing before results can be obtained and analyzed, whereas the latter bases its simulations and resulting feedback on kinematic joint and skin movement. By combining known literature on musculoskeletal modeling with gaming graphics architecture, this paper presents a preliminary full body mathematical musculoskeletal system on a digital human that deforms and responds in real time, along with an amateur muscle elongation feedback mechanism. Each muscle is approximated by one or more muscle action lines that start at the anatomically correct origin position, and travel through 5 via points to the corresponding insertion position. The via points are determined using a wrapping algorithm on cylindrical obstacles that emulate the minimum bulges required to keep the approximating muscle action line running through the centroid of the equivalent volumetric muscle. For each muscle, key joints that set the minimum and maximum muscle action line lengths are also determined for elongation feedback. This model currently consists of 126 muscle action lines. More muscles still need to be added. A more sound elongation feedback based on muscle action line length deviation from its ‘rest’ length still needs to be implemented. After this, human tasks can be simulated and analyzed using muscle elongation as a constraint. That is, a digital human such as Santos® can then be instructed to carry a simple task such as touch a point in space while minimizing or maximizing muscle action line elongation in the process. Acknowledgements The authors of this paper will like to acknowledge the Virtual Soldier Research program, the University of Iowa, Amos Patrick, John Looft, and Laura Freylaw. References [1] Xiang Y., Chung H.J., Mathai A., Rahmatalla S., Kim J.H, Marler T., Beck S., Yang J., Arora J., Abdel-Malek K. “Optimization-based Dynamic Human Walking Prediction,” Proceedings of SAE Digital Human Modeling for Design and Engineering, Seattle, WA, June 2007. [2] Denavit J., Hartenberg R.S., “A kinematic notation for lower-pair mechanisms based on matrices,” Journal of Applied Mechanics, Vol. 77, pp. 215-221, 1955. [3] Chalrton Ian W., Johnson Garth R., “Application of spherical and cylindrical wrapping algorithms in the musculoskeletal model of the upper limb,” Journal of Biomechanics 34, pp. 1209-1216, 2001. [4] Patrick A., Abdel-Malek K., “A Musculoskeletal Model of the Upper Limb for Real Time Interaction,” SAE Technical Paper 2007-01-2488, 2007 [5] Poppen R., Maurer J., “Electromyographic Analysis of Relaxed Postures,” Biofeedback and Self-Regulation, pp. 491-98, 1982. [6] Checkley E., A Natural Method of Physical Training, Making Muscle and Reducing Flesh without Dieting or Apparatus, Brooklyn, NY, 1892. [7] Woodhull-Mcneal A. P., “Activity in Torso Muscles during Relaxed Standing,” European Journal of Applied Physiology and Occupational Physiology, pp. 419-424, 1986. [8] Mount F. E., Whitmore M., Stealey S. L., “Evaluation of Neutral Body Posture on Shuttle Mission STS-57 (SPACEHUB-1),” pp. 1-10, 2003. work_626abalifzhrdnxq4ftwsiv7pm ---- DH2020 Poster In 2017, librarians at Bucknell University developed a librarian-led undergraduate digital scholarship research program. We created the Digital Scholarship Summer Research Fellows (DSSRF) program to broaden research opportunities for students and introduce them to new ways of engaging in scholarship. The eight week program provides students with an opportunity to undertake independent research on a topic of their own choosing, and utilize digital humanities tools and methodologies to both answer questions and convey their research findings.Here, we examine the lasting impacts of DSSRF on the participants. We surveyed past fellows to understand how their participation and the skills they acquired were applicable to their subsequent coursework and career paths, and how the program influenced their thinking about scholarship. Assessing the Impact of a Digital Humanities Summer Research Program Carrie Pirmann, Bucknell University and Courtney Paddick, Bloomsburg University Reflections Self-Assessment Future Directions "DSSRF made me realize that research has no limits. You can conduct research in any field, and add to it through it being in a digital form. I think it's the research of the future." We asked students to assess their confidence levels, before and after DSSRF, with a variety of research and soft skills. These charts represent the areas in which students displayed the greatest amount of growth. (1 = not at all confident; 5 = extremely confident) One student leveraged his newly developed data visualization skills and showcased his project on the job market, and was hired by a sports analytics firm One student decided to pursue a graduate degree in library science after learning about archives and special collections One student, who is pursuing a career in market research, credited DSSRF with both confirming her decision to major in economics, and kick-starting her interest in data visualization Two undeclared students indicated participation in DSSRF helped confirm their choice of major Several students reported the program influenced their choice of majors, minors, and/or career paths. Some examples: Academic/Career Impacts "I think that the biggest impact that the program had was about how presentation of scholarship might change and expand to allow for more collaboration, and what this could be used for in different situations." Responses to the survey have proven very helpful as we look forward to future iterations of the program. Based on student feedback, we know they found field trips, interactions with peers and members of Library and IT, and work on their individual projects to be the most impactful aspects of DSSRF. Students found the weekly blog posts and assigned readings to be the least helpful parts of the program, so these will certainly be areas to revise moving forward. Based on the results of the survey, we have also identified the tools students most frequently gravitate towards for their own projects as well as tools they have used after DSSRF, and we will use this information to make decisions on the tools and techniques included in the future. Background work_64vyo2k6avex7pwhrhwwluc6gy ---- PowerPoint-Präsentation 1 Amir Moghaddass Esfehani Campus Library ADHO DH2019 Workshop "Towards Multilingualism In Digital Humanities: Achievements, Failures And Good Practices In DH Projects With Non-latin Scripts" No text – no mining. And what about dirty OCR? 2 No text – no mining. And what about dirty OCR? • OCR • Metadata • Tools & APIs No text – no mining. And what about dirty OCR? ADHO 08.07.2019 Amir Moghaddass Esfehani 3 Dirty OCR: Layout & Text No text – no mining. And what about dirty OCR? ADHO 08.07.2019 Amir Moghaddass Esfehani ㊅ @ 問 靈你’身栎物肩 “ 1 , ? \ ^ , 41安5 ~ 10, 與I 神的 蓋自神你 也^重孤确触&何^)! 魂 何 耶 0 原 耶。 荷本狗' 0 耶。造等答白, 0我的'曰,小 愁的物眞子 答爲 承 是 魂 也。 地如 ~萬此 ^ # ^ |+之教 之神也。 人 Precision = 0,448 Recall = 0,371 F-measure = 0,272 Error rate = 58,3% 4 „Discovery“ Of A Chinese Ice Age No text – no mining. And what about dirty OCR? ADHO 08.07.2019 Amir Moghaddass Esfehani 夾 襖綿 冰 5 „Discovery“ Of A Chinese Ice Age No text – no mining. And what about dirty OCR? ADHO 08.07.2019 Amir Moghaddass Esfehani 夾 + 冰 + 綿 + 襖 1750 來 + 永 + 秀 +澳 | | | | 6 Metadata 嚴如熤 Yan Ruyi: 苗防備覽: [22卷] Miao fang bei lan [22 juan]. 紹義堂, Daoguang 23 [China, 1843]. https://nbn-resolving.org/urn:nbn:de:bvb:12-bsb11123105-5 No text – no mining. And what about dirty OCR? ADHO 08.07.2019 Amir Moghaddass Esfehani Descriptive Structural Rigths Technical OCR MDZ > OPAC > DEAC > DDB > ZVDD > Europeana MARC RDF IIIF Retrieval MARC RDF METS/ MODS IIIF cortex EDM EDMMETS/ MODS MARC non-latin script 7 Chinese Text Project: OCR No text – no mining. And what about dirty OCR? ADHO 08.07.2019 Amir Moghaddass Esfehani Sturgeon, Donald (2018): Large-scale Optical Character Recognition of Pre- modern Chinese Texts. International Journal of Buddhist Thought and Culture (2), p. 11-44. https://digitalsinology.org/zh/wiki/Fil e:Ctext-ocr.png 8 Chinese Text Project / MARKUS: Tools & APIs No text – no mining. And what about dirty OCR? ADHO 08.07.2019 Amir Moghaddass Esfehani Text reuse Regex NER (MARKUS) 9No text – no mining. And what about dirty OCR? ADHO 08.07.2019 Amir Moghaddass Esfehani Thank you! AMIR MOGHADDASS ESFEHANI Campus Library Freie Universität Berlin amir.moghaddass@fu-berlin.de No text – no mining. �And what about dirty OCR? No text – no mining. �And what about dirty OCR? Dirty OCR: Layout & Text „Discovery“ Of A Chinese Ice Age „Discovery“ Of A Chinese Ice Age Metadata Chinese Text Project: OCR Chinese Text Project / MARKUS: Tools & APIs Foliennummer 9 work_66ihjngvive2dkurrqjf562gg4 ---- A study of scale effect on specific sediment yield in the Loess Plateau, China Chinese Science Bulletin © 2008 SCIENCE IN CHINA PRESS Springer www.scichina.com | csb.scichina.com | www.springerlink.com Chinese Science Bulletin | June 2008 | vol. 53 | no. 12 | 1848-1854 Construction and visualization of high-resolution three-dimensional anatomical structure datasets for Chinese digital human LI AnAn1, LIU Qian1†, ZENG ShaoQun1, TANG Lei2, ZHONG ShiZhen2 & LUO QingMing1 1 The Key Laboratory for Biomedical Photonics of the Ministry of Education, Wuhan National Laboratory for Optoelectronics, De- partment of Biomedical Engineering, Huazhong University of Science and Technology, Wuhan 430074, China; 2 Department of Anatomy, Southern Medical University, Guangzhou 510515, China The objective of the China Digital Human Project (CDH) is to digitize and visualize the anatomical structures of human body. In the project, a database with information of morphology, physical charac- teristics and physiological function will be constructed. The raw data of CDH which was completed in the Southern Medical University is employed. In Huazhong University of Science and Technology (HUST), the frozen section images are preprocessed, segmented, labeled in accordance with the major organs and tissues of human beings, and reconstructed into three-dimensional (3D) models in parallel on high performance computing clusters (HPC). Some visualization software for 2D atlas and 3D mod- els are developed based on the new dataset with high resolution (0.1 mm × 0.1 mm × 0.2 mm). In order to share, release and popularize the above work, a website (www.vch.org.cn) is online. The dataset is one of the most important parts in the national information database and the medical infrastructure. Chinese Digital Human, anatomical atlas, extremely large data processing, three-dimensional modeling, visualization The study of digital human aims at digitizing and visu- alizing the anatomical structures of human body, and constructing the database of the morphological informa- tion, physical characteristics and physiological function. It is a focused field in recent years and develops faster and faster. The study originated from the Visible Human Project (VHP) launched by the United States National Library of Medicine (NLM) in 1989[1]. The VHP pub- lished the first western male anatomy dataset in 1994 and another female edition in 1995. South Korea began their five-year plan of Visible Korean Human (VKH) in 2000, and got the first dataset in the next year[2]. At pre- sent, the VHP datasets have become the most popular sectional anatomy dataset of human beings. Based on the VHP, researchers in the world have made significant achievements in image processing, 3D modeling, visu- alization software development, physical simulation, and many other fields[3―5]. In November 2001, the 174th Xiangshan Science Conference was held in Beijing, and the theme was sci- ence and technical issues of digital virtual human body in China[6]. Plans and suggestions for the China Digital Human Project (CDH) were proposed in the conference. Since then the CDH project was launched formally. Up till now, some high resolution 2D datasets have already been acquired in Southern Medical University and Third Military Medical University. Image processing and or- gan modeling based on the 2D datasets have been car- ried out in some research institutions[7―9]. Construction of the 3D structure dataset is the basic application of the CDH, witch is related to the medical, industrial design, education, etc. With this applications, work can be more economical and effective, regardless of some ethics issues in medicine, etc. The 3D structure Received December 19, 2007; accepted April 6, 2008 doi: 10.1007/s11434-008-0244-2 †Corresponding author (email: qianliu@mail.hust.edu.cn) Supported by the National High Technology Research and Development Program of China (Grant No. 2006AA02Z343) LI AnAn et al. Chinese Science Bulletin | June 2008 | vol. 53 | no. 12 | 1848-1854 1849 A R TI C LE S B IO M E D IC A L E N G IN E E R IN G dataset of human is most widely used in medical field, including new treatment methods, surgical navigation, virtual surgery, clinical diagnosis, assessment of nuclear radiation, radiation therapy, medical education, and so on. In this paper, 3D modeling of human organs and 3D visualization techniques of using CDH dataset were stu- died, and a high-resolution 3D anatomy dataset of the human was constructed. 1 Materials The dataset of CDH No. 2 (CDH M2) was employed, which is the world’s highest-resolution sectional image dataset of human beings, and obtained in March, 2005[7]. Table 1 Comparison of the digital human datasets VHP VKH CDH M2 Spacing (mm) 1 0.2 0.2 Total of section 1878 8590 9320(8952) Image size 2048×1216 3040×2008 4080×5440 Pixel depth (Bit) Non-digital(color) 24(color) 24(color) The original data of CDH M2 was derived from a male body without any physical injury. 9320 horizontal sectional images (4080 × 5440 × 24 bit) were obtained using the frozen section milling and digital imaging techniques. The file size with the RAW (original image data storage format) format reached about 260 GB (Giga Byte, Giga is 109), and the spatial resolution was about 0.1 mm × 0.1 mm × 0.2 mm. 2 Methods The study begins from the raw data, and Figure 1 is the flow diagram of the whole work. Image processing is the beginning of the study, involving image registration, image compression, image segmentation, etc (Figure 2). The next step is 3D reconstruction, and it was also considered as ‘the level of understanding’ of the image processing. Visualization is based on the 2D image dataset and 3D modeling dataset. In the study, the volume of the data is huge: the aver- age task of data processing amounts to 100 GB level, the complete set amounts as high as TB (Tera Byte, Tera is 1012) level. So it is too large to be treated with the tradi- tional methods and hardware, or more efficient data processing methods. And better data processing methods and hardware are needed. In this study, computing is realized on an HPC, which has 17 computing nodes. The nodes have two CPU (Intel Xeon 2.4 GHz) each, and connect with each other using the InfiniBand high-speed switching technology. In short, the total computing ca- pacity of the device is about 100 billion times. Figure 1 Flow diagram of the work. 2.1 Imaging preprocessing Imaging preprocessing of the study includes the release of RAW files, registration, extraction for the region of interest (ROI), and lossless image compression. The purpose of the operation is to minimize the space and exposure errors from the acquisition process, reduce the redundancy data and get rid of some difficulties for later processing. At first, RAW files were released as TIFF (Tagged Image File Format) format without compression. It has to be mentioned that the RAW format is a kind of equipment related data packets rather than image format, and requires a certain tool for decoding. After that, images were registered by using spatial transformation in the environment of Mathworks Mat- lab[10]. As the relative position between camera and spe- cimen is unstable in the process of image acquisition, it is necessary to make registration to eliminate the devia- tions. In order to reduce the difficulty in image registra- tion, 4 copper cables were embedded in the corpse in the direction of the lying body. The cables used as registra- tion markings for section planes were clear and regular. The markings’ positions provide information and the cause of deviations, which can be simplified into three spatial transformations: displacement, rotating and zooming. For each image, only one space-transform ma- trix can be founded and used in inverse transform. Till now, we had not obtained a suitable result as yet, and the ROI extraction and lossless image compression 1850 LI AnAn et al. Chinese Science Bulletin | June 2008 | vol. 53 | no. 12 | 1848-1854 Figure 2 Image processing for the section No. 2305. (a) Original section; (b) imaging preprocessing; (c) segmentation and identification. were needed. The purpose we did it in this way was to remove redundant data and make the storage and trans- mission for massive data achievable. Here, the ROI re- fers to the region within the contours of the human body. And on the contrary, the non-region-of-interest only contains embedded reagent, color cards, container, and any other non-human materials. The proportion of ROI area and total area is defined as image utilization. The average image utilization of the CDH M2 dataset is only 15%. In other words, 85% of the data is redundant. An effective way to increase the utilization is to fill the non-region-of-interest with black color and to use image compression algorithm. The PNG (Portable Network Graphic Format) is a good choice for lossless image compression, and it has a very good performance in the network transmission and displaying speed. The JPEG/JPG (Joint Photographic Experts Group) format is also a good way if the high-resolution is not very neces- sary, and the compression ratio can reach 1%. 2.2 Segmentation and identification of the organs Segmentation is the base of 3D modeling, and it sepa- rates the image into regions of different meanings based on sectional image and anatomical knowledge. Accuracy and speed are the bottleneck for segmentation. There are two kinds of segmentation methods, automatic segmen- tation and interactive segmentation. Automatic segmentation method is a good idea, but it requires a high degree of image contrast which can hardly be reached under normal circumstances. There are only a limited number of organs and tissues suitable for the use of the automatic methods in CDH M2 dataset, such as the cartilage, artery, body contour and red bone marrow. Here, ITK (Insight Toolkit) is used to segment the above-mentioned several targets. Practice has proved that the human-machine-interactive (HCI) can effec- tively improve the effectiveness of automatic segmenta- tion. However, ITK is unable to provide visualization and graphical user interface (GUI). Therefore, other two toolkits were added: Visualization Toolkit (VTK) and Fast Light Toolkit (FLTK, a kind of GUI toolkits). An interactive image segmentation approach based on the Adobe Photoshop software is the most important method for us. Photoshop has many advantages, such as the powerful image segmentation features, massive data processing capability, batch processing and easiness in operation. Although the interactive method can enable us to get better results than the automatic methods, but it requires enormous workload, and moreover, the opera- tors have to have sufficient knowledge of anatomy. Therefore, image segmentation becomes the most diffi- cult task in the study. In order to ensure the objectivity and accuracy, some anatomy experts were invited. Every result from the segmentation will have a unique identification, which is the file name of the images, and also will be recorded into database. MeSH (Medical Subject Headings)[11] was used as the standard for nam- ing system. The name is unique, and it has its own hier- archy consisting of physiological system classification, organ classification, structural position of the physiology, original serial number and so on. Among them, the physiological system classification means nervous sys- tem, movement system, circulatory system, etc. The structural position of the physiology means head, chest, belly, etc. The organ classification means heart, kidney, spleen, etc. The new results were kept by the JPEG/JPG format, which had a high compression ratio. And the color depth LI AnAn et al. Chinese Science Bulletin | June 2008 | vol. 53 | no. 12 | 1848-1854 1851 A R TI C LE S B IO M E D IC A L E N G IN E E R IN G and image size had no change. 2.3 3D modeling of the massive data 3D reconstruction is the most common modeling method, which expands the 2D information into 3D space, and makes the information more intuitive and vivid. Because of the massive data problem, parallel algorithm actual- ized by using the MPI (Message Passing Interface) and VTK has to be used for reconstruction. Massive data processing poses the major difficulty in 3D modeling. For the complete structure of the body (such as skin), at least 100 GB memory in a single PC is needed. Therefore, a single PC is unable to finish such a task without sampling. Some small organs can be recon- structed on a workstation, such as vertebra and testicle. Regularly, the memory consumption is proportional to the number of voxel. So it can lighten the burden for each computing unit through splitting the large data into small chips (Figure 3(c)). In this work, HPC was used for re- construction. The approach could enable the high resolu- tion of models to support big organs at a high speed. Figure 3 3D reconstruction of the heart. (a) Original surface rendering; (b) triangle mesh topology of surface model; (c) schematic diagram of parallel computing. ①―④, Four different gray-scale regional represen- tatives of different computing nodes output. All the models only have the contour information by using the marching cube algorithm, and the structure of the data is polygonal[12] as shown in Figure 3(b). It’s a good way to reduce the costs in visualization. In addi- tion, the lower-resolution models can be obtained by reducing the number of polygonal. The 3D models are kept as the private format of the VTK with binary mode. The other commonly format of the 3D model can be transformed through the IO inter- faces of VTK or some other business software. The bi- nary mode is so important that it can economize the sto- rage space, and also increase the efficiency of reading and writing. 2.4 Visualization of the CDH datasets Visualization is a process of translating the data into graphics and display by using the computer graphics and image processing technology, and it is often combined with interactive and stereoscopic display technology. To meet different purposes and datasets, some software were developed: stand-alone version of 2D atlas browser, stand-alone version of 3D model browser, automatic demonstration system with stereo, fictitious operation system, web version of 2D atlas browser, remote anat- omy teaching system, etc (Figure 4). VTK and Mercury Open Inventor (OIV) were used to develop the stand-alone software, and the development environment was Microsoft Visual Studio 6. OIV is greatly advantageous over VTK in 3D stereo display, automatic engine and development cycle, but it has such disadvantages as high cost of secondary development and the underlying algorithm. JavaScript and Virtual Reality Modeling Language Figure 4 Visualization of CDH datasets. 1852 LI AnAn et al. Chinese Science Bulletin | June 2008 | vol. 53 | no. 12 | 1848-1854 (VRML) were used to realize the web-based visualiza- tion, and the web page was developed by using HTML (Hyper Text Markup Language). The remote client can use the resources from server by accessing to relevant web pages on internet. 3 Results and discussion 3.1 Construction of the datasets Two kinds of datasets were constructed, that is, 2D im- age datasets and 3D model datasets. The two datasets can be further divided into smaller ones, called sub- datasets. The 2D image datasets include several sub-datasets, which are the interim results after preprocessing, regis- tration, segmentation and compression. Below are some useful sub-datasets: the original dataset before decom- pression (A1), the original dataset after decompression (A2), the images after registration (A3), the lossless compressed dataset without background (A4), the com- pressed dataset without background (A5) and the results of segmentation (A6) (Table 2). Table 2 Statistics of 2D image datasets Dataset Total section Size (GB) Format Depth (Bit) Resolution A1 9320 260 RAW − A2 8952 552 TIFF 24 A3 8952 552 TIFF 24 A4 8952 40.8 PNG 24 A5 8952 2.79 JPEG 24 A6 >100000 >40 JPEG 8 4080×5440 Pixel Table 3 Tissues and organs that have been completely segmenteda) Physiological systems Name of tissues or organs Locomotor skeletal muscle, bones (a total of 200, not including six ossicles), cartilage (ribs, ears, thyroid cartilage, etc.) Digestive salivary glands, pharynx, esophagus, stomach, intestine, liver, gallbladder, pancreas Respiratory bronchus, lung Urogenital kidney, adipose capsule of kidney, attached testicle, bladder, ureter, sponginess, prostate, spermaduct, sper- matophore gland, testicle, urethra ball-gland, urethra Circulatory heart, coronary artery, vein, artery, spleen, thymus Nervous gray matter, white matter, cerebellum, brainstem, spinal cord Endocrine pituitary, adrenal gland, thyroid Other body contour, eyeball, lacrimal gland, tongue, ocular a) Some organs may have symmetric structure, not listed separately, such as lung and kidney. The 3D model datasets were all developed from the original 3D model which was generated from the VTK program. Up to now, at least 260 models have been re- constructed that belong to the locomotor system[14], cir- culatory system[10], nervous system, etc (Table 3 and Fig- ure 5). The dataset of the original 3D model has a higher resolution (0.1 mm × 0.1 mm × 0.2 mm) than the VIP-MAN developed by Xu et al.[5] with a resolution of 0.33 mm × 0.33 mm × 1 mm. Depending on the format conversion and polygonal decimation, there were 4 dif- ferent 3D model datasets: the original dataset (B1), the simplified version of B1 (B2), the VRML models (B3) and the OIV models (B4) (Table 4). Table 4 Statistics of 3D model datasets Dataset Fatherdataset Size (MB) Format Coding Visualization platform B1 A6 950.2 VTK ASCII VTK B2 B1 105.7 VTK Binary VTK B3 B2 293.8 WRL ASCII VRML, OIV, VTK, 3DMax B3 252.4 IV ASCII OIV B4 B3 104.7 IV Binary OIV 3.2 Several visualization methods The concrete realization of the visualization needs some application programs, including the stand-alone version of 2D atlas browser called CDH Atlas, the stand-alone version of 3D model browser called Clairvoyance Man, the automatic demonstration system with stereo called CDH 3DProjector, the remote anatomy teaching system, etc (Figure 6). The CDH Atlas is a powerful sectional anatomy atlas of the human body. Based on A4 and A6 datasets, it pro- vides the atlas on three orthogonal planes. The software also provides some useful tools like magnifier, ruler, area calculator, marker, organ probe, etc. And it becomes substitution of traditional anatomical atlas. The Clairvoyance Man based on the B4 dataset is a 3D model browser with highly interactive. The operation is simple and easy. The user can self-define the color, light, position and the models’ components. And it pro- vides the introduction of each organ or tissue on the right side of the interface. The content is so abundant and lively that it is very suited to science popularizing and anatomy teaching. The CDH 3DProjector based on the B4 dataset is an automatic demonstration system, and needs the support of 3D stereo equipment. Figuratively speaking, the software is like a stage, the models are just like some actors, and the operation of the software is a drama. The LI AnAn et al. Chinese Science Bulletin | June 2008 | vol. 53 | no. 12 | 1848-1854 1853 A R TI C LE S B IO M E D IC A L E N G IN E E R IN G audiences can feel reality and immersion by using the stereographic projection and glasses. CDH 3D Projector is a good choice for some exhibition occasion, and suit- able for other 3D models. The remote anatomy teaching system is a web site for 3D model browser and online education. It’s very simi- lar to the Clairvoyance Man in software function. A complete person can be assembled by dragging and dropping the mouse, like building up the toy bricks. 4 Prospect The development of digital human research needs the supports of the other fields, such as compute science, information science, physics, mathematics, medicine, national defense industry, sports industry, aerospace in- dustry, automatics, etc. The plan proposed in 174th Xiangshan Science Conference involve four stages―the visible human, the physical human, the physiological human and the intelligent human. The visible human is our goal at the present stage. Visible human is a long-term and difficult project, especially the image segmentation. Constantly updating the quality of image processing, VHP is constructing the meticulous model in 20 years. The CDH M2 is superior to VHP in some aspects, but brings greater difficulties in image processing. 2005 was the year of full challenge for us, and about fifty organ models were reconstructed within three months through 24 graduate students’ hard work. The fifty models are the basis of our study. Just as the VHP datasets, the CDH M2 datasets will possibly give good support to other researchers. In HUST, further study is still continuing. Our present goal is to improve the performance of the image proc- essing and parallel computation, to construct more mod- els, and to develop more practical visual software. It is our expectation that the achievement might inspire as many as possible applications. In addition, we are start- Figure 5 Portfolio effects of 3D structural model of human (B4). Come back from the front: combination of the respiratory and urogenital system; the locomotor system; combination of the nervous, digestive and body contour; circulatory system. Three planar projection are side view, top view and former view of the 3D model. 1854 LI AnAn et al. Chinese Science Bulletin | June 2008 | vol. 53 | no. 12 | 1848-1854 Figure 6 Different visualization methods (only Chinese edition). (a) The user interface of CDH atlas, currently displaying a sagittal image, the location of the mouse is salivary gland; (b) user interface of Clairvoyance Man; (c) the web page of teaching system for human anatomy, the current model is the urogenital system. ing the study on the second stage (the physical human) with some representative projects including fictitious op- eration, mechanics simulation, radiation simulation, etc. In a word, the technology of digital human has a bright future with the progress of other technologies[16]. It is not a dream that we can mould out an actual person in computer. We thank all those who participated in the CDH Project from Huazhong University of Science and Technology. And we thank Prof. Yu Y T and Prof. Li Z H for the guidance in anatomy. 1 Spitzer V M, AcKerman M J, Scherzinger A L, et al. The visible hu- man male: A technical report. J Am Med Inform Assoc, 1996, 3(2): 118―130 2 Chung M S, Kim S Y. Three dimensional image and virtual dissection program of the brain made of Korean cadaver. Yousei Med J, 2000, 41: 299―303 3 Pommert A, Hohne K H, Pflesser B, et al. Creating a high-resolution spatial/symbolic model of the inner organs based on the Visible Hu- man. Med Imag Anal, 2001, 5(3): 221―227[DOI] 4 Robb R A. Virtual endoscopy: development and evaluation using the Visible Human Datasets. Comput Med Imag Graphics, 2000, 24: 133―151[DOI] 5 Xu X G, Chao T X, Bozkurt A. VIP-MAN: An image-based whole- body adult male model constructed from color photographs of the visible human project for multi-particle Monte Carlo calculations. Health Phys, 2000, 78(5): 476―487[DOI] 6 Li Z H. Science and Techology Issues of Digital Virtual Human Body in China—Summary of Xiangshan Science Conference Conference of No.174. Chin Basic Sci (in Chinese), 2002, 3: 35―38 7 Tang L, Zhong S Z, Li Z Y, et al. Establish high resolution image da- taset of Chinese digital human male. J Med Biomech (in Chinese), 2006, 21(3): 179―182 8 Tang L, Yuan L, Huang W H, et al. Data collecting technology on Virtual Chinese Human. Chin J Clin Anat (in Chinese), 2002, 20(5): 324―326 9 Zhang S X, Liu Z J, Tang L W, et al. Number one of Chinese digi- tized visible human completed. Acta Acad Med Mil Tert (in Chinese), 2002, 10: 1231―1232 10 Wang W J, Liu Q, Gong H, et al. 3D Reconstruction of cardiovascu- lar system on Virtual Chinese Male-No.1. J Med Biomech (in Chi- nese), 21(3): 198―202 11 Medical Subject Headings. Be-thesda(Md): National Library of Medicine, 2006 12 Liu Q, Gong H, Luo Q M. Parallel visualization of visible chinese human with extremely large datasets. In: Proceedings of the 2005 IEEE Engineering in Medicine and Biology 27th Annual Con- fer-ence, 2005 Sep1-4, Shanghai, China. Shanghai: IEEE, 2005: 5172―5175 13 Liang F, Lu Q, Zeng S Q. A Parallel Volume Rendering Algorithm Based on MPI. Comput Eng (in Chinese), 2005, 31(13): 171―173 14 Li A A, Liu Q, Gong H, et al. High quality 3 D skeleton system mod- eling of Virtual Chinese Human male No.1. Chin J Clin Anat (in Chinese), 2006, 24(3): 292―294 15 Zhang G Z, Liu Q, Luo Q M. Monte Carlo simulations for external neutron dosimetry based on the visible Chinese human phantom. Phys Med Biol, 2007, 52:7367―7383[DOI] 16 Zhong S Z. Actualities and prospects of research on digitized virtual human. Med J PLA (in Chinese), 2003, 28(5): 385―388 << /ASCII85EncodePages false /AllowTransparency false /AutoPositionEPSFiles true /AutoRotatePages /All /Binding /Left /CalGrayProfile (Dot Gain 20%) /CalRGBProfile (sRGB IEC61966-2.1) /CalCMYKProfile (U.S. Web Coated \050SWOP\051 v2) /sRGBProfile (sRGB IEC61966-2.1) /CannotEmbedFontPolicy /Warning /CompatibilityLevel 1.4 /CompressObjects /Tags /CompressPages true /ConvertImagesToIndexed true /PassThroughJPEGImages true /CreateJDFFile false /CreateJobTicket false /DefaultRenderingIntent /Default /DetectBlends true /DetectCurves 0.0000 /ColorConversionStrategy /LeaveColorUnchanged /DoThumbnails false /EmbedAllFonts true /EmbedOpenType false /ParseICCProfilesInComments true /EmbedJobOptions true /DSCReportingLevel 0 /EmitDSCWarnings false /EndPage -1 /ImageMemory 1048576 /LockDistillerParams false /MaxSubsetPct 100 /Optimize true /OPM 1 /ParseDSCComments true /ParseDSCCommentsForDocInfo true /PreserveCopyPage true /PreserveDICMYKValues true /PreserveEPSInfo true /PreserveFlatness true /PreserveHalftoneInfo false /PreserveOPIComments false /PreserveOverprintSettings true /StartPage 1 /SubsetFonts true /TransferFunctionInfo /Apply /UCRandBGInfo /Preserve /UsePrologue false /ColorSettingsFile () /AlwaysEmbed [ true ] /NeverEmbed [ true ] /AntiAliasColorImages false /CropColorImages true /ColorImageMinResolution 300 /ColorImageMinResolutionPolicy /OK /DownsampleColorImages true /ColorImageDownsampleType /Bicubic /ColorImageResolution 300 /ColorImageDepth -1 /ColorImageMinDownsampleDepth 1 /ColorImageDownsampleThreshold 1.50000 /EncodeColorImages true /ColorImageFilter /DCTEncode /AutoFilterColorImages true /ColorImageAutoFilterStrategy /JPEG /ColorACSImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000ColorACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000ColorImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 300 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /GrayImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000GrayACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000GrayImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict << /K -1 >> /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile () /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False /Description << /CHS /CHT /DAN /DEU /ESP /FRA /ITA /JPN /KOR /NLD (Gebruik deze instellingen om Adobe PDF-documenten te maken voor kwaliteitsafdrukken op desktopprinters en proofers. De gemaakte PDF-documenten kunnen worden geopend met Acrobat en Adobe Reader 5.0 en hoger.) /NOR /PTB /SUO /SVE /ENU (Use these settings to create Adobe PDF documents for quality printing on desktop printers and proofers. Created PDF documents can be opened with Acrobat and Adobe Reader 5.0 and later.) >> /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ << /AsReaderSpreads false /CropImagesToFrames true /ErrorControl /WarnAndContinue /FlattenerIgnoreSpreadOverrides false /IncludeGuidesGrids false /IncludeNonPrinting false /IncludeSlug false /Namespace [ (Adobe) (InDesign) (4.0) ] /OmitPlacedBitmaps false /OmitPlacedEPS false /OmitPlacedPDF false /SimulateOverprint /Legacy >> << /AddBleedMarks false /AddColorBars false /AddCropMarks false /AddPageInfo false /AddRegMarks false /ConvertColors /NoConversion /DestinationProfileName () /DestinationProfileSelector /NA /Downsample16BitImages true /FlattenerPreset << /PresetSelector /MediumResolution >> /FormElements false /GenerateStructure true /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles true /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /NA /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /LeaveUntagged /UseDocumentBleed false >> ] >> setdistillerparams << /HWResolution [2400 2400] /PageSize [612.000 792.000] >> setpagedevice work_66pmf7vgszbbxdjkrrmwuowmom ---- Artnodes ##plugins.themes.bootstrap3.accessible_menu.label## ##plugins.themes.bootstrap3.accessible_menu.main_navigation## ##plugins.themes.bootstrap3.accessible_menu.main_content## ##plugins.themes.bootstrap3.accessible_menu.sidebar## Register Login Change the language. The current language is: English Castellano Català Toggle navigation Current issue Published issues Who we are Editorial Team Reviewers About the Journal Editorial Policies Indexing Contact Statistics Subjects Call for Papers Submit your article Search Online ISSN: 1695-5951 Artnodes is an e-journal promoted by the Universitat Oberta de Catalunya which analyses the intersections between Art, Science and Technology. ISSN 1695-5951. DOI: http://dx.doi.org/10.7238/issn.1695-5951 Current issue No 27 (2021): Node 27 «Arts in the Time of Pandemic» (Guest Editors: Laura Benítez & Erich Berger) Full Issue PDF NODE 27. Arts in the Time of Pandemic (Guest Editors: L. Benítez, E. Berger) First Response Laura Benítez Valero, Erich Berger PDF ePUB Non/Living Queerings, Undoing Certainties, and Braiding Vulnerabilities: A Collective Reflection Marietta Radomska, Mayra Citlalli Rojo Gómez, Margherita Pevere, Terike Haapoja PDF ePUB Staying in Touch: case study of artistic research during the COVID-19 lock-down A Case Study of Artistic Research during the COVID-19 Lockdown Louise Mackenzie, Robertina Šebjanič, Karolina Żyniewicz, Isabel Burr Raty, Dalila Honorato PDF ePUB The trauma of the inert. Notes for a new parasitology Ivan Flores Arancibia, Begonya Saez Tajafuerce PDF (Castellano) ePUB (Castellano) The boundaries that constitute us: Parasite, pandemic life, and crises of vulnerability Won Jeon PDF ePUB Imaginations of the evolution of bodies: Notes on the relationships between plants and humans Mayra Citlalli Rojo Gómez PDF (Castellano) ePUB (Castellano) Unspeakable references — On infective states of words and images in the "SHUT DOWN 2020" project Claudia Reiche, Brigitte Helbling PDF ePUB The journey in situ. How walking in confined spaces can boost imagination Laura Apolonio PDF (Castellano) ePUB (Castellano) Between the living and the non-living. Ageism, ableism, LGBTQI*phobia and anti-racist artivism in the Spanish State during the first months of COVID-19 Andrés Senra PDF (Castellano) ePUB (Castellano) The virtualities of art (or how art is, above all, virtual) José Manuel Ruiz PDF (Castellano) ePUB (Castellano) Contemporary art and architectures of the infected (host-host) under the narrative of the tuberculosis sanatorium Gloria Lapeña Gallego PDF (Castellano) ePUB (Castellano) Miscellany Towards generative craftwork – Study case of the Pasto varnish (Colombia) artisanal technique and 3D printing Carlos Córdoba-Cely PDF (Castellano) ePUB (Castellano) To learn from the fall, to deal with anguish: paradoxes of playful melancholy in «Gris» Shaila García Catalán, Aarón Rodríguez Serrano, Marta Martín Núñez PDF (Castellano) ePUB (Castellano) Quantitative analysis of the journal Índice Literario (1932-1936) Juana María González PDF (Castellano) ePUB (Castellano) The Video, the City, and the Spectator: The Architecture and Its Bodies in Front of a Video Camera Lorenzo Lazzari PDF ePUB Discourses on artistic research in Flanders: non-scholarly perspectives on re-search in the arts Florian Vanlee PDF ePUB View All Issues Submit your article   Read our Blog ArtMatters! Tweets by ArtnodesUOC Subscribe to our Newsletter * indicates required Email Address * You can unsubscribe at any time by clicking the link in the footer of our emails. For information about our privacy practices, please visit our website. We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp's privacy practices here.   About RACO About Inclusion criteria and Editorial Board News Statistics Contact Direct to… Authors Subjects Participant Journals Information Accessibility Legal note Cookies policy Collaborations Coordination work_67cfmbnqgbaoxchoamuceizf3e ---- Microsoft Word - HBEM_2012_1999_Burgess_Bruns_Hjorth.docx EMERGING METHODS FOR DIGITAL MEDIA RESEARCH 1 Emerging Methods for Digital Media Research: An Introduction Jean Burgess, Axel Bruns and Larissa Hjorth Now as in earlier periods of acute change in the media environment, new disciplinary articulations are producing new methods for media and communication research. At the same time, established media and communication studies methods are being recombined, reconfigured, and remediated alongside their objects of study. This special issue of JOBEM seeks to explore the conceptual, political and practical aspects of emerging methods for digital media research. It does so at the conjuncture of a number of important contemporary trends: the rise of a “third wave” of the Digital Humanities and the “computational turn” (Berry, 2011) associated with natively digital objects and the methods for studying them; the apparently ubiquitous Big Data paradigm—with its various manifestations across academia, business, and government—that brings with it a rapidly increasing interest in social media communication and online “behavior” from the “hard” sciences; along with the multisited, embodied, and emplaced nature of everyday digital media practice. The issue contains seven articles that advocate for, reflect upon, or critique current Jean Burgess (PhD, Queensland University of Technology) is an Associate professor of Digital Media Studies and Deputy Director of the ARC Centre of Excellence for Creative Industries & Innovation at Queensland University of Technology. Her research focuses on the uses, politics and methodological implications of social and mobile media platforms. Axel Bruns (PhD, University of Queensland) is an Associate Professor in the Creative Industries Faculty at Queensland University of Technology in Brisbane, Australia, and a Chief Investigator in the ARC Centre of Excellence for Creative Industries and Innovation (http://cci.edu.au/). He is an expert on the impact of user-led content creation, or produsage, and his current work focuses on the study of user participation in social media spaces such as Twitter, especially in the context of acute events. Larissa Hjorth (Ph.D., University of Melbourne) is an artist, digital ethnographer, and Senior Lecturer in the Games Program at the School of Media & Communication at RMIT University. Since 2000, Hjorth has been researching and publishing on gendered customizing of mobile communication, gaming, and virtual communities in the Asia–Pacific. EMERGING METHODS FOR DIGITAL MEDIA RESEARCH 2 methodological trends in digital media research. It ranges from a discussion of the emergence of a new wave of Digital Humanities (Neils Brügger and Niels Ole Finneman), the potential for digital media research of emerging approaches like Media Archaeology (Frédérick Lesage), the role of language in research (Randy Kluver, Heidi Campbell and Stephen Balfour), to the ways Big Data is impacting upon content analysis (Seth C. Lewis, Rodrigo Zamith, and Alfred Hermida), digital media methods (Merja Mahrt and Michael Scharkow) and the large-scale policy research potential of community media archives (Nicole Matthews and Naomi Sunderland). The special issue begins with Randy Kluver, Heidi Campbell and Stephen Balfour’s “Language and the Boundaries of Research” which argues that “data-driven research” has failed to engage with its increasingly internationalized context, especially in terms of its Anglophonic or Western-centric focus. As Kluver et al. rightly identify, the field remains focused upon Western media as a placeholder for “global media.” Here we are reminded of the importance of understanding Digital Media in context. While Big Data can often abstract the cultural, social, and linguistic nuances of digital media practice, there is a growing pool of researchers exploring interdisciplinary methods such as “ethno-mining” that use ethnography to critique Big Data (Anderson et al., 2009) and situate digital media as part of the complex dynamics of everyday life (Coleman, 2010). In their review article “The Value of Big Data in Digital Media Research,” Merja Mahrt and Michael Scharkow provide a critical survey of methodological approaches to media communication and how the field is being reconfigured in an age of Big Data. In particular, Mahrt and Scharkow focus upon the consequences of using Big Data at different stages of research process, in dialogue with the traditions underpinning manual quantitative and EMERGING METHODS FOR DIGITAL MEDIA RESEARCH 3 qualitative approaches. For Seth C. Lewis, Rodrigo Zamith, and Alfred Hermida in “Content Analysis in an Era of Big Data: A Hybrid Approach to Computational and Manual Methods,” one can gain insight into content by blending computational and manual methods. Drawing on a case study of Twitter, Lewis et al. argue that a hybrid method of computational and manual techniques can provide both systematic rigor and contextual sensitivity. This is followed by Anne Galloway’s “Emergent Media Technologies, Speculation, Expectation and Human/nonhuman Relations” in which Galloway draws on her background as one of the earliest researchers to study ubiquitous computing to discuss the role of sociology in situating emergent media technologies as part of a cultural process involving a range of human and nonhuman actors. Here Galloway focuses upon the often-overlooked aspect of anticipation and expectation in the process of media practice and the production of imaginaries for and of the future. Drawing on the work of Bruno Latour, Galloway concludes with some thought-provoking questions for relationships between Digital Media methods and design. For Neils Brügger and Niels Ole Finneman in “The Web and Digital Humanities: Theoretical and Methodological Concerns” there is a need for the Digital Humanities to understand the complex social, temporal, and spatial dimensions of the web. Using the case study of the real-time and archived web (as a dynamic depiction, not simply a copy of what was once online) to illustrate their point, Brügger and Finneman argue that, currently, the Digital Humanities is limited in its ability to capture the moving architecture of digital media. Frédérick Lesage compliments this discussion by picking up on some aspects of the related field of software studies as well as cultural analytics and media archaeology, in “Cultural EMERGING METHODS FOR DIGITAL MEDIA RESEARCH 4 Biographies and Excavations of Media: Context and Process.” Lesage argues for a “cultural biography” approach to the study of software as media objects—as “things.” Nicole Matthews and Naomi Sunderland’s “Digital Life Story Narratives as Data for Policy Makers and Practitioners: Thinking Through Methodologies for Large-scale Multimedia Qualitative Datasets” explores the role of community-based digital media narratives (e.g. via digital storytelling projects) in “amplifying marginalized voices in the public domain.” It is clear from Matthews and Sunderland’s piece that despite the large numbers of these projects—and hence the depth of research potential in the stories they have produced—the effective deployment of this potential in social policy remains a missed articulation with political, ethical, and methodological dimensions. References Anderson, K., Rafus, D., Rattenbury, T., & R. Aipperspach (2009). ‘Numbers Have Qualities Too: Experiences with Ethno-Mining’, http://www2.berkeley.intel- research.net/~tlratten/public_usage_data/anderson_EPIC_2009.pdf Berry, D. (2011). The Computational Turn: Thinking About the Digital Humanities. Culture Machine, 12. Retrieved from http://www.culturemachine.net/index.php/cm/article/view/440/470 Coleman, G. (2010). Ethnographic Approaches to Digital Media. Annual Review of Anthropology, 39, p. 487–505. work_6bfzseovxvc7xadhsheso6nubm ---- Microsoft Word - MTerras_Crowdsourcing in Digital Humanities_Final.docx   1   Crowdsourcing in the Digital Humanities In Schreibman, S., Siemens, R., and Unsworth, J. (eds), (2016) "A New Companion to Digital Humanities", (p. 420 – 439). Wiley-Blackwell. http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1118680596.html © Wiley-­‐Blackwell,  January  2016.    Author’s  last  version  provided  here  with  permission.   As Web 2.0 technologies changed the World Wide Web from a read-only to a co- creative digital experience, a range of commercial and non-commercial platforms emerged to allow online users to contribute to discussions and use their knowledge, experience, and time to build online content. Alongside the widespread success of collaboratively produced resources such as Wikipedia came a movement in the cultural and heritage sectors to trial crowdsourcing - the harnessing of online activities and behaviour to aid in large-scale ventures such as tagging, commenting, rating, reviewing, text correcting, and the creation and uploading of content in a methodical, task-based fashion (Holley 2010) - to improve the quality of, and widen access to, online collections. Building on this, within Digital Humanities there have been attempts to crowdsource more complex tasks traditionally assumed to be carried out by academic scholars: such as the accurate transcription of manuscript material. This chapter aims to survey the growth and uptake of crowdsourcing for culture and heritage, and more specifically, within Digital Humanities. It raises issues of public engagement and asks how the use of technology to involve and engage a wider audience with tasks that have been the traditional purview of academics can broaden the scope and appreciation of humanistic enquiry. Finally, it asks what this increasingly common public-facing activity means for Digital Humanities itself, as the success of these projects demonstrates the effectiveness of building projects for, and involving, a wide online audience.   2   Crowdsourcing: an introduction Crowdsourcing – the practice of using contributions from a large online community to undertake a specific task, create content, or gather ideas – is a product of a critical cultural shift in Internet technologies. The first generation of the World Wide Web had been dominated by static websites, facilitated by search engines which only allowed information-seeking behaviour. However, the development of online platforms which allowed and encouraged a two-way dialogue rather than a broadcast mentality fostered public participation, the co-creation of knowledge, and community- building, in a phase which is commonly referred to as “Web 2.0” (O’Reilly 2005, Flew 2008). In 2005, an article in Wired Magazine discussed how businesses were beginning to use these new platforms to outsource work to individuals, coining the term “crowdsourcing” as a neologistic portmanteau of “outsourcing” and “crowd” (Howe 2006b): Technological advances in everything from product design software to digital video cameras are breaking down the cost barriers that once separated amateurs from professionals. Hobbyists, part-timers, and dabblers suddenly have a market for their efforts, as smart companies in industries as disparate as pharmaceuticals and television discover ways to tap the latent talent of the crowd. The labor isn’t always free, but it costs a lot less than paying traditional employees. It’s not outsourcing; it’s crowdsourcing…” (ibid). The term was quickly adopted online to refer to the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an open call. This can take the form of peer-production (when the job is performed collaboratively), but is also often undertaken by   3   sole individuals. The crucial prerequisite is the use of the open call format and the large network of potential laborers (Howe 2006c). Within a week of the term being coined, 182,000 other websites were using it (Howe 2006a) and it rapidly became the word used to describe a wide range of online activities from contributing to online encyclopedias such as Wikipedia, to tagging images on image sharing websites such as Flickr, to writing on blogs, to proofreading out of copyright texts on Project Gutenberg, or contributing to open-source software. (An analagous term to crowdsourcing, Citizen Science, has also been used where the small-scale tasks carried out online contribute to scientific projects (Silvertown 2009)). It is important to note here that the use of distributed (generally volunteer) labour to undertake small portions of much larger tasks, gather information, contribute to a larger project, or solve problems, is not new. There is a long history of scientific prizes, architectural competitions, genealogical research, scientific observation and recording, and linguistic study (to name but a few applications) that have relied on the contribution of large numbers of individuals to undertake a centrally managed task, or solve a complex problem (see Finnegan 2005 for an overview). For example, the Mass-Observation Project was a social research organisation in the United Kingdom between 1937 and the 1960s, which relied on a network of 500 volunteer correspondents to record every day life in Britain, including conversation, culture, and behaviour (Hubble, 2006). The difference between these projects and the modern phenomenon of crowdsourcing identified by Howe is, of course, the use of the Internet, the World Wide Web, and interactive web platforms as the mechanism for distributing information, collecting responses, building solutions, and communicating   4   around a specified task or topic. There was an intermediary phase, however, between offline volunteer labour, and the post-2006 “crowdsourcing” swell, where volunteer labour was used in conjunction with computers and online mechanisms to collect data. Brumfield (2013a) identifies at least seven genealogy projects, such as Free Births, Marriages and Deaths (FreeBMD, http://freebmd.org.uk/), Free Registers (FreeREG, http://www.freereg.org.uk/) and Free Census (FreeCEN, http://www.freecen.org.uk/), that emerged in the 1990s,       out of an (at least) one hundred year old tradition of creating print indexes to manuscript sources which were then published. Once the web came online, the idea of publishing these on the web [instead] became obvious. But the tools that were used to create these were spreadsheets that people would use on their home computers. Then they would put CD ROMs or floppy disks in the posts and send them off to be published online (Brumfield 2013a). The recent phenomenon of crowdsourcing, or citizen science, can then be seen as a continuation of the use of available platforms and communications networks to distribute tasks amongst large numbers of interested individuals, working towards a common goal. What types of web-related activities are now described as “crowdsourcing”? Daren Brabham (2013, p.45) proposes a useful typology, looking at the mostly commercial projects which exist in the crowdsourcing space, suggesting that there are two types of problems which can be best solved using this approach: information management issues and ideation problems. Information management issues occur where information needs to be located, created, assembled sorted, or analysed. Brabham suggests that knowledge discovery and management techniques can be used for   5   crowdsourced information management, as they are ideal for gathering sources or reporting problems: an example of this would be SeeClickFix (http://en.seeclickfix.com/) which encourages people to “Report neighborhood issues and see them get fixed” (SeeClickFix 2013). An alternative crowdsourcing approach to information management is what Brahbam calls “distributed human intelligence tasking”: when “a corpus of data is known and the problem is not to produce designs, find information, or develop solutions, but to process data” (Brabham 2013, p.50). The least creative and intellectually demanding of the crowdsourcing techniques, users can be encouraged to undertake repetitive “micro-tasks”, often for monetary compensation, if the task is for a commercial entity. An example of this would be Amazon’s Mechanical Turk (https://www.mturk.com/), which “gives businesses and developers access to an on-demand, scalable workforce. Workers select from thousands of tasks and work whenever it’s convenient” (Amazon Mechanical Turk, 2014) – although Amazon Turk has been criticised for its “unethical” business model, with a large proportion of its workers living in third world countries, working on tasks for very little payment (Cushing 2013). The second type of task that Brabham identified that is suited to crowdsourcing are ideation problems: where creative solutions need to be proposed, that are either empirically true, or a matter of taste or market support (Brabham 2013, p. 48-51). Brabham suggests that crowdsourcing is commonly used as a form of “broadcast search” to locate individuals who can provide the answer to specific problems, or provide the solution to a challenge, sometimes with pecuniary rewards. An example of an online platform using this approach is InnoCentive.com, which is predominantly geared towards the scientific community to generate ideas or reach solutions, for   6   research and development, sometimes with very large financial prizes: at time of writing, there were three awards worth $100,000 on offer. Brahbam suggests that an alternative crowdsourcing solution to ideation problems is “peer-vetted creative production” (ibid, p.49) where a creative phase is opened up to an online audience, who submit a large number of submissions, and voting mechanisms are then put in place to help sort through the proposals, hoping to identify superior suggestions. An example of this approach would be Threadless.com, a creative community that designs, sorts, creates, and provides a mechanism to purchase various fashion items (the website started with t-shirts, but has since expanded to offer other products). Since being coined in 2006, the term “crowdsourcing” is now used to cover a wide variety of activities across a large number of sectors: “Businesses, non-profit organizations, and government agencies regularly integrate the creative energies of online communities into day-to-day operations, and many organizations have been built entirely from these arrangements” (Brabham 2013, xv). Brabham’s overall typology is a useful tool as it provides a framework in which to think about both the type of problem that is being addressed by the online platform, and the specific crowdsourcing mechanism that is being used to propose a solution. Given the prevalence of the use of crowdsourcing in online communities for a range of both commercial and not for profit tasks, it is hardly surprising that various implementations of crowdsourcing activities have emerged in the cultural and heritage sector at large, and the Digital Humanities in particular. The growth of crowdsourcing in cultural and heritage applications   7   There are many aspects of crowdsourcing that are useful to those working in history, cultural and heritage, particularly within Galleries, Libraries, Archives and Museums (GLAMs), which have a long history of participation with members of the public and generally have institutional aims to promote their collections and engage with as wide an audience as possible. However, “Crowdsourcing is a concept that was invented and defined in the business world and it is important that we recast it and think through what changes when we bring it into cultural heritage” (Owens 2012b). The most obvious difference is that payment to those who undertake tasks is generally not an option for host institutions, but also that “a clearly ethical approach to inviting the public to help in the collection, description, presentation, and use of the cultural record” needs to be identified and pursued (ibid). Owens (2012b) sketches out a range of differences between the mass crowdsourcing model harnessed by the commercial sector and the use of online volunteer labour in cultural and heritage organisations, stressing that “many of the projects that end up falling under the heading of crowdsourcing in libraries, archives and museums have not involved large and massive crowds and they have very little to do with outsourcing labor.” Heritage crowdsourcing projects are not about anonymous masses of people, they are about inviting participation from those who are interested and engaged, and generally involve a small cohort of enthusiasts to use digital tools to contribute (in the same way as they may have volunteered offline to organize and add value to collections in the past). The work is not “labour” but a meaningful way in which individuals can interact with, explore, and understand the historical record. It is often highly motivated and skilled individuals that offer to help, rather than those who can be described with the derogatory term “amateurs”. Owens suggests that crowdsourcing within this sector is then a complex interplay between understanding the potentials for   8   human computation, adopting tools and software as scaffolding to aid this process, and understanding human motivation (ibid). No chronological history of the growth of crowdsourcing in culture and heritage exists, but the earliest, large scale project which adopted this model of interaction with users was the Australian Newspaper Digitisation Program (http://www.nla.gov.au/content/newspaper-digitisation-program), which in August 2008 asked the general public to correct the OCR (Optical Character Recognition) text of 8.4 million articles generated from their digitised historic Australian Newspapers. This has been a phenomenally successful project, and at time of writing (July 2015), over 166 million individual lines of newspaper articles had been proof read and corrected by volunteer labour. The resulting transcriptions can both aid others in reading, but also in finding, text in the digitised archive. After the success of this project, and the rise of commercial crowdsourcing, other projects began to adopt crowdsourcing techniques to help digitise, sort, and correct heritage materials. In 2009 One of the earliest citizen science projects that is based on historical data, the North American Bird Phenology Program (http://www.pwrc.usgs.gov/bpp/) was launched to transcribe 6 million migration card observations collected by a network of volunteers “who recorded information of first arrival dates, maximum abundance, and departure dates of migratory birds across North America” between 1880 and 1970 (North American Bird Phenology Program, n. d). At time of writing, over a million cards have been transcribed by volunteers since launch, allowing a range of scientific research to be carried out on the resulting data (ibid).   9   Crowdsourcing in the heritage sector began to gather speed around 2010 with a range of projects being launched that asked the general public for various types of help via an online interface. One of the most successful of these is another combination of historical crowdsourcing, and citizen science, called Old Weather (http://www.oldweather.org/) which invites the general public to transcribe weather observations that were noted in ship’s log books dating from the mid 19th Century to the present day in order to “contribute to climate model projections and …improve our knowledge of past environmental conditions” (Old Weather 2013a). Old Weather launched in October 2010 as part of the Zooniverse (http://www.zooniverse.org/) portal of fifteen different citizen science projects (which had started with the popular gallery classification tool, Galaxy Zoo (http://www.galaxyzoo.org/), in 2009). The Old Weather project is a collaboration of a diverse range of archival and scientific institutions and museums and universities in both the UK and the USA (Old Weather 2013b), showing how a common digital platform can bring together physically dispersed information for analysis by users. At time of writing, over 34,000 logs and seven voyages have been transcribed (three times, by different users to insure quality control, meaning that over 1,000,000 individual pages have been transcribed by users (Brohan, P. 2012)), and the resulting data is now being used by both scientists and historians to understand both climate patterns and naval history (with their blog regularly updated with findings: http://blog.oldweather.org/). A range of other noteable crowdsourcing projects launched in the 2010 to 2011 period, showing the breadth and scope of the application of online effort to cultural heritage. These include (but are not limited to): Transcribe Bentham, which aims to transcribe the writings of the philosopher and jurist Jeremy Bentham   10   (http://blogs.ucl.ac.uk/transcribe-bentham/); the Victoria and Albert Museum’s tool to get users to improve the cropping of their photos in the collection (http://collections.vam.ac.uk/crowdsourcing/); The United States Holocaust Museum’s “Remember Me” project which aims to identify children in photographs taken by relief workers during the immediate aftermath of the second World War, to facilitate connections amongst survivors (http://rememberme.ushmm.org/); New York Public Library’s “What’s on the menu?” project (http://menus.nypl.org/), in which users can transcribe their collection of historical restaurant menus; and the National Library of Finland’s DigitalKoot project (http://www.digitalkoot.fi/index_en.html) which allowed users to play games which helped improve the metadata of their Historical Newspaper Library. The range and spread of websites that come under the crowdsourcing umbrella in the cultural and heritage sector continues to increase, and it is now a relatively established, if evolving, method used for galleries, libraries, archives and museums. A list of non-profit crowdsourcing projects in GLAM institutions is maintained at http://www.digitalglam.org/crowdsourcing/projects/. Considering this activity in light of Brabham’s typology, above, it is clear that most projects fall into the “information management” category (Brabham 2013), where an organisation (or collaborative project between a range of organisations) tasks the crowd with helping to gather, organise, and collect information into a common source or format. What is the relationship of these projects to those working in Digital Humanities? Obviously, many crowd-sourcing projects depend on having information – or things – to comment on, transcribe, analyse, or sort, and therefore GLAM institutions, who are custodians of such historical material, often partner with University researchers who   11   have an interest in using digital techniques to answer their Humanities or Heritage based research question. There is often much sharing of expertise and technical infrastructure between different projects and institutions: for example, the Galaxy Zoo platform which underpins Old Weather also is used by Ancient Lives (http://ancientlives.org/) to help crowdsource transcription of papyri, and Operation War Diary (http://www.operationwardiary.org/) to help transcribe First World War Unit Diaries. Furthermore, those working in Digital Humanities can often advise and assist colleagues in partner institutions and scholarly departments: the Transcribe Bentham project is a collaboration between University College London’s Library Services (including UCL’s Special Collections), The Bentham Project (based in the Faculty of Laws), UCL Centre for Digital Humanities, The British Library, and The University of London Computing Centre, with the role of the Digital Humanities centre being to provide guidance and advice with online activities, best practice, and public engagement. Another example of collaboration can be seen in events such as the CITSCribe Hackathon in December 2013, which “brought together over 30 programmers and researchers from the areas of biodiversity research and digital humanities for a week to further enable public participation in the transcription of biodiversity specimen labels” (iDigBio 2013). Crowdsourcing in the Digital Humanities can also be used to sort and improve incomplete data sets, such as a corpus of 493 non-Shakespearean plays written between 1576 and 1642 in which 32,000 partially transcribed words were corrected by students over the course of an eight week period using an online tool (http://annolex.at.northwestern.edu, see Mueller 2014), indicating how we can use crowdsourcing to involve Humanities students in the gathering and curating of corpora relevant to the wider Humanities community. Scholars in the Digital Humanities are well placed to research, scope and   12   theorise crowdsourcing activities across a wider sector: for example, the “Modeling Crowdsourcing for Cultural Heritage” project (http://cdh.uva.nl/projects-2013- 2014/m.o.c.c.a.html) based at the Centre for Digital Humanities and the Creative Research Industries Amsterdam, both at the University of Amsterdam, is aiming to determine a comprehensive model for “determining which types and methods of crowdsourcing are relevant for which specific purposes” (Amsterdam Centre for Digital Humanities 2013). As we shall see, below, Digital Humanities scholars and centres are investigating and building new platforms for crowdsourcing activities – particularly in the transcription of historical texts. In addition, Digital Humanities academics can help with suggestions on what we can do with crowdsourced information once collected: we are now moving into a next phase of crowdsourcing, where understanding data mining and visualisation techniques to query the volume of data collected by volunteer labour is necessary. Finally, there is the beginnings of a body of literature on the wider area of crowdsourcing, both across the Digital Humanities and the GLAM sector, and taken together these can inform those who are contemplating undertaking a crowdsourcing project for a related area. It should be stressed that it is often hard to make a distinction between what should be labelled a “GLAM sector” project and what should be labelled “Digital Humanities” in the area of crowdsourcing, as many projects are using crowdsourcing not only to sort or label or format historical information, but to provide the raw materials and methodologies for creating and understanding novel information about our past, our cultural inheritance, or our society. Following on from the success of the Australian Newspapers Digitisation Program which she managed, Holley (2010) brought issues of “Crowdsourcing: How and Why   13   Should Libraries Do it” to light, in a seminal discussion that much subsequent research and project implementation has benefited from. Holley proposes that there are a variety of potential benefits in using crowdsourcing within a library context (which we can also extrapolate to cover those working across the GLAM sector, and Digital Humanities). The benefits of crowdsourcing noted are that it can help to: achieve goals the institution would not have the resources (temporal, financial, or staffing) to accomplish itself; achieve these goals quicker than if working alone; build new user groups and communities; actively engage the community with the institution and its systems and collections; utilise external knowledge, expertise and interest; improve the quality of data which improves subsequent user search experiences; add value to data; improving and expanding the ways in which data can be discovered; gain an insight into user opinions and desires by building up a relationship with the crowd; show the relevance and importance of the institution (and its collections) by the high level of public interest in the project; build trust and encouraging loyalty to the institution; and encourage a sense of public ownership and responsibility towards cultural heritage collections (ibid). Holley also asks what the normal profile of a crowdsourcing volunteer in the cultural, heritage, and humanities sector is, stressing that from even early pilot projects the same makeup emerges: although there may be a large number of volunteers who originally sign up, the majority of the work is done by a small cohort of super users, which achieve significantly larger amounts of work than anyone else. They tend to be committed to the project for the long term, appreciate that it is a learning experience, which gives them purpose and is personally rewarding, perhaps because they are interested in it, or see it as a good cause. Volunteers often talk of becoming addicted   14   to the activities, and the amount of work undertaken often exceeds the expectations of the project (ibid). Holley argues that “the factors that motivate digital volunteers are really no different to factors that motivate anyone to do anything” (ibid), saying that interest, passion, a worthy cause, giving back to the community, helping to achieve a group goal, and contributing to the discovery of new information in an important area, are often reasons that volunteers contribute. Observations and surveys of volunteers by site managers noted various techniques that can improve user motivation, such as adding more content regularly, increasing challenges, creating a camaraderie, building relationships with the project, acknowledging the volunteer’s help, providing rewards, and making goals and progress transparent. The reward and acknowledgement process is often linked to progress reports, with volunteers being named, high achievers being ranked in publicly available tables, and promotional gifts (ibid). Holley provides various tips that have provided guidance for a variety of crowdsourcing projects, and are worth following by those considering using this method. The project should have a clear goal, a big challenge, report regularly on progress, and showcase results. The system should be easy and fun, reliable and quick, intuitive, and provide options to the user so they can choose what they work on (to a certain extent). The volunteers should be acknowledged, be rewarded, be supported by the project team, and be trusted. The content should be interesting, novel, focussed on history or science, and there should be lots of it (ibid). Holley’s paper was written just before many of the projects outlined above came on- stream, stressing the potential for institutions, and challenging institutional structures to be brave enough to attempt to engage individuals in this manner. By 2012, with   15   various projects in full swing, reports and papers began to appear about the nuances of crowdsourcing in this area, although “there is relatively little academic literature dealing with its application and outcomes to allow any firm judgements to be made about its potential to produce academically credible knowledge” (Hedges and Dunn 2012, p.4). Ridge (2012) explores the “Frequently Asked Questions about Crowdsourcing in Cultural Heritage”, noting various misconceptions and apprehensions surrounding the topic. As with Owens (2012a), Ridge agrees that the industry definition of crowdsourcing is problematic, suggesting instead that it should be defined as an emerging form of engagement with cultural heritage that contributes towards a shared, significant goal or research area by asking the public to undertake tasks that cannot be done automatically, in an environment where the tasks, goals (or both) provide inherent rewards for participation” (Ridge 2012). Ridge draws attention to the importance of the relationships built between individuals and organisations, and that projects should be mindful of the motivations for participating. Institutional nervousness around crowdsourcing is caused by worries that malicious or deliberately bad information will be provided by difficult, obstructive users, although Ridge maintains this is seldom the case, and that a good crowdsourcing project should have inbuilt mechanisms to highlight problematic data or users, and validate the content created by its users. Ridge returns again to the ethics of using volunteer labour, allaying fears about the type of exploitation seen in the commercial sector exploitation by explaining that   16   Museums, galleries, libraries, archives and academic projects are in the fortunate position of having interesting work that involves an element of social good, and they also have hugely varied work, from microtasks to co-curated research projects. Crowdsourcing is part of a long tradition of volunteering and altruistic participation (Ridge 2012). In a further 2013 post, Ridge also highlights the advantages of digital engagement via crowdsourcing, suggesting that digital platforms can allow smaller institutions to engage with users just as well as large institutions can, can generate new relationships with different organisations in order to work together around a similar topic in a collaborative project, and can provide great potential for audience participation and engagement (Ridge 2013). In fact, Owens (2012a) suggests that our thinking around crowdsourcing in cultural and heritage is the wrong way round: rather than thinking of the end product and the better data that volunteers are helping us create, institutions should focus on the fact that crowdsourcing marks a fulfilment of the mission of putting digital collections online: What crowdsourcing does, that most digital collection platforms fail to do, is offers an opportunity for someone to do something more than consume information… Far from being an instrument which enables us to ultimately better deliver content to end users, crowdsourcing is the best way to actually engage our users in the fundamental reason that these digital collections exist in the first place… At its best, crowdsourcing is not about getting someone to do work for you, it is about offering your users the opportunity to participate in public memory (ibid).   17   The lessons learned from these museum and library based projects are important starting points for those in the Digital Humanities who wish to undertake crowdsourcing themselves. Crowdsourcing and Digital Humanities In a 2012 scoping study of the use of crowdsourcing particularly applied to Humanities research, 54 academic publications were identified that were of direct relevance to the field, and a further 51 individual projects, activities or websites were found which documented or presented some aspect, application, or use of crowdsourcing within humanities scholarship (Hedges and Dunn 2012). Many of these projects have crossovers with libraries, archives, museums, and galleries, as partners who provide content, expertise, or host project themselves, and many of them are yet to produce a tangible academic outcome. As Hedges and Dunn point out, at a time when the web is simultaneously transforming the way in which people collaborate and communicate, and merging the spaces which the academic and non-academic communities inhabit, it has never been more important to consider the role which public communities -connected or otherwise - have come to play in academic humanities research (ibid, p. 3). Hedges and Dunn (ibid, p.7) identify four factors that define crowd-sourcing used within humanities research. These are: a clearly defined core research question and direction within the humanities; the potential for an online group to add to, transform, or interpret data that is important to the humanities; a definable task which is broken down into an achievable workflow; and the setting up of a scalable activity which can be undertaken with different levels of participation. Very similar to the work done in the GLAM sector, the theme and research question of the project are therefore the   18   main distinguishing factors from other types of crowdsourcing, with Digital Humanities projects learning from other domains such as successful projects in citizen science, or industry. An example of such a project fitting into this Humanities Crowdsourcing definition, given its purview, is Transcribe Bentham (http://blogs.ucl.ac.uk/transcribe-bentham/), a manuscript transcription initiative that intends to engage students, researchers, and the general public with the thought and life of the philosopher and reformer, Jeremy Bentham (1748–1832), by making available digital images of his manuscripts for anyone, anywhere in the world, to transcribe. The fundamental research question driving this project is to understand the thought and writings of Bentham more completely – a topic of fundamental importance to those engaged in eighteenth or nineteenth century studies – given that 40,000 folios of his writings remain un- transcribed “and their contents largely unknown, rendering our understanding of Bentham’s thought—together with its historical significance and continuing philosophical importance—at best provisional, and at worst a caricature.” (Causer and Terras, Forthcoming 2014). The objectives of the project are clear, with the benefit to humanities (and law, and social science) research evident from the research objectives. Hedges and Dunn (2012, p.18 -19) list the types of knowledge that may be usefully created in Digital Humanities crowdsourcing activities, resulting in new understanding of Humanities research questions. These Digital Humanities crowdsourcing projects are involved in: making ephemera available that would otherwise not be; opening up information that would normally be accessible to distinct groups, giving a wider audience to specific information held in little known   19   written documentation, circulation of personal histories and diaries, giving personal links to historical processes and events, identifying links between objects, summarising and circulating datasets, synthesizing new data from existing sources, and recording ephemeral knowledge before it dissipates. Hedges and Dunn stress that an important point in these crowdsourcing projects is that they enable the building up of knowledge of the process of how to conduct collaborative research in this area, whilst creating communities with a shared purpose, which often carry out research work that go beyond the expectations of the project (p.19). However, they are keen to also point out that most humanities scholars who have used crowd-sourcing in its various forms now agree that it is not simply a form of cheap labour for the creation or digitization of content; indeed in a cost-benefit sense it does not always compare well with more conventional means of digitization and processing. In this sense, it has truly left its roots, as defined by Howe (2006) behind. The creativity, enthusiasm and alternative foci that communities outside that academy can bring to academic projects is a resource which is now ripe for tapping in to (ibid, p. 40). As with Owens’ thoughts on crowdsourcing in the GLAM sector (2012), we can see that crowdsourcing in the humanities is about engagement, and encouraging a wide, and different audience to engage in processes of humanistic enquiry, rather than merely being a cheap way to encourage people to get a necessary job done. Crowdsourcing and Document Transcription The most high profile area of crowdsourcing carried out within the humanities is in the area of document transcription. Although commercial optical character   20   recognition (OCR) technology has been available for over 50 years (Schantz 1982), it still cannot generate high quality transcripts of handwritten material. Work with texts and textual data is still the major topic of most Digital Humanities research (see the analysis by Scott Weingart of submissions to the Digital Humanities Conference 2014, which showed that of the 600 abstracts, 21.5% dealt with some form of Text Analysis, 19% were about literary studies, and 19% were about Text Mining (Weingart, S. 2013)). It is therefore no surprise that most Digital Humanities crowdsourcing activities – or at least, those emanating from Digital Humanities centres and or associated in some sense with the Digital Humanities community - have been involved in the creation of tools in which to help transcribe important handwritten documents into machine processable form. Ben Brumfield, in a talk presented in 2013, demonstrated that there were thirty collaborative transcription tools developed since 2005 (Brumfield 2013a), situating the genealogical sites, and those such as Old Weather and Transcribe Bentham, in a trajectory which leads to the creation of tools and platforms which people can use to upload their own documents, and manage their own crowdsourcing projects (reviews of these different platforms are available on Brumfield’s blog, at http://manuscripttranscription.blogspot.co.uk/, and at time of writing there are now forty collaborative tools for crowdsourcing document transcription). The first of these customizable tools was Scripto (http://scripto.org/), a freely available, open source platform for community transcription, developed in 2011 by the Center for History and New Media at George Mason University alongside their Papers of the United States War Department project (http://wardepartmentpapers.org/). Another web based tool, specifically designed for Transcription for Paleographical and Editorial Notation   21   (T-PEN) (http://t-pen.org/TPEN/), coordinated by the Center for Digital Theology at Saint-Louis University, provides a web based interface for working with images of manuscripts. Transcribe Bentham has also released a customizable, open source version of its Mediawiki based platform (https://github.com/onothimagen/cbp- transcription-desk), which has since been used by the Public Record Office of Victoria, Australia (http://wiki.prov.vic.gov.au/index.php/Category:PROV_Transcription_Pilot_Project). The toolbar developed for Transcribe Bentham, which helps people encode various aspects of transcription such as dates, people, deletions, etc, has been integrated into the Letters of 1916 project at Trinity College Dublin (http://dh.tcd.ie/letters1916/). The platform the Letters of 1916 project uses is the DIYHistory suite, built by the University of Iowa, which itself is based on CHNM's Scripto tool. Links between crowdsourcing projects are common. There are now a range of transcription projects online ranging from those created, hosted, and managed by scholarly or memory institutions, to those entirely organised by amateurs with no scholarly training or association. A prime example of the latter would be Soldier Studies, (http://www.soldierstudies.org/), a website dedicated to preserving the content of American Civil War correspondence bought and sold on eBay, to allow access to the contents of this ephemera before it resides in private collections, which, although laudable, uses no transcription conventions at all in cataloguing or transcribing the documents it finds (Brumfield 2013a). The movement towards collaborative online document transcription by volunteers not only uncovers new, important historical primary source material, but it also “can open   22   up activities that were traditionally viewed as academic endeavours to a wider audience interested in history” (Causer and Terras, Forthcoming 2014). Brumfield (2013) points out that there are issues which come with this: There's an institutional tension, in that editing of documents has historically been done by professionals, and amateur editions have very bad reputations. Well now we're asking volunteers to transcribe. And there's a big tension between, well how do volunteers deal with this [process], do we trust volunteers? Wouldn't it be better just to give us more money to hire more professionals? So there's a tension there. Brumfield further explores this in another blog post (2013b) where he asks what is the qualitative difference between the activities we ask amateurs to do and the activities performed by scholars… we're not asking "citizen scholars" to do real scholarly work, and then labeling their activity scholarship -- a concern I share with regard to editing. If most crowdsourcing projects ask amateurs to do little more than wash test tubes, where are the projects that solicit scholarly interpretation? There is therefore a fear that without adequate guidance and moderation, the products of crowdsourced transcription will be what Shillingsburg referred to as “a dank cellar of electronic texts” where “the world is overwhelmed by texts of unknown provenance, with unknown corruptions, representing unidentified or misidentified versions” (2006, p.139). Brumfield (2013c) points out that Peter Robinson describes both the utopia and the dystopia of crowdsourcing transcription: utopia in which textual scholars train the world in how to read documents, and a dystopia in which hordes of “well-meaning but ill-informed enthusiasts will strew the web willy-nilly   23   with error-filled transcripts and annotations, burying good scholarship in rubbish.” (Robinson, quoted in Brumfield 2013c). To avoid this, Brumfield (2013c) suggests that partnerships and dialogue between volunteers and professionals is essential, to make methodologies for approaching texts visible, and to allows volunteers to become advocates “not just for the material and the materials they are working on through crowdsourcing project, but for editing as a discipline” (ibid). Care needs to be taken, then, when setting up a crowdsourcing transcription project, to ensure that the quality of the resulting transcription is suitable to be used as the basis for further scholarly humanistic enquiry, if the project is to be useful over a longer term and for a variety of research. The methods and approaches in assuring transcription quality of content need to be ascertained: whether the project uses double-keying (where two or more people enter the same text to ensure its veracity), or moderation (where an expert in the field signs off the text into a database, agreeing that its content meets benchmarked standards). However, in addition to this the format that the data is stored in needs to be structured to ensure that complex representational issues are preserved, and that any resulting data created can be easily reused and textual models can be understood, repurposed, or integrated with other collections. As Brumfield points out (2013a) Digital Humanities already has a standard for documentary scholarly editing in the Text Encoding Initiative guidelines (TEI 2014), which have been available since 1990 and provide a flexible but robust framework within which to model, analyse, and present textual data. However, only seven of the crowdsourcing manuscript transcription tools (out of the thirty then available) attempt to integrate TEI compliant XML encoding into their workflow (Brumfield 2013a). Projects which have used TEI markup as part of the manuscript transcription process,   24   such as Transcribe Bentham, have demonstrated that users can easily learn the processes of encoding texts with XML if clear guidance and instruction is given to them, and it is explained why they should make the effort to do it (Brumfield 2013a, Causer and Terras Forthcoming 2014). Brumfield (2013a) stresses that is it the responsibility of those involved in academic scholarly editing within the Digital Humanities to ensure that their work on establishing methods and guidelines for academic transcription is felt within the development of public facing transcription tools, and if we are engaging users so that they can built their own skillsets, we need to use our digital platforms to train them according to pedagogical and scholarly standards: “Crowdsourcing is a school. Programs are the teachers. We have to get it right” (Brumfield 2013d). Brumfield (2013c) also highlights that it is the responsibility of those working in document editing, and the Digital Humanities, to release guides to editing and transcribing that are accessible to those with no academic training in this area, such as computer programmers building transcriptions tools, if we wish for the resulting interfaces to allow community-led transcription to result in high quality textual material. Future Issues in Digital Humanities Crowdsourcing We are now at a stage where crowdsourcing has joined the ranks of established digital methods for gathering and classifying data for use in answering the types of questions of interest to Humanities scholars, although there is much research that still needs to be done about user response to crowdsourcing requests, and how best to build and deliver projects. There are also issues about data management, given that crowdsourcing is now reaching a mature phase where a variety of successful projects have amassed large amounts of data, often from different sources within individual   25   projects: the million pages from Old Weather from different archives; over 3 million words transcribed by volunteer labour in the Transcribe Bentham project (Grint 2013) from both UCL and the British Library; approximately one and a half thousand letters transcribed in Soldier Studies (Soldier Studies 2014), which at a conservative estimate must give at least half a million words of correspondence from the American Civil War, culled from images of letters sold on eBay which are now in private hands. Issues are therefore arising about sustainability: what will happen to all this data, particularly with regard to projects that do not have institutional resources or affiliation for long term backup or storage? There are also future research avenues to investigate cross-project sharing and amalgamation of data: one can easily imagine either centrally managed or federated repositories of crowdsourced information that contain all the personal diaries that have been transcribed, searchable by date, place, person, etc; or all letters and correspondence that have been sent over time, or all newspapers that were issued on a certain date worldwide. Both legal and technical issues will come in to play with this, as questions of licensing (who owns the volunteer created data? Who does the copyright belong to?) and cross-repository searching will have to be negotiated, with related costs for delivering mechanisms and platforms covered. The question of the ethics of crowdsourcing is one that also underlies much of this effort in the Humanities and the cultural and heritage sector, and projects have to be careful to work with volunteers, rather than exploit them, when building up these repositories and reusing and repurposing data in the future. Ethical issues come sharply into focus when projects start to pay (usually very little) for the labour involved, particularly when using online crowdsourcing labour brokers such as Amazon’s Mechanical Turk (https://www.mturk.com/mturk/welcome), which has been criticised as a   26   “digital sweatshop… critics have emerged from all corners of the labor, law, and tech communities. Labor activists have decried it as an unconscionable abuse of workers’ rights, lawyers have questioned its legal validity, and academics and other observers have probed its implications for the future of work and of technology” (Cushing 2013). The relationship between commerce and volunteers, payment and cultural heritage, resources and outputs, online culture and the online workforce, is complex. A project such as “Emoji-Dick” (https://www.kickstarter.com/projects/fred/emoji-dick) - which translated Moby Dick into Japanese Emoji Icons using Amazon's Mechanical Turk – is a prime example of what emerges when the lines of public engagement, culture, art, fun, low-paid crowdsourced labour, crowdfunding, and an internet meme, collide. Institutions and scholars planning on tapping into the potential labour force crowdsourcing offers have to be aware of the problems in outsourcing such labour, often very cheaply, to low paid workers, often in third world countries (Cushing 2013). Returning to Brabham’s typology on crowdsourcing projects, we can also see that although most projects that have used crowdsourcing in the Humanities are information management tasks in that they ask volunteers to help enter, collate, sort, organise, and format information, there is also the possibility that crowdsourcing can be used within the Humanities for ideation tasks: asking big questions, and proposing solutions. This area is undocumented within Digital Humanities, although the Association for Computers and the Humanities (ACH), and the 4Humanities.org initiative, have both used an open source platform, All Our Ideas (http://www.allourideas.org/) to help scope out future initiatives (ACH 2012,   27   4Humanities 2012). ACH also host and support DH Questions and Answers (http://digitalhumanities.org/answers/), a successful community based questions and answers board for Digital Humanities issues, which falls within the ideation category of crowdsourcing. There is much scope within the Humanities in general to explore this methodology and ideation mechanism further, and to engage the crowd in both proposing, and solving, questions about the Humanities, rather than only using it to self organise Digital Humanities initiatives. Crowdfunding is another relatively new area allied to crowdsourcing, which could be of great future benefit to Digital Humanities, and Humanities projects in general. Only a few projects have been started to date within the GLAM sector, both for traditional collections acquisition and for digital projects: The British Library is attempting to crowdfund for the digitisation of historical London maps (British Library 2014); The Naturalis Biodiversity Centre in Leiden is raising funds via crowdfunding to purchase a Tyrannosaurus Rex skeleton (http://tientjevoortrex.naturalis.nl/), The Archiefbank or the Stadarcheif Amsterdam has raised 30,000 euros to digitise and catalogue the Amsterdam death registers between 1892 and 1920 (Stadsarchief Amsterdam 2012), and   a   campaign to crowdfund the £520,000 needed to buy the cottage on the Sussex coast where William Blake wrote “England’s green and pleasant land” was launched at time of writing (Flood 2014). A project, Micropasts (http://micropasts.org/) recently funded by the UK’s Arts and Humanities Research Council based at University College London and the British Museum, will be developing a community platform for conducting, designing and funding research into the human past: over the next few years this will   28   be an area which has much potential for involving those outside the academy with core issues within Humanities scholarship. Crowdsourcing also offers a relatively agile mechanism for those working in Digital Humanities to respond immediately to important contemporary events, preserving and collating evidence, ephemera, and archive material for future scholarship, and community use. For example the September 11th Digital Archive (http://911digitalarchive.org) which “uses electronic media to collect, preserve, and present the history of the September 11, 2001 attacks in New York, Virginia, and Pennsylvania and the public responses to them” (September 11 Digital Archive) began as a collaboration American Social History Project at the City University of New York Graduate Center, and the Center for History and New Media at George Mason University, immediately after the terrorist attacks. Likewise, the Our Marathon Archive (http://marathon.neu.edu/), led by Northeastern University, provides an archival and community space to crowdsource an archive of “pictures, videos, stories, and even social media related to the Boston Marathon; the bombing on April 15, 2013; the subsequent search, capture, and trial of the individuals who planted the bombs; and the city’s healing process” (Our Marathon 2013). There is clearly a role here for those within the Digital Humanities with technical and archival expertise to respond to contemporary events by building digital platforms that will both keep records for the future, whilst engaging with a community – and often a society - in need of sustained dialogue to process the ramifications of such events. There is also potential for more sustained and careful use of crowdsourcing within both the university and school classroom, to promote and integrate on going   29   Humanities research aims, but also to “meet essential learning outcomes of liberal education like gaining knowledge of culture, global engagement, and applied learning” (Frost Davis 2012). There are opportunities for motivated students to become more involved and engaged with projects that digitize, preserve, study, and analyse resources, encouraging them to gain first hand knowledge of humanities issues and methods, but also to understand the role that digital methods can play in public engagement: Essential learning outcomes aim at producing students with transferrable skills; in the globally networked world, being able to produce knowledge in and with the network is a vital skill for students. Students also benefit from exposure to how experts approach a project. While these tasks may seem basic, they lay the groundwork for developing deeper expertise with practice so that participation in crowdsourcing projects may be the beginning of a pipeline that leads students on to more sophisticated digital humanities research projects. Even if students don’t go on to become digital humanists, crowdsourced projects can help them develop a habit of engagement with the (digital) humanities, something that is just as important for the survival of the humanities. Indeed, a major motivation for humanities crowdsourcing is that involving the public in a project increases public support for that project (Frost Davis 2012). Crowdsourcing within the Humanities will then continue to evolve, and offers much scope for using public interest in the past to bring together data and build projects which can benefit Humanities research:   30   Public involvement in the humanities can take many forms – transcribing handwritten text into digital form; tagging photographs to facilitate discovery and preservation; entering structured or semi-structured data; commenting on content or participating in discussions, or recording one’s own experiences and memories in the form of oral history – and the relationship between the public and the humanities is convoluted and poorly understood (Hedges and Dunn 2012, p.4). By systematically applying, building, evaluating, and understanding the uses of crowdsourcing within culture, heritage and the humanities, by helping develop the standards and mechanisms to do so, and by ensuring that the data created will be useable for future scholarship, the Digital Humanities can aid in creating stronger links with the public and humanities research, which, in turn, means that crowdsourcing becomes a method of advocacy for the importance of humanities scholarship, involving and integrating non-academic sectors of society into areas of humanistic endeavour. Conclusion This chapter has surveyed the phenomenon of using digital crowdsourcing activities to further our understanding of culture, heritage and history, rather than simply identifying the activities of digital humanities centres, or self identified digital humanities scholars, which do so. This in itself is an important discussion to have about the nature of Digital Humanities research, its home, and its purview. Much of the crowdsourcing activity identified in the GLAM sector comfortably fits under the Digital Humanities umbrella, even if those involved did not self-identify with that classification: there is a distinction to be made between projects which operate within   31   the type of area which is of interest to Digital Humanities, and those run by Digital Humanities centres and scholars. With that in mind, this chapter has highlighted various ways in which those working in Digital Humanities can help advise, create, build, and steer crowdsourcing projects working in the area of culture and heritage to both add to our understanding of crowdsourcing as a methodology for humanities research, and to build up resulting datasets which will allow further humanities research questions to be answered. Given the current pace of development in the area of crowdsourcing within this sector, there is much that can be contributed from the Digital Humanities community to ensure that the resulting methods and datasets are useful, and reusable, particularly within the arena of document transcription and encoding. In addition, crowdsourcing affords vast opportunities for those working within the Digital Humanities to provide accessible demonstrators of the kind of digital tools and projects which are able to forward our understanding of culture and history, and also offers outreach and public engagement opportunities to show that Humanities research, in its widest sense, is a relevant and important part of the scholarly canon to as wide an audience as possible. In many ways, crowdsourcing within the cultural and heritage sectors is Digital Humanities writ large: indicating an easily accessible way in which we can harness computational platforms and methods to engage a wide audience to contribute to our understanding of society, and our cultural inheritance. Short Biographical Note Melissa Terras is Director of UCL Centre for Digital Humanities, Professor of Digital Humanities in UCL's Department of Information Studies and Co-Investigator of the   32   award winning Transcribe Bentham crowdsourcing project (www.ucl.ac.uk/transcribe-bentham). Her research spans various aspects of digitisation and public engagement. You can generally find her on twitter @melissaterras. Abstract A recent movement in the cultural and heritage industries has been to trial crowdsourcing (the harnessing of online activities and behaviour to aid in large-scale ventures such as tagging, commenting, rating, reviewing, text correcting, and the creation and uploading of content in a methodical, task-based fashion) to improve the quality of, and widen access to, online collections. Building on this, within Digital Humanities there have been attempts to crowdsource more complex tasks traditionally assumed to be carried out by academic scholars, such as the accurate transcription of manuscript material. This chapter surveys the growth and uptake of crowdsourcing within Digital Humanities, raising issues which emerge when building projects for and with a wide online audience. Keywords Crowdsourcing, public engagement, digitisation, online participation, citizen science. Further Reading Brabham, D. C. (2013). Crowdsourcing. MIT Press Essential Knowledge Series. London, England, MIT Press. Brumfield, B. (2013a). Itinera Nova in the World(s) of Crowdsourcing and TEI. Collaborative Manuscript Transcription Blog. http://manuscripttranscription.blogspot.co.uk/2013/04/itinera-nova-in-worlds-of- crowdsourcing.html. Accessed 17th January 2014.   33   Brumfield, B. (2013c). The Collaborative Future of Amateur Editions. Collaborative Manuscript Transcription Blog, http://manuscripttranscription.blogspot.co.uk/2013/07/the-collaborative-future-of- amateur.html. Accessed 28th January 2014. Causer, T. and Terras, M. (Forthcoming 2014) "Crowdsourcing Bentham: beyond the traditional boundaries of academic history". Accepted, International Journal of Humanities and Arts Computing. Flood, A. (2014). “Crowdfunding campaign hopes to save William Blake’s cottage for nation”. Guardian, 11st September 2014, http://www.theguardian.com/culture/2014/sep/11/crowdfunding-campaign-william- blake-cottage Frost Davis, R. (2012). “Crowdsourcing, Undergraduates, and Digital Humanities Projects”. http://rebeccafrostdavis.wordpress.com/2012/09/03/crowdsourcing- undergraduates-and-digital-humanities-projects/. Accessed 29th January 2014. Hedges, M. and Dunn, S. (2012). Crowd-Sourcing Scoping Study: Engaging the Crowd with Humanities Research. Arts and Humanities Research Council. http://crowds.cerch.kcl.ac.uk/, Accessed 16th January 2014.   Holley, R. (2010). Crowdsourcing: How and Why Should Libraries Do It?, D-Lib Magazine, 16 (2010), http://www.dlib.org/dlib/march10/holley/03holley.html. Accessed 17th January 2014. Owens, T. (2012b). The Crowd and The Library. http://www.trevorowens.org/2012/05/the-crowd-and-the-library/. Accessed 16th January 2014. Ridge, M. (2012). Frequently Asked Questions about crowdsourcing in cultural heritage. Open Objects blog. http://openobjects.blogspot.co.uk/2012/06/frequently- asked-questions-about.html. Accessed 18th January 2014. Bibliography ACH (2014). ACH Agenda Setting: Next Steps. Association for Computers and the Humanities Blog, http://ach.org/2012/06/04/ach-agenda-setting-next-steps/. Accessed 29th January 2014. Amazon Mechanical Turk, (2014). Amazon Mechanical Turk, Welcome. https://www.mturk.com/mturk/welcome. Accessed 16th January 2014. Amsterdam Centre for Digital Humanities (2013). Modeling Crowdsourcing for Cultural Heritage. http://cdh.uva.nl/projects-2013-2014/m.o.c.c.a.html. Accessed 17th January 2013.   34   Brabham, D. C., (2013). Crowdsourcing. MIT Press Essential Knowledge Series. London, England, MIT Press. British Library (2014). Unlock London Maps and Views. http://support.bl.uk/Page/Unlock-London-Maps, Accessed 29th January 2014. Brohan, P. (2012). New Uses for Old Weather. Position Paper, AHRC Crowdsourcing StudyWorkshop, May 2012. http://crowds.cerch.kcl.ac.uk/wp- content/uploads/2012/04/Brohan.pdf. Accessed 29th January 2014. Brumfield, B. (2013a). Itinera Nova in the World(s) of Crowdsourcing and TEI. Collaborative Manuscript Transcription Blog. http://manuscripttranscription.blogspot.co.uk/2013/04/itinera-nova-in-worlds-of- crowdsourcing.html. Accessed 17th January 2014. Brumfield, B. (2013b). A Gresham’s Law for Crowdsourcing and Scholarship, Collaborative Manuscript Transcription Blog. http://manuscripttranscription.blogspot.co.uk/2013/10/a-greshams-law-for- crowdsouring-and.html. Accessed 28th January 2014. Brumfield, B. (2013c). The Collaborative Future of Amateur Editions. Collaborative Manuscript Transcription Blog, http://manuscripttranscription.blogspot.co.uk/2013/07/the-collaborative-future-of- amateur.html. Accessed 28th January 2014. Brumfield, B. (2013d). In Van Zundert, J. J., Van den Heuvel, C., Brumfield, B.,Van Dalen-Oskam, K., Franzini, G., Sahle, P., Shaw, R., Terras, M. (2013). Text Theory, Digital Documents, and the Practice of Digital Editions. Panel session, Digital Humanities 2013, University of Nebraska, Lincoln. July 2013. Causer, T. and Terras, M. (Forthcoming 2014) "Crowdsourcing Bentham: beyond the traditional boundaries of academic history". Accepted, International Journal of Humanities and Arts Computing. Cushing, E. (2013). “Amazon Mechanical Turk: The Digital Sweatshop.” UTNE, http://www.utne.com/science-and-technology/amazon-mechanical-turk- zm0z13jfzlin.aspx#axzz3DNzILSHI, January/February 2013. Finnegan, R. (2005). Participating in the Knowledge Society: Research beyond University Walls. Houndmills-Basingstoke: Palgrave Macmillan. Flew, T. (2008) New Media: An Introduction (3rd ed.). Melbourne: Oxford University Press. Frost Davis, R. (2012). “Crowdsourcing, Undergraduates, and Digital Humanities Projects”. http://rebeccafrostdavis.wordpress.com/2012/09/03/crowdsourcing- undergraduates-and-digital-humanities-projects/. Accessed 29th January 2014.   35   Grint, K. (2013). Progress Update, 24 to 30 August 2013, Transcribe Bentham Blog. http://blogs.ucl.ac.uk/transcribe-bentham/2013/08/. Accessed 29th January 2014. Hedges, M. and Dunn, S. (2012). Crowd-Sourcing Scoping Study: Engaging the Crowd with Humanities Research. Arts and Humanities Research Council. http://crowds.cerch.kcl.ac.uk/, Accessed 16th January 2014. Holley, R. (2010). Crowdsourcing: How and Why Should Libraries Do It?, D-Lib Magazine, 16 (2010), http://www.dlib.org/dlib/march10/holley/03holley.html. Accessed 17th January 2014. Howe, J. (2006a). Birth of a Meme. Crowdsourcing Blog. May 27th 2006. http://www.crowdsourcing.com/cs/2006/05/birth_of_a_meme.html. Accessed 17th January 2014. Howe, J. (2006b). The Rise of Crowdsourcing. Wired Magazine, June 2006. http://www.wired.com/wired/archive/14.06/crowds.html. Accessed 17th January 2014. Howe, J. (2006c). Crowdsourcing: a definition. Crowdsourcing Blog. June 2nd 2006. http://crowdsourcing.typepad.com/cs/2006/06/crowdsourcing_a.html. Accessed 17th January 2014. Hubble, N. (2006). Mass-Observation and Everyday Life. Houndmills-Basingstoke: Palgrave Macmillan. iDigBio (2013). CITScribe Hackathon. https://www.idigbio.org/content/citscribe- hackathon. Acsessed 30th January 2014. Mueller, M. (2014). “Shakespeare His Contemporaries: collaborative curation and exploration of Early Modern drama in a digital environment”. Digital Humanities Quarterly, Volume 8, Number 3. http://www.digitalhumanities.org/dhq/vol/8/3/000183/000183.html North American Bird Phenology Program, n. d. About BPP. http://www.pwrc.usgs.gov/bpp/AboutBPP2.cfm. Accessed 17th January 2014. Old Weather (2013a). Old Weather: Our Weather’s Past, the Climate’s Future. http://www.oldweather.org/. Accessed 17th January 2014. Old Weather (2013b). Old Weather, About. http://www.oldweather.org/about. Accessed 17th January 2013. O’Reilly, T. (2005). What is Web 2.0? 30th September 2005. http://www.oreilly.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html, Accessed 16th January 2014. Our Marathon (2013). About The Our Marathon Archive. http://marathon.neu.edu/about. Accessed 28th January 2014.   36   Owens, T. (2012a). Crowdsourcing Cultural Heritage: The Objectives Are Upside Down. http://www.trevorowens.org/2012/03/crowdsourcing-cultural-heritage-the- objectives-are-upside-down/. Accessed 17th January 2014. Owens, T. (2012b). The Crowd and The Library. http://www.trevorowens.org/2012/05/the-crowd-and-the-library/. Accessed 16th January 2014. Ridge, M. (2012). Frequently Asked Questions about crowdsourcing in cultural heritage. Open Objects blog. http://openobjects.blogspot.co.uk/2012/06/frequently- asked-questions-about.html. Accessed 18th January 2014. Ridge, M. (2013). Digital participation, engagement, and crowdsourcing in museums. London Museums Group blog. http://www.londonmuseumsgroup.org/2013/08/15/digital- participation-engagement-and-crowdsourcing-in-museums/. Accessed 18th January 2014. Schantz, H. F. (1982). The history of OCR, optical character recognition. Manchester Center, Vt., Recognition Technologies Users Association. September 11 Digital Archive (2011), About the September 11 Digital Archive, http://911digitalarchive.org/about/index.php. Accessed 29th January 2014. Shillingsburg, P. L. (2006). “From Gutenberg to Google: Electronic Representations of Literary Texts”. Cambridge, Cambridge University Press. Soldier Studies (2014).Civil War Voices, Home Page. http://www.soldierstudies.org/, Accessed 29th January 2014. SeeClickFix (2013). Report non-emergency issues, receive alerts in your neighbourhood, http://en.seeclickfix.com/. Accessed 16th January 2014. Silvertown, J. 2009: A new dawn for citizen science. Trends in Ecology & Evolution, 24, 9, pp. 467-71. Text Encoding Initiative (2014). P5: Guidelines for Electronic Text Encoding and Interchange. http://www.tei-c.org/release/doc/tei-p5-doc/en/html/. Accessed 29th January 2014. Weingart, S. B. (2013). Submissions to Digital Humanities 2014. The Scottbot irregular, http://www.scottbot.net/HIAL/?p=39588. Accessed 28th January 2014. 4Humanities (2012). All Our Ideas: The Value of the Humanities. http://4humanities.org/2012/10/all-our-ideas-the-value-of-the-humanities/. Accessed 28th January 2014.     work_6ezeddlohnerzdeq7zfm6f2md4 ---- Abstract Machine - Geographical Information Systems (GIS) for literary and cultural studies: 'Mapping Kavanagh' | Semantic Scholar Skip to search formSkip to main content> Semantic Scholar's Logo Search Sign InCreate Free Account You are currently offline. Some features of the site may not work correctly. DOI:10.3366/ijhac.2011.0005 Corpus ID: 2504582Abstract Machine - Geographical Information Systems (GIS) for literary and cultural studies: 'Mapping Kavanagh' @article{Travis2010AbstractM, title={Abstract Machine - Geographical Information Systems (GIS) for literary and cultural studies: 'Mapping Kavanagh'}, author={C. Travis}, journal={Int. J. Humanit. Arts Comput.}, year={2010}, volume={4}, pages={17-37} } C. Travis Published 2010 Sociology, Computer Science Int. J. Humanit. Arts Comput. Drawing upon previous theoretical and practical work in historical and qualitative applications of Geographical Information Systems (GIS), this paper, in Giles Deleuze and Felix Guattari's terminology, conceptualizes GIS as ‘an abstract machine’ which plays a ‘piloting role’ which does not ‘function to represent’ something real, but rather ‘constructs a real which is yet to come.’ To illustrate this digital humanities mapping methodology, the essay examines Irish writer Patrick Kavanagh's novel… Expand View via Publisher tara.tcd.ie Save to Library Create Alert Cite Launch Research Feed Share This Paper 8 CitationsBackground Citations 2 View All Figures and Topics from this paper figure 1 figure 2 figure 3 figure 4 figure 5 figure 6 figure 7 figure 8 figure 9 figure 10 View All 10 Figures & Tables Geographic information system Abstract machine 8 Citations Citation Type Citation Type All Types Cites Results Cites Methods Cites Background Has PDF Publication Type Author More Filters More Filters Filters Sort by Relevance Sort by Most Influenced Papers Sort by Citation Count Sort by Recency Visual Geo-Literary and Historical Analysis, Tweetflickrtubing, and James Joyce's Ulysses (1922) C. Travis Sociology 2015 12 PDF View 1 excerpt, cites background Save Alert Research Feed Transcending the cube: translating GIScience time and space perspectives in a humanities GIS C. Travis Computer Science Int. J. Geogr. Inf. Sci. 2014 17 Save Alert Research Feed Visualizing a spatial archive: GIS, digital humanities, and relational space R. Foley, Rachel Murphy Geography 2015 1 PDF View 2 excerpts, cites background Save Alert Research Feed Geographic Information Systems and Historical Research: An Appraisal Luís Espinha da Silveira History, Computer Science Int. J. Humanit. Arts Comput. 2014 5 Save Alert Research Feed GeoHumanities, GIScience and Smart City Lifeworld approaches to geography and the new human condition C. Travis 2017 7 Save Alert Research Feed GIS and History: Epistemologies, Reflections, and Considerations C. Travis Geography 2013 3 Save Alert Research Feed Designing a profession: the structure, organisation and identity of the design profession in Britain, 1930-2010 Leah Armstrong Political Science 2014 2 PDF Save Alert Research Feed Digital Arts and Humanities Working Group 2011-­2012 Report Jennie M. Burroughs, K. Brooks, +6 authors S. Spicer Political Science 2012 1 Save Alert Research Feed References SHOWING 1-10 OF 29 REFERENCES SORT BYRelevance Most Influenced Papers Recency The City in Textual Form: Manhattan Transfer's New York M. Brosseau History 1995 53 Save Alert Research Feed Place, Voice, Space: Mikhail Bakhtin's Dialogical Landscape M. Folch-Serra Sociology 1990 89 Save Alert Research Feed Qualitative GIS: A Mixed Methods Approach M. Cope, S. Elwood Sociology 2009 236 Save Alert Research Feed Topophilia: A Study of Environmental Perception, Attitudes and Values J. W. Watson, Yi-fu Tuan Sociology 1976 660 Save Alert Research Feed Moving Through Modernity: Space and Geography in Modernism A. Thacker Art 2003 126 Save Alert Research Feed A Thousand Plateaus: Capitalism and Schizophrenia G. Deleuze, F. Guattari, B. Massumi Philosophy 1980 13,187 PDF Save Alert Research Feed Computing and Language Variation: A special issue of International Journal of Humanities and Arts Computing J. Nerbonne, Charlotte Gooskens, S. Kürschner Computer Science 2012 1 Save Alert Research Feed Postmetropolis: Critical Studies of Cities and Regions T. Bell Sociology 2003 283 Save Alert Research Feed Speech genres and other late essays M. Bakhtin, M. Holquist, C. Emerson, Vern W. McGee Psychology, Philosophy 1986 2,732 PDF Save Alert Research Feed The Personality of Ireland: Habitat, Heritage and History E. E. Evans Geography 1973 46 Save Alert Research Feed ... 1 2 3 ... Related Papers Abstract Figures and Topics 8 Citations 29 References Related Papers Stay Connected With Semantic Scholar Sign Up About Semantic Scholar Semantic Scholar is a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI. Learn More → Resources DatasetsSupp.aiAPIOpen Corpus Organization About UsResearchPublishing PartnersData Partners   FAQContact Proudly built by AI2 with the help of our Collaborators Terms of Service•Privacy Policy The Allen Institute for AI By clicking accept or continuing to use the site, you agree to the terms outlined in our Privacy Policy, Terms of Service, and Dataset License ACCEPT & CONTINUE work_6fe3wiohwvaknjow5afof3i6gi ---- Mining oral history collections using music information  retrieval methods Article (Accepted Version) http://sro.sussex.ac.uk Webb, Sharon, Kiefer, Chris, Jackson, Ben, Baker, James and Eldridge, Alice (2017) Mining oral history collections using music information retrieval methods. Music Reference Services Quarterly, 20 (3-4). pp. 168-183. ISSN 1058-8167 This version is available from Sussex Research Online: http://sro.sussex.ac.uk/id/eprint/71250/ This document is made available in accordance with publisher policies and may differ from the published version or from the version of record. If you wish to cite this item you are advised to consult the publisher’s version. Please see the URL above for details on accessing the published version. Copyright and reuse: Sussex Research Online is a digital repository of the research output of the University. Copyright and all moral rights to the version of the paper presented here belong to the individual author(s) and/or other copyright owners. To the extent reasonable and practicable, the material made available in SRO has been checked for eligibility before being made available. Copies of full text items generally can be reproduced, displayed or performed and given to third parties in any format or medium for personal research or study, educational, or not-for-profit purposes without prior permission or charge, provided that the authors, title and full bibliographic details are credited, a hyperlink and/or URL is given for the original metadata page and the content is not changed in any way. http://sro.sussex.ac.uk/ The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 Abstract Recent work at the Sussex Humanities Lab, a digital humanities research program at the University of Sussex, has sought to address an identified gap in the provision and use of audio feature analysis for spoken word collections. Traditionally, oral history methodologies and practices have placed emphasis on working with transcribed textual surrogates, rather than the digital audio files created during the interview process. This provides a pragmatic access to the basic semantic content, but obviates access to other potentially meaningful aural information; our work addresses the potential for methods to explore this extra-semantic information, by working with the audio directly. Audio analysis tools, such as those developed within the established field of Music Information Retrieval (MIR), provide this opportunity. This paper describes the application of audio analysis techniques and methods to spoken word collections. We demonstrate an approach using freely available audio and data analysis tools, which have been explored and evaluated in two workshops. We hope to inspire new forms of content analysis which complement semantic analysis with investigation into the more nuanced properties carried in audio signals. http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 1 Mining oral history collections using information retrieval methods Webb, S., Kiefer, C., Jackson, B., Baker, J. & Eldridge, A. Music Reference Services Quarterly (Taylor and Francis) 1 Introduction The Sussex Humanities Lab is a multidisciplinary research program tasked with embedding digital humanities into research and teaching practices across the University of Sussex. As a multidisciplinary team we have unique access to varied expertise and skills that enable us to carry out experimental work in an agile and proficient manner. One experimental project problematized the predominant approach within digital humanities – a largely text based domain – to treat digital audio files as text.1 We applied Music Information Retrieval (hereafter MIR) techniques to oral history interviews in order to develop new, complementary, approaches to text based methods of extracting semantic information from spoken word collections. As an established field, with established methods, the MIR community provides open source tools, code and libraries to work through our hypothesis, to treat audio as audio, and to help us work through and establish its practical application to spoken word collections. Having established the potential utility of MIR techniques to problems in both oral history and the digital humanities, we developed a workshop framework that aimed at exploring the utility of this approach for a variety of humanities scholars. 1 Notable exceptions include, Tanya Clement and Stephen McLaughlin, “Measured Applause: Toward a Cultural Analysis of Audio Collections,” Cultural Analytics 1, no .1 (2016), http://culturalanalytics.org/2016/05/measured- applause-toward-a-cultural-analysis-of-audio-collections/; Tanya Clement, Kari Kraus, Jentery Sayers and Whitney Trettien, “: The Intersections of Sound and Method,” Proceedings of Digital Humanities 2014. Lausanne, Switzerland, https://terpconnect.umd.edu/~oard/pdf/dh14.pdf http://dx.doi.org/10.1080/10588167.2017.1404307 http://culturalanalytics.org/2016/05/measured-applause-toward-a-cultural-analysis-of-audio-collections/ http://culturalanalytics.org/2016/05/measured-applause-toward-a-cultural-analysis-of-audio-collections/ https://terpconnect.umd.edu/~oard/pdf/dh14.pdf The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 2 Taking oral history collections from the University of Sussex ‘Archive of Resistance’ as a test case, we led two distinct groups, at two separate workshops, through the process of using MIR approaches to categorize, sort and discover audio collections. This process enabled us to: - Build a set of python workbooks that provide a conceptual and practical introduction to the application of MIR techniques (e.g. feature extraction and clustering) to spoken word collections. - Work through, develop and amend use cases. - Learn lessons, from two distinct communities and perspectives, about the potential benefits – or otherwise – of our approach. Both workshops, the first at Digital Humanities 2016 (Krakow, July 2016) and the second at London College of Communication (March 2017),2 provided points of clarification and discussion that enabled us to identify areas that require work. This article is therefore not a final report on our findings, instead it is an attempt to capture the hypothesis and problem statement, the experimentation and methodology used, and our preliminary findings. It also describes a method for workshop facilitation that utilizes a) virtual environments to reduce setup time for participants and facilitators and b) Jupyter Notebooks to enable participants to run sophisticated and complex code in a supported, learning environment.3 This article proceeds in five parts. First, in order to provide some context to this work, we provide some background information on the Sussex Humanities Lab. Parts two, three, and four 2 ‘Data-Mining the Audio of Oral History: A Workshop in Music Information Retrieval’ at London College of Communication (March 2017) https://web.archive.org/web/20171003144121/http://www.techne.ac.uk/for- students/techne-events/apr-2015/data-mining-the-audio-of-oral-history-a-workshop-in-music-information-retrieval (accessed 3 Oct. 2017) 3 Thomas Kluyver, Benjamin Ragan-Kelley, Fernando Pérez, Brian Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica Hamrick, Jason Grout, Sylvain Corlay, Paul Ivanov, Damián Avila, Safia Abdalla, Carol Willing, Jupyter Development Team, “Jupyter Notebooks – a publishing format for reproducible computational workflows,” Positioning and Power in Academic Publishing: Players, Agents and Agendas (2016). 87-90. doi: 10.3233/978-1-61499-649-1-87 http://dx.doi.org/10.1080/10588167.2017.1404307 https://web.archive.org/web/20171003144121/http:/www.techne.ac.uk/for-students/techne-events/apr-2015/data-mining-the-audio-of-oral-history-a-workshop-in-music-information-retrieval https://web.archive.org/web/20171003144121/http:/www.techne.ac.uk/for-students/techne-events/apr-2015/data-mining-the-audio-of-oral-history-a-workshop-in-music-information-retrieval The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 3 consider our hypothesis and motivations, the workshops we developed and the technologies we used. The fifth and final part outlines our preliminary findings, both from mining oral history collections using audio feature analysis and from delivering workshops on MIR in a digital humanities context. 1.2 Background The authors are current or former members of the Sussex Humanities Lab (hereafter SHL): a four-year university program, launched in 2015 at the University of Sussex, which seeks to intervene in the digital humanities. It is a team of 31 faculty, researchers, PhD students and technical and management staff, working in a state of the art space – the Digital Humanities Lab. SHL collaborates with a network of associates across and beyond the university nationally and internationally and is radically cross-disciplinary in its approach. The aim of SHL is to engage with the myriad of new and developing technologies to explore the benefits these offer to humanities research and to ask what will technology do to the arts and humanities? To achieve this, SHL is divided into four named strands of activity: digital history and digital archiving; digital media and computational culture; digital technologies and digital performance; digital lives and digital memory. However, the intention is to make sure that our research crosses and links these strands, to develop fruitful methodological and conceptual intersections. The work described here grows from this multidisciplinary ethos, since the project combines the diverse interests and expertise of the authors. It stems from the inherently collaborative environment facilitated by SHL and is influenced by two strands in particular: digital history and digital archiving, and digital technologies and digital performance. http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 4 2 Hypothesis: problem statement and motivation Oral history best practice publications and resources often focus on the application and use of digital methods and tools to create, store and manage audio, audio-visual, and subsequent text files. They recommend, for example, standards for file formats, metadata and text encoding, software for audio to text conversion, and database and content management systems. And whilst the privileged position of text has been challenged,4 the majority of oral history projects still rely on the creation of transcripts to carry out analysis using digital tools and methods. This focus on textual surrogates rather than audio sources denies – according to Alessandro Portellii – the ‘orality of the oral source’.5 It also denies – or at least underplays – the inherently interpretative nature of transcription. Of course, textual encodings or transcripts of oral history interviews do have advantages: they are easier to anonymize, distribute, store and retrieve than digital audio files, and there are established techniques for analyzing them as text and/or data. But as a consequence of this privileging of the “text”, a significant proportion of oral history collections and the tools provided to navigate and analyze them do not support navigation or analysis of the digital audio files captured during interviews. Instead, they focus on how to record oral history interviews, the management of digital files and the creation of transcripts using both semi-automated audio to text tools and manual transcription.6 4 For example, Doug Boyd, "OHMS: Enhancing Access to Oral History for Free," Oral History Review, Winter- Spring, 40 no.1 (2013). doi:10.1093/ohr/oht031 5 Ronald Grele, “Oral History as Evidence,” in History of Oral History: Foundations and Methodology. Edited by Thomas L. Carlton, T.L., Lois E. Myers, L.E., & Sharpless, R. (UK, 2007), 69. 6 For example, ‘Oral History in a Digital Age’ http://ohda.matrix.msu.edu/ http://dx.doi.org/10.1080/10588167.2017.1404307 http://ohda.matrix.msu.edu/ The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 5 While using text surrogates is an established tradition, the oral history community is beginning to question the privilege of this text based approach. This is evident, for example, in the UK's Oral History Society’s 2016 conference call for papers, which stated the ‘auditory dimension of oral history [has] for decades [been] notoriously underused’.7 While this move is welcome, it is true that, as per Clement et al’s 2014 survey of the field, currently ‘there are few means for humanists interested in accessing and analyzing [spoken] word audio collections to use and to understand how to use advanced technologies for analyzing sound’.8 Moreover, these technologies have the potential to help resolve some of the backlog in archives and libraries of ‘un-described, though digitized, audio collections’.9 It is from this context, therefore, that we decided to explore the potential for direct audio analysis of oral history interviews. This work represents a move towards analyzing oral content in the context in which they were created. It also challenges the privilege of text, as it focuses on extracting information from the audio signal directly. We are particularly interested in how such techniques could complement the semantic content obtained through manual or automated transcription. On the basis that comparable methods have been developed for digital recordings of music for some time, we explored the field of MIR for possible solutions. We are explicitly carrying out a study of computational techniques for the analysis of oral history records with the aim of extracting quantitative results to assist research. The MIR techniques that we use create quantitative 7 “Beyond Text in the Digital Age? Oral history, images and the Written Word” Oral History Society, 2016 Conference CFP: https://web.archive.org/web/20161214045140/http://www.ohs.org.uk/conferences/2016- conference-beyond-text-in-the-digital-age/ accessed 27th June 2017 8 Tanya E. Clement, David Tcheng, Loretta Auvil and Tony Borries, “High Performance Sound Technologies for Access and Scholarship (HiPSTAS) in the Digital Humanities,” Proceedings of the Association for Information Science and Technology 51 (2014) 1–10 doi:10.1002/meet.2014.14505101042 9 Ibid http://dx.doi.org/10.1080/10588167.2017.1404307 https://web.archive.org/web/20161214045140/http:/www.ohs.org.uk/conferences/2016-conference-beyond-text-in-the-digital-age/ https://web.archive.org/web/20161214045140/http:/www.ohs.org.uk/conferences/2016-conference-beyond-text-in-the-digital-age/ The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 6 information (i.e. timestamps that locate specific features and/or events within the audio) that could enhance and stimulate new directions in the qualitative research of others. MIR draws from digital audio signal processing, pattern recognition, psychology of perception, software system design, and machine learning to develop algorithms that enable computers to ‘listen’ to and abstract high-level, musically meaningful information from low- level audio signals. Just as human listeners can recognize pitch, tempo, chords, genre, song structure, etc., MIR algorithms – to a greater and lesser degree – are capable of recognizing and extracting similar information, enabling systems to perform extensive sorting, searching, music recommendation, metadata generation, transcription on vast data sets. Deployed initially in musicology research and more recently for automatic recommender systems, the research potential for MIR tools in non-musical audio data mining is being recognized but yet to be fully explored in the humanities. We chose to develop a one-day workshop related to this topic because the approach allowed us to explore our hypothesis and methods on different users, both expert and novice, from different disciplines, digital humanities and oral history, and garner important, domain specific feedback. 3 Experimentation: workshops and method The workshops were intentionally experimental in nature (especially from a content analysis perspective), but were developed and delivered with a number of use cases in mind. We framed these use cases around three distinct contexts: the digital object, the content of the interview and the environment in which the interviews were carried out. Upon completion of the http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 7 workshops we revisited these use cases. The following questions represent a series of potential applications for the use of MIR in the context of analyzing oral history collections. They are based on a synthesis of both our initial scoping work and our interactions with workshop attendees. A known problem within oral history and digital humanities is the time and resources intensive process of cataloguing and analysis oral history collections. Therefore, although for practical reasons small collections were used in the workshops, the use cases developed and methods adopted are fully scalable: 1. Context - the object: 1.1. What technical metadata or technical information can we automatically extract from a digital audio file? 1.2. Can this new information enhance what we know about an object and improve search and discoverability? 1.3. Can we detect the use of different recording devices as a means of clustering and classifying two temporally distinct data sets? 2. Context - the content (i.e. what type of content analysis can we carry out): 2.1. What descriptive metadata can we automatically extract from the digital audio file? For example, can we create a feature which distinguishes interviewer from interviewee? Could we use this to automatically detect a specific voice within a collection? 2.2. Can we reveal anything about the relationship or dynamic between interviewer and interviewee? For example, can we detect overlaps or interruptions by the interviewer? Can this reveal anything about gender roles and/or behaviors? http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 8 2.3. Can we augment our ability to detect emotion by analyzing changes in rhythm, timbre, tone, tempo? Is it therefore possible to identify song, poetry, speech, crying, laughter, etc.? 2.4. Can we automatically cluster acoustically similar audio/material/objects? For which properties might this be most robust? 2.5. Can we use techniques from musical analysis to reveal structure in spoken audio, for example to pull apart different voices, and how might this be useful for oral history collections? 3. Context- the environment: 3.1. Can we detect any environmental features in the audio stream? What might this tell us about where the interview took place. 3.2. Can we use source separation, developed to separate parts (e.g. drums, vocals, keyboards in pop music), to pull apart intertwined ‘voices’ or ‘noises’. Can we use this to remove background noise that provides context to recordings? How might this affect the analysis of interviews? Enabling these kind of preprocessing and descriptive orientated steps affords new possibilities in oral history research and archival management. For example, these enable access to under described repositories such as the wealth of content created by the YouTube generation. This will enable new opportunities of empirical analysis and supporting qualitative research (e.g. gender studies). The first workshop, ‘Music Information Retrieval Algorithms for Oral History Collections’, was facilitated by the authors in July 2016 at the Digital Humanities 2016 http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 9 conference.10 The workshops introduced participants to specialist software libraries and applications used by MIR researchers.11 All tools used in the workshop are freely available and cross-platform, meaning our examples are extendable, reusable and shareable. We used open source Python libraries for audio feature analysis, maths and machine learning. Additionally we packaged all dependencies in a Virtual Machine (VM) for ease of accessibility (see section 3.3). All Python code was written and executed in the Jupyter Notebook environment. Jupyter enabled us to develop pre-written examples that supported participants from various backgrounds: those new to coding could immediately engage in working examples, whilst those with more technical experience could edit and explore the code as they wished. In these exploratory sessions we worked with digital audio files from the ‘Archive of Resistance’: a growing collection of oral history content related to forms of resistance in history (for example, British Special Operations Executive operations during World War II) that is held at the Keep, an archive located near the University of Sussex. 3.1 Introduction to MIR and technologies used During the last decade content-based music information retrieval has moved from a small field of study, to a vibrant international research community12 whose work has increasing application across music and sound industries.13 Driven by the growth of digital and online audio 10 See http://dh2016.adho.org/workshops/ 11 See ‘Listening for Oral History’ https://github.com/algolistening/MachineListeningforOralHistory and ‘Music Information Retrieval Algorithms for Oral History Collections’ in Zenodo (July 2016) available at https://zenodo.org/record/58336#.WdOghLzyt24 12 http://www.ismir.net/ 13 http://the.echonest.com/ http://dx.doi.org/10.1080/10588167.2017.1404307 http://dh2016.adho.org/workshops/ https://github.com/algolistening/MachineListeningforOralHistory https://zenodo.org/record/58336#.WdOghLzyt24 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 10 archives, tools developed in MIR enable musically-meaningful information to be extracted directly from a digital audio signal by computational means, offering an automated, audio-based alternative to text-based tagging (the latter of which is common to both spoken word and music collections).14 For example, digital audio files can be automatically described using high level musical features such as melodic contour, tempo or even “danceability”.15 These features are designed to enable automatic genre recognition or instrument classification, which in turn support archive management and recommender services. Applications of these methods in musical research and in industry include:16 - Music identification (commonly associated with software applications such as Shazam and SoundHound), plagiarism detection and copyright monitoring to ensure correct attribution of musical rights, identification of live vs studio recordings, for database normalization and near-duplicate results elimination. - Mood, style, genre, composer or instrumental matching for search, recommender and organization of musical archives. - Music vs speech detection for radio broadcast segmentation and archive cataloguing. Techniques are numerous and rapidly evolving, but most methods work by extracting low-level audio features and combining these with domain specific knowledge (for example, that hip-hop generally has less beats per minute than dubstep) to create models from which more musically-meaningful descriptors can be built and – in turn – tempo, or melody, and ultimately 14 Downie, J. Stephen. "Music information retrieval." Annual review of information science and technology 37, no. 1 (2003): 295-340. 15 http://the.echonest.com/app/danceability-index/ 16 Michael A. Casey, Remco Veltkamp, Masataka Goto, Marc Leman, Christophe Rhodes and Malcolm Slaney, "Content-based music information retrieval: Current directions and future challenges," Proceedings of the IEEE96 4 (2008): 668-696. http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 11 genre, composer, etc. might be identified. Low level features are essentially statistical summaries of audio in terms of distribution of energy across time, or frequency range. Some features might equate to perceptual characteristics such as pitch, brightness or loudness of a sound; others, such as MFCC (Mel-frequency Cepstral Coefficients), provide computationally powerful timbral descriptions but have less obvious direct perceptual correlates. Such low level features can then be used to create methods to find sonically-salient events, such as an onset detector, to identify when an instrument or voice starts playing. This low level information can then be combined with domain specific knowledge – such as the assumption that note onsets occur at regular intervals in most music – to create a tempo detector. In turn, this might be used to inform musical genre recognition, in the knowledge – as above – that hip-hop generally has less beats per minute than dubstep. Just as these low level features can be combined and constrained to create high-level, information with many applications in engaging with and managing music archives, we are interested in the possibility that information of interest to historians and digital humanists might be discoverable in a digital audio file, that would be missed by the analysis of semantic, textual surrogates alone. Whilst no off-the-shelf tools exist for such analysis yet, the open, experimental ethos of digital audio and machine learning research cultures means that there are many accessible software tool kits available which enable rapid experimentation. 3.2 Learning MIR in Jupyter Workbooks Content based MIR combines methods of digital signal processing, machine learning and music theory which in turn draw upon significant perceptual, mathematical, programming http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 12 knowledge and experience. Together these are skills that can take years to acquire. We wished to provide sufficient insight into the core concepts and techniques so as to inspire the imaginations of humanities researchers – with very mixed technical experience and interests – in a single day workshop. Fortunately, many of the complex technical and conceptual underpinnings can be readily grasped with audio-visual illustration, especially if they can be interactively explored. We therefore chose a constructionist approach in the form of hands-on workshops where participants learned through exploring interactive workbooks containing a mix of text-based information, audio-visual illustration and executable, editable code. This meant participants could work through carefully designed examples and learn by editing and exploring the code, all without having to grasp the mathematical bases of the ideas. Figure 1: Screenshot of a workbook that introduces participants to some basic methods (reading and loading digital audio files). Jupyter Notebooks were used to present example code in interactive workbooks which combined formatted (mark-down) text, executable code and detailed, dynamic audio-visual illustration. Jupyter provides a rich and supportive architecture for interactive computing, including a powerful interactive shell, support for interactive data visualization and GUI toolkits, flexible, embeddable interpreters and high performance tools for parallel computing. For novice and expert users alike it offers an interactive coding environment ideal for teaching, learning and rapidly experimenting with and sharing ideas. Executable code was written in python. Python is a human-readable general purpose programming language which is fast becoming the primary choice in data science, as well as computer science education in general. A vibrant, active community of users contribute to well-maintained open-source libraries which we used in the workbooks. These include: librosa (for music and audio analysis), matplotlib and ipython display http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 13 (for visualisation), scikit-learn (for machine learning), and SciPy and NumPy mathematical libraries (see Resources for a full list). 3.3 Sharing workbooks in a Virtual Machine: Reducing barriers to participation Workshop participants were humanities scholars from a range of backgrounds, with differing levels of programming experience and computing knowledge. The requirement to install and configure the necessary collection of developer tools on a disparate selection of participant laptops had the potential to consume significant amounts of workshop time, increase the difference in participant progression through the schedule of activities, as well as diminish the amount of time available to explore MIR techniques. To avoid this, we created server and virtual machine (VM) based Python development environments for the workshop sessions. This approach reduced technological barriers to participation. VM images were created and distributed on USB memory sticks. Installation of the developer tools, sample digital audio files and a minimal host operating system (Lubuntu 32 bit) resulted in a VM image size of about 8GB. Oracle VM VirtualBox was selected as the technology to implement the VM as the software; it is free and cross platform.17 The main drawbacks of the approach were the large amount of storage needed on user machines, and the requirement for authors to create content far enough ahead of the event so that it could be distributed with the VM image. In response to the first drawback – the requirement of 8GB of available disk space is a barrier to adoption for some users – a server-based alternative was also developed. The local 17 https://www.virtualbox.org/ http://dx.doi.org/10.1080/10588167.2017.1404307 https://www.virtualbox.org/ The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 14 computing requirement for the server-based approach is a modern web browser to run the Jupyter Notebooks and a terminal program capable of implementing the network communication method used for the service, such as a Secure Shell (SSH) tunnel. The reason for using SSH is that it simplifies security concerns relating to the provision of unrestricted access to a Python development environment across the internet. Tunneling through SSH is not necessary for secure access, in our case it provided a technique that did not impose restrictions on contributor code development methods. This is a trade-off between simplicity and restriction of user behavior. 4 Workbook content The workbooks were designed to be taught across a full day. In the morning session, they were used to introduce participants to the key concepts of coding, digital audio and audio features. In the afternoon, they were used by participants to apply these ideas and methods to an illustrative example. Workbook One introduced basic Python and the Jupyter notebook, with interactive exercises to familiarize participants with navigating the environment, executing code, carrying out basic mathematical operations and getting help. Workbook Two introduced the fundamental practical tools and ideas necessary to work with digital audio. These included loading, playing and visualizing digital audio files and introducing both ways of understanding how audio is represented digitally and ways to visualize and analyze frequency content of audio files. Workbook Three used plotting and listening to develop an intuitive understanding of audio features, as well as introducing practical tools and existing libraries used to inspect digital audio files and extract audio features. The worked through example, in Workbook Three, demonstrates how simple, low-level audio features (spectral bandwidth and the average number of zero- http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 15 crossings) can be used to distinguish between recordings of female and male interviewees. The two interviews provided were 10 minute interviews from the ‘Archive of Resistance’: one of a French woman and one an English man. The participants used the workbook to split the digital audio files from these interviews into one second chunks and then extracted a range of illustrative audio features. Finally, unsupervised clustering (k-means) was applied and the results of different pairs of features plotted to see which most successfully separated the two files. We found that even without clustering, the files could be separated with just two audio features. Figure 2: Scatter plot showing spectral bandwidth versus zero-crossing for all 1200 one second chunks of two 10 minute interviews. Figure 2 is a scatter plot that shows spectral bandwidth versus zero-crossing for all 1200 one second chunks of two, ten-minute interviews. Segments from recordings of the male speaker are colored blue, the female speaker segments are red. The two clusters are quite distinct, making it simple to automatically separate the segments of the two files. This demonstrates how low-level features can be used to identify recordings according to distinct characteristics of speaker’s voice. Note that both recordings also contain a male interviewer. In this example, only two files were used, but the approach is scalable to large data sets, demonstrating how audio feature analysis might be used to sort and explore unlabeled archives. Large scale tests would be necessary to prove the generalizability of these results. Nevertheless, this example illustrates that simple feature analysis holds promise for meeting several of the use cases listed in Section 3. Because different recording devices create digital audio files with differing acoustic profiles, this approach has potential – for example – to reveal information about the content (use case 1.3). Identifying interviewee-specific characteristics suggests a route http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 16 to automated content analysis: the identification of gender provides useful metadata (use case 2.1) that could underpin further gender-specific analyses (use case 2.2) and be potentially extended other personal characteristics (use case 2.4). The final workbook explored how changes in textural information within a sound could be analyzed to identify the points at which the speaker changed from interviewer to interviewee. In sound processing, the mel-frequency cepstrum (MFC) is a representation of the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency. The mel scale spaces frequency bands to approximate the human auditory system. The coefficients of the MFC (MFCCs) do not intuitively correlate with perceptual properties of sound, however they do consistently reflect timbral characteristics. If we calculate the MFCCs for short segments, or frames, of audio throughout the files, changes in values throughout the file reveal points of timbral, or textural change. By plotting a two- dimensional self-similarity matrix - in which the difference between each frame is compared to all other frames - we can visualize periods of similarity and change. In musical applications, this technique is used to identify structures such as changes between and repetitions of verse and chorus; in spoken word interviews, this allows us to observe changes in texture which reflect transitions between interviewer and interviewee. This example demonstrates how changes within a single file might be used to reveal changes in speaker characteristics: in this case who the speaker is (use-case 2.4). In combination with successful gender identification, this could be applied to use case 2.2, enabling large scale analysis of interviewer-interviewee dynamics. A similar method could potentially be developed to identify changes in rhythm or timbre (use case 2.3). http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 17 5 Findings and lessons learned The examples in the workbooks illustrated that relatively simple audio analysis could be used to provide useful extra-semantic insights into oral history interviews. However, the degree to which participants grasped the possibilities was directly related to their familiarity with firstly digital audio and secondly digital methods in general. Those participants who were familiar with basic digital audio concepts and programming techniques (as was the case with many participants at the Digital Humanities 2016 workshop), recognized the potential of this approach, particularly those who worked with large audio archives. Other participants, those who had not previously engaged with computational methods, or done any coding of any kind, found it more difficult to imagine wider usage. This was particularly true for those who worked with very small sets of recording, for which this type of analysis is largely irrelevant. Whilst the process of developing and facilitating both workshops indicated that MIR methods and technologies can be usefully applied to digital audio files that contain spoken word, adoption of these techniques is likely to be amongst existing computationally literate communities. For some, understanding how to interpret the visual display of audio files (e.g. the spectrogram) was challenging: ‘I found it hard to translate spectrograms and plots to observations about the interviews’.18 Our workbooks allowed participants to carry out sophisticated, complex analysis, yet many participants found it difficult to envisage or imagine questions beyond those that we posed or included in the workbooks. This difficulty is linked to a number of factors. First, the workbooks in effect hide the complexities of a number of different tasks, so that while 18 MIR for Oral History Collections (feedback) http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 18 participants could execute a piece of code and get results, this reduced understanding of the methods and the capabilities of the software libraries used and the code developed. Second, while the workbooks scaffolded learning, some participants – especially those new to programming – experienced a steep learning curve. This was especially evident in the second cohort/workshop, which mostly consisted of PhD students and researchers using oral history as a method in their work. Indeed, a portion of this cohort had little to no experience of the term “digital humanities” , or indeed the methods used in this domain. From our perspective it was interesting to discover that many oral history practitioners in this session still use the audio to text method as standard procedure. Therefore, working with the audio or digital files in a computational manner was completely unfamiliar territory. However, even though some participants found the workbooks challenging, they indicated to us that by working through the concepts, ideas and indeed the use cases, they felt inspired to re-think their current forms of analysis and to investigate how they might incorporate new forms of computational analysis. One participant, working on an historical collection of oral history interviews, reported that they could see how using audio analysis might help to reverse engineer the methodology or order of the original interviews. Other participants noted that from an archival perspective, techniques used to cluster “related” content might help with cataloguing collections by creating new forms of metadata. Many participants remarked that although they did not envisage learning the skills necessary to carry out this kind of analysis, having seen the potential, they could see value in collaborating with data scientists to explore new approaches, something they had not previously considered. These remarks made us reflect on how best such methods could be introduced into research communities with little or no prior engagement with computational methods. A http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 19 common solution is to create a package with a graphical user interface and presets, which users can employ without conceptual or technical knowledge, yet the real potential of such methods can only be realized through hands-on, bespoke experimentation with specific real-world research questions. Our decision to present participants with pre-written executable code was intended as a compromise between these two positions. In terms of the technological set up, with further time spent on preparation it would be preferable to develop a service to support the workshop exercises without tunneling the connection, which would result in a more reliable delivery of the service. HTTP(S) communication is resilient to the fluctuating quality that is common in public Wi-Fi networks as it does not require an uninterrupted connection, instead connections are created and destroyed with every interaction. Provision of server based development environments is a good fit for cloud based computing infrastructures. The cost of running the cloud servers used for the workshops was less than £1 (or $1.30 approx.) for each event. In hindsight this means that server allocation should have been increased to improve service reliability. During both workshops broken connections were experienced and servers crashed; however, user experience was preserved by monitoring the cloud servers, supporting the participants and reconnecting broken connections as quickly as possible. The lesson learned with regards to connection and server stability was the extent of the variation of computing resource requirements across the activities and participants. The lesson learned with regards to the provision of virtual machines is the choice to use container technology (Docker) instead. Docker overcomes variations in user software configuration without the need to distribute a full operating system to every user. http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 20 Conclusion Our overall aim for this experimental project was to help the digital humanities and oral history community explore alternatives to the use of textual surrogates in oral history. Using off- the-shelf tools, we created and disseminated online interactive workbooks which demonstrate how generic audio analysis methods can be used to extract extra-semantic information from digital audio files. This approach might be used to complement traditional semantic analyses, providing automation of existing methods (metadata) or potentially new levels of analysis, such as interviewee-interviewer dynamics. By running participatory workshops, we tested the response of a wide range of humanists interested in oral history collections. The workshops demonstrated that this approach might be of great interest to DH researchers working with large audio databases, but are unlikely to be rapidly taken up by those working with small data sets, or with preference for manual methods. Our work suggest great potential for audio-analysis in oral history. Refinement of methods to meet the use cases outlined in Section 3 will require systematic research on a wide range of large oral history archives in order to establish how well this work can be generalized and extended. In terms of future adoption in digital humanities communities, as with all computational analyses, a balance must then be sought between providing ready-to-use tools with a low barrier to entry, or nurturing a wider understanding technically and conceptually, such that members of the community may build and develop their own methods. As computational literacy grows amongst research communities, we see potential for novel applications of these methods in the future. http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 21 http://dx.doi.org/10.1080/10588167.2017.1404307 The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 22 Bibliography Bertin-Mahieux, T., Ellis, D.P., Whitman, B. and Lamere, P. “The Million Song Dataset.” ISMIR 2, no. 9 (2011). Boyd, Doug. "OHMS: Enhancing Access to Oral History for Free." Oral History Review 40, no. 1 (2013): 95–106 doi:10.1093/ohr/oht031 Casey, Michael A., Veltkamp, Remco, Goto, Masataka, Leman, Marc, Rhodes, Christophe and Slaney, Malcolm "Content-based music information retrieval: Current directions and future challenges." Proceedings of the IEEE96 4 (2008): 668-696. Clement, Tanya, Kraus, Kari, Sayers, J. and Trettien, Whitney “: The Intersections of Sound and Method.” Proceedings of Digital Humanities 2014. Lausanne, Switzerland https://terpconnect.umd.edu/~oard/pdf/dh14.pdf (accessed August 14, 2017) Clement, Tanya E., Tcheng, David, Auvil, Loretta and Borries, Tony “High Performance Sound Technologies for Access and Scholarship (HiPSTAS) in the Digital Humanities.” Proceedings of the Association for Information Science and Technology 51 (2014):1–10 doi:10.1002/meet.2014.14505101042 Downie, J. Stephen. "Music information retrieval." Annual review of information science and technology 37, no. 1 (2003): 295-340. Grele, Ronald J., “Oral History as Evidence.” In History of Oral History: Foundations and Methodology Thomas L. Carlton, Lois E. Myers and Rebecca Sharpless (eds) (UK, 2007), 69. Kluyver, Thomas, Ragan-Kelley, Benjamin, Pérez, Fernando, Granger, Brian, Bussonnier, Matthias, Frederic, Jonathan, Kelley, Kyle, Hamrick, Jessica, Grout, Jason, Corlay, Sylvain, Ivanov, Paul, Avila, Damián, Abdalla, Safia, and Willing, Carol (Jupyter Development Team) “Jupyter Notebooks – a publishing format for reproducible computational workflows.” Positioning and Power in Academic Publishing: Players, Agents and Agendas (2016):87-90. doi: 10.3233/978-1-61499-649-1-87 Tzanetakis, George and Cook, Perry “Musical genre classification of audio signals.” IEEE Transactions on speech and audio processing 10, no. 5 (2002):293-302. http://dx.doi.org/10.1080/10588167.2017.1404307 https://terpconnect.umd.edu/~oard/pdf/dh14.pdf The Version of Record of this manuscript has been published and is available in the Journal, Music Reference Services Quarterly, November 2017, http://dx.doi.org/10.1080/10588167.2017.1404307 23 Resources All workbooks, data, slides from the workshops are deposited in both GitHub and Zenodo:  ‘Machine Listening for Oral History’, GitHub ● Eldridge, A., Kiefer, C., Webb, S., Jackson, B., & Baker, J. (2016, July). Music Information Retrieval Algorithms for Oral History Collections. Zenodo. http://doi.org/10.5281/zenodo.58336 Python libraries used: https://www.scipy.org/ http://scikit-learn.org/ https://matplotlib.org/ https://ipython.org/ipython-doc/3/api/generated/IPython.display.html https://librosa.github.io/librosa/ Other: https://www.docker.com/ http://dx.doi.org/10.1080/10588167.2017.1404307 http://doi.org/10.5281/zenodo.58336 https://github.com/algolistening/MachineListeningforOralHistory https://github.com/algolistening/MachineListeningforOralHistory https://www.scipy.org/ http://scikit-learn.org/ https://matplotlib.org/ https://ipython.org/ipython-doc/3/api/generated/IPython.display.html https://librosa.github.io/librosa/ https://www.docker.com/ work_6fu6aiwvubhrlhtso5q7z4euzi ---- How Digital Are the Digital Humanities? An Analysis of Two Scholarly Blogging Platforms                City, University of London Institutional Repository Citation: Puschmann, C. & Bastos, M. T. (2015). How Digital Are the Digital Humanities? An Analysis of Two Scholarly Blogging Platforms. PLoS One, 10(2), e0115035. doi: 10.1371/journal.pone.0115035 This is the published version of the paper. This version of the publication may differ from the final published version. Permanent repository link: http://openaccess.city.ac.uk/15364/ Link to published version: http://dx.doi.org/10.1371/journal.pone.0115035 Copyright and reuse: City Research Online aims to make research outputs of City, University of London available to a wider audience. Copyright and Moral Rights remain with the author(s) and/or copyright holders. URLs from City Research Online may be freely distributed and linked to. City Research Online: http://openaccess.city.ac.uk/ publications@city.ac.uk City Research Online http://openaccess.city.ac.uk/ mailto:publications@city.ac.uk RESEARCH ARTICLE How Digital Are the Digital Humanities? An Analysis of Two Scholarly Blogging Platforms Cornelius Puschmann1*‡, Marco Bastos2‡ 1 Faculty of Social Sciences, Zeppelin University, Am Seemooser Horn 20D, Friedrichshafen D-88045, Germany, 2 Franklin Humanities Institute, Duke University, 114 S. Buchanan Blvd, Bay 5 Box 90403, Durham, North Carolina 27701, United States of America ‡ These authors contributed equally to this work. * cornelius.puschmann@hiig.de Abstract In this paper we compare two academic networking platforms, HASTAC and Hypotheses, to show the distinct ways in which they serve specific communities in the Digital Humanities (DH) in different national and disciplinary contexts. After providing background information on both platforms, we apply co-word analysis and topic modeling to show thematic similari- ties and differences between the two sites, focusing particularly on how they frame DH as a new paradigm in humanities research. We encounter a much higher ratio of posts using humanities-related terms compared to their digital counterparts, suggesting a one-way de- pendency of digital humanities-related terms on the corresponding unprefixed labels. The results also show that the terms digital archive, digital literacy, and digital pedagogy are rel- atively independent from the respective unprefixed terms, and that digital publishing, digital libraries, and digital media show considerable cross-pollination between the specialization and the general noun. The topic modeling reproduces these findings and reveals further dif- ferences between the two platforms. Our findings also indicate local differences in how the emerging field of DH is conceptualized and show dynamic topical shifts inside these respective contexts. Introduction The advent of the Internet has profoundly affected scholarly communication [1–4]. Few schol- ars, whether in the sciences, social sciences, or humanities can imagine conducting research or organizing teaching without relying on email, digital library services, or e-learning environ- ments. Formal academic publishing has undergone a series of changes with the increased avail- ability of electronic publications, whether under an open access or toll access regime [5]. Structural changes in the dissemination of knowledge have largely been gradual and evolution- ary: while the volume of scholarly publications has greatly increased in the past decades and the formal and distribution models have diversified, the form and function of research articles and scholarly monographs have remained relatively stable [6]. PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 1 / 15 OPEN ACCESS Citation: Puschmann C, Bastos M (2015) How Digital Are the Digital Humanities? An Analysis of Two Scholarly Blogging Platforms. PLoS ONE 10(2): e0115035. doi:10.1371/journal.pone.0115035 Academic Editor: Vincent Larivière, Université de Montréal, CANADA Received: June 24, 2014 Accepted: November 18, 2014 Published: February 12, 2015 Copyright: © 2015 Puschmann, Bastos. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability Statement: All relevant data are within the paper and its Supporting Information files. Funding: This work was supported by the National Science Foundation under grant number 1243622 and the German Research Foundation under grant number PU 439/2-1. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0115035&domain=pdf http://creativecommons.org/licenses/by/4.0/ Meanwhile, the range of avenues available for the dissemination of informal scholarly com- munication has increased exponentially. In addition to formal publication venues, scholars can now communicate their findings in (micro)blogs, wikis, social networking sites (SNS) and countless other social web platforms [7–10]. Such services carry both opportunities and risks for early-career researchers, and they are used for a wide variety of purposes and with a range of motives [11–13]. While researchers are able to disseminate their findings more quickly and reach out to broader audiences than was previously possible, they also risk that their work will not be acknowledged in more traditional and hierarchical professional structures. Informal genres of scholarly communication frequently lack peer review and rely on new measures of impact, rather than the established currency of acceptance within a field [14]. As a result, re- searchers have overall been very careful in their acceptance of digital formats that compete with established forms of expert knowledge dissemination, largely choosing instead to focus on established formats [15]. This is especially true in the humanities, where conservatism towards new formats is particularly strong. Digital Humanities (DH) can be broadly characterized as the adoption of an array of computational methodologies for humanities research [16, 17]. During the early nineties, DH scholarship developed under the umbrella of several academic organizations dedicated to what was then commonly referred to as humanities computing [18]. These organizations brought together scholars from different fields interested in exploring computational methods for traditionally-defined humanities scholarship [19]. The suffix “digital” is increasingly used to delineate the new computational areas of humanities research (i.e. digital literature, digital archaeology, digital history, etc.). The introduction of computational methods aims among other things to supplement established humanities research routines and explore new method- ological avenues, such as text analysis and encoding; archive creation and curation; mapping and GIS; and modeling of archaeological and historical data [20, 21]. Since the early 2000s the term Digital Humanities has also been used to refer to humanities research defined by a data-driven approach, in which summarization and visualization are im- portant methodological cornerstones. Media and cultural studies, library and archival studies, digital pedagogy, and the recently emergence of MOOCs have also been referred to as Digital Humanities in a more general sense [22]. As a result, DH has evolved to incorporate a range of different definitions and is subject to considerable interpretative flexibility [23]. The central hypothesis of this study is that the variety of terms and topics associated with DH is locally configured, and that their makeup reflects different (and to a degree contradictory) conceptual- izations of what constitutes DH. DH and Social Media Because of its interdisciplinary and international character, its affinity for digital media, and its recent emergence as a scholarly movement, DH has been comparatively strongly impacted by informal communication tools such as blogs and Twitter, with junior scholars invested in DH research using such tools widely to organize, network, and collaborate. Kirschenbaum notes the important role of social media for establishing and galvanizing DH as a movement: “Twit- ter, along with blogs and other online outlets, has inscribed the digital humanities as a network topology, that is to say, lines drawn by aggregates of affinities, formally and functionally mani- fest in who follows whom, who friends whom, who tweets whom, and who links to what.” [17] Usage of Twitter and blogs has contributed to establishing DH as a brand, and it has helped to increase its visibility on a global scale [24]. While actively using social media does not make one a digital humanist, social media applications seem to be perceived as valuable instruments for intra-community communication in the DH community, rather than being used just out of How Digital Are the Digital Humanities? PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 2 / 15 curiosity or for self-promotion [11]. Crucially, there are scholars who take up blogging and Twitter because they are important channels of communication in the DH community. Such tools therefore increasingly constitute scholarly infrastructure to their users in the same sense that library services and communal mailing lists constitute infrastructure. While traditional scholarly organizations are struggling to integrate social media, DH scholars, espe- cially junior researchers, have considerable uptake of such tools, reflected for example in the strong use of Twitter at the annual Digital Humanities conference [24, 25]. DH can therefore be characterized as an emerging digital scholarly network—a group of scholars that has integrat- ed digital genres of scholarly communication into its communicative infrastructure from the onset. Inside such a network in which heterogeneous links connect different actors it should be possible to study the flow of ideas, trends, and discourses much more effectively through social media than purely by assessing formal publications in scholarly journals and monographs [26]. HASTAC The Humanities, Arts, Science, and Technology Alliance and Collaboratory (HASTAC) is an online community and social network that connects researchers, young scholars, and the gen- eral public interested in a wide range of subjects associated with DH and peer-to-peer learning. Founded in 2002 by Davidson and Goldberg [27], HASTAC emerged as a consortium of edu- cators, scientists, and technology designers funded by the National Science Foundation, the Digital Promise Initiative, and the MacArthur Foundation, with infrastructure provided by Duke University and the University of California Humanities Research Institute. HASTAC dif- fers from similar initiatives in that it is largely decentralized with content generated by a net- work of over ten thousand members including university faculty, students, and general public. The network platform is built on the Drupal content management system and requires an inclusive free-of-charge membership. Member participation varies widely, with many register- ing but passively interacting with the website by reading the content and a robust minority ex- pressing their thoughts and communicating their interests by writing or commenting on blog posts, joining discussion forums, or contributing information about current events. According to the initiative’s website, “HASTAC members are motivated by the conviction that the digital era provides rich opportunities for informal and formal learning and for collaborative, net- worked research that extends across traditional disciplines, across the boundaries of the acade- my and the community, across the two cultures of humanism and technology, across the divide of thinking versus making, and across social strata and national borders.” [28]. While the platform is interdisciplinary in nature, it is strongly focused on learning and DH-related topics. Hypotheses Hypotheses is a publication platform for academic blogs. Launched in 2004, it is funded and operated by the Centre for Open Electronic Publishing (Cléo), a unit that brings together two major French research institutions and two universities: the Centre national de la recherche scientifique (CNRS), the École des Hautes Études en Sciences Sociales (EHESS), the Aix-Mar- seille Université, and the Université d’Avignon. In addition to Hypotheses, Cléo provides other tools via the OpenEdition portal: Revues.org, a platform for journals in the humanities and so- cial sciences and Calenda, a calendaring tool. According to the Hypotheses website “[a]cademic blogs can take numerous forms: accounts of archaeological excavations, current collective research or fieldwork; thematic research; books or periodicals reviews; newsletter etc. Hypotheses offers academic blogs the enhanced visibility of its humanities and social sciences platform. The Hypotheses team provides support How Digital Are the Digital Humanities? PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 3 / 15 http://Revues.org and assistance to researchers for the technical and the editorial aspects of their project.” [29] To publish on Hypotheses, a blog must first be admitted by the platform’s editorial team. Only researchers employed by institutions of higher learning are eligible to join Hypotheses after having been evaluated, and the criterion for positive evaluation is a consistent focus on aca- demic issues. Through its policy the platform maintains some characteristics of a formal publi- cation outlet, aiming to stimulate both open discussion within scholarly disciplines and exchange with the broader public. Hypotheses is based on the Wordpress content management platform, with a home page that features current contributions from participant blogs. In addition to English, a large por- tion of Hypotheses’ content is composed in French, German, Spanish, and other languages, but for the purpose of this study we only considered posts published in English. Similarities and differences Both platforms share strong similarities: they aim to promote new forms of scholarly commu- nication and knowledge dissemination. At the same time, there are also considerable differ- ences: HASTAC places a clear emphasis on learning and also mentions media and communication in its self-characterization. While Hypotheses is also interdisciplinary in char- acter, it has a stronger slant towards traditional humanities subfields, and specifically towards history. The concept of scholarly blogging outlined on the Hypotheses website points to its role for intradisciplinary communication, whereas HASTAC is more geared towards interdisciplin- ary exchange. Despite these differences, the two platforms make an ideal case for comparison on the grounds of their functional similarities. Both are related to DH, both seek to integrate blogging into scholarly communication, and both are publicly funded. Furthermore, both plat- forms have been operational for a similar timespan and attract broadly comparable user communities. Research Design Our aim is to characterize differences in the discourse that takes place on HASTAC and Hy- potheses reflecting different cultural implementations of DH and different understandings of what constitutes DH. To this end, we formulated two research questions: How frequent are particular keywords associated with (digital) humanities on the two platforms (H1) and what are thematic differences in the distribution of topics in the two sites (H2)? We approached the first question by counting the co-occurrence of humanities-related terms and their digital equivalents (e.g. history—digital history) on blog posts. In a second step we applied topic modeling to the post content to identify substantial thematic differences between the commu- nities in both platforms and their respective approaches to blogging. Based on the self-charac- terizations of both platforms, we expected there to be both overlap and variation with regards to the adoption of DH-related labels and overall disciplinary focus. Data The data from the two platforms were collected from database dumps containing the SQL table structure and the blog post content. HASTAC data included content posted between August 14, 2006 and August 14, 2013, together with the profile data of 11,284 users. Most users shared brief biographical information and identified a set of topical interests, institutional affiliation, and links to personal websites. In addition to the posts themselves, the Hypotheses data includ- ed metadata such as author information, timestamp, text, internal and external links in each post, which was collected between the 1st of July 2006 and the 30rd of June 2012. How Digital Are the Digital Humanities? PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 4 / 15 The language of posts was detected automatically using the language identification system langid.py for Python, which supports a large number of languages and achieves a high level of accuracy without requiring prior in-domain classifier training [30]. The material initially in- cluded a large number of posts published in languages other than English (45,528 posts) pub- lished over different periods of time. For the purpose of this investigation, we only considered blog posts in English published between the 1st of July 2006 and the 30th of June 2012, thus ex- tracting 7,269 posts from HASTAC and 6,777 posts from Hypotheses. We performed a co- word analysis over these 14,046 posts [31] and subsequently extracted a random sample of 5,000 posts from each platform to perform topic modeling. Fig. 1 shows a frequency histogram of blog posts in the abovementioned period on a logarithmic scale, with HASTAC posts being comparatively more frequent from 2006 to 2010, and posts on Hypotheses being comparatively more frequent in the period thereafter. Activity on both platforms drops during the summer vacation months (July for HASTAC and August for Hypotheses) reflecting seasonal work patterns. Methods We approached our first question (H1) by means of a co-word analysis of keywords associated with humanities and Digital Humanities research [31]. We used one vector of twenty humani- ties areas (anthropology, archaeology, archive, art, culture, ethnography, history, humanities, learning, libraries, literacy, literature, media, pedagogy, preservation, publishing, rhetoric, scholarship, storytelling, knowledge) and another identical vector plus the suffix “digital” (digi- tal anthropology, digital archaeology, digital archive, digital art, digital culture, digital ethnog- raphy, digital history, digital humanities, digital learning, digital libraries, digital literacy, digital literature, digital media, digital pedagogy, digital preservation, digital publishing, digital rhetoric, digital scholarship, digital storytelling, digital knowledge). These keywords include terms that describe fields or general domains associated with the humanities on the basis of raw token frequencies identified in the two datasets. This approach comes with considerable limitations. Firstly, the semantics of the terms differ considerably, as some describe fields of scholarship (history—digital history), while others are more general and tend to be polysemous (knowledge, media). The same applies to their prefixed counterparts, with digital history likely Fig 1. English-language blog posts published on both sites between 2006 and 2012. doi:10.1371/journal.pone.0115035.g001 How Digital Are the Digital Humanities? PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 5 / 15 identifying a field, while digital media most likely describes certain kinds of technical media. Furthermore, issues of precision and recall arise, due to which not all discussion of the relevant phenomena is reliably captured and some of what is captured relates to other concepts. In spite of these limitations, we found co-word analysis to be useful, because it shows the entrenchment of the terms as convenient and fashionable labels on both platforms. We accept that such labels do not narrowly identify concepts, but believe that they are suitable to characterize the success of particular terms around which the DH community can rally. Using these terms we generated a series of term-document matrices for each of the net- works. We visualized the association between humanities and DH by performing a multinomi- al logistic regression on the terms. We relied on the textir package for R [32] to convert the term-to-term co-occurrence matrix to a matrix of the log-odds ratios of co-occurrence. The re- sulting matrices (HASTAC and Hypotheses) scales the word similarity as a function of word frequency, with terms of similar semantic content numerically represented as being similar to one another [33]. After converting the log-odds ratios to distance matrices using cosine simi- larity [34, 35], we relied on multidimensional scaling [36] to visualize humanities and DH terms in a latent semantic space [37] with a two-dimensional density surface [38]. The second question (H2) was addressed using Latent Dirichlet Allocation [39] implementa- tion for R [40]. R package topicmodels allows the probabilistic modeling of term frequency oc- currences in documents and estimation of similarities between documents and words using an additional layer of latent variables referred to as topics. The package provides the basic func- tions for fitting topic models based on data structures from the text mining package tm [41]. Topics were modeled using a mixed-membership approach in which documents are not as- sumed to belong to single topics, but to simultaneously belong to several topics, with varying distributions across documents. To equally represent both platforms, we drew a random sam- ple of 5,000 posts from each platform from the data previously described. Prior to mapping the documents to the term frequency vector, we tokenized the posts and processed the tokens by removing punctuation, numbers, stemming, and stop words, in order to sparsen the matrices. We also omitted very short documents (<200 characters) for the same purpose. Ethics Statement. The authors confirm that the study is in compliance with the Terms and Conditions of HASTAC and Hypotheses. Results Co-word analysis With respect to our first research question (H1) we found that unprefixed keywords occurred in a much higher ratio relative to their prefixed counterparts. Table 1 shows the number of oc- currences of humanities and DH terms on both platforms, with a high concentration of posts focusing on art, media, history, culture, and humanities, followed by learning, publishing, and libraries. The areas of research with fewer occurrences are archaeology, storytelling, ethnogra- phy, and preservation. HASTAC presented a much higher number of references to humanities (21,262) and DH (2,771) in comparison to Hypotheses (9,644 and 187, respectively). The ratio of posts with humanities to DH related terms is also higher on HASTAC at seven posts on hu- manities to each post on DH while on Hypotheses the ratio is of fifty-one posts on humanities to each post on DH. In fact, we found no mention to nine areas of DH in the Hypotheses sample. Although the distribution of humanities and DH terms is skewed towards HASTAC, the distribution per area of research on humanities is fairly similar. Fig. 2 shows a cluster dendo- gram of term co-occurrences based on Euclidean distance, with humanities areas appearing at the top of the hierarchical structure and DH terms appearing near the bottom. Art, culture, How Digital Are the Digital Humanities? PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 6 / 15 and media are likely to also refer to general terms rather than only humanities disciplines, therefore presenting a higher value of intergroup dissimilarity and appearing higher up in the hierarchy. More narrowly defined areas such as learning and digital media are followed on HASTAC, while the hierarchical clustering of topics on Hypotheses is topped by history and publishing. Fig. 2 shows internal differences and dissimilarities between the two platforms in their usage of the labels listed in Table 1. DH subfields are much more distinct from other terms in HASTAC that they are on Hy- potheses, where many of the DH labels are either uncommon or not used at all. Unsurprisingly, we found that most blog posts that made reference to DH terms also included references to the unprefixed terms, but not the other way around. From the 5,711 posts on HASTAC that in- cluded references to humanities-related terms (21,262 occurrences), 89% of them also included references to the corresponding label in DH. However, from the 1,996 posts on HASTAC that included references to Digital Humanities terms (2,771 occurrences), only 11% of them also in- cluded references to the corresponding term in the humanities. This asymmetry is actually more pronounced in the Hypotheses network. From the 4,001 posts on Hypotheses that in- cluded references to humanities-related terms (9,644 occurrences), 98% also included refer- ences to the corresponding term in DH. However, from the 140 posts on Hypotheses that included references to DH-related terms (187 occurrences), only 2% also included references to the corresponding humanities area. The dependence of Digital Humanities on established humanities labels is consistent, but it varies considerably within each of the areas investigated. The average percentage of posts per area that include reference to both humanities and DH is still quite skewed, as 80% of posts on HASTAC (mean = .79, median = .84) and Hypotheses (mean = .78, median = .81) dedicated to Digital Humanities areas also including references to the main humanities area. The reverse Table 1. Number of occurrences of humanities and DH terms. HASTAC HU HASTAC DH Hypo HU Hypo DH anthropology 213 4 310 NA archaeology 80 2 147 NA archive 784 99 365 11 art 3363 128 1626 8 culture 1540 146 1168 12 ethnography 144 8 117 NA history 1772 38 1945 15 humanities 1533 674 309 69 knowledge 1515 15 613 NA learning 2026 113 187 NA libraries 924 116 293 34 literacy 505 92 45 NA literature 727 10 492 NA media 3118 1021 761 17 pedagogy 504 27 48 NA preservation 139 31 90 7 publishing 1150 35 843 11 rhetoric 309 11 76 NA scholarship 735 132 190 1 storytelling 181 69 19 2 doi:10.1371/journal.pone.0115035.t001 How Digital Are the Digital Humanities? PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 7 / 15 dependency is also observed in the aggregated data per area, as less than 10% of posts on HAS- TAC (mean = .09, median = .05) and Hypotheses (mean = .05, median = .02) dedicated to hu- manities also included references to the related DH area. However, the dependency is noticeably lower in some fields of humanities. Preservation and archival studies presented a much lower ratio of posts dedicated to Digital Humanities that also referred to the associated humanities area (48% and 74% on HASTAC, and 57% and 91% on Hypotheses). Storytelling, literacy, and pedagogy are also particularly independent in the HASTAC network, with 52%, 63%, and 67% of posts making reference to digital terminology without mentioning the related Fig 2. Hierarchical cluster dendrogram of term co-occurrences in both platforms. doi:10.1371/journal.pone.0115035.g002 How Digital Are the Digital Humanities? PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 8 / 15 humanities field. On Hypotheses, art is the term most detached from the main humanities area, with 63% of posts dedicated to digital art not making reference to the unprefixed field. Some areas show a strong intersection of humanities and DH terms. A considerable propor- tion of articles that refer to humanities, storytelling, and libraries also made reference to digital humanities, digital storytelling, and digital libraries (37%, 20%, and 11% on HASTAC, and 21%, 11%, and 9% on Hypotheses). Media, scholarship, literacy, and preservation also pre- sented higher-than-average levels of cross-pollination on HASTAC, with 30%, 14%, 11%, and 11% of the articles focusing on these terms also making reference to their niche Digital Human- ities label. Most of these terms also presented a considerable level of intersection of DH with general terms. We further explored the interplay between humanities and DH by performing a multinomi- al logistic regression on the terms. The matrices of log-odds ratios of co-occurrence indicate the word similarity and allow for visualizing humanities and DH terms in a latent semantic space with a two-dimensional density surface. Fig. 3 shows a contour-sociogram of the terms with substantial cross-pollination across different topics of humanities and Digital Humanities research. HASTAC posts with humanities and DH terms are clearly clustered around four main groups. The first includes terms associated with humanities at large, culture, and arts; the second is dedicated to education and learning; the third to archives and libraries; and the last clusters terms associated with anthropology and history. On the other hand, Hypotheses posts with humanities and DH terms are mostly concentrated on a single cluster due to many topics lacking more entry points. Nonetheless, humanities content published on Hypotheses presents clusters around humanities and media; archives, history, and arts; and one cluster grouping li- brary-related materials. The vast majority of articles focusing on digital media, digital libraries, digital art, digital hu- manities, digital culture, and digital publishing also included references to the main humanities area. This is particularly the case on HASTAC (93%, 91%, 89%, 85%, 84%, and 80%, respective- ly), but also on Hypotheses (71%, 79%, 63%, 96%, 83%, and 91%, respectively). In short, the re- sults predictably show a considerable one-way dependency of DH on the unprefixed keyword, and a relative independence of the latter relative to the former. However, there are a few DH Fig 3. Density curves of log-odds co-occurrence ratios between humanities-related terms. Larger labels represent thematic areas manually identified. doi:10.1371/journal.pone.0115035.g003 How Digital Are the Digital Humanities? PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 9 / 15 areas that presented substantial independence from the related humanities area, namely preser- vation, archive, storytelling, literacy, and pedagogy. We interpret this emancipation as an indi- cator for the establishment of these terms as convenient labels, which, while not necessarily identifying clear-cut concepts, provide attractive brands for the DH community to rally around. Topic modeling We proceeded by exploring the topical differences between the two platforms to test our sec- ond research question (H2). We modeled twenty topics for the combined corpus of both plat- forms (5,000 posts each). Table 2 provides an overview of twelve selected topics and their ten most distinct terms by rank, some of which related to particular domains (Health, History, Law, Art, Games), while others are related to more general themes (Chatter, Learning). Topics were labeled through a qualitative interpretation of the most salient topic keywords and Table 2. Common topics on HASTAC and Hypotheses. Topic 1: Health Topic 2: Cold War Topic 3: Law Topic 4: DH health war law digital medicine university legal humanities medical korean series university history history turkish hastac food korea history new university cold also media social culture said will urban women one scholars care art book technology research visual new research Topic 5: SocMed Topic 6: Data Topic 7: Art Topic 8: Urban Std social can university urban can data art social new use museum political media will history new one digital heritage international cultural information museums studies culture project cultural european time also music global digital site new economic space work sound management Topic 9: Gaming Topic 10: Chatter Topic 11: Learn Topic 12: Energy game one students energy games people learning climate video like will change play can can policy virtual just class countries world time new will one even education global can think digital gas gaming now one carbon worlds many work paper doi:10.1371/journal.pone.0115035.t002 How Digital Are the Digital Humanities? PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 10 / 15 reading a sample of the associated blog posts, meaning that they retain a certain subjective bias. Most domain areas identified are strongly associated with content published on Hypotheses through individual blogs with a clear and consistent topical focus (e.g. Health, History, Law, Energy), while HASTAC has a stronger association with metatopics such as Learning, Data, and Gaming. Some topics of general interest (e.g. Social Media and Data) are shared between the plat- forms. Conference Calls and Job Advertisements form two distinct yet evenly distributed topic based on their stylistic uniformity. In addition to pointing out thematic differences, topics also reflect differences in style between the two sites. Topic #12 (Chatter) is lexically distinct from other topics in that it uses much more general nouns (time, people) and verbs (think, know). It reflects a set of essayistic posts, particularly on HASTAC, which discuss controversial issues and tend to be relatively short. Spam is also a distinct topic, but one that is also shared between both sites. We also found that while some topics overlap somewhat, many are highly characteristic of one of the two platforms. Topics #1 (Health), #2 (Cold War), #4 (Law), #8 (Art), #9 (Urban Studies), and #16 (Energy) are relatively clearly associated with Hypotheses, while topics #5 (Digital Humanities), #10 (Gaming), #12 (Chatter), and #15 (Learning) are prevalent on HAS- TAC. Topics #6 (Social Media) and #7 (Data) show a more even distribution between the two sites. Similar to our findings in the co-word analysis, #5 (Digital Humanities) is more prevalent in HASTAC than in Hypotheses. The distribution of topic scores suggests that a number of lin- guistically distinct thematic areas exist on Hypotheses, and that these areas follow disciplinary patterns. By contrast, HASTAC posts are less clearly associated with a single field of inquiry and most closely associated with metatopics such as learning and general conversation. HAS- TAC posts are also linked to the discussion of Digital Humanities and the usage of labels relat- ed to DH. The differences between the two platforms may point to diverging goals associated with scholarly blogging: addressing broad interdisciplinary issues before a wider public vs. con- ducting focused scholarly discussion within fields. The difference in the number of unique authors between the two platforms (923 authors on HASTAC vs. 403 authors on Hypotheses) may influence the result of the topic modeling, with a few very specific topics present on Hypotheses not represented on HASTAC (e.g. Cold War). Nonetheless, the results confirm the observations drawn from the co-word analysis, with topics on Hypotheses tending to be more disciplinarily aligned and connected exclusively to a single area of research, while posts on HASTAC are more likely to pick up interdisciplinary and gen- eral themes. Fig. 4 shows the topic scores in the 12 selected topics, with each dot representing a post and its color indicating the platform. Discussion The results reported in this study can be summarized in two parts. Firstly, we found a substan- tial one-way dependency of DH terms on their unprefixed counterparts, as most blog posts dedicated to DH also included references to the corresponding humanities term (89% on HAS- TAC and 98% on Hypotheses). DH-related labels are considerably more frequent in HASTAC pointing to an unequal adoption of Digital Humanities-related terms in different local contexts. Secondly, we found a tendency in Hypotheses towards focused thematic areas representing dis- ciplinary interests contrasted with a tendency to discuss more general, cross-disciplinary themes in HASTAC. In terms of institutional branches of humanities research, history is the areas with the largest number of posts across the networks for the sample of topics considered in this study. Areas that are not traditionally associated with humanities research (or institutions that support the How Digital Are the Digital Humanities? PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 11 / 15 field), i.e. library and media, also account for a considerable portion of the posts. We also found considerable topical differences between the two platforms. While traditional areas of the humanities and social sciences (History, Art, Law) are clearly represented in Hypotheses, HASTAC is topically more cross-disciplinary and less focused on single disciplines. Some of these topics show considerable overlap between the networks (i.e. Social Media and Data), highlighting the fact that there are areas in which users of HASTAC and Hypotheses have simi- lar interests, while others are considerably more predominant in one of the networks. Although both networks are on the forefront of the Digital Humanities research agenda, they present considerable differences in how explicitly they use new disciplinary labels (HASTAC) and ad- dress well-established disciplinary themes without explicitly associating them with DH (Hypotheses). The differences we observed highlight that two platforms that attract broadly similar user communities may still differ considerably with regards to topics. We interpret the differences in adoption of Digital Humanities terminologies and topics across the networks to mirror dif- ferent developments in DH. Whereas digital learning, digital literacy, and particularly digital scholarship are particularly prominent labels on HASTAC, Hypotheses is mostly focused on digital libraries, digital history, and digital archives. These differences are of qualitative and quantitative nature reflecting not just the personal preferences of bloggers and users, but may also indicate broader conceptual differences. While blog posts in HASTAC tend to raise issues suitable for (controversial) discussion, contributions in Hypotheses more closely mirror tradi- tional expository humanities genres (e.g. book chapters or essays). Moreover, while HASTAC is a social network in which users can create profiles and interact with other users by posting and commenting on content, Hypotheses is a publishing platform with lesser emphasis on community building than HASTAC, and a closer alignment with traditional genres of publishing. The content of each network also presents considerable variation in terms of formats and style. The prominence of Topic #12 (Chatter) in HASTAC indicates that HASTAC’s blog en- tries are conceptually more like casual conversation rather than academic writing. As blogs serve different purposes for different users, the data necessarily includes posts of different gen- res comprising of short essays, conference reviews, book reports, group discussions, and Fig 4. Distribution of posts per topic, with posts in red from HASTAC and posts in blue from Hypotheses. doi:10.1371/journal.pone.0115035.g004 How Digital Are the Digital Humanities? PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 12 / 15 general academic advertising. While HASTAC and Hypotheses are interdisciplinary in charac- ter, they have a strong slant towards the humanities, particularly towards learning and digital media on HASTAC, and specifically towards history on Hypotheses. Common to both net- works is the small proportion of users producing the large majority of the content, which leads to a typical long-tail distribution of content within the platforms. In the last instance, the results reported in this study show that the variety of terms and top- ics associated with DH is locally configured and reflects different conceptualizations of what constitutes DH. We expect this study to be informative for future research grappling with the rapid establishment of DH in humanities departments. At any rate, it will be interesting to fol- low the ongoing maturation of both platforms and their respective approaches to scholarly blogging, as well as the different conceptualizations of Digital Humanities scholarship in North American and European contexts. Supporting Information S1 Materials. HASTAC dataset with 7,269 entries including timestamp and blog posts. (ZIP) S2 Materials. Hypotheses dataset with 6,777 entries including timestamp and blog posts. (ZIP) Acknowledgments We are thankful to David Sparks for helping with HASTAC data analysis and visualization, the HASTAC team for providing access to the HASTAC data, Ruby Sinreich for providing impor- tant feedback to this research, and Marin Dacos for providing access to the Hypotheses data. Author Contributions Conceived and designed the experiments: CP MB. Performed the experiments: CP MB. Ana- lyzed the data: CP MB. Contributed reagents/materials/analysis tools: CP MB. Wrote the paper: CP MB. References 1. Borgman CL (2007) Scholarship in the digital age: Information, infrastructure, and the Internet. Cam- bridge, MA: MIT Press. 360 p. doi: 10.1093/jxb/erm028 PMID: 25506957 2. Meyer ET, Schroeder R (2009) The world wide web of research and access to knowledge. Knowl Manag Res Pract 7: 218–233. doi: 10.1057/kmrp.2009.13. 3. Nentwich M, König R (2012) Cyberscience 2.0: Research in the age of digital social networks. Frank- furt am Main: Campus. 237 p. doi: 10.1007/s12070-012-0514-9 PMID: 25621271 4. Dutton WH, Jeffreys PW, editors (2010) World wide research: Reshaping the sciences and humanities. Cambridge, MA: MIT Press. 408 p. doi: 10.7551/mitpress/9780262014397.001.0001. 5. Evans JA (2008) Electronic publication and the narrowing of science and scholarship. Science 321: 395–399. doi: 10.1126/science.1150473 PMID: 18635800 6. Cope WW, Kalantzis M (2009) Signs of epistemic disruption: Transformations in the knowledge system of the academic journal. First Monday 14. Available: http://firstmonday.org/article/view/2309/2163. Ac- cessed 23 December 2014. 7. Mahrt M, Weller K, Peters I (2014) Twitter in scholarly communication. In: Weller K, Bruns A, Burgess J, Mahrt M, Puschmann C, editors. Twitter and society. New York: Peter Lang. pp. 399–410. 8. Puschmann C, Mahrt M (2012) Scholarly blogging: A new form of publishing or science journalism 2.0? In: Tokar A, Beurskens M, Keuneke S, Mahrt M, Peters I, et al., editors. Science and the Internet. Düs- seldorf: Düsseldorf University Press. pp. 171–181. How Digital Are the Digital Humanities? PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 13 / 15 http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0115035.s001 http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0115035.s002 http://dx.doi.org/10.1093/jxb/erm028 http://www.ncbi.nlm.nih.gov/pubmed/25506957 http://dx.doi.org/10.1057/kmrp.2009.13 http://dx.doi.org/10.1007/s12070-012-0514-9 http://www.ncbi.nlm.nih.gov/pubmed/25621271 http://dx.doi.org/10.7551/mitpress/9780262014397.001.0001 http://dx.doi.org/10.1126/science.1150473 http://www.ncbi.nlm.nih.gov/pubmed/18635800 http://firstmonday.org/article/view/2309/2163 9. Puschmann C (2014) (Micro)blogging science? Notes on potentials and constraints of new forms of scholarly communication. In: Bartling S, Friesike S, editors. Opening Science. Berlin, Heidelberg: Springer International Publishing. pp. 89–106. doi: 10.1007/978-3-319-00026-8. 10. Shema H, Bar-Ilan J, Thelwall M (2012) Research blogs and the discussion of scholarly information. PLoS One 7: e35869. doi: 10.1371/journal.pone.0035869 PMID: 22606239 11. Kjellberg S (2010) I am a blogging researcher: Motivations for blogging in a scholarly context. First Mon- day 15. Available: http://firstmonday.org/article/view/2962/2580. Accessed 23 December 2014. 12. Gruzd A, Staves K, Wilk A (2012) Connected scholars: Examining the role of social media in research practices of faculty using the UTAUT model. Comput Human Behav 28: 2340–2350. doi: 10.1016/j. chb.2012.07.004. 13. Rowlands I, Nicholas D, Russell B, Canty N, Watkinson A (2011) Social media use in the research workflow. Learn Publ 24: 183–195. doi: 10.1087/20110306. 14. Priem J, Hemminger BH (2010) Scientometrics 2.0: New metrics of scholarly impact on the social web. First Monday 15. Available: http://firstmonday.org/article/view/2874/2570. Accessed 23 December 2014. 15. Bar-Ilan J, Haustein S, Peters I, Priem J, Shema H, et al. (2012) Beyond citations: Scholars’ visibility on the social web. In: Archambault É, Gingras Y, Larivière V, editors. Proceedings of the 17th International Conference on Science and Technology Indicators. Montréal: Science-Metrix and OST. pp. 98–109. 16. Schreibman S, Siemens R, Unsworth JM, editors (2004) A companion to digital humanities. Oxford: Blackwell Publishers. 604 p. PMID: 25057686 17. Gold MK, editor (2012) Debates in the digital humanities. Minneapolis, MN: University of Minnesota Press. 532 p. doi: 10.1007/s12070-012-0514-9 PMID: 25621271 18. Berry D (2012) Understanding digital humanities. Basingstoke: Palgrave Macmillan. 336 p. doi: 10. 1007/s12070-012-0514-9 PMID: 25621271 19. Kirschenbaum MG (2010) What is digital humanities and what’s it doing in English departments? ADE Bull 150: 55–61. 20. Juola P (2008) Killer applications in digital humanities. Lit Linguist Comput 23: 73–83. doi: 10.1093/llc/ fqm042. 21. Moretti F (2007) Graphs, maps, trees: Abstract models for a literary history. New York: Verso. 119 p. doi: 10.1016/j.encep.2007.07.008 PMID: 19013884 22. McPherson T (2008) Introduction: Media studies and the digital humanities. Cine J 48: 119–123. doi: 10.1353/cj.0.0077. 23. Pinch TJ, Bijker WE (1984) The social construction of facts and artefacts: Or how the sociology of sci- ence and the sociology of technology might benefit each other. Soc Stud Sci 14: 399–441. doi: 10. 1177/030631284014003004. 24. Ross C, Terras M, Warwick C, Welsh A (2011) Enabled backchannel: Conference Twitter use by digital humanists. J Doc 67: 214–237. doi: 10.1108/00220411111109449. 25. Puschmann C, Weller K, Dröge E (2011) Studying Twitter conversations as (dynamic) graphs: Visuali- zation and structural comparison. In: Taddicken M, editor. Proceedings of General Online Research 2011. Düsseldorf: DGOF. 26. Yan E, Ding Y (2012) A framework of studying scholarly networks. In: Archambault É, Gingras Y, Lari- vière V, editors. Proceedings of the 17th International Conference on Science and Technology Indica- tors. Montréal: Science-Metrix and OST. pp. 917–926. 27. Davidson CN, Goldberg DT (2004) A manifesto for the humanities in a technological age. Chron High Educ: B7. 28. HASTAC (2014) About HASTAC. Available: http://www.hastac.org/about. Accessed 23 December 2014. 29. Hypotheses.org (2014) About Hypotheses. Available: http://hypotheses.org/about/hypotheses-org-en. Accessed 23 December 2014. 30. Lui M, Baldwin T (2012) langid.py: An off-the-shelf language identification tool. In: Li H, editor. Proceed- ings of the 50th Annual Meeting of the Association for Computational Linguistics. Jeju Island, Korea: ACL. pp. 25–30. 31. Callon M, Courtial J-P, Turner WA, Bauin S (1983) From translations to problematic networks: An intro- duction to co-word analysis. Soc Sci Inf 22: 191–235. doi: 10.1177/053901883022002003. 32. Taddy M (2013) Multinomial inverse regression for text analysis. J Am Stat Assoc 108: 755–770. doi: 10.1080/01621459.2012.734168. How Digital Are the Digital Humanities? PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 14 / 15 http://dx.doi.org/10.1007/978-3-319-00026-8 http://dx.doi.org/10.1371/journal.pone.0035869 http://www.ncbi.nlm.nih.gov/pubmed/22606239 http://firstmonday.org/article/view/2962/2580 http://dx.doi.org/10.1016/j.chb.2012.07.004 http://dx.doi.org/10.1016/j.chb.2012.07.004 http://dx.doi.org/10.1087/20110306 http://firstmonday.org/article/view/2874/2570 http://www.ncbi.nlm.nih.gov/pubmed/25057686 http://dx.doi.org/10.1007/s12070-012-0514-9 http://www.ncbi.nlm.nih.gov/pubmed/25621271 http://dx.doi.org/10.1007/s12070-012-0514-9 http://dx.doi.org/10.1007/s12070-012-0514-9 http://www.ncbi.nlm.nih.gov/pubmed/25621271 http://dx.doi.org/10.1093/llc/fqm042 http://dx.doi.org/10.1093/llc/fqm042 http://dx.doi.org/10.1016/j.encep.2007.07.008 http://www.ncbi.nlm.nih.gov/pubmed/19013884 http://dx.doi.org/10.1353/cj.0.0077 http://dx.doi.org/10.1177/030631284014003004 http://dx.doi.org/10.1177/030631284014003004 http://dx.doi.org/10.1108/00220411111109449 http://www.hastac.org/about http://Hypotheses.org http://hypotheses.org/about/hypotheses-org-en http://dx.doi.org/10.1177/053901883022002003 http://dx.doi.org/10.1080/01621459.2012.734168 33. Lipsitz SR, Laird NM, Harrington DP (1991) Generalized estimating equations for correlated binary data: Using the odds ratio as a measure of association. Biometrika 78: 153–160. doi: 10.1093/biomet/ 78.1.153. 34. Moody J, Light R (2006) A view from above: The evolving sociological landscape. Am Sociol 37: 67– 86. doi: 10.1007/s12108-006-1006-8. 35. Leydesdorff L (2004) Top-down decomposition of the Journal Citation Report of the Social Science Ci- tation Index: Graph- and factor-analytical approaches. Scientometrics 60: 159–180. doi: 10.1023/B: SCIE.0000027678.31097.e0. 36. Kruskal JB (1964) Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29: 1–27. doi: 10.1007/BF02289565. 37. Moody J (2004) The structure of a social science collaboration network: Disciplinary cohesion from 1963 to 1999. Am Sociol Rev 69: 213–238. doi: 10.1177/000312240406900204. 38. Wickham H (2009) ggplot2: Elegant graphics for data analysis. Berlin, Heidelberg: Springer. 213 p. PMID: 25506961 39. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3: 993–1022. doi: 10. 1162/jmlr.2003.3.4-5.993. 40. Grün B, Hornik K (2011) topicmodels: An R package for fitting topic models. J Stat Softw 40. PMID: 22523482 41. Feinerer I, Hornik K, Meyer D (2008) Text mining infrastructure in R. J Stat Softw 25. How Digital Are the Digital Humanities? PLOS ONE | DOI:10.1371/journal.pone.0115035 February 12, 2015 15 / 15 http://dx.doi.org/10.1093/biomet/78.1.153 http://dx.doi.org/10.1093/biomet/78.1.153 http://dx.doi.org/10.1007/s12108-006-1006-8 http://dx.doi.org/10.1023/B:SCIE.0000027678.31097.e0 http://dx.doi.org/10.1023/B:SCIE.0000027678.31097.e0 http://dx.doi.org/10.1007/BF02289565 http://dx.doi.org/10.1177/000312240406900204 http://www.ncbi.nlm.nih.gov/pubmed/25506961 http://dx.doi.org/10.1162/jmlr.2003.3.4-5.993 http://dx.doi.org/10.1162/jmlr.2003.3.4-5.993 http://www.ncbi.nlm.nih.gov/pubmed/22523482 << /ASCII85EncodePages false /AllowTransparency false /AutoPositionEPSFiles true /AutoRotatePages /None /Binding /Left /CalGrayProfile (Dot Gain 20%) /CalRGBProfile (sRGB IEC61966-2.1) /CalCMYKProfile (U.S. Web Coated \050SWOP\051 v2) /sRGBProfile (sRGB IEC61966-2.1) /CannotEmbedFontPolicy /Error /CompatibilityLevel 1.4 /CompressObjects /Tags /CompressPages true /ConvertImagesToIndexed true /PassThroughJPEGImages true /CreateJobTicket false /DefaultRenderingIntent /Default /DetectBlends true /DetectCurves 0.0000 /ColorConversionStrategy /CMYK /DoThumbnails false /EmbedAllFonts true /EmbedOpenType false /ParseICCProfilesInComments true /EmbedJobOptions true /DSCReportingLevel 0 /EmitDSCWarnings false /EndPage -1 /ImageMemory 1048576 /LockDistillerParams false /MaxSubsetPct 100 /Optimize true /OPM 1 /ParseDSCComments true /ParseDSCCommentsForDocInfo true /PreserveCopyPage true /PreserveDICMYKValues true /PreserveEPSInfo true /PreserveFlatness true /PreserveHalftoneInfo false /PreserveOPIComments true /PreserveOverprintSettings true /StartPage 1 /SubsetFonts true /TransferFunctionInfo /Apply /UCRandBGInfo /Preserve /UsePrologue false /ColorSettingsFile () /AlwaysEmbed [ true ] /NeverEmbed [ true ] /AntiAliasColorImages false /CropColorImages true /ColorImageMinResolution 300 /ColorImageMinResolutionPolicy /OK /DownsampleColorImages true /ColorImageDownsampleType /Bicubic /ColorImageResolution 300 /ColorImageDepth -1 /ColorImageMinDownsampleDepth 1 /ColorImageDownsampleThreshold 1.50000 /EncodeColorImages true /ColorImageFilter /DCTEncode /AutoFilterColorImages true /ColorImageAutoFilterStrategy /JPEG /ColorACSImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000ColorACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000ColorImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 300 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /GrayImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000GrayACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000GrayImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict << /K -1 >> /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile () /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False /CreateJDFFile false /Description << /ARA /BGR /CHS /CHT /CZE /DAN /DEU /ESP /ETI /FRA /GRE /HEB /HRV (Za stvaranje Adobe PDF dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. Stvoreni PDF dokumenti mogu se otvoriti Acrobat i Adobe Reader 5.0 i kasnijim verzijama.) /HUN /ITA /JPN /KOR /LTH /LVI /NLD (Gebruik deze instellingen om Adobe PDF-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. De gemaakte PDF-documenten kunnen worden geopend met Acrobat en Adobe Reader 5.0 en hoger.) /NOR /POL /PTB /RUM /RUS /SKY /SLV /SUO /SVE /TUR /UKR /ENU (Use these settings to create Adobe PDF documents best suited for high-quality prepress printing. Created PDF documents can be opened with Acrobat and Adobe Reader 5.0 and later.) >> /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ << /AsReaderSpreads false /CropImagesToFrames true /ErrorControl /WarnAndContinue /FlattenerIgnoreSpreadOverrides false /IncludeGuidesGrids false /IncludeNonPrinting false /IncludeSlug false /Namespace [ (Adobe) (InDesign) (4.0) ] /OmitPlacedBitmaps false /OmitPlacedEPS false /OmitPlacedPDF false /SimulateOverprint /Legacy >> << /AddBleedMarks false /AddColorBars false /AddCropMarks false /AddPageInfo false /AddRegMarks false /ConvertColors /ConvertToCMYK /DestinationProfileName () /DestinationProfileSelector /DocumentCMYK /Downsample16BitImages true /FlattenerPreset << /PresetSelector /MediumResolution >> /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ] >> setdistillerparams << /HWResolution [2400 2400] /PageSize [612.000 792.000] >> setpagedevice work_6fuex7f7o5fknbvjrtuvimvsvy ---- User-Centered Design Practices in Digital Humanities – Experiences from DARIAH and CENDARI ABI Technik 2017; 37(1): 2–11 Fachbeitrag Klaus Thoden, Juliane Stiller, Natasa Bulatovic, Hanna-Lena Meiners, Nadia Boukhelifa User-Centered Design Practices in Digital Humanities – Experiences from DARIAH and CENDARI DOI 10.1515/abitech-2017-0002 Abstract: User experience and usability (UX) form a key part of research and best practice for product and software development. In this paper, the topic is addressed from the perspective of the Digital Humanities (DH) and approach- es undertaken in two DH infrastructure projects, DARI- AH and CENDARI are presented. Both projects addressed aspects of UX, focusing on the usage of a single software tool, as well as on an integrated research workflow using several tools and devices. The article lists the main factors, gleaned from research undertaken in the projects, that in- fluence usability practices in the DH, and provides possi- ble recommendations on how to approach them. Keywords: digital humanities, usability, participatory de- sign Nutzerorientierte Entwicklungsmethoden in den digitalen Geisteswissenschaften – Erfahrungen aus den Infrastrukturprojekten DARIAH und CENDARI Zusammenfassung: User Experience und Usability sind wichtige Bestandteile der Forschung und Praxis in der Produkt- und Softwareentwicklung. In diesem Artikel nähern sich die Autoren dem Thema aus der Perspektive der digitalen Geisteswissenschaften. Es werden die Er- fahrungen aus zwei Infrastrukturprojekten, DARIAH und CENDARI, näher beschrieben und Handlungsanweisun- gen abgeleitet. In den Projekten wurden Usability-Studien durchgeführt und nutzerorientierte Methoden eingesetzt, bei denen es einerseits um die Nutzung von unabhängi- gen Tools, andererseits um einen integrierten Forschungs- kreislauf mit verschiedenen Tools ging. Aufgrund der in den Projekten gesammelten Erfahrung und durchgeführ- ten Forschung werden Faktoren aufgelistet, die die Ent- wicklungsmethoden in Bezug auf Usability in den digita- len Geisteswissenschaften beeinflussen. Schlüsselwörter: Digitale Geisteswissenschaften, Usabili- ty, Nutzerstudien 1 Introduction In the Digital Humanities (DH), one focus of research centers on the development and advancement of meth- ods and respective tools that can support them. Burdick et al.1 understand data curation and data analysis as well as editing and modeling as the tasks most relevant to the DH. Using digital tools and services for these tasks is at the core of DH practice. There is an interest in convincing tra- ditional scholars to adopt digital methods and tools and in demonstrating the potential of computing for the human- ities. Creating a positive user experience can increase the adoption and usage of tools2 and therefore, over the past decade, requests have increased for development in line with results from user-centered research methods. In light of the numerous sources that claim a lack of usability studies for the DH3 or that developed tools are not intuitive or difficult to use and thus fail user expec- tations4, building digital tools with better usability seems difficult to put into practice. Schreibman and Hanlon5 state that only 31 percent of tool developers actually con- duct usability tests. But what are the reasons for this, giv- 1 Burdick, Anne; Johanna Drucker, Peter Lunenfeld, et al.: Digital_ Humanities. Cambridge, Mass. 2012. 17 f. 2 Gibbs, Fred; Trevor Owens: “Building Better Digital Humanities Tools: Toward Broader Audiences and User-Centered Designs.” In: Digital Humanities Quarterly 006,2 (2012). http://www.digitalhuman ities.org/dhq/vol/6/2/000136/000136.html. 3 Jänicke, Stefan, Greta Franzini, M. Cheema, et al.: “On Close and Distant Reading in Digital Humanities: A Survey and Future Chal- lenges’.” In: Proc. of EuroVis—STARs. (2015): 83–103. 4 Gibbs and Owens 2012. 5 Schreibman, Susan, Ann M. Hanlon: “Determining Value for Dig- ital Humanities Tools: Report on a Survey of Tool Developers.” In: Digital Humanities Quarterly 004,2 (2010). http://www.digitalhuman ities.org/dhq/vol/4/2/000083/000083.html. Bereitgestellt von | Humboldt-Universität zu Berlin Angemeldet Heruntergeladen am | 23.04.19 12:25 K. Thoden et al., User-Centered Design Practices in Digital Humanities Fachbeitrag 3 en the abundant literature on usability and on how to de- velop for good user experience? The theoretical path to a better and great user experience is well laid out. What are the factors that contribute to such a gap, or is the current status of usability in DH tools better than we think? This paper intends to explore the practice of usability in the DH and the perceived lack thereof. The goal is to find consensus on what good user experience and usabil- ity means in the Digital Humanities, what practices this might entail and what factors influence these practices. The authors report on experiences from two infrastruc- ture projects in the DH  – CENDARI and DARIAH  – both of which have invested considerable resources in research and best practice for usability. Based on these two cases, they will derive the requirements and needs for a funda- mental usability practice in the domain of DH. In the next section, definitions will be given for user experience and usability from various domains, and dis- tilled into a working definition for this paper. Section 3 introduces approaches to user experience in the DH. In section  4, there will be a reflection on the experiences, opportunities and problems from DARIAH and CENDARI. The paper concludes with answers to the questions about the presence of a kind of “reality gap,” which manifests itself in the discrepancy between developing a highly ac- cepted theoretical tool and the in practice often poor user experience. 2 Usability and User Experience – Methods to Increase User Satisfaction User experience, usability and interface design play a tre- mendous role for DH tools, services and infrastructures because, as Kirschenbaum6 puts it, “the interface becomes the first and in most respects the exclusive experience of the project for its end users”7. Emphasis on usability and user experience in the development process of tools and infrastructure components should be self-explanatory. The terms user experience and usability are closely re- lated and often also used synonymously. In the ISO stan- dard 9241-201, usability is defined as the “[e]xtent to which 6 Kirschenbaum, Matthew G.: “So the Colors Cover the Wires: Inter- face, Aesthetics, and Usability.” In: A Companion to Digital Human- ities. Edited by Susan Schreibman, Ray Siemens, John Unsworth. 523–542. Blackwell Publishing Ltd., 2004. http://onlinelibrary.wiley. com/doi/10.1002/9780470999875.ch34/summary. 7 Kirschenbaum 2004, 2. a system, product or service can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.” Usability is often used as a generic term under which several methods, re- quirements and definitions are subsumed. The term “user experience” (UX) is considered to be even broader with the goal to optimize human performance and user satis- faction8. In most software development processes, usabil- ity considerations are an integral part of a successful product delivery. The methods implemented target the improvement of interface design but also successful hu- man-computer interactions. In his book Usable Usability9, Reiss distinguishes two components of usability. The first component is to see whether the intended functionality is working as it should. He calls this “ease of use.” The second component, “elegance and clarity”, deals with the expectations users have with regard to certain functional- ities. Both components incorporate elements of interface design and human-computer interaction focusing on de- livering a product or service that has user-centered design at its core. This is an approach the authors would like to adopt for the remainder of this paper: a useable tool, workflow or service that supports the scholar in obtaining results, in line with the method used, while being trans- parent about the provenance of these results. To deliver usable software products and tools, several processes and methods are defined and ideally integrated into the development process. For projects where the re- quirements of several stakeholders need to be reconciled, participatory design studies have proved successful  – an approach that is particularly interesting for the Digital Humanities. Participatory design has been adopted by many disciplines where stakeholders cooperate to ensure that the final product meets everyone’s requirements. For interactive software design, the aim is to benefit from dif- ferent expertise: designers know about the technology and users know their data, their workflow and its context. In a similar vein, Muller10 describes participatory design 8 Bevans, Nigel: “What Is the Difference between the Purpose of Us- ability and User Experience Evaluation Methods?” UXEM’09 Work- shop, INTERACT 2009, Uppsala, Sweden 2009. http://nigelbevan. com/cart.htm. 9 Reiss, Eric L.: Usable Usability: Simple Steps for Making Stuff Bet- ter. John Wiley & Sons, 2012. http://www.books24x7.com/marc.asp? bookid=49784. 10 Muller, Michael J.: “Participatory Design: The Third Space in HCI.” In: The Human-Computer Interaction Handbook. Edited by Julie J. Jacko, Andrew Sears. Hillsdale, NJ, USA: Lawrence Er- lbaum Associates, 2003. 1051–1068. http://dl.acm.org/citation. cfm?id=772072.772138. Bereitgestellt von | Humboldt-Universität zu Berlin Angemeldet Heruntergeladen am | 23.04.19 12:25 4 Fachbeitrag K. Thoden et al., User-Centered Design Practices in Digital Humanities as belonging to the in-between domain of end-users and technology developers, which is characterized by re- ciprocal learning and the creation of new ideas through negotiation, co-creation and polyvocal dialogues across and through differences. Participatory design was also the chosen method for user-centered development within CENDARI. To conclude, we can define two different strands regarding user-centered design practices: first, tools that are developed along usability guidelines, and second, the provision of factors and criteria to evaluate the results of usability practices. In the domain of DH, there seems to be no coherent understanding of these two strands. 3 Approaching User Experience in Digital Humanities Usability and positive user experiences are important in increasing the acceptance of digital solutions for the hu- manities. This view is also supported by several surveys that were conducted in the project DARIAH-DE11 and was also the major outcome of usability studies in TextGrid12. Due to the diversity of the DH and the tools developed to serve a wide range of users with varying degrees of technical knowledge and experience, usability and user experience is particularly challenging. Serving tech-savvy users and at the same time convincing other scholars to adopt digital tools will only be possible if the initial hur- dles for using these tools are low. In addition to the diver- sity of the user groups in the DH, the research areas and objects are also very heterogeneous which, with regard to development, complicates the definition of standards 11 Gnadt, Timo, Juliane Stiller, Klaus Thoden, et al.: Finale Version Erfolgskriterien. DARIAH-DE, R1.3.3, Göttingen 2016. https://wiki. de.dariah.eu/download/attachments/14651583/R133_Erfolgskriter- ien_Konsortium.pdf; Stiller, Juliane, Klaus Thoden, Oona Leganovic, et al.: Nutzungsverhalten in den Digital Humanities. DARIAH-DE R1.2.1/M 7.6. Göttingen 2016. https://wiki.de.dariah.eu/download/at tachments/14651583/Report1.2.1-final3.pdf; Bulatovic, Natasa, Timo Gnadt, Matteo Romanello, et al.: “Usability in Digital Humanities  – Evaluating User Interfaces, Infrastructural Components and the Use of Mobile Devices During Research Process.” In: Research and Advanced Technology for Digital Libraries. Edited by Norbert Fuhr, László Kovács, Thomas Risse, et al. 335–346. Cham 2016a. http://link. springer.com/10.1007/978-3-319-43997-6_26. 12 Kaden, Ben, Simone Rieger: “Usability in Forschungsinfrastruk- turen für die Geisteswissenschaften: Erfahrungen und Einsichten aus TextGrid III.” In: TextGrid: Von der Community – für die Communi- ty: eine virtuelle Forschungsumgebung für die Geisteswissenschaften. Edited by Heike Neuroth, Andrea Rapp, Sibylle Söring. 63–75.Glück- stadt, 2015. or the following of standard procedures. The diverse re- search areas and objects, which may be very new and are often unexplored, constitute a further barrier to a reliance on experience and accepted methods and practices. This diversity could lead to missing feedback for de- veloped prototypes and a lack of user requirement anal- ysis, which could be the root cause for unused tools. Prototypes and final products can be assessed, however, using the heuristics of Nielsen13 or Shneiderman et al.14. Usability tests have been undertaken for the DH where these heuristics were consulted for walkthroughs and for evaluation by experts15. One finding was that usabil- ity problems do not always stem from the particular task that a tool should solve but are often very generic prob- lems. This is in line with the findings of Burghardt16 who distinguishes generic and very domain-specific usability problems for his research objects: linguistic annotation tools. Within DARIAH-DE, several tools and services were reviewed and tested, revealing similar problems related to usability and user experience. Although these shortcom- ings may be quite general, they can have a huge impact on the satisfaction of users and what they experience when interacting with a tool. The following problems occurred across products and services: a) ambiguous and incon- sistent vocabulary, b) disregard of graphical conventions, c) intransparency of the system status, d) missing docu- mentation, e) missing strategies to avoid mistakes, f) dis- regard of convention for workflows, e.g. search17. Having strategies in place to avoid these common mistakes would already make DH-tools and services much more usable. An approach often taken for usability is the user study for specific tools, services or infrastructure components. Here, user experience and usability aspects may play a role in investigating user satisfaction with developed features and components. One example is MONK, a web- based text-mining software. In an extensive study, web analytics data and user interviews were analyzed to gain knowledge about the usage of the application18. There was an overall satisfaction with the tool offered but it was ob- 13 Nielsen, Jakob: 10 Heuristics for User Interface Design. 1995. http://www.nngroup.com/articles/ten-usability-heuristics/ 14 Shneiderman, Ben, Catherine Plaisant, Maxine Cohen, et al: De- signing the User Interface: Strategies for Effective Human-Computer Interaction. Harlow 2014. 15 Stiller et al. 2016. 16 Burghardt, Manuel: “Annotationsergonomie: Design-Empfehlun- gen für Linguistische Annotationswerkzeuge.” In: Information – Wis- senschaft & Praxis 63 (5) (2012). doi:10.1515/iwp-2012-0067. 17 Bulatovic et al. 2016(a). 18 Green, Harriett E.: “Under the Workbench: An Analysis of the Use and Preservation of MONK Text Mining Research Software.” In: Liter- ary and Linguistic Computing 29 (1) (2014): 23–40. Bereitgestellt von | Humboldt-Universität zu Berlin Angemeldet Heruntergeladen am | 23.04.19 12:25 K. Thoden et al., User-Centered Design Practices in Digital Humanities Fachbeitrag 5 served that the functionalities were geared more towards easy access and approachability than towards offering a flexible tool for the expert user19. Here, one can see that researchers are more willing to go to considerable lengths to learn a tool when it provides expert functionalities and enables flexible adaptation to their specific needs. Tools such as MONK have been praised for their ability to teach and their provision of gateway entry into learning more about text mining and its capabilities. Other studies have been surveyed and scholars interviewed to gain insights into other barriers and hurdles that hinder the use of such tools. Gibbs and Owens20 identified the lack of integration of users into the design process as well as missing doc- umentation as factors for the low acceptance of tools. In particular, technically challenging tools for visualization and text mining should have usable interfaces and con- cise documentation with examples of use cases21. Techni- cal documentation as well as procedural documentation is essential to ensure the reuse of data and the results of projects. This would help interested parties to understand the scope and goal of online projects22. Research has now started to move away from case studies targeted at the requirements of specific DH-tools and instead search for a more generalized approach to usability and user experiences in the DH. For example, Burghardt23 has developed usability patterns for linguistic annotation tools, arguing that the specificity of the tools requires specific solutions for design and interaction pat- terns. To the authors’ knowledge, Burghardt’s approach to usability engineering in the DH is unique. Participatory design, as one approach to unify the perspectives of several stakeholders, is applied in several infrastructure and tool development projects. Warwick24 provides explanations for the neglect of participatory de- sign in humanities projects: 19 Green, 2014. 20 Gibbs and Owens, 2012. 21 Gibbs and Owens, 2012. 22 Warwick, Claire, Melissa Terras, Isabel Galina, et al.: Evaluating Digital Humanities Resources: The LAIRAH Project Checklist and the Internet Shakespeare Editions Project. London, 2007. http://elpub. scix.net/data/works/att/144_elpub2007.content.pdf. 23 Burghardt, Manuel: Engineering Annotation Usability  – Toward Usability Patterns for Linguistic Annotation Tools. Universität Regens- burg 2014. http://epub.uni-regensburg.de/30768/. 24 Warwick, Claire: “Studying Users in Digital Humanities.” In: Digital Humanities in Practice. Edited by Claire Warwick, Melissa Ter- ras, Juliane Nyhan. 1–21. London, 2012. http://www.facetpublishing. co.uk/title.php?id=7661. It was often assumed that the resources created in digital human- ities would be used by humanities scholars, who were not techni- cally gifted or, perhaps, even luddites. Thus, there was little point asking them what they needed, because they would not know, or their opinion about how a resource functioned, because they would not care. It was also assumed that technical experts were the people who knew what digital resources should look like, what they should do and how they should work. If developers decided that a tool or resource was a good one, then their opinion was the one that counted, since they understood the details of program- ming, databases, XML and website building. The plan, then, was to provide good resources for users, tell them what to do and wait for them to adopt digital humanities methods. (p. 1) Many of these assumptions have been challenged and a number of recent projects have shown that involving DH users in the design process is beneficial in learning about users and their requirements25. For example, Heuwing and Womser-Hacker26 describe user-centered methods applied in the project “Children and their world”27 to ag- gregate the requirements for a catalog that can guide the system design. The authors underline the necessity of communication and understanding in DH projects, which often consist of teams from different community practices. Here, one problem is that tools are developed by computer linguists who may lack knowledge of the domain of the re- spective scholar using the tool. Bridging the gap between the scholar, who can often anticipate the functionalities of a tool but might not know how to build it, and the scien- tist, who develops the tool but may lack insights into the methods applied or the workflow to be mapped in the dig- ital environment, might be the key to resolving this con- flict. In a more recent article, the authors again underline the benefits of user-centered methods in getting different stakeholders closer together in the development process 25 Mattern, Eleanor, Wei Jeng, Daqing He, et al.: “Using Participa- tory Design and Visual Narrative Inquiry to Investigate Researchers? Data Challenges and Recommendations for Library Research Data Services.” In: Program: Electronic Library and Information Systems 49 (4) (2015): 408–23; Wessels, Bridgette, Keira Borrill, Louise Sorensen, et al: Understanding Design for the Digital Humanities. Studies in the Digital Humanities. Sheffield, 2015. HRI Online Publications. http://www.hrionline.ac.uk/openbook/chapter/understanding-de sign-for-the-digital-humanities; Visconti, Amanda: Infinite Ulyss- es. 2016. http://www.infiniteulysses.com/; Heuwing, Ben, Christa Womser-Hacker: „Zwischen Beobachtung und Partizipation  – Nu- tzerzentrierte Methoden für eine Bedarfsanalyse in der Digitalen Geschichtswissenschaft.” In: Information – Wissenschaft & Praxis 66 (5–6) (2015): 335–344. doi:10.1515/iwp-2015-0058. 26 Heuwing and Womser-Hacker, 2015. 27 http://welt-der-kinder.gei.de/. Bereitgestellt von | Humboldt-Universität zu Berlin Angemeldet Heruntergeladen am | 23.04.19 12:25 6 Fachbeitrag K. Thoden et al., User-Centered Design Practices in Digital Humanities and in guiding successful communication practice among different domain experts28. 4 Experiences from Two Infra- structure Projects: DARIAH and CENDARI This section reflects on the experiences made and prob- lems faced in the two infrastructure projects CENDARI and DARIAH, and the opportunities they both offered. The work in DARIAH focused mainly on the evaluation of existing tools and services, as well as their integration into a digital workflow and the iterative monitoring of devel- opment processes with regard to usability. CENDARI es- tablished and conducted user-centered research with the goal of reflecting user requirements at an early stage in the design process. 4.1 DARIAH DARIAH29 is one of the landmark projects within the ESFRI Framework30 of the European Union and one of the research infrastructures for the arts and humanities. According to the EU, the term “research infrastructure” refers to “facilities, resources and related services used by the scientific community to conduct top-level research in their respective fields, ranging from social sciences to astronomy, genomics to nanotechnologies.”31 The German partner DARIAH-DE32 is financed by the Federal Ministry of Education and Research (BMBF) and is now in its third funding period. This period will end in early 2019 with the goal to provide a stable and fully developed infrastructure. One of the work packages in DARIAH-DE deals with the usability of digital tools and infrastructure components. One main aim of this work package in the previous funding period was to accompany the development of 28 Heuwing, Ben, Thomas Mandl, Christa Womser-Hacker: Methods for User-Centered Design and Evaluation of Text Analysis Tools in a Digital History Project. In: Proceedings of ASIS&T. 2016. https://www. asist.org/files/meetings/am16/proceedings/submissions/papers/ 53paper.pdf. 29 Digital Research Infrastructure for the Arts and Humanities, http://www.dariah.eu/. 30 European Strategy Forum on Research Infrastructures, https:// ec.europa.eu/research/infrastructures/index_en.cfm?pg=esfri. 31 https://ec.europa.eu/research/infrastructures/index_en.cfm? pg=what. 32 https://de.dariah.eu/. tools and services and iteratively give feedback to support the development process. For this purpose, stand-alone applications33 as well as digital workflows were tested. In the case of evaluating stand-alone applications for instance, some well-established methods were used: heuristic evaluation34 and thinking aloud tests35. Both methods do not need a large setup and are fairly feasible. We are aware that more extensive tests will always yield better results, but testing with a small group and con- fronting future users with software products will already gain valuable insights. As a first step, the application in question was evaluated heuristically by two usability ex- perts using the guidelines set up by Nielsen36. Using such a standardized set of guidelines helps to establish a com- mon measure by which to evaluate different projects. As an additional step, the tools were checked against the spe- cific DARIAH-DE guidelines, which were primarily devel- oped for the tools built for that infrastructure37. Secondly, a thinking aloud test was performed in which a subject was asked to fulfill common tasks concerning the tool in question. During that test, the subject is observed by one or two persons who take notes during the experiment and also remind the subject to continue to think aloud while completing the individual tasks. Ideally, a recording of both the audio data and the computer screen are made to better analyze the experiment. To understand if the interchangeable use of tools and devices during the research workflow is possible and ac- ceptable for researchers, an exemplary digital workflow was identified, designed and studied. The workflow com- prised several steps that were performed on various de- vices such as desktop, laptop, digital camera, tablet and custom displays (see Figure 1). First, existing collections of tombstone images were integrated into a research data management (RDM) system (based on the imeji38 soft- ware). Next, using a specialized app39 on a smart device, new nearby locations with tombstones were identified. In 33 Stiller, Juliane; Klaus Thoden, Dennis Zielke: Usability in den Digital Humanities am Beispiel des Laudatio Repositoriums. Leipzig, 2016. http://www.dhd2016.de/abstracts/vorträge-058.html 34 Nielsen 1995. 35 Lewis, Clayton, John Rieman: Task-Centered User Interface De- sign. A Practical Introduction. 1994. http://grouplab.cpsc.ucalgary. ca/saul/hci_topics/tcsd-book/contents.html. 36 Nielsen 1995. 37 Romanello, Matteo, Juliane Stiller, Klaus Thoden: Usability Cri- teria for External Requests of Collaboration. DARIAH-DE R1.2.2/R 7.5. Göttingen, 2016. https://wiki.de.dariah.eu/download/attach ments/14651583/R1.2.2-7.5_final.pdf. 38 https://imeji.org/. 39 See “Orte jüdischer Geschichte” (Places of Jewish history), http:// app-juedische-orte.de.dariah.eu/. Bereitgestellt von | Humboldt-Universität zu Berlin Angemeldet Heruntergeladen am | 23.04.19 12:25 K. Thoden et al., User-Centered Design Practices in Digital Humanities Fachbeitrag 7 a subsequent excursion in the field, using another special- ized app40 installed on a digital camera, a smart phone and a tablet, researchers took images of those tombstones. These images were automatically uploaded to the RDM system by the app. The uploaded images were further en- riched with domain-specific metadata by using the RDM system web application on a desktop computer. As a final step, the image collections were visualized both on desk- top computers and on a so-called Hyperwall display (an array of four 4K screens). Depending on the task, 2 to 10 scholars were involved in this study. It showed that beyond the usability of one specific tool, the ease of transition between tools and de- vices is an important factor contributing to the overall user experience because researchers in the humanities often use several tools, potentially on different devices, during their research activities. This could necessitate data con- version and switching between devices in order to perform a particular research task. Since researchers commonly work with different devices, also in an everyday context, both multi-device and multi-tool interactions were consid- ered to be acceptable. One observation was that original user expectations change depending on the research questions pursued. We further observed the expectations of the users from the mo- bile and web application tools and how these reflect in the complexity of mobile apps intended for smaller displays, and the applications intended for use on bigger displays (e.g. web applications used on a desktop). In most cases, mobile apps have fewer features and thus a smaller set of interactions are expected to be learned and performed. Conversely, many desktop or web applications are much more feature-rich and seem to be designed under the as- sumption that the user will eventually spend some time 40 See LabCam app, http://labcam.mpdl.mpg.de/. learning the tool. This is congruent with the expectations that researchers had regarding the devices: for capturing data in the field, it was deemed sufficient to let the device acquire data automatically and to assign only basic key- words. The proper documentation of the field work would later be performed on a desktop computer. On the imple- mentation side, this may mean having to develop different user interfaces for one single backend (the database): a finger-friendly mobile version with reduced functionality plus a desktop application providing the complete func- tionality. For the visualization of data, it was found that the high-resolution large displays are not always optimal since not all applications tested support such high reso- lution41. To conclude, diverse factors influence the user experience in a complex digital workflow spanning sever- al tools and devices, depending on the task they perform and the intent and context of use. This requires multiple approaches in addressing actual user needs. Tool devel- opment and efforts need not be underestimated, especial- ly when such tools are part of a larger infrastructure and ecosystem. 4.2 CENDARI CENDARI (Collaborative European Digital Archive Infra- structure)42 was a 4-year European Commission-fund- ed project with the aim to integrate digital archives and resources for the pilot areas of medieval culture and the First World War. The project brought together computer scientists and developers on one side, and historians and existing historical research infrastructures (archives, li- 41 Bulatovic 2016(a) and 2016(b). 42 http://www.cendari.eu/. Fig. 1: Exemplary digital research workflow Bereitgestellt von | Humboldt-Universität zu Berlin Angemeldet Heruntergeladen am | 23.04.19 12:25 8 Fachbeitrag K. Thoden et al., User-Centered Design Practices in Digital Humanities braries and other digital projects) on the other.43 CENDARI intended to improve conditions for historical sciences in Europe through active reflection and using the impact of the digital age to respond to scientific and archival prac- tice. The development of a virtual research environment was therefore planned with the aim to support researchers in their work with different tools and features that can fa- cilitate their research. In order to discern the actual needs and expectations of possible end user or researchers, so-called “participa- tory design workshops” were planned within CENDARI. The main goal was to determine the major requirements for a future environment, while avoiding the development of features and components that would find little interest among the end users. So the idea itself was a very simple one: why not ask researchers and stakeholders at different institutions what they think a good and usable virtual re- search environment needs in order to achieve or support defined research goals, or open up research questions in specific research areas. As simple as this may seem, the method of participatory design is not as widely used as one might expect. To understand the difficulties this meth- od involves, in the following we will take a closer look at the method and how it was used in the CENDARI context. In CENDARI, three participatory design workshops44 were organized with three different user groups: histori- ans, medievalists, as well as archivists and librarians. Par- ticipants held brainstorming sessions about the function- alities of their ideal virtual research environment. With the help of the workshop facilitators, they then produced paper and video prototypes illustrating the desired func- tionalities. Based on these prototypes and discussions with the participants, the main results of the participa- tory design sessions were threefold: a delineation of the historians’ research workflow, a detailed list of functional requirements and some high-level recommendations to CENDARI. First, CENDARI described a broad framework of how early stage research is conducted. In this workflow, they describe 11 non-linear steps. The iterative nature of the re- search workflow was also noted in the literature45. These steps were: research preparation, source selection, plan- ning of visits to archives and libraries, archive and library visits, note-taking, transcription, research refinement and 43 One should not consider a contradictory construction despite talk of two ‘sides’. 44 Boukhelifa, Nadia, Emmanouil Giannisakis, Evanthia Dimara, et al.: “Supporting Historical Research Through User-Centered Visual Analytics.” In: EuroVis Workshop on Visual Analytics (EuroVA). 2015. doi:10.2312/eurova.20151095. 45 Mattern et al. 2015. annotation, knowledge organization and restructuring, refinement and writing, continuation and expansion, and collaboration support. Second, it was clear from the video prototypes produced during the participatory ses- sions that there were shared functionalities between the different user groups. In particular, networking, search, note-taking and visualization were the most popular fea- tures participants demanded for an ideal virtual research environment. Third, there were three high-level recom- mendations to the project: to take into account existing workflows, e.g. paper and digital, and accepted practices such as sharing notes and research material, to envisage methods that encourage participants to share or release research data, and to work closely with researchers by de- veloping early prototypes and test beds. The CENDARI functional requirements described above were “translated” into functional descriptions, which were evaluated by technical experts and then formed the backbone for software development. An in- teresting aspect of the development was the creation of use cases and user stories from selected system functions. These were intended to bring researchers and technical experts together by working on real research questions, and to help demonstrate the developed system function- alities in a coherent way. In addition to these benefits, there were also some problems with this method of developing a new environ- ment. One major issue was the diversity of requirements extracted from the use cases and user stories. Of course, there was accordance regarding some required basic func- tions like searching and browsing, but there were also great differences in the details. In relation to their respec- tive research questions and areas, researchers came up with highly specific demands that would have required very much time for individual development. This problem could have originated in the selected case studies “First World War” and “Medieval Studies”. Both research areas deal with numerous and varied research questions and involve different disciplines. At the same time, demands that were placed on the tools and components were in part delusive and could not be fulfilled. This and other prob- lems led to many lessons being learned within the proj- ect, which will have great value for future projects dealing with similar challenges. We will take closer look at the les- sons learned in the following section. Besides the infrastructure and the virtual research environment contributed by CENDARI, the project also highlighted successful strategies for developing DH tools and areas where additional efforts are needed. In this re- gard, there were three key lessons learned concerning tool design, implementation and adoption that may be gener- Bereitgestellt von | Humboldt-Universität zu Berlin Angemeldet Heruntergeladen am | 23.04.19 12:25 K. Thoden et al., User-Centered Design Practices in Digital Humanities Fachbeitrag 9 alized to the domain of DH. Concerning the design of DH tools, participatory design applied to DH problems was found to be a successful methodology in gathering user requirements and in bringing researchers and developers together. However, due to the many user groups involved in CENDARI (historians, archivists and librarians) and their diverse user requirements, decisions had to be made with regard to implementation. A “one system does it all” approach was not feasible. Therefore the strategy was to give priority to common needs amongst the different user groups (e.g. note-taking). This allowed a variety of user scenarios and stories to be implemented but may not have addressed the specific needs of a specialized user group. Finally, besides user-centered design, CENDARI also high- lighted the factors that may impact the adoption of a tool, such as data privacy. CENDARI’s recommendation was to keep historians’ notes private by default and tagged enti- ties46 public by default. This was seen as being helpful to “spread […] historical knowledge with little risk of disclos- ing historians’ work”47. Another factor was user percep- tion of the cost/benefit of structuring and enriching their research data. CENDARI’s strategy was to demonstrate to users how their annotations can be effectively exploited, for example through visualization and faceted search. 5 Reflections and Recommenda- tions for Usability Practices in the DH Contrary to common perceptions expressed in the litera- ture, as demonstrated with DARIAH and CENDARI, there are many projects in the DH that do address usability and that integrate user-centered design methods. Neverthe- less, the resulting tools are often not easy to use or are not self-explanatory. Although usability guidelines and heuristics exist, many DH-tools fail to even comply with the simplest rules. In the following, reflections gleaned from our experiences within the infrastructure projects CENDARI and DARIAH48 are presented. Three aspects were identified that influence usability practices in the DH: (a) heterogeneous research methods and data, (b) lack of in- 46 For example, persons, places, events or organizations identified by users during their research and annotated in their notes. 47 Boukhelifa et al. 2015. 48 Adopting good engineering practices such as continuous testing, integration and builds, is a prerequisite for any software develop- ment. This aspect is well known and will not be addressed further here. tegration of stakeholders in development processes, (c) project-driven development. Under each of these aspects, recommendations are given for raising the awareness of usability, both in its theoretical understanding and in its implementation during the development of DH tools. 5.1 Heterogeneous Research Methods and Data One of the biggest challenges in the DH are the diverse research methods executed and the countless research objects in different formats. It is important to note that in the DH, scholars often experiment with new methods or employ old methods on new quantitative data. General- izing usability guidelines for this domain is therefore very challenging. 5.1.1 Adhere to Standards Research data can come from various sources and in vari- ous formats. Tool development should therefore adhere to standards and openness. For example, preference should be given to a tool that exposes a well-described REST inter- face over a tool where a direct database is the only means of access. Developing test data sets and providing users with sample data to test and play with the tool should be common practice. When aggregating heterogeneous data, it is difficult to find a common relevant denominator to an- swer upcoming and as yet ill-defined research questions. The proper representation of such data is a challenging task. It is better to start with a minimal set of attributes and then to iterate as more is learned about each data type. It is preferable that the REST interfaces are designed more generic and the user interfaces more specific. 5.1.2 Choose the Right Methods and Techniques In essence, every research project tends to deliver novel features and methods. Software tools that are used should support such novelty and implement the necessary mech- anisms. We are often already aware of some features that should certainly be implemented by the tool, such as the creation and curation of resources, searching, browsing, and so forth. Instead of implementing everything from scratch, one ought to try and find an open source tool that can be applied to the research domain and that provides the required functionality. The focus should be on an im- plementation of any missing features, either in the tool, as Bereitgestellt von | Humboldt-Universität zu Berlin Angemeldet Heruntergeladen am | 23.04.19 12:25 10 Fachbeitrag K. Thoden et al., User-Centered Design Practices in Digital Humanities an add-on or through the integration of an existing service that supports them. If necessary, several tools should be used. The development of all features and methods from scratch should, as far as possible, be avoided. 5.2 Lack of Integration of Stakeholders in the Development Process In the DH, different stakeholders often have conflicting ideas about the success of a developed tool and the sig- nificance of usability in achieving success. On the one hand, there are service providers and funders of digital tools who want to increase user acceptance and usage of tools. A high number of users could mean more and bet- ter networking scenarios within the specific research area and better statistical and heuristic analyses of the tool and its components. There is also the need to justify funding and to explain the additional benefits of developed tools and digital methods. On the other hand, there are scholars who are often considered to be mere users, having little in- fluence on the design and development process. A deeper involvement of the scholars can lead to them being trail- blazers of new methods and tools in the humanities and thus further advancing the field. 5.2.1 Assemble a Cross-Functional Team that Works Closely Together It is vital for the whole development team to understand the scholars’ needs, their vocabulary and research prac- tices. Conversely, scholars should also have the chance to understand the reasons for limitations on the implemen- tation side. A potential solution could be to build a team that comprises all parties involved, works closely together and shares their respective experiences as early as possi- ble. The methods described above are examples of exactly such practice. The involvement of researchers in tool de- velopment is necessary from the very beginning. If possi- ble, the team should be situated in the same location. 5.2.2 Understand the Users’ Needs and the Project Goals Innovative projects  – especially large international proj- ects – are often based in different locations. There is there- fore a high risk of misunderstanding the goals and the re- quirements of the project due to a lack of communication, especially when it comes to diversity of scenarios that need to be supported. Developing a common “language of understanding” is not an easy task. Communicate often and communicate openly. Start with the features of high- risk first. It is necessary to practice agile and innovative methods to help understand different aspects of future solutions and new developments and priorities49. 5.3 Project-driven Development Tool development in the DH is often driven by projects with strictly limited resources. These research projects often aim at developing tools that support new methods justifying the funding for further development of the field. A sustainable development of tools with a long-term per- spective is often not the primary goal of such projects and usability considerations are often seen as the finishing touch – also in heavily funded projects. Even when fund- ing is available to study the user experience, time or re- sources are lacking for an implementation of the results. 5.3.1 Document Everything – People Might Move On Irrelevant of the duration of the research project, in many cases there are difficulties in hiring people. There are a few variations of this phenomenon: positions cannot be filled in time, people find other positions to pursue their research during a research project, newly hired people master completely different technology than the one al- ready used in the project, and so on. Not only do these slow down the whole development process, it directly affects the user experience aspects of the project. Due to insufficient documentation of the work already done, ad- ditional time is required to understand the needs of the re- searchers who already expect a working solution, to adapt to changing goals, to introduce new members to the work- ing environment, and so forth. In order to reduce the neg- ative effects of such changes, one ought to use common components and apply common standards, keep the code clean, maintain a sufficient level of documentation and preserve project artifacts (e.g. design workshops, brain- storming outcomes, notes and meeting memos). 49 Hohmann, Luke: Innovation Games: Creating Breakthrough Products Through Collaborative Play. Addison-Wesley Profession- al, 2006. http://proquest.tech.safaribooksonline.de/0321437292; Luchs, Michael G., Scott Swan, Abbie Griffin: Design Thinking. John Wiley & Sons, 2015. http://proquest.tech.safaribooksonline. de/9781118971802. Bereitgestellt von | Humboldt-Universität zu Berlin Angemeldet Heruntergeladen am | 23.04.19 12:25 K. Thoden et al., User-Centered Design Practices in Digital Humanities Fachbeitrag 11 5.3.2 Take Small Steps and Iterate It is important to decide carefully about the prioritization of the user experience related development. For example, writing a one-time script to upload data may have little impact on the user experience in comparison to a web ap- plication for data entry or data annotation. In order to test what is acceptable before any implementation, one should practice at an early stage agile and innovative methods to address user experience by using several low-fidelity pro- totypes.50 One should not try to model everything upfront. Instead, one can make many smaller-sized implementa- tion iterations, thus reducing the risk of a larger part of the work being left unfinished. 6 Concluding Remarks Juxtaposing the different aspects that influence practices and methods of usability in the DH has shown that the reasons for disregarding user experience can be manifold. Although there is awareness in projects of the importance of usability, results from studies are rarely taken into ac- count during development. To increase user experience, however, one can start with very simple things when de- veloping tools: even little usability is better than none. And it can easily be achieved by providing sample data or good documentation, which helps users in becoming familiar with the tool. With this presentation of user-cen- tered design practices and the recommendations above in this article, it is hoped to narrow the gap between usability in theory and usability in practice. 50 Check some tools and resources for prototypes and mockups available at https://balsamiq.com/products/mockups/, http://www. axure.com/, https://www.build.me. Autoreninformationen Klaus Thoden Max-Planck-Institut für Wissenschaftsgeschichte Boltzmannstraße 22 14195 Berlin kthoden@mpiwg-berlin.mpg.de orcid.org/0000-0003-0434-3951 Juliane Stiller Berlin School of Library and Information Science Humboldt-Universität zu Berlin Dorotheenstraße 26 10117 Berlin juliane.stiller@ibi.hu-berlin.de orcid.org/0000-0001-8184-6187 Natasa Bulatovic Max Planck Digital Library (MPDL) Amalienstraße 33 80799 München bulatovic@o2mail.de Hanna-Lena Meiners University of Göttingen Göttingen State and University Library Papendiek 14 37073 Göttingen meiners@sub.uni-goettingen.de orcid.org/0000-0001-7499-9345 Nadia Boukhelifa UMR GMPA AgroParisTech INRA Université Paris-Saclay nadia.boukhelifa@inra.fr orcid.org/0000-0002-0541-8022 Bereitgestellt von | Humboldt-Universität zu Berlin Angemeldet Heruntergeladen am | 23.04.19 12:25 work_6g5fvv6mtnfuvogiz3otmgdzum ---- Recognizing Groups Among Dialects Jelena Prokić John Nerbonne Abstract In this paper we apply various clustering algorithms to the dialect pronunciation data. At the same time we propose several evaluation techniques that should be used in order to deal with the instability of the clustering techniques. The results have shown that three hierarchical clustering algorithms are not suitable for the data we are working with. The rest of the tested algorithms have successfully detected two-way split of the data into the Eastern and Western dialects. At the aggregate level that we used in this research, no further division of sites can be asserted with high confidence. 1 Introduction Dialectometry is a multidisciplinary field that uses various quantitative methods in the analysis of dialect data. Very often those techniques include classification algorithms such as hierarchical clustering algorithms used to detect groups within certain dialect area. Although known for their instability (Jain and Dubes, 1988), clustering algo- rithms are often applied without evaluation (Goebl, 2007; Nerbonne and Siedle, 2005) or with only partial evaluation (Moisl and Jones, 2005). Very small differences in the input data can produce substantially different grouping of dialects (Nerbonne et al., 2008). Without proper evaluation, it is very hard to determine if the results of the ap- plied clustering technique are an artifact of the algorithm or the detection of real groups in the data. The aim of this paper is to evaluate algorithms used to detect groups among lan- guage dialect varieties measured at the aggregate level. The data used in this research is dialect pronunciation data that consists of various pronunciations of 156 words col- lected all over Bulgaria. The distances between words are calculated using Leven- shtein algorithm, which also resulted in the calculation of the distances between each 1 two sites in the data set. We apply seven hierarchical clustering algorithms, as well as the k-means and neighbor-joining algorithm to the calculated distances and examine these using various evaluation methods. We evaluate using several external and inter- nal methods, since there is no direct way to evaluate the performance of the clustering algorithms. The structure of this paper is as follows. Different classification algorithms are presented in the next section. In Section 3 we discuss our data set and how the data was processed. Various evaluation techniques are described in Section 4. The results are given in Section 5. In Section 6 we present discussion and conclusions. 2 Classification algorithms In this section we briefly introduce seven hierarchical clustering algorithms, k-means and neighbor-joining algorithm, originally used for reconstructing phylogenetic trees. 2.1 Hierarchical clustering Cluster analysis is the process of partitioning a set of objects into groups or clusters (Manning and Schütze, 1999). The goal of clustering is to find structure in the data by finding objects that are similar enough to be put in the same group and by identifying distinctions between the groups. Hierarchical clustering algorithms produce a set of nested partitions of the data by finding successive clusters using previously established clusters. This kind of hierarchy is represented with a dendrogram—a tree in which more similar elements are grouped together. In this study seven hierarchical clustering algorithms will be investigated with regard to their performance on dialect pronun- ciation data. All these agglomerative clustering algorithms proceed from a distance matrix, repeatedly choosing the two closest elements and fusing them. They differ in the way in which distances are recalculated from the newly fused elements to the others. We now review the various calculations. Single link method, also known as nearest neighbor, is one of the oldest methods in cluster analysis. The similarity between two clusters is computed as the distance 2 between the two most similar objects in the two clusters. dk[ij] = minimum(dki, dkj ) In this formula, as well as in other formulae in this subsection, i and j are the two closest points that have just been fused into one cluster[i, j], and k represents all the remaining points (clusters). As noted in Jain and Dubes (1988), single link clusters easily chain together, producing the so-called chaining effect, and produce elongated clusters. The presence of only one intermediate object between two compact clusters is enough to turn them into a single cluster. Complete link, also called furthest neighbor, uses the most distant pair of objects while fusing two clusters. It repeatedly merges clusters whose most distant elements are closest. dk[ij] = maximum(dki, dkj ) Unweighted Pair Group Method using Arithmetic Averages (UPGMA) belongs to a group of average clustering methods, together with three methods that will be described below. In UPGMA, the distance between any two clusters is the average of distances between the members of the two clusters being compared. The average is weighted naturally, according to size. dk[ij] = (ni/(ni + nj )) × dki + (nj /(ni + nj )) × dkj As a consequence, smaller clusters will be weighted less and larger ones more. Weighted Pair Group Method using Arithmetic Averages (WPGMA), just as UPGMA, calculates the distance between the two clusters as the average of distances between all members of two clusters. But in WPGMA, the clusters that fuse receive equal weight regardless of the number of members in each cluster. dk[ij] = ( 1 2 × dki) + ( 1 2 × dkj ) 3 Because all clusters receive equal weights, objects in smaller clusters are more heavily weighted than those in the big clusters. Unweighted Pair Group Method using Centroids (UPGMC) In this method, the members of a cluster are represented by their middle point, the so-called centroid. This centroid represents the cluster while calculating the distance between the clusters to be fused. dk[ij] = (ni/(ni + nj )) × dki + (nj /(ni + nj )) × dkj− ((ni × nj )/(ni + nj )2) × dij In the unweighted version of the centroid clustering the clusters are weighted based on the number of elements that belong to that cluster. This means that bigger clusters receive more weight, so that centroids can be biased towards bigger clusters. Centroid clustering methods can also occasionally produce reversals—partitions where the dis- tance between two clusters being joined is smaller than the distance between some of their subclusters (Legendre and Legendre, 1998). Weighted Pair Group Method using Centroids (WPGMC) Just as in WPGMA, in WPGMC all clusters are assigned the same weight regardless of the number of ob- jects in each cluster. In that way the centroids are not biased towards larger clusters. dk[ij] = ( 1 2 × dki) + ( 1 2 × dkj ) − ( 1 4 × dij ) Ward’s method This method is also known as the minimal variance method. At each stage in the analysis clusters that merge are those that result in the smallest in- crease in the sum of the squared distances of each individual from the mean of its cluster. dk[ij] = ((nk + ni)/(nk + ni + nj )) × dki + ((nk + nj )/(nk + ni + nj )) × dkj− ((nk/(nk + ni + nj )) × dij 4 This method uses an analysis of variance approach to calculate the distances between clusters. It tends to create clusters of the same size (Legendre and Legendre, 1998). 2.2 K-means The k-means algorithm belongs to the non-hierarchical algorithms which are often re- ferred to as partitional clustering methods (Jain and Dubes, 1988). Unlike hierarchical clustering algorithms, partitional clustering methods generate a single partition of the data. A partition implies a division of the data in such a way that each instance can be- long only to one cluster. The number of groups in which the data should be partitioned is usually determined by the user. The k-means is the most commonly used partitional algorithm, that despite its sim- plicity, works sufficiently well in many applications (Manning and Schütze, 1999). The main idea of k-clustering is to find the partition of n objects into K clusters such that the total error sum of squares is minimized. In the most simple version, the algorithm consists of the following steps: 1. pick at random initial cluster centers 2. assign objects to the cluster whose mean is closest 3. recompute the means of clusters 4. reassign every object to the cluster whose mean is closest 5. repeat steps 3 and 4 until there are no changes in the cluster membership of any object Two main drawbacks of the k-means algorithm are the following: • the user has to define the number of clusters in advance • the final partitioning depends on the initial position of the centroids Possible solutions to these problems, as well as the detailed descriptions of the k-means algorithm can be found in some of the classical references to k-means: Hartigan (1975), Everitt (1980) and Jain and Dubes (1988). 5 2.3 Neighbor-joining Apart from the seven hierarchical clustering algorithms and k-means, we also investi- gate the performance of the neighbor-joining algorithm. We introduce this technique at more length as it is less familiar to linguists. Neighbor-joining is a method for re- constructing phylogenetic trees that was first introduced by Saitou and Nei (1987). The main principle of this method is to find pairs of taxonomic units that minimize the total branch length at each stage of clustering. The distances between each pair of instances (in our case data collection sites) are calculated and put into the n×n matrix, where n represents the number of instances. The matrices are symmetrical since distances are symmetrical, i.e. distance (a, b) is always the same as distance (b, a). Based on the input distances, the algorithm finds a tree that fits the observed distances as closely as possible. While choosing the two nodes to fuse, the algorithm always takes into ac- count the distance from every node to all other nodes in order to find the smallest tree that would explain the data. Once found, two optimal nodes are fused and replaced by a new node. The distance between the new node and all other nodes is recalculated, and the whole process is repeated until there are no more nodes left to be paired. The algorithm was modified by Studier and Kepler (1988), and the complexity was reduced to O(n3). The steps of the algorithm are as follows (taken from Felsenstein (2004)): • For each node compute ui which is the sum of the distances from that node to all other nodes ui = n∑ j:j 6=i Dij (n − 2) • Choose i and j for which Dij − ui − uj is smallest • Join i and j. Compute the length from i and j to the newly formed node v using the equations below. Note that the distances from the new node to its children (leaves) need not be identical. This possibility does not exist in hierarchical clustering. vi = 1 2 Dij + 1 2 (ui − uj ) vj = 1 2 Dij + 1 2 (uj − ui) 6 • Compute the distance between the new node and all of the remaining nodes D(ij),k = (Dik + Djk − Dij ) 2 • Delete nodes i and j and replace them by the new node This algorithm produces a unique unrooted tree under the principal of minimal evo- lution (Saitou and Nei, 1987). In biology, the neighbor-joining algorithm has become very popular and widely used method for reconstructing trees from distance data. It is fast and can be easily applied to a large amount of data. Unlike most hierarchical clustering algorithms, it will recover the true tree even if there is not a constant rate of change among the taxa (Felsenstein, 2004). 3 Data preprocessing The data set used in this research consists of transcriptions of the pronunciations of 156 words collected from 197 sites equally distributed all over Bulgaria. All measurements were done based on the phonetic distances between the various pronunciations of these 156 words. No morphological, lexical or syntactic variation between the dialects were taken into account. Word transcriptions were preprocessed in the following way: • First, all diacritics and suprasegmentals were removed from word transcriptions. In order to process diacritics and suprasegmentals, they should be assigned cer- tain weights appropriate for the specific language that is being analyzed. Since no study of this kind was available for Bulgarian, diacritics and suprasegmentals were removed, which resulted in the simplification of data representation. For example, [u], [u:], ["u], and ["u:] counted as the same phone. Also, all words were represented as series of phones which are not further defined. The result of com- paring two phones can be 1 or 0; they either match or they do not. For example, pair [e, E] counts as different to the same degree as pair [e, i]. Although it is lin- guistically counterintuitive to use less sensitive measures, Heeringa (2004:p.186) 7 has shown that in the aggregate analysis of dialect differences more detailed fea- ture representation of segments does not improve the results obtained by using simple phone representation. • All transcriptions were aligned based on the following principles: a) a vowel can match only with a vowel b) a consonant can match only with a consonant, semivowels [j], [w] and sonorants. The alignments were carried out using the Levenshtein algorithm, which also results in the calculation of a distance be- tween each pair of words. A detailed explanation of the Levenshtein algorithm can be found in Heeringa (2004). The distance is the smallest number of inser- tions, deletions, and substitutions needed to transform one string to the other. In this work all three operations were assigned the same value: 1. An example of an aligned pair of transcriptions can be seen here: - e d e m j A d A - The distance between two sites is the mean of all word distances calculated for those two sites. The final result is a distance matrix which contains the distances between each two sites in the data set. This distance matrix was further analyzed using seven hierarchical algorithms, k-means and the neighbor-joining algorithm described in the previous section. 4 Evaluation We analyzed the results obtained by the above mentioned methods further using a vari- ety of measures. Multidimensional scaling was performed in order to see if there were any separate groups in the data and to determine the optimal number of clusters in the data set. External validation of the clustering results included the modified Rand index, purity and entropy. External validation involves comparison of the structure obtained by different algorithms to a gold standard. In our study we used the manual classifica- 8 tion of all the sites produced by traditional dialectologist as a gold standard. Internal validation included examining the cophenetic correlation coefficient, noisy clustering and a consensus tree, which do not require comparison to any a priori structure, but rather try to determine if the structure obtained by algorithms is intrinsically appropri- ate for the data. Multidimensional scaling is a dimension-reducing method used in exploratory data analysis and a data visualization method, often used to look for separation of the clusters (Legendre and Legendre, 1998). The goal of the analysis is to detect meaning- ful underlying dimensions that allow the researcher to explain observed similarities or dissimilarities between the investigated objects. In general then, MDS attempts to ar- range "objects" in a space with a certain small number of dimensions, which, however, accord with the observed distances. As a result, we can “explain“ the distances in terms of underlying dimensions. It has been frequently used in linguistics and dialectology since Black (1973). 4.1 External validation The modified Rand index (Hubert and Arabie, 1985) is used for comparing two differ- ent partitions of a finite set of objects. It is a modified form of the Rand index (Rand, 1971), one of the most popular measures for comparing partitions. Given a set of n elements S = o1, ...on and two partitions of S, U = u1, ...uR and V = v1, ...vC we define a the number of pairs of elements in S that are in the same set in U and in the same set in V b the number of pairs of elements in S that are in different sets in U and in different sets in V c the number of pairs of elements in S that are in the same set in U and in different sets in V d the number of pairs of elements in S that are in different sets in U and in the same set in V The Rand index R is R = a + b a + b + c + d In this formula a and b are the number of pairs of elements in which two classifications 9 agree, while c and d are the number of pairs of elements in which they disagree. The value of the Rand index is between 0 and 1, with 0 indicating that the two data clusters do not agree on any pair of points and 1 indicating that the data clusters are exactly the same. In dialectometry, this index was used by Heeringa et al. (2002) to validate dialect comparison methods. A problem with the Rand index is that it does not return a constant value (zero) if two partitions are picked at random. Hubert and Arabie (1985) suggested a modification of Rand index that corrects this property. It can be expressed in the general form as: RandIndex − ExpectedIndex M aximumIndex − ExpectedIndex The value of the modified Rand index is between -1 and 1. Entropy and purity are two measures used to evaluate the quality of clustering by looking at the reference class labels of the elements assigned to each cluster (Zhao and Karypis, 2001). Entropy measures how different classes of elements are distributed within each cluster. The entropy of a single cluster is calculated using the following formula: E(Sr) = − 1 log q q∑ i=1 nir nr log nir nr where Sr is a particular cluster of size nr, q is the number of classes in the reference data set, and nir is the number of the elements of the ith class that were assigned to the rth cluster. The overall entropy is the sum of all cluster entropies weighted by the size of the cluster: E = k∑ r=1 nr n E(Sr) The purity measure is used to determine to which extent a cluster contains objects from primarily one class. The purity of a cluster is calculated as: P (Sr) = 1 nr max(nir) 10 while the overall purity is the weighted sum of the individual cluster purities: P = k∑ r=1 nr n P (Sr) 4.2 Internal validation The cophenetic correlation coefficient (Sokal and Rohlf, 1962) is Pearson’s correla- tion coefficient computed between the cophenetic distances produced by clustering and those in the original distance matrix. The cophenetic distance between two objects is the similarity level at which those two objects become members of the same cluster during the course of clustering (Jain and Dubes, 1988) and is represented as branch length in dendrogram. It measures to which extent the clustering results correspond to the original distances. When the clustering functions perfectly, the value of the cophe- netic correlation coefficient is 1. In order to check the significance of this statistics we performed the simple Mantel test as implemented in zt software (Bonet and de Peer, 2002). A simple Mantel test is used to compare two matrices by testing the corre- lation between them using the standard Pearson correlation coefficient and testing its statistical significance (Mantel, 1967). Noisy clustering, also called composite clustering, is a procedure in which small amounts of random noise are added to matrices during repeated clustering. The main purpose of this procedure is to reduce the influence of outliers on the regular clusters and to identify stable clusters. As shown in Nerbonne et al. (2008) it gives results that nearly perfectly correlate with the results obtained by bootstrapping—a statistical method for measuring the support of a given edge in a tree (Felsenstein, 2004). The ad- vantage of the noisy clustering, compared to bootstrapping, is that it can be applied on a single distance matrix—the same one used as input for the classification algorithms. A consensus dendrogram, or consensus tree, is a tree that summarizes the agree- ment between a set of trees (Felsenstein, 2004). A consensus tree that contains a large number of internal nodes shows high agreement between the input trees. On the other hand, if a consensus tree contains few internal nodes, it is a sign that input trees clas- sify the data in conflicting ways. The majority rule consensus tree, used in this study, 11 is a tree that consists of the groups, i.e clusters, which are present in the majority of the trees under study. In this research a consensus dendrogram was created from four dendrograms produced by four different hierarchical clustering methods. Clusters that appear in the consensus tree are those supported by the majority of algorithms and can be taken with greater confidence to be true clusters. 5 Results Before describing the results of applying various algorithms to our data set, we give a short description of the traditional division of the Bulgarian dialect area that we used for external validation in our research. 5.1 Traditional scholarship Traditional scholarship (Stojkov, 2002) divides the Bulgarian language into two main groups: Western and Eastern. The border between these two areas is so-called ’yat’ border that reflects different pronunciations of the old Slavic vowel ’yat’. It goes from Nikopol in the North, near Pleven and Teteven down to Petrich in the South (bold dashed line in Figure 1). Figure 1: Traditional map of Bulgarian dialects 12 Figure 2: The two-way and six-way classification of sites done by expert Figure 3: MDS plot Figure 4: MDS map 13 Stojkov divides each of these two areas further into three smaller dialect zones, which can also be seen on the map in Figure 1. This 6-fold division is based on the variation of different phonetic features. No morphological or syntactic differences were taken into account. In order to evaluate the performance of different clustering algo- rithms, all sites present in our data set were manually put by an expert in one of the two, and later into six, main dialect areas according to the Stojkov’s classification. This was done by Professor Vladimir Zhobov, phonetician and dialectologist from the Faculty of Slavic Philologies ’St. Kliment Ohridski’, University of Sofia. Due to various historical events, mostly migrations, some villages are dialectolog- ical islands surrounded by language varieties from groups different from the one they belong to. This lack of geographical coherence can be seen, for example, in the north- central part on the map in Figure 2. 5.2 MDS Multidimensional scaling was performed in order to check if there are any separate clusters in the data. The results can be seen in the Figure 3, where the first two ex- tracted dimensions are plotted against the x and y axes. In addition, all three extracted dimensions are represented by different shades of red, blue and green colors. This represents the third MDS dimension. The first three dimensions represented in Figure 3 explain 98 per cent of the varia- tion in the data set—the first dimension extracted explains 80 per cent of the variation, and the second dimension 16 per cent. In Figure 3 we can see two distinct clusters along the x-axis, which, if put on the map, correspond to the Eastern and Western group of dialects (Figure 4). Variation along the y-axis corresponds to the separation of the dialects in the South from the rest of the country. Using MDS to screen the data, we observe that there are two distinct clusters in the data set—even though MDS is fully capable of representing continuous data. This finding fully agrees with the expert opinion (Stojkov, 2002) according to which the Bulgarian dialect area can be divided into Eastern and Western dialect areas along the ’yat’ border. A third area that can be seen in Figure 4 is the area 14 in the South of the country—the area of the Rodopi mountains. In the classification of dialects done by Stojkov (2002), this area is identified as one of the six main dialect areas based on the phonetic features. 5.3 External validation The results of the multidimensional scaling and dialect divisions done by expert can be used as a first step in the evaluation of the clustering algorithms. Visual inspection shows that three algorithms fail to identify any structure in the data, including East- West division of the dialects: single link and two centroid algorithms, UPGMC and WPGMC. Dendrograms drawn using UPGMC and WPGMC reveal a large number of reversals, while closer inspection of the single link dendrogram clearly shows the presence of the chain effect. The remaining algorithms reveal the East-West division of the country clearly (Figure 5). For that reason, in the rest of the paper the main focus will be on those four clustering algorithms, as well as on the k-means and neighbor- joining. Figure 5: Top left map: 2-way division produced by UPGMA, WPGMA and Ward’s method. Top right map: 6-way division produced by UPGMA. Bottom maps: 6-way divisions produced by WPGMA and Ward’s method respectively. In order to compare divisions done by clustering algorithms with the division of sites done by expert we calculated the modified Rand index, entropy and purity for 15 Table 1: Results of external validation: the modified Rand index (MRI), entropy (E) and purity (P). Results for the 2, 3 and 6-fold divisions are reported. Algorithm MRI(2) MRI(3) MRI(6) E(2) E(3) E(6) P(2) P(3) P(6) single link -0.004 0.007 -0.001 0.958 0.967 0.881 0.614 0.396 0.360 complete link 0.495 0.520 0.350 0.510 0.542 0.467 0.848 0.766 0.645 UPGMA 0.700 0.627 0.273 0.368 0.445 0.583 0.914 0.853 0.568 WPGMA 0.700 0.626 0.381 0.368 0.445 0.448 0.914 0.853 0.665 UPGMC -0.004 0.007 -0.006 0.959 0.967 0.926 0.614 0.396 0.310 WPGMC -0.004 0.007 -0.005 0.958 0.967 0.925 0.614 0.396 0.305 Ward’s method 0.700 0.627 0.398 0.368 0.445 0.441 0.914 0.853 0.675 k-means 0.700 0.625 0.471 0.354 0.451 0.355 0.919 0.756 0.772 NJ 0.567 0.461 - 0.442 0.550 - 0.873 0.777 - the 2-fold, 3-fold, and 6-fold divisions done by algorithms on the one hand, and those divisions according to the expert on the other. The results can be seen in Table 1. The neighbor-joining algorithm produced an unrooted tree (Figure 6), where only 2- fold and 3-fold divisions of the sites can be identified. Hence, all the indices were calculated only for the 2-fold and 3-fold divisions in neighbor-joining. Figure 6: NJ tree In Table 1 we can see that the values of the modified Rand index for single link and two centroid methods are very close to 0, which is the value we would get if the partitions were picked at random. UPGMA, WPGMA, Ward’s method and k-means, which gave 16 nearly the same 2-fold division of the sites, show the highest correspondences with the divisions done by expert. For 3-fold and 6-fold divisions the values for the modified Rand index went down for all algorithms, which was expected since the number of groups increased. The two algorithms with the highest values of the index are Ward’s method and UPGMA for 3-fold, and k-means for the 6-fold division. Just as in the case of the 2-fold division, the single-link, UPGMC, and WPGMC algorithms have values of the modified Rand index close to 0. Neighbor-joining produced a relatively low correspondence with expert opinion for the 3-fold division—0.461. Similar results for all algorithms and all divisions were obtained using entropy and purity measures. External validation of the clustering algorithms has revealed that single link, UPGMC and WPGMC algorithms are not suitable for the analysis of the data we are working with, since they fail to recognize any structure in the data. 5.4 Internal validation In the next step internal validation methods were used to check the performance of the algorithms: the cophenetic correlation coefficient, noisy clustering and consensus tree. Since k-means does not produce a dendrogram, it was not possible to calculate the cophenetic correlation coefficient. The values of the cophenetic correlation coefficient for the remaining eight algorithms can be seen in Table 2. We can see that clustering results of the UPGMA have the highest correspondence to the original distances of all algorithms—90.26 per cent. They are followed by the results obtained by using com- plete link and neighbor-joining algorithm. All correlations are highly significant with p < 0.0001. Given the poor performance of the centroid and single-link methods in detecting the dialect divisions scholars agree on, we note that cophenetic correlation coefficients are not successful in distinguishing the better techniques from the weaker ones. We conjecture that the reason for this lies in the fact that the cophenetic correla- tion coefficient so dependent is on the lengths of the branches in the dendrogram, while our primary purpose is the classification. Noisy clustering, that was applied with the seven hierarchical algorithms, has con- firmed that there are two relatively stable groups in the data: Eastern and Western. 17 Table 2: Cophenetic correlation coefficient Algorithm CCC p single link 0.7804 0.0001 complete link 0.8661 0.0001 UPGMA 0.9026 0.0001 WPGMA 0.8563 0.0001 UPGMC 0.8034 0.0001 WPGMC 0.6306 0.0001 Ward’s method 0.7811 0.0001 Neighbor-joining 0.8587 0.0001 Dendrograms obtained by applying noisy clustering to the whole data set show low confidence for the two-way split of the data, between 52 and 60 per cent. After re- moving the Southern villages from the data set, we obtained dendrograms that confirm two-way split of the data along the ’yat’ border with much higher confidence ranging around 70 per cent. These values are not very high. In order to check the reason of the influence of the Southern varieties on the noisy clustering we examine an MDS plot in two dimensions with cluster groups marked by colours. In Figure 7 we can see MDS plot of 6 groups produced by WPGMA algorithm. MDS plot reveals two homogeneous groups and a third, more diffuse, group that lies at a remove from them. The third group of the sites represents the Southern group of varieties, colored light blue and yellow, and is much more heterogeneous than the rest of the data. Closer inspection of the MDS plot in Figure 3 also shows that this group of dialects has a particularly unclear border to the Eastern dialects, which could explain the results of the noisy clustering applied to the whole data set. Since different algorithms gave different divisions of sites, we used a consensus dendrogram in order to detect the clusters on which most algorithms agree. Since single link, UPGMC and WPGMC have turned to be inappropriate for the analysis of our data, they were not included in the consensus dendrogram. The consensus dendrogram drawn using complete link, UPGMA, WPGMA and Ward’s method can be seen in Figure 8. The names of the sites are colored based on the expert’s opinion, i.e. the same as in Figure 2. The dendrogram shows strong support for the East-West division of sites, but no agreement on the division of sites within the Eastern and Western areas. 18 Figure 7: MDS plot of 6 clusters produced by WPGMA. Note that the good separation of the clusters is often spoiled by unclear margins. At this level of hierarchy, i.e. 2-way division, there are several sites classified differ- ently by algorithms and by expert. These sites go along the ’yat’ border and represent the marginal cases. The only two exceptions are villages in the South-East, namely Voden and Zheljazkovo. However, according to many traditional dialectologists these villages should be classified as Western dialects due to many features that they share with the dialects in the West (personal communication with prof. Vladimir Zhobov). The four algorithms show agreement only on the very low level where several sites are grouped together and on the highest level. It is not possible to extract any hierarchical structure that would be present in the majority of four analyses. 6 Discussion and conclusions Different clustering validation methods have shown that three algorithms are not suit- able at all for the data we are working with, namely single link, UPGMC and WPGMC. The remaining four hierarchical clustering algorithms gave different results depending on the level of hierarchy, but all four algorithms had high agreement on the detection of two main dialect areas within the dialect space. At the lower level of hierarchy, i.e where there are more clusters, the performance of the algorithms is poorer, both with 19 respect to the expert opinion and with respect to the mutual agreement as well. As shown by noisy clustering, the 2-fold division of the Bulgarian language area is the only partition of sites that can be asserted with high confidence. The results of the neighbor-joining algorithm were a bit less satisfactory. The rea- son for this could be in the fact that our data is not tree-like, but rather contains a lot of borrowings due to contact between different dialects. A recent study (Hamed and Wang, 2006) of Chinese dialects has shown that their development is not tree-like and that in such cases usage of tree-reconstruction methods can be misleading. The division of sites done by the k-means algorithm corresponded well with the expert divisions. Two and three-way divisions also correspond well with the divisions of four hierarchical clustering algorithms. What we find more important is the fact that in the divisions obtained by the k-means algorithm into 2, 3, 4, 5 and 6 groups the two-way division into the Eastern and Western groups is the only stable division that appears in all partitions. This research shows that clustering algorithms should be applied with caution as classifiers of language dialect varieties. Where possible, several internal and external validation methods should be used together with the clustering algorithms in order to validate their results and make sure that the classifications obtained are not mere artifacts of algorithms but natural groups present in the data set. Since performance of clustering algorithms depends on the sort of data used, evaluation of algorithms is a necessary step in order to obtain results that can be asserted with high confidence. The fact that there are only two distinct groups in our data set that can be asserted with high confidence, as opposed to six found in the traditional atlases, could possible be due to the simplified representation of the data (see Section 3). It is also possible that some of the features responsible for the traditional 6-way division are not present in our data set. At the moment, we are investigating these two issues. Regardless of the quality of the input data set, we have shown that clustering algorithms will partition data into the desired number of groups even if there is no natural separation of the data. For this reason it is essential to use different evaluation techniques along with the clustering algorithms. 20 Classification algorithms are nowadays applied in different subfields of humanities (Woods et al., 1986; Boonstra et al., 1990). It is a general technique that can be applied to any sort of data that needs to be put into different groups in order to discover various patterns. Document and text classification, authorship detection and language typology are just some of the areas where classification algorithms are nowdays successfully applied. The problem of choosing the right classification algorithm and obtaining stable results goes beyond dialectometry and is present whenever applied. For this reason the present paper is valuable not only for the research done in dialectometry, but also for other branches of humanities that are using clustering techniques. It shows how unstable the results of clustering algorithms can be, but also how to approach this problem and overcome it. References P. Black (1973), ‘Multidimensional scaling applied to linguistic relationships’, in Cahiers de l’Institut de Linguistique Louvain, Volume 3 (Montreal). Expanded ver- sion of a paper presented at the Conference on Lexicostatistics. University of Mon- treal. E. Bonet and Y. V. de Peer (2002), ‘zt: a software tool for simple and partial mantel tests’, Journal of Statistical software, 7(10), 1–12. O. Boonstra, P. Doorn, and F. Hendrickx (1990), Voortgezette Statistiek voor Historici (Muiderberg). B. S. Everitt (1980), Cluster Analysis (New York). J. Felsenstein (2004), Inferring Phylogenies (Massachusetts). H. Goebl (2007), ‘On the geolinguistic change in Northern France between 1300 and 1900: a dialectometrical inquiry’, in J. Nerbonne, T. M. Ellison, and G. Kondrak, eds, Computing and Historical Phonology. Proceedings of the Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology (Prague), 75–83. 21 M. B. Hamed and F. Wang (2006), ‘Stuck in the forest: Trees, networks and Chinese dialects’, Diachronica, 23(1), 29–60. J. A. Hartigan (1975), Cluster algorithms (New York). W. Heeringa (2004), Measuring Dialect Pronunciation Differences using Levensthein Distance (PhD Thesis, University of Groningen). W. Heeringa, J. Nerbonne, and P. Kleiweg (2002), ‘Validating dialect comparison methods’, in W. Gaul and G. Ritter, eds, Classification, Automation, and New Media. Proceedings of the 24th Annual Conference of the Gesellschaft für Klassifikation, University of Passau, March 15-17, 2000 (Heidelberg), 445–452. L. Hubert and P. Arabie (1985), ‘Comparing partitions’, Journal of Classification, 2, 193–218. A. K. Jain and R. C. Dubes (1988), Algorithms for Clustering Data (New Yersey). P. Legendre and L. Legendre (1998), Numerical Ecology, second ed. (Amsterdam). C. Manning and H. Schütze (1999), Foundations of Statistical Natural Language Pro- cessing (Cambridge, MA). N. Mantel (1967), ‘The detection of disease clustering and a generalized regression approach’, Cancer Research, 27, 209–220. H. Moisl and V. Jones (2005), ‘Cluster analysis of the Newcastle Electronic Corpus of Tyneside English: a comparison of methods’, Literary and Linguistic Computing, 20, 125–146. J. Nerbonne, P. Kleiweg, W. Heeringa, and F. Manni (2008), ‘ Projecting Dialect Dif- ferences to Geography: Bootstrap Clustering vs. Noisy Clustering’, in H. B. Chris- tine Preisach, Lars Schmidt-Thieme and R. Decker, eds, Data Analysis, Machine Learning, and Applications. Proc. of the 31st Annual Meeting of the German Clas- sification Society (Berlin), 647–654. 22 J. Nerbonne and C. Siedle (2005), ‘Dialektklassifikation auf der Grundlage Ag- gregierter Ausspracheunterschiede’, Zeitschrift für Dialektologie und Linguistik, 72(2), 129–147. W. M. Rand (1971), ‘Objective criteria for the evaluation of clustering methods’, Jour- nal of American Statistical Association, 66(336), 846–850. N. Saitou and M. Nei (1987), ‘The neighbor-joining method: A new method for recon- structing phylogenetic trees’, Molecular Biology and Evolution, 4, 406–425. R. R. Sokal and F. J. Rohlf (1962), ‘The comparison of dendrograms by objective methods’, Taxon, 11, 33–40. S. Stojkov (2002), Bulgarska dialektologiya (Sofia). J. A. Studier and K. J. Kepler (1988), ‘A note on the neighbor-joining algorithm of Saitou and Nei’, Molecular Biology and Evolution, 5, 729–731. A. Woods, P. Fletcher, and A. Hughes (1986), Statistics in Language Studies (Cam- bridge). Y. Zhao and G. Karypis (2001), ‘Criterion functions for document clustering: Experi- ments and analysis’, Technical report 01-40, Department of Computer Science, Uni- versity of Minnesita, Minneapolis, MN. 23 Figure 8: Consensus dendrogram for four algorithms.The four algorithms show agree- ment only on the 2-way division. It is not possible to extract any hierarchical structure that would be present in the majority of four analyses. (For the explanation of colors see Figure 3.) 24 work_6le2npbvkveupkv3od5vq5edau ---- Deference to Paper: Textuality, Materiality, and Literary Digital Humanities in Africa Research How to Cite: Yékú, James. 2020. “Deference to Paper: Textuality, Materiality, and Literary Digital Humanities in Africa.” Digital Studies/ Le champ numérique 10(1): 15, pp. 1–27. DOI: https://doi.org/10.16995/ dscn.357 Published: 30 December 2020 Peer Review: This is a peer-reviewed article in Digital Studies/Le champ numérique, a journal published by the Open Library of Humanities. Copyright: © 2020 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. Open Access: Digital Studies/Le champ numérique is a peer-reviewed open access journal. Digital Preservation: The Open Library of Humanities and all its journals are digitally preserved in the CLOCKSS scholarly archive service. https://doi.org/10.16995/dscn.357 https://doi.org/10.16995/dscn.357 http://creativecommons.org/licenses/by/4.0/ Yékú, James. 2020. “Deference to Paper: Textuality, Materiality, and Literary Digital Humanities in Africa.” Digital Studies/Le champ numérique 10(1): 15, pp. 1–27. DOI: https://doi.org/10.16995/dscn.357 RESEARCH Deference to Paper: Textuality, Materiality, and Literary Digital Humanities in Africa James Yékú University of Kansas, Lawrence, KS, US jyeku@ku.edu This article explores the relationship between the forms of representation and the modes of production of African online writing, rendering visible an appreciation of the digital contexts that have occasioned new experimentations with regard to genre, style and the formation of digital publics. But it’s the bibliographic form and materiality of African digital texts that interest me the most. The medium of African literature, profoundly transformed since the arrival of the Internet, conditions the transmission, reception of meaning and the constitution of reading publics and the identity of audiences. More so, the new textual environment of African writing and creatives expressions such as digital networks and literary blogs are terrains of discursive contestations that activate these digital fields of cultural production in Africa as materially connected to prior literary forms. Deploying several examples, including the recent ‘transition’ of Saraba Magazine to print and Mike Maphoto’s Diary of a Zulu Girl, I argue that, despite the widespread uses of digital media forms in new African narratives, there is a lingering print imaginary in the digital articulations of African literary texts. While this tension between print and digital forms shapes textual meanings, it signals new directions in African literary studies more broadly. Keywords: Materiality; African literature; digital media; textuality; print Cet article explore la relation entre les formes de représentation et les modes de production de l’écriture africaine en ligne, en rendant visible une appréciation des contextes numériques qui facilitent de nouvelles expérimentations en ce qui concerne le genre, le style et la formation des publics numériques. C’est pourtant la forme bibliographique et la matérialité des textes africains numériques qui m’intéressent le plus. Le média de littérature africaine, profondément transformé depuis l’arrivée de l’Internet, conditionne similairement la transmission, la réception de la signification ainsi que la composition de publics de lecteurs et l’identité d’audiences. En outre, le nouvel environnement textuel de l’écriture africaine et les expressions créatives, tels que les réseaux numériques et https://doi.org/10.16995/dscn.357 mailto:jyeku@ku.edu Yeku: Deference to PaperArt. 15, page 2 of 27 les blogs littéraires, sont des terrains de contestations discursives qui stimulent ces domaines numériques de production culturelle en Afrique, liés matériellement à d’anciennes formes littéraires. En me servant des exemples de la « transition » récente vers la presse écrite de la Saraba Magazine et le Journal d’une fille Zulu par Nkululeko Maphoto, j’argumente que, malgré les usages répandus des formes numériques de média dans de nouvelles narrations africaines, il existe une anxiété persistante imaginaire ou archiviste dans les articulations numériques de textes littéraires africains. Pendant que cette tension entre les formes imprimées et numériques façonnent les sens textuels, cela indique plus généralement de nouvelles directions dans les études littéraires africaines. Mots-clés: Matérialité; littérature africaine; médias numériques; Textualité; écriture imprimée Introduction Few studies on postcolonial African literary productions online have focused on the digital environments and materialities of many born-digital literary texts by African writers. For scholarly works that present critical analyses of these important domains, the emphasis is more on content than on the form and physical features of the medium of expression. Ideas from bibliographic and textual criticism can be beneficial to studies of the African digital literary spaces that produce multiple textual forms, and which remain strongly linked to print culture. I use “textual criticism” not as a synonym for “literary criticism,” but in the sense of Thomas Tanselle (1990) who in “Textual Criticism and Deconstruction” writes of the ambiguity that emerges when both are used interchangeably. For him, “textual criticism” has traditionally meant the scholarly activity of studying the textual histories of verbal works in an effort to propose reliable texts of those works according to one or another definition of correctness. As I will show throughout this paper, my interest is not to focus on a single digital text or the remediated form of a print text for my analysis; the point is to use several examples which signal the material implications of an enduring print consciousness permeating African digital spaces. It is thus imperative to begin a conversation on new works that exist purely through the medium of the web, texts whose ontology is shaped by digitality and that later appear as printed texts. As the Internet is increasingly central for the circulation of African creative Yeku: Deference to Paper Art. 15, page 3 of 27 expressions in Web 2.0 spaces like literary blogs and social media, we need to explore connections between the material form of these spaces and the variations of the texts produced. In her work on the emergence of digital literary studies, Amy Earhart writes that a unique approach of the digital humanities is the decentering of print scholarship which is “beginning to wield less power in shaping the area as blog posts, tweets, listserv discussions, and digital projects gain attention” (2015, 6). While a similar situation may be found in digital literary studies in Africa, print culture lingers as a solidified aspiration of digital literary writers who mostly encounter the digital as an experimental space for their creativity. If as Karin Barber suggests, “literature is a social product and bears the imprint of the conditions of its production” (2007, 432), we can seek to understand the ways in which digital technologies shape the form, genres and texts of African literary texts circulating online. The platformization of African literature, evident in several literary blogs devoted to publishing new writers, has meant that spaces such as Saraba and Jalada have become central in the recent invigoration of literary sensibilities on the continent. The importance of the digital moment in African literature is asserted by established writers like Chimamanda Adichie (2013) who not only experiments with literary blogging in her novel, Americanah but also makes digital affordances central narrative techniques. Structured around social media as a meta-fictional space from which her protagonist, Ifemelu expresses grim perspectives about race, the novel demonstrates a range of textualities that highlight how print and digital poetics actually converge, rather than displace each other in contemporary works from the continent. New platforms of African writing not only assure the visibility of new names, but also accommodate novel forms of authorial agency that enriches African cultural productions more generally. Theoretically, the field of African digital literary studies itself is responding to this digital explosion of literary agency, with an increased engagement with the Internet as the medium of new literary voices. Shola Adenekan’s 2012 inaugurating dissertation in the area frames this process as “the Internetting of African literature” (11) and is the basis for his forthcoming monograph on African literature in a digital Yeku: Deference to PaperArt. 15, page 4 of 27 age. The new volume addresses class and sexual politics in online writing from Kenya and Nigeria and provides an analysis of digital literary networks and their importance to the understanding of literary history in both countries (Adenekan 2021). In the 2012 study, Adenekan argues that the platformization of literary craft in the Kenyan and Nigerian literary contexts arguably started from the mid to late 1990s when writers seeking to draw attention to their printed work started posting poems and short stories on e-mailing lists, such as Krazitivity and Ederi, and other similar listservs hosted by the likes of Yahoo and the now defunct Geocities. In more recent years, newer online spaces, such as Brittlepaper and Africaisacountry have appeared as platforms from which new writers produce literary and cultural discourses on Africa. In the last ten years, Brittlepaper, in particular, as an Afrocentric space of literary circulations has consolidated its status as the most formidable literary portal into creative writing on the continent. In line with Adenekan’s work, I analyzed an online fan fiction of Chinua Achebe’s 1958 novel Things Fall Apart that was published by Brittlepaper in 2017, demonstrating that these digital platforms of African literature have created new roles for readers of African fiction who “recast the monologic frame and individual authority of print,” resulting in “a decentered medium, structured on the logics of interactivity and participatory culture” (2017, 262). The reading experience is now such that allows people to easily respond to literary texts and even engage directly with other writers themselves, even as new reading habits are formed. My goal in that article was to show that the structure and design of Web 2.0 platforms and applications, such as blogs and social media, democratize and decenter authorial space, opening up more user-focused engagements and reformulations of traditional modes of cultural representations. Around the same time, Zahrah Nesbitt-Ahmed (2017) likewise argued that an explosion of new technologies has meant a completely new way of reaching and interacting with an African audience that not only changes the traditional gatekeepers of literature on the continent but enables creators and consumers of African literature to reclaim—and then reframe—their own narratives (387). A more recent paper stresses the role of social media in amplifying Yeku: Deference to Paper Art. 15, page 5 of 27 articulations of reader agency in African literary texts. Exploring the intersection of digital culture and African writing on the Internet, another study uses social media commentary discussions and reader responses to China Achebe’s war memoir, There Was a Country to track the nature and functions of the digital publics of African literature (Yékú 2019, 2). I have outlined these studies to acknowledge the existing theoretical engagements with a rapidly expanding field of African literature in a digital age and to counter critical any tendencies that obfuscate the presence of African digital subjects both as producers and critics of new digital genres. Indeed, Stephanie Santana’s brilliant exploration of serialized fiction from southern Africa published on Facebook and blogs show that the digital is ‘helping to foster multiple cyberplaces in which new literary forms and indigenous languages are thriving’ (2018, 187). In a work that is closest to the topic I examine in this article, digital literary forms are asserted as influencing from local, national, and regional zones, with diverse audiences of these online fictional forms drawing from broader African, diasporic, and global audiences. Also, this theoretical excursion gestures towards the relevance of my present interest— that the close connection of print culture to the digital informs different forms of textualities which mandate an appreciation of bibliographic and textual criticism— areas that are undertheorized in African literature more generally. The making of literary texts, in terms of, for example, the archival materials and manuscripts drafts of several pioneering African writers, offers a glimpse of the changes that occur in the process of textual production and circulations. One recent iteration of this materialist approach in African literature is Nathan Suhr-Sytsma’s essay “Christopher Okigbo’s Materials,” a work that investigates the poetics and compositional practice of the Nigerian modernist writer Christopher Okigbo through unpublished drafts (2020). For my own reflection, I am interested in the conceptual implications of the bibliographic variations that emerge when Africa literary texts first exist online and later in print. After examining the 2017 “transition” of Saraba Magazine (2020) to print, I will proceed to use Mike Maphoto’s Diary of a Zulu Girl (2013), to reflect on how different Yeku: Deference to PaperArt. 15, page 6 of 27 platforms of African online writing both shape reading of literature and provoke new textual forms. I am hoping to use this article to explore the relationship between the forms of representation and modes of production of online African writing, showing how print and digital platforms can complement each other in a context in which print remains an unconscious or perhaps tactical objective. As several African creative writers connect and nurture relationships with readers by effectively leveraging social media and real-time collaboration tools, we are invited to appreciate why the Internet “serves as a test-bed for work that may later go into print” (Adenekan and Cousins 2014, 139). This possibility that works produced online may end up in print is what I set out to explore in this article, particularly in the framework of how ideas in textual and bibliographical criticism intersect with the African writing online. Some of the ideas expressed here are of course not new, but I do wish to restate what I believe is a lingering print imaginary in the digital articulations of African literary expressions. This is necessary in the framework of African digital studies in which conversations around the text, in either its print or digital iteration, are limited in terms of their materiality and bibliographic form. A print imaginary refers to the ways in which print practices continue to undergird and inform digital expressions of literary agency. This persistence of print is informed by the popular notion that the digital realm is a domain of impermanence and instability. As Katherine Hayles writes, a print-centric perspective in the ways electronic writing and textuality are discussed (Hayles 2005, 37). This is evident among African writers online. I am interested in how these oft-repeated arguments in the digital humanities manifest in the particular contexts of online writing from several African countries. A print consciousness continues to shape the digital expression of works produced via electronic platforms, despite the tendency to romanticize the digital that we see in ideas of Nigerian online literary critic Ikhide Ikheloa. I am suggesting that the mobility from the digital to print is itself something we need to investigate since print in African writing remains pervasive despite our celebratory posture toward the digital. Because of the vast amount of literary data from blogs and platforms on African content, it is reasonable to assume that currently, the digital is the chief medium Yeku: Deference to Paper Art. 15, page 7 of 27 of literature. However, considering present realities that show the simultaneous investments in both print and online writing, this may be a tad essentialization of digital media, one that is impervious to the class politics of the digital divide on the continent. According to 2019 data from the ITU, a United Nations specialized agency for information and communication technologies, 3.6 billion people around the world still lacked online access, with Africa as the region with the lowest (28.2 per cent). The ITU data revealed a growing Internet uptake, with 4.1 billion people now online, but also a widening digital gender divide. While we need to acknowledge the important contributions of literary blogs and platforms that give prominence to previously unrecognize creative voices, we need to avoid fetishizing views on the medium as the new mainly uncontested space of creative writing. That writers use digital media as creative and experimental terrains for rendering visible their creative genius has received attention from others. New African voices online “use this space to overtly attack the ‘single story’ of African representations outside Africa through the modality of “their Facebook status updates and in much of the online fiction they post on social media networks” (Adenekan and Cousins 2014, 14). While the focus on Saraba Magazine enables me to reflect on this persistence of print, Maphoto’s experimentation with his literary blog allows me to examine more fully the ways in which the interactions of both print and digital platforms can structure meanings differently. Saraba magazine’s “deference to paper” Saraba as a literary magazine published its first online issue in 2009 and has since then aimed to “create unending voices” by publishing the finest emerging writers, with focus on writers from Nigeria, and other parts of Africa. What I undertake in this section is to use an important digital literary space in Africa to make a point about print, rather than use the magazine’s foray into print to undermine the platform. A major platform’s turn to print proves the point that the digital only serves as a testbed for their literary works that ultimately aspire to print. From the editorial page of its first printed issue, the magazine expresses their commitment to “publish at least one print issue each year,” writing that “the idea, as we have done here, is to anthologize Yeku: Deference to PaperArt. 15, page 8 of 27 new writing, squarely on the premise of promise; as though this is a document with which readers might return to understand how the featured contributors have prospered in their calling as writers” (2017). I am using this Saraba example to make legible the many other literary works that circulate on blogs and social media which, though are born-digital, desire the conditions of print since the digital, as testbed for the imagination, produces writing as an aesthetic of contingency. Contingency manifests as writing emerges through parameters of transience that subject online texts to constant emendations. This is particularly the case as texts evolve because of the multiplicity of those participating in its emergence. Although this particular Saraba example may seem to be only one instance, there are other similar gestures in Africa’s literary communities on the Internet. For instance, we may think about another platform, African Writing Online which published some of today’s most important writers in contemporary African literature long before Web 2.0 became a converging point for cultural productions on the continent. The Nigerian writer Chuma Nwokolo one of the pioneering founders of the platform told me via social media on April 15, 2015 that their platform “was always meant to be hybrid” in the sense that digital works could later be published in print as a strategy of making African Writing Online “more sustainable without depending on grants and the like” from patrons (Nwokolo 2015). Therefore, although it is a foundation platform of African online writing, the magazine has had about five print issues and literary texts from many of its writers have appeared in print. The material differences between online poems by Niyi Osundare or Jack Mapanje and their print iterations deserve more scholarly attention than what we currently have. Saraba indicates that print is still very much entangled with digital environments, thus enabling a richer understanding of the production of texts in a material and bibliographical sense. As Figure 1 indicates, what they call a “deference to paper,” which I explore more below, provides a sufficient metaphor for the kind of conversations and analyses on African bibliographical criticism which we need to be having as texts move back-and-forth between print and digital platforms. We can begin unpacking the implications of this post by Saraba magazine by examining how the editors imagine a reader who dives “into something of beauty” Yeku: Deference to Paper Art. 15, page 9 of 27 as a gross misunderstanding of the act of reading in a digital environment. The language the editors employ invokes the physical affordances of print as a domain the reader dives into. On the other hand, digital reading is an interaction with a screen environment that transforms perception and the acts of comprehension and interpretation. Readers express the cognitive demands of print reading differently in a digital space in which non-linearity, speed reading, and browsing are reading strategies typically employed by digital natives. In the light of their desire to treat “the magazine as an aesthetic object” the editors at Saraba reinforce the idea that Figure 1: Saraba Magazine. “The Transition Issue – A Note from the Editors.” (see Saraba Magazine 2017). Yeku: Deference to PaperArt. 15, page 10 of 27 digital reading does not offer the kind of stability and permanence associated with reading print. The assumption here is that electronic texts are not as stable as printed works, even if the supposed fixity of print is an idea many scholars including Matthew Kirschenbaum (2002) have challenged. In what they acknowledge as their “deference to paper,” the editors at Saraba suggest that literary meaning in digital texts may be diminished and, in fact, not organized around any aesthetic impulse. What could become evident is an untended and ironic devaluation of the literary merits of the works on platforms like Saraba itself. Again, this language offers the sense that in a digital environment, a reader or user cannot possess “a lingering touch” or “a felt presence” that is enabled by the reading experience of print technology. It is mere subjective speculation to propose, for instance, that a kindle edition of Ayobami Adebayo’s debut novel, Stay with Me (2017) presents a less “lingering touch” than its various print editions. Whether it is print or digital, the materiality of both media offers different textual opportunities that constitute the reading experience, although materiality in a digital space is apparently of a haptic yet intangible kind. Also, the idea of “a lingering touch” invokes narratives of materiality regarding the digital text. Matthew Kirschenbaum (2002) calls this supposition that a digital text or artefact cannot be material because you cannot reach out and touch them an instance of tactile fallacy (43). Kirschenbaum, alongside other members of the textual community, such as Johanna Drucker (2013) invite critics to understand the differences between print and electronic texts, arguing for a thorough understanding of their materiality. Scholars that imagine the role-playing game narrative Bioshock or World of Warcraft as important expressions of digital literature can hardly be convinced that there is not a felt presence in these interactive narratives. Beyond an uncritical suggestion that consolidates a bias for print, Saraba’s turn to a printed edition is in fact “a deference to paper” that emanates from the assumption that electronic literature cannot constitute a sufficient textual aesthetic for readers. Their view restates problematic ideas that digital environment is problematic for literature. Rather than the assumption that the Internet reduces literary texts to the surface spectacles of digital interfaces and typefaces, or to any other material structure of the electronic medium, the digital does not depend solely on language Yeku: Deference to Paper Art. 15, page 11 of 27 for the conveyance of its aesthetics and narratives. It also looks to its own materiality and navigational apparatus as part of semantic transmission, to the visual strategies of presenting data and metadata, and to the media itself. In other words, we need to pay more sufficient attention to how the materiality of the text is central to the aesthetics and the reading experience of the reader. In a 2020 email interview in which I asked Rasak Malik Gbolahan what he thought about Saraba’s print edition and his own reading practices and preferred medium, the Nigerian poet Rasak Malik Gbolahan who uses social media to circulate his works and has published his poetry on Saraba elaborates on this idea of print culture as a more enduring space: There is something magical about holding a book. The magic emanates from the smell of each page, the letters gracing each page, and the book in its entirety. Also, I prefer to carry books around with me. They are like passports to a new city, to a world inhabited by people intoxicated by the opium of stunning sentences. When I hold a book, I feel a certain connection to it. This extends to every genre. For instance, in reading a poetry book, I find it easier to mark a page, or underline a line. I am quite focused in reading a print book than reading online. This is not to undermine the power of digital literature, but hard copies are always tugging my heart, offering me the liberty to embrace them (Gbolahan 2020). Although it appears a mystifying logic of fetishization is being extended to print technology, Gbolahan’s argument may suggest that the transitioning of literary texts from their original digital domain to print is symptomatic of a larger practice in African digital literary culture: the persistence of print in the imaginaries of cultural producers, one that is probably undergirded by an unstated capitalist impulse to circulate print copies and recoup investments. Though also evident in other cultural and writing traditions elsewhere, it undercuts the sometimes unacknowledged, over- celebration of digital platforms in African literary discourses. What is at stake here is perhaps an unwillingness to appreciate the ways in which linguistic texts, images and other formal transmitters of meanings can have narrative power based on a graceful form that does not hinder the fullest expression of aesthetic content. It is Yeku: Deference to PaperArt. 15, page 12 of 27 ironic, though, that the presentation of content on the Saraba website is carefully designed to render graphic as an aesthetic appeal that contributes to the semiotics of the creative works posted on the online magazine. Put simply, despite the numerous sites, blogs, platforms and data on African online literature, a deference to paper by one of Africa’s leading online literary platforms aptly symbolizes the persistence of print in an otherwise saturated digital environment. This is not to suggest that Saraba Magazine is denouncing the digital or that it does not trust the aesthetic abilities of digital technologies. While the editors may not necessarily romanticize the printed version, their transition does suggest the fixation with print that I argue is still largely to be found among some digital actors. Being able to access the printed version in digital format on online publishing platforms like OkadaBooks can also mean that the material complexities of both the printed and the digital versions warrant an exploration of the texts produced from a perspective that recognizes the physical peculiarities of both platforms. Diary of a Zulu Girl My second example is Nkululeko Maphoto’s literary blog Dairy of a Zulu Girl which started in 2013 as a fictionalized presentation of sexual politics and crime in urban South Africa. This online diary presents the opportunity to explore the re/imagination of African literary practices on Web 2.0 platforms, enabling a focused attention on the many ways social media and literary blogs signify as new media spaces in which new African writers express literary talents. Maphoto’s Diary of a Zulu Girl is an important work of popular culture that stages the materiality of the text of online African writing in fascinating ways, aside from its alertness to the poetics of the medium and the dialogic possibilities of social networks. The Diary of a Zulu Girl is a fictional blog about 19-year-old Thandeka Mkhize who leaves her small town in Mooi River to study law at Wits University in South Africa. The diary details her experiences in the big city, where she is introduced to older men, from Nigeria mostly, who buy her drinks and expensive clothes in order to exploit her sexually. The Diary chronicles the story of the average cosmopolitan girl who must navigate the social pressures of living in urban South Africa. Maphoto’s narrative is in the form of an online diary, serialized as a blog, and therefore gets his readers to interact both with the story, and with one Yeku: Deference to Paper Art. 15, page 13 of 27 another. Like many other recent expressions of African online writing that use the authorial affordances of social media to produce and sustain “networked publics” (Ito 2008), the Diary of a Zulu Girl brings readers and the writer in a shared media space in which the boundaries between author and reader are almost nonexistent. Whether it is on Maphoto’s blog or on his social media accounts, the medium structures literary meaning and conditions reader behavior and responses. In the seventh chapter of the e-book, which almost appears to reproduce the entry in the original Facebook post, Thandeka the protagonist is made aware of the dialogic space in which she has been constructed when she appears to step outside of her original fictive environment and negotiates some dignity for herself: By the time I went to university I had slept with two guys and yes that sounds sluttish to some but count how many people you have slept with before you judge. If you have more than 5 in five years then I guess we in the same boat (Maphoto 2013). In this passage, Thandeka Mkhize asks her readers and the potential moral judges of her sexual choices to look at themselves in the mirror before judging her erotic proclivities. Maphoto presents a character that is aware of the readerly gaze of the online reader glued to the screen of a mobile phone or a desktop. Thandeka is rendered conscious of a need for a dialogue with the readers of the story in which she exists and signifies an awareness of the medium in which her subjectivities have been constructed. While both the initial Facebook entries and e-book version have an easy- flowing style, the obvious spelling and grammar errors reveal Maphoto’s story has not been well edited, indicating the fluidity and mutability of a work whose language is in a perpetual state of flux. Also, as evident in this narration by the protagonist in the seventh chapter of part 1, the original blog entries were riddled with grammatical glitches and stylistic incongruities that are more effectively presented in the print edition: “I was not born poor but I cant say I was born rich either. My parents are both teachers in Mooi River halfway between Johannesburg and Durban…Yes I do not stay in the rural parts of it but its still deeply cultured with rules and traditions that go deep.” Yeku: Deference to PaperArt. 15, page 14 of 27 Aside from the fact that the print edition polishes the linguistic glitches of the digital versions, another major difference is the difference in material and physical properties. As Stephanie Bosch Santana reveals, in general, the success of Maphoto’s blog is due to its “exploitation of strategies associated with both the digital and print realms ” Employing the anonymity of the online space to gain initial traction for the diary began, Maphoto later revealed his identity, and cast himself in the more traditional role of “author” rather than “blogger” (Santana 2018, 192). What is significant from the perspective of bibliographic and textual criticism is how his fictional work usefully illustrates how the book object, or the text, is usually open to emendations. Seeing the edited print version of the diary as merely polishing the unofficial language of the web version sequesters the changes in both. Accounting for these textual alterations is important. Traditionally, emendation is designed to arrest the influx of error and corruption in textual transmission. It is the editorial intervention a text is made to have when it is disturbed by a fault. Emendation eliminates errors and repairs the text where its record of authority is deemed to be interrupted and broken (Hans Gabler 1993, 201). The print edition of the Diary of a Zulu Girl (2015) introduces variations and corrections since there are significant changes and substantives to both blog versions and digital editions of the original text. The textual critic of this text would, therefore, be interested in how the transmission of writing from digital formats to print texts affect the different versions and editions of these texts. A digital critical edition of Maphoto’s Diary of a Zulu Girl, for instance will attempt to examine how the book differs considerably in its material form from the blog. Such materialist reading of the book as an object that has physical properties which contribute meaning is not common in African literary circles and it is one that I hope the imbrications of print and digital media can facilitate. As Maphoto’s online fiction morphs into a printed book, as well as a planned television drama that is still being expected, it forces us to appreciate how a text can change its form and, hence, its meaning, and audience when its medium of expression changes. In Figure 2, we get a sense of the text on both Facebook and Wordpress. Yeku: Deference to Paper Art. 15, page 15 of 27 Maphoto’s blog has been described as “something of a digital literature phenomenon” (Santana 2018). Literary forms such as the Diary proliferate on social media and blogging as a genre of popular culture that asks us to rethink the ways African writing is being reconfigured by Web 2.0 platforms. Areas impacted include author-reader interactivity as well as the digitally enabled fluidity of these identities. It invites us to appreciate anew how traditional authors use platforms, such as Facebook pages and profiles as alternative spaces for the articulation of creative expressions. In a conversation with Jeanette Chabalala of City Press, Maphoto explains that “readers have not stopped lapping it up and keep begging for more of the sex, drugs, lies and betrayal that fuel the story lines... What is fascinating is that parents contacted me and requested the blog to be turned into a book because they could not read it on their phones.” (Chabalala 2013). That parents “requested the blog to be turned into a book” may be read along with the many responses to the fictional diary as demonstrating the spontaneous nature of interactions on social media, but it also raises a question of age and class since the reading space of the net could be tenuous for older people with limited access to the internet. In another context, the request for book also recalls Ikhide Ikheloa’s question about books in African writing: “how is Africa viewed everywhere, if not primarily through hard-copy books?” (2013). While it is productive to imagine Ikheloa’s Figure 2: Screen grabs of Facebook and Blog versions of Maphoto’s Dairy of a Zulu Girl (see Maphoto 2013). Yeku: Deference to PaperArt. 15, page 16 of 27 question as an exploration of the persistent focus on print culture as the major medium of the canons of African literature, it is also essential to discern the logic of materiality and the changing modes of the production of texts inherent in his question. In his response, Ikheloa locates his concerns not only within a trajectory of a book history in Africa that is unsettled by the intractable problems of indigenous publishing houses in Africa, but also in a hasty dismissal of print culture to coherently represent the African condition. Ikhide Ikheloa argues on the USA-Africa Dialogue listserv that “the book [understood as print] is an inappropriate gauge of Africa’s stories, history and circumstances. You would have to look to the great book in the sky, the Internet, to have a well-rounded view of our world, not just Africa” (Ikheloa 2013). While Ikheloa, in the tradition of several other techno-optimists, does indeed fetishize the Net and essentialize “the great book in the sky” in his reflection on the medium of Africa’s stories and narratives, his observations are largely reasonable in signaling a need to recognize the close associations between print and digital texts. The materiality of text As the work of Karin Barber (2007) shows, African theories of textuality often invoke oral poetics, something that is relevant in the framework of the regeneration of meanings in literary texts both in print and digital realms. The major moments of textual representations in African knowledge productions includes the oral text best demonstrated by its context of performance, the printed text that emerged in the early twentieth-century Africa through the works of early publications, and finally, an electronic textuality of African literature that is gaining traction among scholars of what Walter Ong (1991)refers to as secondary orality. An expanded perspective of the text as any site of discourse, therefore, finds precedent in African oral frameworks that identify the text as a mutable entity produced by social and technological developments that affect its aesthetic and literary production and appreciation (Olorunyomi 2006, 137). In terms of digital texts, Shola Adenekan (2012) reckons that the movement of texts and writers across different mediums signals “an important way through which some of the emerging African voices negotiate the relationship between their works, themselves, Africa and the outside world” (17). Adenekan’s Yeku: Deference to Paper Art. 15, page 17 of 27 argument is a useful commentary on the nature of textuality in contemporary African literature. That said, there are more significant bibliographical issues the textual critic might be interested in, including what this “movement of texts and writers across different mediums” means for literary interpretation. That seems to me to be pertinent for an appreciation of the material history of the African literary text today. His work highlights the emergence of online literary magazines which focused principally on the publication of short stories, essays and poems that appeal to a reading public that is equally online. Blogging and social media sites also give young African writers more platforms to publish works that did not exist previously outside of the computational space. Critical evaluations of such work must be mindful of the structure and materiality of their new medium. This seems to me a crucial point to emphasize as it points to what Mathew Kirschenbaum identifies elsewhere as first-generation objects. A “first generation electronic object,” writes Kirschenbaum “is one that enjoys no material existence outside of the electronic environment of a computational file system-though this is not…to say that such objects enjoy no material existence at all” (2002, 20). In relation to African literature, it is important to engage more conceptually with not just works that exist purely through the medium of the web but also the forms of textualities propelled by digital technologies, particularly in terms of their ontology and materiality. Texts, whether written, oral, or digitally transmitted, can help us understand the political contexts of their producers as well as the publics that emerge from and cohere around them. However, beyond questions of class and the political touchstones of texts online, the various formal changes that accrue from a digital-to-print move need to be foregrounded too. Significant emendations involving a wide range of accidentals and substantives are therefore usually present in the transformation from electronic formats to print texts. Adenekan explains that this process of textuality “involves reshaping the text for different formats, and in the process the creative piece is unfixed” (2012, 13). This idea of an unfixed, sliding, and impermanent textuality that is obvious in the Saraba example is reiterated in the view that these digital texts may later appear in print. Yeku: Deference to PaperArt. 15, page 18 of 27 The traditional obsession with “canonical texts has blocked our view of the real historical processes at work in the emergence and spread of literary forms” (Karin Barber 2007, 40). There is a sense here in which Barber’s assertion calls for a study of genre and textuality that is wide and varied in its orientation. This view can inform an appreciation of the ways a book’s production history, and its various editions also help to understand the sociocultural contexts of writers and their works, the form and physical shape of the text itself is the focus of my analysis. Karin Barber’s argument connects well to D.F McKenzie’s description of text. In the framework of textual bibliographical criticism, D.F McKenzie (1999) broadly conceptualizes the term in the famous 1985 Pannizi lectures. McKenzie (1999) defines ‘texts’ to include verbal, visual, oral, 9and numeric data, in the form of maps, prints, and music, of archives of recorded sound, of films, videos, and any computer-stored information (13) McKenzie’s perspective on textuality may be read alongside other theories of textualities in the core digital humanities tradition, including perennial names, such as Katherine Hayles (2005) and Jerome McGann (2004). For some of these early DH, scholars, a textual object, whether in a print or an electronic context, needs to be alert to not only materiality, design, and physical features but also the social contexts that inform the texts. In other words, our engagement with the physical hardware of the text is essential to understanding its semiotic transmissions and diverse economies of meanings. Items, such as page size, fonts, binding and other “bibliographic codes” are as important as the linguistic codes and social context of the work. These ideas were originally put forward by Jerome McGann (2004), who in his discussion of the production of the scholarly edition of The Rossetti Archive, asserts: the apparitions of text—its paratexts, bibliographical codes, and all visual features—are as important in the text’s signifying programs as the linguistic elements; second, that the social intercourse of texts—the context of their relations—must be conceived an essential part of the ‘text itself’ if one means to gain an adequate critical grasp of the textual situation.” (McGann 2004, 11) Yeku: Deference to Paper Art. 15, page 19 of 27 This view of textuality suggests that the text has a constructed character, and together with “documents are fields open to decisive and rule-governed manipulations” (McGann 2). A description of the textual situation of Chinua Achebe’s canonical text Things Fall Apart, for instance, would include numerous editions that present a wide array of bibliographical codes and visual features that form part of the algorithms and rules which control the text as a literary work that transmits meaning. The first edition of Achebe’s novel was published in 1958 by Heinemann Publications in London. Because of its significance as a strategic postcolonial response to the history of colonial discourse in Africa, many other editions of Things Fall Apart have been published and circulated since then. In 2009, Anchor Canada published a 209-page paperback edition of the novel, adding to several other paperback editions by Penguin, Oxford, and Norton, Although these various editions obviously differ significantly from the first edition by Heinemann in terms of material and physical features, there is not much critical work by Achebe scholars that foreground how these bibliographical variations—in terms of design, page number and paper textures—can be one of the determinants of the text’s semiotic impulses. The production of meanings in the text can no longer remain only at the level of close reading of linguistic content; it needs to be extended to the various physical elements that constitute the text. A scholarly critical edition of, say a kindle edition of Things Fall Apart, would be interested in how it differs from earlier printed texts from the perspective of its electronic features, interface, and other aspects of its physical design. To be interested in a physical description of this kindle edition is to take seriously how the presence of substantive elements, such as coding structure and paratextual features make this digital edition different from earlier editions. The code of this kindle edition has a material existence whose structure is central to the ways online writing articulate meaning. These questions of form and physical elements are significant considerations in terms of the materiality and form of the text of African writing in the age of the internet. In my 2017 article on Kiru Taye’s Thighs Fell Apart,’ an online fan fiction of Things Fall Apart, I demonstrate that an established African text is reconfigured by a work of digital Yeku: Deference to PaperArt. 15, page 20 of 27 literature that “articulates a paradigm of erotic fantasy not too familiar in canonical African literature’ (2). The gendered politics of Achebe’s original text is revisited by an online subject whose reading reproduces and extends the text’s narrative boundaries. For instance, “the spectacle of Okonkwo’s wrestling with Amalinze the cat in Things Fall Apart “is recast in ‘Thighs Fell Apart’ as sexual contest in which bodies clash in an ideological force field, with phallocentric might triumphing over female desire,” forcing a rereading of one of Africa’s famous fictional heroes. What is of more interest to me here is that by using blogging to produce a new text that remediates Okonkwo’s identity, Kiru Taye also use digital technology to materially reshape the original text. A critical edition of Achebe’s novel may be linked with Taye’ short story to complicate and trouble textual meanings in the original work. Another helpful example of the material transformations of textuality in the making of African literature is probably evident in the digital copies of literary pamphlets in the tradition of the famous Onitsha Market Literature. These pamphlets, produced by local publishers in a popular Nigerian trading center of the 1960s, are composed of moral narratives, social discourses, plays, advice, and other popular stories, and have been recently digitized in different library holdings in the US. Emmanuel Obiechina traces the development of the Onitsha literature to the concentration of large numbers of locally-owned and operated printing presses in the town, writing that “the influx in the 1940s of Indian and Victorian drugstore pulp magazine fiction that also shaped the format of pamphlet literature (2008, 119). Several digitized versions of the Onitsha Market pamphlets that appear in the digital collection of University of Kansas libraries are held at the Spencer Research Library at the University of Kansas. The Onitsha Market pamphlets are a legacy of colonialism in Nigeria, representing, as Charlotte Nunes’s article imagines, “the culturally textured crossroads of British colonial influence and the print record of a traditionally oral regional narrative tradition” (2015, 126). The digital archives at the Spencer library hold digital copies of the pamphlets (see Figure 3) which are important in the context of this essay because they demonstrate how the production of a new text, or even a remediated one propel new material conditions that can enable a better appreciation of cultural meanings. Yeku: Deference to Paper Art. 15, page 21 of 27 Katherine Hayles’s notes that “the navigational apparatus of a work changes the work,” and constitutes a “part of the work’s signifying structure” (2005, 90). This suggests that an appreciation of these digital manuscripts and their meanings is also now dependent on the functional designs and algorithmic structures that shape them. Digital environments, as Hayles intimates, do not merely provide us with ways of encountering the texts, they are central to our critical reading of the texts themselves. This idea that meaning is altered when the medium is translated may not be new in the study of textuality in an African context; yet, considered from the framework of digitality, there is an enrichment of the discourse of African literary studies itself. The materiality of the new digitized pamphlet offers the most obvious example of difference since the reading experience is now dependent on other textual codes and physical properties not evident in the original printed versions. Also, since the original print copies of the pamphlets are different in terms of meaning and interpretations from the digital copies, our reading experience of the text becomes varied. In How We Think: Digital Media and Contemporary Technogenesis, Katherine Hayles describes reading types in terms of a close reading that “correlates with deep attention” and a hyper reading that “includes skimming, scanning, fragmenting, and juxtaposing texts, and is “a strategic response to an information-intensive environment” we find ourselves in (2012, 12). Hyper reading produces a cognitive Figure 3: Digitized copies of the Onitsha Market Literary Pamphlets from The University of Kansas Library’s Special Collections (see Onitsha Market Literary Pamphlets 2015). Yeku: Deference to PaperArt. 15, page 22 of 27 mode that eschews boredom in its preference for different information streams a high level of stimulation. The Onitsha market pamphlets as print texts have animated research on African popular culture and print culture, but not much work exist on their iterations as digital text. From Figure 4, I am currently creating an archive based on digital critical editions of several of the pamphlets, using Jekyll and Ed to repurpose digitized copies of the pamphlets from the University of Kansas special collections, Jekyll is a preservation-friendly website generator that requires no database since it is static, with all stored information displayed on a webpage contained in an HTML file for the webpage. Ed as a Jekyll theme is based on minimal computing principles and was designed for textual editions. My goal with this project is a scholarly digital edition that makes available a large corpus of text which yields more insights on Nigerian market literature, while offering ideas on what the digital transmission of texts may look like in the context of African literatures. The textual situations of the new literary pamphlets produced will have to incorporate the entire digital contexts and materiality of the new works. Figure 4: A Jekyll-based Project on Market Literature from Nigeria (see Onitsha Market Literature 2.0 2020). Yeku: Deference to Paper Art. 15, page 23 of 27 Conclusion The need to examine how the impact of the physical properties of the text on the transmission of meaning has been the central idea of this work. To recast the title of Amy Earhart’s book, the old is not just a mere trace in the new digital contexts of African online writing, it appears to be a desired condition. Despite a massive growth of digital publications in Africa, print culture not only remains solid as a major goal for writers who deploy digital technologies to circulate their works but also impels us to see how the reverse movement from digital to print create multiple texts with different physical features. The medium of the text of African literature in the present moment has to be understood beyond the print technology that continues to permeate digital discourses. When Shola Adenekan and Helen Cousins argue in their discussion of class and online African writing that “cybertexts are not permanent,” and that “like orature, the meaning of cybertexts is unfixed and subject to multiple interpretations (2014, 11), they restate an assumption about the materiality of both print and electronic texts that needs to be highlighted: indeed, they praise impermanence, but this is only to the extent that it is the condition that makes the transition to print possible. Whether the medium is oral, print, or electronic, texts transform and are transformed by alterations in their technological conditions. How the materiality of African texts harbor part of their meaning signals an alertness to the importance of the medium. The new digital spaces of African literature alter several aspects of literary conversations and interactions in Africa, even as they recall several poetics from oral tradition. There are several ideas one can uncover when reflecting on the critical implications of the digital reconfiguration of literary expressions in contemporary Africa. It is apparent that writers based on and outside of the continent are using digital media to alter the form of their writing and that the platforms of creative expressions can inform the relationship between form and content, while transforming the relationship between writers and their publics. On different Web 2.0 environments, writers connect with other writers, and with new audiences who in turn share and transmit their works in their own digital networks. Indeed, the platforms themselves as well as the workings of algorithmic protocols and databases play an active part in Yeku: Deference to PaperArt. 15, page 24 of 27 the networked ecologies of social media. There is a democratization, a decentering, of authorial space, as African literature and its audiences are reconstituted in a dialogic space that assures a polyphonic assemblage of new authorial perspectives and reading publics in rhizomatic networks of literary relations and socialities. The interactive and connective logics of social media implicate these social arenas of literary networks and affinities as cultural terrains that consolidate interactive performances of writerly agency. Editorial contributions DSCN GO::DH 2020 issue Special Editors: Barbara Bordalejo (University of Saskatchewan, Canada), and Juan Steyn (North-West University, South Africa). Section Editor/Copy Editor: Darcy Tamayose, The Journal Incubator, University of Lethbridge, Canada. Bibliography Manager: Shahina Parvin, The Journal Incubator, University of Lethbridge, Canada. Competing interests The author has no competing interests to declare. References Achebe, Chinua. 1994. Things Fall Apart. New York: Anchor Books. Adebayo, Ayobami. 2017. Stay with me. Edinburgh: Canongate. Adenekan, Shola. 2012. African Literature in the Digital Age: Class and Sexual Politics in New Writing from Nigeria and Kenya. PhD dissertation, University of Birmingham. ———. 2021. African Literature in the Digital Age: Class and Sexual Politics in New Writing from Nigeria and Kenya. Suffolk: Boydell & Brewer. Adenekan, Shola, and Helen Cousins. 2014. “Class Online: Representations of African Middle-Class.” Postcolonial Text, 9(3): 2–15. Adichie, Chimamanda Ngozi. 2013. Americanah. First edition. New York: Alfred A. Knopf. Barber, Karin. 2007. The Anthropology of Texts, Persons and Publics: Oral and Written Culture in Africa and Beyond. Cambridge, UK: Cambridge University Press. Yeku: Deference to Paper Art. 15, page 25 of 27 Chabalala, Jeanette. 2013. “The Interview: Diary of a Zulu Girl’s Mika Maphoto.” News24. November 25. Accessed 16 March 2016. http://www. news24.com/Archives/City-Press/The-Interview-Diary-of-a-Zulu-Girls-Mike- Maphoto-20150429. Drucker, Johanna. 2013. “Performative Materiality and Theoretical Approaches to Interface.” Digital Humanities Quarterly, 7(1). Accessed September 18, 2020. http://www.digitalhumanities.org/dhq/vol/7/1/000143/000143.html Earhart, Amy E. 2015. Traces of the Old, Uses of the New: The Emergence of Digital Literary Studies. Ann Arbor: University of Michigan Press. DOI: https://doi. org/10.3998/etlc.13455322.0001.001 Gabler, Hans W. 1993. “What “Ulysses” Requires.” The Papers of the Bibliographical Society of America, 87(2): 187–248. DOI: https://doi.org/10.1086/ pbsa.87.2.24304765 Gbolahan, Rasak Malik. 2020. Email Interview. 10 April, 2020. Hayles, Katherine. 2005. My Mother Was a Computer: Digital Subjects and Literary Texts. Chicago: University of Chicago Press. DOI: https://doi.org/10.7208/ chicago/9780226321493.001.0001 ———. 2012. How We Think: Digital Media and Contemporary Technogenesis. Chicago: The University of Chicago Press. DOI: https://doi.org/10.7208/ chicago/9780226321370.001.0001 Ikheloa, Ikhide. 2013. “Are You a Nigerian Writer? Why Join the Association of Nigerian Authors?—Brittle Paper Q&A with Richard Ali.” Conversation on USA- Africa Dialogue Listserv, November 8. Accessed November 11, 2020. https:// brittlepaper.com/2013/11/nigerian-writer-join-association-nigerian-authors- brittle-paper-qa-richard-ali/. International Telecommunication Union. 2019. “New ITU Data Reveal Growing Internet Uptake but a Widening Digital Gender Divide.” Accessed September 18, 2020. https://www.itu.int/en/mediacentre/Pages/2019-PR19.aspx. Ito, Mizuko. 2008. “Introduction.” In Networked Publics, edited by Kazys Varnelis. Cambridge: MIT Press. 1–14. DOI: https://doi.org/10.7551/ mitpress/9780262220859.003.0001 http://www.news24.com/Archives/City-Press/The-Interview-Diary-of-a-Zulu-Girls-Mike-Maphoto-20150429 http://www.news24.com/Archives/City-Press/The-Interview-Diary-of-a-Zulu-Girls-Mike-Maphoto-20150429 http://www.news24.com/Archives/City-Press/The-Interview-Diary-of-a-Zulu-Girls-Mike-Maphoto-20150429 http://www.digitalhumanities.org/dhq/vol/7/1/000143/000143.html https://doi.org/10.3998/etlc.13455322.0001.001 https://doi.org/10.3998/etlc.13455322.0001.001 https://doi.org/10.1086/pbsa.87.2.24304765 https://doi.org/10.1086/pbsa.87.2.24304765 https://doi.org/10.7208/chicago/9780226321493.001.0001 https://doi.org/10.7208/chicago/9780226321493.001.0001 https://doi.org/10.7208/chicago/9780226321370.001.0001 https://doi.org/10.7208/chicago/9780226321370.001.0001 https://brittlepaper.com/2013/11/nigerian-writer-join-association-nigerian-authors-brittle-paper-qa-richard-ali/ https://brittlepaper.com/2013/11/nigerian-writer-join-association-nigerian-authors-brittle-paper-qa-richard-ali/ https://brittlepaper.com/2013/11/nigerian-writer-join-association-nigerian-authors-brittle-paper-qa-richard-ali/ https://www.itu.int/en/mediacentre/Pages/2019-PR19.aspx https://doi.org/10.7551/mitpress/9780262220859.003.0001 https://doi.org/10.7551/mitpress/9780262220859.003.0001 Yeku: Deference to PaperArt. 15, page 26 of 27 Kirschenbaum, Matthew G. 2002. “Editing the Interface: Textual Studies and First Generation Electronic Objects.” Text, 14: 15–51 Maphoto, Mike. 2013. “Diary of a Zulu Girl.” WordPress. http://diaryofazulugirl. co.za. Accessed 15 June 2014. ———. 2015. Diary of a Zulu Girl: From Mud Huts, Umqomboti and Straight Back to Penthouses, Expensive Weaves and Moët. Centurion: Wakahina Media. McGann, Jerome J. 2004. Radiant Textuality: Literature After the World Wide Web. New York: Palgrave Macmillan. McKenzie, Donald Francis. 1999. Bibliography, and the Sociology of the Text. Cambridge: Cambridge University Press. Nesbitt-Ahmed, Zahrah. 2017. “Reclaiming African literature in the Digital Age: An Exploration of Online Literary Platforms.” Critical African Studies, 9(3): 377–390. DOI: https://doi.org/10.1080/21681392.2017.1371618 Nunes, Charlotte. 2015. “Digital Archives in the Wired World Literature Classroom in the Us.” Ariel: A Review of International English Literature, 46: 115–141. DOI: https://doi.org/10.1353/ari.2015.0004 Obiechina, Emmanuel. 2008. “Market Literature in Nigeria.” Kunapipi, 30(2): 108– 125. Olorunyomi, Sola. 2006. “The Mutant Called ‘Text.’” Ibadan Journal of English Studies, 3: 135–145. Ong, Walter J. 1991. Orality and Literacy: The Technologizing of the Word. New York: Routledge. Onitsha Market Literary Pamphlets. 2015. The University of Kansas Library’s Special Collections Onitsha Market Literature, edited by Elizabeth MacGonagle and Ken Lohrentz. Accessed November 11, 2018. https://exhibits.lib.ku.edu/ exhibits/show/onitsha/ku-onitsha-collection. Onitsha Market Literature 2.0. 2020. Github, edited by James Yékú. Accessed June 10. https://idrhku.github.io/onitsha-market/. Santana, Stephanie Bosch. 2018. “From Nation to Network: Blog and Facebook Fiction from Southern Africa.” Research in African Literatures, 49(1): 187–208. DOI: https://doi.org/10.2979/reseafrilite.49.1.11 http://diaryofazulugirl.co.za http://diaryofazulugirl.co.za https://doi.org/10.1080/21681392.2017.1371618 https://doi.org/10.1353/ari.2015.0004 https://exhibits.lib.ku.edu/exhibits/show/onitsha/ku-onitsha-collection https://exhibits.lib.ku.edu/exhibits/show/onitsha/ku-onitsha-collection https://idrhku.github.io/onitsha-market/ https://doi.org/10.2979/reseafrilite.49.1.11 Yeku: Deference to Paper Art. 15, page 27 of 27 Saraba Magazine. 2017. “The Transition Issue – A Note from the Editors -Saraba Magazine.” Facebook. October 3, 1:49 a.m. Accessed September, 14, 2020. https://www.facebook.com/sarabamag/. Suhr-Sytsma, Nathan. 2020. “Christopher Okigbo’s Materials.” The Cambridge Quarterly, 49(3): 212–231. DOI: https://doi.org/10.1093/camqtly/bfaa018 Tanselle, G. Thomas. 1990. “Textual Criticism and Deconstruction.” Studies in Bibliography, 43: 1–33. Yékú, James. 2017. “‘Thighs Fell Apart’: Online Fan Fiction, and African Writing in a Digital Age.” Journal of African Cultural Studies, 29(3): 261–275. DOI: https:// doi.org/10.1080/13696815.2016.1201652 ———. 2019. “Chinua Achebe’s There was a Country and the Digital Publics of African Literature.” Digital Scholarship in the Humanities. DOI: https://doi.org/10.1093/ llc/fqz084 How to cite this article: Yékú, James. 2020. “Deference to Paper: Textuality, Materiality, and Literary Digital Humanities in Africa.” Digital Studies/Le champ numérique 10(1): 15, pp. 1–27. DOI: https://doi.org/10.16995/dscn.357 Submitted: 08 July 2019 Accepted: 24 August 2020 Published: 30 December 2020 Copyright: © 2020 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. OPEN ACCESS Digital Studies/Le champ numérique is a peer-reviewed open access journal published by Open Library of Humanities. https://www.facebook.com/sarabamag/ https://doi.org/10.1093/camqtly/bfaa018 https://doi.org/10.1080/13696815.2016.1201652 https://doi.org/10.1080/13696815.2016.1201652 https://doi.org/10.1093/llc/fqz084 https://doi.org/10.1093/llc/fqz084 https://doi.org/10.16995/dscn.357 http://creativecommons.org/licenses/by/4.0/. Introduction Saraba magazine’s “deference to paper” Diary of a Zulu Girl The materiality of text Conclusion Editorial contributions Competing interests References Figure 1 Figure 2 Figure 3 Figure 4 work_6miiym4nhbgt3pqu6fdqsbicxq ---- Absorbing DiRT: Tool Directories in the Digital Age Research How to Cite: Grant, Kaitlyn, Quinn Dombrowski, Kamal Ranaweera, Omar Rodriguez-Arenas, Stéfan Sinclair, and Geoffrey Rockwell. 2020. “Absorbing DiRT: Tool Directories in the Digital Age.” Digital Studies/Le champ numérique 10(1): 4, pp. 1–18. DOI: https://doi.org/10.16995/dscn.325 Published: 03 June 2020 Peer Review: This is a peer-reviewed article in Digital Studies/Le champ numérique, a journal published by the Open Library of Humanities. Copyright: © 2020 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. Open Access: Digital Studies/Le champ numérique is a peer-reviewed open access journal. Digital Preservation: The Open Library of Humanities and all its journals are digitally preserved in the CLOCKSS scholarly archive service. https://doi.org/10.16995/dscn.325 http://creativecommons.org/licenses/by/4.0/ Grant, Kaitlyn, et al. 2019. “Absorbing DiRT: Tool Directories in the Digital Age.” Digital Studies/Le champ numérique 10(1): 4, pp. 1–18. DOI: https://doi.org/10.16995/dscn.325 RESEARCH Absorbing DiRT: Tool Directories in the Digital Age Kaitlyn Grant1, Quinn Dombrowski2, Kamal Ranaweera1, Omar Rodriguez-Arenas1, Stéfan Sinclair3 and Geoffrey Rockwell1 1 University of Alberta, CA 2 Stanford University, US 3 McGill University, CA Corresponding author: Kaitlyn Grant (kgrant1@ualberta.ca) In the summer of 2017, Quinn Dombrowski, an IT staff member in UC Berkeley’s Research IT group, approached Geoffrey Rockwell about the possibility of merging the DiRT Directory with TAPoR, both popular tool discovery portals. Dombrowski could no longer offer the time commitment required to maintain the organizational structure of the volunteer-run tool directory (2018). This decommissioning of DiRT illustrates a set of problems in the digital humanities around tool directories and the tools within as academic contributions. Tool development, in general, is not considered sufficiently scholarly and often suffers from a lack of ongoing support (Ramsay & Rockwell, 2012). When tool discovery portals are no longer maintained due to a lack of ongoing funding, this leads to a loss of digital humanities knowledge and history. While volunteer-based directories require less outright funding, managing and motivating those volunteers to ensure that they remain actively involved in directory upkeep requires a vast amount work to ensure long-term sustainability (Dombrowski, 2018). This paper will explore the difficult history of tool discovery catalogues and portals and the steps being taken to save the DiRT Directory by integrating it into TAPoR. In particular, we will: – Provide a brief history of the attempts to catalogue tools for digital humanists starting with the first software catalogues, such as those circulated through societies, and ending with digital discovery portals, including DiRT Directory and TAPoR. – Discuss the challenges around the maintenance of discovery portals – Consider the design and metadata decisions made in the merging of DiRT Directory with TAPoR. Keywords: tool directories; tools; TAPoR; DiRT Directory; digital infrastructure À l’été 2017, Quinn Dombrowski, un membre du personnel informatique du groupe de recherche informatique de l’Université de Californie Berkeley, est allée discuter avec Geoffrey Rockwell de la possibilité de fusionner le https://doi.org/10.16995/dscn.325 mailto:kgrant1@ualberta.ca Grant et al: Absorbing DiRTArt. 4, page 2 of 18 répertoire DiRT avec TAPoR, tous deux étant des portails populaires pour découvrir des outils. Dombrowski ne pouvait plus consacrer le temps requis pour maintenir la structure organisationnelle de ce répertoire d’outils géré par des bénévoles (2018). Ce démantèlement de DiRT démontre plusieurs problèmes existants dans les Humanités numériques qui concernent des répertoires d’outils et les outils eux-mêmes en tant que contributions académiques. Le développement d’outils, en général, n’est pas suffisamment considéré dans le domaine académique et souffre souvent d’un manque de soutien continu (Ramsay & Rockwell, 2012). Lorsque des portails pour découvrir des outils ne sont plus maintenus à cause d’un manque de financement continu, il y a des pertes de connaissances et d’histoire dans les Humanités numériques. Bien que les répertoires gérés par des bénévoles nécessitent moins de financement initial, organiser et motiver ces bénévoles exigent une grande quantité de travail pour garantir qu’ils continuent à participer activement à la maintenance d’un répertoire et pour assurer sa viabilité à long terme (Dombrowski, 2018). Cet article examine l’histoire difficile de catalogues et de répertoires pour découvrir des outils, ainsi que les mesures prises pour sauver le répertoire DiRT en l’intégrant à TAPoR. Nous allons en particulier fournir une histoire brève des tentatives de catalogage des outils pour les Humanités numériques, ce qui commence par les premiers répertoires de software, tels que ceux qui se sont diffusés dans la société, et ce qui finit par des portails numériques pour découvrir des outils, y compris les répertoires DiRT et TAPoR. Nous allons discuter des défis relatifs à la maintenance de tels portails. Nous allons aussi considérer les décisions concernant les métadonnées et la conception, qui datent de la période de fusion du répertoire DiRT avec TAPoR. Mots-clés: répertoires d’outils; TAPoR; répertoires DiRT; infrastructure numérique Introduction In an essay entitled “Humanities Computing,” Willard McCarty (2003) talks about a failure in the traditional model of bibliographic scholarship to capture the work of computing humanists. Bibliographies that focus only on publications miss the intellectual work that goes into things like tools, infrastructure, web sites, games and other digital resources. This has made it difficult for the Digital Humanities (DH) to create its own historiography, to know itself through its history of intellectual contributions. This paper is about one type of resource, the tool directory, developed to try to keep track of the tools Digital Humanists have made. We will discuss the problems of tool development, discovery, and preservation; provide a brief history Grant et al: Absorbing DiRT Art. 4, page 3 of 18 of tool directories; and finally, we will provide an in depth look at the infrastructure and subsequent merging of two tool directories, TAPoR 3.0 (Text Analysis Portal for Research) and the DiRT Directory (Digital Research Tools). Tool knowledge Part of the reason instruments have largely escaped the notice of scholars and others interested in our modern techno-scientific culture is language, or rather its lack. Instruments are developed and used in a context where mathematical, scientific, and ordinary language is neither the exclusive vehicle of communication nor, in many cases, the primary vehicle of communication. Instruments are crafted artifacts, and visual and tactile thinking and communication are central to their development and use.” —Baird 2004, xv As Baird points out in Thing Knowledge (2004), tools have been ignored in the Humanities disciplines that deal in discourse. Humanists not only tend to study discourse as a privileged form of expression, but we also think of (print) discourse as the medium for our academic exchanges. This has been a perennial problem in the Digital Humanities because it means that the tools or new media works that we both study and express ourselves in are difficult to value in the academy (Rockwell 2011). A set of tools like Voyant (voyant-tools.org) might have hundreds of thousands of users a year, but it is difficult to formally justify it as a scholarly contribution to a tenure and promotion committee that counts publications. The problem is not limited to the scholarly value of tool building. Most would agree that tools and their associated documentation can bear meaning, but DH is still struggling with ways to formally evaluate them without the apparatus of journals and peer-review. The problem is the infrastructure of valuation starting with the ways we remember what has been done and why. This is a problem the Digital Humanities shares with overlapping fields like Instructional Technology and Game Studies, both of which also value software things as objects of study and objects of creation (for example, see Newman 2012 on the preservation of games). To properly value software things we need a stack of infrastructure, starting with records of https://voyant-tools.org/ Grant et al: Absorbing DiRTArt. 4, page 4 of 18 what was done, as software has a way of disappearing so quickly as to be almost ephemeral. For example, FANGORN and SNAP are both historical tools that were designed specifically for Humanists to assist with text analysis, but they are no longer maintained for active use (TAPoR 2019). There are some organizations doing this preservation work. For example, the Internet Archive’s Software Library preserves decades of computer software that can be accessed and used through their JSMESS emulator (Internet Archive 2014). TAPoR 3.0 and the DiRT (2019) Directory are tool discovery portals for the digital age that try to meet the need for knowledge about tools by recording sufficient information about tools and other resources that can be discovered and surveyed, but, unlike the Internet Archive, they are not preserving the software itself. In this case, they provide access to the metadata and important information about a host of digital tools and software so that researchers can determine the best tool for their project, and also understand where the tools came from by examining the history. Nonetheless, this is only one model for how knowledge about tools can be gathered and organized. The role of tool directories in the digital age should concern Digital Humanists as tool directories have a long history of supporting and providing recognition for DH software work. While tool directories began as published lists and collections of tools, such as Stephen Reimer’s “TCRUNCHERS: A Collection of Public Domain Software and ShareWare for Writers” (Lancashire 2017), today, tool directories have taken an online format that requires continuous upkeep to maintain accessibility and relevance in a rapidly changing digital context (Dombrowski forthcoming). The DiRT Directory and TAPoR 3.0 are two well-known digital tool directories in the English-speaking DH community. The DiRT Directory evolved from Project Bamboo, which developed “Bamboo DiRT” from Lisa Spiro’s “DiRT Wiki” (Dombrowski forthcoming). Meanwhile, the TAPoR project was initially developed as a full portal that could coordinate text analysis web services (Rockwell 2006). When this proved hard to maintain, the creators of TAPoR developed text analysis tools, such as Voyant, separately from TAPoR which was later redeveloped to support the discovery of text analysis tools and code to help Humanists in their research (Rockwell and Sinclair 2016). Grant et al: Absorbing DiRT Art. 4, page 5 of 18 Of the challenges faced by tool discovery portals, sustainability has proven to be the most intractable. In the summer of 2017, Quinn Dombrowski, at that time an IT staff member in UC Berkeley’s Research IT group, approached Geoffrey Rockwell about the possibility of merging the DiRT Directory with TAPoR 3.0. The elimination of funding for Dombrowski’s Digital Humanities-focused position, and her transition into a research computing role, meant that she could no longer dedicate time to maintaining the organizational structure of the volunteer-run tool directory (Dombrowski forthcoming). This decommissioning of DiRT illustrates a set of problems in the field of Digital Humanities around treating tool directories and tool development as academic contributions. Tool development, in general, has not been considered a “scholarly” activity and often suffers from a lack of ongoing support (Ramsay and Rockwell 2012). This is even more the case for the development and maintenance of tool directories, which require a great deal of time and some degree of curation, but do not align well to existing frameworks for incentivizing and rewarding work in a scholarly context. However, when tool discovery portals are no longer maintained, it can lead to a loss of Digital Humanities knowledge and history if the data is not, at the very least, archived. Archiving data in a widely used text-based format (such as CSV or JSON) may preserve this knowledge for certain kinds of Digital Humanities audiences, but for many Humanities scholars, a mediating web-based interface is a de facto requirement for data to be meaningfully usable. Scholars and developers who are more comfortable working with data (such as a content dump from a defunct tool directory) may be able to restore access to this content for their less-technical colleagues by ingesting it into a new interface, but the amount of work required to do so depends on how well aligned the source metadata is with the data model underpinning the new interface. While volunteer-based directories require less outright funding, managing and motivating those volunteers to ensure that they remain actively involved in directory upkeep requires a vast amount of ongoing work (Dombrowski forthcoming). This raises the question: how can Digital Humanists better support digital tool directories, and, more broadly, tool development? Grant et al: Absorbing DiRTArt. 4, page 6 of 18 The Digital Humanities is often described as a type of scholarship that is bound in collaboration and public visibility that is different from traditional Humanities research (Kirschenbaum 2012). This nature is at the heart of discovery portals, which centralize multiple tools and different pieces of code in one place (Dombrowski 2014). Thus, it is not surprising that discovery portals serve as common starting points for scholars new to Digital Humanities, but there has been little discussion of the portals outside of this role. While these portals are generally viewed as useful, the difficulty of maintaining them makes it important to be clear about the nature and extent of their value, who their audience truly is, and under what conditions they can be sustained. Furthermore, Digital Humanists could further engage the scholarly labour of tool directories by recognizing them in their scholarly work and providing links from blogs and library subject guides. While some of this recognition is already happening, it is not a formal practice in the field. Tool directories: A history Directories for finding tools are not new to the Digital Humanities. In the “Prospect” of the first issue of Computers and the Humanities the editor talks about reducing the “wasteful duplication of key-punching and programming that exists” by publishing lists of “programs designed to solve humanistic problems” (Prospect 1966, 2). To that end, tool reviews were published in Computers and the Humanities starting with the first issue which had a review of PRORA (Lieberman 1966), a concording tool developed at the University of Toronto. Another approach to documenting tools that lasted for only a few years was the yearbook. Ian Lancashire (1991) and Willard McCarty published two issues of the Humanities Computing Yearbook, one in 1988 and one for 1989–90. The Yearbook organized descriptions of tools and other resources under disciplines, but also had sections for general tools like bibliographic management tools that crossed disciplines. The Yearbook was probably the most ambitious attempt in print to document the resources, both digital and in print, of interest to computing humanists. Alas, only two issues were published before the task of keeping up across all the disciplines in the Humanities became too difficult, not dissimilar to the struggles of online tool Grant et al: Absorbing DiRT Art. 4, page 7 of 18 directories. That said, yearbooks published from reputable imprints do have the advantage that they look like publications and can be preserved in libraries. Lists, reviews and yearbooks are three approaches to documenting tools, another is software exhibits and associated catalogues. One of the most notable was the exhibit and accompanying catalogue at the first joint ACH-ALLC conference held in Toronto in 1989. Along with a conference guide entitled “The Dynamic Text,” Willard McCarty edited a software tool guide called “Tools for Humanists, 1989” which described seventy-four systems that were displayed at the 1989 hardware and software fair by the same name (1989). With the accessibility of the Internet it became both harder and easier to keep track of tools and other resources. On the one hand, there has been an explosion of DH websites, so the task has expanded, on the other hand, companies like Google have provided useful industrial tools for searching the web. The TAPoR project was originally funded in 2002 to provide a vertical portal that would bring together services with information about available tools, especially those that could be used through the portal as web services. Web directories like TAPoR 3.0 and DiRT are, in principle, easier to maintain and can even be maintained by a community, such as the learning resource directory, MERLOT. MERLOT (2019) enlists editors from the community to curate disciplinary sub-portals for learning tools, resources and documentation. In another example, TERESAH (2019) is a tool registry managed by DARIAH that aims to provide a listing of active tools for Social Sciences and Humanities researchers in Europe. Moreover, the DH Toy Chest offers users an ongoing development of free “guides, tools, and other resources for practical work in the Digital Humanities by researchers, teachers, and students” (Liu 2013). These examples offer similar services as TAPoR 3.0 and DiRT but use different methods to provide access to tools. Maintaining infrastructure Directories of tools are essentially a form of infrastructure. They do not present original research, though research may go into their design. If they are to work well they should a) support other activities like research, and b) be maintained over time so they are accessible to scholars. As many have noted, it is challenging therefore Grant et al: Absorbing DiRTArt. 4, page 8 of 18 to build infrastructure, especially fast changing digital infrastructure, using one- time grants (Bement 2007, Green 2007, Rockwell 2010). The Canada Foundation for Innovation (CFI) program that initially funded TAPoR recognizes this to some extent by asking for a maintenance plan and by providing ongoing funds for up to eight years. For Humanities infrastructure projects, and, for that matter, any successful infrastructure, eight years (including the years of initial development) are too little. This means that infrastructure leads have to continually seek new sources of funding which in turn means adapting to new contexts and partnering with projects that could use the infrastructure. The TAPoR project has now lasted over 15 years from when the CFI grant was first awarded in 2002. The portal has been completely redeveloped (i.e. reprogrammed from scratch) twice, leading to its current designation as TAPoR 3.0. TAPoR 3.0 and DiRT Directory both represent the latest iterations of past tool directories that have been updated or adapted for new purposes. Most importantly, the continuation of these tool directories in this ad hoc fashion highlights a core issue with their development—the need for uninterrupted support. What this support could look like is not yet clear. Nonetheless, both projects have key lessons and tactics for maintaining directory projects in the long term. First, keep infrastructure small and simple enough that it can survive during dry funding spells. While the initial idea of the TAPoR “portal” was to integrate various resources from social media, text repositories, tools as web services, and ways of chaining tools in one place, this proved very hard to maintain. There were, and are, better resources available that the TAPoR project was trying to replicate in order to have a full-service portal. In version 2.0, TAPoR narrowed its focus to the discovery of tools. The tools themselves were spun off into projects like TAPoRware, TATToo, and most importantly, Voyant. TAPoRware was a set of tools designed specifically for TAPoR 2.0. They were simple tools that could be deployed as demonstration web services. TATToo was an embeddable toolbar that could be put into other websites where it would operate on the content of whatever page it was on (See Rockwell et al. 2010). Next, scale infrastructure down to what can be led and maintained by a faculty member with university support. Faculty already have access to a certain number of resources, depending on local computing support. Faculty at most research-intensive Grant et al: Absorbing DiRT Art. 4, page 9 of 18 universities can get small local grants, involve research assistants, involve students, apply for grants and so on. Infrastructure that is scaled to the support that a faculty member can obtain on their own can survive the dry years; however, this necessitates a faculty lead for the project, rather than a librarian, IT staff (such as Dombrowski), or other alt-ac roles. Keep infrastructure modular so that it can connect with other projects easily. Rather than trying to create a vertical portal that includes everything and would be complicated to maintain. In version 2.0, TAPoR focused on doing one thing well that others weren’t doing and doing it in a way that could fit with other projects. This can take multiple forms. While DiRT focused on technological integration by developing an API, the TAPoR project took the approach of “political” integration by making it easy to be written into other projects’ grant proposals. The latter approach does not lead to further proliferation of infrastructure that must be maintained, making it, by definition, more sustainable. Do one thing well and then build out features as new opportunities, partners and projects need them. Version 3 of TAPoR began adding features that made sense for the projects like Text Mining the Novel (https://novel-tm.ca/), which was contributing funding. As well, projects could take new features that need to be implemented and implement them in a more broadly reusable and integrated fashion within an existing framework. This may take some rework in the existing code and appears to be more cumbersome, but it avoids bloated code in the long run. TAPoR 3.0 has, wherever possible, been implemented by the University of Alberta’s Arts Resource Center in a way that any custom code can be reused (and maintained) for other projects. Finally, beware the siren call of crowdsourcing. Since the success of the Suda On Line (Mahoney 2009) there has been the hope that projects could get human labour from the crowd. The DiRT Directory has shown that often the work of motivating and organizing volunteers can be as time consuming as the work those volunteers do. As promising as crowdsourcing is, its value lies more in how it can engage a broader community than how much work it saves (Rockwell 2012). As shown above, there are other ways of securing ongoing support that allow for more ambitious projects, this is how the TAPoR project has survived, and hopefully will continue to survive. With https://novel-tm.ca/ Grant et al: Absorbing DiRTArt. 4, page 10 of 18 the addition of DiRT Directory’s tool data and a larger mandate, TAPoR 3.0 plans to involve more scholarly associations in the support of the infrastructure which may provide other avenues for sustainability. DiRT Directory’s successes and failures in this regard may prove informative. In order to reduce the risk of being shut down by UC Berkeley’s central IT division on account of being unrelated to Berkeley-specific IT service offerings, ownership of DiRT Directory was formally transferred to centerNet, an ADHO member organization. In this arrangement, centerNet provided an organizational home for the project, and would coordinate opportunities for partnerships and joint development with other centerNet projects (such as, the project directory DHCommons, which itself faced sustainability challenges similar to DiRT’s). In principle, centerNet’s member centers would serve as an ongoing source of volunteers for maintaining DiRT. In practice, however, the volunteer model is crucially dependent on the active involvement of a project director, and the arrangement with centerNet had no provisions for financially supporting the director position. This could be done through a buy-out of time to ensure the director could continue to work on DiRT even in the absence of a DH-specific position funded by her employer, or for replacing the director if she became unavailable, perhaps through a more financially sustainable dedicated graduate student position. Fundamentally, a tool directory’s survival is dependent on funding—which may be modest but must be fairly consistent—to pay for a position of some sort that can ensure the currency of the listings, either through their own labor or through engaging a community of volunteers. Absorbing DiRT The decision to merge the DiRT Directory with TAPoR 3.0 was a difficult one. It not only highlights the lack of ongoing support for tool discovery portals, but it also represents the end of a project (Ruecker et al. 2012). As a part of this project, we specifically aimed to merge the two directories together into a larger tool discovery portal that combined the best parts of both original projects. This process used the following steps: Grant et al: Absorbing DiRT Art. 4, page 11 of 18 1. First, we examined the metadata structure of the DiRT Directory to see what information the site was holding on each tool. At the same time, we explored possible ways to integrate DiRT’s data with TAPoR 3.0’s. 2. Next, we mapped a crosswalk of the metadata on DiRT and the metadata on TAPoR 3.0. This allowed us to see which fields are shared between the sites and determine which fields would need to be added to TAPoR 3.0, and which fields on TAPoR 3.0 would need to be populated for the DiRT Directory’s tools. This process required meeting with programmers at the University of Alberta’s Arts Resource Centre. Kamal Ranaweera and Omar Rodriguez-Arenas worked with us, providing technical support and advice throughout the project, and, most importantly, completing the actual mi- gration of the data to TAPoR 3.0. 3. After finalizing the data and fields mapping, we moved on to data cleaning. Quinn Dombrowski provided a spreadsheet in comma-separat- ed-value format of all 988 tools. This file was uploaded to OpenRefine (http://openrefine.org/), which was used to make overarching changes to the data. For example, using our fields maps, we relabelled DiRT Direc- tory’s platform data to match TAPoR 3.0’s web-usable data. The final step of this process was to delete any duplicates or empty tools. This process brought the total tool count to 950. 4. The final step of the project was to hand over the data to the Arts Resource Centre and allow the ingestion process to begin. Overall, the process went very smoothly. We began the project in September 2017 and successfully integrated the tools from DiRT to TAPoR 3.0 in May 2018. While the integration process is complete, there is an ongoing data cleaning project as we not only integrated almost a thousand new tools into TAPoR 3.0, but we also added some new fields to the descriptive metadata of the tools, and we expanded the scope of TAPoR 3.0 beyond text analysis. Most obvious, is a lack of consistency in the descriptions across the tool directory. Moving forward, we continue to try and http://openrefine.org/ Grant et al: Absorbing DiRTArt. 4, page 12 of 18 find the most effective way to share important information about tools through trial and error. Discussion Records of software take many forms, and in this essay, we have outlined a history of one form, directories of Digital Humanities tools, but there are other types of records like grant proposals, design and development documents, manuals, brochures, web documentation, reviews, conference papers and code. Developing memory infrastructure like directories is not as simple as preserving documentation. It is also a matter of structuring the records of tools so that they can be managed and found (Bowker and Starr 2000). As Bowker (2008) points out, the development of memory infrastructure is a structuring process of developing practices that are supported by infrastructure and in turn reinforce the need for infrastructure. This is where we are in the Digital Humanities; we have experimental infrastructure, but it hasn’t yet been woven into the practices of the field partly because the practices are still emerging. DH has not become disciplinary in the sense of a self-perpetuating field that has stable practices and infrastructure. This means that there isn’t yet the recognition and support for infrastructure like directories and portals. It is possible to get grants to build them as experimental infrastructure, but we haven’t found a way to weave them into a changing discipline so that they are maintained. By contrast, we have developed journals in the field that do have long term support. This paper documents attempts to develop disciplinary infrastructure at, and as a moment of, disciplinary formation. The attempts, failures, and successes say much about our formation. Inevitably one wonders how directories of tools could be better supported. How might knowledge about tools be preserved and made available? Are directories the best way to do so, or should we give up and depend on Google to manage our history? Some directions suggest themselves: It may be time to go back to including reviews or notices about tools in journals. Journals in the Digital Humanities like DHQ (http://www.digitalhumanities.org/dhq/) have proven maintainable and could integrate tool reviews and support for directories into their online practices. Notices about notable new tools could be included in journal issues and then archived in a tool directory like TAPoR 3.0. http://www.digitalhumanities.org/dhq/ Grant et al: Absorbing DiRT Art. 4, page 13 of 18 As mentioned above, we could learn from the MERLOT model, where there are editors who get credit for maintaining a sub-portal on best learning resources for a discipline (https://www.merlot.org). Tool directories could develop Associate Editor positions that would offer scholarly credit for curating and managing a list of specified tools. This would help maintain the directory while also providing an opportunity for moving tool directory maintenance into the traditional scholarly outputs that are more easily recognized by tenure and hiring committees. Finally, it may be prudent to recognize that tool directories have a life span tied to funding, thus making long term support unnecessary. The important thing is to find a way to preserve the data so that it can be passed on and reused as new projects arise with new models (Rockwell et al. 2014). Conclusion In conclusion, absorbing the DiRT Directory into TAPoR 3.0 forced our team to wrestle with the some of the major problems facing the Digital Humanities as a field. First and foremost, the debate on the importance of tools and tool development, as well as, the role of tool directories in encouraging the maintenance of tools and software. Furthermore, the process of integrating the two directories encouraged us to consider the issues of long-term access and support. This leaves us with an important question to consider: what will happen if TAPoR 3.0 is no longer able to be maintained or supported? If the experience and data is archived in an accessible form, does it really matter if any particular tool like TAPoR 3.0 disappears? We would like to thank Dr. Andrew Piper (McGill University) and NovelTM: Text Mining the Novel, a project funded by a SSHRC Partnership Grant, for their support with this project. Competing interest The authors have no competing interests to declare. Author roles Kaitlyn Grant, University of Alberta, kgrant1@ualberta.ca – kg Quinn Dombrowski, Stanford University, qad@stanford.edu – qd Dr. Kamal Ranaweera, University of Alberta, kamal.ranaweera@ualberta.ca – kr Omar Rodriguez-Arenas, University of Alberta, omar.rodriguez@ualberta.ca – ora https://www.merlot.org/ mailto:kgrant1@ualberta.ca mailto:qad@stanford.edu mailto:kamal.ranaweera@ualberta.ca mailto:omar.rodriguez@ualberta.ca Grant et al: Absorbing DiRTArt. 4, page 14 of 18 Dr. Stéfan Sinclair, McGill University, stefan.sinclair@mcgill.ca – ss Dr. Geoffrey Rockwell, University of Alberta, grockwel@ualberta.ca – gr The order of the authors reflects the level of contribution to the overall research project, including the research, programming, writing, and funding of the project. Conceptualization – kg, gr, qd, ss Data curation – kg Formal analysis – kg, gr Funding acquisition – ss Investigation – kg, gr, qd Methodology – kg, gr Project administration – kg Resources – kg, gr, qd, ss Software – kr, ora Supervision – gr, ss Writing – original draft – kg, gr, qd Writing – review & editing – kg, gr, qd, kr, ora, ss Editorial contributors Congress 2018 Special Editor: Dr. Constance Crompton, University of Ottawa Section Editor/Copy Editor: Darcy Tamayose, University of Lethbridge Journal Incubator Layout Editor: Mahsa Miri, University of Lethbridge Journal Incubator References Baird, Davis. 2004. Thing Knowledge: A Philosophy of Scientific Instruments. Berkeley: University of California Press. Bement, Arden L. Jr. 2007. “Shaping the Cyberinfrastructure Revolution: Designing Cyberinfrastructure for Collaboration and Innovation.” First Monday 12: 6. Accessed October 17, 2019. https://firstmonday.org/ojs/index.php/fm/article/ view/1915/1797. mailto:stefan.sinclair@mcgill.ca mailto:Alberta, grockwel@ualberta.ca https://firstmonday.org/ojs/index.php/fm/article/view/1915/1797 https://firstmonday.org/ojs/index.php/fm/article/view/1915/1797 Grant et al: Absorbing DiRT Art. 4, page 15 of 18 Bowker, Geoffrey. 2008. Memory Practices in the Sciences. Cambridge, MA: MIT Press. Bowker, Geoffrey, and Susan Star. 2000. Sorting Things Out: Classification and Its Consequences. Cambridge, MA: MIT Press. DiRT (Digital Research Tools). 2019. Accessed October 18. https://web.archive. org/web/20180716234416/http://dirtdirectory.org/. Dombrowski, Quinn. 2014. “What Ever Happened to Project Bamboo?” Literary and Linguistic Computing 29(3). DOI: https://doi.org/10.1093/llc/fqu026 Dombrowski, Quinn. Forthcoming. “The Directory Paradox.” Debates in the Digital Humanities. Accessed October 17, 2019. http://dhdebates.gc.cuny.edu/. Green, David. 2007. “Cyberinfrastructure for Us All: An Introduction to Cyberinfrastructure and the Liberal Arts.” Academic Commons. Accessed September 16, 2019. http://www.academiccommons.org/commons/essay/ cyberinfrastructure-introduction. Internet Archive. 2014. “Software Library.” Internet Archive. Accessed September 10, 2019. https://archive.org/details/softwarelibrary. Kirschenbaum, Matthew. 2012. “What is Digital Humanities and What is it Doing in English Departments?” Debates in the Digital Humanities. Accessed September 16, 2019. https://dhdebates.gc.cuny.edu/read/untitled-88c11800-9446-469b- a3be-3fdb36bfbd1e/section/f5640d43-b8eb-4d49-bc4b-eb31a16f3d06. Lancashire, Ian. 1991. The Humanities Computing Yearbook 1989–90. Oxford, UK: Clarendon Press. Lancashire, Ian. 2017. “CSDH/SCHN Beginnings.” Accessed September 16, 2019. http://csdh-schn.org/2017/05/31/csdhschn-beginnings-reflections-by-ian- lancashaire/. Lancashire, Ian, and Willard McCarty. 1988. The Humanities Computing Yearbook 1988. Oxford: Clarendon Press. Lieberman, D. 1966. “Review of Manual for the Printing of Literary Texts by Computer by Robert Jay Glickman and Gerrit Joseph Staalman.” Computers and the Humanities 1(1): 12. DOI: https://doi.org/10.1007/BF00188014 https://web.archive.org/web/20180716234416/http://dirtdirectory.org/ https://web.archive.org/web/20180716234416/http://dirtdirectory.org/ https://doi.org/10.1093/llc/fqu026 http://dhdebates.gc.cuny.edu/ http://www.academiccommons.org/commons/essay/cyberinfrastructure-introduction http://www.academiccommons.org/commons/essay/cyberinfrastructure-introduction https://archive.org/details/softwarelibrary https://dhdebates.gc.cuny.edu/read/untitled-88c11800-9446-469b-a3be-3fdb36bfbd1e/section/f5640d43-b8eb-4d49-bc4b-eb31a16f3d06 https://dhdebates.gc.cuny.edu/read/untitled-88c11800-9446-469b-a3be-3fdb36bfbd1e/section/f5640d43-b8eb-4d49-bc4b-eb31a16f3d06 http://csdh-schn.org/2017/05/31/csdhschn-beginnings-reflections-by-ian-lancashaire/ http://csdh-schn.org/2017/05/31/csdhschn-beginnings-reflections-by-ian-lancashaire/ https://doi.org/10.1007/BF00188014 Grant et al: Absorbing DiRTArt. 4, page 16 of 18 Liu, Alan. 2013. “DH Toychest.” Digital Humanities Resources for Project Building. Updated 2017. http://dhresourcesforprojectbuilding.pbworks. com/w/page/69244243/FrontPage. Mahoney, Anne. 2009. “Tachypaedia Byzantina: The Suda On Line as Collaborative Encyclopedia.” Digital Humanities Quarterly 3(1). Accessed September 16, 2019. http://www.digitalhumanities.org/dhq/vol/3/1/000025/000025.html. McCarty, Willard. 1989. Software Fair Guide: A Guide to the Software Fair Held in Conjunction with the Conference Computers and the Humanities: Today’s Research, Tomorrow’s Teaching. Toronto. ON: Centre for Computing in the Humanities, University of Toronto. McCarty, Willard. 2003. “Humanities Computing.” In Encyclopedia of Library and Information Science, 1224–35. New York: Marcel Dekker. MERLOT (Multimedia Educational Resources for Learning and Online Teaching). 2019. Accessed July 17. http://info.merlot.org/merlothelp/topic. htm#t=Who_We_Are.htm. Newman, James. 2012. Best Before: Videogames, Supersession and Obsolescence. New York: Routledge. DOI: https://doi.org/10.4324/9780203144268 “Prospect.” 1966. Computers and the Humanities 1(1): 1–2. DOI: https://doi. org/10.1007/BF00188009 Ramsay, Stephen, and Geoffrey Rockwell. 2012. “Developing Things: Notes Toward an Epistemology of Building in the Digital Humanities.” Debates in the Digital Humanities. Accessed September 20, 2019. https://dhdebates. gc.cuny.edu/read/untitled-88c11800-9446-469b-a3be-3fdb36bfbd1e/section/ c733786e-5787-454e-8f12-e1b7a85cac72#ch05. Rockwell, Geoffrey. 2006. “TAPoR: Building a Portal for Text Analysis.” In Mind Technologies: Humanities Computing and the Canadian Academic Community, edited by R. Siemens, and D. Moorman, 285–99. Calgary: University of Calgary Press. Rockwell, Geoffrey. 2010. “As Transparent as Infrastructure: On the Research of Cyberinfrastructure in the Humanities.” In Online Humanities Scholarship: The http://dhresourcesforprojectbuilding.pbworks.com/w/page/69244243/FrontPage http://dhresourcesforprojectbuilding.pbworks.com/w/page/69244243/FrontPage http://www.digitalhumanities.org/dhq/vol/3/1/000025/000025.html http://info.merlot.org/merlothelp/topic.htm#t=Who_We_Are.htm http://info.merlot.org/merlothelp/topic.htm#t=Who_We_Are.htm https://doi.org/10.4324/9780203144268 https://doi.org/10.1007/BF00188009 https://doi.org/10.1007/BF00188009 https://dhdebates.gc.cuny.edu/read/untitled-88c11800-9446-469b-a3be-3fdb36bfbd1e/section/c733786e-5787-454e-8f12-e1b7a85cac72#ch05 https://dhdebates.gc.cuny.edu/read/untitled-88c11800-9446-469b-a3be-3fdb36bfbd1e/section/c733786e-5787-454e-8f12-e1b7a85cac72#ch05 https://dhdebates.gc.cuny.edu/read/untitled-88c11800-9446-469b-a3be-3fdb36bfbd1e/section/c733786e-5787-454e-8f12-e1b7a85cac72#ch05 Grant et al: Absorbing DiRT Art. 4, page 17 of 18 Shape of Things to Come. Proceedings of the Mellon Foundation Online Humanities Conference at the University of Virginia, March 26–28, 2010, edited by J. McGann, 461–87. Houston: Rice University Press. Rockwell, Geoffrey. 2011. “On the Evaluation of Digital Media as Scholarship.” MLA Profession, 152–68. DOI: https://doi.org/10.1632/prof.2011.2011.1.152 Rockwell, Geoffrey. 2012. “Crowdsourcing the Humanities: Social Research and Collaboration.” In Collaborative Research in the Digital Humanities, edited by M. Deegan, and W. McCarty, 135–54. Farnham, Surrey: Ashgate. Rockwell, Geoffrey, Shawn Day, Joyce Yu, and Maureen Engel. 2014. “Burying Dead Projects: Depositing the Globalization Compendium.” Digital Humanities Quarterly 8(2). Accessed September 16, 2019. https://era.library.ualberta.ca/ items/34ec3dc1-f8b5-481a-826d-2428f5283ce8. Rockwell, Geoffrey, and Stéfan Sinclair. 2016. Hermeneutica: Computer-assisted Interpretation in the Humanities. Cambridge: MIT Press. DOI: https://doi. org/10.7551/mitpress/9522.001.0001 Rockwell, Geoffrey, Stefan Sinclair, Stan Ruecker, and Peter Organisciak. 2010. “Ubiquitous Text Analysis.” Poetess Archive Journal 2(1). Ruecker, Stan, Geoffrey Rockwell, Daniel Sondheim, Mihaela Ilovan, Jennifer Windsor, Mark Bieber, Luciano Frizzera, Omar Rodriguez-Arenas, Kamla Ranaweera, Carlos Fiorentino, Stéfan Sinclair, Milena Radzikowska, Teresa Dobson, Ann Blandford, Sarah Faisal, Alejandro Giacometti, Susan Brown, Brent Nelson, and Piotr Michura. 2012. “The Beginning, the Middle, and the End: New Tools for the Scholarly Edition.” Scholarly and Research Communication 3(4). Accessed September 16, 2019. http://www.src- online.ca/index.php/src/article/view/57. TAPoR (Text Analysis Portal for Research). 2019. Text Analysis Portal for Research. Accessed September 16. http://tapor.ca/pages/about_tapor. TERESAH (Tools E-Registry for E-Social science, Arts and Humanities). 2019. “About TERESAH.” Accessed July 15. http://teresah.dariah.eu/about. https://doi.org/10.1632/prof.2011.2011.1.152 https://era.library.ualberta.ca/items/34ec3dc1-f8b5-481a-826d-2428f5283ce8 https://era.library.ualberta.ca/items/34ec3dc1-f8b5-481a-826d-2428f5283ce8 https://doi.org/10.7551/mitpress/9522.001.0001 https://doi.org/10.7551/mitpress/9522.001.0001 http://www.src-online.ca/index.php/src/article/view/57 http://www.src-online.ca/index.php/src/article/view/57 http://tapor.ca/pages/about_tapor http://teresah.dariah.eu/about Grant et al: Absorbing DiRTArt. 4, page 18 of 18 How to cite this article: Grant, Kaitlyn, Quinn Dombrowski, Kamal Ranaweera, Omar Rodriguez-Arenas, Stéfan Sinclair, and Geoffrey Rockwell. 2020. “Absorbing DiRT: Tool Directories in the Digital Age.” Digital Studies/Le champ numérique 10(1): 4, pp. 1–18. DOI: https://doi.org/10.16995/dscn.325 Submitted: 15 October 2018 Accepted: 05 February 2019 Published: 03 June 2020 Copyright: © 2020 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. OPEN ACCESS Digital Studies/Le champ numérique is a peer-reviewed open access journal published by Open Library of Humanities. http://creativecommons.org/licenses/by/4.0/ Introduction Tool knowledge Tool directories: A history Maintaining infrastructure Absorbing DiRT Discussion Conclusion Competing interest Author roles Editorial contributors References work_6nac5nltkbgqhllijnfzdmtg4i ---- Durham Research Online Deposited in DRO: 22 September 2014 Version of attached �le: Accepted Version Peer-review status of attached �le: Peer-reviewed Citation for published item: Warwick, C. (2009) 'Ray Siemens and Susan Schreibman (eds.). The Blackwell companion to digital literary studies.', Review of English studies., 60 (244). pp. 335-338. Further information on publisher's website: http://dx.doi.org/10.1093/res/hgn145 Publisher's copyright statement: This is a pre-copyedited, author-produced PDF of an article accepted for publication in Review of English Studies following peer review. The version of record Warwick, Claire (2009) 'Ray Siemens and Susan Schreibman (eds.). The Blackwell companion to digital literary studies.', Review of English Studies, 60 (244): 335-338 is available online at: http://dx.doi.org/10.1093/res/hgn145. Additional information: Ray Siemens, and Susan Schreibman. (eds.). The Blackwell Companion to Digital Literary Studies. Pp. xviii + 620 (Blackwell Companions to Literature and Culture 50): Oxford and Malden, MA: Blackwell, 2007. Cloth, £95. Use policy The full-text may be used and/or reproduced, and given to third parties in any format or medium, without prior permission or charge, for personal research or study, educational, or not-for-pro�t purposes provided that: • a full bibliographic reference is made to the original source • a link is made to the metadata record in DRO • the full-text is not changed in any way The full-text must not be sold in any format or medium without the formal permission of the copyright holders. Please consult the full DRO policy for further details. Durham University Library, Stockton Road, Durham DH1 3LY, United Kingdom Tel : +44 (0)191 334 3042 | Fax : +44 (0)191 334 2971 http://dro.dur.ac.uk http://www.dur.ac.uk http://dx.doi.org/10.1093/res/hgn145 http://dro.dur.ac.uk/13499/ http://dro.dur.ac.uk/policies/usepolicy.pdf http://dro.dur.ac.uk Review of The Blackwell Companion to Digital Literary Studies. Edited by RAY SIEMENS and SUSAN SCHREIBMAN. Pp. Xviii + 620. Blackwell Companions to Literature and Culture 50. Malden, MA, Oxford, Carlton, Victoria, Australia: Blackwell. 2007. £95. Claire Warwick, UCL. Once again Ray Siemens and Susan Schreibman have produced a remarkable collection of writing about scholarship and resource creation in the area of digital humanities. This volume on digital literary studies is, as it were, a companion to their earlier Companion to Digital Humanities. As such it promises to be be equally significant to the field and, should be equally well used and highly regarded in universities both in Europe and North America. The companion provides a very thorough survey of research and resource development in numerous area of digital literary studies, written by an impressive collection of leading scholars. It is intended as a general introduction to the multiple aspects of the field, but many of the chapters go beyond this to provide fascinating discussion of the problems and scholarly possibilities of different aspects of this highly diverse area. As such it is impossible in this space to do justice to the entire range of subjects covered in the Companion's thirty one very detailed chapters, and so what follows will survey its general structure and comment on themes emerging from the book as a whole. The book is divided into three sections, on traditions, textualities and methodologies. In the first section chapters are concerned with periodic areas of study, such the classical, medieval, early modern, and so on. They review the digital resources available and the type of scholarly questions that are being researched in their area, and include some speculation about future scholarship made possible by digital resources. Perhaps the most outstanding of these is by Crane, Bamman and Jones of the Perseus project, (Ch. 2) who not only survey the state of resources in classical scholarship but raise vital questions about the requirements for future digital libraries for classicists and discuss how the availability of a greater range of digital texts may affect the way that researchers handle evidence for editorial decisions. It is an example of true digital literary study in that it combines a high level of discussion of technical, computer scientific matters, with a subtle discussion of the business of scholarly editing. The second section, textualities, is highly comprehensive and includes chapters on a huge range of digital literature, from interactive fiction and digital poetry, from digital text as art installation, to blogging. Some areas, such as hypertext theory, may be familiar to most readers. Yet, however expert a reader might believe herself to be in digital literary studies, so great is the range of subjects covered that it seems likely that almost everyone will find at least one chapter about material that they had not encountered before. The final section discusses computational methodologies that have been used to study literature. This section is probably the most uneven of the three in terms of the level of the material. Several of the chapters share the virtue of being able to introduce new methods to non-experts as well as engaging with more complex intellectual issues that will interest the more knowledgeable scholar, for example those by Hoover on quantitative analysis (Ch. 28) and Price on scholarly editing (Ch. 24). These engage with questions familiar to the literary scholar, such as personification in Woolf or the problems of establishing the ideal copy text, then demonstrate how such issues may be addressed with digital methods. However, other chapters in this section are pitched at a level of detail that is probably only comprehensible to the digital humanities expert, for example those on the Text Encoding Initiative, (Ch. 25) cybertextuality, (Ch. 23) character encoding, (Ch. 31) and format issues. (Ch. 30) Although such issues may be fascinating to the reader already undertaking technical research, such a lack of engagement with the non-expert is somewhat unfortunate in a volume of this type. Although this final section is also very wide-ranging there were a few issues that might have been given a chapter of their own, for example, the important questions of preservation and sustainability. These are touched upon, for example by Choudhury and Seaman (Ch. 29). However, such issues, including questions of how to document digital resources, are vital to the long term use and availability of digital materials, as Price (Ch. 24) makes clear. It is also surprising that the question of the use of digital resources was almost completely ignored in this volume. This is significant since, as Damian-Grint argues (Ch. 5) “... little thought appears to have been given to the way in which the texts might be used” (p.116) and that as a result resources, although plentiful, may not be of a sufficient quality to be helpful for serious research. A chapter on how to design digital resources with user needs in mind would therefore be of great help to any reader of this companion wanting to find information to help them create a high quality digital resource As various contributors, such as Van Hulle (Ch. 7) and Wardrip-Fruin, (Ch. 8) point out, non-linear reading pre-dates the advent of digital hypertexts. As proof of this assertion, it is probably quite unusual to read a volume such as this companion from beginning to end. It is almost certainly designed for readers to pick the chapters that are of interest to them, as needed. However the experience of reading the entire text was suggestive of some of the recurring themes in the volume, and thus of most current concern in the area of digital literary studies. It is not surprising that numerous authors were concerned with questions of how digital reproduction of text varies the way that we interact with it as readers. So for example there was repeated concern with the question of hypertext and non-linear reading, and new ways in which interactive media allow is to become part of texts and narratives. Textuality itself is conceived of very broadly in this volume to include not only the printed word, whether digitised or not, but also more immersive environments such as gaming, virtual communities, and textual performance in art installations, all of which raise fascinating questions of where the boundaries of literary studies may lie in a digital world, if indeed there need be any. It is perhaps more surprising that there was a pervasive appeal to history, previous forms of textuality and earlier reading practices. Continuity is stressed as much as change by several writers. Vandendorpe, (Ch. 10) for example provides a fascinating account of how reading practices have changed, and remarks on the irony that in a post-codex reading space we have returned to the metaphor of the scroll to navigate electronic documents. Drucker (Ch. 11) also points out that the best way to design new electronic books is not to fetishise the functions of the codex with unnecessary ersatz page turning, but to study how printed books developed functions to aid reading and textual organisation, such as page numbers, running head, indices, etc. Her contention is that if we understand how visual design features help books work as successful functional objects, we may be better able to understand what best translates to the electronic medium. This kind of appeal to the past seems entirely understandable and appropriate. What is perhaps less predictable is that so many of the authors in this volume appeal to past forms and practices almost as a sense of giving authority and legitimacy to scholarship concerning the electronic medium. When asked to survey their disciplinary area numerous authors adopt the method of producing a chronology of important work that has been done over the years. It is almost as if proving that a digital literary form has had a longer period of existence than the reader might have expected proves its worthiness as an object of study. This seems slightly worrying. It appears that we in digital humanities may betray a certain insecurity about the importance of what we do, and need to appeal to history as a kind of legitimating force. Rather in the way that the designers of many of the early university degrees in English Literature felt that this young discipline must prove its worth by studying the development and history of the language and its origins, so it seems that digital literary scholars feel, for example, that the study of hypertext is lent more gravitas if reference can be made to Tristram Shandy or James Joyce. This is surely somewhat regrettable. If we believe in our discipline, and can claim that it addresses questions that are worthy of study in their own right, we should be proud of our scholarship, whether or not it has ancient roots. The great strength of this volume is that it surely does establish beyond the need for such self justification that digital literary studies is an important, fascinating and diverse discipline, with every right to assert the importance of work being done in its name. It may not have a venerable past, but this collection is strongly suggestive of a fascinating future. Claire Warwick School of Library Archive and Information Studies University College London work_6ogs3pzc4vbmvors2yk7ya2cie ---- August 18, 2017 Time: 05:11pm ijhac.2017.0196.tex PRACTICAL LINGUISTIC ANNOTATION: THE HEBREW BIBLE1 DIRK ROORDA introduction 1. Annotation An annotation is a piece of information attached to another piece of information.2 Annotations generally do not have the same authorship, publishing workflow, and audience as the information sources they are attached to. Annotations serve to provide comments to sources, and these comments may involve analysis, explanation, correction, linking, evaluation, tagging, counting, and much more. In this article we focus on the logistics of information, rather than on the meaning. While it is useful to distinguish annotations for their type of content, our interest lies in the patterns of information distribution. How are annotations created, how are they published, and how do they behave in the research data cycle? 2. The Hebrew Bible The Hebrew Bible is a family of ancient texts with a complex origin. It is recognized by several world religions, and it has pervaded large swaths of human culture. Academic research into the Bible occurs in several disciplines: linguistics, history, and theology with their specialties such as linguistic variation, historical linguistics, textual criticism, literary analysis, exegesis, and hermeneutics. Religious communities have added their own sets of interpretations and observations. The practice of Bible translation into a great many languages of the world3 has tuned people’s antennas for interpretation. There are editions of the text of the Hebrew Bible in which the pages contain a small square of source text, surrounded by layers and layers of annotation.4 International Journal of Humanities and Arts Computing 11.2 (2017): 276–287 DOI: 10.3366/ijhac.2017.0196 © Edinburgh University Press 2017 www.euppublishing.com/ijhac 276 August 18, 2017 Time: 05:11pm ijhac.2017.0196.tex Practical linguistic annotation Figure 1. : Text and annotations in SHEBANQ. Clicking on a verse number hides and shows the annotations. shebanq: a system for hebrew text The ETCBC is the department of the Faculty of Theology at the Vrije Universiteit Amsterdam that has created a linguistic text database of the Hebrew Bible.5 In 2013–2014 the SHEBANQ project has reshaped that database into a standard form: LAF6 and has built a demonstrator to show new ways of utilizing that database in the age of internet connectedness. Indeed, the ETCBC database has been modeled as a huge set of annotations. This demonstrator is now a website in production, also called SHEBANQ. We show how the Hebrew Bible has been captured in a system of annotations and point to a number of non-trivial, innovative uses of the concept of annotation which were not possible or practical before the digital handling of information. 1. Exhaustive linguistic annotation Each of the more than 400,000 words carries annotations specifying its part of speech, it morphological characteristics, its various representations and more. The same holds for larger units, such as phrases and clauses. All in all, this gives tens of millions of annotated features. Before the arrival of digital information processing, this was not a feasible thing to do. But here we have it: a text with millions of annotations, online, in a working system: SHEBANQ (see Fig 1). 277 August 18, 2017 Time: 05:11pm ijhac.2017.0196.tex Dirk Roorda Figure 2. : Text in phonetic representation, with all markings and annotations in place. 2. Multiple textual representations as annotation There is something else to note: the text itself exists as the content of annotations. This has to do with the peculiar fact that the older variants of biblical material were written down in a consonantal script, while the vowels were added as diacritical marks (‘pointing’) several centuries later, near the final consolidation of the text around 900 AD. So every word still has a consonantal representation, but also a fully ‘pointed’ representation. It is a clear case where the text does not have a single representation. Annotation provides a neat way to expose those representations together. Further down that road, we also provide a phonetic representation of the text (see Fig. 2). That will help people not familiar with Hebrew to get access to the linguistic annotations and use it for their own purposes.7 Nevertheless, the authoritative text of the Biblia Hebraica Stuttgartensia is the default representation.8 In SHEBANQ, the annotations are not tied to the representation of the text. So if the user switches representation, all the highlights and other annotations remain in place. 3. Queries as annotations Now that text and linguistic annotations reside in a database, it becomes possible to query both kinds of data. An important objective of the creators of the ETCBC database has always been the ability to search for peculiar syntactic patterns. When reading the Bible, every now and then a passage is particularly problematic and requires explanation. But what kind of explanation? Has there been a text transmission error? Is there a hidden borrowing from another text? Is there a syntactic construction that belongs to another dialect or language? Is 278 August 18, 2017 Time: 05:11pm ijhac.2017.0196.tex Practical linguistic annotation Figure 3. : Queries as notes in the margin. The reader of the passage is drawn to exegetical problems of others, and their solutions. there deliberate use of language to achieve a literary effect? Or is there a truly special meaning lurking behind the text? Research into these problems is greatly helped by catalogues of occurrences of the same or partly the same phenomenon. By using a text database, we are able to systematically query those patterns. It is not easy to write such queries. The data is full of unexpected patterns, it is easy to miss cases, so many checks and cross-checks are needed. A successful query is a piece of scholarly crafts(wo)manship, and should be shared and published as such. Seen in an abstract way, a query is an annotation to all its results. One annotation targeting multiple passages is already a little bit innovative, although one might say that cross-references and indexes are examples of multi-target annotations. But here there is a bit more going on. By presenting a query as an annotation to its results, an unexpected flow of information is made possible: from result to query. When a scholar reads a difficult passage, (s)he might be interested in the exegetical queries that have results in that passage (see Fig. 3). This is exactly what SHEBANQ makes possible. Next to every chapter in the Bible a list of relevant queries is presented, and the results of those queries are highlighted in the chapter at hand.9 279 August 18, 2017 Time: 05:11pm ijhac.2017.0196.tex Dirk Roorda 4. Semi-automatic analysis as annotation Linguistic research into the Hebrew Bible has not ended. The meaning of Hebrew verb forms in poetry is a long-standing problem (and many occurrences in prose are far from clear for that matter), and data-driven research has the potential to produce new solutions.10 Verb meanings are also dependent on the number and nature of constituents in the sentence (verbal valence), and it is worthwhile to devise a flow chart system to generate verb senses on the basis of signals near verb occurrences.11 This involves a lot of trial and error. Sometimes it leads to a review of the linguistic encoding, to new syntactic and semantic distinctions. One way to organize this, is to generate the results of a flow chart as a set of annotations to be presented next to the text. The researcher can then see the decisions in full context and comment on those outcomes by manual annotations. These annotations can be harvested in turn and provide a basis for an improved algorithm. This workflow is supported on SHEBANQ, although not many people are fully utilizing it yet. Experience, however, shows that it is cumbersome to execute this work exclusively on a website. A website such as SHEBANQ only supports that many use cases, while every research activity requires its own data preprocessing. An efficient workflow for this kind of research is to collect data, store it in spreadsheets, have the researcher work on them, and then feed the filled-in sheets back into the system. We support this workflow by means of LAF-Fabric, which is an off-line companion to SHEBANQ, based on exactly the same data. With the help of LAF-Fabric, the programming scholar can grab all data that is needed for a particular task, lay it out neatly in columns, and convert edited sheets into new sets of annotations.12 The work of verbal valency is available on the SHEBANQ tools page (see Fig. 4). These new annotations have been bulk-imported into SHEBANQ and pubished, but they can also serve as basis for new algorithms in LAF-Fabric.13 5. Everything else Although versatile, SHEBANQ cannot do everything. For example, teaching Hebrew to academic students could profit from SHEBANQ, but SHEBANQ is not optimized for it. There is a system called Bible Online Learner14, based on the same ETCBC database, that has facilities to generate drills and exercises for students and score their answers. Rather than to try to pack all functionality into one system, it is better to have several systems around, each geared to their own task, but yet knowing of each other’s existence. Every chapter page in SHEBANQ links to the corresponding chapter page in BibleOL and vice versa. Moreover, in order to compose exercises, BibleOL uses queries that are published in SHEBANQ (see Fig. 5). 280 August 18, 2017 Time: 05:11pm ijhac.2017.0196.tex Practical linguistic annotation Figure 4. : Verbal valence notes have been bulk-imported into SHEBANQ and are visible in notes view. Users can mute note sets and focus on the topics of their interest. Figure 5. : Interlinking with Bible Online Learner. Clicking on the SHEBANQ logo takes you to SHEBANQ, where there is a Bible OL logo to link you back. 6. Summing up In the digital age, annotation has become a practical paradigm to carry out scholarly work: we can use annotations in quantities unheard of, to achieve old goals in new ways, and to pursue new goals with new workflows. The reader is invited not only to look at the screenshots, because they tend to show screens packed with information. One of the strong points of digitally displaying information is that most of the material can be hidden most of the time. SHEBANQ as an annotation tool helps the researcher to collect all data relevant to the task at hand in one or two screens, for a great variety of tasks. And where SHEBANQ falls short, the companion tool LAF-Fabric takes over, but the price is that the user must program it. This is where the digital paradigm affects (or should we say infects) the daily work of the scholar: programming skills are becoming increasingly relevant. An important characteristic mentioned in most of the cases above is the facility to share and publish annotations. The Hebrew Text database is the result of a lot of scholarly work, and that work should be published, not only for the academic 281 August 18, 2017 Time: 05:11pm ijhac.2017.0196.tex Dirk Roorda record, but also for the purposes of teaching and training.15 Moreover, published annotations enable useful cooperation of different systems based on the same data. requirements for scholarly annotation In the previous section we described annotations in action. When the action is research, it is important to comply with a few essential requirements. Archiving We saw how annotations capture scholarly work, sometimes at a high level of abstraction and expertise. So scholars must be able to save annotations and then share and publish them. Researchers that work years from now must be able to retrieve annotations when they see the sources, and to retrieve the sources when they see the annotations. While the digital paradigm is very beneficial to transform information flexibly and distribute it globally, it is much more challenging to fix existing information rigidly and distribute it over decades to come. The digital age calls for digital archives that recognize these challenges and do something about it. In the SHEBANQ case, the data has been archived at DANS16, all the code sits on Github (see an overview of the sources) and repository snaphsots have been archived at Zenodo at CERN. The live website is run by DANS on a server of the Royal Netherlands Academy of Arts and Sciences. Coupling The particular thing about annotations is that they need the coupling to another resource in order to be ‘to-the-point’. In the age of analogue resources, this coupling tended to be tight: in the margins, or as footnotes, usually within the same material container. Where the coupling was less tight, such as in endnotes, indexes, registers as separate books or volumes, it became quickly unwieldy to handle all relevant annotations. In the digital age these problems of information logistics can be solved much more elegantly and effectively, provided certain agreements are being made by the designers of information. It is a bit like geotagging photos by means of a recorded GPS track: if the track points are coded with the same time codings as the photos, the photos can be located on the track and then on the map. For annotations we need anchors: points in sources to link to. These points should be standardized so that different scholars, as producers of annotations, use the same anchors. That will help to make their annotations interoperable. 282 August 18, 2017 Time: 05:11pm ijhac.2017.0196.tex Practical linguistic annotation For linguistic annotations, the LAF standard helps a lot to refer to primary data in an objective way, although these anchors are still project dependent. There are efforts to bring about a more global persistent linking system to canonical resources (see Canonical Text Services and the CITE architecture), and it is a matter of time before it will be applied to the Hebrew Bible as well. The holy grail of this all is the Linked Open Data (http://linkeddata.org) endeavour, which is an attempt to map all entities in human discourse unto unique, persistent identifiers, and code all properties that can be expressed into triples consisting of a subject, predicate and object, according to well-defined vocabularies and ontologies. This is a huge modelling effort, and it is not always clear how computing-intensive workflows may take advantage of it. But for importing and exporting data across boundaries of project and discipline, this is definitely the way to go. An advantage of well-coupled annotations is that they can be sorted and organized on the basis of where they point to. But we need other organizing principles as well, such as the provenance of an annotation (researcher, project, organization), time (creation, update), motivation (correction, evaluation), nature (linguistic, hermeneutical). Of these, motivation and nature can be entered in free text description fields, which in practice, sadly, quite often reveal the text ‘None’. Innovation A lot of digital development starts with mimicking analogue concepts. After a certain period, those digital counterparts may exhibit new dynamics. This only happens if the new concepts manage to exploit typical advantages of the digital paradigm over the old ways. One of the key digital advantages is the network effect: for certain tasks it has become possible to mobilize many people with mostly limited contributions. Such loosely organized networks can deliver impressive results, such as Wikipedia.17 If scholars grab the opportunity to ‘socialize’ parts of their workflows, they may gain results not previously possible. SHEBANQ has socialized the art of making exegetical queries. It is being used in the classroom, and scholars can quote queries to each other and cite them in papers. Everybody may enter new queries. And everybody can comment on specific query results by means of simple manual annotations. However, we are not seeing (yet) that kind of spontaneous manual annotation. Reflection and action Before building SHEBANQ, we tried to design its layout and the details of how queries should be displayed to the user. Query results are structured objects, and queries may have many structured results; it was not at all clear how we could 283 August 18, 2017 Time: 05:11pm ijhac.2017.0196.tex Dirk Roorda provide the users with a good visual representation of query results, and how to show them in context. Most of this became clear after we started construction. Only fully engaging in building this web app made us discover one unanticipated problem after another, and solve them all. For example, we decided to provide on-the-fly heat maps of query results, which give users an instant overview of how the results of a particular query are distributed in the Bible (see Fig. 6). But we refrained from presenting query results in their full complexity as structured objects. We also modified our goals. Rather than make SHEBANQ into the ultimate research tool, we developed LAF-Fabric as an off-line side tool, with more flexibility to tackle the nitty-gritty of daily research. SHEBANQ got redefined from a laboratory to a showroom of research results, where very diverse research output comes together in one context. Now SHEBANQ and LAF-Fabric together provide the facilities of a scholarly lab. In our opinion, it makes no sense to reflect on the nature of annotations without being involved in digital construction work. The ontology of a (digital) medium is the reflection of its usage patterns. When migrating annotations from analog to digital, we are potentially upsetting those very usage patterns, and hence the ontology of annotations. Programming skills Just as analogue information systems presuppose the skills of reading and writing, the potential of the digital media cannot be unleashed without new skills. For researchers, this means definitely: programming. Especially where experimentation is involved, it is impractical to outsource development of new tools to ‘mere’ programmers. Instead, scholarly teams should insource programming skills in their own skulls. They do not need to master professional levels. Data oriented programming has become much easier by the evolution of scripting languages such as Python and additional tools such as the Jupyter notebook.18 And not every team member needs to learn to program, if only the team as a whole is able to produce experimental or pilot solutions. Only after many experiments by scholars, it will be the right time to bring the professional coders in to turn the successful pilots into products and infrastructure. Addendum From the start of 2017 onwards, I have deprecated LAF-Fabric in favour of a new format and tool: Text-Fabric.19 Thanks to the move from an XML based format into a plain text based format all data fits in a Github repository.20 284 August 18, 2017 Time: 05:11pm ijhac.2017.0196.tex Practical linguistic annotation Figure 6. : Heat map of query results. Every square represents a block of 500 words of Bible text. The color indicates how many result words the query has in that block. Every square is clickable and takes you to the corresponding passage. end notes 1 This works rests on the shoulders of the giants at the ETCBC, such as Eep Talstra and Constantijn Sikkel who conceived the database and made it work through the decades behind us. See E. Talstra and C. J. Sikkel, ‘Genese und Kategorienentwicklung der WIVU- Datenbank’, in C. Hardmeier et al., ed., Ad Fontes! Quellen erfassen—lesen—deuten. Was ist Computerphilologie? Ansatzpunkte und Methodologie—Instrument und Praxis (Amsterdam, 2000), 33–68; E. Talstra, ‘Computer-assisted linguistic analysis. The Hebrew Database used 285 August 18, 2017 Time: 05:11pm ijhac.2017.0196.tex Dirk Roorda in Quest.2’, in J. A. Cook, ed., Bible and Computer. The Stellenbosch AIBI-6 Conference. 2000–07–17/21, Stellenbosch: Proceedings of the Association Internationale Bible et Informatique (Leiden, 2000), 3–22, https://shebanq.ancient-data.org/shebanq/static/docs/ methods/2000_Talstra_QuestDataTypes.pdf. The query engine of SHEBANQ is the one made by Ulrik Petersen. See U. Peterson, ‘Emdros—a text database engine for analyzed or annotated text’, Proceedings of COLING 2004, 1190–3, http://emdros.org/petersen-emdros- COLING-2004.pdf; U. Peterson, ‘Principles, Implementation Strategies, and Evaluation of a Corpus Query System’, Lecture Notes in Computer Science, 4002 (2006). 215–26, http://link.springer.com/chapter/10.1007%2F11780885_21; U. Peterson, EMDROS. Text database engine for analyzed or annotated text, 2002–2014, http://emdros.org. Peterson has relied on the ideas of Christ-Jan Doedens: C.-J. Doedens, Text Databases. One Database Model and Several Retrieval Languages (Amsterdam, 1994). Researchers, senior and junior have put data and tools to many tests: Janet Dyk, Reinoud Oosting, Oliver Glanz, Gino Kalkman, Martijn Naaijer, Christiaan Erwich, Cody Kingham plus 89 users of SHEBANQ that shared 686 queries with us. 2 See M. Bauer and A. Zirker, ‘Whipping Boys Explained: Literary Annotation and Digital Humanities’, in Ray Siemens and Kenneth M. Price, eds., Literary Studies in the Digital Age: An Evolving Anthology (New York, 2015), https://dlsanthology.commons.mla.org/whipping- boys-explained-literary-annotation-and-digital-humanities/; and M. Bauer and A. Zirker, ‘Explanatory Annotation of Literary Texts and the Reader: Seven Types of Problems’, this volume. 3 See M. Cysouw, ‘Parallel Bible Corpus. 1169 unique Bible translations’, n.d., http://www.paralleltext.info/data/, and C. A. Christodoulopoulos, ‘A multilingual parallel corpus created from translations of the Bible’, https://github.com/christos-c/bible-corpus, 22 June 2017. 4 See also R. Siemens et al., this volume. 5 D. Roorda, ‘The Hebrew Bible as Data: Laboratory—Sharing—Experiences’, in J. Odijk and A. van Hessen, eds., CLARIN in the Low Countries, 2015, https://arxiv.org/abs/1501.01866; D. Roorda, J. Krans, B.-J. Lietaert-Peerbolte, W. T. van Peursen, U. Sandborg- Petersen and E. Talstra, ‘Scientific report of the workshop Biblical Scholarship and Humanities Computing: Data Types, Text, Language and Interpretation, held at the Lorentz Centre Leiden from 6 Feb 2012 through 10 Feb 2012’, Lorentz Center, Leiden, 2012, http://www.lorentzcenter.nl/lc/web/2012/480/report.php3?wsid=480&venue=Oort, 22 June 2017. 6 N. Ide and L. Romary, Linguistic Annotation Framework, 2012, http://www.iso.org/iso/ home/store/catalogue_tc/catalogue_detail.htm?csnumber=37326, 22 June 2017. 7 F. de Vree, ‘Using social co-occurrence networks to analyze biblical narrative’, 2016, https://github.com/Fred-Erik/social-biblical-networks, 22 June 2017. 8 See K. Elliger and W. R. Rudolpfh, eds., Biblia Hebraica Stuttgartensia, 5th corrected edition (Stuttgart, 1997), www.bibelwissenschaft.de/online-bibeln/biblia-hebraica- stuttgartensia-bhs/lesen-im-bibeltext/, 22 June 2017. 9 See Roorda and van den Heuvel for an early formulation of the idea of queries-as-annotations; D. Roorda and C. M. J. M. van den Heuvel, ‘Annotation as a New Paradigm in Research Archiving’, Proceedings of ASIS&T 2012 Annual Meeting. Final Papers, Panels and Posters, 2012, http://arxiv.org/abs/1412.6069, 22 June 2017. 10 G. J. Kalkman, Verbal Forms in Biblical Hebrew Poetry: Poetical Freedom or Linguistic System? PhD thesis, VU University (Amsterdam, 2015), https://shebanq.ancient- data.org/tools?goto=verbsystem. 11 J. W. Dyk, O. Glanz and R. Oosting, ‘Analysing Valence Patterns in Biblical Hebrew: Theoretical Questions and Analytic Frameworks’, Journal of Northwest Semitic Languages, 40 (2014), 43–62, https://shebanq.ancient-data.org/shebanq/static/docs/methods/2014_ Dyk_jnsl.pdf. 286 August 18, 2017 Time: 05:11pm ijhac.2017.0196.tex Practical linguistic annotation 12 See Roorda, Naaijer, Kalkman, & van Cranenburgh for initial examples; D. Roorda, M. Naaijer, G. J. Kalkman and A. van Cranenburgh, ‘LAF-Fabric: a data analysis tool for Linguistic Annotation Framework with an application to the Hebrew Bible’, Computational Linguistics in the Netherlands Journal, 4.4 (2015), preprint http://arxiv.org/abs/1410.0286. 13 Indeed, using LAF-Fabric requires programming skills. It is a Python package that gives streamlined access to the Hebrew Text Database. A beginner’s course in Python is enough to get started. Another, even more computationally intensive, example is the quest for parallel passages in the Bible. This is part of the Syntactic Variation project, carried out by a team of (PhD) researchers at the ETCBC. To see what is at stake here, see R. Rezetko and M. Naaijer, ‘An Alternative Approach to the Lexicon of Late Biblical Hebrew’, Journal of Hebrew Scriptures, 16.1 (2016), www.jhsonline.org/Articles/article_213.pdf. 14 N. Winther-Nielsen and C. Tøndering, Bible Online Learner, n.d., http://www. bibleol.3bmoodle.dk/, 22 June 2017. 15 SHEBANQ is meant as a service to publish queries for the academic record. DANS, as a national research data archive, is capable of archiving the database as a whole. It is also possible to store the data on Github, and preserve a snapshot of the repository to Zenodo, a service of CERN to preserve repositories for the academic record. 16 E. Talstra, C. J. Sikkel, O. Glanz, R. Oosting and J. W. Dyk, Text Database of the Hebrew Bible, 2012, http://www.persistent-identifier.nl/?identifier=urn:nbn:nl:ui:13-ukhm-eb; W. T. van Peursen and D. Roorda, Hebrew Text Database in Linguistic Annotation Framework, 2014, PID: urn:nbn:nl:ui:13–048i-71, http://www.persistent-identifier.nl/?identifier=urn:nbn: nl:ui:13-048i-71; W. T. van Peursen and D. Roorda, Hebrew text database ETCBC4b. Dataset available online at Data Archiving and Networked services, Den Haag, 2015, dx.doi.org/10.17026/dans-z6y-skyh. 17 S. Clay, Here Comes Everybody: The Power of Organizing Without Organizations (London, 2012). 18 F. Pérez and B. E. Granger, ‘IPython: a System for Interactive Scientific Computing’, Computing in Science and Engineering, 9.3 (2007), 21–29, http://ipython.org, ISSN: 1521- 9615, DOI: 10.1109/MCSE.2007.53. 19 Text-Fabric: Data model, file format and processing tool for annotated texts. https://github.com/ETCBC/text-fabric/wiki. 20 Text-Fabric-Data: Text and Annotations of the Hebrew Bible and the Greek New Testament. Includes documentation of the annotation features. https://etcbc.github.io/text-fabric-data/. 287 srnp191 AQ: Please provide missing abstract for this article. work_6oploxdahfdunbtpt5dn4xsorm ---- Web 25. Histories from the first 25 years of the world wide web BOOK REVIEW Web 25. Histories from the first 25 years of the world wide web Niels Brügger (editor), Peter Lang. New York, Bern, Berlin, Brussels, Frankfurt am Main, Oxford, Vienna, 2017. XXVI, 258 pp., num. Ill., ISBN: 978–1–4331-4065-5, https://doi.org/10.3726/b114925. Series: Digital formations. Recommended retail Price: €44.20 Peter Mechant1 Published online: 25 February 2019 # Springer Nature Switzerland AG 2019 Web 25. Histories from the first 25 years of the world wide web, edited by Niels Brügger, is part of the Digital Formations series. Brügger’s collection takes the challenge of placing the (hi)story of the World Wide Web in a critical perspective seriously. The volume seeks to foster those working in web archiving, internet studies or web historiography to undertake innovative, cross-disciplinary research. The editor has brought together authors who collectively have contributed to a book that is a valuable addition to the emerging scholarship surrounding the study of the web and the web’s history. The book, divided in four sections, comprises ‘a number of probes into the vast and multifaceted past of the web’ (xi). However, the volume is neither designed to be exhaustive, nor comprehensive. It is broad in scope in relation to a number of aspects: (i) its variety of topics, (ii) its combination of case studies and methodological reflections and (iii) the compilation of chapters focusing on national as well as international WWW phenomena. The first section of the book, aptly entitled ‘The early web’, includes four chapters that focus on the history leading up to the emergence of the World Wide Web, including how the web was narrated and understood in the early years. Brügger’s own contribu- tion provides a brief history of the hyperlink. It argues that the hyperlink is part of the latest phase in the history of how segments of text are deliberately and explicitly connected to each other by the use of specific textual and media features. The second chapter by Natale & Gory focuses on ‘the particular imaginary hidden behind the story of the emergence and development of the World Wide Web’ (30). Drawing on sources such as Tim Berners-Lee’s autobiography and other web histories, the authors show how the story of the web follows the pattern of Campbell’s monomyth, with the hero in International Journal of Digital Humanities (2019) 1:133–136 https://doi.org/10.1007/s42803-019-00010-y * Peter Mechant Peter.Mechant@UGent.be 1 imec-mict-Ghent University, Ghent, Belgium http://crossmark.crossref.org/dialog/?doi=10.1007/s42803-019-00010-y&domain=pdf mailto:Peter.Mechant@UGent.be different stages (departure, initiation, return and reintegration of the hero). Natale & Gory show how narratives about the ‘birth’ of the world wide web played an important role in shaping the public’s imagination towards elements such as plurality, openness and creativity, while demonstrating that these ‘biographies’ of the web function as fields through which understandings of the web are constructed, reproduced and communicated. Next, Deken, describes the first years of the Stanford Linear Acceler- ator Center website, the first www site outside of Europe. The section ends with a descriptive discourse analysis by Barry deconstructing the language around the early web and examining how the web entered general public discourse. The second section of the book contains three chapters each of which tells the story of a cultural phenomenon on the web in three different national settings; China, Italy and Australia. Hockx-Yu shows that the Western characterization of the web in China as nothing but censorship and repression does not do justice to the rich social and cultural significance that the web has there today. Next, Locatelli discusses the tech- nological, economic, institutional and cultural dimension of early Italian blogs and identifies three phases in their history: 1999/2000–2003 (early bloggers with the first blogs); 2003–2006 (the success of Splinder, the first Italian blogging platform); 2006– 2008 (when Google redesigned Blogger, and the blogosphere went mainstream). The third and last national setting is Australia; here, Nolan explores the creation of the Age Online, the first major newspaper website in Australia launched in February 1995. Drawing from his own experience working as a journalist on the newspaper (not the website) and from interviews with five key actors in the creation of the Age’s website, he notes that in 1995, forward-looking newspaper executives could already see the threat that internet advertising posed to the press and that, at first, online advertisements were described on the website as a free service to readers. While the first two sections of Web 25 focus on web history and bring detailed accounts of specific historical examples, the discussion of various methods of web historiography in the third section titled ‘Methodological reflections’ may be more valuable to digital scholars. First, Weber sets out to consider key research problems that researchers had in the past. His chapter highlights three specific research challenges for working with web data today: (i) size and time dimensions of research, (ii) reliability and validity of web data, and (iii) ethical research questions. In the next chapter, Helmond takes a historical perspective on the changing composition of a website, considering the website as an ecosystem, through which we can analyse the larger techno-commercial configurations that it is embedded in. In her fascinating contribution, she develops a novel methodological approach by repurposing the browser add-on Ghostery to detect trackers in archived websites and to reconstruct the historical tracking ecologies the New York Times (NYT) website has been embedded in. Finally, Chakraborty & Nanni use websites as primary sources to trace and examine activities of scientific institutions through the years. Somewhat surprisingly they conclude that these institutions’ websites, traditionally viewed as authoritative and top down, have become key in interactive, multidirectional communication channels between museums and their visitors. The book’s fourth and final section discusses ‘Web archives as [a] historical source’ and discusses the impact of web preservation on web historiography. Webster takes a closer look at the cultural history of the web archiving movement, investigating why, by whom and on whose behalf web archiving is done. This is important. It ‘[…] serves to orient users as to some of the questions they should be asking of [about] their 134 International Journal of Digital Humanities (2019) 1:133–136 sources, and of the institutions that provide them’ (176) as web archiving constitutes an interplay between the interests of 3 key stakeholders: libraries, owners of content (in particular established media companies) and end users. The section continues with Koerbin presenting short case studies of web artefacts from the National Library of Australia’s PANDORA archive, reflecting upon research issues related to early web content. He uses the framework of taphonomy, the branch of palaeontology that studies decaying organisms and their processes of fossilization, and argues that ‘web archives present artefactual evidence for the digital archaeologist that also comes with biases resulting from the processes that led to the objects being removed from the “living” web to be held in the digital archaeological locus of the web archive’ (205). In ‘Looking back, looking forward. 10 years of development to collect, preserve, and access the Danish web’, Laursen & Moldrup-Dalum do just this from three perspec- tives: legal, technical and curatorial. In line with Koerbin, they contend that a web archive’s history is pertinent to all users of the archive, in particular, it is relevant in order to evaluate it as a source. They also stress the importance of data mining skills and supporting systems when looking at, or working with web archives. They exten- sively describe their multiple-method approach. As such, they demonstrate that the so- called ‘computational turn’ in humanities and social science – the increased incorpo- ration of advanced computational research methods and large datasets into disciplines which have traditionally dealt with considerably more limited collections of evidence – indeed requires new skills and new software. The final chapter of Web 25, written by Paloque-Berges, deals nicely with records of computer-mediated communications (CMC) and, in particular, with the Usenet archives, which have not, as yet, become the focus of institutions’ appraisal process of web archiving, despite the fact that this aspect of the web can function as a critical environment for building and studying the heritage value of CMC. Web 25 has several important merits. Firstly, it emphasizes how a set of fundamental web features such as http, html and the hyperlink have transcended time and still function as the ‘nuts and bolts’ of the web. The book also shows that the discourse on the history of the web follows the recurring pattern of heroic narratives. Secondly, it demonstrates that web culture is not necessarily by defi- nition a uniform, globalizing phenomenon, but that it can have surprisingly local characteristics. It shows that there is no one single and fixed history of the web, but rather, there are multiple local, regional and national webs and a variety of ways that the world wide web has been imagined, used, shaped and regulated. Thirdly, by providing methodological reflections on web archiving, the book emphasizes that all web archives, to a greater or lesser degree, can only attempt comprehensiveness, and that the processes involved in harvesting and preserving content from the live web involves biases resulting from technical, resource and curatorial constraints. In this way, it offers an important point of departure for further critical examination of the web and its history. Finally, the book offers valuable and realistic starting points for further methodological development. Not only does it point to some novel and out-of-the box methods, such as repurposing the browser add-on Ghostery to detect trackers in archived websites, or utilizing the framework of taphonomy to consider how certain web archives came about, but it also illustrates how various methodological approaches can be applied. International Journal of Digital Humanities (2019) 1:133–136 135 If Web 25 has any shortcomings, it is that, in a few instances, the book leaves the reader in the dark about the overall context. I would have welcomed a timeline or an infographic, showing an overview of important events or websites in the history of the web in order to contextualize the issues discussed in the various chapters better. Secondly, sometimes the book might have needed a more developed methodology: for example, archiving social media content is hardly discussed. However, this is quite problematic as the methods for preserving digital artefacts are currently not up to the challenge of preserving what happens on social networks. Hence, archivists and memory organizations will need to develop new methodologies in order to probe and document social networks, such as Twitter or Facebook, in order to accurately capture what it is like to live online today and to understand these algorithmic systems. Thirdly, to conclude, the book could have benefited from some more editing and proofreading work, especially in terms of internal cross-referencing. Although this is a shortcoming seen in many edited books, it is a shame that not more effort was made to textually link the individual chapters and, as a result, creating less of a mix of various probes into the vast and multifaceted past of the web, which this book ultimately presents. Web 25 provides a critical and thoroughly documented guide to understanding the first 25 years of the web and is a noteworthy contribution to the field of web historiography. It is a well-written and accessible contribution to an expanding field. Throughout the book, I found a clear analysis of the history of specific websites and methodological reflection, founded on well-selected sources. This makes it a must-read both for web historians, academics and cultural heritage professionals, involved in web archiving and for a wider audience with an interest in Web history. 136 International Journal of Digital Humanities (2019) 1:133–136 Web 25. Histories from the first 25 years of the world wide web work_6q6s76vinzcprhoteoqw6vgn7a ---- Digital Humanities 2010 1 The Importance of Pedagogy: Towards a Companion to Teaching Digital Humanities Hirsch, Brett D. brett.hirsch@gmail.com University of Western Australia Timney, Meagan mbtimney.etcl@gmail.com University of Victoria The need to “encourage digital scholarship” was one of eight key recommendations in Our Cultural Commonwealth: The Report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Sciences (Unsworth et al). As the report suggested, “if more than a few are to pioneer new digital pathways, more formal venues and opportunities for training and encouragement are needed” (34). In other words, human infrastructure is as crucial as cyberinfrastructure for the future of scholarship in the humanities and social sciences. While the Commission’s recommendation pertains to the training of faculty and early career researchers, we argue that the need extends to graduate and undergraduate students. Despite the importance of pedagogy to the development and long-term sustainability of digital humanities, as yet very little critical literature has been published. Both the Companion to Digital Humanities (2004) and the Companion to Digital Literary Studies (2007), seminal reference works in their own right, focus primarily on the theories, principles, and research practices associated with digital humanities, and not pedagogical issues. There is much work to be done. This poster presentation will begin by contextualizing the need for a critical discussion of pedagogical issues associated with digital humanities. This discussion will be framed by a brief survey of existing undergraduate and graduate programs and courses in digital humanities (or with a digital humanities component), drawing on the “institutional models” outlined by McCarty and Kirschenbaum (2003). The growth in the number of undergraduate and graduate programs and courses offered reflects both an increasing desire on the part of students to learn about sorts of “transferable skills” and “applied computing” that digital humanities offers (Jessop 2005), and the desire of practitioners to consolidate and validate their research and methods. We propose a volume, Teaching Digital Humanities: Principles, Practices, and Politics, to capitalize on the growing prominence of digital humanities within university curricula and infrastructure, as well as in the broader professional community. We plan to structure the volume according to the four critical questions educators should consider as emphasized recently by Mary Bruenig, namely: - What knowledge is of most worth? - By what means shall we determine what we teach? - In what ways shall we teach it? - Toward what purpose? In addition to these questions, we are mindful of Henry A. Giroux’s argument that “to invoke the importance of pedagogy is to raise questions not simply about how students learn but also about how educators (in the broad sense of the term) construct the ideological and political positions from which they speak” (45). Consequently, we will encourage submissions to the volume that address these wider concerns. References Breunig, Mary (2006). 'Radical Pedagogy as Praxis'. Radical Pedagogy. http://radicalpeda gogy.icaap.org/content/issue8_1/breunig.ht ml. Giroux, Henry A. (1994). 'Rethinking the Boundaries of Educational Discourse: Modernism, Postmodernism, and Feminism'. Margins in the Classroom: Teaching Literature. Myrsiades, Kostas, Myrsiades, Linda S. (eds.). Minneapolis: University of Minnesota Press, pp. 1-51. http://radicalpedagogy.icaap.org/content/issue8_1/breunig.html http://radicalpedagogy.icaap.org/content/issue8_1/breunig.html http://radicalpedagogy.icaap.org/content/issue8_1/breunig.html Digital Humanities 2010 2 Schreibman, Susan, Siemens, Ray, Unsworth, John (eds.) (2004). A Companion to Digital Humanities. Malden: Blackwell. Jessop, Martyn (2005). 'Teaching, Learning and Research in Final Year Humanities Computing Student Projects'. Literary and Linguistic Computing. 20.3 (2005): 295-311. McCarty, Willard, Kirschenbaum , Matthew (2003). 'Institutional Models for Humanities Computing'. Literary and Linguistic Computing. 18.4 (2003): 465-89. Unsworth et al. (2006). Our Cultural Commonwealth: The Report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Sciences. New York: American Council of Learned Societies. work_6rnmxllsrjeuxlm4ytewedpzoi ---- [PDF] An assessment of the realism of digital human manikins used for simulation in ergonomics | Semantic Scholar Skip to search formSkip to main content> Semantic Scholar's Logo Search Sign InCreate Free Account You are currently offline. Some features of the site may not work correctly. DOI:10.1080/00140139.2015.1038306 Corpus ID: 21973707An assessment of the realism of digital human manikins used for simulation in ergonomics @article{Nrot2015AnAO, title={An assessment of the realism of digital human manikins used for simulation in ergonomics}, author={A. N{\'e}rot and W. Skalli and X. Wang}, journal={Ergonomics}, year={2015}, volume={58}, pages={1897 - 1909} } A. Nérot, W. Skalli, X. Wang Published 2015 Medicine, Engineering Ergonomics In this study, the accuracy of the joint centres of the manikins generated by RAMSIS and Human Builder (HB), two digital human modelling (DHM) systems widely used in industry for virtual ergonomics simulation, was investigated. Eighteen variously sized females and males were generated from external anthropometric dimensions and six joint centres (knee, hip and four spine joints) were compared with their anatomic locations obtained from the three-dimensional reconstructed bones from a low-dose X… Expand View on Taylor & Francis sam.ensam.eu Save to Library Create Alert Cite Launch Research Feed Share This Paper 5 Citations View All Topics from this paper Manikins science of ergonomics 5 Citations Citation Type Citation Type All Types Cites Results Cites Methods Cites Background Has PDF Publication Type Author More Filters More Filters Filters Sort by Relevance Sort by Most Influenced Papers Sort by Citation Count Sort by Recency A Markerless Method for Personalizing a Digital Human Model from a 3D Body Surface Scan G. Beurier, Xiaolin Yao, Y. Lafon, X. Wang Computer Science 2015 1 PDF Save Alert Research Feed A principal component analysis of the relationship between the external body shape and internal skeleton for the upper body. A. Nérot, W. Skalli, X. Wang Engineering, Medicine Journal of biomechanics 2016 7 PDF Save Alert Research Feed Interactive tools for safety 4.0: virtual ergonomics and serious games in real working contexts A. Lanzotti, A. Vanacore, +8 authors S. Papa Computer Science, Medicine Ergonomics 2019 3 Save Alert Research Feed Interactive Tools for Safety 4.0: Virtual Ergonomics and Serious Games in Tower Automotive A. Lanzotti, A. Tarallo, +6 authors S. Papa Computer Science 2018 3 Save Alert Research Feed In-Vehicle Driving Posture Reconstruction from 3D Scanning Data Using a 3D Digital Human Modeling Tool Junyi Chen, B. Jiang, S. Song, Hongyan Wang, Xuguang Wang Computer Science 2016 1 Save Alert Research Feed References SHOWING 1-10 OF 31 REFERENCES SORT BYRelevance Most Influenced Papers Recency External and internal geometry of European adults S. Bertrand, W. Skalli, Laurent Delacherie, D. Bonneau, G. Kalifa, D. Mitton Computer Science, Medicine Ergonomics 2006 12 View 3 excerpts, references methods and background Save Alert Research Feed Standardisation of digital human models G. Paul, S. Wischniewski Engineering, Medicine Ergonomics 2012 18 View 1 excerpt, references background Save Alert Research Feed Validation of a Model-based Motion Reconstruction Method Developed in the REALMAN Project X. Wang, N. Chevalot, G. Monnier, Sergio Ausejo, Angel Suescun, J. T. Celigueta Computer Science 2005 27 View 1 excerpt, references methods Save Alert Research Feed A reference method for the evaluation of femoral head joint center location technique based on external markers. H. Pillet, M. Sangeux, J. Hausselle, R. El Rachkidi, W. Skalli Computer Science, Medicine Gait & posture 2014 29 PDF View 2 excerpts, references background Save Alert Research Feed Fast 3D reconstruction of the lower limb using a parametric model and statistical inferences and clinical measurements calculation from biplanar X-rays Y. Chaibi, T. Cresson, +5 authors W. Skalli Mathematics, Medicine Computer methods in biomechanics and biomedical engineering 2012 172 View 1 excerpt, references methods Save Alert Research Feed Effects of hip center location on the moment-generating capacity of the muscles. S. Delp, W. Maloney Mathematics, Medicine Journal of biomechanics 1993 158 View 1 excerpt, references background Save Alert Research Feed ESTIMATION OF EXTERNAL AND INTERNAL HUMAN BODY DIMENSIONS FROM FEW EXTERNAL MEASUREMENTS S. Bertrand, I. Kojadinovic, W. Skalli, D. Mitton Mathematics 2009 2 View 5 excerpts, references methods and background Save Alert Research Feed Development of Computerized Human Static Strength Simulation Model for Job Design D. Chaffin Computer Science 1997 104 PDF View 1 excerpt, references background Save Alert Research Feed Musculoskeletal computational analysis of the influence of car-seat design/adjustments on long-distance driving fatigue M. Grujicic, B. Pandurangan, X. Xie, A. Gramopadhye, David W. Wagner, M. Ozen Engineering 2010 69 PDF View 1 excerpt, references methods Save Alert Research Feed Analysis of musculoskeletal systems in the AnyBody Modeling System M. Damsgaard, J. Rasmussen, S. Christensen, E. Surma, M. D. Zee Engineering, Computer Science Simul. Model. Pract. Theory 2006 622 PDF View 1 excerpt, references methods Save Alert Research Feed ... 1 2 3 4 ... Related Papers Abstract Topics 5 Citations 31 References Related Papers Stay Connected With Semantic Scholar Sign Up About Semantic Scholar Semantic Scholar is a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI. Learn More → Resources DatasetsSupp.aiAPIOpen Corpus Organization About UsResearchPublishing PartnersData Partners   FAQContact Proudly built by AI2 with the help of our Collaborators Terms of Service•Privacy Policy The Allen Institute for AI By clicking accept or continuing to use the site, you agree to the terms outlined in our Privacy Policy, Terms of Service, and Dataset License ACCEPT & CONTINUE work_6seporrc6rafjcqugk5zpbfg3y ---- Digital human modeling (DHM) for improving work environment for specially-abled and elderly Vol.:(0123456789) SN Applied Sciences (2019) 1:1326 | https://doi.org/10.1007/s42452-019-1399-y Review Paper Digital human modeling (DHM) for improving work environment for specially‑abled and elderly Charu M. Maurya1  · Sougata Karmakar1 · Amarendra Kumar Das1 © Springer Nature Switzerland AG 2019 Abstract Digital human modeling (DHM), a technique of simulating human interaction with the product or workplace in a virtual environment is gaining popularity. This virtual evaluation process is useful in developing user-centered products by incorporating human factor principles at an early design phase, which reduces the design time and improves quality. Application of DHM has gained attention in the design process of the manufacturing industry, agriculture, healthcare sectors, transportation and aviation sectors, etc. However, the use of DHM for designing ergonomic products and work environment for the specially-abled and elderly is quite limited. This, otherwise, is more important as their real-life par- ticipation in experiments pertaining to the ergonomic evaluation of any product, workplace or public facility may cause discomfort to them. Moreover, improved products or workplace reduces their dependence on others and enables active involvement in work, communication, and social life. Therefore, an attempt has been made in this paper to explore the state of the art literature review on the applications of DHM based virtual ergonomic approaches to improve products and workplaces designs for specially-abled/elderly. The paper also proposes a way forward to continue research and developmental activities towards the betterment of the quality of life of the elderly and specially-abled persons through proactive and inclusive design strategies. Keywords Digital human modeling (DHM) · Ergonomic design · Specially-abled · Elderly 1 Introduction The ergonomic design of the work environment reduces postural stress, improves organizational productivity, enhances job satisfaction, and results in a better quality of work-life [1, 2]. With the advent of virtual ergonom- ics, user-centered workplaces and/or products are being developed and tested in a virtual environment at an early design phase. Human modeling software enables design- ers to simulate human –workplace interaction by insert- ing a digital human model in the CAD generated work environment. CAD-based virtual ergonomics evaluation process shortens design time, lowers development cost, improves quality and enhances productivity [3]. 1.1 Digital human modeling (DHM) software Digital human modeling software is a computer-aided design tool for the construction of 2D and 3D human mod- els from anthropometric data of targeted users/population for ergonomic analysis of virtual human fit to virtual work- station components [4]. Any design or work environment can be evaluated from an ergonomics perspective using virtual simulation before making the real physical proto- type [5]. The study of digital prototypes in virtual environ- ment reduces the developmental cost and design time [6]. The correlation between the results of DHM simulation and real-life assessment is fairly high [7]. A few popular DHM software, which is commercially available include JACK, SAMMIE, RAMSIS, DELMIA, SANTOS, etc. Received: 29 May 2019 / Accepted: 1 October 2019 / Published online: 4 October 2019 * Charu M. Maurya, mauryacm@iitg.ac.in | 1Department of Design, Indian Institute of Technology, Guwahati, Assam, India. http://crossmark.crossref.org/dialog/?doi=10.1007/s42452-019-1399-y&domain=pdf http://orcid.org/0000-0002-7673-2406 Vol:.(1234567890) Review Paper SN Applied Sciences (2019) 1:1326 | https://doi.org/10.1007/s42452-019-1399-y 1.2 Application of DHM software for improving the work environment Digital human modeling software has gained atten- tion for proactive design and ergonomic evaluation of products and workplaces in diverse fields that include manufacturing industry, healthcare sectors, transporta- tion, agriculture, defense research and development, aerospace-aviation sectors and so on. In industrial work- place, DHM has been applied for improving the designs of work cells in car manufacturing plants [8], designing of small fishing vessels to reduce work-related musculo- skeletal disorders of fisherman [9], redesigning of work accessories for minimizing awkward postures in Indian shop floor workstation [10], workplace evaluation of coir industry, etc. [11]. The ergonomic analysis of refrigerated cabinets [12], shoe rack [13], adjustable walking cane [14], modified cycle rickshaw [15], improved load carrier for coolie [16] improved design of wearable load assisting device [17] etc. are few examples where DHM has been applied effectively. In the healthcare sector, DHM has been suc- cessfully utilized for improving laparoscopic surgery [18], evaluation of bathing system design for patients [19], patient lifting devices for healthcare personnel [20], etc. DHM also finds its application in the field of trans- portation, aviation, and aerospace viz. vision analysis of pilots in jet aircraft [21], evaluation of cockpit design [22], vehicle interior design [23], evaluation of seat belt, driver posture and comfort in vehicles [24, 25]. Inclusion of DHM in the design process of the agricultural tools and machinery [26–28] is also gaining popularity. 1.3 Need of DHM application for improving the work environment of the specially‑abled and elderly Following the literature review, it is evident that a large number of research and developmental activities have been carried out by applying DHM in product and work- place design for military personnel, automobile drivers, healthcare professionals and for general civilian popu- lations. Researches on DHM applications in the design and development of products for specific population subgroups like elderly and specially-abled persons have received less attention. Thus, there is a scope for need- based design of products and support systems for such specific sub-populations by taking the advantages of DHM technologies. A user-centric work environment or ergonomic products is an utmost necessity for the spe- cially-abled or elderly as they encounter various barriers like inadequate policies and standards, negative attitude, lack of provision of services and the problem with ser- vice delivery, etc. [29]. Specially-abled employees who work in uncomfortable workplaces have also complained of mobility trouble, problems associated with heart and blood circulation, depression, etc. [30]. They also face difficulties while traveling and using public transport [31], which prevents their participation in social and work life [29]. Apart from that, specially-abled persons also encounter other disparities like lower average pay, job insecurity, lack of training facilities, participation in decision-making, etc. [32]. Hence, convenient housing and adequate support services should be provided to the specially-abled or elderly people [33]. Moreover, planners, designers, and architects should adopt univer- sal and inclusive designs approach to remove obstacles in accommodation, transportation and communication to empower the specially-abled to participate indepen- dently and comfortably in education, employment and social life [29]. In the scenario described above, virtual prototyping with DHM could enable the designer to evaluate the product and modify it by simulating the interaction of digital manikin with the CAD model of the product [34]. Digital manikin-based virtual testing of the product-user interface also reduces discomforts/troubles to the elderly and specially-abled persons by eliminating their actual participation in real experiments of physical compatibility evaluation. 2 Aim The aim of the current paper is to explore the state of the art literature review on applications of DHM based virtual ergonomic approaches in improving the product and workplaces designs for specially-abled/elderly. The paper also proposes a way forward to continue research and developmental activities towards the betterment of the quality of life of the elderly and specially-abled persons through proactive and inclusive design strategies. 3 Results The papers published in peer-reviewed English journals and conference proceedings using DHM for a product or work- place analysis and having the specially-abled or elderly as participants were considered for the present review. The studies pertaining to anthropometric databases of elderly and specially-abled for making their digital manikin were identified from the available literature. Areas such as indus- tries, public places, sports, product design, healthcare, etc. Vol.:(0123456789) SN Applied Sciences (2019) 1:1326 | https://doi.org/10.1007/s42452-019-1399-y Review Paper were also explored where DHM has been used for solving design related issues of specially-abled and elderly. 3.1 Anthropometric database of the specially‑abled and elderly for DHM Anthropometry is one of the important aspects of DHM based product/workplace design and evaluation. Different anthropometric databases are incorporated in DHM soft- ware to get the digital manikin for the targeted population in the simulation process [35]. Fourteen (14) studies dealing with the creation of anthropometric databases of elderly and specially-abled have been included in the present review. D’souza et al. [36] in their study created functional reach database of 320 users of wheel mobility devices using DMH software and applied it for calculating vertical reach ranges for forward and side-ways. A model to predict reach envelop of digital human model based on data collected from elderly subjects was also proposed [37]. Another human modeling software “HADRIAN” (human anthropometric data require- ment investigation and analysis) was developed [38–43] where anthropometric data, range of motion data, reach range, data regarding the ability to do kitchen based activi- ties in daily life were recorded from 102 respondents. The elderly having age-related impairments and specially-abled persons (covering a wide range of disability) were selected as respondents. The HADRIAN software was validated by comparing the results of HADRIAN manikins with the actual users while performing the task such as retrieving a ticket from the machine, using ATM to obtain cash or using the lift at the railway station [44]. Hogberg et al. [45] developed digi- tal human models of the elderly by modifying the anthro- pometric and joint range of motion data. Anthropometric data of elderly and their caregiver was also collected [46] to develop 3 D digital interactive works environment through which movements of caregivers of elderly can be studied and improved through training. Chaffin [47] recorded 37,000 motions from people in the age group of 18–78 years for developing a database of human motion prediction models. The motions of the people while reaching and moving the light to moderate objects in either seated or in standing pos- tures were recorded. The models can help to predict various motor behavioral strategies adopted by a different group of people within a virtual workplace. Motion capture tech- nology to create digital human models of specially-abled people was also applied [48, 49]. 3.2 DHM application for improving industrial workplace Earlier researchers have reported few studies where DHM has been applied for improving the workplace of specially-abled and elderly. Aubry et  al. proposed an approach for the ergonomic analysis of the specially- abled person’s workplace. In this approach gesture - based description of the workplace, virtual environment and modeling of disability as motion constraints were utilized. Thus, motions of the specially-abled person affected by his/her disability were generated and ergo- nomic analysis of workplace (3D CAD model) was per- formed [50]. A discomfort model of the climbing task for a specially-abled worker with prosthesis limb was also developed. The model was validated by simulating ladder-climbing task with the model of digital under knee prosthesis wearer and the results were compared with statistical data from the observed experiment [51]. Kaklanis et al. introduced a new virtual modeling tech- nique comprising virtual user models, task models, and simulation models as a core component of a simulation module, expressed in UsiXML format. Here, the virtual user model was described as a virtual user with disabili- ties, simulation model was described as the product/ services to be tested, and a task model was described as the complex tasks of the user. The effectiveness of the proposed framework was evaluated virtually by exam- ining the accessibility of a workplace to five virtual user models with different disabilities [52]. In a study, a digi- tal manikin of the wheelchair user was developed and interfaced with the simulated workplace environment. The work environment was evaluated and all modifica- tions were incorporated virtually [53]. In another study, immersive virtual reality technology was used to assess the modified workplace where a person with disability accessed virtual reality environment. The specific behav- ior of physically handicapped was studied in a virtual environment by integrating task and physical disabil- ity constraints. Their model considered three levels of constraints namely appearance (broken arm or ampu- tation), kinematics (inaccurate pointing or less degree of freedom of joints) and physical (strength limits) that affect the motion and posture of physically handicapped in task performance [54]. An office environment for the elderly and disabled was evaluated using DHM soft- ware in a simulated environment and in a real scenario with actual users. Both evaluations gave similar results regarding accessibility features in office design [55]. Digital human modeling based inclusive design strat- egy was adopted in a study for evaluating furniture manufacturing assembly for an elderly worker having joint mobility constraints. A human model based on joint mobility data of elderly worker from HADRIAN database was developed. The posture adopted by workers in the furniture assembly environment was replicated virtually on a human model of the elderly worker to assess the acceptability of posture [56]. Vol:.(1234567890) Review Paper SN Applied Sciences (2019) 1:1326 | https://doi.org/10.1007/s42452-019-1399-y 3.3 DHM application for public utilities A few reported studies have illustrated the use of DHM for making public utilities comfortably accessible to the specially-abled. Li et al. examined the interaction behav- ior of wheelchair users with ATM machine in a virtual environment using DHM and immersive virtual reality technique. Comfort analysis of digital manikin was per- formed in DHM software. EMG measures of a person while using ATM with immersive virtual reality technique were also recorded. Both the results were compared with the subjective responses of real respondents [57]. In another study, the design of the ATM machine was also virtually evaluated and modified by studying its interaction with the digital human model of a wheelchair user from HAD- RIAN database [43]. A pedestrian simulation model to make the barrier- free environment for elderly and specially-abled at pub- lic buildings is being developed [58]. Such a simulation model at public places will be able to evaluate the visibility of the guidance system and reveal the areas not visible to the specially-abled [59]. 3.4 DHM application in healthcare sectors Healthcare sector is also taking benefit of DHM technol- ogy for designing prosthesis, exoskeleton and assistive aids. Morotti et  al., Colombo et  al. and Colombo et  al. created two digital human models for transtibial ampu- tee and transfemoral amputee. Its gait was simulated for analyzing causes of gait deviations related to prosthesis set up and socket modeling [60–62] (Fig. 1). A system for designing sockets for lower limb prosthesis was also devel- oped. This system designs sockets based on the patient’s weight, lifestyle, tonicity level and geometry of residuum [63]. A virtual prototype of intelligent bionic leg (IBL), an advanced trans-femoral prosthesis, was developed and virtually evaluated by Xie et al. [64]. For designing exoskeletons, musculoskeletal analysis of upper limb exoskeleton in a simulated environment was performed [65]. In this way, the designs of the exoskel- eton can biomechanically be evaluated before making an actual prototype. To analyze the effect of the strap that connects the exoskeleton with the human body, a com- bined human exoskeleton model was developed and evaluated in a simulated environment [66]. Virtual pro- totyping of rehabilitation exoskeleton by merging com- putational musculoskeletal analysis with simulation was also proposed [67]. In the proposed framework an exo- skeleton–limb musculoskeletal model is developed first and then its performance is assessed using biomechanical, morphological and controller parameters. These param- eters are optimized for developing the virtual design. The virtual experiment is then carried out to generate a modi- fication in the design if required. The application of such a framework was illustrated by developing the index finger exoskeleton prototype. In the area of assistive aids, a modified DHM tool was applied for ergonomic evaluation of a bathing system design from caretakers’ (elderly) and caregivers’ point of view. Most suitable bathing posture was also defined. Anthropometrics, joint range of motion, description, and appearance were customized for developing manikins of the elderly. RULA and joint comfort values were used to evaluate bathing system design [19]. A walker with sit- stand assistance for the elderly was developed and virtu- ally evaluated on the human model before experimenting on real users [68]. A sit to stand and mobility assistance Fig. 1 DHM application in a prosthesis design from Colombo et al. [62] and b assistive aids for elderly and specially-abled from Khan et al. [69] Vol.:(0123456789) SN Applied Sciences (2019) 1:1326 | https://doi.org/10.1007/s42452-019-1399-y Review Paper device for the elderly was also developed and virtually evaluated by Khan et al. [69]. Out of numerous case exam- ples, two prominent usages of DHM in the design of pros- thesis for the person with amputated leg and assistive aids for the elderly and specially-abled have been depicted in Fig. 1 for the easy understanding of the readers. Application of virtual human for assisted healthcare is also evolving. The virtual human with characteristics like speech recognition, natural vision, and language allows a human user to interact with their computer in a more natural way. The virtual human monitor the type of user such as old age person or disable, record data from sensors and communicate the data to healthcare professionals. Currently, virtual patient technology has been applied for mental health diagnosis and clinical training [70]. Kakijaki et al. applied digital human modeling to physiotherapy education for 3D visualization and analysis of gait motion of the normal and hemiplegic patient [71]. 3.5 DHM application in sports DHM was applied by Holmberg et al. for understanding the effect of the impairment on sport’s performance [72]. In their study musculoskeletal simulation for skiing sport was carried out on digital manikin with lower leg pros- thesis. The influence of technique, fitness, and training was taken as negligible. Two full-body simulation models with identical anthropometric data were created having similar kinematics and external kinetics. In order to assess the impact of prosthesis on muscular work, one model was composed of full muscle setup and the other without muscle setup in right lower leg and foot. Biomechanical simulation of cross-country skiing was performed on both manikins. The output was used for computing metabolic muscle work and skiing efficiency. The results indicated that without muscles in leg and foot, skiing demanded more muscular efforts in total. 3.6 DHM for vehicle design and assessment DHM has also gained the attention of researchers for design- ing and analyzing vehicles, which can be comfortable to specially-abled and elderly. A humanoid model was used for analyzing techniques adopted by the elderly in getting into a car [73]. Vehicle egress strategies and constraints faced by the elderly were also evaluated using RAMSIS bodybuilder [74]. A virtual modeling technique was used for evaluating car designs for hand brake and storage compartment use with a virtual model of the elderly having spinal cord injury by Kaklanis et al. [75]. Erdelyi et al. virtually evaluated the vibrational comfort level of the motorcycle ride with human models of able-bodied riders and people with different dis- abilities using root mean square (RMS) and vibration dose value (VDV) [76]. The suitability of a bus design for elderly and disabled people was also analyzed by Marshall et al. using digital manikin of specially-abled and elderly from HADRIAN database [77]. To provide the glimpse of applica- tion of DHM software in vehicle design and evaluation tar- geting for elderly and specially-abled people as the users, current researchers have compiled images from various pieces of literature and presented in Fig. 2. 3.7 DHM application in other areas A model for simulating motion according to age by incor- porating age-related changes in gait pattern, kinemat- ics, and kinetic values has also been developed [78]. The model can be used for analyzing crowd behavior in the virtual environment. Wyk [79] worked on a methodology which can help to set up an open framework where a vir- tual human can be animated for visualizing sign language. In this method, any verbal language can be translated into a sign language in a machine translation system. 4 Future work The participation of specially-abled or elderly in the real- life design development and evaluation process might cause discomfort to them. However, ergonomic design that suits their needs is very much required for minimiz- ing the effects of constraints caused by their disability. The review presents the holistic knowledge base regarding the applications of DHM technology by various research- ers in improving the quality of life of the elderly and spe- cially-abled people. It is hoped that it would encourage researchers to develop and modify designs of products of everyday use for the aforesaid targeted users. DHM based virtual ergonomics evaluations in the field of transport, public utilities like hospital environment, markets, parks, temples, etc. are needed for modifying existing design towards more inclusive considering the specific require- ments of the elderly and specially-abled persons as these public places are frequently accessed by them. In the living environment, designs of the bathroom, toilets, furniture, etc. can also be evaluated and accordingly design modifi- cations can be performed to address ergonomic issues if any. DHM can also be applied to design and evaluate assis- tive aids and prosthesis to make them more user-friendly. 5 Discussion and conclusion The present paper explores the areas where DHM has been applied for ergonomic evaluations and improving the work environment of elderly and specially-abled persons. The paper highlights the efforts done by researchers for Vol:.(1234567890) Review Paper SN Applied Sciences (2019) 1:1326 | https://doi.org/10.1007/s42452-019-1399-y recording anthropometric and biomechanical data of spe- cially-abled and elderly people and developing software such as HADRIAN. DHM has been applied for evaluating and modifying the workplaces viz. industrial workplace, public utilities, vehicle design, prosthesis design, etc. to make it compatible with elderly and specially-abled. Specially-abled employees often encounter problems viz. trouble in mobility, blood circulation, and depression in the industrial workplace. DHM can be of immense help in solving the workplace-related problems of the specially- abled as done in case of office design [55] and furniture assembly [56]. Physical disability often hampers the move- ment of specially-abled and elderly and absence of appro- priate physical infrastructures discourages their interac- tion in the social environment. DHM can be effectively used for virtual evaluation and improvement of the infra- structural facility of public places like ATM, railway station, bus-stop and public transportations. The current paper also highlights the research and development activities carried out by the researchers for designing assistive aids, prosthesis, and exoskeleton with the help of DHM and its successful evaluation in a simulated environment. In the automotive industry, suitable modifications in the design of hand brake, storage compartment has been undertaken using DHM in order to make it comfortable for specially- abled and elderly. Though limited, DHM has now been applied by researchers in the design process for making the work environment suitable for the specially-abled and elderly people. The major bottleneck for application of digital human modeling for product and facility design for elderly and specially-abled is the unavailability of anthro- pometric and biomechanical (total range of motion and comfort range of motion of body joints) databases of the aforesaid populations. This is true for most of the countries all over the world. Hence, the need of the hour is to develop such databases for effective use of DHM software to provide a better quality of life to these targeted users. It is envisaged that the body of litera- ture presented in the current review would encourage designers and engineers to use DHM based ergonomic evaluation as an inclusive design approach for ensuring a barrier-free environment for the specially-abled and the elderly persons. The future research pertaining to applications of DHM software in developing manikins (with anthropometric and biomechanical data of elderly and specially-abled population) for ergonomic design and development of various products and facilities as described in current review would facilitate the quality of life of the elderly and specially-abled persons. Fig. 2 DHM application in vehicle design and evaluation for elderly and specially-abled. Images adopted from a Kaklanis et al. [75], b Erdelyi et al. [76] and c Marshall et al. [77] Vol.:(0123456789) SN Applied Sciences (2019) 1:1326 | https://doi.org/10.1007/s42452-019-1399-y Review Paper Compliance with ethical standards Conflict of interest On behalf of all authors, the corresponding au- thor states that there is no conflict of interest. References 1. Afroz S, Haque MI (2017) Ergonomics in the workplace for a bet- ter quality of work life. In: Muzammil M, Khan AA, Farooq M, Hassan F (eds). In: Proceedings of 15th international conference on Ergonomics for improved productivity (HWWE 2017). Excel Publishers, New Delhi, p 70 2. Easwaran N (2017) Ergonomic analysis and workplace evalua- tion of an assembly section to improve productivity. In: Muzam- mil M, Khan AA, Farooq M, Hassan F (eds) Proceedings of 15th international conference on ergonomics for improved produc- tivity (HWWE 2017). Excel Publishers, New Delhi, p 19 3. Naumann A, Roetting M (2007) Digital human modeling for design and evaluation of human machine systems. MMI Inter- aktiv 1:27–35 4. Chaffin DB (2005) Improving digital human modeling for proac- tive ergonomics in design. Ergonomics 48:478–491 5. Cappelli TM, Duffy VG (2006) Motion capture for job risk clas- sifications incorporating dynamic aspects of work. In: Digital human modeling for design and engineering conference, Lyon, 4–6 July 2006. SAE International, Warrendale 6. Zhang X, Chaffin DB (2005) Digital human modeling for com- puter-aided ergonomics. Handbook of occupational ergonom- ics. CRC Press, London, pp 1–13 7. Fritzsche L (2010) Ergonomics risk assessment with digital human models in car assembly: simulation versus real life. Hum Fact Ergon Manuf 20:287–299. https ://doi.org/10.1002/ hfm.20221 8. Spada S, Germana D, Ghibaudo L, Sessa F (2013) Application and benefits of digital human models to improve the design of work cells in car’s manufacturing plants according to international standards. Proceedings of the 11th international conference on manufacturing research (ICMR 2013). http://dspac e.lib.cranfi eld. ac.uk/handl e/1826/9490. Accessed 12 May 2017 9. Álvarez-Casado E, Zhang B, Sandoval ST, Pedro M (2016) Using ergonomic digital human modeling in evaluation of work- place design and prevention of work-related musculoskeletal disorders aboard small fishing vessels. Hum Fact Ergon Manuf 26:463–472. https ://doi.org/10.1002/hfm.20321 10. Sanjog J, Baruah RL, Patel T, Karmakar S (2016) Redesign of work-accessories towards minimizing awkward posture and reduction of work cycle elements in an Indian shop-floor work- station. In: Rebelo F, Soares M (eds) Advances in ergonomics in design. Advances in intelligent systems and computing, vol 485. Springer, Cham 11. Satheeshkumar M, Krishnakumar K (2016) Ergonomic analysis of workstations in coir mat industries in Kerala using digital human modeling method. In: Bhardwaj A, Singh LP, Singh S (eds) Pro- ceedings of 14th international conference on humanizing work and work environment (HWWE 2016), GIAP journals, pp 87–89 12. Colombo G, Ponti DG, Rizzi C (2009) Digital human modeling for ergonomic analysis of refrigerated cabinets. https ://www.desig nsoci ety.org/publi catio n/32297 /digit al_human _model ing_for_ ergon omic_analy sis_of_refri gerat ed_cabin ets. Accessed 12 Mar 2018 13. Sanjog J, Karmakar S, Agarwal H, Patil CD (2012) Designing and ergonomic evaluation of a shoe-rack in CAD environment. Int J Comput Appl 49:38–41 14. Sahoo BB, Yein N, Pal S (2015) Design concept of adjustable cane-chair for elderly in virtual environment. In: Proceedings of 13th international conference on humanizing work and work environment 2015 & community nutrition and health: a social responsibility. Patel Digital Printers, Mumbai, p 46 15. Mohapatra S (2015) Application of digital human modeling and work posture analysis of human powered cycle rickshaw pullers on Odisha, India—a case study. In: Proceedings of 13th interna- tional conference on humanizing work and work environment 2015 & community nutrition and health: a social responsibility. Patel Digital Printers, Mumbai, p 228 16. Shroff A, Mohapatra S (2015) Computer-aided ergonomics: design, development and ergonomic analysis of load carrier for an Indian coolie. In: Proceedings of 13th international confer- ence on humanizing work and work environment 2015 & com- munity nutrition and health: a social responsibility. Patel Digital Printers, Mumbai, p 258 17. Sharma HK, Sharma AK, Singh A, Singhal P (2017) Digital human modeling using catia-V5 for the analysis and ergonomic improvements in the design of wearable load assisting device for porters. In: Muzammil M, Khan AA, Farooq M, Hassan F (eds) Proceedings of 15th international conference on ergonomics for improved productivity (HWWE 2017). Excel Publishers, New Delhi, p 31 18. Marcos P, Seitz T, Bubb H, Wichert A, Feussner H (2006) Com- puter simulation for ergonomic improvements in laparoscopic surgery. Appl Ergon 37:251–258. https ://doi.org/10.1016/j.aperg o.2005.09.003 19. Hanson L, Hogberg D, Lundstrom D, Warell M (2009) Applica- tion of human modeling in healthcare industries. In: Duffy VG (ed) Digital human modeling. ICDHM 2009, LNCS 5620. Springer, NewYork. https ://doi.org/10.1007/978-3-642-02809 -0_55 20. Cao W, Jiang M, Han Y, Khasawneh MT (2013) Ergonomic assess- ment of patient barrow lifting technique using digital human modelling. In: Duffy VG (ed) Digital human modeling and appli- cations in health, safety, ergonomics, and risk management, human body modeling and ergonomics. DHM 2013, LNCS 8026. Springer, Berlin. https ://doi.org/10.1007/978-3-642-39182 -8_3 21. Karmakar S, Pal MS, Majumdar DA, Majumdar DB (2012) Applica- tion of digital human modeling and simulation for vision analy- sis of pilots in a jet air craft: a case study. Work 41:3412–3418. https ://doi.org/10.3233/WOR-2012-0617-3412 22. Xue H, Zhang X, Chen Y, Zhou L (2014) Comfort evaluation of cockpit based on dynamic pilot posture. In: Duffy VG (ed) Digi- tal human modeling. Applications in health, safety, ergonomics and risk management. DHM 2014, LNCS 8529. Springer, New York, pp 215–223. https ://doi.org/10.1007/978-3-319-07725 -3_15 23. Yang J, Kim JH, Abdel-Malek K, Marler T, Beck S, Kopp GR (2007) A new digital human environment and assessment of vehicle interior design. Comput Aided Des 39:548–558. https ://doi. org/10.1016/j.cad.2006.11.007 24. Meulen PVD, Seidl A (2007) Ramsis—the leading CAD tool for ergonomic analysis of vehicles. In: Duffy VG (ed) Digital human modeling. ICDHM 2007, LNCS 4561. Springer, New York, pp 215–223. https ://doi.org/10.1007/978-3-540-73321 -8_113 25. Kyung G, Nussabaum MA (2009) Specifying comfortable driv- ing postures for ergonomic design and evaluation of the driver workspace using digital human models. Ergonomics 52:939– 953. https ://doi.org/10.1080/00140 13090 27635 52 26. Fathallah FA, Chang J, Pickett W, Marlenga B (2009) Ability of youth operators to reach farm tractor controls. Ergonomics 52:685–694. https ://doi.org/10.1080/00140 13080 25246 41 27. Wu GJ, Lin JJ, Chiu YC (2012) Computer aided human factor engineering analysis of a versatile agricultural power. In: Pro- ceedings of the 6th international symposium on machinery https://doi.org/10.1002/hfm.20221 https://doi.org/10.1002/hfm.20221 http://dspace.lib.cranfield.ac.uk/handle/1826/9490 http://dspace.lib.cranfield.ac.uk/handle/1826/9490 https://doi.org/10.1002/hfm.20321 https://www.designsociety.org/publication/32297/digital_human_modeling_for_ergonomic_analysis_of_refrigerated_cabinets https://www.designsociety.org/publication/32297/digital_human_modeling_for_ergonomic_analysis_of_refrigerated_cabinets https://www.designsociety.org/publication/32297/digital_human_modeling_for_ergonomic_analysis_of_refrigerated_cabinets https://doi.org/10.1016/j.apergo.2005.09.003 https://doi.org/10.1016/j.apergo.2005.09.003 https://doi.org/10.1007/978-3-642-02809-0_55 https://doi.org/10.1007/978-3-642-39182-8_3 https://doi.org/10.3233/WOR-2012-0617-3412 https://doi.org/10.1007/978-3-319-07725-3_15 https://doi.org/10.1007/978-3-319-07725-3_15 https://doi.org/10.1016/j.cad.2006.11.007 https://doi.org/10.1016/j.cad.2006.11.007 https://doi.org/10.1007/978-3-540-73321-8_113 https://doi.org/10.1080/00140130902763552 https://doi.org/10.1080/00140130802524641 Vol:.(1234567890) Review Paper SN Applied Sciences (2019) 1:1326 | https://doi.org/10.1007/s42452-019-1399-y and mechatronics for agriculture and biosystems engineering (ISMAB). https ://www.scien tific .net/AEF.10.16.pdf. Accessed 21 June 2017 28. Dooley WK (2012) Ergonomics and the development of agricul- tural vehicles. American Society of Agricultural and Biological Engineers (ASABE) distinguished lecture series no. 36. https :// elibr ary.asabe .org/data/pdf/6/edav2 012/Lec_Serie s_2012.pdf. Accessed 18 Oct 2017 29. World Health Organization (2011) World report on disability. http://apps.who.int/iris/bitst ream/10665 /70670 /1/WHO_NMH_ VIP_11.01_eng.pdf. Accessed 14 May 2017 30. Goldstone C (2002) Barriers to employment for disabled peo- ple. http://217.35.77.12/Cb/engla nd/paper s/pdfs/2002/IH95. pdf. Accessed 25 Nov 2017 31. Soltani SHK, Sham M, Awang M, Yaman R (2012) Accessibility for disabled in public transport terminal. Proc Soc Behav Sci 35:89–96. https ://doi.org/10.1016/j.sbspr o.2012.02.066 32. Schur L, Kruse D, Blasi J, Blanck P (2009) Is disability disabling in all workplaces? Workplace disparities and corporate cul- ture. Ind Relatsh 48:381–410. https ://doi.org/10.1111/j.1468- 232X.2009.00565 .x 33. Mgonela VA (2010) Obstacles and challenges faced by disabled women in employment opportunities in the public civil service in Tanzania: a case study of dar es salaam. Master’s thesis, Uni- versity of Zimbabwe 34. Krovi V, Kumar V, Ananthasuresh GK, Vezien JM (1999) Design and virtual prototyping of rehabilitation aids. J Mech Des 121. http://www.meche ng.iisc.ernet .in/~sures h/journ al/J6Kro vi119 99JMD .pdf. Accessed 25 Apr 2017 35. Satheeshkumar M, Krishnakumar K (2014) Digital human mod- eling approach in ergonomic design and evaluation—a review. Int J Sci Eng Res 5:617–623 36. D’Souza C, Steinfeld E, Paquet V (2009) Functional reach for wheeled mobility device users: a comparison with ADA-ABA guidelines for accessibility. https ://www.resna .org/sites /defau lt/files /legac y/confe rence /proce eding s/2009/JEA/Stude nt%20 Pap ers/DSouz a.html. Accessed 12 Sept 2017 37. Wang X, Chateauroux E, Chevalot N (2007) A data based mod- eling approach of reach capacity and discomfort for digital human models. In: Duffy VG (ed) Digital human modeling. ICDHM 2007, LNCS 4561. Springer, New York, pp 215–223. https ://doi.org/10.1007/978-3-540-73321 -8_26 38. Gyi DE, Sims RE, Porter JM, Marshall R, Case K (2004) Represent- ing older and disabled people in virtual user trials: data collec- tion methods. Appl Ergon 35:443–451. https ://doi.org/10.1016/j. aperg o.2004.04.002 39. Porter JM, Case K, Marshall R, Gyi D, Nee Oliver RS (2004) Beyond Jack and Jill: designing for individuals using HADRIAN. Int J Ind Ergonom 33:249–264. https ://doi.org/10.1016/j.ergon .2003.08.002 40. Marshall R, Summerskill S, Porter M, Case K, Sims R, Gyi D, Davis P (2008) Multivariate design inclusion using HADRIAN. In: Digital human modeling for design and engineering conference and exhibition. http://www.sae.org/techn ical/paper s/2008-01-1899. Accessed 10 Sept 2017 41. Case K, Marshall R, Hogberg D, Summerskill S, Gyi D, Sims R (2009) HADRIAN: fitting trials by digital human modelling. https ://link.sprin ger.com/chapt er/10.1007/978-3-642-02809 -0_71. Accessed 25 July 2017 42. Marshall R, Gyi D, Case K, Porter JM, Sims RE, Summerskill SJ, Davis P (2009) A design ergonomic approach to accessibility and user needs in transport. In: Bust PD (ed) Contemporary ergonomics 2009: proceedings of the international conference on contemporary ergonomics 2009. https ://pdfs.seman ticsc holar .org/89fe/c972ff df90 3d10f 8ad40 ace85 81e6c f231e 0.pdf. Accessed 2 July 2017 43. Marshall R, Case K, Porter M, Summerskill S, Gyi D, Davis P, Sims R (2010) HADRIAN: a virtual approach to design for all. J Eng Des 21:253–273. https ://doi.org/10.1080/09544 82090 33170 19 44. Summerskill SJ, Marshall R, Gyi DE, Porter JM, Case K, Sims RE, Davis P (2010) Validation of the HADRIAN system with a train station design case study. Int J Hum Fact Model Simul 1:420– 432. https ://doi.org/10.1504/IJHFM S.2010.04027 5 45. Hogberg D, Hanson L, Lundstrom D, Jonsson M, Lamskull D (2008) Representing the elderly in digital human modelling. www.arbet sliv.eu/nes20 08/paper s/1791.doc 46. Guimaraes C, Balbio V, Cid G, Zamberlane MC, Pastura F, Paixao L (2015) 3 D virtual environment system applied to aging study— biomechanical and anthropometric approach. Proc Manuf 3:5551–5556. https ://doi.org/10.1016/j.promf g.2015.07.728 47. Chaffin DB (2002) On simulating human reach motions for ergo- nomic analyses. Hum Fact Ergon Manuf 12:235–247. https ://doi. org/10.1002/hfm.10018 48. Jenkins GR (2005) The visualization of human function for use in environmental and product design. In: Proceedings of the RESNA 28th annual conference. https ://www.resna .org/sites / defau lt/files /legac y/confe rence /proce eding s/2005/Index .html. Accessed 14 May 2017 49. Jenikins G, Mahdjoubi L (2005) The development of computer visualizations of older people and disabled people to inform and instruct the design of the products and living spaces. In: 5th International postgraduate conference in the built and human environment. http://www.irbne t.de/daten /icond a/CIB16 704. pdf. Accessed 10 Nov 2017 50. Aubry M, Julliard F, Gibet S (2009) Interactive ergonomic analysis of a physically disabled person’s workplace. https ://hal.archi ves- ouver tes.fr/hal-00503 246/docum ent. Accessed 25 Nov 2017 51. Fu Y, Li S, Yin M, Bian Y (2009) Simulation-based discomfort pre- diction of the lower limb handicapped with prosthesis in the climbing tasks. In: Duffy VG (ed) Digital human modeling, HCII 2009, LNCS 5620, pp 521–527. https ://link.sprin ger.com/chapt er/10.1007/978-3-642-02809 -0_54 Accessed 12 Sept 2017 52. Kaklanis N, Moschonas P, Moustakas K, Tzovaras D (2010) Enforcing accessible design of products and services through simulated accessibility evaluation. In: González K, Kalla V (eds) Tangible information technology for a better ageing society, confidence, pp 59–71. http://www.iti.gr/~moust ak/Confi dence 10_Simul atedA ccess bilit y.pdf. Accessed 10 Sept 2017 53. Budziszewski P, Grabowski A, Milanowicz M, Jankowski J, Dzwiarek M (2011) Designing a workplace for workers with motion disability with computer simulation and virtual real- ity techniques. Int J Disabil Hum Dev. https ://doi.org/10.1515/ IJDHD .2011.054 54. Yan FU, Shiqi L, Gwen-guo C (2013) Motion/posture mode - ling and simulation verification of physically handicapped in manufacturing system design. Chin J Mech Eng. https ://doi. org/10.3901/CJME.2013 55. Moschonas P, Paliokas I, Tzovaras D (2014) A novel accessibility assessment framework for the elderly: evaluation in a case study on office design. In: Proceedings of 8th international conference of pervasive computing technologies for healthcare. https ://doi. org/10.4108/icst.perva siveh ealth .2014.25534 9 56. Case K, Hussain A, Marshall R, Summerskill S, Gyi D (2015) Digital human modeling and the computer-aided workforce. Proc Manuf 3:3694–3701. https ://doi.org/10.1016/j.promf g.2015.07.794 57. Li K, Duffy VG, Zheng L (2006) Universal accessibility assess- ments through virtual interactive design. Int J Hum Fact Model Simul 1:52–68. https ://doi.org/10.1504/IJHFM S.2006.01168 58. Brunnhuber M, Schrom-Feiertag H, Hesina G, Bauer D, Purgath- ofer W (2010) Simulation and visualization of the behaviour of handicapped people in virtually reconstructed public buildings. https://www.scientific.net/AEF.10.16.pdf https://elibrary.asabe.org/data/pdf/6/edav2012/Lec_Series_2012.pdf https://elibrary.asabe.org/data/pdf/6/edav2012/Lec_Series_2012.pdf http://apps.who.int/iris/bitstream/10665/70670/1/WHO_NMH_VIP_11.01_eng.pdf http://apps.who.int/iris/bitstream/10665/70670/1/WHO_NMH_VIP_11.01_eng.pdf http://217.35.77.12/Cb/england/papers/pdfs/2002/IH95.pdf http://217.35.77.12/Cb/england/papers/pdfs/2002/IH95.pdf https://doi.org/10.1016/j.sbspro.2012.02.066 https://doi.org/10.1111/j.1468-232X.2009.00565.x https://doi.org/10.1111/j.1468-232X.2009.00565.x http://www.mecheng.iisc.ernet.in/%7esuresh/journal/J6Krovi11999JMD.pdf http://www.mecheng.iisc.ernet.in/%7esuresh/journal/J6Krovi11999JMD.pdf https://www.resna.org/sites/default/files/legacy/conference/proceedings/2009/JEA/Student%20Papers/DSouza.html https://www.resna.org/sites/default/files/legacy/conference/proceedings/2009/JEA/Student%20Papers/DSouza.html https://www.resna.org/sites/default/files/legacy/conference/proceedings/2009/JEA/Student%20Papers/DSouza.html https://doi.org/10.1007/978-3-540-73321-8_26 https://doi.org/10.1007/978-3-540-73321-8_26 https://doi.org/10.1016/j.apergo.2004.04.002 https://doi.org/10.1016/j.apergo.2004.04.002 https://doi.org/10.1016/j.ergon.2003.08.002 https://doi.org/10.1016/j.ergon.2003.08.002 http://www.sae.org/technical/papers/2008-01-1899 https://link.springer.com/chapter/10.1007/978-3-642-02809-0_71 https://link.springer.com/chapter/10.1007/978-3-642-02809-0_71 https://pdfs.semanticscholar.org/89fe/c972ffdf903d10f8ad40ace8581e6cf231e0.pdf https://pdfs.semanticscholar.org/89fe/c972ffdf903d10f8ad40ace8581e6cf231e0.pdf https://doi.org/10.1080/09544820903317019 https://doi.org/10.1504/IJHFMS.2010.040275 http://www.arbetsliv.eu/nes2008/papers/1791.doc https://doi.org/10.1016/j.promfg.2015.07.728 https://doi.org/10.1002/hfm.10018 https://doi.org/10.1002/hfm.10018 https://www.resna.org/sites/default/files/legacy/conference/proceedings/2005/Index.html https://www.resna.org/sites/default/files/legacy/conference/proceedings/2005/Index.html http://www.irbnet.de/daten/iconda/CIB16704.pdf http://www.irbnet.de/daten/iconda/CIB16704.pdf https://hal.archives-ouvertes.fr/hal-00503246/document https://hal.archives-ouvertes.fr/hal-00503246/document https://link.springer.com/chapter/10.1007/978-3-642-02809-0_54 https://link.springer.com/chapter/10.1007/978-3-642-02809-0_54 http://www.iti.gr/%7emoustak/Confidence10_SimulatedAccessbility.pdf http://www.iti.gr/%7emoustak/Confidence10_SimulatedAccessbility.pdf https://doi.org/10.1515/IJDHD.2011.054 https://doi.org/10.1515/IJDHD.2011.054 https://doi.org/10.3901/CJME.2013 https://doi.org/10.3901/CJME.2013 https://doi.org/10.4108/icst.pervasivehealth.2014.255349 https://doi.org/10.4108/icst.pervasivehealth.2014.255349 https://doi.org/10.1016/j.promfg.2015.07.794 https://doi.org/10.1016/j.promfg.2015.07.794 https://doi.org/10.1504/IJHFMS.2006.01168 Vol.:(0123456789) SN Applied Sciences (2019) 1:1326 | https://doi.org/10.1007/s42452-019-1399-y Review Paper https ://www.resea rchga te.net/publi catio n/22867 7311_Simul ation _and_Visua lizat ion_of_the_Behav ior_of_Handi cappe d_Peopl e_in_Virtu ally_Recon struc ted_Publi c_Build ings 59. Schrom-Feiertag H, Matyus T, Brunnhuber M (2011) Simulation of handicapped people finding their way through transport infrastructures. In: Weidmann U, Kirsch U, Schreckenberg M (eds) Pedestrian and evacuation dynamics. https ://link.sprin ger.com/ chapt er/10.1007/978-3-319-02447 -9_78. Accessed 10 Sept 2017 60. Morotti R, Rizzi C, Regazzoni D, Colombo G (2007) Digital human modeling to analyse virtual amputee’s interaction with prosthe- sis. In: Proceedings of the ASME 2014 international design engi- neering technical conferences and computers and information in engineering conference IDETC/CIE. https ://doi.org/10.1115/ DETC2 014-34381 61. Colombo G, Facoetti G, Rizzi C (2013) A digital patient for com- puter aided prosthesis design. Interface Focus 8:8–9. https ://doi. org/10.1098/rsfs.2012.0082 62. Colombo G, Facoetti G, Regazzoni D, Rizzi C (2015) Virtual patient to assess prosthetic devices. https ://www.seman ticsc holar .org/paper /G-Colom bo-Virtu al-Patie nt-Virtu al-Patie nt-to- Asses -Colom bo-Facoe tti/1efb8 ad863 38a6b 6f5aa ded36 9c301 207bf 7ff63 . Accessed 13 Nov 2017 63. Colombo G, Facoetti G, Rizzi C (2016) Automatic below-knee prosthesis socket design: a preliminary approach. https ://link. sprin ger.com/chapt er/10.1007/978-3-319-40247 -5_8#citea s. Accessed 12 Sept 2017 64. Xie H, Kang G, Li F (2013) The design and control simulation of trans-femoral prosthesis based on virtual prototype. Int J Hybrid Inf Technol 6:91–100. https ://doi.org/10.14257 /ijhit .2013.6.6.08 65. Agarwal P, Narayan MS, Lee LF, Mendal F, Krovi V (2010) Simu- lation- based design of exoskeletons using musculoskeletal analysis. In: Proceedings of ASME 2010 international design engineering technical conference. https://doi.org/https ://doi. org/10.1115/detc2 010-28572 66. Cho K, Kim Y, Yi D, Jung M, Lee K (2012) Analysis and evaluation of a combined human-exoskeleton model under two differ- ent constraints condition. In: Proceedings of the international summit on human simulation. https ://www.resea rchga te.net/ publi catio n/26350 5202_Analy sis_and_evalu ation _of_a_combi ned_human _-_exosk eleto n_model _under _two_diffe rent_const raint s_condi tion. Accessed 16 Nov 2017 67. Agarwal P, Kuo PH, Neptune RR, Deshpande AD (2013) A novel framework for virtual prototyping of Rehabilitation exoskel- etons. https ://www.ncbi.nlm.nih.gov/pubme d/24187 201. Accessed 13 Nov 2017 68. Chugo D, Takase K (2009) A rehabilitation walker with a standing assistance device. https ://www.intec hopen .com/books /rehab ilita tion-engin eerin g/a-rehab ilita tion-walke r-with-a-stand ing- assis tance -devic e. Accessed 25 Nov 2018 69. Khan MR, Patnaik B, Patel S (2017) Design and development of a novel sit-to-stand and mobility assistive device for ambula- tion and elderly. In: Chakrabarti D, Chakrabarti A (eds) Research into design for communities, vol 1, ICoRD’17. Smart innovation. Systems and Technologies, New York, pp 801–811. https ://doi. org/10.1007/978-981-10-3518-0_69 70. Kenny P, Parsons T, Gratch J, Rizzo A (2008) Virtual humans for assisted healthcare. In: PETRA’08 proceedings of the 1st inter- national conference on pervasive technologies related to assis- tive environment. https ://pdfs.seman ticsc holar .org/eb4b/0953d a6c2d 2beee 250d9 50bea 1aa80 dc45e 7.pdf. Accessed 8 May 2017 71. Kakizaki T, Urii J, Endo M (2016) Application of digital human models for physiotherapy training. In: Proceedings of ASME 2010 international design engineering technical conferences and computers and information in engineering conference IDETC/CIE 2016. https ://doi.org/10.1115/detc2 016-59455 . Accessed 15 Mar 2018 72. Holmberg LJ, Ohlsson ML, Danvind J (2012) Musculoskeletal simulations: a complimentary tool for classification of athletes with physical impairments. Prosthet Orthot Int 36:396–397 73. Ait El Menceur MO, Pudlo P, Gorce P, Thevenon A, Lepoutre FX (2008) Alternative movement identification in the automobile ingress and egress for young and elderly population with or without prosthesis. Int J Ind Ergonom 38:1078–1087. https :// doi.org/10.1016/j.ergon .2008.02.019 74. Chateauroux E, Wang X (2010) Car egress analysis of young and old drivers for motion simulation. Appl Ergon 42:169–177 75. Kaklanis N, Moschonas P, Moustakas K, Tzovaras D (2011) A framework for automatic simulated accessibility assessment in virtual environments. In: Duffy VG (ed) Digital human modeling, HCII 2011, LNCS 6777, pp 302–311. https ://link.sprin ger.com/ chapt er/10.1007/978-3-642-21799 -9_34. Accessed 14 May 2017 76. Erderlyi H, Kirchner M, Manzato S, Donders S (2012) Multibody simulation with a virtual dummy for motorcycle vibration comfort assessment. https ://www.resea rchga te.net/publi catio n/23571 8888_Multi body_simul ation _with_a_virtu al_dummy _for_motor cycle _vibra tion_comfo rt_asses sment . Accessed 25 Nov 2017 77. Marshall R, Summerskill S, Case K, Hussain A, Gyi D, Sims R, Mor- ris A, Barnes J (2016) Supporting a design driven approach to social inclusion and accessibility in transport. Soc Incl 4:7–23. https ://doi.org/10.17645 /si.v4i3.521 78. Duy Le D, Boulic R, Thalmann D (2003) Integrating age attributes to virtual human locomotion. In: Proceedings of international archives of the photogrammetry, remote sensing and spatial information sciences. http://www.isprs .org/proce eding s/XXXIV /5-W10/paper s/le.pdf. Accessed 14 May 2017 79. Wyk DE (2008) Virtual human modeling and animation for real- time sign language visualization. Master’s thesis, University of the Western Cape Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. https://www.researchgate.net/publication/228677311_Simulation_and_Visualization_of_the_Behavior_of_Handicapped_People_in_Virtually_Reconstructed_Public_Buildings https://www.researchgate.net/publication/228677311_Simulation_and_Visualization_of_the_Behavior_of_Handicapped_People_in_Virtually_Reconstructed_Public_Buildings https://www.researchgate.net/publication/228677311_Simulation_and_Visualization_of_the_Behavior_of_Handicapped_People_in_Virtually_Reconstructed_Public_Buildings https://link.springer.com/chapter/10.1007/978-3-319-02447-9_78 https://link.springer.com/chapter/10.1007/978-3-319-02447-9_78 https://doi.org/10.1115/DETC2014-34381 https://doi.org/10.1115/DETC2014-34381 https://doi.org/10.1098/rsfs.2012.0082 https://doi.org/10.1098/rsfs.2012.0082 https://www.semanticscholar.org/paper/G-Colombo-Virtual-Patient-Virtual-Patient-to-Asses-Colombo-Facoetti/1efb8ad86338a6b6f5aaded369c301207bf7ff63 https://www.semanticscholar.org/paper/G-Colombo-Virtual-Patient-Virtual-Patient-to-Asses-Colombo-Facoetti/1efb8ad86338a6b6f5aaded369c301207bf7ff63 https://www.semanticscholar.org/paper/G-Colombo-Virtual-Patient-Virtual-Patient-to-Asses-Colombo-Facoetti/1efb8ad86338a6b6f5aaded369c301207bf7ff63 https://www.semanticscholar.org/paper/G-Colombo-Virtual-Patient-Virtual-Patient-to-Asses-Colombo-Facoetti/1efb8ad86338a6b6f5aaded369c301207bf7ff63 https://link.springer.com/chapter/10.1007/978-3-319-40247-5_8#citeas https://link.springer.com/chapter/10.1007/978-3-319-40247-5_8#citeas https://doi.org/10.14257/ijhit.2013.6.6.08 https://doi.org/10.1115/detc2010-28572 https://doi.org/10.1115/detc2010-28572 https://www.researchgate.net/publication/263505202_Analysis_and_evaluation_of_a_combined_human_-_exoskeleton_model_under_two_different_constraints_condition https://www.researchgate.net/publication/263505202_Analysis_and_evaluation_of_a_combined_human_-_exoskeleton_model_under_two_different_constraints_condition https://www.researchgate.net/publication/263505202_Analysis_and_evaluation_of_a_combined_human_-_exoskeleton_model_under_two_different_constraints_condition https://www.researchgate.net/publication/263505202_Analysis_and_evaluation_of_a_combined_human_-_exoskeleton_model_under_two_different_constraints_condition https://www.ncbi.nlm.nih.gov/pubmed/24187201 https://www.intechopen.com/books/rehabilitation-engineering/a-rehabilitation-walker-with-a-standing-assistance-device https://www.intechopen.com/books/rehabilitation-engineering/a-rehabilitation-walker-with-a-standing-assistance-device https://www.intechopen.com/books/rehabilitation-engineering/a-rehabilitation-walker-with-a-standing-assistance-device https://doi.org/10.1007/978-981-10-3518-0_69 https://doi.org/10.1007/978-981-10-3518-0_69 https://pdfs.semanticscholar.org/eb4b/0953da6c2d2beee250d950bea1aa80dc45e7.pdf https://pdfs.semanticscholar.org/eb4b/0953da6c2d2beee250d950bea1aa80dc45e7.pdf https://doi.org/10.1115/detc2016-59455 https://doi.org/10.1016/j.ergon.2008.02.019 https://doi.org/10.1016/j.ergon.2008.02.019 https://link.springer.com/chapter/10.1007/978-3-642-21799-9_34 https://link.springer.com/chapter/10.1007/978-3-642-21799-9_34 https://www.researchgate.net/publication/235718888_Multibody_simulation_with_a_virtual_dummy_for_motorcycle_vibration_comfort_assessment https://www.researchgate.net/publication/235718888_Multibody_simulation_with_a_virtual_dummy_for_motorcycle_vibration_comfort_assessment https://www.researchgate.net/publication/235718888_Multibody_simulation_with_a_virtual_dummy_for_motorcycle_vibration_comfort_assessment https://doi.org/10.17645/si.v4i3.521 http://www.isprs.org/proceedings/XXXIV/5-W10/papers/le.pdf http://www.isprs.org/proceedings/XXXIV/5-W10/papers/le.pdf Digital human modeling (DHM) for improving work environment for specially-abled and elderly Abstract 1 Introduction 1.1 Digital human modeling (DHM) software 1.2 Application of DHM software for improving the work environment 1.3 Need of DHM application for improving the work environment of the specially-abled and elderly 2 Aim 3 Results 3.1 Anthropometric database of the specially-abled and elderly for DHM 3.2 DHM application for improving industrial workplace 3.3 DHM application for public utilities 3.4 DHM application in healthcare sectors 3.5 DHM application in sports 3.6 DHM for vehicle design and assessment 3.7 DHM application in other areas 4 Future work 5 Discussion and conclusion References work_6tlfkx4zqrff7cr3mziuw6y4ka ---- The Dutch-German Border: Relating Linguistic, Geographic and Social Distances | Semantic Scholar Skip to search formSkip to main content> Semantic Scholar's Logo Search Sign InCreate Free Account You are currently offline. Some features of the site may not work correctly. DOI:10.3366/E1753854809000342 Corpus ID: 1022968The Dutch-German Border: Relating Linguistic, Geographic and Social Distances @article{Vriend2008TheDB, title={The Dutch-German Border: Relating Linguistic, Geographic and Social Distances}, author={F. D. Vriend and Charlotte Giesbers and R. Hout and Louis ten Bosch}, journal={Int. J. Humanit. Arts Comput.}, year={2008}, volume={2}, pages={119} } F. D. Vriend, Charlotte Giesbers, +1 author Louis ten Bosch Published 2008 Computer Science, Geography Int. J. Humanit. Arts Comput. In this paper we relate linguistic, geographic and social distances to each other in order to get a better understanding of the impact the Dutch-German state border has had on the linguistic characteristics of a sub-area of the Kleverlandish dialect area. This area used to be a perfect dialect continuum. We test three models for explaining today's pattern of linguistic variation in the area. In each model another variable is used as the determinant of linguistic variation: geographic distance… Expand View via Publisher repository.ubn.ru.nl Save to Library Create Alert Cite Launch Research Feed Share This Paper 8 CitationsBackground Citations 3 View All Figures, Tables, and Topics from this paper figure 1 table 1 figure 2 table 2 figure 3 table 3 figure 4 table 4 figure 5 figure 6 figure 7 View All 11 Figures & Tables Social welfare model Geographical distance Social structure Triune continuum paradigm 8 Citations Citation Type Citation Type All Types Cites Results Cites Methods Cites Background Has PDF Publication Type Author More Filters More Filters Filters Sort by Relevance Sort by Most Influenced Papers Sort by Citation Count Sort by Recency Dialect borders - political regions are better predictors than economy or religion C. Derungs, C. Sieber, E. Glaser, Robert Weibel Geography, Computer Science Digit. Scholarsh. Humanit. 2020 3 Save Alert Research Feed Japanese Lexical Variation Explained by Spatial Contact Patterns P. Jeszenszky, Y. Hikosaka, Satoshi Imamura, K. Yano Geography, Computer Science ISPRS Int. J. Geo Inf. 2019 Save Alert Research Feed Japanese Lexical Variation Explained by Spatial Contact Patterns P. Jeszenszky, Y. Hikosaka, S. Imamura, K. Yano 2019 1 PDF Save Alert Research Feed A Study of the Variation and Change in the Vowels of the Achterhoeks Dialect Melody Pattison History 2018 PDF View 1 excerpt, cites background Save Alert Research Feed Tools for Computational Analyses of Dialect Geography Data F. D. Vriend Computer Science 2012 1 View 2 excerpts, cites background Save Alert Research Feed Space, Diffusion and Mobility D. Britain Geography 2013 29 Save Alert Research Feed Intelligibility of Standard German and Low German to Speakers of Dutch Charlotte Gooskens, S. Kürschner, R. V. Bezooijen History 2011 12 PDF View 1 excerpt, cites background Save Alert Research Feed Maps, meanings and loanwords. The interaction of geography and semantics in lexical borrowing Karlien Franco, D. Speelman, R. V. Hout 2019 Save Alert Research Feed References SHOWING 1-10 OF 29 REFERENCES SORT BYRelevance Most Influenced Papers Recency Travel time as a predictor of linguistic distance Charlotte Gooskens Sociology 2005 31 PDF Save Alert Research Feed Dutch-German Contact in and around Bentheim J. Nerbonne, Peter Kleiweg History, Political Science 2000 19 PDF View 2 excerpts, references background Save Alert Research Feed Linguistic Change and Diffusion: Description and Explanation in Sociolinguistic Dialect Geography. P. Trudgill 1974 346 Highly Influential View 3 excerpts, references background Save Alert Research Feed Geolinguistic diffusion and the U.S.–Canada border Charles Boberg Geography Language Variation and Change 2000 95 Save Alert Research Feed Some patterns of linguistic diffusion G. Bailey, T. Wikle, J. Tillery, Lori Sand Computer Science 1993 168 Save Alert Research Feed Evaluating the Relationship between Linguistic and Geographic Distances using a 3D Visualization F. D. Vriend, J. Kunst, Louis ten Bosch, Charlotte Giesbers, R. Hout Computer Science LREC 2008 6 PDF Save Alert Research Feed Quantitative perspectives on syntactic variation in Dutch dialects M. Spruit Art 2008 35 PDF Save Alert Research Feed Geografie en inwoneraantallen als verklarende factoren voor variatie in het Nederlandse dialectgebied W. Heeringa, J. Nerbonne, R. V. Bezooijen, M. Spruit Sociology 2007 14 PDF Save Alert Research Feed A multilocality study of a sound change in progress: The case of /l/ vocalization in New Zealand and Australian English B. Horvath, R. Horvath Language Variation and Change 2001 95 View 1 excerpt, references background Save Alert Research Feed The apparent time construct G. Bailey, T. Wikle, J. Tillery, Lori Sand Computer Science 1991 311 Save Alert Research Feed ... 1 2 3 ... Related Papers Abstract Figures, Tables, and Topics 8 Citations 29 References Related Papers Stay Connected With Semantic Scholar Sign Up About Semantic Scholar Semantic Scholar is a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI. Learn More → Resources DatasetsSupp.aiAPIOpen Corpus Organization About UsResearchPublishing PartnersData Partners   FAQContact Proudly built by AI2 with the help of our Collaborators Terms of Service•Privacy Policy The Allen Institute for AI By clicking accept or continuing to use the site, you agree to the terms outlined in our Privacy Policy, Terms of Service, and Dataset License ACCEPT & CONTINUE work_6toa3a2mefhnvm2csn7b7435ca ---- Annotating Narrative Levels: Review of Guideline No. 7 Annotating Narrative Levels: Review of Guideline No. 7 Gunther Martens 01.15.20 Article DOI: 10.22148/001c.11775 Journal ISSN: 2371-4549 Cite: Gunther Martens, “Annotating Narrative Levels: Review of Guideline No. 7,” Journal of Cultural Analytics. January 15, 2020. doi: 10.22148/001c.11775 The guideline under review builds on the acquired knowledge of the field of nar- rative theory. Its main references are to classical structuralist narratology, both in terms of definitions (Todorov, Genette, Dolezel) and by way of its guiding prin- ciples, which strive for simplicity, hierarchy, minimal interpretation and a strict focus on the annotation of text-intrinsic, linguistic aspects of narrative. Most recent attempts to do “computational narratology” have been similarly “struc- turalist” in outlook, albeit with a stronger focus on aspects of story grammar: the basis constituents of the story are to some extent hard-coded into the language of any story, and are thus more easily formalized. The present guideline goes well beyond this restriction to story grammar. In fact, the guideline promises to tackle aspects of narrative transmission from the highest level (author) to the lowest (character), but also demarcation of scenes at the level of plot, as well as focalisation. Thus, the guideline can be said to be very wide in scope. The shared task to which this guideline responds focuses on identifying and reaching consensus on the demarcation of narrative levels. In standard narrato- logical parlance, shifts in level correlate to shifts in the information distribution from one narrative agent to another. In keeping with film terminology, these acts, including of the act of taking charge of the narration itself, are taken to be acts of framing constitutive of distinctive levels: “We will define this difference in level by saying that any event a narrative recounts is at a diegetic level immediately higher than the level at which the narrating act producing 1 https://doi.org/10.22148/001c.11775 https://doi.org/10.22148/001c.11775 Gunther Martens Cultural Analytics this narrative is placed.”1 In Genette’s view, narrative levels lead to an intricate nesting or embedding effect of speakers and viewers. While the more comprehensive approach of the guideline will be more palatable to scholars trained in literary theory, it is to some extent undecided as to what it takes “levels” to mean. Though the guideline addresses a broad set of narrative features, it is ultimately geared towards annotating the most conspicuous shifts in narrative levels: the turn-taking in dialogues between characters and switches in voice from narrator to character and vice versa. This is certainly the part of the guideline most easily to be operationalized. It should be pointed out that the guideline chose to restrict its interaction with the shared task corpus to a minimum: only three of the texts are briefly cited, and the bulk of the examples stems from Sally Rooney’s novel Conversations with friends. It is stated that: “The main components of such narratives are dialogues”, which may help to explain why the annotation schema is more focused on reported speech than on reported thought. While the current guideline takes its cue mainly from the tried-and-trusted toolkit of (textual) narrative theory, it is also informed by Digital Humanities. This can be seen when aspects of the paratext (Genette’s short-hand notation for any extra-textual element that frames texts and guides their reception) are taken into account, for instance when the typographic make-up of chapters, paragraphs and quotation is considered as a machine-readable index of narrative levels. Likewise, aspects of the guideline go beyond structuralism when it under- takes to consider narratees and addressees. This extension of the narratological toolbox is in keeping with recent redefinitions of style in the area of Digital Humanities, as epitomized by the following definition: In Digital Humanities, ›style‹ is seen as anything that can be measured in the linguistic form of a text, such as vocabulary, punc- tuation marks, sentence length, word length, the use of character strings.2 The adoption of this line of reasoning becomes evident when the guideline draws on the layout of the texts: “alternations between discourse levels are usually sig- nalled by paragraph breaks.” It is certainly necessary and helpful to consider such material underpinnings of narrative structure. Yet, there is a wide variety in na- tional and historical print cultures to be considered in this regard, so these appar- ently stable markers of narrative level should be handled with care and flexibility. 1Gérard Genette, Narrative Discourse: An Essay in Method (Ithaca: Cornell University Press, 1983), 228. 2J. Berenike Herrmann, Karina van Dalen-Oskam, and Christof Schöch, “Revisiting Style, a Key Concept in Literary Studies,” Journal of Literary Theory 9, no. 1 (2015): 25-52. 2 Cultural Analytics Review of Guideline No. 7 The guideline claims that it seeks to make the annotation amenable to machine learning so as to “predict narrative structure”. While this is certainly a laudable ambition, it remains to be seen whether the guideline’s heuristic focus actually al- lows for this. The current guideline is rather hybrid in nature. On the one hand, it caters to the hermeneutic strengths of human annotators. Especially the attempt to annotate the addressee(s) of specific utterances presupposes a lot of interpre- tation, as it hinges on implication and logical deduction rather than on actual mentions. Likewise, the guidelines for annotating focalisation strike me as unde- cided. The main reference here is Todorov, which is somewhat dated in view of the lengthy debates on various conceptualisations of focalisation and the question of its transferability to specific media. Focalisation is restricted to “perspective of the narrator”. It would seem that even more semantics would be required to demarcate other types of focalisation. The ambition to cover these areas may run counter to the manual’s declared adherence to structuralist tenets, as both rely on interpretation and semantics. Co-reference resolution of unstructured tex- tual data (like fictional narratives) is notoriously difficult.3 Currently, automatic event detection on the basis of machine learning has proven most successful with regard to text genres that involve a lot of referential anchoring (e.g. news arti- cles).4 The current state-of-the-art allows machine learning to predict structure “in the wild” only over a limited span of semi-structured text.5 Annotating the intricacies of implied audiences presupposes an even more extensive degree of co-reference resolution. I would like to take issue with another specific decision: The guideline argues in favour of handling tags as cleanly as possible, in order to provide a visual analogy to levels that it demarcates. For instance, it encloses the markers that attribute discourse to specific characters within the tags that demarcate that very content. These attributive markers typically involve verba dicendi in the so-called inquit- formulae. The main rationale for “includ[ing] the speech-verb construction in the line tag” is “to avoid cluttering the annotation”. I am not convinced that this is a workable decision. This might seem to be an issue of lesser importance with regard to texts that keep this attributive marking to an absolute minimum, as is the case in the samples from the contemporary novel. Yet, if the focus of the shared task is indeed on identifying levels in a wide range of narrative texts, this decision is counterproductive. It undermines the attempt to identify levels and, 3”R/ProgrammerHumor - When Do We Want What?,” reddit, accessed June 3, 2019. 4Tommaso Caselli and Oana Inel, “Crowdsourcing StoryLines: Harnessing the Crowd for Causal Relation Annotation,” in Proceedings of the Workshop Events and Stories in the News 2018, 2018, 44-54. 5Markus Krug et al., “Rule-Based Coreference Resolution in German Historic Novels.,” in CLfL@ NAACL-HLT, 2015, 98-104; S. Malec et al., ”Landing Propp in Interaction Space: First Steps Toward Scalable Open Domain Narrative Analysis With Predication-Based Semantic Indexing,” in DIVA, 2015. 3 https://www.reddit.com/r/ProgrammerHumor/comments/7l0sry/when_do_we_want_what/ http://urn.kb.se/resolve?urn=urn:nbn:se:hb:diva-8531 http://urn.kb.se/resolve?urn=urn:nbn:se:hb:diva-8531 Gunther Martens Cultural Analytics especially to extricate from sentences chunks that allow machines to identify pat- terns indicative of shifts in level. While the concatenation of discourse with dis- course markers is in line with a fairly recent trend in postclassical narratology, as I discussed elsewhere, 6 it would seem that these tags are kept to a minimum for the sake of human readability. Chunking at higher-order levels such as scenes is not necessarily the way to go when aiming for machine readability. In order to annotate narrative levels, it is mandatory to provide tagging at the micro-level of words rather than of sentences, paragraphs or even scenes. This will inevitably lead to a cluttered view to the human eye, but such a nesting of annotations is much more likely to lead to transfer learning. Much more meta-information is needed with regard to the framing verbs. These tags could then be linked with ex- isting tag-sets that deliberately aim to target and/or attenuate contextual ambigu- ity, such as PropBank and FrameNet. Similar efforts are under way. A brief look at www.redewiedergabe.net might suffice to illustrate what such micro-coding may afford in terms of the detection of narrative levels.7 It is certainly laudable that the guidelines undertakes to emulate the structuralist annotation of complex aspects of narrative levels. It remains to be seen whether the textualist and bottom-up focus of this guideline warrants for a basis represen- tative enough to provide a gold standard in order to extrapolate from. Granted, this is a dilemma that currently most attempts at doing computational narratol- ogy with roots in literary narrative theory are facing. While the adherence of the guideline to structuralist tenets can be lauded for its principled nature, there is much to be learned from the extension of the narratological toolkit in the di- rection of multimodality and paralinguistics. While references to time and co- reference can be resolved with a high degree of confidence in formulaic genres like news articles or scientific articles, especially co-reference resolution in ellip- tic fictional texts like Virginia Woolf ’s can probably only be solved by looking at interactions of readers and other users with the text (e.g. through eye tracking8 or the study of adaption in other media9 ). Notwithstanding the many concep- tual challenges of doing transmedia comparisons, one may profit from compar- 6Gunther Martens, “Narrative and Stylistic Agency: The Case of Overt Narration,” in Point of View, Perspective, and Focalization. Modeling Mediation in Narrative, ed. Peter Hühn, Wolf Schmid, and Jörg Schönert, Narratologia (Berlin: De Gruyter, 2009), 99-118. 7Annelen Brunner et al., ”Das Redewiedergabe-Korpus. Eine neue Ressource,” in Digital Human- ities: multimedial & multimodal. 6. Tagung des Verbands Digital Humanities im deutschsprachigen Raum e.V. (DHd 2019), ed. Patrick Sahle (Frankfurt am Main, 2019), 103 - 106. 8Geert Brône and Bert Oben, Eye-Tracking in Interaction: Studies on the Role of Eye Gaze in Dia- logue (Amsterdam: John Benjamins, 2018). 9Alexander Dunst, Jochen Laubrock, and Janina Wildfeuer, Empirical Comics Research: Digital, Multimodal, and Cognitive Methods (London: Routledge, 2018). 4 http://www.redewiedergabe.net https://doi.org/10.5281/zenodo.2600812 Cultural Analytics Review of Guideline No. 7 ing with retellings10 and film adaptations11 to gauge more safely which words are imagined as spoken by what character (and to what music).12 The powers of machine learning can be harnessed more productively through learning from transfer and actual reception. Hence, I am under the impression that a purely text-based, bottom-up approach will not suffice to reach the declared goal of prediction. Narratology has already taken advantage of ongoing research in the fields of multimodality and paralin- guistics. Also annotation schemata should go beyond purely text-intrinsic for- malism and accommodate for drawing on the ways in which users process and interact with complex narratives.13 This may involve annotating for semantic properties in tandem with strictly formal properties. This is a dilemma faced by all of those seeking to reconcile with cultural analytics. High-profile advances in the study of large amounts of narrative text, however, have been achieved without any reference to narratology or to (at least a customary understanding of) narra- tive aspects of the texts at hand ( e.g. authorship attribution in the cases of J.K. Rowling and Elena Ferrante). These experiments do away with the nitty-gritty of conventional narratological analysis at the advantage of ruthless, yet highly prin- cipled reductions of complexity in order to make hidden patterns visible. At the same time, it should be clear that narratology’s toolkit has a lot in store to bring to the table of cultural analytics. Annotating for narrative structures of reported speech and variations in ontological modalities may help to reveal that appar- ently unstructured text is far more structured and/or narrative than has often been taken for granted. Narratologists should also be aware that a mere trans- position of these tried-and-trusted methods onto large amounts of unlabelled data necessitates compromise and conceptual tweaking. Hence, this annotation guideline is a productive invitation to a much-needed continuation of the dia- logue between narratology and Digital Humanities. 10Fritz Breithaupt et al., ”Fact vs. Affect in the Telephone Game: All Levels of Surprise Are Retold With High Accuracy, Even Independently of Facts,” Frontiers in Psychology 9 (November 20, 2018). 11Katalin Bálint and András Bálint Kovács, “Focalization, Attachment, and Film Viewers’ Re- sponses to Film Characters: Experimental Design with Qualitative Data Collection.,” in Making Sense of Cinema: Empirical Studies into Film Spectators and Spectatorship, ed. CarrieLynn D. Reinhard and Christopher J. Olson (Bloomsbury Publishing USA, 2016), 187-210. 12Joakim Tillman, ”Solo Instruments and Internal Focalization in Dario Marianelli’s Pride & Prej- udice and Atonement,” in Contemporary Film Music: Investigating Cinema Narratives and Composi- tion, ed. Lindsay Coleman and Joakim Tillman (London: Palgrave Macmillan UK, 2017), 155-86. 13Susanna Salem, Thomas Weskott, and Anke Holler, “On the Processing of Free Indirect Dis- course,” Linguistic Foundations of Narration in Spoken and Sign Languages 247 (2018): 143. 5 https://doi.org/10.3389/fpsyg.2018.02210 https://doi.org/10.3389/fpsyg.2018.02210 https://doi.org/10.1057/978-1-137-57375-9_11 https://doi.org/10.1057/978-1-137-57375-9_11 Gunther Martens Cultural Analytics Unless otherwise specified, all work in this journal is licensed under a Creative Commons Attribution 4.0 International License. 6 http://creativecommons.org/licenses/by/4.0/ http://creativecommons.org/licenses/by/4.0/ work_6vcrgppij5eg5mm2vy2yjuhdee ---- Microsoft Word - IJDH010104 MOES 72 Int. J. Digital Human, Vol. 1, No. 1, 2015 Copyright © 2015 Inderscience Enterprises Ltd. Multi-dimensional digital human models for ergonomic analysis based on natural data representations Niels C.C.M. Moes Department of Industrial Design Engineering, Delft University of Technology, Delft, Netherlands E-mail: ccm.moes@xs4all.nl Abstract: Digital human models are often used for ergonomic analysis of product designs, before physical prototypes are available. However, existing digital human models cannot be used to simultaneously: 1) consider the tissue loads and the physiological effects of the tissue loads; 2) optimise the product properties. This paper develops multi-dimensional digital human models for ergonomic analysis based on natural data representations, which include anatomy, morphology, behaviour, physiology, tissue, and posture data representations. The results show that the multi-dimensional digital human models can be used to: 1) accelerate the design process; 2) assess mechanical and physiological loads inside the body and in the contact area between the body and the product; 3) optimise the quality of the product; 4) reduce the number of user trials needed to create the product. Keywords: human modelling; ergonomics; product design; design process. Reference to this paper should be made as follows: Moes, N.C.C.M. (2015) ‘Multi-dimensional digital human models for ergonomic analysis based on natural data representations’, Int. J. Digital Human, Vol. 1, No. 1, pp.72–80. Biographical notes: Niels C.C.M. Moes is an Associate Professor at the Faculty of Industrial Design Engineering, Delft University of Technology, The Netherlands. He received his MSc at the Eindhoven University of Technology in 1974. He earned his PhD from the Delft University of Technology in 2004. His primary research interests include the human factors aspects in human modelling and in applying ubiquitous technologies in product design and design education. Since Spring 2014, he has retired. This paper is a revised and expanded version of a paper entitled ‘Digital human body modelling to support designing products for physical interaction’ presented at International Design Conference – Design 2006, Dubrovnik, Croatia, 15–18 May 2006. Multi-dimensional digital human models for ergonomic analysis 73 1 Introduction Digital human models are often used for ergonomic analysis of product designs, before physical prototypes are available. The digital human models should allow designers to: 1 accelerate the design process 2 assess mechanical and physiological loads inside the body and in the contact area between the body and the product 3 optimise the quality of the product 4 reduce the number of user trials needed to create the product. However, existing digital human models cannot be used to simultaneously: 1 consider the tissue loads and the physiological effects of the tissue loads 2 optimise the product properties. Existing digital human models for medical analysis of patient conditions can be used to consider the tissue loads and the physiological effects of the tissue loads. Existing digital human models for ergonomic analysis of product designs can be used to optimise the product properties. Therefore, this paper develops multi-dimensional digital human models for ergonomic analysis, which combine elements of existing digital human models for medical analysis of patients with elements of existing digital human models for ergonomic analysis of product designs. The multi-dimensional digital human models are based on natural data representations, which include anatomy, morphology, behaviour, physiology, tissue, and posture data representations. Therefore, the multi-dimensional digital human models are more knowledge intensive than existing digital human models. As a result, the multi-dimensional digital human models can be used to simultaneously: 1 consider the tissue loads and the physiological effects of the tissue loads 2 optimise the product properties. The goal of this paper is to develop the multi-dimensional digital human models in a step-wise manner, and to use the multi-dimensional digital human models to: 1 analyse the internal stresses and deformations, tissue relocations, and muscle activations and the resulting effects on the physiological tissue functions under external loads 2 use the analysis results to optimise the product. The knowledge that is needed in specific multi-dimensional digital human models depends on the applications at hand. Therefore, the goal of this paper is to determine: 1 what knowledge is needed to build adaptive quasi-organic models of the human body 2 how to manage this knowledge. As a result, this paper presents: 1 the requirements for the multi-dimensional digital human models 74 N.C.C.M. Moes 2 the knowledge that is needed to build the multi-dimensional digital human models 3 the procedures that are needed to build the multi-dimensional digital human models, 4 the initial implementation results. 2 The requirements for the multi-dimensional digital human models The multi-dimensional digital human models must be capable of representing different individual humans, or different groups of individual humans. The multi-dimensional digital human models are based on algorithms that process and relate the knowledge. The knowledge is typically incomplete for individual humans, and the knowledge typically varies for individual humans. Therefore, the multi-dimensional human models must be adaptive quasi-organic models of the human body, which consider variable properties such as the shape and size of the body, the shape and size of the internal tissues, the material properties of the internal tissues, and the physiological functioning of the internal tissues. Consequently, the multi-dimensional digital human models must consist of frameworks and sub-models, which can be added to or removed from the frameworks, and which can be adapted for different individual humans, or different groups of individual humans. The multi-dimensional digital human models can only consider the knowledge which is available. Therefore, the multi-dimensional digital human models must also be extendable. The frameworks and sub-models must be capable of adding new knowledge, when new knowledge is available. 3 The knowledge that is needed to build the multi-dimensional digital human models Figure 1 shows the structure of the multi-dimensional digital human models. The multi-dimensional digital human models consist of frameworks and sub-models, which can be added to or removed from the frameworks, and which can be adapted for different individual humans, or different groups of individual humans. The frameworks do not contain the knowledge. The sub-models deliver the knowledge. Therefore, the frameworks use algorithms to: 1 process the knowledge that is delivered by the sub-models 2 facilitate communication between the sub-models 3 make specific decisions. The sub-models deliver different types of knowledge: anatomy, morphology, behaviour, physiology, tissue, and posture knowledge. Anatomy knowledge consists of the internal structures, the active and passive elements, the physical locations, the physical functions, and the functional relationships. Morphology knowledge consists of the shapes, the connections, and the contact properties (the geometric relationships). Physiology knowledge consists of the functions of the fluids (the blood, lymph, and interstitial fluids), the soft tissues (the muscle, and adipose tissues), the hard tissues (the bone tissues), the metabolic processes, and the nerve systems. The behaviour knowledge Multi-dimensional digital human models for ergonomic analysis 75 consists of the material properties (the elastic, nonlinear, and viscous properties), and the muscular structures. The posture knowledge consists of the joint positions, the changes in the joint positions, the changes in the shapes of the body, the changes in the shapes of the tissues, the relocations of the tissues, and the changes in the forces in the body. Figure 1 The structure of the multi-dimensional digital human models Framework Anatomy Location Function Structure Morphology Shape Connectivity Contact Behaviour Linear Nonlinear Viscous Physiology Fluids Tissues Nerves Posture Static Joints Changes Tissue Skin Muscle Adipose New sub-models, which are not shown, can be added, when new sub-models are available. New knowledge, which is not shown, can also be added, when new knowledge is available. 3.1 The procedures that are needed to build the multi-dimensional digital human models Figure 2 shows the procedures that are needed to build the multi-dimensional digital human models. The procedures consist of measurement, reduction, formalisation, instantiation, and utilisation procedures. The procedures can also be grouped into measurement procedures (which are used to capture the knowledge about the humans), conceptualisation procedures (which are used to transform the knowledge into the algorithms) and implementation procedures (which are used to transform the algorithms into the digital human models). The procedures are used to sequentially transform the knowledge into a more useable format. The end result is multi-dimensional digital human 76 N.C.C.M. Moes models that can be used to analyse the interactions between the multi-dimensional digital human models and accordingly modelled products. Figure 2 The procedures that are needed to build the multi-dimensional digital human models Measurement procedures Conceptualisation procedures Reduction procedures Formalisation procedures Implementation procedures Instantiation procedures Utilisation procedures 3.1.1 The measurement procedures The measurement procedures consist of physical or conceptual procedures which are used to capture measured knowledge (which is not very useable). For example, the measurement procedures consist of physical (laser scanning) or conceptual (database access) procedures which are used to capture measured knowledge (scanned point clouds) that describes the shapes of individual humans, or groups of individual human. 3.1.2 The reduction procedures The reduction procedures consist of statistical or conceptual procedures which are used to transform the measured knowledge (which is not very usable) into the structured knowledge (which is more usable). For example, the reduction procedures consist of statistical or conceptual (vague discrete interval modelling) procedures (Moes et al., 2001; Rusák, 2003) which are used to transform the measured shape knowledge (scanned point clouds) for individual humans, or groups of individual humans, into structured shape knowledge [limited sets of characteristic surface points), for individual humans, or groups of individual humans (Moes, 2004)]. 3.1.3 The formalisation procedures The formalisation procedures consist of statistical or mathematical procedures which are used to transform the structured knowledge (which is not very understandable) into relationship knowledge (which is more understandable). For example, the formalisation procedures consist of statistical or mathematical (anatomical, physiological, biomechanical) procedures which are used to transform the structured shape knowledge (limited sets of characteristic surface points) into relationship shape knowledge (relationships between the external environmental conditions, the external forces, and the locations of the characteristic surface points). The formalisation procedures also consist of conceptual procedures which are used to transform the relationship knowledge (which is not very executable) into algorithmic knowledge (which is more executable). For example, the formalisation procedures also consist of conceptual (algorithm development) procedures which are Multi-dimensional digital human models for ergonomic analysis 77 used to transform the relationship shape knowledge (relationships between the external environmental conditions, the external forces, and the locations of the characteristic surface points) into algorithmic shape knowledge (algorithms which can be converted into software and executed on digital computers, within the morphology sub-models of the multi-dimensional digital human models). The resulting algorithmic shape knowledge can be used to rotate, translate, and align the scanned point clouds, by matrix operations, with limited sets of characteristic points, analyse the resulting rotated, translated, and aligned point clouds to create inner and outer hulls, and convert the resulting inner and outer hulls into shape models of distribution trajectories and statistically defined location indices (Moes, 2004). Therefore, the resulting algorithmic shape knowledge can be used to: 1 describe the shapes of individual humans, or groups of individual humans 2 generate new shapes, based on the external environmental conditions, the external forces, and the locations of the characteristic surface points). 3.1.4 The implementation procedures The implementation procedures consist of conceptual procedures which are used to transform the algorithmic knowledge (which is not very executable) into implemented knowledge (which is more executable). For example, the implementation procedures consist of conceptual (software development) procedures which are used to transform the algorithmic shape knowledge (algorithms which can be converted into software and executed on digital computers, within the morphology sub-models of the multi-dimensional digital human models) into implemented knowledge (implemented morphology sub-models of the multi-dimensional digital human models), In order to support the computation the mathematical expressions are converted to algorithms, and suitable software is used for the actual implementation. The implementation procedures also consist of test procedures which are used to find and fix errors in measured, structured, relationship, algorithmic, and implemented knowledge. For example, the implementation procedures also consist of test procedures which are used to find and fix errors in measured, structured, relationship, algorithmic, and implemented knowledge for the morphology sub-models of the multi-dimensional digital human models. 3.1.5 The utilisation procedures The utilisation procedures consist of conceptual procedures which use the resulting multi-dimensional digital human models to optimise product properties based on ergonomics criteria. For example, the utilisation procedures consist of conceptual (software simulation) procedures which use the multi-dimensional digital human models to optimise the shapes of chairs, by changing product properties (design parameters) to reduce physical stresses, based on an objective optimisation function (OOF), such as the ergonomics goodness index (EGI) (Moes and Horváth, 2002a). As a result, the utilisation procedures consist of conceptual (software simulation) procedures which use the multi-dimensional digital human models to optimise product properties, and to improve specific user-product interactions. 78 N.C.C.M. Moes 4 The initial implementation results The procedures described in this paper were used to create a multi-dimensional digital human model for the lower torso and upper leg regions of the human body. The measurement procedures used physical (laser scanning) and conceptual [visible human project database access (VHP, 1997)] procedures to capture measured knowledge (scanned point clouds) which described the shapes of the skin and bones of individual humans, or groups of individual humans, when sitting on chairs. The reduction procedures used vague discrete interval modelling (VDIM) procedures to transform the measured shape knowledge (scanned point clouds) for individual humans, or groups of individual humans, into structured shape knowledge (limited sets of characteristic surface points), for individual humans, or groups of individual humans, when sitting on chairs. The formalisation and implementation procedures were used to transform the structured shape knowledge (limited sets of characteristic surface points) for individual humans, or groups of individual humans, when sitting on chairs, into generic morphology and behaviour sub-models for the multi-dimensional digital human model. New geometric alignment (Moes, 2004) software, new VDIM (Rusák, 2003) software, and commercially available statistical analysis software were used to create the sub-models for the multi-dimensional digital human model. The multi-dimensional digital human model was used to predict the shapes of body surfaces and bones for individual humans, or groups of individual humans, when sitting in chairs, in terms of distributed spatial points (Moes, 2004). The morphology model was used to predict the shapes of the tissues and the connectivities between the tissues, based on the contact conditions. The behaviour model was used to predict the effects of the external environmental conditions, and the external forces, on the predicted shapes. The predicted shapes were used to create solid finite element models (FEMs). The solid FEMS were used to validate the constitutive equations by comparing computed pressure distribution knowledge (Moes and Horváth, 2002b) with measured pressure distribution knowledge (Moes, 2006) for individual humans, or groups of individuals, when sitting in chairs. Therefore, the multi-dimensional digital human model was used to analyse the relationships between the stresses and strains inside the bodies of individual humans, or groups of individual humans, and the shapes of chairs, based on actual measured knowledge, and the results were used to create virtual models of the optimised chairs (Moes, 2004). Commercially available statistical analysis software and commercially available finite element analysis (FEM) software (MARC, 2001) were used to test the multi-dimensional digital human model. The constitutive models for the mechanical behaviour of human tissues are quite complex. Therefore, the commercially available statistical analysis software and the commercially available finite element analysis (FEM) software was used to test many different constitutive equations (Moes, 2004). Figure 3 shows one FEM and three chair models. The three chair models differ only in shape, and one chair is a flat surface. The three chairs were modelled as rigid bodies. The three chair models were used to create loads for the finite element (FEM) model, which was used to predict internal stresses, strains, and tissue relocations for an individual human, when sitting in the three chairs. Multi-dimensional digital human models for ergonomic analysis 79 Figure 3 One FEM and three chair models 5 The results and conclusions The results (Moes, 2004) show that the procedures described in this paper can be used to create multi-dimensional digital human models. The results show that the multi-dimensional digital human models described in this paper can be used to predict shape knowledge for individual humans, or groups of individual humans, when sitting in chairs. Therefore, the results show that the multi-dimensional digital human models described in this paper can be used to optimise the shapes of chairs, by changing product properties (design parameters) to reduce physical stresses, based on an OOF. As a result, the results show that the procedures and the multi-dimensional digital human models described in this paper are feasible, and the results describe significant technical contributions for one specific application (sitting in chairs). However, more work is needed to create and test complete multi-dimensional digital human models for other specific applications. More work is needed to create statistical and mathematical relationships for complete multi-dimensional digital human models for other specific applications. More work is needed to create algorithms and software for complete multi-dimensional digital human models for other applications. Further research is needed to: 1 improve the framework 2 improve the sub-models and create new sub-models 3 testing and optimise the multi-dimensional digital human models 4 use the multi-dimensional digital human models for actual design tasks. 80 N.C.C.M. Moes References MARC (2001) MARC Volume A: Theory and Users Guide, MARC Analysis Research Corporation, Palo Alto, CA. Moes, C.C.M. (2004) Advanced Human Body Modelling to Support Designing Products for Physical Interaction, Delft University of Technology, ISBN: 90-018829-0 [online] http://repository.tudelft.nl/assets/uuid:75a23948-7bbd-4ce4-93de-defe41d10af7/dep_moes_ 20041213.pdf. Moes, C.C.M. (2006) ‘Modelling the sitting pressure distribution and the location of the points of maximum pressure for body characteristics and rotation of the pelvis’, Ergonomics, submitted. Moes, C.C.M. and Horváth, I. (2002a) ‘Optimizing product shape with the Ergonomics Goodness Index, Part I: Conceptual solution’, in McCabe Paul, T. (Ed.): Contemporary Ergonomics, pp.314–318, The Ergonomics Society, Taylor & Francis. Moes, C.C.M. and Horváth, I. (2002b) ‘Estimation of the non-linear material properties for a finite elements model of the human body parts involved in sitting’, in Lee, D.E. (Ed.). Moes, C.C.M., Rusák, Z. and Horváth, I. (2001) ‘Application of vague geometric representation for shape instance generation of the human body’, in Mook, D.T. and Balachandran, B. (Eds.): Proceedings of DETC’01, Computers and Information in Engineering Conference, (CDROM: DETC2001/CIE-21298), ASME, Pittsburgh, Pennsylvania. Rusák, Z. (2003) Vague Discrete Interval Modelling for Product Conceptualization in Collaborative Virtual Design Environments, Delft University of Technology, Fac. Industrial Design Engineering [online] http://repository.tudelft.nl/assets/uuid:7b6d2cc1-16cf-4337-a7af- fff25c880577/803627.pdf. VHP (1997) The Visible Human Project [online] http://www.nlm.nih.gov/research/visible/ visible_human.html (accessed January 2006). work_6wd6eklalvevlfkfrjx63ouh3a ---- 173 ■ ZPRÁVA Konference k 10. výročí Centra vizuální historie Malach: Prague Visual History and Digital Humanities Conference 2020, 27.–28. 1. 2020 V  roce 2020 uplynulo deset let od založe­ ní Centra vizuální historie Malach při Ústavu formální a  aplikované lingvistiky Matematic­ ko­fyzikální fakulty Univerzity Karlovy. Během první dekády své existence se CVH Malach etablovalo jako průsečík humanitních a spole­ čenských věd s digitálními technologiemi. Stalo se uznávanou institucí, která se věnuje zejména orální historii a  dějinám genocid. Základním kamenem činnosti CVH Malach je zpřístup­ nění rozsáhlých sbírek orálněhistorických roz­ hovorů. CVH Malach původně vzniklo jako přístupový bod k Archivu vizuální historie USC Shoah Foundation; tato stále rostoucí sbír­ ka rozhovorů se svědky a  přeživšími genocid, zejména holocaustu, v  současnosti obsahu­ je téměř 55 000 audiovizuálních nahrávek ve více než 40 jazycích. Od roku 2018 je v  CVH Malach k dispozici též Fortunoffův videoarchiv svědectví o holocaustu Yaleovy University s více než 4400 rozhovory. Návštěvníci mohou navíc pracovat i s menšími sbírkami: archivem Refu­ gee Voices (150 anglických rozhovorů) a malou částí rozhovorů z Židovského centra holocaus­ tu v  Melbourne (15 rozhovorů s  přeživšími narozenými na území Československa). Kro­ mě přístupu do uvedených databází se CVH Malach podílí na výzkumných a  vzdělávacích aktivitách, spolupracuje s  dalšími fakultami Univerzity Karlovy a  s  domácími i  zahranič­ ními institucemi. V  průběhu let se v  Centru konaly vzdělávací semináře pro učitele z ČR i ze zahraničí, letní školy pro mezinárodní studenty a mnoho skupinových exkurzí vysokoškolských studentů různých oborů. V roce 2019 byl zahá­ jen program stáží, který studentům umožňuje seznámit se s  technologickými a  obsahovými aspekty činnosti CVH Malach. U  příležitos­ ti 10. výročí existence CVH Malach byl tak i  s  ohledem na rozvíjející se náplň činnosti centra podstatně rozšířen i program pravidelné lednové konference, která byla nazvána Prague Visual History and Digital Humanities Confe­ rence (PraViDCo). První den konference (27. ledna) byl věnován zvaným přednáškám a dis­ kusím zástupců partnerských institucí, zatímco ve druhém dni vystoupili domácí i zahraniční badatelé v  klasickém konferenčním rámci na zá kladě otevřené výzvy k zasílání příspěvků. V  dopolední části programu prvního dne vystoupili jako zvaní řečníci Martin Šmok z USC Shoah Foundation (s přednáškou „Edu­ cation through genocide testimony: Visual History Archive of USC Shoah Foundation, IWitness and IWalks in the Czech schools“) a následně Stephen Naron s Jakem Karou, zastu­ pující Fortunoffův archiv (přednáška „Striking a Balance Between Ethics and Access: The For­ tunoff Archive’s Approach to the Digital Huma­ nities“). Většina dalších hostujících řečníků následně vytvořila dva expertní panely. První z nich, nazvaný Institutions and Oral History in Europe, „Micro“ and „Macro“ Perspectives and Possibilities, Research and Technology, obsaho­ val příspěvky Adama Hradilka z  Ústavu pro studium totalitních režimů, Natalie Otriščenko z  Centra pro urbánní historii v  Lvově, Micha­ ela Loebensteina z  Rakouského muzea filmu, a  Martina Bulína ze Západočeské univerzity v Plzni. Diskuse směřovala od využití rozsáhlých archivů nahrávek orálněhistorických rozhovorů v  badatelské a  vzdělávací praxi, přes obecnější otázky spojené s  interpretací audiovizuálního materiálu z hlediska filmové teorie, až k nejno­ vějším pokrokům na poli počítačové lingvisti­ ky a  automatického zpracování mluvené řeči. Druhý expertní panel, nazvaný Interdisciplinary Research and Visual History Archival Collecti­ ons, představil konkrétní výzkumné projekty Hany Kubátové zastupující Centrum pro tran­ sdisciplinární výzkum traumatu, násilí a  spra­ vedlnosti při Univerzitě Karlově a Ildikó Barna budapešťské Univerzity Loránda Eötvöse. Hana Kubátová pohovořila o  tématu přináležitosti v  kontextu osobní biografie; Ildikó Barna pak o využití komplementárních archivních zdrojů různého charakteru, zvláště tradičního archiv­ ního materiálu a  orálněhistorických pramenů. První konferenční den uzavřelo předání cen vítězům komiksově­výtvarné soutěže a  exklu­ zivní projekce dokumentárního filmu Terezínští hrobaři režisérky Olgy Struskové, který vznikl v produkci České televize. 174 H I S T O R I C K Á S O C I O L O G I E 1/2020 Druhý konferenční den (28. leden 2020) byl vyhrazen příspěvkům zejména mladších bada­ telů z ČR i zahraničí, shromážděným na základě otevřené výzvy. První sekce této části konference se věnovala uplatnění digital humanities v his­ torii. Začala příspěvkem „How to detect coup d’état 800 years later“ od Jana Škvrňáka, Jere­ miho Ochaba a Michaela Škvrňáka z Masaryko­ vy univerzity v Brně v České republice. Autoři uplatnili analýzu sociálních sítí na poměrně neobvyklém případu zkoumání politických spo­ jenectví v českém království raného středověku s  cílem rozplést mocenské vztahy a  jejich vliv na tehdejší svět aristokracie. Mauricio Nicolas Vergara následně ve svém příspěvku popsal, jak podceňovaným prvkem byly přírodní katastro­ fy ve vojenských taženích první světové války v  alpské oblasti  – a  podceňovaným faktorem zůstaly i při současném výkladu dějin. Jeho prá­ ce představuje přístup GIS (Geographic Informa­ tion Systems), který umožňuje lepší pochopení tohoto jevu. Magdalena Sedlická a  Wolfgang Schellenbacher z Masarykova ústavu Akademie věd ČR pak popsali proces tvorby databáze vizu­ ální historie (EHRI Online Edition), založený na interdisciplinárním porozumění shromáždě­ ným materiálům, ale též zohlednění perspektivy koncového uživatele, vedoucí ke vzniku přínos­ ného uživatelského rozhraní pro výzkumné pra­ covníky, učitele, studenty i širší veřejnost. Druhá sekce byla zaměřena na různé pří­ stupy k novým historickým pramenům v digi­ tálním kontextu. Vanessa Hannesschläger z Rakouské akademie věd představila probíhající projekt, jehož cílem je reedice právních doku­ mentů rakouského spisovatele a novináře Karla Krause, vycházející z verze publikované knižně v 90. letech. Eva Grisová z Univerzity Jana Evan­ gelisty Purkyně v Ústí nad Labem následně na konkrétním příkladu přiblížila, jak mohou být audiovizuální nahrávky orálněhistorických svě­ dectví využity i  pro výzkum starších historic­ kých období. Ukázala, že i rozhovory věnované primárně holocaustu mohou být v  kombinaci s  dalšími prameny hodnotným zdrojem pro zkoumání dlouhého 19. století. Nataša Simeuno­ vić Bajić ze srbské Univerzity v Niši pak přiblíži­ la zajímavý fenomén známý jako Jugonostalgie, a to v jeho internetové podobě. Prostřednictvím analýzy online archivů tištěných i  filmových médií, YouTube videí, internetových médií a  sociálních sítí autorka prozkoumala, jak se internetová sféra stává jakýmsi virtuálním muzeem a  otevřeným prostorem pro sdílení a formování individuální i kolektivní paměti. Třetí panel se podrobněji zaměřil na kvali­ tativní analýzu svědectví o holocaustu v audio­ vizuálních rozhovorech. Jakub Bronec z  Uni­ verzity v  Lucemburku osvětlil málo známý případ československých Židů, kteří se rozhodli vyhledat úkryt v Lucembursku, ježovšem nebylo oním bezpečným útočištěm, po kterém toužili. Přiblížil systematické pronásledování těchto lidí, které je zkoumáno v rozsáhlejším výzkumném projektu, jenž využívá i orálněhistorické mate­ riály. Následně se Karolína Bukovská z  Freie Universität Berlin a Jakub Mlynář z Univerzity Karlovy ve společném příspěvku zaměřili na způsoby, jimiž je do rozhovoru vedeného meto­ dou orální historie zapojeno zobrazení tetování přeživších koncentračního tábora v  Osvětimi. Do popředí se zde dostal i často přehlížený vliv tazatelů a výzkumného kontextu na výslednou podobu rozhovoru. Poslední sekce, nazvaná Identities, Beliefs and Humanism in the Modern Era, poskytla příležitost k  širší konceptuální diskusi vybra­ ných klíčových témat spojených s  využitím audiovizuálních digitálních zdrojů ve vztahu k  jejich tvůrcům a  zprostředkovatelům. Dee­ pika Kashyap z  estonské Univerzity v  Tartu se zabývala otázkou identity indické menšiny Nyishi, jak je reprezentována v  on­line sféře prostřednictvím dokumentace kulturních prv­ ků, které zachycují jejich identitu. Poté se Lauri Niemistö z  finské Univerzity věnoval formám především vizuální reprezentace hnutí za práva žen v  populárním britském satirickém časopi­ se Punch mezi lety 1905 a 1914. Ukázal, že pro správné pochopení tropů a alegorií, které zpro­ středkovávají sociálně konstruované významy, je velmi důležité podrobně znát historický kon­ text. V dalším příspěvku se Komlan Agbedahin z  univerzity Svobodného státu v  Jihoafrické republice zaměřil na nedávný skandál z africké­ ho fotbalového šampionátu coby příklad potla­ čení lidské důstojnosti a hodnoty lidského života ve snaze podrobit se obchodním a  politickým 175 Zpráva zájmům. Poukázal též na skutečnost, že inter­ netové zdroje umožňují rekonstrukci události včetně osobních svědectví fotbalistů, jejichž spoluhráči byli při vojenském incidentu zabiti. Celou konferenci pak uzavřel příspěvek Karin Hofmeisterové z  Univerzity Karlovy, která se věnovala reprezentacím mučednictví ve vizuál­ ní a textové produkci srbské pravoslavné církve a otázce kontinuity tohoto fenoménu s obdobím jugoslávského socialismu. Dvoudenní konference poskytla vhled do široké palety různých témat, metod a přístupů, s  nimiž přišlo CVH Malach do styku během prvních deseti let své existence. Příspěvky z dru­ hého dne jsou shromážděny v  konferenčním sborníku, který je již k dispozici i v elektronic­ ké podobě (https://ufal.mff.cuni.cz/malach/en /publications). Centrum bude bezesporu i nadá­ le sloužit jako místo setkávání vědců, učitelů, studentů a všech, kteří mají zájem dozvědět se o  zkušenosti lidí, kteří byli ochotni sdílet své životní příběhy navzdory prožitým tragédiím a utrpení. Jiří Kocián – Jakub Mlynář DOI: 10.14712/23363525.2020.15 work_6wrtlxrn4vegnltwentr5irmkq ---- "Ei, dem alten Herrn zoll' ich Achtung gern'" Malte Rehbein It’s our department: On Ethical Issues of Digital Humanities1 1 Anecdotal Introduction I am not an ethicist and had not thought through the moral aspects of my profession as a Digital Humanities scholar until quite recently. A historian by training with a focus on Medieval Studies, I mostly considered the objects I studied to be beyond the scope of moral and legal issues: historical figures are long dead and historical events took place in the past, and my research would hardly ever influence the course of history. After this training as a (traditional) historian, I went on to work digitally with historical data, like Digital Humanists do, trying to identify entities, to find correlations, or to visualize patterns within that very data. At the big Digital Humanities gathering in Hamburg, 2012, however, rumours circu- lated that a secret service agency was recruiting personnel at the conference. This agency, they said, was interested in competences and skills in analysing and inter- preting ambiguous, heterogeneous, and uncertain data. I had not been approached in person and until today do not know whether the story is true or not. Nevertheless, just the idea that a secret service might be interested in expertise from the Digital Humanities was a strong enough signal to start thinking about the moral implications of the work we are doing in this field, and it inspired for this essay. In this light, examples of recent research in Digital Humanities such as psychological profiling appear at the same time exciting and frightening. We can observe a typical dual-use problem: something can have good as well as bad consequences according to its use. Is it not a fascinating asset for research to determine a psychological profile 1 This contribution is based on the Institute Lecture “On Ethical Aspects of Digital Humanities (Some Thoughts)” presented by the author at the Digital Humanities Summer Institute, Victoria BC, Canada, 5 June 2015. While the text has been revised and enriched with annotations, its essayistic style has been maintained. 632 Malte Rehbein of a historical figure just through automated analysis of historical textual sources? On the other hand, what would the consequences be if a psychological profile of anyone, living or dead, were to be revealed or circulated without her knowledge or her assignee’s consent? Ethical considerations are more than just a philosophical exercise. They help us to shape the future of our society (and environment) in ways that we want it to be, and they help us minimize risks of changes that we do not want to happen. 2 Setting the Stage: Use and Abuse of Big Data Now and Then 2.1 Big Data This essay uses Big Data as a vehicle for considerations about ethical issues of the Digital Humanities, pars pro toto for the field as a whole with all its facets. Big Data has been a hyped term worldwide for some time now,2 and its methods have reached the Humanities.3 Big Data is a collective term for huge collections of data and their analysis, often characterized by four attributes: volume, velocity, variety, and veracity: • Volume describes vast amounts of data generated or collected. The notion of “vast” is ambiguous, however. Some define it as an amount of data that is too big to be handled by a single computer. Digital Humanities and the use-cases described in this essay will hardly ever reach this amount. However, as Manfred Thaller has pointed out in his keynote presentation at the second annual conference of the Digital Humanities im deutschsprachigen Raum,4 Big Data is characterized especially by the multiplication of all four factors described here. Since data in the Humanities is often far more complex than, for instance, engineering data due to its ambiguous nature, data in the Humanities can be “big” in this way. • Velocity describes the speed at which data goes viral. As people often think of new data generated within seconds or even faster, this is not characteristic for the Humanities, which deals with historical or literary data. But it can become 2 Cornelius Puschmann and Jean Burgess, Big Data, Big Questions. Metaphors of Big Data, in: International Journal of Communication 8 (2014), p. 1690–1709. 3 Christof Schöch, Big? Smart? Clean? Messy? Data in the Humanities, in: The Dragonfly’s Gaze. Computational analysis of literary texts, (August 1, 2013), online available at http://dragonfly.hypotheses. org/443 [last accessed: 30 Nov. 2015]. 4 Manfred Thaller, Wenn die Quellen überfließen. Spitzweg und Big Data, Closing Keynote, Graz, 27 February 2015. http://dragonfly.hypotheses.org/443 http://dragonfly.hypotheses.org/443 It’s our department: On Ethical Issues of Digital Humanities 633 relevant for Social Science, for instance in the analysis of so-called social media data, which can be considered as part of Digital Humanities. • Variety refers to the various types of data processed, multimodal data, data in dif- ferent structures or completely unstructured data. This variety is characteristic of Humanities’ sources, and Digital Humanities offer new methods to interlink various types of data and to process it synchronously. • Veracity questions the trustworthiness of the data to be processed, its quality and accuracy. The Humanities, especially within the historical disciplines, know best of all what critically questioning origin, context, and content of data means – making veracity of data a very relevant aspect for the Digital Humanities. Overall, Big Data collections are too big and/or too complex to be handled by tradi- tional techniques of data management (databases) and by algorithmic analysis on a single computer. With this definition of big data in mind, one might think of systems like the Large Hadron Collider in Geneva, which is considered the largest research infrastructure and scientific enterprise of all time, or of telecommunication data pro- duced by mankind – tons of terabytes every second. Compared to this, it might not be appropriate to speak of Big Data in the context of scholarship in the Humanities at all. Nevertheless, Big Data can act as a metaphor for one of the current major trends in Digital Humanities: data-driven, quantitative research based on an amount of data that a single scholar or even a group of scholars can barely oversee let alone calculate. Such data, due to its amount, complexity, incompleteness, uncertainty, and ambiguity, requires the support of computers and their algorithmic power. For centuries, Humanities scholars have recognised this aspect of their data, but now they have this data at hand in a much larger quantity than ever before. In general, typical applications for Big Data are well known and described. With regard to ethics, these applications span a broad range of how they are used and what implications this might cause. This shall be illustrated by the following examples, divided into three groups. The first group comprises those of the Sciences and Humanities (the German term Wissenschaft fits better here and will be used henceforth). Incredible amounts of data are investigated, for example for research on global climate (e. g., NASA Center for Climate Simulation), to increase the precision of weather forecasts, to decode the human genome, or in the search for elementary particles at the Large Hadron Collider in Geneva. In a positive view on Wissenschaft, these investigations shall serve society as a whole. 634 Malte Rehbein The second class encompasses applications of Big Data from which particular groups would benefit but which might interfere with the interests of others. Depending on one’s perspective, such applications can easily be found in the business world. For example, on February 16th, 2012, the New York Times published an article “How Com- panies Learn Your Secrets”. Taking the example of the US retailer Target, the article describes how companies analyse data and then try to predict consumer behaviour in order to tailor and refine their marketing machine. In this context, Andrew Pole proposed a “pregnancy-prediction model” to answer a company’s question: “If we wanted to figure out if a customer is pregnant, even if she didn’t want us to know, can you do that?” with the help of algorithms based on Big Data.5 In the wake of revelations by Edward Snowden, Glen Greenwald and others in 2013,6 a third class of applications has come more and more into public consciousness. Current mass surveillance might be the strongest example of Big Data analysis in which the interests of a very small group are in stark contrast with the values of society at large. 2.2 Big Data Ethics These three categories form only one, preliminary classification of Big Data applica- tions from a moral perspective, a classification, which is, of course, simplistic and disputable. Nevertheless, it shall lead us towards the basic question of ethics: the distinction between right and wrong or good and evil when it comes to deciding between different courses of action. It should also be clear by now that there is no one answer to this question, but that different moral perspectives exist: perspectives of those who conduct Big Data analysis, perspectives of those who do basic research so that others can apply these methods, and perspectives of the ambient society. Putting aside the particular scenarios in which Big Data is studied/examined, one might ask what is methodologically typical for it? There are three main methodological areas involved: pattern recognition, linkage of data and correlation detection. Then people (or machines) begin the process of inference or prediction. In his 1956 short story The Minority Report, Philip K. Dick depicts a future society in which a police department called Precrime, employing three clairvoyant mutants, called Precogs, is capable of predicting and consequently preventing violent crimes. 5 Charles Duhigg, How Companies Learn Your Secrets, in: New York Times, 16 February 2012, on- line available at http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html [last accessed: 30 Nov. 2015]. 6 The Snowden files, online available at http://www.theguardian.com/world/series/the-snowden-files [last accessed: 30 Nov. 2015]. http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html http://www.theguardian.com/world/series/the-snowden-files It’s our department: On Ethical Issues of Digital Humanities 635 This world has not seen a single murder in years. In Steven Spielberg’s movie of the same name from 2002,7 these Precogs are characterized in more detail: thanks to their extrasensory capabilities they are worshiped by mankind. However, it is said in the film, the only thing they do is search for patterns of crime in their visions: “people start calling them divine – Precogs are pattern recognition filters, nothing more.” Assuredly, Precogs are a fiction. However, modern real-life crime prevention indeed attempts to find patterns in Big Data collections to predict likely locations of criminal behaviour, which has been reported with various, some say doubtable success rates from the USA and Germany, and probably other countries. The way our governments and secret services justify (disguise) their actions of mass surveillance, namely to predict and prevent terror attacks are real, too. Big data and its technology (pattern recognition, data linkage, and inference) serve such predictions: weather prediction – pregnancy prediction – crime prediction. This is not new. For instance, the East German secret service Stasi under the direction of Erich Mielke conceptualized a comprehensive database of human activities and intentions.8 Its goal: “to put together digital data and reports of all of the 16.5 million citizens of the German Democratic Republic.”9 In the historically realistic scenario of the Academy Award-winning film The Lives of Others (orig. “Das Leben der An- deren”),10 the Stasi possesses a type specimen collection of all typewriters that are in circulation, and they know the machine favoured by each of the human writers they are observing. Whenever they come across an anonymous document, they attribute this document to a particular writer by comparing the typesetting of this document with the specimen. This is not (yet) Big Data in the modern definition, but is a form of pattern recognition. In The Lives of Others, the Stasi did not manage to disclose Georg Dreyman as the author of an article published in the West German magazine Der Spiegel, an article in which Dreyman revealed the high statistical rate of suicides in East Germany after the suicide of his friend. The East German secret service did not manage to do so, because Dreyman did not write the draft with his own but with somebody else’s typewriter. He behaved untypically. 7 Minority Report, Dir. Steven Spielberg, 2002. 8 Cf. Stefan Wolle, Die heile Welt der Diktatur, Berlin 2013, p. 186. 9 Victor Sebestyen, Revolution 1989: The Fall of the Soviet Empire, New York 2009, p. 121. In comparison to nowadays Big Data companies, Andrew Keen concludes: “Mielke war ein Datendieb des 20. Jahrhun- derts, der die DDR in eine Datenkleptokratie verwandelte. Doch verglichen mit den Datenbaronen des 21. Jahrhunderts war sein Informationsimperium zu regional und zu klein gedacht. Er kam nicht auf den Gedanken, dass Milliarden Menschen in aller Welt ihre persönlichen Daten freiwillig herausrücken könnten.” (Andrew Keen, Das digitale Debakel, München 2015, p. 201). 10 Das Leben der Anderen, Dir. Florian Henckel von Donnersmarck, 2006. 636 Malte Rehbein Scenarios and applications like this are not new. What is new is their dimension. And what brings these briefly introduced examples together, be they fictitious or real, is that they are all so-called probabilistic methods; they do not give us the truth, but the probability that a particular event or behaviour will take or has taken place. However, even a likelihood of 99 % prediction accuracy means that in one out of a hundred cases, the wrong person will have to suffer the consequences. For various reasons, Big Data yields several normative questions and issues. On May 30th, 2014, Kate Crawford published a piece in “The New Inquiry” under the title: “The Anxieties of Big Data. What does the lived reality of big data feel like?” She concludes: If the big-data fundamentalists argue that more data is inherently better, closer to the truth, then there is no point in their theology at which enough is enough. This is the radical project of big data. It is episte- mology taken to its limit. The affective residue from this experiment is the Janus-faced anxiety that is heavy in the air, and it leaves us with an open question: How might we find a radical potential in the surveillant anxieties of the big-data era?11 Ethical questions in Big Data have barely been addressed in the research.12 In 2014, Rajendra Akerkar edited a volume on Big Data Computing.13 In 540 pages, however, neither legal nor ethical questions are discussed. In the chapter on Challenges and Op- portunities by Roberto Zicarci, for example, opportunities are business opportunities, challenges are mostly technical challenges.14 The volume does not address individual, organisational let alone societal risks and consequences of Big Data Computing. This seems to be symptomatic for hyped technologies such as Big Data and for techno- logical advancement of our time generally. First, we do it, and then we handle the consequences. Very much alike is the 2013 report on Frontiers in Massive Data Analysis issued by the National Academy of Sciences of the USA. Limitations of data analysis discussed here 11 Kate Crawford, The Anxieties of Big Data, in: The New Inquiry, 30 May 2014, online available at http: //thenewinquiry.com/essays/the-anxieties-of-big-data/ [last accessed: 30 Nov. 2015]. 12 More recently, a conference at Herrenhausen “Big Data in a Transdisciplinary Perspective” discussed legal aspects of Big Data. Their proceedings have not yet been published. A report is available: Christoph Kolodziejski and Vera Szöllösi-Brenig, Big Data in a Transdisciplinary Perspective. Her- renhäuser Konferenz, 22 July 2015, online available at http://www.hsozkult.de/conferencereport/id/ tagungsberichte-6084 [last accessed: 30 Nov. 2015]. 13 Rajendra Akerkar, Big data computing, Boca Raton 2014. 14 Roberto V. Zicar, Big Data: Challenges and Opportunities, in: Big data computing, ed. by Rajendra Akerkar, Boca Raton, p. 103–128. Ethical challenges are mentioned (“Ensuring that data are used correctly (abiding by its intended uses and relevant laws)”) but not further discussed (p. 111). http://thenewinquiry.com/essays/the-anxieties-of-big-data/ http://thenewinquiry.com/essays/the-anxieties-of-big-data/ http://www.hsozkult.de/conferencereport/id/tagungsberichte-6084 http://www.hsozkult.de/conferencereport/id/tagungsberichte-6084 It’s our department: On Ethical Issues of Digital Humanities 637 are merely of a technical nature. The report states: “The current report focuses on the technical issues – computational and inferential – that surround massive data, consciously setting aside major issues in areas such as public policy, law, and ethics that are beyond the current scope.”15 Bollier makes such issues more explicit: “The rise of large pools of databases that interact with each other clearly elevates the potential for privacy violations, identity theft, civil security and consumer manipulation.”16 Even in areas where potential ethical issues are more obvious than in the Humanities, Wissenschaft and the general public are slowly beginning to realize the implications of Big Data and to demand action. In June 2014, for example, the University of Oxford announced a postdoctoral position of Philosophy in “ethics of big data”: “this pilot project will formulate a blueprint of the ethical aspects, requirements and desiderata underpinning a European framework for the ethical use of Big Data in biomedical research.”17 Earlier, on October 24th, 2012, Stephan Noller called for a general ethics of algorithms (orig.: Algorithmen-Ethik) in the German newspaper FAZ to promote control and transparency: “Algorithmen müssen transparent gemacht werden, sowohl in ihrem Einsatz als auch in ihrer Wirkweise.”18 It is clear that a wide-spread understanding of algorithms is also an urgent necessity. 2.3 Technology is not value-free In a brief survey of current research, one should not overlook a small publication by Kord Davis from 2012, titled “Ethics of Big Data”. Davis’ analysis runs as follows: While big-data technology offers the ability to connect information and innovative new products and services for both profit and the greater social good, it is, like all technology ethical neutral. That means it does not come with a built-in perspective on what is right or wrong or what is good or bad in using it. Big-data technology has no value framework. Individuals and corporations, however, do have value systems, and it is only by asking and seeking answers to ethical questions that we can ensure big data is used in a way that aligns with those values.19 15 National Research Council, Frontiers in Massive Data Analysis, Washington DC 2013, online available at http://www.nap.edu/read/18374/ [last accessed: 30 Nov. 2015], p. 5. 16 David Bollier, The Promise and Peril of Big Data, Washington DC 2010, p. 33. 17 Job offer for a Postdoctoral Research Fellowship in Ethics of Big Data at the University of Oxford, online available at https://data.ox.ac.uk/doc/vacancy/113435 [last accessed: 30 Nov. 2015]. 18 Stephan Noller, Relevanz ist alles. Plädoyer für eine Algorithmen-Ethik, in: Frankfurter Allgemeine Zeitung, 24 October 2012. 19 Kord Davis, Ethics of Big Data, Sebastopol (CA) 2012, p. 8. http://www.nap.edu/read/18374/ https://data.ox.ac.uk/doc/vacancy/113435 638 Malte Rehbein While Davis is right in demanding that the discussion of Big Data ethics has to be embedded in surrounding value systems, he is wrong about the neutrality of technology. His argument reminds us of Francis Bacon who had this dream of value- free Wissenschaft in the 17th century. In the wake of the bombing of Hiroshima and Nagasaki in August 1945, many, such as Max Born woke up from this dream and recognized the dual-use dilemma of technology and acknowledged the responsibility of the scientists: “Wir stehen auf einem Scheidewege, wie ihn die Menschheit auf ihrer Wanderung noch niemals angetroffen hat.”20 Closer to our field, Vannevar Bush, who provided an important milestone for the development of the Digital Humanities with his seminal publication As We May Think from 1945, asked how science can come back to the track that leads to the growth of knowledge: It is the physicists who have been thrown most violently off stride, who have left academic pursuits for the making of strange destructive gadgets, who have had to devise new methods for their unanticipated assignments. […] Now, as peace approaches, one asks where they will find objectives worthy of their best.21 Technology is not value-free. Scientists and scholars develop it. Together with those who apply technology in specific use cases, a huge share of responsibility belongs to them. Computer pioneer Konrad Zuse recognised this. Looking back from the vantage point of his memoir, he describes the qualms (orig.: “Scheu”) he had in the end of 1944 to further develop his machine (Z4). Implementing conditional jumps into it would allow free control flow: Solange dieser Draht nicht gelegt ist, sind die Computer in ihren Möglich- keiten und Auswirkungen gut zu übersehen und zu beherrschen. Ist aber der freie Programmablauf erst einmal möglich, ist es schwer, die Grenze zu erkennen, an der man sagen könnte: bis hierher und nicht weiter.22 According to Zuse’s memoir, his reputation suffered from this “Veranwortungsbe- wußtsein des Erfinders.”23 There is a second critical aspect of Davis’ ethics. His readers are decision makers of business enterprises. The value system he discusses refers to corporations and individuals within the corporate structure. He does not address individuals outside 20 Max Born, Von der Verantwortung des Naturwissenschaftlers. Gesammelte Vorträge, München 1965, p. 9. 21 Vannevar Bush, As We May Think, in: Atlantic Monthly 176 (July 1945), online available at http: //www.theatlantic.com/magazine/archive/1969/12/as-we-may-think/3881/ [last accessed: 30 Nov. 2015]. 22 Konrad Zuse, Der Computer – Mein Lebenswerk. Mit Geleitworten von F. L. Bauer und H. Zemanek, Berlin 1984, p. 77. 23 Ibid., p. 77. http://www.theatlantic.com/magazine/archive/1969/12/as-we-may-think/3881/ http://www.theatlantic.com/magazine/archive/1969/12/as-we-may-think/3881/ It’s our department: On Ethical Issues of Digital Humanities 639 the corporation, let alone the ambient society and world at large: internal but not external responsibility. For Wissenschaft, however, it is essential that we address both. The freedom to study and to investigate always comes with the responsibility to use this freedom carefully. In Wissenschaft, freedom and responsibility are two sides of the same coin. 3 Case Studies 3.1 Some Thoughts on Digital Humanities One can easily imagine that Big Data in biomedical research (as seen in the Oxford job posting) opens the door for ethical considerations. But what about the Digital Humanities? Why should we bother? In the context of this question, it is helpful to characterize Digital Humanities as an attempt to offer new practices for the Hu- manities. This is mainly facilitated by a) the existence or creation of and access to digital data relevant to research in the Humanities, b) the possibility of a computer- assisted operation upon this data, as well as c) modern communication technology in particular the internet. Overall, this characterizes the Digital Humanities as a hybrid field, suggesting two different perspectives within the scholarly landscape. For both perspectives, ethical discussions play a role. The first perspective is that of a distinct discipline, with its own research questions, methodology, study programmes, publication venues, and so on, and of course: values. As a discipline on its own, Digital Humanities needs its Wissenschaftsphilosophie (phi- losophy of science), including theory24 and ethics. The second perspective, however, sees Digital Humanities as a Hilfswissenschaft (auxiliary science) that provides ser- vices for others, which one might compare with the role maths plays for physics and engineering, or palaeography for history and medieval studies. This perspective on Digital Humanities is relevant for our ethical discussion, because a Digital Humanist might be tempted to argue that he is only developing methodologies and hence is not responsible for the uses that others make of them. 24 For Digital Humanities as an emerging academic discipline on its own, more theoretical foundation seems to be timely. This is particularly true in the context of Big Data analysis where proponents are announcing an “end of theory” (provocative: Chris Anderson, The End of Theory: The Data Deluge Makes the Scientific Method Obsolete, in: Wired Magazine 16 (2008), online available at http://archive.wired. com/science/discoveries/magazine/16-07/pb_theory [last accessed: 30 Nov. 2015]. A critical discussion offers Rob Kitchin, Big Data, New Epistemologies and Paradigm Shifts, in: Big Data & Society 1 (June 2014), DOI: 10.1177/2053951714528481. http://archive.wired.com/science/discoveries/magazine/16-07/pb_theory http://archive.wired.com/science/discoveries/magazine/16-07/pb_theory http://doi.org/10.1177/2053951714528481 640 Malte Rehbein 3.2 Early Victims of Digital Humanities: William Shakespeare and Agatha Christie A first case study comprises the work by Ryan Boyd and James Pennebaker on William Shakespeare. In the context of modern text and language analysis, Pennebaker, a social psychologist, is known for his method of Linguistic Inquiry and Word Count (LIWC). Applying text analytical methods to the large corpus of the work of William Shakespeare, Boyd and Pennebaker claim to be able to create a psychological sig- nature of authors (“methods allowed for the inference of Shakespeare’s […] unique psychological signatures”25) and to confirm the broadly accepted characterization of the playwright as “classically trained” and “socially focused and interested in climbing higher on the social ladder.”26 Shakespeare has long been dead, of course, and most likely, neither he nor any of his kin has to face the consequences of this research. But the methods employed here are of a general nature and can easily be applied to anyone, living or dead, whether he wants it or not. Another prominent “victim” of this kind was Agatha Christie, maybe the most read English female writer of all time. In 2009, Ian Lancashire and Graeme Hirst published a study “Vocabulary Changes in Agatha Christie’s Mysteries as an Indication of Dementia: A Case Study”.27 Lancashire and Hirst analyse the corpus of Christie’s work as follows: Fourteen Christie novels written between ages 34 and 82 were digitized, and digitized copies of her first two mysteries […] were taken from Project Gutenberg. After all punctuation, apostrophes, and hyphens were deleted, each text was divided into 10,000-word segments. The segments were then analysed with the software tools Concordance and the Text Analysis Computing Tools (TACT). We performed three analyses of the first 50,000 words of each novel.28 The result of this, fairly straight-forward, textual analysis indicated that Christie’s vocabulary was in significant decline over the course of her life and that the amount of repetition increased, such as the usage of indefinite words. For Lancashire and Hirst, this is an indication that Agatha Christie developed dementia. 25 Ryan L. Boyd and James W. Pennebaker, Did Shakespeare Write Double Falsehood? Identifying Individuals by Creating Psychological Signatures With Text Analysis, in: Psychological Science 26 (2015), p. 570–582, here p. 579. 26 Boyd/Pennebaker, Did Shakespeare Write Double Falsehood? (see note 25), p. 579–580. 27 Ian Lancashire and Graeme Hirst, Vocabulary Changes in Agatha Christie’s Mysteries as an Indication of Dementia: A Case Study, in: Forgetful Muses: Reading the Author in the Text, Toronto 2010, p. 207–219, online available at http://ftp.cs.toronto.edu/pub/gh/Lancashire+Hirst-extabs-2009.pdf [last accessed: 30 Nov. 2015]. 28 Ibid., p. 208. http://ftp.cs.toronto.edu/pub/gh/Lancashire+Hirst-extabs-2009.pdf It’s our department: On Ethical Issues of Digital Humanities 641 These techniques on textual corpora operate on text as a sequence of characters. They are agnostic about who had written these texts and for what purpose. In other words, not only texts by well-known and deceased writers can be examined in such manner. Any text can. Lancashire and Hirst are well aware of this fact and of the potential consequences. Like many technologists, however, their ethics and outlook is strictly positive: “While few present-day patients”, they conclude, have a large online diachronic corpus available for analysis, this will begin to change as more individuals begin to keep, if only by inertia, a life-time archive of e-mail, blogs, professional documents, and the like. [… We can] foresee the possibility of automated textual analysis as a part of the early diagnosis of Alzheimer’s disease and similar dementias.29 Early diagnosis of diseases or their prediction might be a wonderful “tool” in the future. Research in this direction aims at something “good”, Lancashire and Hirst would argue. Their ethics is utilitarian in the tradition of Jeremy Bentham and John Stuart Mill. But what happens if this data is used against someone, for instance, to deny an insurance policy? And as textual data becomes more and more easily available, whether we consciously deliver it, for instance in blogs or Facebook microblogs, or because our e-mails are intercepted, it becomes almost impossible for the individual to avoid this situation. 3.3 Revealing Your Health Preconditions Another, related example shall illustrate that not only texts and data that we currently provide might lean to individual or societal consequences, but also data from the past. An open question in medical research addresses whether or not there is a genetic predisposition to Alzheimer’s disease. Neurologist Hans Klünemann and archivist Herbert Wurster now propose that this hypothesis can potentially be tested with historical data.30 Their research uses historical records, parochial death registers from 1750 to 1900, which were digitized, transcribed and encoded in a database at the archive of the diocese of Passau. They analyse the data for family relations in order to create family trees, and they analyse mortality data to find indicators for Alzheimer’s 29 Lancashire/Hirst, Vocabulary Changes (see note 27), p. 210. 30 Hans Klünemann, Herbert Wurster and Helmfried Klein, Alzheimer, Ahnen und Archive. Genetisch- Genealogische Alzheimerforschung, in: Blick in die Wissenschaft. Forschungsmagazin der Universität Regensburg 15 (2013), p. 44–51. 642 Malte Rehbein disease.31 Through this, they hope to identify genetic conditions for the development of Alzheimer’s disease and they hope, in the future, to be able to predict whether or not someone belongs to such a risk group. This is a highly interdisciplinary approach with Digital Humanities at its very heart: digitization, digital transcription and encoding as well as computer-based analysis of historical data make this work. If the approach turns out to work, one can foresee great potential in it. What could be problematic about such research? This data (the digitized church registers) has been made publically available, searchable, and analysable. Many other archives have done or will do the same. Consequently, however, information about an individual’s family and their causes of death will become public information and this information can be used, for instance, to evaluate the individual risk of a living descendant for a certain disease even if this individual has not disclosed any personal information about him or herself. Hence, information about living persons could be inferred from open historical data. In addition to the question of whether individual rights are affected, these case studies demonstrate typical dual-use problems. On the one hand, family doctors can use the data and its analysis as an early diagnosis of severe diseases. On the other hand, potential employers can also use it, for instance, to pick only those individuals that do not belong to any risk group. There is no easy solution for this problem. Ethical questions appear to be dilemmas, also in Digital Humanities. 3.4 Another Prominent Victim of DH: J. K. Rowling In 2013, a quite prominent case of authorship attribution floated around. A certain Robert Galbraith published a novel called The Cuckoo’s Calling. Despite positive reviews, the book was at first only an average success on the book market. However, three months later, rumours began circulating that the real author of The Cuckoo’s Calling was J. K. Rowling, who had had such a sweeping success with her Harry Potter series. Patrick Juola and Peter Millicam analysed the text of The Cuckoo’s Calling with methods of forensic stylometry and came to the conclusion that it was quite probable that Rowling is indeed its author, which she afterwards admitted. Especially when it is a “closed game” as in this case, in which one computes the likelihood with which a text can be attributed to an author candidate (as opposed to the “open game” where one computes the most likely author of a text), forensic 31 As Dementia or Alzheimer were not known then, other terms were used as indicator for these diseases. “Gehirnerweichung” or “Gehirnwassersucht” are typical expressions from the sources that Klünemann and Wurster use for their research. It’s our department: On Ethical Issues of Digital Humanities 643 stylometry is a simple method: “language is a set of choices, and speakers and writers tend to fall into habitual, or at least common, choices. Some choices come from dialect […], some from social pressure […], and some just seem to come.”32 This leaves stylistic patterns that a computer can measure and compare to corpora of texts already attributed, such as the Harry Potter series. The method has been described and practised since the 19th century (although computers are a late entrant to the game). For the Digital Humanities, methods like these are – at first sight – fantastic. They offer vast opportunities for fundamental research, for example in studying the history of literature, or general history, they allow testing existing hypotheses, and they offer new ones. The moral question, however, is again: at what cost? J. K. Rowling admitted that she would have preferred to remain unrevealed: “Being Robert Galbraith has been such a liberating experience […] It has been wonderful to publish without hype and expectation and pure pleasure to get feedback under a different name.”33 Does research in Digital Humanities threaten the effectiveness of a pseudonym and hence an individual’s right to privacy and freedom to publish? This kind of research does not only affect individuals. There are consequences for society as whole, for the world we live in, and for our social interaction. If one thinks the idea of authorship attribution through to its very end, then we arrive at a future in which it is impossible to remain anonymous – even when we try. Proponents of mass surveillance and leaders of totalitarian regimes will certainly favour such a scenario, but free-speech advocates will certainly not. We have to carefully evaluate the risk that our research carries. There is yet another interesting aspect to this story: we usually speak of technology and Wissenschaft in the same breath as representing? progress. Wissenschaft enhances, it extends, it augments. In the case discussed here, however, we appear to lose a capability by this scientific progress: We will not be capable anymore of hiding. 3.5 Psychological Profiling Through Textual Analysis In 2013, inspired by the Pennebaker’s work on the psychological signature of Shake- speare, John Noecker, Michael Ryan, and Patrick Juola published a study of “Psy- 32 Patrick Juola, Rowling and “Galbraith”: an authorial analysis, in: Language Log Blog, 16 July 2013, online available at http://languagelog.ldc.upenn.edu/nll/?p=5315 [last accessed: 30 Nov. 2015]. 33 Quoted after J. K. Rowling’s pseudonym: A bestselling writer’s fantasy, in: The Boston Globe, 22 July 2013, online available at https://www.bostonglobe.com/opinion/editorials/ 2013/07/21/with-pseudonym-richard-galbraith-rowling-lives-out-every-writer-fantasy/ H9tkYJFB5dAHppCOe963yJ/story.html [last accessed: 30 Nov. 2015]. http://languagelog.ldc.upenn.edu/nll/?p=5315 https://www.bostonglobe.com/opinion/editorials/2013/07/21/with-pseudonym-richard-galbraith-rowling-lives-out-every-writer-fantasy/H9tkYJFB5dAHppCOe963yJ/story.html https://www.bostonglobe.com/opinion/editorials/2013/07/21/with-pseudonym-richard-galbraith-rowling-lives-out-every-writer-fantasy/H9tkYJFB5dAHppCOe963yJ/story.html https://www.bostonglobe.com/opinion/editorials/2013/07/21/with-pseudonym-richard-galbraith-rowling-lives-out-every-writer-fantasy/H9tkYJFB5dAHppCOe963yJ/story.html 644 Malte Rehbein chological profiling through textual analysis”.34 This research presumes that the personality of an individual can be classified with the help of psychological profiles or patterns. Based on a typology suggested by Carl Gustav Jung in 1921,35 Kather- ine Briggs and Isabel Myers developed a classification on their own (Myers-Briggs type indicator, MBTI)36 in which they classify individuals’ preferences among four dichotomies: extraversion versus introversion, sensation versus intuition, thinking versus feeling, and perception versus judging. An individual can be, for instance, an ISTJ type: an introversive, sensing thinker who makes decisions quite quickly. Although the validity of this classification as well as its reliance on questionnaires is disputable, the Myers-Briggs indicator appears to be quite popular, especially in the USA where it is used in counselling, team building, social skill development, and other forms of coaching. Noecker, Ryan, and Juola formulate a simple hypothesis: the writing style of an individual can serve as a measure for this individual’s MBTI and hence, stylometric methods can be used to determine the type indicator. In other words, they propose that automated textual analysis can create a psychological classification of the author of a given text. For their experiments, Noecker, Ryan, and Juola used a corpus of texts by Dutch authors whose MBTI is known (Luyckx’ and Daelemans’ Personae: A Corpus for Author and Personality Prediction from Text).37 Noecker, Ryan, and Juola state an average success rate of 75 %. They claim to detect the ‘J’-type (judging) and the ‘F’-type (feeling) quite well (91 %, 86 %). For the ‘P’-types, the perceivers, however, the method does not respond equally well (56 %).38 According to Myers and Briggs, the perceivers are those individuals who are willing to rethink their decisions and plans in favour of new information, those who act more spontaneously than others. Again, the texts that these methods are grounded in might be provided consciously and willingly or unconsciously and unwillingly. Hence, the same moral issue of use and reuse of scholarly methods arises here and needs to be discussed within the context of these usages. But what about the researcher who develops but does not necessarily apply this technology? In this case, Digital Humanities would play the role of an auxiliary science, providing services for others. As such an auxiliary science, it is tempting to argue that research is value-free, that its sole goal is the development 34 John Noecker, Michael Ryan and Patrick Juola, Psychological profiling through textual analysis, in: Literary & Linguistic Computing 28 (2013), p. 382–387, DOI: 10.1093/llc/fqs070. 35 Carl Gustav Jung, Psychologische Typen, Zürich 1921. 36 Cf. A Guide to the Isabel Briggs Myers Papers, online available at http://web.uflib.ufl.edu/spec/manuscript/ guides/Myers.htm [last accessed: 30 Nov. 2015]. 37 K. Luyckx and W. Daelemans, Personae: A Corpus for Author and Personality Prediction from Text, in: Proceedings of the 6th Language Resources and Evaluation Conference, Marrakech 2008. 38 Noecker/Ryan/Juola, Pschological profiling through textual analysis (see note 34), p. 385. http://doi.org/10.1093/llc/fqs070 http://web.uflib.ufl.edu/spec/manuscript/guides/Myers.htm http://web.uflib.ufl.edu/spec/manuscript/guides/Myers.htm It’s our department: On Ethical Issues of Digital Humanities 645 of methods and that only those who apply these methods have to consider moral consequences – whether that be literary scholars working on Agatha Christie or historians interested in the psychological profiling of historical leaders. However, as argued above, Wissenschaft and technology is never value-free. Everyone who is developing something is responsible for considering potential risks of its usage. Especially when Digital Humanities is understood as a discipline in its own right, these issues have to be addressed and discussed. 4 Elements of an Ethical Framework – Towards a Wissenschaftsethik for Digital Humanities 4.1 Fears of Media Change With the rough definition of Digital Humanities elaborated above in mind, we next sketch out some of the changes underway during this computational turn. Media changeover has always been characterised by anxiety and outspoken criticism. Well- known examples include Plato’s critique on writing as it led to degeneration of the human capability of memorizing and more importantly comprehension (Phaedrus dialogue), the invention of the printing press which allowed limitless publications and led to moral decay, Nietzsche’s trouble with the typewriter and how this technology changed his way of thinking,39 the “indoctrination or seize-over of the listener through very close spraying” of sounds by stereophonic headphones,40 and many others. More recently, the internet as a new medium has been criticized as leading towards superficiality and the decline of cognitive capabilities as Nicholas Carr’s rhetorical question “Is Google Making Us Stupid?” suggests.41 Let us briefly look at some positive aspects of these changes: writing down knowledge allowed its increase beyond the memory capability of a single person, the invention of the printing press led to a liberalisation of this knowledge, internet technology and open access might lead to further democratization, de-imperialization and de- canonization of knowledge. In the context of the latter, David Berry emphasizes the 39 Robert Kunzmann, Friedrich Nietzsche und die Schreibmaschine, in: Archiv für Kurzschrift, Maschinen- schreiben, Bürotechnik, 3 (1982), p. 12–13. 40 “Psychoterror durch den Kunstkopf”, zitiert nach Ralf Bülow, Vor 40 Jahren: Ein Kunstkopf für bin- aurale Stereophonie, in: heise online (31 August 2013), URL: http://heise.de/-1946286 [last accessed: 30 Nov. 2015]. 41 Nicholas Carr, Is Google Making Us Stupid?, in: The Atlantic, 1 July 2008, online available at http: //www.theatlantic.com/magazine/archive/2008/07/is-google-making-us-stupid/6868 [last accessed: 30 Nov. 2015]. http://heise.de/-1946286 http://www.theatlantic.com/magazine/archive/2008/07/is-google-making-us-stupid/6868 http://www.theatlantic.com/magazine/archive/2008/07/is-google-making-us-stupid/6868 646 Malte Rehbein ubiquitous access to human knowledge,42 which reminds one of Vannevar Bush’s memory extension system Memex: Technology enables access to the databanks of human knowledge from anywhere, disregarding and bypassing the traditional gatekeepers of knowledge in the state, the universities, and market. […] This introduces not only a moment of societal disorientation with individuals and institu- tions flooded with information, but also offer a computational solution to them in the form of computational rationalities, what Turing (1950) described as super-critical modes of thought.43 One may regard it as positive or negative,44 but changes in media have always been followed by a dismissal of the old “gatekeepers of knowledge”: first the authorities of the classical age, the Christian church and the monasteries, then the publishing houses and the governmental control in modern history. Progress dismissed them but new gatekeepers succeeded them. In a way, Vannevar Bush’s vision of the memory extension by what we would now call a networked database of knowledge seems to have become reality. Not only do the various types of media converge, but also man and machine merge. Data and algorithms become more and more important for everyday life and work, and those who control these algorithms and “gatekeep” the data, wield power. “Code is law”, postulates Lawrence Lessig,45 and in the German newspaper DIE ZEIT, Gero von Randow follows this up and proclaims: “Who controls this process, rules the future”.46 Apparently, this leaves the door open for manipulation and for mistakes. 4.2 The Sorcerer’s Apprentice The 1980s Czechoslovakian (children’s) science-fiction TV series Návštěvníci (The Visitors)47 depicts a peaceful world in the year 2484. In this world, everything is in harmony until the Central Brain of Mankind, a computer, predicts the collision of an 42 David M. Berry, Introduction, in: Understanding Digital Humanities, ed by David M. Berry, Basingstoke 2012, 1–20. 43 Ibid., p. 8–9. 44 Andrew Keen is one to emphasize the negative impact of the vanishing of gatekeepers because it led to lost of trust and opens the door for manipulation and propaganda (Keen, Das digitale Debakel (see note 9), p. 184–185). 45 Lawrence Lessig, Code and Other Laws of Cyberspace, New York 1999. 46 “Wer diesen Prozess steuert, beherrscht die Zukunft”. Gero von Randow, Zukunftstechnologie: Wer denkt in meinem Hirn?, in: DIE ZEIT, No. 11 (7 March 2014), online available at http://www.zeit.de/ 2014/11/verschmelzung-mensch-maschine-internet [last accessed: 30 Nov. 2015]. 47 Návštěvníci (1981–1983), dir. Jindřich Polák. http://www.zeit.de/2014/11/verschmelzung-mensch-maschine-internet http://www.zeit.de/2014/11/verschmelzung-mensch-maschine-internet It’s our department: On Ethical Issues of Digital Humanities 647 asteroid with the Earth leading to the planet’s destruction. The people completely and blindly rely on this Central Brain and start a mission to rescue mankind. The mission fails and people are about to evacuate the planet. Then an accidental traveller in time, from the year 1984, comes into this world. What he finds out is very simple: the people have built the machine (the Central Brain) onto a crooked surface which hence caused crooked predictions. The traveller put the Central Brain back into its upright position from which it could correct its prediction (Earth was not threatened) and the machine apologized for causing so much trouble. The visitor from a past time did one thing that the people of 2484 did not: he critically (one might say naïvely) approached the computer and challenged its functionality – a capability that the people of 2484 have lost or forgotten. They never thought of questioning the computer’s prediction. The moral of this story is that, in the end, it has to be the humans to justify the consequences of actions. This is very much like what Joseph Weizenbaum has told us. A computer can make decisions, he would argue, but it has no free choice. In the future world of The Visitors, one single, central computer steers the fate of mankind. In our present age, it is the ubiquity of computing technology – computers are everywhere – that effects our daily lives. Philosopher Klaus Wiegerling discusses ubiquitous computing48 from an ethical perspective in ways that are highly relevant to (Digital) Humanities.49 If systems, Wiegerling argues, acquire, exchange, process, and evaluate data on their own, then the materialization of information can no longer be comprehended by people. Personal identity, however, is formed through such comprehension, and making experiences (an important part of these is doubt or resistance) is essential for it. Hence, ubiquitous algorithms might lead to a loss of identity and personal capabilities and competences. Like The Visitors, we start behaving like little children, being incapable of determining reality correctly, losing our identity as an acting subject and limiting our options on how to act. The “unfriendly takeover” by computers that technology critic Douglas Rushkoff fears for our present society50 has taken place in The Visitors and it is only someone from the past who saves the present live of the future. We need to engage more critically with the origin of our data and with the algorithms we are using. One needs only to look into a university classroom to observe how the role search engines and smartphone apps play in decision-making is increasing. A typical argument that you can often hear is that some information comes ‘from the 48 The term appeared around 1988. Cf. Mark Weiser, R. Gold and J. S. Brown, The Origins of Ubiquitous Computing Research at PARC in the Late 1980s, in: IBM Systems Journal 38/4 (1999), p. 693–696. 49 Klaus Wiegerling, Ubiquitous Computing, in: Handbuch Technikethik, ed. by Armin Grundwald, Stuttgart 2013, p. 374–378. 50 Douglas Rushkoff, Present shock: when everything happens now, New York 2013. 648 Malte Rehbein internet’. That this information is not challenged (by questioning who has provided this ‘information’ and when, with what intention, which were the sources etc.) il- lustrates the lack of information literacy. Additionally, the conclusion that because something ‘comes from the internet’, this something has to be the truth (or at least valid), illustrates the danger of this attitude and information illiteracy being abused. Consequently, new gatekeepers of knowledge might emerge all too easily. Being incapable of critical thinking can be observed more and more, from a classroom situation in Digital Humanities to scholarship in general, and to society at large. Crawford’s observation about Big Data that “If the big-data fundamentalists argue that more data is inherently better, closer to the truth, then there is no point in their theology at which enough is enough”51 leads us to a position that ethicists would call the problem of the Sorcerer’s Apprentice, named after Goethe’s poem Der Zauberlehrling.52 The poem begins as an old sorcerer departs his workshop, leaving his apprentice with household chores to be done. Tired of fetching water with a pail, the apprentice enchants a broom to do the work for him – using magic for which he is not yet fully trained. The floor is soon awash with water, and the apprentice realizes that he does not know how to stop the broom: Immer neue Güsse bringt er schnell herein, Ach, und hundert Flüsse stürzen auf mich ein! The apprentice splits the broom in two with an axe, but every piece becomes a new broom on its own and takes up a pail and continues fetching water, now at twice the speed. Die ich rief, die Geister, werd’ ich nun nicht los When all seems lost, the old sorcerer returns and quickly breaks the spell. The poem finishes with the old sorcerer’s statement that powerful spirits should only be called by the master himself. 51 Crawford, The Anxieties of Big Data (see note 11). 52 I am grateful to my colleague Christian Thies, Professor of Philosophy at the University of Passau to share his thoughts on this with me. It’s our department: On Ethical Issues of Digital Humanities 649 The analogy to the risks of Big Data is obvious: what initially has been useful to handle a large amount of data might get out of control and start ruling us, taking away from us the options that we once had: the normative power of the de-facto. At some point, we might have no choice anymore but to use data analysis or other computer-based methods for any kind of research in the Humanities. And what would then happen if we do not understand the data and the algorithms anymore and stop challenging the machines like the people from 2484? Wiegerling concludes that it is becoming more and more important today to pinpoint the options available for action, to make transparent the possibilities of intervening with an autonomous operating system, and to enlighten people about the functionality of these systems.53 This should be a core rationale of any training in Digital Humanities, and it is essential to shape our tools before these tools shape us.54 4.3 Some General Thoughts on Wissenschaftsethik of Science for the Digital Humanities New technologies have their good sides and their bad sides depending on one’s perspective. Every change brings forward winners and losers. The big ethical question is how to value and how to opt and to justify what we are doing. Philosopher Julian Nida-Rümelin pointed out that for various areas of human conduct, different normative criteria might be appropriate and ethics cannot be reduced to one single system of moral rules and principles.55 As we are currently forming Digital Humanities as a discipline on its own, a definition of its own Wissenschaftsethik as a complementary counterpart to its theory of science seems to be timely. Theory and ethics together make philosophy of science. Their role it is to clarify what exactly this Wissenschaft is (its ontological determination) and how Wissenschaft is capable to produce reliable knowledge.56 Ethics is part of philosophy and is regarded as a discipline that studies moral (as a noun), i. e., normative, moral (as an adjective) systems, judgements, and principles. This is not the place to discuss any moral criteria. However, on a more general level, a framework from which these criteria, or code of conduct, for Digital Humanities 53 Wiegerling, Ubiquitous Computing (see note 49), p. 376. 54 Cf. Keen, Das digitale Debakel (see note 9), p. 20, in analogy to the famous Winston Churchill quote: “We shape our buildings; thereafter they shape us.” 55 Julian Nida-Rümelin, Theoretische und angewandte Ethik: Paradigmen, Begründungen, Bereiche, in: Angewandte Ethik: Die Bereichsethiken und ihre theoretische Fundierung, ed. by Julian Nida-Rümelin, Stuttgart 1996, p. 3–85, here p. 63. 56 Thomas Reydon, Wissenschaftsethik, Stuttgart 2013, p. 16. 650 Malte Rehbein might be derived, shall be outlined along three areas following Hoyningen-Huene’s systematization:57 1. Moral issues in specific fields of research and in close relation to the objects of study 2. Moral aspects of Wissenschaft as a profession 3. The responsibility of an individual scholar as well as of the scholarly community at large. All these areas are relevant for Digital Humanities. The first area comes into play, for instance, when one deals with and analyses personal data. Many of the examples discussed above touch on this question. Consider the authorship attribution and the case of Rowling. The researchers analyse text, but this text mediates an individual, which then becomes the object of study. Do we violate Rowling’s right of privacy or anonymity? Should one (or not) ask this individual whether she objects to this investigation? If we are capable of inferring an individual’s genetic disposition to certain diseases by just analysing historical records, should permission be required from this individual when the historical data of his ancestors is going to be public through digitization? Scientific and technological progress seem to go more and more hand in hand with an increasing readiness for taking risks as Ulrich Beck criticizes.58 He observes that there are hardly any taboos anymore or that once existing taboos are broken. Societal scruples seem to disappear with the consequence that society increasingly accepts once questionable conduct without opposition. Beck’s observation applies not only for the use of technology but also for research as such. Moreover, this research, Beck argues, is taking place less and less inside the protected environment of a laboratory. Instead, the world as a whole is becoming a laboratory for research. For the objects that Beck discusses, for instance genetically mutated plants, it is rather obvious how this ‘world as laboratory’ is threatening the world as a whole. For the Humanities, it is less apparent. However, the tendency might indeed be the same. For instance, Big Data offers the possibility of studying communicational patterns and behaviours of people at large by analysing so-called social media such as Twitter. Unlike an experiment in a laboratory where people are invited to participate as test subjects, the internet, the virtual world, becomes the new laboratory where participation is often unwitting and involuntary. In a physical laboratory, we used to ask people 57 Cf. Reydon, Wissenschaftsethik (see note 56), p. 12–15. The fourth area, a Sozialphilosophie der Wis- senschaft is left out here. It addresses the interplay of Wissenschaft with society. 58 Ulrich Beck, Weltrisikogesellschaft, Frankfurt 2008. It’s our department: On Ethical Issues of Digital Humanities 651 for their permission to participate in an experiment (and usually paid them some compensation). Should we not do the same and achieve an informed consent when we regard the internet as a laboratory and use its data? Can we accept the fact that Tweeters are test persons in an experiment without even knowing it?59 The second area of ethics discusses moral aspects of Wissenschaft as a profession. We can divide ethics of science into two dimensions: first, the internal dimension that deals with issues of affecting individuals within a given scholarly community and this community itself, and second, the external dimension that deals with consequences for individuals outside this community, for the ambient society, culture and nature. Moral aspects of Wissenschaft as a profession are of the first dimension. What is understood here is usually a code of good practice: work lege artis, do not fabricate, do not falsify, do not plagiarize, honour the work of others, give credit to all who supported you, name your co-authors, do not publish the same thing twice, and various other guidelines that many scholarly communities have given themselves.60 But it is more than that. Robert Merton formulated in 1942 four epistemological dimensions of what distinguishes good from bad science.61 He claimed that scholars shall only be guided by the ethics of their profession and not by personal or social values. Between the 1920s and 1940s, he observed that science is developing not autonomously and on its own anymore, but that societal and political forces and their interests significantly drive it. This led to a loss of trust into the objectivity of 59 In fact, research of this kind has already been undertaken in a very problematic manner. Kramer, Guillory, and Hancock manipulated the “News Feeds” of 700,000 Facebook users to study the impact on their mood (Adam D. I. Kramer, Jamie E. Guillory and Jeffrey T. Hancock, Experimental Evidence of Massive-Scale Emotional Contagion through Social Networks, in: Proceedings of the National Academy of Sciences 111, No. 24 (17 June 2014), p. 8788–8790). Facebook was a partner in this experiment, provided access to personal data and facilitated data manipulation. Informed consent by the users had not been asked for. Many raised ethical concerns about this study, for instance in online comments to the publica- tion (http://www.pnas.org/content/111/24/8788.full?sid=750ad790-21a1-4ebc-ba71-9dc0ac5af3d0 [last accessed: 30 Nov. 2015]) and other media. In an opinion piece within the same journal, Kahn, Vayena, and Mastroianni ask in a utilitarian view whether the concept of informed consent “makes sense in social-computing research” and conclude that “best practices have yet to be identified” (Jeffrey P. Kahn, Effy Vayena and Anna C. Mastroianni, Opinion: Learning as We Go: Lessons from the Publication of Facebook’s Social-Computing Research, in: Proceedings of the National Academy of Sciences 111, No. 38 (23 September 2014), p. 13677–13679). A more critical opinion expresses Tufekci: Zeynep Tufekci, Engineering the Public: Big Data, Surveillance and Computational Politics, in: First Monday, 19 (7 July 2014), DOI: 10.5210/fm.v19i7.4901. 60 For a general framework cf. for instance the memorandum of the Deutsche Forschungsgemeinschaft (1998/2013). Sicherung guter wissenschaftlicher Praxis. Empfehlungen der Kommission “Selbstkontrolle in der Wissenschaft”/Safeguarding Good Scientific Practice. Recommendations of the Commission on Professional Self Regulation in Science. 61 Robert Merton, The normative structure of science, in: The sociology of science: theoretical and empirical investigations, ed. by Robert Merton, Chicago 1973, p. 267–278. http://www.pnas.org/content/111/24/8788.full?sid=750ad790-21a1-4ebc-ba71-9dc0ac5af3d0 http://doi.org/10.5210/fm.v19i7.4901 652 Malte Rehbein scientific results. Although Merton’s view on the exclusion of personal and social values does not hold out anymore, in the framework of today’s Wissenschaftssystem, there are a couple of characteristics, similar to Merton’s observation 70 years ago. These apparently change the way we work, but they also compel our research into particular directions, and steer and restrict our choices of research topics and meth- ods. These characteristics of today’s Wissenschaftssystem include (among others): a permanent pressure to acquire third-party funding, the “publish-or-perish” principle, a growing necessity to legitimate research, especially in the Humanities, international competition and a demand to be “visible” as a researcher. It has to be discussed how these conditions affect the objectivity of our research especially when at the same time, a huge amount of data is conveniently at hand to quickly produce analytical results, faster than by traditional methods but maybe also less grounded. Merton’s principles from 1942 might still serve as guidance. In order to restore legitimation and trust into research, he demands four principles: 1. Universalism: all research has to be measured against impersonal criteria regardless of its origin. Only then, best results can be produced (this is a teleological criterion) 2. Communalism: all research is the result of a communal effort (which refers to Newton’s ‘Standing on the shoulders of giants’), cannot remain individual and has to be published widely (the modern Open Access, Open Source, Open Data movements builds on this) 3. Selflessness: the behaviour of a researcher has to be guided only by the interest of the scientific community; it is his duty to produce reliable knowledge (this is a deontological criterion) 4. Organized skepticism: it is the duty to steadily question the own work and the work of others in order to produce best possible results. The latter is particularly important within an emerging field such as the Digital Humanities. The third area of ethics is more abstract: it deals with the consequences of our research for the world in which we live. In the 17th century, Francis Bacon formulated his ideal of a Wissenschaft, which should serve society in order to improve the living conditions of humankind. Science shall be – teleologically – subordinated under this higher good. In Bacon’s time, this especially aimed at understanding nature. Knowledge would then empower mankind to master nature.62 62 Cf. Reydon, Wissenschaftsethik (see note 56), p. 82–83. It’s our department: On Ethical Issues of Digital Humanities 653 Scepticism about this view has been raised by many others, among them Philoso- pher Hans Jonas.63 Technology’s control over nature has become excessive with the consequence that technology does not lead any more towards improving living con- ditions but towards their destruction. Jonas formulates an imperative of future ethics: “Handle so, daß die Wirkungen deiner Handlung verträglich sind mit der Permanenz echten menschlichen Lebens auf Erden”64 (“act only according to the maxim that the consequences of your action are in harmony with a permanent existence of true human life on Earth”; translation MR). Jonas does both: he criticizes and he extends (modernizes?) Immanuel Kant’s categorical imperative: “Act only according to that maxim whereby you can, at the same time will, that it should become a universal law without contradiction”. Jonas demands from each scholar the duty to take responsibil- ity for future generations and to preserve what makes “echtes menschliches Leben”, true human life. “True” indicates that the question of permanent existence of life goes beyond mere biological existence and procreation, but the Zeitgeist and the current systems of values of a society probably define what “true human” actually means. Liberty and privacy could be components of such a system in nowadays Western World. Any research that threatens the continuity of these values would violate Jonas’ imperative. For research undertaken in the Digital Humanities, questions like these may arise: how is our social behaviour changing when we know that we cannot express ourselves without being monitored? What consequences would follow out of this for society? How does a society look like in which possibly the history of diseases and dispositions of individuals can easily be detected based on Open Access historical data? Is there a risk that we might create future generations in which values like a right to stay anonymous do not exist anymore or is there not? And if there is, shall we take this take or better not? Or what measures shall we take to minimize it? Jonas gives us advice when it comes to finding answers for these questions, hence to decide among different options of action. He asks us to think of the worst-case scenario first. His heuristic is determined by fear (“Heuristik der Furcht”),65 and the principle of Jonas’ ethics is responsibility, especially for the future. I personally agree with this view and would like to establish the following: as long as the consequences of our research in Digital Humanities are not sufficiently clear, one should be sensitive to the problems that might arise, one should be careful in his actions, and we as a community should at least have these discussions openly. 63 Hans Jonas, Das Prinzip Verantwortung, Frankfurt 1984. 64 Ibid., p. 36. 65 Ibid., p. 7–8. 654 Malte Rehbein 5 Conclusion Wissenschaftsethik refers to all moral and societal aspects of the practice of our Wissenschaft. Nevertheless, it can do nothing more than to problematize and to make the stakeholders of Digital Humanities sensitive for moral questions. It can suggest different perspectives and set a framework within which arguments take place, but it cannot solve dilemmas. The decisions to be made are always up to the individual scholar or – in terms of a code of conduct – up to the scholarly community: it’s our department. work_6xk5ftfpgfdoviacjhoxalslhq ---- FUTURE RESEARCH CHALLENGES FOR A COMPUTER-BASED INTERPRETATIVE 3D RECONSTRUCTION OF CULTURAL HERITAGE - A GERMAN COMMUNITY´S VIEW S. Münster1*; P. Kuroczyński2, M. Pfarr-Harfst3, M. Grellert4, D. Lengyel5 [1] Media Center, Dresden University of Technology, Dresden, Germany – sander.muenster@tu-dresden.de [2] Herder Institute for Historical Research on East Central Europe, Marburg, Germany – piotr.kuroczynski@herder-institut.de [3] Unit Digital Design, Technische Universität Darmstadt – pfarr@dg.tu-darmstadt.de [4] Unit Digital Design, Technische Universität Darmstadt – grellert@dg.tu-darmstadt.de [5] Chair for Visualisation, Brandenburg University of Technology – lengyel@tu-cottbus.de KEY WORDS: Virtual 3D reconstruction, Perspectives, Survey, Research agenda ABSTRACT: The workgroup for Digital Reconstruction of the Digital Humanities in the German-speaking area association (Digital Humanities im deutschsprachigen Raum e.V.) was founded in 2014 as cross-disciplinary scientific society dealing with all aspects of digital reconstruction of cultural heritage and currently involves more than 40 German researchers. Moreover, the workgroup is dedicated to synchronise and foster methodological research for these topics. As one preliminary result a memorandum was created to name urgent research challenges and prospects in a condensed way and assemble a research agenda which could propose demands for further research and development activities within the next years. The version presented within this paper was originally created as a contribution to the so-called agenda development process initiated by the German Federal Ministry of Education and Research (BMBF) in 2014 and has been amended during a joint meeting of the digital reconstruction workgroup in November 2014. 1. INTRODUCTION For more than 3 decades, digital 3D reconstructions of cultural heritage objects have been carried out on many projects. As an overall consequence, challenges have changed significantly during this time and many new research demands for further methodological, technical and practical development have emerged. Our main interest is to identify urgent research challenges and prospects and assemble a research agenda which could propose demands for further research and development activities within the next years. The first version of this research agenda was originally created as a contribution to the so-called agenda development process initiated by the German Federal Ministry of Education and Research (BMBF) in 2014. It was aimed at identifying upcoming research topics and funding needs especially from the point of view of a German community dealing with digital reconstruction (Arbeitsgruppe Digitale Rekonstruktion des Digital Humanities im deutschsprachigen Raum e.V., 2014). It contained contributions submitted by 13 researchers from different disciplinary backgrounds and perspectives on digital reconstruction. This process was initiated by a paper which was circulated in summer 2014. In addition, different outcomes from a joint meeting of the digital reconstruction workgroup in November 2014 (Grellert et al., 2015), which focused on a state-of-the-art analysis, were included in an amended version. They are presented in this paper. Even if the research agenda was created by a German scholarly community focusing on German perspectives, many of the topics addressed may also be relevant to an international community. 1.1 Classification of digital reconstruction Computer-based, i.e. digital 3D reconstructions have increasingly become more important for sustaining conservation, research and broad accessibility of cultural heritage as knowledge carriers, research tools and means of representation. Concerning digital reconstruction, the focus is put on the creation of a spatial, temporal and semantic virtual model. Main differences refer to the kind of object of assessment in terms of material and immaterial objects (e.g. usages or digital data). Furthermore, in regard to the question of how to proceed, the difference between the reconstruction of objects which are no longer existent or which have never been realised (e.g. the current status of plans which have never been realised) and the digitalisation of objects which are still existent is essential (De Francesco and D’Andrea, 2008). While a digitalisation describes the technological transfer of an object to a digital sat (e.g. by means of a semi-automatic modelling with the help of laser scans or photogrammetric technology), a digital reconstruction process includes the necessity for human interpretation of data. 1.2 State-of-the-art In practice, concerning establishment, digital reconstructions have been commonly used both in the academic and commercial field. Currently, digital reconstructions are mainly carried out in one single context in relation to specific usages by interdisciplinary workgroups and by using expert technologies. Especially in regard to this background, it has turned out to be difficult that there are so many standards and guidelines as well as rules for dealing with historical contents (Beacham et al., 2006; Bendicho, 2011; Kiouss et al., 2011; Pfarr, 2009; Sürül et al., 2003) which have only been of limited practical relevance (Kuroczyński et al., 2014; Münster and Köhler, 2012). In contrast, the concept of metadata used as an approach to classify and describe historical information has been established to a large extent. Even if in the meantime one of this schemas has gained a certain popularity with CIDOC-CRM (Doerr, 2003) as reference ontology (in terms of a generic concept of knowledge structure) in archaeology and museology, existing standards of metadata and their implementation are considered as being highly heterogeneous (Felicetti and Lorenzini, 2011; Ronzino et ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-5/W3, 2015 25th International CIPA Symposium 2015, 31 August – 04 September 2015, Taipei, Taiwan This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. doi:10.5194/isprsannals-II-5-W3-207-2015 207 al., 2011; Ronzino et al., 2013). Current approaches on sustainable documentation of the creative process of digital reconstructions have not yet been sufficiently established in practice (Bentkowska-Kafel et al., 2012) despite diverse and innovative concepts (Niccolucci, 2012; Pfarr-Harfst, 2011). An international science community has been shaped by actors from Southern Europe, Great Britain and the US. It mainly comprises perspectives on archaeology and cultural heritage conservation (European Commission, 2011; Foni et al., 2010; Münster et al., in print). A multiplicity of actors from science, economy and education deal with the topic of digital reconstruction in the German-speaking area. Established panels have not yet been set up and a national as well as international networks required for a scientific discourse across disciplines and usages have not yet been established (Pfarr-Harfst, in print). 1.3 Actors and funding environment in Germany A German research environment on digital humanities to which also belongs digital reconstruction of cultural heritage is traditionally strongly affected by dealing with texts and images. However, national priorities on dealing with cultural heritage focus on the development and museal presentations of collections. In contrast, topics of digital 3D reconstructions of cultural heritage have been much less institutionally anchored. Even if many professors out of several disciplines put their work and research focus on the field of digital reconstructions, in Germany no professorship or academic institute is specifically arranged to address these topics in particular yet. A circle of actors is characterised by small workgroups or individual actors. However, they come - as exemplified in Figure 1 by the members of the digital reconstruction workgroup of the Digital Humanities in the German-speaking area association (Digital Humanities im deutschsprachigen Raum e.V.) - from a multiplicity of different institutions and all academic career stages. Figure 1 – Institutions of the members of the digital reconstruction workgroup of the Digital Humanities in the German-speaking area association (Digital Humanities im deutschsprachigen Raum e.V.). Up to now, digital reconstruction projects carried out in Germany have been funded by a heterogeneous field of funding institutions and funding objectives. This includes regional and local funding schemes and research funding on a national level. With a German national funding environment in mind, the German Federal Ministry of Education and Research has lately addressed the assessment of humanities-related questions by means of digital tools ("eHumanities") and the scientific preparation of collections ("The language of objects"). Funded by the German Federal Ministry of Education and Research, a current project is being carried out to assess the space-related placing of inscriptions.1 Furthermore, the structure of a virtual research environment used for web-based documentation and demonstration of semantic 3D datasets of destroyed architecture in Eastern Prussia (Kuroczyński et al., submitted paper) have been assessed thanks to the funding of the Leibnitz Association. The documentation and visualisation of archaeological contents have been examined with the help of the German Research Foundation (DFG).2 On a European level, the Reflective 6 & 7 advertisements carried out in the scope of the Horizon Programme 2020 address questions asking for comprehensive standards and formats used for cultural-historical information.3 Similar to guidelines issued for previous ICT programmes, this advertisement mainly aims at the development of technology. In contrast, EU funds used for a creative Europe focus on specific cases of usage.4 It has only restrictively been taken into consideration that digital reconstructions are complex socio- technical usages which in the meantime have been widely used in the academic environment and museums, media studies and tourism with the help of a current funding environment. For this reason, a number of funding needs exceeding a pure technological development or single usages have come up. 2. PROPOSITIONS AND IDEAS ON RELEVANT TOPICS AND QUESTIONS A number of current tasks of digital humanities in the German- speaking area were described in the scope of a discussion paper issued by the management board of the Digital Humanities in the German-speaking area association and published at the annual conference 2014 (Vorstand des Verbandes Digital Humanities im deutschsprachigen Raum, 2014). In addition, a number of specific challenges have emerged in the context of digital reconstruction. 2.1 Assessment of the scope of digital reconstruction Digital reconstructions do not just use technologies available in the field of information technology used for the development of humanities-related questions but they additionally incorporate a multiplicity of different disciplinary perspectives and contexts of usage. Besides archaeology and different tasks of cultural heritage conservation as main focuses of European funding, specific scenarios of art and architectural history, cultural studies, monument preservation, historical building research and museology are relevant to the German research environment (Burwitz et al., 2012; Riedel et al., 2011). Connected to this is the need to record and systematise research and usage approaches of digital reconstruction and related properties, potentials and fields of usage (Pfarr-Harfst, 2013). In addition to 1 Inschriften im Bezugssystem des Raumes. http://www.spatialhumanities.de/ibr/startseite.html (12.1.2015). 2 OpenInfRA - Ein webbasiertes Informationssystem zur Dokumentation und Publikation archäologischer Forschungsprojekte. http://www.tu-cottbus.de/projekte/de/openinfra/ (12.1.2015). 3 Reflective societies: Cultural Heritage and European Identities. http://ec.europa.eu/research/participants/portal/desktop/en/opportuni ties/h2020/calls/h2020-reflective-7-2014.html (12.1.2015). 4 Creative Europe Program. http://ec.europa.eu/programmes/creative- europe/index_en.htm (12.1.2015). ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-5/W3, 2015 25th International CIPA Symposium 2015, 31 August – 04 September 2015, Taipei, Taiwan This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. doi:10.5194/isprsannals-II-5-W3-207-2015 208 the documentation of spatial-related knowledge (as spatial humanities domain), they include the description of historical objects, the research of historical preparation processes (e.g. historical approaches and craftsmen approaches of planning), contextualisation and assessment of the consistency of sources, classification of objects and subsequent establishment of thesauri and the identification of archetypal characteristics (e.g. craftsmen specifications). Moreover, different usages exist beyond a reference made to concrete historical objects, such as the exploration of a scope with the help of architectural systems and approaches of procedural modelling of hypothetical buildings which are to be erected (Havemann and Wagener, in print; Ling et al., 2007). The recording of good practice examples as well as research and development projects refer to tasks which have to be taken up in a research agenda as was developed for cultural heritage (Arnold and Geser, 2008) research and archaeology (Gibbons, 2012). 2.2 Digital reconstruction between research and practical usage Unlike hardly any other field of digital humanities, digital reconstructions are a cross-sectional area between research and practical use. Respectively, in addition to questions of research and science, there are diverse usages beyond the academic one - e.g. in the context of teaching, museal presentation, virtual tourism, cultural heritage management or popular media (Grellert, 2007; Kuroczyński, 2012; Münster, 2011). Therefore, transfer and exchange between research and practical use is essential, e.g. concerning used technologies, standards and schemas, strategies and quality standards. Furthermore, an assessment of practice-oriented aspects beyond questions of humanities, such as creativity conducive to learning, usability or sustainable business models. 2.3 Establishing virtual models and visual results as topics of scientific discourse Other than in text-related disciplines, knowledge is mainly gained by the creation of a virtual model and its digital, in most cases, visual demonstration in the case of digital reconstruction. Moreover, contributions of different authors and a multiplicity of intuitive decisions are included in such media which are based on know-how (Münster and Prechtel, 2014). So far, both an academic culture and concrete mechanisms have not yet been established to make digital models and generated images scientifically linkable and able to discuss. This includes questions on the access and evaluation of models and images to make authorship transparent as well as references between reconstruction and (explainable) fundamental knowledge such as sources. This also comprises the capacity to quote parts or areas in models and images and the modification of such media by others. In addition to a number of technical requirements described in the following paragraph, the development of approaches on the documentation of processes and their results and the capacity of making a model logic transparent are derived (Günther, 2001; Hoppe, 2001) - e.g. within the meaning of comprehensive reference ontologies and custom-designed domain ontologies (Hauck and Kuroczyński, 2014; Homann, 2011; Ronzino, 2015). 2.4 Securing sustainability It can be seen that in most cases new technologies and trends have quickly been picked up in single projects carried out on digital reconstruction (Münster et al., in print). However, they have just been made transparent mainly via publications issued for a (professional) public in academic contexts. In addition to the aspects of interoperability and long-term availability of datasets, competencies and procedure models to improve accessibility and sustainability of the assessment and mapping of the projects carried out on digital reconstructions of all provenances and the inclusion of established actors, such as libraries, commercial platforms or research infrastructures are essential in making information in this regard available. 2.5 Establishing digital infrastructures for digital reconstructions Beyond buildings, originals of important archaeological objects or objects of art history such as finds or sculptures are often detached from their original context (e.g. in collections, museums etc.). Thus, they can only be assessed, analysed and evaluated spatially in an isolated way. In contrast, virtual objects can not only be re-contextualised by taking into consideration a different probability of the reconstruction hypothesis but also with references between single objects in mind (Laufer et al., 2011; Lengyel and Toulouse, 2011b, c).1 They can be linked in a differentiated way to (source) materials and information on projects (Raspe and Schelbert, 2009). For a long time, the focus of a multiplicity of European projects (e.g. EPOCH, 3D COFORM, CARARE, 3D ICONS) has been put on the recording and storage of historical sources of different kinds, digital research artefacts and results as well as allocated metadata, paradata and contextual data (D’Andrea and Fernie, 2013). However, especially in the German-speaking area, requirements put on digital reconstruction have only been reflected insufficiently beyond archaeology and architectural history (Drewello et al., 2010) in research infrastructures.2 Despite its name, the DARIAH Geobrowser and the Europeana 4D interface are mainly aimed at a two-dimensional mapping of objects. Specific requirements of digital reconstructions are mainly the space- and time-related classification and identification of created digital models and related (source) materials (e.g. by means of word-wide valid unified resource identifiers) and their relationships. Moreover, digital reconstructions have been developed by using a multiplicity of different technologies from domains such as GIS, VR, CAD and BIM or CityEngines which are only a little compatible (Münster and Prechtel, 2014). They are not convertible without loss. Related tasks are likewise assessment, development and spreading of technologies and strategies on interoperability of data - e.g. on conversion without loss or on data exchange in proprietary formats. Furthermore, with linkage in mind, data viewers which are easy to operate have been used for the illustration of 3D datasets. Therefore, there are special requirements in regard to interactivity and simulation quality of materiality and weathering. Furthermore, tools and mechanisms for semantic annotation and modification of existing reconstructions, for the inclusion of alternative hypotheses or for versioning are required. According to the complex requirements the Semantic Web and WebGL technologies seem to be highly promising. Research on and implementation of documentation and visualisation standards within the 1 Berliner Skulpturennetzwerk. http://de.wikipedia.org/wiki/Berliner_Skulpturennetzwerk (12.1.2015). 2 IANUS - Forschungsdatenzentrum Archäologie & Altertumswissenschaften. http://www.dainst.org/de/project/ianus- forschungsdatenzentrum-arch%C3%A4ologie- altertumswissenschaften?ft=all (12.1.2015). ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-5/W3, 2015 25th International CIPA Symposium 2015, 31 August – 04 September 2015, Taipei, Taiwan This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. doi:10.5194/isprsannals-II-5-W3-207-2015 209 community of digital hypothetical 3D reconstruction is a prerequisite. Using above-mentioned open source technologies for web-based description and publishing of the 3D content, in particular developing a domain related ontology (OWL DL), storing the whole process chain and results in a human- and machine-readable schema (XML-Format), linkage with existing controlled vocabularies and authority files (e.g. Getty AAT, etc.), establishing a Graph Database (RDF-Triple-Store) with a SPARQL Endpoint, provides new quality comprehensibility and sustainability within Linked (Open) Data infrastructure (Kuroczyński et al., 2015). 2.6 Developing of competencies in dealing with images and digital reconstruction Especially in the humanistic approach, affinity and competence regarding digital research methods have only been little developed (Albrecht, 2013). Similar to digital humanities altogether (Vorstand des Verbandes Digital Humanities im deutschsprachigen Raum, 2014), method-related development of knowledge and competencies of researchers and users in practice (e.g. curators) concerning a production, evaluation and usage of digital reconstructions pose a main challenge (Kröber and Münster, 2014). Thus, scientific findings in archaeology and construction research are in most cases incomplete. The level of accurateness of knowledge extends from authentic finds to scientific hypotheses, which can also be contradictory. Beside a gradual difference between secure and insecure reconstruction, there is also a coexistence of different alternatives. It is a special strength of virtual models to take up this lack of definition and to be able to make it available in form of special visualisations on scientific discussions and mediation (Grellert and Haas, in print; Lengyel and Toulouse, 2011a, c, 2013). Connected with it is the challenge posed on users to develop the competence of methods and usage for dealing with synthetically produced images and models in both scientific and popular contexts. This includes a conscience concerning tentativeness, the nature of the hypotheses of incorporated knowledge and an evaluation competence in regard to fields of usage and production processes. 2.7 Assessment of digital reconstructions as socio-technical systems So far, topics related to digital reconstruction have mainly emerged in the German research and funding environment with technological development and a specific reference to objects in mind. In contrast, widely excluded has been an examination of socio-technical aspects. In addition to the needs already described, research and development of suitable workflows and strategies used for the creation of digital reconstructions is a main task. In addition to ideas on the organisation of working processes and on interdisciplinary communication and co-operation (Münster, 2013) given by the innovation and project management, innovative approaches such as agile development methods of information technology (Baldwin and Flaten, 2012), have promised added values in practice and in science. 2.8 Establishing digital reconstruction in the German digital humanities area Currently, the landscape of digital reconstruction in Germany includes a multiplicity of actors from different backgrounds. So far, they have been insufficiently linked and organised. Hence, the need of joint platforms for an exchange and the establishment of digital reconstruction in the canon of digital humanities as well as the necessity of support of networking activities have been derived. While single references to topics of digital reconstruction such as museology and archaeology have been taken up by panels and workgroups anchored in these fields, structures and institutions of a scientific and practical development have been missing in the German-speaking area. In this regard, a first step is the workgroup for digital Reconstruction of the Digital Humanities in the German- speaking area association founded in 2014.1 3. CONCLUSION While the usage of digital reconstruction techniques in the context of cultural heritage has been widely explored by prototypic projects and methodological perspectives, current challenges aim at a research and development of sustainable and practicable approaches to access wider scientific communities (and to establish and ensure scholarly standards in this domain) and audiences as well as to enhance interoperability. This includes aspects such as widely interoperable documentation and classification strategies and schemes, an overarching systematisation and cataloguing of projects and the creation of objects as well as strategies and technologies for an exchange between different technological domains and approaches of usage. Moreover, digital reconstructions are socio-technical systems embedded in complex usage scenarios. Due to these reasons, it is crucial to determine research and usage scenarios as well as additional values of digital reconstruction and identify best practice cases. Thus, an identification of both, user and non-user-needs and motivations as well as the education and competency development of researchers, producers and recipients are essential. In addition, the research for and usage of digital reconstruction technologies have to be established and positioned as an important field of usage within a digital humanities scientific community, digital infrastructures as well as within a funding community. ACKNOWLEDGEMENTS This paper is originally based on a joint contribution to the agenda development process of the German Federal Ministry of Education and Research (BMBF) in 2014. The authors would like to thank Henning Burwitz, Frank Henze, Stephan Hoppe, Cindy Kröber, Nikolas Prechtel, Georg Schelbert, Catherine Toulouse and Markus Wacker for their valuable ideas, comments and feedbacks. Moreover, the authors like to thank all participants of the joint meeting of the digital reconstruction workgroup which met in November 2014 to develop additional ideas and perspectives. REFERENCES Albrecht, S., 2013. Scholars‘ Adoption of E-Science Practices: (Preliminary) Results from a Qualitative Study of Network and Other Influencing Factors., XXXIII. Sunbelt Social Networks Conference of the International Network for Social Network Analysis (INSNA), 21-26 May 2013, Hamburg. 1 Arbeitsgruppe Digitale Rekonstruktion des Digital Humanities im deutschsprachigen Raum e.V. http://www.digitale- rekonstruktion.info/ (12.1.2015). ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-5/W3, 2015 25th International CIPA Symposium 2015, 31 August – 04 September 2015, Taipei, Taiwan This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. doi:10.5194/isprsannals-II-5-W3-207-2015 210 Arbeitsgruppe Digitale Rekonstruktion des Digital Humanities im deutschsprachigen Raum e.V., 2014. Aktuelle Herausforderungen im Kontext digitaler Rekonstruktion (Beitrag zum BMBF-Agendaprozess). Arnold, D., Geser, G., 2008. EPOCH Research Agenda – Final Report, Brighton. Baldwin, T.D., Flaten, A.R., 2012. Adapting the Agile Process to Digital Reconstructions of the Temple of Apollo at Delphi, in: Zhou, M., Romanowska, I., Wu, Z., Xu, P., Verhagen, P. (Eds.), Revive the Past. Computer Applications and Quantitative Methods in Archaeology (CAA). Proceedings of the 39th International Conference. Pallas Publications, Amsterdam pp. 30-37. Beacham, R., Denard, H., Niccolucci, F., 2006. An Introduction to the London Charter, in: Ioannides, M., Arnold, D., Niccolucci, F., Mania, K. (Eds.), Papers from the Joint Event CIPA/VAST/EG/EuroMed Event, pp. 263-269. Bendicho, V.M.L.-M., 2011. The principles of the Seville Charter XXIII CIPA Symposium - Proceedings. Bentkowska-Kafel, A., Denard, H., Baker, D., 2012. Paradata and Transparency in Virtual Heritage. Ashgate, Burlington. Burwitz, H., Henze, F., Riedel, A., 2012. Alles 3D? - Über die Nutzung aktueller Aufnahmetechnik in der archäologischen Bauforschung, in: Faulstich, E.I. (Ed.), Dokumentation und Innovation bei der Erfassung von Kulturgütern II, Schriften des Bundesverbands freiberuflicher Kulturwissenschaftler, Band 5, Online-Publikation der BfK-Fachtagung 2012, Würzburg. D’Andrea, A., Fernie, K., 2013. CARARE 2.0: a metadata schema for 3D Cultural, The proceedings of the 1st International Conference Digital Heritage 2013, Marseille http://3dicons-project.eu/eng/Media/Files/CARARE-2.0-a- metadata-schema-for-3D-Cultural-Objects (09.07.2015). De Francesco, G., D’Andrea, A., 2008. Standards and Guidelines for Quality Digital Cultural Three-Dimensional Content Creation, in: Ioannides, M., Addison, A., Georgopoulos, A., Kalisperis, L. (Eds.), Digital Heritage: Proceedings of the 14th International Conference on Virtual Systems and Multimedia. Project Papers. Archaeolingua, Budapest, pp. 229-233. Doerr, M., 2003. The CIDOC CRM – An Ontological Approach to Semantic Interoperability of Metadata. AI Magazine 24. Drewello, R., Freitag, B., Schlieder, C., 2010. Neues Werkzeug für alte Gemäuer. DFG Forschung Magazin 3, 10-14. European Commission, 2011. Survey and outcomes of cultural heritage research projects supported in the context of EU environmental research programmes. From 5th to 7th Framework Programme. European Commision, Brussels. Felicetti, A., Lorenzini, M., 2011. Metadata and tools for integration and preservation of cultural heritage 3D information, XXIII CIPA Symposium - Proceedings. Foni, A.E., Papagiannakis, G., Magnenat-Thalmann, N., 2010. A taxonomy of visualization strategies for cultural heritage applications. Journal on Computing and Cultural Heritage 3, 1- 21. Gibbons, G., 2012. Visualisation in Archaeology Project. Final Report. English Heritage, o. Ort. Grellert, M., 2007. Immaterielle Zeugnisse – Synagogen in Deutschland: Potentiale digitaler Technologien für das Erinnern zerstörter Architektur (Dissertation). transcript Verlag, Bielefeld. Grellert, M., Haas, F., in print. Between Science and Illusion. Virtual reconstructions in Darmstadt University – The Dresden Castle, in: Hoppe, S., Breitling, S., Fitzner, S. (Eds.), Virtual Palaces II: Lost Palaces and Their Afterlife. Virtual Reconstruction Between Science and Media. Proceedings of the European Science Foundation Research Networking Programme PALATIUM meeting at Munich, 13.- 15. 4. 2012. Grellert, M., Pfarr-Harfst, M., Kuroczyński, P., Münster, S., 2015. Virtual Reconstruction and Scientific/Academic 3D Models – Fundamental Considerations (lecture). 43rd Computer Applications and Quantitative Methods in Archaeology Conference 2015 Sienna. Günther, H., 2001. Kritische Computer-Visualisierung in der kunsthistorischen Lehre, in: Frings, M. (Ed.), Der Modelle Tugend. CAD und die neuen Räume der Kunstgeschichte, Weimar, pp. 111-122. Hauck, O., Kuroczyński, P., 2014. Cultural Heritage Markup Language – How to Record and Preserve 3D Assets of Digital Reconstruction, CHNT 19, Vienna http://chml.foundation/wp- content/uploads/2015/06/CHNT19_Hauck_Kuroczynski.pdf (09.07.2015). Havemann, S., Wagener, O., in print. Castles and their Landscape – A case study towards parametric historic reconstruction, in: Hoppe, S., Breitling, S., Fitzner, S. (Eds.), Virtual Palaces II: Lost Palaces and Their Afterlife. Virtual Reconstruction Between Science and Media. Proceedings of the European Science Foundation Research Networking Programme PALATIUM meeting at Munich, 13.- 15. 4. 2012. Homann, G., 2011. Die Anwendung von Ontologien zur Wissensrepräsentation und -kommunikation im Bereich des kulturellen Erbes, in: Schomburg, S., Leggewie, C., Lobin, H., Puschmann, C. (Eds.), Digitale Wissenschaft - Stand und Entwickung digital vernetzter Forschung in Deutschland HBZ, Köln pp. 33-40. Hoppe, S., 2001. Die Fußnoten des Modells, in: Frings, M. (Ed.), Der Modelle Tugend. CAD und die neuen Räume der Kunstgeschichte, Weimar, pp. 87-102. Kiouss, A., Karoglou, M., Labropoulos, K., Moropoulou, A., Zarnic, R., 2011. Recommendations and Strategies for the Establishment of a Guideline for Monument Documentation Harmonized with the existing European Standards and Codes, XXIII CIPA Symposium - Proceedings. Kröber, C., Münster, S., 2014. An App for the Cathedral in Freiberg - An Interdisciplinary Project Seminar, in: Sampson, D.G., Spector, J.M., Ifenthaler, D., Isaias, P. (Eds.), Proceedings of the 11th International Conference on cognition and ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-5/W3, 2015 25th International CIPA Symposium 2015, 31 August – 04 September 2015, Taipei, Taiwan This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. doi:10.5194/isprsannals-II-5-W3-207-2015 211 exploratory learning in digital age (CELDA 2014) Porto, Portugal. Oct. 25-27th 2014, pp. 270-274. Kuroczyński, P., 2012. 3D-Computer-Rekonstruktion der Baugeschichte Breslaus. Ein Erfahrungsbericht, in: Wissenschaften, W.Z.d.P.A.d. (Ed.), Jahrbuch des Wissenschaftlichen Zentrums der Polnischen Akademie der Wissenschaften in Wien, Band 3, Wien, pp. 201-213. Kuroczyński, P., Hauck, O., Dworak, D., submitted paper. Digital Reconstruction of Cultural Heritage – Questions of documentation and visualisation standards for 3D content, EUROMED 2014. Kuroczyński, P., Hauck, O., Dworak, D., Lutteroth, J., 2015. Virtual Museum of destroyed Cultural Heritage – 3D Documentation, reconstruction and visualisation in the Semantic Web, Proceedings of the 2nd International Conference "Virtual Archeology – Methods and benefits". The State Hermitage Publishers, St. Petersburg, pp. 54-61. Kuroczyński, P., Pfarr-Harfst, M., Wacker, M., Münster, S., Henze, F., 2014. Pecha Kucha "Virtuelle Rekonstruktion – Allgemeine Standards, Methodik und Dokumentation" (Panel), 1. Jahrestagung der Digital Humanities im deutschsprachigen Raum (DHd 2014), Passau. Laufer, E., Lengyel, D., Pirson, F., Stappmanns, V., Toulouse, C., 2011. Die Wiederentstehung Pergamons als virtuelles Stadtmodell, in: Scholl, A., Kästner, V., Grüssinger, R. (Eds.), Pergamon. Panorama der antiken Metropole. Verlag Imhof, Petersberg, pp. 82–86. Lengyel, D., Toulouse, C., 2011a. Darstellung von unscharfem Wissen in der Rekonstruktion historischer Bauten, in: Heine, K., Rheidt, K., Henze, F., Riedel, A. (Eds.), Von Handaufmaß bis High Tech III. 3D in der historischen Bauforschung. Verlag Philipp von Zabern, Darmstadt, pp. 182–186. Lengyel, D., Toulouse, C., 2011b. Die Gestaltung der Vision Naga - Designing Naga's Vision,, in: Kröper, K., Schoske, S., Wildung, D. (Eds.), Königsstadt Naga - Naga, Royal City. Grabungen in der Wüste des Sudan - Excavations in the Desert of the Sudan. Naga-Projekt Berlin - Staatliches Museum Ägyptischer Kunst München, München, pp. 163-175. Lengyel, D., Toulouse, C., 2011c. Ein Stadtmodell von Pergamon - Unschärfe als Methode für Darstellung und Rekonstruktion antiker Architektur, in: Petersen, L., Hoff, R.v.d. (Eds.), Skulpturen in Pergamon – Gymnasion, Heiligtum, Palast. Archäologische Sammlung der Albert-Ludwigs- Universität Freiburg, Freiburg, pp. 22-26. Lengyel, D., Toulouse, C., 2013. Die Bauphasen des Kölner Domes und seiner Vorgängerbauten: Gestaltung zwischen Architektur und Diagrammatik, in: Boschung, D., Jachman, J. (Eds.), Diagrammatik der Architektur, Tagungsband Internationales Kolleg Morphomata der Universität zu Köln. Verlag Wilhelm Fink, Paderborn, pp. 182–186. Ling, Z., Ruoming, S., Keqin, Z., 2007. Rule-based 3d modeling for chinese traditional architecture in: Remondino, F., El-Hakim, S. (Eds.), 3D-ARCH 2007, Zürich. Münster, S., 2011. Militärgeschichte aus der digitalen Retorte - Computergenerierte 3D-Visualisierung als Filmtechnik, in: Kästner, A., Mazerath, J. (Eds.), Mehr als Krieg und Leidenschaft. Die filmische Darstellung von Militär und Gesellschaft der Frühen Neuzeit (Militär und Gesellschaft in der frühen Neuzeit, 2011/2), Potsdam, pp. 457-486. Münster, S., 2013. Workflows and the role of images for a virtual 3D reconstruction of no longer extant historic objects. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences XL-5/W2 (XXIV International CIPA Symposium), 197-202. Münster, S., Köhler, T., 2012. 3D modeling as tool for the reconstruction and visualization of “lost” buildings in humanities. A literature-based survey of recent projects (lecture), in: Hoppe, S., Breitling, S., Fitzner, S. (Eds.), Virtual Palaces II: Lost Palaces and Their Afterlife. Virtual Reconstruction Between Science and Media.European Science Foundation Research Networking Programme PALATIUM meeting at Munich, 13.- 15. 4. 2012. Münster, S., Köhler, T., Hoppe, S., in print. 3D modeling technologies as tools for the reconstruction and visualization of historic items in humanities. A literature-based survey, in: Traviglia, A. (Ed.), Across Space and Time. Selected Papers from the 41st Computer Applications and Quantitative Methods in Archaeology Conference (Perth, 25.- 28. 3. 2013). Münster, S., Prechtel, N., 2014. Beyond Software. Design Implications for Virtual Libraries and Platforms for Cultural Heritage from Practical Findings, in: Ioannides, M., Magnenat- Thalmann, N., Fink, E., Žarnić, R., Yen, A.-Y., Quak, E. (Eds.), Digital Heritage. Progress in Cultural Heritage: Documentation, Preservation, and Protection. Springer International Publishing Switzerland, Cham, pp. 131-145. Niccolucci, F., 2012. Setting Standards for 3D Visualization of Cultural Heritage in Europe and Beyond, in: Bentkowska-Kafel, A., Denard, H., Baker, D. (Eds.), Paradata and Transparency in Virtual Heritage. Ashgate, Burlington, pp. 23-36. Pfarr-Harfst, M., 2011. Documentation system for digital reconstructions. Reference to the Mausoleum of the Tang- Dynastie at Zhaoling, in Shaanxi Province, China, 16th International Conference on “Cultural Heritage and New Technologies” Vienna, 2011, Wien, pp. 648-658. Pfarr-Harfst, M., 2013. Virtual Scientific Models, in: Ng, K., Bowen, J.P., McDaid, S. (Eds.), Electronic Visualisation and the Arts, London, pp. 157-163. Pfarr-Harfst, M., in print. 25 Years of Experience in Virtual Reconstructions - Research Projects, Status Quo of Current Research and Visions for the Future, in: Verhagen, P. (Ed.), Across Space and Time. Proceedings of the 42th International Conference on Computer Applications and Quantitative Methods in Archaeology (CAA). Archelingua, Budapest. Pfarr, M., 2009. Dokumentationssystem für Digitale Rekonstruktionen am Beispiel der Grabanlage Zhaoling, Provinz Shaanxi, China (Dissertation), Darmstadt. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-5/W3, 2015 25th International CIPA Symposium 2015, 31 August – 04 September 2015, Taipei, Taiwan This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. doi:10.5194/isprsannals-II-5-W3-207-2015 212 Raspe, M., Schelbert, G., 2009. ZUCCARO - Ein Informationssystem für die historischen Wissenschaften. IT - Information Technology 4, 207-215. Riedel, A., Henze, F., Marbs, A., 2011. Paradigmenwechsel in der historischen Bauforschung? Ansätze für eine effektive Nutzung von 3D-Informationen, in: Heine, K., Rheidt, K., Henze, F., Riedel, A. (Eds.), Von Handaufmaß bis High Tech III - 3D in der historischen Bauforschung. Philipp von Zabern, Darmstadt, pp. 131-141. Ronzino, P., 2015. CIDOC CRMBA - A CRM Extension for Buildings Archaeology Information Modeling (PhD-Thesis). The Cyprus Institute, Nicosia. Ronzino, P., Amico, N., Niccolucci, F., 2011. Assessment and Comparison of Metadata Schemas for Architectural Heritage, XXIII CIPA Symposium - Proceedings. Ronzino, P., Niccolucci, F., D’Andrea, A., 2013. Built Heritage metadate schemas and the integration of architectural datasets using CIDOC-CRM, in: Boriani, M., Gabaglio, R., Gulotta, D. (Eds.), Online Proceedings of the Conference BUILT HERITAGE 2013 Monitoring Conservation and Management, Milano, pp. 883-889. Sürül, A., Özen, H., Tutkun, M., 2003. ICOMOS digital database of the Cultural Heritage of Trabzon. XIX CIPA Symposium - Proceedings. Vorstand des Verbandes Digital Humanities im deutschsprachigen Raum, 2014. Digital Humanities 2020, Passau. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-5/W3, 2015 25th International CIPA Symposium 2015, 31 August – 04 September 2015, Taipei, Taiwan This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. doi:10.5194/isprsannals-II-5-W3-207-2015 213 work_6yqixqlsprc6rlhte7z2c3duy4 ---- TMG_A_000026_O 4..24 University of Groningen Digital Humanities and Media History Wijfjes, Huub Published in: Tijdschrift voor Mediageschiedenis DOI: 10.18146/2213-7653.2017.277 IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below. Document Version Publisher's PDF, also known as Version of record Publication date: 2017 Link to publication in University of Groningen/UMCG research database Citation for published version (APA): Wijfjes, H. (2017). Digital Humanities and Media History: A Challenge for Historical Newspaper Research. Tijdschrift voor Mediageschiedenis, 20(1), 4. https://doi.org/10.18146/2213-7653.2017.277 Copyright Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons). Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum. Download date: 06-04-2021 https://doi.org/10.18146/2213-7653.2017.277 https://research.rug.nl/en/publications/digital-humanities-and-media-history(42c57666-3d7b-4a30-b270-07eaf78ed67a).html https://doi.org/10.18146/2213-7653.2017.277 Huub Wijfjes Digital Humanities and Media History A Challenge for Historical Newspaper Research 1 Abstract Digital humanities is an important challenge for more traditional humanities disciplines to take on, but advanced digital methods for analysis are not often used to answer concrete research questions in these disciplines. This article makes use of extensive digital collections of historical newspapers to discuss the promising, yet challenging relationship between digital humanities and historical research. The search for long-term patterns in digital historical research appropriately positions itself within previous approaches to historical research, but the digitization of sources presents many practical and theoretical questions and obstacles. For this reason, any digital source used in historical research should be critically reviewed beforehand. Digital newspaper research raises new issues and presents new possibilities to better answer traditional questions. KEYWORDS: Media History, Political History, Mediatisation of Politics, Digital Humanities, Historic Newspapers Using digital newspaper collections in historical research is quite new, but some of the problems and possibilities connected to this kind of research can actually be quite old. This article aims to explore this theme in the broader context of the rise of digital humanities, especially digital history. The big question here is if we are facing a revolution in humanities or a clash of innovations and traditions that can be fruitfully reconciled. This also raises questions about the need for digital literacy in historical science. Zooming in on the more specific digital potentials for newspaper history, some theoretical and practical problems will be discussed. A closer look is dedicated to a specific example of digital newspaper research in historical context. This ‘Pidemehs’-project tried to uncover the interaction of politics and newspapers in a long period of Dutch history between 1918 and 1967. The findings stress the need to see digital history as a complimentary approach, rather than one that can replace the traditional historical approaches. Digital newspaper research raises new types of questions and offers new ways to answer traditional questions. Clashes in Digital Humanities and Digital History Although the first handbook on digital humanities was published in 2004, it builds on traditions in using computers in historical research going back to the rise of computer aided research in the late 1940s.2 Digital humanities nowadays is still an experimental but fast growing field of academic research and education, connecting traditional humanities methodologies (for 4 | example historical hermeneutics) to tools that researchers can use to curate or access online collections and to analyse big data sets. Research of this kind has triggered mixed responses, especially in historical sciences. In a special issue of BMGN - Low Countries Historical Review in 2013 several historians debated the possibilities, problems and pitfalls of ‘digital history’ without coming to some sort of agreement about its value. That seems logical because relatively little historical research using digital sources has been performed, tested and properly evaluated. Although some historians practise computer-aided research since the nineteen sixties, digital history is still at the beginning of its development. Fundamental questions about the availability and controllability of sources and about the new methods required for digital research still need answers. Furthermore, a functional and openly accessible infrastructure for digital humanities research and research presentation is not operational in most countries. Still, despite all technical and methodological problems and obstacles, digital humanities bear great opportunities for new research that in nature is ‘global, trans-historical and trans-media’ and has led to impressive claims about its potential impact. Roughly speaking, these claims divide the world of humanities in enthusiastic fans and hesitant critics. In relation to the historical profession it has been said that ‘the digital’ has divided the profession between ‘stalwart believers and underwhelmed agnostics.’3 The agnostics tend to say that until now the digital revolution didn’t create a real paradigmatic revolution, but is a ‘practical revolution’ at heart, making relatively simple keyword searches in singular online sources far easier.4 ‘Stalwart believers’, like Rens Bod in his 2012 inaugural lecture at the University of Amsterdam, claim that they are going to revolutionise humanities to an all-encompassing version 3.0. He stated that after the establishment of hermeneutical and critical traditions of humanities 1.0 in the nineteenth and twentieth century, we are now involved in finding historical patterns in digital big data in humanities 2.0. That is roughly similar to what media historian Bob Nicholson calls ‘the digital turn in cultural history 2.0.’ Advocates of this idea say that modern media historians should be looking for patterns and developments rather than performing traditional, interpretative research of separate and specific mediahistorical cases.5 For the future, Bod sees the big challenge in finding a combination of 1.0 and 2.0 in humanities 3.0: a stage where critical hermeneutical traditions are combined with digital approaches that are able to map encompassing patterns and developments.6 This idea of phases in the development of humanities or historical sciences that are determined by the nature and availability of sources (analogue or digital) and the goal of historical research (interpreting unique events in narrative forms or reconstructing and analysing ‘patterns’) reignites an old fundamental split in historical science. On the one hand there are the historians producing narratives on the basis of detailed study of a small sample of exemplifying sources. On the other hand historians are aiming to analyse long-term developments based upon a varied set of (almost) complete or representative sources, providing conclusions that cover a big time span. The latter find new arguments in ‘the digital society’ with its seemingly endless possibilities in shaping and connecting information and knowledge, any place and any time. In the discussions accompanying this rise of ‘digital society’ a sharp division can be seen between people who envision a totally new society where the political, economic, technological and social | 5Huub Wijfjes relations will be shaped on a totally different basis, and people who stress the power of traditional culture to adjust to these challenges. It’s a split between technological and cultural determinists.7 This clash between technological determinism (sometimes also called ‘solutionism’ or ‘belief in the technological sublime’) and cultural criticism is somewhat artificial, because a lot of researchers are open to dialogue. But the ‘hyperbolic discourse surrounding digital media’ isn’t very fruitful in inviting culturally orientated academics that want to be convinced of the practical value of digital research methods.8 More specifically, the clash can be seen in historiography. In their provocative Historical Manifesto, Armitage and Guldi show, for example, the typical technological determinist combina- tion of worrisome language about out-of-date analogue traditions, and the unlimited promises of ‘big data’ that can be ‘mined’ to reconstruct ‘patterns’ and create something of a scholarly paradise. They claim nothing less than ‘the power of big data to illuminate the shadow of history.’9 Most cultural historians see this kind of ambitious claims for redefining historical research around ‘the digital paradigm’ or ‘the digital turn’ as a threatening takeover by quantitative scientist with an unlimited belief in technological rationality. In their eyes, the ‘mechanisation’ of the heuristic process threatens to repress a critical attitude and devaluate cultural, contextualised analysis.10 Actually, the call of Armitage and Guldi to ‘save’ historical science by shifting the research focus from unique details towards generalised patterns is not totally new. In some respects it can be seen as a digital revival of the Annales-movement. This French born, but decisively international movement inspired generations of historians since the nineteen thirties. The central idea was to approach history as a longue durée, a long-term development that can be found in social and economic life, but also in culture and mentality. Annales-historians were seeking for overarching metanarratives, using a combination of quantitative historical trend data and qualitative micro histories that illustrated the trends on a different level. In the vision of Armitage and Guldi, a revival of this idea is a way to keep pace with the growing influence of economists and social scientists in the current and future public debates. It also offers the possibility of keeping historical sciences in tune with the ways new and future generations of scholars formulate research questions, perform searches and interactively connect the presentation of results to the online world. The debate about ‘the digital turn’ in historical science shows the old ideological question if history should hermeneutically focus on understanding and contextualising unique events or on analysing structure and patterns based on quantifiable units and data. In the nineteen seventies, this recurring debate could be seen in historical discussions about the need to integrate sociological and economic theory and methodology in historical research. It was considered a shift in research that could prove at last that history was ‘a real science’ with falsifiable hypotheses and verifiable methods and models.11 The questions in this theoretical debate relate directly to the more practical problem if historians should use ‘documents’ or ‘data’, or, in other words, should interpret and tell stories or provide quantitative evidence for hypotheses.12 According to Rieder and Röhle digital methods actually raise the question: do statistics and algorithms reach a higher level of objectivity than human interpretation? A second question is about the domination of visual output in digital humanities research. A lot of this research seems to flourish thanks to the spectacular ‘infographics’ and ‘shock and awe’ 6 | TIJDSCHRIFT VOOR MEDIAGESCHIEDENIS - 20 [1] 2017 animations. Are these kind of results of more importance than other output? Visualisation is of course tempting, because it gives us a (sometimes animated) image of patterns in history, and for some people visual material (often called ‘evidence’) is more powerful than evidence in words, which is often called ‘argumentative’.13 Josh Begley, Every NYT front page since 1852. Example of a ‘shock and awe’ animation based on digitised newspaper material. Interpretative storytellers such as cultural historians tend to think that we cannot understand complex historical or cultural processes without a notion about what constitutes and drives culture. In their opinion, sole use of quantitative data, the quest for ‘patterns’, and turning history into a social science therefore are too limited, or even misleading. In the classic words of cultural historian Robert Darnton: ‘the social scientists live in a world beyond the reach of ordinary mortals, a world perfectly organised in perfect patterns of behaviour, peopled by ideal types, and governed by correlation coefficients that exclude everything but the most standard of deviations.’ Such a world can never be joined with, what Darnton calls, ‘the messiness of history.’14 This critique is familiar to the critique on ‘algorithmic culture’ that is formulated in digital society. Critics say that this reliance on code, computer languages and | 7Huub Wijfjes https://vimeo.com/204951759 algorithmic reasoning is problematic for, or even incompatible with, the critical interpretative approach that still is at the basis of most humanities research.15 In this heated debate, there is a danger for unconstructive mutual condemnation. Rather than stressing the unbridgeable technological and cultural determinism, it is much more fruitful to conceive the divergent approaches as a set of methodological and practical issues that need to be addressed and solved in concrete research and should be subject to constant methodological evaluation. The critical scepticism about digital history creates an artificial antagonism between quantitative and qualitative methods or – to say it more harshly – between ‘scientific, digital’ and ‘interpretative, analogue’ historical research.16 However, in the research practices usually both perspectives and methods are used side by side in a complementary way.17 Fears of cultural historians that their ownership of the historical field will be stolen or washed away by a digital flood, doesn’t demonstrate a lot of self- confidence. If the historical debate about the Annales-methodology for example shows anything, it is that the structuralist and quantitative approaches didn’t replace, but in the long run strengthened cultural, political, biographical and other qualitative or interpretative historical approaches. In historical research, the nineteen nineties even gave rise to a ‘cultural turn’ as a response to the rise of quantitative methods coming from social and economic history. This could for example be seen in media history. From focusing on big processes in institutional media production and societal and political developments, attention shifted to the media content and its meaning in the specific historical context of media reception by publics, each with a different cultural background.18 This all indicates that ‘the digital turn’ does not necessarily mean squandering the strengths of cultural approaches. Progress can be made if we understand what digital cultural data are, what digital tools exactly do and how the results can be fitted and contextualised in broader ensembles of historical sources. As Berry asserts in an edited volume with reflections on digital humanities: ‘Computationally supported thinking doesn’t have to be dehumanising (…) but can give us greater powers of thinking and larger reach for our imaginations…’.19 Of course one must acknowledge that there is a difference between the traditional close reading of a limited amount of texts and the ‘distant reading’ of large amounts of data. Historians however should not become what they aren’t: computer scientists. They should use new methods to expand their horizon and possibilities to answer questions of historical value. On the other hand, digital historians should be more aware that there is a big and understandable difference between statistical or algorithmic significance that computers and software engineers subscribe to, and the cultural or historical significance that historians are attached to as a way of contextualising history. Generally speaking ‘the way in which computers work is not automatically compatible with the way historians work.’20 Not automatically indeed, but compatibility can be achieved by acknowledging the strengths of both sides. Historical research cannot exclusively be the algorithmic processing of big data sets, no matter how sophisticated the methods are or will be.21 It also needs research based on the critical interpretation of hybrid information from multiple and varied sources. 8 | TIJDSCHRIFT VOOR MEDIAGESCHIEDENIS - 20 [1] 2017 Literacy and source criticism in Digital History Of course, digital history creates research dilemmas, especially about the balance between digital methods and historical interpretation. Digital historical research often concentrates on technological possibilities and the shrewdness of digital tools as such.22 This implicitly creates a new dominant paradigm about history to be understood not as a set of unique social and cultural phenomena largely determined by distinction, deviance and coincidence but as a cohesive culture that can be understood just by using shrewd algorithms and present the results in spectacular ‘shock and awe visualisations’.23 Data analysts also acknowledge that ‘there is a risk that we look more carefully at the technical components of the datasets than the historical context of the information that they represent.’24 But digital history is more than that. Since the increasing importance of digital communication and digitised historical sources from the nineteen nineties onwards, interest in what this means for historical sciences is obviously growing.25 Looking at the practical results of digital history one should say that expectations about ‘a revolution’ should not be too high. Most historians still see the digital world just as a convenient place for fast and efficient browsing in the rich information sources available and not as a vital environment for historical analysis. Digital history is sometimes seen as an effort to give history meaning in a new environment and create interactive historical debates on the Internet. Characteristically, one of the first books dedicated to digital history, dating from 2006, focused on ‘the Gathering, Preserving and Presenting the Past on the Web’.26 Still scarce are historians who seriously explore the possibilities of analysing digital historical data and integrate results in a broader historical debate. The reason for this may be the pressing need to understand the nature of big data and the many techniques and tools for data storage and analysis, like text mining, topic and concept modelling, network analysis and visualisations. In order to look at historical big data through a ‘macroscope’ it is required for a historian to get a grip on these data, techniques, methods and tools.27 Big question here is to what extent historians need to understand software and digital techniques. Are they digitally literate enough for this task? Of course, every specific research effort requires deep understanding of the methods used for delivering answers, but fully understanding digital methods is challenging for humanities scholars because it requires specialised knowledge of statistical modelling, programming languages, and the way algorithms are used for ‘data mining’. This knowledge generally is restricted to insiders; for most historians the necessary computational knowledge and software is a step too far and the technical side of data collection remains a black box process that is hard to assess.28 Because of their insufficient insight in the algorithmic logic driving these black box processes, historians run the risk of making themselves dependent on a computational logic they do not fully understand, having to rely on professionals in different and often distant fields, such as computational linguistics, information and computer science, who, in turn, lack the domain specific expertise that historians bring to the table.29 | 9Huub Wijfjes Another question that historians are faced with, is whether we can understand history just by looking at and analysing digital sources. For an understanding of our dominantly digital contemporary culture one cannot deny the indispensable relevance of digitally born sources. But what about history that is created in analogue forms, like handwriting, manuscripts, print and analogue audiovisual material? You can of course say that the problem will be solved when these forms will be digitalised, but that moment is still far away. As we shall see in the review of digital newspaper research, the lack of digital historical sources can be a real problem, that should be tackled on the basis of classic source critique: the need to evaluate the reach and restrictions that relevant sources (or the lack of them) offer for answering specific historical research questions. In this respect it is of utmost importance to acknowledge that most archival sources are not digitised yet and shall not be digitised and made publicly accessible in the coming decades because of the enormous costs and copyright problems. Solely relying on digital analysis is therefore too limited in scope and even dangerous because it feeds the idea that only information that is instantly available online is relevant. That creates ‘digital laziness’ which is a direct threat to the historical need to critically evaluate all relevant surviving sources and not only the digitally available. In this kind of evaluation constant acknowledgement is necessary that every source only gives a very specific picture of historical reality.30 The importance and relevance of this is provided in research showing the sensibility of media historical researchers for the availability of data and tools. Research questions and strategies can change fundamentally in this ‘data-driven research’.31 If data are not digitally available, you just turn to data that are and fit the questions to this environment. This also directs us to the problem of a distinct and properly facilitated digital infrastructure for performing digital historical research. Enormous sets of digital historical data have already been gathered in data archives, sometimes together with digital tools to analyse the data. On this foundation, research projects have been set up, generally bringing together historians with computer scientists. This research effort doesn’t seem to root in an urgent need for different views on history, but in the awareness that digital data and software are increasingly guiding our contemporary world and can therefore also be decisive for historical knowledge and understanding. Or as Lev Manovich wrote about ‘softwarised culture’: ‘software plays a central role in shaping both the material elements and many of the immaterial structures which together make up culture.’32 If it is true that the digital is determining our contemporary culture, it is also determining how we should perform historical research. Close cooperation of specialists in both fields is the obvious solution, but generally speaking the digital techniques dominate a lot of the current cooperations. Maybe that is logical because of the many technical problems that must be solved, but historians have important problems to solve as well. Although real interdisciplinary research efforts are still at the very start of development, the combined use of digital and more traditionally stored historical sources has become a more or less normal part of the professional historical field. The big challenges therefore not only lie in the analysis of digital sources, but in developing a professional attitude as a historian in the digital world.33 10 | TIJDSCHRIFT VOOR MEDIAGESCHIEDENIS - 20 [1] 2017 A digital turn in newspaper history How did media historical research, especially newspaper research develop in this emerging digital infrastructure? For an answer we must return to ‘the cultural turn’ in media history since the nineteen eighties. As stated before, the focus in research shifted from the history of institutional and political background of media institutions to the cultural meaning of media content for publics.34 In this respect, the availability of content sources like newspapers, films and broadcasting programmes were increasingly vital. Methods to analyse this content were too. Traditionally, a lot of experience was already built in historical media content analysis. In historical newspaper analysis for example tailor-made approaches were developed in the context of every specific research. Media historian Frank van Vree for example analysed the content of four major Dutch newspapers in relation to their attitude towards Nazi Germany between 1933 and 1939. The sections on the historical context of the press in this period are just as long as the actual content research that can be characterised as a historical discourse analysis strongly focusing on opinion articles and background stories in the four newspapers. Because of the labour intensive work of this sort of analysis not the entire content of the newspapers could be included. Nor could vital sections of the Dutch press in this period be included, like the national neutral or regional press. So questions can be raised about the representativeness of this research for the interpretation of ‘public opinion’.35 In a later study into the cultural transformation of the leading national newspaper De volkskrant in the nineteen sixties and seventies, Van Vree’s focus was also restricted to certain carefully selected sections of the newspaper. In comparable studies of similar developments in newspapers, the same restrictions were characteristic for the research.36 More recently, methods in historical newspaper research have been developed to look more systematically at the long-term development of journalistic practices or genres. In the Netherlands, media historian Marcel Broersma kicked off this research by making a longitudinal analysis of the content of one newspaper for 250 years. Style and genre analysis were integrated in thoroughly contextualised research of the institutional and political development of this newspaper.37 Following the same lines, but with more emphasis on a single genre within several (international) newspapers was the research of Frank Harbers, who analysed the development of the reportage in newspapers in Great Britain, the Netherlands and France between 1880 and 2005. Rutger de Graaf also employed a quantitative content analysis to reconstruct the intertextual connections between the content of pamphlets and newspapers in nineteenth century Dutch society.38 The principal aim of these studies was not to analyse digital data, but shed light on long term trends in newspaper content in relation to societal and political development. The data itself was mainly gathered by manually conducting a large-scale quantitative content analysis, using specific coding schemes and testing for intercoder agreement to ensure the reliability of the research. The advantage of these methods is that the coding is tailored to answering very specific historical questions. The disadvantage was, of course, the still limited amount of research material that could be examined and the risk of subjectivity of the coding decisions. | 11Huub Wijfjes Generally speaking only samples were taken every ten or twenty years, for instance two constructed weeks to represent a particular sample year. As long as there is no sound method of automating the search for a specific and complex historical entities like ‘reportage’ or ‘comment article’, manually conducted research relying on smaller samples of the research material will remain necessary. The cultural, interpretative tradition in newspaper history shows the value of textual research, but also the critical importance of contextualisation of this type of research. Strictly focusing on the text itself can be very useful, in linguistic studies for example, but in media history the context is indispensable for a meaningful interpretation of the past. In the digital environment this is crucial too. An example of the necessity of contextualising digital research questions is shown in an exploratory study of the theoretical concept of ‘pillarisation’ in Dutch history. A research project called ‘Verrijkt Koninkrijk’ aimed to analyse the digital texts of historian Loe de Jong in relation to ‘pillarisation’, a long term process of societal and political segmentation characteristic of Dutch culture roughly between 1900 and the 1960s. It showed that De Jong in his fourteen-volume book about the Netherlands during the Second World War did not write about concepts like ‘zuilen’ (pillars) and ‘verzuiling’ (pillarisation), but referred to related concepts like ‘volksdelen’ (sections of the national community). Researchers also found that these words were not used with the same and uniform connotations. So alternative queries had to be developed, taking into account that pillar is a broad concept with different meanings on different levels. To get a grip on that, contextualised research is necessary. A researcher should also look at the sentiment in which the more detailed concepts were used. All this requires sufficient historical expertise to frame the problem in historically correct proportions and digital expertise to produce sophisticated search methods and tools.39 For newspaper research digital approaches seem to offer more possibilities than ‘old, analogue’ methods, like selectively browsing through newspapers, reading some selected and relevant content and interpreting that in relation to other sources for historical knowledge. Browsing through and closely reading historical newspapers in this manner, gives opportun- ities to see historical context of newspaper content more clearly. So any suggestion that digital history research can best be performed in a closed digital environment with the big data as the only source, would be a misunderstanding of the value of ‘analogue’ research forms like browsing and in depth analysis of singular sources.40 Undoubtedly, new text and data mining methods bear a promise as they can overcome some manual browsing limitations. In principle all texts are available for fast computer-aided analysis, no longer dependent on indexing or coding and with possibilities for unlimited combinations of keyword searches.41 Expectations sometimes are so high that historians like Joris van Eijnatten argue that ‘manual browsing and sampling in various forms (…) are no longer necessary.’42 Yet, the same author also casts doubt on these expectations by concluding that ‘text mining techniques will displace but not replace traditional hermeneutic methods.’43 That may be comforting for the traditionalists, but above all it accentuates that digital history is here to stay. Almost all historians working with historical media sources agree that the greatest potential in working with digital sources lies in reconstructing long term connections between contents that till now could not be connected. New software techniques 12 | TIJDSCHRIFT VOOR MEDIAGESCHIEDENIS - 20 [1] 2017 for historical data mining facilitate historians who are looking for patterns in large amounts of texts like newspapers. An example offers a content analysis of millions of articles published in British periodicals since 1800 aiming to detect specific events, like wars, epidemics, coronations, or conclaves.44 With the use of refined artificial intelligence techniques, the researchers were able to move beyond counting words by detecting references to named entities. These techniques showed both a systematic underrepresentation and a steady increase of women in the news during the 20th century and the change of geographic focus for various concepts. They could also detect the dates when electricity overtook steam and trains overtook horses as a means of transportation, both around the year 1900, along with observing other cultural transitions. An example offers the research project ‘Transatlantis’ of Utrecht University, that maps debates about the supposed Americanisation of European culture in the twentieth century. The theoretical concept used in this research is ‘reference culture’, defined as ‘spatially and temporally identifiable cultures that offer a model to other cultures and have exerted a profound influence in history.’ This concept is researched in a set of digital historical sources like newspapers, creating a network of references to the United States in the Netherlands between 1890 and 1990.45 Tracing ‘patterns’ like this is indeed a goal of digital humanities research in general. But most historical researchers stress that these patterns only get real meaning if they are combined with contextualised research, for example qualitative interpretation of specific texts, words or visuals. With digital newspaper research we can trace the development and intensity of influential events and persons, but for the interpretation of how these constructions were made in different periods we need to take a closer look at the content in its media and cultural context. To make the problem more concrete on an international level: with digital newspaper sources we may be able to trace the complete newspaper coverage of the Dreyfus-affair in French society in the twentieth century (supposing all newspapers are digitised, which isn’t the case). Yet, in order to say something about how this event was constantly redefined in different contexts, we need to look at single newspapers in connection to a broad cultural and political context of its time. For this we need digital research too, because it can allow us to zoom in on content that in a traditional way could only be found by time consuming browsing of newspapers or viewing many hours of broadcasting material. Putting theory to practice: opportunities, challenges and problems Historical newspaper research offers a relevant insight in the practical and methodological problems of digital history. The growing digital collections of newspapers everywhere in the world promise a lot, but experiences in analysing newspaper content in historical research also confronts us with practical problems that cannot be solved easily and immediately. First of all it must be stressed that an entirely centralised storage of all digital newspapers on a national level doesn’t exist, even in countries with a powerful national library infrastructure, like most Western European countries. In these countries the collections are held by national institutions, such as the British Newspaper Archive (subscription), Library of Congress (free), ProQuest Historical Newspapers and Newspaper Archive Library Edition (subscription), the Delpher collection of the National Library of the Netherlands (free), Zefys of | 13Huub Wijfjes the Staatsbibliothek in Berlin (free), Gallica of Bibliothèque Nationale de France (free) and the Trove collection of National Library of Australia (free). Instruction video for Delpher online database (in Dutch). Next to these big digital newspaper archives all kinds of specialised – regional, local, thematic - collections pop up in the online world. Each of these collections can make use of specific interfaces, standards and/or tariffs for accessibility and use. Most of them are publically funded; some are private initiatives that can reach high quality of services. The American based ‘Media History Digital Library’ for example digitises and hosts full and free access to complete collections of classic media periodicals, mainly magazines on broadcasting, film, and commun- ication technique and policy. This online library is supported by owners who loan their magazines for scanning. Voluntary donors contribute the funds to cover the cost of scanning.46 Because there is no standardised rule for adding metadata in these digitisation processes, connections between the metadata sets of all these separate collections are hard to establish. That complicates really new digital search methods like text mining and network analysis. In addition to that, some important collections like the commercial Lexis-Nexis Academic Newspaper database are based on text only and therefore totally ignore the visual dimension of news, a fundamental problem for certain research questions.47 That problem is comparable to other problems surrounding the statistical analysis of the digital data behind the newspaper itself. This metadata, containing all the words, tags, dates, titles and other relevant bits of information, are also used to make segmentations in the newspapers, for example on basis of articles, visual elements, advertorials etcetera. Metadata and segmentation can be the basis for statistical analysis. But for that purpose the data should be uniform, quantifiable and preferably also complete. The uniformity and calculability cannot be guaranteed in public search engines such as Delpher, Zefys, Gallica and Trove. These search engines are designed for relatively simple search queries and making connections between the content of newspapers, magazines, journals and – in some cases – even in books. They seem ready made for researching long term and complex interrelated ‘patterns’.48 But for making statistical calculations they are not very well suited. For statistical analysis the metadata behind the search engines can be useful, but metadata in most cases are not 14 | TIJDSCHRIFT VOOR MEDIAGESCHIEDENIS - 20 [1] 2017 https://www.youtube.com/watch?v=T8M5IQVMpok publically accessible. For research reasons they sometimes can be consulted on request. But more convenient would be an infrastructure that is especially designed for research. Preferably all heritage institutions that have media historical collections would cooperate in this infrastructure. A good, but still experimental example is ‘Europeana Newspapers’, a project of eighteen European libraries creating full-text versions of about ten million newspaper pages.49 It also detects and tags millions of single articles with metadata and named entities (information identifying people, locations etcetera). This kind of projects offers advantages in developing useful tools and expertise on the collections itself, but in the long run they can also provide opportunities to connect databases of different origin together. In order to shed some light on the historical development of the public spaces for example, one can imagine that we need to connect the content of journalistic magazines, newspapers, and radio and television with other reality sources, like proceedings of parliament, general magazines, scientific and special interest journals, films, books and new media content. Next to this general infrastructural problem (that really must be solved to improve the value of digital media historical research) practical problems call for solutions. First of all, and most prominent, is the problem of incompleteness. The digitisation of sources and the preservation of original (analogue) sources come with considerable costs. Making complete digital versions of analogue sources therefore takes a lot of time. Since the beginning of the twenty-first century big projects have started to digitise collections of newspapers. The National Library of the Netherlands for example has invested in a project with the aim to digitise every newspaper in their huge collection that overarches the period from 1618 to 2000. In 2015 more than nine million pages originating in 1700 newspaper titles and containing approximately eighty million articles were digitised (Figure 1). These figures are impressive, but still only fifteen percent of the total collection of newspapers is covered. With eighty-five percent still to go, digitising all newspapers is indeed a long-term project.50 Figure 1. Amount of digitised newspapers per year, available in Delpher collection of the National Library of the Netherlands, 1600-2000. Reference date: January 2017. Source: the National Library of the Netherlands, The Hague: http://www.delpher.nl/nl/kranten#krantenoverzicht. The figures in the graph are continuously updated. | 15Huub Wijfjes http://www.delpher.nl/nl/kranten#krantenoverzicht Obviously, with the digital newspaper collection now available, big gaps can be seen. While circulation figures of the Dutch press show a considerable growth between 1945 and 2000, in contrast the digital collection shows a considerable decrease. The reason is that newspaper titles younger than seventy years can only be digitised and made publicly accessible with permission of copyright holders. The consequences are demonstrated in figure 1. For the period after 1945 most newspapers are not publicly available for digital research. It can be said that we are facing an enormous black hole in the digital collection of historical newspapers. From a historical point of view, avoiding this problem by focusing on the available newspapers can be an irresponsible and unjustifiable solution – emphasising the need for researchers working with these collections to always demonstrate their accountability and the awareness that they are basically working with a ‘convenience sample’. The depth of this problem of incompleteness was shown concretely in the historical research project ‘Pillarization and Depillarization Tested in Digitised Media Historical Sources’ (Pidemehs).51 The universities of Groningen and Amsterdam performed this project between 2014 and 2016, in close cooperation with the Netherlands eScience Centre, the National Library of the Netherlands and NIAS. It aimed at reconstructing long-term patterns in the historical relationship of Dutch political and newspaper cultures on the basis of available digital newspaper collections and digital political sources, like party political programs and proceedings of parliament. Presentation of the results is forthcoming in another publication, so here only some findings about the research practice are presented.52 Pidemehs first of all showed the necessity of thorough preparation (including critical source evaluation) and controlling digital search queries on the basis of contextualised historical research. Before starting such a historical research in digital newspapers some consideration had to be made about the nature of the digital data sources. In what way and to what depth are these data constructed, assembled or stored and how representative are they for the total of newspaper sources produced in certain periods? An important question related to this, is what metadata are connected to the data and how this data relates to the automated segmentation of newspaper content in articles, visuals, advertorials, etcetera. The project showed the huge limitations created by the relative scarcity of digital sources, gaps in collections and technical failures connected to the digitisation process. These problems limited the research to the period in which a representative and relevant set of digital newspapers could be guaranteed: 1918–1967. The original setup that stretched out from the period until 2000, was impossible to realise due to copyright problems. The availability or lack of digital newspaper titles showed to be vital for tackling certain research questions within the Pidemehs-project. For an analysis of the long-term relationship between newspaper content and political identity for example, digital copies of the newspapers were needed that are known for their political or religious identity and those who called themselves ‘neutral’ or ‘not partisan’. It appeared that both could be lacking. In the newspaper collection of the National Library of the Netherlands for example no complete digital set of the most important protestant newspaper between 1870 and 1940 – De standaard – is kept, probably because of a lack of money to digitise the complete set. Furthermore, at the time of 16 | TIJDSCHRIFT VOOR MEDIAGESCHIEDENIS - 20 [1] 2017 this research project a complete set of liberal newspapers like NRC and Algemeen handelsblad was lacking; only certain parts of the interwar years are digitised and made accessible.53 Similarly, at that time, a digital copy of the most important catholic newspaper De volkskrant from 1919 until now was not available because of copyright problems.54 All in all, the available data limited the research to an analysis of socialist, catholic and neutral groups and newspapers. The incompleteness of available data is the biggest practical problem, but not the only. Lack of uniformity in data is another. Effective historical data mining builds upon uniform data. For example, if you’re looking for the intensity of newspaper attention for a political party named RKSP, how can you be sure you’ll retrieve all relevant data? One problem is that newspapers don’t make it a habit to standardise names and concepts, so a search query needs to include all name varieties. Building on expertise knowledge about political history and existing documentation of political parties, a list can be made with all varieties the party RKSP (and its predecessor) used in a period between 1918 and 1940. That list looks like this: ‘ABRKKV; BRKKV; Algemeene Bond van Rooms-Katholieke Kiesvereenigingen; Bond van Katholieke Kiesvereenigingen; Katholieke Kiezersbond; R.K.S.P.; RKSP; Roomsch-Katholieke Staats-Partij; Rooms-Katholieke Staatspartij; Katholieke Staatspartij; kath. Staatspartij; R.K. Staatspartij, onze Staatspartij, onze partij’. The same procedure was followed in connection to other party names. Searching for names of persons (leading politicians in this case) can create the challenging problem of how to isolate exactly one relevant person and exclude persons bearing the same name. Working with searches that combine the name with the proximity of relevant names, titles or concepts (party leader, prime minister, politician etc.) can help, but this requires some carefully performed trial and error operations. It all stresses the importance of specialised context knowledge needed when performing this kind of digital historical newspaper research. While reconstructing the historical relationship of prominent political persons (ministers, party leaders etcetera) to newspaper content in the Pidemehs-project, it is shown that restriction to the quantity of mentioning these persons in newspapers raises questions. In Dutch context you will find that politicians dominating a distinct period like the interwar years (Colijn, De Geer) or the nineteen fifties (Drees, Romme) are mentioned more than average, not only in press that is loyal to their policies. That gives a clear indication that pillarisation is not only a question of loyalty restricted within one’s own ideological group; it is also about the need for a competitor or enemy. This calls for more qualitative research into the way politicians are depicted in certain newspaper content. This can also be researched digitally, using sentiment mining techniques. The above demonstrates that in order to efficiently excavate in big data you need tools that only highly skilled data-engineers can use or develop. Close cooperation with language specialist and/or historians is vital here.55 The heritage institutions can have a role in developing such tools to analyse their digital collections in cooperation with universities and research institutes. Some experience has for example been built up with open source mining technology in research of historical newspapers. In the historical ‘sentiment mining’ programs WAHSP and BILAND word clouds are created based on relative frequencies in the retrieved selection of documents in the corpus. A word cloud can highlight negative or positive | 17Huub Wijfjes connotation, but this still needs further historical contextualisation because connotation constantly changes in time.56 A tool like Texcavator – developed by university of Utrecht and Netherlands eScience Centre in order to trace patterns in public discourse – is also coping with this problem.57 Developing complex and tailor-made digital search methods that can tackle specific problems forms one of the big challenges of digital media history. This is especially valid to the problem how to retrieve and analyse visual or iconic elements within newspapers, like photographs, cartoons, maps and graphics. The search for the proliferation of iconic photographs in public debates for example has just begun.58 ‘Pidemehs’ and other digital humanities projects show how copyright problems can create severe limitations of use, especially for late twentieth century newspapers. Retrieval and consultation in a shielded research environment (using a proxy-server for example) may offer a solution, but then the publication of results in an open access environment can become problematic. If scholars can only read about results without the possibility to check and verify them in the original research data, the scientific historical routine is threatened. This does not mean that completeness and full accessibility are reached for the newspapers dating from the period before roundabout 1940. In the digitisation processes of newspapers priority selections have been made, generally on basis of advice given by researchers. Unavoidably, that creates gaps in the digital collection. Specialised research has shown that even for the seventeenth century, where copyright problems are not an issue and the total amount of newspapers is relatively small, fifty-two percent of all surviving hard copy newspapers between 1618 and 1650 are ‘lost in digitisation’. From the 750 surviving copies of the oldest Dutch newspaper – the Courante uyt Italien, Duytschlandt &c published by Jan van Hilten – until now only 199 copies have been digitised and made publicly accessible in Delpher.59 It needs historical expert knowledge to understand the depth of this problem and possibly create solutions. But maintaining expertise about the context of the original sources and the handling of digital bearers not only costs a lot of money, but also requires understanding of the relationship of the original analogue newspaper and the digital form. ‘When we digitise a newspaper, it is fundamentally changed (…) sources are remediated and not just reproduced,’ historian Bob Nicholson rightly remarked.60 Tagging of articles with metadata categories like ‘advertorials’, ‘family advertisements’, ‘news lead’ or ‘news reports’ for example, facilitates research considerably, but these tags can be anachronistic because the connotation of these kind of concepts change over time. This historical source awareness is growing steadily. So maybe the problem of cost is more pressing. Who will pay for the digitisation of all newspapers? In general one can only say that creating facilities for scientific research in Western Europe is in principle publicly funded. But the public interest clearly clashes with private interests on the issue of copyrights. And the copyright problem really is decisive for the lack of completeness in media historical sources of the twentieth century like newspapers, magazines, films and broadcasting material. Next to the incompleteness in quantity, problem are also created due to OCR-mistakes. It is still unclear how stable and precise the technology of digital bearers is, but experience in digital projects clearly shows unreliability in the relation of the original analogue and the new digital bearer. The accuracy and quality of Optical Character Recognition (OCR) in scanned documents 18 | TIJDSCHRIFT VOOR MEDIAGESCHIEDENIS - 20 [1] 2017 can seriously influence the segmentation and the amount of mistakes in the digital search possibilities, especially in documents that require specialised knowledge to read or interpret.61 OCR-mistakes are for example a special problem in almost all texts produced before 1850, because of the inconsistency in typographic form and layout in the older periods.62 One can see the consequences in the digitised collection of historical newspapers in the National Library of the Netherlands. It is shown that the accuracy level of the OCR increases considerably in time: the older the original bearer the more mistakes it contains. It is estimated that this can run up to more than eighty percent for some seventeenth and eighteenth century newspapers that have peculiar layout features or use unique fonts. For seventeenth century newspapers with a regular layout with gothic lettering and vertical text layout the failure rate is estimated between fifteen and twenty percent.63 It is not absolute to say that the failure rate in newspapers with modern, standardised lettering and layout is negligible or even non-existent. A search for the use of a relatively new Dutch word like ‘verzuiling’ (pillarisation) in historic newspapers demonstrates this. Historical context research has shown that ‘verzuiling’ was developed as a concept to interpret Dutch political culture in the nineteen fifties of the twentieth century. But this neologism shows up two times in eighteenth century Dutch newspapers available through the search engine Delpher of the National Library of the Netherlands. In the nineteenth century thirty-three results show up as ‘verzuiling’ while in the original newspapers are mentioned: verzameling, vervulling, verzetting, verzoeking, verzoening, verzorging, vergoding and verzanding. In the twentieth century period before the first proper use of ‘verzuiling’ in 1952, more than thirty-five OCR-mistakes pop up. Carolyn Strange and other American press historians also point at OCR-errors and other technical obstacles in their historical research like the lack of expert metadata at document level in historical American newspapers. Their conclusion on basis of a clearly outlined selection of nineteenth century newspaper research, is that correction of OCR-failures (in their data set: around twenty percent) is ‘desirable but not essential’ in this kind of topical research, supposing there is enough time to check what exactly the failures do in specific search queries.64 That is of course different with failure-rates running up to more than eighty percent in older newspapers with peculiar typographical features. And it is different if statistical analysis is one of the research tools, because statistical programs or algorithms generally do not automatically discount OCR-mistakes. There are several methods for OCR-failure correction – which cannot be discussed in detail within the scope of this article – but none have yet developed into a definite solution. Ideal is reducing failures, preferably by double manual correction or even crowd sourcing. Crowd sourcing is promising, but despite the success of crowd sourced knowledge databases like Wikipedia and the positive experiences with some crowd sourcing projects at cultural heritage institutions, there is still some doubt about the value and reliability for scientific purposes.65 Technicians predict that self-learning software can solve the problem in the long run, but this requires human input to ‘instruct’ the software of what is correct and what is not. And although there are scholars claiming that crowds of annotators can produce better, more reliable results in adding or correcting metadata than annotators with expert knowledge, curators of heritage institutions remain cautious.66 | 19Huub Wijfjes These institutions still have a vital intermediate function and some experiment with increasing the reliability of metadata and segmentation. British Newspaper Archive and National Library of Australia allow users to correct OCR-errors and add tags they think are relevant for the article in question.67 Together with the Meertens Institute, the National Library of the Netherlands works with a large group of volunteers to re-type the articles in the digital collection of seventeenth century newspapers on basis of the OCR. Conclusion The digitisation of historical newspapers undoubtedly has stimulated research, but eagerness to use the sources sometimes takes away from the awareness of new problems accompanying these approaches; especially since the storage and retrieval of and the access to the data are still highly problematic.68 Storage and free access are of course classical problems. From the perspective of historical research free availability of complete and uniform sources has always been vital. The historical infrastructure that was built in the nineteenth and twentieth centuries is the result of this endeavour: publicly accessible archives, concise and extensively annotated source publications, heritage institutions guarding complete and contextualised collections, and long term research projects. These cultural endeavours get a new dimension in the digital world. Finding proper solutions for a fruitful infrastructural combination of analogue and digital sources is in full development. For researchers reflection on the value and use of digital sources is necessary. Analysing historical newspapers is getting a different dimension when we see this as analysing big data. Manually browsing through newspapers (on paper or using microfilms) automatically used to give some historical context to the content of articles, the position in relation to other content, the cultural forms and media genres to be found in these sources. When analysing digital newspaper data however, a researcher should be aware that he is doing decontextualised research. One should also get used to the idea that scarcity of sources is replaced by relative abundance.69 But this abundance is relative, because it is clear that not all analogue sources are digitally available. It has been shown in this article that in a digital environment completeness and uniformity cannot be guaranteed. Although millions of euros have been invested in digitisation projects, still only a fraction of historical newspapers are accessible for research purposes. OCR and other technical problems also afflict the quest for optimal source accessibility and applicability. Lack of money, but also the scattering of collections and especially the copyright problems still are decisive for the success of research efforts.70 So, a researcher who wants to work with complete newspaper data needs to be able to organise, improvise and negotiate. There is also need for funding of digitisation of the necessary sources, which can be too substantial for a single research project. Last but not least, a researcher needs to realise that good preparation is more than half of the work; it is almost all of the work. Historical research in digital newspapers needs well-equipped heritage institutions that create and maintain an effective infrastructure. It is not only a question of storing and organising digital data, making them accessible and developing digital tools for analysis. It is also about guarding the original and maintaining expert knowledge of all newspaper sources, 20 | TIJDSCHRIFT VOOR MEDIAGESCHIEDENIS - 20 [1] 2017 digital and analogue alike. And it is about making a serious effort in solving the copyright problem by putting the interest of public consultation high on the agenda. So, media heritage institutions should continue with the digitisation of sources with the ultimate goal to reach completeness. Doing this they should be constantly aware that historians and digital scientist both need complete and uniform data, but they also raise different questions and use different methods. For researchers it raises the question of what value they attach to certain components of digital history research: software and data handling techniques, contextualisations, methodo- logical operationalisation, analysis and interpretation. All these components should be in balance and be critically evaluated in the light of the specific historical research question. Just as the assumptions of historians formulating research questions are not neutral, the assumptions of digital toolmakers and analysts aren’t too. ‘Theory is already at work on the most basic level when it comes to defining units of analysis, algorithms, and visualisation procedures.’71 In overview we must conclude that the existing digital humanities research cannot live up to the claims of some digital humanities and information science scholars that we are experiencing a revolution. We are facing important methodological and practical problems that need to be solved in order to make compelling breakthroughs in historical research. Breakthroughs not strictly in theoretical sense but in performing concrete historical newspaper research for example. In close cooperation with digital scholars, media historians should be able to connect long-term developments in digital sources to exemplary historical events. Performing source critique and formulating questions on the basis of historical agendas are crucial. Formulating new research agendas on basis of digital sources can only be useful if acknowledging that analogue sources and contextualised knowledge are vital. The traditional historical guidelines to look carefully and critically at the unique materiality and historical context of sources and not to rely on just one source or method are still relevant, probably more relevant than ever. Notes 1. This text is part of the research project “Pillarization and Depillarization Tested in Digitized Media Historical Sources” (Pidemehs), performed by University of Amsterdam and University of Groningen. The project is made possible thanks to the generous support of the National Library of the Netherlands, the Netherlands Institute of Advanced Studies NIAS and Netherlands eScience Centre. 2. Frédéric Clavert and Serge Noiret, “Digital Humanities and History. A New Field for Historians in the Digital Age,” in Contemporary History in the Digital Age, ed. Ibidem (Brussels: Peter Lang, 2009), 15–26; ed. David M. Berry, Understanding Digital Humanities (Houndmills: Palgrave Macmillan, 2012); Susan Schreibman, A Companion to Digital Humanities (Malden: Blackwell, 2004). 3. Joris van Eijnatten, Toine Pieters, Jaap Verheul, “Big Data for Global History. The Transformative Promise of Digital Humanities,” BMGN - Low Countries Historical Review 128, no. 4 (2013): 55–77, there 57. The general discussion on ‘the Digital inflecting humanities fields and disciplines’ in: ed. Patrik Svensson and David Theo Goldberg Between Humanities and the Digital (Cambridge, MA: The MIT Press, 2015), for the historical field especially pp. 17–33. 4. Bob Nicholson, “The Digital Turn. Exploring the Methodological Possibilities of Digital Newspaper Archives,” Media History 19, no. 1 (2013): 59–73. 5. Nicholson, “The Digital Turn”, 63. 6. Bod, Rens, Het einde van de geesteswetenschappen 1.0. Inaugural lecture, University of Amsterdam, 14 December 2012; “Forum: the End of Humanities 1.0,” BMGN - Low Countries Historical Review 128, no. 4 (2013): 145–180. | 21Huub Wijfjes 7. Jim Macnamara, The 21st Century Media (R)evolution: Emergent Communication Practices (New York: Lang, 2014); Jo Bardoel and Huub Wijfjes, “Journalistieke cultuur in Nederland: een professie tussen traditie en toekomst,” in Journalistieke Cultuur in Nederland, ed. Ibidem (Amsterdam: Amsterdam University Press, 2015), 11–29. 8. Extensive analysis of this ‘hyperbolic debate’ gives: Paul Gooding, ‘Search all about it!’ Historic Newspapers in the Digital Age (Abingdon: Routledge, 2017), 22–48. Compare: Alan Liu, “Where is Cultural Criticism in the Digital Humanities?,” in Debates in the Digital Humanities, ed. Matthew K. Gold (Minneapolis: University of Minnesota Press, 2012), 490–509; Evgeny Morozov, To Save Everything, Click Here: the Folly of Technological Solutionism (New York: Public Affair Books, 2013); Andreas Fickers, “Towards a New Digital Historicism? Doing History in the Age of Abundance,” View. Journal of European Television History and Culture 1, no. 1 (2012), http://www.viewjournal.eu/ index.php/view/article/view/jethc004/4. Andreas Fickers, “Veins Filled with the Diluted Sap of Rationality. A Critical Reply to Rens Bod,” BMGN - Low Countries Historical Review 128, no. 4 (2013): 155–163. 9. David Armitage and Jo Guldi The History Manifesto (Cambridge (UK): Cambridge University Press, 2014), 117. 10. Fickers, “Veins Filled”; Liu “Where is Cultural Criticism”. 11. Kees Bertels, Geschiedenis tussen structuur en evenement (Amsterdam: Wetenschappelijke Uitgeverij, 1973); R.W. Fogel, “‘Scientific History’ and ‘Traditional History,’” in Which Road to the Past? Two visions of History, ed. R.W. Fogel and G.R. Elton, G.R (New Haven, NJ: Yale University Press), 7–70. 12. Hinke Piersma and Kees Ribbens, “Digital Historical Research. Context, Concepts and the Need for Reflection”. BMGN - Low Countries Historical Review 128, no. 4 (2013): 78–102, there 82–85. 13. Bernhard Rieder and Theo Röhle, “Digital Methods: Five Challenges,” in Understanding Digital Humanities, ed. David M. Berry (Houndmills: Palgrave Macmillan, 2012), 67–84. 14. Robert Darnton, The Kiss of Lamourette. Reflections in Cultural History (New York: W.W. Norton, 1990), 60. Compare: Fickers, “Veins Filled”. 15. Stanley Fish, “The Digital Humanities and the Transcending of Mortality,” The New York Times, January 9, 2012, http://opinionator.blogs.nytimes.com/2012/01/09/the-digital-humanities-and-the-transcending-of-mortality/; José van Dijck, “Big data, grand challenges. Over digitalisering en het geesteswetenschappelijk onderzoek,” Ketelaar-lezing 12 (2014), http://www.clariah.nl/files/publicaties/Ketelaarlezing_2014.pdf; Ted Striphas, “Algorithmic Culture,” European Journal of Cultural Studies 18, no. 4–5 (2015): 395–412. 16. Fogel, “‘Scientific History’”; Shawn Graham, Ian Milligan and Scott Weingart, Exploring Big Historical Data. The Historian’s Macroscope (London: Imperial College Press, 2016), 1–35. The concept of ‘analogue humanities’ in: Jonathan Stern, “The Example: Some Historical Considerations” in Between Humanities and the Digital, ed. Patrik Svensson and David Theo Goldberg (Cambridge, MA: The MIT Press, 2015), 17–33. 17. Van Dijck, “Big Data”. 18. Huub Wijfjes, “Perspectief in persgeschiedenis,” BMGN - Low Countries Historical Review. 114, no. 2 (1999): 223–235, doi: http://doi.org/10.18352/bmgn-lchr.4949; Donald G. Godfrey ed., Methods of Historical Analysis in Electronic Media (Mahwah, NJ: Lawrence Erlbaum, 2006); Michele Hilmes, Only Connect. A Cultural History of Broadcasting in the United States (Belmont: Thompson Wadsworth, 2010); John Hartley, Digital Future for Cultural and Media Studies (Chichester: Wiley-Blackwell, 2012), 27–58. 19. Berry, Understanding Digital Humanities, 12. 20. Ibid. 21. Piersma and Ribbens, “Digital Historical Research”, 57. 22. Toni Weller, “Introduction,” in History in the Digital Age, ed. Ibidem (London: Routledge, 2013). 23. Armitage and Guldi, History Manifesto, 122. 24. Prescott, “The Deceptions of Data”, as cited in: Gerben Zaagsma, “On Digital History,” BMGN - Low Countries Historical Review 128, no. 4 (2013): 3–29, there 24. Also: Andrew Prescott, “An Electric Current of the Imagination: What the Digital Humanities Are and What They Might Become,” Journal of Digital Humanities 1, no. 2 (2012), http://journalofdigitalhumanities.org/1-2/an-electric-current-of-the-imagination-by-andrew-prescott/. 25. About the history of history and computing see: Zaagsma “On Digital History”. 26. Daniel J. Cohen and Roy Rosenzweig, Digital History. A Guide to Gathering, Preserving and Presenting the Past on the Web (Philadelphia: University of Pennsylvania Press, 2006). 27. Extensive exploration of these aspects in: Graham, Milligan and Weingart, Exploring Big Historical Data. For humanities in general an instructive manual is: Richard Rogers, Digital Methods (Cambridge, MA: The MIT Press, 2013). 28. Rieder and Röhle, “Digital Methods”. 22 | TIJDSCHRIFT VOOR MEDIAGESCHIEDENIS - 20 [1] 2017 http://www.viewjournal.eu/index.php/view/article/view/jethc004/4 http://www.viewjournal.eu/index.php/view/article/view/jethc004/4 http://opinionator.blogs.nytimes.com/2012/01/09/the-digital-humanities-and-the-transcending-of-mortality/ http://www.clariah.nl/files/publicaties/Ketelaarlezing_2014.pdf http://doi.org/10.18352/bmgn-lchr.4949 http://journalofdigitalhumanities.org/1-2/an-electric-current-of-the-imagination-by-andrew-prescott/ 29. Frank Pasquale, The Black Box Society. The Secret Algorithms that Control Money and Information (Cambridge (MA): Harvard University Press, 2015). 30. Michiel van Groesen, “Digital Gatekeeper of the Past: Delpher and the Emergence of the Press in the Dutch Golden Age,” Tijdschrift voor Tijdschriftstudies 38 (2015): 9–19, there 17, doi: http://doi.org/10.18352/ ts.340. 31. M. Bron, J. van Gorp and M. de Rijke, “Media Studies Research in the Data-Driven Age: How Research Questions Evolve,” Journal of the Association for Information Science and Technology 67, no. 7 (2016): 1535–1554, doi: http://10.1002/asi.23458. 32. Lev Manovich, Software Takes Command (2008), 15, http://softwarestudies.com/softbook/manovich_ softbook_11_20_2008.pdf. 33. Zaagsma, “On Digital History”, 17–18. 34. Wijfjes, “Perspectief in persgeschiedenis”; Godfrey, Methods of Historical Analysis; Hilmes, Only Connect; Hartley, Digital Future. 35. Frank van Vree, De Nederlandse pers en Duitsland. Een studie over de vorming van de publieke opinie (Groningen: Historische Uitgeverij, 1989). 36. Frank van Vree, De metamorfose van een dagblad. Een journalistieke geschiedenis van de Volkskrant (Amsterdam: Meulenhoff, 1996); Gerard Mulder and Paul Koedijk, Léés die krant! Geschiedenis van het naoorlogse Parool 1945–1970 (Amsterdam: Meulenhoff, 1996); Mariëtte Wolf, Het geheim van de Telegraaf (Amsterdam: Boom, 2009). 37. Marcel Broersma, Beschaafde vooruitgang. De wereld van de Leeuwarder courant 1752–2002 (Leeuwarden: Friese Pers, 2002). 38. Rutger de Graaf, Journalistiek in beweging. Veranderende berichtgeving in kranten en pamfletten (Groningen en ‘s-Hertogenbosch 1813–1899) (Amsterdam: Bert Bakker, 2010); Frank Harbers, Between Personal Experience and Detached Information. The Development of Reporting and the Reportage in Great Britain, the Netherlands and France, 1880–2005 (dissertation, University of Groningen, 2014). 39. Piersma and Ribbens, “Digital Historical Research”, 91–95. 40. Marcel Broersma, “Nooit meer bladeren? Digitale krantenarchieven als bron,” Tijdschrift voor Mediageschiedenis 14, no. 2 (2011): 29–55. 41. Van Eijnatten et.al. (2013), 73; Bingham, Adrian (2012). “Reading Newspapers: Cultural Histories of the Popular Press in Modern Britain”. History Compass 10/2, 140–150. Hart, Roderick P. and Lim, Elvin T. (2015). “Tracking the Language of Space and Time, 1948–2008”. Journal of Contemporary History 46/3, 591–609. 42. Van Eijnatten, Pieters and Verheul, “Big Data”, 73. 43. Ibid., 75. 44. Thomas Lansdall-Welfare, Saatviga Sudhahar, James Thompson, Justin Lewis, FindMyPast Newspaper Team and Nello Cristianini, “Content Analysis of 150 Years of British Periodicals,” PNAS 114, no. 4 (2017): 457–465; published online January 9, 2017, doi: http://10.1073/pnas.1606380114. 45. Van Eijnatten, Pieters and Verheul, “Big Data”, 69; http://translantis.wp.hum.uu.nl. 46. http://mediahistoryproject.org. 47. David Deacon, “Yesterday’s Papers and Today’s Technology. Digital Newspaper Archives and Push Button Content Analysis,” European Journal of Communication 22, no. 1 (2007): 5–25; Broersma, “Nooit meer bladeren”; N. Maurantonio, “Archiving the Visual. The Promises and Pitfalls of Digital Newspapers,” Media History 20, no. 1 (2014): 88–102. 48. Maarten van den Bos and H. Giffard, “The Grapevine: Measuring the Influence of Dutch Newspapers on Delpher,” Tijdschrift voor Tijdschriftstudies 38 (2015): 29–41, doi: http://doi.org/10.18352/ts.342. 49. http://www.europeana-newspapers.eu. 50. An overview of available titles in this digital KB-collection offers: https://www.kb.nl/sites/default/files/docs/ Beschikbare_kranten_alfabetisch.pdf. 51. https://www.esciencecenter.nl/project/pidimehs. 52. The technical setup of Pidemehs is shown in: P. Bos, H. Wijfjes, M. Piscaer and Voerman, “Quantifying Pillarization: Extracting Political History from Large Databases of Digitised Media Collections,” in Proceedings of the 3rd HistoInformatics Workshop, Krakow, Poland, ed. M. Düring, A. Jatowt, J. Preiser-Kapeller and A. van den Bosch (Aachen: CEUR Workshop Proceedings, 2016), 52–57, http://ceur-ws.org/Vol-1632/. Forthcoming is: H. Wijfjes, G. Voerman and P. Bos, Meten van verzuilde media. Een digitale benadering van politiek in dagbladen 1918–1967. 53. Currently, both newspapers have been added to the digital collections in Delpher. | 23Huub Wijfjes http://doi.org/10.18352/ts.340 http://doi.org/10.18352/ts.340 http://10.1002/asi.23458 http://softwarestudies.com/softbook/manovich_softbook_11_20_2008.pdf http://softwarestudies.com/softbook/manovich_softbook_11_20_2008.pdf http://10.1073/pnas.1606380114 http://translantis.wp.hum.uu.nl http://mediahistoryproject.org http://doi.org/10.18352/ts.342 http://www.europeana-newspapers.eu https://www.kb.nl/sites/default/files/docs/Beschikbare_kranten_alfabetisch.pdf https://www.kb.nl/sites/default/files/docs/Beschikbare_kranten_alfabetisch.pdf https://www.esciencecenter.nl/project/pidimehs http://ceur-ws.org/Vol-1632/ 54. It is expected that, from august 2017 De volkskrant (and other titles in the portfolio of the media company De Persgroep) will be available in Delpher. 55. Carolyn Strange, Josh Wodak and Ian Wood, “Mining for the Meanings of a Murder. The Impact of OCR Quality on the Use of Digitised Historical Newspapers,” Digital Humanities Quarterly 8, no. 1 (2014), http://www. digitalhumanities.org/dhq/vol/8/1/000168/000168.html. 56. Van Eijnatten, Pieters and Verheul, “Big Data”, 61; http://biland.science.uva.nl/wahsp/. 57. Joris van Eijnatten, Toine Pieters and Jaap Verheul, “Using Texcavator to Map Public Discourse,”, Tijdschrift voor Tijdschriftstudies 35 (2014): 59–65; https://www.esciencecenter.nl/project/texcavator. 58. Martijn Kleppe, “Wat is het onderwerp op een foto? Kansen en problemen bij het opzetten van een eigen fotodatabase,” Tijdschrift voor Mediageschiedenis 14, no. 2 (2012): 73–107. 59. Van Groesen, “Digital Gatekeeper”, 19. 60. Nicholson, “The Digital Turn”, 61, 64. 61. Charles Jeurgens, “The Scent of the Digital Archive,” BMGN - Low Countries Historical Review 128, no. 4 (2013): 30–54, there 34. 62. Thomas Smits, “Problems and Possibilities of Digital Newspaper and Periodical Archives,” Tijdschrift voor Tijdschriftstudies 36 (2014): 139–146, there 141. 63. Van Groesen, “Digital Gatekeeper”, 17. 64. Strange, Wodak and Wood, “Mining for the Meanings”. 65. Johan Oomen and Lora Aroyo, “Crowdsourcing in the Cultural Heritage Domain: Opportunities and Challenges,” Proceedings of the Fifth International Conference on Communication and Technologies (New York: ACM, 2011), 138–149; Daren C. Brabham, Crowdsourcing (Cambridge, MA: The MIT Press, 2013); Gregory D. Saxton, Onook Oh and Rajiv Kishore, “Rules of Crowdsourcing: Models, Issues, and Systems of Control,” Information Systems Management 30, no.1 (2013): 2–20, http://dx.doi.org/10.1080/10580530.2013.739883. 66. Lora Aroyo and Chris Welty, “Truth is a Lie: Crowd Truth and the Seven Myths of Human Annotation,” Artificial Intelligence Magazine 36, no. 1 (2015): 15–24; Mia Ridge ed. Crowdsourcing our Cultural Heritage (Abingdon/ New York: Routledge, 2016). 67. Nicholson, “The Digital Turn”, 64. 68. Deacon, “Yesterday’s Papers”; Broersma, “Nooit meer bladeren”. 69. Broersma (2011), 35–37. 70. Vgl. Karel Berkhout, “Het Digitale Drama,” NRC handelsblad, 10 september 2011. 71. Rieder and Röhle, “Digital Methods”, 70. Biography Huub Wijfjes (1956) is associate professor in Journalism Studies and Media History at University of Groningen and professor in History of Radio and Television at University of Amsterdam (department of Media Studies). He is the author of numerous books and articles on media history, political history and journalism. He wrote comprehensive books on the history of Dutch Public Service Broadcasting: VARA, biografie van een omroep (‘VARA, biography of a public broadcasting association’; Amsterdam 2009, including a website) and the history of Dutch Journalism: Journalistiek in Nederland 1850–2000. Beroep, organisatie en cultuur (‘Journalism in the Netherlands 1850–2000. Profession, organisation and culture’; Amsterdam 2004). In 2009 he edited (with G. Voerman) the volume Mediatization of Politics in History (Peeters Leuven). In 2015 and 2016 he was research fellow at the Netherlands Institute of Advanced Studies (NIAS) for a research into the dynamic historical relationship of politics and newspapers in modern Dutch history on basis of digital sources. 24 | TIJDSCHRIFT VOOR MEDIAGESCHIEDENIS - 20 [1] 2017 http://www.digitalhumanities.org/dhq/vol/8/1/000168/000168.html http://www.digitalhumanities.org/dhq/vol/8/1/000168/000168.html http://biland.science.uva.nl/wahsp/ https://www.esciencecenter.nl/project/texcavator http://dx.doi.org/10.1080/10580530.2013.739883 Abstract Clashes in Digital Humanities and Digital History Literacy and source criticism in Digital History A digital turn in newspaper history Putting theory to practice: opportunities, challenges and problems Conclusion Notes work_72g2q37pzveh7coknmrz2ycs7i ---- New Developments in Quantitative Metrics XML Papers Making Visible the Invisible: Metrical Patterns, Contrafacture and Compilation in a Medieval Castilian Songbook Gimena del Rio Riande, SECRIT-CONICET; Clara Martínez Cantón, UNED; and Elena González Blanco-García, UNED Panel Organizer David J. Birnbaum, University of Pittsburgh, djbpitt@gmail.com Panel Synopsis This panel presents new primary research results in the formal study of poetry and poetics that have been made possible by the development and use of innovative digital technologies. The research questions underlying the presentations are varied, as are the linguistic and cultural traditions (early modern and modern Russian, medieval Spanish, and Urdu [in comparison with Hindi/Sanskrit and Persian/Arabic]). What unites the three presentations is (1) a focus on using digital technologies to create humanities knowledge that would not otherwise be possible; (2) the development of innovative methodologies that are able to address those research questions; and (3) the building of new digital tools that make it possible to address new research needs. Our panel responds to the following specific areas of emphasis noted in the original call for paper: • Humanities research enabled through digital media, data mining, software studies, or information design and modeling. The focus of our panel is on new types of research results that are made possible by the development of original digital tools and methods. In this sense, the research results are foremost, but the projects that produce that research are methodologically innovative, and the research results would not have been attainable without that innovation. • Creation and curation of humanities digital resources. The creation of plain-text poetry archives is relatively straightforward: the text can be generated through OCR or by repurposing digital files originally created to produce print editions. The creation of structured poetry archives is also relatively straightforward: the general hierarchical structure of poetry is represented by pseudo-markup layout in plain-text editions, and is amenable to autotagging with the aid of regular- expression parsing. The creation of poetry archives with metrical and rhyme annotation, however, is difficult because it requires linguistic knowledge, and the presentations on this panel describe new technologies that were developed in order to (1) facilitate the machine-assisted creation of these types of metrically annotated poetic corpora, and (2) undertake original research about formal metrical practice on the basis of large digital corpora. • Social, institutional, global, multilingual, and multicultural aspects of digital humanities . . . For the 2015 conference, we particularly welcome contributions that address ‘global’ aspects of digital humanities including submissions on interdisciplinary work and new developments in the field. Our panel includes three research reports from three diverse poetic traditions: early modern and modern Russian, medieval Spanish, and Urdu (in comparison with Hindi/Sanskrit and Persian/Arabic). The projects that produced these research results have operated independently, but within their highly varied cultural contexts they address similar types of research questions. • Digital humanities in pedagogy and academic curricula. The projects that contribute to our panel, which were designed to create new research knowledge in the humanities, were developed in many cases with attention to pedagogical and curricular concerns. For example, some portions of the development for these projects was carried out in the context of digital humanities courses, with undergraduate and graduate students making contributions to authentic humanities research as a way of learning to be digital humanists. 2. Making Visible the Invisible: Metrical Patterns, Contrafacture, and Compilation in a Medieval Castilian Songbook Del Rio Riande, G., Martínez Cantón, C. and González Blanco-García, E. ReMetCa, Digital Repertoire on the Metrics of the Medieval Castilian Poetry ( Repertorio Métrico Digital de la Poesía Medieval Castellana) is an online, free- access digital tool designed to undertake simultaneous complex searches on the metrical and rhyming patterns of Medieval Castilian poetry (starting from the late-12th-century epics to the poetry of the 16th-century Castilian Cancioneros). ReMetCa is part of the corpus of online digital resources on Medieval Romance poetry, alongside such others as the ones related to Galician-Portuguese (MedDB, The Oxford Cantigas de Santa Maria Database), French (BedTrouveres, Nouveau Naetebus), and Occitan and Catalonian poetry (BedT, Corpus des Trobadours). The research project that sustains ReMetCa aims to integrate traditional studies of philology (especially those pertaining to metrics) with digital humanities, revising and classifying the Castilian corpus in a hybrid digital framework that embeds TEI-Verse module tags in a relational database that works altogether with a controlled vocabulary on Medieval Castilian poetry. One interesting case of study that illustrates the topic of the panel is our digital approach to the Cancionero de Baena, a large songbook containing almost 600 poems transcribed and compiled in the first half of the 15th century by Juan Alfonso de Baena, scribe of the court of King Juan II of Castile. The data retrieved from the tagging and classification of the metrical and rhyming patterns of a large section of Baena’s songbook—the one regarding the antiquiores or the eldest poets, and those that composed their texts mainly in the second half of the 14th century—yielded interesting results in the area of the Hispanic studies devoted to metrics. On the one hand, we discovered that the antiquiores composed a large part of their poems using a pattern almost unknown by their predecessors, the Galician-Portuguese troubadours: the octosyllabic octava (eight-lined stanza, lines of eight syllables that may sometimes be isometric or heterometric). This pattern was shaped in a body of four stanzas ( glosa) with rhymes structured in a singular pattern ( rimas singulares) and words stressed on the penultimate syllable ( rima grave or femenina). Furthermore, we were able to identify some groups or cycles of poems composed on the basis of the metrical and rhyming imitation or contrafacture (→ 4x8@8) (Spanke, 1928; Marshall, 1980; Rossell, 2000). It was the theoretical organization of the XML markup as a discursive system (Jockers and Flanders, 2013), which we used to shape an ontology framework (available at http://www.purl.org/net/remetca and http://datahub.io/dataset/remetca-ontology), that in practice helped us to move from the descriptive to the connotative dimension, thus making visible the invisible. Apart from using the expected TEI-verse attributes such as @met for our schemes based on the number of syllables of each line and on the number of lines (e.g., 8,8,8,8), and @rhyme for the rhyming structure of the stanzas (e.g., abba), there were some new attributes that did not exist in the TEI-Verse module and that we decided to add to our XML schema: @asonancia, an attribute that indicates the two possible values of the rhyming typology: ‘asonante’ (which means that only vowel sounds are repeated) and ‘consonante’ (which means that every sound after the stressed syllable is repeated); @unisonancia, which takes the values of ‘unisonante’ or ‘singular’ and shows whether the same rhyming scheme (e.g., abba) is repeated in different stanzas or not; and @isometrismo, which states whether all the stanzas have the same number of syllables (isométrico) or not (heterométrico). It was the joint work of all these attributes that retrieved the metrical and rhyming patterns and led us to those new results. In addition to this, the whole analysis of the corpus resulted in an unexpected fact: Juan Alfonso de Baena may have organized and compiled the texts of the different poets in his songbook guided not only by chronological order but also following metrical patterns and types of stanza. With the help of our digital tool we will illustrate possible contrafactures, common metrical and rhyming patterns, and cycles of poems in the antiquiores’ corpus, and give an account of more complex definitions. The examples will also serve as an opportunity to cast our eyes again on the macro-microanalysis (Jockers, 2013; Jockers and Flanders, 2013; Liu, 2014) and data-text (Marche, 2012) debates in the field of literary studies and the concepts of close-distant reading (Moretti, 2013; Latour, 2014) and relate them to a subject of study that is interested in formal patterns (and not as much in content) and acquires new meaning when compared through large corpora: metrics. References Asperti, S., Zinelli, F., et al. (n.d.). BedT, Bibliografia Elettronica dei Trovatori. www.bedt.it. Brea, M., et al. (n.d.). MedDB: Base de datos da Lírica profana Galego-Portuguesa. http://www.cirp.es/bdo/med/meddb.html. González-Blanco, E., et al. (n.d.). ReMetCa: Repertorio Métrico Digital de la Poesía Medieval Castellana. www.remetca.uned.es. Jockers, M. L. (2013). Macroanalysis: Digital Methods and Literary History. University of Illinois Press. New Developments in Quantitative Metrics http://dh2015.org/abstracts/xml/BIRNBAUM_David_J_New_Develo... 1 de 2 28/01/2019 11:11 Jockers, M. L. and Flanders, J. (2013). A Matter of Scale. http://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1106&context=englishfacpubs. Latour, B. (2014). Opening plenary, Digital Humanities 2014 (DH2014), http://dh2014.org/videos/opening-night-bruno-latour/. Liu, A. (2014). The Laws of Cool: Knowledge Work and the Culture of Information. University of Chicago Press, Chicago. Marche, S. (2012). Literature Is Not Data: Against Digital Humanities. Los Angeles Review of Books, http://www.lareviewofbooks.org/article.php?id=1040& fulltext=1. Marshall, J. H. (1980). Pour l’étude des contrafacta dans la poésie des troubadours. Romania CI, pp. 289–335. Moretti, F. (2013). Distant Reading. Verso, London. Parkinson, S., et al. (n.d.). The Oxford Cantigas de Santa Maria Database. http://csm.mml.ox.ac.uk/. Rossell, A. (2000). Intertextualidad e intermelodicidad en la lírica medieval. In Bagola, B. (ed.), La Lingüística española en la época de los descubrimientos: Actas del Coloquio en honor del profesor Hans-Josef Niederehe, Tréveris 16 a 17 de Junio de 1997. Hamburg: Helmut Buske, pp. 149–56. Seláf, L. (n.d.). Le Nouveau Naetebus. Répertoire des poèmes strophiques non-lyriques en langue française d’avant 1400. www.nouveaunaetebus.elte.hu. Spanke, H. (1928). Das öftere Aufreten von Strophenformen und Melodien in der altfranzösichen Lyrik. Zeitschrift für französichen sprache und Literatur, 51, pp. 73–117. 3. Using Bioinformatic Algorithms to Analyze the Politics of Form in Modernist Urdu Poetry Pue, A. S., Teal, T. K. and Brown, C. T. This paper has two aims. First, it shows how the authors—a humanist and two computational biologists—adapted graph-based algorithms used in genome assembly and multiple sequence analysis to scan the meter of Urdu poetry. Second, applying these techniques to modernist free-verse poetry of the early 1940s, the paper argues that data-rich analysis of poetic meter offers humanistic insights into the politics of literary form. New Developments in Quantitative Metrics http://dh2015.org/abstracts/xml/BIRNBAUM_David_J_New_Develo... 2 de 2 28/01/2019 11:11 work_73gs7go455hizgbutvr325fr4e ---- 1Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) Abstract: This article discusses the creation of an innovative e-learning resource that provides a unique breadth of frequency, grammatical, and phonetic information on both Classical and Ecclesiastical Latin. Designed to bridge teaching and research, this new digital toolkit, which is available as both an online program and an Android mobile app, provides a frequency list of the most common Latin lemmas, as well as phonetic and grammatical information, including their syllabication, accentuation, and Classical and Ecclesiastical phonetic transcription according to the standards of the International Phonetic Alpha- bet. After providing a concise overview of the different ways in which Latin was and still is pronounced, this article will discuss the methodological and practical issues faced by the creation of the toolkit from the choice of an effective lemmatizing technique for identifying and categorizing inflected word-forms, to the creation of algorithms to accentuate Latin lemmas and transcribe Latin sounds (potentially involving multiple characters of the Latin alphabet) into IPA characters. In so doing, it will offer insights into the technologies used to maximize the impact of this new e-learning resource on teaching and research. Introduction This article discusses the recent creation of the first online Handbook of Latin Phonetics, an innovative opensource digital toolkit that provides a unique breadth of frequency, grammatical, and phonetic infor- mation on both Classical and Ecclesiastical Latin. Originally conceived within the award-winning pro- ject Latine Loquamur (undertaken to support the reform of classical language teaching at the University of St Andrews), the digital toolkit described in this article was developed at the Pontificium Institutum Altioris Latinitatis (Pontifical Salesian University of Rome), to meet the needs of the ever-increasing number of scholars and students who study Latin in Latin, or who focus on late-antique texts.1 Accor- dingly, the Handbook of Latin Phonetics toolkit currently provides, for the first time ever, a frequency 1 The Latine Loquamur Project was designed by Tommaso Spinelli in collaboration with Alice König, Giuseppe Pezzini and Giacomo Fenzi, and was awarded funding by the Teaching Development Office of the University of St Andrews in 2018. This project involved the creation of an online Dictionary of Latin Synonyms, which was published by the University of St Andrews in December 2018 (https://doi.org/10.17630/3cf644e6-86b8-44d0-a50a-b33c7ca86072; last access 17.10.2020), and of other e-learning resources (e.g. Moodle presentations, exercises, and interactive games) for the study of Latin that will be discussed in another article for reasons of space. The Handbook of Latin Phonetics pre- sented in this article is available as both an app and a program: the app was developed by Tommaso Spinelli during his Postdoc at the Pontifical Salesian University of Rome in collaboration with Cleto Pavanetto, Giacomo Fenzi, Kamil Ko- losowski, and Jan Rybojad, and was published – thanks to the collaboration of Miran Sajovic – by the Pontifical Salesian University of Rome in 2020. (https://play.google.com/store/apps/details?id=com.kolosowski.latinhandbook; https://doi. org/10.17630/19ce37ba-2d35-4920-bd7f-6287977de369; last access 17.10.2020). The online version of the Handbook of Latin Phonetics, which was developed by Tommaso Spinelli with the informatic assistance of Giacomo Fenzi at the University of St Andrews, is currently hosted in the GitHub repository of the Latine Loquamur Project (https://github. com/latineloquamur?tab=repositories; last access 17.10.2020) and can be found in the folder titled Latineloquamur-tool- kit-IPA-transcriber-and-App. In the same repository users can find also the Dictionary of Latin Synonyms, which is not discussed in this article, and the link to download its app (https://github.com/latineloquamur/dictionary-of-latin-near-sy- nonyms; last access 17.10.2020). Creating the First Digital Handbook of Latin Phonetics: Bet- ween Linguistics, Digital Humanities and Language Teaching Tommaso Spinelli https://doi.org/10.17630/3cf644e6-86b8-44d0-a50a-b33c7ca86072 https://play.google.com/store/apps/details?id=com.kolosowski.latinhandbook https://doi.org/10.17630/19ce37ba-2d35-4920-bd7f-6287977de369 https://doi.org/10.17630/19ce37ba-2d35-4920-bd7f-6287977de369 https://github.com/latineloquamur?tab=repositories https://github.com/latineloquamur?tab=repositories https://github.com/latineloquamur/dictionary-of-latin-near-synonyms https://github.com/latineloquamur/dictionary-of-latin-near-synonyms Digital Classics Online 2Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) list of the 6500 most common Latin lemmas as attested in the entire extant corpus of Latin literature, as well as unique phonetic and grammatical information, including their syllabication, accentuation, and Classical and Ecclesiastical phonetic transcription according to the standards of the International Phonetic Alphabet. This toolkit, which is available as both a RUST program (referred to as Latineloquamur-toolkit-IPA-tran- scriber-and-App in GitHub) and an Android mobile app (titled Handbook of Latin Phonetics), faced si- gnificant methodological and practical issues during its creation and development, such as the choice of an effective lemmatizing technique for identifying and categorizing inflected word-forms, the creation of algorithms to accentuate Latin lemmas, and the development of an innovative program to transcribe Latin sounds (potentially involving multiple characters of the Latin alphabet) into IPA characters corre- sponding to different pronunciations of Latin.2 After providing a concise overview of the different ways in which Latin was and still is pronounced, this article will discuss the complex interaction between linguistics, phonology, and digital humanities. It will explore the methodologies and principal technolo- gies used within this digital project to offer rigorous frequency and phonological information on Latin lemmas, and to maximize its impact on teaching and research. Pronouncing Latin: between teaching and research One of the aims of the Latin Phonetics digital toolkit is to further the creation of a shared rigorous methodology for the pronunciation of Latin lemmas, and for the identification of the words most used by the Latin authors that a given student or researcher might want to prioritize in their studies. Both ‘frequency’ and ‘pronunciation’ have played a key role in language teaching and rhetorical studies since antiquity. Latin authors such as Cicero, Varro, and Quintilian often referred to the usus (use) of a word or to its frequency in their literary, grammatical, and stylistic discussions.3 Similarly, the pseudo-Cicero’s Rhetorica ad Herennium devotes an entire section to the role of pronunciation in ‘delivering’ a speech (3.19.1–2), which is also discussed by Quintilian in his Institutio Oratoria (1.4; 1.7), while, in the third century CE, the grammarian Probus encourages his students to pronounce correctly the words speculum (mirror) and columna (column), avoiding the wrong forms speclum and colomna.4 And yet, despite the importance of such themes, not enough attention has been paid to them by modern digital scholarship. While the last couple of decades have seen the publication of many frequency dictionaries for modern languages, no comprehensive frequency dictionary yet exists for Latin, as the few modern attempts to provide rigorous lemmatization and counts of Latin words have treated very limited textual corpora, and have adopted remarkably different methodologies, as we shall see better in the following analysis.5 Even more problematic is the situation concerning the pronunciation of Latin. Ancient literary and documentary sources indicate that Latin was spoken differently synchronically at different stages of 2 The two different names of the program and the app are due to the different stages of the development of the toolkit and to the different institutions that published those tools, the University of St Andrews and the Pontifical Salesian University of Rome respectively. However, to avoid confusion in this article I will refer to these tools as the Latin Phonetics app/ program. 3 Joseph Denooz (2010), 1–2 has shown that the word usus (‘use’) is used to explain linguistic facts 45 times by Varro in his De lingua Latina, 163 times by Cicero in the De Oratore and the Orator, and 163 times in Quintilian’s Institutio Oratoria. Moreover, Quintilian uses the adjective frequens (‘frequent’) and the adverb frequenter (‘frequently’) some 223 times in his linguistic and stylistic considerations. Cf. also Cic. De Inv. 1.9.4; 1.9.10; De Or. 3.140.4. 4 The so-called list of the ‘appendix Probi’ has been variously dated to the third or the fifth century CE. See Barnett (2006), 257–278. 5 See, for instance Diederich (1939), Delatte/Evrard/Govaerts/Denooz (1981), Denooz (2010). Digital Classics Online 3Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) Roman history and in different regions of the empire by different social classes. For instance, Lucilius jokes about the rustic pronunciation of a certain Caecilius, who was praetor urbanus (urban pretor), by saying, in a phonetic spelling, ‘Cecilius pretor ne rusticus fiat’ (Let Cecilius not be a rustic pretor; Lucil. 1130, M.), remarking on the fact that, as we know from Varro (L. 5.97), the diphthong ae was already pronounced /e/ in the countryside in Classical times.6 Epigraphs show the existence of different pronunciations of Latin throughout the history of Rome, and the Historia Augusta (Hadr. 3.1) recounts that the emperor Hadrian (117–38 CE) was mocked for his Hispanic accent.7 This ancient diversity has been only partially reduced in modern times; it has therefore been an urgent and challenging necessity to create a tool able to provide a standardized pronunciation of Latin.8 Although the first Congrès International Pour le Latin Vivant (the first international conference for living Latin), held in Avignon in 1956, tried to foster a shared Classical pronunciation of Latin in modern times, at least three different ways of reading Latin are still commonly – and often unthinkingly – used by different institutions.9 The first way is the so-called ‘national’ because of its proximity to the phonetic system of the modern languages of the countries in which Latin is read.10 According to this pronunciati- on, for example, the lemma Caesar, which was pronounced /’kaɛ̯.sar/ in Classical Latin and /’tʃɛ.sar/ in Ecclesiastical, is read as /’tʃɛ.sar/ in Italy, /ʃɛ.’sar/ in France, /’sɪ.sar/ in Britain, and /’tsɛ.sar/ in Germany. The second way is the so-called ‘Ecclesiastical’ because it is officially used by the Catholic Church. Although it looks similar to the Italian pronunciation of Latin, this pronunciation is supranational and reflects the diction of Latin used in Rome during the fourth and fifth centuries CE.11 The third way is the so-called ‘Classical’ pronunciation or ‘restituta’. Starting from the Renaissance period, this system of pronunciation used the phonetic clues provided by ancient grammatical texts and epigraphs to recon- struct the language arguably spoken by cultured Romans in the first century BCE and the first century CE.12 A further complicating factor is that, while an ever-increasing number of institutions worldwide has started to teach Latin in Latin, using the Ørberg’s and Cambridge’s textbooks that encourage a more active use of the language in its ‘Classical’ pronunciation, other world-leading institutions (such as the Salesian University of Rome and the Pontificium Institutum Altioris Latinitatis) have continued to use the Ecclesiastical pronunciation that is also used to read late-antique and early-medieval texts, to which Classicists have increasingly shifted their attention in the last two decades.13 At this critical juncture, my new toolkit builds upon recent developments in the fields of digital huma- nities and Latin linguistics to provide students worldwide with a rigorous guide to the pronunciation of both Classical and Ecclesiastical Latin. In particular, while the Latin dictionaries currently available in many countries tend to provide only the quantity (or length) of the penultimate syllable of lemmas, the Latin Phonetics program and app provide more complete information on the accentuation, prosody, syllabication, and IPA phonetic transcription of Latin lemmas. The following analysis will explore the 6 See Ramage (1963), 390–414. 7 See, for example, the commonly attested form coss. for consules, or the names Crescentsianus and Vincentza respectively attested in CIL XIV, 246; VII, 216. On dialectal pronunciations of Latin see Oniga (2003), 39–62. 8 On the Church’s use of Latin see the epistle Romani Sermonis by Paulus VI (1976). On the use of Latin in modern acade- mia see Short/George (2013). On the bidirectional influence of national languages on Latin, see Serianni (1998), 27–45. 9 See Allen (1966), 102; Pavanetto (2009), 9–10; Traina/Bernardi-Perini (1998), 22–29. 10 See Collins (2012). 11 An overview of the most important features of Ecclesiastical Latin is provided by Collins (1988). 12 See, for instance, Erasmus’ De recta Latini Graecique sermonis pronuntiatione (1528). On this theme see also Allen (1966); Oniga (2014). 13 See the overview provided by Chiesa (2012) and Spinazzé (2014), but also the seminal work on stylometry of the ‘Quanti- tative Criticism Lab’ (https://www.qcrit.org/researchdetail/kHXib8DissfMp53Yx; last access 17.10.2020). See also Har- rington/Pucci (1997), Avitus (2018), and Norberg (1999). https://www.qcrit.org/researchdetail/kHXib8DissfMp53Yx Digital Classics Online 4Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) methodology and innovative technologies used to create this tool, discussing the potential and the limits of the digital technologies that can be currently deployed to process the Latin language. Existing technologies The Latin Phonetics toolkit builds on and bridges together several different technologies developed in recent years to meet the new need for more complete phonetical information which can be used not only to speak and write in Latin, but also to study rhythmic prose texts and the style of authors. Leaving aside the work-in-progress Latin dictionary on Wikipedia that unsystematically offers some information on the pronunciation of Latin words, the best-equipped tool currently available is that offered by the Classi- cal Language Toolkit Project (CLTK).14 This international opensource project offers both a ‘macronizer’ and a ‘phonetic transcriber’. Based on an original algorithm developed by Johan Winge in 2015, the macronizer can mark Latin vowels according to their length, using a POS tagger which matches words with the lexical entries of Morpheus.15 Although this tool does not provide accentuation of lemmas and has an accuracy of around 86.3% (depending on which of the three available POS is used), it allows a more complete prosodic mark-up than that usually offered by traditional dictionaries.16 Moreover, the ‘phonetic transcriber’ represents the first attempt to provide a rigorous phonetic transcription of the La- tin language according to the IPA standards.17 This tool transliterates Latin lemmas into their phonetic forms using a list of replacements based on Allen’s reconstruction of the phonetics of Classical Latin (1966). However, while the source codes of both these tools are available on Github, they do not offer a user-friendly interface, so that only expert users, with a good knowledge of programming, can actually use them to process Latin words. Moreover, the CLTK phonetic transcriber only provides information on the Classical pronunciation of Latin. Similarly, the project LatinWordnet2.0, which is being developed at the University of Exeter by William Short, provides the Classical phonetic transcription for Latin lem- mas, but this data is currently accessible only to expert users.18 Although they do not provide a phonetic transcription of Latin, it is worth mentioning other programs which have tried to address similar issues. The first is Google Translate, which now offers the Ecclesiastical pronunciation (but not accentuation and IPA transcription) of Latin words.19 The second is Collatinus, which was developed within the pro- ject Biblissima for the study of medieval and modern texts, and can divide by syllable and accentuate Latin lemmas or small texts.20 The third is the Quantitative Criticism Lab that, while not providing the pronunciation of Latin lemmas, offers detailed information on the prosody of single words and entire texts, using quantitative metrics to support both linguistic and stylistic analysis.21 Similar are the pro- jects Cursus in Clausula, developed at the University of Udine, which detects the quantitative and tonic 14 On the dictionary offered by Wikipedia see: https://en.wiktionary.org/wiki/Wiktionary:Main_Page (last access 26.10.2020). The CLTK project is a Python library containing tools for the natural language processing (NLP) of ancient Eurasian languages: http://cltk.org/ (last access 02.09.2020). 15 The algorithm and its explanation are available at https://cl.lingfil.uu.se/exarb/arch/winge2015.pdf (last access 02.09.2020); Morpheus is a morphological parsing and lemmatizing tool integrated into the Perseus Project http://www. perseus.tufts.edu/hopper/ (last access 02.09.2020). 16 The Python macronizer is available at https://github.com/cltk/cltk/blob/master/cltk/prosody/latin/macronizer.py (last ac- cess 02.09.2020). 17 The CLTK transcriber can be accessed at https://github.com/cltk/cltk/blob/master/cltk/phonology/latin/transcription.py (last access 02.09.2020). 18 Cf. https://github.com/wmshort/latinwordnet-archive; https://latinwordnet.exeter.ac.uk/ (last access 02.09.2020). 19 See https://translate.google.com/?sl=la#view=home&op=translate&sl=la&tl=en&text (last access 02.09.2020). 20 The codes are available at https://github.com/biblissima/collatinus (last access 02.09.2020). See also https://projet.biblis- sima.fr/ (last access 02.09.2020). 21 See https://www.qcrit.org/ (last access 02.09.2020). https://en.wiktionary.org/wiki/Wiktionary:Main_Page http://cltk.org/ https://cl.lingfil.uu.se/exarb/arch/winge2015.pdf http://www.perseus.tufts.edu/hopper/ http://www.perseus.tufts.edu/hopper/ https://github.com/cltk/cltk/blob/master/cltk/prosody/latin/macronizer.py https://github.com/cltk/cltk/blob/master/cltk/phonology/latin/transcription.py https://github.com/wmshort/latinwordnet-archive https://latinwordnet.exeter.ac.uk/ https://translate.google.com/?sl=la#view=home&op=translate&sl=la&tl=en&text https://github.com/biblissima/collatinus https://projet.biblissima.fr/ https://projet.biblissima.fr/ https://www.qcrit.org/ Digital Classics Online 5Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) rhythm of prose clausulae, and the toolkit of Pedecerto that, developed within the project FIRB Traditio Partum by the University Ca’ Foscari of Venice, can perform automatic scansion of Latin verses.22 The first app of Latin phonetics: outline and features Distinct from the previous contributions described above, the Handbook of Latin Phonetics has been designed to bring together teaching and research. Accordingly, it aims to advance the automated proces- sing of the Latin language through the creation of original algorithms for a rigorous phonetic transcrip- tion of both Classical and Ecclesiastical Latin. It also aims to provide students, teachers, and researchers across the world with a compact, freely accessible, and easy-to-use toolkit to study Latin in Latin. For this reason, the Handbook of Latin Phonetics has been made available both as an online opensource program for expert users (discussed in detail in the following section) and as a user-friendly Android app (discussed in this section) which, developed in collaboration with Kamil Kolosowski, displays and provides the most important features of the toolkit in an accessible format. 22 See respectively http://cursusinclausula.uniud.it/public/ (last access 02.09.2020) and http://www.pedecerto.eu/public/ (last access 02.09.2020). Fig. 1 Latin Phonetics App. http://cursusinclausula.uniud.it/public/ http://www.pedecerto.eu/public/ Digital Classics Online 6Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) The Android app, freely available on Google Play, is organized as follows: 1) a learning section, con- taining an introduction to Latin phonetics and a list of the most frequently attested Latin lemmas, 2) a search tool offering a wide range of prosodic and phonetic information on Latin lemmas, and 3) an ‘info’ section providing details on the app and its related programs.23 The first page of the app is an introductory section that, divided in two parts, contains material for the independent e-learning of Latin. The first part, tit- led De Ratione Efferendi Verba Latina offers a brief history of the Latin language, basic notions of lingu- istics and phonetics, and an up-to-date explanation (written in Latin) of the main differences between Classical and Ecclesiastical pronunciations. This explanation deals especially with the differences in sound between the diphthongs ae and oe (which are pronounced as monophthongs in Ecclesiastical La- tin), and with the pronunciation of velar and voiced plosives (c, g), which are never soft in Classical La- tin, and of the group ‘-ti + vowel’, generally pro- nounced /tɪ/ in Classical Latin and /tsi/ in Ecclesia- stical.24 Different from many modern grammars and digital programs based on Allen’s Vox Latina (1966) for Classical Latin, and Nunn’s Introduction to Ec- clesiastical Latin (1927), this explanation builds on more recent studies such as those on Classical Latin by Traina/Bernardi-Perini (1998) and Oniga (2014), and those on Ecclesiastical Latin by Collins (1998) and especially Pavanetto (2009), who was the head of the Pontifical Institute Latinitas governing the Catholic Church’s official use of Latin. This appro- ach governs the phonetic transcription performed by the program, which is summarized in the following table (table 1), especially concerning the sounding of diphthongs in Classical Latin. In this respect, the development of the so-called historical and genera- tive grammar over the past century has revealed that the diphthongs ae and oe evolved from the older forms ai and oi, which left a mark on the spelling used on some epigraphs composed before the end of the second century CE. For instance, in the inscription adorning the tomb of Scipio Barbatus, dated around the 250 BCE, the term aedilis (‘aedile’, the censor aedilis was an elected officer responsible for the maintenance of public buildings) is spelled aidilis (CIL 06, 01287). 23 The app can be downloaded at https://play.google.com/store/apps/details?id=com.kolosowski.latinhandbook (last access 15.10.2020). The online program is available at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcri- ber-and-App (last access 02.09.2020). The constitutive elements of the app can be inspected through this link: https:// github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/Mobile%20APP (last access 02.09.2020). 24 The introduction is available at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/ blob/master/Mobile%20APP/Intro_App.txt (last access 02.09.2020). Fig. 2 The app’s introductory section. https://play.google.com/store/apps/details?id=com.kolosowski.latinhandbook https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/Mobile%20APP https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/Mobile%20APP https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/Mobile%20APP/Intro_App.txt https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/Mobile%20APP/Intro_App.txt Digital Classics Online 7Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) In another epigraph composed some thirty years later, the word praefectura (‘prefecture’) is written praifectura (CIL 10, 06231).25 Matching the phonetic spellings of these inscriptions with the general linguistic tendency of diphthongs to take as second element a semi-consonantal sound ‘i’ or ‘u’, a past generation of scholars suggested reading classical texts by pronouncing the diphthongs ‘ae’ and ‘oe’ as / aj/ and /oj/, like the English sounds of ‘high’ and ‘boy.’26 However, scholars have more recently pointed out that these pronunciations might simply reflect an archaic transition between the diphthong ai and ae, since the canonical form aedem is already attested in the famous text of the so-called senatus consultum de bacchanalibus, written in 186 BCE.27 Thus, while maintaining that the second element of a diphthong is always an asyllabic vowel that cannot be stressed, modern scholarship has suggested that, in Classical Latin, “the pronunciation of the diphthongs ae and oe is [ae] and [oe] respectively.”28 The introductory section of the app presents the results of these studies in the form of simple Latin rules, often comparing the sounds of Latin with that of modern languages. LATIN CLASSICAL ECCLESIASTICAL ALPHABET PRONOUNCIATION PRONOUNCIATION ā aː a ă a a (ae) ae̯ ɛ b b b c k k c + e, i, y, ae, oe k tʃ ch kʰ k d d d ē eː e 25 CIL is an acronym standing for Corpus Inscriptionum Latinarum: this work contains a comprehensive collection of an- cient Latin inscriptions. 26 See Allen (1966), 131–32. 27 See Cupaiuolo (1991), 77–87. 28 Quotation from Oniga (2014), 22. See also Cupaiuolo (1991), 86–87 and Traina/Bernardi-Perini (1998), 66. Fig. 3 The sarcophagus of Scipio Barbatus, currently displayed in the Vatican Museums. Digital Classics Online 8Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) ĕ ɛ e f f f g g g g + e, i, y, ae, oe g dʒ gn ŋn ɲ h h - ĭ ɪ i ī iː i i (semiconsonant) j j k k k l l l m m m n n n ŏ ɔ o ō oː o (oe) oe̯ e p p p ph pʰ f q kʷ kw r r r s s s t t t th tʰ t ŭ ʊ u ū uː u u (semiconsonant) w v x ks ks z z dz The second part of the app’s introductory section contains a list of the most common Latin lemmas which are crucial for students in their vocabulary-learning. Ideally, it would have been nice to have a dedicated section for this frequency list. However, toolbars of mobile apps tend to be very limited in terms of space. Therefore, we decided to place the list of Latin lemmas in the introductory section, after the explanation of Latin phonetics. While the online program (in GitHub) can virtually scan every Latin lemma, the app offers a selection of the most common 6500 Latin words as attested across a wide corpus of Classical and Christian texts dating from the fourth century BCE to the sixth century CE.29 Following a growing scholarly consensus that frequency information plays a key role not only in com- putational linguistics but also in literary and intertextual research, and in language teaching, the last two 29 The online Python transcriber can be accessed at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcri- ber-and-App/tree/master/CLASSICAL%26ECCLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER (last access 02.09.2020). The App frequency list is available at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-tran- scriber-and-App/blob/master/Mobile%20APP/INFO_LIST.txt (last access 02.09.2020). Tab. 1 Latin phonetic transcription. https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/CLASSICAL%26ECCLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/CLASSICAL%26ECCLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/Mobile%20APP/INFO_LIST.txt https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/Mobile%20APP/INFO_LIST.txt Digital Classics Online 9Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) decades have seen the publication of many frequency dictionaries for modern languages.30 Yet, while Latin authors themselves often referred to the frequency or usus of Latin words in their commentaries, no comprehensive Latin frequency dictionary exists.31 The few modern attempts to provide a rigorous lemmatization and count of Latin words have always adopted limited textual corpora based on ‘highly representative’ authors from the so-called ‘golden literature.’32 These dictionaries consequently struggle to meet the needs of contemporary students and researchers, who are increasingly shifting their attention to the ‘less famous’ literature of the early Republican, Christian, and Late Antique periods. By contrast, the frequency list provided by the app is based on a wide corpus of 307 Classical and Christian authors, which has been analyzed using original algorithms and the capabilities of the new lemmatizer Lemlat to provide a realistic picture of the most common terms used in Latin texts of different periods.33 Since average cultured speakers of a language know around twenty thousand lemmas, and use only a few thousand of them in their daily life, our 6500-word list provides students not only with basic lemmas, but also with the most important technical and specific words most commonly attested in Latin literature.34 At the same time, the reasonably small size of the corpus makes it possible for the information provided to be manually checked, and for the app to work even offline. The section ‘search’ contains the most important contribution offered by the app: the phonetic transcrip- tion of Latin lemmas in both Classical and Ecclesiastical Latin.35 Using this function, users can type a Latin lemma without diacritics, and access information on the quantities of its syllables, and its accentu- ation, pronunciation(s), and basic grammatical information, including the presence of homographs that have different meaning and prosody. For example, when one searches the word praedico, the program shows that two lemmas have the same spelling, one of them with the penultimate syllable short and being a verb of the first conjugation (prāedĭco, prāedĭcas, praedĭcāre, prāedicavi, prāedicatum; to an- nounce), while the other has the penultimate syllable long and belongs to the third conjugation (prāedī- co, prāedīcis, prāedīcĕre, prāedīxi, prāedīctum; to foretell). Although the verb prāedīco is less common than the verb prāedĭco, in these cases, the database displays both entries to help users note potentially ambiguous forms, showing eventual differences in their pronunciation. To make the program more accessible to beginner students, the app provides not only the IPA transcrip- tion, but also the syllabication and accentuation of each lemma using the Latin alphabet. This infor- mation, displayed between squared brackets, can be used for both Classical and Ecclesiastical Latin.36 However, when reading late-antique and medieval texts in Ecclesiastical pronunciation, users should be aware that, after the quantity of vowels was no longer perceived by Latin speakers, the accentuation of some words changed. For instance, while Classical Latin could not preserve the original accentuation of Greek words such as φιλοσοφία (philosophy), which was pronounced philosóphĭa according to the Latin 30 On the importance of frequency lists in the pedagogy of Latin, see Muccigrosso (2004). On the use of frequency data for language teaching in general, see Sinclair (1991), 30 and Davies (2005), vii. 31 See Folco Martinazzoli (1953) on the use of the concept of hapax legomenon by ancient commentators and Denooz (2010), 1–2. 32 Latin frequency dictionaries have been published by Diederich (1939); Delatte/Evrard/Govaerts/Denooz (1981); and Denooz (2010). The largest corpora used so far is that of Denooz (2010), which includes nineteen authors but does not include important texts such as Ovid’s Metamorphoses. 33 The original source code is available at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcri- ber-and-App/tree/master/src (last access 02.09.2020). The lemmatizer Lemlat can be accessed at http://www.lemlat3.eu/ (last access 12.01.2021). 34 On modern languages, see, for instance, Coxhead/Nation/Sim (2015), 121–35. Modern Latin frequency lists tend to pro- vide students with only a few thousand terms. For instance, Williams (2012) offers a 1425 word-list. 35 The database is available at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/ master/Mobile%20APP/Data_App_accentuation_Ipa.txt (last access 02.09.2020). 36 See Pavanetto (2009). https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src http://www.lemlat3.eu/ https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/Mobile%20APP/Data_App_accentuation_Ipa.txt https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/Mobile%20APP/Data_App_accentuation_Ipa.txt Digital Classics Online 10Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) prosody, the Greek words introduced into Latin vocabulary after the disappearance of vocalic quantities around the third century CE maintained their original Greek accentuation (e.g., ἔρημος, éremus, ‘hermi- tage’). Similarly, the words in which the penultimate syllable was short and was followed by a muta cum liquida, which were stressed on the third last syllable in Classical Latin (e.g., íntĕgrum; intact), tended to be accentuated on the penultimate syllable in late-antique and medieval Latin (e.g., intégrum).37 Section Three, which users can access through the button ‘Info’, explains the genesis of the app and acknowledges the work of the Classicists (T. Spinelli, C. Pavanetto) and Computer Scientists (Giacomo Fenzi, Kamil Kolosowski, Jan Rybojad) who developed it. Moreover, it contains links to the online repositories in which the codes and programs underpinning the app are stored. Overall, in its unique and unprecedented features, the Handbook of Latin Phonetics app contributes importantly to language teaching and to stylistic and prosodic studies by allowing even beginner students and non-expert users to learn the most common Latin words and their correct pronunciations, as recommended by the most recent studies on Latin linguistics. The online toolkit: outline and features Available in opensource, the Latin Phonetics online toolkit contains the source codes through which the data displayed in the app is generated. While the app offers only premade information that can be easily accessed (even offline) by every user who is able to use a smartphone, the online program allows users with good informatic skills to generate customized results. As I have anticipated in the introduction, the online program is accessible through the GitHub page of the Latine Loquamur Project.38 The project’s home page currently features two repositories containing, respectively, the Online Dictionary of Latin Near Synonyms (which I plan to discuss in another article), and the program on Latin phonetics, which can be accessed by clicking on the folder ‘Latineloquamur-toolkit-IPA-transcriber-and-App’. The repository that hosts the program on Latin phonetics is organized in different folders corresponding to the different functions performed by the program. This means that, while one can see the accentua- tion, syllabication, phonetic transcription and potential homographic forms of selected lemmas simul- taneously in the app, the online program generates these results separately through different packages. 37 On the evolution of Latin through Late Antiquity, see Norberg (1999), 33–35. 38 https://github.com/latineloquamur?tab=repositories (last access 02.09.2020). Fig. 4 The homepage of the Latine Loquamur repository in GitHub. https://github.com/latineloquamur?tab=repositories Digital Classics Online 11Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) The folder ‘Accentuation’ contains the codes to accentuate automatically macronized Latin words or- ganized in a CSV-file.39 The output of this program is a new CSV-file displaying both the original word and its accentuated form.40 The folder ‘Classical and Ecclesiastical Latin IPA Transcriber’ hosts the codes that perform the transcription of given Latin lemmas into phonetic characters according to the standards of the International Phonetic Alphabet.41 Here the file ‘Readme.md’ provides users with a detailed guide on how to run this complex package. In extreme synthesis, by using in sequence the com- mands cargo build and cargorun--{path to the file} users can operate the phonetic transcription of Latin words (organized one per line in a txt file) and generate two files containing, respectively, the Classical and Ecclesiastical pronunciation of those words. A sample of the results that can be achieved using this package is provided by the section ‘sample IPA’.42 The folders ‘Implementation’ and ‘Mobile App’ can be disregarded by users as they contain, respectively, work-in-progress material that will be used in the implementation of the toolkit (as described in the final section of this article) and the codes that have been used to build the app. Using the folders ‘Cargo’ and ‘Src’ expert users can generate frequency lists of Latin lemmas attested in a customizable set of Latin texts.43 Specifically, the folder ‘cargo’ governs the functioning of the packages hosted in ‘Src’ and contains a ‘Dockerfile’ with instructions. To use this program, users can upload the texts that they wish to process in the sub-folder ‘data_dir’ (within ‘Src’).44 Files in this repository must be organized in folders, one for each author, and named with the authors’ names. Inside each author-folder, there must be a list of folders corresponding to the author’s works, which must be stored in ‘txt’ format. An example of how to organize personalized textual corpora effi- 39 Both functions are available at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/ master/ACCENTUATION/Latin_accentuation_code.py (last access 02.09.2020). 40 A sample is available at (https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/mas- ter/ACCENTUATION/sample_accentuation.txt (last access 02.09.2020). 41 https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/CLASSICAL%26EC- CLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER (last access 02.09.2020). 42 https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/CLASSICAL%26EC- CLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER/sample%20IPA (last access 02.09.2020). Note that slight differences may be caused by the manual checking operated on the data used in the app. 43 https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/cargo; https://github. com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src (last access 02.09.2020). 44 The repository is accessible at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/ master/src/Data_dir (last access 02.09.2020). Fig. 5 The folders in which the Latin Phonetics online program is organized. https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/ACCENTUATION/Latin_accentuation_code.py https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/ACCENTUATION/Latin_accentuation_code.py https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/ACCENTUATION/sample_accentuation.txt https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/ACCENTUATION/sample_accentuation.txt https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/CLASSICAL%26ECCLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/CLASSICAL%26ECCLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/CLASSICAL%26ECCLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER/sample%20IPA https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/CLASSICAL%26ECCLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER/sample%20IPA https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/cargo https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src/Data_dir https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src/Data_dir Digital Classics Online 12Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) ciently is offered by the file ‘sampletxt.txt’.45 While the Android app simply provides a list of the 6500 most common Latin lemmas, expert users can perform much more complex searches with the online program to generate data on the use of a term or only on some of its forms by specific authors within a selected time frame. The tremendous potential of this tool can be appreciated by looking at the sample that, stored in the GitHub repository, shows a part of the data generated by running the program on our large textual corpus.46 Technologies The original technologies used to develop the program underpinning the app are highly advanced and closely tailored to its aims and function. This complex program is organized in different packages that govern, respectively, the frequential statistics of the words most commonly attested in Latin literature, the syllabication and accentuation of Latin lemmas, their transcription into the characters of the Inter- national Phonetic Alphabet, and their visualization through a user-friendly mobile app.47 The following section will discuss the most important technologies and methodological issues concerning each com- ponent of the backend. Lemmatizing and counting Latin The first stage of the development of the Latin Phonetics toolkit was the creation of a unique frequency list of the 6500 most common Latin lemmas, as attested across a large corpus of both Classical and Christian texts. Making this list involved parsing, lemmatizing, and categorizing the data directly from the sources, which means scanning a pre-assembled and standardized textual corpus in order to identify the different inflected forms of each word, and to calculate which lemmas are the ones most frequently used by Latin authors. While most previous Latin dictionaries have relied on a manual processing of texts, this toolkit uses original algorithms and the capabilities of the opensource lemmatization service offered by Lemlat.48 Lemmatization is the process through which the variants of a term, and its inflected or graphically different forms (e.g., amat, amant, amas, amavi, amatum; to love), are attributed to their lemma: the standard form of the word (e.g., amo) as it appears in a dictionary. Many programs can perform this task on Latin texts quite successfully (e.g., the Schinke algorithm, the Perseus lemmatizer, PROIEL, Parsley, Morpheus, Whitaker’s Words, LatMor), but none of these technologies provides entirely correct data.49 Among them, we have chosen to use a freely adapted version of Lemlat, which was developed between 2002 and 2004 by the National Research Centre (CNR) of Pisa in collaboration with the University of 45 https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/commit/b475af49a9dcbabb3a9cb- 70582da84b5df18ecd9 (last access 13.12.2020). 46 Samples are available on Github (https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/ blob/master/src/sample_frequency_list_data.txt; last access 02.09.2020) and in a dedicated, work-in-progress webpage (https://latin.netlify.com/; last access 02.09.2020). 47 These packages are freely accessible through the program’s repository: https://doi.org/10.17630/19ce37ba-2d35-4920- bd7f-6287977de369 (last access 02.09.2020); https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcri- ber-and-App (last access 02.09.2020). 48 Cf. http://www.ilc.cnr.it/lemlat/lemlat/index.html (last access 12.01.2021). 49 See, for instance, LatMor (http://cistern.cis.lmu.de; last access 02.09.2020), Words (http://archives.nd.edu/words.html; last access 02.09.2020), Parsley (https://github.com/goldibex; last access 02.09.2020), PROIEL (https://github.com/proiel/ proiel-treebank; last access 02.09.2020), and Morpheus (https://github.com/tmallon/morpheus; last access 02.09.2020). On these technologies see also Springmann/Schmid/Dietmar (2016). https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/commit/b475af49a9dcbabb3a9cb70582da84b5df18ecd9 https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/commit/b475af49a9dcbabb3a9cb70582da84b5df18ecd9 https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/src/sample_frequency_list_data.txt https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/src/sample_frequency_list_data.txt https://latin.netlify.com/ https://doi.org/10.17630/19ce37ba-2d35-4920-bd7f-6287977de369 https://doi.org/10.17630/19ce37ba-2d35-4920-bd7f-6287977de369 https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App http://cistern.cis.lmu.de http://archives.nd.edu/words.html https://github.com/goldibex https://github.com/proiel/proiel-treebank https://github.com/proiel/proiel-treebank https://github.com/tmallon/morpheus Digital Classics Online 13Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) Turin, because it has proved to be the most consistent and reliable technology of this kind.50 Based on a database of 40,014 lexical entries and 43,432 lemmas including many late antique and medieval terms, Lemlat adopts the standards of the Oxford Latin Dictionary (Glare [1982]). Being able to recognize over 97% of Latin terms including many anthroponyms and toponyms, it successfully lemmatizes 319,725 lexemes into 30,413 lexical entries (around 3,500 more than the modern Liège dictionary by Denooz). Moreover, its automatic analysis is very accurate and takes into account many spelling variations and even rare or archaic forms of a lemma which the former frequency dictionaries neglect.51 However, this technology, which is still undergoing further development, cannot disambiguate homographic forms, which are therefore counted under all the lemmas which they can belong to. To create the frequency list used in the app, we have fed into the program (written in RUST) a large textual corpus yielding some 9,484,029 words, and covering the works of 307 authors, which has been created using different opensource textual databases available online such as Perseus, the PHI database, and the Bibliotheca Augustana.52 This textual corpus, which has not been made publicly available in accordance with its distribution licence, was stored in the repository data_dir.53 As I have anticipated, this folder has been left empty in the program’s repository, so that users can input a personalized corpus on which they can run our program by using, for example, the big textual databank provided by Per- seus, or the Packard Humanities Institute both online and on CD. The most important element of this package is the ‘lemmatizer.’ This file is a ‘CSV’ directly exported (with adaptations) from Lemlat to specify the lemmatization bases that are used to operate on the literature.54 This section also contains an original ‘runner’ program that is in charge of the full GraphQL endpoint, being used to query the text through the generic command: ‘cargo run--bin {program_name}---aAUTHORS_FILE-dDATA_DIR- lLEMM_FILE.’ There, the options authors_file, data_dir, lemm_file are used to specify the data files on which to operate. Specifically, authors_file is used for an advanced function which is still being perfec- ted; this folder contains a database with the chronology of Latin authors which can be used to perform 50 The program is available at http://www.lemlat3.eu/ (last access 02.09.2020). On its features and assessment see Passarotti/ Baudassi/Litta/Ruffolo (2017), 24–31 and Springmann/Schmid/Dietmar (2016). The adapted version of the lemmatizer is available at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src/latin_ lemmatizer (last access 02.09.2020). 51 See Passarotti/Baudassi/Litta/Ruffolo (2017), 26. A way to check the ways in which forms are lemmatized in our dictio- nary through LEMLAT is through http://www.ilc.cnr.it/lemlat/lemlat/index.html (last access 12.01.2021). The lemmatizer Lemlat has successfully lemmatized more than 97% of the word-forms attested in our corpus, leaving unrecognized only the 2.88% of the forms. Among them many are names, Greek forms used in Latin texts, indication of books given in Ro- man letters (e.g., LXV), or Latin endings (e.g., -ar; -or) that are mentioned by ancient grammarians in their discussions of Latin morphology but do not correspond to any lemma. 52 The RUST source code is available at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcri- ber-and-App/tree/master/src (last access 02.09.2020). On Perseus see http://www.perseus.tufts.edu/hopper/ (last access 02.09.2020); the Bibliotheca Augustana can be accessed at http://www.hs-augsburg.de/~harsch/a_impressum.html (last access 02.09.2020). The list of authors included in our textual corpus is stored in this repository: https://github.com/la- tineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/src/authors_chrono/AUTHORS-LIST (last access 02.09.2020). Our corpus uses the PHI standards to name Latin authors in order to facilitate the use of the toolkit by other users who will likely use the PHI textual database. The Packard Humanities Institute (PHI) corpus is one of the widest opensource Latin corpora currently available online https://latin.packhum.org/ (last access 02.09.2020) and, al- though it does not match our corpus perfectly, it can be effectively used to look up the large majority of the Latin passages in which our lemmas or their inflected forms appear. However, while our program lemmatizes every inflected form, the PHI searching tool performs only simple pattern-matching queries. Thus, if one searches ‘ultor’ the program shows also results like ‘multorum’, unless the search is made for a specific form like #ultor#. In this case the program displays only the occurrences of this specific graphic form and not of the lemma ultor and of its inflected forms. 53 The repository is accessible at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/ master/src/Data_dir (last access 02.09.2020). 54 To run the lemmatizer use https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/mas- ter/src/latin_lemmatizer/src/parsers (last access 02.09.2020). http://www.lemlat3.eu/ https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src/latin_lemmatizer https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src/latin_lemmatizer http://www.ilc.cnr.it/lemlat/lemlat/index.html https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src http://www.perseus.tufts.edu/hopper/ http://www.hs-augsburg.de/~harsch/a_impressum.html https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/src/authors_chrono/AUTHORS-LIST https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/src/authors_chrono/AUTHORS-LIST https://latin.packhum.org/ https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src/Data_dir https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src/Data_dir https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src/latin_lemmatizer/src/parsers https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src/latin_lemmatizer/src/parsers Digital Classics Online 14Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) diachronic linguistic searches on specific centuries.55 The data so created can be exported using the CSV function. Conceptually, this system works as a ‘forest’ of sorts, with one tree for each lemma, one leaf for each form, and each form with a collection of occurrences. The data obtained from this system is se- mi-structured (data is fully tagged, but its structure is not rigidly defined, allowing for great flexibility in terms of exporting it), and the entire system operates without interacting with the disk. Leveraging this core system, two applications can query the relational databases (texts, lemmatizer, and chronological authors map), and generate the results requested by users in different formats (e.g., Json, txt, Excel).56 Prosodic processing After building the list of the most common Latin lemmas, other algorithms were used to divide words into their syllables, and to mark them as long or short respectively, which is indispensable for a correct phonetic transcription. In Latin, the quantity of syllables does not always coincide with the quantity of the vowels that they contain. However, syllables are always long, except where a short vowel is in an open syllable (a syllable that does not end with a consonant). For instance, the u in the second syllable of the word sepultus (buried) is short by nature (*se-pŭl-tus). However, because it is closed (it ends in consonant), this syllable is long, and takes the accent (sepúltus). For this reason, the program displays the quantities of syllables. Several opensource tools can divide Latin words into syllables, and mark the long ones as such.57 Among them, we used the syllabifier shared by CLTK and Collatinus because it has been recently implemented to correctly process ‘exceptional’ forms that do not follow the stan- dard rule of syllabication, using a list made by Rev. Frère Romain Marie de l’ Abbaye Saint-Joseph de Flavigny-sur-Ozerain in 2016.58 Thus, this algorithm can correctly syllabify compound words in which consonants are counted in the same syllable (e.g., de-scri-bo; to describe). This tool also offers the most efficient macronizer currently available, which is based on eight Latin dictionaries.59 After marking the quantities of each syllable, original Python scripts were used to accentuate Latin lem- mas.60 This program, which I co-developed in collaboration with Jan Rybojad, takes as an input a CSV- file containing Latin lemmas, and parses words so as to break them into an array of Unicode characters. For each lemma, this array is further converted into two new arrays, one for sounds, and one for vowels (including diphthongs). The actual accentuation is performed through the functions findStress and is- LongVowel that replace long vowels and diphthongs with the appropriate stressed vowels and dipht- hongs, according to the rules of Latin accentuation.61 In particular, if the second-last vowel of a lemma is marked as long or is a diphthong, the program accentuates it; if the second-last vowel is marked as short 55 The runner program is available at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/ tree/master/src/runner (last access 02.09.2020). 56 Samples of potential results are available at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcri- ber-and-App/blob/master/src/sample_frequency_list_data.txt; last access 02.09.2020) and https://latin.netlify.com/ (last access 0.09.2020). 57 For instance, this service is provided by: http://marello.org/tools/syllabifier/ (last access 02.09.2020); https://github.com/ cltk/cltk/blob/master/cltk/stem/latin/syllabifier.py (last access 02.09.2020); https://github.com/biblissima/collatinus/blob/ master/doc-usr/scander.md (last access 02.09.2020). 58 Cf. https://github.com/biblissima/collatinus/blob/master/bin/data/hyphen.la (last access 02.09.2020). 59 The dictionaries are De Valbuena (1819); Noël (1824); Quicherat (1836); De Miguel (1867); Franklin (1875); Lewis/ Short (1879); Du Cange (1883); Georges (1888); Calonghi (1898); Gaffiot (1934); Gaffiot (2016). 60 See https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/ACCENTUATION (last access 02.09.2020). 61 Both functions are available at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/ master/ACCENTUATION/Latin_accentuation_code.py (last access 02.09.2020). https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src/runner https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/src/runner https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/src/sample_frequency_list_data.txt https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/src/sample_frequency_list_data.txt https://latin.netlify.com/ http://marello.org/tools/syllabifier/ https://github.com/cltk/cltk/blob/master/cltk/stem/latin/syllabifier.py https://github.com/cltk/cltk/blob/master/cltk/stem/latin/syllabifier.py https://github.com/biblissima/collatinus/blob/master/doc-usr/scander.md https://github.com/biblissima/collatinus/blob/master/doc-usr/scander.md https://github.com/biblissima/collatinus/blob/master/bin/data/hyphen.la https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/ACCENTUATION https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/ACCENTUATION/Latin_accentuation_code.py https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/ACCENTUATION/Latin_accentuation_code.py Digital Classics Online 15Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) and not a diphthong, then it replaces the third-last vowel or diphthong with the corresponding stressed vowel. The output is a new CSV-file with lines in the format , .62 IPA transcriber for Classical and Ecclesiastical Latin A unique feature of the Latin Phonetics toolkit is the phonetic transcription of both Classical and Ec- clesiastical Latin using the International Phonetic Alphabet, which is performed by original algorithms. Because the program takes as input Latin words that have the quantities of their syllables fully marked, this operation is conceptually simple. Given an input word, the algorithm applies iteratively a number of replacement rules that, co-developed in collaboration with Giacomo Fenzi, convert a combination of Latin characters into the corresponding IPA symbols (e.g., x → /ks/). However, several factors make this process more complex. Firstly, the pronunciation of Classical and Ecclesiastical Latin has to be treated separately because it follows different rules. For instance, while the long e (ē) is pronounced as long /eː/ in Classical Latin, it is pronounced as a normal closed /e/ in Ecclesiastical Latin, where the quantity of vowels is no longer perceived as a phonetically significant element. Similarly, the nexus gn, which sounds /ŋn/ in Classical Latin, is softened in Ecclesiastical Latin (/ɲ/). In this important respect, the phonetic transcription of Classical Latin operated by the Latin Phonetics toolkit differs from that of CLTK in so far as it is based on the new phonetic transcriptions recommended by recent studies of Latin linguistics. Secondly, combinations of letters are sometimes pronounced as just one sound. While a ne- xus can have different lengths, the same letters that appear in a two-character group can be pronounced differently when they occur in a three-character nexus. For instance, in the term ămīcĭtĭa (friendship), the nexus ‘tia’ is pronounced /tsja/ in Ecclesiastical Latin. However, in the plural form ămīcĭtĭāe the same group ‘tia’ appears in the longer group ‘tiae’ which is formed by the nexus ‘ti+vowel’ and the diphthong ‘ae’. In this case, the replacement /tsja/+/e/ would be wrong, because the group ae is mono- phthonged in /ɛ/ or /e/ in Ecclesiastical Latin, and the word is consequently pronounced /a.miˈtʃi.tsje/. To fix these problems, the program, which is written in RUST, processes Classical and Ecclesiastical Latin separately.63 In particular, using as an input a path to a file containing a list of words (one per line), the functions ‘cargo build’ (the executable being ‘target/debug/ipa_latin(.exe)’) and ‘cargorun--{path to the file}’ operate parallel replacement for Classical and Ecclesiastical Latin, using strings such as ‘if Classical {subs.push((„aei“, „ae̯i“));} else {subs.push((„aei“, „ɛi“));}’, where else is the Ecclesiastical pronunciation. As a result, the program generates two different files: ‘Classical.txt’ and ‘Eccl.txt’. In order to efficiently treat nexus, such replacements are based on conversion rules that, specifying each possible combination, are applied in descending length-order, so as to match longer structures first.64 In this way, for instance, the group oe is successfully transcribed as /e/ in Ecclesiastical Latin, rather than as /o/+/e/. The results of this process can be seen in the files stored in the GitHub folder ‘sample IPA’.65 62 A sample is available at (https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/mas- ter/ACCENTUATION/sample_accentuation.txt (last access 02.09.2020). 63 The RUST transcriber is available at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/ tree/master/CLASSICAL%26ECCLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER (last access 02.09.2020). 64 https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/CLASSICAL%26EC- CLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER/src/ipa.rs (last access 02.09.2020). 65 https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/CLASSICAL%26EC- CLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER/sample%20IPA (last access 02.09.2020). Note that slight differences may be caused by the manual checking operated on the data used in the app. Fig. 6 Sample of the database deployed by the app. https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/ACCENTUATION/sample_accentuation.txt https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/ACCENTUATION/sample_accentuation.txt https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/CLASSICAL%26ECCLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/CLASSICAL%26ECCLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/CLASSICAL%26ECCLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER/src/ipa.rs https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/CLASSICAL%26ECCLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER/src/ipa.rs https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/CLASSICAL%26ECCLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER/sample%20IPA https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/tree/master/CLASSICAL%26ECCLESIASTICAL%20LATIN%20IPA%20TRANSCRIBER/sample%20IPA Digital Classics Online 16Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) App design The project has also pioneered the creation of an intuitive mobile app that, titled Handbook of Latin Phonetics, makes the phonetic program easily accessible for students and non-expert users. Because the amount of data is relatively small, rather than connecting the app to the program through a rest API, the application has been designed to work offline, using as input a SQLite database containing the infor- mation created by the program. In this file, a list of Latin lemmas without diacritics is matched with the lemmas complete with grammatical and prosodic information and their phonetic transcriptions. There- fore, when one searches a word without diacritics, all the possible corresponding Latin lemmas, which may include different syllabic quantities as in the aforementioned case of praedico, are returned. The app was built using a Software Development Kit called Flutter, which allows one to build high-perfor- mance apps for iOS & Android from a single codebase, using Dart programming language.66 In the fu- ture, it will become the native framework of Google’s Fuchsia OS, so that a project developed in Flutter will work on three platforms: iOS, Android, and Fuchsia. The architecture of the app is simple, and uses a Business Logic Components pattern, meaning that everything in the app is represented as a stream of events, in which widgets submit events and other widgets respond. Future developments Two new implementations of the Latin Phonetics toolkit are being developed to further support both academic research and language teaching.67 The first is a diachronic function which, based on an origi- nal diachronic mapping of Latin authors, will allow users to see by which Latin authors and in which century each lemma was used.68 This feature will support not only stylistic choices in exercises of Latin composition, but also commentary writers (by providing a concise ‘story’ of each word and of its oc- currences) and philological conjectures (by showing which words were more likely to be used by an au- thor). The second new feature that is being designed will leverage my app of Latin synonyms to describe the meaning of each lemma (for which phonetic information is provided) directly in Latin through the list of its most important Latin synonyms.69 This function will also support new digital technologies that are being developed to detect similarities of meanings and ideas between Latin texts, independently of precise lexical repetitions. Finally, a version of the app for Apple devices will be released soon. Overall, in its innovative cross-fertilization of recent developments in the fields of Latin linguistics, pe- dagogy, and digital humanities, the Latin Phonetics toolkit bridges teaching and research, using original algorithms to provide scholars and students with the first IPA phonetic transcription of the Classical and Ecclesiastical pronunciations of the most common Latin lemmas, as attested across the entire corpus of Latin literature. Besides facilitating the teaching of Latin in Latin and contributing to the creation of a shared methodology for the study of Latin phonology, this tool also supports a more interactive inde- 66 An overview of this innovative technology is provided by Kuzmin/Ignatiev/Grafov (2020). Cf. https://flutter.dev/ (last access 02.09.2020). 67 Originally, we had planned to assess the impact of our toolkit (published at the beginning of 2020) by using students’ and professors’ feedback. However, due to the Covid-19 pandemic this has been impossible so far. We now aim to collect and examine feedback after the development of the two new implementations. 68 A draft is available at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/ src/authors_chrono/AUTHORS-LIST (last access 02.09.2020). 69 A sample is available at https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/mas- ter/IMPLEMENTATION/IMPLEMENTATION_SYNONYMS.txt (last access 02.09.2020). https://flutter.dev/ https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/src/authors_chrono/AUTHORS-LIST https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/src/authors_chrono/AUTHORS-LIST https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/IMPLEMENTATION/IMPLEMENTATION_SYNONYMS.txt https://github.com/latineloquamur/Latineloquamur-toolkit-IPA-transcriber-and-App/blob/master/IMPLEMENTATION/IMPLEMENTATION_SYNONYMS.txt Digital Classics Online 17Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) pendent learning of Latin, displaying the benefits of truly interdisciplinary approaches to the study of classical languages.70 70 I would like to express my sincere gratitude to the University of St Andrews and the Pontificium Institutum Altioris Lati- nitatis (Salesian University of Rome) for the generous support that they provided to my project; to William Short, Cleto Pavanetto and Miran Sajovic, who helped me perfect several aspects of my research; to Giacomo Fenzi, Kamil Kolosow- ski and Jan Rybojad who helped me develop the digital program and the app; to Gregory Tirenin and Maxwell Stocker who proofread this article. Digital Classics Online 18Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) References Allen (1966): W. S. Allen, Vox Latina: a guide to the pronunciation of Classical Latin, Cambridge 1966. Avitus (2018): A. G. Avitus, Spoken Latin: Learning, Teaching, Lecturing and Research, Journal of Classics Teaching 19.37 (2018), 46–52. Barnett (2006): F. J. Barnett, The second appendix to Probus, The Classical Quarterly 56.1 (2006), 257–278. Calonghi (1898): F. Calonghi, Dizionario Latino-Italiano, Milan 1898. Chiesa (2012): P. Chiesa, L‘impiego del ‚cursus‘ in sede di critica testuale: una prospettiva diagnostica, in: F. Bognini, Meminisse iuuat. Studi in memoria di Violetta De Angelis, Florence 2012, 279–304. CIL = Corpus Inscriptionum Latinarum, ed. Th. Mommsen et alii, Berlin 1862ff. Collins (2012): A. Collins, The English pronunciation of Latin: its rise and fall, Cambridge Classical Journal, 58 (2012), 23–57. Collins (1988): J. F. Collins, A primer of Ecclesiastical Latin, Washington 1988. Coxhead/Nation/Sim (2015): A. Coxhead/ P. Nation/ D. Sim, Measuring the Vocabulary Size of Native Speakers of English in New Zealand Secondary Schools, NZ. J. Educ. Stud. 50 (2015), 121–135. Cupaiuolo (1991): F. Cupaiuolo, Problemi di lingua latina. Appunti di grammatica storica, Naples 1991. Davies (2005): M. Davies, A frequency Dictionary of Spanish: Core vocabulary for learners, Abingdon 2005. De Miguel (1867): R. De Miguel, Nuevo Diccionario Latino-Español Etimológico, Leipzig 1867. De Valbuena (1819): M. De Valbuena, Diccionario Universal Latino-Español, Hermanos 1819. Delatte/Evrard/Govaerts/Denooz (1981): L. Delatte, É. Evrard, S. Govaerts, J. Denooz, Dictionnaire fréquentiel et index inverse de la langue latine, Liege 1981. Denooz (2010): J. Denooz, Nouveau lexique fréquentiel de latin. Alpha-Omega, Liege 2010. Diederich (1939): P. B. Diederich, The Frequency of Latin Words and their Endings, Chicago 1939. Du Cange (1883): C. Du Cange, Glossarium Mediae et Infimae Latinitatis. Ed. L. Favre, Niort 1883– 1887. Franklin (1875): A. Franklin, Dictionnaire des noms, surnoms et pseudonymes latins de l’histoire littéraire du Moyen Age, Paris, 1875. Gaffiot (1934): D. L. F. Gaffiot, Dictionnaire Latin-Français, Paris 1934. Gaffiot (2016): D. L. F. Gaffiot, Dictionnaire Latin-Français, London 2016. Digital Classics Online 19Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) Georges (1888): K. E. Georges, Kleines deutsch-lateinisches Handwörterbuch, Hannover / Leipzig 1888. Harrington/Pucci (1997): K. P. Harrington, J. Pucci, (eds.), Medieval Latin, Chicago 1997. Kuzmin/Ignatiev/Grafov (2020): N. Kuzmin, K. Ignatiev, D. Grafov, Experience of Developing a Mo- bile Application Using Flutter. Lecture Notes in Electrical Engineering 621 (2020), 571–75. Lewis/Short (1879): C. T. Lewis, C. Short, Latin Dictionary, founded on Andrews’ Edition of Freund’s Latin Dictionary: Revised, Enlarged, and in Great Part Rewritten, Oxford 1879. Martinazzoli (1953): F. Martinazzoli, Hapax legomenon: Parte prima, Roma 1953. Muccigrosso (2004): J. D. Muccigrosso, Frequent vocabulary in Latin instruction, The Classical World 97.4 (2004), 409–433. Noel (1822): F. Noel, Dictionarium Latino-Gallicum: dictionnaire latin-français, compose sur le plan de l‘ouvrage intitulé: Magnum totius latinitatis lexicon, de Facciolati, septième edittion, Paris 1822. Norberg/Oldoni (1999): D. Norberg, M. Oldoni, (eds), Manuale di Latino Medievale, Napoli 1999. Oniga (2003): R. Oniga, La sopravvivenza di lingue diverse dal latino nell‘Italia di età Imperiale: alcu- ne testimonianze letterarie, Lexis: poética, retórica e comunicaciones nella tradizione classica, 21 (2003), 39–62. Oniga (2014): R. Oniga, Latin: a linguistic introduction, Oxford 2014. Passarotti/Baudassi/Litta/Ruffolo (2017): M. Passarotti, M. Budassi, E. Litta, P. Ruffolo, The ‘Lem- lat3.0’ Package for Morphological Analysis of Latin, Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language, 133 (2017), 24–31. Paulus VI (1976): Paulus P. P. VI, Romani Sermonis, Roma 1976. Pavanetto (2009): C. Pavanetto, Elementa Linguae et Grammaticae Latinae, sexta editio aucta et emen- data, Roma 2009. Préaux (1957): J. Préaux, Premier congrès international pour le latin vivant: Avignon 3–6 septembre 1956, Latomus, 16.3 (1957), 509–511. Quicherat (1836): L. Quicherat, Dictionnaire Français-Latin, Paris 1836. Ramage (1963): E. S. Ramage, Urbanitas: Cicero and Quintilian, a contrast in attitudes, The American Journal of Philology, 84.4 (1963), 390–414. Serianni (1998): L. Serianni, Lezioni di grammatica storica italiana, Roma 1998. Short/George (2013): E. Short, A. George, A Primer of botanical Latin with vocabulary, Cambridge 2013. Sinclair (1991): J. Sinclair, Corpus, concordance, collocation. Oxford (1991). Digital Classics Online 20Spinelli: Creating the First Digital Handbook of Latin Phonetics DCO 7 (2021) Spinazzè (2014): L. Spinazzè, Cursus in clausula. An Online Analysis Tool of Latin Prose, Association for Computing Machinery, 10 (2014), 1–6. Springmann/Schmid/Dietmar (2016): U. Springmann, H. Schmid, N. Dietmar, LatMor: A Latin fini- te-state morphology encoding vowel quantity, Open Linguistics – Topical Issue on Treebanking and Ancient Languages: Current and Prospective Research, 2.1 (2016), 386–392. Traina/Bernardi-Perini (1998): A. Traina, G. Bernardi-Perini, Propedeutica al latino universitario, Rome 1998. Williams (2012): M. A. Williams, Essential Latin Vocabulary: The 1,425 Most Commom Words Occur- ring in the Actual Writings of Over 200 Latin Authors, Milan 2012. Winge (2015): J. Winge, Automatic Annotation of Latin Vowel Length. Bachelor’s Thesis in Langua- ge Technology, Uppsala University, https://cl.lingfil.uu.se/exarb/arch/winge2015.pdf (last access 02.09.2020). Figure references Fig. 1: Latin Phonetics App, by Tommaso Spinelli. Fig. 2: App’s introductory section, by Tommaso Spinelli. Fig. 3: Sarcophagus of Scipio Barbatus (Vatican Museums), image by the Center for Epigraphical Stu- dies of the Ohio State University. (http://db.edcs.eu/epigr/bilder.php?s_language=de&bild=$OH_CIL_06_01284_1.jpg;$OH_ CIL_06_01284_2.jpg;$OH_CIL_06_01284_3.jpg;$CIL_01_00006.jpg;PH0010886;PH0010887;pp; last access 12.01.2021). Fig. 4: The homepage of the Latine Loquamur repository in GitHub, by Tommaso Spinelli. Fig. 5: The folders in which the Latin Phonetics online program is organized, by Tommaso Spinelli. Fig. 6: Database of Latin phonetics, by Tommaso Spinelli. Tab. 1: Phonetic Transcription Table, by Tommaso Spinelli. Author contact information71 Dr Tommaso Spinelli University of Manchester School of Arts, Languages & Cultures E-Mail: tommaso.spinelli@manchester.ac.uk 71 The rights pertaining to content, text, graphics, and images, unless otherwise noted, are reserved by the author. This contribution is licensed under CC BY 4.0. https://cl.lingfil.uu.se/exarb/arch/winge2015.pdf http://db.edcs.eu/epigr/bilder.php?s_language=de&bild=$OH_CIL_06_01284_1.jpg;$OH_CIL_06_01284_2.jpg;$OH_CIL_06_01284_3.jpg;$CIL_01_00006.jpg;PH0010886;PH0010887;pp http://db.edcs.eu/epigr/bilder.php?s_language=de&bild=$OH_CIL_06_01284_1.jpg;$OH_CIL_06_01284_2.jpg;$OH_CIL_06_01284_3.jpg;$CIL_01_00006.jpg;PH0010886;PH0010887;pp mailto:tommaso.spinelli%40manchester.ac.uk%20?subject= _GoBack work_757zcxixengdtjcvhbr7rxqoai ---- ISPF LAB ISPF LAB Laboratorio dell’Ispf Rivista elettronica di testi, saggi e strumenti Anno XIII - 2016 . www.ispf-lab.cnr.it ISPF LAB Laboratorio dell’Ispf. Rivista elettronica di testi, saggi e strumenti www.ispf-lab.cnr.it ISSN n. 1824-9817 Direzione: David Armando, Leonardo Pica Ciamarra, Manuela Sanna. Comitato scientifico: Josep Martinez Bisbal, Giuseppe Cacciatore, Silvia Caianiello, Maria Conforti, Pierre Girard, Matthias Kaufmann, Girolamo Imbruglia, Pierre-François Moreau, Barbara Ann Naddeo, Valeria Pinto, Enrico I. Rambaldi, Salvatore Tedesco, Maurizio Torrini, Amadeu Viana. Redazione: Roberto Evangelista, Armando Mascolo, Assunta Sansone, Alessia Scognamiglio (segretaria). Responsabile tecnico: Ruggero Cerino. Istituto per la storia del pensiero filosofico e scientifico moderno Consiglio Nazionale delle Ricerche via Porta di Massa, 1 80133 Napoli email: lab@ispf.cnr.it © ISPF-CNR, 2016 Quest’opera è pubblicata sotto licenza Creative Commons “Attribuzione - Non commerciale 3.0 Italia”. Ne sono libere la copia e la diffusione a scopo di studio, a condizione dell’indicazione completa della paternità e della licenza. Specifiche limitazioni possono applicarsi alla pubblicazione di materiali di proprietà non esclusiva. Gli articoli della rivista, accettati dopo un processo di peer review, si conformano agli standard della “Iniziativa di Budapest per l’Accesso Aperto”. Gli autori mantengono tutti i diritti d’uso del proprio lavoro, col solo vincolo alla menzione della prima pubblicazione. Gli articoli approvati prima della chiusura di ciascun numero sono pubblicati con procedura online first. SOMMARIO • Questo numero / This Issue Testi • LE “CARTE VILLAROSA” Sei fascicoli di carte vichiane non rilegate (Ms. XIX, 42) Nota editoriale e indici Fascicolo I Fascicolo II Fascicolo III Fascicolo IV Fascicolo V Fascicolo VI Speciale Atti del Convegno “Giambattista Vicos De universi juris uno principio, et fine uno im Kontext der europäischen Naturrechtstradition und Vicos Bedeutung für die heutige Debatte”, Halle, 6.-7. Maggio 2015 • Matthias Kaufmann GIAMBATTISTA VICOS UMGANG MIT DEM BEGRIFF DES DOMINIUM • Antonino Falduto VICO, DAS NATURRECHT, UND DER BEGRIFF OBLIGATIO • Giuseppe D’Anna “GENESE” UND “KOMPOSITION” IN VICOS DE UNO • Claudia Megale IL “DONO DEL LIBERO ARBITRIO” NELLA TERZA ORAZIONE INAUGURALE: ANALOGIE E DIFFERENZE CON JEAN BODIN • Julia V. Ivanova - Pavel V. Sokolov PHYSICA INGENIOSA AND ABYSSINIAN PHILOSOPHY: THE AMBIVALENCE OF THE CARTESIAN PHYSICS IN DE RATIONE, V • Sertório de Amorim e Silva Neto LA “NATURA CORROTTA” TRA ANTICHI E MODERNI • Giulio Gisondi VICO E IL PROBLEMA DEL METODO TRA SPERIMENTALISMO E RETORICA • Romana Bassi IL DE UNO ALLA LUCE DELL’EXEMPLUM TRACTATUS DE IUSTITIA UNIVERSALI, SIVE DE FONTIBUS IURIS DI FRANCIS BACON • Fabrizio Lomonaco APPUNTI SUL “DIRITTO UNIVERSALE DELLE GENTI” NEL DE UNO • Dominik Recknagel VICOS VINDICIAE. EINE AUSEINANDERSETZUNG UM DEN NATURRECHTSBEGRIFF • Stefania Sini “LETIZIA CATTOLICA, ANTICO-EUROPEA”? QUALCHE OSSERVAZIONE SULLA “LETTERATURA” SECONDO VICO TRA REPUBBLICA DELLE LETTERE E MONDO DELLE NAZIONI • Giuseppe Cacciatore IL CONCETTO DI CITTADINANZA IN VICO COME MANIFESTAZIONE DEL NESSO TRA UNIVERSALITÀ DELLA LEGGE E STORICITÀ EMPIRICA DELLA CIVITAS Saggi • Agata Anna Chrzanowska GHIRLANDAIO, FICINO AND HERMES TRISMEGISTUS: THE PRISCA THEOLOGIA IN THE TORNABUONI FRESCOES • David Armando SUPPLIQUES DES VASSAUX, POUVOIRS DU BARON (ETATS PONTIFICAUX, XVIIIème SIÈCLE) • Patrizia Delpiano ACADÉMIES ET CRÉATION DU SAVOIR SCIENTIFIQUE: CIRCULATION DES IDÉES ET MÉCANISMES DE LA CENSURE • Geri Cerchiai CINQUE SCRITTI METODOLOGICI DI EUGENIO COLORNI NELLE CARTE DI VITTORIO SOMENZI • Fabio D. Palumbo IL “GIAPPONISMO” DI DELEUZE Strumenti • Alessandro Stile TRA CINEMA E FILOSOFIA. ESPERIENZE DIDATTICHE IN UN ISTITUTO DI RICERCA DEL CNR • Roberto Mazzola - Ruggero Cerino BIBLIOTECA NAPOLETANA DIGITALE (SEC. XVIII) work_7ce4zfpya5fz3hcextxq2pwwke ---- 2018-03-06 Linköpings universitet Institutionen för kultur och kommunikation Baskurs inom forskarutbildningen vid Forskarskolan Språk och kultur i Europa VETENSKAPSTEORI: LINGVISTIK OCH LITTERATUR, 7, 5 hp Kursen behandlar grundläggande vetenskapsteori med inriktning mot litteraturvetenskap och lingvistik. Genom en diskussion av relevanta frågeställningar och begreppsbildningar inom det vetenskapsteoretiska området syftar kursen till att fördjupa förståelsen av lingvistikens och litteraturvetenskapens ställning inom humaniora samt kritiskt granska den vetenskapshistoriska uppdelningen av litterärt och annat språk. I kursen belyses hur språkliga processer och handlingar utgör fokus för båda vetenskaperna och hur en mängd likartade frågeställningar bearbetas på ömse håll med ofta identiska vokabulärer. Kurserna behandlar och uppfyller följande examensmål: Kunskap och förståelse 1. – visa brett kunnande inom och en systematisk förståelse av forskningsområdet samt djup och aktuell specialistkunskap inom en avgränsad del av forskningsområdet, och 2. – visa förtrogenhet med vetenskaplig metodik i allmänhet och med det specifika forskningsområdets metoder i synnerhet Färdighet och förmåga 3. – visa förmåga till vetenskaplig analys och syntes samt till självständig kritisk granskning och bedömning av nya och komplexa företeelser, frågeställningar och situationer 4. – visa förmåga att kritiskt, självständigt, kreativt och med vetenskaplig noggrannhet identifiera och formulera frågeställningar samt att planera och med adekvata metoder bedriva forskning och andra kvalificerade uppgifter inom givna tidsramar och att granska och värdera sådant arbete 7. – visa förmåga att identifiera behov av ytterligare kunskap Värderingsförmåga och förhållningssätt 9. – visa intellektuell självständighet och vetenskaplig redlighet samt förmåga att göra forskningsetiska bedömningar, och 10. – visa fördjupad insikt om vetenskapens möjligheter och begränsningar, dess roll i samhället och människors ansvar för hur den används. Uppläggning och examination: 5 tillfällen 4 timmar varje gång samt 6 timmar sista gången. Upplägget bygger på introducerande föreläsningar och diskussioner av de frågeställningar och begrepp som kurslitteraturen behandlar. 2018-03-06 Examination sker i form av en skriftlig uppgift på ca 8 sidor där minst en grundläggande vetenskapsteoretisk frågeställning diskuteras och problematiseras i förhållande till det egna avhandlingsprojektet. En muntlig presentation av denna uppgift ges vid sista tillfället och kommentarerna från opponenten och seminariet ska arbetas in i den version som lämnas till Carin Franzén senast 7 januari. Lärare: Carin Franzén, Leelo Keevallik, Angelika Linke, Jesper Olsson Anmälan till: carin.franzen@liu.se Lokal: ESA:s konferensrum (4331) 4 oktober 10–15 1. Introduktion Kurspresentation Vad är vetenskap; humaniora och samhällsvetenskap? Var placerar sig lingvistik och litteraturvetenskap i förhållande till övriga vetenskaper? Vilka data, material, källor använder vi? Litteratur Benveniste, Émile, “Subjectivity in Language”, Critical Theory since 1965, red. Hazard Adams & Leroy Searle. (Gainesville 1985) 728–732; http://faculty.georgetown.edu/irvinem/theory/Benveniste-Linguistic-Sign.pdf ; även: “Om subjektiviteten i språket”, Människan i språket. Texter I urval av John Swedenmark (Stockholm/Stehag 1995) 37–48. Hayles, N. Katherine, ”The Digital Humanities: Engaging the Issues”, How We Think. Digital Media and Contemporary Technogenesis (Chicago 2012), 23–54. Givón, Talmy [Thomas], ”Syntax. An Introduction”, Volume I (Amsterdam/Philadelphia 2001), 1–26. Liu, Alan, ” Where is Cultural Criticism in the Digital Humanities?”, Debates in the Digital Humanities (New York 2012), http://dhdebates.gc.cuny.edu/debates/text/20 Nussbaum, Martha C., Not for profit: Why Democracy Needs the Humanities (Princeton 2010). Sapir, Edward, “Communication”, Selected Writings of Edward Sapir in Language, Culture, and Personality, red. D. G. Mandelbaum (Berkeley 1949 [1931]), 104–109, https://archive.org/download/selectedwritings00sapi/selectedwritings00sapi.pdf 18 oktober 10–15 2. För och mot tolkning Diskussionsseminarium utifrån frågeställningen: Vad är mening, förklaring och förståelse? http://faculty.georgetown.edu/irvinem/theory/Benveniste-Linguistic-Sign.pdf http://dhdebates.gc.cuny.edu/debates/text/20 https://archive.org/download/selectedwritings00sapi/selectedwritings00sapi.pdf 2018-03-06 Litteratur: Barthes, Roland, ”Semantics of the Object”, The Semiotic Challenge, (Oxford 1988), http://people.su.se/~snce/texter/Barthes_Object.pdf Grice, Herbert Paul, “Logic and Conversation”, Syntax and Semantics, Bd. 3, red. P. Cole och J. Morgan (Hgg. 1975), 45–47, http://www.sfu.ca/~jeffpell/Cogs300/GriceLogicConvers75.pdf Gumbrecht, Hans-Ulrich, “A Farewell to Interpretation”, Materialities of Communication, (Stanford 1994), 389-402. Humboldt, Wilhelm von, “On the task of the historian”, The Hermeneutics Reader, red. Kurt Mueller-Vollmer, (New York 1985), 105–119, https://www2.southeastern.edu/Academics/Faculty/jbell/humboldt.pdf Mondada, Lorenza, “Understanding as an embodied, situated and sequential achievement in interaction”, Journal of Pragmatics, 43 (2011), 542-552, DOI: 10.1016/j.pragma.2010.08.019 Ricoeur, Paul, “What is a Text. Explanation and Understanding”, Hermeneutics and the Human Sciences, (Cambridge, 1981), 145-164. Sontag, Susan “Against Interpretation”, Against Interpretation and Other essays, (New York, 2001[1964]), http://www.coldbacon.com/writing/sontag-againstinterpretation.html Wittgenstein, Ludwig, Philosophical Investigations, övers., G.E.M. Anscombe, (Oxford 1958), i urval. 8 november 10–15 3. Ontologiska frågor Textseminarium utifrån frågeställningen: Vad det betyder att någonting är "verkligt" eller "konstruerat" samt problematiken relativism och objektivitet när det gäller mening och kunskap. Litteratur Berger, P. L. och T. Luckmann, “The foundations of knowledge in everyday life”, The Social Construction of Reality: A Treatise in the Sociology of Knowledge, (New York 1966), 31–62, http://perflensburg.se/Berger%20social-construction-of-reality.pdf Butler, Judith, “Performative Acts and Gender Constitution: An Essay in Phenomenology and Feminist Theory”, Theatre Journal, 440: 4 (1988), 519–531, http://people.su.se/%7Esnce/texter/butlerPerformance.pdf Foucault, Michel, The order of things: an archaeology of the human sciences, (Routledge 2002). Heidegger, Martin, ”The Question Concerning Technology”, Basic Writings (Routledge 2011). Latour, Bruno, ”Fourth Source of Uncertainty: Matters of Fact vs. Matters of Concern”, Reassembling the Social (Oxford 2005), 87–120 (e-bok). Linell, Per, Rethinking Language, Mind, and World Dialogically, (Charlotte, NC 2009), 11– 33. http://people.su.se/%7Esnce/texter/Barthes_Object.pdf https://www2.southeastern.edu/Academics/Faculty/jbell/humboldt.pdf http://www.coldbacon.com/writing/sontag-againstinterpretation.html http://perflensburg.se/Berger%20social-construction-of-reality.pdf http://www.jstor.org/sici?sici=0192-2882%28198812%2940%3A4%3C519%3APAAGCA%3E2.0.CO%3B2-C http://www.jstor.org/sici?sici=0192-2882%28198812%2940%3A4%3C519%3APAAGCA%3E2.0.CO%3B2-C http://people.su.se/%7Esnce/texter/butlerPerformance.pdf 2018-03-06 Sacks, Harvey, “On doing being ordinary“, in: Atkinson John M. / Heritage, John (ed.), Structures of Social Action (Cambridge & New York 1984), 413–429. 6 december 10–15 4. Lingvistik och litteraturvetenskap i praktiken Vetenskapsteoretiska frågor utifrån forskarnas egen praktik. Litteraturvetenskapliga och lingvistiska exempel. Litteratur: Delas ut på seminariet. 7 december 10–17 5. Presentation och diskussion av examinationsuppgifter (deadline 30 november till opponenten) Avslutning med middag work_7dzwisbdcngbfiouw3fschciae ---- InPhO for All: Why APIs Matter InPhO for All: Why APIs Matter Jaimie Murdock, Indiana University Colin Allen, Indiana University Abstract The unique convergence of humanities scholars, computer scientists, librarians, and information scientists in digital humanities projects highlights the collaborative opportunities such research entails. Unfortunately, the relatively limited human resources committed to many digital humanities projects have led to unwieldy initial implementations and underutilization of semantic web technology, creating a sea of isolated projects with data that cannot be integrated. Furthermore, the use of standards for one particular purpose may not suit other kinds of scholarly activities, impeding collaboration in the digital humanities. By designing and utilizing an Application Platform Interface (API), projects can reduce these barriers, while simultaneously reducing internal support costs and easing the transition to new development teams. Our experience developing an API for the Indiana Philosophy Ontology (InPhO) Project highlights these benefits. Introduction The unique convergence of humanities scholars, computer scientists, librarians, and information scientists in digital humanities projects highlights the collaborative opportunities such research entails. The digital humanities aspire to create, maintain, and deploy high integrity metadata that are derived from the activities and feedback of domain experts in the humanities, to support scholarly activities which meet the high standards of academic peer review. Unfortunately, the relatively limited human resources committed to many digital projects for the humanities have led to unwieldy initial implementations and underutilization of semantic web technology, with the result that most projects in this burgeoning field are standalone projects whose data cannot easily be integrated with others. In addition to the barriers arising from idiosyncratic implementations, the difficulties of integrating data from multiple sources are compounded by the use of standards that serve one particular purpose well but do not facilitate other kinds of scholarly activities, often making the combination of resources from different projects laborious and expensive. Thus, much of the potential for collaboration in the digital humanities still remains to be unlocked. Even humanities scholars who are not programmers should care about the ad hoc nature of application integration, because so much of their time involves laboriously transferring what they learn in one digital context to what they do in another. For example, the Stanford Encyclopedia of Philosophy (SEP)1 and PhilPapers2 are the two most widely used online resources for philosophers. But if a PhilPapers user wishes to know which SEP entries cite an item listed in the PhilPapers bibliography (or elsewhere online), the citation’s information must be manually copied and pasted from PhilPapers into the SEP search engine in order to perform the search. In the other direction, PhilPapers now provides a service to the SEP whereby a link in each SEP entry leads to a page at PhilPapers showing the items from the entry’s bibliography as represented in PhilPapers. However, the idiosyncratic formats of both the SEP and PhilPapers mean that there is no corresponding Journal of the Chicago Colloquium on Digital Humanities and Computer Science Volume 1 Number 3 (2011) Source URL: http://jdhcs.uchicago.edu/ Published by: The Division of the Humanities at the University of Chicago This work is licensed under a Creative Commons Attribution 3.0 Unported License 1 http://plato.stanford.edu 2 http://philpapers.org http://jdhcs.uchicago.edu http://jdhcs.uchicago.edu http://creativecommons.org/licenses/by/3.0/ http://creativecommons.org/licenses/by/3.0/ http://plato.stanford.edu http://plato.stanford.edu http://philpapers.org http://philpapers.org service in the other direction, that there is only a partial correspondence between items in the SEP bibliography and PhilPapers, and that this special purpose software cannot be easily be redeployed by other developers of online resources for philosophers. Without easy access to preferred representations, the social and semantic web cannot be quickly adapted to the needs of researchers in the humanities.3, 4 And while the needs and goals of librarians have been important drivers of standards in the digital humanities, this represents just one aspect of the potential of the digital humanities to facilitate scholarly research. Humanities scholars need access to the data in many different representational formats: from HTML for the ordinary end user to fully integrated XML specifications and raw data dumps for the information scientist, and to lightweight JSON stores for the web programmer. The Indiana Philosophy Ontology Project The Indiana Philosophy Ontology Project (InPhO)5 aims to overcome barriers to broader collaboration by providing a simple, lightweight API (application programming interface) capable of serving a wide variety of data formats. APIs allow programmers to focus on the ‘what’ of computing rather than the ‘how.’ So, for instance, it is an API that allows programmers to tell the computer’s operating system to respond to a mouse click by opening a “window” on the screen, without the programmer having to worry about the graphics needed to produce a rectangle of a certain size, border, color, etc. Similarly, programmers can exploit databases on another server through an API without having to know what the underlying database model is on the remote server. APIs give power to programmers by allowing them to stand on the shoulders of others. At the InPhO project, we have a vision of seamless integration among digital philosophy applications, and our API is a deliberate first step towards realizing that vision. The InPhO is a dynamic computational ontology which models philosophy using statistical methods applied to the entire SEP corpus,6 as well as machine reasoning methods applied to feedback from experts in the field, particularly the editors and authors of SEP entries. Our approach7 begins with a small amount of manual ontology construction and the development of an initial philosophical lexicon through collaboration with domain experts. We then build on this foundation through an iterative three-step process to create a taxonomic representation of philosophy. First, statistical inference over the SEP is used to generate hypotheses about the relations among various topics, including the relative Journal of the Chicago Colloquium on Digital Humanities and Computer Science Page 2 Volume 1 Number 3 (2011) Source URL: http://jdhcs.uchicago.edu/ Published by: The Division of the Humanities at the University of Chicago This work is licensed under a Creative Commons Attribution 3.0 Unported License 3 Leonard Richardson and Sam Ruby, RESTful Web Services (O’Reilly Media, Inc.: 2007). 4 Sinuhé Arroyo, Rubén Lara, Juan Miguel Gómez, David Berka, Ying Ding, and Dieter Fensel, “Semantic aspects of web services,” in Practical Handbook of Internet Computing, ed. Munindar P. Singh (Baton Rouge: Chapman Hall and CRC Press, 2004), pages 31–1 – 31–17. 5 http://inpho.cogs.indiana.edu 6 The SEP contains over 1,200 entries comprising more than 14 million words, maintained by over 120 volunteer subject editors, and accessed through more than 700,000 entry downloads per week. 7 detailed in Cameron Buckner, Mathias Niepert, and Colin Allen, “From encyclopedia to ontology: Toward a dynamic representation of the discipline of philosophy,” Synthese (2010). Special issue on Representing Philosophy, in press. http://dx.doi.org/10.1007/s11229-009-9659-9. http://jdhcs.uchicago.edu http://jdhcs.uchicago.edu http://creativecommons.org/licenses/by/3.0/ http://creativecommons.org/licenses/by/3.0/ http://inpho.cogs.indiana.edu http://inpho.cogs.indiana.edu http://dx.doi.org/10.1007/s11229-009-9659-9 http://dx.doi.org/10.1007/s11229-009-9659-9 generality of pairs of terms.8 These hypotheses are then evaluated by domain experts through simple questions that do not require any knowledge of ontology design on the part of the experts. Finally, the expert responses are combined with the statistical measures as a knowledge base for a machine reasoning program, which uses answer set programming to output a taxonomic view of the discipline that synthesizes the sometimes inconsistent data obtained by querying experts.9 This resulting representation can then be used to generate tools that assist the authors, editors, and browsers of the SEP, such as a cross-reference suggestion engine, access to bibliographic content, context-aware semantic search, and interfaces for exploring the relations among concepts, among philosophical thinkers, and between concepts and thinkers. InPhO does not assume that a single, correct view of the discipline is possible, but rather takes the pragmatic approach that some representation is better than no representation at all.10 Even if other projects do not agree with our final taxonomic projections, our statistical data and expert evaluations may still be useful. By exposing our data through the API at all three steps of the process outlined above, we encourage other projects to discover alternative ways to construct meaningful and useful representations of the discipline. Furthermore, by exposing our data in this way, others may explore try alternative methods for generating representations of the discipline. Design Considerations The use of APIs by other projects requires our accountability and necessitates permanent availability. The high cost of redesign under these conditions implies that we have one chance to get the access-layer right.11, 12 To do this, we used one of the most venerable and pervasive technologies —the hypertext transfer protocol (HTTP)13 that is the foundation of the World Wide Web—to enable ease of use by scholars, programmers, and scientists through nearly any interface. Each entity in the InPhO knowledge base is exposed as a resource with a unique Uniform Resource Identifier (URI) which is accessed using the HTTP methods, providing a consistent interface for data retrieval and manipulation. This is known as the REpresentational State Transfer (REST) paradigm of web services, pioneered by HTTP inventor Roy Fielding.14 The InPhO data can be explored via human- Journal of the Chicago Colloquium on Digital Humanities and Computer Science Page 3 Volume 1 Number 3 (2011) Source URL: http://jdhcs.uchicago.edu/ Published by: The Division of the Humanities at the University of Chicago This work is licensed under a Creative Commons Attribution 3.0 Unported License 8 Mathias Niepert, Cameron Buckner, and Colin Allen, “A dynamic ontology for a dynamic reference work,” in Proceedings of the 7th ACM/IEEE Joint Conference on Digital Libraries (2007), 288–297. 9 Mathias Niepert, Cameron Buckner, and Colin Allen, “Answer set programming on expert feedback to populate and extend dynamic ontologies,” in Proceedings of the 21st International FLAIRS Conference (Coconut Grove, Florida: AAAI Press, 2008), 500–505. 10 Buckner et al., “From encyclopedia to ontology.” 11 Joshua Bloch, “How to design a good API and why it matters,” in OOPSLA ’06: Companion to the 21st ACM SIGPLAN symposium on Object-oriented programming systems, languages, and applications (New York: ACM, 2006), 506–507. 12 Toby Segaran, Colin Evans, and Jamie Taylor, Programming the Semantic Web. O’Reilly Media, Inc., 2009. 13 Roy Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and Tim Berners-Lee, “Hypertext transfer protocol - http/1.1. RFC 2616,” Network Working Group, June, 1999, accessed February 27, 2010, http://www.w3.org/Protocols/ rfc2616/rfc2616.html. 14 Roy Fielding, Architectural Styles and the Design of Network-based Software Architectures, PhD thesis, University of California, Irvine (2000). http://jdhcs.uchicago.edu http://jdhcs.uchicago.edu http://creativecommons.org/licenses/by/3.0/ http://creativecommons.org/licenses/by/3.0/ http://www.w3.org/Protocols/rfc2616/rfc2616.html http://www.w3.org/Protocols/rfc2616/rfc2616.html http://www.w3.org/Protocols/rfc2616/rfc2616.html http://www.w3.org/Protocols/rfc2616/rfc2616.html friendly HTML or in machine-friendly JSON, selected by simply adding either .html or .json to the URI of each resource. This approach has many advantages over previous attempts at integration. For example, data dumps provided using industry-standard Web Ontology Language files (idiosyncratically referred to as OWL files) reveal only certain types of relations and do not allow for read-write access to the underlying knowledge base. While OWL remains an important format for exchange of data, to limit oneself to that format would place significant limits on collaborative efforts, such as InPhO’s partnership with Noesis15 to power their domain-specific search engine. The Noesis project currently has no need to receive InPhO’s entire ontology file when seeking to query specific pieces of information about a journal, a philosophical concept, or a thinker from the InPhO database. Instead, Noesis programmers may use InPhO’s RESTful API to easily select only those entities and partitions of the InPhO which are relevant to the current query. Other projects are likely to have similar requirements—a project tracing the history of a specific philosopher could initially pull selected data only from the thinker database, but could easily branch out to other portions of the database as connections between that thinker and specific concepts become relevant to an end-user’s online searching and browsing behavior. At the same time, this approach to data sharing protects data providers from the overexposure that may result from making large data dumps available to all comers, while easing the processing load for data consumers who might otherwise have to parse masses of unwanted data. The design of the API also satisfies Crane’s rubric for digital humanities infrastructure:16 1) By providing a unique URI, we have created canonical named entities for each concept within the ontology. These entities are aliased within our knowledge base with alternative spellings or abbreviations, increasing the likelihood of identifying objects correctly. This technology is being used by the Noesis project’s journal search. 2) Our machine learning and data mining techniques create a co-occurrence graph which is exposed through the API as a dynamic cataloging service for philosophical concepts. 3) Structured user contributions are invited through secure write access to improve the quality of the knowledge base. Evaluations will be solicited throughout the SEP editorial process. 4) These are then used to provide custom, personalized data and tools for researchers, such as the SEP cross-reference engine. The design also satisfies the computer science community, by providing a concrete example of a semantic web portal, as envisioned by Stollberg et al.17 Benefits Our experience shows the development of an API is not just an exercise in enhancing collaboration with other projects, but can alleviate internal management concerns about sustainability and efficiency. Due to the nature of academia, turnover happens regularly on a three to five year cycle as students working as programmers and researchers on the project progress from matriculation to graduation. New project members must be quickly integrated with our development process and Journal of the Chicago Colloquium on Digital Humanities and Computer Science Page 4 Volume 1 Number 3 (2011) Source URL: http://jdhcs.uchicago.edu/ Published by: The Division of the Humanities at the University of Chicago This work is licensed under a Creative Commons Attribution 3.0 Unported License 15 http://noesis.evansville.edu 16 Crane et al., “The humanities in a global e-infrastructure.” 17 Michael Stollberg, Holger Lausen, Rubén Lara, Ying Ding, Sung-Kook Han, and Dieter Fensel, “Towards semantic web portals,” in Proceedings of the WWW2004 workshop on Application Design, Development and Implementation Issues in the Semantic Web, eds. Christoph Bussler, Stefan Decker, Daniel Schwabe, and Oscar Pastor. (New York: 2004). http://jdhcs.uchicago.edu http://jdhcs.uchicago.edu http://creativecommons.org/licenses/by/3.0/ http://creativecommons.org/licenses/by/3.0/ http://noesis.evansville.edu http://noesis.evansville.edu fluent in our existing code base. Our initial architecture consisted of a decentralized, uncoupled multitude of quick scripts and interfaces, driven by the necessity of having a proof of concept. This led to difficulty in turnover, and highlighted a need for maintainable, documented code. Additionally, this loosely coupled architecture was resistant to scalability. Many of our scripts required a sequence of coupled events and were often executed by hand. As evaluations continued to trickle in, parts of our database became inconsistent leading to integrity issues and requiring manual cleanup during the ontology extension process. With all data access occurring at a single point, IT demands were reduced, as maintenance of SQL data connections and secure data access tunnels was replaced with the maintenance of the website. By porting out internal tools to use the same API calls, we can use our internal code as public examples. Conclusions While there exist other APIs for humanities computing, these have usually been developed by groups seeking to provide easy access to large cultural collections such as those held by libraries and museums. To our knowledge, we are the first project to have developed an API for access to information about the dynamically changing concepts, people, and institutions defining an academic discipline, and to create a mechanism for partner projects to contribute to our database, bridging the gap between social and semantic web. We are certainly the first to do this for the field of philosophy. The lessons learned in carrying out this project will, we hope, encourage other scholarly communities to pursue similar projects to make the conceptual structure and human capital of their field readily accessible for applications that have not yet been dreamt of, and will enable such projects to avoid some of the early problems with design that arose from an application-centric view of the web, as opposed to the service-oriented semantic web. Bibliography Arroyo, Sinuhé, Rubén Lara, Juan Miguel Gómez, David Berka, Ying Ding, and Dieter Fensel. “Semantic aspects of web services.” In Practical Handbook of Internet Computing, edited by Munindar P. Singh, 31–1 – 31–17. Baton Rouge: Chapman Hall and CRC Press, 2004. Bloch, Joshua. “How to design a good API and why it matters.” In OOPSLA ’06: Companion to the 21st ACM SIGPLAN symposium on Object-oriented programming systems, languages, and applications, 506– 507. New York: ACM, 2006. Buckner, Cameron, Mathias Niepert, and Colin Allen. “From encyclopedia to ontology: Toward a dynamic representation of the discipline of philosophy.” Synthese (2010). Special issue on Representing Philosophy, in press. http://dx.doi.org/10.1007/s11229-009-9659-9. Crane, Gregory, Brian Fuchs, and Dolores Iorizzo. “The humanities in a global e-infrastructure: a web-services shopping-list.” UK e-Science All Hands Meeting (2007). Fielding, Roy. Architectural Styles and the Design of Network-based Software Architectures. PhD thesis, University of California, Irvine (2000). Fielding, Roy, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and Tim Berners-Lee. “Hypertext transfer protocol - http/1.1. RFC 2616,” Network Working Group, June, 1999. Accessed February 27, 2010. http://www.w3.org/Protocols/rfc2616/rfc2616.html. Journal of the Chicago Colloquium on Digital Humanities and Computer Science Page 5 Volume 1 Number 3 (2011) Source URL: http://jdhcs.uchicago.edu/ Published by: The Division of the Humanities at the University of Chicago This work is licensed under a Creative Commons Attribution 3.0 Unported License http://dx.doi.org/10.1007/s11229-009-9659-9 http://dx.doi.org/10.1007/s11229-009-9659-9 http://www.w3.org/Protocols/rfc2616/rfc2616.html http://www.w3.org/Protocols/rfc2616/rfc2616.html http://jdhcs.uchicago.edu http://jdhcs.uchicago.edu http://creativecommons.org/licenses/by/3.0/ http://creativecommons.org/licenses/by/3.0/ Niepert, Mathias, Cameron Buckner, and Colin Allen. “A dynamic ontology for a dynamic reference work.” In Proceedings of the 7th ACM/IEEE Joint Conference on Digital Libraries (2007): 288–297. Niepert, Mathias, Cameron Buckner, and Colin Allen. “Answer set programming on expert feedback to populate and extend dynamic ontologies.” In Proceedings of the 21st International FLAIRS Conference, 500–505. Coconut Grove, Florida: AAAI Press, 2008. Richardson, Leonard and Sam Ruby. RESTful Web Services. O’Reilly Media, Inc.: 2007. Segaran, Toby, Colin Evans, and Jamie Taylor. Programming the Semantic Web. O’Reilly Media, Inc.: 2009. Stollberg, Michael, Holger Lausen, Rubén Lara, Ying Ding, Sung-Kook Han, and Dieter Fensel. “Towards semantic web portals.” In Proceedings of the WWW2004 workshop on Application Design, Development and Implementation Issues in the Semantic Web edited by Christoph Bussler, Stefan Decker, Daniel Schwabe, and Oscar Pastor. New York: 2004. Journal of the Chicago Colloquium on Digital Humanities and Computer Science Page 6 Volume 1 Number 3 (2011) Source URL: http://jdhcs.uchicago.edu/ Published by: The Division of the Humanities at the University of Chicago This work is licensed under a Creative Commons Attribution 3.0 Unported License http://jdhcs.uchicago.edu http://jdhcs.uchicago.edu http://creativecommons.org/licenses/by/3.0/ http://creativecommons.org/licenses/by/3.0/ work_7ebterjnvnahxgdvqdngzjp3de ---- [PDF] Analyzing and visualizing ancient Maya hieroglyphics using shape: From computer vision to Digital Humanities | Semantic Scholar Skip to search formSkip to main content> Semantic Scholar's Logo Search Sign InCreate Free Account You are currently offline. Some features of the site may not work correctly. DOI:10.1093/llc/fqx028 Corpus ID: 3141626Analyzing and visualizing ancient Maya hieroglyphics using shape: From computer vision to Digital Humanities @article{Hu2017AnalyzingAV, title={Analyzing and visualizing ancient Maya hieroglyphics using shape: From computer vision to Digital Humanities}, author={R. Hu and C. Pallan and J. Odobez and D. Gatica-Perez}, journal={Digit. Scholarsh. Humanit.}, year={2017}, volume={32}, pages={ii179-ii194} } R. Hu, C. Pallan, +1 author D. Gatica-Perez Published 2017 Art, Computer Science Digit. Scholarsh. Humanit. Maya hieroglyphic analysis requires epigraphers to spend a significant amount of time browsing existing catalogs to identify individual glyphs. Automatic Maya glyph analysis provides an efficient way to assist scholars’ daily work. We introduce the Histogram of Orientation Shape Context (HOOSC) shape descriptor to the Digital Humanities community. We discuss key issues for practitioners and study the effect that certain parameters have on the performance of the descriptor. Different HOOSC… Expand View via Publisher academic.oup.com Save to Library Create Alert Cite Launch Research Feed Share This Paper 2 CitationsBackground Citations 1 View All Figures and Topics from this paper figure 1 figure 2 figure 3 figure 4 figure 5 figure 6 figure 7 figure 8 figure 9 figure 10 figure 11 figure 12 figure 13 figure 14 View All 14 Figures & Tables Glyph Autodesk Maya Digital humanities Computer vision Directed graph Information visualization Force-directed graph drawing Prototype 2 Citations Citation Type Citation Type All Types Cites Results Cites Methods Cites Background Has PDF Publication Type Author More Filters More Filters Filters Sort by Relevance Sort by Most Influenced Papers Sort by Citation Count Sort by Recency Improved Hieroglyph Representation for Image Retrieval Laura Alejandra Pinilla-Buitrago, J. A. Carrasco-Ochoa, J. Martínez-Trinidad, Edgar Román-Rangel Computer Science JOCCH 2019 View 1 excerpt, cites background Save Alert Research Feed A Probe into Patentometrics in Digital Humanities Guirong Hao, F. Ye Engineering, Computer Science Libr. Trends 2020 Save Alert Research Feed References SHOWING 1-10 OF 21 REFERENCES SORT BYRelevance Most Influenced Papers Recency Analyzing Ancient Maya Glyph Collections with Contextual Shape Descriptors Edgar Román-Rangel, C. Pallan, J. Odobez, D. Gatica-Perez Computer Science International Journal of Computer Vision 2010 49 PDF View 1 excerpt, references methods Save Alert Research Feed Assessing a Shape Descriptor for Analysis of Mesoamerican Hieroglyphics: A View Towards Practice in Digital Humanities R. Hu, J. Odobez, D. Gatica-Perez Geography, Computer Science DH 2016 3 PDF Save Alert Research Feed Multimedia Analysis and Access of Ancient Maya Epigraphy: Tools to support scholars on Maya hieroglyphics R. Hu, Gulcan Can, +6 authors D. Gatica-Perez Computer Science IEEE Signal Processing Magazine 2015 15 PDF View 3 excerpts, references methods Save Alert Research Feed Reading Maya Art: A Hieroglyphic Guide to Ancient Maya Painting and Sculpture Andrea J. Stone, M. Zender Art 2011 73 Save Alert Research Feed Statistical Shape Descriptors for Ancient Maya Hieroglyphs Analysis E. Rangel Geography 2013 5 Highly Influential PDF View 3 excerpts, references methods Save Alert Research Feed Automatic Egyptian hieroglyph recognition by retrieving images as texts Morris Franken, J. V. Gemert Computer Science MM '13 2013 27 PDF Save Alert Research Feed A catalog of the Maya hieroglyphs J. E. Thompson Art, History 1962 104 Highly Influential PDF View 6 excerpts, references background and methods Save Alert Research Feed Sketch-based shape retrieval M. Eitz, Ronald Richter, T. Boubekeur, K. Hildebrand, M. Alexa Computer Science ACM Trans. Graph. 2012 275 PDF Save Alert Research Feed Learning hatching for pen-and-ink illustration of surfaces E. Kalogerakis, Derek Nowrouzezahrai, Simon Breslav, Aaron Hertzmann Computer Science TOGS 2012 505 PDF Save Alert Research Feed Shape Matching and Object Recognition A. Berg, Jitendra Malik Computer Science Toward Category-Level Object Recognition 2006 146 PDF View 2 excerpts, references methods Save Alert Research Feed ... 1 2 3 ... Related Papers Abstract Figures and Topics 2 Citations 21 References Related Papers Stay Connected With Semantic Scholar Sign Up About Semantic Scholar Semantic Scholar is a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI. Learn More → Resources DatasetsSupp.aiAPIOpen Corpus Organization About UsResearchPublishing PartnersData Partners   FAQContact Proudly built by AI2 with the help of our Collaborators Terms of Service•Privacy Policy The Allen Institute for AI By clicking accept or continuing to use the site, you agree to the terms outlined in our Privacy Policy, Terms of Service, and Dataset License ACCEPT & CONTINUE work_7hx5xm4t6fh5vdizdql4poiudy ---- Open Data Empowerment of Digital Humanities by Wikipe- dia/DBpedia Gamification and Crowd Curation –WiQiZi’s Challenges with APIs and SPARQL This is an extended version of the article found at https://dh2018.adho.org/revitalizing-wikipedia-dbpedia-open-data-by-gamification- sparql-and-api-experiment-for-edutainment-in-digital-humanities/ Go Sugimoto ACDH-ÖAW Vienna, Austria go.sugimoto@oeaw.ac.at Abstract Digital Humanities (DH) enjoys a wealth of Open Data published by cultural heritage institutions and ac- ademic researchers. In particular, Linked Open Data (LOD) offers an excellent opportunity to publish, share, and connect a broad array of structured data in the distributed web ecosystem. However, a real break-through in humanities research as well as its societal impact has not been visible, due to several challenging obstacles including lack of awareness, expertise, technology, and data quality. In order to re- move such barriers, this article outlines an experimental case study of Application Programming Interfac- es (APIs) and SPARQL. WiQiZi project employs a gamification technique to develop a simple quiz ap- plication to guess the age of a randomly selected image from Wikipedia/DBpedia. The project demon- strates a potential of gamification of Open Data not only for edutainment for the public, but also for an inspirational source of DH research. In addition, a face detection API based on an Artificial Intelligence is included for hint function, which would increase both the public and academic interests of new tech- nology for DH. Moreover, the project provides a possibility for crowd data curation for which the users are encouraged to check and improve the data quality, when the application fails to calculate the answer. This method seems to create a win-win scenario for the Wikipedia/DBpedia community, the public, and academia. Keywords: Digital Humanities, Application Programming Interfaces, Linked Open Data, SPARQL, gam- ification, crowd sourcing, data curation 1 Introduction The time is ripe for Open Data. As new tech- nology becomes available and expertise spreads across communities, governments and research entities are particularly keen to promote Open Data to meet the demands of democracy in the 21st century and ensure the transparency of re- search activities. In particular, Berners-Lee's (2009) five star Open Data proposition has start- ed to take off. In his vision, Open Data is closely associated with Linked Data, which connects related data on the web with hyperlinks. He combines the two concepts and names it Linked Open Data (LOD). While the best practice of Linked Data is summarized by Heath (2018), he defines LOD as Linked Data that ‘is released under an open licence, which does not impede its reuse for free’. Followed by those initiatives, the global com- munity has started to flourish. Over the last years, LOD has been gaining momentum in digi- tal humanities (DH) and cultural heritage re- search. Many repeatedly explain the essence of LOD, including the tripod of supporting technol- ogy: HTTP, URIs, and RDF1 (For example, Si- mou et al., 2017; Marden et al., 2013; Boer et al., 2016). In fact, RDF has been adopted more frequent- ly as a data format in data repository systems such as Fedora 2 . Important datasets including Europeana3, VIAF4, GeoNames5, and Getty vo- cabularies6 are being published with HTTP URIs in machine-readable formats such as RDF, Tur- tle7, and JSON-LD8. SPARQL9 endpoints have been progressively created in many cultural her- itage organizations and DH projects (Edelstein et al., 2013). SPARQL allows the users to query a large volume of RDF graph datasets, so that se- mantically rich data fragments can be trans- https://dh2018.adho.org/revitalizing-wikipedia-dbpedia-open-data-by-gamification-sparql-and-api-experiment-for-edutainment-in-digital-humanities/ https://dh2018.adho.org/revitalizing-wikipedia-dbpedia-open-data-by-gamification-sparql-and-api-experiment-for-edutainment-in-digital-humanities/ formed, by selecting, merging, splitting, and fil- tering, into new information and knowledge. ‘The Promised Land’ in the digital research era seems to be just around the corner. However, research outcomes of LOD, which would have a significant impact on new discov- eries and/or innovation in society, are still out- standing. Although LOD is meant to offer a powerful paradigm for global data integration, most probably reinforcing interdisciplinary re- search, many cases in DH are reported for the creation and publication of LOD and/or internal use of LOD (Marden et al., 2013). Although there are several DH projects concerning the use of external LOD (e.g. (Boer et al., 2016), they often focus on data enrichment. In addition, SPARQL query exploitation is rather limited within small technology-savvy communities (Lincoln, 2017; Alexiev, 2017). There could be several reasons for the underuse of LOD: a) lack of awareness of existence, b) lack of knowledge and skills to use RDF and SPARQL, c) opened data being too narrow in scope, c) lack of com- puting performance to be usable, and d) interdis- ciplinary research being not widely exercised. In a more general framework of Open Data, the situation is much better for XML 10 and JSON11, because they are, in general, less com- plex than RDF and SPARQL. As such, they are more broadly accepted as standard formats of Application Programming Interfaces 12 (APIs). However, Sugimoto ( 2017a and 2017c) is still concerned about technical hurdles for a majority of data consumers, as well as the needs of API standardization and ease of data reuse for ordi- nary users. In another context, the underuse of data, tools, and infrastructures seems to be a common phenomenon in DH. For example, the use of one of the most prominent services of a European language infrastructure, the Virtual Language Observatory13 of CLARIN14, is rather low and below expectation (Sugimoto, 2017b). Those realities seem to indicate that research is not yet taking full advantage of Open Data, especially LOD, although a large amount of data has become available. It is a pity that the benefit of Open Data is only partially spread. To this end, this article attempts to stimulate the use of LOD within DH. The author has experimented with Wikipedia 15 /DBpedia 16 to explore the po- tential use of and/or the revitalization of (Linked) Open Data in and outside research community. 2 Gamification for Wikipedia/DBpedia (Linked) Open Data 2.1 Simple Quiz Application The choice of Wikipedia, and its structured database version, DBpedia, is rationalized by taking into account the above-mentioned issues of Open Data reuse for APIs and SPARQL end- points. Contrary to most of DH and cultural her- itage targeted projects, Wikipedia/DBpedia pro- vides a much broader scope for data-driven re- search, meaning there would be more familiarity and reusability of the data among the users. This also solves the problem of datasets in DH being too specific to be used by third party researchers (or the researchers do not know how to use data and/or what to do with them (Edmond and Gar- nett, 2014; Orgel et al., 2015). In addition, inter- disciplinary research could be more easily adopt- ed, using a more comprehensive yet relatively detailed level of knowledge, compared to DH- branded research topics. This paper would also serve as an example of the simple application of API and SPARQL for less technical researchers within the DH commu- nity, due to the background of this project. The project is conducted solely by the author who has developed all the code with the assistance of a colleague, albeit being a programming beginner. This setting displays an encouragement not only for researchers with less technical experience to try LOD-based research, but also for the LOD community to gain more like-minded supporters. The project is not limited to pure research use of data. In a connection to the evolution from Open Data to Open Science (FOSTER consorti- um, n.d.), public interest and (ideally) engage- ment are just as important as the innovation po- tential of research itself. In this respect, the key- word of the project is gamification. In order to draw public attention and to show- case a social benefit of Open Data and DH, gam- ification would be a catalyst to connect the scholars conducting complicated DH research and the increasingly greedy knowledge consum- ers among normal citizens. Kelly and Bowan (2014) states that limited attention has been paid to digital games until recently, although this is changing rapidly. Those exceptions include the recent projects of art history games reviewed by Hacker (2015). However, the intensive use of Open Data via APIs and SPARQL endpoints is still not prominent. Although there already are a few sophisticated projects such as a EU funded project Cross Cult which uses elaborate semantic technologies (Daif et al., 2017), this article is able to contribute to this discourse from a web innovation perspective in a more simplified DIY project environment. The primary outcome of the project is WiQi- Zi17, a simple quiz application, based purely on external Open Data APIs and SPARQL. In a nut- shell, it requires users to guess the age of a ran- domly selected person from Wikipedia by look- ing at a portrait of the person. The game starts with a selection of a year in order to specify the time of the target person (Fig. 1). Ten random years between 1700 and 2002 are generated and presented to the users. It is recommended to pick one of them, because the year range is more likely to find a person from a pool of available people. The users can also type a specific year of their choice. The year is used as the birth year of the person. When an image of a person is loaded, the users can start guessing the age of the depicted person, also using the de- scription of the person as a clue (Fig. 2 and Fig. 3). As such, WiQiZi represents an interplay of Wikipedia, QuiZ and Information, delivering elements of entertainment, education and re- search for potentially a wide range of audience. Apparently, the age of a person in a particular image is provided neither by Wikipedia, nor by DBpedia. It is, in fact, calculated programmati- cally by comparing the birthdate and the creation date of the image. Although it is a simple algo- rithm, the quiz is generated automatically. It goes without saying that this approach does not guar- antee the correct answer. For example, an image may be created after the death of a person. If the image is a photograph, it is likely to be more ac- curate. Thus, this game merely provides the best guess based on available facts. Nevertheless, it is good enough for edutainment, because the main purpose of the application is to stimulate the us- ers’ interest. In addition, it only takes into ac- count years but not months or days. On a positive side, the application enables users to play the game even if either (or both) of months and days are missing (see Section 3 too for data quality issues). The random selection of data is sometimes costly for data processing, but it was applied for year and image in the application. Randomiza- tion is, in fact, the key to developing a game ap- plication, as gamers easily get bored, if the game always shows the same information and situa- tion. The application is intended for fun, thus, includes both female and male, and all types of contemporary persons such as politicians, sport athletes, musicians, actors, and businesspersons. Living persons are useful to increase the en- gagement level of the users. At the same time, the inclusion of historical figures is very im- portant in DH in that the user would learn the history of a person from the past. As a result, figures range from Oliver Cromwell (political leader) and Luis Peglion (bicycle racer) to Irina Shayk (model) and Ariana Grande (singer). In this regard, the project successfully represents the richness and diversity of information which LOD can offer for history, art and culture, media studies, and alike. The images are typically paintings, drawings, prints, photos, but occasionally objects such as statues and coins, which depict a person. It is also possible that no person is depicted in the image. For instance, they can be graves or items that symbolise the person (Fig. 4). The earlier the year, the more likely it is that the image does not contain the portrait of a person. In such cases, users are required to reshuffle the image (see yel- low box in Fig. 2). Fig. 1 Select a randomly generated year, or type a year in the text box Fig. 2 Quiz to guess the age of a person found in a Wikipedia article Fig. 3 The screen when submitting a wrong answer Fig. 4 Symbol of a person in the image When the user cannot guess the age, there is a help function. A hint section is equipped with a face detection API, suggesting the estimate age and gender of the person in the image by ma- chine learning (Fig. 5). The confidence score of the estimation is also given by percentage. Alt- hough the function is extremely simple, the cur- rent boom of Artificial Intelligence in our society would inspire DH and alike in the context of APIs in combination with Open Data from Wik- ipedia/DBpedia. When the right answer is deliv- ered, the application displays the links to the cor- responding Wikipedia article and DBpedia da- taset. This gives the users opportunity to learn the person in detail. Fig. 5 Hint function for age and gender, using the face detection and machine learning API 2.2 Simple Technology, but Technical Chal- lenges The application is built with simple PHP 18 even without using a framework such as Laravel19. Bootstrap20 is used for creating a quick web design. This is why the project would be better labelled as a DIY project. The application is entirely based on external data via APIs and SPARQL endpoint, exploring the potential of distributed data research. It uses three different APIs of Wikipedia21 and SPARQL endpoints of DBpedia22. The former consists of 1) Wikipedia API to access Wikpedia articles, 2) Wikimedia API to access images in Wikipedia articles, and 3) another Wikimedia API to access the metadata of the images. The third one is crucial in that it contains copyright and licence information. All the images of the quiz come with as much IPR information as possible, so that the application ensures the protection of copyright, while pro- moting data re-use by clarifying the licenses. It should be also noted that WiQiZi does not store any images on the server. It displays imag- es directly from Wikimedia, being a lightweight software application. It is worth iterating that the automatic generation of quiz is neither very common nor easy, because a quiz has to provide intellectual challenges and the right level of dif- ficulty, thus, many quizzes are hand-written. Thanks to the semantics of DBpedia, question- answer applications such as WiQiZi can be de- veloped. The mix of APIs makes the application devel- opment a little tricky, because API calls have to be made one after another, although they all serve the data that originate, one way or another, from Wikipedia. Better organization of data ac- cess to those resources would increase the usabil- ity of the developers and users in the future. In contrast, there are also advantages for the use of APIs in a decentralized system. It allows devel- opers to save resources (cost of servers and stor- age and maintenance) and to focus on system and data integration. For the sake of data pursing, a SPARQL que- ry is embedded into API query parameters and the query results are returned as JSON23. DBpe- dia automatically executes this transformation. For example, the following SPARQL query (whose results can be seen with the DBpedia endpoint interface on a web browser: Fig. 6) can be transformed into API query parameters below: SELECT * WHERE {?person rdfs:label ?person_name ; rdf:type ?type ; dbo:birthDate ?birthdate ; dbo:abstract ?abstract . bind(rand(1 + strlen(str(?person))*0) as ?rid) FILTER regex(?type, "Person") FILTER regex(?birthdate, "1977") } order by ?rid LIMIT 200 http://dbpedia.org/sparql?default- graph- uri=http%3A%2F%2Fdbpedia.org&query=sele ct+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alab el+%3Fperson_name+%3B+rdf%3Atype+%3Ftyp e+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+ dbo%3Aabstract+%3Fabstract+.%0D%0A++++b ind%28rand%281+%2B+strlen%28str%28%3Fpe rson%29%29*0%29+as+%3Frid%29%0D%0AFILTE R+regex%28%3Ftype%2C+%22Person%22%29%0D %0AFILTER+regex%28%3Fbirthdate%2C+%2219 77%22%29%0D%0A%7D+order+by+%3Frid%0D%0A LIM- IT+200&format=json&CXML_redir_for_subjs =121&CXML_redir_for_hrefs=&timeout=3000 0&debug=on&run=+Run+Query+ Fig. 6 SPARQL query results (part) The implementation of face detection is also simple. The application posts an image of the quiz as URL to the image analysis API of IBM Watson24. The API returns JSON data with the estimation of the age and gender of the person depicted, as well as the numeric location of the facial area. If the image is larger than the stand- ard layout of the game interface, it should be ad- justed accordingly. In that case, extra PHP cod- ing is needed to calibrate the area of the face by using the ratio of resize. There are a couple of technical challenges. First of all, it turns out that the DBpedia dataset is not as rich as one may expect. More precisely, if a SPARQL query is fired to access a generic dataset, for example, to select data classified as ‘person’, there are often only few shared RDF properties in the query results (Table 1). Namely, name of the person, description, and link to the Wikipedia article. Even birthdates and birthplac- es (and death date and place) may not exist, de- pending on the data quality. In addition, the occupation of the person de- termines the availability of his/her properties. For instance, whereas football players may have properties related to club teams and national caps, politicians hold properties related to politi- cal parties and experience of ministers, etc. This generalization-specialization makes it hard to anticipate what properties are available for dif- ferent persons. This is not a problem for DBpe- dia; however, it is a challenge for a quiz applica- tion, which has to start with a generic query in order not to preselect the DBpedia categories of persons. In fact, the very first SPARQL query of AGE- Q is to randomly retrieve data from entries of the type “person” that have user-selected year for the variable of birthdate (See above and Fig. 6.). A further condition is set in PHP to restrict the data to ones with thumbnails available. Unless an al- ternative interface is developed (e.g. select occu- pation first) and the quiz compromises the amount of available persons, the quiz questions need to be very generic. This is the very reason why age was chosen for the application in the first place. Table 1 Summary of available RDF property Available level Likely available common-properties Almost always Name (rdf:label), type (rdf:type) description (dbo:abstract), Wikipedia link25 Frequent Birthdate (dbo:birthDate), birthplace, death date, death place, nationality etc. Sometimes Spouses, occupation, associated people etc. Depending on the type of person Art works, publications, political parties, teams, etc. Secondly, although rather trivial, the applica- tion currently does not support face detection for multiple persons in an image. Therefore, it may not return the estimation of the right person. In rare cases, there are a multiple persons in an im- age and one of them is the very person of the Wikipedia article. At the moment, there is no excellent logic to identify the face of a person in question, and filter out the others. IBM Watson http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+*%0D%0Awhere+%7B%3Fperson+rdfs%3Alabel+%3Fperson_name+%3B+rdf%3Atype+%3Ftype+%3B+dbo%3AbirthDate+%3Fbirthdate+%3B+dbo%3Aabstract+%3Fabstract+.%0D%0A++++bind%28rand%281+%2B+strlen%28str%28%3Fperson%29%29*0%29+as+%3Frid%29%0D%0AFILTER+regex%28%3Ftype%2C+%22Person%22%29%0D%0AFILTER+regex%28%3Fbirthdate%2C+%221977%22%29%0D%0A%7D+order+by+%3Frid%0D%0ALIMIT+200&format=json&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+ normally detects several faces without prioritis- ing them. Thirdly, as hinted earlier, the performance is slow, when loading the first image. In the worst case, it might take up to a couple of minutes to load the quiz, because the application depends on the chain of APIs and select data randomly. Alt- hough the users are informed on the start page, a progress bar is not yet implemented. In the long run, the code needs to be refactored and opti- mized in order to satisfy the users. As Sugimoto (2017a) reported, the chain of API calls opens an avenue for a new data mash-up possibility, but the current web technology may not be sufficient for pragmatic use cases of such distributed sys- tems. Lastly, the application has no multilingual support. The description of a person is always in English, while person names (used as a header of the quiz page) may be presented in languages other than English (See Fig. 3). The biggest ob- stacle of the multilingual extension of WiQiZi is SPARQL query. It is assumed that swapping language code (e.g. xml:lang=”en” to xml:lang=”ja”) is enough to convert English game to Japanese one. However, it turns out that it is not possible to re-use the SPARQL query used for English DBpedia for another language version of DBpedia, because each language ver- sion of DBpedia uses a different ontology. Only a fraction of the ontology (such as rdfs:label and rdf:type) is the same across different languages. For example, the RDF property, http://dbpedia.org/ontology/birthDate, is replaced by http://es.dbpedia.org/property/nacimiento or http://es.dbpedia.org/property/fechaDeNacimient o (“fecha de nacimiento” is the Spanish transla- tion of birthdate) for Spanish DBpedia. While Dutch DBpedia uses dbpedia-owl:birthdate in- stead, Italian DBpedia has another property called http://it.dbpedia.org/property/annonascita as well as dbpedia-owl:birthdate. Many of the variations of property names are unpredictable. This makes it complicated to rep- licate the game in the same manner. There is also inconsistency between different languages of DBpedia, causing confusion for data integrity. The data organization problem between Wikipe- dia and DBpedia only adds complications to the data quality discussion. Therefore, we must acknowledge that although DBpedia provides extremely useful structured data, it has not yet become a fully reliable source of information for serious research. 3 Ongoing Development and Future Work 3.1 Potential of Gamification and Citizen Science Another use of this application for the em- powerment of DH and Wikipedia communities is the crowdsourcing of the curation of Wikipedia articles and DBpedia datasets. Data curation is one of the burgeoning issues of DH and cultural heritage. Countless publications are produced every year to discuss the data quality in the field of library science, archives and DH in general. For instance, as a reflection of critics of Linked Data quality, Daif et al. (2017) reckon that hu- man supervision is needed to manage the data in their project. In our case, the application is sometimes not able to calculate the age of a person, due to sev- eral reasons of metadata quality. For instance, data may be not numeric (e.g. “16th century”) (See Fig. 7 for a Wikipedia/Wikimedia case), malformed (e.g. not ISO compliant: “05/11/88”), confusing (e.g. the creation date of digital image is used instead of that of analogue image), inac- curate (e.g. 1880s instead of 1885 (true value) due to uncertainty), wrong (e.g. 2599 instead of 1599 due to mistype), or missing, resulting in an error message. This is normally regarded as an optimization problem of the code. While usually developers might try to suppress erroneous results, in this case, we are not interested in concealing errors. When the error occurs, it could be a sign of a data quality problem and we could trace back to underlying inconsistencies in the data structure. In this application, users are persuaded to follow the provided links to Wikipedia and DBpedia and able to double-check the original data (Fig. 8). If the users are able to correct and/or improve data, for instance, by executing a little online research, the impact for data curation could be considerable. This scenario creates a dual possi- bility. In other words, the application can be used as: • A curation tool of Wikipedia and DBpedia for existing active editors of Wikipedia. • A tool to transform normal users into new curators of Wikipedia Not only could crowd curation benefit Wik- ipedia by correcting and/or adding data, but DBpedia would also be improved, leading to a higher quality of datasets of this LOD magnet http://dbpedia.org/ontology/birthDate http://es.dbpedia.org/property/nacimiento http://es.dbpedia.org/property/fechaDeNacimiento http://es.dbpedia.org/property/fechaDeNacimiento http://it.dbpedia.org/property/annonascita and affecting hundreds of applications world- wide. Fig. 7 Wikimedia metadata displaying non-numeric date (“16th century”) Fig. 8 Crowd sourcing potential for the game application Without doubt, crowdsourcing has become an established subject in DH in its own right. Wik- ipedia itself is an exemplar of crowdsourcing (Carletti et al., 2015). In her introductory article about crowdsourcing in DH, Terras (2016) re- ported and discussed an overview of crowdsourc- ing examples applied in DH and the GLAM (Galleries, Libraries, Archives and Museums) sector, ranging from an early example of the Australian Digitisation Program26 (Holley, 2009) and the North American Bird Phenology Pro- gram27 to Transcribe Bentham and Soldier Stud- ies28. With regard to data curation, the success of crowdsourcing is proven by examples such as “Wasisda?” by the Sound and Vision in the Netherlands (Brinkerink, 2010) , and “What’s on the menu?” by the New York Public Library (NYPL Labs). Focusing on user engagement and its design patterns, Ridge, (2013) analysed the success of Old Weather 29 , Herbaria@Home 30 , and Galaxy Zoo31. Dunn and Hedges (2012) states that one of the four factors of crowdsourcing used within hu- manities research is a clearly defined core re- search question and direction within the humani- ties. This argument is also echoed in other litera- tures (Ridge, 2013; Terras, 2016). In this regard, however, WiQiZi project has a different view. It does not define a clear research question for the crowdsourcing; thus it does not help a specific area of humanities research per se. Rather, it brings a humanities and/or cultural heritage per- spective of and inspiration for a new use of Wik- ipedia/DBpedia for scholars. In addition to that, the author prefers to focus on the gamification of ‘reasonably intellectual’ materials (i.e. Wikipedia), and the crowd sourc- ing possibility is regarded as a spin-off service. The advantage of this positioning is that it em- phasizes on the ‘voluntary’ public engagement of Wikipedia data curation, rather than the ‘mis- sion’ of institutional crowdsourcing that often has a certain goal, expectation, or ambition to complete relatively specific tasks. A disad- vantage is the lower level of public participation. The more voluntary and supplementary the crowdsourcing becomes, the less the users would engage and help. In this respect, WiQiZi takes a slightly unusu- al approach to crowdsourcing. Therefore, it may not be covered by the crowdsourcing typology (See for example Carletti et al. (2015)). Moreo- ver, integrating gamification and crowdsourcing would be an answer for our project to enhance the motivation of the participants. WiQiZi im- plements a kind of crowdsourcing possibility in such a way that the participants join it without noticing or are less conscious about it. As Ridge (2013) observed, crowdsourcing participants can be categorised into two: those who are intentionally participating and those whose contributions are a side effect of their par- ticipation in other core activities. Cases where the core activity of the latter is a game may be called crowdsourcing games, which seems to better fit the classification of WiQiZi. Subse- quently, WiQiZi project is comfortably in sync with the DH advocators who are careful about criticism on a potential risk of labour exploita- tion (Terras, 2016). Currently there is no good mechanism imple- mented to systematically collect information about the data curation. The users are encouraged to contact the developer via email to report their contributions to data curation. However, adding such a complication affects the incentive to play the game and constitutes extra effort. It could be possible to track user engagement in Wikipedia, using user logs. Unfortunately, this would re- quire a good deal of elaboration to the applica- tion, therefore it is not planned in the close fu- ture. Concerning the combat for data quality on the web, there are initiatives such as a W3C working group, which works on creating a Data Quality Vocabulary (DQV)(W3C, 2016). It will not spec- ify what quality means, simply because some datasets are useful for some, but low quality for other purposes. Instead, it aims to make it easier to publish, exchange, and consume quality metadata for every step of a dataset's lifecycle. In this way, different stakeholders can evaluate the datasets and the data consumers are free to use it as an aid to assess data quality by themselves. The author understands that the need of such a vocabulary has arisen from the situation where it has become difficult to find valuable datasets in the enormous sea of data on the web. In the fu- ture, metadata such as DQV could help the users to identify the data quality including DBpedia. 3.2 Future Work Improvement to the application could be made on several levels. For example, it would be very interesting to have a point awarding system. Incentivization is arguably one of the most chal- lenging parts of crowd sourcing, as Ridge (2013) explored the user motivation and engagement. One simple addition would be to display the amount of guessing attempts to reach the right answer. Based on the count, points can be given to the users, adding more fun element to the game. By introducing a login registration, points can be saved to the user account. There is no doubt that the point system is effective, especial- ly when the game stimulates user competition. In addition, an even more ambitious system can be developed. Ideally, more points should be awarded for contributions to the crowd data cura- tion. Although it is not an easy task to consider a fair way of validation and score provision, for example, by assigning moderators in the user community, it would surely increase user en- gagement. For this type of deep engagement, Ridge (2013) suggests the use of scaffolding techniques of museums for online crowdsourc- ing. Scaffolding design provides clear user roles and information about participation. It also care- fully manages the complexity level of participa- tion with a shallow learning curve and guidance through early levels of participatory activities. Obviously, it is not trivial to devise a sophis- ticated platform that deploys such scaffolding, due to the DIY nature of WiQiZi. However, if more interactive features such as score compari- son, personalisation, and visibility of contribu- tion are implemented, it would be a win-win sit- uation for users to enjoy the game and the Wik- ipedia/DBpedia community to gain more volun- tary support in terms of data curation. In this project, the official LOD version of Wikipedia, Wikidata 32 , has not been explored. The tight connection between Wikipedia and Wikidata would provide an outstanding chance for WiQiZi. Depending on the data quality of Wikidata, WiQiZi can be extended to more de- tailed quiz questions, which will increase the ap- petite of the users. On the other hand, the use of Wikipe- dia/DBpedia was just a beginning of effective LOD research in DH. Its full potential can be examined only by stretching the data integration to other data sources. To this end, Europeana is in the scope of the next development, which sup- plies over 50 million cultural heritage objects of Europe. Its metadata is offered with CC0 license. Good linking points between Europeana and Wikipedia/DBpedia, as well as other such im- portant resources as GeoNames, VIAF, and Get- ty Vocabularies need to be investigated, so that the applications like WiQiZi would truly incen- tivize DH research based on the fully-fledged LOD cloud. Multilingual support is also needed for the promotion of data diversity. It is the interest of not only the DH community in which language plays a vital part of research, but also the Wik- ipedia/DBpedia community, as well as the web community at large, which facilitates diversity. As seen in Section 2, like many other web pro- jects, Wikipedia and DBpedia are rather unnatu- rally English-oriented. Wikipedia’s multilingual achievement is extraordinary in the sense of local community development. In contrast, DBpedia seems to be lagging be- hind. For many LOD experts, the English version takes the central position, partly because of the richness of data. The development of language chapters is rather slow. While there are 292 ac- tive Wikipedia language versions (Wikipedia, 2018), only about 20 DBpedia versions exist (DBpedia, 2018)(Fig. 9). Many DBpedia websites are operated on a voluntary basis by local communities that lack organisational, technical, and financial support. According to a survey among nine DBpedia lan- guage chapters, 66% of the chapters have only one to four people involved in the core work, while only one chapter has about ten people (DBpedia Association, 2018a). In addition, 44.4% update their services once a year, while over 22.2% have not updated in more than two years (DBpedia Association, 2018b). Further- more, the inconsistencies of their websites and the lack of cooperate design reflect the backlog of the multilingual versions of the project. If WiQiZi were able to cope with different lan- guage versions, it would help to promote DBpe- dia chapters and could be presented as a use case for multilingual LOD. Fig. 9 Catalan DBpedia chapter 4 Conclusion In conclusion, this article demonstrates an ex- perimental case study of mixing gamification techniques (entertainment) with data-driven re- search (education) and the possibility for data curation (crowdsourcing), showcasing cutting- edge technologies such as SPARQL and Deep Learning API, with the help of Open Data in the framework of DH. In addition, it presents an ex- ample of an application that automatically gener- ates simple quizzes, based on semantic question- answer capability. Moreover, it displays a poten- tial for a new digital research ecosystem for hu- manities research and digital technologies, con- necting various stakeholders including humani- ties researchers and the public. As Terras, (2016) puts it, ‘the Digital Human- ities can aid in creating stronger links with the public and humanities research, which, in turn, means that crowdsourcing becomes a method of advocacy for the importance of humanities scholarship, involving and integrating non- academic sectors of society into areas of human- istic endeavour.’ It would be interesting to use the same or a similar method for different types of quiz. Given the variety and richness of information in Wik- ipedia, the automatized quiz can be about any interesting concept including buildings, objects, places, events, genres, and movements. Hence, Wikipedia seems to provide an exciting platform for edutainment, especially in the art and human- ities sphere. In addition, the project plans to con- tinue developing a more elaborate game applica- tion by taking advantage of semantically rich Open Data resources such as Europeana, VIAF, GeoNames, and Getty Vocabularies. If WiQiZi could cope with DBpedia multilingual chapters, it would be able to become a valuable prototype for the representation of LOD diversity. At the same time, the paper also acknowledg- es several challenges. For instance, technical de- velopments such as improvement of query per- formance are required in order to use LOD for practical day-to-day research business. Standard- ization may be required to lower the barrier of the complex technical environments especially for the ordinary yet majority of users. There are also data quality issues for Wikipedia and DBpe- dia to be fully useful for serious research, espe- cially in the context of automatization. Raising awareness is another social issue. Admittedly, although the application of this project is fairly simple, it is hoped that it helps to inspire and incentivize the DH researchers to actively use LOD as a new tool for our knowledge society in which any member of the society can become an active actor of knowledge creation, curation, and distribution. Notes 1 https://www.w3.org/RDF/ 2 https://fedorarepository.org/ 3 https://pro.europeana.eu/resources/apis/sparql 4 https://viaf.org/ 5 http://www.geonames.org/ 6 http://vocab.getty.edu/ 7 https://www.w3.org/TR/turtle/ 8 https://json-ld.org/ 9 https://www.w3.org/TR/sparql11-overview/ 10 https://www.w3.org/XML/ 11 https://www.json.org/ 12 https://en.wikipedia.org/wiki/Application_programming_interface 13 https://vlo.clarin.eu 14 https://www.clarin.eu/ 15 https://www.wikipedia.org/ 16 http://wiki.dbpedia.org/ 17 https://wiqizi.acdh-dev.oeaw.ac.at 18 http://www.php.net/ 19 https://laravel.com/ 20 https://getbootstrap.com/ https://www.w3.org/RDF/ https://fedorarepository.org/ https://pro.europeana.eu/resources/apis/sparql https://viaf.org/ http://www.geonames.org/ http://vocab.getty.edu/ https://www.w3.org/TR/turtle/ https://json-ld.org/ https://www.w3.org/TR/sparql11-overview/ https://www.w3.org/XML/ https://www.json.org/ https://en.wikipedia.org/wiki/Application_programming_interface https://vlo.clarin.eu/ https://www.clarin.eu/ https://www.wikipedia.org/ http://wiki.dbpedia.org/ https://wiqizi.acdh-dev.oeaw.ac.at/ http://www.php.net/ https://laravel.com/ https://getbootstrap.com/ 21 https://www.mediawiki.org/wiki/API:Main_page 22 http://dbpedia.org/sparql 23 http://www.json.org/ 24 https://www.ibm.com/watson/services/visual-recognition/ 25 The Wikipedia URL is easily inferred by the corresponding slug URL of DBpedia. The application excluded foaf:primaryTopic to fetch Wikipedia link (partly in order to increase query perfor- mance). 26 https://www.nla.gov.au/content/newspaper-digitisation-program 27 http://www.birds.cornell.edu/citscitoolkit/projects/pwrc/nabirdphen ologyprogram/ 28 http://www.soldierstudies.org/ 29 https://www.oldweather.org/ 30 http://herbariaunited.org/atHome/ 31 https://www.zooniverse.org/projects/zookeeper/galaxy-zoo/ 32 https://www.wikidata.org References Alexiev, V. (2017). Getty Vocabularies LOD: Sample Queries http://vocab.getty.edu/queries#Finding_Subjects (accessed 2 Oc- tober 2018). Berners-Lee, T. (2009). Linked Data - Design Issues https://www.w3.org/DesignIssues/LinkedData.html (accessed 2 October 2018). Boer, V. de, Penuela, A. M. and Ockeloen, C. J. (2016). Linked Data for Digital History: Lessons Learned from Three Case Studies. Anejos de La Revista de Historiografía(4): 139–62. Brinkerink, M. (2010). Waisda? Video Labeling Game: Evalua- tion Report Images for the Future – Research Blog http://research.imagesforthefuture.org/index.php/waisda-video- labeling-game-evaluation-report/index.html (accessed 2 October 2018). Carletti, L., Giannachi, G., Price, D., McAuley, D. and Benford, S. (2015). Digital humanities and crowdsourcing: an exploration. https://core.ac.uk/reader/43093962 (accessed 2 October 2018). Daif, A., Dahroug, A., López-Nores, M., Gil-Solla, A., Ramos- Cabrer, M., Pazos-Arias, J. J. and Blanco-Fernández, Y. (2017). Developing Quiz Games Linked to Networks of Seman- tic Connections Among Cultural Venues. Metadata and Seman- tic Research. (Communications in Computer and Information Science). Springer, Cham, pp. 239–46 doi:10.1007/978-3-319- 70863-8_23. https://link.springer.com/chapter/10.1007/978-3- 319-70863-8_23 (accessed 2 October 2018). DBpedia (2018). Chapters. https://wiki.dbpedia.org/join/chapters (accessed 2 October 2018). DBpedia Association (2018a). DBpedia Chapters – Survey Evalua- tion – Episode One | DBpedia https://wiki.dbpedia.org/blog/dbpedia-chapters-%E2%80%93- survey-evaluation-%E2%80%93-episode-one (accessed 2 Octo- ber 2018). DBpedia Association (2018b). DBpedia Chapters – Survey Eval- uation – Episode Two | DBpedia https://wiki.dbpedia.org/blog/dbpedia-chapters-%E2%80%93- survey-evaluation-%E2%80%93-episode-two (accessed 2 Octo- ber 2018). Dunn, S. and Hedges, M. (2012). Crowd-Sourcing Scoping Study Engaging the Crowd with Humanities Research. Edelstein, J., Galla, L., Li-Madeo, C., Marden, J., Rhonemus, A. and Whysel, N. (2013). Linked Open Data for Cultural Her- itage: Evolution of an Information Technology. http://www.whysel.com/papers/LIS670-Linked-Open-Data-for- Cultural-Heritage.pdf (accessed 2 October 2018). Edmond, J. and Garnett, V. (2014). Building an API is not enough! Investigating Reuse of Cultural Heritage Data LSE Im- pact Blog http://blogs.lse.ac.uk/impactofsocialsciences/2014/09/08/investig ating-reuse-of-cultural-heritage-data-europeana/ (accessed 2 Oc- tober 2018). FOSTER consortium (n.d.). What is Open Science? Introduction, FOSTER FACILITATE OPEN SCIENCE TRAINING FOR EU- ROPEAN RESEARCH https://www.fosteropenscience.eu/content/what-open-science- introduction (accessed 2 October 2018). Hacker, P. (2015). The Games Art Historians Play: Online Game- based Learning in Art History and Museum Contexts The Chron- icle of Higher Education Blogs: ProfHacker https://www.chronicle.com/blogs/profhacker/the-games-art- historians-play-online-game-based-learning-in-art-history-and- museum-contexts/61263 (accessed 12 April 2018). Heath, T. (2018). Linked Data | Linked Data - Connect Distributed Data across the Web http://linkeddata.org/home (accessed 2 Oc- tober 2018). Holley, R. (2009). A success story - Australian Newspapers Digiti- sation Program Journal article (Paginated) Online Currents http://eprints.rclis.org/14176/ (accessed 2 October 2018). Kelly, L. and Bowan, A. (2014). Gamifying the museum: Educa- tional games for learning | MWA2014: Museums and the Web Asia 2014 https://mwa2014.museumsandtheweb.com/paper/gamifying-the- museum-educational-games-for-learning/ (accessed 2 October 2018). Lincoln, M. (2017). Using SPARQL to access Linked Open Data. Programming Historian https://programminghistorian.org/lessons/graph-databases-and- SPARQL (accessed 2 October 2018). Marden, J., Li-Madeo, C., Whysel, N. Y. and Edelstein, J. (2013). Linked Open Data for Cultural Heritage: Evolution of an Information Technology. Columbia University Academic Com- mons https://doi.org/10.7916/D89021QD (accessed 2 October 2018). NYPL Labs Whats on the menu? http://menus.nypl.org/about (accessed 2 October 2018). Orgel, T., Höffernig, M., Bailer, W. and Russegger, S. (2015). A metadata model and mapping approach for facilitating access to heterogeneous cultural heritage assets. International Journal on Digital Libraries, 15(2–4): 189–207 doi:10.1007/s00799-015- 0138-2. Ridge, M. (2013). From tagging to theorizing: deepening engage- ment with cultural heritage through crowdsourcing. https://core.ac.uk/display/82977685?source=2&algorithmId=14 &similarToDoc=82981870&similarToDocKey=CORE&recSetI D=8d9b82c8-62e3-430e-91e7- ee235bcff1ea&position=3&recommendation_type=same_repo& otherRecs=19758508,41340087,82977685,43093962,147827629 (accessed 2 October 2018). Simou, N., Chortaras, A., Stamou, G. and Kollias, S. (2017). Enriching and publishing cultural heritage as linked open data. In Ioannides, M., Magenat-Thalman, N. and Papagiannakis, G. (eds), Mixed Reality and Gamification for Cultural Heritage. Springer http://eprints.lincoln.ac.uk/26895/ (accessed 2 October 2018). Sugimoto, G. (2017a). Who is open data for and why could it be hard to use it in the digital humanities? Federated application programming interfaces for interdisciplinary research. Interna- tional Journal of Metadata, Semantics and Ontologies, 12(4): 204 doi:10.1504/IJMSO.2017.10014806. Sugimoto, G. (2017b). Number game -Experience of a European research infrastructure (CLARIN) for the analysis of web traffic. CLARIN Annual Conference 2016. Aix-en-Provence, France: CLARIN ERIC and Laboratoire Parole et Langage and La- boratoire des Sciences de l’Information et des Systèmes (LSIS) and Aix-Marseille Université and Centre National de la Recher- che Scientifique (CNRS) https://hal.archives-ouvertes.fr/hal- 01539048 (accessed 2 October 2018). Sugimoto, G. (2017c). Battle Without FAIR and Easy Data in Digital Humanities. Metadata and Semantic Research. (Com- munications in Computer and Information Science). Springer, Cham, pp. 315–26 doi:10.1007/978-3-319-70863-8_30. https://link.springer.com/chapter/10.1007/978-3-319-70863- 8_30 (accessed 25 April 2018). Terras, M. (2016). Crowdsourcing in the Digital Humanities. A New Companion to Digital Humanities. Wiley-Blackwell, pp. 420–439 https://hcommons.org/deposits/download/hc:15066/CONTENT/ https://www.mediawiki.org/wiki/API:Main_page http://dbpedia.org/sparql http://www.json.org/ https://www.ibm.com/watson/services/visual-recognition/ https://www.nla.gov.au/content/newspaper-digitisation-program http://www.birds.cornell.edu/citscitoolkit/projects/pwrc/nabirdphenologyprogram/ http://www.birds.cornell.edu/citscitoolkit/projects/pwrc/nabirdphenologyprogram/ http://www.soldierstudies.org/ https://www.oldweather.org/ http://herbariaunited.org/atHome/ https://www.zooniverse.org/projects/zookeeper/galaxy-zoo/ https://www.wikidata.org/ mterras_crowdsourcing20in20digital20humanities_final1.pdf/ (accessed 2 October 2018). W3C (2016). Data on the Web Best Practices: Data Quality Vo- cabulary https://www.w3.org/TR/vocab-dqv/ (accessed 2 Octo- ber 2018). Wikipedia (2018). List of Wikipedias. https://meta.wikimedia.org/wiki/List_of_Wikipedias (accessed 2 October 2018). work_7jxf7vwbnfhojaz2jodkmyfmpq ---- Opening the book: data models and distractions in digital scholarly editing RESEARCH ARTICLE Open Access Opening the book: data models and distractions in digital scholarly editing James Cummings1 # The Author(s) 2019 Abstract This article argues that editors of scholarly digital editions should not be distracted by underlying technological concerns except when these concerns affect the editorial tasks at hand. It surveys issues in the creation of scholarly digital editions and the open licensing of resources and addresses concerns about underlying data models and vocabularies, such as the Guidelines of the Text Encoding Initiative. It calls for solutions which promote the collaborative creation, annotation, and publication of scholarly digital editions. The article draws a line between issues with which editors of scholarly digital editions should concern themselves and issues which may only prove to be distractions. Keywords Scholarly digital editions . Digital infrastructure . Textual editing . TEI XML . Markup and data models 1 Scholarly digital editions The creation of scholarly digital editions is a complex endeavour which exposes and is dependent on our understanding of the theories of text, works, and documents that underlie our relationships with texts. My approach in textual editing projects tends towards the pragmatic, but there is a clear distinction between an objects to which we often refer as a ‘work’, i.e. an abstraction as understood by readers (including authors and editors), and a ‘document’, which is a particular instance of a physical manifesta- tion of this text. Not all documents are faithful copies of the work, nor do they represent all possible ways of understanding the text in question. If we hold that each document is the work because without the document we would not have the work, we would have to see each different document as a different work. If one does not want to say that every copy of a work is a different International Journal of Digital Humanities https://doi.org/10.1007/s42803-019-00016-6 * James Cummings James.Cummings@newcastle.ac.uk 1 Newcastle University, Newcastle upon Tyne, UK http://crossmark.crossref.org/dialog/?doi=10.1007/s42803-019-00016-6&domain=pdf mailto:James.Cummings@newcastle.ac.uk work, then one must not say that the document and the work form an inseparable unity. If, on the other hand, one says a single work is represented differently by the variant texts in different documents, it seems necessary to also hold that one cannot apprehend the work as a whole without somehow holding its variant iterations in mind. Textual complexities resist simplification. How one conceives of the relationship between documents and works influences one’s practice when editing; it is important to have a sense of the complexity of that relationship (Shillingsburg 2017, p. 188). In editing a text then, the editor must attempt to communicate the understanding he or she has of both the document (and any additional related documents considered in scope) and the work as a whole. It is in ‘holding its variant iterations in mind’ that I would argue the true editorial objects are formed, and the representation of this in editions is inherently a lossy translation from the mental construct, whether or not the editions are print or digital, scholarly or otherwise. There is a long history of representing these mental constructs, i.e. what I consider a set of conceptual edited objects, in print editions, and the systems of encoding editorial understanding have a rich and complex history. As a side note, it should be recognised that in many discussions of editing, the assumption that it is solely concerning the relationship between multiple documents and their related works is often problematic for editors of works for which there is only a single witness. Single witness editions are no less editions. However, in the discussion of scholarly digital editions, we tend to focus on scholarly editing within the Lachmannian paradigm, on texts for which multiple witnesses exist, or at least complex textual apparatuses of one form or another, precisely because we seek edge cases on the basis of which to test, problematize, and construct our view of the nature of editorial activity. My view as a digital editorial pragmatist is that these edge cases are interesting, but while they must be dealt with, they need not distract us in the course of projects which focus on the creation of scholarly digital editions which function within the limits of existing solutions. I would argue that at times the academic investigation of scholarly digital editing focuses on the problems rather than the solutions, and as much as possible editors of scholarly digital editions should not be distracted from editorial tasks by technological concerns if these technological concerns do not affect their edition. There are cases, perhaps exacerbated by academic funding models, in which ‘the perfect is the enemy of the good enough’. In the creation of scholarly digital editions, the primary responsibility is the produc- tion of a scholarly edition that is no less rigorous than its print equivalent, but there is also the secondary responsibility to be truly digital in nature. That is not to say that a digitised edition (an edition which represents nothing significantly more or less than the possibilities of a print edition) is not a useful object. One could easily argue that the world would be a better place if we had digitised the full texts of existing print editions as a starting point. But in itself, a digitised edition barely exploits the fundamental shift in medium. Print editing, or the equivalent in digital form (such as static PDF editions), is restricted in the methods with which it presents the edited text, most commonly to a single perspective on the work with accompanying editorial information encoding additional witnesses or documentary information using standard formats, which the reader decodes. International Journal of Digital Humanities The edited text does not get closer to the documents, there is still no visual evidence, no making explicit of textual structures or semantic information, limited potential for multiple views on the text. This is why a digitised edition is not a digital edition (Sahle 2016, p.33). It is precisely the potential of a digital edition to be near-infinitely refactorable and dynamically to provide different views depending on external interactions that is one of its greatest strengths. However, far too much discussion of digital editions focuses on the presentation views of the edition. The real digital edition, that which best represents the set of conceptual editorial objects (whether textual, musical, image, or other forms of editorial object), is not represented by any one view of the edition. Therefore in digital editions the encoded texts themselves are the most important long-term outcome of the project, while their initial presentation within a partic- ular application should be considered only a single perspective on the data. Any given view will be far from unique or canonical, as different usage scenarios call for different presentations—ranging from Breading text^ to Binteractive version^ with popup content, to chart, graph, or map representations and beyond. Further- more, all initial presentations are also ephemeral, bound to be either modified over time as technologies and forms of digital publishing change, or languish in obsolescence on a forgotten server (Turska et al. 2016, para 4). One way of looking at the encoded edition, for example an edition created in TEI XML, is to consider it the true edition rather than any particular output. However, and perhaps surprisingly given my long history helping maintain the TEI Guide- lines, I do not view the encoded edition in TEI XML as the best form. Rather, it is, in my assessment, the best serialization format for the underlying conceptual data of scholarly digital editions. The edition is in the encoding; this implies that encoded data is, in a certain sense, already a scholarly-mediated presentation of other data that exist in the original manuscript (Barabucci et al. 2017, p. 44). While the encoded data is a good representation of the scholarly edition and one I care about deeply, a truly conceptual editorial object is malleable and recombinable, and an encoded edition, by itself, is not. The encoded edition is sufficient, for those literate in the method of encoding, to present an edition by itself, but this would substantially limit the audience of the edition. Editors should be able to understand the granularity and categorisation of an encoded edition, and they should recognise where they have abrogated any philological responsibility, but they should not necessarily be distracted by the underlying data format. However, by forcing us to formalise some of our assumptions, the structures and vocabularies of an encoded edition do help us fore- ground the theories of text we use when creating scholarly digital editions, and thus it is important that editors be at least familiar with the format of the encoded edition and any limitations placed by it on their activity (Cummings 2008). If part of the point of editing a work is to make it more accessible, in all senses of that word, then usually some presentation view of the edition is required. With TEI International Journal of Digital Humanities XML-based digital editions this usually involves transforming the data to a web-based serialization format (such as HTML or JSON being fed to an HTML container). This distinction [between TEI data and HTML presentation] leads to an important question: what constitutes the core of an edition? Its data or its presentation? It is possible to think of a critical edition as a collection of pieces of pure data? Or is a representation layer fundamental to the concept of ‘edition’? (Barabucci et al. 2017, p. 37) In order to maintain the malleability of a conceptual editorial object, it is not the presentation layer that is a requirement, as the presentation layer is merely one or more additional views on the data. Rather, it is the ability to reshape, query, transform, and reconceive of the data in the same way as (and in the case of a digital edition in more ways than) a reader might do when translating a printed critical apparatus into a mental construct representing the various document instances and their relationship to the work. People developing complex IT systems for the publication of (usually specific) scholarly digital editions might believe that their systems provide the necessary infra- structure. However, these systems tend to focus on the rendering of one or more presentational views on their datasets rather than providing a more direct interface to the set of conceptual editorial objects encoded in the underlying data. Given current technologies, I would argue that the true form of a scholarly digital edition would be better expressed as a well-documented API for the manipulation and description of editorial objects following an open international standard for the representation of digital text. This would not necessarily provide the presentational view on the data that most readers would require, but views of scholarly digital editions could be constructed on top of it. This would enable all forms of examination, querying, subsetting, and recombination of all editorial objects of all types. This infrastructure would, in no way, stop a scholarly digital edition from being a publication of knowl- edge and commentary on an individual work, but it requires that its underlying framework meet at least basic criteria to enable the edition’s involvement in inter- edition commentary and research in the future. There is a clear difference between the knowledge in a scholarly digital edition and the knowledge which can be created across a collection of interoperable editions, but the creation of one should not preclude the eventual development of the other. Clearly this is unlikely to be created fully formed as a complete solution, but efforts for API-based access to serialized editorial objects (for instance with open annotations or URI-referenced encoding as first order objects) are a step in the correct direction which should be encouraged and amalgamated through a coherent infrastructure. Ideally, open data repositories for digital editions would build such APIs into their serving of the underlying data of digital editions. (The TAPAS archive seems to be moving in this direction, but even it has long way to go.) One of the limitations of publications of scholarly digital editions is that only so many views on the underlying encoded edition data can be realised. Even when access is given to an API foregrounding manipulation of all the editorial objects, the nature of the access will be limited to the methods of interrogation conceived of at the time of its construction. One approach is to define an API that Bsees digital documents as stacks of abstraction levels, each storing content according to a certain model.^ (Barabucci and Fischer 2017, p. 51) However, these digital documents will suffer the same restrictions International Journal of Digital Humanities based on the limitations of how and when they are created. Even when more sophis- ticated layers of abstraction are provided, there must also be methods for as many low level operations on the encoded editorial data as possible. The editor of a scholarly digital edition, I would argue, should understand the separation of these levels of abstraction and the nature of the models they store on a conceptual level, but does not need to be distracted by the actual implementation of this system. 2 Opening the edition In order for scholarly digital editions to reach their full potential as contributions to a wider academic environment of digital resources, it is not enough that they merely be accessible but (as thankfully is becoming a requirement of many funding bodies) they also need to be openly accessible. This may seem a minor difference, and it is often confused with them being ‘freely’ accessible, that is free at the point of use by researchers. In a world in which digital resources must become sustainable, there are cogent arguments against making them freely available to all, and while this is a regretfully retrograde step which mirrors the publication of print editions, online hosting of scholarly digital editions must still be resourced. However, ideally editions would be freely available to anyone who wants to use them, and institutions which resource such sustainability should be lauded for their attempts to do so. Nonetheless, ‘openly’ available here is meant to convey legal availability rather than financial accessibility, i.e. that a digital edition is openly pre-licensed for reuse with terms as open as possible, e.g. licensing with a Creative Commons Attribution license where feasible, rather than one that includes Non-commercial or Non-derivative conditions, since these conditions significantly limit the potential reuse of the edition. Assuming that data is openly licensed and the licenses follow open international standards and the open repositories of digital edition data will exist for an extended period of time, then the most interesting repurposing of any digital edition will likely not be done by the original creators. In other words, assuming the survival of a well-documented edition’s data into the distant future, the edition (or the data) is much more likely to be repurposed as technology develops and exploited in ways which we could not have predicted. And yet, current reuse of digital edition data by others is very rare, and even with large text collections such as EEBO-TCP, the reuse often consists of making improvements to them in order to ensure they reach the minimum criteria for a scholarly digital edition. Editors of scholarly digital editions do not need to be distracted by the detailed legal implications of openly licensing resources, but they should understand the general categories of restrictions and how openly licensing their own project outputs benefits future research. True reuse of scholarly digital edition data is a laudable aim, and open licensing of data following well-documented open inter- national standards is the necessary foundation of this (potentially overly-optimistic) open data utopia of the future. While the following characterization may represent an idealistic view, one of the benefits of a scholarly digital edition is its infinite potential to be revisited, reformed, and updated in what is a perpetual beta state which subverts the publishing hegemony under which scholarly editing produces editions and a significant revision of an edition (perhaps in adding newly uncovered witnesses) would form a new ‘second edition’. International Journal of Digital Humanities However, this infinitely changing edition creates a new barrier to use through its very nature as an unstable object, unlike print editions, which are more stable but less flexible. While the inherent anxiety about the possibility of ever-changing digital objects is addressed by technological solutions (such as pointing at any particular stage in its version history), these solutions do not fully solve the problem. And yet, the nature of scholarly digital editions is such that we now talk about ‘versions’ or, in reference to the source data itself, ‘revisions’ of the editions. The editorial structures in the data which underlies any scholarly digital edition become the actual resource itself, only temporarily translated into a variety of presentational structures. We create information resources that are guided by abstract models and abstract descriptions of the objects at hand. The dogma of our current markup strategies is the separation or rather translation from form to content. Thus, we do not just transform our textual witnesses from one (material) media and form into another (digital) media and form. Rather, we try to encode structures and meaning of documents and texts beyond their mediality. And from this data we may or we may not create, and from time to time recreate, arbitrary forms of presentation in one media or another (Sahle 2016, p. 32). That not all scholarly digital editing is intended to produce an ‘edition’ rendered in a presentation view is an important reminder that such editions are not merely publications, but are intended to be resources with which to venture answers to the research questions which prompted their initial funding. While this often leads to new and different ways of reading the edition, this is more a product of the context of digital editions. The digital edition allows readers to break away from mono-directional reading (as has also been vigorously discussed in relation to hypertext) (Vine and Verweij 2012, p. 134). However one reads the edition, the underlying data may in fact have been created not for a scholarly digital edition as a publication, but as a resource to be interro- gated, analysed, or queried, rather than published. The publication of a scholarly digital edition can, and perhaps more often should, be a mere byproduct of the real research undertaken. That such information resources, in this case datasets of editorial objects, become corpora for research analysis also enables us to work with reproducible methodologies where all aspects of the data, methodology, and results are transparent. Striving for reproducible research also enables us to publish in more transparent ways, where the data behind the graph which supports any research claims and even the tools used to undertake the analysis are provided. This enables others to check the conclusions in a way often perceived by society (falsely, I must note) as more ‘scientific’. When producing a scholarly edition, an article, or an introduction to an edition in a reproducible way, we publish not only the text in its final format including the prose with possible figures and tables, but also the data (in our case typically annotated transcriptions) as well as the computer code use in the analytic work. International Journal of Digital Humanities This enables other users– including our future selves – to redo, build upon and adjust the work without the need to start over (Speed Kjeldsen 2017, p. 135). It is not just the reproducibility of the research that is important, but the underlying approach. Such a transparent approach to scholarly editing is not a neo-liberal quanti- fication of computational literary studies as only containing objective data to be analysed (there is no such thing as neutral editorial encoding), but merely a foregrounding of our assumptions, methodologies, data, and results, whether we use Digital Humanities methodologies or ‘Experimental Humanities’. [W]e present a different approach to the application of digital techniques to humanities research, a branch of experimental humanities in which digital exper- iments bring insight and engagement with historical scenarios and in turn influence our understanding and our thinking today (De Roure and Willcox 2017, p. 194). Editors of a scholarly digital edition should not find the exposing of their research methods distracting; their editorial tasks produce a dataset upon which experiments which are core to a humanities research approach can be based. Moreover, as the humanities inevitably becomes an increasingly collaborative undertaking, any approach that assists us in making all aspects of scholarly digital editing more transparent from the outset can only be seen as useful. 3 Data models and the TEI The de-facto standard for a data model to be used in creating scholarly digital editions is the Guidelines of the Text Encoding Initiative (TEI, http://www.tei-c.org/). This is a community-developed open international standard which provides a set of recommen- dations for the encoding of digital texts. Yet it is inaccurate to say that the TEI is a data model itself. Used properly, it is more of a framework for constructing and documenting data models for particular editorial projects. In many cases, the TEI defines objects for encoding texts, but it does so in a way which has been called ontologically agnostic. That is, it defines a particular markup object for encoding a specific textual phenomenon, but it does not always prescribe how to determine the nature of that phenomenon. For example, the TEI’s < title> element is defined as ‘[containing] a title for any kind of work’, but TEI does not specify how to determine whether or not something is in fact a title of a work. This extends to all sorts of editorial interventions and encoding, where the editor is still left to determine whether a string of characters is indeed the textual phenomenon in question. In reality, this is a pragmatic level of indirection which enables the standard to be used by vastly different editorial communities. Moreover, TEI customisation can provide equivalences to existing on- tologies if the project is intended to relate an understanding of TEI encoding to particular real-world concepts. The individual encoding of textual phenomena repre- sents the editor’s interpretation of the objects which exist in the real world, and while these signs may be encoded according to different methods, the editor’s choice for how to encode any particular instance of a textual phenomenon has at its root a materialistic cause which we should not confuse with its conceptual categorisation. International Journal of Digital Humanities http://www.tei-c.org/ Note that here I am not negating the whole pluralist view of textuality: I am only denying the unlicensed (and undesirable, in my view) consequence that texts are not really existent objects. The fact that we can describe reality at different levels does not imply that the objects we describe do not exist in se: this fallacy is a direct consequence of the confusion between ontology and epistemology, a confusion that I want to get rid of (Ciotti 2017, p. 87). At the heart of creating a TEI data model is the process of customisation that the TEI framework uses to document, in a literate programming vocabulary, the relationship of the vocabulary of the TEI to the application that is being undertaken in any particular project. The TEI provides a processable form of customisation using the TEI ODD format, which enables both the constraining of the overall scheme and its extension into new areas. At time of writing, the TEI P5 Guidelines version 3.5.0 have 573 elements, but no particular scholarly digital edition would be expected to make use of all of them.1 Though, to be clear, it is not just the inclusion/exclusion of elements that might form part of a customisation. All aspects of the TEI framework (elements, attributes, classes, modules, prose, examples, content models, intended processing, and much more) can be modified for any particular project. Indeed, in proper use of the TEI, customization is not only recommended, but almost required for: These Guidelines provide an encoding scheme suitable for encoding a very wide range of texts, and capable of supporting a wide variety of applications. For this reason, the TEI scheme supports a variety of different approaches to solving similar problems, and also defines a much richer set of elements than is likely to be necessary in any given project. Furthermore, the TEI scheme may be extended in well-defined and documented ways for texts that cannot be conveniently or appropriately encoded using what is provided. For these reasons, it is almost impossible to use the TEI scheme without customizing it in some way (TEI Guidelines, Chapter 23: ‘Using the TEI’, Section 23.3 ‘Customization’ http://www.tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#MD). The nature of the TEI framework in providing methods for extensible meta-schemas (from which TEI users generate schemas to validate their document) can result in vastly different views of the TEI. These views may be so varied as to be almost mutually incompatible, and yet having the common framework at their basis is always going to be more beneficial than a multitude of different schemes. The documentation of the fragmentation found in a TEI ODD customisation file actually enables easier interop- erability and interchange between digital editions than if no such documentation existed. Such documentation of variance of practice and encoding methods as a TEI ODD meta-schema preserves then helps to enable real, though necessarily mediated, interchange between complicated textual resources (Cummings 2014). 1 The number of elements the TEI Guidelines currently include is available on the element reference page from version 3.2.0 onwards. http://www.tei-c.org/release/doc/tei-p5-doc/en/html/REF-ELEMENTS.html International Journal of Digital Humanities http://www.tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#MD http://www.tei-c.org/release/doc/tei-p5-doc/en/html/REF-ELEMENTS.html The potential for different projects to define their own meta-schemas creates a frag- mentation of overall consistency among projects. However, because they all define their variance from the same source in a machine processable form, the divergences are not as great as one might expect. (Though to say ‘same source’ ignores the rolling releases of the TEI framework, i.e. the objects available to one customisation may have been greatly modified by the time another customisation is created.) The TEI ODD customisation methods provide a mechanism by which the meta-schema can document the version of the TEI on which the customisation is based. And yet, even the phrase ‘version of the TEI’ is inaccurate. In most uses of the TEI this is a sufficient description, but it is possible in a single TEI customisation to use multiple sources, which might be different versions of the TEI framework or indeed other standards entirely. The TEI customisation-literate programming mechanisms can also be used to document entirely non-TEI schemas. To further complicate matters, in some sophisti- cated uses a TEI ODD will ‘chain’ customisations in order to provide a variation on an existing TEI customisation. For example, a project might decide that its needs are very similar to the EpiDoc schema (a pure TEI P5 subset) but that it needs additional elements or different attribute values or wants to customise the examples to its materials. The project would indicate that the source is not the TEI directly, but the compiled EpiDoc customisation, which itself points to the TEI. In processing these chained customisations to generate schemas, each of the customisations is flattened in turn against its source. Any form of TEI customisation is potentially quite complicated because of the generalistic nature of the framework of which it is a part. And yet this also provides quite rigorous methods of documenting the variance between schemas in a way that can be processed on a more general level. An editor of a scholarly digital edition should understand, at least on a conceptual level, the customisation of any formal vocabulary they are using and the relationship of this vocabulary to the categories of textual phenomena and the editorial activity they are undertaking. How- ever, an editor whose well-resourced project team has included assistance in this area need not be distracted by the methods by which this customisation is implemented. A recent change in the TEI further extends the TEI ODD customisation language vocabulary with the ability to document intended processing models. This is a signif- icant departure for the TEI, which has more usually held that the processing of the encoded edition is a completely separate activity from that which it defines, the encoding of the textual phenomena according to an agreed framework. The TEI vocabulary for customisation now provides users a mechanism with which they can indicate, in an entirely implementation-agnostic method, how they intend a particular element (or other TEI object) to be processed for a variety of outputs. This mechanism does not specify precisely how to handle elements, but it gives a general behaviour recommendation and might indicate some details of formatting. For example, this processing model documentation might indicate that for abbreviations and expansions embedded inside a TEI element that something processing it should imple- ment a behaviour of ‘alternate’ which would somehow provide both the abbreviation and expansion to the user. Furthermore, the processing model documentation in the TEI customisation can indicates which of the abbreviations or expansions should be used as the ‘default’ content and which should provide the ‘alternate’. One can imagine using this instruction in web output to provide a tooltip with expansion and the text showing the abbreviation. In a print publication the same ‘alternate’ behaviour might generate a International Journal of Digital Humanities footnote to provide a similar effect. One of the reasons for doing this in a hands-off implementation agnostic manner is to predetermine not the nature of the processing, but instead the nature of the intended output. Another benefit is the shrinking of the code- base necessary to maintain publication solutions. Instead of writing code to deal with every occurrence of TEI elements, one could create a system which examines this documentation and reacts to the models it contains. Indeed, early experiments show that this can be beneficial in simplifying the maintenance of such code (Turska 2017, p. 364). One of the intentions behind the documentation of the processing model, however, is to benefit not just software developers but also editors of scholarly digital editions. The format is designed to be simple enough that an editor could easily change whether the abbreviation or expansion is shown or whether it is highlighted in bold or italics. While editors of digital scholarly editions need not be distracted by how the processing is implemented, if this processing model documentation is being exploited by their publication processing they will enjoy significant benefits if they understand the base format well enough to have control over the presentation. One popular area for discussion in explorations of scholarly digital editing is the handling of critical apparatuses of multi-witness texts. These are works that are represented in the edition by multiple documents (extant or theorised) in order to produce a coherent editorial view of the text. This is one area where people sometimes argue against XML as a serialization format. As this is the current format in which TEI is expressed, those making this argument often find themselves arguing against this open, international, community- developed set of recommendations. Instead, other formats are suggested by proponents of one solution or another, mostly based on hiding the serialization format from the user. And yet, once an interface is placed between the editor and the underlying code which represents their decisions, then in many ways it is only the granularity of information and its relationships which matter, not the serialization format. In such a system, though editors are assisted in their work, ‘the tools themselves and their heuristics are not questioned, as long as they do what they are Btold^’ (Pierazzo 2015, p. 109). One of the facile arguments that is often made by those opposed to an XML-based solution is that XML (and thus TEI) is unable to handle overlapping hierarchies. This is, of course, a falsehood. People who perpetuate this myth, however, are usually doing so innocently with a naive understanding of XML as a format assuming all XML representations are created as embedded in-line markup. (Cummings 2018) The crea- tion of markup structures that rely on an embedded hierarchical use of XML is an increasingly dated notion of how XML is used in complex resources such as scholarly digital editions. Increasingly, scholarly digital editions are based on distributed and multi-faceted sets of resources. The idea that all the markup of an edition is embedded within a single hierarchy of XML and encoded inline, while often the tempting when one is creating early digital editions, is now a strawman used for to propose a conflict between overlapping hierarchies. While much ink has been spilt on the merits and shortcomings of the various solutions to the problem, this discussion is pursued primarily by people seeking more elegant solutions for the markup languages of the future. For many projects, the in-built solutions, such as the use of milestones (such as to mark page breaks) where one hierarchy (usually the intellectual) is preferred over another (usually the physical), are sufficient. Swapping between hierarchies displayed as milestones is no longer the complex processing activity that was once imagined. International Journal of Digital Humanities There are numerous other methods for overcoming the supposed limitations of XML without departing from its specification, including out-of-line or standoff markup. It is perfectly reasonable in XML to employ a simple technique of remotely pointing into basic structures (with URI-based pointing or other standoff mechanisms) to provide encoding and annotations which might be at risk of overlapping. This is not a non-XML solution, as it is entirely possible for out-of-line markup to exist as pure XML. For example, the TEI Guidelines provide recommendations for how an element recording an editorial apparatus entry may be stored completely separately from the base text to which it refers. In addition, the apparatus readings may now surround larger structures (such as whole divisions or paragraphs), and not merely phrase-level content. With regard to the use of out-of-line markup, it does not matter if the objects being stored out-of-line are variant readings, physical vs intellectual structures of the document, or something else. If the text is encoded to a sufficient degree of granularity, then all of these supposedly conflicting hierarchies can be expressed in separate out-of-line markup that points to the site of overlap. My own stance as a pragmatic digital editor is to encode at an orthographic word level of granularity (whose markup can be added by simple scripts). While this might mean some redundancy when recording sub-word changes, this is balanced by the ease of processing at this level. While out-of-line markup is a very simple and powerful mech- anism that can be used to cut across an infinite number of hierarchies, it does so at the cost of human-readability. The underlying problem, which explains why solutions such as this, which employ out-of-line or standoff markup, are not popularly used by all digital editions, is that of support from tools, not only in the creation of editorial objects and annotations of data, but also in its processing. There are limitations in the creation of markup for scholarly digital editions that may cross the boundaries of common embedded markup structures, but these limitations are the result of a lack of tools with which to create the markup in standoff or out-of-line forms, rather than any particular serialization format. Other proposed formats, such as JSON (a very useful serialization for frontend manipulation) or RDF (a useful graph technology for conceptual annotation), have as many well-understood problems as formats like XML, and in the creation of scholarly digital editions in TEI, they are more accurately understood as generated outputs from the TEI source. Nonetheless, solutions to these problems are not beyond the scope of current technology, but when projects create solutions, the solutions are usually for very specific use-cases rather than generalised applications. When a scholarly digital edition project creates a significantly detailed frontend to hide the encoding structure, it becomes unnecessary to start proposing entirely new data formats and to eschew the vocabularies of existing open international standards. Significant user-friendly technology in this area would benefit the creation of scholarly digital editions no end, especially if these solutions built on the improvements to the recommendations of the TEI, such as the processing model documentation. The TEI framework is a mature, rich, and complex method of documenting our relationships with text (in its many forms). While editors of scholarly digital editions should not be overly distracted by the implementation of underlying technology for the creation and publication of their editions, they should not be dissuaded from using de facto standards, such as the TEI, merely because they do not wish to understand any of the technological background to their editions. Developers who would throw away frameworks like the TEI because they dislike the current serialization format (XML), because of their own technology choices, or want to reinvent the wheel (and do not realise that they can do so within the framework) are short-sighted. International Journal of Digital Humanities 4 Publishing scholarly digital editions Even where one is not worried about multiple hierarchies or complex out-of-line markup and is creating an edition which is straightforward, the publishing of a scholarly digital edition is still a needlessly complicated affair. Given the technology that already exists and the solutions which have been reinvented time and time again, it is unconscionable that public research funds are used to produce bespoke publication engines unnecessarily again and again. Slowly generalised but customisable and detailed publication infrastructures (such as TEI Publisher http://www.teipublisher. com) are being developed, but they still have a long way to go. It is unusual for an individual to have all the skills necessary to edit a work properly, create an encoded edition, and develop a publication framework. Some of us who have some skills in multiple aspects of these areas are usually less developed in other areas. While there are scholars who have achieved such an impressive skillset, it also seems evident that they are setting the threshold very high and that it is not likely that this profile will become very common in the foreseeable future, if at all (Pierazzo 2015, p. 115). The real answer, of course, is that the creation of scholarly editions, whether digital or not, has never been an individual enterprise. Just as an author in the age of incunabula had a sense of printing technology but did not fully understand the techniques printers used, editors should not be distracted by the publication infrastructure for their editions. Usually, the publishers and printers of print editions took on many of the activities that are now cognate to the frontend developers and web hosting for digital editions. However, as mentioned earlier, so far, no single generalised software for the publication of scholarly digital editions has had mass uptake by the community. And as the research for scholarly digital editions becomes more collaborative (though ignoring the potential of crowdsourcing and citizen science for digital editions), a solution that lowers the bar for the production and publication of digital editions would inherently need to be a collaborative platform. Instead of creating solutions that are individual to any specific project’s needs, we need collaboratively to build small modular improvements on top of a generalised infrastructure for the creation, publication, and analysis of scholarly digital editions. All of these tools, however, act like small unconnected islands. They expect input and output data to match their own data format and data model, both narrowly tailored to their task and following their own idiosyncratic vocabulary (Barabucci and Fischer, p.48). What is needed is a generalised infrastructure to which a larger community of scholarly editing projects contribute and which leverages existing technologies for handling scholarly digital editions. This infrastructure should require little or no specialised knowledge for its use by an editor of scholarly editions. Having the requisite skills for work as an academic researcher in a modern digital age should be sufficient to produce a digital edition. Even if the skills of encoding the edition in TEI XML are required by the editor (and my experience in teaching TEI is that this is a basic skill that International Journal of Digital Humanities http://www.teipublisher.com http://www.teipublisher.com all modern editors are more than capable of learning if they honestly have the desire to do so), the additional annotation, text-image interactions, collaboration with colleagues, and publication and interrogation of this data should be done through a standard easy- to-use interface based on the most common open international standards. If digital editing should become the standard practice for preparing editions, digital tools, which are easy to handle and do not require much technical or even programming skills are needed. Moreover, we need useful standardiza- tion processes, which lead to an unhindered and unrestricted usage of digital tools (Speer 2017, p. 199) That no single solution has been widely adopted by a majority of projects is an indication of the disparate nature of the desires of those producing scholarly digital editions, the strength of the ‘not invented here syndrome’, and the limitations of the existing software. But even when the software is available, it often does not meet the needs of those outside the specific project because it was created with very specific and often fragile approaches to the editorial endeavour. Let us state clearly that the described issues are not due to the fact that the implementations of the tools are incomplete. The root cause lies, instead, in the fragile theoretical foundations upon which these tools are built. (Barabucci and Fischer, p. 50) Editors of scholarly digital editions should not be distracted by the lack of single cohesive solutions to the creation, annotation, and publication of digital editions. Instead, a de facto community-based solution should be created to meet their needs. Scholarly digital editions and the solutions that support them must learn from the history of the print edition and fully exploit the digital medium through which they are expressed. 5 Conclusion The use of open international standards for the creation of scholarly digital editions is necessary if the resources spent on them are not to be squandered. The TEI does a good job in being flexible and customisable to individual scholarly digital editing projects. Where feasible, it is better to use this at least as a storage and preservation format than to invent even more standards. The search for better serialization formats and the reinvention of encoding formats, while an important endeavour for markup theorists, is a distraction pragmatic digital editors should ignore. Similarly, the creation of openly available resources is the future for any truly collaborative international research, and editors should adopt common legal solutions, such as creative commons, so as not to be distracted by unnecessary legal intricacies. The publishing of scholarly digital editions and the distractions of concerns about a particular presentation view of the edition should be discarded in favour of the adoption of consistent digital editorial publication methods where feasible. However, more work needs to be undertaken on the produc- tion of generalised software for editing tasks that truly supports the flexibility of out-of- line and standoff markup technologies within existing standards like the TEI. However, International Journal of Digital Humanities this work should not be undertaken by scholarly editorial projects, who are the customers in this enterprise. I would contend that large amounts of public funding should not be set aside merely for the open publication of digital editions, as there is no technological barrier to achieving this if a consortium of projects desires to do so. (I would want to see that funding used to create the generalised infrastructure proposed above that such projects would use.) Much as the TEI has become the de facto standard for the data of scholarly digital editions, it is time for software infrastructures to be adopted for a more consistent environment that benefits all. As editors of scholarly digital editions, we need to have some understanding of the mechanisms of the production of our editions without being distracted by the underlying technological issues, unless we are interested this distraction. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and repro- duction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. References Barabucci, G., & Fischer, F. (2017). The formalization of textual criticism: Bridging the gap between automated collation and edited critical texts. In P. Boot, A. Cappellotto, W. Dillen, F. Fischer, A. Kelly, A. Mertgens, A.-M. Sichani, E. Spadini, & D. van Hulle (Eds.), Advances in digital scholarly editing: Papers presented at the Dixit conferences in the Hague, Cologne, and Antwerp (pp. 47–54). Leiden: Sidestone Press. Barabucci, G., Spadini, E., & Turska, M. (2017). Data vs. presentation: What is the core of a scholarly digital edition? In P. Boot, A. Cappellotto, W. Dillen, F. Fischer, A. Kelly, A. Mertgens, A.-M. Sichani, E. Spadini, & D. van Hulle (Eds.), Advances in digital scholarly editing: Papers presented at the Dixit conferences in The Hague, Cologne, and Antwerp (pp. 37–46). Leiden: Sidestone Press. Ciotti, F. (2017). Towards a new realism for digital textuality. In P. Boot, A. Cappellotto, W. Dillen, F. Fischer, A. Kelly, A. Mertgens, A.-M. Sichani, E. Spadini, & D. van Hulle (Eds.), Advances in digital scholarly editing: Papers presented at the Dixit conferences in the Hague, Cologne, and Antwerp (pp. 85–90). Leiden: Sidestone Press. Cummings, J. (2008). The text encoding initiative and the study of literature. In S. Schreibman & R. Siemens (Eds.), A companion to digital literary studies (pp. 451–476). Oxford: Blackwell. Cummings, J. (2014). The compromises and flexibility of TEI customisation. In C. Mills, M. Pidd, & E. Ward (Eds.), Proceedings of the Digital Humanities Congress 2012. Studies in the digital humanities. Sheffield: HRI Online Publications. https://www.dhi.ac.uk/openbook/chapter/dhc2012-cummings. Accessed 13 May 2019. Cummings, J. (2018) A world of difference: Myths and misconceptions about the TEI. Digital Scholarship in the Humanities. De Roure, D., & Willcox, P. (2017). Experimental humanities: An adventure with Lovelace and Babbage. In 2017 IEEE 13th international conference on eScience 978-1-5386-2686-3/17 (pp. 194–201). https://doi. org/10.1109/eScience.2017.32. Pierazzo, E. (2015). Digital scholarly editing: Theories, models and methods. Farnham: Ashgate. Sahle, P. (2016). What is a scholarly digital edition? In M. J. Driscoll & E. Pierazzo (Eds.), Digital scholarly editing: Theories and practices (pp.19-40). Cambridge: Open Book Publishers. https://www. openbookpublishers.com/htmlreader/978-1-78374-238-7/ch2.xhtml. Accessed 13 May 2019. Shillingsburg, P. (2017). Enduring distinctions in textual studies. In P. Boot, A. Cappellotto, W. Dillen, F. Fischer, A. Kelly, A. Mertgens, A.-M. Sichani, E. Spadini, & D. van Hulle (Eds.), Advances in digital scholarly editing: Papers presented at the Dixit conferences in the Hague, Cologne, and Antwerp (pp. 187–190). Leiden: Sidestone Press. Speed Kjeldsen, A. (2017). Reproducible editions. In P. Boot, A. Cappellotto, W. Dillen, F. Fischer, A. Kelly, A. Mertgens, A.-M. Sichani, E. Spadini, & D. van Hulle (Eds.), Advances in digital scholarly editing: International Journal of Digital Humanities https://www.dhi.ac.uk/openbook/chapter/dhc2012-cummings https://doi.org/10.1109/eScience.2017.32 https://doi.org/10.1109/eScience.2017.32 https://www.openbookpublishers.com/htmlreader/978-1-78374-238-7/ch2.xhtml https://www.openbookpublishers.com/htmlreader/978-1-78374-238-7/ch2.xhtml Papers presented at the Dixit conferences in the Hague, Cologne, and Antwerp (pp. 135–140). Leiden: Sidestone Press, 2017. Speer, A. (2017). Blind spots of digital editions. In P. Boot, A. Cappellotto, W. Dillen, F. Fischer, A. Kelly, A. Mertgens, A.-M. Sichani, E. Spadini, & D. van Hulle (Eds.), Advances in digital scholarly editing: Papers presented at the Dixit conferences in the Hague, Cologne, and Antwerp (pp. 191–200). Leiden: Sidestone Press. Turska, M. (2017). TEI simple processing model: An abstraction layer for XML processing. In P. Boot, A. Cappellotto, W. Dillen, F. Fischer, A. Kelly, A. Mertgens, A.-M. Sichani, E. Spadini, & D. van Hulle (Eds.), Advances in digital scholarly editing: Papers presented at the Dixit conferences in the Hague, Cologne, and Antwerp (pp. 361–364). Leiden: Sidestone Press. Turska, M., Cummings, J., & Rahtz, S.P.Q. (2016). Challenging the myth of presentation in digital editions, jTEI, 2016. http://journals.openedition.org/jtei/1453. Accessed 13 May 2019. Vine, A., & Verweij, S. (2012). Digitizing non-linear texts in TEI P5: The case of the early modern reversed manuscript. In B. Nelson & M. Terras (Eds.), Digitizing medieval and early modern material culture (pp. 113-136). , New Technologies in Medieval Renaissance Studies (Vol. 3). Toronto: Iter. International Journal of Digital Humanities http://journals.openedition.org/jtei/1453 Opening the book: data models and distractions in digital scholarly editing Abstract Scholarly digital editions Opening the edition Data models and the TEI Publishing scholarly digital editions Conclusion References work_7oth2g2n3reg3n3uv5elarlqdy ---- Partially Automated Method for Localizing Standardized Acupuncture Points on the Heads of Digital Human Models Research Article Partially Automated Method for Localizing Standardized Acupuncture Points on the Heads of Digital Human Models Jungdae Kim1,2 and Dae-In Kang2 1Nano Primo Research Center, Advanced Institutes of Convergence Technology, Seoul National University, Suwon 443-270, Republic of Korea 2Pharmacopuncture Medical Research Center, Korean Pharmacopuncture Institute, Seoul 157-801, Republic of Korea Correspondence should be addressed to Jungdae Kim; tojdkim@gmail.com Received 27 February 2015; Revised 6 May 2015; Accepted 13 May 2015 Academic Editor: Vitaly Napadow Copyright © 2015 J. Kim and D.-I. Kang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Having modernized imaging tools for precise positioning of acupuncture points over the human body where the traditional therapeutic method is applied is essential. For that reason, we suggest a more systematic positioning method that uses X-ray computer tomographic images to precisely position acupoints. Digital Korean human data were obtained to construct three- dimensional head-skin and skull surface models of six individuals. Depending on the method used to pinpoint the positions of the acupoints, every acupoint was classified into one of three types: anatomical points, proportional points, and morphological points. A computational algorithm and procedure were developed for partial automation of the positioning. The anatomical points were selected by using the structural characteristics of the skin surface and skull. The proportional points were calculated from the positions of the anatomical points. The morphological points were also calculated by using some control points related to the connections between the source and the target models. All the acupoints on the heads of the six individual were displayed on three- dimensional computer graphical image models. This method may be helpful for developing more accurate experimental designs and for providing more quantitative volumetric methods for performing analyses in acupuncture-related research. 1. Introduction Regardless of regional differences between western and east- ern medicine, a large amount of medical data has been accumulated over the centuries in the qualitative forms of texts and pictures [1]. Thanks to the fast central process units and vast sizes of computer memories, a new representation of medical knowledge, especially knowledge concerning human anatomy and function, can now be made [2]. Nowadays, the quest to modernize medicine seems to be more demanding on the eastern side than on the western side [3]. As one of the core therapeutic modalities in many Asian countries, acupuncture is still a subject that is actively being studied by with various modernized tools and advanced techniques such as functional magnetic resonance imaging and positron emission tomography [4–6]. In countries such as China, Japan, and Korea, acupunc- ture has been practiced for more than 2500 years and has now become a global therapeutic method used across the world. Clinical studies have shown promising results for the efficacy of acupuncture in, for example, reducing both postoperative and chemotherapy nausea and vomiting in adults, as well as postoperative dental pain [7]. Although basic research on acupuncture has led to considerable progress over the past decades, its underlying mechanism is still an abstruse subject [8]. Because of the increasing demand for standardization of acupuncture point locations, the World Health Organization Western Pacific Regional Office initiated projects in the early 1990s to reach a consensus on those locations. The WHO presented the general guidelines for acupuncture point locations in the forms of texts and figures, and it stipulated the methodology for locating acupuncture Hindawi Publishing Corporation Evidence-Based Complementary and Alternative Medicine Volume 2015, Article ID 483805, 12 pages http://dx.doi.org/10.1155/2015/483805 http://dx.doi.org/10.1155/2015/483805 2 Evidence-Based Complementary and Alternative Medicine Table 1: Characteristics of six subjects for the 3D digital CT images. Subject Gender Age (years) Height (cm) Weight (kg) BMI (kg/m2) Category 1 Male 26 170 66 22.84 Normal 2 Male 37 175 59 19.27 Underweight 3 Male 39 170 75 27.34 Obese 4 Female 30 158 55 20.20 Normal 5 Female 28 161 48 18.52 Underweight 6 Female 43 162 64 24.01 Overweight Obese man Underweight manNormal man Underweight womanNormal womanOverweight woman Figure 1: Three-dimensional reconstructed surface models of Korean adults with various body types. points on the surface of the human body, as well as the locations of 361 standard acupuncture points. This standard established by the WHO may be applied in teaching, research, clinical service, preparation of publications, and academic exchanges involving acupuncture [9]. In more recent studies on acupoint locations, researchers began to use more advanced equipment such as X-ray machines. As a convenient method for locating acupoints, the cun measurement methods have been widely used in the practice of acupuncture. However, the traditional cun measurement methods have been criticized for their lack of reliability. In one study a comparison of two different location methods was done by using dual-energy X-ray absorptiom- etry to measure the soft tissue and the bone mass indepen- dently [10]. Another study used computed tomography (CT) to provide a metric description of acupuncture points in the lumbar region and to give their relation to individual anatomical landmarks and structures [11]. Another study used X-ray radiography to provide experimental evidence to standardize the location of the acupoint in the hand [12]. Synchrotron radiation phase-contrast X-ray CT was also employed to investigate the three-dimensional (3D) topo- graphic structures of acupuncture points [13]. The results of another study suggested that biomedical information about acupuncture treatment could be visualized in the form of a data-driven 3D acupuncture point system [14]. The accurate localization of acupoints is a key issue in acupuncture research. For more precise scientific research and development on acupuncture therapy, having definitions on how to localize acupoints by using a computer-based pictorial representation of the human body is critical. A method for localizing the acupoints on the head of a virtual body by using segmentation and a 3D visualization of the VOXEL-MAN software system was reported [15]. Our initial research on positioning all the 361 standardized acupuncture points was done with the digital data from a healthy Korean male with a normal body shape [16]. The cross-sectional images generated by X-ray CT were used for generating the 3D virtual models for the bones and the skin’s surface of the entire human body. The volumetric 3D acupoint model Evidence-Based Complementary and Alternative Medicine 3 Table 2: Categorized acupoints on the head. The acronyms for the acupoint names come from the names of meridians: GV (governor vessel), GB (gall bladder), ST (stomach), TE (triple energizer), BL (bladder), CV (conception vessel), LI (large intestine), and SI (small intestine). The values in the parentheses are the total numbers of points in the categories.∗These points are not standard acupuncture points but are introduced for convenience. Category Acupoints on the head Anatomical points (34) GV15, GV16, GV17, GV25, GV26, GV27 (6) GB1, GB2, GB3, GB7, GB8, GB9, GB12, GB20 (8) ST1, ST2, ST3, ST4, ST5, ST6, ST7 (7) TE17, TE20, TE21, TE22, TE23 (5) BL1, BL2 (2) CV23, CV24 (2) LI19, LI20 (2) SI17, SI18 (2) Pupil∗, Yintang∗, TOP∗ (3) Proportional points (24) GV18, GV19, GV20, GV21, GV22, GV23, GV24 (7) GB13, GB14, GB15, GB16, GB17, GB18, GB19 (7) ST8 (1) BL3, BL4, BL5, BL6, BL7, BL8, BL9, BL10 (8) SI19 (1) Morphological points (7) GB4, GB5, GB6, GB10, GB11 (5) TE18, TE19 (2) was developed based on projective 2D descriptions of the standard acupuncture points. According to the general guidelines suggested by the WHO in the western pacific region, three methods are used for determining acupuncture point locations: the anatomical landmark method, the proportional bone (skeletal) mea- surement method, and the finger-cun measurement method [7]. The anatomical landmark method utilizes some char- acteristics on the surface of the body that may be fixed or movable, such as protuberances or depressions formed by joints and muscles. The proportional bone (skeletal) mea- surement method also uses landmarks on the body’s surface, that is, primarily joints, to measure the lengths and the widths of various parts of the body. The finger-cun measurement method refers to the proportional measurement method for locating acupuncture points based on the size of the fingers of the person to be measured. In some cases, these methods are used together for complementary decisions on the same acupoint. In this study we developed systematic procedures and algorithms for positioning acupoints on the heads of six individuals. All the acupoints on the head were categorized into one of three types: anatomical points, proportional points, and morphological points. The anatomical acupoints are determined by using corresponding anatomical char- acteristics from the skin’s surface and bone structure. The proportional acupoints are calculated by using prescribed x y FrontBack Right Left Up Down (0, 0, 0) (511, 0, 0) (0, 511, 0) (0, 0, 300) (511, 511, 300) z z 󳰀 y 󳰀 x 󳰀 (CMx, CMy, CMz) Figure 2: Coordinate systems based on the CT images of the head. For convenient numerical calculations, the𝑥-,𝑦-, and 𝑧-axes were chosen to access every voxel from the head skin and skeleton. The model frame (𝑥󸀠,𝑦󸀠,𝑧󸀠)was introduced by translating (𝑥,𝑦,𝑧) to the center of mass of the model and scaling down. proportional numbers between the anatomical points. Some remaining acupoints only have descriptive definitions for their positions in which case the morphological technique is introduced among the individuals. The traditional methods for the acupoint localization are based on measurements along the surface of the body with cun. Here, we do not use cun units, which may vary in length for different parts of the body. One of basic assumptions for this study with the head is that the shape of the human head is approximately spherical. 2. Materials and Methods 2.1. Digital Korean Human Data. Digital Korean CT data for six human beings were obtained from the Korean Institute of Science and Technology Information (KISTI; http://dk.kisti.re.kr/). The CT images of the entire body were taken from three men and three women whose body sizes were measured. Based on the body mass index (BMI), the six individuals were categorized as underweight (1826 kg/m2). The genders, ages, heights, and weights of the six individuals are shown in Table 1. A stack of CT images with 512× 512 pixel resolution was taken from head to toe in 1 mm depth intervals, and the results were saved in the DICOM format. 2.2. Procedure for Surface Reconstruction of 3D Human Models. The procedure for reconstructing the surfaces of the skin and the skull from the stack of 2D images of the six individual is basically the same as that used in [16] for a single person with a normal body shape. In brief, first, the DICOM images are converted to 8-bit BMP format because of memory issues. Then, the images for the surfaces of the skin and bone were put into binary format by using the proper 4 Evidence-Based Complementary and Alternative Medicine Table 3: Descriptions of the locations of the anatomical acupuncture points and landmarks. ∗These landmarks are additionally positioned for localization of the acupoints. Acupoints/landmarks Surface models Description of the position for the landmark GV15 Bone In the depression superior to the spinous process of the second cervical vertebra(C2) on the posterior median line GV16 Bone Directly inferior to the external occipital protuberance GV17 Bone External occipital protuberance GV25 Skin Tip of the nose GV26 Skin Midpoint of the philtrum midline GV27 Skin Midpoint of the tubercle of the upper lip GB1 Skin and bone Outer canthus of the eye GB2 Bone Depression between the intertragic notch and the condylar process of the mandible GB3 Bone Depression superior to the midpoint of the zygomatic arch GB7 Skin Junction of the vertical line of the posterior border of the temple hairline and thehorizontal line of the apex of the auricle GB8 Skin Directly superior to the auricular apex GB9 Skin Directly superior to the posterior border of the auricular root GB12 Bone Depression posteroinferior to the mastoid process GB20 Bone Inferior to the occipital bone, in the depression between the origins of thesternocleidomastoid and the trapezius muscles ST1 Skin and bone Between the eyeball and the infraorbital margin, directly inferior to the pupil ST2 Bone In the infraorbital foramen ST3 Skin Directly inferior to the pupil, at the same level as the inferior border of the ala of thenose ST4 Skin The angle of the mouth ST5 Bone Anterior to the angle of the mandible, in the depression anterior to the masseterattachment ST6 Bone Angle of the mandible ST7 Bone Depression between the midpoint of the inferior border of the zygomatic arch andthe mandibular notch TE17 Skin Posterior to the ear lobe, in the depression anterior to the inferior end of themastoid process TE20 Skin Auricular apex TE21 Skin and bone In the depression between the supratragic notch and the condylar process of themandible TE22 Skin Anterior to the auricular root, posterior to the superficial temporal artery TE23 Skin In the depression at the lateral end of the eyebrow (it is superior to GB1) BL1 Skin and bone In the depression between the superomedial parts of the inner canthus of the eyeand the medial wall of the orbit BL2 Skin In the depression at the medial end of the eyebrow CV23 Bone In the anterior region of the neck, superior to the superior border of the thyroidcartilage, in the depression superior to the hyoid bone, on the anterior median line CV24 Skin In the depression in the center of the mentolabial sulcus LI19 Skin At the same level as the midpoint of the philtrum LI20 Skin In the nasolabial sulcus, at the same level as the midpoint of the lateral border of theala of the nose SI17 Bone Posterior to the angle of the mandible, in the depression anterior to thesternocleidomastoid muscle SI18 Skin and bone Inferior to the zygomatic bone, in the depression directly inferior to the outercanthus of the eye Pupil∗ Skin Center line of the pupil Yintang∗ Skin Midpoint between the eyebrows TOP∗ Skin Top of the head Evidence-Based Complementary and Alternative Medicine 5 Yintang GV17 TE20 TOP Figure 3: The baselines or planes for the head are determined by using four landmarks: Yintang and GV17 (the red line, the median plane) and TE20 and TOP (the green line, the frontal plane). A transverse plane (the blue line) is determined by using a plane orthogonal to the imaginary line connecting the center of the left and the right TE20 points to the TOP. The four landmarks can be positioned by using their anatomical descriptions on the skin and the skeleton. Yintang is the point between the eyebrow, and TOP is the highest point of the head. The acupoint GV17 is in the depression superior to the external occipital protuberance, and the acupoint TE20 is just superior to the auricular apex on both the left and the right sides. GV17 GV18 GV19 GV20 GV21 GV22 GV23 GV24 YintangTE20CP 1.5 1.5 1.5 1.5 1.5 1.0 0.5 3.5 GV25 GV26 GV27 Figure 4: Some proportional acupoints from GV18 to GV24 on the midsagittal plane that were obtained by dividing the angle between GV17 and Yintang by appropriate proportionality constants based on the guidelines of the WHO for standard acupuncture positioning. 6 Evidence-Based Complementary and Alternative Medicine ST8 GB4 GB5 GB6 GB7 GB9 TE20 GB12 TE17 GB10 GB11 TE19 TE18 1/4 1/4 1/4 1/41/3 1/3 1/3 1/3 1/3 1/3 (a) P Q R X Standard source Target Q 󳰀 P 󳰀 L ≡ ‖(P 󳰀 − X) − (P − R)‖ + ‖(Q 󳰀 − X) − (Q − R)‖ (b) Figure 5: Morphological acupoints on the right side of the head (a) and the distance formula for the unknown position𝑋with the known positions𝑃󸀠,𝑄󸀠 from the target model and the𝑃,𝑄, and𝑅 from the standard source model. The position𝑋 is adjusted in such a way that the distance𝐿 is minimized. threshold values for the pixel intensity, 10 for the skin and 110 for the skull. Unnecessary holes were filled by using a 3D binary dilation subroutine, and boundaries were extracted from the objects in the binary images. The boundary for the skin’s surface or for the bone’s surface was isolated, and a 3D dataset was made for the skin’s surface or for the bone’s surface by using the Marching Cube algorithm through the boundary. Actually, the Marching Cube algorithm can be applied to the 8-bit images directly. However, a procedure for binarizing and deleting unnecessary parts is essential in order to obtain model data that are more compact. The 3D image of the surface can be made smoother by averaging the normal vectors at every vertex during the triangulation and by using OpenGL software (https://www.opengl.org/) to present the 3D surface models, as shown in Figure 1, for the skin’s surface. The two surface models, one from the skin and the other from the skull of an individual, can be combined in such a way that their centers of mass coincide. The skull can be seen through the skin by computer-graphically giving it some opacity. 2.3. Definition for the Reference Frames and the Anatomical Planes. If any point in the 3D digital models is to be described, proper reference frames should be introduced. A natural frame, called the stack frame (𝑥,𝑦,𝑧), can be fixed by using the three indices (𝑖,𝑗,𝑘) along the width, length, and height of the stack of 2D images, as shown in Figure 2. For more a model-dependent frame, the model frame (𝑥󸀠,𝑦󸀠,𝑧󸀠) can be obtained by translating and scaling the stack frame as follows: (𝑥 󸀠 ,𝑦 󸀠 ,𝑧 󸀠 ) = 𝛼(𝑥−CM 𝑥 ,𝑦−CM 𝑦 ,𝑧−CM 𝑧 ), (1) where the position of the center of mass position for a given stack of binary images, 𝐼bin 𝑖𝑗𝑘 , is CM 𝑥 = ∑ 𝑖,𝑗,𝑘 𝐼 bin 𝑖𝑗𝑘 𝑖, CM 𝑦 = ∑ 𝑖,𝑗,𝑘 𝐼 bin 𝑖𝑗𝑘 𝑗, CM 𝑧 = ∑ 𝑖,𝑗,𝑘 𝐼 bin 𝑖𝑗𝑘 𝑘, and we have set 𝛼 = 4/512, for convenience. The anatomical planes for the head can be defined by choosing some obvious points anatomically. The sagittal plane, that is, the midsagittal plane or median plane more precisely, is determined by fixing three points in the head: Yintang (midpoint between the eyebrows), TOP (top of the head), and the acupoint GV17 (external occipital protuber- ance in the back of the head), as shown in Figure 3. Any point,𝑃, which resides in the sagittal plane should satisfy the following equation: [(GV17−Yintang)×(TOP−Yintang)] ⋅(𝑝−Yintang) = 0, (2) where all the points in the equation are considered as vectors with triplet values corresponding to the vector’s three components and the cross and the dot in the equation mean the cross product and the scalar product, respectively. A coronal plane in the head can be defined by choosing three points on the head: two acupoints TE20 at the auricular apex just above the left and the right ears and the TOP. Any point, 𝑃, in the coronal plane should satisfy the condition [(TE20left−TOP)×(TE20right−TOP)] ⋅(𝑝−TOP) = 0. (3) The transverse plane at any level can be defined in such a way that the plane is perpendicular to the common axis that resides in the sagittal and the coronal planes simultaneously. 2.4. Categorization for Acupuncture Points. All the acupoints on the head can be categorized into one of three types depending on how they are located, as shown in Table 2. In addition to the standard acupuncture points, three more points, Pupil, Yintang, and TOP, are introduced for position- ing the acupoints. The anatomical acupoints have anatomical descriptions that are more or less clear based on the structures Evidence-Based Complementary and Alternative Medicine 7 GB17GB16 GB15 GB14 GB13 ST8 GB4 GB5 GB6 TE23 TE20 GB1 ST1 ST2 ST3 ST4 ST5 LI19 LI20SI18 BL6 BL5 BL4 BL3 BL2 BL1 GV22 GV23GV24 GV25 GV26 GV27 CV24 CV23 TE22 TE21 SI19 GB2 ST6 SI17 GB9 GB8 GB10 BL7 BL8 GB18 BL9 BL10 GB19 GB20 GB7 (a) GB17GB16 GB15 GB14 GB13 ST8 GB4 GB5 GB6 TE23 TE20 GB1 ST1 ST2 ST3 ST4 ST5 LI19 LI20SI18 BL6 BL5 BL4 BL3 BL2 BL1 GV22 GV23 GV24 GV25 GV26 GV27 CV24 CV23 TE22 TE21 SI19 GB2 ST6 SI17 GB9 GB8 GB10 BL7 BL8 GB18 BL9 BL10 GB19 GB20 GB7 (b) Figure 6: Continued. 8 Evidence-Based Complementary and Alternative Medicine GB17GB16 GB15 GB14 GB13 ST8 GB4 GB5 GB6 TE23 TE20 GB1 ST1 ST2 ST3 ST4 ST5 LI19 LI20SI18 BL6BL5 BL4 BL3 BL2 BL1 GV22 GV23 GV24 GV25 GV26 GV27 CV24 CV23 TE22 TE21 SI19 GB2 ST6 SI17 GB9 GB8 GB10 BL7 BL8 GB18 BL9 BL10 GB19 GB20 GB7 (c) Figure 6: Various views of the 3D model of an (a) obese, a (b) normal, and an (c) underweight man with positioned acupoints on their heads. of the skin or the skull. Brief descriptions on the acupoints on the head are presented in Table 3. The corresponding surface models for their positioning are also given in the table. For the proportional acupoints, we first introduce the midpoint between the left and the right TE20 as follows: TE20CP≡ (TE20left+TE20right) 2 . (4) This imaginary point is introduced just for arithmetic con- venience. In order to obtain the conditions for positioning the proportional acupoints, let us define the angle𝜃 0 made by the three points Yintang, TE20CP, and GV17, with the vertex point being TE20CP, as follows: 𝜃 0 = cos−1 (Yintang−TE20CP) ⋅ (GV17−TE20CP) 󵄨 󵄨 󵄨 󵄨 Yintang−TE20CP󵄨󵄨󵄨 󵄨 |GV17−TE20CP| . (5) We can also calculate two angles depending on the point𝑃, which is one of the voxels for the surface of the skin on the head. If the point𝑃 is on the skin’s surface in the sagittal plane, we define the angle as follows: 𝜃1 = cos −1(𝑝−TE20CP) ⋅ (GV17−TE20CP) 󵄨 󵄨 󵄨 󵄨 𝑝−TE20CP󵄨󵄨󵄨 󵄨 |GV17−TE20CP| . (6) If the point𝑃 is on the skin’s surface in the transverse plane, we define the angle as follows: 𝜃2 = cos −1(𝑝−TE20CP) ⋅ (GV24−TE20CP) 󵄨 󵄨 󵄨 󵄨 𝑝−TE20CP󵄨󵄨󵄨 󵄨 |GV24−TE20CP| . (7) The conditions for the proportional acupoints are displayed in Table 4, and the pictorial descriptions of the conditions are shown in Figure 4. According the standard acupuncture positioning in [9], some acupoints are described on the curved lines between two acupoints, as shown in Figure 5(a) and Table 5. For example, GB4, GB5, and GB6 are positioned along the curved line between ST8 and GB7. In this case, we use the morphological method for positioning the acupoints GB4, GB5, and GB6 by using the previously-known points ST8 and GB7 as two control points. Precise information about positioning the morphological acupoints should be prepared from a “standard” source. We took that “standard” source to be the 3D model of the normal man, as displayed in Figure 5(a), and we obtained the positions of the anatomical and proportional acupoints in advance. Figure 5(b) shows that the target point 𝑋 between the two control points 𝑃󸀠 and 𝑄󸀠 can be calculated in such a way that the sum of the distances between the two control points is minimized Evidence-Based Complementary and Alternative Medicine 9 ST8 GB4 GB5 GB6 GB7 GB9 GB10 GB11 GB12 TE20 TE17 TE19 TE18 Figure 7: Morphological acupoints and the lines connecting the two control points through them. The upper left corner shows the morphological acupoints on the head of anormal man; they exactly match the same positionings on our “standard” source model for the morphological technique. In the other models, the shapes of the curves are shown to be slightly different, depending on the individuals. compared to the distances of the two control points from the standard source. 3. Results All the acupoints on the head above the neck were calculated following the above procedure. We positioned the acupoints on the head models taken from the 3D digital CT images. Figures 6(a), 6(b), and 6(c) show various views of the heads of the obese, the normal, and the underweight man, respectively. The total of 65 standard acupoints on the head could be classified into 34 anatomical points, 24 proportional points, and 7 morphological points. The positions of the anatomical acupoints were described, and those descriptions are summarized in Table 3. Based on the descriptions, the anatomical acupoints were positioned manually, and their positions were saved for further calculations on positioning the proportional and the morphological acupoints. The analytic and pictorial descriptions of the proportional acupoints are presented in Table 4 and Figure 4, respectively. If we assume that the shape of the surface of the head is spherical, the distance between consecutive acupoints in the midsagittal planes can be approximately determined by using the dividing the angle between the two points with a vertex into smaller angles. The acupoints from GV17 to GV24, for example, are positioned by dividing the angle between the GV17 and Yintang with the vertex TE20CP into smaller subangles with proportional values, as shown in Figure 4. In the numerical code for positioning the proportional acupoints, proper small intervals are introduced for every condition in Table 4. If a pixel point on the skin’s surface satisfied the condition within the interval, that pixel point was included, and an average point, which was taken to be the position for the proportional acupoint, was obtained. Traditionally, all the acupoints are classified according to the meridians to which they belong. The results for the morphological acupoints for the six subjects are shown in Figure 7. The figures are displayed for the right sides of the heads, and the three lines in each head connect the two control points with the morphological acupoints along the curves. The morphological acupoints GB4, GB5, and GB6, for example, reside on the curve connecting the control points of ST8 and GB7. Our computer code for the morphological technique has been checked to start positioning the morphological acupoints with the model of the normal man (the panel in the upper left corner in Figure 7). By adopting the model of the normal man as the standard source model, we confirmed that the source points precisely match the target points in the case of the normal man. The same code was applied to the other models to obtain the morphological acupoints. Figure 7 shows lines 10 Evidence-Based Complementary and Alternative Medicine Table 4: Proportional acupoints. The left and the right sides of the acupoints are determined by the positions of 𝑃 whether the acupoints are located on the left and the right side of the 𝑦𝑧 plane, respectively. †The point AUX is introduced for arithmetic convenience. Conditions Calculated points 𝜃1 𝜃0 = 1.5 12.5 𝑃=GV18 𝜃1 𝜃0 = 3.0 12.5 𝑃=GV19 𝜃1 𝜃0 = 4.5 12.5 𝑃=GV20 𝜃1 𝜃0 = 6.0 12.5 𝑃=GV21 𝜃1 𝜃0 = 7.5 12.5 𝑃=GV22 𝜃1 𝜃0 = 8.5 12.5 𝑃=GV23 𝜃1 𝜃0 = 9.0 12.5 𝑃=GV24 𝜃1 𝜃0 = 11.0 12.5 𝑃=AUX † 𝜃2 𝜃0 = 2.25 12.5 𝑃= ST8 𝜃2 𝜃0 = 1.5 12.5 𝑃=GB13 𝑃 𝑥 =Pupil 𝑥 ,𝑃 𝑧 =AUX 𝑧 𝑃=GB14 𝜃2 𝜃0 = 0.25 12.5 𝑃=BL3 𝜃2 𝜃0 = 0.5 12.5 𝑃=BL4 𝑃 𝑥 =Pupil 𝑥 ,𝑃 𝑧 =GV24 𝑧 𝑃=GB15 𝑃 𝑥 =Pupil 𝑥 ,𝑃 𝑧 =GV23 𝑧 𝑃=GB16 𝑃 𝑥 =Pupil 𝑥 ,𝑃 𝑧 = 1 2 ∗GV22 𝑧 + 1 2 ∗GV23 𝑧 𝑃=GB17 𝑃 𝑥 =BL4 𝑥 ,𝑃 𝑧 =GV23 𝑧 𝑃=BL5 𝑃 𝑥 =BL4 𝑥 ,𝑃 𝑦 = 1 3 ∗GV21 𝑦 + 2 3 ∗GV22 𝑦 𝑃=BL6 𝑃 𝑥 =BL4 𝑥 ,𝑃 𝑦 = 1 3 ∗GV20 𝑦 + 2 3 ∗GV21 𝑦 𝑃=BL7 𝑃 𝑥 =BL4 𝑥 ,𝑃 𝑦 = 1 3 ∗GV19 𝑦 + 2 3 ∗GV20 𝑦 𝑃=BL8 𝑃 𝑥 =GV17 𝑥 − 1.3 1.5 ∗(BL4 𝑥 −GV24 𝑥 ),𝑃 𝑧 =GV17 𝑧 𝑃=BL9 𝑃 𝑥 =GB17 𝑥 ,𝑃 𝑧 =BL7 𝑧 𝑃=GB18 𝑃 𝑥 =GB20 𝑥 ,𝑃 𝑧 =GV17 𝑧 𝑃=GB19 𝑃 𝑥 = 1 2 ∗(GV16 𝑥 +GB20 𝑥 ),𝑃 𝑧 =GV15 𝑧 𝑃=BL10 𝑃= 1 2 ∗(TE21+GB2) 𝑃 = SI19 with slightly different shapes that depend on the surface structures of the heads of the other two men (the obese man and the underweight man) and those of the three women (the overweight woman, the normal woman, and the underweight woman). 4. Discussion Recently, considerable efforts have been made to understand the characteristics of acupuncture points [17, 18]. In our previ- ous study, acupuncture point positioning for the entire body of a single person was done by using the 3D digital model of the normal man. The reference points or the landmarks were positioned based on the standard descriptions of the acupoints, and the formulae for the proportionalities between the acupoints and the reference points were presented every- where on the body. We found that the 37% of the 361 standardized acupoints on the entire body were automatically linked to the reference points. The reference points accounted for 11% of the 361 acupoints, and the remaining acupoints (52%) were positioned point-by-point by using 3D computer graphics libraries [16]. In this study, we increased the number of subjects to six and confined the positioning area to their heads. For the locations of some acupoints, we used the morphological technique in which topographical constraints between acupoints were considered among the individuals. Many morphological approaches to handling medical image data have developed for use in basic research, as well as in clinical studies, in various fields [19–22]. Currently, substantial interrater variability in acupoint location and intrarater variability within the clinical setting exist because the optimal location of the point therapeutically may deviate from the location of the standard textbook point and may vary from treatment to treatment. That is, in addition to the anatomical, proportional, and morphological considerations for locating a point, a clinical palpatory consideration may also exist, which may involve modifying the textbook location of the point based upon the texture or the tension of the tissue as detected by the practitioner or based upon the “Ashi” tenderness felt by the patient. Our approach can also accommodate the localization of any additional points on the body that might be clinically relevant for treatment. The traditional methods for acupoint localization are based on measurements along the surface of a body with cun, the traditional Chinese measure, which varies in length for different parts of the body and for different directions of the longitude (vertical) and latitude (horizontal). With a contrasting approach to this work, authors of [15] introduced three projection planes on the head for a 2D acupoint description system. For a definite viewing direction, all visible points on the body’s surface have one projection plane that is perpendicular to that viewing direction, and the visible points have corresponding projection points in that projection plane. Each projection point may be back-projected to the body’s surface to obtain the corresponding 3D coordinates. This project matching between the 2D and the 3D images is necessary to satisfy the traditional descriptions on position- ing acupoints. Evidence-Based Complementary and Alternative Medicine 11 Table 5: Seven acupoints categorized as morphological points with the two corresponding control points. The values in the parentheses are the “standard positions” of the acupoints on the left and the right sides of the head taken from the 3D digital model of the normal man. These values were used as the standard for calculating the morphological acupoints in the other models. Acupoints Control points GB4 Left: (−0.539, 0.274,−0.912) Right: (0.560, 0.274,−0.912) ST8 Left: (−0.447, 0.278,−1.026) GB5 Left: (−0.610, 0.272,−0.759) Right: (0.457, 0.278,−1.026) Right: (0.628, 0.272,−0.759) GB7 Left: (−0.641, 0.155,−0.474) GB6 Left: (−0.639, 0.239,−0.594) Right: (0.651, 0.155,−0.474) Right: (0.655, 0.239,−0.594) GB10 Left: (−0.584,−0.266,−0.467) GB9 Left: (−0.641,−0.089,−0.714) Right: (0.541,−0.266,−0.467) Right: (0.613,−0.089,−0.714) GB11 Left: (−0.556,−0.266,−0.220) GB12 Left: (−0.529,−0.138, 0.027) Right: (0.541,−0.266,−0.220) Right: (0.529,−0.138, 0.027) TE18 Left: (−0.555,−0.159,−0.128) TE17 Left: (−0.546,−0.004, 0.027) Right: (0.555,−0.134,−0.128) Right: (0.566,−0.004, 0.027) TE19 Left: (−0.604,−0.159,−0.284), TE20 Left: (−0.628,−0.004,−0.459), Right: (0.585,−0.159,−0.284) Right: (0.623,−0.004,−0.459) In conclusion, we propose a partially automated method and procedure for localizing the standardized acupuncture points on the heads of 3D digital CT human models taken from six living individuals. In the future, we expect to be able to develop for practical clinical use a fully automated method for positioning acupuncture points on the entire body of an individual. Conflict of Interests The authors declare that there is no conflict of interests regarding the publication of this paper. Acknowledgments The authors used the Digital Korean data that were pro- duced and distributed by the Catholic Institute for Applied Anatomy, College of Medicine, Catholic University of Korea, and the Korean Institute of Science and Technology Informa- tion. References [1] K. H. Höhne, B. Pflesser, A. Pommert, M. Riemer, R. Schubert, and U. Tiede, “A new representation of knowledge concerning human anatomy and function,” Nature Medicine, vol. 1, no. 6, pp. 506–511, 1995. [2] A. Pommert, K. H. Höhne, B. Pflesser et al., “Creating a high- resolution spatial/symbolic model of the inner organs based on the visible human,” Medical Image Analysis, vol. 5, no. 3, pp. 221– 228, 2001. [3] Q. Xu, R. Bauer, B. M. Hendry et al., “The quest for moderni- sation of traditional Chinese medicine,” BMC Complementary and Alternative Medicine, vol. 13, article 132, 2013. [4] H. Liu, X. Shen, H. Tang, J. Li, T. Xiang, and W. Yu, “Using MicroPET imaging in quantitative verification of the acupunc- ture effect in ischemia stroke treatment,” Scientific Reports, vol. 3, article 1070, 2013. [5] H. Liu, J.-Y. Xu, L. Li, B.-C. Shan, B.-B. Nie, and J.-Q. Xue, “FMRI evidence of acupoints specificity in two adjacent acupoints,” Evidence-Based Complementary and Alternative Medicine, vol. 2013, Article ID 932581, 5 pages, 2013. [6] V. Napadow, N. Makris, J. Liu, N. W. Kettner, K. K. Kwong, and K. K. S. Hui, “Effects of electroacupuncture versus manual acupuncture on the human brain as measured by fMRI,” Human Brain Mapping, vol. 24, no. 3, pp. 193–205, 2005. [7] National Institutes of Health (NIH), “Consensus development panel on acupuncture acupuncture—NIH consensus confer- ence,” The Journal of the American Medical Association, vol. 280, no. 17, pp. 1518–1524, 1998. [8] V. Napadow, A. Ahn, J. Longhurst et al., “The status and future of acupuncture mechanism research,” Journal of Alternative and Complementary Medicine, vol. 14, no. 7, pp. 861–869, 2008. [9] WHO, “World Health Organization (WHO) standard acupunc- ture point location,” in WHO Standard Acupuncture Point Locations in the Western Pacific Region, pp. 1–14, WHO Western Pacific Region, Geneva, Switzerland, 2008. [10] D. H. W. Groenemeyer, L. Zhang, S. Schirp, and J. Baier, “Localization of acupuncture points BL25 and BL26 using com- puted tomography,” Journal of Alternative and Complementary Medicine, vol. 15, no. 12, pp. 1285–1291, 2009. [11] H.-J. Park, Y. Chae, M.-Y. Song et al., “A comparison between directional and proportional methods in locating acupuncture points using dual-energy X-ray absorptiometry in Korean women,” American Journal of Chinese Medicine, vol. 34, no. 5, pp. 749–757, 2006. [12] D. Zhang, X. Yan, X. Zhang et al., “Synchrotron radiation phase- contrast X-ray CT imaging of acupuncture points,” Analytical and Bioanalytical Chemistry, vol. 401, no. 3, pp. 803–808, 2011. [13] S. Koo, S. Kim, Y. Kim, S. Kang, and S. Choi, “Measuring the location of PC8 acupuncture point using X-ray radiography in healthy adults,” Korean Journal of Oriental Medicine, vol. 16, pp. 123–126, 2010 (Korean). [14] I.-S. Lee, S.-H. Lee, S.-Y. Kim, H. Lee, H.-J. Park, and Y. Chae, “Visualization of the meridian system based on biomedi- cal information about acupuncture treatment,” Evidence-Based 12 Evidence-Based Complementary and Alternative Medicine Complementary and Alternative Medicine, vol. 2013, Article ID 872142, 5 pages, 2013. [15] L. Zheng, B. Qin, T. Zhuang, U. Tiede, and K. H. Höhne, “Localization of acupoints on a head based on a 3D virtual body,” Image and Vision Computing, vol. 23, no. 1, pp. 1–9, 2005. [16] J. Kim and D.-I. Kang, “Positioning standardized acupuncture points on the whole body based on X-ray computed tomogra- phy images,” Medical Acupuncture, vol. 26, no. 1, pp. 40–49, 2014. [17] J. Kim, K.-H. Bae, K.-S. Hong, S.-C. Han, and K.-S. Soh, “Magnetic resonance imaging and acupuncture: a feasibility study on the migration of tracers after injection at acupoints of small animals,” Journal of Acupuncture and Meridian Studies, vol. 2, no. 2, pp. 152–158, 2009. [18] J. Kim, D.-I. Kang, K.-S. Soh, and S. Kim, “Analysis on post- mortem tissues at acupuncture points in the image datasets of visible human project,” Journal of Alternative and Complemen- tary Medicine, vol. 18, no. 2, pp. 120–129, 2012. [19] B. Golosio, A. Brunetti, and S. R. Amendolia, “A novel mor- phological approach to volume extraction in 3D tomography,” Computer Physics Communications, vol. 141, no. 2, pp. 217–224, 2001. [20] T. L. Weng, S. J. Lin, W. Y. Chang, and Y. N. Sun, “Voxel- based texture mapping for medical data,” Computerized Medical Imaging and Graphics, vol. 26, no. 6, pp. 445–452, 2002. [21] E. Stindel, J. L. Briard, P. Merloz et al., “Bone morphing: 3D morphological data for total knee arthroplasty,”Computer Aided Surgery, vol. 7, no. 3, pp. 156–168, 2002. [22] D. C. Barber and D. R. Hose, “Automatic segmentation of med- ical images using image registration: diagnostic and simulation applications,” Journal of Medical Engineering & Technology, vol. 29, no. 2, pp. 53–63, 2005. work_7r4tgxtkejf7dh3ricgg3ifcsi ---- América Crítica 4(2), 113–122, 2020, ISSN: 2532-6724, https://doi.org/10.13125/americacritica/4516 Digital Humanities at CUNY. Building Communities of Practice in the Public University Stefano Morello1 1 The Graduate Center, CUNY, United States Received: 10/04/2020 Accepted: 13/11/2020 Abstract—In this essay, I reflect on my experience working in the field of Digital Humanities at The Graduate Center (GC) of the City University of New York (CUNY) to refute the misconception that the point of intersection of humanities and computation is dependent on robust technological infrastructure and, therefore, outside of the reach of underfunded public institutions. On the contrary, my tenure as a GC Digital Fellow suggests that the development of DH communities of practice can be an especially valuable asset for public universities, due to the waterfall effect they can produce for both the academic and the local community. Finally, I present evidence of second and third-order effects of the GC’s institutional DH culture by briefly introducing two projects developed at CUNY that both rely on and engage critically with technology: the CUNY Distance Learning Archive (CDLA), a GC class project, and QC Voices, a structured initiative established at one of the four-year CUNY colleges. — Digital humanities, digital praxis, critical university studies, community of practice, American studies. Abstract—Il saggio presenta una riflessione sulla mia esperienza nelle Digital Humanities al Graduate Center (GC) della City University of New York (CUNY) al fine di confutare il luogo comune secondo cui il punto di inter- sezione tra le scienze umanistiche e quelle computazionali richieda una robusta infrastruttura tecnologica e sia, di conseguenza, di difficile applicazione nelle istituzioni pubbliche che operano spesso in regimi di austerità. Al con- trario, la mia esperienza suggerisce come lo sviluppo di “Comunità di Pratica” orientate allo studio e all’applicazione delle DH possa costituire una risorsa di valore soprattutto per le università pubbliche, grazie all’effetto a cascata che possono generare sia all’interno della comunità accademica sia di quella locale. A prova di ciò, il saggio analizza due progetti che dipendono dalla tecnologia e che interagiscono con essa in modo critico: il CUNY Distance Learning Archive (CDLA), un progetto sviluppato nell’ambito di un seminario in DH al GC, e QC Voices, un’iniziativa ped- agogica sistematica presso uno dei CUNY college. — Digital humanities, digital praxis, critical university studies, community of practice, American studies. INTRODUCTION A t a recent open house event for the PhD Programin English at The Graduate Center (GC) of the City University of New York (CUNY), a faculty member Contact data: Stefano Morello, s.morello@me.com sketched a parallel between the graduate student experi- ence and the quest of the protagonist of P. D. Eastman’s children book Are You My Mother? Born in an empty nest, Eastman’s hatchling bird embarks on a journey to find his missing genitor. The search brings the baby bird to ask a number of animals and animated objects if they are his mother. The hatchling’s quest resonates with that of a graduate student, the then-Deputy Exec- utive Officer of the program noted: bouncing between https://doi.org/10.13125/americacritica/4516 s.morello@me.com 114 América Crítica 4(2): 113–122 disciplinary homes, methodologies, formal and informal mentors, and para-curricular activities, until they find their figurative nurturers and with them, their academic homes. The metaphor immediately resonated with me. While my commitment to American Studies has been consistent throughout my – yet short – academic career, both the inherently speculative nature of scholarly re- search and the interdisciplinary anatomy of my work have pulled me in manifold directions during my time as a Ph.D. student. In addition to genuine intellectual cu- riosity and the need to overcome theoretical or practical research challenges, what further prompts graduate stu- dents to pose the proverbial “are you my mother?” ques- tion to different actors, methodologies, and disciplines, are the unstable nature of the job market that increas- ingly requires applicants to be fluent in multiple fields and disciplinary areas, and a desire for community in a context of ever-growing academic alienation. Since the early stages of one’s graduate career at the GC, students, especially those willing to break out of their disciplinary bubbles, are typically exposed to more opportunities than they can chew on. In the fall of 2015, when I began my PhD program, I was intro- duced to manifold formal and informal resources to its students through a number of orientations that kicked off the academic year. Such initiatives included student and faculty-led cross-departmental research groups, cer- tificate programs, and intra-institutional centers geared towards supporting different approaches to academic re- search, often through the employment of graduate stu- dents. I was first exposed to the field of Digital Human- ities (DH) in the kinds of overwhelming circumstances that make new student orientations almost disorienting. Completely oblivious to over fifty years of scholarship in the field and parroting some of my colleagues’ im- pressions, I distinctly remember dismissing what was being demoed at the event (distant reading, data visu- alization, and mapping projects) as an emphasis of form over content. Besides, because of my slight familiar- ity with computer programming and my confidence in my own digital literacy, I did not see the point of fur- ther investing in learning more digital skills when there was so much theory I had to master in my actual field (as a non-literature major in college and first-generation college student, I was especially affected by impostor syndrome). Despite my appreciation for the liveliness of the DH community that surrounded me (I had often admired the warm and welcoming environment that character- ized their events), it was not until two years later, when I found myself in need of what DH had to offer to my dis- sertation project that I went back on my steps. In the fall of 2017, I had the opportunity of laying my hands on un- earthed archival material documenting the punk scenes and the subcultural formations at the heart of my disser- tation. Lawrence Livermore, countercultural figure and co-founder of the Berkeley-based record label Lookout Records, had made his zine collection and a number of artifacts from his days in the East Bay available to me. With an eye to the increasing institutionalization of punk (the acquisition of punk ephemera by academic institu- tions that often de-facto prevents non-academic subcul- tural participants from accessing the material), I became intrigued by the idea of making the content of Liver- more’s archive available to both scholars and subcultural participants through an open access digital archive, mir- roring my commitments to work with and for the com- munity and to produce public-facing scholarship. My first knock on the door of DH – when I first asked myself if it were, indeed, my metaphorical mother – was driven by pure utilitarian intentions: I viewed DH as a means (a set of methodologies and tools) to reach an end (curating and publishing Livermore’s digital col- lection). However, what I discovered in the process of developing the East Bay Punk Digital Archive (EBP- DA) and through my further involvement with the DH community are otherwise modes of academic engage- ment: collaborative, praxis-driven, and public-facing. What follows is an account of my DH history at the GC (CUNY).1 Rather than producing a self-referential nar- rative of success, I aim to refute the misconception that the point of intersection of humanities and computation is dependent on robust technological infrastructure and, therefore, outside of the reach of underfunded public in- stitutions. I argue, on the contrary, that DH hubs are not predominantly dependent on vanguard technology. The development of DH communities of practice can be an especially valuable asset for resource-scarce public uni- versities, due to the waterfall effect they can produce for both the academic and the local community. GCDI AND THE DIGITAL FELLOWS PRO- GRAM The GC is the principal doctoral-granting institution of the CUNY system, the largest public urban univer- sity system in the United States, comprising 25 cam- puses: eleven senior colleges, seven community col- leges, one undergraduate honors college, and seven post- 1 See East Bay Punk Digital Archive at www.eastbaypunkda.com. www.eastbaypunkda.com Stefano Morello, Digital Humanities at CUNY 115 graduate institutions. As of 2019, the CUNY system counted more than 275,000 enrolled students (CUNY 2019). Not unlike other institutions, the GC offers training in DH methods through departmental or cross- departmental courses (including the Interactive Technol- ogy and Pedagogy certificate, a three-course sequence that offers interdisciplinary training in technology and pedagogy), fellowship programs, and para-curricular workshops. Within this constellation, GC Digital Ini- tiatives (GCDI) is an intra-institutional initiative led by Lisa Rhody and Matthew K. Gold that offers opportu- nities to learn, support, and promote digital scholarship. The program is run by a group of graduate fellows, fac- ulty, and staff and central to its mission is the aim to build and sustain a community around the shared idea of a “digital GC,” envisioning and actively devising pro- ductive, inclusive, and ethical ways to integrate tech- nology in the curriculum and in the research process. The majority of GCDI’s activities are conducted through the Digital Fellows program, “an in-house think-and-do tank for digital projects, connecting Fellows to digital initiatives throughout The Graduate Center” (GC Digi- tal Fellows n.d.). The Digital Fellows team, a diverse group of doctoral students, offers events, workshops, of- fice hours, faculty consultations, week-long institutes, and community-based working groups. My first practical encounter with DH took place through GCDI’s Digital Research Institute (DRI), a free week-long in-house training course usually held and taught the last week of Winter Break by the Digital Fel- lows to staff, students, and faculty of the GC. Taking a foundational approach, the institute introduces its par- ticipants to technical skills and a conceptual vocabu- lary that serves as a basis for further learning and en- gagement in the field.2 As pointed out by Rhody in a blog post on the Digital Humanities Research Institute (DHRI, a scaled-up version of the DRI aimed at train- ing faculty from US universities with the goal of setting up similar courses in their home institutions), “know- ing the underlying technologies will inform that choice and help with troubleshooting problems, asking for help on forums, collaborating with programmers and design- ers” (Rhody 2019). This pedagogical approach “also leads to second and third-order effects as students teach themselves and others, builds confidence, and flexibil- 2 The curricula for the 2020 edition included: workshops in Com- mand Line, Digital Ethics and Data, Git, Python, Text Analysis, Introduction to R, Data Manipulation, Data Visualization, Map- ping, Omeka, HTML and CSS and Platforms, and Twitter/API. See https://gcdri.commons.gc.cuny.edu/ for further information. ity” (Rhody 2019). In other words, by taking a foun- dational, as opposed to an instrumental approach (i.e., teaching students how to deploy a particular tool for a specific end), the DRI aims to teach its participants a forma mentis, rather than merely a modus operandi. What I found most valuable, aside from being intro- duced to a number of tools, was indeed the institute’s pedagogical model. Instead of relying solely on the ex- pertise of the instructor, the Digital Fellows fostered a kind of learning-in-common by facilitating exchanges, relationship-building, and skill-sharing among learners from across the disciplines. In doing so, the institute put into practice a set of common values that digital hu- manists aspire to attain in concordance with its goals. In her popular essay in Debates in Digital Humanities, Lisa Spiro identified the values that inform DH ethos as openness, collaboration, collegiality and connectedness, diversity, and experimentation (2012, 22). My positive experience as a DRI participant and the autodidactic efforts that ensued (and eventually led to the development of the EBP-DA, with the support of the New Media Lab, a vital node of the DH ecosystem at the GC that provides access to technology and various forms of support to students and faculty seeking to in- tegrate digital media into traditional academic practice) prompted me, shortly thereafter, to apply for the Dig- ital Fellows program myself. Whereas the majority of DH graduate fellowships in the United States offer ei- ther formalized training (whereby individual or group projects are developed, often in response to an artifi- cial prompt) or financial and technical support to bring a project of one’s own design to realization,3 being a Dig- ital Fellow is a rather unique employment opportunity that puts graduate students in the position of both re- ceiving from and giving back to their community. Each fellow joins the program with a specific set of skills and, usually, a DH project that they are developing as part of their academic pursuit. While graduate fellows re- ceive training and support towards accomplishing their research goals, the fellowship allows them an extraordi- nary amount of freedom: in concert with the team they decide what tools, methods, and outputs are most con- ducive to their professional formation and desirable to different constituencies of the GC, as well as how to 3 As of 2020, some of the distinguished centers that focus primarily on supporting and developing faculty projects include the Mary- land Institute for Technology in the Humanities (MITH) at the University of Maryland, the Roy Rosenzweig Center for His- tory and New Media (RRCHNM) at George Mason University, and the Center for Digital Humanities and Social Sciences (MA- TRIX). https://gcdri.commons.gc.cuny.edu/ 116 América Crítica 4(2): 113–122 learn them, and how to disseminate the knowledge they produce. In other words, the program offers fellows an opportunity to learn while producing output for use of the community (rather than an artificial final product), in the form of workshops, working groups, events, and col- laborative projects. Faculty and student consultations, usually hosted in the Digital Scholarship Lab, are fur- ther opportunities for the Digital Fellows to work with, rather than for the GC population. Through their collab- orative approach, the Digital Fellows foster sustainable training on anything from theoretical concerns to more practical issues and technical obstacles with the ultimate goal of putting scholars in the best position possible to be the expert of their own projects. If the majority of funding schemes reproduce the empirical experience of institutions with generous funding models and extraor- dinary infrastructural capacity (especially in the form of well equipped digital labs and dedicated personnel as- sisting individual projects), the Digital Fellows program aims to replicate an organic learning-by-doing process that prepares early career scholars for real-life scenarios likely to be found in public universities, community col- leges, and even small liberal art colleges. While the development of the EBP-DA offered me the opportunity to put into practice and expand on some of the foundational skills I had learned as a DRI par- ticipant – the command line, HTML and CSS, and Git, among others – developing an expertise in Omeka and digital archiving led to my becoming an instructor at the following iteration of the institute. Omeka is a free Con- tent Management System (CMS) and a web publishing system built by the Roy Rosenzweig Center for History and New Media (RRCHNM) at George Mason Univer- sity (GMU) to create searchable online databases and scholarly online interpretations of digital collections. In addition to being used by archives, historical societies, libraries, and museums, Omeka is also employed by in- dividual researchers and teachers to describe primary sources according to archival standards and publish on- line digital collections, as well as to curate interpretive online exhibits from those items. My workshop, built upon an open-access tutorial developed by DH scholar Amanda French, engaged with some of the conceptual challenges of digital archives before introducing partic- ipants to the nuts and bolts of the platform. By the end of two 75-minute sessions, participants had cre- ated a small digital collection, a short exhibit, and had been introduced to the resources available at the GC for those interested in pursuing such projects. Reflect- ing the increasing implementation of digital archives in both the classroom and in scholarly research (whereas a platform such as Omeka offers an invaluable oppor- tunity for cultural preservation with little to no institu- tional funding),4 the workshop has since transcended the DRI setting and has become a staple of the Digi- tal Fellows offerings, along with “Getting Started with TEI,” “Intro to Python,” “Building Websites with Word- press,” “Data Privacy and Ethics,” and “Introduction to Mapping.” Held in the fall and spring semester, GCDI’s workshops are typically accompanied by material dis- tributed in open access (e.g., web tutorials, PowerPoint slides, and GitHub repositories), allowing for the scope of the Fellows’ work to extend beyond the workshop set- ting and the GC. As Kathleen Fitzpatrick has suggested, open access work entails “free access not just in the sense of gratis, but also in the sense of libre work that, subject to appropriate scholarly standards of citation, is free to be built upon” (2019, 142). Many of GCDI’s workshops live in open access GitHub repositories, al- lowing future Digital Fellows and DH practitioners to update them, build upon them, or adapt them to their learning settings. As per Fitzpatrick’s understanding of free access, GCDI’s approach to knowledge dissemina- tion is informed by the same ethos of openness: knowl- edge is produced to be distributed to the community and to influence more knowledge production at both an intra and extra-institutional level. As DH practitioners, rather than using the Do-It- Yourself (DIY) affordances of technology to replace other professional figures, we are interested in work- ing with them to imagine and develop new and better methodologies. Aside from building a set of technical skills, developing the EBP-DA also involved familiar- izing with archival theory and practice. I engaged in conversation with archivists, librarians, faculty, and fel- low grad students to learn from their experience on mat- ters such as metadata, file format standards, informa- tional architecture (especially its relationship with dis- coverability and accessibility), rights and permissions, and sustainability. Through this process, I realized the extraordinary amount of work in and around digital archives at the GC as well as the need for a platform to put different constituencies in conversation.5 After 4 See especially projects that seek to preserve the cultural heritage of marginalized communities, such as “New Roots: Voices from Carolina del Norte!” (https : / / newroots . lib. unc . edu/), “Dawn- land Voices: Writing of Indigenous New England” (https : / / dawnlandvoices. org/collections/), and “Wearing Gay History” (http://wearinggayhistory.com/). 5 Among these are projects completed by the American Social His- tory Project, developed at the New Media Lab, and in the context https://newroots.lib.unc.edu/ https://dawnlandvoices.org/collections/ https://dawnlandvoices.org/collections/ http://wearinggayhistory.com/ Stefano Morello, Digital Humanities at CUNY 117 further surveying the community about its needs and de- sires, as part of my Digital Fellows duties, I spearheaded the Digital Archive Research Collective (DARC). In the Fall 2019 semester, the working group, co-lead by Filipa Calado and supported by Param Ajmera and Di Yoong, created a Wiki that contains information about various institutional resources, featured projects by students and faculty, and overviews of several digital archival meth- ods, approaches, and tools.6 The WikiMedia platform allows for the repository to be developed collaboratively by the community, al- lowing any user to add and edit content. In paral- lel with other working groups – such as the Python User Group (PUG), the R User’s Group (RUG), and the GIS/Mapping Working Group – DARC also holds monthly meetings open to all members of the commu- nity of all skill levels, disciplines, and backgrounds. During working groups meetings, Digital Fellows do not cast themselves as the only experts in the room, but rather invite those with an interest in specific method- ologies to congregate to work and learn together. Fi- nally, in the spring of 2020, DARC held an event se- ries that included talks by experts in the field and work- shops on tools and platforms such as TEI, Tropy, Audac- ity, and HathiTrust.7 By developing awareness around digital archival work and facilitating access to technical and academic support, DARC’s goal, in accordance with GCDI’s mission, is to foster the birth and development of a self-sustained community of practice. As defined by Lave and Wenger, communities of practice are groups of people who share a concern or a passion for something they do and learn how to do it better as they interact reg- ularly (1991). By emphasizing human relationships and common interests, communities of practice have the ca- pacity to bring constituencies from across the disciplines together and to bridge frozen dialectics among different fields. Furthermore, according to Etienne and Beverly Wenger-Trayner, fostering two complementary forms of participation, competence and knowledgeability, allow higher education to foster a kind of knowing-in-practice (2016: vi). Especially in settings with a rapid turnover (of either students or contingent faculty) communities of practice, born and developed through the very acts of learning and doing together, have the potential of pro- of the Praxis class of the ITP certificate. For a survey of digi- tal archives developed at the GC, see “Projects – DARC (Digi- tal Archive Research Collective),” https://darc.gcdiprojects.org/ Projects. 6 See “Digital Archive Research Collective (DARC) Wiki” https: //darc.gcdiprojects.org/ 7 See https://darc.gcdiprojects.org/DARC_Event_Series ducing a lasting impact, whereas expertise tends to be a shared asset and its divulgation a shared responsibil- ity. This allows for GCDI to extend the longevity of its communities of practice beyond the tenure of Digital Fellows with specific skills as well as institutional in- vestment in specific technologies or methodologies. Tagging the Tower, the blog used by the Digital Fel- lows to share resources and reflect on their experiences, abounds with accounts that resonate with mine and espe- cially emphasize the desire not only to build community around technology-based scholarship, but also to further build bridges across communities and disciplines. As early as 2012, former Digital Fellow Laura Keane wrote: The Digital Fellowship program has sharpened my programming and web development skills, and has given me a new venue to employ such skills. [...] I’ve found that my work in the Digital Fellows pro- gram has been based on collaboration and building a community around technology at the Graduate Center – this is exciting! [...] I’d like to see the Fel- lows working together with representatives from other programs at the Graduate Center to build an infrastructure for communication across disci- plines – a ‘Digital GC’ – and I think technology plays a crucial role in realizing that goal. (Keane 2012) As illustrated through the examples in edited volumes such as Debates in Digital Humanities and Digital Ped- agogy in the Humanities, as well as in journals like Jour- nal of Digital Humanities (JDH) and Journal of Inter- active Technology and Pedagogy (JITP), DH has often proved to foster successful interdisciplinary work, pro- duce new types of knowledge production, and devise curricular innovation. I thus urge the skeptical reader not to think of technology in higher education solely through a Marxist lens, i.e., as a means to relegate the intellectual worker as an appendage to both the machine and the neoliberal university, as part of a perpetual ef- fort to extract her fullest productive capacity. On the contrary, as Brian Greenspan has recently argued, the digital humanities involve a close scrutiny of the affordances and constraints that govern most scholarly work today, whether they are technical (relating to media, networks, platforms, interfaces, codes, and databases), social (involving collabo- ration, authorial capital, copyright and IP, censor- ship and firewalls, viral memes, the idea of “the book,” audiences, literacies, and competencies), or labor-related (emphasizing the often-hidden work of students, librarians and archivists, program- mers, techies, research and teaching assistants, and alt-ac workers). (2019: n.p.) https://darc.gcdiprojects.org/Projects https://darc.gcdiprojects.org/Projects https://darc.gcdiprojects.org/ https://darc.gcdiprojects.org/ https://darc.gcdiprojects.org/DARC_Event_Series 118 América Crítica 4(2): 113–122 As DH practitioners, we object to technological essen- tialism (technology as having an inherently good or bad nature) in favor of a praxis that uses digital means to- wards building academic practices that are better than the ones we have, more conducive of ethical and col- laborative work. In other words, as we think of “the digital” as a catalyst for research in the humanities, our technological praxis can and must be informed by new and better standards of humanity and care. Furthermore, as DH work often enables work geared towards non- academic publics, communities of practice can have a pivotal role in creating synergetic connections with non- academic communities and in promoting dialogues and collaborations across boundaries, emphasizing the pub- lic research agenda of city and state colleges.8 Especially in institutional contexts with limited fi- nancial, technological, and human resources, diverse communities of practice can thus be building blocks for a thriving DH hub. Despite its wide range of activi- ties, GCDI can rely on a rather limited budget, the im- pact of which has been extended through its community- oriented approach. For instance, the initial funding that supported the training materials built for the DRI came from a one-time Strategic Investment Initiative award, a state grant offered to CUNY for particular projects based on strategic infrastructure building. The impact of the grant was scaled up through the Digital Fellows program, sustained through funding from the Provost’s Office, often in the context of the overall support pack- ages offered to PhD students. Whereas at many other (especially private) institutions, graduate funding pack- ages often come with lower (or no) work requirements, being a Digital Fellow – as most GC fellowships do – re- quires a service commitment of 15 hours per week. Fur- thermore, as argued by Rhody (2019) and demonstrated by and through my personal experience, training pro- vided through a foundational approach and developed through communities of practice often produces second and third-order effects. In the next section, I will pro- vide two examples of such effects by briefly introducing two projects developed at CUNY that rely on and engage critically with technology: the CUNY Distance Learn- ing Archive (CDLA), a GC class projects that outgrew its original scope and QC Voices, a structured initiative 8 On extending DH communities of practice beyond academia, see also Joan Fragaszy Troyano and Lisa M. Rhody, “Expanding Communities of Practice” in Jour- nal of Digital Humanities, Vol. 2, No. 2 Spring 2013 accessed online http : / / journalofdigitalhumanities . org / 2-2 / expanding-communities-of-practice/ established at one of the four-year CUNY colleges. ON SECOND- AND THIRD-ORDER EFFECTS In the spring of 2020, Gold, faculty in the English and Digital Humanities programs, led a graduate seminar on Knowledge Infrastructures that required, as a final project, “an intervention [...] into the knowledge infras- tructures at the GC or in CUNY” (Gold 2020). The global COVID-19 pandemic urged the class to make a commitment to a cause much earlier than anticipated. On March 11, the news of CUNY’s switch to distance learning to mitigate the health risks posed by the pan- demic broke just a few minutes before our last in-person class of the semester. Over the course of two hours, the students in the class unanimously decided that the inter- vention would have to be related to the unique moment we were experiencing as students and teachers. Over the rest of the semester, under Gold’s supervision and through the extraordinary involvement of the students in the class,9 the CDLA was developed as a crowdsourced archive that allows students, fac- ulty, and staff from across the CUNY system’s 25 campuses to submit personal narratives about the experience of moving online, emails, and com- munications related to the decisions to move on- line, documentation of online learning experiences (e.g., photos, narratives, screenshots), and links to digital media artifacts that capture the event in real time. (CUNY Distance Learning Archive, 2020) Furthermore, the CDLA also sought to preserve social media posts and reactions (Twitter, Reddit, Facebook, and Instagram) of the CUNY community to both the cri- sis and the shift to remote learning. Since the archive’s initial conception, the class quickly moved forward, under pressure of the need to capture the moment. Within the first week of CUNY’s transition to online instruction, the team developed a website through the CUNY Academic Commons (an academic social network created by and for the CUNY that include a customised installation of WordPress), an online submission system, and a social media pres- ence via major digital platforms. Over the following weeks, Gold’s class partnered with the Core Interac- tive Technology and Pedagogy class of the ITP Pro- gram, whose students devised a number of suggested writing prompts for CDLA contributors. While moving 9 The founding members of the CDLA team are Matthew K. Gold, Travis Bartley, Nicole Cote, Jean Hyemin Kim, Charlie Markbre- iter, Zach Muhlbauer, Michael Gossett, and myself. http://journalofdigitalhumanities.org/2-2/expanding-communities-of-practice/ http://journalofdigitalhumanities.org/2-2/expanding-communities-of-practice/ Stefano Morello, Digital Humanities at CUNY 119 the project forward allowed the team to learn-by-doing, students also studied technical, ethical, and theoretical challenges faced by similar ‘crisis archives’ (such as The September 11 Digital Archive and Our Marathon) and learned from experts in the field (including Jim Mc- Grath, former project director for Our Marathon, Ed Summers, Technical Lead for Documenting the Now, and Johnathan Thayer, assistant professor at the Queens College’s Graduate School of Library and Information Studies) invited as (remote) guest speakers in the re- maining sessions of Gold’s class. As of September 2020, without any funding and relying mostly on its origi- nal team’s labour, the CDLA has collected dozens of contributions (in the form of personal narratives, cor- respondence, official email communications, and learn- ing resources) and its social media collection efforts re- sulted in scraping close to a hundred thousand posts. If the goal of the CDLA is to “document this moment of crisis response from a critical approach to educa- tional technology,” collecting different forms of data from a wide range of sources is aimed at producing a multi-perspective narrative that includes both the insti- tutional and the lived experiences of multiple actors oc- cupying different positionalities and identities. Through their juxtaposition, the CDLA team hopes to enable re- searchers, students, and members of the community to understand, learn from, and engage critically with this moment. As Travis Bartley, one of the members of the team, noted: With this archive, we hope to better understand the particular means through which the accommoda- tion of distance learning has in some ways troubled educational instruction. Further, given the possi- bility that distance learning practices may become instituted as the norm for higher education, we hope to maintain a collection that acknowledges the human cost of such practices, assisting in the development of pedagogy that truly meets student needs through the digital medium. (2020) Moving forward, the CDLA team hopes to find institu- tional backing to ensure the longevity of its archiving efforts, either through merging its collection with an es- tablished repository or through the provision of funds for the migration of data to a secure storage platform. It is also currently seeking external funding for the next stages of the project, geared towards curation and preser- vation solutions, metadata standardization, ethical prac- tices to handle social media datasets, as well as creating an archive front-end to ensure accessibility. The case of the CDLA and its ongoing development, from class assignment to public resource, is not only fur- ther proof of the indissoluble relationship between DH practice and theory in both research and classroom set- tings, whereby community-oriented projects offer out- standing opportunities to develop a praxis that acts on the theoretical underpinnings of the field. It also allows me to emphasize the pivotal role of a human infrastruc- ture – the result of a synergetic approach to building DH communities of practice that comprises both cur- ricular and para-curricular activities – that relies on a set of foundational skills to approach, devise, and de- velop a DH project and contributes, on the one hand, to overcome financial and technological scarcity, and on the other hand, to the development of a “digital GC.” QC VOICES: A COLLABORATIVE WRITING PLATFORM Third order effects of the presence of a community com- mitted to integrating technology in their scholarship also percolate beyond the R1 settings of the GC and into undergraduate pedagogy. Benefits of the GC’s digital knowledge infrastructure also extend to other CUNY campuses and their population, where funding of dig- ital initiatives is not as robust. For once, as graduate students and alumni develop a sensibility to DH tools and methods during their graduate career, they often carry it with them to the CUNY community and four- year colleges, where many of them find employment as faculty, teaching fellows, adjunct teachers, and staff. If the use of course sites and blogs has become somewhat widespread, digital tools such as digital archives or data visualization software are also making their way in un- dergraduate’s teaching pedagogies. As an example of this growing tendency, I want to bring to your attention some initiatives promoted at Queens College (QC), to which I have been affiliated for several years in different capacities. Over the past three years, as part of its efforts to further integrate technol- ogy in English courses, Writing at Queens (the program that supports and administers the college’s writing cur- riculum) has run several faculty development workshops to encourage writing instructors to further implement multimodal assignments in their courses. As posited by Cynthia Selfe, “multimodal writing” extends tradi- tional classroom composition work into “visual, audio, gestural, spatial, or linguistic means of creating mean- ing” beyond what is traditionally considered literature and allows teachers to foster their students’ multilitera- cies (Selfe 2007, 195). A number of para-curricular 120 América Crítica 4(2): 113–122 activities also rely on the affordances of technology to promote otherwise pedagogies and modes of engage- ment with writing. A particularly interesting case is that of QC Voices, a program that uses a local installa- tion of WordPress (QWriting) as a platform for a collec- tive blog featuring student writers. Currently on hiatus due to the budget cuts that resulted from the COVID-19 emergency, QC Voices was spearheaded in 2009 by GC alumni Jason Tougaw (faculty in the QC English depart- ment) and Boone Gorges (QC’s Educational Technolo- gist and PhD Candidate in Philosophy at the GC, at the time). The project’s generative questions were: first, since the domains of writing and information technology are increasingly intertwined, how is the former influencing the purposes of writing, the genres of written communi- cation, and the nature of audience and author? Second, at a time when citizens are bombarded by media mes- sages and information is delivered mostly through dig- ital platforms, how can we further develop and channel digital writing fluency towards critical thinking, effec- tive communication, and active citizenship? (Tougaw 2018). Rather than achieving proficiency with specific software packages and technological devices, the goal of the program was to effectively collaborate, asyn- chronously and synchronously, across spatial barriers, to produce, analyze, and share information on a digital platform. Every semester, with these pedagogical goals in mind, QC Voices hired a diverse cohort of a dozen graduate and undergraduate students, selected from a large pool of applicants from across the disciplines, to each publish six non-fictional thematic columns. In ad- dition to a stipend of $600 per semester, student partic- ipation was driven by the opportunity of being part of a program run like a professional public publication, with the support of Tougaw, in the role of faculty mentor, and two remunerated editors (usually an adjunct professor with experience as a professional content editor and an early career DH scholar in the role of multimedia edi- tor). As explained by Tougaw in a recent interview: We try to structure it like a literary-magazine edit- ing experience [...] We do all the steps that I would go through if I was publishing something. They submit the first draft, we give them notes, it usu- ally takes them another week or so to revise, and then we do a round of more sentence level, detail- oriented editing. In the meantime, one of the tech- nology fellows works with them on assembling the visual elements and doing layout. (“Sharing Stu- dent Perspectives” 2020) Through writing workshops, a professional editorial process, and one-on-one mentoring, writers learn about the distinctive elements of writing online, including vi- sual rhetoric, savvy linking, and media integration. The workshops are hosted every few weeks during free hour, when classes aren’t in session, in the Digital Writing Studio, a lab built through a grant earned by Kevin Fer- guson (GC alumnus and faculty in English at QC and in MA program in Digital Humanities at the GC), equipped with five round-tables with dedicated screens and a lap- top cart, primarily used to promote multimodal writing in composition courses. Workshop topics included pod- casting, digital editorial practices, visual rhetoric, online pitching, developing an online presence, online collab- oration, and building a community of writers. The in- vestment in technology of the program is thus especially geared towards learning outcomes such as cooperation, discussion, and community-building. As per the col- laborative ethos that informs the program, while writ- ers benefit from one-on-one mentoring, peer networks were also often born out of the workshops. The QC Voices website still gets thousands of visits each month, making it both a public forum for members of the QC community and a highly visible online representation of some of the college’s most outstanding students, speak- ing their minds through a range of styles (from poetic prose to journalism, from creative non-fiction to a digi- tal exhibits) on a plethora of topics (recent columns have focused on environmental activism, prison reform, nerd culture, immigrant life, local food culture, Afrocentric- ity, theater, hip hop, and Muslim-American identity). The initiative can thus be framed as laying at the inter- section of digital and public humanities, whereas stu- dents produce public content pertinent to their lived ex- perience and their community. In addition, it also oper- ated as a kind of professional development, with alumni of the program working as professional writers, or us- ing the digital literacy, communication skills, and col- laborative approach to writing they developed through QC Voices in their professional work. In light of the CUNY-wide mass budget cuts under the COVID-19 cri- sis, Queens College has deemed QC Voices too expen- sive to run. The emphasis college administrators put on the cost of the editing fellows is further proof of a peculiar kind of shortsightedness in sustaining digital infrastructures (and computational humanities) through massive investments in technology – including million dollar contracts to purchase licenses for platforms de- veloped with little regards to ethics by for-profit corpo- rations, including CUNYFirst, Blackboard, G Suite for Stefano Morello, Digital Humanities at CUNY 121 Education, and the like – rather than in human capital. CONCLUSION Even within public universities, I am aware of the GC’s privileged position in terms of human and intellectual capital, as well as resources available to its affiliates through the ecosystem to which it belongs. Despite its pathological austerity blues – to quote Michael Fabri- cant and Stephen Brier (2016) – CUNY is the largest public urban university system in the nation, located in one of the largest urban technology hubs in the world. However, scaling up training in DH research methods is a desirable goal for both public institutions and the DH community itself. On the one hand, a true diverse DH community – to this day still extremely white and male-dominated – can only coalesce when training in the field reaches higher education’s largest pools of di- verse resources: community colleges and public univer- sity systems. On the other hand, public institutions can benefit from DH’s ability to promote horizontal collabo- rative research practices that foster mentorship and non- hierarchical relationships among diverse perspectives, training, and fields of expertise to de silo knowledge cre- ation and public impact. In an institutional context steeped in DH, such as that of the GC, the Digital Fellows program represents a sustainable funding scheme aimed to employ and train graduate students, while also producing output for the community in the form of support for DH scholarship. Initiatives like the DRI and DHRI, aimed at teaching not only computational foundational skills, but also at scal- ing up the pedagogical philosophy that informs GCDI’s work, are another example of sustainable professional development that can produce a waterfall effect for the community. If DH practitioners at better funded univer- sities are more likely to have access to the newest tech- nology and to professional assistance than those who are not, public universities can and must promote an institu- tional culture that aims at nurturing graduate students, staff, and faculty computational skills and devise oppor- tunities for them to join forces across disciplines and hi- erarchies. Whereas communities of practice coalesce by doing together, they do not necessarily come nor stay together spontaneously. Public institutions need to ac- tively stimulate, facilitate, or formalize such initiatives. Investing in human, rather than merely technological, infrastructure is essential to build communities of prac- tice and spark a virtuous circle that can lead to further infrastructural development, larger scope of operations, an institutional DH culture, and eventually to formal and informal inter-institutional networks of practice. REFERENCES Bartley, Travis. Personal Interview. August 15, 2020. CUNY. 2019. “Total Enrollment by Undergraduate and Grad- uate Level, Full-time/Part-time Attendance, and College, Fall 2019” Accessed July 2, 2020. https://www.cuny.edu/ irdatabook/rpts2_AY_current/ENRL_0001_UGGR_FTPT. rpt.pdf. CUNY Distance Learning Archive. 2020. “About.” Accessed September 2, 2020. https : / / cdla . commons . gc . cuny. edu / about/. Fabricant Michael, and Stephen Brier. 2016. Austerity Blues: Fight- ing for the Soul of Public Higher Education. Baltimore: Johns Hopkins University Press. Fitzpatrick, Kathleen. 2019. Generous Thinking: A Radical Ap- proach to Saving the University. Baltimore: Johns Hopkins University Press. Fragaszy Troyano, Joan and Lisa M. Rhody. 2013. “Expanding Communities of Practice.” Jour- nal of Digital Humanities 2(2). Accessed July 1, 2020. http : / / journalofdigitalhumanities . org / 2-2 / expanding-communities-of-practice/. GC Digital Fellows. “ABOUT.” Accessed July 2, 2020. https : / / digitalfellows.commons.gc.cuny.edu/about/. Lave, Jean, and Etienne Wenger. 1991. Situated Learning: Legiti- mate Peripheral Participation. Cambridge: Cambridge Uni- versity Press. Kane, Laura. 2012. “A Fresh Perspective.” Tagging the Tower. Ac- cessed August 30, 2020. https://digitalfellows.commons.gc. cuny.edu/2012/11/12/am-i-an-author/. Gold, Matthew K. 2020. “Knowledge Infrastructure.” Syllabus, The Graduate Center, CUNY. Accessed July 2, 2020. https : / / kinfrastructures.commons.gc.cuny.edu/syllabus/. Greenspan, Brian. 2019. “The Scandal of Digital Human- ities.” Debates in the Digital Humanities, edited by Matthew K. Gold and Lauren F. Klein. Minneapolis: University of Minnesota Press. Accessed Novem- ber 5, 2020. https : / / dhdebates . gc . cuny . edu / read / untitled-f2acf72c-a469-49d8-be35-67f9ac1e3a60 / section / 4b6be68c-802c-41f4-a2a5-284187ec0a5c#ch09. Rhody, Lisa. 2019. “DHRI: Notes Toward Our Pedagogical Ap- proach.” Accessed July 2, 2020. http://www.lisarhody.com/ dhri-notes-toward-our-pedagogical-approach/. Selfe, Cynthia L., ed. 2007. Multimodal Composition. New York: Hampton. “Sharing Student Perspectives.” 2020. The QView, 77. Accessed August 31, 2020 https://www.qc.cuny.edu/communications/ Documents/QView/QView77.pdf. https://www.cuny.edu/irdatabook/rpts2_AY_current/ENRL_0001_UGGR_FTPT.rpt.pdf https://www.cuny.edu/irdatabook/rpts2_AY_current/ENRL_0001_UGGR_FTPT.rpt.pdf https://www.cuny.edu/irdatabook/rpts2_AY_current/ENRL_0001_UGGR_FTPT.rpt.pdf https://cdla.commons.gc.cuny.edu/about/ https://cdla.commons.gc.cuny.edu/about/ http://journalofdigitalhumanities.org/2-2/expanding-communities-of-practice/ http://journalofdigitalhumanities.org/2-2/expanding-communities-of-practice/ https://digitalfellows.commons.gc.cuny.edu/about/ https://digitalfellows.commons.gc.cuny.edu/about/ https://digitalfellows.commons.gc.cuny.edu/2012/11/12/am-i-an-author/ https://digitalfellows.commons.gc.cuny.edu/2012/11/12/am-i-an-author/ https://kinfrastructures.commons.gc.cuny.edu/syllabus/ https://kinfrastructures.commons.gc.cuny.edu/syllabus/ https://dhdebates.gc.cuny.edu/read/untitled-f2acf72c-a469-49d8-be35-67f9ac1e3a60/section/4b6be68c-802c-41f4-a2a5-284187ec0a5c##ch09 https://dhdebates.gc.cuny.edu/read/untitled-f2acf72c-a469-49d8-be35-67f9ac1e3a60/section/4b6be68c-802c-41f4-a2a5-284187ec0a5c##ch09 https://dhdebates.gc.cuny.edu/read/untitled-f2acf72c-a469-49d8-be35-67f9ac1e3a60/section/4b6be68c-802c-41f4-a2a5-284187ec0a5c##ch09 http://www.lisarhody.com/dhri-notes-toward-our-pedagogical-approach/ http://www.lisarhody.com/dhri-notes-toward-our-pedagogical-approach/ https://www.qc.cuny.edu/communications/Documents/QView/QView77.pdf https://www.qc.cuny.edu/communications/Documents/QView/QView77.pdf 122 América Crítica 4(2): 113–122 Spiro, Lisa. 2012. “‘This Is Why We Fight’: Defining the Values of the Digital Humanities.” Debates in the Digital Humanities, edited by Matthew K. Gold, 16-34. Minneapolis: University of Minnesota Press. Also accessible at https : / / dhdebates . gc . cuny . edu / read / untitled-88c11800-9446-469b-a3be-3fdb36bfbd1e / section / 9e014167-c688-43ab-8b12-0f6746095335#ch03. Tougaw, Jason. Personal Interview. September, 25, 2018. Wenger-Trayner, Etienne and Beverly. 2016. “Foreword.” Imple- menting Communities of Practice in Higher Education, edited by Jacquie McDonald and Aileen Cater-Steel, v-viii. Singa- pore: Springer. https://dhdebates.gc.cuny.edu/read/untitled-88c11800-9446-469b-a3be-3fdb36bfbd1e/section/9e014167-c688-43ab-8b12-0f6746095335##ch03 https://dhdebates.gc.cuny.edu/read/untitled-88c11800-9446-469b-a3be-3fdb36bfbd1e/section/9e014167-c688-43ab-8b12-0f6746095335##ch03 https://dhdebates.gc.cuny.edu/read/untitled-88c11800-9446-469b-a3be-3fdb36bfbd1e/section/9e014167-c688-43ab-8b12-0f6746095335##ch03 work_7tepdpob3nh2jch3vrij5th4n4 ---- Guest Column—On Disciplinary Finitude Guest Column On Disciplinary Finitude jeffrey t. schnapp JEFFREY T. SCHNAPP holds the Carl A. Pes co so lido Chair in Romance Languages and Literatures and Comparative Lit- erature at Harvard University, where he serves as faculty director of metaLAB (at) Harvard and faculty codirector of the Berk man Klein Center for Internet and Society. His most recent book is Futur- Piaggio: Six Italian Lessons on Mobility and Modern Life (Rizzoli International, 2017). T HE YEAR 2008 WAS ONE OF FRUITFUL DISJUNCTIONS. I SPENT THE fall teaching at Stanford but commuting to the University of California, Los Angeles, to cochair the inaugural Mellon Semi- nar in Digital Humanities. During the same period, I was curating— at the Canadian Center for Architecture, in Montreal—an exhibition devised to mark the centenary of the publication of “he Founding Manifesto of Futurism,” by Filippo Tommaso Marinetti. Whereas other centennial shows (at the Centre Pompidou, in Paris, and at the Palazzo Reale, in Milan) sought to celebrate the accomplishments and legacies of Marinetti’s avant- garde, the Canadian exhibition, Speed Limits, was critical and combative in spirit, more properly fu- turist (though thematically antifuturist). It probed the frayed edges of futurism’s narrative of modernity as the era of speed to relect on the social, environmental, and cultural costs. An exhibition about limits, it looked backward over the architectural history of the twen- tieth century to look forward beyond the era of automobility. My commitments, two pedagogical, the other curatorial, seemed fated to collide. And collide they did in the form of a document I ini- tially drated as an insider joke during the forty- ive minutes I spent in the jet stream between San Francisco and Los Angeles: “A Digital Humanities Manifesto.”1 I had nurtured a fondness for the bluster of the manifesto genre since high school days, and digital humanists had jested about belonging to some sort of avant- garde. So, I asked myself, why not stir the pot by writing a manifesto that indulged in some academic politicking and philosophizing with a hammer while concluding with a call to transcend the digital humanities? In the inal version of the manifesto, the valediction “Let’s get our hands dirty” hovers over a fourfold repetition of John Heartfield’s Five Fingers Has the Hand (1928), a photograph famously employed in 1 3 2 . 3 ] © 2017 jeffrey t. schnapp PMLA 132.3 (2017), published by the Modern Language Association of America 505 a 1928 electoral poster reveling in the power of manual labor (Digital Humanities Mani- festo 2.0). he original drat was animated by enough philofuturist nose thumbing to whip up a dust storm or two once it was placed in circulation and would undergo two collab- orative rewritings: collaborative to the degree that the inal document includes the voices of dozens of coauthors. But amid the ludic pos- turing, one provocation hasn’t abandoned me over subsequent years of work at the conines of the arts and humanities: a section devoted to the question of disciplinary initude. Do disciplines end, or do they just adapt, ab- sorb, and mutate? What are their ends, in the sense of boundaries but also in the sense of their ability to undergo knowledge transfers? What, if anything, comes ater or lies beyond disciplinarity: new disciplines? new disci- plinary containers? always- shiting interdis- ciplinary grounds? Disciplina (or in the old French, de ce- pline) is a word with a complex classical and medieva l Christian lineage. Whereas the classical meaning emphasizes the objects of instruction and cognition, the medieval Christian meaning focuses on the means of enforcing the successful transmission of a teaching through penance or punishment. Both meanings were already present in the Greek term παιδεία (paideia).2 As the two merge and assume the sorts of secular in- stitutional forms that proliferated in nine- teenth- and twentieth- century universities, they associate a given corpus of knowledge and set of standardized procedures and rou- tines for its acquisition and performance with a social hierarchy and system of control, even a system of rewards and punishments. he above may sound like the beginnings of a complaint against disciplinarity. But, if anything, it is the opposite. Before graduate school, my interpretive engagements with modern and contemporar y literature and art came a bit too efortlessly (and were thus less than deep or satisfactor y). I devoted those early years to swimming in the stream of contemporary art as a wannabe abstract painter and to studying classical and mod- ern languages, as well as nineteenth- and twentieth- century French and Spanish litera- ture. What drew me more meaningfully into the academy, like a time- tested armchair that gradually and gratifyingly engulfs your body, releasing you only ater strenuous efort, was a longing for something more challenging and exacting: not freedom but constraint. Brilliant teachers who served up a fore- taste of the feast that awaits the fully disci- plined led me to fall in love with the rigors of thirteenth- century texts that played by alien cultural rules; with the endless puzzles posed by codicology and paleography; with the de- mands of reconstructing a cultural record reduced to fragments by time’s depredations; and with the strange beauty of parchment and inks made of gum and gall, colored with lampblack or iron salts. Here was a galaxy of knowledge forms pulsating with learned ref- erence works that could be marshaled to de- fend this or that position, a universe made up of vast silences as well as hot zones animated by multicentury stratigraphies of commen- tary, annotation, and emendation. And here was a world of inquir y where interpreta- tion was never a given but rather the result of arduous reconstruction. Sometimes these reconstructions required near lifetimes of devotion, prompting equations (fair or not) between the asceticism of the philological method and monastic forms of piety. he deining experiences of my academic life were training for, becoming part of, and participating in this disciplinary community. hey have remained so, even as the compass of my research and teaching, as well as the worldly commitments to which both led, mi- grated from medieval Italian literary history to twentieth- century cultural history (media, architecture, and design) and then to twenty- irst- century technologies (interaction design, data science, and—most recently—artiicial 506 Guest Column [ P M L A intelligence and robotics). The passage was hardly frictionless, and medieval studies was, like any enduring and tightly woven disci- plinary domain, not always irenic. here were clan rivalries, battles over everything from the macro to the micro level (from models, methods, and masters to textual cruxes), ef- forts to police the discipline’s boundaries or to enforce orthodoxies that had run their course, and clashes between disciplinary generations. Eventually, I found some of the wellsprings that had initially nourished me running dry and encountered unexpected resistances: to theoretical engagements, to personal re- search interests in transversal literary- or art- historical ties, to excursuses into the an- thropology of everyday medieval life and ma- terial culture. But, amid the contentiousness and the (oten fruitful) frictions, I did more than chafe: I acquired a knowledge base, a corpus of procedures, and a sense of crat, not to mention what I’d describe as a disciplin- ary imagination, which has served me well in subsequent trans- or extradisciplinary per- egrinations: whether as a twentieth- century cultural historian, a curator involved in the design and development of experimental his- tory museums like the Trento Tunnels, or an experimentally minded humanist engaged in the forms of experimental work that I have come to deine as knowledge design.3 So the question of disciplinary initude that I am posing here is less concerned with why or when disciplines close up shop or come under threat—worthy topics of con- cern, to be sure—than with how disciplines spill over into other disciplinary, institutional, cultural, or social realms. Otherwise phrased, I’m wondering about the nature of disci- plinary innovation and the ability of skills, knowledge, and experience that are based and bound in a discipline: from an intramural perspective, it’s the question of cross-, inter-, or transdisciplinarity; from an extramural one, it’s that of applicability or extensibility— the ability of a given skill and knowledge base to interoperate with disconnected domains, vocational or other. Both are familiar ques- tions to researchers and educators; in neither case are the answers simple or ready at hand. Cross- disciplinarity, interdisciplinarity, transdisciplinarity: these terms surely igure among the most inelegant of academic neolo- gisms. Yet all have become the familiar ban- ners of change during the past half century as disciplines have grown beyond their existing confines; as novel domains of research and teaching have sought recognition; and as new challenges and demands have been posed by shifting socioeconomic, cultural, political, and technological circumstances. Whether in the humanities or in the sciences, rare is the ield that hasn’t experienced an upheaval cast in this sort of mold. The reason seems straightforward (and well- acknowledged at least since the seminal relections of homas S. Kuhn on the nature of scientific revolu- tions). Change that gradually bubbles up from within a given disciplinary domain is unlikely to rattle that domain’s foundations. But extra- neous models, unanticipated collisions and combinations, disciplinary invasions from the outside, can efect momentous transfor- mations. hink of the impact of evolutionary biology on debates over literary stemmatics in the development of nineteenth- century textual criticism. Or consider the sudden emergence of ields like bioinformatics, built around the use of computational techniques in the analysis and interpretation of biological data, or cultural analytics ( well- documented in the special feature on Franco Moretti in this issue), which mines cultural data sets on varying scales using computational meth- ods and visualization tools. In such cases and most others, exogenous tools and tech- niques (network analysis, data visualization, machine vision, artiicial intelligence) arise and come into dialogue with endogenous ob- jects of analysis that become available under new conditions or on altered scales (DNA se- quencing, genomics, digital text repositories, 1 3 2 . 3 ] Guest Column 507 and image databases), giving birth to a new domain. (And to plenty of polemics.) Far from resolving the question of dis- ciplinarity, cross- fertilization, interchange, and transmutations pose the question afresh. For a new disciplinary domain may indeed spring forth from an exogenous- endogenous collision or even, as it were, from the brow of Zeus. More likely, the outcome is evolution- ary, not revolutionary, and has consequences with respect to institutional arrangements. Interdisciplinary programs are the standard institutional expression of cross-, inter-, and transdisciplinary change in universities today, just as departments are the classic expression of a consecrated, historically sustained disci- pline. Interdisciplinary programs are charac- teristically more fragile and less well funded than departments, relying heavily on depart- mental labor and resources. Most are built on top of departments, operating as shared platforms, junction boxes that extend depart- ments’ reach. his reach reasserts itself at key moments of hiring, promotion, and evalua- tion, for disciplines possess well- established, if sometimes contested, standards of quality, depth, and rigor, whereas emergent interdisci- plinary domains tend by their nature to be un- stable and ill- deined: all the more so ones that diverge from established disciplinary norms. he foregoing argues for a more trenchant distinction: between modes of cross-, inter-, and transdisciplinarity that explore disciplin- ary conjunctions or adjust their contours, leaving largely intact the shapes that research, training, and publication assume, and modes that are resolutely experimental, revolution- ary (not evolutionary), imposing different professional language, altered research pro- tocols, new models of teaching and training, and alternative methods of dissemination. During the past decades, the revolutionary, higher- risk approach has shaped a growing array of ventures that include SpecLab and the Scholars’ Lab, at the University of Virginia; Humlab, at Umeå University; Humanities + Design and the Literary Lab, at Stanford Uni- versity; McGill’s .txtLAB; Maryland Institute for Technology in the Humanities (MITH), at the University of Maryland; and the Group for Experimental Methods in the Humanities, at Columbia University, to name only a few.4 Ex- perimentation was and remains the ethos of the Stanford Humanities Laboratory, which I directed from 1999 to 2009, and of metaLAB (at) Harvard, which I’ve directed since 2011. he experimental initiatives just adum- brated suggest that an expanded notion of cross-, inter-, or transdisciplinarity—call it what you will—requires a diferent sort of in- stitutional container than a department or an interdisciplinary program. To my mind, that container is the laboratory. When, in 1999, I had the good fortune to be asked by Stan- ford’s leadership to develop a visionary ven- ture in the arts and humanities, the apparent challenge was to build bridges between the disciplines in question and the cultural and technical revolution that was under way in the Silicon Valley, perhaps along the same lines as t he productive entanglements of the counterculture with cyberculture in the 1960s and 1970s (Turner). I felt well- enough- equipped to do so, hav ing tinkered w it h mainframe computing in high school and having served as the on- campus director for the irst digital pilot project of the National Endowment for the Humanities: the Dart- mou th Dante Project—a database of the seven centuries of line- by- line commentaries on Dante’s Divine Comedy, from Boccaccio to the present.5 But technology per se was never the object (note the absence of digital from any of the cited lab titles).6 A survey of knowledge production and training practices in other ields and schools, accompanied by an infor- mal poll regarding the dreams that my most adventurous colleagues aspired to realize but couldn’t under current conditions, conirmed that new tools, technologies, and media were only one means, however powerful and laden with potential, to a greater end: to expand the 508 Guest Column [ P M L A compass, impact, appeal, scope, and scale of humanistic work; to complement individual- ized models of training and scholarship with collaborative, project- based, hands- on models similar to those encountered in the experi- mental sciences; to test and model alternatives to the current knowledge- distribution system in the arts and humanities. Laboratoria are places of labor; they are workshops where an infrastructure made up of facilities, tools, instruments, and knowl- edge resources support the integrated, col- laborative production of k nowledge in a hierarchica lly structured communit y. As Bruno Latour and Steve Woolgar long ago observed with respect to the research labo- ratories of the industrial era, laboratory pro- ductivity has long been measured in scholarly writing. But what is learned writing? Where does such writing start, and where does it end? Is it restricted to the creation of schol- arly books, monographs, and journal essays disseminated as industrialized print artifacts? Surely not: such a notion would have struck our eighteenth- and nineteenth- century pre- decessors as unduly limiting, even as stiling. In sketching out an institutional blue- print for humanities innovation, I found myself thinking a great deal about the labo- ratories of the avant- garde, from construc- tivism and the Bauhaus to Black Mountain College. But most of all, I found my mind repairing back, time and again, to the me- dieval predecessors of Latour and Woolgar’s laboratories: scriptoria.7 Scriptoria, like the sixth- century renowned ones found at Cas- siodorus’s Vivarium in Squillace or Bene- dict of Norcia’s monastery of Monte Cassino, combined research, study, and contemplation with functions that we’d associate today with the art studio, the maker space, the chemis- try lab, the model farm, and the publishing house. hey were sites of gathering, hands- on teaching, and collaborative fabrication, animated appendages to libraries where the arts of the hand and the life of the mind were understood as one. Writing in scriptoria was an encompassing—today, we’d say a transme- dia—activity that included copying, index- ing, annotation, and commentary, across the full disciplinary grid, as well as decoration, layout, illustration, and bookbinding. Writ- ing was discovery, preservation, and explora- tion, and, as scribes are wont to remind us in their marginalia, it was also hard labor to the drip, drip, drip of water clocks.8 At the Stanford Humanities Laboratory, laboratory connoted the belief that “some cru- cial questions—about what it is to be human, about experience in a connected world, about the boundaries of culture and nature—tran- scend old divisions between the arts, sciences, and humanities; between the academy, indus- try, and the cultural sphere.”9 his copy, com- posed in 2000 for the lab’s home page with my archaeologist colleague Michael Shanks, now feels a bit dated and overreaching. It went on to state: “We engage in experimental projects with a ‘laboratory’ ethos—collab- orative, co- creative, team- based—involving a triangulation of arts practice, commentary/ critique, merging research, technology, peda- gogy, outreach, publication, and practice.” Overreaching or not, pedagogy loomed large in the lab’s collaborative universe. Proj- ects spanned from an experiment in the mul- timedia capture of the entire life cycle of a theater performance (dpResearch) to an art in- stallation for the San José Public Library (he Rosetta Screen) to a “big humanities” project (Crowds) to a Christian- Jewish- Islamic Web resource on the Spanish Middle Ages (Medi- eval Spains) to a mapping platform (Temporal Topographies Berlin). hey typically involved recurring course or seminar components that allowed students from all disciplinary walks of the university to learn “not only by study- ing existing k nowledge in the traditional manner, but also by producing knowledge: by being assigned responsibility for the realiza- tion of a piece of research within a larger re- search mosaic, overseen (as in natural science 1 3 2 . 3 ] Guest Column 509 laboratory settings) by an experienced senior researcher.” here were deadlines and deliv- erables in the form of Web sites, databases, sotware, interactive media, gallery installa- tions, book chapters, archival interventions, physical reconstructions, wall labels for mu- seum exhibitions, or curated virtual galleries. Student work carried out beyond the walls of the classroom was paid: undergraduates were paid by the hour; graduate students received honoraria for assuming leadership roles. he lab did some things well and other things not so well. Eforts to seed a multitude of projects soon stretched the lab’s leader- ship team beyond the limit: a disproportion- ate share of energies and resources was being devoted to supporting the exploratory work of others rather than to modeling the trans- disciplinary future that brought us within the lab’s fold. Some projects were overly am- bitious; most were underfunded. Grant writ- ing absorbed more and more creative juices. An industrial- ailiates program failed to in- spire warm and fuzzy feelings in the upper administration. At times, the pressure to de- liver research on time while training students on the job yielded work of uneven quality. Attempts to crat and then support a digital humanities minor across all the literature departments emerged as an additional time sink and encountered resistance from many senior faculty members. Once the lab moved to the School of Humanities and Sciences and no longer reported to the provost and presi- dent, its days seemed numbered: momentum became harder to sustain, resources became tighter, internal reviews were contentious. he competition became departments, cen- ters, and institutes: units with a more easily identifiable disciplinary terrain and firmer bases of faculty support. he Harvard metaLAB arose not out of the ashes of the Stanford Humanities Labora- tory but as a second cross-, inter-, or trans- disciplinarity chapter. hat chapter is being written in diferent times—digital humani- ties is now less the unkempt upstart than a force to be contended with in the academy— and under altered circumstances: metaLAB didn’t have to start from scratch because it found an ideal, ready- at- hand institutional home in the thriving and highly variegated intellectual community of the Berkman Klein Center for Internet and Societ y. Like the Stanford Humanities Laboratory, metaLAB is a small community of scholars, designers, thinkers, and creative technologists working on a portfolio of projects that share a com- mitment to experimentalism, teamwork, and project- based pedagogy designed to promote students’ translational skills. Unlike the Stan- ford Humanities Laboratory, metaLAB does not aspire (at least for the moment) to build an academic program. It’s a lean and scrappy entrepreneurial operation, physically hosted in Harvard’s Graduate School of Design. In the absence of words like digital and humanities in its title, metaLAB describes it- self as an “idea foundry, a knowledge- design lab, and a production studio” whose aim is to model (not just theorize) answers to the question of what shapes knowledge could or should assume in the twenty- irst century.10 Those answers include experiments in cre- ative coding and multimedia scholarship, critical and expressive data use, exhibition design and curation as ty pes of extended scholarly practice, and print publications that have a digital component and that span ev- erything from design- driven scholarly books (the publication series metaLABprojects) to critical editions (the expanded reprint of Blueprint for Counter Education).11 When it comes to sotware projects, metaLAB’s man- tra is modest: prototype rather than perfect. It approaches questions of knowledge design not just from the perspective of so- called con- tent but also from that of knowledge contain- ers: the design of future libraries, museums, and archives remains an abiding concern: no less so than curricular data sets, rare- book inventories, or collections databases. 510 Guest Column [ P M L A I may seem to have strayed far from my initial questions regarding the powers and limits of disciplines by describing two personal chapters, among the many being authored by creative colleagues throughout the world, from a collective work in progress dedicated to experimentation in the humani- ties. In so doing, my aim has been to circle back to the second extramural question posed earlier—that of disciplinary extensibility or the aptitude of a given skill set and knowl- edge base to prove efective in a distant do- main—from the perspective of the sorts of cross-, inter-, or transdisciplinary ventures just evoked. It’s a question of pedagogical, cognitive, and epistemological consequence, too complex to adequately address in these brief closing thoughts. As the faculty director of a research and training initiative, I am led to ask: What sort of students should we seek to educate, train, and involve in the life of the lab? How to balance disciplinary depth with interdisciplinary reach, rigor with imagina- tion? As the leader of a robotics startup, I am prompted to extend those same questions out into the work world: What sort of employees do we wish to hire when it comes to taking on complex, collaborative, real- world tasks for which the training received in university classrooms can never be adequate? How to balance expertise with ingenuity? Irrespective of which side of the fence I’m standing on, for me the answer remains the same: disciplinary homelessness is like a meal without textures, smells, or lavors. In- novators need to come from somewhere to go somewhere beyond. But to thrive, disciplinar- ity requires a counterforce, and such coun- terforces are fed, in turn, by discipline- based modes of inquiry. he paradox is irresolvable because it’s productive: whether in the class- room, the laboratory, or the workplace, depth plus reach equals greater mental agility than either pursued in isolation can hope to pro- vide. Disciplines may come and go, they may rejuvenate from within or without, but the great mosaics of twenty- irst- century knowl- edge will be built from the tesserae of domain expertise, not from a scattering of skills. NOTES 1. he two main redactions of the text—with signiicant contributions by Todd Presner, my faculty collaborator at the University of California, Los Angeles, and by fellow Mellon seminar presenters Johanna Drucker and Peter Lu- nen feld, along with paragraph- by- paragraph reader com- mentary and criticism—are available on the Web (“Digital Humanities Manifesto”; “Digital Humanities Manifesto 2.0”). he inalized version, with images, is available as a PDF (Digital Humanities Manifesto 2.0). The manifesto prompted the writing of a collaborative book (Burdick et al.). 2. he bibliography on discipline is vast, extending from overall accounts of the foundations of Western pedagogy, like Jaeger’s Paideia, to Foucault’s Discipline and Punish, in which associations between schooling practices and the structure of correctional institutions are a recurring topic. 3. he Trento Tunnels, known as Le Gallerie di Piedi- castello, are a six- thousand- square- meter pair of highway tunnels in the northern Italian city of Trent repurposed as an experimental history museum. hey were featured in the Italian pavilion of the 2010 Venice Biennale of Ar- chitecture; for more on the tunnels, see La Biennale. I irst articulated the notion of knowledge design in a keynote address I gave in December 2013 for the Herrenhausen Conference (Digital) Humanities Revisited—Challenges and Opportunities in the Digital Age. he talk was pub- lished in the pamphlet Knowledge Design (Schnapp). 4. his list should surely be ampliied with references to media studies and history centers like the Signallabor and Medienarchäologischer Fundus, of the media studies program at the Humboldt University of Berlin, or Media Archeology Lab, at the University of Colorado, Denver. 5. The Dartmouth Dante Project (dante .dartmouth .edu/) was founded a nd led by Rober t Hol la nder at Prince ton but run out of Dartmouth because of Dart- mouth’s advanced computing infrastructure. Today the project remains one of the deining reference works in the ield of Dante studies. 6. he debate over the value of digital in the phrase dig- ital humanities is long- standing. he Digital Humanities Manifesto 2.0 embraced it only for reasons of “strategic essentialism”: “We wave the banner of ‘Digital Humani- ties’ for tactical reasons . . . not out of a conviction that the phrase adequately describes the tectonic shits embraced in this document. But an emerging transdisciplinary do- main without a name runs the risk of inding itself deined less by advocates than by critics and opponents, much as 1 3 2 . 3 ] Guest Column 511 cubism became the label associated with the pictorial ex- periments of Picasso, Braque, and Gris” (13). For thought- ful relections on the debate, see the essays in Gold and Klein, particularly the contribution by Jentery Sayers. 7. he best overall introduction to medieval scriptoria remains Reynolds and Wilson. 8. In his famous account of the virtues of scribal activ- ity, Cassiodorus writes: “We have not allowed you to be ig- norant in any way of the measurement of time which was invented for the great use of the human race. I have, there- fore, provided a clock for you which the light of the sun marks, and another, a water clock which continually indi- cates the number of the hours by day and night, because on those days when the brightness of the sun is missing, the water traces marvelously on earth the course that the iery power of the sun runs on its path above. hus, things which are divided in nature, men’s art has made to run together; in these devices the trustworthiness of events stands with such truth that their harmonious function seems to be arranged by messengers” (sec. 30, par. 5). 9. All quotations about the lab appeared on the now- defunct Stanford Humanities Laboratory Web site, circa 2000, and are taken from the author’s personal archives. Two linear feet and 10.9 gigaby tes of materials docu- menting the history of the lab are present in the Special Collections of the Stanford University Libraries; for more information, see Search Works (searchworks .stanford .edu/ view/9333717). Internet Archive (archive.org) also contains ample documentation, particularly regarding the lab’s work in interactive media and machinima. 10. See metaLAB (metalabharvard .github.io/). 11. he Harvard University Press series metaLABproj- ect has published six titles to date, including Presner et al. and Drucker. Stein and Miller’s seminal work of radi- cal pedagogy is supported by the Web site Blueprint for Counter Education (blueprintforcountereducation .com/). WORKS CITED La Biennale de Venezia: XII. Mostra Internazionale di Ar­ chi tet tura. jefreyschnapp .com/ wp - content/ uploads/ 2011/ 07/ deinitivo .pdf. Accessed 19 June 2017. Burdick, Anne, et al. Digital_ Humanities. MIT P, 2012. Cassiodorus. Institutiones. Translated by James W. Hal- porn and Barbara Halporn, bk. 1, faculty .georgetown .edu/ jod/ inst- trans .html. Accessed 31 May 2017. “A Digital Humanities Manifesto.” A Digital Humanities Manifesto, 15 Dec. 2008, manifesto .humanities .ucla .edu/ 2008/12/15/digital- humanities- manifesto/. “he Digital Humanities Manifesto 2.0.” A Digital Human­ ities Manifesto, 29 May 2009, manifesto . humanities .u c l a . e d u / 2 0 0 9/ 0 5/ 2 9/ t h e - d i g i t a l - hu m a n i t i e s - manifesto -20/. he Digital Humanities Manifesto 2.0. www . humanitiesblast .com/ manifesto/ Manifesto_V2.pdf. Accessed 16 June 2017. Drucker, Johanna. Graphesis. Harvard UP, 2014. Foucault, Michel. Discipline and Punish: he Birth of the Prison. Translated by Alan Sheridan, Vintage Books, 1979. Gold, Matthew K., and Lauren F. Klein, editors. Debates in the Digital Humanities, 2016. U of Minnesota P, 2016, dhdebates.gc.cuny .edu/ debates/2. Jaeger, Werner. Paideia: The Ideals of Greek Culture. Translated by Gilbert Highet, Oxford UP, 1945. 3 vols. Kuhn, homas S. he Structure of Scientiic Revolutions. 3rd ed., U of Chicago P, 1996. Latour, Bruno, and Steve Woolgar. Laboratory Life: he Construction of Scientiic Facts. Princeton UP, 1986. Presner, Todd, et al. Hypercities: hick Mapping in the Digital Humanities. Harvard UP, 2014. Reynolds, Leighton Durham, and Nigel Guy Wilson. Scribes and Scholars: A Guide to the Transmission of Greek and Latin Literature. Clarendon Press, 1974. Schnapp, Jefrey T. Knowledge Design. Volkswagen Stif­ tung, 2017, www .volkswagenstitung.de/en/news -press/ publications/ details - publications/ news/ detail/ artikel/ herrenhausen - lecture - knowledge - design/ marginal/ 4295 .html. PDF download, accessed 25 May 2017. Stein, Maurice, and Larry Miller. Blueprint for Counter Education. 1970. Inventory Books, 2016. Turner, Fred. From Counterculture to Cyberculture: Stew­ art Brand, the Whole Earth Network, and the Rise of Digital Utopianism. U of Chicago P, 2006. 512 Guest Column [ P M L A work_7tqfly2aurcuzfasapzyzdredy ---- What’s Under the Big Tent?: A Study of ADHO Conference Abstracts Research How to Cite: Weingart, Scott B. and Nickoal Eichmann-Kalwara. 2017. “What’s Under the Big Tent?: A Study of ADHO Conference Abstracts.” Digital Studies/Le champ numérique 7(1): 6, pp. 1–17, DOI: https://doi. org/10.16995/dscn.284 Published: 13 October 2017 Peer Review: This is a peer-reviewed article in Digital Studies/Le champ numérique, a journal published by the Open Library of Humanities. Copyright: © 2017 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. Open Access: Digital Studies/Le champ numérique is a peer-reviewed open access journal. Digital Preservation: The Open Library of Humanities and all its journals are digitally preserved in the CLOCKSS scholarly archive service. https://doi.org/10.16995/dscn.284 https://doi.org/10.16995/dscn.284 http://creativecommons.org/licenses/by/4.0/ Weingart, Scott B. and Nickoal Eichmann-Kalwara. 2017. “What’s Under the Big Tent?: A Study of ADHO Conference Abstracts.” Digital Studies/Le champ numérique 7(1): 6, pp. 1–17, DOI: https://doi.org/10.16995/dscn.284 RESEARCH What’s Under the Big Tent?: A Study of ADHO Conference Abstracts Scott B. Weingart1 and Nickoal Eichmann-Kalwara2 1 Carnegie Mellon University, US 2 University of Colorado, Boulder, US Corresponding author: Nickoal Eichmann-Kalwara (nickoal.eichmann@colorado.edu) This study identifies how the flagship Digital Humanities conference has evolved since 2004 and continues to evolve by analyzing the topical, regional, and authorial trends in its presentations. Additionally, we explore the extent to which Digital Humanists live up to the characterization of being diverse, collaborative, and global using the conference as a proxy. Given the increased popularization of “digital humanities” within the last decade, and especially recent successes in popular press and grant initiatives, this study tempers the sometimes utopic rhetoric that appears alongside mentions of the term. Keywords: ADHO; authorship; disciplinarity Cette étude a pour but de cerner comment la conférence phare sur les humanités numériques a évolué depuis 2004 et continue à évoluer, en analysant les tendances thématiques, régionales et d’auteur dans ses présentations. De plus, nous explorons dans quelle mesure les humanistes numériques sont à la hauteur de la caractérisation en matière de diversité, de collaboration et de mondialisation, en utilisant la conférence comme intermédiaire. Étant donné la vulgarisation croissante des « humanités numériques » au cours de la dernière décennie, et en particulier les récents succès dans la presse populaire et les initiatives de subvention, cette étude modère la rhétorique parfois utopique qui apparaît aux côtés des mentions du terme. Mots-clés: ADHO; authorship; disciplinarité https://doi.org/10.16995/dscn.284 mailto:nickoal.eichmann@colorado.edu Weingart and Eichmann-Kalwara: What’s Under the Big Tent?2 Introduction “Digital Humanities” is a fraught term, on whose definition rests funding decisions, tenure lines, and institutional power dynamics. Its (or their) public face is multifaceted: New York Times articles (Cohen 2010), museum exhibits (Quirk 2015), popular tools (DiRT Directory 2016), and tech industry partnerships (Google Research Blog 2010; Kirschenbaum 2007) all contribute to how the Digital Humanities (DH) interact with the wider world.1 In academic circles, the term is often associated with backchannel chatter (Holmberg and Thelwall 2014), grey literature (Huggett 2012), and informal workshops and conferences (French 2015). DH has too many definitions to be well- defined (Terras, Nyhan, and Vanhoutte 2013), but its influence is great enough to warrant an exploration of how it appears to newcomers, to scholars, and to the world. The annual Alliance of Digital Humanities Organizations (ADHO) conference provides one important vantage point whence to launch such an exploration (Earhart 2015; Sugimoto and Weingart 2015). As the largest and most public DH-labeled event,2 the conference reflects and constructs many of the visible contours of DH, even (or especially) when it fails to adequately represent all aspects of the community, the scholarship, or the pedagogy. The first Digital Humanities conference was held in 2006 following the founding of ADHO, but its roots are in the joint Association for Literary and Linguistic Computing (ALLC)/Association for Computers and the Humanities (ACH) conference first held in 1989 (ADHO.org 2016).3 This essay reflects on an ongoing quantitative analysis of this conference to trace its changing shape since 1989. The analysis investigates whether the common characterization of DH as collaborative, inclusive, 1 This is an extension of work presented at the DHSI 2015 Colloquium (Eichmann and Weingart 2015). The research began as a blog series by Weingart (see his blog, scottbot.net). A companion piece focusing on representation at DH is forthcoming (Eichmann-Kalwara, Jorgensen, and Weingart forthcoming). 2 The ADHO DH conference draws publishers, students, faculty, librarians, museum curators, and archivists, among others. It is not always the most populated event (e.g., in 2014 the Digital Humanities Summer Institute (DHSI) itself hosted more attendees than ADHO’s conference), but it is undoubtedly the highest-profile annual DH event. 3 Although international conferences from the same tradition and community were held as early as the 1960s, and alternating European (ALLC)/North American (ICCH) conferences began in the early 1970s, the first truly joint conference was held in 1989. Its successors represent the largest international digital humanities conferences, and as such are the focus of this analysis. http://www.adho.org/ http://www.scottbot.net Weingart and Eichmann-Kalwara: What’s Under the Big Tent? 3 and globally-minded appears true through data-driven methods, keeping in mind that while ADHO’s conference is not a synecdoche for the entire digital humanities community, the conference does represent the community’s most public face. We present results in modest visualizations and simple statistics for greatest accessibility. Preliminary results reveal a growing conference, growing research team sizes, poor gender diversity, poor (but recently improving) regional diversity, and some shifts in topical focus of presentations. In light of recent controversies in which self- identified digital humanists have become increasingly worried that they and their work are not adequately represented, a topic discussed at length at DH2015 (Terras 2015 and 2011), we conclude that the annual DH conference has more work to do in reflecting its broad constituency and ethos for inclusion and diversity, though we save improvement suggestions for the companion piece referenced in Footnote 1. Methods and Data The DH conference and its joint ALLC/ACH predecessor began in 1989. We have collected schedules or programs from each, and have entered their contents into a spreadsheet to analyze trends across geography and time. By the writing of this piece, we have no data entered from before 2004. From DH2004–DH2013, we entered presentation title, author names, author institutional affiliations (if provided), author country affiliations (if provided), author academic departments (if provided), presentation type (panel, poster, plenary, etc.), presentation text (abstract or full paper depending on availability), and keywords (if provided). In addition to this 2004–2013 dataset of publicly available conference information, we created an additional dataset from conference submissions for 2013, 2014, and 2015, which contains the same fields as the above dataset. By checking submissions against the final programs for 2013–2015, we could analyze acceptance rates across several variables. During and after data collection, we hand-cleaned names, institutions, and departments, ensuring as best as possible that different people with similar names were given separate unique IDs, and that identical people with spelling variations in their names were given the same unique ID. We did the same for departments and institutions. We appended gender information (m/f/other/unknown) to authors Weingart and Eichmann-Kalwara: What’s Under the Big Tent?4 by a combination of hand-entry and automated inference using Lincoln Mullen’s “gender” package for R (Mullen 2016). This is problematic for many reasons, including a lack of possible gender options, the inability to encode gender changes over time, and the possibility of our matching incorrect genders to authors—especially those with names poorly represented on U.S. census and birth records (Posner 2015). We are working to improve this process (see an extended discussion in our forthcoming companion piece with Jeana Jorgensen), but feel even uncertain information is better than no information in this context. Finally, we used a combination of Google Spreadsheets, Microsoft Excel, Notepad++, OpenRefine, and the R and RStudio development environment to collect and analyze the data for trends. We opt to present simple visualizations, counts, and comparisons rather than more rigorous statistical results in the interest of clarity, but at the expense of certainty. Readers should interpret these results as indicative rather than conclusive. Findings The number of presentations and unique authors at the annual conference has increased nearly every year in the last decade (see Figure 1). Although the data do not appear in Figure 1, preliminary analysis shows even greater acceleration in 2014 and 2015. Figure 1: Rate of DH conference growth over 10 years (2004–2013). Weingart and Eichmann-Kalwara: What’s Under the Big Tent? 5 This matches other analyses of digital humanities (Terras 2012), showing increasing DH activity and participation across the board, with no signs of slowing down. The conference is healthy and attendance rotates, with 60 ± 10% of each year’s authors never having attended previously. This suggests a core of about 200 authors, as of 2013, orbited by a constellation of digital humanists who do not regularly attend the conference, disciplinary tourists (perhaps humanities or computer science researchers or librarians with one-off DH projects), and short-term collaborators on multi-authored projects. Such a large portion of attendees appearing only once raises the question of whether “big tent digital humanities” itself should be considered a discipline in its own right, or simply a meeting place that some steer closer to than others. That is: is DH made up entirely of tourists? Although data for earlier years are unavailable due to privacy standards in many countries, data from the conference in Sydney, Australia in 2015 show that attendance and author lists do not perfectly overlap. Only 70% of pre-registered attendees were also authors of conference presentations. The other 30% of attendees, nearly 150 people, likely included local participants, ADHO committee members, university administrators, and industry professionals. Between attendees and authors, by 2015 we suspect a core community of around 300 returning participants, and a periphery numbering in the several thousands (THATCamp n.d.; @DHNow n.d.).4 That not every author attends, and not every attendee is an author, is itself unsurprising. The demographic difference between the two groups is worth mention, however. We found at DH2015 that ≈35% of authors were women, yet women comprised ≈46% of attendees (Weingart 2015).5 Work must be done to improve representation at future conferences to combat this disparity. 4 This matches with other numbers measured in late 2015 that has since grown to over 7,000 registered users at THATCamp.org, over 24,000 followers of @DHNow on Twitter, etc. 5 See http://scottbot.net/acceptances-to-digital-humanities-2015-part-3/ for a more detailed discussion. Next steps include checking the extent to which this ratio matches the conference “core” of 200 participants, and the various other digital humanities communities and conferences. https://twitter.com/dhnow http://www.thatcamp.org/ https://twitter.com/dhnow http://scottbot.net/acceptances-to-digital-humanities-2015-part-3/ Weingart and Eichmann-Kalwara: What’s Under the Big Tent?6 Topics When submitting to the DH conference, authors must attach author-supplied keywords and ADHO-assigned topics to their presentations. Conference committees rarely made this data public before 2013, meaning topical analysis over the last few decades requires hand-coding or algorithmic assistance, neither of which are complete at the time of this writing. Preliminary results are available, however, combining coded data after 2013 (see Figure 2 and Weingart n.d.)6 with anecdotal evidence from preceding years. In recent years, DH presentations have shifted away from project-based to principle- and skill-focused topics. For instance, interface and user-experience design, scholarly editing, and information architecture, among other project-based topics, have declined. Conversely, text analysis, visualization, and data modeling have increased, especially in the last few years. The exception to this is the rise of topics associated with digitization and GLAM (Galleries, Libraries, Archives, & Museums). The most prominent topics covered recently have related to literary studies, text analysis/mining, visualization archives, and interdisciplinary collaboration. History, linguistics, philosophy, and gender studies have found a home at DH in the past, but their presence fluctuates, especially in comparison with the dominance of literary studies. This dominance should not be surprising given digital humanities’ cultural origins (Schreibman, Siemens and Unsworth 2004),7 though it often comes at the expense of representing other equally rich traditions combining technology with the humanities (Leon 2015; Sloman 1978).8 Historical studies jumped from comprising 10% of presentations in 2013 to 17% in 2014, and down to 15% in 2015. It remains unclear whether this indicates random fluctuations, trends over time, or differing regional profiles of DH. Other recently growing topics include semantic analysis and cultural studies. 6 More exhaustive post-2013 topical analyses appear in Weingart’s blog (http://scottbot.net/tag/ dhconf/). 7 Susan Schreibman, Ray Siemens, and John Unsworth’s A Companion to Digital Humanities popularized the term Digital Humanities around a strongly literary tradition. 8 Examples of underrepresented communities include digital public history (Leon 2015) and computational philosophy (Sloman 1978). http://scottbot.net/tag/dhconf/ http://scottbot.net/tag/dhconf/ Weingart and Eichmann-Kalwara: What’s Under the Big Tent? 7 Figure 2: Topical change at DH Conferences 2013–2015. Weingart and Eichmann-Kalwara: What’s Under the Big Tent?8 The most visible drops in coverage came in topics related to pedagogy, scholarly editions, user interfaces, and research involving social media and the web. Between 2013 and 2015, the conference lost a quarter of its coverage related to pedagogy. “Scholarly Editing” dropped from 11% to 7% of the conference proceedings, and “Interface and User Experience Design” from 13% to 8%. Among the more surprising drops were those in “Internet/World Wide Web” (12% to 8%) and “Social Media” (8.5% to 5%). We mention these specifically because the trends are fairly clear across the three years for which we have data, and conform to our anecdotal awareness of previous years. That said, three years of analysis is not enough to form solid conclusions about shifts in topical coverage, and more collection will be required to confirm these results. Authorship Between 2004 and 2013, nearly 2,000 total authors presented at DH, with the most rapid introduction of new authors after 2010 (see Figure 3). Even after taking the growth of the conference itself into account, new authors are appearing faster than we might expect. Figure 4 shows the rate of introduction of new authors normalized by the growth of the conference itself, such that values above 1 mean authors are entering the conference faster than the conference is growing. The rate of new authors is increasing, suggesting the conference is becoming less insular, or perhaps there are more disciplinary tourists, submitting one presentation and never doing so again. The percentage of returning authors is consequently decreasing, while the sheer volume of core authors is still slowly increasing. This suggests, possibly, that the DH conference is growing in popularity and encouraging more tourists faster than it is growing in core members. DH often self-identifies as innately collaborative, yet our study indicates that over one-third of presenters at the DH conference remain close to their disciplinary humanistic roots by adhering to the single-authorship tradition (Spiro 2009).9 It is unclear whether other humanities conferences hold a similar co-authorship ratio. 9 While future research will investigate diversity within presentations (i.e. ask whether individual multi- authored works include collaborators from other countries and institutions), Lisa Spiro compared digital humanities scholarship and disciplinary scholarship to determine the extent of collaboration in DH-oriented and discipline-specific journals (Spiro 2009). Weingart and Eichmann-Kalwara: What’s Under the Big Tent? 9 Even so, with nearly two-thirds of DH presentations signed by multiple authors, the data indicate a tendency toward collaboration, whether or not that collaboration is innate to all DH work. The co-authorship rate does not likely represent a true account of collaborative work, but rather a lower bound. Collaboration in digital humanities research may often go uncredited, with invisible work contributed by students, interns, or hired assistance. Given this, single-authored DH presentations may have uncredited authors, and perhaps multi-author presentations do not represent their full collaborative scope in the authorship credits. This confusion will continue as long as DH lacks an agreed-upon standard for credit, though work is being done in this direction (Crymble and Flanders 2013). Figure 3: Increasing number of authors at DH conferences who never authored at the conference before. Figure 4: First-authorship rate normalized by conference growth. Weingart and Eichmann-Kalwara: What’s Under the Big Tent?10 While the insular nature of humanities research is unlikely to disappear from DH, a time-based analysis shows that the number of single-authored presentations is decreasing, as the average number of authors per presentation steadily grows (see Figure 5). Regional Diversity Since ADHO is a collection of international organizations, we were interested in the regional diversity of conference authors. We inferred author countries based on their institutional affiliations (e.g., University of Victoria is coded as Canada) and clustered them by U.N. macro regional standards (e.g., Canada = Americas). In doing this, our analysis shows the conference lacks regional diversity, which may be attributed to the locations in which the conference is held.10 Between 2004 and 2013, 1,056 authors originated from the Americas (US: 851; Canada: 202; Mexico: 1; Peru: 1; Uruguay: 1), and 794 were from Europe (see Figure 6). Figure 7 shows the prominence of American authors occurred not only in the odd years when the 10 Between 2004 and 2015, in each odd-numbered year, the DH conference was held in the North America, and all even years took place in Europe, with the exception of Australia in 2015. The host country, from 2004–2015, has been: Sweden, Canada, France, USA, Finland, USA, UK, USA, Germany, USA, Switzerland, Australia. Figure 5: Average number of co-authors on a single presentation in a given DH conference year. Weingart and Eichmann-Kalwara: What’s Under the Big Tent? 11 conference was held in the Americas (with ≈65% American authors), but also in the even years when it was held in Europe (with ≈50% American attendees). While the conference remains Americas-centric overall, regional diversity is on the rise, with notable increases of authors from Asia and Oceania, although no scholars affiliated with African countries appeared in this analysis. Figure 6: Authors per region 2004–2013. Authors we were unable to locate are aggregated under “(blank)”. Figure 7: Country of author institutions to DH conferences 2004–2013. Weingart and Eichmann-Kalwara: What’s Under the Big Tent?12 Preliminary analysis shows greater regional diversity in 2014, and unsurprisingly the most diverse yet in 2015, when the conference was held in Sydney. We feel ADHO’s decision to bring the conference farther afield was a step in the right direction. Gender Distribution With women playing increasingly central leadership roles in the DH community, we hoped to see similarly improved representation among DH authors. After coding for author gender, we looked at the percentage of authors each year who were women (or at least who registered as women according to our hand-corrected algorithmic approach), as well as the percentage of first authors who were women (see Figure 8). With minor fluctuations per year but an unchanging average over time, about a third of all authors from 2004–2013 were women. The ratio is only slightly (though consistently) better for first-authorships, such that a higher percentage of first authors were women. The critique may be raised that this is not a problem of representation, but of interest—though even if this were a broadly valid criticism, it is not true in this case. As mentioned earlier, ≈35% of DH2015 authors appear to be women, contrasted against ≈46% of attendees. Thus attendees are not adequately represented among conference authors. From 2004–2013, North American men seem to represent the largest share of authors by far. Figure 8: Percentage of female authors at each annual ADHO conference 2004–2013. Weingart and Eichmann-Kalwara: What’s Under the Big Tent? 13 Conclusions and Future Analysis The data show that over the last decade, ADHO’s international conference has become slightly more collaborative and regionally diverse, that text and literature currently reign supreme, and that women are underrepresented with no signs of improvement thus-far. This is at odds with many of our anecdotal experiences with colleagues online and at home, a group that is more diverse and multidisciplinary than the annual conference reflects. We hope for ADHO to take this disparity into account when organizing future conferences. For instance, if conference location correlates to regional diversity of authors, ADHO might consider hosting the DH conference less often in North America and more often in non-Anglocentric countries. Certainly to some extent, the onus is on the authors and reviewers themselves to promote diversity and broader representation in their panels and projects, and ADHO might find ways to encourage diverse panels and multi-author presentations, or discourage many presentations from the same author. Finally, diversifying the reviewer pool could broaden the topical scope and geographic representation of presentations and attendees. These suggestions reflect efforts already underway in ADHO, which we applaud. We do not make these suggestions as a gesture towards reaching an international conference whose demographics exactly match the global population, but to ensure DH scholarship remains healthy through the inclusion of a broad range of perspectives and approaches. While the preliminary results are useful and telling, we continue to expand our dataset to include DH abstracts since 1989, and with that, we will look deeper into our initial findings. For instance, while we can anecdotally conclude that there has been a shift in the focus of topics presented at DH, from project- to skill-based, we plan to provide a quantitative assessment of these shifts over time and space. It would be interesting to see how topics distribute geographically, to determine whether regional differences contribute to various differences over self- definitions of digital humanities. Furthermore, we hope to examine authorship with more granularity, to interrogate the diversity of multi-authored presentations for cross-institutional and international collaboration. We also plan to analyze the Weingart and Eichmann-Kalwara: What’s Under the Big Tent?14 relationships between new and repeat authors with topics and the fields they come from, as well as correlating topic with gender. Preliminary results suggest gender does skew what topic is being discussed, with topics more often written by women less likely to appear in the conference. Finally, we will open our dataset so authors can edit their own information, allowing a more sensitive gender analysis beyond the male/female binary and taking into account the fluidity of the category over time. Author Typology The authors of this article are credited in descending order by significance of contribution. The corresponding author is Nickoal Eichmann-Kalwara (ne). Author contributions, described using the CASRAI CRedIT typology (“CRediT – CASRAI” 2016), are as follows (authors identifed by initials): Corresponding author: ne Conceptualization: sw Methodology: sw, ne Validation: sw, ne Formal Analysis: sw, ne Investigation: sw, ne Data Curation: ne, sw Writing – Original Draft Preparation: sw, ne Writing – Review & Editing: ne, sw Visualization: sw, ne Project Administration: sw, ne Funding Acquisition: sw Competing Interests The authors have no competing interests to declare. References @DHNow. n.d. Accessed January 30, 2017. https://twitter.com/dhnow. Alliance of Digital Humanities Organizations. 2016. “Conference.” Accessed December 6. http://adho.org/conference. https://twitter.com/dhnow https://twitter.com/dhnow http://adho.org/conference Weingart and Eichmann-Kalwara: What’s Under the Big Tent? 15 Cohen, Patricia. 2010. “Humanities Scholars Embrace Digital Technology.” New York Times, November 17. http://www.nytimes.com/2010/11/17/arts/17digital.html. Crymble, Adam, and Julia Flanders. 2013. “FairCite.” Digital Humanities Quarterly 7(2). http://www.digitalhumanities.org/dhq/vol/7/2/000164/000164.html. DiRT Directory. 2016. Accessed January 19, 2017. http://dirtdirectory.org. Earhart, Amy. 2015. “Take Back the Narrative: Rethinking the History of Diverse Digital Humanities.” Presentation at Digital Humanities Forum 2015, University of Kansas, September 26. Eichmann-Kalwara, Nickoal, Jeana Jorgensen, and Scott B. Weingart. Forthcoming. “Representation at Digital Humanities Conferences (2000–2015).” DOI: https://doi.org/10.6084/m9.figshare.3120610.v1 Eichmann, Nickoal, and Scott B. Weingart. 2015. “What’s Under the Big Tent? ADHO Conference Abstracts, 2004–2015”. Presentation at the Digital Humanities Summer Institute Colloquium, Victoria, British Columbia, June 16. DOI: https:// doi.org/10.6084/m9.figshare.1461760.v4 French, Amanda. 2015. “THATCamp, Me, and Virginia Tech Libraries.” Amandafrench.net (blog), April 13. http://amandafrench.net/2015/04/13/ thatcamp-me-and-virginia-tech-libraries/. Google Research Blog. 2010. “Our Commitment to the Digital Humanities.” July 14. http://googleresearch.blogspot.com/2010/07/our-commitment-to-digital- humanities.html. Holmberg, Kim, and Mike Thelwall. 2014. “Disciplinary Differences in Twitter Scholarly Communication.” Scientometrics, January. DOI: https://doi.org/ 10.1007/s11192-014-1229-3 Huggett, Jeremy. 2012. “Core or Periphery? Digital Humanities from an Archa- eological Perspective.” Historical Social Research/Historische Sozialforschung 37: 3(141): 86–105. Kirschenbaum, Matthew G. 2007. “The Remaking of Reading: Data Mining and the Digital Humanities.” The National Science Foundation Symposium on Next Generation of Data Mining and Cyber-Enabled Discovery for Innovation (NGDM’07): Final Report. http://www.almaden.ibm.com/cs/projects/iis/hdb/ Publications/papers/ngdm07report.pdf. http://www.nytimes.com/2010/11/17/arts/17digital.html http://www.digitalhumanities.org/dhq/vol/7/2/000164/000164.html http://dirtdirectory.org https://doi.org/10.6084/m9.figshare.3120610.v1 https://doi.org/10.6084/m9.figshare.1461760.v4 https://doi.org/10.6084/m9.figshare.1461760.v4 www.Amandafrench.net http://amandafrench.net/2015/04/13/thatcamp-me-and-virginia-tech-libraries/ http://amandafrench.net/2015/04/13/thatcamp-me-and-virginia-tech-libraries/ http://googleresearch.blogspot.com/2010/07/our-commitment-to-digital-humanities.html http://googleresearch.blogspot.com/2010/07/our-commitment-to-digital-humanities.html https://doi.org/10.1007/s11192-014-1229-3 https://doi.org/10.1007/s11192-014-1229-3 http://www.almaden.ibm.com/cs/projects/iis/hdb/Publications/papers/ngdm07report.pdf http://www.almaden.ibm.com/cs/projects/iis/hdb/Publications/papers/ngdm07report.pdf Weingart and Eichmann-Kalwara: What’s Under the Big Tent?16 Leon, Sharon. 2015. “User-Centered Digital History: Doing Public History on the Web.” Brackett (blog), March 3. https://www.6floors.org/bracket/2015/03/03/ user-centered-digital-history-doing-public-history-on-the-web/. Mullen, Lincoln. 2016. “Predict Gender from Names Using Historical Data.” https:// github.com/ropensci/gender. Posner, Miriam. 2015. “What’s Next: The Radical, Unrealized Potential of Digital Humanities.” Miriam Posner’s Blog, July 27. http://miriamposner.com/blog/ whats-next-the-radical-unrealized-potential-of-digital-humanities/. Quirk, Kathy. 2015. “Digital Humanities Lab launches first exhibit.” Today@UWM, January 23. http://uwm.edu/humanities/digital-humanities-lab-launches-first- exhibit/. Schreibman, Susan, Ray Siemens, and John Unsworth, eds. 2004. A Companion to Digital Humanities. Oxford: Blackwell. http://www.digitalhumanities.org/ companion/. Sloman, Aaron. 1978. The Computer Revolution in Philosophy: Philosophy, Science and Models of Mind. Highlands, NJ: Humanities Press. Spiro, Lisa. 2009. “Collaborative Authorship in the Humanities.” Digital Scholarship in the Humanities (blog), April 21. https://digitalscholarship.wordpress. com/2009/04/21/collaborative-authorship-in-the-humanities/. Sugimoto, Cassidy R, and Scott Weingart. 2015. “The Kaleidoscope of Disciplinarity.” Journal of Documentation 71(4): 775–94 DOI: https://doi. org/10.1108/JD-06-2014-0082 Terras, Melissa. 2011. “Peering Inside the Big Tent: Digital Humanities and the Crisis of Inclusion.” Melissa Terras’ Blog, July 26. http://melissaterras.blogspot. com/2011/07/peering-inside-big-tent-digital.html. Terras, Melissa. 2012. “Infographic: Quantifying Digital Humanities.” UCL Centre for Digital Humanities (blog), January 20. http://blogs.ucl.ac.uk/dh/2012/01/20/ infographic-quantifying-digital-humanities/. Terras, Melissa. 2015. “Why I Do Not Trust Frontiers Journals, Especially Not @FrontDigitalHum.” Melissa Terras, July 21. http://melissaterras.org/2015/07/21/ why-i-do-not-trust-frontiers-journals-especially-not-frontdigitalhum/. https://www.6floors.org/bracket/2015/03/03/user-centered-digital-history-doing-public-history-on-the-web/ https://www.6floors.org/bracket/2015/03/03/user-centered-digital-history-doing-public-history-on-the-web/ https://github.com/ropensci/gender https://github.com/ropensci/gender http://miriamposner.com/blog/whats-next-the-radical-unrealized-potential-of-digital-humanities/ http://miriamposner.com/blog/whats-next-the-radical-unrealized-potential-of-digital-humanities/ http://wuwm.com/programs/uwm-today#stream/0 http://uwm.edu/humanities/digital-humanities-lab-launches-first-exhibit/ http://uwm.edu/humanities/digital-humanities-lab-launches-first-exhibit/ http://www.digitalhumanities.org/companion/ http://www.digitalhumanities.org/companion/ https://digitalscholarship.wordpress.com/2009/04/21/collaborative-authorship-in-the-humanities/ https://digitalscholarship.wordpress.com/2009/04/21/collaborative-authorship-in-the-humanities/ https://doi.org/10.1108/JD-06-2014-0082 https://doi.org/10.1108/JD-06-2014-0082 http://melissaterras.blogspot.com/2011/07/peering-inside-big-tent-digital.html http://melissaterras.blogspot.com/2011/07/peering-inside-big-tent-digital.html http://blogs.ucl.ac.uk/dh/2012/01/20/infographic-quantifying-digital-humanities/ http://blogs.ucl.ac.uk/dh/2012/01/20/infographic-quantifying-digital-humanities/ https://twitter.com/frontdigitalhum http://melissaterras.org/2015/07/21/why-i-do-not-trust-frontiers-journals-especially-not-frontdigitalhum/ http://melissaterras.org/2015/07/21/why-i-do-not-trust-frontiers-journals-especially-not-frontdigitalhum/ Weingart and Eichmann-Kalwara: What’s Under the Big Tent? 17 Terras, Melissa, Julianne Nyhan, and Edward Vanhoutte, eds. 2013. Defining Digital Humanities: A Reader. Ashgate Publishing. THATCamp.org. n.d. “People.” Accessed January 30, 2017. http://thatcamp.org/ people/. Weingart, Scott. 2015. “Acceptances to Digital Humanities 2015 (part 3).” Scottbot.net (blog), June 27. http://scottbot.net/acceptances-to-digital- humanities-2015-part-3/. Weingart, Scott. n.d. “Tag: dhconf.” Scottbot.net (blog). Accessed January 30, 2017. http://scottbot.net/tag/dhconf/. How to cite this article: Weingart, Scott B. and Nickoal Eichmann-Kalwara. 2017. “What’s Under the Big Tent?: A Study of ADHO Conference Abstracts.” Digital Studies/Le champ numérique 7(1): 6, pp. 1–17, DOI: https://doi.org/10.16995/dscn.284 Submitted: 20 October 2015 Accepted: 31 October 2016 Published: 13 October 2017 Copyright: © 2017 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. OPEN ACCESS Digital Studies/Le champ numérique is a peer-reviewed open access journal published by Open Library of Humanities. http://thatcamp.org/ http://thatcamp.org/people/ http://thatcamp.org/people/ www.Scottbot.net http://scottbot.net/acceptances-to-digital-humanities-2015-part-3/ http://scottbot.net/acceptances-to-digital-humanities-2015-part-3/ http://scottbot.net/ http://scottbot.net/tag/dhconf/ https://doi.org/10.16995/dscn.284 http://creativecommons.org/licenses/by/4.0/ Introduction Methods and Data Findings Topics Authorship Regional Diversity Gender Distribution Conclusions and Future Analysis Author Typology Competing Interests References Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 work_7uefqbkdyzgupbodq53dlxecme ---- 1 CDH2020 BENEVOLENCE AND EXCELLENCE: DIGITAL HUMANITIES AND CHINESE CULTURE 20-21, Shanghai, China President of ADHO Constitutent Organizations Board Elisabeth Burr 20.10.2020 2 1. SHORT INTRODUCTION A very good morning to all of you. I have the great honour and joy to bring the greetings from the Alliance of Digital Humanities Organizations to you. As my Internet connection is shaky, you will watch the video I have recorded beforehand. Please go to ADHO’s website if you want to know which associations make ADHO, as this part of my speech was, sadly, cut for time reasons. Have a very good conference! 2. RECORDED SPEECH A very good morning to you all whether you are taking part in the conference in person or via computer screens. When I received the invitation to attend the opening ceremony of the 2020 Chinese Digital Humanities Conference from your colleague, Dr Jing Chen, on behalf of the executive committee of this conference, I was thrilled. I could, however, not accept the invitation to be your distinguished guest straight away. As President of the Alliance of Digital Humanities Organizations (ADHO) I obviously had to inform, first of all, those who I represent and ask them for their opinion. As the reaction was unanimously positive I have now the great pleasure and honour to welcome you to this conference, to bring to you the collegial greetings of the Alliance for Digital Humanities Organizations and to wish your conference with the wonderful theme “Benevolence and Excellence: Digital Humanities and Chinese Culture” a huge success. Allow me to say a few words about the Alliance of Digital Humanities Organizations. The first Digital Humanities Organisations, or better Organisations for Humanities Computing, as the field was called at the beginning, were founded, as far as we know, in the 1970ies. In fact, in 1973 the European Association for Literary and Linguistic Computing (ALLC) was founded at King’s College London, and in 1978 followed the foundation of the Association for Computers and the Humanities (ACH) in the United States of America. From 1988 onward, these two Associations celebrated joint conferences, which took place one year in Europe and the other year in the United States. In 2002, discussions started about an umbrella organization which would foster closer collaboration and exchange within the field of digital humanities more widely and which also other organisations might want to join. These discussions led to the foundation of the Alliance of Digital Humanities Organizations, or ADHO as it is generally called, in 2005. The first ADHO Digital Humanities Conference 3 was celebrated in Paris in 2006. ADHO’s aim is to promote and support digital research and teaching across all arts and humanities disciplines, and to foster excellence in research, publication, collaboration and training. Over the years, more and more Digital Humanities Associations applied to become Constituent Organisations of ADHO. In 2007, the Canadian Society for Digital Humanities / Société canadienne des humanités numériques (CSDH / SCHN) joined ADHO, in 2012 CenterNet and the Australasian Association for Digital Humanities (aaDH) followed, in 2013 the Japanese Association for Digital Humanities (JADH) became a constituent organisation of ADHO, 2016 saw the arrival of Humanistica. L'association francophone des humanités numériques / digitales, and in 2019 ADHO welcomed the Taiwanese Association for Digital Humanities (TADH), la Red de Humanidades Digitales (RedHD) based in Mexico, and the Digital Humanities Association of Southern Africa (DHASA). As the legal entity for ADHO, The Stichting ADHO Foundation, was established in the Netherlands, ADHO needs to respect Dutch laws. ADHO sponsors a whole range of Special Interest Groups (SIGs) which enable members of ADHO Constituent Organisations who have similar professional specialties and interests, to exchange ideas, keep themselves up to date on developments in their specific field, and develop related activities: these include  Digital Literary Stylistics (SIG-DLS)  Audiovisual Data in Digital Humanities (SIG AVinDH)  Global Outlook::Digital Humanities (GO::DH)  GeoHumanities SIG  Libraries and Digital Humanities  Linked Open Data (DH-LOD) ADHO offers its own Constituent Organisations and SIGs a common infrastructure where web pages, mailing lists, and email addresses can be hosted, and where tools like Wordpress, Drupal and Mediawiki and others are made available for the community. This infrastructure is also used by affiliated bodies like the Text Encoding Initiative (TEI) and by some of the digital scholarly journals, which are published by ADHO Constituent Organisations and which ADHO sponsors: the open access peer-reviewed Digital Humanities Quarterly (DHQ), 4 Digital Studies / Le champ numérique, and Humanités numériques. The Journal of the Alliance of Digital Humanities Organizations is the peer reviewed and Impact Factor holding Digital Scholarship in the Humanities (DSH) published by Oxford University Press. Among the many countries from which proposals for publication were submitted during the last year, the People's Republic of China with 38 submissions actually holds the top place of the list. Every year ADHO organises the Digital Humanities Conference. At first, this conference took place either in Europe or in the United States. When more Associations joined, ADHO’s conference started to travel to other continents and countries. This year’s conference was supposed to be in Ottawa Canada, but because of Covid-19 it had to be cancelled. This was a very sad and frustrating experience for us all. That we managed to have a virtual conference in the end was only possible because some of our colleagues were prepared to invest a lot of their time and energy in its realisation. Obviously, this virtual conference could not do away with the loss we feel. It would have been so much better if we could have met in person as at least some of you can do at this conference. We were longing all through the year to meet our colleagues, exchange ideas with them, construct networks and collaborations and above all get to know each other better. We come after all from very different countries and continents, belong to diverse cultures, speak different languages, and have different views and perspectives on Digital Humanities. As ADHO’s next Digital Humanities conference scheduled for 2021 in Japan had to be postponed for a year because of the pandemic we will now have to wait until 2022 before we can meet the global Digital Humanities community again in person. This also means that we have to wait much longer than planned, before we can welcome the Japanese language among the conference languages. Over the years and because of the hard work of the Standing Committee on Multi-Lingualism & Multi-Culturalism (MLMC) ADHO experienced a process of growing awareness that English is not everything and that the diversity of our community cannot be bridged by having English only conferences. We had to acknowledge instead that the languages and cultures we call our own determine our doing and our concepts of Digital Humanities. Slowly but continuously languages which are spoken by the people or important communities of the country where our conferences took place were admitted as conference languages and ways were found to be inclusive when we present our papers. We were really looking forward to take up the challenges, which Japanese will certainly present for most of us. 5 ADHO is governed by two boards, a Constituent Organizations Board, composed of one representative from each of the Constituent Organizations (COs) and the Special Interest Groups (SIGs) coordinator. The role of this board is to establish vision, strategy, and policy for ADHO. The second board is the Executive Board, which enacts the decisions taken by the Constituent Organisations Board and administers the day-to-day running of the organisation. As president of the Board, which represents all the Digital Humanities organisations which together make ADHO, and also personally, I would have wished to be able to bring you ADHO’s greetings in person and to get to know the Digital Humanities Community which gathers at this conference at least a little bit, but the pandemic makes this impossible. I hope very much that the present virtual encounter will not remain the only one and that sometime in the future I will meet members of this community and will be able to exchange ideas with them. I also hope that at some point in the future the Digital Humanities Community of the People’s Republic of China will be part of the ADHO “family” and will, by contributing its own perspectives and cultures, help ADHO to become ever more embracing and sensitive to the diversity of the field and its scholarly community. May you have a great conference and enjoy very enriching scholarly debates and friendly human encounters. work_7utnn5jkqbbbnka5is3wn3wrdm ---- Datenbank Spektrum (2014) 14:169–172 DOI 10.1007/s13222-014-0169-7 E D I T O R IAL Editorial Theo Härder · Jens Teubner Online publiziert: 15. Oktober 2014 © Springer-Verlag Berlin Heidelberg 2014 Schwerpunktthema: Data Management on New Hardware Seit Jahren bieten innovative Prozessor-Architekturen, bahn- brechende Neuerungen bei den Speichertechnologien und stürmische Weiterentwicklungen bei den Infrastrukturen neue wissenschaftliche Fragestellungen und Forschungsfel- der für die Datenbankgemeinde. Wegen der dramatischen Steigerungen bei Speicherkapazitäten und Ein-/Ausgabe- Raten – bei gleichzeitiger Reduktion der Kosten – haben diese Entwicklungen im Zeitalter von Big Data auch eine große wirtschaftliche Bedeutung. Andererseits decken enorm verbesserte Möglichkeiten zur Nutzung von Parallelität durch Multi-Core-Prozessoren oder Cluster-Architekturen sowie größere Bandbreiten und höhere Transferraten bei der Datenübertragung ständig neue Engpässe und erhöhtes Blockierungspotential in existieren- den Systemen auf. Deshalb erzwingen diese Entwicklun- gen auch eine ständige Anpassung und Optimierung der verfügbaren Methoden und Techniken bei den Software- Lösungen, insbesondere bei Datenbanksystemen. Während früher solche Anpassungen oft durch bloße E/A-Optimierung zu erreichen waren, erfordert heute die effiziente Ausnut- zung der verschiedenartigen und komplexen Charakteristi- ka der modernen Hardware eine abgestimmte Vorgehens- T. Härder ( ) AG Datenbanken und Informationssysteme, TU Kaiserslautern, 67663 Kaiserslautern, Deutschland E-Mail: haerder@cs.uni-kl.de J. Teubner Fakultät für Informatik, Lehrstuhl Datenbanken und Informationssysteme, TU Dortmund, 44227 Dortmund, Deutschland E-Mail: jens.teubner@cs.tu-dortmund.de weise, die mehrere oder alle Komponenten zugleich betrifft. Außerdem standen früher allein die verschiedenen Aspekte der Leistungsoptimierung bei Datenbanksystemen im Mit- telpunkt, während heutzutage neben hoher Performanz zu- nehmend Energieeffizienz oder gar Energieproportionalität beim DBMS gefordert wird. Weiterhin ist ein wichtiges Ziel bei dieser DBMS-Evolution die automatische und für die Anwendung transparente Anpassung an die hochentwickel- ten Hardware-Komponenten. Wichtige Bereiche für neue Lösungen betreffen bei- spielsweise Hardware-unterstützte Anfrageverarbeitung, Datenverwaltung bei Nutzung von Co-Prozessoren oder GPUs, neuartige Anwendungen von neuen und künftigen Speichertechnologien (Flash, PCM, NVRAM usw.), DBMS- Architekturen für Transactional Memory, Low-Power Computing, Embedded Devices usw. Außerdem ist es erfor- derlich, für solche neuen DBMS-Architekturen geeignete Tools zur Analyse und Optimierung sowie zur Leistungs-/ Energiemessung von Komponenten und Gesamtsystem bereitzustellen. Damit die oben genannten Ziele für das Data Management on New Hardware auch überprüft und verschiedene Ansätze zu ihrer Optimierung verglichen wer- den können, sind letztlich auch geeignete Benchmarks zu entwickeln, die nicht nur Performanz-zentriert sind, sondern insbesondere auch wichtige Aspekte der Energieeffizienz berücksichtigen. Vor einigen Jahren hatte ein Heft des Datenbank- Spektrums schon einmal dieses Schwerpunktthema gewählt. Damals wurden ausschließlich Beiträge zu Aspekten der Lei- stungssteigerung des Datenbanksystems bzw. seiner Anwen- dungen eingereicht, wobei Nutzung von Flash-Speichern (SSDs) und Optimierungsmöglichkeiten bei der haupt- speicherbasierten Datenbankverarbeitung im Vordergrund standen. In den vier Beiträgen dieses Heftes hat sich der 170 T. Härder, J. Teubner Schwerpunkt deutlich verschoben. Zwei Beiträge konzen- trieren sich unter Nutzung neuer Hardware-Architekturen auch auf Fragen der Energieeffizienz. In weiteren Beiträgen werden vor allem DB-bezogene Optimierungsmöglichkeiten bei Einsatz von GPUs, FPGA Cores, Many-Core NUMA- organized DB Servers usw. untersucht. Im ersten Beitrag mit dem Titel HyPer beyond Softwa- re: Exploiting Modern Hardware for Main-Memory Da- tabase Systems überprüfen Florian Funke, Alfons Kem- per, Tobias Mühlbauer, Thomas Neumann und Viktor Leis (TU München) die Nutzung neuartiger und viel- fältiger Hardware-Möglichkeiten zur Optimierung von Hauptspeicher-Datenbanksystemen im Kontext des HyPer- Projektes. Insbesondere wird die Virtuelle Speicherverwal- tung eingesetzt, um auf den DB-Daten OLAP-Anfragen von parallelen OLTP-Transaktionen zu separieren. Weiterhin un- tersuchen die Autoren Konzepte und Verfahren zur Trennung der DB-Daten in „heiße und kalte“ Partitionen, zur adapti- ven Parallelisierung und Partitionierung, um eine erhöhte Datenlokalität bei Prozessorkernen zu erreichen, sowie zur Verbesserung der Synchronisation bei OLTP-Transaktionen. Schließlich berichten sie, wie heterogene Prozessoren von verbrauchsarmen Rechnern zur leistungsstarken und ener- gieeffizienten Anfrageverarbeitung eingesetzt werden kön- nen. Im folgenden Beitrag fassen Daniel Schall und Theo Härder (TU Kaiserslautern) die Arbeiten zu ihrem DFG- Projekt Energieeffiziente Verarbeitung in Datenbanksyste- men zusammen. Unter den Titel WattDB—A Journey to- wards Energy Efficiency beschreiben sie die Entwicklung von WattDB, einen verteilten DBMS, das auf einen dyna- mischen Cluster von leistungsschwachen Rechnern abläuft. Das Projekt untersucht, wie und ob die Leistung einen zen- tralisierten DB-Servers durch ein Rechner-Cluster bereitge- stellt werden kann, wobei Energieproportionalität bei der DB-Verarbeitung approximiert werden soll. WattDB nähert sich diesem Ziel durch automatisches Zu- und Abschalten von Rechnern in Abhängigkeit von der DB-Arbeitslast an. Ein wesentliches Problem ist die Erreichbarkeit aller DB- Daten von jedem aktiven Rechnerknoten, was flexible, dyna- mische Datenpartitionierung und Datenallokation impliziert. Mit einem Experiment auf großem Server und dynamischem Zehn-Knoten-Cluster – mit vergleichbaren Ressourcen hin- sichtlich CPU-Leistung, Hauptspeicher- und Cache-Größe und Externspeicher-Ausstattung, wobei eine identische Ver- sion von WattDB mit derselben Arbeitslast eingesetzt wurde – konnten die genauen Abweichungen bei Transaktionslei- stung und Energieverbrauch gemessen werden. Während das Cluster durchgehend bessere Werte für Energieeffizienz er- reichte, konnte es nur bei mittleren oder geringen OLAP- Lasten hinsichtlich Transaktionsleistung mit dem großen Server mithalten. Der dritte Beitrag The Design and Implementation of Co- GaDB: A Column-oriented GPU-accelerated DBMS von Se- bastian Breß (TU Dortmund) liefert einen Einblick in die Pro- bleme und Techniken beim Entwurf und bei der Implemen- tierung eines Hauptspeicher-Datenbanksystems, das zur Lei- stungssteigerung eine „eingebaute“ GPU als Co-Prozessor einsetzt, um OLAP-Arbeitslasten in optimierter Weise verar- beiten zu können. CoGaDB setzt das Optimierer-Framework HyPE zur Realisierung eines Hardware-unabhängigen An- frageoptimierers ein, der in der Lage ist, Kostenmodelle für DB-Operatoren zu lernen und Arbeitslasten effizient auf ver- fügbare Prozessoren zu verteilen. CoGaDB implementiert weiterhin effiziente Algorithmen – insbesondere auch den Star Join – für den kombinierten Einsatz auf CPU und GPU. Der Beitrag macht deutlich, wie diese neuen Techniken in einem einzigen System zusammenspielen. Schließlich bele- gen empirische Experimente, dass sich CoGaDB zur Laufzeit schnell durch zunehmende Genauigkeit seiner Kostenmodel- le an die konkret verfügbare Hardware anpasst. Der vierte Beitrag Heterogeneity-aware Operator Place- ment in Column-Store DBMS kommt von der TU Dresden mit den Autoren Thomas Karnagel, Dirk Habich, Benjamin Schlegel und Wolfgang Lehner. Unter der Annahme einer Multi-Core-CPU als homogene Ablaufplattform bestimmen existierende Anfrageoptimierer für eine SQL-Anfrage die ef- fizienteste Auswertungsreihenfolge der erforderlichen physi- schen Operatoren. Jedoch nimmt heutzutage die Heterogeni- tät bei der Hardware zu, so dass eine Multi-Core-CPU mehr und mehr durch verschiedene Recheneinheiten, wie z. B. GPU oder FPGA-Kernen, ergänzt wird. Wegen dieser He- terogenität wird die Optimierung der Zuordnung physischer Operatoren immer wichtiger. In ihrem Beitrag schlagen die Autoren eine entsprechende Strategie, HOP (Heterogeneity- aware physical Operator Placement) genannt, für haupt- speicherbasierte, spaltenorientierte Datenbanksysteme vor. Um Zuordnungsentscheidungen zu Laufzeit in optimaler Weise zu ermöglichen, wertet das Kostenmodell Merkmale der beteiligten Recheneinheiten, Ausführungseigenschaften der Operatoren sowie Ablaufdaten für jede Recheneinheit aus. Die experimentelle Auswertung des HOP-Strategie mit TPC-H-Anfragen zeigte beträchtliche Antwortzeitgewinne, die sich allein durch die optimierte Zuordnung nach dem HOP-Modell ergeben. Die vier Beiträge zum Schwerpunktthema dieses Heftes werden durch einen Fachbeitrag Eine Erweiterung des Re- lationalen Modells zur Repräsentation räumlichen Wissens ergänzt. Norbert Paul und Patrick E. Bradley (KIT Karls- ruhe) beschreiben darin, wie sich die enge Verwandtschaft von Topologie und Relationalem Datenmodell nutzen lässt, um topologische Konzepte in das Relationale Datenmodell einzuführen. Sie zeigen, dass der relationalen Abgeschlos- senheit der Relationalen Algebra eine Art „räumlicher Abge- schlossenheit“ in der Topologie entspricht. Mit einer proto- Editorial 171 typischen Implementierung dieser topologisch-Relationalen Algebra illustrieren sie, wie Relationen zu topologischen Räumen werden können und wie eine entsprechend erwei- terte Relationale Algebra auf diesen Räumen operiert. An ei- nem Beispiel aus der räumlichen Wissensverarbeitung, dem Region-Connection-Calculus (RCC-8), zeigen die Autoren schließlich den Nutzen dieses generischen Ansatzes. Unter der Rubrik „Datenbankgruppen vorgestellt“ finden Sie einen Beitrag von H.-Jürgen Appelrath und Marco Gra- wunder über Die Abteilung Informationssysteme der Univer- sität Oldenburg. Dieser Beitrag skizziert nach einem Blick auf die geschichtliche Entwicklung des Abteilung in Univer- sität und An-Institut OFFIS größere Projekte auf den Gebiet des intelligenten Datenmanagements mit Anwendungen in der Energiewirtschaft und im Gesundheitswesen sowie ein Framework zur Erstellung von Datenstrommanagementsy- stemen. Weiterhin geben die Autoren einen Überblick über eine Vielzahl weiterer aktueller Forschungsthemen ihrer Ab- teilung. In diesem Heft bietet die Rubrik „Dissertationen“ sechs Kurzfassungen von Dissertationen aus der deutschen DBIS- Community. Die Rubrik „Community“ enthält schließlich unter News weitere aktuellen Informationen aus der DBIS-Gemeinde. Künftige Schwerpunktthemen 1 Informationsmanagement für Digital Humanities In den Geisteswissenschaften fallen in immer größerer Men- ge digitale Forschungsdaten an. Dabei ergeben sich durch die spezifischen Rahmenbedingungen zahlreiche Herausfor- derungen für Datenbanken und IR-Systeme: Die Daten und Dokumente sind heterogen in Sprache, Struktur und Quali- tät. Es gibt zwar eine Vielzahl von Standards und Methoden, eine übergreifende Sicht existiert aber kaum. Relevante Kol- lektionen mit elektronischen Texten, Metadaten, Bildern und anderen multimedialen Ressourcen liegen in verschiedenen Disziplinen und Institutionen vor und bilden eine hochgradig verteilte und heterogene Informationslandschaft, deren Ver- arbeitung oft im Rahmen spezifischer, geisteswissenschaft- licher Forschungsfragen erfolgt. Von besonderer Bedeutung sind die Erschließung, Veröffentlichung und Verwaltung di- gitaler Ressourcen im Rahmen spezifischer Anwendungen z. B. in der Archäologie, den Geschichts-, Sprach- oder Reli- gionswissenschaften, aber insbesondere auch im Kontext in- terdisziplinärer Forschung. Im Themenheft sollen einführen- de und überblicksartige Artikel sowie aktuelle Forschungs- ergebnisse zu ausgewählten Themen ein breites Bild zum aktuellen Stand des Informationsmanagements für Digital Humanities geben. Mögliche Themen aus diesem Bereich könnten z. B. sein: • Integrierte Analyse, Verarbeitung und Visualisierung ver- teilter bzw. heterogener Kollektionen • Nutzung, Entwicklung und Auswertung von Vokabularen, Thesauri und Ontologien • Langzeitarchivierung und Datenprovenienz • Katalogisierung, Annotation und Dokumentation von Res- sourcen (Data Curation) • Erkennung, Analyse und Visualisierung kollektionsinter- ner oder -übergreifender Zusammenhänge z. B. durch Analyse von Ort und Zeit, Themen, Named Entities • Aspekte der Usability im Umgang mit verteilten und he- terogenen Ressourcen • Anwendungen zum Datenmanagement, zur Suche und zur Analyse in speziellen Anwendungsfeldern aus den Gei- steswissenschaften • Big Data-Technologien für die Digital Humanities • Forschungsinfrastrukturen für die Digital Humanities Gastherausgeber: Andreas Henrich, Otto-Friedrich-Universität Bamberg andreas.henrich@uni-bamberg.de Gerhard Heyer, Universität Leipzig heyer@informatik.uni-leipzig.de Christoph Schlieder, Otto-Friedrich-Universität Bamberg christoph.schlieder@uni-bamberg.de 2 Data Management for Mobility Mobility is a major factor in our society and daily life. Thus, approaches for data management need to address the resul- ting dynamics, geospatial and temporal relationships, and distribution of resources. In Web design, the methodology of „mobile first“ – developing new Web applications for mobile usage first and adapt it later for the desktop case – is widely embraced by industry. However, it often only considers the user interface and not the data management. This special is- sue addresses novel approaches and solutions for mobile data management. We invite submissions on original research as well as overview articles covering topics from the following non-exclusive list: • Data management for mobile applications • Context awareness in mobile applications • Analytic techniques in mobile applications • Management of moving objects • Data-intensive mobile computing and cloud computing • Data stream management • Complex event processing • Case studies and applications • Foundations of data-intensive mobile computing 172 T. Härder, J. Teubner Expected size of the paper: 8 – 10 pages (double-column) Important dates: • Notice of intent for a contribution: December 15th, 2014 • Deadline for submissions: February 1st, 2015 • Issue delivery: DASP-2-2015 (July 2015) Guest editors: Bernhard Mitschang, University of Stuttgart bernhard.mitschang@ipvs.uni-stuttgart.de Daniela Nicklas, University of Bamberg daniela.nicklas@uni-bamberg.de 3 Best Workshop Papers of BTW 2015 This special issue of the „Datenbank-Spektrum“ is dedica- ted to the Best Papers of the Workshops running at the BTW 2015 at the University of Hamburg. The selected Workshop contributions should be extended to match the format of re- gular DASP papers. Paper format: 8–10 pages, double column Selection of the Best Papers by the Workshop chairs and the guest editor: April 15th, 2015 Guest editor: Theo Härder, University of Kaiserslautern, haerder@cs.uni-kl.de Deadline for submissions: June 1st, 2015 4 Big Data & IR The term Big Data refers to data and respective processing strategies, which, due to their sheer size, require a data center for the processing, and which become available through the ubiquitous computer and sensor technology in many facets of everyday life. Interesting scientific questions in this regard are the organization and management of Big Data, but also the identification of problems that now can be studied and better understood through the collection and analysis of Big Data. In the context of information retrieval as the purposeful search for relevant content, there are two main challenges: 1) retrieval in Big Data and 2) improved retrieval because of Big Data. Retrieval in Big Data focuses on the organization, the ma- nagement, and the quick access to Big Data, but also addres- ses the creative process of identifying interesting research questions that can only be understood and answered in Big Data. Besides the development of powerful frameworks for the maintenance and analysis of text, multimedia, sensor, and simulation data, an important research direction is the que- stion of what kind of insights Big Data may give us today and in the future. The second challenge in the context of Big Data & IR is the improvement of retrieval approaches through Big Data. Examples include the classic question of improved Web or eCommerce search via machine learning on user behavior data, the usage of user context for retrieval, or the exploitation of semantic data like Linked Open Data or knowledge graphs. We are looking for contributions from researchers and practitioners in the above described context. The contributions may be submitted in German or in Eng- lish and should observe a length of 8–10 pages in the Datenbank-Spektrum format (cf. the author guidelines at www.datenbank-spektrum.de). Important dates: • Notice of intent for a contribution: August 15th, 2015 • Deadline for submissions: October 1st, 2015 • Issue delivery: DASP-1-2016 (March 2016) Guest editors: Matthias Hagen, Universität Weimar matthias.hagen@uni-weimar.de Benno Stein, Universität Weimar benno.stein@uni-weimar.de Editorial 1 Informationsmanagement für Digital Humanities 2 Data Management for Mobility 3 Best Workshop Papers of BTW 2015 4 Big Data & IR << /ASCII85EncodePages false /AllowTransparency false /AutoPositionEPSFiles true /AutoRotatePages /None /Binding /Left /CalGrayProfile (Gray Gamma 2.2) /CalRGBProfile (sRGB IEC61966-2.1) /CalCMYKProfile (Coated FOGRA27 \050ISO 12647-2:2004\051) /sRGBProfile (sRGB IEC61966-2.1) /CannotEmbedFontPolicy /Warning /CompatibilityLevel 1.4 /CompressObjects /Off /CompressPages true /ConvertImagesToIndexed true /PassThroughJPEGImages true /CreateJobTicket false /DefaultRenderingIntent /Perceptual /DetectBlends true /DetectCurves 0.1000 /ColorConversionStrategy /sRGB /DoThumbnails true /EmbedAllFonts true /EmbedOpenType false /ParseICCProfilesInComments true /EmbedJobOptions true /DSCReportingLevel 0 /EmitDSCWarnings false /EndPage -1 /ImageMemory 1048576 /LockDistillerParams true /MaxSubsetPct 100 /Optimize true /OPM 1 /ParseDSCComments true /ParseDSCCommentsForDocInfo true /PreserveCopyPage true /PreserveDICMYKValues true /PreserveEPSInfo true /PreserveFlatness false /PreserveHalftoneInfo false /PreserveOPIComments false /PreserveOverprintSettings true /StartPage 1 /SubsetFonts false /TransferFunctionInfo /Apply /UCRandBGInfo /Preserve /UsePrologue false /ColorSettingsFile () /AlwaysEmbed [ true ] /NeverEmbed [ true ] /AntiAliasColorImages false /CropColorImages false /ColorImageMinResolution 149 /ColorImageMinResolutionPolicy /OK /DownsampleColorImages true /ColorImageDownsampleType /Bicubic /ColorImageResolution 150 /ColorImageDepth -1 /ColorImageMinDownsampleDepth 1 /ColorImageDownsampleThreshold 1.50000 /EncodeColorImages true /ColorImageFilter /DCTEncode /AutoFilterColorImages true /ColorImageAutoFilterStrategy /JPEG /ColorACSImageDict << /QFactor 0.40 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000ColorACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000ColorImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasGrayImages false /CropGrayImages false /GrayImageMinResolution 149 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 150 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict << /QFactor 0.40 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /GrayImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000GrayACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000GrayImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasMonoImages false /CropMonoImages false /MonoImageMinResolution 599 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 600 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict << /K -1 >> /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (Coated FOGRA27 \050ISO 12647-2:2004\051) /PDFXOutputConditionIdentifier (FOGRA27) /PDFXOutputCondition () /PDFXRegistryName (http://www.color.org) /PDFXTrapped /False /CreateJDFFile false /Description << /ENU ([Based on '[SpringerOnline_1003_Acro8]'] Use these settings to create Adobe PDF documents best suited for high-quality prepress printing. Created PDF documents can be opened with Acrobat and Adobe Reader 5.0 and later.) /ENN () >> /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ << /AsReaderSpreads false /CropImagesToFrames true /ErrorControl /WarnAndContinue /FlattenerIgnoreSpreadOverrides false /IncludeGuidesGrids false /IncludeNonPrinting false /IncludeSlug false /Namespace [ (Adobe) (InDesign) (4.0) ] /OmitPlacedBitmaps false /OmitPlacedEPS false /OmitPlacedPDF false /SimulateOverprint /Legacy >> << /AddBleedMarks false /AddColorBars false /AddCropMarks false /AddPageInfo false /AddRegMarks false /BleedOffset [ 0 0 0 0 ] /ConvertColors /ConvertToRGB /DestinationProfileName (Coated FOGRA27 \(ISO 12647-2:2004\)) /DestinationProfileSelector /WorkingCMYK /Downsample16BitImages true /FlattenerPreset << /PresetSelector /MediumResolution >> /FormElements false /GenerateStructure false /IncludeBookmarks true /IncludeHyperlinks true /IncludeInteractive false /IncludeLayers false /IncludeProfiles true /MarksOffset 6 /MarksWeight 0.250000 /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /UseName /PageMarksFile /RomanDefault /PreserveEditing true /UntaggedCMYKHandling /UseDocumentProfile /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> << /AllowImageBreaks true /AllowTableBreaks true /ExpandPage false /HonorBaseURL true /HonorRolloverEffect false /IgnoreHTMLPageBreaks false /IncludeHeaderFooter false /MarginOffset [ 0 0 0 0 ] /MetadataAuthor () /MetadataKeywords () /MetadataSubject () /MetadataTitle () /MetricPageSize [ 0 0 ] /MetricUnit /inch /MobileCompatible 0 /Namespace [ (Adobe) (GoLive) (8.0) ] /OpenZoomToHTMLFontSize false /PageOrientation /Portrait /RemoveBackground false /ShrinkContent true /TreatColorsAs /MainMonitorColors /UseEmbeddedProfiles false /UseHTMLTitleAsMetadata true >> ] >> setdistillerparams << /HWResolution [2400 2400] /PageSize [595.276 841.890] >> setpagedevice work_a536k3ux6zfina57hcqvb3wgge ---- ijhac.2016.0160.dvi SEMI-SUPERVISED TEXTUAL ANALYSIS AND HISTORICAL RESEARCH HELPING EACH OTHER: SOME THOUGHTS AND OBSERVATIONS FEDERICO NANNI, HIRAM KUMPER AND SIMONE PAOLO PONZETTO Abstract Future historians will describe the rise of the World Wide Web as the turning point of their academic profession. As a matter of fact, thanks to an unprecedented amount of digitization projects and to the preservation of born- digital sources, for the first time they have at their disposal a gigantic collection of traces of our past. However, to understand trends and obtain useful insights from these very large amounts of data, historians will need more and more fine- grained techniques. This will be especially true if their objective will turn to hypothesis-testing studies, in order to build arguments by employing their deep in-domain expertise. For this reason, we focus our paper on a set of computational techniques, namely semi-supervised computational methods, which could potentially provide us with a methodological turning point for this change. As a matter of fact these approaches, due to their potential of affirming themselves as both knowledge and data driven at the same time, could become a solid alternative to some of the today most employed unsupervised techniques. However, historians who intend to employ them as evidences for supporting a claim, have to use computational methods not anymore as black boxes but as a series of well known methodological approaches. For this reason, we believe that if developing computational skills will be important for them, a solid background knowledge on the most important data analysis and results evaluation procedures will become far more capital. International Journal of Humanities and Arts Computing 10.1 (2016): 63–77 DOI: 10.3366/ijhac.2016.0160 © Edinburgh University Press 2016 www.euppublishing.com/journal/ijhac 63 F. Nanni, H. Kümper, and S. Paolo Ponzetto Keywords: semi-supervised methods; historical studies; data analysis; born- digital archives 1. introduction In December 2010 Google presented a service called ‘Google Ngram Viewer’.1 This tool allows us to look at the occurrence of single words or sentences in specific subsets of the immense corpus digitized by the Google Books project. A few weeks after, Erez Lieberman Aiden and Jean-Baptiste Michel, team leaders of the prototype Viewer, offered a demonstration of the tool at the annual meeting of the American Historical Association in Boston.2 In front of around 25 curious historians, they noted the enormous potential of conducting historical research by extracting information from large corpora. In particular, they revealed a way to deal with one of the biggest issues for historians who are exploring large datasets, namely rapidly detecting the distribution of specific words in the corpus.3 Interestingly, the development and the functionalities of this tool demonstrate some of the most relevant characteristics of the current interactions between the practice of historical research and the use of computational methods: Firstly, no historian has been directly involved in any step of the development of this project.4 This is particularly significant, given that they would likely be the primary targets of a tool able to process information from a corpus spanning five hundred years. As Aiden and Michel remarked, this is due to two well-known reasons: historians traditionally do not have solid computational skills and they are usually skeptical about the development of quantitative approaches for the analysis of sources.5 Secondly, others have noted that the Ngram Viewer offers an over- simplified research tool, which usually leads to general coarse-grained explorative analyses and to few simple historical discoveries.6 Finally, the way in which the Ngram Viewer has been presented and identified outside academia as a representative tool of the digital humanities also reveals the growing enthusiasm for methodology studies and big- data driven researches in this community.7 However, as already remarked, researchers in digital humanities need to bear in mind their long-term purpose, that is to use the computer in order to answer specific and relevant research questions and not simply to build tools.8 But while the Ngram Viewer symbolises a current widespread way of employing computational methods for studying historical corpora, namely for data- exploration and general hypothesis-confirmation analyses, we believe that a change is about to come. In fact, in our opinion new generations of historians 64 Textual Analysis and Historical Research will need more and more fine-grained techniques to conduct inspections of large datasets. This will be especially true if their objective turns from exploratory analyses to hypothesis-testing studies, in order to build arguments by employing their deep in-domain expertise. For this reason, we focus here on a set of computational techniques, namely semi-supervised computational methods, which could potentially provide us with a methodological turning point for this change.9 As a matter of fact these approaches make it possible to actively include the human expert in the computational process. Therefore, due to their potential of affirming themselves as both knowledge- and data-driven at the same time, they could become a solid alternative to some of the most common unsupervised techniques currently used. However, historians who intend to employ computational methods as evidence for supporting a claim, have to use them as a series of well known methodological approaches rather than as ‘black boxes’ whose workings are unknown.10 For this reason, if developing computational skills will be important for historians, a solid background knowledge of the most important data analysis and results evaluation procedures will become far more necessary. Starting from all these assumptions, this paper is organized as follows: firstly, a few basic concepts of machine learning methods are introduced. Then, a diachronic description of the use of computational methods in historical research is presented. Following this, our focus on a specific technique, namely Latent Dirichlet Allocation topic modeling, is defined. Next, the advantages and the consequences of the use of semi-supervised topic modeling approaches on the historian’s craft are described. Finally, a future project on the use of these methodological frameworks for the analysis of the different semantic dimensions of specific concepts in a collection of around 1,000 French legal books from the 17th and 18th century is introduced. Our essay is focused on a precise potentiality of the complex datasets of sources that historians have now at their disposal. This is the possibility of exploiting the results of fine-grained analyses as historical evidence through the combination of specific in-domain research interests and the scientifically correct employment of computational methods. This will help researchers to deal with the abundance of digital materials by extracting precise information from them, and to move from exploratory studies to hypothesis testing analyses. However, now that both large datasets and text mining methods are at our disposal, other challenges are emerging, such as multilingual corpora or the evolution of languages in diachronic extended datasets. In the near future this will raise other issues for the new generations of historians, increasing the need for advanced computational approaches (i.e. specific language models for machine translation) and demanding always more advanced competencies of the humanities researcher. 65 F. Nanni, H. Kümper, and S. Paolo Ponzetto 2. supervised and unsupervised text analyses Before going into the details of how these methods have previously been employed in historical research and how they could be used in the near future, it is important to clarify a few key concepts in data analysis and machine learning that have already been mentioned in the previous paragraphs.11 As described earlier, an initial requirement of many historical studies is to identify semantic similarities and recurrent lexical patterns in a collection of documents. In machine learning there are two main different kinds of approaches that allow us to do this. The first one consists of supervised learning methods, which focus on classification tasks. In classification tasks, humans identify a specific property of a subset of elements in the dataset (for example articles about foreign policy in a newspaper archive) and then guide the computer, by means of an algorithm, to learn how to find other elements with that characteristic. This is done by providing the machine with a dataset of labeled examples (‘this is an article about foreign policy’, ‘this is not’), called a ‘gold standard’, which are described by a set of other ‘features’ (for instance, the frequency of each word in each document). Moreover, the learning process is typically divided into two main phases, namely: i) a training phase, in which the predictive model is learnt from the labeled data; ii) a testing phase, in which the previously learnt model is applied to unseen, unlabeled data in order to quantity its predictive power, specifically its ability to generalize to data other than the labeled ones seen during training. Additionally, a validation phase can take place to fine-tune the model’s parameters for the specific task or domain at hand – e.g., classifying foreign policy articles from newspaper sources, as opposed to websites. The potential of a good classifier is immense, in that it offers a model that generalizes from labeled to (a potentially very large set of) unlabeled data. However, building such models can also be extremely time-consuming. In fact, researchers not only need a dataset with specific annotated examples to train the classifier but, perhaps even more fundamentally, they need to have extremely clear sense of what they are looking for, since this leads them to define the annotation guidelines and learning task itself. For this reason, it is evident that classification methods are arguably not the most convenient approaches for conducting data exploration in those situations where a researcher sets out to investigate the dataset with no clear goal in mind other than searching for any phenomenon they deem interesting a posteriori. The second class of methods is unsupervised, and addresses the problem of clustering. In a nutshell, clustering methods aim at grouping elements from a dataset on the basis of their similarity, as computed from their set of features (for example by looking at patterns in the frequency of words in different documents). This is achieved by computing likenesses across features without 66 Textual Analysis and Historical Research relying on labeled examples, unsupervised by humans. Crucially for digital humanities scholars, researchers can study the resulting clusters in order to understand what the (latent) semantic meaning of the similarities between the elements is. Clustering techniques are extremely useful for analyzing large corpora of unlabeled data (i.e., consisting of ‘just text’), since they rapidly offer researchers a tool to get a first idea of their content in a structured way (i.e., as clusters of similar elements, which can be optionally hierarchically arranged by using so-called hierarchical clustering methods). This is primarily because, as they do not require labeled data, they can be applied without having in mind a specific phenomenon or characteristics of the dataset to mine (i.e., learn). However, even if scholars noted their potential, for example by creating serendipity, and different metrics have been proposed for evaluating the number and correctness of these clusters, this is still an extremely challenging task, typically due to the difficulties of interpreting the clusters output by the algorithms.12 3. studying the past, in the digital world The potential of computational methods for the study of primary sources has been a recurrent topic in the humanities. As Thomas remarked, already in 1945 Vannevar Bush, in his famous essay ‘As We May Think’, pointed out that technology could be the solution that will enable us to manage the abundance of scientific and humanistic data; in his vision the Memex could become an extremely useful instrument for historians.13 The use of the computer in historical researches consolidated between the Sixties and the Seventies with its application to the analysis of economic and census data. The advent of cliometrics gave birth to a long discussion on the use of the results of quantitative analysis as evidence in the study of the past.14 Due in part to this long debate on the application of quantitative methods in historical research and in part to the new potentials of the Web as a platform for the collection, presentation, and dissemination of material, during the Nineties a different research focus emerged in what was already at that time identified as digital history.15 As Robertson recently pointed out, this specific attention on the more ‘communicative aspects’ of doing research in the humanities could be recognized as one of the main differences between the ways in which historians have been interpreting the digital turn compared to their colleagues in literary studies over the last twenty years.16 However, regardless of whether historians of the 21st Century are interested in employing computational methods for analysis textual documents or not, it is evident that the never-ending increase of digitized and born digital sources is no longer manageable with traditional close reading hermeneutic approaches alone.17 For this reason, two different activities have consolidated in the digital 67 F. Nanni, H. Kümper, and S. Paolo Ponzetto humanities community during the last decade. On one side digital historians started creating tools in order to help other traditionally trained colleagues in employing computational methods.18 On the other side, more recently a small but strongly connected community of historians has decided to focus their efforts on teaching the basic of programming languages and the potential of different textual analyses techniques for conducting exploratory studies of their datasets. As Turkel remarked: ‘My priority is to help train a generation of programming historians. I acknowledge the wonderful work that my colleagues are doing by presenting history on the Web and by building digital tools for people who can’t build their own. I know that the investment of time and energy that programming requires will make sense only for one historian in a hundred’.19 a. Computational History The works conducted by Willam J. Turkel at the University of Western Ontario, with particular attention to his blog ‘Digital History Hacks’ and his project ‘The programming historian’, could be identified as a starting point of these digital interactions.20 Following Turkel’s approaches and advice, a group of historians has begun experimenting with these different computational methods to explore large historical corpora.21 The use of Natural Language Processing and Information Retrieval methods, combined with network analysis techniques and a solid set of visualization tools, are the points around which this new wave of quantitative methods in historiography has consolidated. During recent years several interesting examples of these interactions between historical research and computational approaches have been presented.22 In addition, thanks to the collaborations with other digital humanities colleagues (i.e. literary studies researchers and digital archivists), the words ‘text mining’ and ‘distant reading’ have become buzzwords of this new trend in digital history. If we were to look more closely at how these techniques have been applied, we could notice that the first objective of the digital humanities researchers has been to show the exploratory potential of these methods and to confirm their accuracy by re-evaluating already well-known historical facts.23 As we will remark in the next sections, this is due to the unsupervised nature of the specific textual analysis techniques most widely used in historical research (e.g., topic modeling), which do not need (but at the same time cannot obtain benefit from) human supervision and in-domain knowledge during the computational process. b. Topic modeling Topic modeling is arguably the most popular text mining technique in digital humanities.24 Its success is due to its ability to address one of the deepest need of a historian, namely to automatically identify with as little human supervision 68 Textual Analysis and Historical Research as possible (none, ideally) a list of topics in a collections of documents, and how these are intertwined with specific document sources in the collection. At a first sight this technique seems to be the methodological future of historical research. However, as researchers rapidly discovered, working with topic modeling toolboxes is neither easy nor always yielding satisfactory results. First of all, Latent Dirichlet Allocation (LDA - the main topic models algorithm), like other unsupervised techniques, needs to be told in advance the number of topics (resp. clusters) that the researcher is interested in.25 However, knowing the number of topics is itself a non-trivial issue, which leads researchers to a chicken- and-egg-problem in which they use LDA to find some interesting topics, while being required to explicitly state the exact number of such topics they are after. Moreover, as this technique looks at the distribution of topics by document, the results will be extremely different in relation to the number of topics chosen. Thus, topic modeling highlights both advantages and limitations of unsupervised techniques. In fact, the obtained topics are, as others noticed, usually difficult to decode; each of them is presented as a list of words, and being able to identify it with a specific concept generally depends on the intuitions of the researcher.26 The first paper on LDA was published in 2003, however before 2010 there were just a few publications on humanities topics where this technique was employed.27 We could identify a turning point in the digital humanities community between 2011 and 2012, when suddenly a remarkable number of blogposts, online discussions, workshops and then publications been focused on how to deal and employ this technique.28 As we will describe later, in the same period Owens observed the risks for humanists of using topic modeling results as justification for a theory and in general suggested limiting its use to exploratory studies.29 4. semi-supervised textual analysis Today, if there is something more criticized than the use of quantitative methods in the humanities, this is data-driven research.30 More specifically, we agree that the practice of employing unsupervised computational approaches to analyse a dataset and then relying on their automatically generated results to build a scholarly argument could reduce the role of the humanist in the research process. This is due to two main reasons: firstly, since even the more technically skilled historian does not have a solid statistical background as computational linguists, computer scientists or other kinds of researchers that currently are implementing these methods; this will consequently limit their understanding of both the techniques and the obtained results.31 Secondly, because by employing unsupervised techniques, historians will not draw on their background knowledge, and will not directly use these methods for 69 F. Nanni, H. Kümper, and S. Paolo Ponzetto answering specific research questions they have in mind. This is because, since unsupervised methods do not rely on human supervision and are mainly targeted at generating serendipity, they do not, and are not meant to include human feedback to guide the process of model creation. However, on the other side of the spectrum, supervised classification approaches are particularly time-consuming to build, and their usefulness depends on specific research purposes (i.e., what is the scholar trying to discover by classifying documents in different categories?). Therefore, it is evident that for historians interested in performing more fine-grained explorations, a different computational technique is needed that is able to stake out a middle ground between explicit human supervision and serendipitous searching and exploration; a method that could help the researcher switching from general exploration analyses to more specific ones, from getting a first idea of the contents of a corpus to start evaluating theories by employing her/his domain expertise. For this purpose, we argue that a series of semi-supervised topic modeling algorithms, adopted in recent years in the fields of machine learning and natural language processing, could also become established research methods in digital history. The first one is Supervised LDA, originally presented by Mcauliffe and Blei.32 This method makes it possible to derive distribution of topics by considering a set of labels, each one associated with each document. In their paper the authors note the potential of this method when the prediction of a specific value is the ultimate goal; to this end, they combine movie ratings and text reviews to predict the score of unrated reviews. However, as remarked by Travis Brown, historians could also experiment with this technique, to, for example identify the relation between topics and labels (i.e. to find the most relevant topics for ‘economics’ articles).33 A conceptual extension of this technique is Labeled LDA, developed by Ramage et al.34 This method makes it possible to highlight the distribution of labeled topics in a set of multi-labeled documents. If we imagine a corpus where every document is described by a set of meta-tags (for example a newspaper archive with articles associated with both ‘economics’, ‘foreign policy’, and so on), Labeled LDA will identify the relation between topics, documents and tags, and its output will consist of a list of topics, one for each tag. This, in turn, could be used to identify which part of each document is associated with each tag. Another relevant approach is Dirichlet-multinomial regression, proposed by Mimno and McCallum.35 As the authors describe, rather than generating metadata (as for example the ratings in Supervised LDA) or estimating topical densities for metadata elements (as the topics related to metadata, like Labeled LDA), this method learns topic assignments by considering a set of pre- assigned document-features. In their paper the researchers show how authors, 70 Textual Analysis and Historical Research paper-citations and date of publications could be useful features of external knowledge to improve the topic model representation on a dataset of academic publications. Finally, a last method is Seeded LDA.36 Instead of using a prior set of descriptive labels for each document or topic, as in previous approaches, Seeded LDA offers the possibility of manually defining a list of seed words for the topics the researcher is interested in. Let us imagine, for instance, that we are after a specific topic within the corpus of interest (e.g., news related to the relations between USA and Cuba in a newspaper archive): using Seeded LDA the researcher could guide the topic model in a specific direction, receiving as output the distribution of topics that she/he is interested in. A thorough comparison of these different semi-supervised topic modeling techniques is beyond the scope of this paper. However, the fact that all methods make it possible to include the human (i.e., the humanities scholar) in the loop (i.e., the learning process) by requiring the expert to provide either labeled meta- data, or a set of initial seed words to guide the topic acquisition process is crucial for out argument. We argue that this last option, in particular, is very attractive for digital historians in that it forces them to explicitly state the lexical components of the specific topics they are after, while requiring a minimal amount of supervision. That is, the scholar has to input a small set of seed words he/she deems important on the basis of her/his expertise, as opposed to merely labeling documents with a pre-compiled set of class labels. 5. how data becomes evidence In the previous section we gave a brief overview of different semi-supervised topic modeling techniques, and argued that they could help historians exploit different sources like metadata and seed words, stemming from their human expertise as scholars, in order to perform fine-grained exploration analyses. Topic modeling is a fascinating way of navigating through large corpora, and it could become even more interesting for the researcher by making the tool consider specific labels or seed-words. Regarding this, Owens remarked: ‘If you shove a bunch of text through MALLET and see some strange clumps clumping that make you think differently about the sources and go back to work with them, great’.37 Then, he continues: ‘If you aren’t using the results of a digital tool as evidence then anything goes’. In the second sentence Owens perfectly describes the current main problem of digital humanities scholars employing text mining methods. As others already remarked, on the one hand the research community wants to see the humanistic relevance of these analyses, and not only the computational benefits.38 On the other hand, digital humanists are aware that they cannot present the results of 71 F. Nanni, H. Kümper, and S. Paolo Ponzetto Figure 1. In this figure the methodological framework we suggest for analysing large historical corpora is summarized. Both the in-domain knowledge of the researcher and a solid expertise in data analysis are key components. their studies as evidence without a solid evaluation of the performance of the methods. For instance, if the purpose is to detect articles related to a specific subject (i.e. the relations between USA and Cuba), the documents obtained by looking at the distribution of specific (LDA-derived) topics are nothing more than an innovative way of searching through the dataset. Thus, it is important to keep in mind that these documents are not the only articles about the subject, and that maybe they are not even about that specific subject at all – due to the errors in the automatic learning process. Therefore, if we want to transform our data into evidences for supporting a specific argument or for confirming a hypothesis, we always have to evaluate our approach first. It is interesting to notice that this specific process would sound perfectly ordinary if we were not talking about machine learning methods, computers and algorithms. When a researcher wants to be sure that a viewpoint is correct (‘I believe this article is focused on the relations between USA and Cuba’), she/he will ask other colleagues.39 The process described here is the same: we need human annotations (for example articles marked as ‘being focused on the relations between USA and Cuba’ or not) in order to confirm that our hypothesis (what the machine is showing to me are articles related to the relations between USA and Cuba) is correct. Moreover, since humanists are working on extremely specific in-domain research tasks, they cannot rely on Amazon Mechanical Turk annotations as others usually do.40 For solving this specific issue, they cannot even rely on computer scientists or data mining experts: they need the help of their peers. Therefore, we believe that future advances in historical research on large corpora will be essentially achieved by exploiting deep human expertise, such as that provided by history scholars, as key components within weakly-supervised computational methods in two different ways. In our vision (Fig. 1), a first stage will still consist of exploratory studies, which are extremely useful to develop an initial idea of arbitrary datasets. During this process, both standard LDA and especially the semi-supervised methods presented earlier could be particularly useful, as they will help researchers manage the vastness of digital data at their disposal. Following the exploratory phase, when the interest on a specific phenomenon has been established, 72 Textual Analysis and Historical Research we envision researchers moving on and developing models to quantify such phenomenon in text, and creating a gold standard for evaluation based on human ground truth judgments – again, based on input from domain experts, i.e., scholars. During this second part of the study it might be that useful methods for exploratory studies (such as LDA) are not always as helpful when the task is to precisely identify specific phenomena. For this reason, the new generation of historians needs to learn how to employ text classification algorithms and have to become more and more confident with data analysis evaluation procedures.41 As a matter of fact, these practices have the potential to sustain and improve our comprehension of the past, when dealing with digital sources. 6. case study: applying these procedures in a well-defined historical research In this final section we describe how we intend to employ the methodological framework presented before in an interdisciplinary research project that, in the near future, will bring together researchers from the Historical Institute and the Data and Web Science Group of the University of Mannheim. Our cases study will be focused on circa 1,000 legal books from the 17th and 18th century, comprising over 310,000 pages of text. This is of course a large corpus for a historian, but only a small one for current research in computerized text analysis. Therefore, testing computational methods for specific analyses may proof insightful for both disciplines. These volumes form the ‘Juridica’ part of a book collection brought to Mannheim by the learned Jesuit François-Joseph Terrasse Desbillons (1711–1798) in the 1770s. They cover a broad variety of legal matters with a special, but not very surprising interests in canon law, and another, little more surprising interest in legal history, or more precisely: the old (French) law. Based on this corpus, we want to know more about this old French law, the ‘ancien droit’. Yet, we do not trace legal institutions, ideas, or regulations. Rather we ask for the fundamental terms that old French law rested upon. These terms lay the conceptual groundwork upon which concrete institutions, rules, and distinctions of legal thinking were built. Hence, they are usually not technical in a stricter sense (i.e. not exclusively legal), or bear multiple semantic dimensions largely depending upon their uses in specific contexts, e.g. terms like volonté (‘will’), origin (‘origin’), or liberté (‘liberty’). We aim to find these terms and their specific contexts, cluster together similar contexts, and weight them against each other, iteratively reaching a broad, yet precise spectrum of their meanings. Traditionally, dictionaries like these are compiled by domain experts (i.e. historians) by reading large amounts of contemporary texts, and by analysing these texts in what we, broadly speaking, term a ‘hermeneutical’ fashion. The selection of texts rests upon the researcher and his or her scope of reach, its 73 F. Nanni, H. Kümper, and S. Paolo Ponzetto amount on what he or she can physically read/bear, and its results rest largely on what he or she can find by physically reading either line by line or hastily flipping through the texts. This is not to say that this traditional method cannot or will not lead to fruitful conclusions.42 In the end, however, these projects are largely based on the presuppositions of the researcher about what she/he can (or will) actually find in the texts, and which texts will be more likely to give fruitful results. In other words, the researcher predefines both search terms and contexts. Our approach, in contrast, will also start with presuppositions, but iteratively enlarge them by finding both new contexts and probably even new search terms. It could, for instance, well be that notions of ‘will’ (volonté) and its faculties will be discussed in contexts of compulsion (contrainte, compulsion, coercition) without even using a word deriving from volonté. Term-based textual analysis will not find such instances, but concept-based analysis will – even in far less obvious examples than the one given here. As described before, our work will proceed through different steps. In the beginning, coarse-grained exploratory analyses (i.e. using standard LDA) will offer us a general idea of the content of the volumes and their similarities. Then, by combining different weakly-supervised techniques like Supervised LDA and Seeded LDA we will exploit domain expert knowledge to identify the semantic contexts in which these relevant concepts appear and to detect other similar patterns in the corpus. Finally, in order to use the results of these analyses as historical evidences, we will test, compare and improve our methods on a gold standard that it will be built with this specific purpose. 7. conclusions In this paper, we have discussed the applicability of a set of computational techniques for conducting fine-grained analyses on historical corpora. Furthermore, we have remarked the importance of an evaluation step when the data are exploited as evidence to support specific hypotheses. We believe that these practises will allow us to deepen our understanding of historical information embedded in digital data. acknowledgements The authors want to thank Laura Dietz (Data and Web Science Group) and Charlotte Colding Smith (Historical Institute) for their precious methodological advice. end notes 1 https://books.google.com/ngrams; all the URLs mentioned in this research were lastly checked on November 13th 2015. 74 Textual Analysis and Historical Research 2 J.B. Michel et al., ‘Quantitative analysis of culture using millions of digitized books’, Science, 331.6014 (2011), 176–182; A. Grafton. ‘Loneliness and Freedom’, Perspectives on History, online edition, March 2011, http://www.historians.org/publications- and-directories/perspectives-on-history/march-2011/loneliness-and-freedom. 3 G. Crane, ‘What do you do with a million books?’, D-Lib magazine, 12.3 (2006). 4 Grafton, ‘Loneliness and Freedom’. 5 See: http:// www . culturomics . org / Resources / faq / thoughts - clarifications - on - grafton-s- loneliness-and-freedom; F. Gibbs and T. Owens, ‘The hermeneutics of data and historical writing’, in J. Dougherty and K. Nawrotzki ed., Writing History in the Digital Age (Ann Arbor, MI, 2013). 6 D. Cohen, ‘Initial Thoughts on the Google Books Ngram Viewer and Datasets’, Dan Cohen’s Digital Humanities Blog, 19/10/2010, http://www.dancohen.org/2010/12/19/initial-thoughts- on-the-google-books-ngram-viewer-and-datasets/. 7 See the answer to ‘How does this relate to ‘humanities computing’ and ‘digital humanities’?’ in Culturomics FAQ section: http://www.culturomics.org/Resources/faq; C. S. Fisher, ‘Digital Humanities, Big Data, and Ngrams, Boston Review, 20/06/2013, http://www. bostonreview.net/blog/digital-humanities-big-data-and-ngrams; C. Blevins, ‘The Perpetual Sunrise of Methodology’, 05/01/2015, http://www.cameronblevins.org/posts/perpetual- sunrise-methodology/ 8 I. Gregory, ‘Challenges and opportunities for digital history’, Frontiers in Digital Humanities, 1 (2014); M. Thaller, ‘Controversies around the Digital Humanities: An Agenda’, Historical Social Research/Historische Sozialforschung (2012), 7–23. 9 O. Chapelle et al. (edited by), Semi-Supervised Learning (Cambridge, MA, 2006). 10 T. Owens, ‘Discovery and justification are different: Notes on science-ing the humanities’, 19/11/2012, http://www.trevorowens.org/2012/11/discovery-and-justification- are-different-notes-on-sciencing-the-humanities/; D. Sculley and B. M. Pasanek. ‘Meaning and mining: the impact of implicit assumptions in data mining for the humanities’, Literary and Linguistic Computing, 23.4 (2008), 409–424. 11 R. S. Michalski, J. G. Carbonell and T. M. Mitchell, Machine learning: An artificial intelligence approach (Heidelberg, 1983). 12 E. Alexander et al. ‘Serendip: Topic model-driven visual exploration of text corpora’, Proceedings of IEEE Conference on Visual Analytics Science and Technology (Paris, 2014); M. Steinbach, G. Karypis, and V. Kumar, ‘A comparison of document clustering techniques’, KDD workshop on text mining. 400–1 (2000), 525–526. 13 W. G. Thomas III, ‘Computing and the historical imagination’, in S. Schreibman, R. Siemens and J. Unsworth, ed., A companion to digital humanities (Oxford, 2004), 56–68. 14 D. N. McCloskey, ‘The achievements of the cliometric school’, The Journal of Economic History, 38.01 (1978), 13–28. 15 D. J. Cohen, and R. Rosenzweig. Digital history: a guide to gathering, preserving, and presenting the past on the web (Philadelphia, 2006). 16 S. Robertson, The differences between digital history and digital humanities, 23/05/2014, http://drstephenrobertson.com/ blogpost/ the-differences-between-digital-history-and-digital- humanities/. 17 S. Graham, I. Milligan and S. Weingart. The Historian’s Macroscope - working title, Open Draft Version, Autumn 2013, http://themacroscope.org. 18 For example the TAPoR project: http://www.tapor.ca/. 19 In D. J. Cohen et al, ‘Interchange: The promise of digital history’, The Journal of American History (2008), 452–491. 75 F. Nanni, H. Kümper, and S. Paolo Ponzetto 20 Willam J. Turkel’ blog: http://digitalhistoryhacks.blogspot.com/; The Programming Historian: http://programminghistorian.org/. 21 For example, I. Milligan, ‘Mining the ‘Internet Graveyard’: Rethinking the Historians’ Toolkit’, Journal of the Canadian Historical Association/Revue de la Société historique du Canada, 23.2 (2012), 21–64. 22 For instance, C. Blevins, ‘Space, Nation, and the Triumph of Region: A View of the World from Houston’, Journal of American History, 101.1 (2014), 122–147 and M. Kaufman, ‘Everything on Paper Will Be Used Against Me: Quantifying Kissinger’, 2014, http://blog.quantifyingkissinger.com/. 23 For example, C. Au Yeung and A. Jatowt. ‘Studying how the past is remembered: towards computational history through large scale text mining’, Proceedings of the 20th ACM international conference on Information and knowledge management (Glasgow, 2011). 24 E. Meeks and S. Weingart, ‘The digital humanities contribution to topic modeling’, Journal of Digital Humanities, 2.1 (2012), 1–6. 25 D. M. Blei, A. Y. Ng and M. I. Jordan, ‘Latent dirichlet allocation’, the Journal of machine Learning research, 3 (2003), 993–1022. 26 J. Chang et al., ‘Reading tea leaves: How humans interpret topic models’, Advances in neural information processing systems, 2009. 27 R. Brauer, M. Dymitrow and M. Fridlund, ‘The digital shaping of humanities research: The emergence of Topic Modeling within historical studies’, Enacting Futures: DASTS 2014 (Roskilde, 2014). 28 T. Underwood, ‘Topic modeling made just simple enough’, The Stone and Shell, 07/04/2012, http://tedunderwood.com/2012/04/07/topic-modeling-made-just-simple-enough/; Storify of the DH Topic Modeling Workshop: https://storify.com/sekleinman/dh-topic-modeling- seminar; Meeks and Weingart, ‘The digital humanities contribution to topic modeling’. 29 Owens, ‘Discovery and justification are different: Notes on science-ing the humanities’. 30 S. Marche, ‘Literature is not data: Against digital humanities’, LA Review of Books (2012); L. Wieseltier, ‘Crimes against humanities’, New Republic, 244.15 (2013), 32–39. 31 D. Hall, D. Jurafsky and C. D. Manning, ‘Studying the history of ideas using topic models’, Proceedings of the conference on empirical methods in natural language processing (Honolulu, 2008); D. Mimno, ‘Computational historiography: Data mining in a century of classics journals’, Journal on Computing and Cultural Heritage, 5.1 (2012); M. Schich et al., ‘A network framework of cultural history’, Science, 345.6196 (2014), 558–562. 32 J. D. Mcauliffe, and D. M. Blei, ‘Supervised topic models’, Advances in neural information processing systems (2008). 33 T. Brown, ‘Telling New Stories about our Texts: Next Steps for Topic Modeling in the Humanities’, DH2012: Topic Modeling the Past, http://rlskoeser.github.io/2012/08/ 10/dh2012-topic-modeling-past/ 34 D. Ramage et al., ‘Labeled LDA: A supervised topic model for credit attribution in multi- labeled corpora’, Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (Singapore, 2009). 35 D. Mimno and A. McCallum ‘Topic models conditioned on arbitrary features with Dirichlet multinomial regression’, Uncertainty in Artificial Intelligence, 2008. 36 J. Jagarlamudi, H. Daumé III and R. Udupa, ‘Incorporating lexical priors into topic models’, Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (Avignon, 2012). 37 Owens, ‘Discovery and justification are different: Notes on science-ing the humanities’. 38 M. Thaller, ‘Controversies around the Digital Humanities: An Agenda’. 76 Textual Analysis and Historical Research 39 The examples presented here describe an over simplified case study. However, the complexity of the evaluation process can easily be shown by turning to more complex, realistic tasks like, for example, to identify how the different meanings of ‘will’ evolve within a reasonably sized historical corpus. 40 In computational linguistics and natural language processing during last decade the use of human non-expert annotators for the construction of labeled datasets has become an established practice. To know more about the online labor market Amazon Mechanical Turk: https:// www.mturk.com/mturk/welcome. 41 F. Sebastiani, ‘Machine learning in automated text categorization’, ACM computing surveys, 34.1 (2002), 1–47. 42 For example R. Koselleck, W. Conze and O. Brunner ed. by, Geschichtliche Grundbegriffe, 8 vols. (Stuttgart, 1972–1997) and R. Rolf, E. Schmitt, and H. J. Lüsebrinck, Handbuch politisch-sozialer Grundbegriffe in Frankreich, 1680–1820 (Berlin et al, 1985ff). 77 Your short guide to the EUP Journals Blog.pdf 1. The primary goal of the EUP Journals Blog To aid discovery of authors, articles, research, multimedia and reviews published in Journals, and as a consequence contribute to increasing traffic, usage and citations of journal content. 2. Audience Blog posts are written for an educated, popular and academic audience within EUP Journals’ publishing fields. 3. Content criteria - your ideas for posts 4. Word count, style, and formatting  Flexible length, however typical posts range 70-600 words.  Related images and media files are encouraged.  No heavy restrictions to the style or format of the post, but it should best reflect the content and topic discussed. 5. Linking policy 6. Submit your post If you’d like to be a regular contributor, then we can set you up as an author so you can create, edit, publish, and delete your own posts, as well as upload files and images. 7. Republishing/repurposing 8. Items to accompany post work_a6ekywpsd5ahlk57zn2bhownzy ---- Microsoft Word - Syllabus.doc >>> mappingthedigitalhumanities.org <<< room: OUGL 102 Welcome to Comparative History of Ideas 498, Mapping the Digital Humanities! What is the role of digital technologies in learning and taking humanities classes at the university? How are these technologies influencing humanities scholarship and research practices, as well as facilitating critical, collaborative, and creative inquiry? With these questions as a framework, this course provides you with the opportunity to develop your own digital humanities project throughout (and ideally beyond) an entire quarter. More specifically, the class is structured around two approaches to “mapping” in the digital humanities: geographical mapping and textual mapping. In the first instance, as a class, you will collaboratively compose an interactive, digital map of the University of Washington’s Seattle campus through a combination of photography, video, sound, text, and Google Maps and Earth. In the second instance, you will pursue individual projects, where you will use a blend of qualitative and quantitative approaches to produce a digital model of your own research on a particular text or texts. Put this way, both the collaborative and individual projects will function as vehicles for “animating” information and moving audiences toward new ways of engaging humanities research. This class is an introduction to the digital humanities. No technical competences are required, and the course content stresses technology-focused critical methods and computer-aided approaches to culture, history, and literature. That said, while I will assume that you have no technical competences in computing (specifically in XHTML, CSS, GIS, or data modeling), I will ask you to further the humanities work you have already done. Regardless of what individual project you ultimately choose, I ask that you think of this class both as a direct extension of your previous studies and as a tangible means of preparing you for future studies at the intersection of things digital and things humanistic. Try being a computer geek and a book nerd, simultaneously, if only for a quarter. “Mapping the Digital Humanities” will be a quarter-long project on a number of registers—individual and collaborative, methodical and experimental, technical and critical. And as for that peculiar title: “mapping” the digital humanities implies not just the maps you will be producing, but also locating possibilities for the digital humanities in your own undergraduate education. This act of locating should allow you a great deal of leeway in making your own choices in this class; it should also allow me to learn a great deal with you in the process. So what is “the digital humanities,” exactly? “Don’t teach skills. Teach competences. . . . Computers can do better things than that.” – Sandy Stone, during a July 16, 2007 talk at the European Graduate School The digital humanities is not a discipline. It’s best understood as a field of study that often requires interdisciplinary work across departments and learning spaces—for instance, here at the UW, this class emerged through a collaboration between faculty, graduate students, and staff in Geography, English, Comparative History of Ideas, and the Simpson Center for the Humanities, and with the input of some Comparative History of Ideas 498, Spring 2009 Mapping the Digital Humanities, Instructor: Jentery Sayers Page 2 _______________________________________________ undergraduates, I should add. While I am teaching the course, over the last year the development of the curriculum demanded practices, approaches, and experiences that could not be situated solely in the discipline of English. With that brief history of the course in mind, the digital humanities is the synthesis of technical competences in computing with critical practices in the humanities. Yes indeed, humanists do use computers. In fact, scholars in the digital humanities often: • “Refashion” print, or digitize and encode print texts for preservation and searching, • Generate digital models (e.g., graphs, diagrams, and charts) that re-present and re-think the book, • Study the history of computers and computing practices in humanities contexts, • Use computers for storing, transmitting, and mining humanities research, • Work in collaborative teams consisting of, say, literary critics, historians, information scientists, and designers, and • Assess the cultural implications of new media and technologies. True, not every digital humanities scholar practices all of the above, and there are many more things to be added to that list. Nevertheless, what each has in common is the fact that technology is never understood as merely a means to rehearse particular skills. Technology is more than that, more than the thing through which input generates output. It is a culturally embedded, contextual catalyst for producing knowledge. Technology shapes us, and we shape technology. Call that a “feedback loop,” if you wish. Why teach this course, in particular, on “mapping”? “there can be no true maps” – Fredric Jameson, in Postmodernism, Or, the Cultural Logic of Late Capitalism My colleague (Matt Wilson, Geography) and I noticed something commonplace in undergraduate education and research at the University of Washington, namely that students in departments such as English and Comparative History of Ideas often learn about how technology is embedded in culture, yet they rarely have the opportunity to acquire technical competences in media production. On the other hand, students in departments such as Geography do often acquire the technical competences they need, but they do so without the chance to learn some critical perspectives on technology. With this apparent polarity or gap in mind, this class asks you to blend the technical with the critical—to see how both function in any technology-focused project. However, by the calendar we must abide. We have just one quarter. Consequently, of all the things digital humanities scholars often do, we’ll narrow them down to two things: (1) modeling and (2) refashioning print. Both of these are knotted together through “mapping.” This quarter, mapping both the campus and a text (or a group of texts) will allow you to: • Learn how to use new media and technologies, as well as computer-aided approaches to the humanities, to identify and analyze patterns (e.g., everyday habits on campus, word occurrence, and lines of thought) that you perhaps overlooked in your previous studies and experiences, • Examine the complex relationships between print and digital texts, particularly how digital humanities scholars do not simply “digitize” print—they reconfigure and reshape it, and • Understand how all maps and classifications are inherently biased and how to pressure that bias toward critical readings of history, place, literature, and culture. Comparative History of Ideas 498, Spring 2009 Mapping the Digital Humanities, Instructor: Jentery Sayers Page 3 _______________________________________________ While there may be no true maps, some are much more persuasive than others, with far more palpable effects. Abstractions are material, and they are not divorced from the actual, felt goings-on of everyday life. Keep this in mind as you progress through the quarter. One challenge will be how—in all the modeling and refashioning—to make your work matter for particular audiences. Another might be how—in all the classifying, coding, and locating—to generate a surprise, or something that an audience does not expect from a map. By focusing on maps and mapping, we’ll attend to how maps are simultaneously: • Abstractions and idealized forms, • Material objects, • Negotiations between social forces, cultures, practices, structures of power, and people, and • Classification systems and means of producing and sustaining order. How are the projects graded? “The digital pioneers in American literature are beginning to take stock of their achievements. They are asking questions about how the new technology is affecting analysis itself, rather than focusing only on its scope, speed, or convenience.” – Kathlin Smith, in “American Literature E-Scholarship: A Revolution in the Making” To reiterate: In this class, you will outline, execute, revise and present your own digital humanities research project that is not only feasible in a quarter, but also builds upon work you’ve already done. What’s more, you’ll be asked to use a method that’s flexible enough to allow you to further develop your project after the class is finished. Your project will emerge in steps, which will include opportunities for you to comment on your peers’ projects, receive feedback from them and me, and experiment with ideas. By the quarter’s end, you should: • Become familiar with a markup language (XHTML) and a stylesheet language (CSS) and write in both of them (at a novice level) without the use of a computer. • Collaboratively construct a geographical map (of the UW, Seattle campus) through a set of shared and agreed-upon standards for composing in a networked environment. • Individually produce a textual map (e.g., of a city depicted in a novel, of the relations between texts in an archive) and articulate (in an abstract of no more than 300 words) the map’s critical motivation, its classification system, and the method used to produce it. • Research aspects of a print text (e.g., a novel, a geographical map), refashion and animate them in a digital text, and assess (in 750-1250 words) how that animation affords a novel way for audiences to perceive, navigate and interpret your research. • Sample a variety of software and systems (e.g., ArcGIS, WordPress, and Google Visualization, Earth and Maps) and identify what software and systems are most appropriate for your own digital humanities project. Note that these learning outcomes are not based simply on making humanities research easier or speedier. Instead, they stress how new technologies afford new analysis, which requires both technical competences and critical practices. Based upon these outcomes, your work will be graded as follows: • Class participation (30% of the grade): Class time will include hands-on modules on humanities computing, group conversations, short talks, workshops, and critiques. Aside from Comparative History of Ideas 498, Spring 2009 Mapping the Digital Humanities, Instructor: Jentery Sayers Page 4 _______________________________________________ these components, the class participation grade will also include the timeliness of your work, your participation in three conferences with me, and the quality of your collaboration with your peers. • Blogging and collaborative project (20% of the grade): You will be blogging throughout the quarter. Since the collaborative project is for the most part housed on the blog, it is also included in this portion of your grade. Factors for assessing the blogging and collaborative project include timeliness, how persuasively your work responds to the prompt at hand, and how concretely the ideas and applications from class modules are mobilized in your writing and compositions. • Quiz (5% of the grade): There will be one quiz—announced in advance—administered and taken in class. It will emerge from the modules and will cover the basics of XHTML and CSS. You can only take it once. • Final presentation (5% of the grade): At the quarter’s end, you will present your individual project (see next bullet point) to the class or to a wider audience. (We’ll decide on the audience at the beginning of the quarter.) That presentation will be graded on how concisely you articulate your work, the clarity of your method, and the appropriateness of the presentation’s content for the context. • Individual project (40% of the grade): Individual projects will consist of six stages (i.e., thought piece, needs assessment, work flow, abstract, data model, and final digital model and assessment). Aside from the final digital model and assessment, you will be able to revise each stage of the project based upon the criteria in the prompt, comments from and conferences with me, and feedback from your peers. These five components of the class will each be graded on a 4.0 scale and then, for your final grade, averaged according to the percentages I provide above. How does the individual project work? “Now imagine that the forest is a huge information space and each of the trees and bushes are classification systems. . . . Your job is to describe this forest. You may write a basic manual of forestry, or paint a landscape, compose an opera, or improve the maps used throughout. What will your product look like? Who will use it?” – Geoffrey C. Bowker & Susan Leigh Star, in Sorting Things Out To elaborate on your individual project, each stage will be graded, outcome by outcome, on the 4.0 scale. These stages will allow you to continuously revise what your project will look like and who will be its audience(s). At the end of the quarter, your individual project will be treated as a six-stage portfolio and will receive one grade on the 4.0 scale. To receive credit for the class, all six stages must be included in your final portfolio (which will be housed at mappingthedigitalhumanities.org). Here is how I will calculate the grade for your portfolio: • Thought piece (10% of portfolio, can be revised once after it’s graded), • Needs assessment (10% of portfolio, can be revised once after it’s graded), • Work flow (10% of portfolio, can be revised once after it’s graded), • Data model (15% of portfolio, can be revised once after it’s graded), • Abstract (15% of portfolio, can be revised once after it’s graded), and • Final prototype and assessment (40% of portfolio, cannot be revised after it’s graded). Please note that I will probably revise the prompts as the class progresses. Needs and demands change. Such is life. Here is a map, then, of what the course includes. It’s reductive, in a productive way. Comparative History of Ideas 498, Spring 2009 Mapping the Digital Humanities, Instructor: Jentery Sayers Page 5 _______________________________________________ • On the left are your collaborative mapping assignments (part of your participation grade). • On the right are the assignments for the textual map (part of your individual project grade). • In the middle (top) and middle (bottom) are the critical practices and technical competences you’ll be asked to acquire, respectively, • In the middle of the map is the course goal. • On the bottom (left) are the critical traditions and practices used to generate the curriculum. (“STS” stands for “Science and Technology Studies.”) Comparative History of Ideas 498, Spring 2009 Mapping the Digital Humanities, Instructor: Jentery Sayers Page 6 _______________________________________________ What are the course materials or textbooks? There is no textbook for the class. The course material consists mostly of ten modules. These, too, are subject to change. The purpose of the modules is to work toward technical issues in the digital humanities through the lenses of history, culture, and literature. Generally speaking, a single module will take one class period (roughly two hours), with half of the class dedicated to lecture and conversation and the other half to technical application. What I ask of you, then, is to review each module prior to class (including the links provided), actively participate during class, and chat with me whenever questions or concerns arise. Other than the modules, the bulk of out-of-class reading, studying, and research will be project specific. For your individual projects and with advice from your peers, I will work with you (in class, during conferences, and by appointment) to help you determine what texts, materials, and methods you might consider to produce a textual map by the quarter’s end. Occasionally, I will ask you to read a tad between classes in order to prepare for a module. Those readings will be provided in class, on the class blog, or via the class listserv. Other than the readings and modules, you will occasionally need access to a digital camera, mobile phone, and/or camcorder. If you do not have any of these, then I suggest reserving a digital camera or camcorder from Classroom Support Services. (More at http://www.css. washington.edu/). I will keep you posted on when would be a good time to make those reservations. I realize there are time restrictions. Where’s the calendar? First off, it’s subject to change and quite elastic. That said, I provide it via a Google calendar, which is available via the course website (mappingthedigitalhumanities.org). During class, I will generally announce what we’ll be attending to in the next few classes. I often echo that in-class announcement with an email to the class listserv. In advance, thanks for your willingness to be flexible here. As an instructor, I find that flexibility pays off for both students and me. What are the course policies? >>> Participation Since conversations are essential to the quality of this class, I expect that we shall work together to create an atmosphere of respect. College level discourse does not shy away from sensitive issues, including questions of race, gender, class, sexuality, politics, art, and religion, and neither will we. There are going to be differences in opinions, beliefs, and interpretations when we question texts, technology and cultural issues. You need not agree with the arguments in what we read or with what others—including me— have to say. In fact, it is important to think critically and question approaches. Still, you must do so intelligently and with respect. Respect for difference is instrumental to creating a classroom in which a variety of ideas can be exchanged and points of view can be explored. What is crucial to CHID 498 is that you are enjoying and are comfortable participating in the course. If for whatever reason you are not, then please talk with me. I understand that some people are more comfortable speaking in the classroom than others. That said, additional blogging, visits to class colloquia (see below), and individual meetings with me will also improve your participation grade. >>> Conferences During the quarter, you are required to individually meet with me three times to discuss your project. The conferences are really conversations: they are informal ways of checking in, saying hello, and talking Comparative History of Ideas 498, Spring 2009 Mapping the Digital Humanities, Instructor: Jentery Sayers Page 7 _______________________________________________ face-to-face about particular aspects of your project (e.g., your thought piece, data model, and final presentation). Before each round of conferences, I’ll circulate a sign-up sheet. >>> Attendance While I do not take attendance, attending CHID 498 will greatly enhance your chances of submitting a persuasive final project, learning about the material, engaging in modules, collaborating with others, and sharing your ideas. Communication is key. If possible, then get in touch with me before you miss class, but most certainly after. I am not a detective. I will not hunt you down to tell you what you missed. Please rely on your peers and the course blog for that information. Thanks! >>>Late Work The best policy is to never turn anything in late. But things happen. The things to remember are: • If you are falling behind, then just talk with me. We can make arrangements. • Late work decreases your participation grade. The later the work, the greater the decrease. • If you miss class when something’s due, then just submit it (e.g., via the blog) ASAP. • Assignments that are not turned in (e.g., via the blog) by the beginning of class on the due date are considered late and decrease your participation grade. However, you still need to complete and submit late work, as your project portfolio must include all six stages of the process. >>> Drops Before a specific date, you can withdraw from courses without an entry being made on your transcript. After a specific date, fees ensue. See the University's withdrawal policy for more information and those dates. >>> Incompletes I rarely consider giving a grade of "I" (for Incomplete). To receive an incomplete: • A special request must be made to me, • All of your work must be complete through the seventh week of the quarter, • There must be a documented illness or extraordinary situation, • A written contract, stipulating when course work will be completed, must be arrived at between you and me, and • Failure to complete the course by the end of the following quarter (summer term excepted) will result in a failing grade of 0.0. If, without explanation, you leave the class at any time during the quarter, an incomplete grade will not be considered. In such cases, I determine the grade based on the work you submitted. >>> Plagiarism Plagiarism, or academic dishonesty, is presenting someone else's ideas or writing as your own. In your writing for this class, you are encouraged to refer to other people's thoughts and writing—as long as you cite them. Many students do not have a clear understanding of what constitutes plagiarism. It includes: • Failing to cite the source of an idea, Comparative History of Ideas 498, Spring 2009 Mapping the Digital Humanities, Instructor: Jentery Sayers Page 8 _______________________________________________ • Failing to cite sources of paraphrased material, • Failing to cite courses of specific language and/or passages, and • Submitting someone else’s work as her or his own. If you have doubts about whether to cite or acknowledge another person’s writing, then just let me know. Better safe than sorry. I would rather not report an act of plagiarism to the College of Arts and Sciences for review. And think about it: Google, databases galore, and the fact that I was a student, too, make it really, really easy for me to spot plagiarized work. So don’t do it. For more information, refer to the UW’s Student Conduct Code. I will update and revise these policies if the quarter so requires. How can students find help with 498 and find other support on campus? >>> Digital Humanities Colloquia, Office Hours, and Appointments My spring quarter office hours are Wednesdays, 3-5 p.m., or by appointment (preferably on Mondays or Wednesdays), in Parnassus Café (in the basement of the Art building). (For appointments, I cannot meet on a Thursday, since I will be teaching another course (at UW-Bothell) on that day.) Additionally, during this quarter there will be at least three “digital humanities colloquia” related to the class. These colloquia will occur during my office hours and are open to everyone in the class, as well as to others who might be interested in what we’re talking about. Spread the word. For each colloquium, I invited UW graduate students and friends who are familiar with the course content to participate and offer insight. Currently, the possible topics for the three colloquia are: • Contriving Rules: On Generative Constraints in Poetry • Plots and Patterns: Mapping Practices in Detective Fiction • Re-Mapping the University: The Race/Knowledge Project at the UW The colloquia are optional and intended to be conversational in character. You are invited to come quibble, ask questions, chat, or just listen. The first ninety minutes of a given colloquium will be geared toward group conversation. The final thirty minutes will give you the chance to individually meet with me. If the class colloquia or my office hours are not amenable to your schedule, then please don’t hesitate to ask for an appointment. I'm around. I may ask you to meet with me when I think a conference would be useful. I invite you to meet with me whenever you have questions, concerns, or suggestions. This quarter, there are also a number of talks, which will be relevant to the digital humanities, occurring on campus and elsewhere. I’ll keep you posted. If you attend, then I’ll give you extra participation credit. >>> E-mail and Class Listserv You can e-mail me at jentery@u.washington.edu. I will generally respond to e-mail within twenty-four hours, unless I am out-of-town giving a talk or the like. The course listserv is chid498b_sp09@u.washington.edu. When you send an e-mail to it, everyone in the class will receive your message. Remember: if I send a message via the listserv (which I will do about once per week), reply to me (jentery@u.washington.edu) and not the listserv, unless you want everyone on the list to read your e-mail. Comparative History of Ideas 498, Spring 2009 Mapping the Digital Humanities, Instructor: Jentery Sayers Page 9 _______________________________________________ >>> Q Center The University of Washington Q Center builds and facilitates queer (gay, lesbian, bisexual, two-spirit, trans, intersex, questioning, same-gender-loving, allies) academic and social community though education, advocacy, and support services to achieve a socially-just campus in which all people are valued. More at http://depts.washington.edu/qcenter/. >>> Office of Minority Affairs and Diversity The mission of the Office of Minority Affairs and Diversity is to ensure the access and academic success of a diverse student population through the advancement of knowledge, academic excellence, diversity, and the promotion of values, principles, and a climate that enriches the campus experience for all. More at http://depts.washington.edu/omad/. >>> Center for Experiential Learning The University of Washington's Center for Experiential Learning (EXP) is home to seven programs (the Undergraduate Research Program, the Mary Gates Endowment for Students, the Carlson Center, Pipeline, Jumpstart, Global Opportunities Advising, and the Office of Merit Scholarships, Fellowships & Awards), each of which connects UW undergraduates to compelling and invigorating opportunities to expand and enrich their learning. More at http://exp.washington.edu/. >>> The Counseling Center The Counseling Center exists to support UW students in all aspects of their development. They provide personal counseling, career counseling, study skills assistance, and other services to currently-enrolled UW students. The Counseling Center also provides consultation to faculty, staff, and parents who have concerns about a student. More at http://depts.washington.edu/counsels/. >>> Writing Centers You can find additional writing help at: • The English Department Writing Center, located in B-12 Padelford Hall (http://depts.washington.edu/wcenter/) • The CHID Writing Center, also in Padelford (http://depts.washington.edu/chid/wcenter/about.php). If you make an appointment to see a writing center tutor, then you will receive extra participation credit. >>> The DSO Please let me know if you need accommodation of any sort. I can work with the UW Disability Service Office (DSO) to provide what you require. I am very willing to take suggestions specific to this class to meet your needs. The course syllabus, prompts, and modules are available in large print, as are other class materials. [Skip to next page.] Comparative History of Ideas 498, Spring 2009 Mapping the Digital Humanities, Instructor: Jentery Sayers Page 10 _______________________________________________ >>> My Contact Information Department of English Box 354330 University of Washington Seattle, WA 98195-4330 jentery at u.washington.edu Office Hours: MW, 3-5, Parnassus Café Thanks! And please let me know what questions or concerns you have! In the meantime, I’m looking forward to this quarter! A thank you and nod of appreciation to: • The Comparative History of Ideas Program • The Simpson Center for the Humanities at the UW • UW English • UW Geography • The Huckabay Teaching Fellowship program at the UW Graduate School • Matthew W. Wilson • Curtis Hisayasu • Sarah Elwood • Phillip Thurtle • The Humanities, Art, Science, and Technology Advanced Collaboratory Walter Benjamin, in "Theses on the Philosophy of History”: Thinking involves not only the flow of thoughts, but their arrest as well. While it’s tempting to spend the balance of the quarter aggregating data and piling on media, I say we stop for a second and start building things. But! This one’s not the whole idea. It’s a thought piece. And it should consist of the following: • As a field of study, what you think the digital humanities does, • How you think its practitioners do what they do, and • Initial and interesting ideas for at least one digital humanities project that you could develop this quarter. At least one. By “you,” I mean you in particular. Be selfish, people. How you shape this information is up to you. You can essay, diagram, video, draw . . . The medium is not the matter. Pick what you prefer. However, you should figure this in: your medium will influence how you (and your audience) create and think through a message. (Consider “remediation” and “intermediation” from Module 4, as well as “syntagms” and “paradigms” from Module 2.) And remember: A thought piece is a riff. The point is to conjecture. Speculate. Toss out a rich idea or two or three, and later we’ll talk about making the whole thing happen. Outcomes Your thought piece should: • Demonstrate a general understanding of how Modules 1 through 4 relate to the digital humanities as a field and a set of practices (e.g., apply some of the concepts from the modules, think through how to use new media for new forms of scholarship, or unpack the distinctions between print and digital texts). • Give your audience (that is, your 498 peers and me) a sense of why your project(s) would be filed under “digital humanities” and what’s interesting—provocative, even—about your idea(s). Before and during the process, consider: • Reviewing the visualization/diagram of the class (in the syllabus). What’s familiar? What isn’t? • Giving the class modules another gander. What appeals? What confounds? • Looking back at some of your old work from other classes. What have you written on? Studied? What do you care about? What’s curious, and what could be developed? Conversation Coming Soon Your thought piece is due—on the class blog (embedded, via a link, or as text)—before class on Wednesday, April 15th. It will serve as a vehicle for conversation during your first conference with me. Which is to say: I’ll attend to it before we meet. That way, we don’t start cold. I swear. The thought piece will be graded on the 4.0 scale, and it can be revised once. It’s part of your individual project grade. If you have problems with the blog, then let me know. http://books.google.com/books?id=AFJ7dvSdXPgC&dq=illuminations&ei=Q4e6Sd6_O5WWkATG6oiCDA&pgis=1 http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-2.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-4.pdf Ishmael, in Herman Melville’s Moby Dick: God keep me from ever completing anything. . . . Oh, Time, Strength, Cash, and Patience! Michel Eyquem de Montaigne in “Of Cannibals”: I am afraid our eyes are bigger than our bellies and that we have more curiosity than capacity. We grasp at all, but catch nothing but wind. You’ve made a thought piece. We’ve talked about it. Now it’s time to sketch out what—aside from time, strength, cash, and patience—is needed to put a thought in motion. Of course, a thought moving isn’t a thought complete. Keep that pithy line in mind as you respond to this prompt. Or, to contextualize: The goal for the quarter isn’t to finish a research project; it’s to build one worth developing in the future. Recall Shelley Jackson, from Module 1: “there can be no final unpacking.” Determine, then, what you can grasp—what’s feasible—between now and June-ish. How practical, especially for humanists. Let’s give such practicality a name: “needs assessment.” However! As opposed to the image below, your “needs” here won’t simply be downloaded for regurgitation later. You’ll have to come up with them on your own, with some guidelines. As with the first prompt, the medium is yours. But please respond to the following: • What do you want from your emerging project? Or, what is your objective, and what’s motivating it? • What do you need (e.g., knowledge, experience, materials, and practice) to pull everything off? Or, to return to Moretti and Module 5 for a sec: For now, what knowledge are you taking for granted? • Where are you going for evidence or data? That is, what texts will you be working with? Outcomes Your needs assessment should be: • Specific, pointing to the particular knowledge you need and want (e.g., XHTML, GIS, literature review, and media theory/history) and what materials you should have (e.g., software, time, and books). • More refined and focused than your thought piece. (If the thought piece was about broad possibilities, then your needs assessment is about concrete ones.) • A way of responding to your first conference with me. (Reference our conversation and expound upon it.) • Aware that its audience consists of your peers and me. (Feel free to use names or speak to particular bits from class.) Before and during the process, consider: • What is realistic for a quarter? • How do you avoid reinventing the wheel? What did you learn from another course or project that could be developed and re/intermediated? • When the spring’s finished, what kind of project will be most useful for you? Think before and beyond now. http://books.google.com/books?id=cYKYYypj8UAC&dq=moby+dick&ei=Boq6SbL7K5r6kASbl6yJCA http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-1.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-5.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-4.pdf Mapping the Digital Humanities Assignment 2, Page 2 __________________________ Critiques Soonish Your needs assessment is due—on the class blog (embedded, via a link, or as text)—before class on Monday, April 20th. You will share it during in-class critiques. During those critiques, you’ll also respond to your peers’ assessments. The needs assessment will be graded on the 4.0 scale, and it can be revised once. It’s part of your individual project grade. If you still have problems with the blog, then let’s talk. I might need to revise or address something. Carl von Clausewitz in On War: Everything in strategy is very simple, but that does not mean that everything is very easy. Fair enough, Carl, but that doesn’t mean we can’t at least try to make things a tad easier, right? Despite the fact that plans and thoughts and needs and life are all subject to change, sketching out an agenda, through some simple elements, is rarely a bad idea. The key is—to borrow from Chris Kelty—“planning in the ability to plan out; an effort to continuously secure the ability to deal with surprise and unexpected outcomes” (12). So how about what we’ll call a “workflow”? Again, the medium is yours, but please transmit the following: • What is your research question? (Try one that starts with “how.”) • What are the data elements for your project? (We have already discussed these in class; and, if all’s on par, then you should have already drafted them.) • How are you animating these elements (e.g., through what medium—for example, a motion chart, a geomap, or a timeline—are you shaping information)? • What do you expect to emerge from this animation (e.g., what will information look like, how will the audience interpret it, or what might you learn from it)? • Ultimately, what are you going to do with it (e.g., how will it influence your current work, how might you use it in other classes, how will it persuade audiences, or how will it change the ways in which you perceive the text(s) you’re working with)? Outcomes Your workflow should: • Be driven by a concrete and provocative research question, which emerges from your responses to Prompts 1 and 2. • Be very specific about the data elements you are using. Name them. List them out. • Be very specific about the kind of animation you are using, including some knowledge of how that animation allows you and your audience to produce knowledge—or how that animation is a “swervy thing.” • Demonstrate that you are aware of why you are using the data elements and animation you’re using and what might be the implications of your decision (e.g., what are the benefits and deficits, or the hot ideas worth some risk and not-so-hot possibilities that are deterring you). • Aware that its audience consists of your peers and me. Before and during the process, consider: • How your digital project—through computational animation—demands a different mode of thought than, say, writing a paper. How might you take advantage of this difference? What does it afford? • What options you have for animation, what you are most comfortable with, and—again, again, again—what seems feasible for a quarter. http://books.google.com/books?id=fXJnOde4eYkC&printsec=frontcover&dq=on+war&ei=7ZG6SfWZLIWekwS-3LT8Cw#PPA110,M1 http://books.google.com/books?id=wC2stJS83rYC&printsec=frontcover&dq=kelty+free+software&ei=QoLJSdLcDI6QkASH1pCCDg#PPA12,M1 http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-5.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-5.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-6.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-6.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/assignment-1.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/assignment-2.pdf Mapping the Digital Humanities Assignment 3, Page 2 __________________________ • In your previous work, what terms or concepts pop up most often, which ones interest you the most, which ones you’d rather do without, and how those terms would translate in a computational approach. Toward Making Animation Matter Your workflow is due—on the class blog (embedded, via a link, or as text)—before class on Monday, April 27th. In class, we’ll get theoretical and address the “stakes” of your animation and data elements, or how you can make them matter and for whom. The workflow will be graded on the 4.0 scale, and it can be revised once. It’s part of your individual project grade. Keep me posted with questions and quibbles. http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-7.pdf From Dyeth in Samuel R. Delany’s Stars in My Pocket Like Grains of Sand: Someone once pointed out to me that there are two kinds of memory (I don’t mean short- and long-term, either): recognition memory and reconstruction memory. The second is what artists train; and most of us live off the first—though even if we’re not artists we have enough of the second to get us through the normal run of imaginings. A constant challenge in academic work, then, is to model something that reshapes the material with which you and others are already familiar—to re-construct and re-imagine history, culture, texts, territories, and places through new paradigms, without simply recognizing them as what you already know, using the same blueprints, strategies, and maps as before. To produce a contrivance. To project a world and animate it. To swerve. I’m not saying it’s easy. It’s not. But give it a whirl. You’ve thought about your project (in your Thought Piece), assessed its possibilities (in your Needs Assessment), made it elemental (in your Work Flow), and speculated on what might happen come June (during in-class workshops). Now’s the time to give people the classification system for your information collecting and some results—that is, your data model and some data. This time around, the medium isn’t yours. Sorry. Please complete the data model worksheet. However, when you provide your data, you can choose the medium. For instance, feel free to use a spreadsheet, provide copies of a log, or complete the table I provide at the end of the worksheet. Outcomes Your data model should be: • Extremely specific, providing your audience with exact details for each of your data elements, following the form provided, and leaving no necessary field blank. • A cogent means of giving a reader who is not familiar with your project a sense of how you are collecting and organizing your data. Your elaboration on your data model should be: • A mobilization of terms and concepts from class (e.g., classification, paradigms, re/intermediation, collecting, affordance, intent, procedures, bias, discourse, animation, and distant reading), putting them to work in the context of your project. • Concrete and situated in your project. Abstract language should be avoided. Responses to each question should be based on examples from and exact instances in your project. • Aware of the limits and benefits of the decisions you are making and how those decisions will affect your target audience and your own learning. Remember: you can’t do everything, but you should be able to account for how you are mapping your project. Your data should be: • Well-organized and specific, based upon the framework outlined in your data model. • Sufficient enough to—at this juncture in your project—allow you to make some preliminary findings based upon your research. (However, the data does not need to be complete. You http://books.google.com/books?id=ngHQ_ZghbbYC&printsec=frontcover&dq=stars+in+my+pocket&source=gbs_summary_r&cad=0#PPA183,M1 http://mappingthedigitalhumanities.org/wp-content/uploads/2009/04/data-model.doc Mapping the Digital Humanities Assignment 4, Page 2 __________________________ might still be in the process of collecting more. In the worksheet, I require three rows of data. I recommend collecting much more, if possible. For some projects, twenty to forty rows will be necessary.) Before and during the process, consider: • What you expect to emerge from your animation at the quarter’s end. How do those expectations resonate with your data model? • Returning to what you churned out in response to Prompts 1 through 3. What’s your trajectory, collector? • How, broadly speaking, this approach to humanities work relates to your previous coursework and experiences, and to what effects. • Revisiting the modules and contacting me and/or your peers with any questions you have about the terms and concepts used. Another Review Coming Soon Your data model worksheet is due—on the class blog (embedded, via a link, or as text)—before class on Monday, May 11th. During that class, your worksheet will be peer reviewed, and I will grade your worksheet based on that peer review. The data model will be graded on the 4.0 scale, and it can be revised once. It’s part of your individual project grade. Hope all’s coming along well. As always, let me know about your concerns. http://mappingthedigitalhumanities.org/?page_id=235 From Hervé Le Tellier’s “All Our Thoughts”: I think the exact shade of your eyes is No. 574 in the Pantone color scale. Ah . . . the abstract: the oh so academic act of summarizing work that’s often still in progress. Your project’s not finished, you’re still not sure if everything coheres, and the thing’s so deep you can’t dare reduce it to a single paragraph. I know this. I don’t particularly enjoy writing abstracts, either. But abstracts are necessary beasts. Aside from giving your readers a quick snapshot of your research, they also force you to articulate—in a precise fashion and in exact numbers—what, exactly, you are up to. To the details, then. Your abstract should include: • The aim of your project and its motivation/purpose, • Your research question (although it does not need to be articulated as a question), • Your method (how you did what you did), • Your results (what you learned), • The implications of your results (or why your research matters), and • The trajectory of your project (what you plan to do with it in the future). This one should be in words. Despite Blake’s abstract of humans (above-right), we’re going with the industry standard here. Outcomes Your abstract should: • Be no more than three hundred words. • Be one concise and exact paragraph. • Include a title for your project, three keywords for it, and a one-sentence tagline describing it. (The keywords and tagline are not part of the three-hundred word limit.) • Be written for educated, non-expert audiences (e.g., academic types who might not be familiar with the digital humanities) and avoid jargon. • Summarize your work as it stands, instead of becoming an idea hike into unventured regions (that is, avoid speculations). • Mobilize terms and concepts from the class, again, for educated, non-expert audiences. • Demonstrate, through clear language, how your project’s motivation, question, method, results, and trajectory are related. • Follow the form below on page two. Before and during the process, consider: • How your data model is one way of thinking through your method. • Returning to your response to Prompt 3, which asked you for your research question, and to Prompt 2, which asked you what you want from your project. • Module 7 (on making your project matter) and how it speaks to your project’s motivation and the implications of your results. • How to write for people who would have absolutely no clue what, exactly, the digital humanities is. • How terms common in the course thus far (e.g., paradigm, syntagm, model, distant reading, remediation, and intermediation) might be helpful when articulating your project. • When terms should be defined. Contextualizing the Thing Your abstract is due—on the class blog (attached as a Word document)—before class on Wednesday, May 20th. On May 27th, we’ll consider how to integrate your abstract into the presentation of your project. An abstract is nothing without what it’s abstracting. The abstract will be graded on the 4.0 scale, and it can be revised once. It’s part of your individual project grade. If you need help condensing, then let me know. Form for the Abstract Project Title Your Name, Your Major Tagline Three keywords Body of abstract (300 words, one paragraph) Examples View some sample abstracts (which do not necessarily follow the format and outcomes for this prompt, but are nevertheless good references). http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/assignment-3.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/assignment-2.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-7.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/04/userguide.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/04/userguide.pdf http://www.sccur.uci.edu/sampleabstracts.html From DJ Spooky’s Rhythm Science: As George Santayana said so long ago, “Those who cannot remember the past are condemned to repeat it.” That one’s scenario. But what happens when the memories filter through the machines we use to process culture and become software—a constantly updated, always turbulent terrain more powerful than the machine through which it runs? Memory, damnation, and repetition: That was then, this is now. We have machines to repeat history for us. . . . The circuitry of the machines is the constant in this picture; the software is the embodiment of infinite adaptability, an architecture of frozen music unthawed. Reflection, reflection, reflection. Instructors often like the word. I’m not sure it fits here, though. The purpose of this project assessment isn’t for you to ruminate on whether you’re good enough or smart enough. We know you are, and people like you. It’s for you to articulate what—over the course of the quarter—ultimately emerged from your project and what you think of it. The thing began as an idea. You then converted it into an agenda, with a model, compiling pieces of data, and ultimately animating those pieces. That said, I hope you collected something you’re happy with. The project goal was for you to think through “generative constraints” as strict as computation and data models to produce provocative questions, new knowledge, and reconfigurations of literature, culture, and history. After all, the hardware of history needn’t determine its interpretation, and the wiring of culture is never neutral. Infinite adaptability. With that adaptability in mind, please unpack this list, without, of course, the brazen assumption that your unpacking is final. The quarter just so happens to be over. (And I’m really sad about that.) • How—for better and for worse—does your animation project differ from an academic paper (especially one intended for print)? What does it ask of audiences and to what effects? • How does your project produce new knowledge and about what? • Considering the brevity of a quarter, how was your project a success? What did you learn from it? What will others? • How could you improve your project? What do you want to continuing learning from it? • How, if at all, do you plan on developing (or using) your project in the future? Do you plan to circulate it to others or make it public? Why or why not? Unless you are going for writing credit, I’ve decided to let you choose the medium or media here. You can make—or blend together—video, a website, audio, word docs, or what-have-you. Be creative. Just do me two favors: 1. With your assessment, include three outcomes upon which I should assess your project and your assessment of it. Those outcomes should include references to your method for collecting data, your awareness of your own bias/intent/procedures, your project’s design, and how your project produces knowledge (instead of just re-presenting known information). 2. Provide me with your final animation project. Upload it to the blog, provide a link, or the like. (See more below.) Outcomes By focusing on your project as a process, your project assessment should: Mapping the Digital Humanities Assignment 6, Page 2 __________________________ • Be composed for educated, non-expert audiences (e.g., academic types who might not be familiar with the digital humanities). • Demonstrate your understanding of the digital humanities as a field, using material from the class when appropriate. • Reference specific aspects of your project and draw upon it for evidence. • Exhibit critical approaches to your own project (e.g., show that you know how you did what you did, what worked, and how you could have done things differently). • If applicable, include a works cited page of texts quoted, paraphrased, or the like. Before and during the process, consider: • Returning to your responses to all prompts. How has your project—and your framing of it— changed since then? • Returning to the course syllabus and assessing what you’ve learned in the class since day one of the quarter. • Returning to the user’s guide for CHID 498. • Circulating a draft assessment to me and your peers. (Use the blog!) • How to write for people who would have absolutely no clue what, exactly, the digital humanities is. • Doing something that will keep you interested. It’s finals week, in spring, just before summer, y’all. This One Will Not Be Revised Your project assessment and final portfolio are due—on the class blog (filed under your name)—by the end of the day, Wednesday, June 10th. Here’s what (ideally) should be uploaded to your author page on the blog: • Mapping 1, • Thought Piece (First Draft and Revision, if applicable), • Needs Assessment (First Draft and Revision, if applicable), • Work Flow (First Draft and Revision, if applicable), • Mapping 2 • Data Model (First Draft and Revision, if applicable), • Abstract (First Draft and Revision, if applicable), • Animation (all versions, including the one presented on June 3rd), • Project Assessment, and • Anything else you think is relevant. As a reminder, here’s how your work in 498 will be graded: • Class participation (30% of the grade) • Blogging and collaborative mapping (20% of the grade) • HTML quiz (5% of the grade) • Final exhibition (5% of the grade) • Individual project (40% of the grade) These five components of the class will each be graded on a 4.0 scale and then, for your final grade, averaged according to the percentages I provide above. And here’s how the portfolio is graded: Mapping the Digital Humanities Assignment 6, Page 3 __________________________ • Thought piece (10% of portfolio, can be revised once after it’s graded), • Needs assessment (10% of portfolio, can be revised once after it’s graded), • Work flow (10% of portfolio, can be revised once after it’s graded), • Data model (15% of portfolio, can be revised once after it’s graded), • Abstract (15% of portfolio, can be revised once after it’s graded), and • Final prototype and assessment (40% of portfolio, cannot be revised after it’s graded). See me with questions! Have a rad summer break, people. It’s been a pleasure, and—to reiterate—make this last bit interesting. After all, CHID 498 was, from the get-go, an experiment. From Shelley Jackson’s my body—a Wunderkammer: I have found every drawer to be both bottomless and intricately connected to every other drawer, such that there can be no final unpacking. But you don't approach a cabinet of wonders with an inventory in hand. You open drawers at random. You smudge the glass jar in which the two-headed piglet sleeps. You filch one of Tom Thumb's calling cards. You read page two of a letter; one and three are missing, and you leave off in the middle of a sentence. Learning Outcomes for the Module • Make the distinction between information and knowledge and articulate how a given medium will influence that distinction. • Historicize contemporary trends in the digital humanities through the Wunderkammer and consider some conceptual relations between the two. • Through a hands-on example, unpack the differences between “top-down” approaches to media and emergent media. About the Wunderkammer (or Cabinet of Curiosity, or Wonder-Room) • European phenomenon, beginning in the mid-16th century • First mention: Vienna in 1553 • Natural science before the 18th century, prior to the modern notion of science as a system of ordering and separating objects • An inhabitable, miniature world that allows people to engage the world as a macrocosm • Included natural objects (preserved animals, skeletons), man-made artifacts (works of art, scientific instruments) and myths (the Scythian Lamb, the debunking of the Unicorn) • Example: Museum Wormianum (1655) by Ole Worm (University of Copenhagen) (image above) Related to the digital humanities, the Wunderkammer suggests that knowledge • Does not exist in objects themselves, but rather in relationships, which are often contrivances (that is, they don’t have their intended effects). • Can emerge from random (rather than strictly ordered) and situational (rather than universal) relationships between objects, subjects, and places, where objects are ripped from their original contexts (e.g., place of invention) and re-contextualized (e.g, in a museum) in novel juxtapositions. • Is a negotiation between a macrocosm (e.g., the world) and a microcosm (e.g., the Wunderkammer). • Implies both what is perceivable (the found object) and what is possible (the uncharted territory, the surprise, the unexplored). • Not always top-down (the application of a universal concept in the particular instance), but also emergent (what comes about, what is the potential of given relationship, what is the exception to the rule). • Consists of abstract reason (“the mind”) without opposition to sensation (touch) and matter (the body). • Cannot be simply downloaded and acquired (The Matrix) as information. (Sorry! It’s just true!) http://collection.eliterature.org/1/works/jackson__my_body_a_wunderkammer.html Mapping the Digital Humanities Module 1, Page 2 _________________________ Anna Munster, from Materializing New Media, on the Wunderkammer: “knowing about an object required a knowledge that involved getting to know: a familiarity with its location, the stories one could elicit from and about it, and its own association with a wide range of other objects in the world.” (76) What Now?: Applications • Visit Day Life (http://www.daylife.com/) and search for “Seattle.” • Visit Doodlebuzz (http://www.doodlebuzz.com/), search for “Seattle”, draw some link branches, and occasionally press the spacebar to view your map. • What did you learn, and how did you learn it differently from these two interfaces? • How did your perception of how you controlled and navigated information change? • What does something like Doodlebuzz afford? And how does it differ from more common websites? What’s Next?: Modules Ahead • Thinking in Association Blocks, Collecting Idea Pockets What to Consider during Future Modules • How does composing digitally affect our perceptions of physical (e.g., print) objects? • How can something “digital” also be “material”? • How can something as strict as binary code, computation, or organization enable curiosity? • How can we do more than simply “use” digital technologies and media for information? http://books.google.com/books?id=wRfK324cR-0C&dq=materializing+new+media William S. Burroughs, in The Third Mind: The scrapbooks and time travel are exercises to expand consciousness, to teach me to think in association blocks rather than words. Learning Outcomes for the Module • Distinguish between “paradigm” and “syntagm” and articulate their roles in reading print and new media. • Practice some basics in XHTML and CSS. Gertrude Stein (1874 – 1946) wrote in boxes. But, for the purposes of this module, what are boxes? • Words (nested in even more words, including connotations and denotations) • Containers (to be unpacked) • Means of concealing something else • Vectors (or transmission devices) • Poems Consider “A BOX,” from Tender Buttons: “Out of kindness comes redness and out of rudeness comes rapid same question, out of an eye comes research, out of selection comes painful cattle. So then the order is that a white way of being round is something suggesting a pin and is it disappointing, it is not, it is so rudimentary to be analysed and see a fine substance strangely, it is so earnest to have a green point not to red but to point again.” Note how repetition (or what Stein calls “insistence”) adds texture to language. This texture stresses how words always refer to something else. They are ways of mapping the world—of referencing that with this. Words are association blocks. Not, of course, that they capture everything. Think back to Shelley Jackson: “there can be no final unpacking.” Importantly, readings of “A BOX” change depending upon its medium, its context, and how we understand the terms “paradigm” and “syntagm” in the digital humanities. Borrowing from Ferdinand de Saussure on natural languages • The syntagm is a series of words or concepts strung together in a line. In print, these words appear in horizontal lines and are explicit (e.g., “A BOX” contains the phrase “out of kindness comes”). • The paradigm is the set of elements (e.g., nouns) from which a given a word is selected. In print, these words are implicit and inferred (e.g., associating “box” with “word” or “cattle” with “animals”). As Lev Manovich points out, new media reverse this relationship of explicit syntagm and implicit paradigm, with the “horizontal” syntagm emerging from a structured (or encoded) “vertical” paradigm. For example, most websites are not read exactly like a book, left to right, from page to page. Instead, the syntagm emerges from how the reader selects from the structured choices provided. This structure influences interpretation. Relating language with mapping here, the syntagm is comparable to a territory, and the paradigm is what’s ostensibly included within that territory. (A ha! Now we can see why http://books.google.com/books?id=HYMFAQAAIAAJ&q=the+third+mind&pgis=1 http://www.bartleby.com/140/ http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-1.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-1.pdf http://books.google.com/books?id=B0eB8mvov6wC&dq=saussure&source=gbs_summary_s&cad=0 http://books.google.com/books?id=7m1GhPKuN3cC&printsec=frontcover&dq=language+of+new+media&source=gbs_summary_r&cad=0#PPT233,M1 Mapping the Digital Humanities Module 2, Page 2 _________________________ language and mapping are so important to scholars who, for example, study colonialism! Both are ways enabling what—and who!—is included/excluded!) Implications on writing in code (e.g., in XHTML and CSS) and the digital humanities • XHTML and CSS are relatively strict ways of encoding and stylizing language. When opened, a “box” (or an element) must always be closed (e.g., if

, then

). • Encoding a “box” of code is not simply a technical matter. It has social dynamics, including influencing how people make sense of culture and texts. Sometimes the technical and the social are at odds. • Still, the shift from syntagm-focused print text to paradigm-oriented digital text need not imply “dumbing” down a text (e.g., all links go to denotations in the dictionary) or determining how a reader interprets it. (See Module 1 on curious relationships. Here, it might be productive to think of Stein’s style through “boundary” or “hybrid” objects, like the Scythian Lamb. Often, the words in her poetry fit, quite purposefully, in multiple categories simultaneously. Her writing’s wonderfully monstrous.) What Now?: Applications • Select a specific paradigm for reading “A BOX”—a rule for reading, if you will. This will be your generative constraint for encoding your interpretation into the poem. Let’s look at William Gass’s etymology cluster of the poem for an example. • Now let’s review an example of a page written in XHTML. Note how the text is written in nested “boxes.” Again, the boxes, when opened, must be closed. • And let’s review an example page in CSS. Note how CSS stylizes the boxes written in XHTML. • In Notepad, practice encoding “A BOX” in XHTML and CSS (in two separate files) for the web. In the XHTML, include, at a minimum, the , ,

, and tags and elements. In your CSS, stylize the XHTML body and the element. • After your encoding, in the XHTML file, please write a sentence or two explaining what your generative constraint for encoding was. • How did encoding the text influence your interpretation of it? How did that interpretation manifest in the encoding? How would your encoding influence how a reader interprets the poem? What’s Next?: Modules Ahead • Collecting Idea Pockets, Do You Believe in Angels? What to Consider during Future Modules • For a module in the near future, you’ll start thinking about refashioning a print-based project you’ve already started. How might paradigms and syntagms play a role in this refashioning? http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-1.pdf http://books.google.com/books?id=U8qwGsiCCScC&printsec=frontcover&dq=word+world+gass&client=firefox-a#PPA94,M1 http://books.google.com/books?id=U8qwGsiCCScC&printsec=frontcover&dq=word+world+gass&client=firefox-a#PPA94,M1 Stillman, in Paul Auster’s City of Glass: My brilliant stroke has been to confine myself to physical things, to the immediate and tangible. My motives are lofty, but my work now takes place in the realm of the everyday. That’s why I’m so often misunderstood. But no matter. I’ve learned to shrug these things off. . . . You see, I am in the process of inventing a new language. Learning Outcomes for the Module • Understand how new media can be integrated into collecting information for, and collaborating in, digital humanities research projects. • Practice some basics in WordPress and Google Books, Maps, & Reader. The Paris arcades (iron and glass structures popular in the 1820s and 1830s) are, according to Walter Benjamin (1892–1940) in The Arcades Project: • “a center of commerce in luxury items” (3) • “a world in miniature” (Illustrated Guide to Paris qtd. in the text, 3) • “buildings that serve transitory purposes” (4) The collector and collecting play prominent roles in the arcades. Benjamin on collecting: • “What is decisive in collecting is that the object is detached from all its original functions in order to enter into the closest conceivable relation to things of the same kind” (204). • “Collecting is a form of practical memory, and of all the profane manifestations of ‘nearness’ it is the most binding” (205). • “The true method of making things present is to represent them in our space (not to represent ourselves in their space)” (206). • “The collector dreams his way not only into a distant or bygone world but also into a better one— one in which, to be sure, human beings are not better provided with what they need than in the everyday world, but in which things are freed from the drudgery of being useful” (9). For The Arcades Project, Benjamin’s method is collecting: snippets of writing put into juxtaposition, pockets of ideas that are contrived. (See Module 1 on contrivances, hybrid objects, and practicality, as well as Module 2 on association blocks and paradigms.) This method corresponds with the form of Benjamin’s book (see the hard copy), not to mention his research practices. In a way, Benjamin gave theory a new language, with his dictionary of collections. Implications for blogging and digital humanities research projects in this class • Research as Wunderkammer-making (see Module 1) • Relevance of the everyday to academic research and new media • The habit of documenting work (archive it now, arrange it later, delete nothing) • Articulating thoughts through paradigms first, then organizing the syntagms (e.g., compiling things before making a claim (“X causes Y”), rather than making a claim and finding the evidence to “fill it in” or support it) (see Module 2 on paradigms and syntagms) • Embracing a type of experimentation in your academic work—as you collect, being open to change, flexibility, and failure and avoiding the “theory hammer,” where everything in sight becomes a nail http://books.google.com/books?id=8WVIigYB8HQC&dq=city+of+glass&source=gbs_summary_s&cad=0 http://books.google.com/books?id=sKkoT9QyjD4C&q=arcades+project&dq=arcades+project&ei=T4-kSb6DH5-OkASU-9WNAg&pgis=1 http://books.google.com/books?id=sKkoT9QyjD4C&q=arcades+project&dq=arcades+project&ei=T4-kSb6DH5-OkASU-9WNAg&pgis=1 http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-1.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-1.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-2.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-2.pdf Mapping the Digital Humanities Module 3, Page 2 _________________________ • Class blog: Collaborative collection of microcontent in a networked space, which offers juxtapositions across our individual collections • Conjecturing (per Willard McCarty in Humanities Computing): “a collecting or throwing together of particulars” in an attempt to make sense of them (47) What Now?: Applications • Log-in to the class blog. (I’ll give you your username and password.) • Post your first entry, categorized under “introductions” and tagged as you find appropriate. Before you publish it: o Introduce yourself to the class in whatever way you wish. o Provide a link to your XHTML and CSS exercise (which should be at students.washington.edu/[yourUWnetID]/chid498/) o Include an image of the book or text you encoded in your exercise. If you can’t find one, then tell me. We’ll think of something relevant.) Of note, all images on the blog must be 400 pixels or less in width. You can always use a program to shrink them accordingly. • When you are finished, I will also show you how to post a video. Of note, all videos on the blog must be 200 pixels or less in width. • Now log-in to the class Google account (“mappingthedigitalhumanities”): o Note how a majority of our online class content is aggregated at iGoogle. Peruse it to see what’s there. o In Google Books, add a book that you’ll likely be using this quarter or that you think is relevant to the class. o In Google Maps, add something (e.g., a comment, an image, or a video) to the class map. We’ll also have to decide by what standards we’ll be collaborating to map the campus this quarter. o Time permitting, in Google Reader, add a relevant snippet from the web. • How is each of these a form of collecting? Of research? Of everyday life? • How is each of these a form of collaboration? And what kind of collaboration, exactly? Consider other ways you’ve collaborated that might differ from what we’re doing here. What’s Next?: Modules Ahead • Do You Believe in Angels?, Oh How Reductive What to Consider during Future Modules • When you have so much to collect for a given research project, then how do you refine your options? Data, but how to gather it? http://books.google.com/books?id=o_sHHQAACAAJ&dq=humanities+computing&ei=G52kSZbhApPOkAS21byrBg http://mappingthedigitalhumanities.org/ http://www.google.com/ig http://books.google.com/books?hl=en http://maps.google.com/maps http://maps.google.com/reader From Steve Tomasula’s The Book of Portraiture: The unexamined life is not worth living —Socrates, and www.homecams.com, the site that lets you see inside 1,024 private homes…. Learning Outcomes for the Module • Explore media differences between print and digital texts and the implications of these differences on remediation and intermediation projects. • Examine the distinctions between “remediation” and “intermediation” through some examples. Let’s give a look at an animation of the first newsreel from John Dos Passos’s The 42nd Parallel in tandem with a digitized version of it and its print version. Now, let’s unpack the relations between these three “versions” of the text through two terms: remediation and intermediation. Per Jay David Bolter and Richard Grusin, remediation is • “the representation of one medium in another” (45) • nearly synonymous with “‘repurposing:’ to take a ‘property’ from one medium and reuse it in another” (45) Per N. Katherine Hayles, intermediation is • the “complex transactions between bodies and texts as well as between different forms of media” (7) • includes “interactions between systems of representations, particularly language and code, as well as interactions between modes of representation, particularly analog and digital” (33) • “denotes mediating interfaces connecting humans with the intelligent machines that are our collaborators in making, storing, and transmitting informational processes and objects” (33) How do the two terms offer different readings of our three versions of Dos Passos? Consider what they emphasize (e.g., “medium,” “representation,” “bodies,” and “collaborators”). To help us along, we might consider what Hayles, in a different text, says are the characteristics of computer-mediated text. It • is “layered” (e.g., layer of text on a screen and code layer) (163) • “tends to be multimodal” (e.g., including “text, images, video, and sound”) (164) • exists such that “storage is separate from performance” (e.g., store files on a server in Seattle, read them in Santiago) (164) • “manifests fractured temporality” (e.g., reader does not control “how quickly the text becomes readable”) (164) Implications for Your Digital Humanities Project When thinking of “remediating” or “intermediating” print, the characteristics of computer-mediated text should factor what remediation or intermediation will afford—how either invites or pressures certain http://books.google.com/books?id=9ftlAAAAMAAJ&q=book+of+portraiture&dq=book+of+portraiture&ei=gbikSYzqPIfEkASW04FD&pgis=1 http://www.youtube.com/watch?v=TItBHwKORm8 http://books.google.com/books?id=TFlVe4ySsKQC&pg=PP1&dq=dos+passos+42nd+parallel&ei=IcGkSc6eOJTUlQS5ye23CA#PPA1,M1 http://books.google.com/books?id=TFlVe4ySsKQC&pg=PP1&dq=dos+passos+42nd+parallel&ei=IcGkSc6eOJTUlQS5ye23CA#PPA1,M1 http://books.google.com/books?id=TFlVe4ySsKQC&pg=PP1&dq=dos+passos+42nd+parallel&ei=IcGkSc6eOJTUlQS5ye23CA#PPA1,M1 http://books.google.com/books?id=NHwwHwAACAAJ&dq=remediation&ei=g8KkSbOlJ43qkQSM0v2fCw http://books.google.com/books?id=lwaRyOZfBzgC&dq=my+mother+was+a+computer&source=gbs_summary_s&cad=0 http://books.google.com/books?id=5gtoAAAAMAAJ&q=hayles+electronic+literature&dq=hayles+electronic+literature&ei=AcqkSeXaMYa4kwS-y72NAg&pgis=1 Mapping the Digital Humanities Module 4, Page 2 ----------------------------------------- readings and engagements. (See Module 1 on curious relationships and Module 3 on the class blog as a collection.) What Now?: Applications • Check out Marsha’s Throne Angels! As a parody of old school, low-tech personal web pages, what media is it remediating? How does it achieve humor in this remediation? • In the above line, what happens to our interpretations when we revise “remediating” and “remediation” to “intermediating” and “intermediation”? What’s Next?: Modules Ahead • Oh How Reductive, Making Swervy Things What to Consider during Future Modules • How might these angels, not to mention these distinctions between intermediation and remediation, inform your project? Which of the two terms do you prefer? Why? http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-1.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-3.pdf http://collection.eliterature.org/1/works/wittig__the_fall_of_the_site_of_marsha/Spring/index.html From Marianne Moore’s “The Student”: “When will your experiment be finished?” “Science is never finished.” And from her “People’s Surroundings”: there is something attractive about a mind that moves in a straight line— Learning Outcomes for the Module • Explore the implications of “reduction” and classification in digital humanities research. • Consider ways you might use specific data elements to methodically reduce the primary text(s) in your research project. Franco Moretti is a cartographer of sorts. He makes literary maps, with a science. In Graphs, Maps, Trees, he writes: “What do literary maps do . . . First, they are a good way to prepare a text for analysis. You choose a unit—walks, lawsuits, luxury goods, whatever—find its occurrences, place them in space . . . or in other words: you reduce the text to a few elements, and abstract them from the narrative flow, and construct a new, artificial object . . . And with a little luck, these maps will be more than the sum of their parts: they will posses ‘emerging’ qualities, which were not visible at the lower level” (53). Literary maps also afford what Moretti calls a “distant reading,” “where distance is however not an obstacle, but a specific form of knowledge: fewer elements, hence a sharper sense of their overall interconnection. Shapes, relations, structures. Forms. Models” (1). To flesh out “distant reading,” let’s look at a couple of examples (1 and 2) from Moretti’s Atlas of the European Novel: 1800-1900. What’s mapped? What’s not? One trick: How to avoid assuming that a distant reading fully accounts for its territory. Alfred North Whitehead called this slippage “misplaced concreteness.” Abstractions such as maps—in their richness and utility—are used to explain the territory. They become objectifying media that always generate reliable results (e.g., facts from maps) or uniform products (e.g., the same houses from a single blueprint). As Matthew Fuller observes: “The ruse of concrete misplacedness, of an ideally isolatable element, produces its offspring—but they are unruly” (104). Frankenstein’s creature animates this very unruliness (e.g., the uncontrollable monster of science), as does Stein’s poetry (e.g., “a rose is a rose is a rose,” where the definition of a rose is historically and culturally dependent). (See Module 2.) So does the image (right) of Astaire’s unruly movement; he looks positioned in the still shot, but photography needn’t give us the illusion that this event is isolatable and easily repeated. (I certainly couldn’t pull it off.) Consider, too, syntagms from Module 2. This shot of Astaire is in a sequence of shots. What comes before and after is crucial. Abstraction here is not what Ezra Pound means when he writes (in Poetry, 1913), “Go in fear of abstractions.” Pound’s on a different register. For him, the idea is to avoid writing in imprecise language what someone else already wrote precisely. Treat the thing directly. Use the exact word. For Whitehead and Moretti, abstractions are quite useful for collecting elements and showing their relations. When they are understood as the causes that produce homogenous territories, then misplaced concreteness occurs. (Consider, too, Nietzsche on how the cause is generated after the effect.) http://books.google.com/books?id=Bvdm-SmgMHcC&printsec=frontcover&dq=Marianne+Moore&ei=Md6lSdS_JpDUlQSB9YiKDg&client=firefox-a#PPA101,M1 http://books.google.com/books?id=YL2kvMIF8hEC&dq=graphs+maps+trees&printsec=frontcover&source=bn&hl=en&ei=5-GlSffDB4jTnQfBud2pBQ&sa=X&oi=book_result&resnum=4&ct=result#PPA53,M1 http://books.google.com/books?id=YL2kvMIF8hEC&dq=graphs+maps+trees&printsec=frontcover&source=bn&hl=en&ei=5-GlSffDB4jTnQfBud2pBQ&sa=X&oi=book_result&resnum=4&ct=result#PPA53,M1 http://books.google.com/books?id=ja2MUXS_YQUC&printsec=frontcover&dq=atlas+of+the+european+novel&ei=PuWlSZOaO4nwkQSAhIWKDg#PPA12,M1 http://books.google.com/books?id=ja2MUXS_YQUC&printsec=frontcover&dq=atlas+of+the+european+novel&ei=PuWlSZOaO4nwkQSAhIWKDg#PPA50,M1 http://books.google.com/books?id=ja2MUXS_YQUC&dq=atlas+of+the+european+novel&source=gbs_summary_s&cad=0 http://books.google.com/books?id=ojs-nAgV170C&q=science+and+the+mordern+world&dq=science+and+the+mordern+world&ei=3eqlSeKmJomulQSw9fSJDg&pgis=1 http://books.google.com/books?id=ojs-nAgV170C&q=science+and+the+mordern+world&dq=science+and+the+mordern+world&ei=3eqlSeKmJomulQSw9fSJDg&pgis=1 http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-2.pdf http://en.wikipedia.org/wiki/You%27re_All_The_World_To_Me http://en.wikipedia.org/wiki/You%27re_All_The_World_To_Me http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-2.pdf http://books.google.com/books?id=1FLIHNPucroC&dq=media+ecologies&client=firefox-a&source=gbs_summary_s&cad=0 http://www.english.illinois.edu/maps/poets/m_r/pound/retrospect.htm http://books.google.com/books?id=OwGPCsLiBlwC&dq=nietzsche+causes+effects+genealogy+of+morals&printsec=frontcover&source=bn&hl=en&ei=BwumSa75FYGEsQP_-dT1Dw&sa=X&oi=book_result&resnum=4&ct=result#PPA58,M1 http://books.google.com/books?id=Bvdm-SmgMHcC&printsec=frontcover&dq=Marianne+Moore&ei=Md6lSdS_JpDUlQSB9YiKDg&client=firefox-a#PPA55,M1 Mapping the Digital Humanities Module 5, Page 2 _________________________ Implications of the reductive method for your digital humanities project: • Textual/Literary maps are not only geographical maps. Think broadly about how to map the space of your text(s) (e.g., places in a novel, recurrence of concepts in a poem, publication dates in a genre/corpus). • Novel questions, complex issues, and creativity can emerge from reduction and classification. (Consider Oulipo!) In fact, reduction and classification can help generate interpretations you may have never considered. Moretti writes, “I had found a problem for which I had absolutely no solution. And problems without a solution are exactly what we need . . . we are used to asking only those questions for which we already have an answer” (26). • Reduction is a practical way of narrowing rich research projects, of keeping them simple. It forces you to not only isolate elements of the text, but to also articulate how you isolated them and how you are assessing/quantifying them. • Distant reading runs contrary (in some ways) to “close reading” in the humanities. Keep in this in mind. How will some audiences object to the distant reading you’re conducting? What Now?: Applications • In your clusters, work together so that each student selects three data elements that reduce the primary text(s) of her/his project. These elements would ostensibly lead to a textual mapping. • On the blog, list your three elements and address three things about each: (1) what kind of interpretation would it afford? (2) what of importance might it ignore? (3) how does it relate to—or join—the other two elements? What’s Next?: Modules Ahead • Making Swervy Things, Mapping in Stakes What to Consider during Future Modules • How does the kind of map you ultimately produce influence your choice of data elements and vice versa? http://en.wikipedia.org/wiki/Oulipo http://books.google.com/books?id=YL2kvMIF8hEC&dq=graphs+maps+trees&source=gbs_summary_s&cad=0 From Donna Haraway’s Modest_Witness@Second_Millenium.FemaleMan_ Meets_OncoMouse: Feminism and Technoscience: In Greek, trópos is a turn or a swerve; tropes mark the nonliteral quality of being and language. Metaphors are tropes, but there are many more kinds of swerves in language and in worlds. Models, whether conceptual or physical, are tropes in the sense of instruments built to be engaged, inhabited, lived. Learning Outcomes for the Module • Consider the implications of modeling for humanities research through examples from Google Visualization API. • Become familiar with how digital models enable the organization of difference and patterns. • Explore some possible options for modeling the data from your own project. According to Willard McCarty, a model is “either a representation of something for purposes of study, or a design for realizing something new” (24). These two understandings of models correspond with Clifford Geertz’s “denotative ‘model of’, such as a grammar describing the features of a language, and an exemplary ‘model for’, such as an architectural plan” (24). Here, models relate to maps. McCarty suggests that, like modeling, mapping “can be either of or for a domain, either depicting the present landscape or specifying its future—or altering how we think about it, e.g., by renaming its paces. A map is never entirely neutral, politically or otherwise” (33). (For more, see his “Modeling: A Study in Words and Meanings.”) McCarty also suggests that there are two features of modeling as a practice • Take knowledge for granted and just start modeling. Eventually, meaningful surprise occurs when the model generates an occurrence that cannot be explained (e.g., something is where it shouldn’t be), or when the model fails to generate the expected occurrence (e.g., something isn’t where it should be) (25-26). Both of these examples could also be called “contrivances,” or the bringing about of unintended events. (See Module 1 on knowledge production, curiosity, and the Wunderkammer.) • Perceive the manipulability of information. Models are repeatedly altered and must be interactive (26). Digital models are arguably more flexible, interactive, and manipulable than print ones. How, then, does a map become Haraway’s nonliteral swervy thing, or Geertz’s “model for”? How might it alter common perceptions of history, of landscape, of culture, of literature? Or how might it become a vehicle for humor or political action? (We’re really going to unpack these questions in the next module.) Implications for your digital humanities research projects Modeling entails the • Introduction of, and interaction between, media layers (e.g., the spreadsheet, the motion chart, the notes, the text, and the essay) in the stages of research and collecting data. (See Module 4 on intermediation and remediation, and Module 3 on collecting and conjecturing.) http://books.google.com/books?id=ftO4jLQ2RM8C&dq=haraway+modest&client=firefox-a&source=gbs_summary_s&cad=0 http://books.google.com/books?id=ftO4jLQ2RM8C&dq=haraway+modest&client=firefox-a&source=gbs_summary_s&cad=0 http://books.google.com/books?id=o_sHHQAACAAJ&dq=Humanities+Computing&ei=7SumScXVCYL8lQSd_fmJDg http://www.digitalhumanities.org/companion/view?docId=blackwell/9781405103213/9781405103213.xml&chunk.id=ss1-3-7&toc.depth=1&toc.id=ss1-3-7&brand=9781405103213_brand http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-1.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-7.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-4.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-3.pdf Mapping the Digital Humanities Module 6, Page 2 __________________________ • Mobilization of theory through what McCarty calls “the continual process of coming to know by manipulating things” (28). In other words, the swervy thing is also a theory thing: it’s a material object (that has force and is used by people in certain ways) and a concept repeatedly put into action. • Integration of quantitative approaches and classifications into critical approaches to history, culture, and literature. • “Distant reading” of texts and discovering a problem without a solution. (See Moretti’s comments in Module 5.) • Challenges of: o (1) Synthesizing various modes of perceiving, storing and transmitting information, o (2) Selecting the most effective data elements (for a swervy thing), o (3) Finding the most persuasive model for your audience(s) and purpose(s), and o (4) Determining whether you are representing information (“model of”) or designing for the realization of the new (“model for”). What Now?: Applications • Check out Google Visualization API library. Scroll through the options (e.g., motion chart, geo- map, and annotated time line) with your project in mind. • For each that interests you, look (at least) at the examples provided, the data format, and configuration options. Considering the aims of your project, as well as your elements (from Module 5), does any of the visualizations work for you? Why or why not? • As a class, we’ll work through an example motion chart using a spreadsheet as a data source. • When we are finished, on the blog, respond to the following in your own entry: o (1) Given this cursory look at modeling, what obstacles do you foresee? o (2) For your project, are you more invested in modeling for or modeling of? Why? o (3) How do the visualizations affect your perception of your elements (from Module 5)? What might need to change from that last module? o (4) What other kind of visualizations or models would you like to work with in class? What’s Next?: Modules Ahead • Mapping in Stakes, What’s Data? What to Consider during Future Modules • Soon, you’ll be submitting data for your project. Regardless of whether you are modeling for or modeling of, how will you make your data interesting, and how will it be organized? What audience(s) do you have in mind, and what matters to them? http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-5.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-5.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-5.pdf http://code.google.com/apis/visualization/documentation/gallery.html Protagonist, from Ralph Ellison’s Invisible Man: All things, it is said, are duly recorded—all things of importance, that is. But not quite, for actually it is only the known, the seen, the heard and only those events that the recorder regards as important that are put down, those lies his keepers keep their power by. . . . Where were the historians today? And how would they put it down? Learning Goals for the Module • Become familiar with some critical approaches to technology and how to apply one or two of those approaches to your own project, especially to how you are gathering data. • Determine—through examples and an assessment of your data elements—how those critical approaches might help you increase the stakes of your project. What or who a map excludes, as well as what or who it enables, are arguably its most important aspects. Often, humanities research projects attend to how objects, such as maps, function in certain social or cultural domains—how, for example, maps render invisible certain people, places, and events and how to change existing maps or create new ones accordingly. Indeed, maps are ways of writing and classifying history, of putting it down. A question, then, is how to recognize what’s missing from your own work, why what’s missing matters, and how to revise, if need be. Before we start there, let’s look at an example mapping project, “Queering the map: The Productive Tensions of Colliding Epistemologies,” by Michael Brown and Larry Knopp. Here’s the abstract from their article: “Drawing on and speaking to literatures in geographic information systems (GIS), queer geography, and queer urban history, we chronicle ethnographically our experience as queer geographers using GIS in an action-research project. We made a map of sites of historical significance in Seattle, Washington, with the Northwest Lesbian and Gay History Museum Project. We detail how queer theory/activism and GIS technologies, in tension with one another, made the map successful, albeit imperfect, via five themes: colliding epistemologies, attempts to represent the unrepresentable, productive pragmatics, the contingencies of facts and truths, and power relations. This article thus answers recent calls in the discipline for joining GIS with social-theoretical geographies, as well as bringing a spatial epistemology to queer urban history, and a cartographic one to queer geography.” With this project as a case study, how might we consider how “Queering the Map” could emerge from different critical approaches to the map as a technology? Below are five possible approaches, which are broadly framed and adopted from Roel Nahuis’s and Harro van Lente’s “Where Are the Politics? Perspectives on Democracy and Technology.” • Intentionalist: How is a map (as an artifact representing the values of mapmakers and specific social groups) a materialization of power and authority? • Proceduralist: How is mapping (as a set of social practices with rules and agreed-upon guidelines) a negotiation between interested groups? And who do these groups represent? • Actor-Network: How is the map (as an artifact that affords and forbids certain actions) the result of a struggle between forces or programs, and how does it affect people’s actions on a local level? http://books.google.com/books?id=KBYUH8Qk5HIC&q=invisible+man&dq=invisible+man&ei=EHzKSfGNKJqGkATBuZWKBg&pgis=1 http://faculty.washington.edu/michaelb/index.html http://www.d.umn.edu/~lknopp/ http://sth.sagepub.com.offcampus.lib.washington.edu/cgi/reprint/33/5/559 http://sth.sagepub.com.offcampus.lib.washington.edu/cgi/reprint/33/5/559 http://www.informaworld.com.offcampus.lib.washington.edu/smpp/content~content=a791001958~db=all~order=page http://www.informaworld.com.offcampus.lib.washington.edu/smpp/content~content=a791001958~db=all~order=page Mapping the Digital Humanities Module 7, Page 2 __________________________ • Interpretivist: How are the map (as a text with multiple meanings) and the mapmaker (as an participant with certain investments) influencing and influenced by the discourse in which they are embedded? • Performative: How is the setting of mapping practices (as activities influenced by particular biases) enabling people to act the way that they do, and what other approaches to the setting would somehow surprise or lay bare biased mapping practices? As a class, let’s unpack these approaches a bit. Then, in your clusters, you can decide—in the context of the “Queering the Map” case study—which two critical approaches you find most relevant. After you chat and blog (with one entry per group) about your decisions, then we’ll reconvene and discuss. Implications for your digital humanities research projects • Digital projects that are motivated by and well aware of their specific critical approaches to technology will be more persuasive—they will have higher stakes—than those projects where the critical approach is loosely articulated or even nonexistent. • Critical approaches to technology allow digital humanities projects to do more than simply “represent” information in new forms (e.g., digitize print texts). They allow them to produce new knowledge. • Note how these five critical approaches relate to Module 5 (on modeling “of” and “for”) and Module 1 (on emergent media and knowledge production). • Selecting one or two of the approaches above and mobilizing it in your own work might be a way of focusing your project. • These critical approaches affect both how projects are theorized and how they are practiced (e.g., your project as an idea and your project as a process of gathering and organizing data). What Now?: Applications • Return to your data elements from Module 5 and to your workflow. In your own blog entry, please respond to the following questions: o How, if at all, are your data elements emerging from one or several of the critical approaches listed above, and to what effects? If they don’t appear to be emerging from one of these approaches, then explain why you think that is the case. o If you were to revise your data elements along the lines of one of these approaches, then what would change? (For example, would you cut an element? Add one? Revise them so that they relate differently? Change how they are worded?) • Time permitting, let’s discuss your entries in your clusters and as a class. What’s Next?: Modules Ahead • What’s Data?, Close Reading What to Consider during Future Modules In the next module, you’ll be gathering data based upon the data elements you selected in your workflow. Given this module, what kind of data do you expect? How might you make that data more interesting? Riskier? More provocative? http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-5.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-5.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-1.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/assignment-3.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-8.pdf From Linda Nagata’s Limit of Vision: Virgil squeezed his eyes shut, wondering if they ever would have the power to heal death. The human body was a machine; he knew that. He had looked deep into its workings, all the way down to the level of cellular mechanics, and there was no other way to interpret the processes there than as the workings of an intricate, beautiful, and delicate machine. Machines, though, could be repaired. They could be rebuilt, copied, and improved—and sometimes it seemed inevitable that all of that would soon be possible for the human machine too. Learning Outcomes for the Module • Understand how data elements (as categorizations of data) are imbricated in material practices, which are associated with actual people and places. • Consider the importance of scope in assessing your data. • Learn some “textured” language for assessing your own data and data sources. In Nanovision, Colin Milburn writes about how nanotechnologists and nanoscientists can “fashion their work as a mapping practice, an effort to contain novel territory within a representational topography that is pictorial, rhetorical, and numerical all at the same time—a ‘data map,’ a visual rendering, and a descriptive survey of the landscape that transforms its various physical properties into property as such” (65-66). Put broadly, nanovision, or, for instance, a researcher’s ability to see objects and bodies at the atomic level, translates the microscopic world into a landscape to be explored, mapped, and territorialized—to visualize it, give it a language, and quantify it. The world as we know it is rendered strange through a new scale. For one, bodies and objects behave differently when we zoom in, when we use technologies such as scanning tunneling microscopes to see what the human eye cannot. What’s more, if we can now map what we cannot see with the naked eye, then we can also start to manipulate and shape it. In short, the nanoworld becomes a world of new affordances and possibilities. And as Milburn points out: “Indeed, a vocabulary of western exploration and ‘Manifest Destiny’ plays a powerful epistemic role in nanoscience research” (67). Expand vision? Expand human control and domain over the world (67). (Martin Jay, among others, refers to this as “ocularcentrism.”) Perhaps a video spells it out better. Let us see. Implications for your digital humanities research projects • With maps, we tend to think of how to make things that are larger than us (e.g., the whole world) smaller than us (e.g., a map of the world). Yet nanotechnology demonstrates how mapping is really a matter of scope—of expanding our scale (e.g., applicability) and range (e.g., breadth) of knowledge, whether that is seeing the entire world or seeing the minute, inner-workings of the body. The scope of your data (and not necessarily the amount of it) is thus always something to consider. Of course, thinking big isn’t always the best option, and your acute knowledge of your project’s scope—of why you are setting its scale and range the way that you are—will only enhance how persuasive audiences find it. • While nanotechnologies afford us increasing freedom (e.g., of choice, of movement), freedom is not the same as control. For instance, our bodies still function in ways we cannot see, let alone grasp. Increased access to information about them does not imply that all material problems will be easily remedied. Put another way, political issues cannot be resolved technologically. (See Wendy Chun and Module 7 here.) Persuasive digital projects often recognize that knowledge does not exist in objects, bodies, technologies, or information alone, but rather in the material relationships between them. (Some refer to these relationships as ecologies.) http://books.google.com/books?id=W4XuL513rgkC&printsec=frontcover&dq=limit+of+vision&ei=ZrnOSePyDoqUkQShprm4AQ#PPA17,M1 http://www.nano.washington.edu/index.asp http://books.google.com/books?id=_T-BfsiIWCoC&printsec=frontcover&dq=martin+jay+vision&ei=HMHOSeihLIXElQTjx6iiAQ#PPA8,M1 http://books.google.com/books?id=46J9NQAACAAJ&dq=Nanovision&ei=D7vOSZncGIzUkwSs58mqAQ http://www.vimeo.com/3835663 http://books.google.com/books?id=M-RZAAAACAAJ&dq=control+and+freedom&ei=mMbOSd2dFIjSlQTlppi5AQ http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-7.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-1.pdf Mapping the Digital Humanities Module 8, Page 2 _________________________ What Now?: Applications • For this module, I asked you to bring in some data. More specifically, I asked you to actually cut up your print project—to cut into print, gather what you need, and consequently cut out the rest. I also asked you to arrange your data according to your data elements. Now, with that arranged data in front of you, let’s ask the following questions of what we’ll call your data’s “texture.” These metaphors, borrowed in part from Sorting Things Out by Bowker and Star, will be means of reminding ourselves of your data’s materiality and its scope. Comparable to how nanotechnologists speak of carbon nanotubes, let’s speak of your data as threads: o How “thick” of a thread is it? (That is, how well does it account for the range of possibilities suggested by your data elements?) o How “durable” of a thread is it? (That is, how would it hold up to critique? To what critical approaches (see Module 7) is it accountable?) o How “tightly or loosely woven” is it? (That is, how broadly or narrowly does it describe the place, people, or things it’s describing?) o How well are your data sets “knotted” or “tied” together? (That is, how do they relate, and how do they contradict/complement each other?) • With these questions in mind, please, in your own entry, blog about miscellany. But by “miscellany,” I’m being quite specific. After conducting the above material assessment of your data’s scope: o What do you think you “cut out” from the data sources and archive you’ve been working with? What’s in the remnants? In “zooming in” on specific elements of the text, what did your nanovision occlude, and to what effects on your project? Especially consider how tightly or loosely woven the data is. o What are the limits of your data sources and archive? Their limits of vision? Do you need to look to more texts? Why, or why not? Especially consider the thickness and durability of your threads. o Now that you have some data, how, if at all, did the data elements (as constraints) help you gather data that surprised you? Put another way, what, if anything, did you think you had under control and all mapped out that, in fact, you do not? Especially consider the ties and knots across your data sets. If were not surprised, then why? What’s Next?: Modules Ahead • Close Reading, Assessing Your Project What to Consider during Future Modules • In the near future, you’ll be producing a data model, which is essentially an abstraction of how you are organizing and processing your data. In composing such an abstraction, what are some ways to remind yourself of your data’s texture? Of its material embeddedness and implications? Good luck, humans. http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-7.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/assignment-4.pdf http://books.google.com/books?id=xHlP8WqzizYC&dq=sorting&source=gbs_summary_s&cad=0 From The Verbal Icon, by W.K. Wimsatt and Monroe C. Beardsley: One must ask how a critic expects to get an answer to the question about intention. How is he to find out what the poet tried to do? If the poet succeeded in doing it, then the poem itself shows what he was trying to do. Learning Outcomes for the Module • Understand what might be some critiques of “distant reading” and how to engage those critiques. • Collaboratively annotate a text that has been popular in the class thus far and see what collaborative annotation affords. • Recognize some possible tensions between “distant reading” and “close reading” and articulate why that tension is productive. Put this possibility on the table: For the entire quarter, you’ve been compiling data on an author’s entire corpus—let’s say Virginia Woolf’s. More specifically, you’re studying what places are referenced in her novels, and you’re locating those places, together with relevant quotes from their texts, on a single map. When the quarter’s finished, it’s quite possible that you haven’t read—in its entirety—a single book by Virginia Woolf. My first suggestion? Read a book by Virginia Woolf. My next suggestion? Consider what someone (e.g., a literary critic, a fan of Woolf) would value as “close reading,” where careful attention is paid to the words and ideas of a text (and often just the text alone). Select passages of the text are then scrutinized in a work of criticism. (You’ve likely done this, no?) Actually, for this module, let’s conduct a close reading on a text that’s been popular in the class. For now—of course, subject change—I’ll go with Martin Heidegger’s “The Question Concerning Technology,” first published in 1954. I select it primarily because it’s essentially a canonical (or ubiquitous) text as far as the culture, philosophy, history, and sociology of technology is concerned. Regardless of the text (which should be only a chapter or an article), we’ll go through it, in class, line by line, and annotate it using Microsoft Word. I’ll then circulate that annotated text for your future reference. During the module, it might not be a bad idea for a number of us to play the role of transcriber, taking down the annotations, in the margins, as they emerge. After all, transcription is a matter of interpretation, and it’s labor-intensive. Switching up transcribers will thus give people breaks and generate a broader range of experiences and questions during the exercise. Once the text is annotated, we’ll ask what we’ve learned from the close reading and how it differs, if at all, from the work you’ve been doing all quarter. Implications for your digital humanities research projects • Distant readings are often, fairly enough, critiqued as ignoring the principles and benefits of close reading. While assessing your project and speaking to it, keeping these critiques in mind is a smart practice. • Rather than eschewing close reading for distant reading (or vice versa), a more complex response is to note how the two differ, to what effects, and why. For instance, a literary historian might be more invested in a distant reading, while a New Critic might be more invested in a close reading. Both afford distinct and (when done persuasively) equally important readings. http://books.google.com/books?id=KmrUKcU2JUoC&printsec=frontcover&dq=the+verbal+icon&ei=4BrQSZe_B5-OkAT_ndS2AQ#PPA4,M1 http://en.wikipedia.org/wiki/Virginia_Woolf http://books.google.com/books?id=QEHI-uN0tmgC&dq=mrs.+dalloway&ei=XB3QSfPpC4zMlQT3qdCvAQ http://books.google.com/books?id=kVc9AAAAIAAJ&printsec=frontcover&dq=basic+writings+heidegger&ei=BR7QSaGhMI3wkQSlsMGuAQ#PPA283,M1 http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-5.pdf Mapping the Digital Humanities Module 9, Page 2 __________________________ • If you’ve been asked to conduct close readings in the past, then you might consider how your project for this class has shaped your learning and humanities research differently. • Collaboratively annotating a text, where a screen and document are shared, is one digital humanities practice that highlights how subordinating individual investments toward a shared goal (e.g., annotating a text as a group and collectively determining the benefits of close reading) becomes the vehicle for mutual, technology-focused learning. (See Chris Kelty here.) What Now?: Applications • As a class, we’ll create a document that puts our annotated text into conversation with your individual projects. In so doing, we should draw upon each of them for evidence and address the following: o What does a distant reading afford humanities research, especially digital humanities research? How? o What does a close reading afford humanities research, especially digital humanities research? How? o How are the two approaches coextensive or complementary? In tension? o How, if at all, do computers and new media figure into the above questions? • If we have time, then you should, in your own blog entry, respond to this exercise with your own thoughts. Things to consider: What concerns do you have about distant reading? How, if at all, is it at odds with other ways you’ve practiced reading and criticism? What approach(es) do you prefer and why? What’s Next?: Modules Ahead • Assessing Your Project What to Consider during Future Modules • For the last module, you’ll be thinking through how to assess your project. How might this conversation between close and distant reading figure into your assessment? By focusing, perhaps, on what your project is not doing, what have you learned about what it is doing persuasively? http://books.google.com/books?id=wC2stJS83rYC&pg=PP1&dq=chris+kelty&ei=5VjQSbroAZG4kwTxkYCfAQ#PPA228,M1 Walter Benjamin, in "Theses on the Philosophy of History”: Thinking involves not only the flow of thoughts, but their arrest as well. While it’s tempting to spend the balance of the quarter aggregating data and piling on media, I say we stop for a second and start building things. But! This one’s not the whole idea. It’s a thought piece. And it should consist of the following: • As a field of study, what you think the digital humanities does, • How you think its practitioners do what they do, and • Initial and interesting ideas for at least one digital humanities project that you could develop this quarter. At least one. By “you,” I mean you in particular. Be selfish, people. How you shape this information is up to you. You can essay, diagram, video, draw . . . The medium is not the matter. Pick what you prefer. However, you should figure this in: your medium will influence how you (and your audience) create and think through a message. (Consider “remediation” and “intermediation” from Module 4, as well as “syntagms” and “paradigms” from Module 2.) And remember: A thought piece is a riff. The point is to conjecture. Speculate. Toss out a rich idea or two or three, and later we’ll talk about making the whole thing happen. Outcomes Your thought piece should: • Demonstrate a general understanding of how Modules 1 through 4 relate to the digital humanities as a field and a set of practices (e.g., apply some of the concepts from the modules, think through how to use new media for new forms of scholarship, or unpack the distinctions between print and digital texts). • Give your audience (that is, your 498 peers and me) a sense of why your project(s) would be filed under “digital humanities” and what’s interesting—provocative, even—about your idea(s). Before and during the process, consider: • Reviewing the visualization/diagram of the class (in the syllabus). What’s familiar? What isn’t? • Giving the class modules another gander. What appeals? What confounds? • Looking back at some of your old work from other classes. What have you written on? Studied? What do you care about? What’s curious, and what could be developed? Conversation Coming Soon Your thought piece is due—on the class blog (embedded, via a link, or as text)—before class on Wednesday, April 15th. It will serve as a vehicle for conversation during your first conference with me. Which is to say: I’ll attend to it before we meet. That way, we don’t start cold. I swear. The thought piece will be graded on the 4.0 scale, and it can be revised once. It’s part of your individual project grade. If you have problems with the blog, then let me know. http://books.google.com/books?id=AFJ7dvSdXPgC&dq=illuminations&ei=Q4e6Sd6_O5WWkATG6oiCDA&pgis=1 http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-2.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-4.pdf Ishmael, in Herman Melville’s Moby Dick: God keep me from ever completing anything. . . . Oh, Time, Strength, Cash, and Patience! Michel Eyquem de Montaigne in “Of Cannibals”: I am afraid our eyes are bigger than our bellies and that we have more curiosity than capacity. We grasp at all, but catch nothing but wind. You’ve made a thought piece. We’ve talked about it. Now it’s time to sketch out what—aside from time, strength, cash, and patience—is needed to put a thought in motion. Of course, a thought moving isn’t a thought complete. Keep that pithy line in mind as you respond to this prompt. Or, to contextualize: The goal for the quarter isn’t to finish a research project; it’s to build one worth developing in the future. Recall Shelley Jackson, from Module 1: “there can be no final unpacking.” Determine, then, what you can grasp—what’s feasible—between now and June-ish. How practical, especially for humanists. Let’s give such practicality a name: “needs assessment.” However! As opposed to the image below, your “needs” here won’t simply be downloaded for regurgitation later. You’ll have to come up with them on your own, with some guidelines. As with the first prompt, the medium is yours. But please respond to the following: • What do you want from your emerging project? Or, what is your objective, and what’s motivating it? • What do you need (e.g., knowledge, experience, materials, and practice) to pull everything off? Or, to return to Moretti and Module 5 for a sec: For now, what knowledge are you taking for granted? • Where are you going for evidence or data? That is, what texts will you be working with? Outcomes Your needs assessment should be: • Specific, pointing to the particular knowledge you need and want (e.g., XHTML, GIS, literature review, and media theory/history) and what materials you should have (e.g., software, time, and books). • More refined and focused than your thought piece. (If the thought piece was about broad possibilities, then your needs assessment is about concrete ones.) • A way of responding to your first conference with me. (Reference our conversation and expound upon it.) • Aware that its audience consists of your peers and me. (Feel free to use names or speak to particular bits from class.) Before and during the process, consider: • What is realistic for a quarter? • How do you avoid reinventing the wheel? What did you learn from another course or project that could be developed and re/intermediated? • When the spring’s finished, what kind of project will be most useful for you? Think before and beyond now. http://books.google.com/books?id=cYKYYypj8UAC&dq=moby+dick&ei=Boq6SbL7K5r6kASbl6yJCA http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-1.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-5.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-4.pdf Mapping the Digital Humanities Assignment 2, Page 2 __________________________ Critiques Soonish Your needs assessment is due—on the class blog (embedded, via a link, or as text)—before class on Monday, April 20th. You will share it during in-class critiques. During those critiques, you’ll also respond to your peers’ assessments. The needs assessment will be graded on the 4.0 scale, and it can be revised once. It’s part of your individual project grade. If you still have problems with the blog, then let’s talk. I might need to revise or address something. Carl von Clausewitz in On War: Everything in strategy is very simple, but that does not mean that everything is very easy. Fair enough, Carl, but that doesn’t mean we can’t at least try to make things a tad easier, right? Despite the fact that plans and thoughts and needs and life are all subject to change, sketching out an agenda, through some simple elements, is rarely a bad idea. The key is—to borrow from Chris Kelty—“planning in the ability to plan out; an effort to continuously secure the ability to deal with surprise and unexpected outcomes” (12). So how about what we’ll call a “workflow”? Again, the medium is yours, but please transmit the following: • What is your research question? (Try one that starts with “how.”) • What are the data elements for your project? (We have already discussed these in class; and, if all’s on par, then you should have already drafted them.) • How are you animating these elements (e.g., through what medium—for example, a motion chart, a geomap, or a timeline—are you shaping information)? • What do you expect to emerge from this animation (e.g., what will information look like, how will the audience interpret it, or what might you learn from it)? • Ultimately, what are you going to do with it (e.g., how will it influence your current work, how might you use it in other classes, how will it persuade audiences, or how will it change the ways in which you perceive the text(s) you’re working with)? Outcomes Your workflow should: • Be driven by a concrete and provocative research question, which emerges from your responses to Prompts 1 and 2. • Be very specific about the data elements you are using. Name them. List them out. • Be very specific about the kind of animation you are using, including some knowledge of how that animation allows you and your audience to produce knowledge—or how that animation is a “swervy thing.” • Demonstrate that you are aware of why you are using the data elements and animation you’re using and what might be the implications of your decision (e.g., what are the benefits and deficits, or the hot ideas worth some risk and not-so-hot possibilities that are deterring you). • Aware that its audience consists of your peers and me. Before and during the process, consider: • How your digital project—through computational animation—demands a different mode of thought than, say, writing a paper. How might you take advantage of this difference? What does it afford? • What options you have for animation, what you are most comfortable with, and—again, again, again—what seems feasible for a quarter. http://books.google.com/books?id=fXJnOde4eYkC&printsec=frontcover&dq=on+war&ei=7ZG6SfWZLIWekwS-3LT8Cw#PPA110,M1 http://books.google.com/books?id=wC2stJS83rYC&printsec=frontcover&dq=kelty+free+software&ei=QoLJSdLcDI6QkASH1pCCDg#PPA12,M1 http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-5.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-5.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-6.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-6.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/assignment-1.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/assignment-2.pdf Mapping the Digital Humanities Assignment 3, Page 2 __________________________ • In your previous work, what terms or concepts pop up most often, which ones interest you the most, which ones you’d rather do without, and how those terms would translate in a computational approach. Toward Making Animation Matter Your workflow is due—on the class blog (embedded, via a link, or as text)—before class on Monday, April 27th. In class, we’ll get theoretical and address the “stakes” of your animation and data elements, or how you can make them matter and for whom. The workflow will be graded on the 4.0 scale, and it can be revised once. It’s part of your individual project grade. Keep me posted with questions and quibbles. http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-7.pdf From Dyeth in Samuel R. Delany’s Stars in My Pocket Like Grains of Sand: Someone once pointed out to me that there are two kinds of memory (I don’t mean short- and long-term, either): recognition memory and reconstruction memory. The second is what artists train; and most of us live off the first—though even if we’re not artists we have enough of the second to get us through the normal run of imaginings. A constant challenge in academic work, then, is to model something that reshapes the material with which you and others are already familiar—to re-construct and re-imagine history, culture, texts, territories, and places through new paradigms, without simply recognizing them as what you already know, using the same blueprints, strategies, and maps as before. To produce a contrivance. To project a world and animate it. To swerve. I’m not saying it’s easy. It’s not. But give it a whirl. You’ve thought about your project (in your Thought Piece), assessed its possibilities (in your Needs Assessment), made it elemental (in your Work Flow), and speculated on what might happen come June (during in-class workshops). Now’s the time to give people the classification system for your information collecting and some results—that is, your data model and some data. This time around, the medium isn’t yours. Sorry. Please complete the data model worksheet. However, when you provide your data, you can choose the medium. For instance, feel free to use a spreadsheet, provide copies of a log, or complete the table I provide at the end of the worksheet. Outcomes Your data model should be: • Extremely specific, providing your audience with exact details for each of your data elements, following the form provided, and leaving no necessary field blank. • A cogent means of giving a reader who is not familiar with your project a sense of how you are collecting and organizing your data. Your elaboration on your data model should be: • A mobilization of terms and concepts from class (e.g., classification, paradigms, re/intermediation, collecting, affordance, intent, procedures, bias, discourse, animation, and distant reading), putting them to work in the context of your project. • Concrete and situated in your project. Abstract language should be avoided. Responses to each question should be based on examples from and exact instances in your project. • Aware of the limits and benefits of the decisions you are making and how those decisions will affect your target audience and your own learning. Remember: you can’t do everything, but you should be able to account for how you are mapping your project. Your data should be: • Well-organized and specific, based upon the framework outlined in your data model. • Sufficient enough to—at this juncture in your project—allow you to make some preliminary findings based upon your research. (However, the data does not need to be complete. You http://books.google.com/books?id=ngHQ_ZghbbYC&printsec=frontcover&dq=stars+in+my+pocket&source=gbs_summary_r&cad=0#PPA183,M1 http://mappingthedigitalhumanities.org/wp-content/uploads/2009/04/data-model.doc Mapping the Digital Humanities Assignment 4, Page 2 __________________________ might still be in the process of collecting more. In the worksheet, I require three rows of data. I recommend collecting much more, if possible. For some projects, twenty to forty rows will be necessary.) Before and during the process, consider: • What you expect to emerge from your animation at the quarter’s end. How do those expectations resonate with your data model? • Returning to what you churned out in response to Prompts 1 through 3. What’s your trajectory, collector? • How, broadly speaking, this approach to humanities work relates to your previous coursework and experiences, and to what effects. • Revisiting the modules and contacting me and/or your peers with any questions you have about the terms and concepts used. Another Review Coming Soon Your data model worksheet is due—on the class blog (embedded, via a link, or as text)—before class on Monday, May 11th. During that class, your worksheet will be peer reviewed, and I will grade your worksheet based on that peer review. The data model will be graded on the 4.0 scale, and it can be revised once. It’s part of your individual project grade. Hope all’s coming along well. As always, let me know about your concerns. http://mappingthedigitalhumanities.org/?page_id=235 From Hervé Le Tellier’s “All Our Thoughts”: I think the exact shade of your eyes is No. 574 in the Pantone color scale. Ah . . . the abstract: the oh so academic act of summarizing work that’s often still in progress. Your project’s not finished, you’re still not sure if everything coheres, and the thing’s so deep you can’t dare reduce it to a single paragraph. I know this. I don’t particularly enjoy writing abstracts, either. But abstracts are necessary beasts. Aside from giving your readers a quick snapshot of your research, they also force you to articulate—in a precise fashion and in exact numbers—what, exactly, you are up to. To the details, then. Your abstract should include: • The aim of your project and its motivation/purpose, • Your research question (although it does not need to be articulated as a question), • Your method (how you did what you did), • Your results (what you learned), • The implications of your results (or why your research matters), and • The trajectory of your project (what you plan to do with it in the future). This one should be in words. Despite Blake’s abstract of humans (above-right), we’re going with the industry standard here. Outcomes Your abstract should: • Be no more than three hundred words. • Be one concise and exact paragraph. • Include a title for your project, three keywords for it, and a one-sentence tagline describing it. (The keywords and tagline are not part of the three-hundred word limit.) • Be written for educated, non-expert audiences (e.g., academic types who might not be familiar with the digital humanities) and avoid jargon. • Summarize your work as it stands, instead of becoming an idea hike into unventured regions (that is, avoid speculations). • Mobilize terms and concepts from the class, again, for educated, non-expert audiences. • Demonstrate, through clear language, how your project’s motivation, question, method, results, and trajectory are related. • Follow the form below on page two. Before and during the process, consider: • How your data model is one way of thinking through your method. • Returning to your response to Prompt 3, which asked you for your research question, and to Prompt 2, which asked you what you want from your project. • Module 7 (on making your project matter) and how it speaks to your project’s motivation and the implications of your results. • How to write for people who would have absolutely no clue what, exactly, the digital humanities is. • How terms common in the course thus far (e.g., paradigm, syntagm, model, distant reading, remediation, and intermediation) might be helpful when articulating your project. • When terms should be defined. Contextualizing the Thing Your abstract is due—on the class blog (attached as a Word document)—before class on Wednesday, May 20th. On May 27th, we’ll consider how to integrate your abstract into the presentation of your project. An abstract is nothing without what it’s abstracting. The abstract will be graded on the 4.0 scale, and it can be revised once. It’s part of your individual project grade. If you need help condensing, then let me know. Form for the Abstract Project Title Your Name, Your Major Tagline Three keywords Body of abstract (300 words, one paragraph) Examples View some sample abstracts (which do not necessarily follow the format and outcomes for this prompt, but are nevertheless good references). http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/assignment-3.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/assignment-2.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/01/module-7.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/04/userguide.pdf http://mappingthedigitalhumanities.org/wp-content/uploads/2009/04/userguide.pdf http://www.sccur.uci.edu/sampleabstracts.html From DJ Spooky’s Rhythm Science: As George Santayana said so long ago, “Those who cannot remember the past are condemned to repeat it.” That one’s scenario. But what happens when the memories filter through the machines we use to process culture and become software—a constantly updated, always turbulent terrain more powerful than the machine through which it runs? Memory, damnation, and repetition: That was then, this is now. We have machines to repeat history for us. . . . The circuitry of the machines is the constant in this picture; the software is the embodiment of infinite adaptability, an architecture of frozen music unthawed. Reflection, reflection, reflection. Instructors often like the word. I’m not sure it fits here, though. The purpose of this project assessment isn’t for you to ruminate on whether you’re good enough or smart enough. We know you are, and people like you. It’s for you to articulate what—over the course of the quarter—ultimately emerged from your project and what you think of it. The thing began as an idea. You then converted it into an agenda, with a model, compiling pieces of data, and ultimately animating those pieces. That said, I hope you collected something you’re happy with. The project goal was for you to think through “generative constraints” as strict as computation and data models to produce provocative questions, new knowledge, and reconfigurations of literature, culture, and history. After all, the hardware of history needn’t determine its interpretation, and the wiring of culture is never neutral. Infinite adaptability. With that adaptability in mind, please unpack this list, without, of course, the brazen assumption that your unpacking is final. The quarter just so happens to be over. (And I’m really sad about that.) • How—for better and for worse—does your animation project differ from an academic paper (especially one intended for print)? What does it ask of audiences and to what effects? • How does your project produce new knowledge and about what? • Considering the brevity of a quarter, how was your project a success? What did you learn from it? What will others? • How could you improve your project? What do you want to continuing learning from it? • How, if at all, do you plan on developing (or using) your project in the future? Do you plan to circulate it to others or make it public? Why or why not? Unless you are going for writing credit, I’ve decided to let you choose the medium or media here. You can make—or blend together—video, a website, audio, word docs, or what-have-you. Be creative. Just do me two favors: 1. With your assessment, include three outcomes upon which I should assess your project and your assessment of it. Those outcomes should include references to your method for collecting data, your awareness of your own bias/intent/procedures, your project’s design, and how your project produces knowledge (instead of just re-presenting known information). 2. Provide me with your final animation project. Upload it to the blog, provide a link, or the like. (See more below.) Outcomes By focusing on your project as a process, your project assessment should: Mapping the Digital Humanities Assignment 6, Page 2 __________________________ • Be composed for educated, non-expert audiences (e.g., academic types who might not be familiar with the digital humanities). • Demonstrate your understanding of the digital humanities as a field, using material from the class when appropriate. • Reference specific aspects of your project and draw upon it for evidence. • Exhibit critical approaches to your own project (e.g., show that you know how you did what you did, what worked, and how you could have done things differently). • If applicable, include a works cited page of texts quoted, paraphrased, or the like. Before and during the process, consider: • Returning to your responses to all prompts. How has your project—and your framing of it— changed since then? • Returning to the course syllabus and assessing what you’ve learned in the class since day one of the quarter. • Returning to the user’s guide for CHID 498. • Circulating a draft assessment to me and your peers. (Use the blog!) • How to write for people who would have absolutely no clue what, exactly, the digital humanities is. • Doing something that will keep you interested. It’s finals week, in spring, just before summer, y’all. This One Will Not Be Revised Your project assessment and final portfolio are due—on the class blog (filed under your name)—by the end of the day, Wednesday, June 10th. Here’s what (ideally) should be uploaded to your author page on the blog: • Mapping 1, • Thought Piece (First Draft and Revision, if applicable), • Needs Assessment (First Draft and Revision, if applicable), • Work Flow (First Draft and Revision, if applicable), • Mapping 2 • Data Model (First Draft and Revision, if applicable), • Abstract (First Draft and Revision, if applicable), • Animation (all versions, including the one presented on June 3rd), • Project Assessment, and • Anything else you think is relevant. As a reminder, here’s how your work in 498 will be graded: • Class participation (30% of the grade) • Blogging and collaborative mapping (20% of the grade) • HTML quiz (5% of the grade) • Final exhibition (5% of the grade) • Individual project (40% of the grade) These five components of the class will each be graded on a 4.0 scale and then, for your final grade, averaged according to the percentages I provide above. And here’s how the portfolio is graded: Mapping the Digital Humanities Assignment 6, Page 3 __________________________ • Thought piece (10% of portfolio, can be revised once after it’s graded), • Needs assessment (10% of portfolio, can be revised once after it’s graded), • Work flow (10% of portfolio, can be revised once after it’s graded), • Data model (15% of portfolio, can be revised once after it’s graded), • Abstract (15% of portfolio, can be revised once after it’s graded), and • Final prototype and assessment (40% of portfolio, cannot be revised after it’s graded). See me with questions! Have a rad summer break, people. It’s been a pleasure, and—to reiterate—make this last bit interesting. After all, CHID 498 was, from the get-go, an experiment. work_a734rm3mxbf4pn5ghgxadmhjja ---- Preparing Non-English Texts for Computational Analysis Dombrowski, Q 2020 Preparing Non-English Texts for Computational Analysis. Modern Languages Open, 2020(1): 45 pp. 1–9. DOI: https://doi.org/10.3828/mlo. v0i0.294 ARTICLE – DIGITAL MODERN LANGUAGES Preparing Non-English Texts for Computational Analysis Quinn Dombrowski Stanford University, US qad@stanford.edu Most methods for computational text analysis involve doing things with “words”: counting them, looking at their distribution within a text, or seeing how they are juxtaposed with other words. While there’s nothing about these methods that limits their use to English, they tend to be developed with certain assumptions about how “words” work – among them, that words are separated by a space, and that words are minimally inflected (i.e. that there aren’t a lot of different forms of a word). English fits both of these assumptions, but many languages do not. This tutorial covers major challenges for doing computational text analysis caused by the grammar or writing systems of various languages, and ways to overcome these issues. Introduction Most methods for computational text analysis involve doing things with ‘words’: counting them, looking at their distribution within a text or seeing how they are juxtaposed with other words. While there’s nothing about these methods that limits their use to English, they tend to be developed with certain assumptions about how ‘words’ work – among them, that words are separated by a space, and that words are minimally inflected (i.e. that there aren’t a lot of different forms of a word). English fits both of these assumptions, but many languages do not. Depending on the text analysis method, a sufficiently large corpus (on the scale of mul- tiple millions of words) may sufficiently minimize issues caused by inflection, for instance at the level commonly found in Romance languages. But for many highly inflected Slavic and Finno-Ugric languages, Arabic, Quechua, as well as historical languages such as Latin and Sanskrit, repetitions of what you think of as a ‘word’ will be obscured to algorithms with no understanding of grammar, when that word appears in different forms, due to variation in the number, gender or case in which that word occurs. To make it possible for an algorithm to count those various word forms as the same ‘word’, you need to modify the text before run- ning the analysis. Likewise, if you’re working with Japanese or Chinese, which don’t typically separate words with spaces, you need to artificially insert spaces between ‘words’ before you can get any meaningful result. For example, ‘I went to Kansai International Airport’ is writ- ten in Japanese as 関西国際空港に行きました. The lack of spaces between words means that tools dependent on spaces to differentiate (and then count) words will treat this entire sentence as a single ‘word’. Segmentation – the process of adding spaces – is not always an obvious or straightforward process; on one hand, it’s easy to separate ‘to’ and ‘went’ from the https://doi.org/10.3828/mlo.v0i0.294 https://doi.org/10.3828/mlo.v0i0.294 mailto:qad@stanford.edu Dombrowski: Preparing Non-English Texts for Computational AnalysisArt. 45, page 2 of 9 name of the airport (関西国際空港 ‘Kansai International Airport’ に ‘to’ 行きました ‘went’), but depending on what sorts of questions you are attempting to answer with the analysis, you may want to further split the proper name to separate the words ‘international’ and ‘airport’, so that they can be identified as part of a search, or contribute to instances of those words in the corpus: 関西 ‘Kansai’ 国際 ‘international’ 空港 ‘airport’ に ‘to’ 行きました ‘went’. Goals This tutorial covers major challenges to doing computational text analysis caused by the grammar or writing systems of various languages, and offers ways to overcome these issues. This often involves using a programming language or tool to modify the text – for instance by artificially inserting spaces between every word for languages such as Chinese that aren’t regularly written that way, or replacing all nouns and verbs with their dictionary form in highly inflected languages such as Finnish. In both of these situations, the result is a text that is less easy to parse for a human reader. Removing inflection may have the effect of making it impossible to decipher the meaning of the text: if a language has relatively flexible word order, removing cases renders it impossible to differentiate subjects and objects (e.g. who loved whom). But for some forms of computational text analysis, the ‘meaning’ of any given sentence (as readers understand it) is less important; instead, the goal is to arrive at a different kind of understanding of a text using some form of word frequency analysis. By modifying a text so that its ‘words’ are more clearly distinguishable using the same conven- tions as found in English (spaces, minimal word inflection etc.), you can create a text deriva- tive that is specifically intended for computation and will lead to much more interpretable computational results than if you give the algorithm a form of the text intended for human readers. While this lesson provides pointers to code and tools for implementing changes to the text in order to adapt it for computation, the landscape of options is evolving quickly and you should not feel limited to those presented here. Audience Text analysis methods are most commonly used in research contexts, and frequently appear as part of ‘an introduction to digital humanities’ and similar courses and workshops. While these courses are taught worldwide, the example texts are, most often, in English, and the application of these text analysis methods may not be as straightforward for students work- ing in other languages. This tutorial is intended for instructors of such workshops, to help them be better informed about the challenges and needs of students working in other lan- guages and to provide them with pointers for how to troubleshoot issues that may arise. For instructors of modern languages, text analysis methods can also have a place in inter- mediate to advanced language courses (see Cro & Kearns). For instance, while many digital humanities researchers now use more nuanced methods than word clouds, they can still be employed in a language pedagogy context to provide a big-picture visualization of word frequency – starting with the generic and obvious (prepositions, articles, pronouns etc.) and becoming more and more related to the content of the text as students apply and refine a stopword list (a list of words that should be removed prior to doing the word counts and generating the visualization). Depending on the text, even a word cloud may make visible the impact of inflection, as it may contain multiple forms of a given ‘word’, which can spur discussion about what constitutes a ‘word’. Intuitively, we think of saber (‘to know’ in Spanish) as the ‘same word’ as sé ‘I know’, sabemos ‘we know’, sabía ‘knew’ and so on, but what do we gain and lose if we treat them as ‘different words’, the way a computer would by default? Dombrowski: Preparing Non-English Texts for Computational Analysis Art. 45, page 3 of 9 Text encoding Text encoding – or how the low-level information about how each letter/character is actually stored on a computer – is important when working with any text that involves characters beyond unaccented Latin letters, numerals and a small number of punctuation marks.1 It may be tempting to think languages that use the Latin alphabet are safe from a particular set of challenges faced by other writing systems when it comes to computational text analysis. In reality though, many writing systems that use the Latin alphabet include at least a few letters with diacritics (e.g. é, ñ, or ż), and these letters cause the same issues as a non-Latin alphabet, albeit on a smaller scale. While a text in French, Spanish or Polish may be decipherable even if all of these characters are mangled (e.g. ma□ana for mañana is unlikely to cause confusion, and even a less obvious case such as a□os for años is often distinguishable by context), issues with text encoding may cause bigger problems later in your analysis – including causing code to not run at all. For languages with a non-Latin alphabet, text encoding problems will render a text completely unreadable and must be resolved before doing anything at all with the text. Unicode (UTF-8) encoding is the best option when working with text in any language, but particularly non-English languages. What is Unicode? Unicode is the name of a computing industry standard for encoding and displaying text in all writing systems of the world. While there are scripts that are not yet part of Unicode as of 2020 (including Demotic and some Egyptian hieroglyphs), researchers affiliated with the Unicode consortium have done a tremendous amount of work starting in the late 1980s to differentiate characters (graphemes, the smallest units of a writing system) versus glyphs (var- iant renderings of a character, which look a little different but have the same meaning) for the world’s writing systems, and assign unique code points to each character. With some writing systems – including Chinese and various medieval scripts – the decision of what constitutes a character as opposed to a glyph is at times controversial. Scholars who disagree with previous decisions or who feel that they have identified a character that is not represented in Unicode, can put forward proposals for additions to the standard. While the Unicode consortium that shapes the development of the standard is primarily made up of large tech companies, schol- ars and researchers play a significant role in shaping decision-making at the language level (Anderson). Why is Unicode important? Before Unicode was widely adopted, there were many other standards that developed and were deployed in language-specific contexts. Windows-1251 is an encoding system that was widely used for Cyrillic and is still used on 11% of websites with .ru (Russian) domain names (W3Techs). A competing, but less common, Cyrillic encoding for Russian was KOI8-R, and a similar one, KOI8-U, was used for Ukrainian. For Japanese, you may still encounter websites using Shift JIS encoding. For Chinese, you can find two major families of encoding standards prior to Unicode, Guobiao and Big5. A major advantage of Unicode, compared to these other encoding standards, is that it makes it possible to seamlessly read text in multiple languages and alphabets. Previously, if you had a bilingual parallel edition of a text on a single webpage with languages that used two different writing systems, you would have to toggle between 1 Note that ‘encoding’ here refers to the comparatively low-level technical process of standardizing which bits represent which letters in various alphabets. This is a different use of the term than the ‘encoding’ in the Text Encoding Initiative (TEI), https://tei-c.org, which captures structural and/or semantic features of text in a poten- tially machine-readable way. https://tei-c.org Dombrowski: Preparing Non-English Texts for Computational AnalysisArt. 45, page 4 of 9 multiple text encodings – reducing one side of the text, then the other, to gibberish as you switched between them. If you work in a language with a non-Latin alphabet, odds are good that you’ll encounter text that doesn’t use Unicode encoding at some point in your work. Long-running digital text archives, in particular, are likely candidates for not having migrated to Unicode. If you try to open a text file using the wrong kind of encoding, you won’t see text in the alpha- bet you’re expecting to see, but rather a kind of gibberish that will soon become familiar. (For instance, Windows-1251 Cyrillic looks like Latin characters with diacritics: “Äîñòîåâñêèé Ôåäîð Ìèõàéëîâè÷. Ïðåñòóïëåíèå è íàêàçàíèå” for “Достоевский Федор Михайлович. Преступление и наказание” – Dostoevsky Fyodor Mikhailovich. Crime and Punishment.) Making sure your text uses Unicode encoding Most computational text analysis tools and code assume that the input text(s) use UTF-8 (Unicode) encoding. If the input text is not in UTF-8, you may get an error message, or the tool may provide an ‘analysis’ of the unreadable gibberish (Figure 1). It is not obvious what encoding a text file uses: that information isn’t included in the file properties available on Windows or Mac. There isn’t even an easy way to write Python code to reliably detect a file’s encoding. However, most plain text editors have some way to open a text file using various encodings until you find one that renders the text readable, as well as some way to save a text file with UTF-8 encoding. A plain text editor is software that natively reads and writes .txt files, without adding in its own additional formatting (which Notepad does in Windows). Atom is a cross-platform (Windows/Mac/Linux) plain text editor that you can install if you don’t already have a preferred editor.3 There are numerous packages (add-ons) for Atom that provide additional functionality. One of these is called convert-file-encoding.4 Download and install this add-on following the instructions in the Atom documentation.5 Once you’ve installed the convert-file-encoding package, open your text file in Atom. By default, Atom tries to open everything as UTF-8. If everything displays correctly, your file already uses Unicode encoding. If the text is gibberish, go to Edit > Select encoding, and 2 Voyant Tools, https://voyant-tools.org/. 3 Atom is available for download at https://atom.io/. 4 The convert-file-encoding package is available at https://atom.io/packages/convert-file-encoding. 5 Atom documentation is available at https://flight-manual.atom.io/using-atom/sections/atom-packages/. Figure 1: Voyant ‘analysis’ of Windows-1251 encoded Russian text.2 https://voyant-tools.org/ https://atom.io/ https://atom.io/packages/convert-file-encoding https://flight-manual.atom.io/using-atom/sections/atom-packages/ Dombrowski: Preparing Non-English Texts for Computational Analysis Art. 45, page 5 of 9 choose a possible candidate encoding. The encodings are listed in Atom by what languages they cover, so you can try different options for your language if you’re not sure. Once your text appears normally, go to Packages > Convert to encoding and select UTF-8. Then save your file. Segmentation For Chinese and Japanese text, you need to segment your text, or artificially insert spaces between ‘words’, before you can use it for computational text analysis. For Chinese, some scholars treat every character as a ‘word’. This destroys compounds but is more predictable than using a segmenter. For both Chinese and Japanese, segmenters work best when the text does not contain a lot of jargon or highly specialized vocabulary, or non-standard orthogra- phy (e.g. Japanese children’s writing, which often uses the hiragana syllabary where a fully literate adult would use kanji). Stanford NLP (natural language processing) provides a Chinese segmenter6 with algorithms based on two different segmentation standards.7 For Japanese, segmentation is available through the mecab software.8 Rakuten MA is a Javascript-based segmenter that supports Chinese and Japanese.9 There is also a Python implementation, Rakuten MA Python.10 If you have trouble with mecab but aren’t comfortable writing Python code yourself, there’s a Jupyter Notebook available for segmenting Japanese.11 See the Programming Historian tutorial ‘Introduction to Jupyter Notebooks’ (Dombrowski et al.) for a description of Jupyter Notebooks and how to use them. Stopwords Stopwords are words that are filtered out as the first step of text analysis. Many tools have a configuration option where you can define which words should be treated as stopwords.12 Stopword removal is essential for some methods (including word clouds and topic model- ling), to avoid having your results flooded with articles, copulas, prepositions and the like. Other methods, such as word vectors (which analyse words in their context as a way to explore semantic relationships within large corpora), rely on stopwords for important information about the semantic value of words, and stopwords should be retained in the text. Stopwords are language specific, and more nuanced use of stopwords can involve text- specific lists that also exclude things like character names (which are likely to occur with high frequency, but that frequency may or may not be meaningful depending on your research question). If you’re using a tool that supports the use of stopword lists, you should check to make sure that a default, almost certainly English, stopword list isn’t being applied to your non-English text. Some tools provide reasonable built-in stopword lists for multiple languages. Voyant offers generally reasonable lists for thirty-four languages, along with a combined ‘multilingual’ set- ting, and an option for defining your own list. These lists are not identical: the Russian list 6 The Stanford NLP segmenter can be downloaded at https://nlp.stanford.edu/software/segmenter.shtml. 7 This Chinese part-of-speech tagger tutorial begins with a step-by-step guide to segmenting with the Stanford NLP segmenter: https://github.com/quinnanya/dlcl204/blob/master/chinese/pos_chinese.md. 8 Mecab can be downloaded at https://taku910.github.io/mecab/. 9 Rakuten MA is available at https://github.com/rakuten-nlp/rakutenma. 10 Raktuen MA Python is available at https://github.com/ikegami-yukino/rakutenma-python. 11 The Jupyter notebook for running Rakuten MA Python is available at https://github.com/quinnanya/japanese- segmenter. 12 See the settings for the Topic Modeling Tool (https://senderle.github.io/topic-modeling-tool/documenta- tion/2018/09/27/optional-settings.html) or general purpose text exploration environment Voyant (https:// voyant-tools.org/docs/#!/guide/stopwords). https://nlp.stanford.edu/software/segmenter.shtml https://github.com/quinnanya/dlcl204/blob/master/chinese/pos_chinese.md https://taku910.github.io/mecab/ https://github.com/rakuten-nlp/rakutenma https://github.com/ikegami-yukino/rakutenma-python https://github.com/quinnanya/japanese-segmenter https://github.com/quinnanya/japanese-segmenter https://senderle.github.io/topic-modeling-tool/documentation/2018/09/27/optional-settings.html https://senderle.github.io/topic-modeling-tool/documentation/2018/09/27/optional-settings.html https://voyant-tools.org/docs/#!/guide/stopwords https://voyant-tools.org/docs/#!/guide/stopwords Dombrowski: Preparing Non-English Texts for Computational AnalysisArt. 45, page 6 of 9 includes the words for many numbers (including пятьдесят ‘fifty’), the Spanish list has no numbers but does include various forms of emplear ‘use’, and the Czech list includes no num- bers whatsoever but does have a number of words related to news (e.g. články ‘articles’), hint- ing at the domain and context of its origins. Is it the right thing to do to eliminate written-out numbers from a Russian text, or any references to ‘articles’ in a Czech text? It all depends on what you’re trying to learn from the text analysis. Students should examine – and, if neces- sary, modify liberally – any stopword list before applying it to their text. If you’re a digital humanities instructor, be careful about uncritically recommending stopword lists for lan- guages you can’t read yourself. As an initial vetting step, at least run any list you find through Google Translate first, and read through it. There are many resources online that aggregate stopword lists for any number of languages, without considering that many of those lists were developed for very particular use cases, and might, for instance, remove all words about computers, along with the more-expected prepositions. Your stopword list should be influenced by other changes you make to your text. In gen- eral, stopword lists are all lower case, due to the lower-casing that is typically part of the text analysis process. If you lemmatize your text (as described below), you won’t need to include every possible form of pronouns: just the lemma. If you don’t plan to lemmatize your text before the stopword list is applied, you’ll need to work through every number, gender and/or case of undesired pronouns, adjectives, verbs and so forth, to ensure they are all excluded. Remember, these methods are matching, character-for-character, what you put on the list, and including the dictionary form of a word does not by extension include all conjugations, declensions or other variant forms. Lower-casing Capital letters and lower-case letters, in bicameral writing systems (those that have the con- cept of capitalization, unlike Japanese, Hebrew, Georgian or Korean), are different characters from the point of view of text analysis algorithms. Dad, dad and Sad are all treated as separate words, where the latter two are both parsed as having a different first letter from the first. To address this issue, texts are commonly ‘lower-cased’, or converted to all lower-case characters, before they are further processed with stopword removal or used for analysis. Most text analy- sis tools (e.g. with graphical user interfaces, like Voyant and the Topic Modeling Tool) handle this automatically, even for non-Latin alphabets. If you’re writing analysis code yourself, don’t forget this step. Punctuation removal What we easily recognize as punctuation is just another character from the point of view of most algorithms. This leads to problems when the following are all treated as different ‘words’: • cats • “cats • “cats, • (cats) • cats! • cats!! • cats?! • cats. Some tools automatically remove punctuation as part of pre-processing, some tools include punctuation on the stopwords list and others require you to remove it from the text yourself. Dombrowski: Preparing Non-English Texts for Computational Analysis Art. 45, page 7 of 9 For tools that remove punctuation automatically, you should check to make sure that all the punctuation present in your language is being removed successfully. Punctuation removal may be based on English, so punctuation not found in English (such as « » or 「 」, the Russian and Japanese quotation marks, respectively) may not be included. Running the text through a tokenizer algorithm (such as the one provided by the Stanford NLP library for Python, which currently supports fifty-three languages) can also separate punctuation from text, but may make other changes you haven’t anticipated. For instance, in English, a contrac- tion like ‘she’s’ gets split into two ‘words’, she and ’s, which is a reasonable choice reflecting the word’s origins, but can lead to initial confusion when you discover the ‘word’ ’s in the results of your analysis. Lemmatizing If you’re working with a highly inflected language (i.e. if your language has multiple gram- matical cases, or a complex verbal system where different persons and numbers have dif- ferent forms), you may need to lemmatize your text to get meaningful results from any text analysis method. Lemmatization attempts to convert the word forms actually found in a text into their dictionary form. For languages with less inflection (including Romance languages), many scholars don’t feel the need to lemmatize because some methods, such as topic mod- elling, end up successfully clustering together different forms of a word, even given a small amount of variation. It could be a worthwhile activity with students to compare text analysis results with and without lemmatization for these languages. A lot of work goes into developing NLP code for lemmatizing text, and not all lemmatizers perform equally well on all kinds of text: the informal language of tweets and the formal lan- guage of newspapers are different, to say nothing of literary and historical language. English is by far the best-resourced language, given the longstanding academic and commercial inter- est in improving NLP tools for at least modern English. Many languages lack effective lem- matizers, or any lemmatizers at all. If there’s no lemmatizer for the language that you want to work with, another possibility is to look for a stemmer. Stemmers are a shortcut to the same fundamental goal as lemmatizers: reducing variation within a text, in order to more effec- tively group similar words. Rather than replacing the word forms in a text with the proper dictionary form, a stemmer looks for patterns of letters to chop off at the beginning and/or end of words, to get to something similar to (but often distinct from) the root of the word. Stemmers don’t effectively handle suppletive word forms (e.g. ‘children’ as a plural of ‘child’), or other word forms that diverge from the usual grammatical ‘rules’, but they may work well enough to reduce overall variation in the word forms present in a text, if no lemmatizer is available. The truncated forms produced by a stemmer may, however, be harder to recognize and connect back to the original form when you’re looking at the results of your analysis. The current state-of-the-art (whatever state that may be) for lemmatizing most languages is usually not available through an easy-to-use tool: you should expect to use the command line and/or write code. As a few illustrative examples: • For Russian, Yandex (the major Russian search engine) has released software called MyStem for lemmatizing Russian.13 A wrapper is available that makes this code usable in Python, PyMyStem.14 13 MyStem is available at https://yandex.ru/dev/mystem/. 14 PyMyStem is available at https://github.com/nlpub/pymystem3. https://yandex.ru/dev/mystem/ https://github.com/nlpub/pymystem3 Dombrowski: Preparing Non-English Texts for Computational AnalysisArt. 45, page 8 of 9 • For Basque, eustagger-lite (Ezeiza, N. et al.) processes text using the following steps: tokenization, segmentation, identifying grammatical part-of-speech, treatment of mul- tiword expressions and morphosyntactic disambiguation.15 • While the concept of lemmatization doesn’t quite carry over to Korean grammar, the KoNLPy package can be used for some kinds of potentially helpful text pre-processing (Kim).16 • The Classical Languages Toolkit (cltk.org) provides lemmatization for Latin, Greek and Old French, with other languages under development.17 • Lemmatization isn’t enough for agglutinative languages such as Turkish, where very long words can be constructed by stringing together morphemes. The resulting com- plex words (e.g. Çekoslovakyalılaştıramadıklarımızdanmışsınız, ‘you are reportedly one of those that we could not make Czechoslovakian’) are rare, and therefore not ideal to use for word counts, but may consist of morphemes that are repeated with a frequency in the text that more closely resembles other languages’ concept of a ‘word’. Byte-pair encod- ing (Mao) is one algorithm that has been used as a reasonably effective shortcut to ‘sub- word encoding’ (similar to lemmatization, but for linguistic components smaller than a word, such as Turkish morphemes) without requiring tokenization or morphological analysis. Scholars have also worked on more nuanced, linguistically motivated segmenta- tion using supervised morphological analysis as a way of addressing the challenges posed by agglutinative languages (Ataman et al.). • Lemmatization isn’t applicable to Chinese.18 Conclusion Text preparation is essential for computational text analysis but how, specifically, you need to modify the text – and how best to go about doing that – will vary based on the research question, the method and the language. To even begin making sense of the output of com- putational text analysis, it is important to understand how the input text was processed, and to take precautions to ensure that default settings derived from English were not applied to languages with very different grammar or orthography. Fortunately, there is a growing community of scholars working on computational text anal- ysis, and other digital humanities methods, as applied to languages other than English. For scholars working with digital humanities methods, a community has begun to form around the mailing list and resources posted on the Multilingual DH website (https://www.multilin- gualdh.org), which is applying to become a special interest group of the Alliance of Digital Humanities Organizations. These resources, and their applications to digital humanities research as well as language pedagogy, continue to be refined, and self-identified ‘newcom- ers’ are welcome and encouraged to join the conversation. Author Information Quinn Dombrowski supports digitally-facilitated research in the Division of Literatures, Cultures & Languages at Stanford University in the USA. In addition to working on digital humanities projects for a wide variety of non-English languages, Quinn serves on the Global 15 Eustagger-lite is available at http://ixa2.si.ehu.es/eustagger/. 16 KoNLPy is available at http://konlpy.org/en/latest/, along with a tutorial for how to use it for text pre-process- ing at https://lovit.github.io/nlp/2019/01/22/trained_kor_lemmatizer/. 17 The Classical Languages Toolkit is available at http://cltk.org/. 18 At the same time, see this discussion about attempts to decompose characters into radicals as if the radicals were lemmas: https://www.quora.com/Does-the-Chinese-language-have-concepts-of-lemmatization-and-stemming- just-as-English-has. http://cltk.org https://www.multilingualdh.org https://www.multilingualdh.org http://ixa2.si.ehu.es/eustagger/ http://konlpy.org/en/latest/ https://lovit.github.io/nlp/2019/01/22/trained_kor_lemmatizer/ http://cltk.org/ https://www.quora.com/Does-the-Chinese-language-have-concepts-of-lemmatization-and-stemming-just-as-English-has https://www.quora.com/Does-the-Chinese-language-have-concepts-of-lemmatization-and-stemming-just-as-English-has Dombrowski: Preparing Non-English Texts for Computational Analysis Art. 45, page 9 of 9 Outlook::DH executive board and leads Stanford’s Textile Makerspace. Quinn’s publications include “What Ever Happened to Project Bamboo?” about the failure of a digital humanities cyberinfrastructure initiative, “Drupal for Humanists”, and “Crescat Graffiti, Vita Excolatur: Confessions of the University of Chicago” about library graffiti. References Anderson, Deborah. The Script Encoding Initiative, the Unicode Consortium, and the Character Encoding Process. Signa nr. 6 April 2004. https://www.signographie.de/cms/upload/pdf/ SIGNA_Anderson_SEI_1.0.pdf. Accessed 30 January 2020. Ataman, Duygu, Matteo Negri, Marco Turchi and Marcello Federico. ‘Linguistically Motivated Vocabulary Reduction for Neural Machine Translation from Turkish to English’. Prague Bulletin of Mathematical Linguistics, vol. 108, no. 1, 2017, pp. 331–42. DOI: https://doi. org/10.1515/pralin-2017-0031 Cro, Melinda A. and Sarah K. Kearns. ‘Developing a Process-Oriented, Inclusive Pedagogy: At the Intersection of Digital Humanities, Second Language Acquisition, and New Litera- cies’. Digital Humanities Quarterly, vol. 14, no. 1, 2020. http://www.digitalhumanities. org/dhq/vol/14/1/000443/000443.html. Accessed 30 April 2020. DOI: https://doi. org/10.46430/phen0087 Dombrowski, Quinn, Tassie Gniady and David Kloster. Introduction to Jupyter Notebooks. The Programming Historian. 12 December 2019. https://programminghistorian.org/en/les- sons/jupyter-notebooks. Accessed 30 January 2020. Ezeiza, Nerea, Iñaki Alegria, Jose Maria Arriola, Ruben Urizar and Itziar Aduriz. ‘Combining Stochastic and Rule-Based Methods for Disambiguation in Agglutinative Languages’. Pro- ceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, vol. 1, 1998, pp. 380–4. DOI: https://doi.org/10.3115/980845.980909 Kim, Hyunjoong. 말뭉치를 이용한 한국어 용언 분석기 (Korean Lemmatizer), 22 January 2019. https://lovit.github.io/nlp/2019/01/22/trained_kor_lemmatizer/. Accessed 30 January 2020. Mao, Lei. ‘Byte Pair Encoding’. Lei Mao’s Log Book, 2019. https://leimao.github.io/blog/Byte- Pair-Encoding/. Accessed 30 January 2020. W3Techs. Distribution of character encodings among websites that use .ru. Updated 30 January 2020. https://w3techs.com/technologies/segmentation/tld-ru-/character_ encoding. Accessed 30 January 2020. How to cite this article: Dombrowski, Q 2020 Preparing Non-English Texts for Computational Analysis. Modern Languages Open, 2020(1): 45 pp. 1–9. DOI: https://doi.org/10.3828/mlo.v0i0.294 Published: 28 August 2020 Copyright: © 2020 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. OPEN ACCESS Modern Languages Open is a peer-reviewed open access journal published by Liverpool University Press. https://www.signographie.de/cms/upload/pdf/SIGNA_Anderson_SEI_1.0.pdf https://www.signographie.de/cms/upload/pdf/SIGNA_Anderson_SEI_1.0.pdf Introduction Goals Audience Text encoding What is Unicode? Why is Unicode important? Making sure your text uses Unicode encoding Segmentation Stopwords Lower-casing Punctuation removal Lemmatizing Conclusion Author Information References Figure 1 work_aeyo6w7szva7lgp53sg2u7jkma ---- Ergonomic Design of a Main Control Room of Radioactive Waste Facility Using Digital Human Simulation Ergonomic Design of a Main Control Room of Radioactive Waste Facility Using Digital Human Simulation Baekhee Lee1, Yoon Chang2, Kihyo Jung3, Ilho Jung4, and Heecheon You1 1Division of Mechanical and Industrial Engineering, Pohang University of Science and Technology, Pohang, South Korea 2Department of Production System, LG Electronics, Pyeongtaek, South Korea 3School of Industrial Engineering, University of Ulsan, Ulsan, South Korea 4Department of Nuclear, Power & Energy Plant Division, Hyundai Engineering, Seoul, South Korea The present study evaluated a preliminary main control room (MCR) design of radioactive waste facility using the JACK® digital human simulation system. Four digital humanoids (5th, 50th, 95th, and 99th percentiles) were used in the ergonomic evaluation. The first three were selected to represent 90% of the target population (Korean males aged 20 to 50 years) and the last to reflect the secular trend of stature for next 20 years in South Korea. The preliminary MCR design was assessed by checking its compliance to ergonomic guidelines specified in NUREG-0700 and conducting an in-depth ergonomic analysis with a digital prototype of the MCR design and the digital humanoids in terms of postural comfort, reachability, visibility, and clearance. For identified design problems, proper design changes and their validities were examined using the JACK. A revised MCR design suggested in the present study would contribute to effective and safe operations of the MCR as well as operators’ health in the workplace. INTRODUCTION A radioactive waste facility (RWF) is a facility for managing radioactive waste which is usually by-product of nuclear power generation and other applications of nuclear fission or nuclear technology, such as research and medicine. Most Radioactive waste has been charged in a temporary facility in nuclear power plants (NPP) in South Korea, so the Korean government has planned to establish an RWF by the year 2012 in Gyeongju considering the radioactive waste saturation of the temporary facility projected in the future (KRMC, 2009). A main control room (MCR) of the RWF needs to be considered with ergonomic aspects at the initial design stage for effective monitoring of operators and reduction of development cost. Hwang et al. (2009) analyzed three usability issues (operating interface of the display and controls in the MCR, usability of procedures, and layout of the MCR) through ergonomic evaluation of the MCR. Ku et al. (2007) evaluated and analyzed the MCR of the NPPs (unit-1, 2, 3, and 4 of the Kori NPP, unit-1, 2 of the Yeonggwang NPP) applying with ergonomic evaluation checklist as part of the periodic safety review (PSR). The evaluation of developed MCR is effective for analyzing design improvements, but on the other hand, the development of an improved MCR needs considerable time and cost. Therefore, the ergonomic evaluation at the initial design stage is needed for effective MCR design and development. Digital human simulations (DHS) using humanoids have been used for ergonomic design of the workplace. Lee et al. (2005) and Park et al. (2008) carried out ergonomic evaluations using DHS and analyzed design improvements of the overhead crane and the helicopter cockpit respectively (Figure 1). Ergonomic design and evaluation using virtual mockups in DHS at initial design stage have been recommended as a useful method for effective retrenchment of development period and cost (Chaffin, 2005; You, 2007). (a) Overhead crane (b) Helicopter cockpit Figure 1. Ergonomic evaluation using digital human simulation The present study evaluated preliminary designs of the MCR of the RWF and analyzed design improvements. 3D virtual mockups of the MCR of the RWF were developed for use of DHS. We used JACK® for DHS and generated four representative human models (5th, 50th, 95th, and 99th percentiles) considered with the anthropometric data of Size Korea (2004) and secular trend of stature over the next 20 years. In this study, the preliminary designs of the MCR of the RWF were evaluated applying with 4 ergonomic aspects (postural comfort, reachability, visibility, and clearance) and were analyzed to determine design components and improvement direction. METHODS Representative Human Models Four representative human models considered with accommodation percentage of 90% (5th ~ 95th percentiles) for the target population and secular trend of stature over the next 20 years were generated for ergonomic evaluation using DHS. The target population consisted of male aged 20 to 50 was determined considering workforce planning in the MCR of the RWF. Three representative human models (5th, 50th, and 95th Co py rig ht 2 01 2 by H um an F ac to rs a nd E rg on om ic s S oc ie ty , I nc . A ll rig ht s re se rv ed . D O I 1 0. 11 77 /1 07 11 81 31 25 61 27 9 PROCEEDINGS of the HUMAN FACTORS and ERGONOMICS SOCIETY 56th ANNUAL MEETING - 2012 1912 percentiles) which can accommodate Size Korea (2004)’s anthropometric data (n = 1,992) of 90% were generated. Additionally, a representative human model having 99th percentile (same as 95th percentile over the next 20 years) which reflected three characteristics (domestic stature growth of male, international stature growth of male, and conservative estimation of stature) for consideration of secular trend. In the last 25 years (from 1979 to 2004 year), stature of Korean male has grown as 4.4 cm (Figure 2; Size Korea, 2004). On the other hand, secular growth has been reported that there are differences among nations according to economic conditions and nutrition (Roche, 1995). For example, secular growth of Korea (GNP: $ 9,287) was 1.65 cm in the last 10 years, while secular growth of Japan (GNP: $ 42.657; 4.5 times bigger than Korea’s) was 1.32 cm. Finally, we applied a conservative secular growth to meet the utmost target accommodation percentage (90%) of the MCR of the RWF in the next 20 years, based on domestic and international stature growth. Figure 2. Stature growth trend of male aged 20 ~ 50 (Size Korea, 2004) Humanoids in the JACK were generated through input of body sizes of generated representative human models as shown in Figure 3. In the JACK, input of 27 body sizes is needed to generate a humanoid; however Size Korea (2004) provides only 24 body sizes of them. So the present study applied with 24 body sizes provided by Size Korea (2004), the other 3 body sizes (hand breadth, head length, and thumb-tip reach) were estimated using JACK’s regression equations based on 24 body sizes. 5th %ile (160.5 cm) 50th %ile (170.2 cm) 95th %ile (180.2 cm) 99th %ile (184.4 cm) Figure 3. Generated representative human models Reference Posture for Evaluation The present study established an operators’ monitoring posture referring to existing studies related to computer workstation postures for DHS evaluation as shown in Figure 4. The existing studies observed and analyzed reference postures at computer workstation (ANSI/HFES, 2007; Chaffin and Andersson, 1984; Grandjean et al., 1983; Salvendy, 1987). In this study, the reference posture for evaluation as shown in Figure 4 was chosen considering the operator’s posture, similar to postures at a computer workstation, for monitoring tasks in the MCR of the RWF. For example, the degree of shoulder abduction was determined as 13°, which is a median degree provided by Chaffin and Andersson (1984)’s recommended range (0 ~ 25°). 90º 95º 13º 80º10º 35º 13º (a) Side view (b) Front view Figure 4. Reference posture of operators in the MCR Ergonomic Evaluation Criteria The present study established a relationship matrix between four ergonomic evaluation criteria and seven design components in the MCR (Table 1). Ergonomic evaluation criteria were determined as postural comfort, reachability, visibility, and clearance which were used in the existing DHS studies (Bowman, 2001; Nelson, 2001; Park et al., 2008). Selected ergonomic evaluation criteria were selectively applied with target design components. For example, Table 1 shows that console being seated by the operator was evaluated using postural comfort and clearance, and large display panel (LDP) providing information about the RWF was analyzed using postural comfort and visibility. The design components of the MCR of the RWF were evaluated using NUREG-0700 design guideline. NUREG- 0700 design guideline (O’Hara et al., 2002) provides ergonomic design parameters of each design component in the NPP. For example, according to NUREG-0700, the console’s clearance should provide adequate height, depth, and knee clearance for the 5th to 95th percentile adults, LDP’s visibility should permit operators at the consoles a full view of all display panels, and LCD’s vertical viewing angle of visibility should not be more than 20° above and 40° below the operator’s horizontal line of sight. PROCEEDINGS of the HUMAN FACTORS and ERGONOMICS SOCIETY 56th ANNUAL MEETING - 2012 1913 Table 1. Relationship matrix between ergonomic evaluation criteria and design components (O: related, X: not related) No. Design component Postural comfort Reach- ability Visib- ility Clear- ance 1 Console O X O O 2 Large display panel (LDP) O X O X 3 LCD O X O X 4 Security access control sub- console O O X X 5 CCTV master control rack O O X X 6 Main fire control panel O O X X 7 Printers O O X X RESULTS In this study, we show ergonomic evaluation results of three major design components (console, LDP, and LCD) of the MCR of the RWF. Console’s minimum clearance which was analyzed as 1.6 ~ 6 cm for 4 humanoids was adequately evaluated in terms of the NUREG-0700. Minimum clearance was calculated as the least distance between operator’s leg and console. The more body sizes of humanoid increase, the more clearance of console decrease. For example, Figure 5 shows that 95th and 99th percentile’s minimum clearance were 3.5 cm and 1.6 cm respectively. (a) 95th percentile (b) 99th percentile Figure 5. Clearance of the console for operator’s upper leg LCD’s vertical gaze range (VGR) was analyzed satisfying with the NUREG-0700 design guideline. LCD’s VGR was calculated as humanoid’s vertical viewing angle when the humanoid at the reference posture monitored the top and bottom of LCD. For example as shown in Figure 6, 5th and 95th percentile’s LCD’s VGR (5th percentile: - 29 ~ 1°, 95th percentile: - 34 ~ - 4°) were analyzed satisfying with - 40 ~ 20° recommended by the NUREG-0700 design guidelines. LDP’s VGR could cause postural discomfort when operators monitor for a long time because of the higher than horizontal line (0°). LDP’s VGR was calculated as humanoid at the reference posture monitored the top and bottom of LDP over LCD having 125 cm. For example as shown in Figure 7, 5th percentile’s LDP’s VGR (2 ~ 23°) was adequately evaluated because of being formed over the top of LCD (Figure 7.a). (a) 5th percentile (b) 50th percentile (c) 95th percentile (d) 99th percentile Figure 6. Vertical gaze analysis: LCD (a) 5th percentile (b) 50th percentile (c) 95th percentile (d) 99th percentile Figure 7. Vertical gaze analysis: LDP (125 cm) LDP’s VGR of all humanoids (- 1 ~ 23°) met the NUREG-0700 design guideline that LDP should permit operators at the consoles full view (Figure 7). However, the current design of LDP which was formed over horizontal line (0°) could cause fatigue and postural discomfort during the long monitoring task according to existing studies with regard to recommended display gaze range (- 26 ~ - 2°, Grandjean et al., 1983; - 56 ~ - 1°, Kim et al., 1991; - 40 ~ 20°, O’Hara et al., 2002). To improve LDP’s VGR through decrease of LDP’s height, it was analyzed that LCD’ height should be decreased along with LDP’s height. It was found that LDP’s VGR could improve through reduction of LDP’s height, however interference between LDP’s and LCD’s VGR could appear as shown in Figure 8. To solve this interference effectively, we designed a groove located into console as shown in Figure 9. In case LDP’s height became 115 cm through the LCD installation groove, having height of 10 cm, LDP’s VGR was improved as - 3 ~ 19° (Figure 10). As a result, improved LDP’s VGR in this study became lower than the existing LDP’s VGR (- 1 ~ 23°). For example, LDP’s VGR of 5th percentile was improved from 2 ~ 23° to 0 ~ 19°. Meanwhile LCD’s VGR (- 31 ~ 2.5°) was satisfied with the NUREG-0700 design guideline (- 40 ~ 20°) at the improved design. PROCEEDINGS of the HUMAN FACTORS and ERGONOMICS SOCIETY 56th ANNUAL MEETING - 2012 1914 Figure 8. Vertical gaze interference between LCD and LDP (a) LCD installation groove (b) Installed LCD Figure 9. Installation groove of LCD in console (a) 5th percentile (b) 50th percentile (c) 95th percentile (d) 99th percentile Figure 10. Vertical gaze analysis: improved LDP (115 cm) LDP’s horizontal gaze range (HGR) was analyzed satisfying with the NUREG-0700 design guideline that operator’s HGR should be within 30° from center of LDP. The MCR of the RWF has planned to be managed by an operator (operation of 7 consoles left) and a supervisor (operation of 3 consoles right) as shown in Figure 11. LDP’s HGR was calculated as a horizontal gaze interval when both the operator and the supervisor monitored LDP’s left and right points from center of LDP. The operator’s and the supervisor’s HGR were analyzed as 12 ~ 27° and 14 ~ 26° respectively, according to assigned console position. (a) Operator (b) Supervisor Figure 11. Horizontal gaze analysis: LDP DISCUSSION The present study analyzed the preliminary design of the MCR of the RWF through ergonomic evaluation considered with the NUREG-0700 design guideline in digital environment using the JACK. The evaluation of the MCR of the RWF was conducted considering four ergonomic aspects (postural comfort, reachability, visibility, and clearance), NPP design guidelines provided by the NUREG-0700, and references related to ergonomic computer workstation design. With regard to the design components that need to be improved through digital human simulation, ergonomic solutions were developed and evaluated to analyze improvement effects. The improved preliminary design in this study can contribute to the MCR design of the RWF in the future. The present study applied to representative human models to make humanoids in JACK considering Korean anthropometric characteristics and secular trend of stature. Three representative human models were generated considered with demographic characteristics of the operator in the MCR of the RWF to accommodate 90% (5th ~ 95th percentiles) of male aged 20 to 50 of Size Korea (2004). Additionally, one representative human model having 99th percentile for the next 20 years was generated to reflect secular trend of operator’s stature based on Korean stature from the years 1979 to 2004. The present study used estimated body sizes in terms of three anthropometric variables (hand breadth, head length, and thumb-tip reach) provided by the JACK, however these variables were highly correlated with other variables. Meanwhile the JACK generates a humanoid through input of 27 body sizes; body sizes not inputted were automatically estimated. The present study conducted post hoc analysis through stepwise regression analysis (pin = 0.05, pout = 0.1) in terms of the missing 3 anthropometric variables and other 24 anthropometric variables using US Army anthropometric data (Gordon et al., 1988). As a result, regression equations of the missing 3 anthropometric variables had a high adjusted coefficient of multiple determinations (adj. R2 = 52%, hand breadth; 83%, head length; 84%, thumb-tip reach). The present study established the reference posture for evaluation based on computer workstation posture provided by the existing studies. However, the reference posture at the MCR in this study could be different with recommended postures at a computer workstation (only for one display) because more than two displays (LDP and LCD) have been installed in the MCR. Therefore, consideration of monitoring PROCEEDINGS of the HUMAN FACTORS and ERGONOMICS SOCIETY 56th ANNUAL MEETING - 2012 1915 tasks for LDP and LCD could be needed for more appropriate evaluation of the MCR of the RWF. ACKNOWLEDGES This research was supported by Korea Power Engineering Company (KOPEC). REFERENCES ANSI/HFES (2007). Human Factors Engineering of Computer Workstations. California, USA: Human Factors and Ergonomics Society. Arcaleni, E. (2006). Secular trend and regional differences in the stature of Italians, 1854-1980. Economics and Human Biology, 4, 24-38. Bielicki, A. and Szklarska, A. (1999). Secular trends in stature in Poland: national and social class-specific. Annals of Human Biology, 26(3), 251-258. Bowman D. (2001). Using digital human modeling in a virtual heavy vehicle development environment. In Chaffin, D. B. (Ed.), Digital Human Modeling for Vehicle and Workplace Design. Warrendale, PA: SAE International. Chaffin, D. B. (2005). Improving digital human modeling for proactive ergonomics in design. Ergonomics, 48(5), 478- 491. Chaffin, D. B. (2001). Digital Human Modeling for Vehicle and Workplace Design. Pennsylvania, USA: SAE International. Chaffin, D. B. and Andersson, G. (1984). Occupational Biomechanics (2nded.). New York, USA: WileyInterscience. Gordon, C. C., Bradtmiller, B., Churchill, T., Clauser, C., McConville, J., Tebbetts, I. and Walker, R. (1988). 1988 Anthropometric Survey of US Army Personnel: Methods and Summary Statistics (Technical Report NATICK/TR- 89/044). US Army Natick Research Center: Natick, MA. Grandjean, E. (1987). Ergonomics in Computerized Offices. Philadelphia, USA: Taylor & Francis. Grandjean, E. (1983). Hunting, W. and Pidermann, M., VDT workstation design: Preferred settings and their effects. Human Factors, 25, 161-175. Hedge, A. and Powers, J. A. (1995). Wrist postures while keyboarding: Effects of a negative slope keyboard system and full motion forearm supports. Ergonomics, 38, 508- 517. Hwang, S.-L., Liang, S.-F.M.b, Liu, T.-Y.Y., Yang, Y.-J., Chen, P.-Y., Chuang, C.-F. (2009). Evaluation of human factors in interface design in main control rooms. Nuclear Engineering and Design, 239, 3069-3075. Kim, C., Lee, N., Jang, M., and Kim, J. (1991). Research on Ergonomic Design and Evaluation Technology for VDT Workstation. Korea Research Institute of Standards and Science. Korea Radioactive Waste Management Corporation (KRMC) (2009). Radioactive Waste. Retrieved August 21, 2009 from http://www.krmc.or.kr. Ku, J., Jang, T., Lee, J., and Lee, Y. (2006). A review of Human Factors Criteria for the Main Control Room MMI in Nuclear Power Plants. Ergonomic In Proceedings of the 2006 Fall Conference of the Society of Korea. Lee, S., Kwon, O., Park, J., Cho, Y., Lee, M., You, H., and Han, S. (2005). Development of a Workload Assessment Model for Overhead Crane Operation. In Proceedings of the 2005 Fall Conference of the Ergonomics Society of Korea. NASA (2006). Man-system integration standards. Retrieved September 22, 2009 from http://msis.jsc.nasa.gov/Volume1.htm. National Institute of Advanced Industrial Science and Technology (AIST) (2006). Secular change in Japan. Retrieved January 11, 2009, from http:// www.dh.aist.go.jp/research/centered/anthropometry/secul ar.php.en. Nelson, C. (2001). Anthropometric Analyses of Crew Interfaces and Component Accessibility for the International Space Station. In Chaffin, D. B. (Ed.), Digital Human Modeling for Vehicle and Workplace Design. Warrendale, PA: SAE International. O'Hara, J. M., Brown, W. S., Lewis, P. M. and Persensky, J. J. (2002). Human-System Interface Design Review Guidelines (DC 20555-0001). U.S. Nuclear Regulatory Commission, Office of Nuclear Regulatory Research. Park, J., Jung, K., Lee, W., Kang, B., Lee, J., Eom, J., Park, S., and You, H. (2008). Development of an Ergonomic Assessment Method of Helicopter Cockpit using Digital Human Simulation. In Proceedings of the 2008 Spring Conference of the Ergonomics Society of Korea. Padez, C. and Johnston, F. (1999). Secular trends in male adult height 1904-1996 in relation to place of residence and parent's educational level in Portugal. Annals of Human Biology, 26(3), 287-298. National Center for Health Statistics (2004). Hyattsville, Maryland, 1995. Size Korea. Statistics of Korean anthropometry. Retrieved September 26, 2009 from http://sizekorea.kats.go.kr. You, H. (2007). Digital Human Model Simulation for Ergonomic Design of Tangible Products and Workplaces. In Proceedings of the 2007 Fall Conference of the Ergonomics Society of Korea. PROCEEDINGS of the HUMAN FACTORS and ERGONOMICS SOCIETY 56th ANNUAL MEETING - 2012 1916 work_af7sa3spenctfit5v264bdayny ---- Mariana_Ou_INM380_2017 CITY, UNIVERSITY OF LONDON, MSC LIBRARY SCIENCE INM 380 LIBRARIES & PUBLISHING IN AN INFORMATION SOCIETY, ERNESTO PRIEGO MAY 2017 ASSIGNMENT OPTION 3 IDENTIFY THE MAIN WAYS IN WHICH TRANSFORMATIONS IN PUBLISHING ARE CHANGING THE WAY PEOPLE DO RESEARCH. WHAT ARE THE RELATIONSHIPS BETWEEN PUBLISHING AND DIGITAL SCHOLARSHIP? AND WHAT DO THESE RELATIONSHIPS MAKE POSSIBLE? WHAT ARE SOME CHALLENGES AND OPPORTUNITIES FOR PUBLISHERS AND/OR LIBRARIES IN THE CONTEXT OF THE NEW DEVELOPMENTS IN DIGITAL SCHOLARSHIP? WORD COUNT 3499, INCLUDING TITLES; ESTIMATED READING TIME: 18 MIN Mariana Strassacapa Ou Publishing as Sharing: OBSERVATIONS FROM ORAL HISTORY PRACTICES IN THE DIGITAL HUMANITIES Despite the evident general feeling that we experience an information deluge in our daily lives, whether ours is an ‘information society’ is subject of great debate. The term implies that ‘information’ is the very defining aspect of today’s society, rather than ‘agriculture’, for example (Bawden & Robinson, 2012); it also implies that at some point in the twentieth century a revolution has taken place, one that would have substituted a previous ‘industrial society’ for the current ‘information society’ as it fundamentally disrupted technologies and cultural practices related to human communication. Even though I am not convinced by the idea that we live in a ‘new’ kind of society, and rather prefer interpretations that identify all the continuities of modernism and capitalism developments through the last century, it is undeniable that recently, in the last decades, transformations in mediated communication have accelerated the production and dissemination of information enormously, increasing the complexity of ways people interact (Borgman et al., 2008). The widespread use of the Internet and the World Wide Web through cheap, personal digital information computing devices is largely to blame for these profound transformations; the term ‘digital’, originally applied as synonymous with discrete electronic processing techniques, came to refer to anything related to computers, from electronics to social descriptors (digital divides, digital natives), to emerging fields of inquiry (digital art, digital physics) (Peters, 2016). ‘Digital scholarship’ fits the latter category; according to Christine Borgman, it ‘encompasses the tools, services, and infrastructure that support research in any and all fields of study’ (2013). Clearly this is a quite broad definition, but does express the essential idea that scholarly practices and research opportunities have been widened through many new supporting ways. As I will argue here, a leading force defining digital scholarship has been the generalisation, in the digital milieu, of publishing as sharing. ‘Sharing’ as the new rhetoric of publishing In the book Digital Keywords: A Vocabulary of Information Society & Culture, Nicholas John scrutinises the term ‘sharing’ in its meanings recently acquired through use in the digital realm. Non-metaphorically, John explains, to share is to divide, and at least from the sixteenth century it refers to the distribution of scarce resources; recently, though, it has also been attributed a more abstract communicative dimension: ‘a category of speech, a type of talk, characterised by the qualities of openness and honesty, and commonly associated with the values and virtues of trust, reciprocity, equality, and intimacy, among others’; it has become ‘the model for a digitally based readjustment of our interactions with things (sharing instead of owning) and with others’ (John, 2016). Furthermore, ‘sharing’ would also mean a positive attitude with regards to future society; John talks in terms of the promise of sharing: The promise of sharing is at least twofold. On the one hand, there is the promise of honest and open (computer-mediated) communication between individuals; the promise of knowledge of the self and of the other based on the verbalisation of our inner thoughts and feelings. On the other hand, there is the promise of improving what many hold to be an unjust state of affairs in the realms of both production and consumption; the promise of an end to alienation, exploitation, self-centred greed, and breathtaking wastefulness. (John, 2016) Publishing after the digital boom—and specifically after the Internet and the World Wide Web having taken over a large share of our usual communication routines—, I argue, has a meaning which is becoming more and more inter-sectioned with that of ‘sharing’ we are referring to here. Digital publishing and ‘sharing’ are intertwined as both follow a ‘distributive logic’ more sustainable and alternative to capitalism models of production and consumption (John, 2016); publishing has had its definition widened as well as its actors and subjects and, just as ‘sharing’, it ‘plays heavily on interpersonal relations, promising to introduce you to your neighbours, for instance, or to reinstate the sense of community that has been driven out by, say, the alienation supposedly typical of modern urban life’ (John, 2016): it is now part of everybody’s daily activities, and not just a specialised profession. This ‘publishing as sharing’ new notion is in accordance with the new paradigm of openness in digital scholarship. Publishing processes had to be readapted, some of them radically, both to developments in digital technologies and to the pervasive digital ‘sharing’; when it comes to academic publishing and research practices, that means ‘open scholarship’, as in making your research data available in a repository for consultation and reuse; ‘open access’, as in publishing free from charge academic articles that would initially be charged for in digital journals; and ‘open dissemination’, as the idea behind institutional websites like Oxford University Research Archive (two screenshots below), a friendly, searchable repository of research outputs, including many open-access articles. In this essay, I use the debates on Oral History in the Digital Humanities to support the presentation of some of the relationships between publishing and digital scholarship and their implications, as well as challenges and opportunities that should concern those involved in both publishing and library & information science. NEW STANDARDS IN ORAL HISTORY widening scholarship practices through digital publishing The transformations in scholarship brought about by the universe of digital possibilities and the World Wide Web abound, but not many fields have been impacted as much as oral history. In the introduction to Oral history in the digital humanities: voice, access, and engagement (Boyd & Larson, 2014), the authors provide an overview of the developments in oral history and highlights how they were heavily influenced by the changing recording technologies of the last decades; if affordable and accessible new analogue technologies helped establish oral history as a compelling methodology for historical research in the 1960s, the transcript of the audio recordings still posed a great challenge from the library/archival perspective: as text, they were considered a more efficient communication than the recording, easier to go through looking for specific bits of information; ‘without the transcript, the archive might have no more information about an oral history interview on its shelves beyond a name, a date, and the association with a particular project’, and oral history collections (of cassettes) were always under the threat of obscurity, with no perspective of use of discovery (Boyd & Larson, 2014). Digital technologies, however, came to solve not only these problems but, with the World Wide Web, also give new and widened meanings for access; as the authors pointed out, ‘Digital technologies posed numerous opportunities to explore new models for automating access and providing contextual frameworks to encourage more meaningful interactions with researchers as well as with community members represented by a particular oral history project’. In this essay, I present four main changes in publishing after the ‘digital shift’ (publishing = sharing) as we can identify from oral history’s new practices in research and dissemination: 1 • the ‘democratic spirit’ Boyd & Larson talk about a ‘democratic spirit’ found in both oral history and the digital humanities as ‘the sense that the materials created, shared, generated, or parsed belong to everyone—not just to the educated or the well-to-do, but to those outside the university walls as well as those within’. Indeed, oral historians are obviously interested in history from ‘bottom-up’, the one that can be found and captured in common people’s voices, and are then characterised by adopting a more ‘democratic’ approach to historical inquiry, one that assumes collective participation in the creation of materials; in combination with the digital humanities, this inclusion of people in the creation process extends also to people’s access to these materials (Boyd & Larson, 2014); oral history’s ‘democratic’ values and preconditions are enhanced and find fertile ground in digital publishing. As we can read from the Founding Statement of The Journal for MultiMedia History of the University at Albany, a website that used to publish oral history collections: [it is] because so much of what we were doing as professional historians seemed so isolating that we wanted to "get out on the Web”, to reach not only academicians, but an entire universe of interested readers. We wanted to bring serious historical scholarship and pedagogy under the scrutiny of amateurs and professionals alike, to utilise the promise of digital technologies to expand history's boundaries, merge its forms, and promote and legitimate innovations in teaching and research that we saw emerging all around us (Zahavi & Zelizer, 1998) I understand this ‘democratic spirit’, as Boyd & Larson put it, as a manifestation of one of the transitions in authorship in the digital realm, ‘From Intellectual Property to the Gift Economy’, suggested by Kathleen Fitzpatrick in her book Planned obsolescence: publishing, technology, and the future of the academy. If academics and publishers are to restore scholarly communication’s origins and work towards genuinely open practices of producing and sharing academic content, she argues, then scholars must embrace the Creative Commons licenses for their work, ‘thus defining for themselves the extent to which they want future scholars to be able to reuse and remix their texts, thereby both protecting their right to be credited as the author of their texts and contributing to a vibrant intellectual commons that will genuinely ‘promote the Progress of Science and useful Arts.”’ (Fitzpatrick, 2011; citing the U.S. Constitution). Oral history research output has always been a complicated type of material in terms of authorship, ownership, and rights; whole collections cannot be made accessible because of copyright issues, e.g. the interviewer has deceased and did not leave any documentation on the matter behind. But online, it is becoming more common to apply CC licenses to oral history interviews through the interviewees consent forms, as in the words of an oral historian, ‘it clearly keeps the copyright in the hands of the oral history interview participant, but allows us to freely share the recording and transcript on our open-access public history website and library repository, where individuals and organisations may copy and circulate it, with credit to the original source’ (Simpson, 2012). The ‘democratic’ solution seems to be already available for academics, but the challenge now is to promote the CC license as such; academic and librarian Jane Secker seems to be on the right track when she refers to ‘copyright literacy’ as closely related to information literacy, to be of concern to everyone who ‘owns a device with access to the internet’ (Secker, 2017). 2 • ‘share your story’: authorship, collaboration, crowdsourcing Co-authorship in interviewing projects is nothing new, but collaborative work tends to become the norm when we consider oral history as related to and part of the digital humanities. If oral history has always been distinct from other practices in the humanities, as it often holds certain complexity with regards to authorship—who is the author of an interview, the interviewer, the interviewee, or both? Or none?—, this complexity has been successfully embraced in the digital realm. With crowdsourced websites like StoryCorps.org and AntiEvictionMappingProject.net (below), anyone is encouraged to ‘share their story’ and take part as author of a larger narrative, comprised of the collection of stories that assemble an inconstant, growing whole. Furthermore, as a oral history collection is published online and becomes a website, new roles which can arguably be corresponded to that of an author become essential: ‘While there are always two (and sometimes more) participants in the initial recording of an oral history, I would argue that there are three primary players in the presentation and preservation of a digital oral history once it has been recorded—the oral historian, the collection manager, and the Information Technology (IT) specialist. These three roles may, in some programs, actually be represented by the same person, but there are specific concerns and responsibilities particular to each’ (Schneider; In Boyd & Larson, 2014). In that sense, oral history is indeed in conformity with the basis of the digital humanities, understood as contrast to the essentially mono-authorial and monographic traditional processes and outputs of research in the humanities; as The DH Manifesto 2.0 states: ‘Digital Humanities = Co- creation’ (The Digital Humanities Manifesto 2.0, 2009; In Boyd & Larson, 2014). This is not to say that digital humanities has not been disruptive to previous practices in the humanities; on the contrary, it appears that the sciences have found continuity and enhancement of their procedures and methods in the digital realm, given that, as Gross & Harmon argue, in the sciences ‘collaboration was already flourishing; the Internet greatly facilitated it, among not only networked scientists from around the globe but also armies of citizen-scientists participating through websites like GalaxyZoo’ (Gross & Harmon, 2016). Knowledge in the humanities, in contrast, the authors argue, build up as ‘a chain of individual achievements. Even in the 21st century, collaboration in the humanities, though more common than previously, is not common at all. When it does occur, only two scholars are usually involved. There is a sense that these achievements ought to be individual.’ The humanities seem to be lagging behind the sciences in terms of being able to embrace the web’s possibilities, as we can see from some online journals: The Oral History Review by Oxford Academic, for example, presents no audio recording files or any other interactive feature, just the traditional pdf, authorial, text article. Institutional digital publishing in the humanities would greatly benefit from more ‘digital’ explorations of content and linking, but that obviously involves difficult changes in well-established mindsets and practices with regards to the notion of the strong individual author and the acclaimed, recognition-provider, conventional text based academic journal article. 3 • ‘archive everything’ A habit that is being abandoned thanks to the possibilities of digital archiving and storage is getting rid of the audio recordings of oral history once they have been transcribed. Now, researchers are not only able to keep the audio recordings and their many versions and editions, but also house and organise the interview collections using digital depositories and content management systems like CONTENTdm, and also enhance access to the interviews with OHMS (Oral History Metadata Syncronizer), which connects search terms with the online audio or video (website screenshot below) (Boyd & Larson, 2014). Usability and discoverability issues are being sorted out by the ‘archive everything’ (Giannachi, 2016) trend that comes with publishing-as- sharing practices. The ‘archive everything’ new paradigm is becoming such a norm in digital scholarship that Fitzpatrick talks about a ‘database-driven scholarship’, that refers to new kinds of research questions made possible through the online availability of collections of digital objects (Fitzpatrick, 2014). Nyhan & Flinn also mention a ‘rubric’ in the present research agenda of the digital humanities as one that looks back at humanities questions long asked and attempt to ask them in new ways, and to identify new questions that could not be conceived of explored before (Nyhan & Flinn, 2016); academic digital datasets, databases and archives are greatly responsible and enablers of these new opportunities. Gross & Harmon use a prize-winning monograph as an example of how current possibilities help ‘historians see anew’: Pohlandt-McCormick’s research on the Soweto uprising uses ‘photographs and official documents as an archive that can supplement, even interrogate the traditional historical archive. Her monograph contains 743 images and reproductions of some 200 written documents in all, a trove hard to imagine in a conventional book. These images and documents are reproduced in an “Archive” in her e-book, and select ones are integrated into the text and hyperlinked to supplementary information.’ (Gross & Harmon, 2016). Of course, database and archival academic websites are not just product of research, but increasingly made available as opportunity for other researchers to come up with new inquiries from them. That is one of the ideas behind making research data accessible as requirement in journal publications; Gross & Harmon cite Science’s stated policy as now typical: ‘As a condition of publication, authors must agree to make available all data necessary to understand and assess the conclusions of the manuscript to any reader of Science’. With the ‘archive everything’ practices and the emergence of digital collections of data and documents, comes the increasing significance of the activity of curation, meaning ‘making arguments through objects as well as words, images, and sounds’ (Digital Humanities Manifesto 2.0, 2009). For Fitzpatrick, curation relates to another shift in authorship that she identifies as ‘from originality to remix’: We might, for instance, find our values shifting away from a sole focus on the production of unique, original new arguments and texts to consider instead curation as a valid form of scholarly activity, in which the work of authorship lies in the imaginative bringing together of multiple threads of discourse that originate elsewhere, a potentially energising form of argument via juxtaposition. (Fitzpatrick, 2011) But just as difficult as establishing this kind of curation as legitimate academic work is enhancing the reusability of these valuable datasets and digital archives; just requiring data sharing seems to be not enough. If we want to ‘archive everything’, discoverability and dissemination are essential, but cannot happen without solid institutional base and support: storage must be big, URLs must always work, metadata and indexing must be precise and efficient. CONCLUSION academic publishing should be about sharing Layers of London is a project being undertaken in the University of London’s Institute of Historical Research, funded by the Heritage Lottery Fund; It ‘will bring together, for the first time, digitised heritage assets provided by key partners across London including: the British Library, London Metropolitan Archives, Historic England, The National Archives, MOLA. These will be linked in an innovative new website which will allow you to create and interact with many different layers of London’s history from the Romans to the present day. The layers include historic maps, images of buildings, films as well as information about people who have lived and worked in London over the centuries.’ (screenshot below) (Layers of London, 2017). It is still being developed at this moment, but it is working hard on its dissemination, as ‘a major element of the project will be work with the public at borough level and city-wide, through crowd-sourcing, volunteer, schools and internship programmes. Everyone is invited to contribute material to the project by uploading materials relating to the history of any place in London. This may be an old photograph, a collection of transcribed letters, or the results of local research project’ (Layers of London, 2017). So, instead of an individual historical research on London mapping that would traditionally be published as textual product, Layers of London is an open, funded website being built in an academic institution as platform for voluntary contributions; it has a blog, a twitter account, and instead of an ‘author’, a team of director, development officer, administrator, and digital mapping advisor. It represents all shifts in authorship as proposed by Fitzpatrick: ‘from product to process’; ‘from individual to collaborative’; ‘from originality to remix’; ‘from intellectual property to the gift economy’; and ‘from text to… something more’ (Fitzpatrick, 2011); and just like contemporary oral history projects, its success will be ‘measured by metrics pertaining to accessibility, discovery, engagement, usability, reuse, and … impact on both community and scholarship.’ (Boyd & Larson, 2014). As an open digital humanities work that fully embraces the possibilities of the web, however, it faces all the challenges that this kind of academic digital publication today usually does, including the recognition that it might even count as academic research. Fitzpatrick points out: ‘The key, as usual, will be convincing ourselves that this mode of work counts as work—that in the age of the network, the editorial or curatorial labor of bringing together texts and ideas might be worth as much as, perhaps even more than that, production of new texts.’ (Fitzpatrick, 2011). This ‘convincing ourselves’ effort involves the difficult task of rethinking university practices and the academic career, which simply cannot afford to shy away from the disruptive impact of digital publishing as sharing. The humanities in special has been trying to work itself out with the digital humanities; according to Nyhan & Flinn, another ‘rubric’ of the DH ‘has a distinct activist mission in that it looks at structures, relationships and processes that are typical of the modern university (for example, publication practices, knowledge creation and divisions between certain categories of staff and faculty) and questions how they may be reformed, re-explored or re-conceptualised.’ (Nyhan & Flinn, 2016). It must be a concern and responsibility of the university to establish and guarantee academic publishing as sharing, addressing today’s unsustainable models of publishing and embracing the shifting, more open forms of scholarly communication and research; I agree with Fitzpatrick: ‘Publishing the work of its faculty must be reconceived as a central element of the university’s mission.’ (Fitzpatrick, 2011). Librarians have significant roles to perform on this mission; the web is not a library, but librarians can help ensure it is used in its full potential: as a world wide networked communication system. And can help to let publishing be about sharing. REFERENCES Antieviction Mapping Project: Documenting the dispossessions and resistance of SF Bay Area residents, (2014-2017). Home. [online] Available at: http://www.antievictionmap.com/#/we-are-here-stories-of- displacement-and-resistance/ [Accessed 02 May 2017]. Bawden, D. and Robinson, L. (2012). Introduction to Information Science. London: Facet. Borgman, C. (2013). Digital scholarship and digital libraries: past, present, and future. Keynote Presentation, 17th International Conference on Theory and Practice of Digital Libraries, Valletta, Malta. Available at: http:// works.bepress.com/borgman/273/ [Accessed 01 May 2017]. Borgman, C., Abelson, H., Dirks, L., Johnson, R., Koedinger, K., Linn, M., … Szalay, A. (2008). Fostering Learning in the Networked World: The Cyberlearning Opportunity and Challenge. National Science Foundation. Available at: https://www.nsf.gov/pubs/2008/nsf08204/nsf08204.pdf [Accessed 01 May 2017]. Boyd, D. and Larson, M. (2014) Introduction. In: Boyd. D. and Larson, M., eds., Oral history and digital humanities: voice, access, and engagement. New York: Palgrave Macmillan US. The Digital Humanities Manifesto. (2009). [online] Available at: http://manifesto.humanities.ucla.edu/ 2009/05/29/the-digital-humanities-manifesto-20/ [Accessed 04 May 2017]. Dougherty, J. and Simpson, C. (2012). Who owns oral history? a creative commons solution. In: Boyd, D., Cohen, Rakerd, S. and D. Rehberger, eds., Oral history in the digital age. Institute of Library and Museum Services. Available at: http://ohda.matrix.msu.edu/2012/06/a-creative-commons-solution/ [Accessed 02 May 2017]. http://www.antievictionmap.com/#/we-are-here-stories-of-displacement-and-resistance/ http://www.antievictionmap.com/#/we-are-here-stories-of-displacement-and-resistance/ http://works.bepress.com/borgman/273/ http://works.bepress.com/borgman/273/ https://www.nsf.gov/pubs/2008/nsf08204/nsf08204.pdf http://manifesto.humanities.ucla.edu/2009/05/29/the-digital-humanities-manifesto-20/ http://manifesto.humanities.ucla.edu/2009/05/29/the-digital-humanities-manifesto-20/ http://ohda.matrix.msu.edu/2012/06/a-creative-commons-solution/ Giannachi, G. (2016). Archive everything: mapping the everyday. Cambridge, Massachusetts: The MIT Press. Fitzpatrick, K. (2011). Planned obsolescence: publishing, technology, and the future of the academy. New York: New York University Press. Gross, A. and Harmon, J. (2016). The Internet revolution in the sciences and humanities. 1st ed. New York: Oxford University Press. John, N. (2016). Sharing. In: Peters, B., ed., Digital Keywords: A Vocabulary of Information Society & Culture. Princeton: Princeton University Press. The Journal for MultiMedia History, (2000, 2001). Current issue. [online] Available at: http://www.albany.edu/ jmmh/ [Accessed 01 May 2017]. Layers of London, (2017). Home. [online] Available at: https://layersoflondon.blogs.sas.ac.uk [Accessed 03 May 2017]. Nyhan, J. and Flinn, A. (2016). Computation and the humanities: towards an oral history of digital humanities. Springer Open. DOI 10.1007/978-3-319-20170-2 Oral History Metadata Syncronizer: enhance access for free, (2017). Home. [online] Available at: http:// www.oralhistoryonline.org [Accessed 01 May 2017]. Oxford University Research Archive, (2008). Home. [online] Available at: https://ora.ox.ac.uk [Accessed 03 May 2017]. Pohlandt-McCormick, H. (2002). ‘I saw a nightmare…’ Doing violence to memory: the Soweto uprising, June 16, 1976. [online] Columbia University Press and Gutenberg-e. Available at: http://www.gutenberg-e.org/pohlandt- mccormick/index.html [Accessed 03 May 2017]. Secker, J. (2017). Digital, information or copyright literacy for all? [Blog] Libraries, Information Literacy and E- learning: reflections from the digital age. Available at: https://janesecker.wordpress.com/2017/02/08/digital- information-or-copyright-literacy-for-all/ [Accessed 01 May 2017]. Schneider, W. (2014). Oral history in the age of digital possibilities. In: Boyd. D. and Larson, M., eds., Oral history and digital humanities: voice, access, and engagement. New York: Palgrave Macmillan US. StoryCorps. (2003). Stories. [online] Available at: https://storycorps.org/listen/ [Accessed 03 May 2017]. http://www.albany.edu/jmmh/ http://www.albany.edu/jmmh/ https://layersoflondon.blogs.sas.ac.uk http://www.oralhistoryonline.org http://www.oralhistoryonline.org https://ora.ox.ac.uk http://www.gutenberg-e.org/pohlandt-mccormick/index.html http://www.gutenberg-e.org/pohlandt-mccormick/index.html https://janesecker.wordpress.com/2017/02/08/digital-information-or-copyright-literacy-for-all/ https://janesecker.wordpress.com/2017/02/08/digital-information-or-copyright-literacy-for-all/ https://storycorps.org/listen/ work_afxycoctbrbcpkbkn7gub2qnpa ---- Raemy_Schneider_VKKS2019_quidproquo_AssigningPID_Art_Design 07.06.2019Raemy & Schneider 07.06.2019 ASSIGNING PERSISTENT IDENTIFIERS TO ART AND DESIGN ENTITIES Julien A. Raemy & René Schneider Fourth Swiss Congress for Art History, VKKS, Mendrisio Quid pro quo: linked data in art history research 07.06.2019Raemy & Schneider 1. Introduction to Persistent identifiers (PIDs) 2. Cool URIs and PIDs 3. The rationale and main results of the ICOPAD project 4. ICOPAD possible follow-up project: INCIPIT Agenda 1. INTRODUCTION TO PIDS 07.06.2019Raemy & Schneider Persistent identifiers (PID) A persistent identifier is a long-lasting and biunique reference to a digital resource. It usually has two parts: 1. A unique identifier (to ensure the provenance of a digital resource) 2. A location for the resource over time (to ensure that the identifier resolves to the correct location) https://www.slideshare.net/AustralianNationalDataService/fsci-persistent-identifiers https://www.slideshare.net/AustralianNationalDataService/fsci-persistent-identifiers 07.06.2019Raemy & Schneider In order to… https://www.interserver.net/tips/kb/404-error-fix/ - Create long lasting (not permanent) access - Avoid error messages https://www.interserver.net/tips/kb/404-error-fix/ 07.06.2019Raemy & Schneider PIDs are essential and indispensable to create fair data. F1 Principle: (meta)data are assigned a globally unique and eternally persistent identifier FAIRness http://www.dit.ie/dsrh/data/fairdata/ http://www.dit.ie/dsrh/data/fairdata/ 07.06.2019Raemy & Schneider § Publications § Data § Persons § Organisations § Citations and more: (antibodies, fictious characters, places, plants, e-books, …) PID ≠ PID 07.06.2019Raemy & Schneider « Persistence is not dependant on the identifier itself, but on legal, organisational and technical infrastructure ». (Hakala 2005) Persistence 07.06.2019Raemy & Schneiderhttp://andrew.treloar.net/research/diagrams/recording-to-archiving-architecture.jpg http://andrew.treloar.net/research/diagrams/recording-to-archiving-architecture.jpg 2. COOL URIS AND PIDS 07.06.2019Raemy & Schneider § Cool URIs don’t change: https://www.w3.org/Provider/Style/URI (Tim Berners-Lee, 1998) § Cool URIs for the Semantic Web: https://www.w3.org/TR/cooluris/ (W3C Interest Group Note, 2008) Cool URIs https://www.w3.org/Provider/Style/URI https://www.w3.org/TR/cooluris/ 07.06.2019Raemy & Schneider PIDs and cool URIs (Bazzanella, Bortoli, Bouquet 2013) Feature PIDs Cool URIs Resolver YES NO Authority YES NO Naming authorities YES NO Level of trust HIGH LOW Policies YES NO Persistence YES NO Actionability of IDs Partially YES Uniqueness YES NO Content change NO YES Content negotiation NO YES Cross linkage NO YES Effort for implementation HIGH LOW Costs for users Potentially HIGH LOW Sustainability issues MANY FEW Identified entities Mainly digital objects Everything Bridge metadata NO YES 07.06.2019Raemy & Schneider Motivation § SARI § ICOPAD PID LOD – cool URIs 07.06.2019Raemy & Schneider PIDs and LOD at the BnF 07.06.2019Raemy & Schneiderhttps://gallica.bnf.fr/ark:/12148/btv1b10542304w/f13.item https://gallica.bnf.fr/ark:/12148/btv1b10542304w/f13.item 3. THE RATIONALE AND MAIN RESULTS OF THE ICOPAD PROJECT 07.06.2019Raemy & Schneider § Identités de confiance pour les données de l’art et du design (ICOPAD) – June 2017 to December 2018 o Haute école de gestion de Genève (HEG-GE) – Instigator and Project Manager o Zentralbibliothek Zürich (ZB) / Zurich Central Library o Zürcher Hochschule der Künste (ZHdK) / Zurich University of the Arts o Schweizerisches Institut für Kunstwissenschaft (SIK-ISEA) / Swiss Institute for Art Research o Goal: feasibility of a suitable PID model (prototype) o Requirements and workflow – link between research data and Linked Data based on PIDs o Dedicated to the disciplines of art, design, and digital humanities to derive conjectures o Transferability of the model to other disciplines ICOPAD Project https://campus.hesge.ch/id_bilingue/projekte/icopad/index_fr.asp https://campus.hesge.ch/id_bilingue/projekte/icopad/index_fr.asp 07.06.2019Raemy & Schneider Swiss PID Landscape ark 07.06.2019Raemy & Schneider ICOPAD use cases from our project partners Institution Data set types/entities Needs SIK-ISEA Artists Artworks Dictionary entries Diverse PIDs and links to normed data. ZB Digital surrogates Fine level of granularity. ZHdK Artists Artworks Events Films Glossary entries Projects Research Data Further development of applications such as eMuseum and Medienarchiv. 07.06.2019Raemy & Schneider Approaches DOI + 𝐶(𝑑𝑜𝑖) = 𝑥 DOI + 1 𝐶(𝑑𝑜𝑖) = 𝑎 𝐶(𝑑𝑜𝑖) = 𝑥+,𝑥-, …,𝑥/DOI + n DOI + 1 + LD 𝐶(𝑑𝑜𝑖) = 𝑎 → 𝑜𝑤𝑙:𝑠𝑎𝑚𝑒𝐴𝑠 𝑥+,𝑥-, …,𝑥/ 𝑎 = 𝑎𝑟𝑘 (𝐴𝑟𝑐ℎ𝑖𝑣𝑎𝑙 𝑅𝑒𝑠𝑠𝑜𝑢𝑟𝑐𝑒 𝐾𝑒𝑦 𝑝𝑟𝑜𝑣𝑖𝑑𝑒𝑑 𝑏𝑦 𝐶𝑎𝑙𝑖𝑓𝑜𝑟𝑛𝑖𝑎 𝐷𝑖𝑔𝑖𝑡𝑎𝑙 𝐿𝑖𝑏𝑟𝑎𝑟𝑦) 07.06.2019Raemy & Schneider § ARK identifiers are free § ARKs are built using a completely different theoretical model, consisting of a decentral and domain (i.e. DNS) agnostic approach § ARKs allow to use with ease LOD on top of them § ARKs can effortlessly be combined with other specifications such as the International Image Interoperability Framework (IIIF) canonical URI syntax Archival Resource Key (ARK) 07.06.2019Raemy & Schneider Solution approaches ark s ervic e to crea te PID service request Swiss PID Hub ark r eque st DaSCH Uni Bas ark CDL exis ting ark s ervic e multitude of PIDs ark service to create own PID Attribution Service if NOT DOI @ ETH ¦ FORS AND if NOT data archived @ DaSCH 07.06.2019Raemy & Schneider o PIDs are a key element in the research data management process and should be assigned to any entities as soon as possible o Trusted identity, FAIR data o PID for the Semantic Web is possible o BnF platforms o A large variety of PIDs à DOIs are not sufficient o Most interesting complement: ARKs (LOD, free, decentral, granularity, etc.) o Need for an infrastructure/service in Switzerland o National Hub that can mint ARKs Conclusion 4. ICOPAD POSSIBLE FOLLOW-UP PROJECT: INCIPIT 07.06.2019Raemy & Schneider §Infrastructure nationale d’un complément pour les identifiants pérennes, interopérables et traçables (INCIPIT) §Project submission (August 2019) §3 phases 1. Attribution service (by the end of 2019) – ArODES 2. Fusion of ArODES and SONAR (2020) 3. Creation of a Hub (2021) Partners welcome (see you at Bits and Bites)! INCIPIT 07.06.2019Raemy & Schneider Julien A. Raemy Research and Teaching Assistant in Information Science julien.raemy@hesge.ch rene.schneider@hesge.ch René Schneider Full Professor of Information Science mailto:julien.raemy@hesge.ch mailto:rene.schneider@hesge.ch 07.06.2019Raemy & Schneider Bibliography • BAZZANELLA, Barbara, BORTOLI, Stefano and BOUQUET, Paolo, 2013. Can persistent identifiers be cool? International journal of digital curation. 14 June 2013. Vol. 8, no. 1, p. 14–28. DOI 10.2218/ijdc.v8i1.246. • BERMÈS, Emmanuelle, 2006. Des identifiants pérennes pour les ressources numériques : l’expérience de la BnF [online]. Paris, France: Bibliothèque nationale de France. [Accessed 20 May 2019]. Available from: https://web.archive.org/web/20181006042857/http://www.bnf.fr/documents/ark_presentation_bermes_2006.pdf • ESPASANDIN, Kate, JAQUET, Aurélie, LEFORT, Lise and SCHNEIDER, René (dir ), 2018. TRMASID 14: Panorama et modélisation d’identifiants pérennes pour la création d’identités de confiance [online]. Genève, Suisse: Haute école de gestion de Genève. [Accessed 20 May 2019]. Available from: https://doc.rero.ch/record/309479 • EU. DIRECTORATE-GENERAL FOR RESEARCH AND INNOVATION, 2018. KI-06-18-206-EN-N: Turning FAIR into reality. Final Report and Action Plan on FAIR Data [online]. Brussels, Belgium. [Accessed 20 May 2019]. Available from: https://doi.org/10.2777/1524 • HILSE, Hans-Werner and KOTHE, Jochen, 2006. Implementing persistent identifiers: overview of concepts, guidelines and recommendations. London: CERL. ISBN 978-90-6984-508-1. • LA TRIBUNE DES ARCHIVISTES, 2018. Choisir des URL persistantes pour la mise en ligne de sa base de données : ARK pas à pas... La Tribune des Archivistes [online]. 21 October 2018. [Accessed 20 May 2019]. Available from: http://latribunedesarchives.blogspot.com/2018/10/choisir-des-url-persistantes-pour-la.html • MEADOWS, Alice, 2017. PIDapalooza – the open festival for persistent identifiers. Insights. 8 November 2017. Vol. 30, no. 3, p. 161–164. DOI 10.1629/uksg.393. • NICHOLAS, Nick, WARD, Nigel and BLINCO, Kerry, 2009. A policy checklist for enabling persistence of identifiers. D-Lib magazine [online]. January 2009. Vol. 15, no. 1/2. [Accessed 20 May 2019]. DOI 10.1045/january2009-nicholas. Available from: http://www.dlib.org/dlib/january09/nicholas/01nicholas.html • PEYRARD, Sébastien, KUNZE, John A. and TRAMONI, Jean-Philippe, 2014. The ARK Identifier Scheme: Lessons Learnt at the BnF and Questions Yet Unanswered. International Conference on Dublin Core and Metadata Applications. 8 October 2014. P. 83–94. • PRONGUÉ, Nicolas and RAEMY, Julien A., 2017. Revue de la littérature : identifiants pérennes (PID), Linked Data, Données de la recherche [online]. Carouge, Suisse: Haute école de gestion de Genève. [Accessed 20 May 2019]. Available from: https://campus.hesge.ch/id_bilingue/projekte/icopad/doc/Prongue_Raemy_Revue_Litterature_2017.pdf • RAEMY, Julien A., 2018. Identifiants pérennes (PID) : Processus d’obtention, mapping et approches d’attribution, modélisation, glossaire [online]. Carouge, Suisse: Haute école de gestion de Genève. [Accessed 20 May 2019]. Available from: https://campus.hesge.ch/id_bilingue/projekte/icopad/doc/Raemy_PID_Processus_Approches_Modelisation_2018.pdf • SCHNEIDER, René and RAEMY, Julien A., 2019a. Résultats du projet ICOPAD. ID Bilingue [online]. February 2019. [Accessed 20 May 2019]. Available from: https://campus.hesge.ch/id_bilingue/projekte/icopad/results_fr.html • SCHNEIDER, René and RAEMY, Julien A., 2019b. Towards Trusted Identities for Swiss Researchers and their Data. 14th International Digital Curation Conference (IDCC) [online]. Melbourne, Australia. 6 February 2019. [Accessed 20 May 2019]. Available from: https://doi.org/10.5281/zenodo.2415995 • VAN DE SOMPEL, Herbert, KLEIN, Martin and JONES, Shawn M., 2016. Persistent URIs Must Be Used To Be Persistent. arXiv:1602.09102 [cs] [online]. 29 February 2016. [Accessed 20 May 2019]. Available from: http://arxiv.org/abs/1602.09102 https://doi.org/10.2218/ijdc.v8i1.246 https://web.archive.org/web/20181006042857/http:/www.bnf.fr/documents/ark_presentation_bermes_2006.pdf https://doc.rero.ch/record/309479 https://doi.org/10.2777/1524 http://latribunedesarchives.blogspot.com/2018/10/choisir-des-url-persistantes-pour-la.html https://doi.org/10.1629/uksg.393 https://doi.org/10.1045/january2009-nicholas http://www.dlib.org/dlib/january09/nicholas/01nicholas.html https://campus.hesge.ch/id_bilingue/projekte/icopad/doc/Prongue_Raemy_Revue_Litterature_2017.pdf https://campus.hesge.ch/id_bilingue/projekte/icopad/doc/Raemy_PID_Processus_Approches_Modelisation_2018.pdf https://campus.hesge.ch/id_bilingue/projekte/icopad/results_fr.html https://doi.org/10.5281/zenodo.2415995 http://arxiv.org/abs/1602.09102 work_ai7mlk2lq5htzcv2pmbhjhhgiu ---- Ergonomic Assessment for DHM Simulations Facilitated by Sensor Data Available online at www.sciencedirect.com 2212-8271 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the scientific committee of 48th CIRP Conference on MANUFACTURING SYSTEMS - CIRP CMS 2015 doi: 10.1016/j.procir.2015.12.098 Procedia CIRP 41 ( 2016 ) 702 – 705 ScienceDirect 48th CIRP Conference on MANUFACTURING SYSTEMS - CIRP CMS 2015 Ergonomic assessment for DHM simulations facilitated by sensor data Dan Gläsera, Lars Fritzschea, Sebastian Bauera, Vipin Jayan Sylajaa aimk automotive GmbH, 09128 Chemnitz, Germany Abstract The digital factory with its innovative tools is experiencing an increasing importance, not only in experimental but also productive domains. One of these tools is the digital human model (DHM). In the field of production, the focus of using DHMs lies in the planning and evaluation of processes and products in terms of plausibility, productivity and ergonomics. Up to now, ergonomic assessment within DHM simulations have been mostly limited to static evaluations of reachability and postures. INTERACT is a running R&D project, working on the main weak points of DHM software tools. The industry-driven requirements are mainly the reduction of input effort, the increase of movement quality and a quick and intuitive way to create simulation variations in a workshop environment. The utilization of sensor data to create high quality simulations is another point of development. Next to the addressed improvement in productivity and plausibility, these latest advancements also enable automatic ergonomic assessments, including process oriented standards like EAWS, OCRA and NIOSH lifting index. The inclusion of these standards will allow a more holistic ergonomic assessment and therewith expand the fields of application in the industrial environment. This paper will give an insight in the latest developments and the performance of current implementations of automatic ergonomic assessment within digital human models. © 2015 The Authors. Published by Elsevier B.V. Peer-review under responsibility of the Scientific Committee of 48th CIRP Conference on MANUFACTURING SYSTEMS - CIRP CMS 2015. Keywords: Ergonomic assessment for DHM simulations facilitated by sensor data 1. Introduction The interactive nature and the flexibility are the main advantages of digital simulations. Especially in the environment of process planning for manual work tasks, where the classic methods have been using paper boxes as mock-ups and string to plan body postures and walking paths, the advantages of a virtual environment become clear. The creation of process variations within seconds, the exchange of objects in the work place, or the shifting of tasks from one worker to another are just a few of many examples. Next to that software systems possess the ability to measure precisely, when it comes to path lengths, times or joint angles. Thus, the full incorporation of ergonomic assessment methods into DHM software tools may improve evaluation efficiency, objectivity and validity. Nethertheless, the simulation of manual processes and the ergonomic assessment of these processes hasn’t been used widely in the past. The simulation of manual manufactoring processes has been a very time consuming work, since the definition of body postures and the © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the scientifi c committee of 48th CIRP Conference on MANUFACTURING SYSTEMS - CIRP CMS 2015 703 Dan Gläser et al. / Procedia CIRP 41 ( 2016 ) 702 – 705 motions in between had to be defined on the level of individual limbs and joints. The massive time effort, which has been needed hindered the digital human model as a technology to become the intuitive and interactive tool it could be. The INTERACT approach tries to focus explicitly on these weaknesses, to raise the digital human model onto a higher level of intuitiveness and interactiveness. This paper focusses mainly on the ergonomic assessment function of the INTERCAT software prototype. In the following paper the three included assessment methods EAWS, NIOSH lifting index and OCRA will be described, followed by the methodology and the implementation of the regarding software modules. 2. Methodology The automatic ergonomic assessment with the previously mentioned methods EAWS, NIOSH lifting index and OCRA require a certain amount of information of the process: - Body postures - Handled loads - Forces applied to the body These parameters have to analyzed discretely, to be able to assign the parameters to each other at every time of the process. The body posture will be retrieved through the measuring of joint angles and/or distances of joints, limbs and body marks as required by the relevant ergonomic assessment method. The information of the handled loads will be retrieved from the geometry data, which includes information about the mass of the used geometry. If a load in the scene is handled will be retrieved from an ‘attached’/’detached’ information for the right, left or both hands. The forces will be measured and interactively assign to the process through sensor data. This can be done in advance of the simulation or interactively in the work shop environment. Next to that it will be possible to assign forces manually to individual processes. The three methods also allow to define ‘extra-points’ for special ergonomic risks like throwback, sitting on hanging surfaces, walking on sticky floors, etc.. 3. Ergonomic assessment modules 3.1. EAWS The Ergonomic Assessment Work Sheet (EAWS) [1] is a widely used method in the German automotive industry. It’s based on a holistic analysis of the work process, considering all executed work tasks in the context of a whole working day. EAWS is separated in 5 Modules, which are assessed separately. The first module is related to body postures, which are assessed as static (duration > 4 sec.) or dynamic (freq. > 2/min.). A posture is only assessed, if during its occurrence no significant force (> 40 N) or load (>3 Kg) is applied to the worker. If a relevant force or load is occurring, the related parts of the process are assessed with the regarding modules. The first module addresses the extra points, which can’t be or at least not easily quantified within a ‘standard’ assessment. The last module is related to upper limb movements at high frequencies. This module results in an extra index, which is displayed separately. Due to its complex nature and focus on relatively difficult to observe body parts, such as the wrist, this module isn’t used widely. 3.2. NIOSH lifting index The NIOSH lifting index (LI) is a standard assessment method for load handling and together with OCRA one of three ergonomic assessment tools, which are part of the ISO 11228 standard and therewith international standards [2]. The LI applies for lifting and lowering without considering any walking respectively carrying in between. Fig. 1. Skeleton of the INTERACT avatar Fig. 2. Graphic representation of hand location 704 Dan Gläser et al. / Procedia CIRP 41 ( 2016 ) 702 – 705 The result of the assessment – the lifting index- displays the quotient between the handled load and a recommended load for the reviewed tasks. The recommended is calculated by the following equation, which combines the parameters weight of the handled object, horizontal (HM) & vertical locations (VM), distance (DM), angle of symmetry (AM), frequency of lift (FM), duration and the coupling (CM) between hands and object: 3.3. OCRA The OCRA system is a set of set of tools enabling different levels of risk assessment based on the desired specificity, variability and objectives [3]. As mentioned above its part of ISO 11228. OCRA consists of three modules: the Ocra Mini-Checklist, the Ocra Checklist and the Ocra Index. For an automatic assessment the Ocra Index is the one that is used, because only the Index is developed to quantify the work related exposure and risks on a detailed level. As the NIOSH lifting index, the OCRA index is a quotient of actual technical actions (ATA) to recommended technical actions (RTA). The definition of technical actions is shown below ( see Fig. 4) Both are calculated by a number of multipliers containing the number of repetitive tasks per shift, Force exertion, posture, recovery and the additional multiplier. 4. Results All assessment tools have been analyzed with regard to the quantification and measurement of their input parameters. The current prototypes of the assessment tools contain only those parameters, which are measurable within the INTERACT prototype’s functionality. There is still a number of additional parameters, which have to be put in automatically, since they are not assigned to the process or the geometry yet. Some of these additional parameters are the coupling between hand and object during load handling, temperatures or vibration. The workflow for the development and implementation of the tools has been the same for all three methods: method analysis and preparation, GUI draft, program flow chart, implementation, validation through test scenario. 4.1. EAWS Besides the additional points, EAWS has been transferred to a fully automated assessment tool. The body postures are assessed in every frame of the simulation. The loads are retrieved from the masses, which are assigned to the handled geometry, while forces are assigned to tasks via sensor data in the workshop. The results are displayed through the INTERACT GUI (see Fig. 4). On the right the overall score is displayed, with the distribution of points into the several assessment modules posture, action forces, load handling and extra points. The EAWS result is ranked in the three categories green (0-25 pts.), yellow (25-50 pts.) and red (>50 pts.), which indicates either low risk, intermediate risk or urgent need for adaption of the working conditions. In the left part of the GUI several detailed representations of the individual modules (posture, forces, loads) can be displayed regarding to the requirements of the user. Fig. 3. Equation to calculate the recommended weight for the NIOSH lifting index Fig. 5. GUI of the EAWS module Fig. 4. Technical actions in OCRA 705 Dan Gläser et al. / Procedia CIRP 41 ( 2016 ) 702 – 705 4.2. NIOSH lifting index The NIOSH lifting index can be processed almost fully automatically, beside the coupling multiplier between hands and objects. In the long-term, this parameter can be assign directly to the geometry as meta- information. With this further improvement, the NIOSH lifting index will be available as automatic assessment tool. It has to be mentioned that the NIOSH lifting index shows several weaknesses, as a holistic assessment tools, since it only assesses lifting and lowering tasks and points out a number of restrictions. For example a switch of hands, sitting down, tool handling and other tasks are not allowed to be assessed. 4.3. OCRA The OCRA method is suitable for an automatic assessment in principle, but there are several challenges coming with it. Not every technical action is defined irrevocably defined, what makes it difficult to determine them explicitly. For the identification of ‘putting in/pulling out’ it is necessary to be able to differentiate them from a simple ‘moving’. For the technical action ‘start-up’ the software has to know, if a tool is manual or automatic and if it required the pressing of a start-button or not. There are concepts for these problems to be solved, since most the required information can be assigned either to objects or to processes in the future, but the current INTERACT prototype won’t allow to implement all of the required features. Nethertheless there is a tool ready for a semi-automatic OCRA assessment, which requires some manual input (see Fig. 6.). 5. Conclusion and discussion With the automatic assessment with the three process oriented ergonomic assessment tools EAWS, NIOSH lifting index and OCRA, INTERACT makes a big contribution to promote the work with digital human models for the ergonomic evaluation of processes in manufacturing. While all methods show the ability to be used automatically in a virtual environment, there are still problems to solve. Some parameters, which are required by the methods aren’t part of the current virtual representations of product and processes. Properties like surface conditions, temperatures or vibrations aren’t assigned to virtual objects yet. The INTERACT project strengthens the idea, that the focused goals of higher efficiency, objectivity and validity in ergonomic assessment can be achieved with digital human modelling in the near future. References [1] Schaub K, Caragnano G, Britzke B, Bruder R. The European Assembly Worksheet. Theoretical Issues in Ergonomics Science. 2012. DOI: 10.1080/1463922X. 2012.678283 [2] ISO 11228-1:2007 [3] ISO 11228-3:2007 Fig. 6. GUI draft for the OCRA assessment tool work_ajlebeewlbelvegajsgyamggoq ---- Toward Sustainable Growth: Lessons Learned Through the Victorian Women Writers Project Research Article How to Cite: Borgo, Mary Elizabeth. 2017. “Toward Sustainable Growth: Lessons Learned Through the Victorian Women Writers Project.” Digital Studies/Le champ numérique 7(1): 4, pp. 1–8, DOI: https://doi.org/10.16995/ dscn.276 Published: 13 October 2017 Peer Review: This is a peer-reviewed article in Digital Studies/Le champ numérique, a journal published by the Open Library of Humanities. Copyright: © 2017 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. Open Access: Digital Studies/Le champ numérique is a peer-reviewed open access journal. Digital Preservation: The Open Library of Humanities and all its journals are digitally preserved in the CLOCKSS scholarly archive service. https://doi.org/10.16995/dscn.276 https://doi.org/10.16995/dscn.276 http://creativecommons.org/licenses/by/4.0/ Borgo, Mary Elizabeth. 2017. “Toward Sustainable Growth: Lessons Learned Through the Victorian Women Writers Project.” Digital Studies/Le champ numérique 7(1): 4, pp. 1–8, DOI: https://doi.org/10.16995/dscn.276 RESEARCH ARTICLE Toward Sustainable Growth: Lessons Learned Through the Victorian Women Writers Project Mary Elizabeth Borgo Department of English, Indiana University, US meborgo@umail.iu.edu This case study offers strategies for TEI-based projects with limited funding. By focusing on the needs of our volunteers, the Victorian Women Writers Project has developed truly collaborative relationships with the project’s partners. Contributions to the project’s resources have grown out of digital humanities survey courses, literature classes, and independent work. The paper concludes with a brief sketch of our efforts to support continued work by rethinking our social media outreach and our online presence. Keywords: TEI encoding; feminist DH; sustainability Cette étude de cas offre des stratégies pour les projets TEI (initiative pour l’encodage de texte) dont le financement est limité. En mettant l’accent sur les besoins de nos bénévoles, le projet Victorian Women Writers a mis au point des relations véritablement collaboratives avec les partenaires du projet. Les contributions aux ressources du projet sont issues des cours d’introduction et des classes de littérature en humanités numériques, et de travail indépendant. L’article conclut par un bref résumé de nos initiatives afin d’appuyer le travail continu en réévaluant notre diffusion dans les médias sociaux et notre présence en ligne. Mots-clés: Encodage TEI; HN féministes; durabilité By age 20, Juliana Horatia Ewing had published her first children’s story in the Monthly Packet, “A Bit of Green” (1895). It features a selfish child who learns Christian charity by visiting with his father’s patients, and exhibits all of the hallmarks of a typical Victorian children’s story. While this sentimental tale was unremarkable https://doi.org/10.16995/dscn.276 mailto:meborgo@umail.iu.edu Borgo: Toward Sustainable Growth2 in its time, this short piece launched Ewing’s extraordinary career. As the founder and editor of Aunt Judy’s Magazine, Ewing became one of the most dynamic and influential children’s authors of her time. Her most enduring story, “The Brownies,” even inspired a new division of the Girl Scouts. Ewing’s work, among other rare and often out-of-print texts, has found a new audience through the Victorian Women Writers Project. Since its founding in 1995, the archive has supported feminist literary studies through its innovative approach to preserving nineteenth-century texts. By working alongside groundbreaking projects like Orlando and the Women Writers Project among many others, over 200 texts have been encoded according to TEI-P5 guidelines. We continue to add more texts, critical introductions, scholarly annotations, and biographies with each passing year. But if the authors in our archive are any indication of our future, the next 20 years will be even more spectacular. In order to ensure the project’s sustained growth, the VWWP has been developing new types of partnerships. While there is no “one size fits all” solution to developing sustainable projects, the following case study offers a broad spectrum of approaches for encoding initiatives that rely heavily on the work of unpaid contributors. By assessing their needs, we have become better prepared to create and support mutually-beneficial partnerships. This learning process has shed light on logistical difficulties inherent in collaborative encoding projects, ultimately inspiring a more student-centered approach. Our first step towards sustainable growth was to identify potential contributors who had some familiarity with coding and with nineteenth-century texts. Since its inception, our project has been the result of close partnerships between faculty, students, and librarians at Indiana University. Perry Willett, then Head of Library Electronic Text Resource Service (LETRS), founded the project in 1995 after being approached by an undergraduate, Felix Jung, who requested additional resources to study Victorian poetry, a genre dominated by women. Through close collaboration with Donald Gray from the English department, the founders identified, encoded, and launched new digital editions of rare materials authored by women that had been largely overlooked in subscription-based services. After lying fallow for a few years, the project was revived in 2007 by Angela Courtney, IU’s English Literature Borgo: Toward Sustainable Growth 3 Librarian, and Michelle Dalmau, then Digital Projects Librarian. Their outreach efforts ultimately resulted in one of the first Digital Humanities courses taught at Indiana University in the fall of 2010. Co-teacher Joss Marsh, a Victorianist, and Adrianne Wadewitz, then a graduate student at IU, transformed the VWWP into a powerful pedagogical tool. Encoding texts for the project as part of course objectives gave students the opportunity to practice traditional editorial skills alongside emergent methodologies in the digital humanities (For more information about the project’s founding and development see Courtney et al. 2015). As a student in this course, I saw first-hand how digital preservation projects can lead to exponential professional growth, particularly at a graduate student level. Learning how to code through the VWWP gave me the advanced TEI skills needed for digital preservation projects. This experience laid the groundwork for building my own digital projects and contributing to others. By incorporating digital resource-building into my writing process, I have created publically-accessible versions of my dissertation research. This aspect of my work has made me a more competitive candidate for travel funding and research grants. When I assumed the role of managing editor of the VWWP in the spring of 2011, I did not yet know how formative digital humanities would be for my own approach to nineteenth-century literature, but I was (and still am) passionate about helping undergraduate and graduate students professionalize through their work with the VWWP. Since students have been a key facet of the project’s growth, we then looked for resources which would help us to expand our partnerships with students at a graduate level. Our research included identifying relevant models for classroom engagement. Many successful projects deliberately target the classroom as the primary site of contributions. The Victorian Web and the Map of Early Modern London, for example, includes entries written as part of daily class objectives. Graduate-level digital humanities courses taught at IU since the fall of 2010 include the VWWP, the Swinburne Project, and the Chymistry of Isaac Newton as part of a more general survey of DH projects. The courses taught in the Fall of 2014 and 2015 used Scalar to preserve the classes’ work. Yet, the first class was a bit of an outlier in its focus on editorship and on TEI. By nature, digital humanities survey courses have little Borgo: Toward Sustainable Growth4 room for extended TEI-encoding projects. Since most students enroll in these courses without prior knowledge of XML encoding and TEI guidelines, it is difficult to devote a significant portion of the class to technical training. Learning how to encode seemed to be the biggest logistical challenge for graduate volunteers. When coupled with the fact that most graduate students are also juggling teaching responsibilities and dissertations, devoting time to learning a coding language seems like a daunting task. Until there are institutional changes to dissertation criteria, it’s difficult to convince graduate students to engage with digitization projects as an extension of their research because this kind of work is not needed to graduate. IU has taken steps toward changing this perception by modifying the language requirement of the Ph.D. to include code. Positioning TEI as a language prepares Ph.D. candidates like myself to engage with a broader range of critical work, much in the same way that one would grapple with criticism in German or French. As a language, TEI also shapes the way that an encoder interacts with the texts. In my own work, looking for place names has made me more attuned to the role of space in shaping narrative. Encoding creates an experience of close- reading a text that both prepares the text for digital publication and generates new interpretations of nineteenth-century material. In order to better support work that combined editing with encoding, we had to cater encoding tasks to fit the requirements and time constraints of the classroom. This was a particularly daunting undertaking since many of the books in our current workflow span over 200 pages. With the help of teaching workshops offered through the Women Writers Project (Northeastern University) and the Digital Humanities Summer Institute (University of Victoria), we developed different strategies for sharing the work of encoding. In some cases, encoders complete only a portion of the text; while this is well-suited for short-term projects, it’s challenging to maintain a level of continuity between each part (and among all of the texts in the repository). Since our encoders have found it easier to work with a whole text, we are gravitating toward adding shorter texts into our digitization workflow and toward dividing encoding tasks into phases. Having several encoders make multiple passes through a text increases chances for peer-review and thus reduces the number of errors in the encoding. Borgo: Toward Sustainable Growth 5 As we worked on strategies to market encoding tasks to graduate students, we also considered expanding contributions to the project that did not require encoding. While this move does not help us expand our collection of TEI-encoded texts, it allows us to develop partnerships with undergraduate students and to increase our outreach efforts. Much to our delight, we were able to partner with Chris Hokanson at Judson College in the spring of 2012 in order to add supplemental scholarly material to the archive. As part of an undergraduate course on Victorian women’s writing, Hokanson asked students to write brief scholarly biographies for authors in the collection. These submissions were then edited and encoded by the project’s managers. The greatest challenge that we face during the next phase of the project’s development is not a logistical problem but an ethical one. Since the VWWP is, and will continue to be, an open-access resource, we lack the revenue generated by subscriptions. To further complicate matters, encoding a 300-page Victorian novel or writing a scholarly introduction to an obscure tract on suffrage requires a significant amount of time, energy, and expertise. We are morally obligated to compensate our contributors for their time, especially since their work requires advanced technical skills and knowledge of the subject material, but we are unable to financially reimburse the project’s partners and thus must rely on the good-will of contributors. The citizen science model provides one way to address this issue. By simplifying tasks, projects like Science Gossip and Ancient Lives broaden the range of potential contributors. Because many hands make light work, labor-intensive projects like transcription can be accomplished in a fraction of the time. More importantly, these projects reward volunteer efforts by positioning contributors as shareholders in the final product. Clearly articulating the goals of the project gives citizen-scientists a better sense of how these small-scale tasks contribute to our understanding of history. Citizen-science projects have helped us to re-evaluate our classroom model. As Emily Murphy and Shannon Smith have argued, teacher-apprentice models lend student projects focused structure, but they risk reinforcing traditional hierarchies rather than giving students opportunities to join the DH community (2015). The VWWP encourages its students to become what Murphy and Smith describe as the Borgo: Toward Sustainable Growth6 “scholar-citizen,” a position which allows students to shape the project’s content at both a textual and encoding level. Graduate students in particular have worked with librarians and English department faculty to add new texts to the archive and to make emendations to the project’s encoding guidelines. This collaboration between the VWWP ’s editorial board and contributors has resulted in TEI encoding which more accurately represents the material. From a feminist perspective, the scholar-citizen model adopted by the VWWP not only places women more centrally in the literary cannon but also empowers women to be leaders in digital scholarship. Performing both encoding and editorial tasks has allowed junior scholars to actively participate in conversations about encoding best-practices and archive-building. Though our most dynamic periods of growth have stemmed from close partnerships with faculty, the opportunities to teach TEI encoding through the VWWP ’s texts are too few at IU to sustain the project’s continued growth. In light of limited course offerings, we have explored options that extend beyond the classroom model. Contributors working independently of a class have allowed us to extend our pool of contributors beyond IU. These long-distance partners have revealed the need for more streamlined project guidelines and for continued support in the form of regular meetings to maintain momentum. For our particular project, contributors must find their work professionally and intellectually rewarding. Locating and digitizing texts which intersect with our contributor’s research interests attracts a broader spectrum of students. One of our most recent collaborators, Rachel Philbrick (Brown University), has been encoding Victorian classical scholarship as an extension of her dissertation research on ancient Greek literature. Since most of the graduate student encoders will be entering the job market soon, they are concerned that their contribution won’t “count” as a publication. We have been working to create a more robust editorial review in order to add weight to their work with the project. Furthermore, we are developing surveys to track how the website is being used so that we can build stronger partnerships with those actively using the collection. We’ve also discussed at length how we can preserve the ownership and self-direction integral to the “citizen-scholar” model in non-encoding based tasks, Borgo: Toward Sustainable Growth 7 particularly at the undergraduate level. These strategies stem from undergraduate student-driven research projects. By offering students the option to work with the VWWP as part of professional writing courses, we’ve been working with undergraduates from marketing, business, and events management to create outreach events and internships. Thanks to Rachel Sharp, Evan Garthus, and Katelyn Kass, we will be hosting the 21st birthday party for the VWWP in Spring 2017. Research performed by two other groups have shown that students are looking for social media marketing experience. In response to this need, we will be offering a social media internship where students tweet, develop blog posts, and design marketing campaigns. Increasing our social media presence will help us to reach potential collaborators and identify projects with similar thematic foci. By identifying our contributors’ needs and finding models for sustainable growth, the VWWP has been developing new methods to expand TEI-based projects with limited funding. Catering project tasks to fulfill the professional and pedagogical objectives of our contributors has created partnerships which benefit volunteers and the project. As we move forward, we will continue to explore ways to support collaboration through coursework, through independent efforts, and through our online presence. In the years to come, we hope to attract an even more diverse range of contributors in order to foreground underrepresented voices in Victorian studies and digital scholarship. Competing Interests Mary is the Managing Editor of the Victorian Women Writers Project. There are no other competing interests. References Courtney, Angela, Arianne Hartsell-Grundy, et al. 2015. “Second Time Around; or The Long Life of the Victorian Women Writers Project.” In Digital Humanities in the Library: Challenges and Opportunities for Subject Specialists, 263–75. Chicago: ACLR. Borgo: Toward Sustainable Growth8 Ewing, Juliana Horatia. 1895. “A Bit of Green.” In Melchior’s Dream, 118–33. London: Society for Promoting Christian Knowledge. Murphy, Emily, and Shannon Smith. 2015. “Productive Failure” for Undergraduates: How to Cultivate Undergraduate Belonging and Citizenry in the “Digital Humanities.” Digital Pedagogy Institute – Improving the Student Experience. University of Toronto Scarborough, August 20. How to cite this article: Borgo, Mary Elizabeth. 2017. “Toward Sustainable Growth: Lessons Learned Through the Victorian Women Writers Project.” Digital Studies/Le champ numérique 7(1): 4, pp. 1–8, DOI: https://doi.org/10.16995/dscn.276 Submitted: 20 October 2015 Accepted: 31 October 2016 Published: 13 October 2017 Copyright: © 2017 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. OPEN ACCESS Digital Studies/Le champ numérique is a peer-reviewed open access journal published by Open Library of Humanities. https://doi.org/10.16995/dscn.276 http://creativecommons.org/licenses/by/4.0/ Competing Interests References work_alynjplhunbqnmppz32wm5fojm ---- OP-LLCJ140026 326..339 UC Berkeley UC Berkeley Previously Published Works Title What Ever Happened to Project Bamboo? Permalink https://escholarship.org/uc/item/6jq660tm Author Dombrowski, Quinn Publication Date 2014-06-16 Peer reviewed eScholarship.org Powered by the California Digital Library University of California https://escholarship.org/uc/item/6jq660tm https://escholarship.org http://www.cdlib.org/ What Ever Happened to Project Bamboo? ............................................................................................................................................................ Quinn Dombrowski Research IT, UC Berkeley, Berkeley, CA 94720, USA ....................................................................................................................................... Abstract This paper charts the origins, trajectory, development, challenges, and conclusion of Project Bamboo, a humanities cyberinfrastructure initiative funded by the Andrew W. Mellon Foundation between 2008 and 2012. Bamboo aimed to en- hance arts and humanities research through the development of infrastructure and support for shared technology services. Its planning phase brought together scholars, librarians, and IT staff from a wide range of institutions, in order to gain insight into the scholarly practices Bamboo would support, and to build a com- munity of future developers and users for Bamboo’s technical deliverables. From its inception, Bamboo struggled to define itself clearly and in a way that resonated with scholars, librarians, and IT staff alike. The early emphasis on a service- oriented architecture approach to supporting humanities research failed to con- nect with scholars, and the scope of Bamboo’s ambitions expanded to include scholarly networking, sharing ideas and solutions, and demonstrating how digital tools and methodologies can be applied to research questions. Funding con- straints for Bamboo’s implementation phase led to the near-elimination of these community-oriented aspects of the project, but the lack of a shared vision that could supersede the individual interests of partner institutions re- sulted in a scope around which it was difficult to articulate a clear narrative. When Project Bamboo ended in 2012, it had failed to realize its most ambitious goals; this article explores the reasons for this, including technical approaches, communication difficulties, and challenges common to projects that bring to- gether teams from different professional communities. ................................................................................................................................................................................. 1 Introduction Project Bamboo was a humanities cyberinfrastruc- ture initiative funded by the Andrew W. Mellon Foundation between 2008 and 2012, in order to en- hance arts and humanities research through the de- velopment of infrastructure and support for shared technology services. In 2008, the Mellon Foundation funded a joint proposal for UC Berkeley and the University of Chicago to conduct a planning process that would gather feedback from scholars, librar- ians, and IT staff from a wide range of institutions, and build a community of future developers and users for Bamboo’s technical deliverables. Where project staff anticipated 200 attendees representing 75 institutions, over 600 ultimately participated, representing more than 115 institutions.1 This article charts the origins, trajectory, devel- opment, challenges, and conclusion of Project Bamboo, from its initial funding through the months immediately following its conclusion. The article is an expansion of the author’s presentation at Digital Humanities 2013, with the goal of provid- ing background and context for further discussion within the digital humanities community about les- sons that can be learned from this project. Correspondence: 2195 Hearst Avenue, Suite 200, Berkeley, CA 94720, USA. E-mail: quinnd@berkeley.edu Literary and Linguistic Computing, Vol. 29, No. 3, 2014. � The Author 2014. Published by Oxford University Press on behalf of EADH. All rights reserved. For Permissions, please email: journals.permissions@oup.com 326 doi:10.1093/llc/fqu026 Advance Access published on 16 June 2014 . . paper paper Material for this article has been drawn from a number of sources, most prominently the public Bamboo wikis,2 supplemented by the author’s own memory, that of colleagues, and email records.3 While this article largely deals with the facts of Project Bamboo, a layer of interpretation is inevit- able, particularly as pertains to the factors contri- buting to the project’s failure to realize its most ambitious goals. The conclusions drawn are the au- thor’s own, and neither a product of consensus among the participants nor an official statement on behalf of Project Bamboo, the University of Chicago, UC Berkeley, or the Mellon Foundation. 2 Origins In the mid-2000s, discussions about cyberinfrastruc- ture emerged in higher education IT circles, includ- ing EDUCAUSE and the Coalition for Networked Infrastructure. Future Bamboo project co-director Chad Kainz, then the senior director of Academic Technologies within the University of Chicago’s central IT unit, saw a role for cyberinfrastructure, and what would come to be known as cloud com- puting, in addressing the following issues he had encountered while supporting digital humanities projects: (1) at least two-thirds of the time spent on typical humanities technology projects was spent on developing the technology rather than focus- ing on the scholarship, (2) many of the projects centered on either ‘yet another database’ or ‘yet another website’, and (3) the technologies that were ultimately created for the projects in question were developed before, but for different contexts, thus ‘re- inventing the wheel’. (Kainz, 2010) At the 2006 EDUCAUSE Seminar on Academic Computing, Kainz discussed support for digital humanities with Chris Mackie, at that time an Associate Program Officer for the Research in Information Technology (RIT) program at the Mellon Foundation. For Mackie, the issues that Kainz identified also led to frustrations for funding agencies: foundation funds were being directed to- ward the development of software that would likely not be reused and the creation and presentation of data that could spread no further than a single Web site or database, rather than substantively furthering humanities scholarship. Mackie encouraged Kainz to partner with David Greenbaum, the UC Berkeley Director of Data Services and future Bamboo co-director, to initiate a Mellon-funded project that would address these issues. Based on feedback from Mackie, Kainz and Greenbaum revised an initial technology development proposal into a community-driven technology planning project. 3 Bamboo planning project proposal The Bamboo Planning Project proposal identified five key communities whose participation was seen as crucial for the project’s success: humanities researchers, computer science researchers, informa- tion scientists, librarians, and campus technolo- gists.4 Anticipating—if understating—the root of many of the challenges that would arise in the work- shops, the proposal noted that ‘[e]ach community has distinctive practices, lingo, assumptions, and concerns; and clearly there is much diversity within each community as well’ (Project Bamboo, 2008, p. 6). The proposal drew extensively on infor- mation and examples shared by 50 representatives of these five communities at UC Berkeley who at- tended an all-day focus group at the Townsend Center for the Humanities in November 2007. Perspectives from University of Chicago faculty and staff also contributed to the view of the then- current landscape of digital humanities depicted in the proposal. While both UC Berkeley and the University of Chicago are leading research institu- tions with strong programs in the humanities and a number of longstanding digital humanities projects (e.g. ARTFL at the University of Chicago, and the Sino-Tibetan Etymological Dictionary and Thesaurus at UC Berkeley), these projects were more the exception than the norm, and faculty members at these institutions were not highly involved in the leadership of large digital humanities What Ever Happened to Project Bamboo? Literary and Linguistic Computing, Vol. 29, No. 3, 2014 327 paper , . paper . ' (CNI) Chicago's , 2/3 `` '' `` '' `` '' s project's . organizations, in 2007. As such, while the depiction of the digital humanities landscape in the proposal may have been accurate for some faculty members at research institutions, it reflected neither the ex- periences and concerns of many noteworthy digital humanists, nor those of scholars at small liberal arts schools, though both groups participated in Bamboo’s workshops. This omission, while difficult to avoid at such an early stage, opened the project up to criticism.5 In the context of the Bamboo Planning Project, the role of the humanities scholar was to share in- formation about methods, practices, and workflows, paying particular attention to ‘pain points’ and areas where current tools and services were inad- equate. Technologists and librarians would then construct a proposal for the development of new services and underlying infrastructure to support scholarship in the humanities. The Bamboo plan- ning proposal did not significantly treat the possi- bility that humanists might focus on needs that could not meaningfully be addressed through the development of technology. The proposal specified the two models that the infrastructure and scholarly services would draw from: large enterprise SOA practices for scalability, management, cost-effectiveness, and long-term sta- bility on one hand, and mash-ups, which emphasize ease, flexibility, and fast innovation on the other (Project Bamboo, 2008, pp. 15–16). The Bamboo planning proposal charted a direct path from the expression of scholarly practices6 within and across disciplines (in the first workshop) to systematizing those practices into defined schol- arly workflows that could be used ‘to derive com- monalities and unique requirements related to practices, functions, barriers, needs, and existing and potential transformations at the disciplinary level’, to developing ‘a community-endorsed tech- nology services roadmap for scholarship’, along with organizational, staffing, and partnership models to support those services. It anticipated that ‘arts and humanities scholars [would] begin to shape technology options by questioning impacts of potential technological choices, clarifying misin- terpreted goals and ultimately co-determining a roadmap of goals to pursue, tools to provide, platforms on which to run, and architecture to use’ (Project Bamboo, 2008, p. 24). SOA would play an increasingly prominent role as the work- shops progressed.7 Between the workshops, participants would pro- pose pilot projects that would be undertaken by Bamboo program staff. These pilot projects would ‘be based on industry-accepted practice and open standards for a services-oriented architecture’ and would ‘present . . . a tangible expression of how ser- vices can function . . . facilitate understanding and critique . . . our process, as well as clarify our seman- tics and goals’ (Project Bamboo, 2008, p. 28). According to the plan, by the end of the Bamboo Planning Project, the initial group of 200 partici- pants from 75 institutions would be narrowed down to 30 participants from the 15 institutions that would move ahead with implementing a robust, scalable web services framework and a set of services that aligned with scholarly practice in the humanities, as defined by participating scholars. In reality, this plan changed dramatically when faced with the interests and priorities of actual humanities scholars. 4 Bamboo planning workshops One of the hallmark traits of the Bamboo planning workshops was their flexibility—on more than one occasion, plans and agendas that had been painstak- ingly prepared over weeks were discarded and com- pletely rewritten after a frustrating morning session. This began with the first iteration of workshop 1 (held in Berkeley, 28–30 April 2008). After high- level presentations on Bamboo, its approach, and its methodology, participants were asked to name abstracted scholarly practices (as verbþdirect object), provide a description, identify applicable domains, cluster those practices, and then repeat the process for emerging scholarly practices, while scribes filled in an Excel spreadsheet template with different tabs for each exercise. Faculty participants were particularly turned off by the technical jargon in the presentations (including ‘services’, as com- monly understood by IT staff), and the program staff’s pushing for immediately abstracting Q. Dombrowski 328 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 . `` '' service-oriented architecture ( ) - , Service-oriented architecture . ... .. … ... 28---30, `` '' verbþdirect object ‘scholarly practices’ instead of facilitating a conversation about what scholars do. The spreadsheet was emblematic of the disconnect between the plan for workshop 1 and what scholars believed was needed, as it was unable to capture the narrative of their discussions. By the second day of the workshop, the exercises took on a less rigidly structured form, and this informed the process used with greater success in the subsequent three iterations of workshop 1.8 At the time, the incident at the first workshop 1 was largely interpreted as a tactical misstep, rather than the beginnings of a challenge to the entire premise and planned approach of Project Bamboo. After the completion of the workshop 1 series (28 April–16 July 2008), work continued as defined in the proposal: program staff aggregated the notes taken during the workshop 1 meetings, and distilled from that material a set of ‘themes of scholarly prac- tice’9 to present at workshop 2 (15–18 October 2008). Program staff also prepared and presented an introduction to SOA in the context of Bamboo, intended to link the themes of scholarly practice to the planning for future technical development that would be the focus of subsequent workshops. This approach to workshop 2 backfired. While developing the themes of scholarly practice, pro- gram staff had created accounts for over 400 work- shop 1 participants on the project wiki, anticipating that they would actively contribute to the process of theme distillation. The minimal uptake (six con- tributors, each making a few edits) was interpreted as a consequence of humanists being unaccustomed to using a wiki for scholarly discussions, com- pounded by the unintuitive interface of the Confluence wiki platform. In person, however, it quickly became clear that what scholars found unin- tuitive was the program staff’s approach of present- ing their livelihood back to them as a set of ‘scholarly practices’. Already frustrated by the seem- ingly purposeless decontextualization and misrepre- sentation of scholarship in the humanities, many workshop 2 attendees were not disposed to attempt to make sense of the technical language and the ‘wedding cake’ diagram used to present the SOA component of the project. In heated Q&A sessions, some participants went so far as to challenge the legitimacy of a cyberinfrastructure initiative for the humanities led by IT staff rather than by humanists themselves. During workshop 2, it became clear that ‘com- munity design’ could not simply mean that the community would deliberate the details of a web services framework. The community had spoken and made it clear that continuing to emphasize SOA would alienate the very members of the com- munity Bamboo was intended to benefit most: the scholars themselves. While a web services frame- work would continue to play an important role in the project, it was represented in only one or two of the six working groups 10 established at workshop 2. The other groups focused on topics drawn from the themes of scholarly practice, with the exception of ‘Stories’ (later renamed ‘Scholarly Narratives’), a last-minute addition to address concerns about the decontextualization inherent in the process of iden- tifying themes of scholarly practice. Participants were allowed to choose the working group in which they would participate, but the program staff strove to balance group membership, so that IT staff were not the only participants in Shared Services, librarians were not the only participants in the Tools & Content Partners, etc. Professional homogeneity within working groups would have made the discussions easier, but mixing up the membership was seen as a productive step toward developing a single community that bridged profes- sional divides, with a shared vision informed by a diverse range of perspectives. After workshop 2, working groups focused on specific needs, opportunities, and challenges for Bamboo in relation to their working group topic. Working group findings were presented and discussed at workshop 311 (12–14 January 2009), along with a straw proposal outline12 and straw con- sortial model.13 The straw proposal outline intro- duced the idea that the Bamboo Implementation Project would be a 7–10 year endeavor that would need to be split into two phases. The straw proposal outline did not attempt to prioritize the foci of the different working groups, treating them all as part of the first phase (2010–2012). The resulting highly am- bitious scope drew criticism from workshop at- tendees, who also noted the lack of specifics about What Ever Happened to Project Bamboo? Literary and Linguistic Computing, Vol. 29, No. 3, 2014 329 `` '' felt . 28 -- 16, `` '' 15- 18 , services-oriented architecture ' staff's `` '' `` '' service-oriented architecture `` '' , `` '' `` '' weren't weren't s 12-14, . - - what exactly Bamboo would do, and the lack of defined criteria for success.14 At workshop 4 (16–18 April 2009), the Bamboo staff presented a more detailed articulation of a ‘Bamboo Program Document’,15 which outlined the 7–10 year vision and defined the activities to be carried out in the first development phase. The major activities for Bamboo were divided into three areas, with the first two major areas slated for im- plementation in the first phase16: (1) The Forum (a) Scholarly Network (b) Scholarly Narratives (c) Recipes (workflows) (d) Tools and Content Guide (e) Other Educational and Curricular Materials (f) Bamboo Community Environment(s) (2) The Cloud (a) Services Atlas (b) Bamboo Exchange (c) Shared Services Lifecycle (d) Tool and Application Alignment Partnerships (e) Content Interoperability Partnerships (3) Bamboo Labs (a) Diversity, Innovation, and Labs (b) Ecosystem of Projects and Initiatives (c) Structure (Explore, Plan, and Build) (d) Liaisons (e) Governance While the workshop discussion draft of the pro- gram document had already benefited from two rounds of asynchronous feedback from participants, concerns remained about the lack of specificity in each of these areas.17 However, this did not hinder participants from expressing their enthusiasm for the areas of work proposed for the first phase of development. Grouped by institution, participants voted on each sub-area of the ‘Forum’ and the ‘Cloud’, to indicate interest (none/low/medium/ high/potential leadership).18 Every topic except Tools and Content Guide had at least one potential leader, and Content Interoperability (CI) Partnerships, Services Atlas, and Scholarly Network all received a significant number of ‘high’ votes. Workshop 5 (17–19 June 2009) featured presen- tations of demonstrator projects19 and discussions of the draft Bamboo Implementation Proposal20 in- tended to be submitted to the Mellon Foundation that fall. The proposal, as discussed at the workshop, had the following major areas of work 21 : (1) Scholarly Networking—comprising the earlier Scholarly Networking and Bamboo Exchange from the program document. (2) Bamboo Atlas—comprising Scholarly Narratives, Recipes (workflow), Tool and Content Guide, Educational and Curricular Materials, and Services Atlas from the pro- gram document.22 (3) Bamboo Services Platform—the major area of technical development for the project, com- prising Tool and Application Alignment Partnerships, CI Partnerships, and Shared Services Lifecycle from the program document. At workshop 5, the participants (comprising 43% arts and humanities faculty, 41% technologists, and 12% ‘content partners’, primarily librarians and archivists) were asked to vote (yes/no/abstain) on these areas of work. Participants overwhelmingly voted yes on all three,23 while a handful of ab- stainers continued to voice strong concerns about scope,24 particularly with regards to the Bamboo Atlas. 5 Bamboo implementation proposal During the summer and fall of 2009, the Bamboo program staff engaged in an iterative feedback pro- cess with Chris Mackie from the Mellon Foundation on the proposal that developed out of workshop 5. The program staff intended to submit the proposal to the Mellon Foundation by the end of 2009, for consideration at the Mellon Board meeting in March 2010, with work beginning shortly thereafter. Instead, an organizational restructuring at the Mellon Foundation in December 2009 brought Bamboo proposal development to a halt. In this restructuring, the Mellon Foundation merged the RIT program that funded Bamboo into the Q. Dombrowski 330 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 . 16-18, `` '' , - . `` '' `` '' . `` '' 17-19, , , , , -- -- . -- Content Interoperability ed of `` '' , , *** Scholarly Communication program, and the pro- gram officers with whom Bamboo had been work- ing closely left the foundation.25 Over the next 6 months, Bamboo program staff worked with Donald Waters and Helen Cullyer, the program officer and associate program officer in the Scholarly Communication program at the Mellon Foundation, on an implementation proposal for Bamboo that aligned with a different set of con- straints and priorities than those provided by the former RIT program. The Mellon Foundation’s ear- lier investment of $1.3 million dollars in Bamboo’s planning phase had left the project team anticipat- ing a larger investment in the project’s development. This proved not to be the case, and the pool of resources available to Bamboo contracted further in the wake of the global economic crisis, as IT and/or library groups at potential partner institu- tions faced steep cuts, leaving fewer staff, less cash, and a stronger mandate for directing what resources remained toward projects with immediate local impact, rather than contributing to a consortium in potentia with long-term potential. Scope reduc- tion, which Bamboo had resisted, became unavoid- able, and the priorities of the Scholarly Communication program shaped the outcome. Rather than reducing the scope of all areas of Bamboo equally, the ‘Bamboo Commons’ area (consisting of the earlier Scholarly Networking, Scholarly Narratives, Recipes/workflow, Tool and Content Guide, Educational and Curricular Materials, and Service Atlas) was eliminated almost entirely, with only a machine-oriented ‘tool and ser- vice information registry’ remaining. The resulting Bamboo implementation proposal more closely resembled the one suggested by the SOA-oriented planning project proposal than the document dis- cussed at workshop 5. Even as the project’s scope contracted through the elimination of almost all of the community-oriented aspects, it expanded in other ways. Two new areas of work that had previ- ously received minimal attention were ‘work spaces’—virtual research environments intended to provide basic content management capabilities and/ or access to the tools on the services platform—and planning and design work for Corpora Space, ‘applications that will allow scholars to work on dispersed digital corpora using a broad range of powerful research tools and services’ (Project Bamboo, 2010, p. 11). Corpora Space was to be built on top of the Bamboo infrastructure during a subsequent technical development phase. In the Bamboo implementation proposal, UC Berkeley alone served as managing partner, with nine other universities contributing to the project: Australian National University, Indiana University, Northwestern, Tufts, University of Chicago, Univer- sity of Illinois—Urbana-Champaign, University of Maryland, Oxford, and University of Wisconsin— Madison. The University of Chicago PI for the Bamboo Planning Proposal, vice president and CIO Greg Jackson, left that institution in August 2009, followed by Chad Kainz, Bamboo Planning Project co-director, a year later. None of the Chi- cago-based staff who were actively involved in the management of the planning process reprised those roles in the implementation phase. In addition, UC Berkeley hired a new project manager, and had to develop new relationships with staff at the Univer- sities of Wisconsin and Maryland who took on areas of the project that Chicago had previously managed. These staffing changes led to a loss of the project’s organizational memory, which had particularly negative consequences for the message and tone of the project’s communication with scholarly communities. 6 Bamboo technology project It remains difficult to articulate succinctly what Project Bamboo was, without either resorting to barely informative generalities (‘humanities cyberin- frastructure, particularly for working with textual corpora’) or a list of the areas of work. The project struggled to identify a coherent vision that neatly encapsulated all the work being done in the name of Bamboo, or to clearly describe what future state the work would collectively realize. The lack of a shared vision was compounded by the staffing model for the different areas. Most institutions focused on one area or subarea, giving them little exposure to the work going on elsewhere in the pro- ject. Unlike the planning project working groups, What Ever Happened to Project Bamboo? Literary and Linguistic Computing, Vol. 29, No. 3, 2014 331 . six s `` '' `` '' `` '' -- -- -- -- a year later . - where membership represented a mix of scholars, technologists, and librarians, the different areas of the Bamboo technology project were each staffed by the ‘usual suspects’—technologists focusing on shared services and work spaces, librarians focusing on interoperability, and scholars focusing on Corpora Space. This arrangement helped lead to a sense of mutual mistrust among the different groups26—not atypical in project development,27 but corrosive nonetheless. Effective communication with scholarly and pro- fessional communities was never one of Project Bamboo’s greatest strengths. Even during the plan- ning project, most activity took place on a public wiki whose complex organization was a barrier to access. The news feed on the project Web site had always been updated sporadically, but the complete lack of updates to the public Web site between August 2010 and April 2011—a period including the first 6 months of the 18 month technology project—fueled confusion and doubt about what, if anything, Bamboo was doing. Once periodic com- munication resumed in April 2011 with the launch of a new rebranded Web site, the lack of a clear shared vision became more apparent, as did the challenges of having such a widely distributed pro- ject team; some areas of the project received much more visibility than others. Outside observers’ com- bined uncertainty and lack of agreement about what Bamboo was doing were detrimental to the project’s reputation, to the point where it became a source of concern for the project staff and Mellon Foundation alike. Nonetheless, a considerable amount of technical development and planning work took place under the auspices of the Bamboo Implementation Project between 2010 and 2012. Major accomplishments included the following: � Development of identity and access management (IAM) services,28 which also made possible ac- count linking (e.g. of a user’s university and Google accounts). � Development of a CI hub29 that normalized texts using the Bamboo Book Model.30 � Development of utility and scholarly services,31 and their deployment along with IAM services on a centrally hosted Bamboo Services Platform.32 � Investigation of HUBzero, Alfresco ECM, and the OpenSocial API as platforms for ‘work spaces’ or research environments for scholars33 that could be integrated with the Bamboo Services Platform. � Partnering with the long-running Digital Research Tools (DiRT) wiki to develop Bamboo DiRT (http://dirt.projectbamboo.org), which would serve as Bamboo’s ‘Shared Tools and Services Information Registry’. � The Corpora Space design process, where huma- nities scholars and tool developers conceptua- lized a set of applications that would allow scholars to work on dispersed digital corpora using a broad range of powerful research tools and services.34 7 The end of Project Bamboo Between December 2011 and December 2012, the UC Berkeley Bamboo program staff drafted two nearly complete proposals for a second development phase. The first, written in partnership with teams at the University of Wisconsin and the University of Maryland, directly followed from the Corpora Space planning process. The proposal was abandoned in June 2012, after it became clear that insufficient re- sources would be available. When the Mellon Foundation’s technical review of Bamboo empha- sized Bamboo’s place as an infrastructure project (rather than an application development project), Berkeley started over on a new proposal in that spirit. The new version, developed with a team from Tufts, focused on extending the infrastructure and demonstrating its utility through a ‘Classical philology reference implementation’. On 13 December 2012, days before the anticipated final submission, the Mellon Foundation declined to move ahead with inviting the Bamboo proposal, citing the project’s track record of failing to define itself or achieve adoption for its code, the fact that it had not retained its partners, as well as dissatisfac- tion with the proposal itself. The Mellon Foundation requested that the team bring the pro- ject to a close, with an eye toward making the pro- ject’s legacy visible to and usable by others. Q. Dombrowski 332 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 `` '' , w six , , - project's , Content Interoperability ( ) , - `` '' DiRT ( ) http://dirt.projectbamboo.org `` '' - `` '' 13, s Between January and March 2013, the remaining Bamboo staff worked with partners to develop and publish a documentation wiki that would serve as a sort of ‘reliquary’ for the project, alongside the code repository, issue tracker, the archived Web site, email lists, and social media accounts. Respecting the Mellon Foundation’s preferences, the Bamboo staff never publicly announced that Bamboo was over. Word simply spread informally and un- evenly35 beyond the notification of project partners, until the day when the Web site was replaced by the reliquary. 8 Bamboo’s afterlife Some of the components of Bamboo are still in use in other contexts. 8.1 Perseids The Perseids project at the Perseus Digital Library (http://www.perseus.tufts.edu/hopper/) integrates a variety of open-source tools and services to provide a platform for collaborative editing and annotation of classical texts and related objects. An instance of the Bamboo Services Platform is deployed as part of Perseids to provide access to the Tufts Morphology and Annotation Services, and the supporting Cache and Notification Services developed at Berkeley. Under new funding from the Mellon Foundation, Perseids developers will be exploring approaches, including those offered by Bamboo IAM compo- nents, for enabling the platform to better support cross-project and cross-institution collaboration. In addition, the Perseus Digital Library is currently exploring the viability of the Bamboo IAM infra- structure to support a centralized user model for the Perseus ecosystem of distributed applications and services. 8.2 CIFER Designs and technologies for account linking (part of Bamboo’s IAM work) have become the acknowl- edged basis of several items on the development roadmap for Community Identity Framework for Education and Research (CIFER, http://www.cifer project.org/), a collaborative effort across a large number of research institutions and consortia to provide an ‘agile, comprehensive, federation- and cloud-ready IAM solution suite’. 8.3 DiRT directory In October 2013, the Mellon Foundation funded a proposal for additional work on Bamboo DiRT, which would be rebranded as the DiRT directory. This new project included the development of an API that will facilitate data sharing with other digital humanities directories and community sites, includ- ing DHCommons (http://dhcommons.org) and the Commons-In-A-Box (http://commonsinabox.org/) platform, which powers sites such as the MLA Commons (http://commons.mla.org/). The DiRT directory continues to thrive as a community- driven project. 9 Conclusion Project Bamboo began with the ambitious dream of advancing arts and humanities research through the development of shared technology services. Conscious of the challenges for humanities cyberin- frastructure identified in the 2006 Our Cultural Commonwealth report (Unsworth et al., 2006) (e.g. ephemerality, copyright, and conservative academic culture), the Bamboo program staff identified those issues as out-of-scope for Bamboo after workshop 1,36 but they continued to impact the project none- theless (e.g. copyright as the fundamental motivat- ing force behind IAM work). Prior work on social science infrastructure devel- opment suggests that Bamboo’s mode of engage- ment—bringing together people from the scholarly, technology, and library communities after Bamboo had a conceptual and technical trajec- tory, while nonetheless expecting ‘participatory design’—would be a source of tension. Indeed, the wide range of responses to the initial technology- oriented proposal put Bamboo in a bind. Technologists and some librarians tended to see it as important and necessary, while many scholars felt that their needs lay elsewhere entirely. Changing scholars’ minds would not be quick; as noted in What Ever Happened to Project Bamboo? Literary and Linguistic Computing, Vol. 29, No. 3, 2014 333 `` '' and . http://www.perseus.tufts.edu/hopper/ CIFER ( http://www.ciferproject.org/ http://www.ciferproject.org/ http://dhcommons.org http://commonsinabox.org/ http://commons.mla.org/ . , `` '' Ribes and Baker (2007), ‘conceptual innovation is an extended process: one cannot simply make claims about the importance of . . . [e.g. cyberinfras- tructure] and expect immediate meaningful com- munity uptake’. Accommodating the interests of all three groups would necessarily mean a broader scope, but additional supporters could bring with them additional resources to make such a scope possible. It also seemed more promising than the alternative of creating a new group of like-minded technologists and librarians who would move forward with an SOA-focused devel- opment effort without focusing on scholarly out- reach and adoption. In retrospect, doing so may have led the project to greater technical success, but it is arguable whether taking such an approach from the start was even a real option, given Bamboo’s public commitment to a ‘community design process’. From the early planning workshops to the Mellon Foundation’s rejection of the project’s final proposal attempt, Bamboo was dogged by its reluc- tance and/or inability to concretely define itself. In the early days, avoiding a concrete definition was motivated by a desire for the project to remain flex- ible and responsive to its community. The tendency toward generality persisted long after it had ceased being adaptive, even after it became a source of criticism. An infrastructure project like Bamboo could be expected to name the tools and corpora it would integrate as a way to be more concrete, but it became apparent that very few of the tools in use by digital humanists at that time were being refac- tored to fit the model Bamboo was architected to support (i.e. scholarly web services running on nonprofessionally managed servers). If ‘true infra- structures only begin to form when locally con- structed, centrally controlled systems are linked into networks and internetworks governed by dis- tributed control and coordination processes’ (Edwards et al., 2007), the shortage of locally con- structed systems with wide scholarly uptake that were technically compatible with Bamboo was prob- lematic.37 The work done in the Bamboo technology project was pitched as laying the infrastructure for top-to-bottom support for working with textual corpora. Bamboo would support a complete scholarly workflow, from accessing and ingesting texts from repositories, to analyzing and curating them using scholarly web services, all within an en- vironment that facilitated collaboration. This vision was complicated by the decision to include integra- tion with three different research environment sys- tems, each with a distinct approach and feature set. This choice was partly pragmatic (allowing partners to focus on whatever platform their institution had already invested in38), partly in keeping with Bamboo’s philosophy (the infrastructure was in- tended to be flexible, not tied to any one user- facing platform). Flexibility and scalability were part of the early value proposition for Bamboo, and they remained influential considerations in the architecture and development of the infrastructure. However, the in- frastructure was architected in such a way that made it difficult to complete and release stand-alone com- ponents that could be tested and used while other parts were incomplete. As a result, it was nearly impossible to create demonstrator projects that scholars or digital humanities developers could try out and that potential funders could evaluate. Demonstrator projects could have effectively and concretely shown that Bamboo was producing something useful, or provided an opportunity for feedback at a stage where it could have been incor- porated productively. The technical team and the scholarly team had very different perspectives on what was needed, which led to frustration and com- munication failures from both sides. Consequently, the technical team relied on hypothetical scholarly use cases. Given the emphasis placed on the import- ance of communication between technical and nontechnical communication in literature on cyber- infrastructure development (e.g. Freeman, 2007), addressing this communication breakdown should have been a higher priority. The extensive develop- ment time required for infrastructure components, without opportunities to confirm that the compo- nents successfully fulfilled real needs, may have proven even more problematic had Bamboo continued. The resources allocated to Bamboo were signifi- cantly smaller than amounts provided to similarly scoped infrastructure projects in the sciences. Q. Dombrowski 334 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 ' `` '' s - . - - Bamboo’s struggle to produce value within these constraints was made more challenging by a failure to differentiate needs essential to the humanities, and those unique to the humanities. It is crucial in the long run for scholars to be able to work with texts in access-restricted repositories, but the pre- requisite IAM infrastructure represents a common need across all universities. Seeing that existing con- sortia dedicated to working on this problem would not have a solution ready in time for Bamboo to adopt, it might have been wiser for Bamboo to re- define its initial scope to only include free-access textual repositories, allowing it to demonstrate suc- cess by sidestepping the encumbrance of copyright as identified by Our Cultural Commonwealth. While Bamboo’s IAM work did make significant technical contributions, it came at the cost of diverting lim- ited resources from other areas of the project, and became a ‘reverse salient’ (Edwards et al., 2007) for the entire Bamboo infrastructure. Deferring decision on Bamboo’s sustainability plan and operational model until the second phase of development was consequential on multiple fronts. From a technical angle, it risked path de- pendency problems: the best technology choices for a centrally run enterprise-level platform may have made it considerably harder for individual uni- versities to run the platform under a different model. From the social perspective, postponing de- cisions about what ‘membership’ would mean, how much it would cost, and what it would provide made it difficult for institutions to assess whether they would be ‘winners’ or ‘losers’ (Edwards et al., 2007) if Bamboo succeeded. While Bamboo pro- gram staff saw Bamboo as freeing up local staff to provide more hands-on consulting about the appli- cation of scholarly tools (rather than spending time configuring and managing locally run tools and en- vironments), some groups were concerned that uni- versity administration might see those staff as redundant in the face of Bamboo, and lay them off rather than transition them to new kinds of fac- ulty support. Particularly for liberal arts colleges that had participated in the planning project, there was no way to engage with Bamboo to increase one’s chances of ending up a ‘winner’, other than joining an occasional invite-only ‘community’ conference call. Given the expansive scope of Bamboo’s other deliverables, it was unrealistic for Bamboo program staff to have additionally taken on the work of establishing a sustainability plan during the first phase of technical development. Still, defer- ring or constraining the scope of some of the tech- nical work (e.g. reducing the number of work space platforms) in order to redirect resources toward determining a viable operational and membership model before the second phase of development might have made more institutions willing to invest in Bamboo. Perhaps, the greatest impediment to Bamboo’s success was the lack of a shared vision among project leaders, development teams, and communi- cations staff. In the beginning, Bamboo had multi- university cross-professional teams whose members faced challenges in communication and culture but helped one another understand Bamboo’s goals in more nuanced ways. During the development phase, teams were formed on the basis of profession and institution, each one working according to their own status quo, with little connection to a bigger picture. The Bamboo planning project asked partici- pants ‘what’s in it for you?’—an important consid- eration often overlooked in consortial efforts. Without a shared vision to counterbalance the pull of self-interest, a complex multi-faceted project like Bamboo becomes little more than a funding um- brella for individual initiatives. As the likelihood of those initiatives intersecting in a coherent way decreases, project messaging becomes muddled, and the resulting decrease in public confidence and comprehension can jeopardize a project’s con- tinued existence. Brett Bobley, director and CIO of the Office of Digital Humanities at the National Endowment for the Humanities, offered his own interpretation of and eulogy for Bamboo at Digital Humanities 2013, which may serve as a fitting conclusion here. He suggested that, if nothing else, Bamboo brought together scholars, librarians, and technologists at a crucial moment for the emergence of digital huma- nities. The conversations that ensued may not have been what the Bamboo program staff expected, but they led to relationships, ideas, and plans that have blossomed in the years that followed (e.g. DiRT and What Ever Happened to Project Bamboo? Literary and Linguistic Computing, Vol. 29, No. 3, 2014 335 - `` '' - `` '' `` '' `` '' - `` '' `` '' s `` '' -- , the TAPAS project), even as Bamboo itself struggled to find a path forward. References Dombrowski, Q. and Denbo, S. (2013). TEI and Project Bamboo. Journal of the Text Encoding Initiative, 5. http://jtei.revues.org/787 (accessed 12 November 2013). Kainz, C. (2010). The engine that started Project Bamboo, Friday Sushi http://fridaysushi.com/2010/01/30/the- engine-that-started-project-bamboo (accessed 12 November 2013). Edwards, P., Jackson, S., Bowker, G., and Knobel, C. (2007). Understanding infrastructure: dynamics, ten- sions, and design Report from ‘‘History & Theory of Infrastructure: Lessons for New Scientific Cyberinfrastructures,’’ Designing Cyberinfrastructure for Collaboration and Innovation. http://cyberinfras tructure.groups.si.umich.edu//UnderstandingInfrastruc ture_FinalReport25jan07.pdf (accessed 30 April 2014). Freeman, P. (2007). Is ‘designing’ cyberinfrastructure – or, even, defining it – possible? Designing Cyberinfrastructure for Collaboration and Innovation http://cyberinfrastructure.groups.si.umich.edu//OECD- Freeman-V2-2.pdf (accessed 30 April 2014). Project Bamboo. (2008), Bamboo Planning project: an arts and humanities community planning project to develop shared technology services for research. Grant proposal to the Andrew W. Mellon Foundation. http://dx.doi.org/10. 7928/H6J10129 (accessed 12 November 2013). Project Bamboo. (2010), Bamboo technology proposal (Public). Grant proposal to the Andrew W. Mellon Foundation. http://dx.doi.org/10.7928/H6D798B1 (ac- cessed 12 November 2013). Ribes, D. and Baker, K. (2007). Modes of Social Science Engagement in Community Infrastructure Design. In Steinfield, C., Pentland, B. T., Ackerman, M., and Contractor, N. (eds), Communities and Technologies 2007. London: Springer, pp. 107–30. Terras, M. (2008). Bamboozle, Melissa Terras’ Blog http://melissaterras.blogspot.com/2008/05/bambooo zle.html (accessed 12 November 2013). Unsworth, J., Courant, P., Fraser, S. et al. (2006). Our Cultural Commonwealth: The Report of the American Council of Learned Societies Commission on Cyberinfrastructure for Humanities and Social Sciences. American Council of Learned Societies. http://www. acls.org/cyberinfrastructure/cyber.htm (accessed 30 April 2014). Notes 1 Despite later impressions to the contrary, early participation in Bamboo was open to any interested college or university (http://web.archive.org/web/ 20080706131357/http://projectbamboo.org/colleges- universities), museum or library (http://web.archive. org/web/20080706131442/http://projectbamboo.org/ museums-libraries), or organization, society, or agency (http://web.archive.org/web/20080706131346/http:// projectbamboo.org/organizations-societies-agencies) that could pay for their own travel and lodging. The university and library-oriented calls for participation mentioned the possibility of ‘limited travel support’ that could be arranged on a case-by-case basis; in prac- tice, Bamboo covered lodging for participating teams during the nights of the workshops. 2 As of November 2013, archived versions of the Bamboo Planning Project wiki (http://dx.doi.org/10.7928/ H6RN35SK) and Bamboo Technology Project wiki (http://dx.doi.org/10.7928/H6MW2F28) are hosted at UC Berkeley. 3 Project Bamboo was one of the first initiatives the author was involved in when employed by the Academic Technologies group of central IT at the University of Chicago, shortly after leaving a Ph.D. pro- gram in the humanities and while concurrently pursu- ing an MLIS degree. The author was a member of Bamboo’s core program staff throughout the planning process; while she was minimally engaged in the early stages of Bamboo’s implementation phase, by 2011 she was involved in both development and planning, and in 2012 she again joined the program staff at UC Berkeley, where she is still employed. 4 Later prose would reduce this number to three by col- lapsing the distinction between information scientists and librarians and eliminating computer science re- searchers. The latter group was barely represented in the attendees of workshop 1, let alone subsequent workshops. 5 One representative example, from a 2008 blog post entitled ‘Bamboozle’ (which also exemplifies the unfor- tunate wordplay on the project’s name that persisted throughout its duration): . . .an interesting proposal to sort out What Needs To Be Done to aid scholars in using computa- tional power and tools in their research. But there is very little evidence that they have done their homework to what efforts have gone into this before, and no mention of the digital huma- nities community/communities (such as Alliance of Digital Humanities Organizations (ADHO); Q. Dombrowski 336 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 , among others http://jtei.revues.org/787 http://fridaysushi.com/2010/01/30/the-engine-that-started-project-bamboo http://fridaysushi.com/2010/01/30/the-engine-that-started-project-bamboo http://cyberinfrastructure.groups.si.umich.edu//UnderstandingInfrastructure_FinalReport25jan07.pdf http://cyberinfrastructure.groups.si.umich.edu//UnderstandingInfrastructure_FinalReport25jan07.pdf http://cyberinfrastructure.groups.si.umich.edu//UnderstandingInfrastructure_FinalReport25jan07.pdf http://cyberinfrastructure.groups.si.umich.edu//OECD-Freeman-V2-2.pdf http://cyberinfrastructure.groups.si.umich.edu//OECD-Freeman-V2-2.pdf http://dx.doi.org/10.7928/H6J10129 http://dx.doi.org/10.7928/H6J10129 http://dx.doi.org/10.7928/H6D798B1 http://melissaterras.blogspot.com/2008/05/bambooozle.html http://melissaterras.blogspot.com/2008/05/bambooozle.html http://www.acls.org/cyberinfrastructure/cyber.htm http://www.acls.org/cyberinfrastructure/cyber.htm http://web.archive.org/web/20080706131357/http://projectbamboo.org/colleges-universities http://web.archive.org/web/20080706131357/http://projectbamboo.org/colleges-universities http://web.archive.org/web/20080706131357/http://projectbamboo.org/colleges-universities http://web.archive.org/web/20080706131442/http://projectbamboo.org/museums-libraries http://web.archive.org/web/20080706131442/http://projectbamboo.org/museums-libraries http://web.archive.org/web/20080706131442/http://projectbamboo.org/museums-libraries http://web.archive.org/web/20080706131346/http://projectbamboo.org/organizations-societies-agencies http://web.archive.org/web/20080706131346/http://projectbamboo.org/organizations-societies-agencies `` '' http://dx.doi.org/10.7928/H6RN35SK http://dx.doi.org/10.7928/H6RN35SK http://dx.doi.org/10.7928/H6MW2F28 `` '' they've Association for Literary and Linguistic Computing (ALLC); Association for Computers and the Humanities (ACH); Society for Digital Humanities/Société pour l’étude des médias inter- actifs (SDH/SEMI); Text Encoding Initiative (TEI)) and the hundreds of scholars already tread- ing this path or trying to deal with the concerns raised in the proposal (Terras, 2008). 6 Scholarly practice as defined by Bamboo: ‘For example, authoring might be considered a scholarly practice that is comprised of many component tasks; these tasks may include a literature review, documenting citations, acquiring peer review, etc.’ (Project Bamboo, 2008, p. 27). 7 The stated goal of workshop 2 was to ratify the findings of a report on scholarly practice written based on feed- back from the first workshop, and ‘aggregate the initial list of component tasks required to complete these practices along with desired automation capabilities’ (Project Bamboo, 2008, p. 29). As a requirement for attending the second workshop, each institution had to send ‘at least one arts and humanities scholar and one enterprise-level technologist with, if possible, either serious interest in or experience with Services-Oriented Architecture (SOA)’ (Project Bamboo, 2008, p. 28). In workshop 3, ‘a professional SOA consultant will train participants to leverage our task lists by converting them to services. We will then attempt to describe scholarly practices as a sequence of identified service capabilities (in comparison, at the end of the previous workshop scholarly practices were described as a set of component tasks)’ (Project Bamboo, 2008, p. 30). In workshop 4, participants would ‘assign some type of initial grouping of scholarly practices, and prioritiza- tion as to the order in which services should be de- veloped’ (Project Bamboo, 2008, p. 31), and begin discussing organizational issues for a Bamboo consor- tium and requirements for being a partner institution in the next phase; these topics would also serve as the focus for the 5 th and final workshop. 8 At workshops 1b (Chicago, 15–17 May), 1c (Paris, 9–10 June), and 1d (Princeton, 14–16 July), there were six exercises: (1) Initial impressions: What do you hope Bamboo will accomplish? What questions do you have re- garding Bamboo? We are gathering together repre- sentatives from a range of backgrounds—scholars, libraries, IT staff, presses, and funding agencies— around the theme of how technology can better serve arts and humanities research. Based on what you have heard at the table and read from the proposal, what one or two questions, observa- tions, and hopes would your table like to share with the group? (2) Exploring scholarly practice: As a researcher, librar- ian, IT professional, computer scientists, etc., during a really good day, term, research cycle, etc. what productive things do you do in relation to humanities research? (3) Common and uncommon: What are common themes that have emerged from your exploration of scholarly practices? Based on your discussion of scholarly practices, what are two themes that piqued the curiosity of those at your table, or are uncommon? What makes these themes common and uncommon? (4) Unpacking a commonality: What discrete practices are involved in this theme? What outstanding issues need to be addressed in regards to this theme? (5) Unpacking the uncommon: For whom/which dis- ciplines or areas of study is this theme helpful? What discrete practices are involved in this theme? What outstanding issues need to be ad- dressed in regards to this theme? (6) Identify future scholarly practices/magic wand: When you look at new-hires or up-and-coming graduate students, what practices do they use that are different from yours? If you had a magic wand, what would make your day, term, research cycle, etc. more productive in relation to research? 9 See http://dx.doi.org/10.7928/H6H41PBV for a list of the themes that were identified. 10 Education (professional development of faculty and staff around digital tools and methodologies for teach- ing and research), Institutional Support (identifying service models and articulating the scope and value proposition of Bamboo), Scholarly Networking (eval- uating existing social networking and Virtual Research Environment platforms for potential adoption by Bamboo), Shared Services (comprising much of the original SOA vision), and Tools & Content Partners (identifying models and standards for tool and con- tent discovery and integration). See http://dx.doi.org/ 10.7928/H6CC0XM4 for more information about working groups, and links to the wiki pages of indi- vidual working groups. 11 The agenda and notes for workshop 3 are available at http://dx.doi.org/10.7928/H67P8W9K. 12 Slides from the implementation proposal presentation and notes on the discussion that followed are available at http://dx.doi.org/10.7928/H63X84K7. What Ever Happened to Project Bamboo? Literary and Linguistic Computing, Vol. 29, No. 3, 2014 337 two three four fifth 15-17 9-10 14-16 -- -- : D http://dx.doi.org/10.7928/H6H41PBV http://dx.doi.org/10.7928/H6CC0XM4 http://dx.doi.org/10.7928/H6CC0XM4 http://dx.doi.org/10.7928/H67P8W9K http://dx.doi.org/10.7928/H63X84K7 13 Slides from the consortial model presentation and notes on the discussion that followed are available at http://dx.doi.org/10.7928/H6057CVT. 14 These criticisms emerged in the discussion of the pro- posal: ‘Focused on value proposition; really needs to start saying what it is. Need to be more specific con- crete things on the table. Lots of things involving text processing. For this to have clearly perceived value— need to start saying what those things are. Also some consensus that just from social perspective begins to be important to go back home after receiving funding to go to these things, ‘‘here’s what we’re going to do’’ ’ (Table 10); ‘Finiteness of resources, and realities of what have to be accomplished. Have to tell stories about people who could put resources in. Need more finite sense of what is involved. A little con- cerned that we haven’t had that focusing-in phase.’ (Table 12); ‘Need to iterate - if Bamboo is ambitious, will fail over and over. Will succeed only if there’s a sustainability model that will allow for tweaking and redesigning’ (Table 13). See http://dx.doi.org/10.7928/ H63X84K7. 15 All released versions of the Bamboo Program Document are available here: http://dx.doi.org/10. 7928/H6VD6WCJ. 16 For full descriptions of each of these areas, see http:// dx.doi.org/10.7928/H6QN64N6. 17 Notes are available on the discussions about the Forum (http://dx.doi.org/10.7928/H6KW5CXG), Cloud (http://dx.doi.org/10.7928/H6G44N6G), and Labs (http://dx.doi.org/10.7928/H6BG2KW2). 18 See http://dx.doi.org/10.7928/H66Q1V5R for full re- sults and discussion notes. 19 Notes on these presentations are available at http://dx. doi.org/10.7928/H62Z13FD. A larger list of demon- strators is available in the Demonstrator Report: http://dx.doi.org/10.7928/H6Z60KZ1. Dombrowski and Denbo (2013) includes a discussion of some of the challenges that the ‘NYX/Barlach bibliography’ project encountered when attempting to demonstrate a service for processing TEI. 20 All versions of the draft implementation proposal are available at http://dx.doi.org/10.7928/H6TD9V75. Version 0.5 was discussed at workshop 5. 21 A more thorough description of the areas of work in version 0.5 of the draft Bamboo Implementation Proposal can be found here: http://dx.doi.org/10. 7928/H6PN93HT. There was originally a fourth area of work, ‘Bamboo Community’—a repackaging of ‘Bamboo community environments’ from the pro- gram document. Participants largely agreed that this should not be treated as an area of work, but a component of the larger section on community and governance. As a result, this section was not put up for a vote. 22 In response to feedback from workshop 5, the Scholarly Networking area of work was merged with the Bamboo Atlas, and this combined entity was renamed the ‘Bamboo Commons’. 23 See http://dx.doi.org/10.7928/H6JW8BS3 for full results. 24 ‘Direction of Bamboo Atlas is fine, but I have big reservations about the scope, both as it was described in original document and fear discussions haven’t nar- rowed scope at all’; ‘[W]hen you’re reading texts or doing markup, when you find a place that doesn’t make sense, it’s a place of interest but also a place where if you slice/dice differently, problem goes away. Atlas is a confusing chunk—what’s in it, what does it do, trying to tease it out, etc. Not clear exactly what the atlas does; pieces of it that one has associated with it are useful. Not trying to eliminate what it’s doing. But might make it cleaner to take pieces of Atlas (esp. ones that have to do with Bamboo users) and move to scholarly networking, and rename the whole thing.’ http://dx.doi.org/10. 7928/H6JW8BS3 25 This was reported publicly in the Chronicle of Higher Education: http://chronicle.com/blogs/wiredcampus/ in-potential-blow-to-open-source-software-mellon- foundation-closes-grant-program/19519. On 7 January, the following message was posted to the ‘News’ section of the Project Bamboo Web site: On 5 January 2010, the Chronicle of Higher Education published on its blog an article regard- ing recent changes at the Mellon Foundation and in particular, the closure of the RIT program. Although the planning project had been sup- ported by RIT, the changes have had a minimal impact on Bamboo. At the end of December, both the University of California, Berkeley, and the University of Chicago were contacted by the Foundation, and Bamboo was smoothly migrated into the Scholarly Communications program. In short, the transition has gone well, and we look forward to working with Scholarly Communications into the future. (http://web. archive.org/web/20101231171544/http://project- bamboo.org/news?page¼2) 26 This frequently manifested itself in the concern that the scholars would be unable to design sufficiently scalable applications, and that the technologists Q. Dombrowski 338 Literary and Linguistic Computing, Vol. 29, No. 3, 2014 http://dx.doi.org/10.7928/H6057CVT `` , here's we're haven't — there's http://dx.doi.org/10.7928/H63X84K7 http://dx.doi.org/10.7928/H63X84K7 http://dx.doi.org/10.7928/H6VD6WCJ http://dx.doi.org/10.7928/H6VD6WCJ http://dx.doi.org/10.7928/H6QN64N6 http://dx.doi.org/10.7928/H6QN64N6 http://dx.doi.org/10.7928/H6KW5CXG http://dx.doi.org/10.7928/H6G44N6G http://dx.doi.org/10.7928/H6BG2KW2 http://dx.doi.org/10.7928/H66Q1V5R http://dx.doi.org/10.7928/H62Z13FD http://dx.doi.org/10.7928/H62Z13FD http://dx.doi.org/10.7928/H6Z60KZ1 `` '' http://dx.doi.org/10.7928/H6TD9V75 http://dx.doi.org/10.7928/H6PN93HT http://dx.doi.org/10.7928/H6PN93HT `` '' `` '' `` '' http://dx.doi.org/10.7928/H6JW8BS3 haven't you're doesn't it's what's it's http://dx.doi.org/10.7928/H6JW8BS3 http://dx.doi.org/10.7928/H6JW8BS3 http://chronicle.com/blogs/wiredcampus/in-potential-blow-to-open-source-software-mellon-foundation-closes-grant-program/19519 http://chronicle.com/blogs/wiredcampus/in-potential-blow-to-open-source-software-mellon-foundation-closes-grant-program/19519 http://chronicle.com/blogs/wiredcampus/in-potential-blow-to-open-source-software-mellon-foundation-closes-grant-program/19519 7th `` '' 5, Research in Information Technology ( ) http://web.archive.org/web/20101231171544/http://projectbamboo.org/news?page=2 http://web.archive.org/web/20101231171544/http://projectbamboo.org/news?page=2 http://web.archive.org/web/20101231171544/http://projectbamboo.org/news?page=2 http://web.archive.org/web/20101231171544/http://projectbamboo.org/news?page=2 would spend inordinate amounts of resources on sys- tems with minimal scholarly utility. These concerns were never raised through official channels, but had a real presence in informal conversations among members of each professional group. 27 This topic often arose over the course of the planning project workshops. Some examples: ‘sees huge gulf between librarians/faculty and technologists; so here is an opportunity to communicate with each other’ (Ex 1, 1b-B); ‘hope bamboo moves beyond the usual conversation between humanities scholars and digital technology, i.e. ‘‘What do you want?’’, ‘‘What can you do?’’ Also troubled by formula of service, that digital technology folk and librarians are there just to ‘‘ser- vice’’ the humanities faculty; should be a partnership of equals, both have research goals they want to pursue’ (Ex 1, 1b-D); ‘Libraries, Publishing and Faculty are not talking. IT in the background. Efficiency and Effectiveness are not entirely a huma- nities priority.’ (Ex 1, 1b-E); ‘Humanities and IT people have different definitions of Effectiveness v Efficiency? Humanities has ‘‘productive inefficiency’’ ’. (Ex 1,1b-E) See http://quinndombrowski.com/pro- jects/project-bamboo/data/building-partnerships-be- tween-it-professionals-and-humanists for more quotes from the planning project workshops that refer to this phenomenon. 28 For further information about Bamboo’s IAM work, see http://dx.doi.org/10.7928/H6F769GD. 29 For more information about the architecture and im- plementation of the CI hub, see http://dx.doi.org/10. 7928/H69G5JRP. 30 See http://dx.doi.org/10.7928/H65Q4T1C for a de- scription of the Bamboo Book Model, including its implementation through a CMIS binding. The Bamboo Book Model is also discussed in Dombrowski and Denbo (2013). 31 See http://dx.doi.org/10.7928/H61Z4291 for a list of service APIs that were developed by Bamboo. 32 By proxying access through the Bamboo Services Platform, remotely running scholarly services could take advantage of IAM and utility services (e.g. result set caching and notification) hosted on the Platform. See http://dx.doi.org/10.7928/H6X63JTN for more about the architecture, development, and invocation of centrally hosted Bamboo services. 33 See http://dx.doi.org/10.7928/H6SF2T3B for details about the type and extent of integration accomplished for each platform. 34 See http://dx.doi.org/10.7928/H6NP22C0 for informa- tion about the design process. 35 During this transition period, the author received an email from a Bamboo planning project partici- pant inquiring after upcoming opportunities for his liberal arts institution to become more involved. Even a few months after the Project Bamboo Web site was replaced, at Digital Humanities 2013, the author fielded multiple questions about the status of Bamboo. 36 An ‘Advocacy’ working group was discussed at work- shop 2 (http://dx.doi.org/10.7928/H6RF5RZJ), but participants were concerned that it failed to make a clear distinction between the self-promotion necessary for Bamboo’s adoption and advocacy with regards to larger issues facing digital humanities, such as those laid out in Our Cultural Commonwealth. Ultimately, a working group was not formed around this topic after workshop 2; the key issues for Bamboo in this area were reframed as ‘principles for leadership’, and expli- citly put on hold (http://dx.doi.org/10.7928/ H6MS3QNJ). 37 The Bamboo program staff members were aware that a good deal of scholarly functionality was only available as desktop software (e.g. Juxta), or systems that required complex installation (e.g. Philologic), in 2008. They anticipated that software development in digital humanities would evolve toward a web services model, following trends in enterprise soft- ware development. Some tools have moved in this direction: Juxta released a web service in 2012 (http://www.juxtasoftware.org/on-the-juxta-beta-relea se-and-taking-collation-online/), and Philologic 4 in- cludes web services (http://dx.doi.org/10.7928/ H6H12ZX4). However, as of 2014, scholarly tools are still not expected to be delivered as web services, and a great deal of work is done using stand-alone web ap- plications such as Voyant Tools (http://voyant-tools. org/), or locally run packages such as MALLET (http://mallet.cs.umass.edu/). 38 The modest duration of these institutional commit- ments came into conflict with the longer development, deployment, and support timelines for a large cyber- infrastructure initiative. While the level of Bamboo infrastructure integration for HubZero came closest to achieving the vision of the ‘work space’, by 2012, the University of Wisconsin, Madison, was moving away from supporting HubZero. Work was underway to port the integration code to Drupal—which had been selected as the ‘work space’ platform for the second phase of technical development—when Bamboo was shut down. What Ever Happened to Project Bamboo? Literary and Linguistic Computing, Vol. 29, No. 3, 2014 339 http://quinndombrowski.com/projects/project-bamboo/data/building-partnerships-between-it-professionals-and-humanists http://quinndombrowski.com/projects/project-bamboo/data/building-partnerships-between-it-professionals-and-humanists http://quinndombrowski.com/projects/project-bamboo/data/building-partnerships-between-it-professionals-and-humanists http://dx.doi.org/10.7928/H6F769GD http://dx.doi.org/10.7928/H69G5JRP http://dx.doi.org/10.7928/H69G5JRP http://dx.doi.org/10.7928/H65Q4T1C http://dx.doi.org/10.7928/H61Z4291 - , , etc. http://dx.doi.org/10.7928/H6X63JTN http://dx.doi.org/10.7928/H6SF2T3B http://dx.doi.org/10.7928/H6NP22C0 http://dx.doi.org/10.7928/H6RF5RZJ `` '' http://dx.doi.org/10.7928/H6MS3QNJ http://dx.doi.org/10.7928/H6MS3QNJ s http://www.juxtasoftware.org/on-the-juxta-beta-release-and-taking-collation-online/ http://www.juxtasoftware.org/on-the-juxta-beta-release-and-taking-collation-online/ http://dx.doi.org/10.7928/H6H12ZX4 http://dx.doi.org/10.7928/H6H12ZX4 http://voyant-tools.org/ http://voyant-tools.org/ - http://mallet.cs.umass.edu/ `` '' -- , `` '' -- work_am4amy5buvcwplbvdzknmmb4oq ---- Killer Applications in Digital Humanities Patrick Juola Duquesne University Pittsburgh, PA 15282 UNITED STATES OF AMERICA juola@mathcs.duq.edu August 31, 2006 Abstract The emerging discipline of “digital humanities” has been plagued by a perceived neglect on the part of the broader humanities community. The community as a whole tends not to be aware of the tools developed by DH practitioners (as documented by the recent surveys by Siemens et al.), and tends not to take seriously many of the results of scholarship obtained by DH methods and tools. This paper argues for a focus on deliverable results in the form of useful solutions to common problems that humanities scholars share, instead of simply new representations. The question to address is what needs the humanities community has that can be dealt with using DH tools and techniques, or equivalently what incentive humanists have to take up and to use new methods. This can be treated in some respects like the computational quest for the “killer application” – a need of the user group that can be filled, and by filling it, create an acceptance of that tool and the supporting methods/results. Some definitions and examples are provided both to illustrate the idea and to support why this is necessary. The apparent alternative is the status quo, where digital research tools are brilliantly developed, only to languish in neglect and disuse. 1 Introduction “The emerging discipline of digital humanities”. . . . Arguably, “digital humani- ties” has been emerging for decades, without ever having fully emerged. One of the flagship journals of the field, Computers in the Humanities, has published nearly forty volumes, without having established the field as a mainstream sub- discipline. The implications of this are profound; tenure-track opportunities for DH specialists are rare, publications are not widely read or valued, and, perhaps most seriously in the long run, the advances made are not used by mainstream scholars. 1 This paper analyzes some of the patterns of neglect, the ways in which mainstream humanities scholarship fails to value and participate in the digital humanities community. It further suggests one way to increase the profile of this research, by focusing on the identification and development of “killer” ap- plications (apps), computer applications that solve significant problems in the humanities in general. 2 Patterns of Neglect 2.1 Patterns of participation A major indicator of the neglect of digital humanities as a humanities discipline is the lack of participation, particularly by influential or high-impact scholars. As an example, the flagship (or at least, longest running) journal in the field of “humanities computing” is Computers and the Humanities, which has been published since the 1960s. Despite this, the impact of this journal has been minimal. The Journal Citation Reports database suggests that for 2005, the impact factor of this journal (defined as “the number of current citations to articles published in the two previous years divided by the total number of articles published in the two previous years”1) is a relatively low 0.196. (This is actually a substantial improvement from 2002’s impact factor of 0.078.) In terms of averages from 2002–4, CHum was the 6494th most cited journal out of a sample of 8011, scoring in only the 20th percentile. By contrast, the most influential journal in the field of “computer applications,” Bioinformatics, scores above 3.00; Computational Linguistics scores at 0.65; the Journal of Forensic Science at 0.75. Neither Literary and Linguistic Computing, Text Technology, nor the Journal of Quantitative Linguistics even made the sample. In other words, scholars tend not to read, or at least cite, work published under the heading of humanities computing. Do they even participate? In six years of publication (1999-2004; volumes 33–38), CHum published 101 articles, with 205 different authorial affiliations (including duplicates) listed. Who are these authors, and do they represent high-profile and influential scholars? The unfortunate answer is that they do not appear to. Of the 205 affiliations, only 5 are from “Ivy League” universities, the single most prestigious and influential group of US universities. Similarly, of the 205 affiliations, only sixteen are from the universities recognized by US News and World Report [USNews, 2006] as one the top 25 departments in in any of the disciplines of English, history, or sociology. Only two affiliations are among the top ten in those disciplines. While it is of course unreasonable to expect any group of American universities to dominate a group of international scholars, the conspicuous and almost total absence of faculty and students from top-notch US schools is still important. Nor is this absence confined to US scholars; only one affiliation from the top 5 Canadian doctoral universities (according to the 2005 MacLean’s ranking) appears. (Geoff Rockwell has pointed out that the MacLean’s rankings are 1http://jcrweb.com/www/help/hjcrgls2.htm, accessed June 15, 2006 2 School Papers (2005) Papers (2006) USNews Top 10 7 4 Harvard Cal-Berkeley 1 1 Yale Princeton 1 Stanford 1 2 Cornell Chicago Columbia 1 Johns Hopkins UCLA Penn Michigan-Ann Arbor 2 Wisconsin-Madison UNC-Chapel Hill 1 1 MacLean’s top 5 2 3 McGill Toronto 1 (3 authors) 1 Western 1 UBC 1 1 Queen’s Ivies not otherwise listed 4 6 Brown 4 (one paper 2 authors) 6 Dartmouth Table 1: Universities included for analysis of 2005 ACH/ALLC and 2006 DH proceedings not necessarily the “best” research universities in Canada, and that a better list of elite research universities would be the so-called “Group of 10” or G– 10 schools. Even with this list, only three papers — two from Alberta, one from McMaster – appear.) Australian elite universities (the Go8) are slightly better represented; three affiliations from Melbourne, one from Sydney. Only in Europe is there broad participation from recognized elite universities such as the LERU. The English-speaking LERU universities (UCL, Cambridge, Oxford, and Edinburgh) are all represented, as are the universities of Amsterdam, Leuven, Paris, and Utrecht despite the language barrier. However, students and faculty from Harvard, Yale, Berkeley, Toronto, McGilli, and Adelaide — in many cases, the current and future leaders of the fields — are conspicuously absent. Perhaps the real heavyweights are simply publishing their DH work else- where, but are still a part of the community? A study of the 118 abstracts accepted to the 2005 ACH/ALLC conference (Victoria) shows that only 7 in- cluded affiliations from universities in the “top 10” of the USNews ranking. Only two came from universities in the “top 5” of the Maclean ranking, and 3 only 6 from Ivies (Four of those six were from the well-established specialist DH program at Brown, a program unique among Ivies.) A similar analysis shows low participation among the 151 abstracts at the 2006 DH conference (Paris). The current and future leaders seem not to participate in the community, either. 2.2 Tools and awareness People who do not participate in a field cannot be expected to be aware of the developments it creates, an expectation sadly supported by recent survey data. In particular, [Siemens et al., 2004, Toms and O’Brien, 2006] reported on a survey of “the current needs of humanists” and announced that, while over 80% of survey respondents use e-text and over half use text analysis tools, they are not even aware of “commonly available tools such as TACT, WordCruncher and Concordancer.” The tools of which they are aware seem to be primarily common Microsoft products such as Word and Access. This lack of awareness is further supported by [Martin, 2005] (emphasis mine): Some scholars see interface as the primary concern; [electronic] resources are not designed to do the kind of search they want. Oth- ers see selection as a problem; the materials that databases choose to select are too narrow to be of use to scholars outside of that field or are too broad and produce too many results. Still others question the legitimacy of the source itself. How can an electronic copy be as good as seeing the original in a library? Other, more electronically oriented scholars, see the great value of accessibility of these resources, but are unaware of the added potential for research and teaching. The most common concern, however, is that schol- ars believe they would use these resources if they knew they existed. Many are unaware that their library subscribes to resources or that universities are sponsoring this kind of research. Similarly, [Warwick, 2004a] describes the issues involved with the Oxford University Humanities Computing Unit (HCU). Despite its status as an “inter- nationally renowned centre of excellence in humanities computing,” [P]ersonal experience shows that it was extremely hard to con- vince traditional scholars in Oxford of the value of humanities com- puting research. This is partly because so few Oxford academics were involved in any of the work the HCU carried out, and had little knowledge of, or respect for, humanities computing research. Had there been a stronger lobby of interested academics who had a vested interest in keeping the centre going because they had projects asso- ciated with it, perhaps the HCU could have become a valued part of the humanities division. That it did not, demonstrates the con- sequences of a lack of respect for digital scholarship amongst the mainstream. 4 3 Killer Apps and Great Problems One possible reason for this apparent neglect is a mismatch of expectations between the expected needs of audience (market) for the tools and the com- munity’s actual needs. A recent paper [Gibson, 2005] on the development of an electronic scholarly edition of Clotel may illustrate this. The edition itself is a technical masterpiece, offering, among other things, the ability to compare passages among the various editions and even to track word-by-word changes. However, it is not clear who among Clotel scholars will be interested in using this capacity or this edition; many scholars are happy with their print copies and the capacities print grants (such as scribbling in the margins or reading on a park bench). Furthermore, the nature of the Clotel edition does not lend itself well either to application to other areas or to further extension. The knowledge gained in the process of annotating Clotel does not appear to generalize to the annotation of other works (certainly, no general consensus has emerged about “best practices” in the development of a digital edition, and the various pro- posals appear to be largely incompatible and even incomparable). The Clotel edition is essentially a service offered to the broader research community in the hope that it will be used, and runs a great risk of becoming simply yet another tool developed by the DH specialists to be ignored. Quoting further from [Martin, 2005]: [Some scholars] feel there is no incentive within the university system for scholars to use these kinds of new resources. — let alone to create them. This paper argues that for a certain class of resources, there should be no need for an incentive to get scholars to use them. Digital humanities specialists should be in a unique position both to identify the needs of mainstream hu- manities scholars and to suggest computational solutions that the mainstream scholars will be glad to accept. 3.1 Definition The wider question to address, then, is what needs the humanities community has that can be dealt with using DH tools and techniques, or equivalently what incentive humanists have to take up and to use new methods. This can be treated in some respects like the computational quest for the “killer applica- tion” – a need of the user group that can be filled, and by filling it, create an acceptance of that tool and the supporting methods/results. Digital Humanities needs a “killer application.” “Killer application” is a term borrowed from the discipline of computer sci- ence. In its strictest form, it refers to an application program so useful that users are willing to buy the hardware it runs on, just to have that program. One of the earliest examples of such an application was the spreadsheet, as typified by VisiCalc and Lotus 1-2-3. Having a spreadsheet made business deci- sionmaking so much easier (and more accurate and profitable) that businesses 5 were willing to buy the computers (Apple IIs or IBM PCs, respectively) just to run spreadsheets. Gamers by the thousands have bought Xbox gaming consoles just to run Halo. A killer application is one that will make you buy, not just the product itself, but also invest in the necessary infrastructure to make the product useful. For digital humanities, this term should be interpreted in a somewhat broader sense. Any intellectual product — a computer program, an abstract tool a the- ory, an analytic framework — can and should be evaluated in terms of the “affor- dances” [Gibson, 2005, Ruecker and Devereux, 2004] it creates. In this frame- work, an “affordance” is simply “an opportunity for action” [Ruecker and Devereux, 2004]; spreadsheets, for instance, create opportunities to make business decisions quickly on the basis of incomplete or hypothesized data, while Halo creates the opportu- nity for playing a particular game. Ruecker provides a framework for comparing different tools in terms of their “affordance strength,” essentially the value of- fered by the affordances of a specific tool. In this broader context, a “killer app” is any intellectual construct that creates sufficient affordance strength to justify the effort and cost of accepting, not just the construct itself, but the supporting intellectual infrastructure. It is a solution sufficiently interesting to, by itself, retrospectively justify looking the problem it solves — a Great Problem that can both empower and inspire. Three properties appear to characterize such ”killer apps”. First, the prob- lem itself must be real, in the sense that other humanists (or the public at large) should be interested in the fruits of its solution. For example, the organizers of a recent NSF summit on “Digital Tools for the Humanities” identified several examples of the kinds of major shifts introduced by information technology in various areas. In their words, When information technology was first applied [to inventory- based businesses], it was used to track merchandise automatically, rather than manually. At that time, the merchandise was stored in the same warehouses, shipped in the same way, depending upon the same relations among produces and retailers as before[. . . ]. To- day, a revolution has taken place. There is a whole new concept of just-in-time inventory delivery. Some companies have eliminated warehouses altogether, and the inventory can be found at any instant in the trucks, planes, trains, and ships delivering sufficient inventory to re-supply the consumer or vendor — just in time. The result of this is a new, tightly interdependent relationship between sup- pliers and consumers, greatly reduced capital investment in “idle” merchandise, and dramatically more responsive service to the final consumer. A killer application in scholarship should be capable of effecting similar change in the way that practicing scholars do their work. Only if the prob- lem is real can an application solving it be a killer. The Clotel edition described above appears to fail under this property precisely because only specialists in 6 Clotel (or in 19th-century or African-American literature) are likely to be inter- ested in the results; a specialist in the Canterbury Tales will not find her work materially affected. Second, the problem must get buy-in from the humanities computing com- munity itself, in that humanities computing specialists will be motivated to do the actual work. The easiest and probably cheapest way to do this is for the process of solution itself to be interesting to the participating scholars. For example, the compiling of a detailed and subcategorized bibliography of all ref- erences to a given body of work would be of immense interest to most scholars; rather than having to pore through dozens of issues of thousands of journals, they could simply look up their field of interest. (This is, in fact, very close to the service that Thompson Scientific provides with the Social Science Citation Index, or that Penn State provides with CiteSeer.) The problem is that though the product is valuable, the process of compiling it is dull, dreary, and unre- warding. There is little room for creativity, insight, and personal expression in such a bibliography. Most scholars would not be willing to devote substan- tial effort — perhaps several years of full-time work — to a project with such minimal reward. (By contrast, the development of a process to automatically create such a bibliography could be interesting and creative work.) The process of solving interesting problems will almost automatically generate papers and publications, draw others into the process of solving it, and create opportuni- ties for discussion and debate. We can again compare this to the publishing opportunities for a bibliography — is “my bibliography is now 50% complete” a publishable result? Third, the problem itself must be such that even a partial solution or an incremental improvement will be useful and/or interesting. Any problem that meets the two criteria above is unlikely to submit to immediate solution (oth- erwise someone would probably already have solved it). Similarly, any such problem is likely to be sufficiently difficult that solving it fully would be a ma- jor undertaking, beyond the resources that any single individual or group could likely muster. On the other hand, being able to develop, deploy, and use a par- tial solution will help advance the field in many ways. The partial solution, by assumption, is itself useful. Beyond that, researchers and users have an incen- tive to develop and deploy improvements. Finally, the possibility of supporting and funding incremental improvements makes it more likely to get funding, and enhances the status of the field as a whole. 3.2 Some historical examples To more fully understand this idea of a killer app, we should first consider the history of scholarly work, and imagine the life of a scholar c. 1950. He (probably) spends much of his life in the library, reading paper copies of journal articles and primary sources to which he (or his library) has access, taking detailed notes by hand on index cards, and laboriously writing drafts in longhand which he will revise before finally typing (or giving to a secretary to type). His new ideas are sent to conferences and journals, eventually to find their way into the libraries 7 of other scholars worldwide over a period of months or years. Collaboration outside of his university is nearly unheard-of, in part because the process of exchanging documents is so difficult. Compare that with the modern scholar, who can use a photocopier or scan- ner to copy documents of interest and write annotations directly on those copies. She can use a word processor (possibly on a portable computer) both to take research notes and to extend those notes into articles; she has no need to write complete drafts, can easily rearrange or incorporate large blocks of text, and can take advantage of the computer to handle “routine” tasks such as spelling correction, footnote numbering, bibliography formatting, and even pagination. She can directly incorporate the journal’s formatting requirements into her work (so that the publisher can legitimately ask for “camera-ready” manuscripts as a final draft), eliminating or reducing the need both for typists and typesetters. She can access documents from the comfort of her own office or study via an electronic network, and use advanced search technology to find and study docu- ments that her library does not itself hold. She can similarly distribute her own documents through that same network and make them available to be found by other researchers. Her entire work-cycle has been significantly changed (for the better, one hopes) by the availability of these computation resources. We thus have several historical candidates for what we are calling “killer apps”: xerographic reproduction and scanning, portable computing (both ar- guably hardware instead of software), word processing and desktop publishing (including subsystems such as bibliographic packages and spelling checkers), net- worked communication such as Email and the Web, and search technology such as Google. These have all clearly solved significant issues in the way humanities research is generally performed (i.e. met the first criterion). In Ruecker’s terms, they have all created ‘affordances” of the sort that no modern scholar would choose to forego. The amount of research work — journals, papers, patents, presentations, and books — devoted to these topics suggests that researchers themselves are interested in solving the problems and improving the technolo- gies, in many cases incrementally (e.g., “how can a search engine be tuned to find documents written in Thai?”). Of course, for many of these applications, the window of opportunity has closed, or at least narrowed. A group of academics are unlikely to be able to have the resources to build/deploy a competing product to Microsoft and/or Google. On the other hand, the very fact that humanities scholars are something of a niche market may open the door to incremental killer apps based upon (or built as extensions to) mainstream software, applications focused specifically on the needs of practicing scholars. The next section presents a partial list of some candidates that may yield killer applications in the foreseeable future. Some of these candidates are taken from my own work, some from the writings of others. 8 3.3 Potential current killer apps 3.3.1 Back of the Book Index Generation Almost every nonfiction book author has been faced with the problem of index- ing. For many, this will be among the most tedious, most difficult, and least rewarding parts of writing the book. The alternative is to hire a professional indexer (perhaps a member of an organization such as the American Society of Indexers, www.asindexing.org) and pay a substantial fee, which simply shifts the uncomfortable burden to someone else, but does not substantially reduce it. A good index provides much more than the mere ability to find information in a text. The Clive Pyne book indexing company2 lists some aspects of what a good index provides. According to them, “a good index: • provides immediate access to the important terms, concepts and names scattered throughout the book, quickly and efficiently; • discriminates between useful information on a subject, and a passing men- tion; • has headings which are concise, accurate and unambiguous reflecting the contents and terminology used in the text; • has sufficient cross-references to connect related terms; • anticipates how readers will search for information; • reveals the inter-relationships of topics, concepts and names so that the reader need not read the whole index to find what they are looking for; • provides terminology which might not be used in the text, but is the reference point that the reader will use for searching through the index; • can make the difference between a book and a very good book” A traditional back-of-the-book (BotB) index is a substantial intellectual ac- complishment in its own right. In many ways, it is an encapsulated and stylized summary of the intellectual structure of the book itself. “A good index is an objective guide to the text, a link between the author’s ideas and the reader. It should be a road map that leads readers to every relevant idea without frus- trating detours and dead ends.”3 And it is specifically not just a concordance or a list of terms appearing in the document. It is thus surprising that a tedious task of such importance has not yet been computerized. This is especially surprising given the effectiveness of search en- gines such as Google at “indexing” the unimaginably large volume of information on the Web. However, the tasks are subtly different; a Google search is not ex- pected to show knowledge of the structure of the documents or the relationships 2http://www.cpynebookindexing.com/what makes a good index.htm, accessed 5/31/2006 3Kim Smith, http://www.smithindexing.com/whyprof.html, accessed 5/31/2006. 9 among the search terms. As a simple example, a phrasal search on Google (May 31, 2006) for “a good index,” found, as expected, several articles on back of the book indexing. It also found several articles on financial indexing and index funds, and a scholarly paper on glycemic control as measured (“indexed”) by plasma glucose concentrations. A good text index would be expected to identify these three subcategories, to group references appropriately, and to offer them to the reader proactively as three separate subheadings. A good text index is not simply a search engine on paper, but an intellectual precis of the structure of the text. This is therefore an obvious candidate for a killer application. Every hu- manities scholar needs such a tool. Indeed, since chemistry texts need indexing as badly as history texts do, scholars outside of the humanities also need it. Unfortunately, not only does it not (yet) exist, but it isn’t even clear at this writing what properties such a tool would have. Thus there is room for fun- damental research into the attributes of indices as a genre of text, as well as into the fundamental processes of compiling and evaluating indices and their expression in terms of algorithms and computation. I have presented elsewhere [Juola, 2005, Lukon and Juola, 2006] a possible framework to build a tool for the automatic generation of such indices. With- out going into technical detail,the framework identifies several important (and interesting) cognitive/intellectual tasks that can be independently solved in an incremental fashion. Furthermore, this entire problem clearly admits of an in- cremental solution, because a less-than-perfect index, while clearly improvable, is still better than no index at all, and any time saved by automating the more tedious parts of indexing will still be a net gain to the indexer. Thus all three components of the definition of killer app given above are present, suggesting that the development of such an indexing tool would be beneficial both inside and outside the digital humanities community. 3.3.2 Annotation tools As discussed above, one barrier to the use of E-texts and digital editions is the current practices of scholars with regard to annotation. Even when documents are available electronically, many researchers (myself include) will often choose to print them and study them on paper. Paper permits one not only to mark text up and to make changes, but also to make free-form annotations in the margins, to attach PostIt notes in a rainbow of colors, and to share commentary with a group of colleagues. Annotation is a crucial step in recording a reader’s encounter with a text, in developing an interpretation, and in sharing that interpretation with others. The recent IATH Summit on Digital Tools for the Humanities [IATH Summit, 2006] identified this process of annotation and interpretation as a key process underly- ing humanistic scholarship, and specifically discussed the possible development of a tool for digital annotation, a “highlighter’s tool,” that would provide the same capacities of annotation of digital documents, including multimedia doc- uments, that print provides. The flexibility of digital media means, in fact,that 10 one should be able to go beyond the capacities of print — for example, instead of doodling a simple drawing in the margin of a paper, one might be able to “doodle” a Flash animation or a .wav sound file. Discussants identified at least nine separate research projects and communi- ties that would benefit from such a tool. Examples include “a scholar currently writing a book on Anglo-American relations, who is studying propaganda films produced by the US and UK governments and needs to compare these with text documents from on-line archives, coordinate different film clips, etc.”; “an add-on tool for readers (or reviewers) of journal articles,” especially of electronic journal systems (The current system of identifying comments by page and line number, for example, is cumbersome for both reviewers and authors.); and “an endangered language documentation project that deals with language variation and language contact,” where multilingual, multialphabet, and multimedia re- sources must be coordinated among a broad base of scholars. Such a tool has the potential to change the annotation process as much as the word processor has changed the writing and publication process. Can community buy-in be achieved? There is certainly room for research and for incremental improvements, both in defining the standards and capacities of the annotations and in expanding those capacities to meet new requirements as they evolve. For example, early versions of such a project would probably not be capable handling all forms of multimedia data; a research-quality prototype might simply handle PDF files and sound, but not video. It’s not clear that the community support is available for building early, simple versions – although “a straw poll showed that half of [the discussants] wanted to build this kind of tool, and all wanted to use it.” [IATH Summit, 2006], responding to a straw poll is one thing and devoting time and resources is another altogether; it is not clear that any software development on this project has yet happened. However, given the long-term potential uses and research outcomes from this kind of project, it clearly has the potential to be a killer application. 3.3.3 Resource exploration Another issue raised at the summit is that of resource discovery and explo- ration. The huge amount of information on the Web is, of course, a tremendous resource for all of scholarship, and companies such as Google (especially with new projects such as Google Images and Google Scholar) are excellent at finding and providing access. On the other hand, “such commercial tools are shaped and defined by the dictates of the commercial market, rather than the more complex needs of scholars.” [IATH Summit, 2006] This raises issues about ac- cess to more complex data, such as textual markup, metadata, and data hidden behind gateways and search interfaces. Even where such data is available, it is rarely compatible from one database to another, and it’s hard to pose questions to take advantage of the markup. In the words of the summit report, What kinds of tools would foster the discovery and exploration 11 of digital resources in the humanities? More specifically, how can we easily locate documents (in multiple formats and multiple media), find specific information and patterns in across [sic] large numbers of scholarly disciplines and social networks? These tasks are made more difficult by the current state of resources and tools in the hu- manities. For example, many materials are not freely available to be crawled through or discovered because they are in databases that are not indexed by conventional search engines or because they are behind subscription-based gates. In addition, the most commonly used interfaces for search and discovery are difficult to build upon. And, the current pattern of saving search results (e.g., bookmarks) and annotations (e.g., local databases such as EndNote) on local hard drives inhibits a shared scholarly infrastructure of exploration, discovery, and collaboration. Again, this has the potential to effect significant change in the day-to-day working life of a scholar, by making collaborative exploration and discovery much more practical and rewarding, possibly changing the culture by creating a new “scholarly gift economy in which no one is a spectator and everyone can readily share the fruits of their discovery efforts.” “Research in the sciences has long recognized team efforts. . . . A similar emphasis on collaborative research and writing has not yet made its way into the thinking of humanists.” But, of course, what kind of discovery tools would be needed? What kind of search questions should be supported? How can existing resources such as lexi- cons and ontologies be incorporated into the framework? How can it take advan- tage of (instead of competing with) existing commercial search utilities? These questions illustrate many of the possible research avenues that could be explored in the development of such an application. Jockers’ idea of “macro lit-o-nomics (macro-economics for literature)” [Jockers, 2005] is one approach that has been suggested to developing useful analysis from large datasets; Ruecker and De- veraux [Ruecker and Devereux, 2004] and their “Just-in-Time” text analysis is another. In both projects, the researchers showed that interesting conclusions could be drawn by analyzing the large-scale results of automatically-discovered resources and looking at macro-scale patterns of language and thought. 3.3.4 Automatic essay grading The image of a bleary-eyed teacher, bent over a collection of essays at far past her bedtime is a traditional one. Writing is a traditional and important part of the educational one, but most instructors find the grading of essays to be time-consuming, tedious, and unrewarding. This applies regardless of the sub- ject; essays on Shakespeare are not significantly more fun to grade than essays on the history of colonialism. The essay grading problem is one reason that multiple choice tests are so popular in large classes. We thus have another po- tential “killer app,” an application to handle the chore of grading essays without interfering with the educational process. 12 Several approaches to automatic essay grading have been tried, with rea- sonable but not overwhelming success. At a low enough level, essay grading can be done successfully just by looking at aspects of spelling, grammar, and punctuation, or at stylistic continuity [Page, 1994]. Foltz [Foltz et al., 1999] has also shown good results by comparing semantic coherence (as measured, via La- tent Semantic Analysis, from word cooccurances) with that of essays of known quality: LSA’s performance produced reliabilities within the range of their comparable inter-rater reliabilities and within the generally accepted guidelines for minimum reliability coefficients. For example, in a set of 188 essays written on the functioning of the human heart, the av- erage correlation between two graders was 0.83, while the correlation of LSA’s scores with the graders was 0.80. . . . In a more recent study, the holistic method was used to grade two additional questions from the GMAT standardized test. The performance was compared against two trained ETS graders. For one question, a set of 695 opinion essays, the correlation between the two graders was 0.86, while LSA’s correlation with the ETS grades was also 0.86. For the second question, a set of 668 analysis of argument essays, the correlation between the two graders was 0.87, while LSA’s correlation to the ETS grades was 0.86. Thus, LSA was able to perform near the same reliability levels as the trained ETS graders. Beyond simply reducing the workload of the teacher, this tool has many other uses. It can be used, for example, as a method of evaluating a teacher for consistency in grading, or for ensuring that several different graders for the same class use the same standards. More usefully, perhaps, it can be used as a teach- ing adjunct, by allowing students to submit rough drafts of their essays to the computer and re-write until they (and the computer) are satisfied. This will also encourage the introduction of writing into the curriculum in areas outside of tra- ditional literature classes, and especially into areas where the faculty themselves may not be comfortable with the mechanics of teaching composition. Research into automatic essay grading is a active area among text categorization scholars and computer scientists for the reasons cited above. [Valenti et al., 2003] From a philosophical point of view, though, it’s not clear that this approach to essay grading should be acceptable. A general-purpose essay grader can do a good job of evaluating syntax and spelling, and even (presumably) grade “se- mantic coherence” by counting if an acceptable percentage of the words are close enough together in the abstract space of ideas. What such a grader cannot do is evaluate factual accuracy or provide discipline-specific information. Further- more, the assumption that there is a single grade that can be assigned to an essay, irrespective of context and course focus, is questionable. Here is an area where a problem has already been identified, applications have been and con- tinue to be developed, uptake by a larger community is more or less guaranteed, 13 but the input of humanities specialists is crucially needed to improve the service quality provided. 4 Discussion The list of problems in the preceeding section is not meant to be either exclusive or exhaustive, but merely to illustrate the sort of problems for which killer apps can be designed and deployed. Similarly, the role for humanities specialists to play will vary from project to project – in some cases, humanists will need to play an advisory role to keep a juggernaut from going out of control (as might be needed with the automatic grading), while in others, they will need to create and nurture a software project from scratch. The list, however, shares enough to illustrate both the underlying concept and its significance. In other words, we have an answer to the question “what?” — what do I mean by a “killer application,” what does it mean for the field of digital humanities, and, as I hope I have argued, what can we do to address the perennial problem of neglect by the mainstream. An equally important question, of course, is “how?” Fortunately, there appears to be a window opening, a window of increased attention and avail- able research opportunities in the digital humanities. The IATH summit cited above [IATH Summit, 2006] is one example, but there are many others. Re- cent conferences such as the first Text Analysis Developers Alliance (TADA), in Hamilton (2005), the Digital Tools Summit for Linguistics in East Lansing (2006), the E-MELD Workshops (various locations, 2000–6), the Cyberinfras- tructure for Humanities, Arts, and Social Sciences workshop at UCSD (2006), and the recent establishment of the Working Group on Community Resources for Authorship Attribution (New Brunswick, NJ; 2006) illustrate that digital scholarship is being taken more seriously. The establishment of Ray Siemens in 2004 as the Canada Research Chair in Humanities Computing is another impor- tant milestone, marking perhaps the first recognition by a national government of the significance of Humanities Computing as an acknowledged discipline. Perhaps most important in the long run is the availability of funding to support DH initiatives. Many of the workshops and conferences described above were partially funded by competitively awarded research grants from national agencies such as the National Science Foundation. The Canadian Foundation for Innovation has been another major source of funding for DH initiatives. But perhaps the most significant development is the new (2006) Digital Humanities Initiative at the (United States) National Endowment for the Humanities. From the website4: NEH has launched a new digital humanities initiative aimed at supporting projects that utilize or study the impact of digital technology. Digital technologies offer humanists new methods of conducting research, conceptualizing relationships, and presenting 4http://www.neh.gov/grants/digitalhumanities.html, accessed 6/18/2006 14 scholarship. NEH is interested in fostering the growth of digital hu- manities and lending support to a wide variety of projects, including those that deploy digital technologies and methods to enhance our understanding of a topic or issue; those that study the impact of digital technology on the humanities–exploring the ways in which it changes how we read, write, think, and learn; and those that digitize important materials thereby increasing the public’s ability to search and access humanities information. The list of potentially supported projects is large: • apply for a digital humanities fellowship (coming soon!) • create digital humanities tools for analyzing and manipulating humanities data (Reference Materials Grants, Research and Development Grants) • develop standards and best practices for digital humanities (Research and Development Grants) • create, search, and maintain digital archives (Reference Materials Grants) • create a digital or online version of a scholarly edition (Scholarly Editions Grants) • work with a colleague on a digital humanities project (Collaborative Re- search Grants) • enhance my institution’s ability to use new technologies in research, educa- tion, preservation, and public programming in the humanities (Challenge Grant) • study the history and impact of digital technology (Fellowships, Faculty Research Awards, Summer Stipends) • develop digitized resources for teaching the humanities (Grants for Teach- ing and Learning Resources) Most importantly, this represents an agency-wide initiative, and thus illus- trates the changing relationship between the traditional humanities and digital scholarship at the very highest levels. Of course, just as windows can open, they can close. To ensure continued access to this kind of support, the supported research needs to be successful. This paper has deliberately set the bar high for “success,” arguing that digi- tal products can and should result in substantial uptake and effect significant changes in the way that, as NEH put it, “how we read, write, think, and learn.” The possible problems discussed earlier are an attempt to show that we can effect such changes. But the most important question, of course, is “should we?” 15 “Why?” Why should scholars in the digital humanities try to develop this software and make these changes? The first obvious answer is simply one of self- interest as a discipline. Solving high-profile problems is one way of attracting the attention of mainstream scholars and thereby getting professional advance- ment. Warwick [Warwick, 2004b] illustrates this in her analysis of the citations of computational methods, and the impact of a single high-profile example. Of all articles studied, the only ones that cited computation methods did so in the context of Don Foster’s controversial analysis of “A Funeral Elegy” to Shake- speare. The Funeral Elegy controversy provides a case study of circum- stances in which the use of computational techniques was noticed and adopted by mainstream scholars. The paper argues that a com- plex mixture of a canonical author (Shakespeare) and a star scholar (Foster) brought the issue to prominence. . . . The Funeral Elegy debate shows that if the right tools for tex- tual analysis are available, and the need for, and use of, them is explained, some mainstream scholars may adopt them. Despite the current emphasis on historical and cultural criticism, scholars will surely return in time to detailed analysis of the literary text. There- fore researchers who use computational methods must publish their results in literary journals as well as those for humanities computing specialists. We must also realize that the culture of academic disci- plines is relatively slow to change, and must engage with those who use traditional methods. Only when all these factors are understood and are working in concert, may computational analysis techniques truly be more widely adopted. Implicit in this, of course, is the need for scholars to find results that are publishable in mainstream literary journals as well as to do the work resulting in publication, the two main criteria of killer apps. On a less selfish note, the development of killer applications will improve the overall state of scholarship as a whole, without regard to disciplinary boundaries. While change for its own sake may not necessarily be good, solutions to genuine problems usually are. Creating the index to a large document is not fun — it requires days or weeks of painstaking, detailed labor that few enjoy. The inability to find or access needed resources is not a good thing. By eliminating artificial or unnecessary restrictions on scholarly activity, scholars are freed to do what they really want to do — to read, to write, to analyze, to produce knowledge, and to distribute it. Furthermore, the development of such tools will in and of itself generate knowledge, knowledge that can be used not only to generate and enhance new tools but to help understand and interpret the humanities more generally. Soft- ware developers must be long-term partners with the scholars they serve, but digital scholars must also be long-term partners, not only with the software de- velopers, but with the rest of the discipline and its emerging needs. In many 16 case, the digital scholars are uniquely placed to identify and to describe the emerging needs of the discipline as a whole. With a foot in two camps, the digital scholars will be able to speak to the developers about what is needed, and to the traditional scholars about what is available as well as what is under development. 5 Conclusion Predicting the future is always difficult, and predicting the effects of a newly- opened window is even more so. But recent developments suggest that digital humanities, as a field, may be at the threshold of new series of significant de- velopments that can change the face of humanities scholarship and allow the “emerging discipline of humanities computing” finally to emerge. For the past forty years, humanities computing has more or less languished in the background of traditional scholarship. Scholars lack incentive to partici- pate (or even to learn about) the results of humanities computing. This paper argues that DH specialists are placed to create their own incentives by develop- ing applications with sufficient scope to materially change the way humanities scholarship is done. I have suggested four possible examples of such applica- tions, knowing well that many more are out there. I believe that by actively seeking out and solving such Great Problems – by developing such killer apps, scholarship in general and digital humanities in particular, will be well-served. References [Foltz et al., 1999] Foltz, P. W., Laham, D., and Landauer, T. K. (1999). Auto- mated essay scoring: Applications to educational technology. In Proceedings of EdMedia ’99. [Gibson, 2005] Gibson, M. (2005). Clotel: An electronic scholarly edition. In Proceedings of ACH/ALLC 2005, Victoria, BC CA. University of Victoria. [IATH Summit, 2006] IATH Summit (2006). Summit on digital tools for the humanities : Report on summit accomplishments. [Jockers, 2005] Jockers, M. (2005). Xml aware tools — catools. In Presentation at Text Analysis Developers Alliance, McMaster University, Hamilton, ON. [Juola, 2005] Juola, P. (2005). Towards an automatic index generation tool. In Proceedings of ACH/ALLC 2005, Victoria, BC CA. University of Victoria. [Lukon and Juola, 2006] Lukon, S. and Juola, P. (2006). A context-sensitive computer-aided index generator. In Proceedings of DH 2006, Paris. Sorbonne. [Martin, 2005] Martin, S. (2005). Reaching out: What do scholars want from electronic resources? In Proceedings of ACH/ALLC 2005, Victoria, BC CA. University of Victoria. 17 [Page, 1994] Page, E. B. (1994). Computer grading of student prose using mod- ern concepts and software. Journal of Experimental Education, 62:127–142. [Ruecker and Devereux, 2004] Ruecker, S. and Devereux, Z. (2004). Scraping Google and Blogstreet for Just-in-Time text analysis. In Presented at CaSTA- 04, The Face of Text, McMaster University, Hamilton, ON. [Siemens et al., 2004] Siemens, R., Toms, E., Sinclair, S., Rockwell, G., and Siemens, L. (2004). The humanities scholar in the twenty-first century: How research is done and what support is needed. In Proceedings of ALLC/ACH 2004, Gothenberg. U. Gothenberg. [Toms and O’Brien, 2006] Toms, E. G. and O’Brien, H. L. (2006). Understand- ing the information and communication technology needs of the e-humanist. Journal of Documentation, (accepted/forthcoming). [USNews, 2006] USNews (2006). U.S. News and World Report : America’s best graduate schools (social sciences and humanities). [Valenti et al., 2003] Valenti, S., Neri, F., and Cucchiarelli, A. (2003). An overview of current research on automated essay grading. Journal of In- formation Technology Education, 2:319–330. [Warwick, 2004a] Warwick, C. (2004a). No such thing as humanities comput- ing? an analytical history of digital resource creation and computing in the humanities. In Proceedings of ALLC/ACH 2004, Gothenberg. U. Gothenberg. [Warwick, 2004b] Warwick, C. (2004b). Whose funeral? a case study of com- putational methods and reasons for their use or neglect in English studies. In Presented at CaSTA-04, The Face of Text, McMaster University, Hamilton, ON. 18 work_ankxmqrvibhnznbna2irncou5q ---- Microsoft Word - RIs special issue v5 - revised - changes accepted King’s Research Portal DOI: 10.3366/ijhac.2013.0086 Document Version Early version, also known as pre-print Link to publication record in King's Research Portal Citation for published version (APA): Dunn, S., & Hedges, M. (2013). Crowd-sourcing as a Component of Humanities Research Infrastructures. International Journal of Humanities and Arts Computing, 7(1), 147-169. [N/A]. https://doi.org/10.3366/ijhac.2013.0086 Citing this paper Please note that where the full-text provided on King's Research Portal is the Author Accepted Manuscript or Post-Print version this may differ from the final Published version. If citing, it is advised that you check and use the publisher's definitive version for pagination, volume/issue, and date of publication details. And where the final published version is provided on the Research Portal, if citing you are again advised to check the publisher's website for any subsequent corrections. General rights Copyright and moral rights for the publications made accessible in the Research Portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognize and abide by the legal requirements associated with these rights. •Users may download and print one copy of any publication from the Research Portal for the purpose of private study or research. •You may not further distribute the material or use it for any profit-making activity or commercial gain •You may freely distribute the URL identifying the publication in the Research Portal Take down policy If you believe that this document breaches copyright please contact librarypure@kcl.ac.uk providing details, and we will remove access to the work immediately and investigate your claim. Download date: 06. Apr. 2021 https://doi.org/10.3366/ijhac.2013.0086 https://kclpure.kcl.ac.uk/portal/en/publications/crowdsourcing-as-a-component-of-humanities-research-infrastructures(294ddb42-e10c-4bce-b6cf-8b9500c9aeb7).html https://kclpure.kcl.ac.uk/portal/en/persons/stuart-dunn(9a7fa6a7-47a3-49b3-a358-140b7ba41334).html /portal/mark.hedges.html https://kclpure.kcl.ac.uk/portal/en/publications/crowdsourcing-as-a-component-of-humanities-research-infrastructures(294ddb42-e10c-4bce-b6cf-8b9500c9aeb7).html https://kclpure.kcl.ac.uk/portal/en/journals/international-journal-of-humanities-and-arts-computing(ede900f4-f773-46e9-a878-a2110f8c1d8a).html https://doi.org/10.3366/ijhac.2013.0086 Open Access document downloaded from King’s Research Portal https://kclpure.kcl.ac.uk/portal The copyright in the published version resides with the publisher. When referring to this paper, please check the page numbers in the published version and cite these. General rights Copyright and moral rights for the publications made accessible in King’s Research Portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications in King's Research Portal that users recognise and abide by the legal requirements associated with these rights.' • Users may download and print one copy of any publication from King’s Research Portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the King’s Research Portal Take down policy If you believe that this document breaches copyright please contact librarypure@kcl.ac.uk providing details, and we will remove access to the work immediately and investigate your claim. Citation to published version: Hedges, M., & Dunn, S. (2013). Crowd-sourcing as a Component of Humanities Research Infrastructures. International Journal of Humanities and Arts Computing This version: Pre-print https://kclpure.kcl.ac.uk/portal/en/publications/crowdsourcing-as-a- component-of-humanities-research-infrastructures%28294ddb42-e10c- 4bce-b6cf-8b9500c9aeb7%29.html This Pre-print version has been submitted for publication https://kclpure.kcl.ac.uk/portal/ mailto:librarypure@kcl.ac.uk https://kclpure.kcl.ac.uk/portal/en/publications/crowdsourcing-as-a-component-of-humanities-research-infrastructures%28294ddb42-e10c-4bce-b6cf-8b9500c9aeb7%29.html https://kclpure.kcl.ac.uk/portal/en/publications/crowdsourcing-as-a-component-of-humanities-research-infrastructures%28294ddb42-e10c-4bce-b6cf-8b9500c9aeb7%29.html https://kclpure.kcl.ac.uk/portal/en/publications/crowdsourcing-as-a-component-of-humanities-research-infrastructures%28294ddb42-e10c-4bce-b6cf-8b9500c9aeb7%29.html Crowd-sourcing as a Component of Humanities Research Infrastructures Stuart Dunn, Mark Hedges Centre for e-Research, Department of Digital Humanities, King’s College London, 26-29 Drury Lane, London, UK mark.hedges@kcl.ac.uk, stuart.dunn@kcl.ac.uk Abstract: Crowd-sourcing, the process of leveraging public participation in or contribution to a project or activity, is relatively new to academic research, but is becoming increasingly important as the Web transforms collaboration and communication and blurs the boundaries between the academic and non- academic worlds. At the same time, digital research methods are entering the mainstream of humanities research, and there are a number of initiatives addressing the conceptualisation and construction of research infrastructures for the humanities. This paper examines the place of crowd-sourcing activities within such initiatives, presenting a framework for describing and analysing academic humanities crowd- sourcing, and using this framework of ‘primitives’ as a basis for exploring potential relationships between crowd-sourcing and humanities research infrastructures. Keywords: crowd-sourcing, research infrastructures, citizen science, scholarly primitives, typology. Introduction Crowd-sourcing, 1 the process of leveraging public participation in or contribution to a project or activity, is relatively new to academic research, and even more so to the humanities. However, at a time when the Web is transforming the way in which people collaborate and communicate, and is blurring boundaries between the spaces inhabited by the academic and non-academic worlds, it has never been more important to examine the role that public communities are beginning to play in academic humanities research. At the same time, digital research methods are starting to enter the mainstream of humanities research, and there are a number of initiatives addressing the conceptualisation and construction of research infrastructures that would support a shift from ad hoc projects and centres to an environment that is more integrated and sustainable. Such an environment will inevitably be distributed, integrating knowledge, services and people in a loosely-coupled, collaborative ‘digital social marketplace’. 2 The question naturally arises as to where crowd-sourcing activities fit within this framework. More specifically, what contributions can public participants, and the communities to which they belong, make to a humanities research infrastructure, and conversely how can these participants and communities, and the academic researchers who make use of the knowledge and effort that they contribute, benefit from such participation? To begin to address these questions is one of the aims of this paper. The paper is organised as follows: we begin by describing the context in which the work was carried out, and the methodology used. We then review a number of existing terminologies and typologies for crowd- sourcing and related concepts, and follow this with an analysis of the main motivations for engaging with crowd-sourcing, from both the volunteer’s and the academic’s points of view. Finally, we build upon this by presenting the outline of a framework for describing and analysing academic humanities crowd-sourcing projects, and use this framework of ‘primitives’ as a basis for exploring the potential relationships between various forms of crowd-sourcing activity and humanities research infrastructures. Background and Methodology The research described in this paper was mostly carried out as part of the Crowd-sourcing Scoping Study project (Ref. AH/J01155X/1), which ran for nine months from February-November 2012, and was funded by the Arts and Humanities Research Council as part of its Connected Communities programme. The study’s methodology had four main components: • a literature review covering academic humanities research that has incorporated crowd-sourcing, research into crowd-sourcing as a method, and less formal outputs such as blogs and project websites. • two workshops facilitating discussion between, respectively, humanities academics who have used crowd-sourcing, and contributors to crowd-sourcing projects; • an online survey of contributors to crowd-sourcing projects, exploring their backgrounds, histories of participating in such projects, and motivations for doing so; • interviews with academics and contributors. The study does not claim to be comprehensive: there are bound to be important projects, publications, individuals and activities that have been omitted, and there is a strong UK and Anglophone focus on the activities studied. In particular, while the survey was widely publicised, it was self-selecting and makes no claim to being statistically representative; it functioned rather as a means of gathering qualitative information about contributors’ backgrounds and motivations. Crowd-sourcing and related concepts The term crowd-sourcing was coined in a Wired article by Jeff Howe, 3 in which he draws a parallel between reducing labour costs by outsourcing to cheaper countries, and utilising ‘the productive potential of millions of plugged-in enthusiasts’. In an academic context, the term has developed from an economic focus to an information focus, in which this productive potential is used to achieve research aims. However, the term is problematic and requires further analysis. It is first necessary to distinguish crowd-sourcing from some related concepts. It is broader and less easy to define than ‘citizen science’, which is commonly understood to refer to activities whereby members of the public undertake well-defined and (individually) small-scale tasks as part of larger-scale scientific projects. 4 Another related concept is the ‘Wisdom of Crowds’, 5 which holds that large-scale collective decision-making can be superior to that of individuals, even experts. Although academic crowd-sourcing can be about decision, the decisions involved are rarely as neatly packageable as those implied in the world of business, where the ‘good’ or ‘bad’ nature of a decision can be evaluated on the basis of profitability. 6 Such collective decision-making also lacks the elements of collaboration around activities conceived and directed for a common purpose that characterise crowd-sourcing as commonly understood. Another important distinction is that between crowd-sourcing and ‘social engagement’. 7 According to Holley, social engagement involves ’giving the public the ability to communicate with us and each other‘, and is ’usually undertaken by individuals for themselves and their own purposes‘, whereas crowd-sourcing ’uses social engagement techniques to help a group of people achieve a shared, usually significant, and large goal by working collaboratively together as a group‘. Holley also notes that crowd-sourcing is likely to involve more effort, and implies a level of commitment and participation that goes beyond casual interest, whereas social engagement is an extension of the kinds of online activities – Tweeting, commenting – that millions do on a daily basis anyway. In one way, this aligns crowd-sourcing with ‘citizen science’. Indeed, Wiggins and Crowston develop this theme by highlighting a distinction between citizen science and community science, and stating as a key ingredient of the former that it is not self-organising and ’does not represent peer production ... because the power structure of these projects is usually hierarchical‘. 8 A fundamental aspect of citizen science is thus that the goal is defined by a particular person or group (almost always as part of a professional academic undertaking), and the participants (recruited through an open call) provide some significant effort towards achieving that goal. However, the different intellectual traditions of the sciences and the humanities embrace, and are embraced by, different kinds of non-academic community. Indeed, as Trevor Owens has noted, most successful crowd-sourcing activities in the humanities and cultural sectors are not really about crowds at all, in the sense of ’large anonymous masses of people’, but are about ’participation from interested and engaged members of the public’. 9 While a crowd-sourcing project may have the capacity for involving large numbers of people, in many cases only a few contributors end up being actively engaged, and these contribute a large percentage of the work. While there may be a centralised recruitment process, at this level the body of contributors is self-organising and self-selecting. A number of attempts have been made to identify the key characteristics, or to formulate a typology, of crowd-sourcing and related activities. Estellés-Arolas and González-Ladrón-de-Guevara identify eight characteristics, distilled from 32 distinct definitions identified in the literature: the crowd; the task at hand; the recompense obtained; the crowdsourcer or initiator of the crowdsourcing activity; what is obtained by crowdsourcing process; the type of process; the call to participate; and the medium. 10 This extremely processual definition is comprehensive in identifying stages that map easily to business processes. For the humanities, the ‘type of process’ is both more significant and more problematic, given the great diversity of processes in the creation of humanities research material. A more task-oriented approach is taken by Wiggins and Crowston, 11 who construct a typology for ‘citizen science’ activities, identifying five areas of application: Action, Conservation, Investigation, Virtual, and Education. The factors that lead to an activity being assigned to a category are multivariate, and the identification of the categories was based on whether there is an occurrence in a category or not, rather than frequency of those occurrences. The coverage is therefore extremely broad; ’Action’, for example, covers self-organising citizen groups that use web technologies to achieve a common purpose, often to do with campaigns on local issues. Moreover, the use of the word ‘science’ (at least in the usual Anglophone sense) confines the activities reviewed (in terms of both the methods and the content) to a particular epistemic bracket, which inevitably excludes some aspects of humanities research. One widely-quoted set of definitions for citizen science projects was presented by Bonney et al.. 12 This divided the field into three broad categories: contributory projects, in which members of the public, via an open call, contribute along lines that are tightly defined and directed by scientists; collaborative projects, which have a central design but to which members of the public contribute data, and may also help to refine project design, analyze data, or disseminate findings; and co-created projects, which are designed by scientists and members of the public working together and for which at least some of the public participants are actively involved in the scientific process. This approach shares important characteristics with the ‘task type’ described below, in that it is rooted in the complexity of the task, and the amount of initiative and independent analysis required to make a contribution. The Galleries, Libraries, Archives and Museums (hereafter GLAM) sectors have in particular seen efforts to develop crowd-sourcing typologies. One such typology has been proposed by Mia Ridge in a blog post, 13 and includes the following categories: Tagging, Debunking (i.e. correcting/reviewing content), Recording a personal story, Linking, Stating preferences, Categorizing, and Creative responses. Again, these categories imply a processual approach, concerning the type of task being carried out, and are potentially extensible across different types of online and physical content and collections. Another typology from the GLAM domain was developed by Oomen and Aroyo. 14 Their categories include Correction and Transcription, defined as inviting users to correct and/or transcribe outputs of digitisation processes (a category that Ridge’s ‘Debunking’ partially, but not entirely, covers); Contextualisation, or adding contextual knowledge to objects, by constructing narratives or creating User Generated Content (UGC) with contextual data; Complementing Collections, which is the active pursuit of additional objects to be included in a collection; Classification, defined as the gathering of descriptive metadata related to objects in a collection (Ridge’s ‘Tagging’ is a subset of this); Co-curation, which is using inspiration/expertise of non-professional curators to create (Web) exhibits (somewhat analogous to the co- created projects of Bonney et al., but more task-oriented); and Crowdfunding, or the collective cooperation of people who pool their money and other resources together to support efforts initiated by others. 15 Ridge explicitly rejects crowdfunding as a component of crowd-sourcing. 16 These typologies from the GLAM world perhaps represent best the different crowd-sourcing activities examined by the study, although such lists of categories do not reflect fully the complexity of the situations encountered. Instead, we propose a typology that is orientated along four distinct, although inter- dependent, facets, as described in Crowd-sourcing and research infrastructures below. Motivations Motivations of participants Overview Most studies have concluded that crowd-sourcing contributors typically do not have a single motivation; our own survey indicated overwhelmingly (79%) that the contributors who responded have both personal and altruistic motivations. However in many cases it is possible to identify a dominant motivating factor, which is almost always concerned directly with the activity’s subject area. In an analysis of 207 forum posts and interview responses for example, the Galaxy Zoo project found that the top motivations were an interest in astronomy (39%), a desire to contribute (13%) and a concern with the vastness of the universe (11%). 17 A study of volunteers for the Florida Fish and Wildlife Conservation Commission’s Nesting Beach Survey found that concern for turtle conservation was the overwhelming motivating factor. 18 Moreover, studies of the motivations of the contributors to academic crowd-sourcing projects have emphasised personal interest in the subject area concerned, and the opportunities provided to exercise that interest and to engage with people who share it, without material benefit. Such interest is usually concerned with the outcome, but it can also be in the process, or some combination of both. For example, in her 2009 assessment of volunteers to the TROVE project, Holley notes that ‘a large proportion was family history researchers’, who were highly motivated and had ‘a sense of responsibility towards other genealogists to help not only themselves but other people where possible’. 19 In general, it may be said that research into crowd-sourcing motivations suggests a clear primary, although not exclusive, focus on the subject or activity area, and that motivations can be personal or altruistic, and extrinsic or intrinsic. Rewards For the most part, crowd-sourcing projects do not reward their contributors directly in material or professional terms, and conversely contributors to crowd-sourcing projects are not subject to discipline (in either sense) or sanction in the way that members of conventionally-configured research projects are. Indeed, it is clear that the motivations of participants in academic crowd-sourcing tend to be intrinsic to the activity. However, we may regard more indirect benefits as constituting a form of reward: the fulfilment of an interest in the subject; personal gains such as skills, experience or knowledge; some form of status; or a feeling of gratification. In our survey, contributors mentioned a number of skills gained, including general IT competencies, such as editing wikis and using Skype for distributed collaboration, as well as specialised skills such as TEI encoding. Many contributors gained domain knowledge, for example through the opportunity to edit historical documents (ships’ histories) resulting from participation in the Old Weather project. This project showed that the domain interests of the participants can differ from those of the project team, which in this case is solely interested in those parts of the documents being transcribed that relate to climate history, 20 whereas several contributors became interested in the histories of individual ships, and in addressing niches of history that had been hitherto unexplored. Participants can also pick up a basic grounding in research methods of collation, synthesis and analysis in the area of interest to them. Less concrete benefits also function as rewards. It was frequently noted that some form of ‘feedback loop’, through which a participant is informed that their contributions were correct and valuable, is a very important motivating factor for engaging with crowd-sourcing projects, and conversely that a lack of feedback can be very frustrating and discouraging to the participant. Feedback also plays a key role in building a sense of community, and making participants feel that they have a stake in the project. For complex tasks, feedback may also be a necessary part of improving volunteers’ work practices, as in Transcribe Bentham. 21 This feedback can be immediate and specific to an individual contribution – for example,. participants in the British Library’s Georeferencer project (BLG), 22 who could see the results of their work immediately – or it can be deferred and cumulative, for example by means of rankings. Contributors may receive various ’social’ rewards, for example through rankings, increased standing in the crowd-sourcing community, or (in the case of Galaxy Zoo) being credited and named in publications. Similarly, contributors may be subjected to social sanctions, such as banning (e.g. removal of pages or blocking of accounts on Wikipedia), which can adversely affect their reputation and enjoyment, and may even in rare cases reflect on their professional standing. As well as simple feedback interactions between the project and an individual user, the ability to interact with other participants, for example via a project forum, is an extremely important motivation. Such project-based social networks are used both for ‘exchanging chit-chat’ and for discussing and sharing information on the practical and technical issues raised, and can foster a sense of community among the participants that can extend beyond the immediate activities of the project itself. A good example of this is the Old Weather forum, 23 which contains exchanges among participants that are indicative of a high degree of collaborative, communal working in addressing problems that arise during the process. The importance of forums was also noted by participants in Transcribing Bentham and British Library Georeferencer. Gamification Some approaches have emphasised the importance of tasks being enjoyable, and have focused on the development of games for crowd-sourcing of different kinds. Prestnopnik and Crowston discuss the role of games, and in particular possible approaches to creating an application for crowd-sourced natural history taxonomy classification using design science. 24 The Bodiam Castle project provides an example of the potential for games in the context of archaeological analysis of buildings, although this had a greater emphasis on visualisation than on competition. 25 However, Prestnopnik and Crowston also note that ‘gamification’ can act as a disincentive to contributors who have expert knowledge or deep interest in the subject. 26 Gamification can also be a barrier for users who simply want to engage with the assets or processes in question, and can trivialise the process of acquiring or processing data. 27 In their analysis of The Bird Network project, in which gathered data about the use of bird-boxes by birds, which was then shared with the scientific team, Brossard et al. note that participants’ interest in ornithology was likely to overshadow awareness of scientific process, 28 and thus stymie efforts by the Lab to contribute to scientific awareness and education. 29 Competition Although very few participants in our survey admitted to being motivated by competition with each other, among those who attended our workshop competition featured strongly as a factor, although this should be qualified by the fact that those present tended to be ‘super contributors’, who are likely to feel more competitive than those in the ‘long tail’ of the crowd. For many projects it is possible to track individual participant’s contributions and to acquire statistics on contributions, and in such cases projects can establish ‘leader boards’ indicating which participants have made the biggest contributions (in whatever terms the project is using). For example, the British Library’s Georeferencer project displayed the handles of the users who processed the most maps, and the ‘winner’ was invited to meet the Library’s head of cartography. The Old Weather project also encouraged competition by assigning roles to contributors based on the number of pages transcribed. However, in order for competition to be a significant motivating factor, the tasks and their outcomes must be sufficiently quantifiable to allow mutual comparison; matters can become complex when tasks are not comparable directly. For example, in BLG some maps were more complex than others, and the team felt that this affected the meaningfulness of comparing the effort needed to georeference them. Where more creative or interpretive outputs are being created, this lack of commensurability is a still greater issue, and there may even be conflicts between outputs; simple rankings seem inappropriate to such scenarios. In any case, the encouragement of competition should not be at the cost of alienating potential participants who are not by nature competitive, nor of favouring speed and volume at the expense of quality and care. Indeed, competition can be defined not just in this quantitative sense; volunteers may compete to produce more high-quality work, although in the absence of metrics this can amount to competing only against oneself. Note also that competition is not incompatible with a sense of common purpose; for example, Old Weather participants often ‘feel like part of the ship’ on which they are working. Motivations of academics At least part of the success of Galaxy Zoo and other Zooniverse projects is that they catered to clear and present academic needs. In the case of Galaxy Zoo itself, the assets – photographs of galaxies – were far too numerous to be examined individually by any research team, and the task – the classification of those galaxies – was not one that could be performed by computer software, although for the most part could be carried out by a person without specialist expertise. 30 Quite simply, this is work that could not have been carried without large-scale public engagement and participation. Most cases where humanities academics have engaged with crowd-sourcing have been driven by specific research questions or the need for a particular resource. For example, the Transcribe Bentham project was motivated by the fact that 40,000 folios of Bentham’s work were untranscribed, and thus these valuable primary sources were inaccessible to people researching eighteenth or nineteenth century thought. 31 BLG was motivated by the desire to make its map collections more searchable and thus more exploitable. In Old Weather, researchers were motivated by the desire to be able to use information contained within the assets to explore historic weather patterns, although these motivations may not necessarily be shared by the participants. 32 Although the research motivations are various, the key characteristic leading the project to use crowd-sourcing is that each involves tasks that a computer could not carry out, and that a research team could only do only with prohibitively large resources. Note however, during the initial six-month testing period of the project, the rate of volunteer transcription compared unfavourably with that of professional researchers, 33 possibly due to the complexity of the material and the difficulty of Bentham’s handwriting. There was also an extremely high moderation overhead, with significant staff time needed to validate the outputs and provide feedback to the contributors. Since then, the volunteer transcription rate has improved significantly, so there is potential for avoiding significant costs in the future. 34 However, this example can serve as a warning against assumptions that crowd-sourcing provides free labour. Other researchers, particularly those in the GLAM sector, see crowd-sourcing as a means of filling gaps in the coverage of their collections, 35 as it can be an effective way of obtaining information about assets (or the assets themselves) to which only certain members of the public have access, for example through personal or family connections. However, in order to be usable for academic purposes, a degree of curation is required, and this may involve expert input. It is clear that public engagement and community building is frequently an unintentional by-product of crowd-sourcing projects. In some cases it is seen as an explicit motivation, with the aim of encouraging public engagement with scholarly archives and research, and thus increasing the broader impact of academic research activities. 36 Crowd-sourcing and research infrastructures A conceptual framework for crowd-sourcing One of the outcomes of our study is a typology for crowd-sourcing in the humanities, which brings together the earlier work cited in Section 2 with the experiences and processes uncovered during the study. It does not seek to provide an alternative set of categories specifically for the humanities, in competition with those considered above. Rather, we propose a model for describing and understanding crowd-sourcing projects in the humanities by analysing them in terms of four key facets – asset type, process type, task type, and output type – and of the relationships between them, and in particular by observing how the applicable categories in one facet are dependent on those in other facets. Error! Reference source not found. illustrates the four facets and their interactions. • A process is composed of tasks through which an output is produced by operating on an asset. It is conditioned by the kind of asset involved, and by the questions that are of interest to project stakeholders (both organisers and volunteers) and can be answered, or at least addressed, using information contained in the asset. • An Asset refers to the content that is, in some way, transformed as a result of processing by a crowd-sourcing activity. • A task is an activity that a project participant undertakes in order to create, process or modify an asset (usually a digital asset). Tasks can differ significantly as regards the extent to which they require initiative and/or independent analysis on the part of the participant, and the difficulty with which they can be quantified or documented. The task types were identified the aim of categorising this complexity, and are listed below in approximately increasing order. • The output is what is produced as the result of applying a process to an asset. Outputs can be tangible and/or measurable, but we make allowance also for intangible outcomes, such as awareness or knowledge etc. Error! Reference source not found.–Error! Reference source not found. list the categories that the study identified under each facet; these are based for the most part on an examination of existing crowd- sourcing practice, so it is to be expected that the lists will be extended and/or challenged by future work. Detailed descriptions of each category may be found in the report by Dunn and Hedges; 37 in the rest of this paper, we examine the framework specifically in relation to humanities research infrastructures. From crowd-sourcing primitives to research infrastructures Rather than attempting to map the elements of this crowd-sourcing framework to specific infrastructures or infrastructural components, we note instead that it may be thought of as a framework of ‘primitives’, in a sense analogous to that of ‘scholarly primitives’. Scholarly primitives may be defined as ’basic functions common to scholarly activity across disciplines’, 38 and they provide a conceptual framework for classifying scholarly activities. Given the diversity of humanities research, it is not surprising that there are various sets of candidates – in addition to Palmer et al. there are, for example, Unsworth, 39 Benardou et al. 40 and Anderson et al. 41 – and such a structure has in particular been used as a framework for conceptualising and developing infrastructure for supporting humanities research. 42 The process facet in particular may be regarded as providing a set of primitives in this sense, and the output type composite digital collection with multiple meanings may in particular be regarded as a form of humanities ‘research object’, in the sense used by Bechhofer et al. 43 and Blanke and Hedges. 44 Of course, the categorisation into primitives described above is quite different to those in the works cited; this is only to be expected, as it represents the activities of quite different stakeholders, namely interested members of the public rather than professional scholars (although of course one person can play different roles in different circumstances). In particular, there is a greater emphasis on creating or enhancing digital assets in some way, rather than using these assets in research (although again these activities can overlap. For the remainder of this paper, we will look in more detail at each of the process types in turn, using specific examples examined by the study with a view to seeing how crowd-sourcing can contribute effectively to humanities research infrastructures. COLLABORATIVE TAGGING Collaborative tagging may be regarded as crowd-sourcing the organisation of information assets by allowing users to attach tags to those assets. Tags can be based on existing controlled vocabularies, but are more usually derived from free text supplied by the users themselves. Such ‘folksonomies’ are distinguished from deliberately designed knowledge organisation systems by the fact that they are self- organising, evolving and growing as contributors add new terms. It is possible to extract more formal vocabularies from folksonomies. 45 Collaborative tagging can result in two concrete outcomes: it can make a corpus of information assets searchable using keywords applied by the user pool, and it can highlight assets that have particular significance, as evidenced by the number of repeat tags they are accorded by the pool. Research in this area has examined the patterns and information that can be extracted from folksonomies. Golder found that patterns generated by collaborative tagging are, on the whole, extremely stable, meaning that minority opinions can be preserved alongside more highly replicated, and therefore mainstream, concentrations of tags. 46 Other research has shown that user-assigned tags in museums may be quite different from vocabulary terms assigned by curators, and that relating tags to controlled vocabularies can be very problematic, 47 although it could be argued that this allows works to be addressed from a different perspective than that of the museum’s formal documentation. In any case, such approaches to knowledge organisation are likely to play a significant part in the organisation of humanities data in the future. An example is the BBC’s YourPaintings project, 48 developed in collaboration with the Public Catalogue Foundation, which has amassed a collection of photographs of all paintings in public ownership in the UK. The public is invited to apply tags to these, which both improves discovery and enables the creation of an aggregation of specialised knowledge. A more complex example is provided by the Prism project. 49 Collaborative tagging typically assumes that the assets being tagged are themselves stable and clearly identifiable as distinct objects. Prism allowed readers to highlight significant areas of a text and apply tags to them, and thus build up a collective interpretation of the text. Unlike many humanities crowd-sourcing activities, such as transcribing texts according to well-defined procedures, which have identifiable completions, interpretation can go on indefinitely, and there are no right or wrong answers. LINKING Linking covers the identification and documentation of relationships (usually typed) between individual assets. Most commonly, this takes the form of linking via semantic tags, where the tags describe binary relationships, in which case it is analogous to collaborative tagging. In principle, this could also include the identification of n-ary relationships. TRANSCRIBING Transcribing is currently one of the most prominent areas of humanities crowd-sourcing, as it can be used to address a fundamental problem with digitisation, namely the difficulty of rendering handwriting into machine-readable form using current technology. Typically, such transcription requires the human eye and, in many cases, human interpretation. In terms of our typology, the output of a transcribing process will typically be transcribed text. Two projects have contributed significantly to this prominence: Old Weather (OW) and Transcribe Bentham (TB). OW involved the transcription of ships’ log-books held by The National Archives, in order to obtain access to the weather observations they contain, information that is of major significance for climate research. 50 TB encouraged volunteers to transcribe and engage with unpublished manuscripts by the philosopher and reformer Jeremy Bentham, by rendering them into text marked up using TEI XML. 51 The collaborative model needed for successful crowd-sourced transcription depends on the complexity of the source material. Complex material, such as these two cases, requires a high level of support, whether from the project team or a participant’s peers. Simpler material is likely to require less support; for example, when transcribing the more structured data found in family records, 52 the information (text or integers) to be transcribed is presented to the user in small segments – e.g. names, dates, addresses – and transcription requires different cognitive processes that are less dependent on interaction with peers and experts. Note that this category includes marked-up transcriptions, e.g. using TEI XML, as well as simple transcription of characters. There will be a point however at which the addition of semantic mark-up will go beyond mere transcription, and will count as a form of collaborative tagging or linking, and the output will typically be enhanced text. CORRECTING/MODIFYING CONTENT While content is increasingly ‘born digital’, projects for digitising analogue material abound. Many mass- digitisation technologies, such as Optical Character Recognition (OCR) and speech recognition, can be error-prone, and any such enterprise needs to factor in quality control and error correction, which can make use of crowd-sourcing. The TROVE project, which produced OCR-ed scans of newspapers from the Australian National Archives, is an excellent example of this. 53 The volume of digitised material precluded the corrections being undertaken by the Archive’s its own staff, and using uncorrected text would have significantly reduced the benefits of digitisation, as search capability would have been very restricted. Another potential application in this category is for correcting automated transcriptions of recorded speech, as such transcription is currently highly error-prone, with error rates of 30% or more. 54 RECORDING AND CREATING CONTENT Processes in this category frequently deal with ephemera and intangible cultural heritage. The latter covers any cultural manifestation that does not exist in tangible form; typically, crowd-sourcing is used to document such heritage through a set of processes and tasks, resulting in some form of tangible output. The importance of preserving intangible cultural heritage has been recognised by the UN, 55 and the ways in which this can be documented and curated by distributed communities is an important area for future research. Frequently this takes the form of a cultural institution soliciting memories from the communities it serves, for example the Tenbury Wells Regal Cinema’s Memory Reel project. 56 Such processes can incorporate a form of editorial control or post hoc digital curation, and their outputs can be edited into more formal publications. Another example is the Scottish Words and Place-names (SWAP) project, 57 which gathered words in Scots, determining which words were in current use and where/how they were used, with the ultimate aim of offering selected words for inclusion in the Scottish Language Dictionaries resource. 58 Candidate words were gathered via the project website as well as via social media – Facebook in particular was an important venue for developing conversations around the material – and words that the project felt were suitable were passed to lexicographers for further scrutiny. By ephemera, we understand cultural objects that are tangible, but are at risk of loss because of their transitory nature, for example home videos or personal photographs. 59 There are a number of project addressing such assets, for example the Europeana 1914-1918 project, 60 which is collecting digitised personal artefacts relating to the First World War. The ubiquity of the Web, and access to content creation and digitisation technologies, has led to the creation of non-professionally curated online archives. These have a clear role to play in enriching, augmenting and complementing collections held by memory institutions, and in developing curatorial narratives independent from those of library and archive professionals. 61 Processes in this category are also likely to have elements of the ‘social engagement’ model, in terms of Holley’s distinction. 62 COMMENTING, CRITICAL RESPONSES AND STATING PREFERENCES Processes of this type are likely to count as crowd-sourcing only if there is some specific purpose around which people come together. One example of this is the Shakespeare’s Global Communities project, 63 which captured audience responses to the 2012 World Shakespeare Festival, with the aim of investigating how ‘social networking technologies reshape the ways in which diverse global communities connect with one another around a figure such as Shakespeare’ 64 . The question provides a focus for the activity, which, although not itself producing an academic output, provides a dataset for addressing research questions on the modern reception of Shakespeare. Appropriately managed blogs can provide a platform for focused scholarly interactions of this type. For example, a review by Sonia Massai of King Lear on the Year of Shakespeare site attracted controversial responses, leading to an exchange about critical methods as well as content. 65 What differentiates such exchanges from amateur blogging is the scholarly focus and context provided by the project, and its proactive directing of content creation. The project thus provides a tangible link between the crowd and the subject. CATEGORISING Categorising involves assigning assets to predefined categories; it differs from collaborative tagging in that the latter is unconstrained. CATALOGUING Cataloguing – or the creation of structured, descriptive metadata – is a more open-ended process than categorising, but is nevertheless constrained to following accepted metadata standards and approaches. It frequently includes categorising as a sub-activity, e.g. by LoC subject headings. Cataloguing is a time- and resource-consuming process for many GLAM institutions, and crowd-sourcing has been explored as a means of addressing this. For example, the What’s the Score project at the Bodleian investigated a cost-effective approach to increasing access to music scores from their collections through a combination of rapid digitisation and crowd-sourcing descriptive metadata. 66 Cataloguing is related to contextualising, as ordering, arraying and describing assets will also make explicit some of their context. CONTEXTUALISING Contextualising is typically a more broadly-conceived activity than the related process types of cataloguing or linking, and it involves enriching an asset by adding to it or associating with it other relevant information or content. GEOREFERENCING Georeferencing is the process of establishing the location of un-referenced geographical information in terms of a modern coordinate system such as latitude and longitude. Georeferencing can be used to enrich geospatial assets – datasets or texts, including maps, gazetteers or travelogues, that refer to locations on the earth’s surface – that do not include such explicit information. A major example of crowd-sourcing activity in this area is the British Library Georeferencer project, which aimed to ’geo-enable‘ historical maps in its collections by asking participants to assign spatial coordinates to digitised map images, a task that would have been too labour-intensive for Library staff to undertake themselves. Once georeferenced, the digitised maps are searchable geographically due to the inclusion of latitude and longitude coordinates in the metadata. 67 MAPPING Mapping (in the sense of this typology) refers to the process of creating a spatial representation of some information asset(s). This could involve the creation of map data from scratch, but could also be applied to the spatial mapping of concepts, as in a ‘mind map’. The precise sense will depend on the asset type to which mapping is being applied. There is an important distinction between maps and related geospatial assets created by expert organisations, such as the Ordnance Survey, and those created by community-based initiatives. The former may have the authority of a governmental imprimatur, and the distinction of official endorsement. However, the recent emergence of crowd-sourced geospatial assets – a product of the recent global growth in the ownership of hand-held devices with the ability to record location using GPS 68 – has led to the emergence of resources such as Open Street Map, 69 which has in turn led to a discussion about the reliability of such resources. In general, it has been found that Open Street Map in particular is extremely reliable, 70 but that the specifications for such resources must be carefully defined. 71 The impact of Open Street Map on the cartographic community generally has been noted. 72 The importance of mapping as a means of convening spatial significance means that this kind of asset is particularly open to different discourses, and possibly conflicting narratives. The digital realm, with its potential for accommodating multiple, diverse, contributions and interpretations, holds great potential for such material. 73 TRANSLATING This covers the translation of content from one language to another. In many cases, a crowd-sourced translation will require a strongly collaborative element if it is to be successful, given the semantic interdependencies that can occur between different parts of a text. However, in cases where a large text can be broken up naturally into smaller pieces, a more independent mode of work may be possible; for example, Suda On-Line, 74 which is translating the entries in a 10 th Century Byzantine lexicon/encyclopaedia. A more modern, although non-academic, example is the phenomenon of ‘fansubbing’, where enthusiasts provide subtitles for television shows and other audiovisual material. 75 Conclusions One of the main conclusions of our study is that research involving humanities crowd-sourcing can best be framed and understood through an analysis in terms of four fundamental facets – asset type, process type, task type, and output type – and of the relationships between them. Depending on the activity in question, and what it aims to do, some categories, or indeed some facets, will have primacy. Outputs might be original knowledge, or they might be more ephemeral and difficult to identify: however, considering the processes of both knowledge and resource creation as comprising of these four facets gives a meaningful context to every piece of research, publication and activity we have uncovered in the course of this review. We hope the lessons and good practice we have identified here will, along with this typology, contribute to the development of new kinds of humanities crowd-sourcing in the future. Significantly, we have determined that most humanities scholars that have used crowd-sourcing as part of some research activity agree that it is not simply a form of ‘cheap labour’ for mass digitisation or resource enhancement; indeed, in a narrowly cost-benefit sense it does not always compare well with more conventional mechanisms of digitisation. In this sense, it has truly left its economic roots, as defined by Howe (2006), behind. The creativity, enthusiasm and alternative foci that communities outside that academy can bring to academic research is a resource that is now ripe for tapping into, and the examples above illustrate the rich variety of forms that this tapping can take. We have noted the similarity between some aspects of our typology and the concept of the ‘scholarly primitive’, which has proved valuable in humanities e-research for providing a conceptual framework of fundamental building blocks for describing scholarly activities and modelling putative research infrastructures for the humanities. We have used this relationship to investigate how crowd-sourcing activities falling under various process types can contribute effectively to such research infrastructures. Acknowledgements and additional information A list of the projects investigated by the study, and a description of the survey (including the questions and a summary of the results) may be found in Appendices B and A respectively of (Dunn and Hedges 2012). The project website is at http://humanitiescrowds.org, and additional information (in ‘raw’ form) from the workshops organised as part of the study may be found at http://humanitiescrowds.org/wp- uploads/2012/09/workshop_report1.pdf. We are very grateful to all those who have shared their knowledge and experience with us during the study, and in particular those who agreed to be interviewed, or participated in the workshops, or provided feedback on the project report. 1 We follow the convention of hyphenating ‘crowd-sourcing’; other authors use ‘crowdsourcing’ or ‘crowd sourcing’. In quotations, we preserve the original form. 2 T. Blanke, M. Bryant, M. Hedges, A. Aschenbrenner and M. Priddy, ‘Preparing DARIAH’, 7th IEEE International Conference on e-Science, Stockholm, Sweden (2011), 158-165, http://dx.doi.org/10.1109/eScience.2011.30. 3 J. Howe, ‘The rise of crowdsourcing’, Wired, 14.06 (2006), http://www.wired.com/wired/archive/14.06/crowds.html. 4 J. Silvertown, ‘A new dawn for citizen science’, Trends in ecology & evolution, 24, No. 9 (2009), 467-71. D. P. Anderson, J. Cobb, E. Korpela, M. Lebofsky and D. Werthimer, ‘SETI@home: an experiment in public-resource computing’, Communications of the ACM, 45, Issue 11 (2002), 56-61. 5 J. Surowiecki, The wisdom of crowds: why the many are smarter than the few, 2004. 6 D. Brabham, ‘Crowdsourcing as a model for problem solving: an introduction and cases’, Convergence: The International Journal of Research into New Media Technologies, 14, Issue 1 (2008), 75-90. 7 R. Holley, ‘Crowdsourcing: how and why should libraries do it?’, D-Lib Magazine, 16, No. 3/4 (2010), http://www.dlib.org/dlib/march10/holley/03holley.html. 8 A. Wiggins and K. Crowston, ‘From conservation to crowdsourcing: a typology of citizen science’, System Sciences (HICSS), 2011 44 th Hawaii International Conference, http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=5718708. 9 http://www.trevorowens.org/2012/05/the-crowd-andthe-library 10 E. Estellés-Arolas and F. González-Ladrón-de-Guevara, ‘Towards an integrated crowdsourcing definition’, Journal of Information Science, 38, No. 2 (2012), 189-200. 11 A. Wiggins and K. Crowston, ‘From conservation to crowdsourcing: a typology of citizen science’. 12 R. Bonney, H. Ballard, R. Jordan, E. McCallie. T. Phillips, J. Shirk and C. Wilderman, Public Participation in Scientific Research: Defining the Field and Assessing Its Potential for Informal Science Education, Center for Advancement of Informal Science Education, Washington D. C. (2009), http://caise.insci.org/uploads/docs/PPSR%20report%20FINAL.pdf. 13 http://openobjects.blogspot.co.uk/2012/06/frequently-asked-questions-about.htm 14 J. Oomen and L. Aroyo, ‘Crowdsourcing in the cultural heritage domain: opportunities and challenges’, Proceedings of the 5 th International Conference on Communities and Technologies (2011), 138-149, http://www.cs.vu.nl/~marieke/OomenAroyoCT2011.pdf. 15 A. Agrawal, C. Catalini and A. Goldfarb, ‘The geography of crowdfunding’, NET Institute Working Paper Series, 10-8 (2011), 1-57, http://ssrn.com/abstract=1692661. 16 http://openobjects.blogspot.co.uk/2012/06/frequently-asked-questions-about.htm 17 M. J. Raddick, G, Bracey, P. L. Gay, C. J. Lintott, P. Murray, K. Schawinski, A. S. Szalay and J. Vandenberg, ‘Galaxy Zoo: exploring the motivations of citizen science volunteers’, Astronomy Education Review, 9 (2010), http://aer.aas.org/resource/1/aerscz/v9/i1/p010103_s1. 18 B. M. Bradford and G. D. Israel, ‘Evaluating volunteer motivation for sea turtle conservation in Florida’, Agricultural Education (2004), 1-9. 19 R. Holley, Many hands make light work: public collaborative OCR text correction in Australian historic newspapers, National Library of Australia (2009), http://www.nla.gov.au/ndp/project_details/documents/ANDP_ManyHands.pdf. 20 http://crowds.cerch.kcl.ac.uk/wp-uploads/2012/04/Brohan.pdf 21 T. Causer, J. Tonra and V. Wallace, ‘Transcription maximized; expense minimized? crowdsourcing and editing The Collected Works of Jeremy Bentham’, Literary and Linguistic Computing, 27, Issue 2 (2012), 1-19. Similar conclusions were drawn by the authors of the current article, based on their interviews with staff and volunteers from the Old Weather project and the British Library’s Georeferencer project. 22 http://www.bl.uk/maps/ 23 http://forum.oldweather.org 24 N. R. Prestopnik and K. Crowston, ‘Gaming for (citizen) science: exploring motivation and data quality in the context of crowdsourced science through the design and evaluation of a social-computational system’, Proceedings of “Computing for Citizen Science” workshop at the 7 th IEEE eScience Conference (2011), http://crowston.syr.edu/sites/crowston.syr.edu/files/gamingforcitizenscience_ver6.pdf. 25 http://crowds.cerch.kcl.ac.uk/wp-uploads/2012/04/Masinton.pdf 26 N. R. Prestopnik and K. Crowston, ‘Gaming for (citizen) science: exploring motivation and data quality in the context of crowdsourced science through the design and evaluation of a social-computational system’ (2011). 27 See http://blog.tommorris.org/post/3216687621/im-not-an-experience-seeking-user-im-a for a combative assertion of this position. 28 D. Brossard, B. Lewenstein and R. Bonney, ‘Scientific knowledge and attitude change: the impact of a citizen science project’, International Journal of Science Education, 27, Issue 9 (2005), 1029-1121. 29 D. J. Trumbull, R. Bonney, D. Bascom and A. Cabral, ‘Thinking scientifically during participation in a citizen-science project’ Science Education, 84, Issue 2 (1999), 265-275. 30 C. J. Lintott, K. Schawinski, A. Slosar, K. Land, S. Bamford, D. Thomas, M. J. Raddick, R. Nichol, A. Szalay, D. Andreescu, P. Murray and J. Vandenberg, ‘Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey’, Monthly Notices of the Royal Astronomical Society, 389, Issue 3 (2008), 1179-1189. 31 http://humanitiescrowds.org/wp-uploads/2012/04/Causer.pdf 32 http://humanitiescrowds.org/wp-uploads/2012/04/Brohan.pdf 33 T. Causer, J. Tonra and V. Wallace, ‘Transcription maximized; expense minimized? crowdsourcing and editing The Collected Works of Jeremy Bentham’ (2012). 34 T. Causer and V. Wallace, ‘Building a volunteer community: results and findings from Transcribe Bentham’ Digital Humanities Quarterly, 6. No. 2 (2012), http://www.digitalhumanities.org/dhq/vol/6/2/000125/000125.html. 35 M. Terras, ‘Digital curiosities: resource creation via amateur digitisation’, Literary and Linguistic Computing, 25, No. 4 (2010), 425-438, doi:10.1093/llc/fqq019. 36 M. Moyle, J. Tonra and V. Wallace, ‘Manuscript transcription by crowdsourcing: Transcribe Bentham’. Liber Quarterly - The Journal of European Research Libraries. 20, Issue 3/4 (2011). 37 S. Dunn and M. Hedges, ‘Crowd-sourcing scoping study: engaging the crowd with humanities research’, Arts and Humanities Research Council report (2012), http://humanitiescrowds.org/wp- uploads/2012/12/Crowdsourcing-connected-communities.pdf. 38 C. L. Palmer, L. C. Teffeau and C. M. Pirmann, ‘Scholarly information practices in the online environment: themes from the literature and implications for library service development’ (2009). 39 J. Unsworth, ‘Scholarly primitives: what methods do humanities researchers have in common, and how might our tools reflect this’, ‘Humanities Computing, Formal Methods, Experimental Practice’ Symposium, King’s College London (2000), http://people.lis.illinois.edu/~unsworth/Kings.5-00/primitives.html. 40 A. Benardou, P. Constantopoulos, C. Dallas and D. Gavrilis, ‘Understanding the information requirements of arts and humanities scholarship’, International Journal of Digital Curation, 5, No. 1 (2010), 18-33. 41 S. Anderson, T. Blanke and S. Dunn, ‘Methodological commons: arts and humanities e-science fundamentals’, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 368, No. 1925 (2010), 3779-3796. 42 T. Blanke and M. Hedges, ‘Scholarly primitives: building institutional infrastructure for humanities e- science’, Future Generation Computer Systems, 29, Issue 2 (2013), 654-661, http://dx.doi.org/10.1016/j.bbr.2011.03.031. 43 S. Bechhofer, I. Buchan, D. De Roure, P. Missier, J. Ainsworth, J. Bhagat, P. Couch, D. Cruickshank, M. Delderfield, I. Dunlop, M. Gamble, D. Michaelides, S. Owen, D. Newman, S. Sufi and C. Goble, Future Generation Computer Systems, 29, Issue 2 (2013), 599–611, http://dx.doi.org/10.1016/j.future.2011.08.004. 44 T. Blanke and M. Hedges, ‘Scholarly primitives: building institutional infrastructure for humanities e- science’ (2013). 45 H. Lin and J. Davis, ‘Computational and crowdsourcing methods for extracting ontological structure from folksonomy, The Semantic Web: Research and Applications, Lecture Notes in Computer Science, 6089 (2010), 472-477, DOI:10.1007/978-3-642-13489-0_46. 46 S. Golder, ‘Usage patterns of collaborative tagging systems’, Journal of Information Science, 32, Issue 2 (2006), 198-208. 47 J. Trant, Tagging, Folksonomy, and Art Museums: Resultsof steve.museum’s research (2009), http://conference.archimuse.com/blog/jtrant/stevemuseum_research_report_available_tagging_fo; J. Trant, D. Bearman and S. Chun, ‘The eye of the beholder: steve.museum and social tagging of museum collections’, Proceedings of the International Cultural Heritage Informatics Meeting (ICHIM07), Toronto, Canada (2007). 48 http://www.bbc.co.uk/arts/yourpaintings/ 49 http://www.scholarslab.org/category/praxis-program/ 50 P. Brohan, R. Allan, J. E. Freeman, A. M. Waple, D. Wheeler, C. Wilkinson and S. Woodruff, ‘Marine observations of old weather’ Bulletin of the American Meteorological Society, 90, Issue 2 (2009), 219-230. 51 T. Causer, J. Tonra and V. Wallace, ‘Transcription maximized; expense minimized? crowdsourcing and editing The Collected Works of Jeremy Bentham’ (2012). 52 For example, http://www.familysearch.org 53 R. Holley, Many hands make light work: public collaborative OCR text correction in Australian historic newspapers (2009). 54 M. Wald, ‘Crowdsourcing correction of speech recognition captioning errors’ Proceedings of the International Cross-Disciplinary Conference on Web Accessibility - W4A '11 (2011), http://eprints.soton.ac.uk/272430/1/crowdsourcecaptioningw4allCRv2.pdf. 55 R. Kurin, ‘Safeguarding intangible cultural heritage in the 2003 UNESCO convention: a critical appraisal’, Museum International, 56, Issue 1-2 (2004), 66–77. 56 http://www.regaltenbury.org.uk/memory-reel/ 57 C. Hough, E. Bramwell and D. Grieve, Scots Words and Place-Names Final Report, JISC (2011), http://www.jisc.ac.uk/media/documents/programmes/digitisation/swapfinalreport.pdf. See also http://swap.nesc.gla.ac.uk/. 58 http://www.scotsdictionaries.org.uk/ 59 This usage differs from the standard usage of the term by museums. 60 http://www.europeana1914-1918.eu/en/contributor 61 M. Terras, ‘Digital curiosities: resource creation via amateur digitisation’ (2010). 62 R. Holley, ‘Crowdsourcing: how and why should libraries do it?’ (2010). 63 www.yearofshakespeare.com 64 http://humanitiescrowds.org/wp-uploads/2012/09/workshop_report1.pdf 65 http://bloggingshakespeare.com/year-of-shakespeare-king-lear-at-the-almeida 66 http://www.whats-the-score.org; http://scores.bodleian.ox.ac.uk 67 C. Fleet, K. C. Kowal and P. Pridal, ‘Georeferencer: crowdsourced georeferencing for map library collections, D-Lib Magazine, 18, No. 11/12 (2012), http://www.dlib.org/dlib/november12/fleet/11fleet.html. 68 M. Goodchild, ‘Editorial: citizens as voluntary sensors: spatial data infrastructure in the world of Web 2.0’, International Journal of Spatial Data Infrastructures Research, 2 (2007), 24-32. 69 http://www.openstreetmap.org/ 70 M. Haklay and P. Weber, ‘OpenStreetMap: user-generated street maps’, Pervasive Computing, IEEE, 7, Issue 7 (2008), 12-18. M. Haklay, ‘How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets’, Environment and Planning B: Planning and Design,37, Issue 4 (2010), 682-703. 71 C. Brando and B. Bucher, ‘Quality in user generated spatial content: a matter of specifications’, Proceedings of the 13 th AGILE International Conference on Geographic Information Science, Guimarães, Portugal (2010), 1-8. 72 S. Chilton, ‘Crowdsourcing is radically changing the geodata landscape: case study of OpenStreetMap’, Proceedings of the 24th International Cartographic Conference, Santiago, Chile (2009), http://w.icaci.org/files/documents/ICC_proceedings/ICC2009/html/nonref/22_6.pdf. 73 C. Fink, ‘Mapping together: on collaborative implicit cartographies, their discourses and space construction’, Journal for Theoretical Cartography, 4 (2011), 1-14. M. Graham, ‘Neogeography and the palimpsests of place: Web 2.0 and the construction of a virtual earth’, Tijdschrift voor Economische en Sociale Geografie, 101, Issue 4 (2010), 422-436. 74 http://www.stoa.org/sol/ 75 J. D. Cintas and P. M. Sanchez, ‘Fansubs: audiovisual translation in an amateur environment’, Journal of Specialised Translation, 6 (2006), 37-52. Figure 1: Typology framework Process Type Collaborative tagging Linking Correcting/modifying content Transcribing Recording and creating content Commenting, critical responses and stating preferences Categorising Cataloguing Contextualisation Mapping Georeferencing Translating Table 1: Process Types Asset Type Geospatial Text Numerical or statistical information Sound Image Video Ephemera and intangible cultural heritage Table 2: Asset Types TASK Mechanical Configurational Editorial Synthetic Investigative Creative Table 3: Task Types Asset Type Original text Transcribed text Corrected text Enhanced text Transcribed music Metadata Structured data Knowledge/awareness Funding Synthesis Composite digital collection with multiple meanings Table 4: Output Types ADP106.tmp Open Access document downloaded from King’s Research Portal General rights Citation to published version: work_asoxottesnejraao7uisnuotsy ---- humanities Article Digital Humanities’ Shakespeare Problem Laura Estill Department of English, St. Francis Xavier University; P.O. Box 5000, Antigonish, NS B2G 2W5, Canada; lestill@stfx.ca Received: 28 January 2019; Accepted: 23 February 2019; Published: 4 March 2019 ���������� ������� Abstract: Digital humanities has a Shakespeare problem; or, to frame it more broadly, a canon problem. This essay begins by demonstrating why we need to consider Shakespeare’s position in the digital landscape, recognizing that Shakespeare’s prominence in digital sources stems from his cultural prominence. I describe the Shakespeare/not Shakespeare divide in digital humanities projects and then turn to digital editions to demonstrate how Shakespeare’s texts are treated differently from his contemporaries—and often isolated by virtue of being placed alone on their pedestal. In the final section, I explore the implications of Shakespeare’s popularity to digital humanities projects, some of which exist solely because of Shakespeare’s status. Shakespeare’s centrality to the canon of digital humanities reflects his reputation in wider spheres such as education and the arts. No digital project will offer a complete, unmediated view of the past, or, indeed, the present. Ultimately, each project implies an argument about the status of Shakespeare, and we—as Shakespeareans, early modernists, digital humanists, humanists, and scholars—must determine what arguments we find persuasive and what arguments we want to make with the new projects we design and implement. Keywords: digital humanities; Shakespeare; early modern drama; literary canon; English literature; Renaissance 1. Introduction Digital humanities has a Shakespeare problem; or, to frame it more broadly, a canon problem. Too many digital projects and sites focus on Shakespeare alone. Some sites highlight Shakespeare to the exclusion of other writers; other projects set their bounds at Shakespeare and “not Shakespeare”. Digital humanities’ Shakespeare problem both stems from and reifies Shakespeare’s centrality to the canon of English literature. While this problem is, indeed, a digital humanities problem, it is also a problem in the arts and humanities more generally. Shakespeare is one of the few writers regularly featured in single-author undergraduate courses (alongside, perhaps, Chaucer, Milton, and Austen, albeit to a lesser extent). Shakespeare’s works are so often produced on the twenty-first century stage that American Theatre excludes Shakespeare from their annual list of top-produced American plays in order to “make more room on our list for everyone and everything else” (Tran 2018). Digital humanities has often been heralded as the solution to the canonicity problem, but that is a great burden that it cannot bear alone. This essay begins by demonstrating why we need to consider Shakespeare’s position in the digital landscape, recognizing that Shakespeare’s prominence in digital sources stems from his cultural prominence. I describe the Shakespeare/not Shakespeare divide in digital humanities projects and then turn to digital editions to demonstrate how Shakespeare’s texts are treated differently from his contemporaries—and often isolated by virtue of being placed alone on their pedestal. In the final section, I explore the implications of Shakespeare’s popularity to digital humanities projects, some of which exist solely because of Shakespeare’s status. Shakespeare’s centrality to the canon of digital humanities reflects his reputation in wider spheres such as education and the arts. No digital project will offer a complete, unmediated view of the past, or, indeed, the present. Ultimately, each digital Humanities 2019, 8, 45; doi:10.3390/h8010045 www.mdpi.com/journal/humanities http://www.mdpi.com/journal/humanities http://www.mdpi.com https://orcid.org/0000-0003-0904-3325 http://dx.doi.org/10.3390/h8010045 http://www.mdpi.com/journal/humanities https://www.mdpi.com/2076-0787/8/1/45?type=check_update&version=2 Humanities 2019, 8, 45 2 of 16 humanities project presents an argument about the status of Shakespeare, and we—as Shakespeareans, early modernists, digital humanists, humanists, and scholars—must determine what arguments we find persuasive and what arguments we want to make with the new projects we design and implement. Although the definition of digital humanities (and perhaps even the definition of Shakespeare) is subject to disagreement, for this essay, I limit my scope to digital humanities resources for pedagogy and research. This excludes games such as Richard III Attacks! (P. 2015), online performances such as Such Tweet Sorrow (Silbert 2010), and social media hashtags like #ShakespeareSunday. Cultural studies often informs New Media Shakespeare scholarship to show Shakespeare’s continued prominence online (see O’Neill 2015 for an overview): consider recent issues of Shakespeare Quarterly (Rowe 2010) and Borrowers and Lenders (Calbi and O’Neill 2016) on this topic. Stephen O’Neill (2018), drawing on Douglas Lanier’s notion of “Shakespearean rhizomatics” (Lanier 2014), equates “Our contemporary Shakespeares” to “digital Shakespeares”, describing both as “fully rhizomatic in their extraordinary and seemingly endless flow of relations.” Christy Desmet suggests that we need to encounter all digital Shakespeares (both digital humanities and new media) through the lens of Ian Bogost’s “alien phenomenology” (Bogost 2012), considering “material objects and networks as models for posthuman relations” (Desmet 2017, p. 5). Although Digital Humanities and New Media are often paired, for the purpose of this essay it is useful to differentiate the two: new media endeavors that participate in or create digital culture versus digital humanities projects that announce themselves as contributing to our general and scholarly knowledge. This article focuses on digital humanities projects for two reasons: first, as one way of limiting the scope of the “seemingly endless flow of relations” in Digital Shakespeares, and second, because the majority of digital humanities projects exist primarily to educate rather than to entertain. Digital humanities projects provide the resources we use to study and teach the early modern period: digital editions, bibliographies, digitizations, catalogs, and more. Often, digital humanities projects are expanded from earlier print resources: consider, for instance, the online English Short Title Catalogue (British Library 2006) and its print antecedents, the short-title catalogs by Pollard and Redgrave (1926) and Donald Goddard Wing (1945). Nondigital scholarly resources frequently skew towards Shakespeare; even the library catalogs we use to access archival resources are not neutral and emphasize Shakespeare above his contemporaries (Estill 2019a). Many digital humanities resources replicate this Shakespeare-centric focus, and, as such, misrepresent the materials they provide or offer a skewed perspective on early modern literature, theatre, and culture. Biased sources can only lead to biased scholarship; and while some professors will be able to see the biases of the sites they visit, many students will not. This is particularly problematic because, as Christie Carson and Peter Kirwan explain, “Students are some of the key ‘users’ of digital Shakespeare” (Carson and Kirwan 2014a, p. 244). It has been well-documented that major digital literary studies projects often focus on canonical authors. There is excellent work on the biases of digital humanities projects, particularly in relation to the status of women writers (see, for instance, Wernimont and Flanders 2010; Mandell 2015; Bergenmar and Leppänen 2017) and the canon of American literature (Earhart 2012; Price 2009), yet comparatively few scholars have critiqued how digital humanities overrepresents perhaps the most canonical figure in all of English literature: Shakespeare. “Shakespeare and Digital Humanities” has been and continues to be a fruitful area of research, with special issues of Shakespeare (Galey and Siemens 2008), the Shakespearean International Yearbook (Hirsch and Craig 2014), RiDE: Research in Drama Education (Bell et al., forthcoming), and this issue of Humanities. The prevalence of digital humanities tools in Shakespeare teaching and research leads Carson and Kirwan to wonder, “are all Shakespeares digital now?” (Carson and Kirwan 2014a, p. 240). The questions less often asked are: when we focus on Shakespeare(s) in our digital projects, what is excluded by our Shakespeare-centrism? And how does that shape how we access and understand early modern drama? Digital Shakespeare studies often focuses on Shakespeare’s place in the digital world, without questioning why he is given such primacy and the ramifications of his continued canonization. Humanities 2019, 8, 45 3 of 16 A decade ago, Matthew Steggle (2008) showcased how digital projects were “developing a canon” of early modern literature. Building on the “interrelated cycles” that Gary Taylor identified as supporting Shakespeare’s centrality to textual studies, Brett Greatley-Hirsch describes “the long shadows cast by the cultural, scholarly, and economic investments in Shakespeare” (Hirsch 2011, p. 569), specifically as it pertains to digital editions of early modern plays. This essay furthers the work by Steggle, Greatley-Hirsch, and others by arguing that we must continually assess the landscape of digital projects available for teaching and researching the early modern period in order to understand and shape the future of the field. As the argument goes, traditional anthologies and resources are constricted by page counts and other limited resources, unlike digital projects, which can be democratizing due to their lack of—or, more realistically, different—limitations. In that vein, Neil Fraistat, Steven E. Jones, and Carl Stahmer (Fraistat et al. 1998, p. 2) suggest that “one of the strengths of Web publishing is that it facilitates—even favors—the production of editions of texts and resources of so-called non-canonical authors and works.” Earhart (2015, esp. chp. 3), however, traces the familiar pattern of discovery then loss for noncanonical writers: their work is digitized, declared as recuperated, and then the site disappears. Another way digital humanities has been announced to recover noncanonical writers is by projects that digitize on a large scale. Julia Flanders (2009) explains: It is now easier, in some contexts, to digitize an entire library collection than to pick through and choose what should be included and what should not: in other words, storage is cheaper than decision-making. The result is that the rare, the lesser known, the overlooked, the neglected, and the downright excluded are now likely to make their way into digital library collections, even if only by accident. Indeed, it is the decision-making where Shakespeare too often gets pulled artificially to the fore: sometimes even in the foundational decisions about project scope. The next section of the essay explores how single authors are represented in small-scale digital resources versus large-scale digital resources, thinking about them in terms of labor, funding, and project scope. 2. The Shakespeare/Not Shakespeare Divide in Digital Humanities Resources There is a lopsidedness to early modern online resources: some, such as the English Short Title Catalog (ESTC; British Library 2006) and the Database of Early English Playbooks (DEEP; Lesser and Farmer 2007) deliver breadth of coverage that is, due to their large scope, necessarily shallow; others, such as The Shakespeare Quartos Archive (Bodleian Library 2009) or MIT’s Global Shakespeares (Donaldson 2009), provide deep coverage of a much narrower topic. Both approaches are needed to support different avenues of early modern scholarship, but, the latter, I contend, too often begins and ends with Shakespeare. The logistical reasons for these very different kinds of projects (broad coverage versus deep coverage) are readily apparent. The notion of “Shakespeare” offers a convenient scope and bounds for a given project. Many projects that include detailed metadata, extensive editorial annotation or encoding, expensive-to-create facsimiles, or streaming media center on the work of a single author. The Pulter Project (Knight and Wall 2018), for instance, is an example of a new project that focuses on a single author, and, indeed, a single manuscript, in order to offer a hypertext edition with multiple layers of editorial intervention, linked related texts, and comparative viewing options. The Digital Cavendish Project (Moore and Tootalian 2013) offers a range of ways to interact with Margaret Cavendish’s life and texts: site visitors can explore Margaret Cavendish’s social network, search the bibliography-in-progress of Cavendish scholarship, and make use of reference works such as a list of Cavendish’s printers and booksellers and a spreadsheet locating all known copies of Cavendish’s early publications. We can imagine extending these projects by adding another analysis section, another manuscript, or even another individual author. However, to extend these projects by any order of magnitude, by say, covering all seventeenth-century women writers or all previously unpublished Humanities 2019, 8, 45 4 of 16 manuscript poetry would be to undertake significant amounts of labor and would require both time and money. These single-author projects are the fruits of detailed scholarly attention: they are “boutique” digital projects. In their discussion of archival practices, Mark A. Greene and Dennis Meissner position “boutique digitization” at the far end of the continuum from “‘Googlization’ (ultra-mass digitization)” (Greene and Meissner 2005, p. 196). The former, boutique projects, require “extraordinary attention to the unique properties of each artifact” (Conway 2010, p. 76). While Greene, Meissner, and Conway focus on archival digitization projects, the continuum also applies to digital humanities projects, many of which include digitized elements alongside other interventions: transcriptions, editorial apparatus, bibliographic resources, and so forth. The Shakespeare Quartos Archive (Bodleian Library 2009) is an example of extraordinary attention to primary sources: the site’s goal is to “reproduce at least one copy of every edition of William Shakespeare’s plays printed in quarto before the theatres closed in 1642.” Where possible, however, they include digitizations of as many copies of each Shakespeare quartos as possible. Their prototype offers thirty-two quartos of Hamlet (from Q1–Q5), carefully digitized and painstakingly encoded.1 With their attention to primary sources, the Shakespeare Quartos Archive project argues that scholars must pay attention to copy-specific details. The Shakespeare Quartos Archive text encoding highlights different marginalia in each copy, the binding, and even the library ownership stamps.2 While the Shakespeare Quartos Archive can be used as an exemplar of a “boutique” project, it is not the labor of a single scholar. This project emerged from the collaboration of multiple major institutions, including, most notably, the Bodleian Library of the University of Oxford, the British Library, the University of Edinburgh Library, the Folger Shakespeare Library, the Huntington Library, and the National Library of Scotland. The project was made possible by major grant funding from the United States’s National Endowment for the Humanities (NEH) and the United Kingdom’s Joint Information Systems Committee (JISC). The well-supported Shakespeare Quartos Archive raises another reason for author-centric approaches, namely, existing funding models. As Jamie “Skye” Bianco (2012) explains, “digital humanities is directly linked to the institutional funding that privileges canonical literary and historiographic objects and narratives” (see also Price 2009). In her review, Desmet unpacks the project’s “rationale for a focus on Shakespeare’s quartos” (Desmet 2014, p. 143): the rarity and fragility of the material objects; their locations in libraries around the world; and the lack of Shakespearean manuscript texts. This rationale, while a compelling argument for why we need to digitize and encode all early modern play quartos, hardly touches on why Shakespeare is the focus of the project. We lack authorial manuscripts of many plays by many playwrights. The Shakespearean focus of the Shakespeare Quartos Archive is taken for granted. It is hard to imagine the Ford Quartos Archive receiving much enthusiasm from funders, despite the fact that John Ford’s plays are still edited, anthologized, taught, and performed today. There are many ongoing editorial projects focused on individual early modern playwrights, such as Oxford University Press’s The Complete Works of John Marston (Butler and Steggle, forthcoming); yet to imagine digitizing and encoding all known early printings of Marston’s work for a Marston Quartos Archives seems far-fetched, and the notion of turning to even less canonical playwright—say, the Glapthorne Quartos Archive—hardly bears thinking about. Shakespeare sells. Shakespeare’s name is itself a valuable commodity (Hodgdon 1998; McLuskie and Rumbold 2014; Olive 2015). Digital project 1 Just as Digital Humanities has a Shakespeare problem, Shakespeare studies has a Hamlet problem, although the prominence of Hamlet in Shakespeare studies, both digital and otherwise, is a topic for another essay. For evidence of Hamlet’s prominence, see Bernice W. Kliman et al.’s HamletWorks (Kliman et al. 2004) and Estill, Klyve, and Bridal (Estill et al. 2015). 2 The Shakespeare Quartos Archive uses the Text Encoding Initiative (TEI) for their XML (eXtensible Markup Language), which includes elements such as , , and (form work, for running heads, as an example). For more on their detailed encoding, see Desmet 2014. Humanities 2019, 8, 45 5 of 16 funders and creators recognize its value as much as academic publishers who push for Shakespeare’s name in book titles. Martin Mueller pointed to tenure and promotion part of the reason for the scholarly focus on major, canonical plays. He asked, “You can see why professional scholars stay away from minor plays, unless they explicitly deal with hot topics. A play may interest them, but how will an entry about it look on a c.v.?” (Mueller 2014). While there is a wealth of valuable scholarship on minor plays, as Mueller points out, “the annual number of publications about Shakespeare dwarfs—by at least an order of magnitude—the number of publications about his contemporaries.” Before scholars can achieve tenure and promotion, they must first land that tenure-track job, which, in many cases, means demonstrating that they can teach the single-author undergraduate Shakespeare course(s). Just as work on noncanonical playwrights can be met with institutional skepticism, digital humanities publication has tended to be undervalued by tenure and promotion committees, prompting scholarly bodies such as the Modern Language Association (MLA) to publish interventions like “Guidelines for Authors of Digital Resources” and “Guidelines for Evaluating Work in Digital Humanities and Digital Media” (MLA Committee on Information Technology 2012a, 2012b). For scholars creating digital projects, both funding and institutional structures of tenure and promotion can offer disincentives to go beyond Shakespeare. Shakespeare is so privileged in early modern digital humanities projects that some projects market themselves as a corrective. Mueller’s now-defunct Shakespeare His Contemporaries (Mueller 2016) described itself as “a project devoted to the collaborative curation of non-Shakespearean plays from Shakespeare’s world.”3 Despite offering a digital humanities project that recognizes and pushes back against Shakespeare’s centrality to early modern drama studies, Shakespeare His Contemporaries’s self-definition (“non-Shakespearean”), title (Shakespeare His Contemporaries), and scope (“Shakespeare’s world”) all gravitate around Shakespeare. This is hardly unique. Similarly, the “Beyond Shakespeare” project (a podcast and blog) has a twitter bio announces their interest in “anything but the Bard”, just as their handle, @BeyondShakes, and the project title evokes his name (Crighton 2013). Andy Kesson, Lucy Munro, and Callan Davies’s “Before Shakespeare” reveals valuable insights about mid-sixteenth century London theatres. In their article, “DH and Non-Shakespearean Theatre History”, Davies and Kesson (forthcoming) explain how the digital components of their project are an integral part of their outreach mission: The digital presence of “Before Shakespeare” is centered around showcasing various media at once: archives, discussion, videos, images, performance, and song—from Soundcloud to YouTube—to increase the visibility of non-Shakespearean drama and diversify its availability and appeal beyond printed editions and text. Despite their non-Shakespearean focus, or, indeed, perhaps because of it, their project title, URL (beforeshakespeare.com), “About” description, and Twitter account similarly centralize Shakespeare in the literary canon, even while resisting this positioning. The “About” page explains, “Before Shakespeare is also the first project to take seriously the mid-century beginnings of those playhouses, seeing them as mid-Tudor and early Elizabethan phenomena rather than becoming distracted by the second generation of people working in the playhouses, the most famous of whom is William Shakespeare himself” (Kesson et al. 2016, “About”). Their Twitter avatar (@B4Shakes, as of January 2019) is a picture of Shakespeare himself, though with the word “before” covering his eyes and with his mouth silenced by a series of decorative fleurons. There hardly seems to be an elegant solution 3 Shakespeare His Contemporaries can be accessed on the Internet Archive’s Wayback Machine by inserting its former URL, http://shakespearehiscontemporaries.northwestern.edu/shc. The Shakespeare His Contemporaries XML—itself created by improving the encoding provided by the Early English Books Online Text Creation Partnership, or EEBO-TCP—is preserved in the Folger’s Digital Anthology of Early Modern English Drama (Brown et al. 2016). http://shakespearehiscontemporaries.northwestern.edu/shc Humanities 2019, 8, 45 6 of 16 for digital projects designed to push attention away from Shakespeare. As the most recognizable literary figure from his day, it could be argued that a site designed to appeal to the general public would be remiss to avoid naming him: there is no need to turn him into he-who-must-not-be-named, giving the name of Shakespeare even more power. Furthermore, for a project aiming to reach “wider audiences within and beyond scholarship”, name-dropping Shakespeare can be an effective way to attract people to their site and social media, which will then offer “a powerful advertisement for the force and fascination of currently ‘non-canonical’ plays” (Davies and Kesson, forthcoming). Despite the potential for democratization or canon expansion, digital projects too often reify canon, even when they attempt to subvert it. Emma Smith (2017) describes how this effect is not limited to the digital: drawing on examples from scholarship, culture, and online, she shows how “attempts to decentre Shakespeare are thus often self-defeating.” She continues, “Do we privilege Shakespeare above other writers? Self-evidently and self-fulfillingly so.” Smith contends that “Shakespeare studies have begun to reflect on the conditions and consequences of their own cultural supremacy”; this article contributes to these ongoing reflections. Although Smith acknowledges the “cultural, theatrical and educational disadvantages of Shakespeare-centrism,” she concludes by positioning Shakespeare as “the apex predator in a cultural ecosystem where he has no rivals, only prey,” suggesting our focus on Shakespeare is somehow required for the metaphoric ecosystems of culture and scholarship. Digital projects, however, have the potential to go beyond this status quo, by, for instance, positioning Shakespeare alongside his contemporaries or by highlighting the historical moments that led to Shakespeare’s current position as cultural touchstone. 3. Digital Editions and the Privileging of Shakespeare’s Text When we turn to digital editions, those digital humanities stalwarts, we see the same “not Shakespeare” construction of projects as detailed above. For instance, Greatley-Hirsch’s Digital Renaissance Editions was “inspired by the Internet Shakespeare Editions” (Greatley-Hirsch 2015, homepage). That is to say, an online edition of Shakespeare’s works inspired a site whose aim is to offer “electronic scholarly editions of early English drama and texts of related interest, from late medieval moralities and Tudor interludes, occasional entertainments and civic pageants, academic and closet drama, and the plays of the commercial London theaters, through to the drama of the Civil War and Interregnum” for all authors, except Shakespeare (Greatley-Hirsch 2015, homepage). These sibling projects only reinforce the divide between Shakespeare and not-Shakespeare. Shakespeare’s central position in the canon becomes exceptional: he no longer falls under the umbrella of “Renaissance” or “early English drama.” By excluding Shakespeare, Digital Renaissance Editions follows the tradition of printed non-Shakespearean anthologies, such as Arthur F. Kinney’s Renaissance Drama (1999) and David Bevington’s English Renaissance Drama (2002). With digital editions, however, this Shakespeare-not Shakespeare gulf can be bridged, for instance, with a federated search interface. It would be wonderful to see, in the future, a new way to access Digital Renaissance Editions, the Internet Shakespeare Editions (Jenstad 2018), and the Queen’s Men Editions (Ostovich 2006), all of which are built on the same platform, where users can easily compare content from across all three sites, perhaps searching for keywords across plays from all three. There is, of course, a value to maintaining each site separately: each project makes an argument about how we need to approach early modern drama. The Internet Shakespeare Editions includes much non-Shakespearean content, such as the full text and facsimiles of the play A Yorkshire Tragedy; however, the non-Shakespearean content is provided as context for our understanding of Shakespeare. A Yorkshire Tragedy is included in the Internet Shakespeare Editions because of its status as “almost Shakespeare”: although now accepted as apocryphal, it was once attributed to Shakespeare and was published in the second imprint of the 1664 folio. Similarly, the Internet Shakespeare Editions includes an extract from Robert Greene’s Selimus, because Jessica Slights deemed Greene’s play a valuable intertext for her edition of Othello (Slights 2017). A Yorkshire Tragedy, Selimus, and other non-Shakespearean works on the site are categorized as “resources” (the last option from the top menu) Humanities 2019, 8, 45 7 of 16 whereas Shakespeare’s plays and poems are the “texts” the Internet Shakespeare Editions foregrounds (the first option from the top menu). The Internet Shakespeare Editions guides users to approach all non-Shakespearean content through the lens of Shakespeare, first and foremost. The argument of Digital Renaissance Editions emerges to counter this overreliance on Shakespeare, yet ends up making Shakespeare conspicuous in his absence. As a digital edition based on the plays performed by a single playing company, the Queen’s Men Editions argues for the value of performance and the importance of repertory-based studies not defined by authorship (Ostovich 2006, “The QME brand”). As Scott McMillin and Sally-Beth MacLean (McMillin and MacLean 1998), Lucy Munro (2009), and others have demonstrated, repertory studies is a valuable field that could be bolstered with even further digital editions organized by theatre company or playing space. At this point, the ISE, DRE, and QME offer three sites, three goals, and three uneven slices of early modern drama. While “maintaining the integrity of [the] sites” and “eliminat[ing] confusion” about their roles and boundaries (Ostovich 2006) is important, there is still a place for a federated search that would allow users to approach the content on all three sites at once. Although this imagined federated search would, at this moment, be far from a universal view of early modern English drama, it could offer a more comprehensive overview than each site currently provides as they stand alone, connected for users only with the occasional hyperlinks. Diane K. Jakacki’s thoughtful description of the Internet Shakespeare Editions tagset, relation to its sibling sites, and the potential of linked open data insightfully considers the potential of “acts of editorial disruption” to “allow us to move forward toward infinity while maintaining editorial stability across digital projects” (Jakacki 2018, p. 158). As digital editions of early modern drama “move forward toward infinity”, we must assess if we want Shakespeare to be the default number one. The Folger Shakespeare Library has also published digital projects defined by the presence or lack of Shakespeare: Folger Digital Texts (Mowat et al. 2012) and the Digital Anthology of Early Modern English Drama Anthology (Brown et al. 2016). However, unlike the Internet Shakespeare Editions and their sister sites the Folger sites provided edited texts without critical introductions or notes. Folger Digital Texts offers editions of Shakespeare; the Digital Anthology includes editions and bibliographic information about, as their homepage announces, “other plays from Shakespeare’s time” (emphasis in the original). The Digital Anthology Frequently Asked Questions page anticipates that users will want to know “Where is Shakespeare? And how does this relate to him?” Their response runs, in full: William Shakespeare’s plays are not part of EMED, for a simple reason: EMED was conceived as a way of showcasing all of the other playwrights writing in England’s early modern era. By bringing together their plays, however, EMED recreates the theater world that made possible Shakespeare’s career and influenced his work. Shakespeare knew many of the earlier plays as an actor or audience member. He also collaborated and competed with some of the playwrights. He directly influenced others. To read Shakespeare’s works, we recommend another Folger resource: the Folger Digital Texts. Some of the plays in EMED have historically been attributed to Shakespeare, including The London Prodigal, Sir John Oldcastle, and The Yorkshire Tragedy. These are currently regarded as “Shakespeare Apocrypha” and are no longer attributed to Shakespeare. For an xplanation of how The London Prodigal fits (or does not fit) into Shakespeare’s corpus, see Peter Kirwan’s article in Shakespeare Documented. (Hyperlinks removed from original.) Even as they undertake important work on early modern drama beyond Shakespeare, the Digital Anthology repeatedly presents the non-Shakespearean plays at the center of their project as “other”. They assert that their site is valuable because it adds to our knowledge of Shakespeare. Their anticipated users don’t care about Sir John Suckling or even Christopher Marlowe. They highlight the value of their site’s “almost Shakespeare” apocryphal content. The Digital Anthology links to two Folger projects focusing entirely on Shakespeare: the Folger Digital Texts and Shakespeare Documented, both examples of “deep” digital humanities projects with a focus on Shakespeare. Humanities 2019, 8, 45 8 of 16 Even if we consider the Digital Anthology of Early Modern Drama and Folger Digital Texts as twinned projects, they are not identical, but fraternal twins. The interfaces for both sites are quite different; one of the most notable differences is that the Folger Digital Texts Shakespeare editions are presented in modern spelling, whereas the rest of the early modern drama corpus is not. This is because the Folger Digital Texts are based on the Folger’s print series, edited by Paul Werstine and Barbara Mowat, which means they have a different level of editorial intervention. The Folger digital projects do not neatly fit into the “deep” and “broad” categories: rather, they exist to serve different audiences. A nonspecialist will have an easier time navigating Shakespeare’s texts on Folger Digital Texts than the plays on the Digital Anthology of Early Modern English Drama. Conversely, the Digital Anthology appeals to scholars by offering extensive links existing resources, such as DEEP and the ESTC, as well as the additional data about early performance and publication that offers easy comparison across the corpus. The artificial divide the Folger sites erect between Shakespeare and not Shakespeare, then, is only compounded when, for instance, a scholar wants to know plays first performed in 1599 and returns a list of eleven plays, which, based on the Digital Anthology’s scope, excludes Shakespeare’s Julius Caesar, Henry V, and As You Like It. (A similar search in DEEP will include all results, but with multiple entries for each play that has more than one pre-1660 publication.) This is to say, the Folger’s Digital Anthology of Early Modern English Drama is a digital project both with breadth (including bibliographic data about 403 plays) and depth (offering full texts of twenty-nine plays), yet it is the project’s very exclusion of Shakespeare that warps the search results to offer an unrepresentative view of early modern drama and instead presents results with a Shakespeare-sized hole at their center. Indeed, the work of other writers is also omitted with the Shakespearean: for instance, Fletcher’s work in Henry VIII is cut out from the corpus simply because it is a collaboration with Shakespeare. Even in digital editions ostensibly focused on non-Shakespearean early modern drama, Shakespeare’s shadow looms. The Queen’s Men Editions currently provides performance editions of nine plays from the Queen’s Men repertory—and four of these nine plays (Famous Victories of Henry V, King Leir, Troublesome Reign of King John, and True Tragedy of Richard III) have Shakespearean counterparts. The repertory of the Queen’s Men Company did not comprise 44% of plays directly related to Shakespeare (McMillin and MacLean 1998, esp. appendix A); yet this digital project has begun by privileging those texts. The Folger’s Digital Anthology of Early Modern English Drama similarly offers an edition of The True Chronicle of King Leir, as well as the apocrypha they highlight in their FAQ. Some of the same apocryphal plays (including The London Prodigal) appear in both the “resources” section of the Internet Shakespeare Edition and the Digital Anthology of Early Modern Drama. Just having proximity to Shakespeare means these works get more editorial attention than other plays. Richard Brome Online (Cave 2010) remains remarkable in the history of open-access online editions of early modern drama.4 Greatley-Hirsch notes, “Until the launch of Richard Brome Online in 2010, there were no electronic critical editions of non-Shakespearean Renaissance drama available” (Hirsch 2011, p. 574). Today, it still stands alone in the landscape of digital humanities projects as the only non-Shakespearean author-based online edition. (The Cambridge Edition of the Works of Ben Jonson Online (Butler 2014), which expands and supplements their printed play editions, is paywalled.) Richard Brome Online argues for the value of considering the works of a single playwright as an oeuvre—an approach often taken to Shakespeare. Like repertory-based editions, there is the idea that if we could expand this model to every author or every repertory, we would have a complete representation of the plays of the period. The realities of early modern collaborative playwriting and anonymous works, however, will complicate future author-based online editions, although author-based editions will certainly have their place in digital humanities projects; I, for one, look forward to Christopher Marlowe Online or John Webster Online. 4 See Hirsch (2010) for an insightful and extended review of this project. Humanities 2019, 8, 45 9 of 16 Let us take Webster’s The Duchess of Malfi as an exemplar of the status of non-Shakespearean plays online. The Duchess of Malfi is not Shakespearean apocrypha, nor is it a source or adaptation of one of Shakespeare’s plays. (Webster’s play, however, was performed by the King’s Men, Shakespeare’s company.) Despite having only marginal Shakespearean ties, The Duchess of Malfi is of continued scholarly interest and has an ongoing performance history, including a 2018 Royal Shakespeare Company production directed by Maria Aberg (Aberg 2018). Although currently Shakespeare’s plays are performed at much higher rates than those by his contemporaries, performance and scholarship about performance offers one opportunity to effectively decenter Shakespeare. Even though The Duchess of Malfi is a relatively popular early modern play, it does not currently appear in any of the digital editions discussed in this essay so far (The Internet Shakespeare Editions, Queen’s Men Editions, Digital Renaissance Editions, Folger Digital Texts, Early Modern Anthology of Early Modern English Drama)—it does, however, appear in both printed anthologies mentioned (Kinney 1999; Bevington 2002). In future expansions, it could fall into the scope of Digital Renaissance Editions and the Digital Anthology. Yet today, in 2019, it can only be found freely available online in out-of-copyright editions (on HathiTrust (Furlough 2008), GoogleBooks (Google 2004), and the Internet Archive (Kahle 1996)), in its Early English Books Online–Text Creation Partnership (Early English Books Online-Text Creation Partnership EEBO-TCP) transcription and derivatives, and in a single digital edition. The archived version on Renascence Editions (Moncrief-Spittle 2001) offers a transcription of William Hazlitt’s 1857 edition; the 1910 Harvard Classics edition, edited by Charles W. Eliot, is available on Bartleby.com: Great Books Online (1993), Project Gutenberg (Hart 1971), and the ebooks@Adelaide (Thomas 2015) sites—though not all of these sites are transparent about their sourcetexts. St John’s College Digital Archive offers an unannotated, undated facsimile of a typewritten Duchess of Malfi text (King William Players 1947), with no clue as to its origins except that it is posted in the “Playbills and Programs” digital collection, many of which are “from productions by The King William Players, the St. John’s student theater troupe”.5 The only online scholarly edition of The Duchess of Malfi less than a hundred years old is Larry Avis Brown’s 2010 edition (last updated 2018), which includes glosses, commentary on each scene, and photos from a 1998 production at Lipscomb University in Nashville (Brown 2010). Brown’s useful edition, however, exists separately from most of the sphere of early modern English drama online: it is a boutique project that stands alone, without links to and from many scholarly resources. Brown links to The Internet Shakespeare Editions, noting that his edition won their “swan” award in 2003, yet in the ISE rebuild, all mentions of Brown’s site (still findable in their site search) now result in “Page Not Found” errors. The usefulness of Brown’s Duchess of Malfi edition, then, is hampered by its lack of findability. I admit I only stumbled upon this edition because it is linked from the Wikipedia page for The Duchess of Malfi. “Boutique” editions created by individual scholars, particularly when peer-reviewed, have the potential to democratize our access to early modern plays—but this access must include findability. As Jakacki notes, however, “the ambition of a network of linked sources has significant implications for the editorial processes of not one, but all of the resources involved” (Jakacki 2018, p. 165). Previously, Early Modern Literary Studies (Steggle 2004) and Renascence Editions (Bear 1994) made efforts to host boutique editions of early modern literature edited to varying degrees, however these attempts seem to have been largely abandoned. Shakespeare is separated from the other playwrights and poets of his day by our current scholarly digital editions. Greatley-Hirsch quantified the disproportionate number of digital editions of Shakespeare compared to his contemporaries (Hirsch 2011); this analysis suggests that the disparity extends beyond the amount of Shakespearean texts online to the very ways the texts are made 5 The St. John’s College Catalogue for 1947–1948 reveals that the King William Players produced The Duchess of Malfi in their 1946–47 season (St. John’s College in Annapolis 1948). Humanities 2019, 8, 45 10 of 16 accessible. As Katherine Rowe (2014) argues, scholars need to assess if digital Shakespeare texts are “good enough” for the purposes we wish to apply them, including digital analysis.6 Furthermore, I assert that we need to bring this awareness to our use of digital projects about early modern drama more generally: what questions do we bring to the projects? What are our goals as users?7 4. Proliferating Shakespeares Shakespeare’s cultural prominence accounts for many of the factors discussed thus far: funders’ pro-Shakespeare predilections, appeals to general audiences, and the “non-Shakespeare” project backlash. Shakespeare’s preeminence itself also leads to the development and shape of digital humanities projects themselves. Peter Donaldson’s Global Shakespeares highlights Shakespeare’s cross-cultural appeal and offers site visitors evidence of how Shakespeare’s plays are adapted and performed around the world. The nature of the Global Shakespeares site (and similar sites such as Shakespeare in Taiwan or Shakespeare in Spain) is only possible because Shakespeare is a global commodity.8 A Global Peeles site would have precious little content, because George Peele’s works are not as frequently rewritten and staged. Global Shakespeares does not strive to be comprehensive: it is not a repository of full-length filmed productions, nor is it a record of all international Shakespeare production. Rather, it is a gathering of curated videos, taken from a wealth of global Shakespeare materials; it is the very wealth of materials that makes the project possible. Other examples abound of digital projects that exist precisely because of Shakespeare’s cultural prominence. The four hundredth anniversary of Shakespeare’s death in 2016 led to the reimagining or launch of multiple new digital projects, many of which are devoted to Shakespeare’s legacy. Shakespeare & The Players (Rusche and Shaw 2016), for instance, is a collection of nearly 1000 postcards of Shakespearean performances from 1880–1914. The Victorian Illustrated Shakespeare Archive (Goodman 2016) offers a repository of illustrations of Shakespeare’s works by four Victorian illustrators. Performance Shakespeare 2016 (Massai and Bennett 2016) captured a database of those productions that were performed in honor of the quadricentennial anniversary. Exploring Shakespeare’s ongoing and changing cultural impact is an important part of Shakespeare studies, which naturally lends itself to the creation of resources that, in turn, highlight Shakespeare’s prominence. It is not then surprising that Shakespeare is overrepresented in scholarship about the early modern period. Shakespeare’s prominence in digital humanities now contributes to this cycle: scholars write about Shakespeare because they can research him in innovative ways (easily comparing, for instance, early printed texts on the Shakespeare Quartos Archive, or watching a production on the Global Shakespeares site); the interest in Shakespeare, in turn, generates more Shakespeare-centric sites, often specifically designed for teaching and research. The World Shakespeare Bibliography Online (Estill 2019b) serves as a record of this research and as another element of the self-reinforcing cycle of Shakespeare publication. The World Shakespeare Bibliography is a database of performances of and publications about Shakespeare, which ultimately shapes how and what we research.9 The boundaries of the World Shakespeare Bibliography (it includes only works that focus on Shakespeare), means that scholars using the WSB will not be able to find related work about early modern literature or the professional Elizabethan stage more broadly, unless that scholarship includes a sustained focus on Shakespeare. Users of any other author-focused bibliography, such as the Marlowe Bibliography Online 6 See also the discussion, cited by Rowe, on the open review for Andrew Murphy’s “Shakespeare Goes Digital” (Murphy 2010) about how Shakespeareans use digital texts. 7 See The Shakespeare User: Critical and Creative Appropriations in a Networked Culture, edited by Valerie M. Fazel and Louise Geddes (Fazel and Geddes 2017), particularly the chapter by Eric Johnson (2017). 8 For a thoughtful discussion of the strengths and weaknesses of the Global Shakespeares project, as well as a consideration of the opportunities for and threats to the project, see Diana Henderson (2018). Henderson positions her in-depth analysis as “a case study that may assist others wrestling with the challenging, changing digital/Shakespeares studies landscape” (p. 70). For additional reflections on Global Shakespeares, including by its editors, see Henderson’s citations. 9 For a history of the World Shakespeare Bibliography and its move online, see (Estill 2014). Humanities 2019, 8, 45 11 of 16 (McInnis and Allan 2019) or the Margaret Cavendish Bibliography Initiative (Siegfried 2019), will face similar limitations; however, the sheer scope of the World Shakespeare Bibliography (currently over 126,000 records) can lead scholars to forget about the world of scholarship beyond its scope, whereas the limits of smaller, boutique bibliographies are more readily apparent. The bibliography with breadth to complement the World Shakespeare Bibliography’s depth is the MLA International Bibliography (MLA International Bibliography 2018). The World Shakespeare Bibliography, of course, covers much material outside the scope of the MLAIB, such as professional productions, podcasts, digital projects, and reviews. The World Shakespeare Bibliography’s depth of scope leads to multiple benefits, including having descriptive annotations and cross-referencing between items (for instance, a journal article about film adaptations of Hamlet would be cross-referenced to entries for each post-1960 film discussed, which in turn would have an annotation describing the cast as well as a list of reviews and other scholarly works that had discussed the film). Yet, even where their scopes are the same, the Shakespeare-centric focus of the World Shakespeare Bibliography means that there are items in the WSB that should appear in the MLAIB but simply aren’t included. Books offer the most striking disparity: only 14% of the books published after 1960 annotated in the World Shakespeare Bibliography are indexed in the MLA International Bibliography.10 Despite being the MLA International Bibliography (emphasis added), it is too often the global, non-English contributions that are among the thousands of overlooked texts. As such, perhaps counterintuitively, it is the World Shakespeare Bibliography’s specificity of focus that leads to its greater inclusivity of global materials. The digital projects that reflect Shakespeare’s cultural prominence, in turn, reinforce his position in our scholarship by opening new avenues for research, often focused entirely on Shakespeare and his legacy. Indeed, digital humanities’ Shakespeare problem extends beyond the framing and focus of existing and in-progress digital projects (what we study) by affecting the kinds of research we can undertake (how we study). For instance, Shakespeare, and the consideration of what is Shakespearean or not, has been central to stylometry, an area of study that now uses primarily digital methodologies. Shakespeare has long been the testing ground and often bellwether for new approaches to both literary criticism and textual studies (Parvini 2012; Machan 2000); new digital humanities approaches are no exception, and often turn to Shakespeare as a first case study. The cycle that reinforces Shakespeare’s centrality continues into the digital: online projects about Shakespeare beget new research questions that are, in turn, focused on Shakespeare. The boundaries of Shakespeare-centric projects affect the very questions we can bring to our research and teaching and the new questions we are conditioned to develop. 5. Conclusions If we could imagine an early modern digital project with both depth and breadth that positions Shakespeare in his changing historical contexts, the rise of bardolatry over time would mean reflecting Shakespeare’s rising cultural prominence over the past centuries. A synchronic project might choose to focus only on Shakespeare’s lifetime or only on the heyday of Elizabethan and/or Jacobean professional theatre, yet such a digital project would not capture Shakespeare’s legacy. Even if we could conceptualize (let alone realize) the most idealized, unbiased digital project, we would certainly not be able to navigate or query it without bringing in our conditioned, canonical biases. 10 This figure was based on searches undertaken in January 2019. Choosing “book collection” and “book monograph” as document types in the World Shakespeare Bibliography yielded 28,706 entries, compared to 4416 results in the MLA International Bibliography, limited from 1960–2019 and by document type “book,” “translation,” and “edition,” searching with the keyword “Shakespeare.” Humanities 2019, 8, 45 12 of 16 Too frequently, we protest that we are “Shakespeareans” or “early modernists” first and digital scholars second11; yet, in order to be effective scholars, we must train ourselves and future generations about digital research methods, including how to determine scope and functionality. To make the most of those valuable early modern digital projects we have, scholars must understand what questions these resources can effectively answer. As John Lavagnino (2014) has observed, today, all humanists undertake research with digital tools, whether they consider themselves digital humanists or not. In both building and using tools about the early modern period, we need to create and reference transparent and detailed project descriptions and guidelines. The future of early modern studies will be shaped by the digital tools that will change the way we research. The potential of linked open data or other digital advances, however, will not be realized if scholars do not critically analyze each digital project as we would a monograph, an edition, a performance, or a bibliography. This essay’s critical engagement of digital projects both individually and in their online ecosystem demonstrates that digital humanities has a Shakespeare problem. As these projects evolve and depreciate and as new projects are built, we will have to continue our assessments. How we choose to respond to these early modern digital resources and how we design our future projects will, in turn, shape how we understand the literary canon. Funding: This research was made possible by funding from the Canada Research Chair program. Acknowledgments: I would like to thank Heidi Craig for her thoughtful feedback on this article. Thanks also to the Humanities blind peer reviewers and guest editor, Stephen O’Neill, for their constructive suggestions. Conflicts of Interest: The author declares no conflicts of interest. References Directed by Maria Aberg. 2018, The Duchess of Malfi. Stratford-upon-Avon: The Royal Shakespeare Company. Bartleby.com: Great Books Online. 1993. Available online: www.bartleby.com/hc/ (accessed on 20 November 2018). Bear, Risa Stephanie. 1994. Renascence Editions. Archived 2018. Available online: https://scholarsbank.uoregon. edu/xmlui/handle/1794/507 (accessed on 28 February 2019). Bell, Henry, Amy Borsuk, and Christie Carson, eds. Forthcoming. RiDE: Research in Drama Education: The Journal of Applied Theatre and Performance. Shakespeare and Digital Pedagogy (Themed Issue). Available online: ridejournal.net/articles/5a79ae7c193311f3060d4ff4 (accessed on 10 December 2018). Bergenmar, Jenny, and Katarina Leppänen. 2017. Gender and Vernaculars in Digital Humanities and World Literature. NORA: Nordic Journal of Feminist and Gender Research 25: 232–46. [CrossRef] Bevington, David. 2002. English Renaissance Drama: A Norton Anthology. New York: Norton. Bianco, Jamie “Skye”. 2012. This Digital Humanities Which Is Not One. In Debates in the Digital Humanities. Edited by Matthew K. Gold. Minneapolis: University of Minnesota Press. Available online: dhdebates.gc.cuny.edu/ debates/text/9 (accessed on 15 December 2018). Bodleian Library. 2009. Shakespeare Quartos Archive. Available online: quartos.org (accessed on 18 December 2018). Bogost, Ian. 2012. Alien Phenomenology: Or, What It’s Like to Be a Thing. Minneapolis: University of Minnesota Press. British Library. 2006. English Short Title Catalogue. Available online: estc.bl.uk (accessed on 18 December 2018). Brown, Larry Avis, ed. 2010. The Duchess of Malfi: The Complete Texts with Notes and Commentary. Available online: larryavisbrown.com/duchess-of-malfi/ (accessed on 27 January 2019). Brown, Meaghan, Michael Poston, and Elizabeth Williamson, eds. 2016. A Digital Anthology of Early Modern English Drama. Folger Shakespeare Library. Available online: emed.folger.edu (accessed on 27 January 2019). Butler, Martin, ed. 2014. The Cambridge Edition of the Works of Ben Jonson Online. Cambridge: Cambridge University Press. Available online: http://universitypublishingonline.org/cambridge/benjonson/ (accessed on 29 January 2019). 11 See, for instance, the proud claim in the introduction to Shakespeare and the Digital World that “this is a collection that does not come of out the vanguard of digital humanities specialists, but from the trial-and-error approaches of committed Shakespearean professionals working within an evolving field” (Carson and Kirwan 2014b, pp. 3–4). This claim suggests that a scholar cannot be both a digital humanities specialist and a Shakespearean professional. www.bartleby.com/hc/ https://scholarsbank.uoregon.edu/xmlui/handle/1794/507 https://scholarsbank.uoregon.edu/xmlui/handle/1794/507 ridejournal.net/articles/5a79ae7c193311f3060d4ff4 http://dx.doi.org/10.1080/08038740.2017.1378256 dhdebates.gc.cuny.edu/debates/text/9 dhdebates.gc.cuny.edu/debates/text/9 quartos.org estc.bl.uk larryavisbrown.com/duchess-of-malfi/ emed.folger.edu http://universitypublishingonline.org/cambridge/benjonson/ Humanities 2019, 8, 45 13 of 16 Butler, Martin, and Matthew Steggle. Forthcoming. The Complete Works of John Marston. Oxford: Oxford University Press. Available online: johnmarston.leeds.ac.uk (accessed on 15 January 2019). Calbi, Maurizio, and Stephen O’Neill. 2016. Shakespeare and Social Media (special issue). Borrowers and Lenders: The Journal of Shakespeare and Appropriation 10. Available online: borrowers.uga.edu/29/toc (accessed on 25 January 2019). Carson, Christie, and Peter Kirwan. 2014a. Conclusion: Digital Dreaming. In Shakespeare and the Digital World. Edited by Christie Carson and Peter Kirwan. Cambridge: Cambridge University Press, pp. 238–57. Carson, Christie, and Peter Kirwan. 2014b. Shakespeare and the Digital World: Introduction. In Shakespeare and the Digital World. Edited by Christie Carson and Peter Kirwan. Cambridge: Cambridge University Press, pp. 1–7. Cave, Richard, ed. 2010. Richard Brome Online. Available online: https://www.dhi.ac.uk/brome (accessed on 28 January 2019). Conway, Paul. 2010. Preservation in the Age of Google: Digitization, Digital Preservation, and Dilemmas. Library Quarterly 80: 61–79. Available online: hdl.handle.net/2027.42/85223 (accessed on 25 January 2019). [CrossRef] Crighton, Robert. 2013. Beyond Shakespeare. Available online: beyondshakespeare.blogspot.com (accessed on 27 January 2019). Davies, Callan, and Andy Kesson. Forthcoming. DH and Non-Shakespearean Theatre History. Shakespeare Newsletter. Desmet, Christy. 2014. The Shakespeare Quartos Archive. Shakespearean International Yearbook 14: 143–54. Desmet, Christy. 2017. Alien Shakespeares 2.0. In Shakespeare après Shakespeare/Shakespeare after Shakespeare. Edited by Anne-Valérie Dulac and Laetitia Sansonetti. Société Française Shakespeare. Available online: https: //journals.openedition.org/shakespeare/3877 (accessed on 19 February 2019). Donaldson, Peter S. 2009. Global Shakespeares: Video and Performance Archive. Available online: globalshakespeares.mit.edu/about/ (accessed on 25 January 2019). Earhart, Amy. 2012. Can Information Be Unfettered? Race and the New Digital Humanities Canon. In Debates in the Digital Humanities. Edited by Matthew K. Gold. Minneapolis: University of Minnesota Press. Available online: dhdebates.gc.cuny.edu/debates/text/16 (accessed on 15 December 2018). Earhart, Amy. 2015. Traces of the Old, Uses of the New: The Emergence of Digital Literary Studies. Ann Arbor: University of Michigan Press. Early English Books Online-Text Creation Partnership (EEBO-TCP). 2009. Available online: quod.lib.umich.edu/ e/eebogroup/ (accessed on 25 January 2019). Estill, Laura. 2014. Digital Bibliography and Global Shakespeare. Scholarly and Research Communication 5. Available online: doi.org/10.22230/src.2014v5n4a187 (accessed on 27 January 2019). Estill, Laura. 2019a. Shakespearean Extracts and the Misrepresentation of the Archive. In Rethinking Theatrical Documents in Shakespeare’s England. Edited by Tiffany Stern. London: Bloomsbury. Estill, Laura, ed. 2019b. World Shakespeare Bibliography. Created by Sidney Thomas, 1950. Moved online by James L. Harner, 2011. Available online: worldshakesbib.org (accessed on 27 January 2019). Estill, Laura, Dominic Klyve, and Kate Bridal. 2015. “Spare your arithmetic, never count the turns”: A Statistical Analysis of Writing about Shakespeare, 1960–2010. Shakespeare Quarterly 66: 1–28. [CrossRef] Fazel, Valerie M., and Louise Geddes. 2017. The Shakespeare User: Critical and Creative Appropriations in a Networked Culture. Cham: Palgrave Macmillan. Flanders, Julia. 2009. The Productive Unease of 21st-Century Digital Scholarship. Digital Humanities Quarterly 3. Available online: digitalhumanities.org/dhq/vol/3/3/000055/000055.html (accessed on 15 January 2019). Fraistat, Neil, Steven E. Jones, and Carl Stahmer. 1998. The Canon, the Web, and the Digitization of Romanticism. Romanticism on the Net 10. Available online: http://www.erudit.org/revue/ron/1998/v/n10/005801ar.html (accessed on 29 January 2019). [CrossRef] Furlough, Michael. 2008. HathiTrust Digital Library. Available online: hathitrust.org (accessed on 27 January 2019). Galey, Alan, and Ray Siemens, eds. 2008. Reinventing Shakespeare in the Digital Humanities (special issue). Shakespeare 4: 201–207. [CrossRef] Goodman, John Michael. 2016. Victorian Illustrated Shakespeare Archive. Available online: https:// shakespeareillustration.org (accessed on 20 November 2018). Google. 2004. GoogleBooks. Available online: books.google.com (accessed on 27 January 2019). johnmarston.leeds.ac.uk borrowers.uga.edu/29/toc https://www.dhi.ac.uk/brome hdl.handle.net/2027.42/85223 http://dx.doi.org/10.1086/648463 beyondshakespeare.blogspot.com https://journals.openedition.org/shakespeare/3877 https://journals.openedition.org/shakespeare/3877 globalshakespeares.mit.edu/about/ dhdebates.gc.cuny.edu/debates/text/16 quod.lib.umich.edu/e/eebogroup/ quod.lib.umich.edu/e/eebogroup/ doi.org/10.22230/src.2014v5n4a187 worldshakesbib.org http://dx.doi.org/10.1353/shq.2015.0000 digitalhumanities.org/dhq/vol/3/3/000055/000055.html http://www.erudit.org/revue/ron/1998/v/n10/005801ar.html http://dx.doi.org/10.7202/005801ar hathitrust.org http://dx.doi.org/10.1080/17450910802295062 https://shakespeareillustration.org https://shakespeareillustration.org books.google.com Humanities 2019, 8, 45 14 of 16 Greatley-Hirsch, Brett, ed. 2015. Digital Renaissance Editions. Available online: digitalrenaissance.uvic.ca (accessed on 27 January 2019). Greene, Mark A., and Dennis Meissner. 2005. More Product, Less Process: Revamping Traditional Archival Processing. The American Archivist 68: 208–63. [CrossRef] Hart, Michael. 1971. Project Gutenberg. Available online: gutenberg.org (accessed on 27 January 2019). Henderson, Diana. 2018. This Distracted Globe, This Brave New World: Learning from the MIT Global Shakespeares’ Twenty-First Century. In Broadcast Your Shakespeare: Continuity and Change Across Media. Edited by Stephen O’Neill. London: Bloomsbury Arden Shakespeare, pp. 67–86. Hirsch [now Greatley-Hirsch], Brett. 2010. Bringing Richard Brome Online. Early Theatre 13: 137–53. Hirsch [now Greatley-Hirsch], Brett. 2011. The Kingdom Has Been Digitized: Electronic Editions of Renaissance Drama and the Long Shadows of Shakespeare and Print. Literature Compass 8: 568–91. [CrossRef] Hirsch [now Greatley-Hirsch], Brett, and Hugh Craig. 2014. Digital Shakespeares (special section). Shakespearean International Yearbook 14. Hodgdon, Barbara. 1998. The Shakespeare Trade: Performances and Appropriations. Philadelphia: University of Pennsylvania Press. Jakacki, Diane K. 2018. Internet Shakespeare Editions and the Infinite Editorial Others: Support Critical Tagsets for Linked Data. In Shakespeare’s Language in Digital Media: Old Words, New Tools. Edited by Janelle Jenstad, Mark Kaethler and Jennifer Roberts-Smith. London and New York: Routledge, pp. 157–71. Jenstad, Janelle, ed. 2018. Internet Shakespeare Editions. Created by Michael Best, 1996. Available online: ise.uvic.ca (accessed on 17 January 2019). Johnson, Eric. 2017. Opening Shakespeare from the Margins. In The Shakespeare User: Critical and Creative Appropriations in a Networked Culture. Edited by Valerie M. Fazel and Louise Geddes. Cham: Palgrave Macmillan, pp. 187–206. Kahle, Brewster. 1996. The Internet Archive. Available online: archive.org (accessed on 27 January 2019). Kesson, Andy, Lucy Munro, and Callan Davies. 2016. Before Shakespeare. Available online: beforeshakespeare. com (accessed on 20 January 2019). King William Players. 1947. The Duchess of Malfi. St. John’s College Digital Archives. Available online: digitalarchives.sjc.edu/items/show/1261 (accessed on 9 January 2019). Kinney, Arthur F. 1999. Renaissance Drama: Anthology of Plays and Entertainments. Malden: Blackwell. Kliman, Bernice W., Frank Nicholas Clary, Hardin Aasand, Eric Rasmussen, Laury Magnus, and Marvin Hunt, eds. 2004. HamletWorks. Available online: hamletworks.org (accessed on 10 December 2018). Knight, Leah, and Wendy Wall. 2018. The Pulter Project. Available online: pulterproject.northwestern.edu/ (accessed on 7 January 2019). Lanier, Douglas. 2014. Shakespearean Rhizomatics: Adaptation, Ethics, Value. In Shakespeare and the Ethics of Appropriation. Edited by Alexa Huang [now Joubin] and Elizabeth Rivlin. New York: Palgrave, pp. 21–40. Lavagnino, John. 2014. Shakespeare in the Digital Humanities. In Shakespeare and the Digital World. Edited by Christie Carson and Peter Kirwan. Cambridge: Cambridge University Press, pp. 238–57. Lesser, Zachary, and Alan Farmer. 2007. Database of Early English Playbooks. Available online: deep.sas.upenn. edu (accessed on 2 December 2018). Machan, Tim William. 2000. ‘I Endowed Thy Purposes’: Shakespeare, Editing, and Middle English Literature. Text 13: 9–25. Mandell, Laura. 2015. Gendering Digital Literary History: What Counts for Digital Humanities. In A New Companion to Digital Humanities. Edited by Susan Schreibman, Raymond G. Siemens and John Unsworth. Malden: Wiley-Blackwell, pp. 511–23. Massai, Sonia, and Susan Bennett, eds. 2016. Performance Shakespeare 2016. Available online: performanceshakespeare2016.org (accessed on 28 February 2019). McInnis, David, and Gayle Allan. 2019. The Marlowe Bibliography Online. Marlowe Society of America Release 2017.1. Available online: marlowesocietyofamerica.org/mbo/ (accessed on 20 January 2019). McLuskie, Kate, and Kate Rumbold. 2014. Cultural Value in Twenty-First Century England: The Case of Shakespeare. Manchester: University of Manchester Press. McMillin, Scott, and Sally-Beth MacLean. 1998. The Queen’s Men and Their Plays. Cambridge: Cambridge University Press. digitalrenaissance.uvic.ca http://dx.doi.org/10.17723/aarc.68.2.c741823776k65863 gutenberg.org http://dx.doi.org/10.1111/j.1741-4113.2011.00830.x ise.uvic.ca archive.org beforeshakespeare.com beforeshakespeare.com digitalarchives.sjc.edu/items/show/1261 hamletworks.org pulterproject.northwestern.edu/ deep.sas.upenn.edu deep.sas.upenn.edu performanceshakespeare2016.org marlowesocietyofamerica.org/mbo/ Humanities 2019, 8, 45 15 of 16 MLA Committee on Information Technology. 2012a. Guidelines for Authors of Digital Resources. Available online: mla.org/About-Us/Governance/Committees/Committee-Listings/Professional-Issues/Committee- on-Information-Technology/Guidelines-for-Authors-of-Digital-Resources (accessed on 15 November 2018). MLA Committee on Information Technology. 2012b. Guidelines for Evaluating Work in Digital Humanities and Digital Media. Available online: mla.org/About-Us/Governance/Committees/Committee-Listings/ Professional-Issues/Committee-on-Information-Technology/Guidelines-for-Evaluating-Work-in-Digital- Humanities-and-Digital-Media (accessed on 15 November 2018). MLA International Bibliography. 2018. Available online by Subscription. Available online: https://www.mla.org/ Publications/MLA-International-Bibliography/About-the-MLA-International-Bibliography (accessed on 27 January 2019). First Published 1922. Moncrief-Spittle, Malcolm, trans. 2001. The Duchess of Malfi. By John Webster. Hazlitt, William, ed. Renascence Editions. Available online: scholarsbank.uoregon.edu/xmlui/handle/1794/776 (accessed on 27 January 2019). Moore, Shawn, and Jacob Tootalian, directors. 2013. The Digital Cavendish Project. Available online: digitalcavendish.org (accessed on 15 January 2019). Mowat, Barbara, Paul Werstine, Michael Poston, and Rebecca Niles, eds. 2012. Folger Digital Texts. Folger Shakespeare Library: Available online: folgerdigitaltexts.org (accessed on 19 December 2018). Mueller, Martin. 2014. Shakespeare His Contemporaries: Collaborative Curation and Exploration of Early Modern Drama in a Digital Environment. Digital Humanities Quarterly 8. Available online: digitalhumanities.org/ dhq/vol/8/3/000183/000183.html (accessed on 17 January 2019). Mueller, Martin. 2016. Shakespeare His Contemporaries. Now Defunct. Available online: web.archive.org/web/ */http://shakespearehiscontemporaries.northwestern.edu/shc/home.html (accessed on 23 January 2019). Munro, Lucy. 2009. Children of the Queen’s Revels: A Jacobean Theatre Repertory. Cambridge: Cambridge University Press. Murphy, Andrew. 2010. Shakespeare Goes Digital: Three Open Internet Editions. Shakespeare Quarterly 61: 401–14. Available online: mcpress.media-commons.org/ShakespeareQuarterly_NewMedia/shakespeare- remediated/murphy-shakespeare-goes-digital/ (accessed on 19 February 2019). [CrossRef] O’Neill, Stephen. 2015. Shakespeare and Social Media. Literature Compass 12: 274–85. [CrossRef] O’Neill, Stephen. 2018. Shakespeare’s Digital Flow: Humans, Technologies and the Possibilities of Intercultural Exchange. Shakespeare Studies 46: 120–33. Olive, Sarah. 2015. Shakespeare Valued: Education Policy and Pedagogy 1989–2009. Bristol and Chicago: Intellect. Ostovich, Helen, ed. 2006. The Queen’s Men Editions. Available online: qme.internetshakespeare.uvic.ca (accessed on 27 January 2019). P., Cildas. 2015. Richard III Attacks! Playfool.net; Thomas Jolly, La Piccola Familia. Available online: lapiccolafamilia.fr/richard-iii-attacks/ (accessed on 20 November 2018). Parvini, Neema. 2012. Shakespeare and Contemporary Theory: New Historicism and Cultural Materialism. New York: Bloomsbury. Pollard, Alfred W., and Gilbert Richard Redgrave. 1926. A Short-Title Catalogue of Books Printed in England, Scotland & Ireland, and of English Books Printed Abroad, 1475–1640. London: Bibliographical Society. Price, Kenneth M. 2009. Digital Scholarship, Economics, and the American Literary Canon. Literature Compass 6: 274–90. [CrossRef] Rowe, Katherine, ed. 2010. Shakespeare and New Media (special issue). Shakespeare Quarterly 61. Rowe, Katherine. 2014. Living with Digital Incunables, or a ‘Good-Enough’ Shakespeare Text. In Shakespeare and the Digital World. Edited by Christie Carson and Peter Kirwan. Cambridge: Cambridge University Press, pp. 144–159. Rusche, Harry, and Justin Shaw. 2016. Shakespeare & The Players. Available online: shakespeare.emory.edu/ (accessed on 20 November 2018). Siegfried, Brandie. 2019. Margaret Cavendish Bibliography Initiative. Digital Cavendish Project. Available online: digitalcavendish.org/resources/margaret-cavendish-bibliography-initiative/ (accessed on 10 December 2018). Silbert, Roxana, director. 2010. Such Tweet Sorrow. Story grid by Beth Marlow and Tim Wright. Stratford-upon-Avon and Sheffield: Royal Shakespeare Company and Mudlark Production Company. Slights, Jessica, ed. 2017. Othello. Internet Shakespeare Editions: Available online: http://internetshakespeare. uvic.ca/Library/Texts/Oth/ (accessed on 27 January 2019). mla.org/About-Us/Governance/Committees/Committee-Listings/Professional-Issues/Committee-on-Information-Technology/Guidelines-for-Authors-of-Digital-Resources mla.org/About-Us/Governance/Committees/Committee-Listings/Professional-Issues/Committee-on-Information-Technology/Guidelines-for-Authors-of-Digital-Resources mla.org/About-Us/Governance/Committees/Committee-Listings/Professional-Issues/Committee-on-Information-Technology/Guidelines-for-Evaluating-Work-in-Digital-Humanities-and-Digital-Media mla.org/About-Us/Governance/Committees/Committee-Listings/Professional-Issues/Committee-on-Information-Technology/Guidelines-for-Evaluating-Work-in-Digital-Humanities-and-Digital-Media mla.org/About-Us/Governance/Committees/Committee-Listings/Professional-Issues/Committee-on-Information-Technology/Guidelines-for-Evaluating-Work-in-Digital-Humanities-and-Digital-Media https://www.mla.org/Publications/MLA-International-Bibliography/About-the-MLA-International-Bibliography https://www.mla.org/Publications/MLA-International-Bibliography/About-the-MLA-International-Bibliography scholarsbank.uoregon.edu/xmlui/handle/1794/776 digitalcavendish.org folgerdigitaltexts.org digitalhumanities.org/dhq/vol/8/3/000183/000183.html digitalhumanities.org/dhq/vol/8/3/000183/000183.html web.archive.org/web/*/http://shakespearehiscontemporaries.northwestern.edu/shc/home.html web.archive.org/web/*/http://shakespearehiscontemporaries.northwestern.edu/shc/home.html mcpress.media-commons.org/ShakespeareQuarterly_NewMedia/shakespeare-remediated/murphy-shakespeare-goes-digital/ mcpress.media-commons.org/ShakespeareQuarterly_NewMedia/shakespeare-remediated/murphy-shakespeare-goes-digital/ http://dx.doi.org/10.1353/shq.2010.0004 http://dx.doi.org/10.1111/lic3.12234 qme.internetshakespeare.uvic.ca lapiccolafamilia.fr/richard-iii-attacks/ http://dx.doi.org/10.1111/j.1741-4113.2009.00622.x shakespeare.emory.edu/ digitalcavendish.org/resources/margaret-cavendish-bibliography-initiative/ http://internetshakespeare.uvic.ca/Library/Texts/Oth/ http://internetshakespeare.uvic.ca/Library/Texts/Oth/ Humanities 2019, 8, 45 16 of 16 Smith, Emma. 2017. Shakespeare: The Apex Predator. TLS: Times Literary Supplement. May 4. Available online: https://www.the-tls.co.uk/articles/public/shakespeare-apex-predator/ (accessed on 19 February 2019). St. John’s College in Annapolis. 1948. St John’s College Catalogue 1947–1948. St. John’s College Digital Archives. Available online: digitalarchives.sjc.edu/items/show/181 (accessed on 9 January 2019). Steggle, Matthew, ed. 2004. Early Modern Literary Studies. Hosted Resources: Available online: https://extra.shu. ac.uk/emls/iemls/resources.html (accessed on 28 January 2019). Steggle, Matthew. 2008. “Knowledge Will Be Multiplied”: Digital Literary Studies and Early Modern Literature. In A Companion to Digital Literary Studies. Edited by Ray Siemens and Susan Schriebman. Malden: Blackwell. Thomas, Steve. 2015. Ebooks@Adelaide. Available online: ebooks.adelaide.edu.au (accessed on 27 January 2019). Tran, Diep. 2018. The Top 10* Most Produced Plays of the 2018–2019 Season. American Theatre. September 20. Available online: americantheatre.org/2018/09/20/the-top-10-most-produced-plays-of-the- 2018-19-season/ (accessed on 27 January 2019). Wernimont, Jacqueline, and Julia Flanders. 2010. Feminism in the Age of Digital Archives: The Women Writer’s Project. Tulsa Studies in Women’s Language and Literature 29: 425–35. Wing, Donald. 1945. A Short-Title Catalogue of Books Printed in England, Scotland & Ireland, and of English Books Printed Abroad, 1641–1700. New York: Columbia University Press. © 2019 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). https://www.the-tls.co.uk/articles/public/shakespeare-apex-predator/ digitalarchives.sjc.edu/items/show/181 https://extra.shu.ac.uk/emls/iemls/resources.html https://extra.shu.ac.uk/emls/iemls/resources.html ebooks.adelaide.edu.au americantheatre.org/2018/09/20/the-top-10-most-produced-plays-of-the-2018-19-season/ americantheatre.org/2018/09/20/the-top-10-most-produced-plays-of-the-2018-19-season/ http://creativecommons.org/ http://creativecommons.org/licenses/by/4.0/. Introduction The Shakespeare/Not Shakespeare Divide in Digital Humanities Resources Digital Editions and the Privileging of Shakespeare’s Text Proliferating Shakespeares Conclusions References work_atos2ytygjdmbgkfeebe3jejr4 ---- Teaching TEI: The Need for TEI by Example Terras, M., Van den Branden, R., and Vanhoutte, E. (2009). "The need for TEI By Example". Literary and Linguistic Computing 24(3):297-306. http://llc.oxfordjournals.org/cgi/content/abstract/fqp018?ijkey=deB17DJBT3YKBEX&keytype=ref Teaching TEI: The Need for TEI by Example Melissa Terras, Ron van den Branden and Edward Vanhoutte m.terras@ucl.ac.uk Abstract The TEI (Text Encoding Initiative) 1 has provided a complex and comprehensive system of provisions for scholarly text encoding. Although a major focus of the “digital humanities” domain, and despite much teaching effort by the TEI community, there is a lack of teaching materials available which would encourage the adoption of the TEI’s recommendations and the widespread use of its text encoding guidelines in the wider academic community. This paper describes the background, plans, and aims of the TEI by Example project, and why we believe it is a necessary addition to the materials currently provided by the TEI itself. The teaching materials currently available are not suited to the needs of self directed learners, and the development of stand alone, online tutorials in the TEI are an essential addition to the extant resources, in order to encourage and facilitate the uptake of TEI by both individuals and institutions. 1. Introduction Over the past 20 years, the TEI (Text Encoding Initiative) has developed comprehensive guidelines for scholarly text encoding (TEI, 2007a). In order to expand the user base of TEI, it is important that tutorial materials are made available to scholars new to textual encoding. However, there is a paucity of stand alone teaching materials available which support beginner’s level learning of TEI. Materials which are available are not in formats which would enable tutorials to be provided in classroom settings (such as through part of a University course), or allow individuals to work through graded examples in their own time: the common way of learning new computational techniques through self-directed learning. As a result, there is an urgent need for a suite of TEI tutorials for the self directed learner. The “TEI by Example” 2 project is currently developing a range of freely available online tutorials which will walk individuals through the different stages in marking up a document in TEI. To do so, the development environment will need to be explained, documented, and links to freely available software provided to allow users to undertake TEI based markup themselves. In addition to this, the tutorials will provide annotated examples of a range of texts, indicating the editorial choices necessary when marking up a text in TEI. Linking to real examples from projects which utilise the TEI will reaffirm the advice given to learners. In this paper, we discuss the current methods of teaching TEI, and why these do not cater for the lone scholar or self directed learner interested in learning TEI in their own environment and in their own time. We discuss the need for specifically designed 1 http://www.tei-c.org/ 2 http://www.teibyexample.org Terras, M., Van den Branden, R., and Vanhoutte, E. (2009). "The need for TEI By Example". Literary and Linguistic Computing 24(3):297-306. http://llc.oxfordjournals.org/cgi/content/abstract/fqp018?ijkey=deB17DJBT3YKBEX&keytype=ref online tutorials in the TEI, and why it is important to incorporate example material of TEI code in these tutorials. Finally, an overview of the TEI by Example project is given, discussing its aims, structure, deliverables, and future work. 2. Teaching the TEI The Text Encoding Initiative, “an international organization founded in 1987 to develop guidelines for encoding machine-readable texts in the humanities and social sciences” (TEI,2007b), has produced a variety of guidelines for the encoding of scholarly texts. The rigorous intellectual endeavour to create the guidelines ensures that “The TEI is a very extensive encoding language and is intended to support very complex encoding of very complex documents” (TEI, 2007b). As a result, the TEI has the potential to be used in a variety of situations. Markup projects often train their workforce in the principles and theory of markup, with encoders learning on the job. Students in Literature and Language and other Humanities based subjects may have a need for TEI. Students in Library, Archives, and Electronic Communication and Publishing, and Librarians and Archivists, may benefit from understanding how best to encode, document, and ultimately preserve electronic textual data (and the widespread inclusion of information professionals in the TEI community would further the aims of the initiative, by encouraging the uptake of TEI as an aid to preserve electronic textual data). Academics wishing to join the “digital revolution” may have an introduction to the field of Digital Humanities through the discipline of textual markup. Fundamentally, many of these individuals who come across the TEI may go on to teach or inform others, and as a result, if we wish to expand the user community and the use of TEI, it is important to provide training, and teaching materials, which foster and build confidence, and demonstrate the use and usefulness of the TEI guidelines. However, individuals wishing to learn the TEI are currently faced with the lengthy and technically descriptive guidelines, which are hardly written with the absolute beginner in mind. An alternative is to attend a taught course regarding the TEI, or to consult the materials made available on the TEI website. University courses sometimes integrate TEI into their teaching (for example, the “Humanities Computing: Electronic Text 3 ” undergraduate course at the University of Antwerp, or the “Digital Resources in the Humanities 4 ” Masters level module, in the School of Library, Archive and Information Studies, University College London) although this is rare, and access is limited to a few interested students. More commonly, short courses are sometimes sponsored and provided by the TEI, or related organisations: for example the workshops organised by the Centre for Scholarly Editing and Document Studies 5 , and those ran by the Brown Women Writers Project “offers periodic hands-on workshops on text encoding and the TEI Guidelines. These range from one to five days and cover a range of topics in basic and intermediate TEI encoding, TEI customization, basic XSLT, and issues in text encoding theory” (WWP, 2007)., for instance at the yearly Digital Humanities Summer Institute. 6 , An archive of a range of documentation from workshops, 3 http://www.edwardvanhoutte.org/HC/ 4 http://www.ucl.ac.uk/slais/teaching/modules/instg008/ 5 http://www.kantl.be/ctb/META/ 6 http://www.dhsi.org Terras, M., Van den Branden, R., and Vanhoutte, E. (2009). "The need for TEI By Example". Literary and Linguistic Computing 24(3):297-306. http://llc.oxfordjournals.org/cgi/content/abstract/fqp018?ijkey=deB17DJBT3YKBEX&keytype=ref including presentations, exercises, and handout materials, is maintained on the TEI website 7 . However, short term courses have their own pedagogical problems: they are rarely assessed, so it is difficult to know if the students have really learnt anything useful that will be retained. When the intensity of the course ends, students may go back to their old habits. When motivational tutors are no longer around to ask when things go wrong, it may be the case that students give up their attempt to learn the TEI. There is little room for a “holistic” approach to teaching (Bernold, n.d.) where what is learnt can be reinforced over a period of time through a variety of pedagogical methods such as evaluation, feedback, discussion, experimentation, and teamwork. Although the materials emanating from the TEI workshops are available online, until recently the design of the TEI website dissuaded many potential new users from learning TEI. The old website comprised of a multitude of broken or misleading links, maze like structures, and dated tutorials (in early 2007, the latest introductory material available on the site presented outdated and therefore erroneous material to novices: Sperberg-McQueen and Burnard (2002a), and Sperberg-McQueen and Burnard (2002b)). The new website design, launched in October 2007, presents a cleaner and more modern face to the TEI. An up to date generic tutorial, which also features in the TEI guidelines itself, is now available on the TEI website (Sperberg-McQueen and Burnard 2007). However, this tutorial, and links to workshop materials, are still buried deep within misleadingly titled menu items, and unlikely to be found instantly by potential new users. Although there has been much time devoted to teaching TEI, and the preparation of teaching materials for lectures and workshops, the online presence of the TEI suggests that reaching out to new users is not high on the TEI’s agenda. This may not be the intention, but the design and focus of the website is not welcoming to those new to the concept of textual markup, or from outside the existing TEI community. Additionally, there is the problem that the retrospective posting of workshop materials on a website is not the same, for users, as actually attending a workshop. The nuances of bullet points on PowerPoint slides are lost when the presenter is not there to explain their meaning. There is no room for feedback, or for any communication of any sort with the course leader. Exercises which may have been clear in the classroom, when a computing environment was provided, may be impossible for those attempting them alone, on a different system. As a result most of the workshop materials posted online are intimidating rather than illuminating, and serve more to act as an archive for the TEI for teaching activities than to provide learning materials for those wishing to learn TEI unaided. Although it makes sense to offer online materials for distance learners, these have to be tailored to the needs of online users. Online materials need to take a different form from face to face teaching materials, as the online experience is different to that presented in the traditional classroom because: learners are different; the communication is via computer and World Wide Web; the social dynamic of the learning environment is changed; feedback mechanisms function differently; there is the potential to reach a much wider audience; and there is the potential for re-use in other learning environments. As a result, instructors wishing to provide such materials online should “master design and 7 http://www.tei-c.org/Support/Learn/tutorials.xml#body.1_div.3 Terras, M., Van den Branden, R., and Vanhoutte, E. (2009). "The need for TEI By Example". Literary and Linguistic Computing 24(3):297-306. http://llc.oxfordjournals.org/cgi/content/abstract/fqp018?ijkey=deB17DJBT3YKBEX&keytype=ref delivery strategies, techniques, and methods for teaching online courses” (Yang and Cornelius 2005). If the digital humanities community wants to promote the TEI markup framework as a serious tool for digital humanities, humanities computing, digital culture, or humanities informatics, to name just a few of the labels this archipelago of disciplines gets (McCarty 2006), and to expand the use of the TEI guidelines beyond the reaches of the TEI community as it stands, then there is an urgent need for an online TEI course which is less generic than the introduction published within the TEI guidelines and more user friendly, comprehensive, and interactive than the online workshop materials which are currently presented as stand alone teaching materials. The demand for such introductory material can be illustrated by the popularity of a paper published in a special edition of LLC: “An Introduction to the TEI and the TEI Consortium” (Vanhoutte, 2004). This was reviewed by Matthew Driscoll, thus: This is followed by a short introduction by Edward Vanhoutte to the Text Encoding Initiative (TEI) in general. There are many such introductions to the TEI available both in print and on the web and this one is fine so far as it goes, but one may wonder about its appropriateness here, given that few readers of LLC are likely to be so unfamiliar with the TEI as to require such an introduction. (Driscoll, 2005, p. 337) However, this introductory paper to TEI was consistently amongst the top ten articles requested online from LLC in the three years following publication. 3. The Need for TEI by Example The TEI by Example project was conceived after a difficult teaching session. (It should be noted that, in many cases at University level, those teaching a course may not be subject experts in all aspects of a field, and rely on appropriate resources and teaching materials to assist them in the areas they are weaker in. This is a fact of University teaching life, where academics are asked to teach broadly across a discipline whilst tending to focus on one aspect of the discipline as a research topic). In this case, TEI was being taught by a lecturer who has used TEI in the past, but not on a day to day basis. An intelligent and articulate Masters level student asked whether “TEI was a theoretical exercise on the principles and theory of textual markup” as, although many projects purported to be using the TEI, there are very few examples of source code which are available for those learning the TEI to consult. Most projects marking texts up in TEI deliver their texts via the Internet: which means their code is transformed, via XSLT, into HTML or XHTML. Interested users can generally only see this transformed version, and so cannot inspect and learn from the underlying TEI code. 8 At time of writing, the TEI wiki page which hosts sample texts from those utilising the TEI framework features only eight projects willing to make some of their marked up texts available to the general public 9 . The Oxford Text Archive 10 , which collects, catalogues, and preserves electronic text for use by the 8 A notable example is the Digital Library of Dutch Literature: http://www.dbnl.org. 9 http://www.tei-c.org/wiki/index.php/Samples_of_TEI_texts 10 http://ota.ahds.ac.uk/ http://www.dbnl.org/ Terras, M., Van den Branden, R., and Vanhoutte, E. (2009). "The need for TEI By Example". Literary and Linguistic Computing 24(3):297-306. http://llc.oxfordjournals.org/cgi/content/abstract/fqp018?ijkey=deB17DJBT3YKBEX&keytype=ref research community, has a few TEI marked up texts (of any useful granularity, excluding those with just a TEI Header which is added to all texts by the OTA themselves) available, but it is impossible to find these through searching the website, and these texts are only available by contacting the OTA and requesting TEI marked up texts (Cummings, 2007). This is a loss to users, who would benefit from seeing both good and bad examples of markup, to learn to encode by example. It is understood that much intellectual and temporal effort goes into marking up textual material with suitable granularity to facilitate in depth analysis and manipulation of textual material, and that projects may not wish to make this investment public. However, being able to view the markup approaches of established scholars and projects in the field is an essential tool for TEI teaching which is currently not utilised. Learning a computing language (especially through self-directed learning) is usually accomplished through examining and working through examples. Learning by example is effectively an implementation of problem based learning, an efficient and useful approach to teaching skills to individuals in order for them to undertake similar tasks themselves, successfully. The literature on this is wide and varied 11 . The paucity of TEI examples currently available to learners can be contrasted with the teaching literature for computing: at time of writing, there were 837 titles available on Amazon.co.uk with the words “by example” in the title: most were featured in the Computers and Internet section. 756 computing books had “case studies” in the title. There has been particular consideration as to the effectiveness of example and problem based learning when learning computer programming (for example, see Mayer, 1981, 1988; Kelleher and Pausch, 2005). Even the fictional LOLcode programming language, constructed as a joke after the popularity of the internet meme LOLcats, where instant messaging English is used to caption cute pictures of cats, has a variety of examples of code available which users can scrutinise to learn LOLcode for themselves 12 . If learning by example is such a fundamental approach to learning a computing language, where is TEI by Example? Additionally, the development of any online teaching course would need to understand how to develop online materials successfully, and how this may differ from more traditional teaching and learning environments (Stephenson, 2001; Jochems et al., 2003). Understanding the nature of online tutorials, and grappling with the pedagogical issues these technologies offer, is a core issue when beginning to consider the construction of a TEI by Example online course. The need for introductory training materials regarding text encoding within the Text Encoding Initiative framework, and the present lack of appropriate teaching resources, spurred us to create TEI by Example ourselves. We aim to produce an online TEI course by example which will introduce novice users to text encoding within the TEI framework, and serve as an introductory teaching package for instructors in the classroom, while presenting the user with real, annotated examples from encoding projects. Additionally, we will need to make a software toolkit available for teaching text encoding, to support interested trainers and learners. Investigating the affordances 11 For seminal literature regarding the effectiveness of problem based learning as a pedagogic approach see Norman and Schmidt (1992), Garrison (1997), and Savin-Baden and Wilkie (2006). 12 http://lolcode.com/ Terras, M., Van den Branden, R., and Vanhoutte, E. (2009). "The need for TEI By Example". Literary and Linguistic Computing 24(3):297-306. http://llc.oxfordjournals.org/cgi/content/abstract/fqp018?ijkey=deB17DJBT3YKBEX&keytype=ref of online teaching tools (such as quizzes and interactive feedback) will also aid in the creation of useful learning materials for those who wish to undertake textual markup using the TEI guidelines. 4. TEI by Example Development The “TEI by Example” project is currently developing a range of freely available online tutorials walking individuals through the different stages in marking up a document in TEI. Project partners are The Centre for Scholarly Editing and Document Studies (CTB) 13 of the Royal Academy of Dutch Language and Literature, the Centre for Computing in the Humanities (CCH) 14 of King's College London, and the School for Library, Archive, and Information Studies (SLAIS) 15 of University College London, with an international advisory board consisting of experts in textual encoding and markup 16 . The development team consists of the project leaders, Melissa Terras and Edward Vanhoutte, and the executive project officer, Ron Van den Branden. The deliverables will be published and hosted by CCH (King's College London) under endorsement by the Association of Literary and Linguistic Computing (ALLC) 17 . A small amount of funding has been procured from the CCH, the ALLC, and the CTB which allows for a few days of development time to construct the tutorials. A major point of attention at the start of the project was the status of the TEI model. Since early 2002, the TEI Consortium has been engaged in a major (backward- incompatible) revision of the TEI specification, migrating it from version P4 (2002, see TEI, 2004) to P5 (see Burnard and Bauman, 2007). Featuring more than just changes in the markup model and the content of the guidelines, P5 entails an overhaul of the complete production process of the standard. Apart from the innovations regarding the content of the TEI markup scheme, adoption of P5 involves coping with peripheral technical innovations. The TEI Pizza Chef software for deriving P4 TEI DTDs has been superseded by the Roma system, allowing users to derive TEI customisations in a number of formal expressions, from the (innate) Relax NG Scheme to DTDs or W3C XML Schemas. By developing P5 as a sourceforge project 18 , early adopters could prepare for adoption of this revision via public access to the latest source code, and more or less stable intermediate snapshot code releases. Of course, the inherent instability of a long (public) transition period mortgages any teaching material covering its changing subject matter. It seems that the timing of the TEI by Example project coincides with a turning point in the transition of TEI P4 to P5: the advantages of P5 adoption for this project seemed to outweigh the disadvantages of P4. When undertaking the preliminary investigations into instigating the project, the most recent snapshot suggested that stability would soon be at hand (Van den Branden, 2006). As a result, the project began developing materials in P5. 13 http://www.kantl.be/ctb/ 14 http://www.kcl.ac.uk/schools/humanities/cch/ 15 http://www.slais.ucl.ac.uk/ 16 http://www.kantl.be/ctb/project/2006/tei-ex.htm#t4 17 http://www.allc.org/ 18 http://sourceforge.net/projects/tei/ Terras, M., Van den Branden, R., and Vanhoutte, E. (2009). "The need for TEI By Example". Literary and Linguistic Computing 24(3):297-306. http://llc.oxfordjournals.org/cgi/content/abstract/fqp018?ijkey=deB17DJBT3YKBEX&keytype=ref Indeed, in 2007, the P5 Guidelines were released (Burnard and Bauman, 2007), indicating that this was a prudent choice taken at the outset of the project. The deliverables of the project are: online “TEI by example” tutorials, a printable PDF version of the “TEI by example” tutorials, an online software toolkit for text encoding, a downloadable CD-ROM image for burning off-line toolkits for use by course participants, and adequate documentation to enable the tutorials to be used elsewhere if needed. Development of the tutorials began in October 2006 and has continued throughout 2007 and into 2008. It is conceded that development time has been slow: however, this is due to the fact that the project is being undertaken with very little funding, and on top of full time academic and research projects by the development team. At present, the technical infrastructure of the project has been agreed and implemented. Work on the individual tutorials has begun, with an aim for a full project launch in summer 2008. Figure 1: The current TEI by Example home page, providing the user with an overview of the structure and contents of tutorials, exercises and quizzes. Eight tutorials are under construction. The first, an introduction to text encoding and the TEI, encourages the user to explore textual encoding and markup to foster an understanding of why this is useful, or even necessary, to allow texts to be processed automatically and used and understood by others. The TEI header tutorial covers the type of information and metadata captured in the header element. Three tutorials focus on examples of individual types of text: Prose, Poetry, and Drama, and a further two tutorials deal with examples of Manuscript Transcription and Scholarly Editing. The final tutorial will investigate how the TEI can be customized, and the use of ODD and Roma. The TEI by Example tutorials aim to provide examples of markup for users of all levels. Examples will be provided of different document types, with varying degrees in the granularity of markup, to provide a useful teaching and reference aid for those involved in the marking up of texts. Likewise, the availability of a software toolkit for teaching text encoding will support the potential trainers to take up the challenge to teach TEI on several occasions. The first tutorial to be fully developed was the Poetry module. This was chosen as it was a relatively self-contained module, and it could be used to test the various options available for development. There were many editorial, technological, and pedagogical choices the authors had to make. The team had to understand the technical possibilities and limitations afforded by the online environment, and decide how best to integrate these into the tutorial materials. By juxtaposing static (pages, articles) and dynamic (quizzes, validation) functionality, the project aims to provide a holistic learning environment for those new to the TEI. Further linking to other markup examples, provided by the community and the project, extends the remits of the project into another, alternative viewpoint by which to start learning the TEI, aside from the TEI guidelines themselves (TEI, 2007a). Additionally, the role of user testing will be explored to feature feedback and comments from the TEI user community, to aid in the development of intuitive tutorial materials. The completed poetry module has been circulated to the project board, and potential users, for Terras, M., Van den Branden, R., and Vanhoutte, E. (2009). "The need for TEI By Example". Literary and Linguistic Computing 24(3):297-306. http://llc.oxfordjournals.org/cgi/content/abstract/fqp018?ijkey=deB17DJBT3YKBEX&keytype=ref comment, and user requirements will inform the design and implementation of the remaining modules over the coming months. Figure 2: The Poetry validation exercise. The user is presented with a poem and given a set of tasks to carry out. The online validator checks whether they have carried out those tasks correctly. It has always been the TEI by Example project’s aim to integrate real examples from the TEI community within the modules. In December 2006, a call for examples was sent out to the TEI community via the TEI-L email discussion list 19 . Specific projects that were also known to be using TEI to markup interesting and complex texts were also contacted to ascertain whether they would be able to provide examples of specific encoding approaches and to contextualise encoding theory with real world examples. However, the response from the community so far has been disappointing. It is understood that there is much intellectual effort placed into marking up a text, and that the creators of markup up texts may not want to make their TEI based markup available. TEI is often used as a production standard, and as a result, users can be hesitant in letting others glimpse into the internal workings of a system or project. It is also clear that there is some concern that markup approaches would be criticised, and projects are not keen to “air their dirty laundry in public”: even though showing users real life examples of markup can be more instructive than perfect specimen cases. Additionally, learning good techniques from the observation of bad techniques is a well used pedagogical approach which has some benefit: “An understanding of practical rhetoric as conduct … provides what a teacher cannot: a locus for questioning, for criticism, for distinguishing good practice from bad” (Miller, 1989, p. 23). It would be useful for projects to be able to provide examples where they do not feel the markup was well executed, and comment why this is the case. However, it is understood that individuals and projects do not wish to be open to criticism. It may also be that projects and individuals do not wish to contribute to the development of a resource which is operating outside the safe bounds of the established, limited TEI community: as yet, the TEI by Example project has no official relationship with the Textual Encoding Initiative itself (until the tutorial development has been undertaken it was felt that it was better to keep the team small and focussed, although this may be revised in future. Certain established members in the TEI community have been critical of the efforts of TEI by Example, perhaps because they do not like the implied criticism that their approaches and their teaching methods are not reaching a wide audience). Additionally, due to the paucity of examples from the community available, the TEI by Example tutorials have been written with examples which were created by the project itself, which has come under some criticism (although this has the added benefit of allowing all learners to start from the same carefully chosen point). Finally, providing TEI by Example with marked up XML files is not enough for examples to be of use: it is important that the examples are accompanied with a brief introductory commentary to the editorial approaches used within each document markup, so contribution to TBE requires investment of time and effort for a project, which is an additional task for already hard 19 http://listserv.brown.edu/archives/cgi-bin/wa?A0=TEI-L Terras, M., Van den Branden, R., and Vanhoutte, E. (2009). "The need for TEI By Example". Literary and Linguistic Computing 24(3):297-306. http://llc.oxfordjournals.org/cgi/content/abstract/fqp018?ijkey=deB17DJBT3YKBEX&keytype=ref pressed people to add to their to-do lists. The TBE project continues to attempt to outreach to individuals and projects within the community and to encourage them to submit examples for use in the tutorials. Until this happens, a selection of texts are being marked up by the project itself, for use in the tutorials, providing textbook examples of markup approaches. It is acknowledged that this is second best to building up a library of real-life examples of markup. A poster presented at Digital Humanities 2007 (Van den Branden et al., 2007) encouraged feedback regarding the implementation and design of the tutorials from the Digital Humanities community. A poster presented at Digital Humanities 2008 will encourage user testing from the Digital Humanities community and allow us to integrate feedback into the design process, before launching the tutorials online. 5. Future Work The response of the TEI community to TEI by Example has been muted: although many examples of TEI markup have been promised, few have been provided to the project. Any projects working on texts that they think may be of interest to a learner, or those who would like their texts to be considered, should get in touch with the TEI by Example project: teibyexample@kantl.be. In order to support multilingualism in the text encoding community, the online tutorials are being considered for translation into a number of languages from their original English. The translations proper, however, are outside the scope of the initial stages of the project, but the problems presented by internationalisation are important and pressing ones. Issues such as how to provide relevant examples of various text types in different languages must be addressed, as is how to reach as wide an audience as possible through the translation of the tutorial teaching materials into various languages. Further user testing needs to be undertaken once the next phase of development of the projects begins. Students from both University College London and the University of Antwerp will be used to give feedback on the TEI by Example materials. Additionally, at some stage the tutorials will be open to feedback from the TEI community itself: we brace ourselves for the reaction. 6. Conclusion TEI by Example is a modest, but important project which aims to produce stand alone tutorials in the of use the Text Encoding Initiative’s guidelines for document markup, which should be of use to the Digital Humanities audience, and beyond. It is hoped that the project results will be relevant to the trainers of TEI, the students of TEI, the text encoding community, and the humanities computing community in general. To do this, it is important to involve the TEI community both in the design and testing of the tutorials, but also in the provision of real world examples of markup materials which can be used as an alternative inroad for interested individuals wishing to learn and understand the aims of the TEI. By making use of the possibilities afforded by the online teaching environment, and creating and tailoring TEI based teaching materials Terras, M., Van den Branden, R., and Vanhoutte, E. (2009). "The need for TEI By Example". Literary and Linguistic Computing 24(3):297-306. http://llc.oxfordjournals.org/cgi/content/abstract/fqp018?ijkey=deB17DJBT3YKBEX&keytype=ref which can be used by both individuals and in classroom training sessions, the TEI by Example project aims to expand the user community of TEI by providing teaching materials which cater completely towards learners, rather than materials provided for the small community of TEI experts who have little requirement for introductory materials. References Bernold, L.E. (n.d). “Learning Oriented Teaching and Academic Success”. Department of Civil Engineering, North Carolina State University. http://www2.ncsu.edu/CIL/CARL/Education/Classes/LearnResearch.html Burnard, L. and Bauman, S. (eds) (2007). TEI P5: Guidelines for Electronic Text Encoding and Interchange. Text Encoding Initiative Consortium. http://www.tei- c.org/release/doc/tei-p5-doc/en/html/index.html Cummings, J. (2007). “Quick Question Regarding OTA and TEI”. Personal Email Communication to M. Terras, 23 rd October 2007. Driscoll, M.J. (2005). “Review: Mats Dahlström, Espen S. Ore, and Edward Vanhoutte, Literary and Linguistic Computing, Vol. 19, No. 1, April 2004”. In: A.M. Hansen (ed), (2005). The Book as Artefact. Text and Border. Variants - The Journal of the European Society for Textual Scholarship. Volume 4, p. 337-340. Garrison, D. R. (1997). “Self Directed Learning, Towards a Comprehensive Model”. Adult Education Quarterly, Vol. 48, No. 1, 18-33. Jochems, W., van Merrienboer, J. and, Koper, R. (Eds) (2003). Integrated E-Learning: Implications for Pedagogy, Technology and Organization (Open & Flexible Learning). Routledge Farmer. Kelleher, C. and Pausch, R. (2005). “Lowering the Barriers to Programming: A Taxonomy of Programming Environments and Languages for Novice Programmers”, ACM Computing Surveys, June 2005, 37(2):83-137. Mayer, R. (1981). “The Psychology of How Novices Learn Computer Programming” ACM Computing Surveys (CSUR). Volume 13, Issue 1. Pages: 121 – 141. Mayer, R. (1988). Teaching and Learning Computer Programming: Multiple Research Perspectives. Lawrence Erlbaum Associates Inc, USA. McCarty, W. (2006). “Tree, Turf, Centre, Archipelago - or Wild Acre? Metaphors and Stories for Humanities Computing”. Literary and Linguistic Computing, 21/1: 1- 13. Miller, C. R. (1989). “What’s practical about technical writing”. In B. E. Fearing & W. Keats Sparrow (Eds.), Technical writing: Theory and practice. New York: Modern Language Association, pp. 14-24. Terras, M., Van den Branden, R., and Vanhoutte, E. (2009). "The need for TEI By Example". Literary and Linguistic Computing 24(3):297-306. http://llc.oxfordjournals.org/cgi/content/abstract/fqp018?ijkey=deB17DJBT3YKBEX&keytype=ref Norman, G. R., and Schmidt, H. G. (1992). “The psychological basis of problem- based learning: a review of the evidence”. Acad Med; 67(9): 557-65 Savin-Baden, M. and Wilkie, K. (2006). Problem Based Learning online. Open University Press. Sperberg-McQueen, C.M. and Burnard, L. (2002a). 'A Gentle Introduction to XML'. In: Sperberg-McQueen, C.M. and Burnard, L. (eds.) (2002). TEI P4: Guidelines for Electronic Text Encoding and Interchange (XML-compatible edition). Text Encoding Initiative Consortium: Oxford, Providence, Charlottesville, Bergen. http://www.tei- c.org/P4X/SG.html Sperberg-McQueen, C.M. and Burnard, L. (2002b). “TEI Lite: An introduction to Text Encoding for Interchange”. http://www.tei-c.org/Lite/teiu5_en.html Sperberg-McQueen, C.M. and Burnard, L. (2007). “A Gentle Introduction to XML”. In: Burnard, L. and Bauman, S. (eds) (2007). TEI P5: Guidelines for Electronic Text Encoding and Interchange. Text Encoding Initiative Consortium. http://www.tei- c.org/release/doc/tei-p5-doc/en/html/index.html Stephenson, J. (2001). Teaching and Learning Online: New Pedagogies for New Technologies. Routledge. TEI (2004). “P4 Guidelines for Electronic Text Encoding and Interchange, XML- compatible edition” http://www.tei-c.org/P4X/index.html TEI (2005). Burnard, L. and Bauman, S. (eds). “TEI P5 Guidelines for Electronic Text Encoding and Interchange.” http://www.tei-c.org.uk/P5/Guidelines/index.html TEI (2007a). Burnard, L. and Bauman, S. (eds). “TEI P5 Guidelines for Electronic Text Encoding and Interchange” http://www.tei-c.org.uk/P5/Guidelines/index.html TEI (2007b). “TEI Frequently Asked Questions”. http://www.tei- c.org/About/faq.xml#body.1_div.1_div.1 Van den Branden, R. (2006). [TBE-R001] – “TEI by example, initial report”, 2006/06/09. http://www.kantl.be/ctb/project/2006/TBE-R001.htm Van den Branden, R., Vanhoutte, E., Terras, M. (2007). "TEI by Example". Digital Humanities 2007, University of Illinois at Urbana-Champaign, USA, June 2007. http://www.digitalhumanities.org/dh2007/abstracts/xhtml.xq?id=221 Vanhoutte, E. (2004). “An Introduction to the TEI and the TEI Consortium”. Literary and Linguistic Computing 19(1):9-16 WWP (Women Writers Project). (2007). “Workshops on Text Encoding with TEI”. http://www.wwp.brown.edu/encoding/workshops/ http://www.tei-c.org/P4X/SG.html http://www.tei-c.org/P4X/SG.html http://www.tei-c.org/Lite/teiu5_en.html http://www.tei-c.org/P4X/index.html http://www.tei-c.org.uk/P5/Guidelines/index.html http://www.digitalhumanities.org/dh2007/abstracts/xhtml.xq?id=221 http://www.wwp.brown.edu/encoding/workshops/ Terras, M., Van den Branden, R., and Vanhoutte, E. (2009). "The need for TEI By Example". Literary and Linguistic Computing 24(3):297-306. http://llc.oxfordjournals.org/cgi/content/abstract/fqp018?ijkey=deB17DJBT3YKBEX&keytype=ref Yang, Y. and Cornelius, L. F. (2005), Preparing Instructors for Quality Online Instruction, Online Journal of Distance Learning Administration, Volume VIII, Number I, Spring 2005, http://www.westga.edu/~distance/ojdla/spring81/yang81.htm work_aw467ogqnvajnifcoh3uaqtjyy ---- Untitled The Master Builders: LAIRAH Research on Good Practicein the Construction of Digital Humanities Projects Claire Warwick (c.warwick@ucl.ac.uk) School of Library, Archive and Information Studies University College London Melissa Terras (m.terras@ucl.ac.uk) School of Library, Archive and Information Studies University College London Paul Huntington (p.huntington@ucl.ac.uk) School of Library, Archive and Information Studies University College London Nikoleta Pappa (n.pappa@ucl.ac.uk) School of Library, Archive and Information Studies University College London Isabel Galina (i.russell@ucl.ac.uk) School of Library, Archive and Information Studies University College London Abstract: T his paper describes the results of research carried outduring the LAIRAH (Log analysis of Internet Resources in the Arts and Humanities) project () which is based at UCL’s School of Library Archive and Information Studies. It was a fifteen month study (reporting in October 2006) to discover what influences the long-term sustainability and use of digital resources in the humanities through the analysis and evaluation of real-time use. At Digital Humanities 2006 we reported on the early stages of the project, in which we carried out deep log analysis of the AHDS and Humbul portals to determine the level of use of digital resources. (Warwick et al. 2006) This proposal will discuss the results of the final phase of the research in which we examined digital resources from the point of view of those who designed and built them. We aimed to discover whether there were common characteristics and elements of good practice linking resources that are well- used. Numerous studies have been carried out into the information needs and information seeking practices of humanities scholars (Barrett, (2005) Talja and Maula (2003), Herman (2001) and British Academy, (2005)). However, our research is original because it surveys the practices of those who produce digital humanities resources. We also based the selection of our projects on deep log analysis: a quantitative technique which has not previously been applied to digital humanities resources to ascertain real usage levels of online digital resources. Method: W e selected a sample of twenty one publicly fundedprojects with varying levels of use, covering different subject disciplines, to be studied in greater depth. We classified projects as well-used if the server log data from the Arts and Humanities Data Service (AHDS) and Humbul portals showed that they had been repeatedly and frequently accessed by a variety of users. We also mounted a questionnaire on these sites and asked which digital resources respondents found most useful. Although most nominated information resources, such as libraries, archives and reference collections for example the eDNB, three UK publicly funded research resources were mentioned, and thus we added them to the study. We also asked representatives of each AHDS centre to specify which resources in their collections they believed were most used. In the case of Sheffield University the logs showed that a large number of digital projects accessed were based at the Humanities Research Institute. We therefore conducted interviews about the HRI and its role in fostering the creation of digital humanities resources. The selected projects were studied in detail, including any documentation and reports that could be found on the project’s website. We also interviewed a representative of the project, either the principal investigator or a research assistant. Results: Institutional context: The majority of projects that we interviewed had been well supported in technical terms, and this had undoubtedly aided the success of the project, especially where it was associated with a centre of digital humanities excellence such as the Centre for Computing in the Humanities at Kings College London or the HRI at Sheffield. Critical mass aided the spread of good practice in the construction and use of digital resources in the humanities. Where a university valued such activities highly Page 1 Digital Humanities 2007 http://www.ucl.ac.uk/slais/circah/lairah/ http://www.ucl.ac.uk/slais/circah/lairah/ they tended to proliferate. More junior members of staff were inspired to undertake digital humanities research by the success of senior colleagues and early adopters respected for their traditional and digital research. Unfortunately such critical mass is relatively rare in UK universities and some PIs reported that their digital resource was not understood or valued by their departments, and thus their success had not lead to further digital research. Staffing: PIs also stressed how vital it had been to recruit the ideal RAs. These were however relatively difficult to find, as they had to have both disciplinary research expertise and good knowledge of digital techniques. Most RAs therefore required training, which many PIs often found lacking or of poor quality. A further frustration was the difficulty of finding funding to continue research, this meant that an expert RA might leave, necessitating further training of a new employee if the project was granted future funding. Dissemination: The strongest correlation between well-used projects and a specific activity was in the area of dissemination. In all the projects studied, staff had made determined efforts to disseminate information as widely as possible. This was a new challenge for many humanities academics, who were more used to writing books, marketed by their publishers. This might include giving papers at seminars and conferences both within the subject community and the digital humanities domain; sending out printed material; running workshops, and in the most unusual instance, the production of a tea-towel! User contact: Very few projects maintained contact with their users or undertook any organised user testing, and many did not have a clear idea how popular the resource was or what users were doing with it. However, one of the few projects that had been obliged to undertake user surveys by its funders was very well-used, and its PI had been delighted at the unexpected amount and range of its use. Another project came to the belated realisation that if it had consulted users the process of designing the resource would have been simpler and less demanding. Documentation: Few of the projects kept organised documentation, with the exception of those in archaeology, linguistics and archival studies, where such a practice is the norm in all research. Most projects had kept only fragmentary, internal documents, many of which would not be comprehensible to someone from outside. Documentation could also be difficult to access, with only a small minority of projects making this information available from its website. This is an important omission since documentation aids reuse of resources, and also provides vital contextual information amount its contents and the rationale for its construction that users need to reassure them about the quality of the resource for academic research. Sustainability: Another area of concern was the issue of sustainability. Although the resources were offered for deposit with the AHDS, few PIs were aware that to remain usable, both the web interface and the contents of the resource would require regular updating and maintenance, since users tend to distrust a web page that looks outdated. Yet in most cases no resources were available to perform such maintenance, and we learnt of one ten year old resource whose functionality had already been significantly degraded as a result. Conclusion and recommendations W ell-used projects do therefore share common featuresthat predispose them to success. The effect of institutional and disciplinary culture in the construction of digital humanities projects was significant. We found that critical mass was vital, as was prestige within a university or the acceptance of digital methods in a subject. The importance of good project staff and the availability of technical support also proved vital. If a project as to be well-used it was also essential that information about it should be disseminated as widely as possible. Even amongst well-used projects, however we found areas that might be improved, these included organised user testing, the provision of and easy access to documentation and the lack of updating and maintenance of many resources. Recommendations: Documentation: • Projects should keep documentation and make it available from the project web site, making clear the extent, provenance and selection methods of materials for the resource. • Funding bodies might consider making documentation a compulsory deliverable of a funded project. • Discussions could be held between relevant stakeholders and the funding bodies, with the aim of producing an agreed Page 2 Digital Humanities 2007 documentation template. This should specify what should be documented and to what level of detail. Users: • Projects should have a clear idea of whom the expected users might be; consult them as soon as possible and maintain contact through the project via a dedicated email list , website feedback or other appropriate method • They should carry out formal user surveys, software and interface tests and integrate the results into project design. • Applicants for funding should show that they have consulted documentation of other relevant projects and discuss what they have learnt from it in their case for support. The results of such contact could then be included in the final report as a condition of satisfactory progress. Management: • Projects should have access to good technical support, ideally from a centre of excellence in digital humanities. • Projects should recruit staff who have both subject expertise and knowledge of digital humanities techniques, then train them in other specialist techniques as necessary. • Funding bodies might consider requiring universities to offer more training for graduate students and RAs in digital humanities techniques. Sustainability: • Ideally projects should maintain and actively update the interface, content and functionality of the resource, and not simply archive it with a data archive such as the AHDS. However this is dependent on a funding model which makes this possible. Dissemination: • Disseminate information about itself widely, both within its own subject domain and in digital humanities. • Information should be disseminated widely about the reasons for user testing and its benefits, for example via AHRC/AHDS workshops. Projects should be encouraged to collaborate with experts on user behaviour. Acknowledgements: T his project was funded by the Arts and HumanitiesResearch Council ICT Strategy Scheme. We would also like to thank all of out interviewees for agreeing to talk to us. Bibliography Barrett, A. "The Information Seeking Habits of Graduate Student Researchers in the Humanities." The Journal of Academic Librarianship 31.4 (2005): 324-331. British Academy. E-resources for Research in the Humanities and Social Sciences - A British Academy Policy Review section 3.5 . 2005. Herman, E. "End-users in Academia: Meeting the Information Needs of University Researchers in an Electronic Age Part 2 Innovative Information-accessing Opportunities and the Researcher: User Acceptance of IT-based Information Resources in Academia." Aslib Proceedings. 2001. 431-457. Talja, S., and H. Maula. "Reasons for the Use and Non-use of Electronic Journals and Databases - A Domain Analytic Study in Four Scholarly Disciplines." Journal of Documentation 59.6 (2003): 673-691. Warwick, C., M. Terras, P. Hungtington, and N. Pappa. "If You Build It Will They Come? The LAIRAH Survey of Digital Resources in the Arts and Humanities." Paper presented at Digital Humanities 2006, Paris Sorbonne, 5-9 July 2006. 2006. Page 3 Digital Humanities 2007 http://www.britac.ac.uk/reports/eresources/report/sect3.html#part5 http://www.britac.ac.uk/reports/eresources/report/sect3.html#part5 work_awm7ys2jurdglkgnbzs4sxlfgq ---- 24 C O M M U N I C AT I O N S O F T H E A C M | S E P T E M B E R 2 0 1 4 | V O L . 5 7 | N O . 9 V viewpoints Historical Reflections We Have Never Been Digital Reflections on the intersection of computing and the humanities. that point no American company had yet applied a computer to administra- tive work, and when they did the results would almost invariably disappoint. The machines needed more people than anticipated to tend them, took lon- ger to get running, and proved less flex- ible. So why did hundreds of companies rush into computerization before its economic feasibility was established? Worthington had warned that “The first competitor in each industry to oper- ate in milliseconds, at a fraction of his former overhead, is going to run rings around his competition. There aren’t many businesses that can afford to take a chance on giving this fellow a five-year lead. Therefore, most of us have to start now, if we haven’t started already.”a Following his belief that “the omi- nous rumble you sense is the future coming at us.” Worthington was soon to give up his staff job at Hughes Aircraft in favor of a consulting role, promoting his own expertise as a guide toward the electronic future. He had promised that “We can set our course toward push-but- ton administration, and God willing we can get there.” Similar statements were being made on the pages of the Harvard Business Review and in speeches deliv- ered by the leaders of IBM and other business technology companies as a a W.B. Worthington. “Application of Electronics to Administrative Systems,” Systems and Proce- dures Quarterly 4, 1 (Feb. 1953), 8–14. Quoted in T. Haigh, “The Chromium-Plated Tabula- tor: Institutionalizing an Electronic Revolu- tion, 1954–1958,” IEEE Annals of the History of Computing 23, 4 (Oct.–Dec. 2001), 75–104. T H I S C O L U M N I S inspired by the fashionable concept of the “digital humanities.” That will be our destination rather than our starting point, as we look back at the long history of the idea that adoption of computer technology is a revolutionary moment in human histo- ry. Along the way we will visit the work of Nicholas Negroponte and Bruno Latour, whose books Being Digital and We Have Never Been Modern I splice to suggest that we have, in fact, never been digital. The computer is not a particularly new invention. The first modern com- puter programs were run in 1948, long before many of us were born. Yet for decades it was consistently presented as a revolutionary force whose immi- nent impact on society would utterly transform our lives. This metaphor of “impact,” conjuring images of a bulky asteroid heading toward a swamp full of peacefully grazing dinosaurs, presents technological change as a violent event we need to prepare for but can do noth- ing to avert. Discussion of the looming revolution tended to follow a pattern laid out in the very first book on electronic computers written for a broad audience: Edmund Callis Berkeley’s 1949 Giant Brains: Or Machines That Think.1 Ever since then the computer has been surrounded by a cloud of promises and predications, de- scribing the future world it will produce. The specific machines described in loving detail by Berkeley, who dwelled on their then-novel arrangements of re- lays and vacuum tubes, were utterly ob- solete within a few years. His broader hopes and concerns for thinking ma- chines, laid out in chapters on “what they might do for man” and “how soci- ety might control them” remain much fresher. For example, he discussed the potential for autonomous lawnmow- ers, automated translation, machine dictation, optical character recogni- tion, an “automatic cooking machine controlled by program tapes,” and a system by which “all the pages of all books will be available by machine.” “What,” he asked, “shall I do when a ro- bot machine renders worthless all the skills I have spent years in developing?” Computer systems have always been sold with the suggestion they represent a ticket to the future. One of my favorite illustrations of this comes from 1953, when W.B. Worthington, a business sys- tems specialist, promised at a meeting of his fellows that “the changes ahead appear to be similar in character but far beyond those effected by printing.” At DOI:10.1145/2644148 Thomas Haigh V viewpoints Computer systems have always been sold with the suggestion they represent a ticket to the future. http://dx.doi.org/10.1145/2644148 S E P T E M B E R 2 0 1 4 | V O L . 5 7 | N O . 9 | C O M M U N I C AT I O N S O F T H E A C M 25 viewpoints V viewpoints cal concept within computing. It began as one of the two approaches to high- speed automatic computation back in the 1940s. The new breed of “comput- ing machinery,” after which the ACM was named, was called digital because the quantities the computer calculated with were represented as numbers. That is to say they were stored as a series of digits, whether on cog wheels or in elec- tronic counters, and whether they were manipulated as decimal digits or the 0s and 1s of binary. This contrasted with the better-established tradition of ana- log computation, a term derived from the word “analogy.” In an analog device an increase in one of the quantities be- ing modeled is represented by a corre- sponding increase in something inside the machine. A disc rotates a little faster; a voltage rises slightly; or a little more fluid accumulates in a chamber. Tradi- tional speedometers and thermometers are analog devices. They creep up or down continuously, and when we read off a value we look for the closest num- ber marked on the gauge. Throughout the 1950s and 1960s ana- log and digital computers coexisted. The titles of textbooks and university classes would include the word “analog” or “dig- ital” as appropriate to avoid confusion. Eventually the increasing power and re- liability of digital computers and their broad social alliance assembled itself behind the new technology. After this initial surge of interest in computerization during the 1950s there have been two subsequent peaks of en- thusiasm. During the late 1970s and early 1980s the world was awash with discussion of the information society, post-industrial society, and the micro- computer revolution. There followed, in the 1990s, a wave of enthusiasm for the transformative potential of computer networks and the newly invented World Wide Web. Rupture Talk and Imaginaires Discussion of the “computer revolution” was not just cultural froth whipped up by the forces of technological change. Instead the construction of this shared vision of the future was a central part of the social process by which an unfa- miliar new technology became a central part of American work life. Patrice Flichy called these collective visions “imagi- naires” and has documented their im- portance in the rapid spread of the In- ternet during the 1990s.2 Rob Kling, a prolific and influential researcher, wrote extensively on the importance of “com- puterization movements” within orga- nizations and professional fields.5 Historian of technology Gabrielle Hecht called such discussion “rupture talk” in her discussion of the enthusi- asm with which France reoriented its co- lonial power and engineering talent dur- ing the 1950s around mastery of nuclear technology.4 This formulation captures its central promise: that a new technol- ogy is so powerful and far-reaching it will break mankind free of history. Details of the utopian new age get filled in accord- ing to the interests, obsessions, and po- litical beliefs of the people depicting it. That promise is particularly appealing to nations in need of a fresh start and a boost of confidence, as France then was, but its appeal seems to be universal. This dismissal of the relevance of experi- ence or historical precedent carries out a kind of preventative strike on those who might try to use historical parallels to ar- gue that the impact of the technology in question might in fact be slower, more uneven, or less dramatic than promised. Yet this fondness for rupture talk is itself something with a long history around technologies such as electric power, te- legraphy, air travel, and space flight. Enter “The Digital” One of the most interesting of the clus- ter of concepts popularized in the early 1990s to describe the forthcoming revo- lution was the idea of “the digital” as a new realm of human experience. Digital had, of course, a long career as a techni- 26 C O M M U N I C AT I O N S O F T H E A C M | S E P T E M B E R 2 0 1 4 | V O L . 5 7 | N O . 9 viewpoints falling cost squeezed analog computers out of the niches, such as paint mixing, in which they had previously been pre- ferred. Most analog computer suppliers left the industry, although Hewlett-Pack- ard made a strikingly successful transi- tion to the digital world. By the 1970s it was generally no longer necessary to prefix computer with “digital” and con- sequently the word was less frequently encountered in computing circles. “Digital” acquired a new resonance from 1993, with the launch of the in- stantly fashionable Wired magazine. In the first issue of Wired its editor pro- claimed the “the Digital Revolution is whipping through our lives like a Ben- gali typhoon,” just as enthusiasm was building for the information superhigh- way and the Internet was being opened to commercial use. Wired published lists of the “Digerati”—a short-lived coinage conservative activist and proph- et of unlimited bandwidth George Gilder used to justify something akin to People’s list of the sexiest people alive as judged on intellectual appeal to lib- ertarian techno geeks. The magazine’s title evoked both electronic circuits and drug-heightened fervor. As Fred Turner showed in his book From Counter Cul- ture to Cyberculture, Wired was one in a series of bold projects created by a shifting group of collaborators orbiting libertarian visionary Steward Brand.8 Brand had previously created the Whole Earth Catalog back in the 1960s and a pi- oneering online community known as the WELL (Whole Earth ‘Lectronic Link) in the 1980s. His circle saw technology as a potentially revolutionary force for personal empowerment and social transformation. In the early 1990s this held together an unlikely alliance, from Newt Gingrich who as House Speaker suggested giving laptops to the poor rather than welfare payments, to the fu- turist Alvin Toffler, U.S. Vice President Al Gore who championed government support for high-speed networking, and Grateful Dead lyricist John Perry Barlow who had founded the Electronic Fron- tier Foundation to make sure that the new territory of “cyberspace” was not burdened by government interference. One of the magazine’s key figures, Nicholas Negroponte, was particularly important in promoting the idea of “the digital.” Negroponte was the entrepre- neurial founder and head of MIT’s Me- dia Lab, a prominent figure in the world of technology whose fame owed much to a book written by Brand. Negroponte took “digital” far beyond its literal mean- ing to make it, as the title of his 1995 book Being Digital, suggested, the defin- ing characteristic of a new way of life. This was classic rupture talk. His central claim was that in the past things “made of atoms” had been all important. In the future everything that mattered would be “made of bits.” As I argued in a previous column, all information has an underlying ma- terial nature.3 Still, the focus on digi- tal machine-readable representation made some sense: the computer is an exceptionally flexible technology whose applications gradually expanded from scientific calculation to business administration and industrial control to communication to personal enter- tainment as their speed has risen and their cost fallen. Each new application meant representing a new aspect of the world in machine-readable form. Like- wise, the workability of modern com- puters depended on advances in digital electronics and conceptual develop- ments in coding techniques and infor- mation theory. So stressing the digital nature of computer technology is more revealing than calling the computer an “information machine.” Here is a taste of Being Digital: “Ear- ly in the next millennium, your left and right cuff links or earrings may com- municate with each other by low-orbit- ing satellites and have more computer power than your present PC. Your tele- phone won’t ring indiscriminately; it will receive, sort, and perhaps respond to your calls like a well-trained English butler. Mass media will be refined by systems for transmitting and receiv- ing personalized information and entertainment. Schools will change to become more like museums and playgrounds for children to assemble ideas and socialize with children all over the world. The digital planet will look and feel like the head of a pin. As we interconnect ourselves, many of the values of a nation-state will give way to those of both larger and small- er communities. We will socialize in digital neighborhoods in which physi- cal space will be irrelevant and time will play a different role. Twenty years from now, when you look out of a win- dow what you see may be five thousand miles and six time zones away…” Like any expert set of predictions this cluster of promises extrapolated social and technology change to yield a mix of the fancifully bold, the spot-on, and the overly conservative. Our phones do support call screening, although voice communication seems to be dwindling. Online communities have contributed to increased cultural and political po- larization. Netflix, Twitter, blogs, and YouTube have done more than “refine” mass media. As for those satellite cuff links, well the “Internet of Things” remains a fu- turistic vision more than a daily reality. As the career of the “cashless society” since the 1960s has shown, an imagi- naire can remain futuristic and excit- ing for decades without ever actually arriving.b However, when the cuff links of the future do feel the need to com- municate they seem more likely to chat over local mesh networks than precious satellite bandwidth. This prediction was perhaps an example of the role of future visions in promoting the interests of the visionary. Negroponte was then on the board of Motorola, which poured bil- lions of dollars into the Iridium network of low-earth orbit satellites for phone and pager communication. That busi- ness collapsed within months of launch in 1998 and plans to burn up the satel- lites to avoid leaving space junk were canceled only after the U.S. defense de- partment stepped in to fund their con- tinued operation. b A phenomenon I explore in more detail in B. Batiz-Lazo, T. Haigh, and D. Steans, “How the Future Shaped the Past: The Case of the Cash- less Society,” Enterprise and Society, 36, 1 (Mar. 2014), 4–17. A wave of enthusiasm for “the digital” has swept through humanities departments worldwide. S E P T E M B E R 2 0 1 4 | V O L . 5 7 | N O . 9 | C O M M U N I C AT I O N S O F T H E A C M 27 viewpoints Eroding the Future Of course we never quite got to the digital future. My unmistakably analog windows show me what is immediately outside my house. Whether utopian or totalitarian, imagined future worlds tend to depict societies in which ev- ery aspect of life has changed around a particular new technology, or everyone dresses in a particular way, or everyone has adopted a particular practice. But in reality as new technologies are assimi- lated into our daily routines they stop feeling like contact with an unfamiliar future and start seeming like familiar objects with their own special character. If a colleague reported that she had just ventured into cyberspace after booking a hotel online or was considering taking a drive on the information superhigh- way to send email you would question her sincerity, if not her sanity. These metaphors served to bundle together different uses of information technol- ogy into a single metaphor and distance them from our humdrum lives. Today, we recognize that making a voice or vid- eo call, sending a tweet, reading a Web page, or streaming a movie are distinct activities with different meanings in our lives even when achieved using the same digital device. Sociologist Bruno Latour, a giant in the field of science studies, captured this idea in the title of his 1993 book We Have Never Been Modern, published just as Ne- groponte began to write his columns for Wired. Its thesis was that nature, tech- nology, and society have never truly been separable despite the Enlightenment and Scientific Revolution in which their separation was defined as the hallmark of modernity. Self-proclaimed “mod- erns” have insisted vocally on these sepa- rations while in reality hybridizing them into complex socio-technical systems. Thus, he asserts “Nobody has ever been modern. Modernity has never begun. There has never been a modern world.”6 Latour believed that “moderns,” like Negroponte, see technology as some- thing external to society yet also as something powerful enough to define epochs of human existence. As Latour wrote, “the history of the moderns will be punctuated owing to the emergence of the nonhuman—the Pythagorean theorem, heliocentrism…the atomic bomb, the computer…. People are go- ing to distinguish the time ‘BC’ and ‘AC’ with respect to computers as they do the years ‘before Christ’ and ‘after Christ’.” He observed that rhetoric of revolu- tion has great power to shape history, writing that “revolutions attempt to abolish the past but they cannot do so…” Thus we must be careful not to endorse the assumption of a historical rupture as part of our own conceptual framework. “If there is one thing we are incapable of carrying out,” Latour asserted, “it is a revolution, whether it be in science, technology, politics, or philosophy.…” Our world is inescapably messy, a constant mix of old and new in every area of culture and technology. In one passage Latour brought things down to earth by discussing his home repair tool- kit: “I may use an electric drill, but I also use a hammer. The former is 35 years old, the latter hundreds of thousands. Will you see me as a DIY expert ‘of con- trasts’ because I mix up gestures from different times? Would I be an ethno- graphic curiosity? On the contrary: show me an activity that is homogenous from the viewpoint of the modern time.” According to science fiction writer William Gibson, “The future is al- ready here—it’s just not very evenly distributed.”c That brings me comfort as a historian because of its logical cor- ollary, that the past is also mixed up all around us and will remain so.d Even Ne- groponte acknowledged the uneven na- c The sentiment is Gibson’s, although there is no record of him using those specific words until after they had become an aphorism. See http://quoteinvestigator.com/2012/01/24/ future-has-arrived/. d Gibson himself appreciates this, as I have discussed elsewhere T. Haigh, “Technology’s Other Storytellers: Science Fiction as History of Technology,” in Science Fiction and Comput- ing: Essays on Interlinked Domains, D.L. Ferro and E.G. Swedin, Eds., McFarland, Jefferson, N.C., 2011, 13–37 ture of change. Back in 1997, in his last column for Wired, he noted that “digital” was destined for banality and ubiquity as “Its literal form, the technology, is al- ready beginning to be taken for granted, and its connotation will become tomor- row’s commercial and cultural compost for new ideas. Like air and drinking wa- ter, being digital will be noticed only by its absence, not its presence.”7 Digital Humanities Even after once-unfamiliar technolo- gies dissolve into our daily experience, rupture talk and metaphors of revolu- tion can continue to lurk in odd and un- predictable places. While we no longer think of the Internet as a place called “cyberspace” the military-industrial complex seems to have settled on “cy- ber warfare” as the appropriate name for online sabotage. Likewise, the NSF has put its money behind the idea of “cyberinfrastructure.” The ghastly practice of prefixing things with an “e” has faded in most realms, but “e-com- merce” is hanging on. Like most other library schools with hopes of contin- ued relevance my own institution has dubbed itself an “iSchool,” copying the names of Apple’s successful consumer products. There does not seem to be any particular logic behind this set of prefixes and we might all just as well have settled on “iWarfare,” “cybercom- merce” and “e-school.” But these terms will live on, vestiges of the crisp future vision that destroyed itself by messily and incompletely coming true. The dated neologism I have been hearing more and more lately is “the digital humanities.” When I first heard someone describe himself as a “digital historian” the idea that this would be the best way to describe a historian who had built a website seemed both preten- tious and oddly outdated. Since then, however, a wave of enthusiasm for “the digital” has swept through humanities departments nationwide. According to Matthew Kirschen- baum, the term “digital humanities” was first devised at the University of Virginia back in 2001 as the name for a mooted graduate degree program. Those who came up with it wanted something more exciting than “humanities computing” and broader than “digital media,” two established alternatives. It spread wide- ly through the Blackwell Companion to 28 C O M M U N I C AT I O N S O F T H E A C M | S E P T E M B E R 2 0 1 4 | V O L . 5 7 | N O . 9 viewpoints the Digital Humanities issued in 2004. As Kirschenbaum noted, the reasons be- hind the term’s spread have “primarily to do with marketing and uptake” and it is “wielded instrumentally” by those seeking to further their own careers and intellectual agendas. In this human- ists are not so different from Worthing- ton back in the 1950s, or Negroponte and his fellow “digerati” in the 1990s, though it is a little incongruous that they appropriated “the digital” just as he was growing tired of it. The digital humanities movement is a push to apply the tools and methods of computing to the subject matter of the humanities. I can see why young hu- manists trained in disciplines troubled by falling student numbers, a perceived loss of relevance, and the sometimes alienating hangover of postmodernism might find something liberating and empowering in the tangible satisfaction of making a machine do something. Self-proclaimed digital humanists have appreciably less terrible prospects for employment and grant funding as a hu- manist than the fusty analog variety. As Marge Simpson wisely cautioned, “don’t make fun of grad students. They just made a terrible life choice.” It is not clear exactly what makes a humanist digital. My sense is the bound- ary shifts over time, as one would have to be using computers to do something that most of one’s colleagues did not know how to do. Using email or a word processing program would not qualify, and having a homepage will no longer cut it. Installing a Web content manage- ment system would probably still do it, and anything involving programming or scripting definitely would. In fact, digital humanists have themselves been arguing over whether a humanist has to code to be digital, or if writing and think- ing about technology would be enough. This has been framed by some as a dis- pute between the virtuous modern im- pulse to “hack” and the ineffectual tra- ditional humanities practice of “yack.” As someone who made a deliberate (and economically rather perverse) choice to shift from computer science to the his- tory of technology after earning my first masters’ degree, I find this glorification of technological tools a little disturbing. What attracted me to the humanities in the first place was the promise of an in- tellectual place where one could under- stand technology in a broader social and historical context, stepping back from the culture of computer enthusiasm that valued coding over contemplating and technological means over human ends. There is a sense in which historians of information technology work at the intersection of computing and the hu- manities. Certainly we have attempted, with rather less success, to interest humanists in computing as an area of study. Yet our aim is, in a sense, the op- posite of the digital humanists: we seek to apply the tools and methods of the humanities to the subject of computing (a goal shared with newer fields such as “platform studies” and “critical code studies”). The humanities, with their broad intellectual perspective and criti- cal sensibility, can help us see beyond the latest fads and think more deeply about the role of technology in the mod- ern world. Social historians have done a great job examining the history of ideas like “freedom” and “progress,” which have been claimed and shaped in differ- ent ways by different groups over time. In the history of the past 60 years ideas like “information” and “digital” have been similarly powerful, and deserve similar scrutiny. If I was a “digital histo- rian,” whose own professional identity and career prospects came from evan- gelizing for “the digital,” could I still do that work? There are many ways in which new software tools can contribute to teach- ing, research, and dissemination across disciplines, but my suspicion is that the allure of “digital humanist” as an identity will fade over time. It en- compasses every area of computer use (from text mining to 3D world build- ing) over every humanities discipline (from literary theory to classics). I can see users of the same tools in different disciplines finding an enduring con- nection, and likewise users of different tools in the same discipline. But the tools most useful to a particular disci- pline, for example the manipulation of large text databases by historians, will surely become part of the famil- iar scholarly tool set just as checking a bank balance online no longer feels like a trip into cyberspace. Then we will recognize, to adapt the words of Latour, that nobody has ever been digital and there has never been a digital world. Or, for that matter, a digital humanist. Further Reading Gold, M.K., Ed. Debates in the Digital Humanities, University of Minnesota Press, 2012. Also at http:// dhdebates.gc.cuny.edu/. Broad coverage of the digital humanities movement, including its history, the “hack vs. yack” debate, and discussion of the tension between technological enthusiasm and critical thinking. Gibson, W. Distrust that Particular Flavor, Putnam, 2012. A collection of Gibson’s essays and nonfiction, including his thoughts on our obsession with the future. Latour, B. Science in Action: How to Follow Scientists and Engineers through Society. Harvard University Press, 1987 and B. Latour and S. Woolgar, Laboratory Life: The Construction of Scientific Facts. Princeton University Press, 1986. We Have Never Been Modern is not the gentlest introduction to Latour, so I suggest starting with one of these clearly written and provocative studies of the social practices of technoscience. Marvin, C. When Old Technologies Were New: Thinking About Electric Communication in the Late Nineteenth Century. Oxford University Press, 1988. The hopes and fears attributed to telephones and electrical light when they were new provide a startlingly close parallel with the more recent discourse around computer technology. Morozov, E. To Save Everything, Click Here, Perseus, 2013. A “digital heretic” argues with zest against the idea of the Internet as a coherent thing marking a rupture with the past. Winner, L. The Whale and the Reactor: A Search for Limits in an Age of High Technology. University of Chicago Press, 1986. A classic work in the philosophy of technology, including a chapter “Mythinformation” probing the concept of the “computer revolution.” References 1. Berkeley, E.C. Giant Brains or Machines That Think. Wiley, NY, 1949. 2. Flichy, P. The Internet Imaginaire. MIT Press, Cambridge, MA, 2007. 3. Haigh, T. Software and souls; Programs and packages. Commun. ACM 56, 9 (Sept. 2013), 31–34. 4. Hecht, G. Rupture-talk in the nuclear age: Conjugating colonial power in Africa. Social Studies of Science 32, 6 (Dec. 2002). 5. Kling, R. Learning about information technologies and social change: The contribution of social informatics. The Information Society 16, 3 (July–Sept. 2000), 217–232. 6. Latour, B. We Have Never Been Modern. Harvard University Press, Cambridge, MA, 1993. 7. Negroponte, N. Beyond digital. Wired 6, 12 (Dec. 1998). 8. Turner, F. From Counterculture to Cyberculture: Stewart Brand, the Whole Earth Network, and the Rise of Digital Utopianism. University of Chicago Press, Chicago, 2006. Thomas Haigh (thaigh@computer.org) is an associate professor of information studies at the University of Wisconsin, Milwaukee, and chair of the SIGCIS group for historians of computing. Copyright held by author. work_ayykq6wo5vh35eywdqi5mb77fm ---- January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex EXPLORING LITERARY LANDSCAPES: FROM TEXTS TO SPATIOTEMPORAL ANALYSIS THROUGH COLLABORATIVE WORK AND GIS DANIEL ALVES and ANA ISABEL QUEIROZ Abstract This article argues that the study of literary representations of landscapes can be aided and enriched by the application of digital geographic technologies. As an example, the article focuses on the methods and preliminary findings of LITESCAPE.PT—Atlas of Literary Landscapes of Mainland Portugal, an on-going project that aims to study literary representations of mainland Portugal and to explore their connections with social and environmental realities both in the past and in the present. LITESCAPE.PT integrates traditional reading practices and ‘distant reading’ approaches, along with collaborative work, relational databases, and geographic information systems (GIS) in order to classify and analyse excerpts from 350 works of Portuguese literature according to a set of ecological, socioeconomic, temporal and cultural themes. As we argue herein this combination of qualitative and quantitative methods—itself a response to the difficulty of obtaining external funding—can lead to (a) increased productivity, (b) the pursuit of new research goals, and (c) the creation of new knowledge about natural and cultural history. As proof of concept, the article presents two initial outcomes of the LITESCAPE.PT project: a case study documenting the evolving literary geography of Lisbon and a case study exploring the representation of wolves in Portuguese literature. Keywords: Portuguese literature, interdisciplinary research, collaborative work, databases, GIS International Journal of Humanities and Arts Computing 9.1 (2015): 57–73 DOI: 10.3366/ijhac.2015.0138 © Edinburgh University Press 2015 www.euppublishing.com/journal/ijhac 57 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Daniel Alves and Ana Isabel Queiroz 1. literature and cultural geography As the canonical work of John Wright, David Lowenthal, and Hugh Prince suggests, geographers have long valued literary texts as sources of geographical knowledge.1 Among Portuguese geographers, Amorim Girão long ago expressed a similar conviction in the value of literature in his foundational Geografia de Portugal, where he argues, for example, that no one has captured the spirit and the features of the Beira Transmontana (a region in the north of the country) more powerfully than the author Nuno de Montemor.2 In a similar way, contemporary cultural geographers advocate that works of literature are able not only to capture the significance of specific locations, but also to ascribe new kinds of significance to those locations and, in the process, to influence how readers understand and interpret them. By associating places with particular facts and events, it is claimed, writers foster ‘processes for making places meaningful in a social medium’.3 Given the wide-ranging implications of such meaning- making activities, it comes as little surprise that studying literary representations of landscapes—and the changing cultural geographies those representations reveal—involves integrating concepts, ideas, and approaches from several fields, including ecology, geography, anthropology, and history. For cultural geographers, the interdisciplinary study of literary texts is useful because literature often includes both objective descriptions of space and subjective accounts of place,4 as well as information about spatial patterns and processes.5 Literary texts can, moreover, be usefully integrated with other source materials to give new insights about not only how landscapes change, but also how those changes are culturally documented and perceived. Reading literary texts alongside land-use maps, for example, has been shown to reveal different kinds of information about content and scope, spatial scale, time scale and perspective.6 Applying literature in cultural-geographical research is, however, not without its potential challenges. One must, after all, take the inner subjectivity of literary texts into account and, accordingly, allow for the fact that each text presents a partial, and to some degree biased, vision of reality. In order to accommodate this particularity and bias, it is useful to work not with individual texts but with many texts that can then be compared and contrasted. For the purposes of the LITESCAPE.PT project (which we outline below), this was achieved by designing and creating a large, trans-historical corpus of Portuguese literature (350 works dating from 1843 to 2014), which in comprising a variety of different writers and genres—and, accordingly, an array different ideas and perspectives—can be taken as a representative sample. Assembling and working with a corpus of this kind poses a number of practical and interpretive challenges, which we will address in our discussion of the LITESCAPE.PT project below. For now, it is worth clarifying that 58 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Exploring Literary Landscapes our aim in creating this corpus was not to explain the meaning and role of specific places used by writers in their creative activity (a recurrent subject within literary scholarship), but rather to use a large collection of texts to examine how literary representations of Portugal’s landscapes have changed over time. In this way, our research can be viewed as embedded both within the specific framework of ‘macroanalytic’ literary history (as advocated by Matthew Jockers)7 and within the wider framework of geo-criticism,8 as well as ‘literary cartography’,9 ‘literary geography’,10 and ‘literary GIS’.11 In addition, as will be clarified below, our approach also takes a theoretical orientation from ecological criticism12 in that it draws on literature as a resource for studying environmental history. 2. quantitative and qualitative analyses Quantitative digital methods are useful for working with large amounts of data. They allow the researcher not only to observe and compare patterns, but also to define goals and to test hypotheses.13 Such methods are, of course, most commonly associated with research in the social sciences. However, they have more recently begun to be championed by scholars in the humanities. Franco Moretti, for one, has advocated that literary historians should move away from the ‘close reading’ of individual texts and instead engage in the ‘distant reading’ of large text corpora.14 According to Moretti, one of the main limitations of close reading is that it tends to create blind spots in literary history. This is because the reading of individual texts often leads scholars to focus solely on major works and, accordingly, to ignore lesser-known texts. In response to this problem, Moretti has proposed distant reading as a more scientific and operational method for literary studies.15 The use of such scientific approaches in the humanities can be seem to reflect a desire to follow the example of disciplines, such as physics and evolutionary biology, in using large amounts of data to guide critical inquiry. Not without causing controversy, this ‘identification with new scientific methods gives the impression of a revolutionary new style of research emerging in the humanities’, as Paul Gooding, Melissa Terras, and Clair Warwick have put it.16 Beyond the understandable refusal to change scholarly paradigms and procedures, criticism of the application of scientific methods in the humanities is also based on the problems raised by the manipulation of large volumes of digital data, including both technical issues (such as the mixed quality of optical-character- recognition digitisation) and epistemological concerns (such as bias generated by the automated creation of textual metadata). Largely on account of this, the current trend in the humanities is to strive for a better integration of both quantitative and qualitative approaches.17 Underlying this trend is a desire to make more efficient use of large volumes of texts that 59 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Daniel Alves and Ana Isabel Queiroz remain the fundamental sources for researchers in this field. Mass digitization efforts and the democratization of the World Wide Web have made digital texts more accessible than ever before. The tools needed to extract data and to perform spatial analyses of it are now also increasingly available through the capabilities of database management and GIS platforms, which are enabling researchers around the globe to generate new pathways for research and even, in some cases, to address old questions in new ways. With the development of GIS technology many areas of knowledge, the humanities included, have sought to use and to incorporate new methodologies. In the discipline of history, for example, the main methodological innovation has been the incorporation of time as a variable in historical geographic information systems (HGIS) research. As has elsewhere been shown, including temporal data enables researchers to join attributes commonly evaluated using GIS such as location, extent, and volume.18 Researchers have also sought to consolidate temporal components by spatialising historical information and by analysing the evolution of geo-referenced datasets. Some of the more recent developments along these lines have focused on the exploration of literary texts and the construction of spatial narratives.19 As yet, however, the use of GIS tools is not sufficiently widespread in the humanities. As a result, their potential is not fully recognised. There are many plausible explanations for this, not least the difficulties involved in planning, creating, and managing a GIS project. Moreover, the application of GIS in the humanities is often less than straightforward. GIS were designed to handle large volumes of quantitative data and, in many cases, this not the type of data that scholars in the humanities use. When joining information and geography to temporal attributes, for example, relations tend to multiply and generate datasets that contain more exceptions than rules, more arbitrariness than standards.20 Several models have been proposed to resolve these problems and to provide GIS with a better way of working with texts. Among the more recent is the proposal to integrate GIS with database management systems (DMS). This integration has only recently become possible, when it materialized Stephen Ramsay’s 2004 thinking on the evolution of relational database design and its future impact in research. As Ramsay explains a ‘database . . . can be set up in such a way as to allow multiple users access’, and data entry ‘from a number of different sources’.21 In this way, he concludes, ‘the logical statements that would flow from that ontology would necessarily exceed the knowledge of any one individual. The power of relational databases to enable the serendipitous apprehension of relationships would be that much more increased’. 22 Alongside these innovations, the recent advent of ‘cloud’ databases offers increasing functionalities for structuring, searching, and accessing information, allowing scholars greater freedom to engage in collaborative research. 60 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Exploring Literary Landscapes 3. litescape.pt—atlas of the literary landscapes of mainland portugal Portuguese literature includes a wide variety of landscape representations that have never been studied systematically at a national scale. LITESCAPE.PT is an on-going interdisciplinary project that emerged in this context in order to aid the analysis of literary representations of the landscapes of mainland Portugal: see http://paisagensliterarias.ielt.org. The project works with a corpus of 350 texts (comprising nearly 1.4 million words) published between 1843 and 2014; its overarching aim is to investigate how this corpus can serve a resource for exploring not only environmental and sociological changes, but also how the landscapes of Portugal have evolved in the popular imagination over time. Embracing the idea that writers are also mapmakers,23 the project places the mapping the literary texts at its core. The identification of the geographical references within the corpus is therefore key to our research. In order to facilitate the identification of these references, each literary representation of mainland Portugal in our corpus has been digitized and then registered in a shared database as a discrete literary excerpt. These excerpts are passages that can be read and understood independently and that, moreover, give a clear sense of the aesthetic aspects of the works from which they derive. Once identified and extracted, these excerpts were classified into categories (to indicate whether they were concerned with geographic, ecological, socioeconomic, cultural, and/or temporal issues) and then geo-referenced. Ultimately, this will enable us to depict the spatial and thematic information the excerpts contain on an interactive map, called the ‘Atlas of Literary Landscapes’, that will serve as a source for the development of further interdisciplinary research.24 As the foregoing account of the project suggests, the LITESCAPE.PT project uses a hybrid methodology. Specifically, it combines traditional close- reading methods with ‘distant reading’, collaborative work, a shared PostgreSQL database, GIS tools, and quantitative methods. The project, at this level, has similarities with other digital literary mapping project around the globe.25 But in addition to being unique in the field of Portuguese literary studies, it also has some other special defining features. For instance, it is as a large, trans-historical corpus that not only includes contemporary works, but also writings from the Romantic era, when the appreciate of landscape as an aesthetic apprehension of nature came to the fore. LITESCAPE.PT, moreover, focuses not on a particular region or location, but instead embraces the entire national mainland territory of Portugal with all its natural and cultural diversity. The project, furthermore, focuses on landscape changes by using a set of analytical descriptors relating to five main categories. These descriptors included relief forms, land use, natural heritage, cultural heritage, and human activities. 61 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Daniel Alves and Ana Isabel Queiroz Classifying the excerpts in this way, and compiling a list of metadata to facilitate searching through and analysing them, is a time consuming task, especially given that the project does not, as yet, have a fully funded research team. In order to address this, we invited fellow academics and graduate students in literary studies, as well as school teachers of Portuguese language, geography, history and science, to assist us in collecting, recording, and classifying the texts. We used a standardised reading protocol (introduced through a short training session) to ensure that each of our participants (hereafter called ‘readers’) followed the same procedures and offered them continuous supervision and support. The database was made accessible to the readers through ODBC (Open Database Connectivity). This allowed for the database to be shared and for the information recorded by every reader becomes immediately available to the group. It also allowed readers who wished to find out more about a specific topic to explore the entire corpus. The subjects explored in these texts can cover a wide range of topics, including (1) the identification of fictional and non- fictional place names and their relation to human occupation in the territory; (2) the characterisation of land uses; (3) the exploitation of natural resources; (4) landscape processes associated with human activities; (5) landscape changes observed over suitably organised time periods; and (6) the identification of plant and animal species mentioned in the literary scenarios. The goal, in this sense, is not to engage in a philological approach to a small set of literary works, but instead to provide for a thorough analysis of texts to be carried out by the team members on a wide range of literary works. This approach has the virtue of retaining the advantages of the traditional reading methods while, in the process, overcoming some of the pitfalls related to other ‘distant reading’ approaches, such as the need to disambiguate place names, proper nouns and other errors that normally emerge from an automated computational process of text mining.26 Recognising that digital literary mapping is potentially an extremely broad field of research, LITESCAPE.PT defines its parameters by establishing the identification of a geographical unit as a minimum criterion for selecting and registering the literary excerpts. Three inclusive administrative divisions were considered. The larger of these (the so-called NUTS 3)27 is a cluster of twenty- eight municipalities in mainland Portugal. Whenever possible, municipalities or civil parishes have also been identified. In some cases (for instance in urban centres or in descriptive literary works) a precise location was registered by combining places mentioned in the texts with latitudinal and longitudinal coordinates extracted from Google Maps or other gazetteers.28 The resulting information can be read into a GIS application in order to analyse the different excerpts according to the five thematic categories mentioned above. The GIS can also facilitate spatiotemporal analysis of the excerpts and allow the research to integrate (and draw comparisons with) data from other sources. In this way, 62 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Exploring Literary Landscapes LITESCAPE.PT has made it possible to analyse the deeper significance of each place or geographical unit registered in each excerpt. One methodological challenge facing the project at present is the difficulty of using more advanced computational-linguistic techniques for exploring the Portuguese texts. Computational linguistic software is mainly available for English. Although a research team is building a version for Portuguese, to date, only a very small sample of the Portuguese literature has been scanned and digitized.29Accordingly, it is difficult to take advantage of the automated text extraction tools that other mapping projects employ. Ultimately, the project aims to design a methodology that could overcome this and to extract from the literary texts their ‘absolute singularity, but with potential links to broader phenomena; [their] irreducible difference, but with similarities that may nonetheless be discerned’.30 Keeping up with all those links and similarities, the database will preserve the original text for subsequent interpretation while at the same time enabling a quantitative and spatial approach. In this process a literary excerpt become part of the relevant material for analysis with a set of structured metadata. Relevant links, trends and patterns are then detectable between many excerpts, even from different works and writers. In this sense ‘the corpus as entity shifts meaning away from the text and towards the network’.31 4. landscape changes in portuguese literature The results of the collaborative work can be summarized in a few numbers gathered from the LITESCAPE.PT database. On July 2014, the database comprised 172 authors (mainly Portuguese), 350 literary works (published between 1843 and 2014), 6,082 literary excerpts, and almost 1,400,000 words. All the excerpts have one or more mandatory geographical descriptors. In addition to the twenty-eight NUTS 3 municipalities, readers associated the literary excerpts with more than 2,500 locations. 77.3% of these locations were assigned exact geographical coordinates. The rest were either found to be fictional places, locations that no longer exist, or that are still in the process of being identified. Of all the excerpts, 87.4% were found to have at least one thematic descriptor. On average, readers classified each literary excerpt within two categories and around five thematic descriptors. There are therefore more than 4,000 thematic descriptors in the database, all organised into the five aforementioned categories and twenty-seven subcategories. Up to now, the project has involved thirty-six readers. Their overall contribution is displayed in Figure 1. Two of these readers have been responsible for over 40% of all the work in the database. The work of the five most active readers (all of whom have contributed more than 500 excerpts) accounts for more than two-thirds of the total. Most of the readers recorded excerpts from only one to three books, whereas only four failed to contribute a complete literary work. 63 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Daniel Alves and Ana Isabel Queiroz Figure 1. Distribution of the excerpts registered in the LITESCAPE.PT database per readers. (Readers who recorded more than 50 registers are identified by their initials). An overall identical level of participation was observed in similar crowdsourcing projects that use multiple collaborators to foster digitization, despite their focus on different sources and different research questions.32 An overview of the entire corpus showed that landscape descriptions are spread across mainland Portugal, although some territorial units showed a higher concentration (Figure 2). Lisbon and its surroundings stood out with a maximum of 3,132 excerpts, followed by the Douro region with 1,116 excerpts. This different distribution results from the literary production itself (which privileges some regions) and the readers’ interest in certain regions, writers, or subjects. Understanding these two aspects helps avoid a biased conclusion about the distribution of literary landscapes, their meanings and scope throughout the Portuguese literature. 64 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Exploring Literary Landscapes Figure 2. Distribution of the excerpts registered in the LITESCAPE.PT database per NUTS 3 municipality in mainland Portugal. 65 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Daniel Alves and Ana Isabel Queiroz Lisbon has been widely portrayed in art and literature and was even considered one of the three world literary cities, along with Rome and Constantinople.33 Additionally, it was the scene of major urban, political, and cultural transformations, which is a relevant research topic in the context of the project. In addition to the beauty and cultural appeal of the literary landscapes of the Douro region depicted in many texts, including those of Aquilino Ribeiro and Miguel Torga, who were born in the Douro and portray the region in their works. This spatiotemporal reading, which has recently come to the attention of several researchers,34 was until recently commonly overlooked in Portugal. As for our project, several specific research projects have profited from the material stored in the database.35 Since it is not been feasible to document all these projects in this article, we have decided to focus on two exemplary case studies: one concerning literary representations of Lisbon and one concerning literary representations of wolves. The spatiotemporal analysis applied in both studies is representative of the potentials of LITESCAPE.PT project. 4.1 Lisbon as a literary space36 This case study reveals the benefits of integrating methods from different fields. It assesses the development of the literary space of Lisbon over time and it discusses how literary representations of the city resonate with Lisbon’s evolving identity as a cultural and social space. Bringing together 35 novels published from the mid-nineteenth century onwards, the study relied on an interdisciplinary approach to identify and to present literary geographical patterns and to combine these with other sources of information about Lisbon. The literary space of the city, which can be geo-referenced and drawn on a map, was assumed to be defined by the period setting or as evoked by the characters.37 All of the literary works analysed were chosen because they have Lisbon as the central stage in the narrative, and also according to the clearly identified historical period in which they were either written or published.38 Converting literary locations from points to spots, through density analysis, and then to polygons; developing the concepts of literary space cumulative literary space and common literary space; and using methods borrowed from animal ecology: taking each of these steps made it possible to introduce size calculations and to build on the findings of other researchers.39 From a comparative or evolutionary perspective, the polygon shown in Figure 3 enable sequential and overlapping visualizations, which in turn facilitate comparisons with demographic data from different sources that were spatially referenced using the same framework. The results of the study suggested that the literary space did not match the urban space and, furthermore, that it commonly took thirty to forty years for Lisbon’s literary geography to catch up with the city’s expanding urban 66 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Exploring Literary Landscapes Figure 3. Common and cumulative literary space in Lisbon as calculated by the Minimum Convex Polygon (at 95%), in four historical periods: 1st (until 1910, the period of monarchic rule); 2nd (1910–1926, the Republic); 3rd (1926–1974, the Dictatorship); 4th (1974 onwards, the period of democratic rule). landscape. Accordingly, whereas the old commercial and political centre of the city, persists as its literary space in all the novels analysed, Lisbon’s peripheries are either absent or underrepresented until 1974. This occurs in spite of the fact that many of these peripheral areas were included in Lisbon’s administrative limits as far back 1886 and were intensively urbanized during the 1950s and 1960s. In this way, the study showed that the mapping of an enlarged literary corpus, collected collaboratively and analysed by a combination of qualitative and quantitative approaches (and through a combination of traditional and digital techniques) can produce new insights the cultural evolution of urban landscapes. The methodology employed in this study is fully replicable; it could be applied to another literary corpus to study other cities in other times and to investigate the relationship between real and imagined geographies. Accordingly, it has the potential to lead to a better understanding of the process of ‘how city boundaries 67 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Daniel Alves and Ana Isabel Queiroz Figure 4. Distribution of literary representations of wolves throughout time (main time of the narrative) by NUTS 3 municipality: Tn1, before 1940; Tn2, between 1940 and 1979; Tn3, after 1980. shaped visions of the urban space as it was lived and experienced’, and how literature can elucidate about ‘the history of space becoming place’.40 4.2 Representations of wolves in Portuguese literature41 In this case study, the researchers created a lupine corpus containing literary representation of mainland Portugal that contained representations of wolves. This ‘lupine corpus’ was then augmented by the work of seven other readers, who classified other works that had not been identified in the first stage. This common effort led to the creation of a corpus of 262 excerpts from 68 literary works by 29 writers, published between 1875 and 2010. A content analysis was performed using a grid with several categories. These categories encompassed the various forms that the relationship between humans and nature can take. All literary representations were spatially referenced to one or more NUTS 3 municipalities and they were also associated with three time periods that applied to the first publication date and the time setting of the narrative. In order to study the literary representation of wolves, these time- stamps were then compared with other three time periods extrapolated from the historical knowledge about the trends of the Iberian wolf’s range across Portugal and its different conservation statuses (Figure 4).42 68 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Exploring Literary Landscapes Quantitative analysis revealed that although wolves have been represented in literature since the late nineteenth century, the proportion of representations was not independent of the time period of publication. Notably, a strong decline occurred in the works published after 1980. Literary representations of wolves were found to be combined with a variety of topics, approaches, and perspectives, although they were generally found to be less rich and less diverse in terms of their composition over time. The results also suggest the literary representation of wolves is not homogeneously throughout mainland Portugal, and that the geographic distribution of these representation more-or-less matches that of the Iberian wolf’s range and distribution over time. The approach followed here was enhanced by teamwork, which facilitated the shared analytical effort of classifying and organising the contents of the database concerned with humans’ relationships with wolves. By using quantitative and digital methods (namely, mapping with GIS) this explanatory analysis was able to highlight the structure and composition of literary representations of wolves across time and space. From an eco-critical perspective, this approach can be seen as ‘an example of the advantages of researching into an enlarged sample of literary texts, producing accurate and comparable results and discussing them using current ecological knowledge’.43 5. bridging the divide The advance of digital methods is a challenge both for those who produce tools and use them in digital humanities. Despite technological advances, difficulties persist for their application within the humanities. These difficulties can no longer be viewed exclusively as a consequence of the refusal to embrace new technologies, as was the case some years ago.44 Instead, the must be seem to arise because most of the sources and methodologies used by the humanists are hard to fit into the structured data embedded in the operation model of databases and GIS. This gap results, in part, from the fact that narrative texts are the main sources and outputs of literary or historical analysis, and these cryptic or nuanced texts may be difficult to read and analyse with digital tools, largely because of their inherent standardisation, where everything has to fit into pre-formatted ‘boxes’. But even if the digital approaches may present some pitfalls (supposedly reductionist features that some authors associate with the use of these tools in humanities research)45 researchers should not refrain from using them. In this context, landscapes representations in literature are challenging objects of study, not only because of their inner complexity, but also because of their continuous changes over time and space. If an appropriate procedure is applied, it necessarily results in converting texts to comparable and measurable features through digital methods and technologies. As we have tried to show above, 69 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Daniel Alves and Ana Isabel Queiroz these results are becoming available and the effort made during the compilation process will probably be fully rewarded. From an extensive archive of literary representations of landscapes, a varied range of inquiries with ambitious goals may be pursued. The studies of the literary space of Lisbon and of literary representations of wolves demonstrate but do not exhaust the full potential of information we have compiled. The main value of these studies results in combining the traditional academic reading (1st stage identification and selection) with ‘distant reading’ strategies (2nd stage, analysis and outcomes) with the focus given to certain aspects of the text. These studies relied on a collaborative approach that improved the chances of analysing, on solid ground, the topics depicted in the literary texts as well as increasing scientific productivity. Relying on a single reader would be unfeasible for exploring an enlarged literary corpus such as LITESCAPE.PT, and might cause one either to overlook influential, contemporary works or even to ignore ‘forgotten titles’.46 Furthermore, collaborative work helps one deal with large amounts of information and to overcome a lack of time and funding.47 From this perspective, participatory and collaborative research cane be seen to have far more benefits than drawbacks for the digital humanities, since it allows scholars to engage with more information and promotes interdisciplinary research. There are, of course, concerns about projects based on an ‘imperialistic division of labour among scholars’.48 But, as LITESCAPE.PT affirms, one can pursue collaborative work in a way that shares knowledge equally among participants and, in the process, addresses ‘the common problem by giving other voices a chance to speak’.49 Academia would benefit from more research projects working from texts to spatiotemporal analysis using collaborative work and GIS. Technology-mediated approaches, such as the use of digital materials, methods, and perspectives, can become a fruitful trend. The methods and outcomes presented in this article contribute to that trend by fostering new insights and by modelling new interdisciplinary practices for bridging the divide between the sciences and the humanities. end notes 1 J. Wright, ‘Geography in Literature’, Geographical Review 14, no. 4 (1924), 659–660. D. Lowenthal and H. Prince, ‘English Landscape Tastes’, Geographical Review 55, no. 2 (1965), 186–222. 2 A. Girão, Geografia de Portugal (Porto, 1941), 402. 3 M. Crang, Cultural Geography (London, 1998), 44. 4 P. Lewis, ‘Beyond description’, Annals of the Association of American Geographers 75, no. 4 (1985), 465–477. 5 F. Moreira, ‘Patterns and processes’, working paper for ‘Landscape Ecology: its methods and their applications’, April-May 2005, Faro, Portugal. 70 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Exploring Literary Landscapes 6 A. I. Queiroz, ‘Landscape and Literature: The Ecological Memory of Terras Do Demo, Portugal’, in Z. Roca, T. Spek, T. Terkenli, T. Plieninger and F. Höchtl, eds., European Landscapes and Lifestyles: The Mediterranean and beyond, (Lisboa, 2006), 1–20. 7 M. L. Jockers, Macroanalysis: Digital Methods and Literary History, (Illinois, 2013). See also, F. Moretti, Atlas of the European Novel, 1800–1900, (London, 1998). 8 B. Westphal, La Géocritique, Réel, Fiction, Espace, (Paris, 2007). 9 R. Tally, ‘Literary Cartography: Space, Representation, and Narrative’, Faculty Publications— English, 2008 < https://digital.library.txstate.edu/handle/10877/3932 > [accessed 24 May 2011]. 10 B. Piatti, A. Reuschel and L. Hurni, ‘Literary Geography – or How Cartographers Open up a New Dimension for Literary Studies’, in Proceedings of the 24th International Cartography Conference (Santiago: International Cartographic Association, 2009) < http://icaci.org/files/ documents/ICC_proceedings/ICC2009/html/nonref/24_1.pdf > [accessed 14 July 2013]. 11 D. Cooper and I. Gregory, ‘Mapping the English Lake District: A Literary GIS’, Transactions of the Institute of British Geographers 36, no. 1 (2011), 89–108. 12 L. Buell, The Future of Environmental Criticism: Environmental Crisis and Literary Imagination (Malden, 2005). 13 See, for instance, S. Dunn and M. Hedges, ‘Crowd-Sourcing as a Component of Humanities Research Infrastructures’, International Journal of Humanities and Arts Computing 7, no. 1–2 (2013), 147–69. 14 Moretti, ‘Conjectures on World Literature’, New Left Review 1 (2000), 54–68. Cited here at 56–58. 15 Moretti, ‘Operationalizing’: Or, the Function of Measurement in Modern Literary Theory’, Pamphlets of the Stanford Literary Lab 6 (2013), 1–13, < http://litlab.stanford.edu/ LiteraryLabPamphlet6.pdf > [accessed 27 January 2014]. 16 P. Gooding, M. Terras and C. Warwick, ‘The Myth of the New: Mass Digitization, Distant Reading, and the Future of the Book’, Literary and Linguistic Computing 28, no. 4 (2013), 629–639. Cited here at 631. 17 Dunn and Hedges, 148. 18 Ian Gregory, ‘Exploiting time and space: A challenge for GIS in the digital humanities’, in D. Bodenhamer, J. Corrigan and T. Harris, eds., The Spatial Humanities: GIS and the Future of Humanities Scholarship, (Bloomington, 2010). 19 See D. Bodenhamer, Corrigan, and Harris, eds., op. cit. 20 L. Silveira, ‘Geographic Information Systems and Historical Research: An Appraisal’, International Journal of Humanities and Arts Computing 8, no. 1 (2014), 28–45. 21 S. Ramsay, ‘Databases’, in S. Schreibman, R. Siemens and J. Unsworth, eds., Companion to Digital Humanities, (Oxford, 2004), 177–197. Cited here at 195. 22 Ibid. 23 Tally, ‘Literary Cartography’. 24 This interactive map is not yet available due to insufficient funding. Applications for funding were submitted in 2010 and again in 2012, and although the project was rated as ‘outstanding’, due to budget constraints, the application was not successful. Since the beginning, the LITESCAPE.PT project has received financial support from the research unit (IELT-FCSH, UNL); further support was also given by the municipality of Lisbon, through EGEAC. 25 These projects include A Literary Atlas of Europe < http://www.literaturatlas.eu/en/ > ; Digital Literary Atlas of Ireland, 1922 – 1949 < http://www.tcd.ie/trinitylongroomhub/ digital-atlas/ > ; GéoCulture, Le Limousin Vu Par Les Artistes < http://geo.culture-en- limousin.fr/?lang=fr > ; LitMap Project < http://www.litmapproject.com/ > ; Mapping Canada: An Interactive Resource < http://aelang.net/projects/canada.htm > ; Mapping Lake 71 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Daniel Alves and Ana Isabel Queiroz District Literature < http://www.lancaster.ac.uk/fass/projects/spatialhum.wordpress/?page_ id=43 > ; Placing Literature < http://www.placingliterature.com/ > ; Stanford Literary Lab < http://litlab.stanford.edu/ > ; The Space of Slovenian Literary Culture < http://isllv. zrc-sazu.si/en/programi-in-projekti/the-space-of-slovenian-literary-culture-literary-history- and-the-gis-based > [accessed 23 July 2014]. 26 I. Gregory and A. Hardie, ‘Visual GISting: Bringing Together Corpus Linguistics and Geographical Information Systems’, Literary and Linguistic Computing 26, no. 3 (2011), 297–314; see, especially 301–305. 27 European Commission, ‘NUTS - Nomenclature of Territorial Units for Statistics - Introduction’, Eurostat, 2012 < http://epp.eurostat.ec.europa.eu/portal/page/portal/nuts_ nomenclature/introduction > [accessed 29 January 2014]. 28 This is a methodological approach applied on several researches and projects. See, for instance, R. Mostern and I. Johnson, ‘From Named Place to Naming Event: Creating Gazetteers for History’, International Journal of Geographical Information Science 22, no. 10 (2008), 1091–1108; H. Southall, R. Mostern, and M. Berman, ‘On Historical Gazetteers’, International Journal of Humanities and Arts Computing 5, no. 2 (2011), 127–145. 29 Creating computational linguistic software for Portuguese is challenging because the language that has undergone several reforms since the beginning of the twentieth century. Our corpus, for example, contains a number of linguistic variations (including spelling variations). See, on this topic, P. Garcez, ‘The Debatable 1990 Luso-Brazilian Orthographic Accord’, Language Problems & Language Planning 19, no. 2 (1995), 151–178. See also, A. Baron, P. Rayson and D. Archer, ‘Automatic Standardization of Spelling for Historical Text Mining’, in Digital Humanities 2009 (Maryland, 2009); I. Hendrickx and R. Marquilhas, ‘From Old Texts to Modern Spellings: An Experiment in Automatic normalisation’, in Proceedings of the Workshop on Annotation of Corpora for Research in the Humanities (Germany, 2012), 1–12. 30 ‘Close Reading: A Preface’, SubStance 38, no. 2 (2009), 3–7. Cited here at 4. 31 Gooding, Terras and Warwick, ‘The Myth of the New’, 636. 32 See, for instance, T. Causer and M. Terras, ‘Crowdsourcing Bentham: Beyond the Traditional Boundaries of Academic History’, International Journal of Humanities and Arts Computing 8, no. 1 (April 2014), 46–64. Dunn and Hedges mention other projects and similar conclusions, stating that this kind of collaborative approach to research ‘is relatively new to academic research, and even more so to the humanities’ (‘Crowd-Sourcing’, 147). 33 J. C. Osório, ed., Cancioneiro de Lisboa (seculos XIII - XX) (Lisboa, 1956). 34 See, for example, the works of Moretti, Tally, Piatti et al., and Cooper and Gregory mentioned above. 35 A. I. Queiroz and J. Carrilho, ‘Stone Metaphors about a Village: A ‘Stone Vessel’ or ‘The Most Portuguese’?’ Ecozon@: European Journal of Literature, Culture and Environment 2, no. 1 (2011), 19–33; A. Lavrador and A. C. Tavares, ‘A Literary Ride through Bacchus’ Landscapes’, in Enometrics XVIII, (Angers: 2011) < http://paisagensliterarias.ielt.org/ config/paisagensliterarias/conteudo/pp/A%20literary%20ride%20through%20Bacchus%27% 20landscapes_Anger11.pdf > [accessed 23 July 2014]; F. Cunha, A Paisagem e as palavras que lá estão. Levantado do Chão, um romance político (Lisboa, 2012); A. I. Queiroz and D. Alves, Lisboa, lugares da literatura: História e Geografia na Narrativa de Ficção do Século XIX à Actualidade (Lisboa, 2012); A. I. Queiroz, ed., Lisboa nas narrativas. Olhares exteriores sobre a cidade antiga e contemporânea (Lisboa, 2012) < http://paisagensliterarias. ielt.org/config/paisagensliterarias/conteudo/ebooks/Lisboa_nas_narrativas.pdf > [accessed 23 July 2014]; A. I. Queiroz, ed., Sofrimento, resistência e luta. Ressonâncias na Literatura Portuguesa do século XX (Lisboa, 2013) < http://paisagensliterarias.ielt.org/config/ paisagensliterarias/conteudo/ebooks/anaisabel_2013_final.pdf > [accessed 23 July 2014]; D. Alves and A. I. Queiroz, ‘Studying Urban Space and Literary Representations Using GIS: Lisbon, Portugal, 1852–2009’, Social Science History 37, no. 4 (2013), 457–81; and 72 January 2, 2015 Time: 03:36pm ijhac.2015.0138.tex Exploring Literary Landscapes A. I. Queiroz, M. L. Fernandes and F. Soares, ‘The Portuguese Literary Wolf’, Literary and Linguistic Computing (December 6, 2013), 1–17. 36 For more information about this case study, see Alves and Queiroz, ‘Studying Urban Space’. 37 Alves and Queiroz, ‘Studying Urban Space’, 458. 38 See Alves and Queiroz, ‘Studying Urban Space’, 461–465. 39 Specifically, I. N. Gregory and D. Cooper, ‘Thomas Gray, Samuel Taylor Coleridge and Geographical Information Systems: A Literary GIS of Two Lake District Tours’, International Journal of Humanities and Arts Computing 3, no. 1–2 (2009), 61–84; and Cooper and Gregory, ‘Mapping the English Lake District’. 40 Alves and Queiroz, ‘Studying Urban Space’, 478; see also, L. Buell, The Future of Environmental Criticism, 63. 41 For more information about this case study, see Queiroz, Fernandes, and Soares, ‘The Portuguese Literary Wolf’. 42 See Queiroz, Fernandes, and Soares, ‘The Portuguese Literary Wolf’, 3–5. 43 Queiroz, Fernandes and Soares, ‘The Portuguese Literary Wolf’, 13. 44 D. Cohen and R. Rosenzweig, Digital History: a Guide to Gathering, Preserving, and Presenting the Past on the Web (Philadelphia, 2005), Introduction < http://chnm.gmu.edu/ digitalhistory/ > [accessed 7 September 2012]. 45 Bodenhamer, Corrigan, and Harris, eds., The Spatial Humanities, 90. 46 A. Khadem, ‘Annexing the unread: a close reading of ‘distant reading”, Neohelicon 39, no. 2 (2012), 409–421: 411. 47 M. Simeone, J. Guiliano, R. Kooper and P. Bajcsy, ‘Digging into Data Using New Collaborative Infrastructures Supporting Humanities-Based Computer Science Research’, First Monday 16, no. 5 (2011) < http://firstmonday.org/ojs/index.php/fm/article/ view/3372 > [accessed 30 November 2013]. 48 Khadem, ‘Annexing the unread’, 411. 49 W. McCarty, ‘Collaborative Research in the Digital Humanities’, in M. Deegan and W. McCarty, eds., Collaborative Research in the Digital Humanities, (Surrey, 2012), 1–10. Cited here at 3. 73 work_b5i2fwqh2fhn3h37t24yvpnvpy ---- atari-go Atari Go st on es to p la y Rules 1. Two teams, Black and White, take turns placing a stone (game piece) of their own color on a vacant point (intersection) of the grid on the board 2. Once placed, stones do not move 3. A vacant point adjacent to a stone is called a liberty for that stone 4. Connected stones formed a group and share their liberties 5. A stone or group with no liberties is captured 6. Black plays first 7. The first team to capture anything wins th e bo ar d The White stone has 1 liberty, while the Black group has 6 liberties White has been captured (no more liberties available), Black wins cu t a ll th e bl ac k an d w hi te sq ua re s to g et th e st on es Bonus Implement the function below in Python, that takes in input the colour of the player who has to play the turn (parameter colour), the sets of coordinates (i.e. sets of tuples) of all the black stones (parameter black) and white stones (parameter white) already positioned on the board, and returns the x, y coordinate (a tuple) of a free intersection where to place a new colour stone. The coordinates of the various positions of the board are those ones defined in "the board" in this paper. def place_stone(colour, black, white): # study the board and calculate the # best place where to position the stone return x, y # the coordinates of the new stone Atari Go, a.k.a. Capture Go, is a simplified version of Go, usually proposed to beginners so as to learn the basic rules of Go. Improve the Wikipedia page about it: https://en.wikipedia.org/wiki/Capture_Go (For help in editing Wikipedia: https://en.wikipedia.org/wiki/Wikipedia:FAQ/Editing) This work has been designed by Silvio Peroni (Twitter: @essepuntato) for the Computational Thinking and Programming course of the DHDK degree at the University of Bologna (Twitter: @UniboDHDK). All the rights have been waived worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law. https://creativecommons.org/publicdomain/zero/1.0/legalcode Version 1.0, 11 December 2018 0 1 2 3 4 5 6 0 1 2 3 4 5 6 x axis y ax is https://en.wikipedia.org/wiki/Capture_Go https://en.wikipedia.org/wiki/Wikipedia:FAQ/Editing https://creativecommons.org/publicdomain/zero/1.0/legalcode work_b7souw6hjjhbbdw2yeigadgsbm ---- indexlcomunicación | nº 8(2) 2018 | Páginas 83-102 E-ISSN: 2174-1859 | ISSN: 2444-3239 | Depósito Legal: M-19965-2015 Recibido el 30_04_2018 | Aceptado el 16_05_2018 LAS PRINCESAS DISNEY Y LA CONSTRUCCIÓN DE HUMANIDADES DIGITALES «SILENCIADAS» EN EL CINE DE ANIMACIÓN THE DISNEY PRINCESSES AND THE CONSTRUCTION OF “SILENCED” DIGITAL HUMANITIES IN ANIMATION CINEMA Carmen Cantillo Valero | carmen.cantillo@invi.uned.es | Universidad Nacional Educación a Distancia, UNED Resumen. Una de las principales formas de controlar el poder consiste en prohi- bir el acceso a la palabra. El adoctrinamiento a través del silencio ubica a la mujer en un estrato inferior en la narrativa digital del cine de animación y, por ende, en la sociedad. En este artículo se reflexiona —desde una perspectiva cualitativa—sobre diversos aspectos de la cinematografía infantil de las prince- sas Disney: las imágenes de sus personajes, sus representaciones, sus discursos y sus silencios son analizados con la intención de desnaturalizar este tipo de ficciones, ya que la educación, la participación, la transgresión, la lectura perver- sa y la mirada alternativa son la base para recuperar la libertad de crear historias e identidades. Partimos de visibilizar los mecanismos de creación audiovisual y las formas de sugestionar al público como elementos esenciales para evitar la manipulación y concluimos en la necesidad de alfabetizar en medios desde las escuelas para que las Humanidades Digitales se construyan con pilares tan sóli- dos como: la igualdad, el pensamiento crítico y los valores humanistas, con los que dotar de coherencia nuestra existencia. Palabras clave: narrativa digital; cine de animación; alfabetización mediática; humanismo digital; cultura del silencio. Para citar este artículo: Cantillo Valero, C. (2018). Las princesas Disney y la construcción de Humanidades Digitales «silenciadas» en el cine de animación. index.comunicación, 8(2), 83-102. 84 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional Abstract. One of the main ways of controlling power is to prohibit access to speech. Indoctrination through silence places women in a lower stratum in the digital narrative of animation film and, therefore, in society. This article reflects —from a qualitative perspective— on various aspects of Disney princesses' children's cinematography: the images of their characters, their representations, their speeches and their silences are analyzed with the intention of denaturing this type of fictions, since education, participation, transgression, perverse reading and the alternative view are the basis for recovering the freedom to crea- te stories and identities. We start from making visible the mechanisms of audio- visual creation and the ways of suggesting to the public that they are essential elements to avoid manipulation and we conclude with the need to provide lite- racy in the media from schools so that the Digital Humanities can be built with such solid pillars as equality, critical thinking and humanist values, with which to give coherence to our existence. Keywords: Digital Storytelling; animation film; Media literacy; Digital Humanism; culture of silence. 1. Introducción: de la comunicación a la participación afectiva La comunicación es una de las características más distintivas y definitorias del ser humano. A través de este proceso se establece una relación entre dos o más personas que interactúan para compartir información en un proceso de transfor- mación continua (Aparici, 2003). El devenir comunicativo es más eficaz cuan- do el medio utilizado para transmitir la información está apoyado en un relato, ya que el cerebro humano está concebido para contar y escuchar historias. En cualquier parte del mundo, desde hace miles de años, el ser humano siempre ha utilizado la narración de historias para compartir el conocimiento, los valores y las creencias. Hoy en día, las historias siguen siendo las herramientas más poderosas para la comunicación, puesto que provocan un estado de atracción y entretenimiento con una fuerza capaz de conectar a las personas entre sí y con el medio que las rodea. Existe una conexión permanente entre las prácticas discursivas de nuestro entorno comunicativo y la sociedad. En el caso concreto de la infancia, el estu- dio de las imágenes y los mensajes que proliferan desde las más diversas repre- sentaciones son el punto de partida para construir valores y normas, así como desarrollar las capacidades reflexivas necesarias con las que se consolidarán las Humanidades Digitales en la sociedad del siglo xxi. El estudio de la narra- tiva digital es esencial para saber valorar y juzgar qué transmiten este tipo de discursos mediáticos. Por tanto, conocer los mecanismos de creación audiovi- sual y las formas de sugestionar al público son buenos comienzos para evitar la manipulación informativa y, al menos, suponen el principio para su detección. index.comunicaci�n Las princesas Disney y la construcción de humanidades... | Carmen Cantillo Valero | 85 El semiólogo y teórico cinematográfico Christian Metz (1964) estable- ció una nueva vía para separar las metodologías utilizadas en los análisis del lenguaje del cine, diferenciando lo «fílmico (técnica, industria, directores, censura, público, actores, etc.) de lo cinematográfico (estudio interno de la mecánica de las películas, aisladas de todo contexto, cómo se construyen y transmiten sentidos, y cómo una película, o un grupo de ellas, tiene significa- ciones especiales)» (INTEF, 2015). En este artículo analizamos la cinematografía de las princesas Disney desde una doble perspectiva, que tiene en cuenta: el aspecto mercantilista de la multinacional Disney, así como los símbolos de sus mensajes, que son el fundamento de una ideología androcéntrica, representada en códigos con una potente carga de significados. Partiendo de la apertura evitamos rechazar o magnificar los posibles prejuicios con los que contaminar este estudio, plan- teado desde una perspectiva cualitativa, para proponer pautas de análisis del lenguaje audiovisual que eduquen en la lectura e interpretación de las actuales narrativas y con los que obtener herramientas para desnaturalizar estas lectu- ras que nos permitan dejar de considerarlas inocentes. De lo contrario, se nos condenará a una domesticación audiovisual que bloqueará el desarrollo de la dimensión humana. En un primer instante, encontramos que las industrias del entretenimiento son un agente educativo muy eficaz para transmitir y mantener los valores dominantes en la sociedad. Sin embargo, las empresas de medios se encuen- tran en manos de gigantes mediáticos globales con los que comparten cosmo- visión, criterios comerciales y valores. En estos valores es donde la juventud y la infancia puede descubrir sus verdaderos modelos de conducta (Garabedian, 2014). Georges Orwell afirmaba en su novela 1984 que «quien controla los medios controla el mundo» y actualmente la mayor dominación la ejercen los sistemas de comunicación más poderosos de la historia que, desde la más tier- na infancia, inculcan qué pensar, cómo vivir y qué imagen mostrar al mundo. La cuestión consiste en tener el poder para utilizar instrumentos con los que crear «patrones de pensamiento, conjuntos de imágenes e ideas y marcos de referencia para entender cómo se debe vivir la vida» (Mander, 2009: 16). En cuanto a los aspectos comunicativos, encontramos que la narrativa digital establece un modelo interactivo, en el cual los papeles de quién emite y de quién recibe el mensaje puedan intercambiarse, accediendo ambos en las mismas condiciones materiales al canal comunicativo y compartiendo la misma situación temporal. La organización de la narrativa establece las rela- ciones entre la historia y sus audiencias a partir de dos dimensiones: ¿qué contar? y ¿cómo contarlo? El reto radica en aproximar el argumento al público 86 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional mediante una narrativa y un medio que lo transforme en realidad. La cuestión es que sea un relato creíble, pues «las narrativas nos rodean, pero también deambulan por los recovecos de nuestra mente» (Scolari, 2013). La participación afectiva del público en los relatos siempre ha sido un tema de interés entre todos los movimientos artísticos. Sin ir más lejos, en la década de los años cincuenta del siglo xx encontrábamos representaciones multidisciplinares y provocativas como el Performance Art o el happening artístico, instando a la concurrencia a abandonar su posición pasiva ante la historia narrada. En la actualidad, este espíritu provocativo lo localizamos en la narrativa digital, un contexto cambiante en el que las historias adoptan dife- rentes aspectos, discurren por contextos variables, con diferentes grados de interacción y participación, se adaptan a múltiples dispositivos y nos ofrecen una visión heterogénea de una realidad, de la que también seremos parte acti- va (Cantillo, 2015a). Desde la aparición de la fotografía, las técnicas usadas para hacer partí- cipe a la audiencia han ido evolucionando, se han ido introduciendo provo- caciones estéticas cada vez más sofisticadas que, enfocadas hacia el reino de las emociones, han explosionado para dejar atrás la mera exégesis del hecho cotidiano. Las narrativas digitales han sido un valor añadido para el público espectador, pues, al movimiento intrínseco que ya tenía la imagen cinemato- gráfica le han agregado elementos nuevos para motorizar su activa pasividad. En la misma línea que la participación está el factor afectivo, que condicio- na la comprensión de la información, debiendo prestar especial atención a la fuerza de los apegos emocionales que generan determinadas imágenes, ya que este fenómeno contiene un inmenso potencial para estandarizar nuestro pensa- miento y resulta preocupante cuando se dirige al público infantil, carente de recursos para detectar la influencia de las imágenes que contempla. Existe un debate abierto sobre los afectos de estas narrativas, pues resulta imposible sepa- rar las emociones de la sugestión afectiva que transmiten imágenes inocentes utilizadas por multinacionales como Disney, con un poder de atracción indiscu- tible. Giroux (2001: 118) nos advierte de que «debemos prestar especial aten- ción a cómo los niños utilizan y comprenden estas películas y medios visuales». Por tanto, tendremos que detenernos en esos aspectos cinematográficos, donde las narraciones fílmicas reconstruyen una experiencia que sólo conoce nuestra imaginación y que nos traslada a un mundo mágico que seremos capaces de visualizar, sin apenas sorprendernos. Es alarmante comprobar la facilidad con que los estados de ánimo influyen en la interpretación de la realidad y la toma de decisiones, esta asociación emotiva va a condicionar la valoración que se haga de la misma, ya que, «las emociones pueden distorsionar el pensamiento y el index.comunicaci�n Las princesas Disney y la construcción de humanidades... | Carmen Cantillo Valero | 87 pensamiento puede distorsionar las emociones. Siempre a través de asociacio- nes falsas, o al menos, arbitrarias» (Ferrés i Prats, 2014). 2. Metodología de la investigación: Las relaciones de poder y el conocimiento (en el mundo Disney) para establecer objetivos y delimitar el universo de estudio La marca distintiva de las relaciones de poder es la dominación. Este poder lo ostentan las instituciones y los organismos encargados de proporcionar infor- mación y conocimiento a la sociedad. Aunque, el mayor riesgo de dominación se encuentra en nuestro desconocimiento, el cual impide detectar este poder y, por tanto, encontrar sus efectos en la infancia. Pierre Bourdieu (2000: 12) califica a esta violencia como simbólica, amortiguada e invisible, porque pasa desapercibida para sus propias víctimas y «se ejerce esencialmente a través de los caminos puramente simbólicos de la comunicación y el conocimiento o, más exactamente, del desconocimiento»; Paulo Freire (1970: 32) también encontra- ba una explicación a esta relación cuando afirmaba que «lo que pretenden los opresores es transformar la mentalidad de los oprimidos y no la situación que los oprime. A fin de lograr una mejor adaptación a la situación que, a la vez, permita una mejor forma de dominación”. Los medios de comunicación, actualmente, son las herramientas con mayor autoridad para ejercer el poder y establecer el orden simbólico en el imaginario infantil. En este proceso de formación y adoctrinamiento ocupan un lugar privi- legiado las multinacionales que tienen unos intereses privados y particulares. Las tecnologías unidas a la cultura del entretenimiento han modificado la forma de contar historias en el público infantil. Vemos cómo los personajes de los cuentos van proliferando por medios y plataformas, con discursos que se apoyan en la lógica del espectáculo y la dramatización. A través del tamiz del cine y de los más variopintos dispositivos han pasado todos los cuentos clásicos para ofrecer una imagen renovada que se ajuste al público del momento. Además, los discursos van dirigidos hacia las emociones y ejercen una violencia simbólica a sujetos de escasa edad y con unas estructuras mentales en construcción y que, fácilmente, pueden ocupar la posición dominada. Varios estudios (Llorens-Maluquer, 2001; Albarrán, 1996; Gomery, 1983; Cantillo, 2015) nos revelan que estamos ante multinacionales que, en su esen- cia empresarial, anteponen su lógica comercial1 a los contenidos educativos, [01] Nich Nicholas, expresidente de TIME Inc, cuando firmó la fusión con el grupo Time Warner en marzo de 1989 comentó: «La industria del entretenimiento y los medios estará constituida por un limitado número de gigantes globales. Estas empresas estarán integradas verticalmente, serán suficientemente grandes como para producir, comercializar y emitir mundialmente, y suficientemente flexibles como para asumir los costes de tales actividades a través de una vasta y cada vez más creciente red de distribución» (citado en Llorens-Maluquer, 2001: 109). 88 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional con el agravante de que tenemos delante narrativas que discurren por dife- rentes ámbitos, así que estamos frente a un problema muy serio y mucho más grave de lo que podamos percibir. El control que ejercen este tipo de empre- sas sobre los medios adquiere tal magnitud que los productos y juegos que la infancia utiliza para internalizar los valores están en manos de unas pocas empresas que controlan el universo transmediático (la televisión, las emisoras de radio, las revistas, los cómics, las redes sociales, los videojuegos, las edito- riales, las productoras cinematográficas, la distribución de películas, etc.). Se trata de la mayor concentración de poder y apropiación de la propiedad inte- lectual de la historia, y lo más peligroso de todo esto es que este poder no se ejerce sobre los productos, sino sobre las conciencias, en este caso, infantiles, las más maleables y fácilmente dominables. 2.1 Métodos y análisis del material audiovisual Partimos de la idea de que toda investigación que pretenda incidir sobre las narrativas audiovisuales ha de iniciarse con el estudio de la imagen como un elemento decisivo en el desarrollo de la historia cultural. Frazer2 (1922) propone una catalogación de las operaciones mágicas de las imágenes que nos permi- te conocer: en el plano figurativo, cómo son sustituidas o prolongadas por el pensamiento humano y, en el plano performativo, provocan los efectos que de ellas se pretenden (Gubern, 2004). Pues, es tan importante investigar lo que las imágenes muestran, como todo lo que esconden, ya que toda imagen constitu- ye un comentario que, unas veces está implícito y otras explícito; para lo cual, deberemos dedicar un interés especial a la atribución performativa de las imáge- nes que estas narraciones suscitan en su joven público, puesto que al exhibir a personajes capaces de realizar prodigios, princesas dignas de ser adoradas, etc. se transmite no sólo una ideología, sino que se construye el imaginario infantil con unos mapas de significado que les harán conferir sentido al mundo. En definitiva, estamos ante imágenes que construirán identidades con las que niños y niñas encontrarán un lugar en el mundo adulto; por tanto, para conocer los diferentes modos en que se presentan los personajes según sea su sexo creemos necesario realizar un análisis de los discursos que tienen lugar en las películas, de la narrativa audiovisual, de los gestos de los personajes y todo lo que comunique emoción y sentimientos al público espectador, ya que consideramos que el lenguaje tiene capacidad para producir o reproducir las relaciones de poder de la sociedad. De este modo, visibilizaremos las peculia- [02] Román Gubern, en su obra Patologías de la imagen, menciona las funciones de las imá- genes utilizadas en las prácticas mágicas, citando las categorías establecidas por Frazer (1922) en su obra The Golden Bough: A Study in Magic and Religion. index.comunicaci�n Las princesas Disney y la construcción de humanidades... | Carmen Cantillo Valero | 89 ridades que rodean a las situaciones comunicativas representadas, ya que «el problema no está en las características del discurso, sino en que un discurso de tales características circule y se haga dominante» (Alonso y Callejo, 1999: 48). 2.2 Establecer objetivos En la filmografía de Disney se pueden encontrar historias edulcoradas donde las princesas al final encuentran a su príncipe (su premio). Estas narraciones transmiten información de los papeles propios de cada sexo y que su público irá asimilando, de forma inconsciente, hasta integrarla como algo natural en su comportamiento, reproduciendo los estereotipos sexistas al ser sus personajes imitados desde la infancia. Esta hipótesis y los antecedentes mencionados en el apartado anterior nos llevan a establecer el objetivo principal del presente estudio, fijando esta investigación en interpretar las imágenes y los discursos del material audiovisual de las princesas Disney para hacer visibles las posturas de poder de los oligopolios mediáticos que propagan entre el público femenino una «cultura del silencio» prescrita por sus opresores y positivada en formas de dominación androcéntrica. Este objetivo principal nos conduce a un segundo objetivo, no por ello menos importante: resaltar la necesidad de introducir un análisis crítico de las narrativas audiovisuales en la infancia para impedir la manipulación y adoctrinamiento. Mediante estos objetivos podremos reflexionar sobre cómo se adultera la realidad en las representaciones de las princesas Disney. Así, verificarermos si se cumple otra hipótesis de partida: «la narrativa audiovisual escoge un punto de vista determinado y anula otros; por tanto, se adultera la realidad. Tras la representación de la realidad hay una ideología» cuyo calado en la identidad adulta se propaga a través de las emociones. Los personajes ficticios dejan de serlo en el momento que se personifican en nuestra realidad, una realidad que nuestros sentidos perciben, pero que nuestro cerebro procesa con base en los recuerdos y las emociones. De ahí la influencia que estas imágenes inocen- tes pueden provocar en la conducta, ya que ejercen un desmedido poder de adoctrinamiento que trasciende el ámbito informal y que tiene una fuerte incidencia en la construcción de la Humanidad Digital del siglo xxi. Para alcanzar estos objetivos mostramos algunos análisis de cómo se ven representados los principales personajes, cómo se relacionan entre sí los elementos figurativos de la ficción, así como las diferencias entre estas figu- ras y lo que representan en la realidad, y así reflexionamos acerca de algunas acciones reflejadas en la filmografía. 90 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional 2.3 Delimitar el universo de estudio Para realizar este estudio se han tenido en cuenta los documentos gráficos de una parte muy concreta de la filmografía de Disney; al centrar el estudio en los este- reotipos de género se han elegido los personajes de las princesas como universo de la investigación. Se escogen las películas en las que estas aparecen, desde la primera estrenada en 1937, Blancanieves y los siete enanitos (1937), hasta la más reciente Frozen: El reino del hielo (2013). La filmografía de Disney es extensa y en muchas más películas pueden analizarse los estereotipos sexistas, aunque se ha partido de un patrón común donde el personaje principal a analizar tuviese el mismo peso específico en la película; esto se ha conseguido al elegir sólo los filmes donde el personaje principal estuviese interpretado por una princesa de facto, o una joven que adquiere esa condición al contraer matrimonio con un príncipe; y se excluyen las cintas en las que su protagonista era un animal, por no responder al perfil de persona que pudiese formar parte del imaginario infantil. 3. Resultados: Perdimos la libertad… y hasta la voz Una de las principales formas de controlar el poder consiste en prohibir el acceso a la palabra. Para demostrar este argumento, en este apartado recordamos algu- nos momentos, escenas, diálogos, etc. que utilizan el silencio femenino como elemento narrativo para posicionar a los personajes del film en un espacio deter- minado de la historia. Así nos preguntamos: ¿qué nos dicen los discursos de las mujeres? Esto se visualiza, sobre todo, en los comienzos de cada película a través de la narrativa audiovisual utilizada, en los gestos de los personajes, sus expresiones y sus discursos. En este sentido, las sensaciones transmitidas mediante las escenas que representan los personajes animados han sido analizadas para comprobar que estos supuestos movimientos de cámara consiguen imprimir una carga emocional que involucra al público en el relato. Por ejemplo, hemos asimilado los movimientos de grúa al efecto visual que se pretende transmitir al sorpren- der a Jasmine robando en el bazar, o también un travelling de acompañamien- to que resalta la imagen dulce de Jasmine cuando aparece por primera vez en la película, por considerar que muestran claramente la posición de indefensión que se confiere a los personajes femeninos, aunque sean principales, en este tipo de películas. El código gestual de las princesas cuando se muestran la primera vez ante la cámara es dulce e inocente o, en todo caso, su gesto es de estar sufriendo, esperando o realizando tareas domésticas. La introducción en la narración se hace mediante una voz conductora del relato. Su incorporación a la escena index.comunicaci�n Las princesas Disney y la construcción de humanidades... | Carmen Cantillo Valero | 91 es a través de planos generales que de un modo descriptivo van mostrando la figura de las princesas, poco a poco, y mediante travelling de acercamiento se llegará hasta primeros planos que presentan la belleza y dulzura de estos personajes. Figura 1: Escenas de la película Aladdín (1992). Sin embargo, los personajes masculinos son presentados mediante trave- lling de acompañamiento y primeros planos que demuestran el carácter supe- rior del género dominante, por tanto los valores de valentía, lucha, etc. quedan reforzados desde el principio de la narración. Es interesante el análisis realizado acerca de los códigos gestuales de los personajes protagonistas, ya que encontramos a Bella, en La Bella y la Bestia (1991) en una actitud soñadora; y a Bestia, a quien no podemos considerar como antagonista de ésta ya que será su príncipe, pero que actúa brutalmen- te y muestra un código gestual agresivo en toda la narración fílmica. Como en la publicidad, se le concede a la imagen un poder en la comunicación de emociones estereotipadas «la imagen ocupa una función esencial, un valor de comunicación que ha sustituido el razonamiento argumental por una retórica visual fundamentada básicamente en estereotipos» (Correa, 1999: 191-197). Figura 2: Escenas de La Bella y la Bestia (1991). 92 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional Al presentar a la protagonista en los primeros comentarios se revela la sumisión femenina. Una posición consustancial a la mujer en la sociedad patriarcal, aunque queda remarcada en los comentarios que realizan cuando están ante la presencia de hombres. Blancanieves al llegar a la casa de los enanitos exclama: «¡cuánto polvo!» o «¡qué montón de platos sucios!» y, una tras otra, va repasando las tareas domésticas que están descuidadas. Después, por inducción, va construyendo una realidad que se reproduce en su mente androcéntrica y que se reconstruye en el imaginario de la audiencia: «¿creéis que su madre?... a lo mejor no tienen madre...» hasta llegar a ofrecer su trabajo a cambio del alojamiento «y, si dejáis que me quede, limpiaré la casa, fregaré y también cocinaré...». Figura 3: Escenas de Blancanieves y los siete enanitos (1937). Mulan (1998) también comienza recordando las cualidades que toda mujer debe poseer: «callada y recatada, elegante, refinada, educada, delicada, desen- vuelta, puntual...» y siempre dejando clara la posición de vasallaje en relación a los hombres: «para complacer a tu futuro suegro» (estos son los consejos para agradar y casarse que recibe por parte de sus mayores). El caso de silenciamiento más perverso se da en La Sirenita (1989), cuan- do la bruja del mar pide a Ariel que le dé su voz, interpretando esta canción: «Hablando mucho enfadas a los hombres… se aburren y no dejas buen sabor, pues les causan más placer las chicas que tienen pudor, admirada tú serás, si callada siempre estás». A todas luces, lo que se pone de manifiesto en estas representaciones es la capacidad comunicadora y transmisora que los excelentes arreglos musica- les confieren a las escenas, dándoles ese toque emocional que hace percibir como humanas las experiencias representadas por los dibujos animados y que Disney utiliza para transmitir su ideología conservadora y sexista. La narración del relato nos introduce en una historia donde la mujer es una proyección de los propios fantasmas masculinos, donde la imagen de «lo otro» postulada por Simone de Beauvoir no se emite como la de un sexo diferente e index.comunicaci�n Las princesas Disney y la construcción de humanidades... | Carmen Cantillo Valero | 93 independiente, sino como una proyección hacia fuera de sus instintos reprimi- dos (Mantelli, 2005). Por ejemplo, así encontramos a Ariel, La Sirenita (1989) transformada en pantalla muda reflectora de los propios temores masculinos, en imagen de una masculinidad terrorífica en que la cola (miembro viril) es reemplazada por unas piernas y su singularidad como sirena (mujer indepen- diente) deviene sumisión del falo castrador. Ariel se transforma en una sirena silenciosa, y un eje mudo que la exhibe indefensa ante el hombre, puesto que, al no hablar, permite el libre juego del deseo masculino. Es significativo este silencio impuesto por Disney al personaje de la sirena, ya que según la leyen- da, las sirenas no eran silenciosas, sino que seducían a los hombres con sus cantos mortales. Aunque, al privarla de voz, se niega la posibilidad de inde- pendencia de la mujer. Esta figura muda de una sirena revela una función de signo imaginado por y para el hombre (Cantillo, 2015b). La habitación de Elsa, en Frozen: El reino del hielo (2013), también se caracteriza por su silencio: un silencio discreto y avergonzado que da un aire solemne a la escena, dejando traslucir ese deseo mudo de una sexualidad prohi- bida que se expresa con un aislamiento y una altivez impuesta. Este silencio es lo que también le permite al mundo femenino de Rapunzel (Enredados, 2010) servir de blanco al deseo masculino, llamando la atención masculina ante un cautiverio y alejamiento por no se sabe qué causas. De ahí que este mundo femenino represente una «esfera mágica» o «atmósfera mística», misticismo no en el sentido del lenguaje religioso femenino que se expresa a través de sus silencios, sino en el sentido de constituir una pantalla muda sobre la cual se proyecta el deseo masculino en la ficción narrativa. Figura 4: Escenas de Frozen: El reino del hielo (2013). En líneas generales, el elemento más llamativo que recorre todos los discursos, tanto hablados como cantados, es el uso del masculino genérico para referirse tanto a hombres como a mujeres (por ejemplo, en La Sirenita, Úrsula le comenta a Ariel: «... para conseguir lo que quieres, debes convertirte en humano...», o en La Bella Durmiente (1959), cuando hablan los padres sobre la boda de sus hijos: «Los chicos necesitan un hogar propio, un sitio donde 94 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional criar a sus polluelos...». En Pocahontas (1995), se refieren a los «hombres blancos», no dejando lugar a las mujeres en su discurso, etc.). De igual forma, el lenguaje del discurso de una narrativa que limita el acceso a la palabra femenina aparece en momentos cuidadosamente meta- forizados, está presente en lo que no dice o no puede decir la mujer y donde los personajes fílmicos se verán atravesados interna y externamente en una borrosidad androcéntrica que denota un vacío, que revela la inexistencia de un espacio neutral en el plano interpretativo y semántico. A lo largo de la historia literaria y social, cuando se habla de humani- dad, no se incluye a todos —varones y mujeres— sino que se alude o se sintetiza el pensamiento de un grupo de varones que hace prevalecer su imaginario dominante y parcializado (Mantelli, 2005: 162). Tanto en la literatura como en la cultura popular se valora el arquetipo de la mujer enigmática y silenciosa por encima de la irritante mujer charlatana, que cuando aparece lo hace como un personaje chistoso que aporta comicidad a la narración. Aquí encontramos que la «cultura del silencio» es otro paradig- ma adoctrinador utilizado en las narrativas cinematográficas como un agen- te propagador de la ubicación de la mujer en la sociedad. El silencio, según Christian Salmon, está asociado a diferentes cualidades: La modestia, el respeto a los demás, la prudencia, el saber vivir. Por culpa de reglas de decoro profundamente arraigadas, la gente se calla para evitar los problemas, los conflictos y otros peligros percibidos. Las virtu- des sociales del silencio están reforzadas por nuestros instintos de super- vivencia (Salmon, 2010: 69). La mujer acepta su papel de objeto mudo. Su palabra se ha secuestrado y el sentido de estas, si llegaran a emitirlas, también se ha asumido por las ideologías dominantes (androcéntricas) para travestir y constituir algo inherente a la dialéc- tica entre opresores y oprimidos. La mujer ni siquiera ha tenido la posibilidad de comunicarse, estando silenciada y, cuando ha podido ejercer este derecho, ha utilizado una palabra robada que ha sido reemplazada «por otra que conlleva la idea de transmisión […] el robo, entonces, [el poder hegemónico ha cumplido] el objetivo: secuestrar el significado de las palabras más hermosas de nuestra lengua: La comunicación» (Aparici, 2003: 39); por tanto, habrá que sospechar de una falsa posibilidad de comunicación otorgada a la mujer, ya que formaría parte de una «ideología difusa» que sirve para justificar el poder y sus prácticas. index.comunicaci�n Las princesas Disney y la construcción de humanidades... | Carmen Cantillo Valero | 95 Todos estos discursos (y la ausencia de ellos también) encasillan y atrapan a las mujeres en unos esquemas mentales de dominación, sin saber por qué se sentirán sufridoras y formarán parte de las relaciones de dominación eterna, puesto que «los dominados aplican a las relaciones de dominación unas catego- rías construidas desde el punto de vista de los dominadores» (Bourdieu, 2000). En todas estas películas encontramos los arquetipos de la mujer sumisa y obediente, que se guía por sus emociones, se orienta al amor y al matrimonio, es la que cuida de la familia y de la casa y en muchas ocasiones se presenta o en el rol de mujer «malvada» o en el de la niña «inocente» (Giroux, 2001). Los personajes animados pueden llegar a ser arquetipos, ya que ofrecen la imagen de un mundo desconocido para el público espectador, que sentirá atracción hacia su carácter insólito. El «mérito» a la hora de convertir un personaje de ficción en un modelo arquetípico estriba en hacer que esa historia exótica adquiera un carácter absoluto y se presente como un nuevo modelo de referencia. En lo relativo a la expresión de las emociones, encontramos una ebulli- ción de emotividad en los papeles femeninos que suelen hacer caso a impul- sos afectivos irracionales, en contraposición con el equilibrio y pragmatismo de los masculinos. Por ejemplo, en La Sirenita (1989) aparecen dos perso- najes femeninos: Ariel y Úrsula, protagonista y antagonista respectivamen- te, [la mujer es sumisa y obediente, se guía por sus emociones, se orienta al amor y al matrimonio... se presenta en el rol de mujer «malvada» o en el de niña «inocente»] (Giroux, 2001: 106-111) y, cuatro masculinos: el Rey Tritón, príncipe Eric, Grimsby el mayordomo y el chef Louis, quienes tienen un claro reconocimiento en función de su posición socio-laboral, configuran- do su perfil en la escena en relación con sus acciones. Incluso en las relaciones afectivas se reserva una posición pasiva para la mujer, otorgando la iniciativa al hombre y el papel de espera pasiva a la mujer. Por este motivo, se escucharán comentarios por parte de personajes femeni- nos como: «¡Qué tonta es! ¡Está loca!, ¡es tan lindo!», quienes no dan crédito a que una mujer no entre en el juego de la seducción masculina, donde el movimiento siguiente, en La Bella y la Bestia (1991), supondría la concesión progresiva de los favores de Bella hacia Gastón. El amor para una mujer y las prácticas de cortejo de un hombre hacia ella, le hacen conceder un lugar secundario a toda la actividad que ésta viniera realizando, se abre ante sí la esperanza de escapar del desierto al que se ve abocada por ser mujer, de alcanzar la felicidad gracias a la compañía masculi- na y de poder concebir así una vida más intensa. Trabajos relacionados con el neuromarketing (Braidot, 2005; Damasio, 2006), ponen de manifiesto la importancia de la estimulación de las emocio- 96 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional nes en la ficción, hasta llegar al punto de ser equiparadas con situaciones vivi- das en la realidad. Las emociones influyen en la construcción de la identidad y en la cimentación de la memoria. A las emociones van dirigidos los relatos infantiles que, con una finalidad mercantilista, pretenden conseguir un acer- camiento del público hacia sus productos, pero las técnicas utilizadas por la mercadotecnia tergiversan la realidad, provocando una potenciación del esta- do hipnótico encargado de automatizar la toma de decisiones. La visión de las imágenes infantiles de Disney suele erigirse como un escenario poderoso, estratégicamente construido para «formatear las mentes» infantiles, donde se encuentran representados los sueños y las esperanzas. Incluso, en el margen más machista de los relatos hemos encontrado discursos donde se muestran, bajo apariencia divertida, verdaderas escenas de violencia hacia las mujeres, pero que al envolverse en un halo de chascarrillo pasan desapercibidas, pues «como tendencia, la ficción tiende a hacerse amable y busca edulcorar los sucesos. Al fin y al cabo, la primera y primordial función de la ficción es esa: entretener» (Marta-Lazo, C. y Abadía Urbistondo, A., 2018). Conocida es la habilidad de Walt Disney para conectar con las emociones del público (infantil y adulto), unida esta destreza al poder del cine, puesto que «influye en nuestras emociones directamente, sin necesidad de pasar por el intelecto» (Ferrés i Prats, 2014), constituye un montaje de imágenes en sus películas que genera un relato atrayente con el que también alcanzará esa asociación emocional y, a través de la cual, se transmitirán ideas y creencias. Cuando la política se reviste con la imagen de la inocencia, está en juego algo más que el simple engaño. Se trata de la cuestión del poder cultural y de cómo influye en las formas públicas de comprensión del pasado. La inocencia en el mundo de Disney se convierte en el vehículo ideológico a través del cual la historia se escribe de nuevo. [...] La Disney Company no ignora la cultura, la reinventa como un instrumento pedagógico y político de sus propios intereses, autoridad y poder (Giroux, 1996: 55). En las últimas producciones de Disney, sin embargo, encontramos cómo se incita al público espectador a controlar sus emociones (quizás aquellas que no se ajustan a un modelo establecido) para no alterar el orden natural. Así, el triunfo de Elsa como reina, en Frozen: El reino del hielo (2013) es aprender a controlar sus emociones que estimulan su crioquinesis. Esto sugiere que cuando las mujeres poderosas necesitan capacitación, no es para desarrollar sus habilidades sino para evitar abusar de su poder (Streiff, 2017). index.comunicaci�n Las princesas Disney y la construcción de humanidades... | Carmen Cantillo Valero | 97 3. Discusión y conclusiones: Cuando la narrativa digital silencia y esclaviza El cine es un vehículo de valores y de contravalores, se puede considerar el mejor medio para influir en la educación integral del ser humano. Una película puede ser un ejemplo de lo correcto y socialmente establecido o, por el contrario, se puede convertir en un elemento destructor, todo dependerá de las referencias, de cómo se proyecta su narrativa y del contexto en donde sea exhibida. (Cantillo y Gil-Quintana, 2017). La naturaleza del silencio no tiene voz propia y esta voz tiene el reflejo del hombre y se muestra legitimando un determinado producto cultural, al representarlo como objeto natural que no puede ser de otra manera; es decir, el silencio naturalizado de la mujer. Trasladando este enmudecimiento (tácito o explícito) a la narrativa del cine infantil, encontramos que las princesas Disney también suelen aparecer silenciadas y condenadas a un mutismo con el que se transmitirá su sumisión femenina, actuando, estas figuras, como «máquinas de enseñar» apoyadas sobre el registro del aislamiento comunicativo de la mujer. Sus personajes son silenciados sin tener en cuenta que el silencio puede tener un coste psicológico enorme en los individuos, al crear sentimientos de frustración y aislamiento. Estas historias están perfectamente narradas para que penetren en las mentes infantiles y sean absorbidas sin apenas percibirlo (Salmon, 2010). Se introducen en el imaginario infantil como una muestra más de domina- ción masculina, ya que las palabras (o su ausencia) forman el lenguaje que contribuye a la aceptación de la realidad; es decir, se sigue naturalizando la cosmovisión androcéntrica como representación del orden social, donde el lenguaje es patrimonio masculino y el silencio es femenino. Estas asimetrías impuestas, desde los ámbitos de la fantasía, suponen que la abnegación, la resignación o el silencio han sido virtudes negativas aprendidas de forma dife- rencial por niños y niñas del currículum oculto y manifiesto de la pedagogía androcéntrica, de tal forma que llegan a ser asumidas como pertenecientes al orden natural de las cosas (Correa, en Aguiar y Farray, 2007: 29). La narrativa digital del cine de animación ha reproducido el silencio y el encierro femenino como otro ejemplo cultural más que establece los espacios propios de los géneros. Su lenguaje universal ha ido evolucionando, desarro- llando una capacidad comunicativa en diversos sectores y ha dado lugar a que su narrativa se haya convertido en un factor clave en el mundo de la comunica- ción actual, puesto que la gran mayoría de las propuestas audiovisuales toman como base su estructura expositiva. La energía creativa de la narrativa cine- matográfica consolida la jerarquía de los seres humanos universales y propaga las realidades culturales particulares que se considerarán como naturales. 98 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional El poder que se otorga a las imágenes de las mujeres subyugadas y prisio- neras, junto a las palabras (no pronunciadas) nos lleva a idealizar a los perso- najes y a internalizar ciertos estilos de vida. Encontramos contradicciones en un discurso que representa a la vez la voz de la mujer sumisa y la de la hembra vigorosa, dejando ver «lo femenino» a través de los silencios de un discurso contradictorio y no como una realidad esencial. En el caso de las narraciones de las películas infantiles de dibujos anima- dos, tal y como hiciera Vladimir Propp (1928) con los cuentos maravillosos, podemos aislar sus partes constitutivas para analizarlas según la aplicación de unos patrones particulares y ponerlas en relación para descubrir la descripción exacta de la trama narrada. Entre las funciones establecidas por este autor para analizar los cuentos se encuentra la de proferir una proclama donde «un prisionero pida al héroe que le libere», también Paulo Freire nos traslada a que la causa de la liberación es un compromiso de carácter amoroso, dialógico y «por esta misma razón, no pueden los dominados, los oprimidos acomodar- se a la violencia que se les imponga, sino luchar para que desaparezcan las condiciones objetivas en que se encuentran aplastados» (Freire, 1970: 72). Sin embargo, «los oprimidos, como objetos, como ‘cosas’, carecen de finali- dades. Sus finalidades son aquellas que les prescriben los opresores […] que los explotados casi siempre llevan consigo, condicionados por la “cultura del silencio”» (Freire, 1970: 41). Afortunadamente, y gracias a la interacción y la participación que nos ofrecen las redes sociales digitales «la ciudad comienza a percibir su propio potencial comunicativo a escala local y global, a tal punto que están creando redes informativas paralelas, y muchas veces independientes, a las que esta- blecen los grandes medios de comunicación y las industrias de la cultura» (Aparici y Osuna-Acedo, 2013). Hemos llegado a un nuevo umbral diegético que nos sitúa ante un siste- ma de intercambio simbólico audiovisual que invierte los papeles de la cons- trucción de la realidad. La sociedad de consumo actual ofrece «una gran diversidad de productos para escapar a las presiones y angustias de la vida cotidiana, para evadirse a través del juego y del entretenimiento, para inten- tar satisfacer las esperanzas y los deseos secretos» (Romano, 2006: 144). En este sentido, pensamos que el estudio de las Humanidades Digitales no puede quedar en el mero análisis de las narrativas infantiles o «en el simple análisis de las producciones; nuestra atención se ha de fijar en el modelo comunicativo llevado a cabo, convirtiéndose en el elemento determinante de nuevas formas de narrar» (Gil-Quintana, 2016: 2). index.comunicaci�n Las princesas Disney y la construcción de humanidades... | Carmen Cantillo Valero | 99 La construcción del sujeto femenino que se lleva a cabo en la pantalla configura la no-presencia (la ausencia) y la no-estimación (insignificación). A través de sus silencios se las caracteriza como algo marginal, unos perso- najes sometidos en la narrativa digital a unos estereotipos que se empeñan en delimitarlas y que las ciñen a la mirada masculina que será la que configurará la identidad femenina. Luce Irigaray en Spéculum de l’autre femme (1974) sostiene que el reflejo está íntimamente unido a la percepción y a la formu- lación de esta percepción, su exigencia incluye un nuevo lenguaje que sería el de la subversión, por tanto, sólo podrá librarse del estigma de espejo que le provoca la mirada viril, sobre la que se valora y construye su identidad, si consigue la mirada interior, donde encuentra sentido al espacio vacío que es su ser interior. Si, como decía Paulo Freire (1970: 71) ,«la existencia, en tanto humana, no puede ser muda, silenciosa […] existir, humanamente, es “pronunciar” el mundo, es transformarlo […] los hombres [y las mujeres] no se hacen en el silencio, sino en la palabra, en el trabajo, en la reflexión». Una posible solución para evitar que con estos mensajes se construyan Humanidades Digitales silenciadas pasa por la alfabetización mediática, respaldada por profesionales que nos hagan saber cómo reaccionar ante una industria, que, aunque ya suponemos que está detrás de estas prácticas, no existen cauces legitimados para entablar una lucha contra estos mensajes adoctrinadores y que estandarizan nuestras conciencias. En definitiva, replan- tearnos la narrativa digital con miradas alternativas que se sitúen en lo más profundo de las significaciones. 4. Bibliografía AlbArrán, A. (1996): Media Economics: Understanding Markets, Industries and Concepts. Ames, Iowa: Iowa State University Press. Alonso, L. E. y CAllejo, J. (1999). El análisis del discurso: del postmodernismo a las razones prácticas. Revista Española de Investigaciones Sociológicas, 88, 37-73. Madrid: Centro de Investigaciones Sociológicas. Recuperado desde: http://www.reis.cis.es/REIS/PDF/REIS_088_04.pdf ApAriCi, R. (Coord.) (2003). Comunicación educativa en la sociedad de la infor- mación. Madrid: UNED. ApAriCi, R. y osunA-ACedo, S. (2013). La Cultura de la participación. Revista Mediterránea de Comunicación, 4(2), 137-148. bourdieu, P. (2000) La dominación masculina. Barcelona: Anagrama. brAidot, N. P. (2005) Neuromarketing, Neuroeconomía y Negocios. Madrid: Puerto Norte-Sur. http://www.reis.cis.es/REIS/PDF/REIS_088_04.pdf 100 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional CAntillo VAlero, C. (2015a). Del cuento al cine de animación: Semiología de una Narrativa Digital. Revista de Comunicación de la SEECI, 38, 133-145. CAntillo VAlero, C. (2015b). Imágenes infantiles que construyen identidades adultas. Los estereotipos sexistas de las princesas Disney desde una perspec- tiva de género. Efectos a través de las generaciones y en diferentes entor- nos: digital y analógico. Tesis Doctoral. Madrid: UNED. Recuperado de: http://e-spacio.uned.es/fez/eserv/tesisuned:Educacion-Ccantillo/CANTILLO_VALERO_Carmen_Tesis.pdf CAntillo VAlero, C. y Gil-QuintAnA, J. (2017). Una experiencia práctica de análisis audiovisual en educación primaria. En MArtA-lAzo, C. (coord.): Nuevas realidades en la comunicación audiovisual. Madrid: Tecnos-Grupo Anaya. CorreA, R. I. (1999). Del razonamiento argumental a la retórica de las imágenes. Revista Comunicar, 6(12), 191-197. CorreA, R. I. (2007). Mujer, ¿La sal de la tierra, la luz del mundo? En AGuiAr, M.V. y FArrAy, J. I. (2007). Sociedad de la Información, Educación para la Paz y Equidad de Género. La Coruña: Netbiblo. dAMAsio, A. (2006). El Error de Descartes. La Emoción, la Razón y el Cerebro Humano. Barcelona: Crítica. Ferrés i prAts, J. (2014). Las pantallas y el cerebro emocional. Barcelona: Gedisa. Freire, P. (1970). Pedagogía del oprimido. México: Siglo xxi. GArAbediAn, J. (2014). Animating Gender Roles: How Disney is Redefining the Modern Princess. James Madison Undergraduate Research Journal, 2(1), 22-25. Web. Recuperado desde: http://commons.lib.jmu.edu/jmurj/vol2/iss1/4/ Gil-QuintAnA, J. (2016). Narrativa digital e infancia: Es la hora de la Generación CC. Revista Mediterránea de Comunicación, 7(1), 79-90. Recuperado desde: http://mediterranea-comunicacion.org/. http://dx.doi.org/10.14198/MEDCOM2016.7.1.5 Giroux, H. A. (1996) Placeres inquietantes. Aprendiendo la cultura popular. Barcelona: Paidós. Giroux, H. A. (2001) El ratoncito feroz: Disney o el fin de la inocencia. Madrid: Fundación Germán Sánchez Ruipérez. GoMery, D. (1983). Who Owns de Media?. En AlexAnder, A.; owers, J. y CArVeth, R. (ed.). Media economics. Theory and Practice (pp. 47-70). Hills- dale NJ: Lawrence Erlbaum Associates. Gubern, R. (2004). Patologías de la imagen. Barcelona: Anagrama. INTEF. Instituto Nacional de las Tecnologías Educativas y de Forma- ción del Profesorado (2015). El cine como recurso didáctico. Módulo 6b: Lenguaje cinematográfico: Teorías. Semiótica. Recuperado desde: http://www.ite.educacion.es/formacion/materiales/24/cd/m6_2/semitica.html index.comunicaci�n http://e-spacio.uned.es/fez/eserv/tesisuned:Educacion-Ccantillo/CANTILLO_VALERO_Carmen_Tesis.pdf http://commons.lib.jmu.edu/jmurj/vol2/iss1/4/ http://mediterranea-comunicacion.org/. http://dx.doi.org/10.14198/MEDCOM2016.7.1.5 http://www.ite.educacion.es/formacion/materiales/24/cd/m6_2/semitica.html Las princesas Disney y la construcción de humanidades... | Carmen Cantillo Valero | 101 iriGArAy, L. (1974) Spéculum de l’autre femme. París: Les Éditions de Minuit. llorens-MAluQuer, C. (2001). Concentración de empresas de comunicación y pluralismo: la acción de la UE. Tesis doctoral de la UAB. Recuperado desde: http://www.tesisenred.net/handle/10803/4095 MAnder, J. (2009). Cuatro buenas razones para eliminar la televisión. Barce- lona: Gedisa. MAntelli, N. (2005). La cautiva como mujer modélica. Revista de Estudios de la Mujer. Recuperado desde: http://0-www.ebrary.com.jabega.uma.es MArtA-lAzo, C. y AbAdíA-urbistondo, A. (2018). La hibridación entre el géne- ro policiaco y la comedia en la ficción televisiva norteamericana. Estudio de caso de Castle. index.comunicación, 8(1), 11-29. Metz, C. (1964). Le cinéma: langue ou langage? Communications, 52-90. Recu- perado desde: https://www.persee.fr/doc/comm_0588-8018_1964_num_4_1_1028 propp, V. (1928). Morfología del cuento. Madrid: Fundamentos. roMAno, V. (2006). La formación de la mentalidad sumisa. Venezuela: D - Ministerio de Comunicación e Información. Recuperado desde: http://0-www.ebrary.com.jabega.uma.es sAlMon, C. (2010). Storytelling. La máquina de fabricar historias y formatear las mentes. Barcelona: Península. sColAri, C. (2013). Narrativas transmedia. Cuando todos los medios cuentan. Barcelona: Planeta. streiFF, M. y dundes, L. (2017). Frozen in Time: How Disney Gender- Stereotypes its most powerful princess. Social Sciences, 6, 38. Recuperado de: http://www.mdpi.com/2076-0760/6/2/38/htm 5. Filmografía Conli, R. (productor) y Greno, N. y howArd, B. (directores) (2010). Enredados [película]. Estados Unidos: Walt Disney Animation Studios y Walt Disney Pictures. disney, W. (productor) y hAnd, D.; Cottrell, W.; Morey, L.; peArCe, P. y shArpsteen, B. (directores) (1937). Blancanieves y los siete enanitos [película]. Estados Unidos: Walt Disney Pictures. disney, W. (productor) y GeroniMi, C.; ClArk, L.; lArson, E. y reither- MAn, W. (directores) (1959). La Bella Durmiente [película]. Estados Unidos: Walt Disney Pictures. disney, W. (productor) y CleMents, R. y Musker, J. (directores) (1989). La Sirenita [Película]. Estados Unidos: Walt Disney Pictures y Walt Disney Feature Animation. http://www.tesisenred.net/handle/10803/4095 http://0-www.ebrary.com.jabega.uma.es https://www.persee.fr/doc/comm_0588-8018_1964_num_4_1_1028 http://0-www.ebrary.com.jabega.uma.es http://www.mdpi.com/2076-0760/6/2/38/htm 102 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional disney, W. (productor) y wise, K. y trousdAle, G. (directores) (1991). La Bella y la Bestia [película]. Estados Unidos: Walt Disney Pictures y Walt Disney Feature Animation. disney, W. (productor) y CleMents, R. y Musker, J. (directores) (1992). Aladdín [Película]. Estados Unidos: Walt Disney Pictures y Walt Disney Feature Animation. disney, W. (productor) y CAbriel, M. y GoldberG, E. (directores) (1995). Pocahontas [película]. Estados Unidos: Walt Disney Pictures y Walt Disney Feature Animation. disney, W. (productor) y Cook, B. y bAnCroFt, T. (directores) (1998). Mulan [película]. Estados Unidos: Walt Disney Pictures y Walt Disney Feature Animation. sArAFiAn, K. (productora) y Andrews, M.; ChApMAn, B. y purCell, S. (directores) (2012). Brave. Indomable [película]. Estados Unidos: Walt Disney Pictures, Pixar. VeCho, P. del (productor) y buCh, Ch. y lee, J. (directores) (2013). Frozen: El reino del hielo [película]. Estados Unidos: Walt Disney Animation Studios y Walt Disney Pictures. Para citar este artículo: Cantillo Valero, C. (2018). Las princesas Disney y la construcción de Humanidades Digitales «silenciadas» en el cine de animación. index.comunicación, 8(2), 83-102. index.comunicaci�n work_bdivlozw6fa7fnzin3ntuelr5m ---- White Paper Report Report ID: 106077 Application Number: HD-51548-12 Project Director: Marie-Claire Beaulieu (Marie-Claire.Beaulieu@tufts.edu) Institution: Tufts University Reporting Period: 4/1/2012-8/31/2014 Report Due: 12/31/2014 Date Submitted: 12/19/2014 Digital Humanities Start-Up Grant Final Performance Report Grant Number: HD-51548-12 Digital Humanities in the Classroom: Bridging the Gap between Teaching and Research Project Director: Prof. Marie-Claire Beaulieu Tufts University Report Submitted: 12/19/14 Period covered: 04/01/2012 to 08/31/2014 1 A. Project Activities Perseids in the Classroom As indicated in our original proposal, this project aims to bring digital scholarship into the classroom by means of our online editing platform. Accordingly, project director Marie-Claire Beaulieu started to use Perseids in class in September 2013 in her Classical Mythology course. A dynamic syllabus was created which collected all the readings assigned in the class, whether they were specific passages of ancient texts offered in Perseus or entire works. 1 In addition, three optional texts were assigned each week. These readings concerned the same myth or mythical complex studied during the week. The students’ task was to choose one of the optional readings and analyze it with respect to other sources on the same myth. They were encouraged to address questions such as: “How does this text/artifact compare to other testimonies on the same myth? Why is it different/similar? Did the author/artist have a particular purpose in producing such a rendering (political, social, artistic, etc.)?” The students then produced short essays (maximum 500 words) and typed these essays in the Perseids annotation system. To receive full credit, they had to submit a minimum of two essays graded as satisfactory over the course of the semester. In their essays, the students were encouraged to include links to further materials such as parallel texts/artwork or bibliography. After undergoing several review and feedback cycles through Perseids, the commentaries were published as student annotations on the Perseus Digital Library. 2 The interactive nature of these assignments — as well as the prospect of seeing their work published online — has proved to be a motivating factor for students. However, this first trial round demonstrated the need for strict planning regarding grading. The students were allowed to submit their essays at any time during the semester and to re-submit them any number of times after receiving feedback. This system, while allowing maximum flexibility for the students, overloaded the instructor and teaching assistants with grading, particularly at the end of the semester. In subsequent iterations of the class, mandatory submission and grading periods were imposed in order to produce a more regular grading cycle. In the fall of 2013, Prof. Beaulieu also used Perseids in her intermediary Greek class to edit and translate Greek inscriptions with the students. The inscriptions are now published to the web as a demo for the collection. 3 The texts were marked up in TEI XML using the EpiDoc standard for maximum interoperability with the international community of epigraphists. 4 The Epidoc standard is ultimately based on the Leiden conventions, which have long been in use by scholars to render the characteristics of epigraphical or manuscript texts. Similarly, in January 2014, students in Prof. Beaulieu’s Medieval Latin class edited and translated sections of the fourteenth century compendium of English Forest Law preserved at Tufts University in Tisch Library using the EpiDoc markup standards. We are currently preparing the students’ work for publication, as well as preparing a similar workflow for the upcoming 2015 iteration of Prof. Beaulieu’s Medieval Latin class. This work continued after the end of the grant period. In the fall of 2014, Prof. Beaulieu continued using Perseids in her intermediary Greek class and in her Classical Mythology class. The Greek students are currently editing and translating Greek funerary inscriptions which will be published along with the ones from last year at the end of the term. In Classical Mythology, the 65 students enrolled in the class were asked to use Perseids as part of their term projects. The term project for this class consists in analyzing the presentation of a given myth on an assigned object in the Greek and Roman collection in the Boston Museum of Fine Arts. The students, who are organized in teams of two or three, must observe their assigned objects and compare their depiction of the myth with other depictions of 1 http://sosol.perseids.org/syllabi/tuftsmythf13.html#module-0 2 http://sites.tufts.edu/perseusupdates/2014/05/29/student-commentaries-published-in-perseus/ 3 http://perseids.org/sites/epifacs/ 4 http://sourceforge.net/p/epidoc/wiki/Home/ http://sosol.perseids.org/syllabi/tuftsmythf13.html#module-0 http://sites.tufts.edu/perseusupdates/2014/05/29/student-commentaries-published-in-perseus/ http://perseids.org/sites/epifacs/ http://sourceforge.net/p/epidoc/wiki/Home/ 2 the same myth in ancient texts and ancient art. Once this dossier has been assembled, students prepare an interpretative research paper which seeks to explain the evolving meaning of the myth to ancient audiences. During the course of their research, students use Perseids to upload timelines and Timemaps which help them organize the primary sources for their myth from a chronological and geographical standpoint. Students create these timelines and Timemaps using Timemapper, 5 a utility created by the Open Knowledge Foundation Labs. 6 Timemapper relies on spreadsheets created through Google docs. We provided an input form on Perseids which the students use to submit a link to their Google spreadsheet data. We also created an XSL 7 stylesheet to transform the data provided by the Google Spreadsheets API 8 to RDF 9 triples adhering to the Open Annotation standard 10 upon ingest into Perseids. This allows us to preserve the students’ work in a way that is interoperable with Perseids (as well with the growing number of other tools supporting the Open Annotation standard) and enables us to apply the Perseids review and approval workflow to data the students collected using the TimeMapper tool. We plan on publishing the timelines and Timemaps as part of a nascent Perseus collection on ancient mythology which will offer information on myths as seen through the primary textual and artistic sources. Finally, Prof. Beaulieu is also using Perseids in her intermediary Greek class to support treebanking. Treebanking consists in creating a full semantic and grammatical annotation of a sentence by organizing the words according to their dependency relationships. In the process, the annotator also provides morphology data for each word. In Prof. Beaulieu’s class, students have been treebanking the text of Plato and Xenophon’s Apologies of Socrates through the Arethusa framework, 11 and then submitting their work for review through the Perseids platform. Students express enthusiasm for treebanking, as the method allows them to gain full grammatical control while examining questions such as style, as they compare Plato and Xenophon’s very different renderings of the same speech. We plan on publishing the students’ work online as annotations to the texts at the end of the semester. Other Project Activities In order to support our efforts, Christopher Barbour has overseen the digitization of two manuscripts held in the Tisch Library Special Collections, namely the Commission of Doge Andrea Gritti to Lorenzo Diedo as Podesta of Montefalcone (Venice, 1533) and the Historia Regum Angliae (England, 1693). Christopher Barbour also oversaw the acquisition of an early printed Latin version of one of Galen’s treatises titled “Quos, quibus, et quando purgare oporteat” (Lyon, 1553). Students will start working on these new materials in upcoming classes. Two graduate students, Matthew Kelley and Timothy Buckingham, worked for the Perseids project in 2013 and 2014. Matthew Kelley was in charge of final preparations for the epigraphy project which was implemented in Prof. Beaulieu’s intermediary Greek class. Matthew collated bibliographical references for each inscription and documented the history of the successive editions and textual criticism. Timothy Buckingham was in charge of preparing the manuscripts to be edited and translated by Prof. Beaulieu’s students in her Medieval Latin class scheduled in January 2014. Timothy focused on preparing the fourteenth century compendium of English law preserved at Tisch Library for Prof. Beaulieu’s Medieval Latin class. In collaboration with Christopher Barbour and Alexander May (Tisch 5 http://timemapper.okfnlabs.org/ 6 https://okfn.org/ 7 http://www.w3.org/Style/XSL/ 8 https://developers.google.com/google-apps/spreadsheets/ 9 http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/ 10 http://www.openannotation.org/spec/core/ 11 http://sosol.perseids.org/tools/arethusa/app/#/ http://timemapper.okfnlabs.org/ https://okfn.org/ http://www.w3.org/Style/XSL/ https://developers.google.com/google-apps/spreadsheets/ http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/ http://www.openannotation.org/spec/core/ http://sosol.perseids.org/tools/arethusa/app/#/ 3 Library), Timothy started analyzing the compendium. He noticed that the book was divided at some point in its history and re-bound in order to take some materials out and to insert new texts, copied by a different hand. This process led to the creation of two different tables of contents which were inserted at the beginning of the book. In order to facilitate the students’ work, Timothy prepared an outline of the manuscript and started documenting the two scribal hands present in it. Timothy also worked on identifying the manuscript sections in the two tables of contents in order to eventually produce a display which will lead directly from the tables of contents to the corresponding sections. 12 In January 2013, we extended Timothy Buckingham’s hire as a research assistant for the project between December 2013 and May 2014. During this period, Timothy built on the work he did on preparing the fourteenth century compendium of English law. He continued to enhance the outline of the manuscript he had prepared and also continued to document the scribal hands present in the manuscript. Most importantly, Timothy offered support to students taking Professor Beaulieu’s Medieval Latin class and editing the manuscript. Timothy was in charge of directing and grading student markup of the manuscript and also oversaw their translations. We are currently preparing to publish the student’s work, and their findings are already available on Perseids. We held a hackathon at Tufts University with the Duke Collaboratory for Classics Computing (DC3) team, represented by Ryan Baumann, on December 4-5 2013. The first day was dedicated to workshops held in Prof. Beaulieu’s Mythology class and in her intermediary Greek class. Students were introduced to the editing features available in Perseids and had the occasion to edit texts as a group and to offer feedback to the team of developers. An open workshop to which all Classics students and faculty were invited also took place over the lunch hour. Overall, this first day was a success, and students reported enjoying their work in Perseids and offered valuable comments and feedback to the technical team. The second day of the meeting was dedicated to development work on the part of the technical team, during which we reviewed the Tufts and DC3 teams’ current development plans for the SoSOL application and made plans for merging our separate branches of the code back together. Technical Development and Deliverables Our first technical objective in this project was to put the online editing platform in place, which eventually became Perseids. Senior Software Developer Bridget Almas worked on integrating the SoSOL 13 collaborative text-editing environment with the CITE services 14 and the Image Citation Tool 15 created by the Homer Multitext Project. She was able to reuse large parts of the XML validation and display code from the papyri publication support on SoSOL while focusing on the addition of support for the CTS identifiers. The CITE architecture, of which CTS is a part, provides us with a set of protocols and services around identifying, organizing and linking canonically cited texts and related objects. CITE and CTS make use of Uniform Resource Names (URNs), which are intended to serve as persistent, location- independent, resource identifiers. CTS URNs provide a system of persistent, technology-independent identifiers for texts and passages of texts. CITE URNs provide a companion system of identifiers for objects related to texts. The first deliverable was to create a prototype implementation that re-used the existing SoSOL code for Epidoc transcriptions almost in its entirety by sub-classing it and changing only the structure of the document identifiers to correspond more closely to the CTS URN syntax. She also substituted a CTS text inventory for the Papyri.info catalog. Coding the prototype provided a means to 12 To see all images of this manuscript currently available, visit: http://perseids.org/sites/epifacs/lawimgs.html. 13 https://wiki.digitalclassicist.org/SoSOL 14 http://www.homermultitext.org/hmt-doc/cite/ 15 http://www.homermultitext.org/hmt-doc/guides/ict.html http://perseids.org/sites/epifacs/lawimgs.html http://perseids.org/sites/epifacs/lawimgs.html https://wiki.digitalclassicist.org/SoSOL http://www.homermultitext.org/hmt-doc/cite/ http://www.homermultitext.org/hmt-doc/guides/ict.html 4 explore the design of the SoSOL platform code and assess its viability for reuse. However, as CTS is a read-only API, there was a need to develop a set of parallel write/update/delete functionality that could be used to update and create new editions of CTS-compatible texts. To experiment with this, Bridget Almas augmented her XQuery based implementation of the CTS APIs from the Alpheios project. She also coded prototypes of additional extensions to the SoSOL code to work with texts and passages that use the TEI-A XML schema rather than Epidoc, and to present a passage selection interface. Completing these two deliverables gave us confidence that the integration was in fact viable, and the NEH start-up funding enabled us to move the work beyond the prototype stage to actual implementation. Development of the platform has continued beyond the NEH funding through the support of a new grant from the Andrew W. Mellon Foundation (2013-2015). 16 We have added support for Shibboleth/SAML2 authentication, so the students are now able to login with their educational institution accounts as well as through social identity providers. We enabled this for selected institutions starting with Tufts and University of Leipzig, but the functionality can be configured for any institution that supports Shibboleth. We have also made some improvements to the image mapping functionality, to support multiple images per text/inscription, and to generalize it to work with any SPARQL endpoint to retrieve images related to the text. Our original proposal indicated that we intended to work with ImageJ and Tile to produce mappings between text and images. However, these pieces of software proved unreliable and the Image Citation Tool developed by the Homer Multitext Project was selected instead. In turn, we have now replaced the Image Citation Tool with the Imgspect Image editor plugin 17 , which was developed through funding from the Mellon Foundation. Imgspect is a jQuery plugin 18 which allows us to embed the image citation functionality directly into any page, rather than bringing it in as a web page in a frame, allowing for a more seamless integration with the editing workflow. Outreach Work on Perseids has afforded numerous opportunities for conference presentations. We had the opportunity to present this project to the scholarly community and the public at large in many different venues so far (see appendices for a full listing). Throughout this project, we have also maintained the Perseids blog, 19 which we use to document workflows available on the Perseids platform, news and updates concerning our work, as well as recent presentations and papers related to the Perseids project. Further documentation, especially regarding development work, is available on our Github repository. 20 We also repost our news, updates, and announcements on the Perseus blog 21 as well as on Facebook and Twitter. In this way, we are keeping the Digital Humanities community and the public informed about our progress and accomplishments. 16 “Developing Perseids: Enhancements to a Collaborative Editing Platform for Source Documents in Classics” (21300665). 17 https://github.com/perseusdl/imgspect 18 http://jquery.com/ 19 http://sites.tufts.edu/perseids/ 20 https://github.com/PerseusDL/perseids_docs 21 http://sites.tufts.edu/perseusupdates/ https://github.com/perseusdl/imgspect http://sites.tufts.edu/perseids/ https://github.com/PerseusDL/perseids_docs http://sites.tufts.edu/perseusupdates/ 5 B. Accomplishments We are pleased to report that the objectives outlined in our original proposal have been met. The Perseids platform has been continuously available on the perseids.org domain since fall 2013 and new functionality is regularly being added as we continue our work beyond the start-up phase of the project. Student work has been published online 22 and continues to evolve as classes and other projects unfold. Perseids is now firmly implanted in the Digital Humanities community and beyond, as other teams in other universities in Europe and America are using the platform for their own projects. C. Audiences Over the years, Perseids has developed a broad and diverse audience in the USA and abroad. As described above, Perseids is being used at Tufts University for collaborative editing, translations, and treebanking. Currently, over 500 users are registered on the platform. In the USA, Perseids is currently being used at the University of Nebraska, where Professors Robert and Vanessa Gorman are integrating it into their teaching. Robert Gorman is making use of the treebanking functionality available in Arethusa and the review workflow in Perseids to teach introductory Greek and Latin. Vanessa Gorman is using the same features to edit Athenaeus’ Deipnosophistae with her students. A team at the University of Buffalo under Professor Neil Coffee is planning on using Perseids in January 2015 for classroom work on intertextuality in Statius’ Achilleid and Flavian epic. Another team of researchers and students at the University of Georgia, under the supervision of Frances Van Keuren and Elena Bianchielli, are planning on collecting all the ancient texts and artwork mentioned in Timothy Gantz’s Early Greek Myth in the form of an online dynamic syllabus which will now accompany this important reference work. In Europe, a Croatian team of scholars and students led by Professor Neven Jovanovic at the University of Zagreb is editing Latin poems written by Croatian authors. The team is using the alignment functionality available in Alpheios to align the Latin texts to the Croatian translations and the review functionality in Perseids as part of their workflow. At the University of Leipzig, Perseids is being used by Professor Monica Berti’s team in the Fragmentary Texts Project. 23 The team is collecting fragments of lost authors (quotations, paraphrases, etc) in preserved Greek and Latin literature and annotating them through Perseids. The Perseids team has established a collaboration agreement with the EAGLE consortium (Electronic Archive of Greek and Latin Epigraphy). 24 The EAGLE consortium aims to federate various epigraphical databases already present on the Web and pool their data in a decentralized way. The result is a set of locally hosted and managed databases that are interoperable and searchable as a group. EAGLE is also providing an interface, via the MediaWiki 25 platform, by which users can contribute new translations of inscriptions. In its collaboration with EAGLE, Perseids is particularly involved in managing the user workflow around translations. As MediaWiki does not support a controlled workflow that allows for review of submissions before they are posted, EAGLE approached Perseids to provide an 22 See for instance our demo for the publication of student work on Greek epigraphy: http://perseids.org/sites/epifacs/. 23 http://www.fragmentarytexts.org/tag/monica-berti/ 24 http://www.eagle-eagle.it/ 25 https://www.mediawiki.org/wiki/MediaWiki http://perseids.org/sites/epifacs/ http://www.fragmentarytexts.org/tag/monica-berti/ http://www.eagle-eagle.it/ https://www.mediawiki.org/wiki/MediaWiki 6 alternate workflow for new translation submissions, which allows them to be reviewed and voted on by members of a board on Perseids before publication on the EAGLE site. We succeeded in deploying this integration in September 2014. Over the past two years, Perseids has developed a beneficial partnership with our colleague Michèle Brunet (CNRS-Université Lyon II). Professor Brunet is in charge of editing and publishing the Greek inscriptions preserved in the Louvre Museum in Paris as a collection. To ensure maximum visibility and continued relevance, she chose to prepare a digital publication of the inscriptions through Perseids. The project, entitled E-PIGRAMME, is currently ongoing and we are in regular communication with the team. 26 Also in collaboration with Professor Brunet, Tufts University and Perseids are participating in the Visible Worlds Project. This three-year project, funded under the Partner University Fund Program, aims to promote the practice of digital epigraphy by providing training to graduate students and professors of Classics. The project involves student and scholarly exchanges between the partner institutions, namely Brown University, Tufts University, Université Lyon II, and the network of the French Schools abroad. Field training sessions will take place in Greece in May 2015, in Cambodia in 2016, and in Egypt in 2017. The Perseids team visited Lyon in September 2014 to train the French professors and students in the use of the editing platform. 27 Another training and planning meeting will take place in Leipzig in February 2015, where the focus will be on the overlap between the Visible Worlds project and the Sunoikisis Europe program, which is based in Leipzig. Sunoikisis Europe will emulate the work started by Sunoikisis USA, 28 which is run by the Center for Hellenic Studies. Sunoikisis aims to pool the Classics resources available across institutions in order for students and faculty at participating institutions to benefit from opportunities normally available only at large research institutions. It has been decided that some of the 2015 Visible Worlds sessions in Greece would be part of the Sunoikisis program, in which Perseids will be used as a publishing platform for prosopography and social network visualization. 29 Perseids is not only used in Classics, but has made important inroads in other disciplines. Our Tufts colleague Ioannis Evrigenis (Political Science) is using Perseids to produce a digital edition of Jean Bodin’s Six Livres de la République. Although Jean Bodin (1530-1596) is not widely known today, his work on the notion of sovereignty influenced major Western thinkers such as Montesquieu and Hobbes. Professor Evrigenis’ digital edition will be the first to take into account the three available versions of Bodin’s work, namely the French, Latin, and English versions. The three versions do not fully correspond to one another and reveal important changes and evolution in Bodin’s thought. 30 D. Evaluation Throughout the project, we have been in constant communication with our user base through email as well as face to face meetings such as hackathons and formal presentations at conferences. The feedback we receive ranges from general user comments to questions about specific functionality and desiderata. One comment that has been persistently made concerns the availability of the Leiden+ 26 http://www.hisoma.mom.fr/mb/IG_LOUVRE/E-PIGRAMME-FR.html 27 http://sites.tufts.edu/perseids/news-and-updates/perseids-used-in-lyon/ 28 http://wp.chs.harvard.edu/sunoikisis/ 29 https://sites.tufts.edu/perseids/news-and-updates/perseids-participates-in-sunoikisis-europe/ 30 http://sites.tufts.edu/dynamicvariorum/ http://www.hisoma.mom.fr/mb/IG_LOUVRE/E-PIGRAMME-FR.html http://sites.tufts.edu/perseids/news-and-updates/perseids-used-in-lyon/ http://wp.chs.harvard.edu/sunoikisis/ https://sites.tufts.edu/perseids/news-and-updates/perseids-participates-in-sunoikisis-europe/ http://sites.tufts.edu/dynamicvariorum/ 7 encoding system for the transcription of epigraphic texts and manuscripts as an alternative for marking the text up directly in XML using the EpiDoc standards. In response, we will enable this function in 2015. Student feedback has generally concerned the workflow, and we have been working to integrate the different components of Perseids more and more seamlessly into one another. For instance, we aim to provide a way for students to move directly to treebanking once they have entered a transcription of an epigraphical text into Perseids. Currently, the text has to be entered manually into a different module in order to treebank an object that is not otherwise available in Perseus. We have also received requests for additional customization and enhancements to the review workflow, for example ensuring that a publication, once submitted, always goes to the same reviewer after requested corrections are made. All feedback is entered in the project’s github issue tracker. 31 E. Long Term Impact and Continuation of the Project Perseids has received an enthusiastic response in the community and new collaborations keep arising. As mentioned above, Perseids will continue to be used in Tufts classes for editing, treebanking, translation alignments, timelines, and many other use cases. Starting in January 2015, Perseids will be used in Professor Beaulieu’s Journey of the Hero class to work on prosopographical data in the Greek mythological corpus and to publish social network visualizations. The same features will be put to use in May 2015 with the Perseids’ team participation in the Visible Worlds Project and Sunoikisis Europe. Perseids also continues to expand beyond Digital Classics, as colleagues in other disciplines such as Syriac, 32 and early English literature 33 have been contacting us to explore possibilities for collaboration. Tufts University is strongly committed to Perseids, as the platform keeps taking a more and more important place in our courses and supports our position of leadership in Digital Humanities. Perseids will be central to the new Master’s program in Digital Humanities and Premodern Studies currently being planned at Tufts. The program frames the use of historical languages such as Classical Greek, Latin, and Arabic in an intellectual context that includes but extends beyond antiquity to encompass all disciplines and time periods of the premodern world. Furthermore, the Premodern Studies program is designed to capitalize upon, and to incorporate within its curriculum, emerging digital technologies that have given humanists powerful tools for analyzing texts, objects, and physical spaces. The curriculum integrates learning and research from an early stage, so students are expected to produce new knowledge in the form of digital editions, datasets, and analytical research. Thus, the program will showcase a new model of training in the humanities that transcends the traditional departmental and curricular boundaries: it is a convergent, collaborative effort to use abstract skills and training to add to the sum of human knowledge. Perseids, with its versatile design and its emphasis on collaborative work, offers numerous ways to support and enhance this effort. Continuing development of the Perseids platform is currently ensured by new grants. After a successful start-up period funded by the NEH, we went on to receive a two-year grant from the Andrew W. Mellon Foundation, “Developing Perseids: Enhancements to a Collaborative Editing Platform for Source Documents in Classics”. 31 https://github.com/PerseusDL/perseids_docs/issues 32 http://syriaca.org/ 33 http://www.sas.ac.uk/videos-and-podcasts/culture-language-literature/shakespeare-his-contemporaries- exploring-early-moder https://github.com/PerseusDL/perseids_docs/issues http://syriaca.org/ http://www.sas.ac.uk/videos-and-podcasts/culture-language-literature/shakespeare-his-contemporaries-exploring-early-moder http://www.sas.ac.uk/videos-and-podcasts/culture-language-literature/shakespeare-his-contemporaries-exploring-early-moder 8 Perseids attracted further non-federal funding in the form of a Digital Resources grant from the Samuel H. Kress Foundation 34 titled: “The Digital Milliet: Greek and Roman Painting in the 21 st Century”. 35 The project aims to collect and annotate Greek and Latin texts concerning ancient painting in an online collection which updates the now obsolete Recueil Milliet published by Salomon Reinach in 1921. The Digital Milliet will offer a fully integrated digital edition of the ancient texts that will include translations, commentaries, and an iconographical database. The project will serve as a model for further work of this nature, utilizing the resources developed on Perseids for the dynamic syllabus and annotation modules. F. Grant Products The Perseids platform itself is the central product to come out of this project. The platform is available online 36 and new users can create accounts using their institutional credentials, a social identity provider such as Gmail, Yahoo, and AOL, or through OpenID. As stated above, documentation and updates about Perseids are available on our blog 37 and from our Github repository. 38 Grant products are also available in the form of student publications. As stated above, commentaries produced by students in Marie-Claire Beaulieu’s Fall 2013 Classical Mythology class have been published on Perseus as student annotations. 39 A demo of the epigraphy work performed in Marie- Claire Beaulieu’s 2013 Intermediary Greek class is also available online. 40 We are currently working on publishing student work produced in 2014 in Marie-Claire Beaulieu’s Medieval Latin, Intermediary Greek, and Classical Mythology. Further user publications are available on Perseids as well as on the EAGLE wiki, which now uses the Perseids review workflow. 41 Finally, the team and collaborators have produced articles, posters, papers, and talks, which are listed in Appendix 1. 34 http://www.kressfoundation.org/ 35 See project announcement: https://sites.tufts.edu/perseids/news-and-updates/the-digital-milliet-greek-and- roman-painting-in-the-21st-century/ Working demo available: http://perseids.org/tools/digmill/#callout2 36 http://sosol.perseids.org/sosol/signin 37 http://sites.tufts.edu/perseids/ 38 https://github.com/PerseusDL/perseids_docs 39 http://sites.tufts.edu/perseusupdates/2014/05/29/student-commentaries-published-in-perseus/ 40 http://perseids.org/sites/epifacs/ 41 http://www.eagle-eagle.it/Italiano/index_it.htm http://www.kressfoundation.org/ https://sites.tufts.edu/perseids/news-and-updates/the-digital-milliet-greek-and-roman-painting-in-the-21st-century/ https://sites.tufts.edu/perseids/news-and-updates/the-digital-milliet-greek-and-roman-painting-in-the-21st-century/ http://perseids.org/tools/digmill/#callout2 http://sosol.perseids.org/sosol/signin http://sites.tufts.edu/perseids/ https://github.com/PerseusDL/perseids_docs http://sites.tufts.edu/perseusupdates/2014/05/29/student-commentaries-published-in-perseus/ http://perseids.org/sites/epifacs/ http://www.eagle-eagle.it/Italiano/index_it.htm 9 Appendix 1. Papers and Presentations Presentations ( Links to some of these presentations are available on the Perseids blog at http://sites.tufts.edu/perseids/.) Christopher Barbour and Alexander May “Hidden Treasures of the Middle Ages”, Osher Institute, September 30, 2011. Alexander May and Alicia Morris, “The Miscellany Collection: How a Small Digital Collection Caught the Imagination of the Scholarly Community at Tufts and Beyond.” Presentation at the New England Library Association Annual Conference, Burlington, Vermont, October 2, 2011. Marie-Claire Beaulieu, Francesco Mambrini and J. Matthew Harrington, “Toward a Digital Editio Princeps: Using Digital Technologies to Create a More Complete Scholarly Edition in the Classics”, From Ancient Manuscripts to the Digital Era. Readings and Literacies, 23-25 August 2011, Lausanne, Switzerland. Marie-Claire Beaulieu, Francesco Mambrini and J. Matthew Harrington, “Treebanking and Digital Scholarly Editions in the Classics”, Interedition Symposium: Scholarly Digital Editions, Tools and Infrastructure, March 19-20, 2012, The Hague, Netherlands. Alicia Morris, “Rethinking Tech Services: How we used the Tisch Miscellany to reshape Technical Services.” Presentation at the NETSL Annual Conference, Worcester, MA, May 3, 2012. Marie-Claire Beaulieu and Bridget Almas, “Digital Humanities in the Classroom: Introducing a New Editing Platform for Source Documents in Classics”, Digital Humanities 2012, 16-22 July 2012, Hamburg, Germany. Marie-Claire Beaulieu, “The Perseids Platform”, Institute for Advanced Topics in Digital Humanities, “Working with Text in a Digital Age”, Tufts University, Aug. 6, 2012. http://sites.tufts.edu/digitalagetext/ Marie-Claire Beaulieu, “Une nouvelle plate-forme éditoriale pour les sources primaires en études classiques”, Epigraphy Seminar of the French School in Athens, Nov. 6, 2012, Epigraphical Museum, Athens, Greece. Marie-Claire Beaulieu and Bridget Almas, “Open Philology Workshop”, August 9, 2013, “Teaching with the Perseids Platform”, University of Leipzig, Germany. Bridget Almas, “The Perseids Platform”, Digital Classicist London Seminar, March 22, 2013. http://sites.tufts.edu/perseids/ http://sites.tufts.edu/perseids/ http://sites.tufts.edu/perseids/ http://sites.tufts.edu/digitalagetext/ 10 Monica Berti, “Fragmenta Historica 2.0. Quotations and Text Re-uses in the Semantic Web”, Word, Space, Time: Digital Perspectives on the Classical World, April 5-6, 2013, University at Buffalo, SUNY. Marie-Claire Beaulieu, “Teaching with the Perseids Platform: Tools and Methods”, Digital Classicist London Seminar, July 26, 2013. Monica Berti and Bridget Almas, “The Perseids Collaborative Platform for Annotating Text Re-Uses of Fragmentary Authors”, DH-Case Workshop, September 10, 2013, Florence, Italy. Bridget Almas, “The Perseids Platform”, Research Data Alliance, October 24, 2013. Rensselaer Polytechnic Institute. Monica Berti and Bridget Almas, “The Linked Fragment: TEI and the encoding of text re-uses of lost authors”, The Linked TEI: Text Encoding in the Web, TEI Conference and Members Meeting 2013, Università di Roma Sapienza, October 2-5, 2013. Bridget Almas, David Dubin, Sayeed Choudhury, “Combining Complementary Provenance Data Models in Humanities Research”, Research Data Alliance, Plenary 3, Dublin, March 27, 2014. Accepted : Marie-Claire Beaulieu and J. Matthew Harrington, “Beyond Rhetoric: the Correlation of Data, Syntax, and Sense in Literary Analysis”, Digital Classics Association, Society for Classical Studies, Annual Meeting, New Orleans, LA, January 8-11, 2015. Published Papers Marie-Claire Beaulieu, Francesco Mambrini and J. Matthew Harrington, “Towards a Digital Editio Princeps: Using Digital Technologies to Create a More Complete Scholarly Edition in the Classics”, Lire Demain/Reading Tomorrow, Papers of the International Conference “From Ancient Manuscripts to the Digital Era. Readings and Literacies, Lausanne, 23-25 August 2011, Clivaz, C. et al. eds, Presses polytechniques et universitaires romandes, 2012 (ebook), p. 393-414. Marie-Claire Beaulieu and Bridget Almas, “Digital Humanities in the Classroom”, Literary and Linguistic Computing, (2013) 28 (4): 493-503. Forthcoming: Marie-Claire Beaulieu and Bridget Almas, “Scholarship for all!”, Classics Outside the Echo- Chamber: Teaching,collaboration, outreach and public engagement, Gabriel Bodard & Matteo Romanello eds. (Publisher TBD) 11 Appendix 2. Examples of Manuscripts Digitized with Grant Funds See the Tisch Library Special Collections Flickr site for full record of images: https://www.flickr.com/photos/tischlibraryspecialcollections/sets/ Figures 1 and 2. Photograph of the 14 th century compendium of English forest law before digitization vs. high resolution flattened image produced by Boston Photo Imaging. Figures 3 and 4. Photograph of the Commission of Doge Andrea Gritti to Lorenzo Diedo as Podesta of Montefalcone (Venice, 1533) vs. digitized image. https://www.flickr.com/photos/tischlibraryspecialcollections/sets/ 12 Appendix 3. Course Syllabi Marie-Claire Beaulieu, Classical Mythology, Tufts University, Fall 2013. Dynamic syllabus Susan Dunning, Introduction to Classical Mythology, University of Toronto, Summer 2014 Dynamic syllabus: http://sosol.perseids.org/syllabi2/html/torcla204h1f.html http://sosol.perseids.org/syllabi2/html/torcla204h1f.html 13 Marie-Claire Beaulieu, Intermediary Greek, Tufts University, Fall 2013 Greek 7: Plato’s Apology of Socrates Tufts University Fall 2013 Professor: Marie-Claire Beaulieu, PhD Meets: Mon-Wed 3h00-4h15, Eaton 333 Email: Marie-Claire.Beaulieu@tufts.edu Office: Eaton 327 Office hours: Mon-Wed 1h00-3h00 Objectives This course will familiarize the students with Greek prose and further their knowledge of Greek grammar. The students will develop their skills at reading continuous passages in Greek and will become familiar with Plato’s style. Grading Participation and Preparation, weekly verb quizzes: 10% 3 quizzes: 20% Paper: 15% 2 exams: 20% each (=40%) Special project (inscriptions): 15% Paper -Choose a work of Plato or Xenophon not discussed in class and introduce it -Choose a particularly significant passage in this work and analyze it in detail Special Project This semester, we will edit and translate Greek funerary inscriptions which we will then publish on the Perseus website. Attendance policy and Making-Up Work Class attendance is required. Absences for religious holidays, family emergencies, and properly documented medical reasons will be excused. Missed quizzes and exams can be completed upon presentation of proper documentation. Religious Holidays Students can make up work missed for religious holidays if they notify the instructor in advance. Let me know as early as possible in the semester so that we can make arrangements. Students with disabilities All necessary accommodations will be made for students with documented disabilities. mailto:Marie-Claire.Beaulieu@tufts.edu 14 Marie-Claire Beaulieu, Medieval Latin, Tufts University, Spring 2014 LAT 0030/0130: Medieval Latin Spring 2014 Meets: Mon-Wed 4h30-5h45 Prof. Marie-Claire Beaulieu Marie-Claire.Beaulieu@Tufts.edu Office: Eaton 327 Office hours: Mon-Wed. 1h00-3h00 or by appointment Teaching Assistant Timothy Buckingham Timothy.Buckingham@tufts.edu Office hours: 1-2 Wed-Thurs. Course Description An introduction to Medieval Latin that covers a variety of European authors over a period of 800 years. The course will be organized around the theme of travel and map making in the Middle Ages. Texts we will read include Friar Odoric's thirteenth-century account of his travels to India and crusader narratives. We will also read sections of Isidore's Etymologiae, in which the author describes the world, and we will pay close attention to medieval maps such as the Hereford mappa mundi. Occasionally, we will read excerpts from other contemporary travel accounts not written in Latin such as John Mandeville and Marco Polo. Term projects for the class will be conducted in collaboration with the Tufts Special Collections. Students will transcribe, translate, and publish manuscripts held in the special collections. Grading 3 quizzes: 30 % (10% each) Final Exam: 20% Term Project: Initial transcription and markup: 10% Final transcription and markup: 15% Initial translation: 10% Translation: 15% Attendance policy and Making-Up Work Class attendance is required. Absences for religious holidays, family emergencies, and properly documented medical reasons will be excused. Missed quizzes and exams can be completed upon presentation of proper documentation. Students with disabilities All necessary accommodations will be made for students with disabilities. mailto:Marie-Claire.Beaulieu@Tufts.edu mailto:Timothy.Buckingham@tufts.edu 15 Marie-Claire Beaulieu, Classical Mythology, Tufts University, Fall 2014 Greek and Roman Mythology Meets: Mon-Wed 10h30-11h45, Eaton 201 Instructor: Dr. Marie-Claire Beaulieu Office: Eaton 327 Office Hours: Mon-Wed 1h00-3h00 email: Marie-Claire.Beaulieu@tufts.edu Teaching Assistants Elizabeth Andrews: Elizabeth.Andrews@tufts.edu John Moore: John.Moore@tufts.edu Course description This course offers a survey of Greek and Roman mythology. In addition to learning the names and stories of mythical figures, we will explore different interpretations of the myths and their religious significance for the ancients. We will also pay attention to recurring mythical patterns and their significance in the larger context of Indo-European myth. Required Textbooks: Morford, M., Lenardon, R., Sham, M. Classical Mythology, 10 th ed. Oxford, 2011. Hesiod, Theogony and Works and Days (tr. M.L. West) Sophocles, Antigone, Oedipus the King, Electra (tr. H.D.F. Kitto) Ovid, Metamorphoses (tr. A.D. Melville) Grading Museum visit: 5% 5 pop-quizzes: 15% (lowest score dropped) Bibliography: 15% Outline: 10% Timeline Assignment: 10% Map Assignment: 10% First draft of research paper: 15% Final draft of research paper: 20% Attendance policy and Making-Up Work Class attendance is required. Absences for religious holidays, family emergencies, and properly documented medical reasons will be excused. Missed exams can be completed upon presentation of proper documentation. However, pop-quizzes cannot be made-up if missed. mailto:Marie-Claire.Beaulieu@tufts.edu mailto:Elizabeth.Andrews@tufts.edu mailto:John.Moore@tufts.edu work_becbdobjdjamjgxuo64hj77n7i ---- CIPA 2017 BODY AS ECHOES: CYBER ARCHIVING OF DAZU ROCK CARVINGS Chen Wu-Wei IMA Program, Shanghai NYU, 1555 Century Ave, Pudong Xinqu, Shanghai Shi, China, 200122 - wc54@nyu.edu Commission II Keywords: Digital Heritage, Digital Sculpting, STEM Education, Interactive Info-Motion Design, Dazu Rock Carvings ABSTRACT: “Body As Echoes: Cyber Archiving of Dazu Rock Carvings (BAE project in short)” strives to explore the tangible/intangible aspects of digital heritage conservation. Aiming at Dazu Rock Carvings - World Heritage Site of Sichuan Province, BAE project utilizes photogrammetry and digital sculpting technique to investigate digital narrative of cultural heritage conservation. It further provides collaborative opportunities to conduct the high-resolution site survey for scholars and institutions at local authorities. For preserving and making sustainable of the tangible cultural heritage at Dazu Rock Carvings, BAE project cyber-archives the selected niches and the caves at Dazu, and transform them into high-resolution, three-dimensional models. For extending the established results and making the digital resources available to broader audiences, BAE project will further develop interactive info-motion interface and apply the knowledge of digital heritage from BAE project to STEM education. BAE project expects to bridge the platform for archeology, computer graphics, and interactive info-motion design. Digital sculpting, projection mapping, interactive info-motion and VR will be the core techniques to explore the narrative of digital heritage conservation. For further protecting, educating and consolidating “building dwelling thinking” through digital heritage preservation, BAE project helps to preserve the digital humanity, and reach out to museum staffs and academia. By the joint effort of global institutions and local authorities, BAE project will also help to foster and enhance the mutual understanding through intercultural collaborations. 1. HISTORY OF WORK ON THE PROJECT TO DATE During the six years (2011-2016) of professor position, The author delivers cultural heritage related topics in the studio classes such as digital sculpting and visual programming in the U.S. university. The author shares insights of the 3-D Mandala deployed in the Lecture Hall of To-Ji Temple in Kyoto and helps students from scratch to portray the deities with complex forms by digital sculpting tools. [Figure 1] Besides the duty of teaching, the author also collaborates with cultural organizations in Hong Kong to cyber-archive the cultural objects from Peshawar (e.g., narrative relief and deities of Bodhisattva, Maitreya, and Shakyamuni), and study the iconography of religious deities from different geolocations in Asia. [Figure 2/2.1] Figure 1. Deities at the temples of Kyoto and Nara sculpted by digital tools. Figure 2/2.1: Photogrammetry documentation of Gandhāra- style carved grey schist Shakyamuni, Bodhisattva (gilt), Standing Maitreya and narrative relief. Earlier in 2016, the author conducted the field research by personal efforts at the World Heritage Site of Sichuan Province - Dazu Rock Carvings. Selected Esoteric deities (Bodhisattva, Ksitigarbha, and Peacock Radiant Wisdom King) [Figure 3] are digitally documented and transformed into the 3-D models. To further develop the interactive contents, digital museum, STEM education, and info-motion design for Dazu Rock Carvings, the author collaborates with Dazu Rock Carving Institute in Sichuan Province and works together with collaborative partners from Southwest University of Nationalities. The full supports from local authorities enables the author to contribute to the academic network of Dazu School. Further collaborations will continue to facilitate the conservation efforts by innovative technologies. 2. METAMORPHOSIS OF THE SUTRA AND THE DIGITAL NARRATIVE In Buddhism, Buddha is free from reincarnation and karma. Bodhisattva, on the other hand, stays with the sentient beings and leads them to the Pure Land. Hence the diverse expressions and gestures are depicted on the Buddha and Bodhisattva. [Figure 4] The Mikaeri Amida (Amitabha Looking Back) deity at Zenrin-Ji in Kyoto for example, “looks back to the sentient beings with mercy, and interprets the attitudes of thinking back The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W5, 2017 26th International CIPA Symposium 2017, 28 August–01 September 2017, Ottawa, Canada This contribution has been peer-reviewed. doi:10.5194/isprs-archives-XLII-2-W5-121-2017 | © Authors 2017. CC BY 4.0 License. 121 mailto:idowman@ge.ucl.ac.uk on his own position, waiting for the people behind and leading them to the way of salvation together.”[ ] [Figure 5] 1 Figure 3: Digital documentations of the Esoteric deities (Bodhisattva, Ksitigarbha, and Peacock Radiant Wisdom King) at Dazu Rock Carving in Sichuan Province, China. Figure 4: Shakyamuni deity carved grey schist of the ancient region of Gandhara circa 2nd century. Cyber-archive by Photogrammetry. Similar scenarios can be found on the deity of Peacock Radiant Wisdom King in Dazu Rock Carvings. This deity, which is akin to the one in the cave No. 155 at Beishan area, “sits in padmasana cross-legged lotus posture on the lotus throne placed on the back of a peacock.”[ ] The tail of the peacock extends all 2 the way up to the ceiling as the halo, and works similarly as the central pillar to hold the structure of the cave. Most of the heavenly kings are portrayed with wrathful expressions as the alternative representations of the Bodhisattva to defeat the demons. The faces of the Peacock Radiant Wisdom Kings at Beishan (Cave No.115) and Shimenshan (Cave No.8) [Figure 6] are rather gentle and merciful as Bodhisattva. Figure 5: The Mikaeri Amida (Amitabha Looking Back) deity at Zenrin-Ji in Kyoto Figure 6: Point cloud view (L) and low-polygonal view of Peacock Radiant Wisdom King at Cave No.8 in Shemenshan, Dazu Rock Carving. For Bodhisattva, initially, it is depicted as the male figure in India and earlier deities, such as Gandhara-style Bodhisattva in 2nd Century. [Figure 7] In China, the female figures to show the compassion become famous after Tang Dynasty. At Dazu Rock Carvings, the elegance and exquisiteness are frequently witnessed among the majority of the Bodhisattva Deities. One particular deity of Song Dynasty (960-1127 A.D.) with the unique expression - the Counting Beads Avalokiteśvara (aka Charming Avalokiteśvara) - becomes the highlight of Beishan area at the Niche No. 125 [Figure 8]: Introduction of Mikaei Amida at Eikando.1 P.36, Dazu Rock Carvings.2 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W5, 2017 26th International CIPA Symposium 2017, 28 August–01 September 2017, Ottawa, Canada This contribution has been peer-reviewed. doi:10.5194/isprs-archives-XLII-2-W5-121-2017 | © Authors 2017. CC BY 4.0 License. 122 http://creativecommons.org/licenses/by/3.0/ Figure 7: Gilt grey schist Bodhisattva from the ancient region of Gandhara circa 2nd - 3rd Century, H 15.5 cm. Cyber Archive by photogrammetry. Figure 8: Expressions, gestures and full view of the niche No. 125 “Charming Avalokteśvara” deity. 92 cm in height. Point cloud visualization. “This Avalokiteśvara is counting the beads with the right hand that is gently held by the left hand at the wrist. The belts of her dress are waving in the wind just as those beautifully painted by the famous painter Wu Daozi. Like a graceful and shy maiden, she turns her head a little to the left, and watches down with a smile.”[ ]Regarding the origin of Charming Avalokiteśvara, the 3 author believes it is Sri-Mahadevi compared to the iconography portrayed in Hinduism of India and Esoteric Buddhism in Japan. Even though in the official publication from Dazu Rock Carvings, the deity is categorized as the Avalokiteśvara. The s hy, innocent expres s ion and ges ture of Charming Avalokiteśvara, different from the gentle, merciful looks of the other Count Beads Avalokiteśvara deities at Dazu Rock Carvings, convey the young and refreshing energy to the viewers and further enhances the belief in looking for prosperity, auspiciousness, and mindfulness. Figure 9: Niche No.253 “Bodhisattva and Ksitigarbha” deities. Point cloud visualization. 3. Digital humanity in cultural heritage conservation Softwares nowadays provide easy access to digitally sculpting or documenting the physical objects. Digital sculpting depends on the artist's sense, techniques, and experiences towards the anatomic precision, form, balance, and motion. Sculptors' movements further extend to performance, then transforms into motion sculptures...etc., like never-ending echoes. Compared to digital sculpting, digital documentation rationally and precisely conserves the digital data, analyze the complex forms or damaged parts, and rebuild the real subjects in the virtual world, s u c h a s p h o t o g r a m m e t r y o r l a s e r s c a n n i n g . T h e s e documentation methods are widely adopted and with easier solutions simply by mobile phone cameras and software. Back in the days when the Virtual World Heritage Laboratory led the Digital Sculpture Project (2009-2013), it focused on the complexity of famous sculptures and utilizes 3d laser scanning to restore the point cloud data from the sculptures. The project’s attention to the "neglected area of the digital humanities” and cultural significance, shows the uniqueness of the Digital Sculpture Project. The selected sculptures, ranging from Alexander to Laocoön, inherit the richness of form and context. Digital documentation of sculptures, through the earlier efforts, integrates into heritage preservation and consistently applies to heritage information projects. Technically speaking, BAE project integrates the computer graphics and interactive info-motion design, virtual reality and CAVE as the forms to explore the meaning and metamorphosis of the Sutra. Selected niches and caves at Dazu Rock Carvings (e.g., Bodhisattva, Ksitigarbha, Peacock Radiant Wisdom King) are documented by hybrid scanning (photogrammetry/laser scanning/UAV data). Raw data is analyzed and re-established digitally by reality capture software. Point cloud data (.asc/.xyz/.ply) as the initial results can be further interpreted by modeling, texturing and rendering. Compared with the existing projects of cave art and cultural heritage, BAE project utilizes hybrid techniques in the cyber-archiving process for balancing the accuracy and visibility. P.51, Dazu Rock Carvings.3 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W5, 2017 26th International CIPA Symposium 2017, 28 August–01 September 2017, Ottawa, Canada This contribution has been peer-reviewed. doi:10.5194/isprs-archives-XLII-2-W5-121-2017 | © Authors 2017. CC BY 4.0 License. 123 Digital humanity is the first priority in BAE project facilitated by the dialogues between the contexts and science. BAE project selects digital heritage as the form to study Dazu Rock Carvings to explore the tangible and intangible values of digital humanity. The intangible aspect of the project will be investigated, including the cultural meaning, interconnections between deities depicted, analysis of the iconography, accentuation of the dynamism of time and change, and narration in the new digital space afforded now. Buddhism and its path from India, Dun Huang to Japan symbolize the dissemination of intercultural activities along the Silk Road. Dazu District and Sichuan, on the other hand, connects with Dun Huang and Japan by the cave art, religion, and culture. The existence of Dazu Rock Carvings and its preservation symbolizes the echoes of both the digital and social humanities. 4. PROJECT PLAN AND LONG-TERM SUSTAINABILITY BAE project will focus on the abundant and exquisite rock carving sites which is widely spread at Dazu District and surrounding areas (more than 500 niches/caves and over 10,000 rock-carvings) in Sichuan province ever since Tang dynasty in China. BAE project values the most of the conservation of cultural heritage, and proposes developing digital heritage contents to preserve the physical objects and sites. Digital data documented and archived from the cultural objects and heritages sites can be further disseminated through publishing, exhibitions, symposium presentations, websites and social media. The experience and knowledge obtained from BAE project can also be simplified and transformed into contents for primary / secondary school education, such as STEM education materials. As the Vajracchedika-Prajna-Paramita Sutra (Diamond Sutra) frequently addresses in the text: “if a Bodhisattva (still) clings to the false notion (laksana) of an ego, a personality, a being and a life, he is not ( a true) Bodhisattva.” The various looks of those deities in the caves and niches inevitably get corrosion and surface weathering even human destructions throughout the years. As the sentient beings in the world, our empathies on all the happenings and wishing to preserve the cultural heritage are universal. Through the digitization process of cyber-archiving, hopefully, the echoes of the ancient teachings to be heard and pass on. ACKNOWLEDGMENTS Sincere gratitude to Dazu Rock Carvings Institute and Professor Chen Ching Xiang of Chinese Culture University. REFERENCES Bringing the Ancient Theater of the Silk Road to Los Angeles. East West Bank. https://www.eastwestbank.com/ReachFurther/ News/Article/Bringing-the-Ancient-Theater-of-the-Silk-Road- to-Los-Angeles Cultural Heritage Conservation in Pakistan: Conversation with Dr Richard A. Engelhardt. https://www.youtube.com/watch? v=BWYUO0nciss Dazu Rock Carving Institute. Dazu Rock Carving. Chongqing Publishing Group. Chongqing, China, 2010, 36, 51. Introduction of Mikaei Amida. http://www.eikando.or.jp/ English/mikaeri_amida_e.html Jansen, Michael. “Virtual reality for a physical reconstruction? The Bamiyan Buddhas in Afghanistan”. Archaeologising Heritage. Panel 3: The Virtualisation of Archaeological Heritage. International Workshop on Angkor / Cambodia. Heidelberg, Germany, 2010. Kibi Conservation Studio for Cultural Objects. http:// www.kibibunn.info/ L i a n g Y o n g , W u . L O O K I N G F O R W A R D T O ARCHITECTURE OF THE NEW MILLENNIUM. http:// n e w u r b a n q u e s t i o n . i f o u . o r g / p r o c e e d i n g s / 1 % 2 0 T h e % 2 0 N e w % 2 0 U r b a n % 2 0 Q u e s t i o n / Wu%20Liangyong.pdf Opening ceremony of "The Cave Temples of Dunhuang, Buddhist Art on China’s Silk Road. http://www.yucolab.com/ news/108-opening-ceremony-of-the-at-the-j-paul-getty- museum-la.html#.V_R4u2V6yGk Gandhara - Das buddhistische Erbe Pakistans. https:// www.youtube.com/watch?v=Su2zCRIMFv8 The Digital Sculpture Project. Virtual World Heritage Laboratory. http://www.digitalsculpture.org The International Dunhuang Project. http://idp.bl.uk/idp.a4d The Vajracchedika-prajna-paramita Sutra. http://big5.xuefo.net/ nr/article1/7446.html 重慶⼤⾜⽯刻藝術博物館. ⼤⾜⽯刻研究⽂集第四輯. 中國 ⽂聯出版社. 北京, 中國:2002. 牧野隆夫. 仏像再興 仏像修復をめぐる⽇々. 山と渓⾕社出 版. Yama-kei Publishers co., Ltd. Tokyo, Japan: 2016. 籔内佐⽃司. 壊れた仏像の声を聴く- ⽂化財の保存と修復. 角川学芸出版. KADOKAWA Publishing. Tokyo, Japan: 2015. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W5, 2017 26th International CIPA Symposium 2017, 28 August–01 September 2017, Ottawa, Canada This contribution has been peer-reviewed. doi:10.5194/isprs-archives-XLII-2-W5-121-2017 | © Authors 2017. CC BY 4.0 License. 124 https://www.eastwestbank.com/ReachFurther/News/Article/Bringing-the-Ancient-Theater-of-the-Silk-Road-to-Los-Angeles https://www.eastwestbank.com/ReachFurther/News/Article/Bringing-the-Ancient-Theater-of-the-Silk-Road-to-Los-Angeles https://www.eastwestbank.com/ReachFurther/News/Article/Bringing-the-Ancient-Theater-of-the-Silk-Road-to-Los-Angeles https://www.youtube.com/watch?v=BWYUO0nciss https://www.youtube.com/watch?v=BWYUO0nciss http://www.eikando.or.jp/English/mikaeri_amida_e.html http://www.eikando.or.jp/English/mikaeri_amida_e.html http://www.kibibunn.info/ http://www.kibibunn.info/ http://newurbanquestion.ifou.org/proceedings/1%20The%20New%20Urban%20Question/Wu%20Liangyong.pdf http://newurbanquestion.ifou.org/proceedings/1%20The%20New%20Urban%20Question/Wu%20Liangyong.pdf http://newurbanquestion.ifou.org/proceedings/1%20The%20New%20Urban%20Question/Wu%20Liangyong.pdf http://www.yucolab.com/news/108-opening-ceremony-of-the-at-the-j-paul-getty-museum-la.html#.V_R4u2V6yGk http://www.yucolab.com/news/108-opening-ceremony-of-the-at-the-j-paul-getty-museum-la.html#.V_R4u2V6yGk http://www.yucolab.com/news/108-opening-ceremony-of-the-at-the-j-paul-getty-museum-la.html#.V_R4u2V6yGk http://www.youtube.com/watch?v=Su2zCRIMFv8 http://www.digitalsculpture.org http://idp.bl.uk/idp.a4d http://big5.xuefo.net/nr/article1/7446.html http://big5.xuefo.net/nr/article1/7446.html work_bhz3tr2jvfesniohrdugc7veda ---- HSR Suppl. 31_2018_Tomasi_Modeling in the Digital Humanities.docx www.ssoar.info Modelling in the Digital Humanities: Conceptual Data Models and Knowledge Organization in the Cultural Heritage Domain Tomasi, Francesca Veröffentlichungsversion / Published Version Zeitschriftenartikel / journal article Zur Verfügung gestellt in Kooperation mit / provided in cooperation with: GESIS - Leibniz-Institut für Sozialwissenschaften Empfohlene Zitierung / Suggested Citation: Tomasi, F. (2018). Modelling in the Digital Humanities: Conceptual Data Models and Knowledge Organization in the Cultural Heritage Domain. Historical Social Research, Supplement, 31, 170-179. https://doi.org/10.12759/ hsr.suppl.31.2018.170-179 Nutzungsbedingungen: Dieser Text wird unter einer CC BY Lizenz (Namensnennung) zur Verfügung gestellt. Nähere Auskünfte zu den CC-Lizenzen finden Sie hier: https://creativecommons.org/licenses/by/4.0/deed.de Terms of use: This document is made available under a CC BY Licence (Attribution). For more Information see: https://creativecommons.org/licenses/by/4.0 http://www.ssoar.info https://doi.org/10.12759/hsr.suppl.31.2018.170-179 https://doi.org/10.12759/hsr.suppl.31.2018.170-179 https://creativecommons.org/licenses/by/4.0/deed.de https://creativecommons.org/licenses/by/4.0 Historical Social Research Supplement 31 (2018), 170-179 │ published by GESIS DOI: 10.12759/hsr.suppl.31.2018.170-179 Modelling in the Digital Humanities: Conceptual Data Models and Knowledge Organization in the Cultural Heritage Domain Francesca Tomasi ∗ Abstract: »Modellieren in den Digitalen Geisteswissenschaften: Konzeptuelle Datenmodelle und Wissensorganisation für das kulturelle Erbe«. This paper ex- plores the role of model and modelling in the field of Digital Humanities, pay- ing special attention to the cultural heritage domain. In detail, the approach here described adopts a bi-dimensional vision: considering the model as both a process of abstraction, an interpretation from a certain point of view, and a formal language to implement this abstraction in order to create something processable by a machine. The role of conceptual models – to be converted into ontologies – as a semantic deepening of controlled vocabularies, is the transla- tion of this vision. Ontologies are the models used in domain communities in order to share classes and predicates for conceptual interoperability. Thinking of data models as a knowledge organization system is the core of this reflec- tion on Digital Humanities domain. Keywords: Ontologies, knowledge, interpretation, data structures, controlled vocabularies. 1. Introduction Model and modelling in the domain of Digital Humanities (DH) is a huge and challenging topic. It is not trivial to find a common and shareable definition, because the concept of model/modelling is related to multiple facets, integrat- ing the humanistic point of view with the computer scientists’ approach. Also, DH have their own notion of models and modelling (see in particular Orlandi 1999; Buzzetti 2002; McCarty 2004); concepts that also reflect a core method in DH in general and in my research on domain ontologies – or better on con- ceptual data modelling in the cultural heritage – in particular. But let us start from the beginning, from the attempt to find an appropriate definition. We could say that the activity of modelling consists of choosing the features of the observed reality (e.g. an object in a domain) to be formally represented (the abstract model). This formalization requires the adoption of a data struc- ∗ Francesca Tomasi, Department of Classical Philology and Italian Studies, University of Bolo- gna, via Zamboni 32, 40126 Bologna, Italy; francesca.tomasi@unibo.it. HSR Suppl. 31 (2018) │ 171 ture related to a language useful for the description of the abstraction. Thus, a model refers to the declaration of the selected properties of an object, e.g. a plain text, to be translated into a machine-readable form by using a descriptive language as a representational method. Following this definition, a model is firstly a matter of extracting properties of an object as the result of an interpre- tation. And an interpretation is, naturally, the expression of a point of view. In this sense, a model can never be exhaustive. Each point of view is only one of many ways to interpret the observed reality. The more viewpoints on the same object we have, the more models might be collected. So each abstraction is a possible, individual representation of an object in a domain, which is able to replicate the original object: “to an observer B, an object A* is a model of an object A to the extent that B can use A* to answer questions that interest him about A” (Minsky 1995). But models have to be able to aggregate viewpoints. In fact, modelling also means to identify common features of a collection or extracting those patterns that could be recognized in similar resources. The similarity is a matter of sharing, i.e. sharing a genre, a type, a computational objective, a scope or a function. In this sense, modelling reveals a crowdsourced idea: sharing some- thing within a community that decided to advocate a common idea. By using this approach, we recognize two interrelated levels of modelling: on the one hand, the model as an abstraction, as the interpretation of the object through possibly shared and potentially multiple “lenses” (Peroni et al. 2014); on the other, the model as the choice of a language useful to implement this abstraction by creating something that is processable by a machine. 2. Controlled Vocabularies as an Abstraction The representation of common features of the observed reality is then a matter of communicating a specific vision of the domain. For example, a digital scholarly edition is a model because it represents the choices of the editor in creating the digital objects at each level of the represen- tation: the transcription, the annotation or markup, the para-meta-intra-inter- textual elements, eventually the textual tradition, but also the interface, the criteria for browsing data and documents, etc. (Tomasi 2013). When the editors choose how to transcribe a document (e.g. in a diplomatic, interpretative or critical manner), or to define which features they want to be managed by the machine, they thereby define a model of the text, which they want to reproduce in a digital dimension. Each step of this process involves computational conse- quences. In general, modelling, as the result of an interpretation, has to be in dialogue with a shared vision of the observed domain. This is the reason why each cul- tural heritage domain (from libraries to archives, from museums to galleries) HSR Suppl. 31 (2018) │ 172 endeavors to define strategies for a semantic dialogue within and between cultures that use different standard reference models. The choice of the content model, i.e. a metadata vocabulary for describing a collection, is a matter of sharing. And sharing a model is what it takes to guar- antee a basic semantic interoperability. Dublin Core (DC)1 is, for example, a content model chosen for describing the reality through an abstraction: 15 categories able to collect all necessary features of the observed domain (a cul- tural heritage collection, a web page, an institutional repository, the papers of a journal, etc.). The Text Encoding Initiative (TEI)2 is a model expressed in embedded markup, i.e. a controlled vocabulary, a grammar or, better, a Schema (a set of elements, attributes, rules and constraints), for describing objects related to a domain, namely, the humanities. So, TEI is a common model for representing the observed reality (i.e. texts and documents in the humanities domain), but it also leaves the interpreter free to define his or her own model of the text(s)/document(s) by choosing the features to be in focus in the computation- al representation. The Encoded Archival Description (EAD)3 and the Encoded Archival Con- text – Corporate Bodies, Persons and Families (EAC-CPF)4 are, again, firstly Document Type Definitions (DTDs) and Schemata created to describe archival finding aids and authority records. They are models in the archival domain, reference systems for the community. And they are both the result of the at- tempt to formalize the two related methodological standards, namely ISAD (International Standard for Archival Description)5 for the archival description, and ISAAR-CPF (International Standard Archival Authority Records for Cor- porate Bodies, Persons and Families)6, for the description of authority records. Despite the different implementations, DC, EAD, EAC-CPF, and TEI are all examples among others of metadata element sets used in cultural heritage to resolve ambiguities by sharing a domain vocabulary. They want to present themselves as models, through elements and attributes, conventions, and decla- rations. 1 Dublin Core Metadata Element Set, Version 1.1: (Accessed April 20, 2017). 2 TEI P5 Guidelines, latest version 3.1.0, 2016: (Ac- cessed April 20, 2017). 3 EAD: (Accessed April 20, 2017). 4 EAC-CPF: (Accessed April 20, 2017). 5 ISAD (2nd edition), 2011: (Accessed April 20, 2017). 6 ISAAR-CPF (2nd edition), 2011: (Accessed April 20, 2017). HSR Suppl. 31 (2018) │ 173 Metadata models (and controlled vocabularies) take up the need to define a common conceptual architecture for a domain. It is worth noting that in the literature “metadata modelling” refers to a type of metamodelling used in soft- ware engineering and systems engineering for the analysis and construction of models applicable to and useful for some predefined class of problems. The activity of metadata modelling is reflected in a concept diagram. Unified Mod- eling Language (UML)7 is the language used in the object-oriented paradigm to represent a model as a diagram. Concept, generalization, association, multiplic- ity and aggregation are all keywords for creating the model. So, a diagram is a model of the reality, able to represent objects in a context. We move from controlled vocabularies to diagrams; and from diagrams to languages. 3. Languages, Data Structures and Data Types It has indeed been said that modelling is also a matter of language. And a for- mal language, from a computational point of view, is a question of data struc- ture and abstract data types, i.e. graph (the network), tree (a hierarchy), table (a relation), sequence (a list). They are, in fact, models. For all data to be orga- nized in a digital environment one of these models is chosen to represent the observed reality. Some examples will help make this point clear: - indeclarative markup languages, e.g. in XML, the model of the document is a tree-like structure. So the content (actually the structure) of the doc- ument is represented as a series of features hierarchically organized and nested; - in database systems (DBMS), the more common model adopted is the re- lational one, namely the table. Objects are records (and thus data) and each value is related to one of the attributes (properties) that describe the reality of the objects; - the network is the structure – the model – of the Web; but the network is also the hypertextual representation of documents (in the Web 1.0 envi- ronment), and now of interconnected data (from a Linked Open Data [LOD]8 perspective); from the sequence (a list of documents) to the graph (a network of data). Hypertext is then a model organizing data objects through their relationships. We have to keep this aspect in mind. The formal language is another way to conceive the concept of model. The choice of the language, and of the related 7 UML: (Accessed April 20, 2017). 8 W3C official page on Linked Data: (Accessed April 20, 2017). HSR Suppl. 31 (2018) │ 174 structures, depends on the observed domain: e.g. documents (i.e. non-structured objects) or data (i.e. structured objects). With the markup, e.g. with XML as a formal meta-language, we model documents as semi-structured objects. And the aim is to reduce the narrative, in order to model content (or, better, struc- ture) as a collection of atomic interconnected pieces to be managed as data. We move from the property-value pairs through the tree (the declarative markup language as XML) to the graph (a model as Resource Description Framework [RDF] for creating LOD). As it has been said: in computer science, the concept of model is related to a data structure, i.e. a possible representation of a digital content or a particular way of organizing data. In this sense, the choice of the logical model (e.g. a relational database instead of a markup language) determines the computational results or, better, the computational activities and operations on the data as based on the chosen model. Hence, models play an important role in moving from theory (the abstract model) to practice, understood as the actions that can be performed (the formal language). 3.1 A Conceptual-Oriented Position In the document community, the markup is the model, i.e. the language to represent the structure of the reality. In the data community, the model, i.e. the traditional way to represent the content of the domain, is the database. In data modelling theory, used especially in database design (although it holds true also for other contexts), we recognize three possible models, also described as three levels of abstraction of a DBMS: - a conceptual data model - a logical data model - a physical data model (or better a Schema). We begin with the latter. At the physical level, we deal with physical means by which data are stored, which is not our level of interest. At the logical level, we deal with structures of models again: hierarchical, network, relational and object oriented. The importance of this level lies in the fact that each chosen data structure affects the possible computational activi- ties: even if the model is theoretical, it involves the kind of operations that we could perform with data based on one of these abstract structures (the tree, the graph, the table, the class). So the model in this case is the content, not just the structure. Now, let us move to the first level: the conceptual data model, i.e. the ab- stract conceptual representation of data. On this level, data are defined from a conceptual point of view. The meaning of data depends on the context of the interrelationships with other data. HSR Suppl. 31 (2018) │ 175 There are several notations for data modelling. The most common model is the “Entity relationship model” (E/R), because it depicts data in terms of the entities and relationships described in the data. The E/R notation yields a mod- el, because its aim is to represent the reality as an abstraction: “this model incorporates some of the important semantic information about the real world” (Chen 1976). The conceptual model then represents concepts (entities) and connections (relationships) between them. The notation itself is an abstraction. 3.2 Ontologies and Knowledge The same approach is adopted by ontologies, i.e. conceptual data models trans- lated through a formal language. Again, we range between database theory and markup languages: the data-centric approach of the DBMS, the formal declara- tive language (XML) and the assertion (the triple) as a graph (RDF). We could say that we are dealing with the Semantic Web approach and the LOD perspec- tive (Bizer, Heath, and Berners-Lee 2009). In ontology design, the model is the conceptual framework. The ontology is the conceptualization of an abstraction by identifying those features, in the form of classes and predicates, which enable us to describe a domain, observed from a specific point of view. And the aim is to move from data to information in order to extract knowledge, i.e. to reveal the latent, the yet unknown. Reveal- ing knowledge through the analysis of, for example, the context, is necessary in order to enable inferences (Daquino and Tomasi 2015). Modelling, for in- stance, persons, dates, places or events is an attempt to standardize a conceptu- al approach through relationships (Gonano et al. 2014). EDM9, CIDOC CRM10 and FRBRoo11, SKOS12 – just to give some hetero- geneous examples (see, for example, Doerr 2009) – are nothing but points of view on reality. We could assert that ontologies are the shared ideas concerning a domain, expressed with classes and properties, relationships between con- cepts, rules and constraints. A domain ontology is a formal, abstract representa- tion, useful in order to semantically describe, i.e. to model, a collection of resources and to reason on data, with an inferential aim and a problem-solving approach. Another attempt to model the reality is the translation of an XML Schema, e.g. TEI, into an ontology (see, for example, Eide 2014; Ciotti and Tomasi 2015). 9 Europeana Data Model: (Accessed April 20, 2017). 10 CIDOC Conceptual Reference Model: (Accessed April 20, 2017). 11 FRBRoo: (Accessed April 20, 2017). 12 SKOS: (Accessed April 20, 2017). HSR Suppl. 31 (2018) │ 176 So, ontologies are models, and I think that conceptualization is the core of modelling, with reference to the issue of knowledge organization.13 In fact, ontologies are both a way to express the semantics of a domain and a method to organize knowledge through concepts. I personally believe that in the DH domain, ontology engineering is the most effective and persuasive modelling strategy: it is a method enabling us to reproduce the brain’s reasoning, i.e., the humanistic approach to interpretation. The act of moving from controlled vocabularies to ontologies reflects the need to express the semantics that are hidden because of the absence of a con- tent model. The creation of an interconnection through typed links is the key to solve relationships between entities in order to reveal real knowledge. 4. Final Remarks Another definition, from the Linked Open Data perspective, is the concept of model as a conversion method: Linked Data modelling involves data going from one model to another. For example, modelling may involve converting a tabular representation of the data to a graph-based representation. Often extracts from relational databases are modeled and converted to Linked Data to more rapidly integrate datasets from different authorities or with other open source datasets.14 So, the act of converting data into a different format, or using another data structure, is again a practice. The model gives the theoretical basis for a practi- cal activity. Finally, a model is also a question of interface. The template for a web page, for instance, is a model. The design of a page in a Content Management System (CMS) is a model. The architecture of information, understood as the position of logical components of a page, is a model. The iconic symbols are models of the reality. So, when we model a web resource, we chose a way to represent information in the visual interface: we define spaces for components and we use icons as an abstraction of an idea, we adopt glyphs as a representation of graphemes. In conclusion, models are a guideline, models are shared by a community, models are the representation of a domain, models refer to languages and data 13 An interesting event related to these themes is the “three-day workshop held at Brown University on data modelling in the humanities, sponsored by the NEH and the DFG, and co- organized by Fotis Jannidis and Julia Flanders”. Knowledge Organization and Data Modeling in the Humanities: An ongoing conversation, 2012 (Accessed April 20, 2017). 14 Best Practices for Publishing Linked Data. W3C Working Group Note 09 January 2014: (Accessed April 20, 2017). HSR Suppl. 31 (2018) │ 177 structures, models are a visual and iconic abstraction. Ontologies are models. Modelling is my favorite job. 5. Discussion Günther Görz’s questions (Q) and my answers (A) Q1. I plead for a more restricted and terminological use of the term “model”. As Nelson Goodman already wrote in “Languages of Art”: “Few terms are used in popular and scientific discourse more promiscuously than ‘model’” (171). A1. The scope of this paper is to reflect on the concept of model by using multiple perspectives. So, yes, the term is used here in order to refer to differ- ent levels, but this is exactly what I would like to get across: the ambiguity, the multiplicity and the polysemy of the word “model”. Q2. It is true that modelling in DH is a challenging topic, but I can't see that DH already has its own notion of models and modelling compared with other interdisciplinary enterprises with computer science such as the social sciences, (cognitive) psychology, or economics. A2. The literature in DH regarding the concept of model and modelling is so vast that I could assert that DH is elaborating its own definition. Q3. For the formal language, the distinction between abstract data types, (concrete) data structures and their implementation should be noticed. Never- theless, e.g. in the mentioned case of digital scholarly editions, we should dis- tinguish between a model (the concepts, properties, constraints, structures, rules, etc.) and a particular result. A3. In digital scholarly editing, the concept of model refers to the choice of the features to be formalized at each level of the scholarly activity. In this sense, the edition is a model: it represents the interpretative act of the editor. Q4. I see a similar problem in calling TEI a model. In my view, TEI is first of all a formal language with an informal semantics. This view imposes several severe constraints, e.g. a fundamental tree structure due to its commitment to XML. So, I still see a deficit on the theoretical side; for me, TEI is yet more a representational framework than a model. A4. From the formal point of view, TEI is not a model. It lacks semantics. But, from the point of view of models as a shared definition of elements and attributes related to the classification of hermeneutic aspects of a domain, TEI is a model. Q5. Another issue is the depth of semantic modelling. In this respect, EDM, CIDOC CRM + FRBR and SKOS are not on the same level. I think we are in substantial agreement on what is said about formal ontologies: the question of semantics is tightly connected to a well-defined inference relation. Taking up TEI again, marking up named entities such as place names and representing HSR Suppl. 31 (2018) │ 178 places in a formal ontology such as CIDOC CRM are on reasonably different abstraction levels. The anything but simple question is then, how the relation- ship between TEI elements and CRM concepts can be formally recorded and mapped into a (partial) semantic and interoperable representation in terms of CRM, expressible in RDF/LOD. In the actually used formal systems, the most advanced of which are Descriptions logics (cf. OWL), we can deal with under specification, but not with vagueness. This is one of the very big challenges of the humanities and science. A5. EDM, CIDOC CRM, FRBR, and SKOS are not on the same level, I agree. The semantic depth is surely different. But they are all models, i.e. point of views: how to integrate metadata vocabularies (EDM), how to use an event- centric approach in the cultural heritage (CIDOC CRM), how to document the stratification of object descriptions (FRBR), how to express structured subjects in a domain (SKOS). So, again, they are not all models from the viewpoint of formal languages to describe concepts, i.e. ontology, but because of their at- tempt to define a shared conceptualization. Translating the TEI Schema into an ontology (e.g. an OWL representation), or thinking on TEI as a CRM, is a challenging issue (see, for example, Eide 2014; Ciotti and Tomasi 2015). Q6. Finally, reasoning with formal ontologies is, up to now, deductive rea- soning. But for reasoning in the humanities and in science other forms are also needed, something that Leibniz called “ars inventoria”. A6. Yes, formal reasoning is the final aim. And the role of ontologies is to enable inferences through description logic formalism. But this is just one of the various ways to interpret the concept of model. References Bizer, Christian, Tom Heath, and Tim Berners-Lee. 2009. Linked Data – The Story So Far. International Journal on Semantic Web and Information Systems 5 (3): 1- 22. Buzzetti, Dino. 2002. Digital Representation and the Text Model. New Literary History 33 (1): 61-88. Chen, Peter. 1976. The Entity-Relationship Model – Toward a Unified View of Data. ACM Transactions on Database Systems 1 (1): 9-36. Ciotti, Fabio, and Francesca Tomasi. 2016. Formal ontologies, Linked Data and TEI Semantics. Journal of the Text Encoding Initiative 9 (Accessed April 21, 2016). Daquino, Marilena, and Francesca Tomasi. 2015. Historical Context Ontology (HiCO): A Conceptual Model for Describing Context Information of Cultural Heritage Objects. In Metadata and Semantics Research, 424-36. Berlin: Springer. Doerr, Martin. 2009. Ontologies for cultural heritage. In Handbook on Ontologies, 463-86. Berlin: Springer. Eide, Øyvind. 2014. Ontologies, Data Modeling, and TEI. Journal of the Text Encoding Initiative 8 (Accessed April 21, 2016). HSR Suppl. 31 (2018) │ 179 Gonano, Ciro M., Francesca Mambelli, Silvio Peroni, Francesca Tomasi, and Fabio Vitali. 2014. Zeri e LODE. Extracting the Zeri photo archive to Linked Open Data: formalizing the conceptual model. In Digital Libraries (JCDL), 289-98. London: IEEE. McCarty, Willard. 2004. Modeling: A Study in Words and Meanings. In Companion to Digital Humanities. Oxford: Blackwell (Accessed April 21, 2016). Minsky, Marvin L. 1995. Matter, Mind and Models (Accessed April 21, 2016). (Rev. version of the essay in Semantic Information Processing, ed. Marvin Minsky. Cambridge, MA: MIT Press, 1968). Orlandi, Tito. 1999. Linguistica, sistemi, e modelli. In Il ruolo del modello nella scienza e nel sapere (Roma, 27-28 ottobre 1998) (= Contributi del Centro Linceo Interdisciplinare, n. 100), Roma: Accademia dei Lincei, 73-90. Peroni, Silvio, Francesca Tomasi, Fabio Vitali, and Jacopo Zingoni. 2014. Semantic lenses as exploration method for scholarly article. In Bridging between Cultural Heritage Institutions, 118-29. Berlin: Springer. Tomasi, Francesca. 2013. Vespasiano da Bisticci, Lettere. A digital edition. Bologna: University of Bologna (Accessed April 21, 2016). Historical Social Research Historische Sozialforschung All articles published in HSR Supplement 31 (2018): Models and Modelling between Digital Humanities – A Multidisciplinary Perspective Arianna Ciula, Øyvind Eide, Cristina Marras & Patrick Sahle Modelling: Thinking in Practice. An Introduction. doi: 10.12759/hsr.suppl.31.2018.7-29 Willard McCarthy Modelling What There Is: Ontologising in a Multidimensional World. doi: 10.12759/hsr.suppl.31.2018.33-45 Nina Bonderup Dohn Models, Modelling, Metaphors and Metaphorical Thinking – From an Educational Philosophical View. doi: 10.12759/hsr.suppl.31.2018.46-58 Barbara Tversky Multiple Models. In the Mind and in the World. doi: 10.12759/hsr.suppl.31.2018.59-65 Christina Ljungberg Iconicity in Cognition and Communication. doi: 10.12759/hsr.suppl.31.2018.66-77 Rens Bod Modelling in the Humanities: Linking Patterns to Principles. doi: 10.12759/hsr.suppl.31.2018.78-95 Fotis Jannidis Modeling in the Digital Humanities: a Research Program? doi: 10.12759/hsr.suppl.31.2018.96-100 Oliver Nakoinz Models and Modelling in Archaeology. doi: 10.12759/hsr.suppl.31.2018.101-112 Gunnar Olsson EVERYTHING IS TRANSLATION (Including the Art of Making New Boots out of the Old Ones). doi: 10.12759/hsr.suppl.31.2018.113-123 Claas Lattmann Iconizing the Digital Humanities. Models and Modeling from a Semiotic Perspective. doi: 10.12759/hsr.suppl.31.2018.124-146 Giorgio Fotia Modelling Practices and Practices of Modelling. doi: 10.12759/hsr.suppl.31.2018.147-153 Paul A. Fishwick A Humanities Based Approach to Formally Defining Information through Modelling. doi: 10.12759/hsr.suppl.31.2018.154-162 Günther Görz Some Remarks on Modelling from a Computer Science Perspective. doi: 10.12759/hsr.suppl.31.2018.163-169 Francesca Tomasi Modelling in the Digital Humanities: Conceptual Data Models and Knowledge Organization in the Cultural Heritage Domain. doi: 10.12759/hsr.suppl.31.2018.170-179 For further information on our journal, including tables of contents, article abstracts, and our extensive online archive, please visit http://www.gesis.org/en/hsr. Historical Social Research Historische Sozialforschung https://dx.doi.org/10.12759/hsr.suppl.31.2018.7-29 https://dx.doi.org/10.12759/hsr.suppl.31.2018.33-45 https://dx.doi.org/10.12759/hsr.suppl.31.2018.46-58 https://dx.doi.org/10.12759/hsr.suppl.31.2018.59-65 https://dx.doi.org/10.12759/hsr.suppl.31.2018.66-77 https://dx.doi.org/10.12759/hsr.suppl.31.2018.78-95 https://dx.doi.org/10.12759/hsr.suppl.31.2018.96-100 https://dx.doi.org/10.12759/hsr.suppl.31.2018.101-112 https://dx.doi.org/10.12759/hsr.suppl.31.2018.113-123 https://dx.doi.org/10.12759/hsr.suppl.31.2018.124-146 https://dx.doi.org/10.12759/hsr.suppl.31.2018.147-153 https://dx.doi.org/10.12759/hsr.suppl.31.2018.154-162 https://dx.doi.org/10.12759/hsr.suppl.31.2018.163-169 https://dx.doi.org/10.12759/hsr.suppl.31.2018.170-179 Patrick Sahle How to Recognize a Model When You See One. Or: Claudia Schiffer and the Climate Change. doi: 10.12759/hsr.suppl.31.2018.183-192 Cristina Marras A Metaphorical Language for Modelling. doi: 10.12759/hsr.suppl.31.2018.193-200 Zoe Schubert & Elisabeth Reuhl Setting the Space: Creating Surroundings for an Interdisciplinary Discourse and Sharing of (Implicit) Knowledge. doi: 10.12759/hsr.suppl.31.2018.201-208 Nils Geißler & Michela Tardella Observational Drawing. From Words to Diagrams. doi: 10.12759/hsr.suppl.31.2018.209-225 Tessa Gengnagel The Discourse about Modelling: Some Observations from the Outside. doi: 10.12759/hsr.suppl.31.2018.226-230 https://dx.doi.org/10.12759/hsr.suppl.31.2018.183-192 https://dx.doi.org/10.12759/hsr.suppl.31.2018.193-200 https://dx.doi.org/10.12759/hsr.suppl.31.2018.201-208 https://dx.doi.org/10.12759/hsr.suppl.31.2018.209-225 https://dx.doi.org/10.12759/hsr.suppl.31.2018.226-230 work_bi44b5urgrcc7bl5s7za7xg5wm ---- Digital human modelling over four decades Browse Explore more content Marshall_caseFinalVersion.pdf (1.33 MB) Digital human modelling over four decades CiteDownload (1.33 MB)ShareEmbed journal contribution posted on 21.03.2016, 11:46 by Keith Case, Russell Marshall, Steve Summerskill This paper aims to provide a retrospective of the use of a digital human modelling tool (SAMMIE) that was perhaps the first usable tool and is still active today. Relationships between digital human modelling and inclusive design, engineering design and ergonomics practice are discussed using examples from design studies using SAMMIE and government-funded research. Important issues such as accuracy of representation and handling multivariate rather than univariate evaluations are discussed together with methods of use in terms of defining end product users and tasks. Consideration is given to the use of the digital human modelling approach by non-ergonomists particularly with respect to understanding of the impact of human variability, jurisdiction and communication issues. Categories Mechanical Engineering not elsewhere classified Keywords Digital human modellingSAMMIEInclusive design Funding The funding by the Science Research Council and its successor the Engineering and Physical Science Research Council is also gratefully acknowledged. History School Mechanical, Electrical and Manufacturing Engineering Published in International Journal of the Digital Human Citation CASE, K., MARSHALL. R. and SUMMERSKILL, S., 2016. Digital human modelling over four decades. International Journal of the Digital Human, 1 (2), pp. 112-131. Publisher © InterScience Version AM (Accepted Manuscript) Publisher statement This work is made available according to the conditions of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) licence. Full details of this licence are available at: https://creativecommons.org/licenses/by-nc-nd/4.0/ Acceptance date 07/01/2016 Publication date 2016-06-27 Notes This paper was accepted for publication in the journal International Journal of the Digital Human and the definitive published version is available at http://dx.doi.org/10.1504/IJDH.2016.077408. DOI https://doi.org/10.1504/IJDH.2016.077408 ISSN 2046-3375 eISSN 2046-3383 Publisher version http://dx.doi.org/10.1504/IJDH.2016.077408 Language en Administrator link https://repository.lboro.ac.uk/account/articles/9347861 Licence CC BY-NC-ND 4.0 Exports Select an optionRefWorksBibTeXRef. managerEndnoteDataCiteNLMDC Categories Mechanical Engineering not elsewhere classified Keywords Digital human modellingSAMMIEInclusive design Licence CC BY-NC-ND 4.0 Exports Select an optionRefWorksBibTeXRef. managerEndnoteDataCiteNLMDC Hide footerAboutFeaturesToolsBlogAmbassadorsContactFAQPrivacy PolicyCookie PolicyT&CsAccessibility StatementDisclaimerSitemap figshare. credit for all your research. work_bma2bp7brfhebklg5fbvhutz7m ---- GeoJournal 6 issues/year Electronic access ▶ link.springer.com Subscription information ▶ springer.com/librarians GeoJournal Spatially Integrated Social Sciences and Humanities Editor-in-Chief: B. Warf ▶ An interdisciplinary journal devoted to all branches of spatially integrated social sciences and humanities ▶ Covers human geography, human-environment interactions, geographical information science, medical and health geography, and geographic education ▶ Presents research notes, commentaries, reports, and reviews ▶ Contributors include scholars from around the world ▶ 100% of authors who answered a survey reported that they would definitely publish or probably publish in the journal again GeoJournal is an international journal devoted to all branches of spatially integrated social sciences and humanities. This long standing journal is committed to publishing cutting-edge, innovative, original and timely research from around the world and across the whole spectrum of social sciences and humanities that have an explicit geographical/ spatial component, in particular in GeoJournal’s six major areas:   - Economic and Development Geography   - Social and Political Geography   - Cultural and Historical Geography   - Health and Medical Geography   - Environmental Geography and Sustainable Development   - Legal/Ethical Geography and Policy     In addition to research papers GeoJournal publishes reviews as well as shorter articles in the form of research notes, commentaries, and reports. Submissions should demonstrate original and substantive contributions to social science and humanities from a geographical perspective. Submissions on emerging new fields such as GeoEthics, Neogeography, Digital Humanities and other emerging topics are also welcome. On the homepage of GeoJournal at springer.com you can ▶ Sign up for our Table of Contents Alerts ▶ Get to know the complete Editorial Board ▶ Find submission information http://link.springer.com/journal/10708 http://www.springer.com/librarians http://www.springer.com/social+sciences/human+geography/journal/10708 http://www.springer.com/ work_bodh2qxngbct3pgyhmg464vjwm ---- Multi-Embodiment of Digital Humans in Virtual Reality for Assisting Human-Centered Ergonomics Design O R I G I N A L P A P E R Multi-Embodiment of Digital Humans in Virtual Reality for Assisting Human-Centered Ergonomics Design Kevin Fan1 • Akihiko Murai1 • Natsuki Miyata1 • Yuta Sugiura2 • Mitsunori Tada1 Received: 12 June 2017 / Accepted: 8 September 2017 / Published online: 6 October 2017 � Springer Nature Singapore Pte Ltd. 2017 Abstract We present a multi-embodiment interface aimed at assisting human-centered ergonomics design, where tra- ditionally the design process is hindered by the need of recruiting diverse users or the utilization of disembodied simulations to address designing for most groups of the population. The multi-embodiment solution is to actively embody the user in the design and evaluation process in virtual reality, while simultaneously superimposing addi- tional simulated virtual bodies on the user’s own body. This superimposed body acts as the target and enables simulta- neous anthropometrical ergonomics evaluation for both the user’s self and the target. Both virtual bodies of self and target are generated using digital human modeling from statistical data, and the animation of self-body is motion- captured while the target body is moved using a weighted inverse kinematics approach with end effectors on the hands and feet. We conducted user studies to evaluate human ergonomics design in five scenarios in virtual reality, com- paring multi-embodiment with single embodiment. Similar evaluations were conducted again in the physical environ- ment after virtual reality evaluations to explore the post-VR influence of different virtual experience. Keywords Multi-embodiment � Embodied interaction � Ergonomics evaluation � Digital human Introduction Our human body is the interface between ourselves and the world, with which we intake perceptual information, make cognitive decisions, and perform actions on the basis of our understanding of our own body [37]. However, as each individual human is uniquely gifted with a different body, it may become a barrier for us to comprehend the body capabilities of a different individual [39]. This barrier complicates matters in a situation closely related to our everyday lives, which is product and envi- ronment design of our environment [24]. It is often desired to have the designers and engineers to create products that can accommodate the most groups of the population, as we could see benefits such as increased efficiency, comfort, and safety within the environment when proper ergonomics considerations are taken [33]. For example, we have seen evidence of how different user’s anthropometry influence product design in furniture [28]. The challenge for designing is therefore accounting for the diverse population with physical body deviation. With the emergence of computing technology, we have seen approaches in assisting the ergonomics design process ranging from completely simulated digital human model- ing (DHM) [3] to fully interactive virtual reality (VR), & Kevin Fan kevin.fan@aist.go.jp Akihiko Murai a.murai@aist.go.jp Natsuki Miyata n.miyata@aist.go.jp Yuta Sugiura sugiura@keio.jp Mitsunori Tada m.tada@aist.go.jp 1 Digital Human Research Group, Human Informatics Research Institute, National Institute of Advanced Industrial Science and Technology, 2-3-26, Aomi, Koto, Tokyo 135-0064, Japan 2 Department of Information and Computer Science, Faculty of Science and Technology, Keio University, 3-14-1, Hiyoshi, Kohoku-ku, Yokohama 223-8522, Japan 123 Augment Hum Res (2017) 2:7 https://doi.org/10.1007/s41133-017-0010-6 http://orcid.org/0000-0003-4087-4856 http://crossmark.crossref.org/dialog/?doi=10.1007/s41133-017-0010-6&domain=pdf http://crossmark.crossref.org/dialog/?doi=10.1007/s41133-017-0010-6&domain=pdf https://doi.org/10.1007/s41133-017-0010-6 where researchers explored the potential of utilizing interactive embodiment of users in the virtual environment (VE) for ergonomics design [2, 13]. The DHM approach involves only virtual agents and therefore could address the diverse population body with ease through simulation, but raise concern of disembodiment in the design process [26]. VR excels in providing the design and evaluation in a immersive, embodied VE and therefore could resemble evaluations in the real environment, but as it involves actual human users, the diverse body issue remains. The common solution is to invite specific subjects from the actual population [9, 18] or altering the virtual body and perceptual information to simulate embodying a different body [8, 19]. While both solutions could provide accurate evaluations through involving actual users or simulating actual users, certain drawbacks are of concern. On the one hand, finding diverse users to target every population results in a less agile design process compared to simula- tion [3]. On the other hand, simulating the embodiment of another body would require additional wearable devices [30] and could sometimes be cumbersome, as the simula- tion of the target bodily information needs to override our original, which could be complicated, e.g., imagine simu- lating the embodiment of a target with upper extremity paresis on a normal user. Our vision is therefore a hybrid approach, where the diverse population’s body is simulated as DHM and is embodied to the user as a superposition while the user’s original body embodiment is retained. We present a VR multi-embodiment interface aimed at assisting the ergo- nomics design process by taking different body’s anthro- pometry into account (Fig. 1). We consider the multi- embodiment (ME) as an augmentation, as the user is augmented with an extra body that would attempt in real- time the same action as the user but in an ergonomically optimized manner. ME is different from single embodi- ment (SE) where the user embodies only one body, either self or an altered target simulation (which we call alter- ation). We envision that this underlying difference of augmentation versus alternation, where in augmentation our body is maintained across the physical environment and VR, would enable the utilization of our body as ‘‘the body of reference’’ for other bodies in the physical envi- ronment, even in their absence. We are therefore intrigued to explore whether the aug- mentation approach of multi-embodiment could 1) assist ergonomics design to the same extend or exceed perfor- mance as alteration approaches and 2) generalize well in post-VR exposure where the user could make ergonomics judgment in the physical environment for another person’s absent body. In this paper, we present our contributions as follows: – A multi-embodiment body interface that takes the approach of augmentation for assisting ergonomics design. – Address reachability and accessibility as illustrative application, which are issues present in ergonomically different people. – A user study to compare between our approach with conventional alteration approach for ergonomics design in VR. – Explore augmented perception in post-VR influence of using our unaltered body as body of reference for ergonomics evaluation of other bodies in the physical environment. Related Work Our work is related to the following research areas: 1) using body for affordance judgment, 2) perception of affordance in VR, 3) virtual assisted ergonomics design, and 4) augmented body image for training. Body for Affordance Judgment Our human body not only is accountable how we physi- cally interact with our world, but also constitutes the basis of how we define our world. In recent years, this concep- tion of the relationship between the body and the world has been formulated as embodied cognition [42]. Based on this perception of the world and our understanding of our body, Fig. 1 Multi-embodiment (left), user view in HMD (top right), user wears HMD and motion capture (bottom right) 7 Page 2 of 14 Augment Hum Res (2017) 2:7 123 we construct our judgment of affordance [12], the prop- erties of the world that affords to be acted upon, of our surrounding environment. It is discussed from an embodied perception conception that our understanding of our body morphology influences our perceived affordance [38]. Early studies have shown that humans judge differently the affordance of a climbable stair based on their height and leg length [46], passing through apertures from body sizes [47], or affordable to sit from leg length [25]. Changing the morphological param- eters of our body therefore have effects on our perceived size of the world [16, 44] and therefore affordance, such as walking under barriers with altered body height [45]. These findings show promising evidence that our body plays a major role in our perception of affordance of the environment. Moreover, it is suggested that we could deduct the affordance of an observed different body than us [49]. Our research builds on this conception that body is a factor for affordance and creates an interface to augment ourselves with multi-embodiment for assisting affordance judgment in VR. Affordance Perception in VR VR provides an appropriate pipeline for studying the change in affordance perception due to the changing environment or body perceptual information as VR allows us to manipulate our perceptual cues [20]. In VR, it is relatively easier than physical environment to alter our body morphology, such as hands [21], feet [17], body size [35], and body height [20], so that we perceive the affor- dance of graspable object, crossable gap or aperture, or action decision on whether to duck or step over a pole. However, a particular concern has been that the spatial perception in VR is found to be compressed compare to the real world [50]. This underestimation could be due to measurement method, technical factors of head-mount- display (HMD), compositional factor of the degree of replication of the virtual to the real world [41]. An approach to improve this margin of error is by introducing a embodied body avatar in the VE [27, 34]. Furthermore, by involving embodied action, the affordance judgment is further improved [20, 27]. From these researches, we can see the potential of using VR as an platform for affordance judgment with the benefit of agile prototyping of changing body morphology and the VE. VR can therefore be an ideal approach for affordance based design [24] for ergonomically efficient environment, which we address through our research in an multi-em- bodied system. Virtual Assisted Ergonomics Design Historically, ergonomics design consisted of physical fit- ting trials involving ergonomics experts and a diversity of test users were often employed [10], but could be time- consuming. There has thus been an increase in extending ergonomics design to computer-aided design (CAD) and DHM methods due to the ease of virtual prototyping [52] and virtual fitting trials [26]. However, this disembodied approach could be difficult to provide an accurate human stimulation as their movements are programmed [6], and there could be concern of detachment between the designer and the users represented by the virtual agents [26], where the emotional detachment could hinder accurate design [7]. Embodied, interactive VR has become an emerging platform in ergonomics design. Pontonnier et al. [36] investigated the difference between ergonomics evaluation in physical environment and VR with results suggesting that although VR is slightly inferior to physical environ- ment, the difference is insignificant and the potential of VR is greater. VR is widely utilized in manufacture [5], industrial workstation [29] design and usability evaluation. It is also getting attention on universal design, for evalu- ation against a target user group [18, 48]. For a universal- prone design goal for evaluating against different ergo- nomic bodies, these researches usually employed diverse users of the population or simulate the perceptual infor- mation, so a general user embodies the body of the target population. In this research, we take a different approach, where our perceptual information is not altered to embody the target. Rather, the target’s body is augmented upon our body so we employ a multi-embodiment interface. Augmented Body Image With VR, it is relatively easy for us to alter and augment our body image, and still feel body ownership (e.g., [31]), which may also influence our cognitive and behavior pro- cess. In particular, closely related to our research is the augmentation of the body image with extra bodies or limbs. Augmenting our visual sensation with extra body images in either displaced or co-located location has been utilized for action learning. YouMove is an AR mirror which super- imposes an abstract stick body to assist in dance train- ing [1]. Han et al. [15] developed AR-Arm, superimposing extra hands as indicator to train user’s correct hand movement for learning Tai-chi. Yan et al. [51] took an out- of-body approach to show both the instructor and user’s body image. While augmenting extra body image has been focused on action learning, in our research the focus is on the spatial perception with the augmented body image. This Augment Hum Res (2017) 2:7 Page 3 of 14 7 123 could be plausible as we have discussed that body is a reference for affordance and therefore ergonomics design. Furthermore, our embodied approach of the augmented extra body could possibly strength the connection between the self-body and the augmented body. Multi-Embodiment Interface We developed a multi-embodiment interface that super- imposes extra virtual bodies to the user’s own body in VR, so that the user embodies more than one body, with the goal of assisting ergonomics evaluation and design in a VE. We do so by generating the extra body with DHM from population statistical data. The extra body’s movement in VR is calculated from weighted inverse kinematics by specifying the end effectors on the user’s two hands and feet. In addition, human bone joint constraints can be specified to limit the movement capacity of the extra body, e.g., imposing joint constraints on the lower limbs to simulate a wheelchair occupant’s body. Design Understanding another person’s bodily information, e.g., anthropometric dimensions and muscle strength, is essen- tial in understanding how to design and develop products for that person. With VR, recently we have seen various approaches of stimulating the users for them to feel as though embodying a different body. We could then make ergonomics judgment using this different virtual body in VR. However, this approach of transitioning ourselves into a different body, through which completely altered our perceptual information, could be problematic as well when we consider further about the post-VR, everyday life application of this approach in the physical environment. In the physical environment, our perceptual information is not altered and is embodied with our body that accom- panied us for many years; therefore, even after experi- encing and understanding another person’s body in VR, it is a possible concern that we may ‘‘overwrite’’ the altered experience in the VR as we gradually revert back to our original perceptual information in the physical environ- ment. It is therefore a barrier of expanding the VR ergo- nomics evaluation into our original everyday life. Our approach therefore is to augment our perceptional information, rather than completely altering to a different perceptual information from a different body. This aug- mentation is therefore the multi-embodiment interface, where the user maintains his original perceptual informa- tion and body, but is augmented with extra bodies that move and interact with the environment along with the user. The system automatically handles the movement simulation of the extra body in relation to the user so the user can interact in VR naturally with their original body. This way, the user possesses a common reference point between the physical environment and the virtual envi- ronment, i.e., the user’s own body. We envision that through ‘‘using our body as the reference,’’ the augmen- tation experienced in VR could be persisted in the physical environment so that we may ‘‘remember’’ the different body’s ergonomics information, e.g., reachability, in the physical environment even without any augmentation. System Overview The system (Fig. 2) consists of a head-mounted display (Oculus Rift CV1), a remote controller (Oculus Remote), 8 motion capture cameras (OptiTrack), and a motion capture suit with 37 optical markers. The HMD displays the VE, where the user’s motion-captured body is visualized through a DHM along with other virtual models of the environment to be ergonomically assessed. The remote controller is used to enable the user to scale the target virtual models for ergonomics design in the VE in real time. The motion capture system continuously captures the user at 120Hz, and our software calculates the user’s DHM movement as well as the multi-embodied DHM’s weighted inverse kinematics movement. The following sections will go into detail about the structure of the system (Fig. 3). Embodying Digital Humans It is widely known that presenting a virtual avatar for the users to embody into has multi-dimensional benefits for the overall experience [43]. In most current VR experience, the focus has been more on enabling the agency of the avatar, while the avatar may not be a close representation of the users in dimension. This approach is suitable for most situations, as the sense of agency can induce a stronger sense of ownership even for relatively abstract avatars [22]. However, for the application of VR to ergo- nomics design, the proper anthropometric representation would be crucial. As aforementioned, each individual person has different anthropometric factors such as size and shape, which influences our affordance and ergo- nomics judgment. Therefore, in our system, users are embodied into digital humans that are a closer represen- tation of their own anthropometric factors. Generating Self-Digital Human The user’s digital human avatar is generated from the implementation of ‘‘Dhaiba’’ [11]. Dhaiba is capable of generating very detailed and customized human model based on each individual’s measurement and the 7 Page 4 of 14 Augment Hum Res (2017) 2:7 123 anthropometric dimensions database, accounting for the generalized user population and agile prototyping. In our system, we specify the height and weight scale of the user. Dhaiba is then able to construct a generalized DHM from the anthropometric dimension database. Visuomotor Agency A static DHM is only part of the embodiment, which we need to allow user agency to strengthen the sense of embodiment. To achieve the embodied visuomotor corre- lation of the DHM in the VE and the user’s actual body movement, we employ the method of full body motion capturing the user’s movement. The user wears a motion capture suit, and the captured marker position is streamed into our software to animate the DHM. The DHM is divided into two modules: the skin surface mesh generated from the anthropometric data as discussed in the previous section, and the armature link module, which defines the skeletal joint of the digital human. The inverse kinematics computation from the captured markers updates the rotation of each digital human joint in syn- chrony with the user. The skin surface mesh is then com- puted as the linearly weighted sum of the joint movement according to the Skeletal Subspace Deformation algo- rithm [23], enabling visuomotor correlation as the user moves. Superimposed Multi-Embodiment In the multi-embodiment interface, in addition to embodying the self-DHM as described in the previous sections, an extra DHM is also superimposed on the user. The generation of the extra DHM is using the same method as the self-DHM, but is specified with different anthropo- metric data to simulate the multi-embodiment of two Fig. 2 Equipment Fig. 3 System structure Augment Hum Res (2017) 2:7 Page 5 of 14 7 123 different bodies. In addition, the movement of the extra DHM is calculated in a different manner than self. Inverse Kinematics of the Superimposed DHM While the user’s embodied self-DHM should move in accordance with the user’s physical body via motion cap- ture, the extra superimposed DHM should take a different approach due to the deviation in anthropometry. Our sys- tem aims at assessing the different interaction affordance of humans brought about by the difference in anthropometry. For example, finding the difference in reachability of (and how they reach) a cup on the table between an adult and a child. Therefore, the superimposed DHM should attempt to reach the same point as the user’s self with the limbs, but the posture should be simulated according to the DHM anthropometry. To calculate for the posture of the superimposed DHM, we use a weighted inverse kinematics (IK) method by defining the reaching end effectors on the superimposed DHM’s hands and feet. The end effectors are moved together along with the user’s motion-captured hands and feet (three on each hand and feet). The feet end effectors have larger weights to keep the DHM from floating (Fig. 4 left, the taller DHM is user self). Additional joint con- straints of the DHM can be applied so as to constrain the IK calculation if desired. For example, in the wheelchair DHM discussed later in the user study, the joints of the lower limb are constrained (Fig. 4 right). Real-Time Modifying Objects in VR As a requirement for the user study to be detailed in the next section, we added functionality of the VR multi-em- bodiment interface to allow users to real-time scale and move the objects in VR, as a method of fast prototyping the ergonomic optimal size of the furniture. We opted for a embodied scaling and translation approach, where the user’s embodied movement of their arms could modify the width and height of the objects (Fig. 5) or translate them. This is triggered through the Oculus remote controller and tracking the user’s left-hand position with the OptiTrack cameras. The scaling factor S is computed as S ¼ c þ ðY � aÞ � d � c b � a ð1Þ where Y is the captured hand position in real-world units of millimeter, and a, b, c, d are functional parameters specific to each object’s original height and position so that the scaled height is always on the same level as the user’s hand. The translation factor is simply a linear relation to Y. During the user study, the controls were set as either scaling or translation depending on the scenario, but not both, as not to complicate the user’s tasks. User Study We conducted a user study in two phases to validate the multi-embodiment interface. Our main goal was to study 1) how does the multi-embodiment (ME) interface compare to single embodiment (SE) in assisting ergonomics design in VR and 2) can exposure to a VR multi-embodiment have effect on a post-VR evaluation in the physical environment without augmentation. We observed the accuracy of the ergonomic judgment and completion time in comparison between ME and SE. Participants We recruited 8 male participants. They are aged between 22 to 31 years old (AVG: 24.75, SD: 3.06). Their height was between 163 to 184 centimeters (AVG: 173.1, SD: Fig. 4 Weighted IK for animating the superimposed DHMs while self is motion captured (left), joint constraint on the lower limb (right, transparent for visualization) 7 Page 6 of 14 Augment Hum Res (2017) 2:7 123 7.0). All participants participated in both phases of the study, where they first evaluated user study 1 followed by user study 2. Environmental Setup The user study was conducted in a modified and furnished living room style living laboratory (Fig. 6). The room was surrounded with eight OptiTrack motion capture cameras. The furniture in the room was removed during the VR user study and placed back in the post-VR user study. The participants wore motion-captured HMD and suit. Experimental Design Our main evaluation criteria with the current studies is to make a clear exploration of the proposed ME interface with the conventional SE approach, in the areas of ergonomics design with the aim of designing one set of products to be used simultaneously by different anthropometrical popu- lation. Therefore, both user studies one (VR) and two (post-VR) were conducted in both conditions of augmen- tation ME and alteration SE. The eight participants par- ticipated in both conditions in a within-subjects design, and their order is counterbalanced (four evaluated ME first followed by SE, and four vice versa). The trials of each conditions were conducted on two different days (e.g., ME on day one and SE on day two). For the current studies, our focus on the ergonomics design is in the area of reachability and accessibility, which are greatly influenced by the human anthropometry. To this end, to provide a common evaluation target for all partic- ipants, we have defined two targets (Fig. 7). First is a five- year-old, 110-cm-tall (population statistical mean) kid. Second is a 160-cm-tall wheelchair person, with seated height of 135.6 cm, and an eye level of 124.7 cm (within one SD [32]) and the width of the wheelchair is 84.9 cm. The kid was evaluated for reachability, while the wheelchair person is evaluated for accessibility. The par- ticipants’ tasks were to evaluate and real-time scale or move several furniture, with the goal of designing one dimension for each furniture that is usable when both the participant’s self and the target are to be taken into con- sideration, e.g., designing one refrigerator that is reachable by the kid but not so low as to strain the back of an adult user. Concretely, we have arbitrarily defined four furnitures as the reachability evaluation object. They are a refrigerator, a kitchen sink, a door handle, and a wheelchair handle. In the post-VR experiment, they were represented as the actual furniture in the living laboratory, and in the VR experiment they were represented as similar CG models. For simplic- ity, the participants evaluated the reachability in height only, as one that is appropriate for both self and the target kid. For accessibility evaluation, the scenario was a corri- dor interval between a closet and a table, which the par- ticipants’ task was to ensure it allows the wheelchair occupant to pass through. One note is that, generally speaking, in terms of spatial accessibility, the accessible region of an wheelchair occupant is accessible to a normal person, and the more spacious the better (rather than as the case of reachability of compromising between two people’s heights). Therefore, in this evaluation the participants were instructed to evaluate for the minimal passable interval. Fig. 5 Participants could scale the objects in VR to evaluate for optimal design Fig. 6 User study environment Fig. 7 Target kid and wheelchair occupant in the user study (with a 170-cm-tall avatar as a reference) Augment Hum Res (2017) 2:7 Page 7 of 14 7 123 The scenario setup in both study one and two is summa- rized in Fig. 8. Study 1: VR Evaluation Our first study was conducted in a complete virtual envi- ronment with motion-captured participants. Each partici- pant participated in 10 trials: 5 target scenarios (4 kid ? 1 wheelchair) � 2 conditions (ME/SE). Since the same five scenarios were used for both conditions in a within-subject design, we conducted the two conditions on two different days to counter some of the carryover effects. In the ME condition, the user’s self-DHM is motion- captured, and the target DHMs (kid or wheelchair occu- pant) is superimposed with postures estimated from weighted IK. For the wheelchair occupant DHM, joint constraints were put on the DHM’s lower limbs to exclude them from being simulated by the IK, and the simulated wheelchair moves in accordance with the participant’s captured mass center (pelvis), i.e., as the participant walks in the VE, the wheelchair moves in synchronization. In the SE condition, rather than multi-embodiment, the partici- pant alters between self-DHM and target kid DHM using the Oculus Remote Controller, and the participant’s motion capture directly controls the currently embodied DHM. For SE wheelchair scenario, the participants were required to sit and move around in an wheelchair to simulate the embodiment of the virtual wheelchair DHM. However, due to safety concerns induced by altering between sitting as a wheelchair occupant and standing as self while wearing a HMD, and also that the minimal interval for both self and wheelchair is the maximum of the two, the participants were embodied as wheelchair occupant the whole time, instead of altering between, for the accessibility evaluation. For each trial, the participants were instructed to eval- uate and design the height of the target furniture or interval for accessibility (through scaling or moving them with embodied movement), so it can be easily reachable or accessible by both self and the target body. Each trial is completed when the participants are satisfied with their design and signal the experimenter, and we measure the completion time. To validate the accuracy of their design, the deviation of height of their design from an optimal design (to be detailed in a following section) is compared. Study 2: Post-VR Physical Environment Evaluation A potential drawback of SE in VR is that as SE embodies the user into a different body in VR, then as the user returns to the real environment, the difference in the body in relation to the environment would override the experience gained in VR. Concretely, as an example, while a person may successfully make ergonomics evaluation for a kid in VR using SE (using the embodied kid body, he has become shorter), he may have difficulty in making ergonomic evaluation for the kid in the physical environment as the person does not embody a kid body in reality. In comparison, one envisioned benefit of ME is that, since it maintains our original body and across VR and physical environment, and augments it with additional body that is referenced to the body of the user, we could then use this to our benefit to perceive the relation between our body and the target body. If this relation could be learned, then it could be possible that as we move our body Fig. 8 Five scenarios (four reachability ? one accessibility) in both VR and real environment 7 Page 8 of 14 Augment Hum Res (2017) 2:7 123 in our physical environment, we could imagine the super- imposed body as learned in VR and therefore make reachability and accessibility judgments for others using just our body (without any augmentation). This post-VR evaluation in the physical environment is therefore the aim of the second user study. Immediately after the VR user study (for each day), the participants conducted the evaluations for the same sce- narios, but this time in the physical environment. As in the physical environment the furniture could not be altered easily, the participants were instructed to use their hands to signal their estimated optimal design, and the hands’ 3D position is recorded as the participant’s preferred modifi- cation height or interval. The same objective measurements of completion time and design deviation were recorded. Result A total of 20 trials was conducted per participant ((4 reachability ? 1 accessibility) � 2 (VR/post-VR) � 2 (ME/ SE)). We collected measurements from the completion time in each trial, and the final altered dimension that the participants deem as appropriate for both self and the tar- get. Four VR trial data (2 ME 2 SE) were unretrievable due to corrupted data files, leaving a total of 76 VR trials and 80 post-VR trials. Completion Time To visualize how ME might increase the time efficiency of conducting ergonomics evaluation for multiple people simultaneously, we measured the completion time of each trials. VR Completion Time We observed that the completion time in ME trials was significantly lower than SE (Fig. 9). Anova analysis revealed that there was a significant main effect on the test condition (ME/SE) on the completion time (F(1, 78) = 8.79, p \ 0.01). During our observation of the par- ticipants, we noticed that VR trials were completed noticeably faster due to participants ability to see both their own reachability and the target’s reachability at the same time, which may have helped them to reach their decided dimension faster. On the other hand, during SE trials, we observed many participants seemed they could not make up their minds on the compromised optimal dimension between self and the target. For example, participants repeatedly scaled the target furniture up or down back and forth as they transitioned between self and the kid. During the interview after both ME/SE experiments were conducted, three participant noted that they definitely felt that ME was effective in helping them reaching a decision faster, although one participant noted that while ME might be faster, SE might yield better accuracy (which we will examine in a following section). Physical Environment Completion Time To explore whether there is a difference in the real-world application after exposure to differed VR training methods, we conducted physical environment trials after the VR trials and recorded completion time. The real-environment trials were noticeably faster than the VR trials, with aver- age trial completion time of 19.7 s as compared to the VR of 56 s. This is foreseeable from factors such as: foreign to a VR experience or could actually alter the dimensions in VR. However, although we observed the evaluation approaches of participants were different, e.g., crouching down in the real environment to simulate kid’s eye level after exposure to SE, kept standing still after exposure to ME, we did not observe a significant effect on condition from ANOVA on physical environment completion time (Fig. 9). Designed Dimension Although we found that ME was significantly faster than SE in VR trials, we had concerns that it may accompany with less accuracy to the optimal dimension, due to less decision time and lack of actually taking the target’s per- spective. We therefore measured the participants-designed dimension of each trials and compare them to the optimal dimension. Optimal Dimension First, we describe how we reached our definition of optimal dimension used in the user study. For reachability, we define the optimal compromised dimension as the reaching Fig. 9 Task completion time in each scenario: VR (left), physical environment (right) Augment Hum Res (2017) 2:7 Page 9 of 14 7 123 height that requires the least combined whole body joint torque of both the participant’s self and the target kid. The reason is that a minimized joint torque in turn minimizes our musculo-skeletal biomechanical stress, which is ideal for the comfort of our body [14]. For determining the joint torque of the participants at different reaching heights, we formulated an estimation from simulations of the participants’ DHM and at a upright posture using the OpenSim software. To verify that the OpenSim simulation of the joint torque, which is simplified as the reaction force is set as body weight at center of mass (i.e., perpendicular to the floor), conforms to the actual joint torque, we conducted a experiment of measuring and calculating the overall whole body joint torque of an male adult (height: 168 cm, weight: 51 kg) at different reaching heights. The test subject’s posture was captured with Vicon motion capture cameras, and the reaction force from the ground was measured from standing over force plates. The actual captured posture and reaction force of the different reaching heights were pipelined to OpenSim to calculate the whole body joint torque. Comparing between the torque calculation from the test subject’s captured measurements and the simplified simulations of using the upright posture DHM, we did not observe significant difference and therefore conclude that the simulation of using only the upright posture and weight is feasible. Therefore, the eight subject’s and the target kid’s body joint torque could be calculated from their DHMs, and we found a combined minimal torque at the reaching height of 120 cm for all participants (Fig. 10). For the accessibility, we define the optimal spatial width between the shelf and the table as the recommended pass through width of wheelchairs [4]. As our wheelchair is wider than that in the guideline, our scaled optimal width is 94.9 cm. VR Dimension The dimension height of each VR scenario was measured, and a comparison between conditions ME and SE did not reveal a significant main effect with respect to the devia- tion from optimal (Fig. 11). This could be indication that ME could perform to the same standard as SE in anthro- pometrical evaluations. Also noted is that all VR evalua- tion on reachability was lower than the optimal, i.e., the participants appeared to be favoring the optimal for the kid more than themselves (although the kid optimal was pre- cisely at 120 cm). Physical Environment Dimension To measure whether conditions of ME/SE might influence our post-VR evaluation in the physical environment, where in our everyday lives we have no augmentation, the physical environment dimension evaluations were con- ducted after the VR experiments for each condition. We observed again that the majority of participant reachability evaluations were lower than optimal, regardless of the condition (Fig. 11). However, we did not observe signifi- cant main effect of either ME or SE have on applying to the real-world evaluation, similar to the VR study. VR Physical Deviation As the real-environmental evaluation is conducted after the VR evaluation, and we observed that both were lower than the optimal, we were interested to explore if there was any correlation between the deviation between VR and real judgments with respect to the conditions. ANOVA analysis revealed a significant effect for reachability (first four scenarios) (F(1, 59) = 4.40, p \ 0.05), but not for acces- sibility (Fig. 12). This observation on reachability may be from the fact that during VR and post-VR trials of ME, most participants stayed standing upright. On the other Fig. 10 Simulation of combined torque between 8 participants and the target kid reveals minimum at 120 cm Fig. 11 Designed dimension for each scenario (reachability: height, accessibility: width): VR (left), physical environment (right) 7 Page 10 of 14 Augment Hum Res (2017) 2:7 123 hand, during SE the participants were standing upright in VR trials, but were trying to crouch to mimic the kid in real trials while standing upright again for self-evaluation, and this inconsistency may have resulted in the greater deviation. Discussion Feedback From the general feedback of the users, three expressed definite interest incline toward the multi-embodiment. Positive feedback include preferring seeing both self and the target simultaneously without needing to transition, within which one participant mentioned he felt he forgets the dimension for the other body every time he transi- tioned. In particular, the majority of participants preferred ME in the accessibility scenario, where they could simply walk around and the embodied occupant follows. On the other hand, three participants expressed clear preference with SE, indicating they enjoyed being transitioned to the kids body with a different eye level or that sitting on a wheelchair made them more confident in making their judgments on the accessibility. One participant suggested a hybrid between ME and SE, as sometimes during ME he felt the target body was occluding him from seeing his own body. Physical Environment Evaluation Observations One most intriguing observation we made during the two user studies is how the condition in the VR, either ME or SE, influenced how participants evaluated the following physical environment. Three out of the four participants who started with condition SE, when asked to make the evaluations again in the physical environment for the kid target they experi- enced in VR, immediately asked the experimenter for the height of the kid (to which we replied unable due to nature of experiment). All participants then began to crouch down, trying to simulate different heights and test for reachability. When asked about their strategy during the interview after the experiment, they noted that they were trying to recreate seeing what the kid would see. On the other hand, none of the four participants who started with condition ME asked the experimenter for the height detail of the target kid. Three of the participants did not adjust their height levels during the entire real-world evaluation. One participant crouched down, but noted was rather trying to test for the lower reachability rather than simulate the kid, as he also crouched during the VR trials. The participants described that they made their judgments based on trying to ‘‘imagine’’ where and how the kid would reach with respect to their own arm movements. Body as Reference in Physical Environment Our observation that the participants exhibited different strategies (Fig. 13) in the physical environment with respect to their VR experience (ME/SE) was promising in that they attempted to evaluate for another body based on their own. Here, we discuss two potential benefits of ME, which allows for a constant reference of the body across virtual and physical, over SE for ergonomics evaluation. First, SE requires altering our perceptual information and action capabilities to simulate that of the target. However, this could be difficult in simulating normal users to be certain targets, especially when there exists discrep- ancy between the user’s and target’s action capabilities, e.g., embodying a physically disabled or slow target. ME is unhindered by this limitation, as the user could just move normally, and the targets could be simulated with accuracy via computers. Furthermore, we speculate from an action-perception affordance view [40] that the perceived affordance change Fig. 12 Deviation between designed dimension in VR and physical environment Fig. 13 Participants exhibited different strategies for physical environment evaluation: maintaining their normal posture after ME (left), crouched to mimic the kid after SE (right) Augment Hum Res (2017) 2:7 Page 11 of 14 7 123 in SE could be temporally, specifically only during the alternation; that is, when the user returns to the original body, the original action capability overrides what was experienced in SE, and therefore, the SE approach may not bode well in the real environment. As the action capabili- ties of the users are consistent in ME throughout the VE and physical environment, the users could use that as a reference to evaluate for other targets’ action capabilities. Although from our experiment we did not observe a significance of designed dimension between ME and SE, we speculate this could be from a brief amount of exposure time (ME: 3 min 52 s, SE: 5 min 28 s) or that a within- subject experimental procedure could have biased the physical environment evaluations as the experimental scenarios (furniture) were the same. Nevertheless, our other finding was that there was a significance in the reduced deviation between VR and real in ME compared to SE. Since we also found that ME could perform as well as SE in VR (no significant difference), that means through increased exposure and accuracy training in ME in VR, we may begin to see a greater significance between ME and SE in the ergonomics evaluation in the physical environment. Limitation and Future Work Our system introduced a multi-embodiment interface with the goal of assisting simultaneous evaluations of different bodies and enhancing post-VR awareness. A noticeable limitation with the current implementation is that the IK would pull the target body forward when the user tries to reach far, and as the target bodies used in the user studies were of a lower height than the participant, occlusion by the target body is prone to happen. Possible solutions include constraining joint angles on the torso, or center of mass, within a margin of the user’s torso to reduce occlu- sion. Also, although IK could be sufficient as a initial exploration of simulating the superimposed body, a more ergonomics nature approach, such as taking the range of motion or joint torques into consideration when simulating the body’s motion, could have potential as a next step of the multi-embodiment interface. One interesting potential that is yet to be explored in the current study is superimposing various bodies that could be simulated to precision through the help of computing sys- tems, which is one clear merit of the multi-embodiment interface. Imagine if we want to evaluate for muscle strength, or muscle reaction time of different individuals, which would be difficult for single embodiment unless, we utilize external exoskeletons to hinder our original muscles. Although our focus has been on anthropometry, multi- embodiment could also show the correct simulation of different muscle strength, so we could use our body biomechanical factors as the reference for others. Moreover, while the current study was multi-embodi- ment in VR and expand to post-VR, an AR approach of multi-embodying in the real environment could also be beneficial. The benefits of the AR approach is that the users would not have to ‘‘remember’’ the VR experience for the real-world evaluation and instead could see vividly the different body’s affordance. Conclusion Multi-embodiment interface is a system that superimposes extra DHMs on the person, where the superimposed DHM is simulated to move by inverse kinematics calculation. The DHMs are generated from a statistical population data to give a more accurate anthropometrical representation (which was critical given the focus of the current study) for anthropometrical ergonomic evaluations. Benefits of ME include (1) able provide a more consistent perceptual information between the virtual and the real, therefore a better tendency of eliciting awareness in the post-VR real world and (2) utilization of computer simulations so we can more easily embody targets that are more difficult to embody. We conducted user studies to investigate whether the multi-embodiment interface could enable users to conduct reachability and accessibility evaluations in VR as well as extending to physical environment in comparison with the more conventional method of single embodiment. Our observation was that multi-embodiment was significantly faster than the single embodiment with no significant dif- ference in the made evaluation. However, currently we observed no significant difference between multi-embodi- ment and single embodiment in post-VR evaluation from the user study. Nonetheless, we observed significant main effect in the correlation between VR evaluation and post- VR evaluation in multi-embodiment, which shows its potential over single embodiment by using our body as the reference in both VR and physical environment. We also observed interesting phenomenon of how participants changed their methods of physical environment evaluation based on their VR experience. Future directions of this research include multi-embodying diversified DHMs, as well as investigate its potential in AR scenarios. Compliance with Ethical Standard Conflict of interest On behalf of all authors, the corresponding author states that there is no conflict of interest. 7 Page 12 of 14 Augment Hum Res (2017) 2:7 123 References 1. Anderson F, Grossman T, Matejka J, Fitzmaurice G (2013) Youmove: enhancing movement training with an augmented reality mirror. In: Proceedings of the 26th annual ACM sympo- sium on User interface software and technology, pp. 311–320. ACM 2. Aromaa S, Väänänen K (2016) Suitability of virtual prototypes to support human factors/ergonomics evaluation during the design. Appl Ergon 56:11–18 3. Baek SY, Lee K (2012) Parametric human body shape modeling framework for human-centered product design. Comput-Aided Des 44(1):56–67 4. Brewerton J, Darton D, Foster L (1997) Designing lifetime homes. Joseph Rowntree Foundation, York 5. Choi S, Jung K, Do Noh S (2015) Virtual reality applications in manufacturing industries: past research, present findings, and future directions. Concur Eng 23(1):40–63 6. Chryssolouris G, Mavrikios D, Fragos D, Karabatsou V (2000) A virtual reality-based experimentation environment for the verifi- cation of human-related factors in assembly processes. Robotics Comput-Integr Manuf 16(4):267–276 7. Clarkson PJ, Coleman R, Keates S, Lebbon C (2013) Inclusive design: design for the whole population. Springer Science & Business Media, Berlin 8. Delangle M, Petiot JF, Poirson E (2016) Using motion capture to study human standing accessibility: comparison between physical experiment, static model and virtual ergonomic evaluations. Int J Interact Des Manuf (IJIDeM) 11(3):515–524 9. Di Gironimo G, Matrone G, Tarallo A, Trotta M, Lanzotti A (2013) A virtual reality approach for usability assessment: case study on a wheelchair-mounted robot manipulator. Eng Comput 29(3):359–373 10. Drury C, Coury B (1982) A methodology for chair evaluation. Appl Ergon 13(3):195–202 11. Endo Y, Tada M, Mochimaru M (2014) Dhaiba: Development of virtual ergonomic assessment system with human models. In: Proceedings of the 3rd International Digital Human Modeling Symposium, Paper vol. 58 12. Gibson JJ (2014) The ecological approach to visual perception: classic edition. Psychology Press, Hove 13. Grajewski D, Górski F, Zawadzki P, Hamrol A (2013) Applica- tion of virtual reality techniques in design of ergonomic manu- facturing workplaces. Procedia Comput Sci 25:289–301 14. Hamaoui A, Hassaı̈ne M, Watier B, Zanone PG (2016) Effect of seat and table top slope on the biomechanical stress sustained by the musculo-skeletal system. Gait Posture 43:48–53 15. Han PH, Chen KW, Hsieh CH, Huang YJ, Hung YP (2016) Ar- arm: Augmented visualization for guiding arm movement in the first-person perspective. In: Proceedings of the 7th Augmented Human International Conference 2016, p. 31. ACM 16. van der Hoort B, Guterstam A, Ehrsson HH (2011) Being barbie: the size of ones own body determines the perceived size of the world. PloS One 6(5):e20–195 17. Jun E, Stefanucci JK, Creem-Regehr SH, Geuss MN, Thompson WB (2015) Big foot: using the size of a virtual foot to scale gap width. ACM Trans Appl Percept (TAP) 12(4):16 18. Li K, Duffy VG, Zheng L (2006) Universal accessibility assessments through virtual interactive design. Int J Human Factors Model Simul 1(1):52–68 19. Lin Q, Rieser J, Bodenheimer B (2012) Stepping over and ducking under: The influence of an avatar on locomotion in an hmd-based immersive virtual environment. In: Proceedings of the ACM Symposium on Applied Perception, pp. 7–10. ACM 20. Lin Q, Rieser J, Bodenheimer B (2015) Affordance judgments in hmd-based virtual environments: stepping over a pole and step- ping off a ledge. ACM Trans Appl Percept (TAP) 12(2):6 21. Linkenauger SA, Leyrer M, Bülthoff HH, Mohler BJ (2013) Welcome to wonderland: The influence of the size and shape of a virtual hand on the perceived size and shape of virtual objects. PloS One 8(7):e68,594 22. Ma K, Hommel B (2015) The role of agency for perceived ownership in the virtual hand illusion. Conscious Cognit 36:277–288 23. Magnenat-Thalmann N, Thalmann D (1990) Human body deformations using joint-dependent local operators and finite- element theory. Making Them Move. Morgan Kaufmann, San Mateo, pp 243–262 24. Maier JR, Fadel GM (2009) Affordance based design: a relational theory for design. Res Eng Des 20(1):13–27 25. Mark LS, Vogele D (1987) A biodynamic basis for perceived categories of action: A study of sitting and stair climbing. J Mot Behav 19(3):367–384 26. Marshall R, Case K, Porter J, Sims R, Gyi DE (2004) Using hadrian for eliciting virtual user feedback in design for all. Proc Inst Mech Eng, Part B: J Eng Manuf 218(9):1203–1210 27. Mohler BJ, Creem-Regehr SH, Thompson WB, Bülthoff HH (2010) The effect of viewing a self-avatar on distance judgments in an hmd-based virtual environment. Presence: Teleoperators Virtual Environ 19(3):230–242 28. Mokdad M, Al-Ansari M (2009) Anthropometrics for the design of bahraini school furniture. Int J Ind Ergon 39(5):728–735 29. Nguyen H, Pontonnier C, Hilt S, Duval T, Dumont G (2016) Vr- based operating modes and metaphors for collaborative ergo- nomic design of industrial workstations. J Multimodal User Interfaces 11(1):97–111 30. Nishida J, Takatori H, Sato K, Suzuki K (2015) Childhood: Wearable suit for augmented child experience. In: Proceedings of the 2015 Virtual Reality International Conference, p. 22. ACM 31. Ogawa N, Ban Y, Sakurai S, Narumi T, Tanikawa T, Hirose M (2016) Metamorphosis hand: dynamically transforming hands. In: Proceedings of the 7th Augmented Human International Con- ference 2016, p. 51. ACM 32. Paquet V, Feathers D (2004) An anthropometric study of manual and powered wheelchair users. Int J Ind Ergon 33(3):191–204 33. Pheasant S, Haslegrave CM (2016) Bodyspace: anthropometry, ergonomics and the design of work. CRC Press, Boca Raton 34. Phillips L, Ries B, Kaeding M, Interrante V (2010) Avatar self- embodiment enhances distance perception accuracy in non-pho- torealistic immersive virtual environments. In: Virtual Reality Conference (VR), 2010 IEEE, pp. 115–1148 35. Piryankova IV, Wong HY, Linkenauger SA, Stinson C, Longo MR, Bülthoff HH, Mohler BJ (2014) Owning an overweight or underweight body: distinguishing the physical, experienced and virtual body. PloS One 9(8):e103,428 36. Pontonnier C, Dumont G, Samani A, Madeleine P, Badawi M (2014) Designing and evaluating a workstation in real and virtual environment: toward virtual reality based ergonomic design sessions. J Multimodal User Interfaces 8(2):199–208 37. Proffitt DR (2006) Embodied perception and the economy of action. Perspect Psychol Sci 1(2):110–122 38. Proffitt DR, Linkenauger SA (2013) Perception viewed as a phenotypic expression. In: Prinz W, Beisert M, Herwig A (eds) Action science: Foundations of an emerging discipline. MIT Press, Cambridge, pp 171–198 39. Ramenzoni VC, Davis TJ, Riley MA, Shockley K (2010) Per- ceiving action boundaries: learning effects in perceiving maxi- mum jumping-reach affordances. Atten, Percept, Psychophys 72(4):1110–1119 Augment Hum Res (2017) 2:7 Page 13 of 14 7 123 40. Ramenzoni VC, Riley MA, Shockley K, Davis T (2008) Carrying the height of the world on your ankles: encumbering observers reduces estimates of how high an actor can jump. Q J Exp Psy- chol 61(10):1487–1495 41. Renner RS, Velichkovsky BM, Helmert JR (2013) The percep- tion of egocentric distances in virtual environments-a review. ACM Comput Surv (CSUR) 46(2):23 42. Shapiro L (2010) Embodied cognition. Routledge, London 43. Steed A, Pan Y, Zisch F, Steptoe W (2016) The impact of a self- avatar on cognitive load in immersive virtual reality. In: Virtual Reality (VR), 2016 IEEE, pp. 67–76 44. Stefanucci JK, Geuss MN (2009) Big people, little world: the body influences size perception. Perception 38(12):1782–1795 45. Stefanucci JK, Geuss MN (2010) Duck! scaling the height of a horizontal barrier to body height. Atten, Percept, Psychophys 72(5):1338–1349 46. Warren WH (1984) Perceiving affordances: visual guidance of stair climbing. J Exp Psychol: Human Percept Perform 10(5):683 47. Warren WH Jr, Whang S (1987) Visual guidance of walking through apertures: body-scaled information for affordances. J Exp Psychol: Human Percept Perform 13(3):371 48. Watanuki K (2010) Development of virtual reality-based uni- versal design review system. J Mech Sci Technol 24(1):257–262 49. Weast JA, Shockley K, Riley MA (2011) The influence of athletic experience and kinematic information on skill-relevant affor- dance perception. Q J Exp Psychol 64(4):689–706 50. Willemsen P, Colton MB, Creem-Regehr SH, Thompson WB (2009) The effects of head-mounted display mechanical proper- ties and field of view on distance judgments in virtual environ- ments. ACM Trans Appl Percept (TAP) 6(2):8 51. Yan S, Ding G, Guan Z, Sun N, Li H, Zhang L (2015) Outsideme: Augmenting dancer’s external self-image by using a mixed reality system. In: Proceedings of the 33rd Annual ACM Con- ference Extended Abstracts on Human Factors in Computing Systems, pp. 965–970 52. Zorriassatine F, Wykes C, Parkin R, Gindy N (2003) A survey of virtual prototyping techniques for mechanical product develop- ment. Proce Inst Mech Eng, Part B: J Eng Manuf 217(4):513–530 7 Page 14 of 14 Augment Hum Res (2017) 2:7 123 Multi-Embodiment of Digital Humans in Virtual Reality for Assisting Human-Centered Ergonomics Design Abstract Introduction Related Work Body for Affordance Judgment Affordance Perception in VR Virtual Assisted Ergonomics Design Augmented Body Image Multi-Embodiment Interface Design System Overview Embodying Digital Humans Generating Self-Digital Human Visuomotor Agency Superimposed Multi-Embodiment Inverse Kinematics of the Superimposed DHM Real-Time Modifying Objects in VR User Study Participants Environmental Setup Experimental Design Study 1: VR Evaluation Study 2: Post-VR Physical Environment Evaluation Result Completion Time VR Completion Time Physical Environment Completion Time Designed Dimension Optimal Dimension VR Dimension Physical Environment Dimension VR Physical Deviation Discussion Feedback Physical Environment Evaluation Observations Body as Reference in Physical Environment Limitation and Future Work Conclusion References work_bohtu2wrvzf3dg6g7k4mpjdh6m ---- 1 Developing Collaborative Best Practices for Digital Humanities Data Collection: A Case Study Rachel Di Cresce Information Technology Services, University of Toronto Libraries rachel.dicresce@utoronto.ca Julia King Department of English and Drama, University of Toronto jlm.king@utoronto.ca mailto:rachel.dicresce@utoronto.ca mailto:jlm.king@utoronto.ca Best Practices for Digital Humanities Data Collection 2 Abstract This case study explores the data management practices of medieval manuscript scholars working on the Digital Tools for Manuscript Study project at the University of Toronto. We chose this user group, despite their incredibly domain specific praxis, since the data challenges they face while doing digital humanities work are representative of the wider community. Our goal is to rethink how librarians can best assist researchers within a digital humanities centered environment. This paper first explores how data is conceived in the DH context and what insights can be drawn for data management. Next, focus shifts to the key characteristics of data collection and post-processing activities carried out by manuscript scholars during repository visits. Parallels are drawn between manuscript scholar practices and those of other humanities disciplines. Finally, the implications for information professionals are explored and best practices for assisting digital humanists defined. In particular, community engagement in the process is stressed throughout as it is the authors’ belief it is necessary for success. The best practices are in no way exhaustive, and they are intended to be broadly applicable to a range of disciplines within the digital humanities and to librarians. Future work will involve validating a new data management approach informed by this study by testing in the field. Keywords: Data Management, Digital Humanities, Manuscripts, Scholarly Needs, Best Practices, Knowledge Organization Best Practices for Digital Humanities Data Collection 3 Data and the Humanist Following the scientific method, a researcher should pose a hypothesis, collect data, and test it against that hypothesis to declare it true or false. The data collected is often measurable in some way; it has gone through a rigorous experimentation process, been approved by ethics boards, repeated hundreds of times to ensure little variation, and published as evidence to shore up a hypothesis. The data in the scientific model is meant to be uniform; by doing the same experiment twice, if the research is sound and the experiment has been set up properly, the data should come out to be similar, if not exactly the same. Data that lends itself to measurability, like numbers, computerized data, or facts, is valued by the sciences, and this conception of visible and tangible data is what has shaped our modern understanding of numbers, charts, sets, and tables as more related to laboratory experimentation than humanistic study. But what of humanities data? Unlike scientific studies, which seek to repeat answers to confirm their truth, humanistic inquiry takes an assumption and answers it in several different ways. A simple question can have multiple answers, and the value of a good research question is that it can produce a variety of responses. Compare this to the value of repeatable scientific data. How do you manage data that comes out of humanistic inquiry when it is not as mathematically measurable and regular as scientific data? How do humanists view and manage their own research output and do they conceive of it as manageable data? To truly tend to humanist data management needs it is important to understand these questions and look for answers within the community. One method of understanding the variety of data available to humanists is to recognize the different kinds of data humanities research can produce. For example, Research Best Practices for Digital Humanities Data Collection 4 Data Canada refers to the “Knowledge Map of Information Studies” study, which, among other things, collected 130 definitions of data formulated by forty-five scholars (Zins 2007). Within it, all data, regardless of format or medium, are recognized. Research Data Canada’s broad definition of research data reads: Facts, measurements, recordings, records, or observations about the world collected by scientists and others, with a minimum of contextual interpretation. Data may be any format or medium taking the form of writings, notes, numbers, symbols, text, images, films, video, sound recordings, pictorial reproductions, drawings, designs or other graphical representations, procedural manuals, forms, diagrams, work flow charts, equipment descriptions, data files, data processing algorithms, or statistical records. (Research Data Canada 2017) Humanities researchers produce most, if not all, of these types of data. The multimedia aspect of humanities research is only part of the complex puzzle of how to organize data management. One must understand the theoretical underpinnings of humanities research and the data it produces in order to appreciate the often much smaller and more nuanced data sets of humanists scholars and the unique nature of humanist inquiry. Taken from a professor of Digital Medieval Studies, the following excerpt explores this phenomenon: Humanities’ data has depth in small universes. Our material has the capacity to unfold inwards, as it were, to disclose layer upon layer of insights and connections, within a comparatively tiny amount of data--almost an inverse matryoshka, as it were, where each inner doll is bigger and more complex than the one encasing it. (Bolintineanu 2016) Humanities data requires a level of interference and analysis that is divergent from scientific inquiry. It is changeable, shaped by everything from the tools used to analyse or present it to the scholars who attempt to interpret it. This is why traditional understandings of data seem foreign or unfit for use in a humanities context. Perhaps Posner put it best in stating, “When you call something data, you imply that it exists in discrete, fungible units; that it is computationally Best Practices for Digital Humanities Data Collection 5 tractable; that its meaningful qualities can be enumerated in a finite list; that someone else performing the same operations on the same data will come up with the same results. This is not how humanists think of the material they work with” (Posner 2015). In our case, whether digital or traditional humanities research is concerned, the data produced often poses challenges to the information professional. Simply applying scientific understanding and practices to the field of humanities data management ignores the theoretical underpinnings of humanities research. Even when tools or analytical techniques from the sciences can be fit into a humanities-esque mold, disagreement exists about their appropriateness: [DH visualization tools borrowed from the sciences] carry with them assumptions of knowledge as observer-independent and certain, rather than observer co-dependent and interpretative. […] To begin, the concept of data as a given has to be rethought through a humanistic lens and characterized as capta, taken and constructed. (Drucker 2011) This does imply, however, a unified understanding of what constitutes data within the realm of scientific research and beyond (Funari 2014 or 2015?). Definitions abound, with their own inclusions and focus, even among scholars of the same university department (Whitmire, Boock, and Shutton 2015). It has been shown that academic institutions, federal funding agencies, and regulatory bodies all define ‘data’ uniquely (Joshi and Krag 2010). For example, the Tri-Council Agencies of Canada, made up of the Canadian Institutes of Health Research (CIHR), the Natural Sciences and Engineering Research Council of Canada (NSERC), and the Social Sciences and Humanities Research Council (SSHRC), provide a definition for data in its policies for all grant- funded projects. The agencies note that research data, “include observations about the world that are used as primary sources to support scientific and technical inquiry, scholarship and research- creation, and as evidence in the research process” (Tri-Agency Statement of Digital Principles on Digital Data Management 2016). A more agnostic definition, from ISO/IEC 2382-2015 (2015), Best Practices for Digital Humanities Data Collection 6 defines data as “a re interpretable representation of information in a formalized manner, suitable for communication, interpretation, or processing.” But even these definitions of data, rooted in scientific modes of understanding research, cloud how humanities scholars interpret their own research. It does little to bring the humanities or social sciences, which tend to not think of their findings as tractable, finite, or identically reproducible, into the realm of research data. In an effort to be more succinct, and to align ourselves more with humanistic data theory, we wish to present one more definition of data: Data is “units of information observed, collected, or created in the course of research” (Erway et al. 2013). Importantly, Erway’s definition presumes no scientific inquiry, quantitative analysis, or identically reproducible results. From here, we are better placed to understand the data management needs of digital humanist scholars. Research data management As with all projects, it is imperative to invest in a data management strategy in the digital humanities. As early as 1968, researchers were concerned that “librarians are less than ever before keepers of books; they are coming to be managers of data” (Hays, 1968, 5). More recently, literary scholars have become concerned with the ‘computational turn’, or the increasing reliance on computer science techniques to perform humanities research. This is necessarily different from the concept of the digital humanities, but it is responsible for what Manovich has termed the ‘cultural analytics paradigm’, whereby one assumes that the “big data” created by twenty-first century cultural production is vast, and, therefore unknowable (Hall 2013). Research data management, however, is all aspects of creating, housing, maintaining, and Best Practices for Digital Humanities Data Collection 7 retiring data (O’Reilly et al.2012) and therefore makes these vast amounts of data knowable, sortable, and manageable. The data lifecycle, although originally conceived for science data, is also applicable to humanities data management and can provide helpful guidelines for structuring a data management plan. The California Digital Library defines the data life cycle as having eight steps: plan, collect, assure, describe, preserve, discover, integrate, and analyze (Strasser et al. 2012). By managing these steps, standardized and usable data is created; housed in a way that it is stable, searchable, and findable; maintained through various switches of file formats, permutations, and manipulations; and retired to an archive in a sustainable fashion. Although all of the different permutations of Manovich’s big data are unknowable, research data management makes them possible and searchable. Managing data created during the course of (digital) humanities research requires that the data manager pay attention to the special landscape which they navigate to create, conceptualize, and analyze their data. Humanities research data management is, as Awre et al. (2015) point out, an example of Rittel and Webber’s (1973) ‘wicked problem’, that is, a problem that is seen differently to different stakeholders. As opposed to a ‘tame problem’, where there exists one answer to each problem, for example, “How do I execute a search strategy on the library catalogue”, a wicked problem has multiple solutions that are neither true nor false, just a good solution or a bad solution. As Awre et al. point out, the first step in reckoning with managing any amount of research data is to recognize the complexity of the problem. Keeping this necessary complexity in mind, it becomes obvious that individual projects require an individualized plan, and, to that end, we have used the experience of one particular humanities research data problem as a lens through which to view the subject. Best Practices for Digital Humanities Data Collection 8 Method Rimmer et al. point out that when designing digital resources for humanities scholars, “we need to better understand their research experiences and practices” (2008, 1378). This same principle extends to designing digital humanities data management strategies. The research experiences and practices of scholars heavily informed the work of this project. The case study arose out of collaborative work on the Digital Tools for Manuscript Study Project, based jointly out of the University of Toronto Libraries and Centre for Medieval Studies, to create modular, interoperable tools for scholars using digital medieval manuscripts. The project pairs a set of development outcomes with a scholarly counterpart to demonstrate the capabilities of the tools. One tool we wish to extend and improve upon, in particular, is called VisColl (Porter 2013). VisColl is designed to generate digital visualizations of the binding structure and physical makeup of a medieval manuscript. These digital visualizations are known to scholars as ‘collation diagrams’, and are of immense importance to scholars interested in the method, context, and afterlife of the creation of medieval codices. Traditionally, collation diagrams are produced by hand, as the scholar carefully analyzes the binding of each section of pages in a manuscript (known as a ‘quire’), producing diagrams of the quire’s structure and developing what is known as a collation statement. VisColl is intended to make this process easier and more robust. Scholars want to use VisColl to produce multiple visualizations and statements of extant Canterbury Tales manuscripts. Data collected by researchers will need to interact with the VisColl tool, which, in turn, will need to interpret and represent the data. As such, from the Best Practices for Digital Humanities Data Collection 9 outset, we recognized the need for a research data management strategy to streamline collection processes. We not only felt that this was essential to the success of the overall project, but we also saw an opportunity for progress in the world of digital humanities data management. Two researchers (referred to as Researcher A and Researcher B) were sent overseas to visit multiple archives and libraries to examine several manuscripts. Instruction came from the lead scholar only; no prior input was given to the researchers by an information professional. From speaking with medieval scholars across several institutions prior to this research trip, it became very apparent that, even among specialties, there is no standard data collection practice shared by scholars. As the digital humanities continue to grow and develop in current and new fields, practices most likely will not be standardized across or among disciplines. Upon their return, researchers were interviewed separately about their experiences. At the same time, we examined the data files, both analog and digital, and developed basic organizational spreadsheets in which the researchers were to insert their data. The spreadsheets were created in order to get a good understanding of what raw data we were dealing with while creating a preliminary organizational scheme and preparing for data transfer to our collation tool. Throughout the post-collection process we kept in close contact with researchers to ensure that our assumptions and ideas were valid and representative of their experiences. Our findings from this experience will be discussed in the following section. Discussion How can we as library professionals best aid humanities scholars in the area of data management? We operated under the assumption that the data collected by researchers would be Best Practices for Digital Humanities Data Collection 10 input into a collation tool and used to develop a scholarly argument. By analyzing the data produced by the researchers and speaking with them about their process we recognized four key findings that characterize a researcher’s approach to manuscript study and provide a roadmap for information professionals: the influence of time, universality of pre-data collection practices, reliance on mixed media data collection, and personalized information management. i. Time: Scholars have very limited time to work with physical manuscripts. Any implemented data management processes must be cognizant of this. Time was by far the most influential factor to researchers during the data collection process. One researcher’s ideal data collection process was described simply as “More time.” During the research visit, most of the items had not been digitized, meaning if information was missed or questions remained, and the researcher could not easily refer to the manuscript once back home. In addition, researchers must operate within the fixed hours of the library or archive they visit, resulting in their having an average of between six and eight hours per manuscript per day. Given the size and complexity of many manuscripts, certain texts required more time to analyze than others. This in turn affected research processes, data collection and data management. Researcher B, for example, stated these timeframes were “not really enough time to study a manuscript. It’s just enough for collation and notes on interesting things”. The more time given to researchers, the more information and detail can be collected. Both researchers stated that they spent twice as much time post-processing their data as compared to time spent with a manuscript. This is significant because it frames the way in which the researchers think of their work in repositories. Researcher B had even less time than normal Best Practices for Digital Humanities Data Collection 11 when looking at certain select manuscripts, which affected the type and quality of data they were able to collect. Both researchers described their time as being dominated by taking notes about what they felt were the most important aspects of a manuscript as quickly as possible. Researcher B stated, “If I know I’m running out of time, I take as many pictures as I can and hope they are sufficient later on”. It seems, in this instance, that work done in a repository often entails collecting information that is interesting or has the potential to be interesting in the future, and relying on later information processing to make sense of the data that was gathered. Development of scholarly connections and arguments often happen far away from the material in question. Ideally, any data management approach we develop for these scholars must not require excessive time. For this reason, any alteration to their research process must be minimal or we risk non-adoption or misuse. It should be noted that, through speaking with other manuscript scholars, there are instances where time may be less of a challenge (e.g., when interested in one specific manuscript or a few which are all housed at the same repository), but, for the most part, time is of the essence. Researchers want to spend their time examining a manuscript and opt for whichever collection method they feel is the fastest. In a broader context, all scholars operate under similar constraints and preferences. Digital tools and their associated workflows need to feel natural and easily work into the current research process, because if they do not it is a waste of valuable time (Antonijevic 2015). ii. Pre-data collection Preparation: Researchers conduct basic to very in-depth research about their objects of interest prior to a repository visit. This should be the stage of Best Practices for Digital Humanities Data Collection 12 intervention for information professionals in which clarity of research purpose has been reached and time is not a stressor. Both researchers engaged in pre-visit preparation for this and other projects. Other researchers with whom we have spoken over the last few months indicate that they follow the same practice. Actions range from checking bibliographic cataloguing records to reading previous scholarship about the manuscripts. The researchers seek out an understanding of the research that has already been completed on the object, note items of interest, and identify areas where research may be lacking. These preparatory practices are closely related to time limitations. As one researcher pointed out, “I prep in advance, try to figure out how much time each manuscript will take me, especially with a limited amount of time in an archive”. If there is time, or the research goal is very well articulated, researchers tend to think about organization, even in an abstract way, prior to their visit. For example, researcher B cobbled together checklists they came across over years of study. Researcher A found information to compare findings to the scholarly canon. Every trip teaches them something new about their data collection process, and they recognize holes in their preparation that affect results. What is interesting, however, is as they reflected back on their collection processes they consistently identified tactics well known to information professionals. For example, without using the information terms precisely, researchers recognized controlled vocabularies, pre- defined categories, improved workflows, tracked tags, and systematic file-naming as beneficial to their research. One researcher stated, “I wish I had thought about my categories prior to visits so my notes would have been organized and efficient”. Best Practices for Digital Humanities Data Collection 13 Ultimately, this discussion was not prompted by the potential to reduce post-collection work on the part of the information professional, but the potential for the researcher to save time in the archive. In the researcher’s mind, better quality data does not mean less post-processing. The goal is to decrease the inconsistency of data collection. Researchers lamented notes that became less clear depending on situational factors. Information deemed nonessential is often left out only to be missed later. It is their belief that with a more structured process, the frequency of these occurrences will decrease. The often serendipitous nature of manuscript work is a concern for information managers and researchers alike. Researchers truly never know exactly what they will see when looking at a manuscript - their intention to study one aspect may be completely pushed aside upon the discovery of something unexpected. As with most research, what is fascinating to one researcher may not be worth a second glance from another. One simply cannot control for all of the possible variabilities in manuscripts and the whims of human nature. Any data management plans constructed prior to archival visits must reflect the potentially unstructured path of inquiry. Any attempts at imposing an immovably rigid system will risk serving a few of users and will ensure non-adoption from many others who do not trust the system or are not able to adapt their research practices around it. iii. Mixed Media: Manuscript researchers tend to produce a multitude of both digital and analog files during their visits. Not a trait solely of manuscript scholars, humanists of all disciplines subscribe to a “fusion of digital and ‘pen and paper’ practices” (Antonijevic 2015). Manuscript scholars rely Best Practices for Digital Humanities Data Collection 14 heavily on do-it-yourself images regardless of available digital surrogates. Based on responses from our two researchers, photos are often taken of details which were not caught in the digitization process, when something is too difficult to describe quickly, is an example of a particular phenomenon, when a feature looks interesting, or, as a last resort, to gather as much information as possible before running out of time. Researcher A even took a video of a part of a binding structure which was so different from the standard so that they could consult with colleagues about it later. Due to their volume, and difficulty to track, organize, and store, photos are a particular problem. Researchers often spend a lot of time naming image files and linking them to their notes in some way. These are often kept in greater disarray than other files, with non-descriptive file names and non-standardized tags. Alongside images and photo data, researchers create textual notes about the manuscript they are examining. One researcher took analog notes completely while the other started with analog but switched to digital when they felt it was not efficient. Other researchers we talked to also report a mix of analog or digital notes depending on the individual scholar’s preference, subject matter, and experiences. Often, certain items are interesting, but are not easily expressed digitally. For example, a collation statement, such as the one for Cambridge, Corpus Christi College MS 144, which is notated I8-VII8 VIII8 (+1), is easier to write down manually than enter into a text document because of the superscript notation. The preferred method for collecting digital notes is in Microsoft Excel or Word whereas analog notes tended to have a loose structure of organization such as charts, columns and sub-headings that were unique to the researcher. Finally, researchers often create drawings of manuscript structures, either manually or digitally. These collation diagrams are essential to the researcher, and are most easily produced by hand. Often times, the structure of a binding will reveal oddities of book production or call into Best Practices for Digital Humanities Data Collection 15 question the textual content of a manuscript. These diagrams are often referred to countless times throughout research and used in publications. They are made most commonly with pencil and paper, but digital collation tools are becoming more usable. One researcher was able to visualize a binding pattern by creating a digital collation in Excel while keeping the data neat and organized. iv. Personalized Information Management: All manuscript researchers create their own personalized approach to study which is reflected in every aspect of their personal information management practices. Both interviews and analysis of raw data collected by researchers made very apparent that each researcher develops their own idiosyncratic data management system. There was a lack of standardized vocabulary, researchers disagreed on what labels to put on their data, and their organization grew organically as their data was produced. This presents a series of problems. The creation of standardized vocabulary is quite difficult within the field. “Things like how to record a manuscript’s quire formulas are pretty standard, but the words we use are all over the place,” said researcher A. For example, describing the cover of a manuscript can take many forms; one scholar might refer to the “boards”, whereas another might call it a “cover”, and another might lump it in with the general description “binding”. As researcher B points out, “This is why pictures and diagrams are very useful as they can transcend the vagaries of language.” More difficult is the phenomenon of the organic development of a data management style. Researcher A commented, “Because I was collecting a whole pile of data and I wasn’t sure what I would find I put everything into tiny categories; I started to refine a better system as I Best Practices for Digital Humanities Data Collection 16 went through. By that point I had missed earlier data.” Because of the restrictions of different repositories, it is difficult to return and retrieve the missing data. However, when asked if there was a particular feature of their data management system that they did not like, the researcher responded, “No, because if there was, I would change it. I wouldn’t know [I didn’t like a feature] until I found the magic work around difference.” This individualization of research processes makes it extremely difficult for the information professional to create a pre-defined research procedure. Since each researcher has created a method created through testing different strategies to find what works for them and what does not, they will often be resistant to strategies that have been deemed appropriate for the group which they have personally found ineffective. Problems for the information professional For the information professional, then, creating a data management strategy can be difficult. For those who want data that is sortable and easily malleable, creating Microsoft Excel tables or asking for checklists to be completed might clash with a researcher’s desire to take more photographs that cannot be sorted or to take notes with a more organic information structure by hand. Time is always a factor in these decisions as it puts further constraints on a data management plan. At some point in the research process, data collected on these trips will need to take on a digital form. Whether for analysis, preservation, sharing, or publication, all data will go through transformations to facilitate use. Given this inevitable outcome, information professionals need to work with scholars to identify a suitable point of intervention while communicating the benefit of such actions. Best Practices for Digital Humanities Data Collection 17 Our desire for order, through standardization, structure, and schemas often runs opposite to the more nuanced, organic, and personalized work of individual humanists. Terminology, itself sometimes a subject for scholarly argument, changes depending on the era of study or background of the researcher. Since humanities research is often a discipline given to individual study it leads to individual practices and vocabularies. As such, dreams of standardized workflows or even a taxonomy of vocabulary terms are fairly unrealistic in this climate. A compounding factor is the uniqueness of the material of study itself. No two manuscripts are exactly the same nor are the scholars who look at them. Attempting to predict every scenario, oddity, or change of interest is impossible. Best Practices A result of this study has been the development of general best practices that will work to serve the manuscript scholarly community and the greater digital humanities community simultaneously. In the near future, we plan to test our ideas in the field with the same subjects to determine whether the approach holds value. As Antonijevic states, “although generic tools have better potential to meet research needs of a broader set of humanists, there is also space for a smaller-scale and more experimental tool building” (97). Our hope is that by creating best practices that work within the context of our manuscript based-research project, these smaller- scale tools will have broader application to the wider digital humanities environment. The first practice is to work with scholars during the planning phase of the data life cycle. Information professionals should promote early planning as both beneficial to the overall research process and compliant with university and funding agencies. Our researchers preferred Best Practices for Digital Humanities Data Collection 18 preparation methods, with one noting, “I think the main thing is the more prep-work beforehand to be honest.” Scholars can lay out expectations, create resources that are mutually agreeable to both the scholar and the information professional, and address any concerns before reaching the repository. Information managers can and should create basic tables or checklists at this time to ensure that data is standardized, sortable, and searchable. The second practice, and perhaps most important, is to follow a community approach to data management solutions. Information professionals should incorporate scholars during planning and use their insights to develop solutions. Providing them with a taxonomy or rigid, generalized rules does little to encourage scholars to make use of them, regardless of benefit. But working in a more interdisciplinary way, information managers can borrow from different research communities of practice that fit researchers’ needs. For example, a field like archaeology - with its marriage of both scientific and artistic practices - could be used as a reference point for humanities data management practices. “In archaeology,” writes Antonijevic, “there is no real distinction between digital and non-digital tools” (49). Finally, the third practice is to develop an approach that aligns with scholarly practice as closely as possible. In her ethnographic study, Antonijevic recognizes, “humanities scholars envision tools that would enable seamless and multidimensional flow of research activities from one phase to another and back, across multi sided and multimedia corpora” (95). Indeed, our study participants imagined a futuristic world in which the collection of data in a library could be immediately organized, tagged, and connected to related information with little intervention. The first step in this direction would be careful consideration of the data and processes that surround it. The easier it is to incorporate protocols into research, the more likely scholars will make use of them and the greater the potential for data sharing, long-term preservation, and reuse. Best Practices for Digital Humanities Data Collection 19 Conclusion Based on our findings, we are beginning to develop an approach for the next stage of our research. Still in the preliminary planning stage, our hope is for the beginnings of an ontology, which allows flexible changes to its collection and structure, a formalized checklist outlining the essential data that need to be collected, and a template, both in analog and digital form, which will add structure to their research notes and facilitate the use of tools later on in the research cycle. All of this will be developed and vetted with the close consultation of researchers to ensure their cooperation and our mutual success. This data will then be usable throughout our wider digital humanities project, and the structures and workflows that we develop for data collection and curation can be used for future digital humanities projects. It will serve to validate the tools we create for digital manuscript scholars and also test our framework against the wider field of digital humanities. As the digital humanities grow and adapt to new environments and applications research data practices will come under necessary review. Although humanities scholars have always ‘managed’ their data, in that they track their research and use their own organizational systems, incorporating digital tools changes the way this process unfolds. In short, digital humanities research necessitates an approach perhaps more in line with the standardized scientific approach than the traditionally individualized nature of humanist inquiry. As information professionals, we need to understand these differences and reconcile them with current research data management practices. We must challenge our traditional notions of research data management by placing ourselves within the context of different fields and theories. Information professionals are well suited for this role since we understand both the potential and limitations afforded by different Best Practices for Digital Humanities Data Collection 20 data sets and practices. In short, we must understand and accommodate both the digital and the humanities in our own work. Future efforts in the realm of DH data management will only be successful if we stake out a path in which both sides of the digital humanities coin are recognized and considered. Best Practices for Digital Humanities Data Collection 21 References Abbas, June. 2010. “Structures for organizing knowledge: exploring taxonomies, ontologies, and other schemas”. New York, NY: Neal-Schuman Publishers. Antonijevic, Smiljana. 2015. “Amongst Digital Humanists. An ethnographic Study of Digital Knowledge Production”. New York, NY: Palgrave MacMillan. Awre, Chris, et al. 2015. “Research Data Management as a ‘Wicked Problem’.” Library Review. 356-371. Baofu, Peter. 2008. “The future of information architecture: conceiving a better way to understand taxonomy, network and intelligence”. Michigan: Chandos. Bolintineanu, Alexandra. 2017. “DH History and Data”. Lecture at Woodsworth College, CCR199H1S, Introduction to Spatial Digital Humanities, January. Briney, Kristin. 2015. “Data management for researchers: organize, maintain and share your data for research success.” Exeter, UK: Pelagic Publishing Ltd. Crompton, C., Lane, R. J., Siemens, R. G. 2016.” Doing digital humanities: Practice, training, research”. New York, NY: Routledge. Drucker. Johanna. 2011. “Humanities Approaches to Graphical Display”. Digital Humanities Quarterly. 5(1). Retrieved from http://www.digitalhumanities.org/dhq/vol/5/1/000091/000091.html Erway R. et al. 2013. “Starting the Conversation: University-wide Research Data Management Policy”. Retrieved from: http://www.oclc.org.myaccess.library.utoronto.ca/content/dam/research/publicatio ns/library/2013/2013-08.pdf Funari, Maura. 2015. “Research data and humanities: a European context” Italian Journal of Library and Information Science 5(1): 209-236. Goven, Abigail and Raszewski, Rebecca. 2016. “The data life cycle applied to our own data”. Journal of the Medical Library Association. 103(1): 40-44. http://www.oclc.org.myaccess.library.utoronto.ca/content/dam/research/publications/library/2013/2013-08.pdf http://www.oclc.org.myaccess.library.utoronto.ca/content/dam/research/publications/library/2013/2013-08.pdf http://www.oclc.org.myaccess.library.utoronto.ca/content/dam/research/publications/library/2013/2013-08.pdf http://www.oclc.org.myaccess.library.utoronto.ca/content/dam/research/publications/library/2013/2013-08.pdf Best Practices for Digital Humanities Data Collection 22 Hall, Gary. Dec 2013. “Toward a Postdigital Humanities: Cultural Analytics and the Computational Turn to Data-Driven Scholarship.” American Literature 85(4): 781-809. Hays, David G. 1968. “Data management in the humanities”. Library, Information Science & Technology Abstracts, EBSCOhost (accessed October 4, 2016). http://www.dtic.mil/dtic/tr/fulltext/u2/668752.pdf Heuser, Ryan and Le-Khac Long. 2011. “Learning to Read Data: Bringing out the Humanistic in the Digital Humanities” Victorian Studies, 54(1):79-86. ISO/IEC 2382-2015. 2015. “Information Technology: Vocabulary”. Retrieved from: https://www.iso.org/obp/ui/#iso:std:iso-iec:2382:ed-1:v1:en Joshi, Margi and Krag, Sharon S. 2010. “Issues in Data Management” Science and Engineering Ethics 16:743-748. Kanare, Howard M. 1985. “Writing the laboratory notebook”. Washington, D.C.: American Chemical Society. Krier, Laura and Strasser, Carly A. 2014. “Data Management for libraries”. Library and Information Technology Association, Chicago: Neal Schuman Publisher. O’Reilly, Kelley et al. 2012. “Improving University Research Value: A Case Study” SAGE Open 2:3 (https://doi.org/10.1177/2158244012452576) Porter, Dorothy. 2013. “Viscoll: Visualizing physical manuscript collation”. Retrieved from: https://github.com/leoba/VisColl. Posner, Miriam. 2015. June 25. “Humanities Data: A Necessary Contradiction”. Retrieved from: http://miriamposner.com/blog/humanities-data-a-necessary-contradiction Posner, Miriam. 2016, April 19. “Data Trouble: Why Humanists Have Problems with Datavis, and Why Anyone Should Care”. Retrieved from: https://www.youtube.com/watch?v=sW0u1pNQNxc&t=209s Research Data Canada. 2017. “Original RDC Glossary”. Retrieved from: https://www.rdc- drc.ca/glossary/original-rdc-glossary/ http://www.dtic.mil/dtic/tr/fulltext/u2/668752.pdf https://www.iso.org/obp/ui/#iso:std:iso-iec:2382:ed-1:v1:en https://github.com/leoba/VisColl http://miriamposner.com/blog/humanities-data-a-necessary-contradiction/ https://www.youtube.com/watch?v=sW0u1pNQNxc&t=209s https://www.rdc-drc.ca/glossary/original-rdc-glossary/ https://www.rdc-drc.ca/glossary/original-rdc-glossary/ Best Practices for Digital Humanities Data Collection 23 Richardson, Julie and Hoffman-Kim, Diane. 2010. “The Importance of Defining “Data” in Data Management Policies” Science and Engineering Ethics 16: 749-751. Rittel, Horst W. J. and Melvin M. Webber. 1973. “Dilemmas in a General Theory of Planning” Policy Sciences 4: 155-169. Rimmer, J. and C. Warwick, A. Blandford, J. Gow and G. Buchanan. 2008. “An examination of the physical and digital qualities of humanities research.” Information Processing and Management 44: 1374–1392 Strasser, Carly; Cook, Robert; Michener, William; & Budden, Amber. 2012. Primer on Data Management: What you always wanted to know. UC Office of the President: California Digital Library. Retrieved from: https://escholarship.org/uc/item/7tf5q7n3 Tri-Agency Statement of Digital Principles on Digital Data Management. 2016. Retrieved from:http://www.science.gc.ca/eic/site/063.nsf/eng/h_83F7624E.html Whitmire, A. L., M. Boock., and S. C. Sutton. 2015. Variability in academic research data management practices. Program, 49(4): 382-407. Zins, C. 2007. Conceptual approaches for defining data, information, and knowledge. Journal of the Association for Information Science and Technology, 58(4): 479–493. doi:10.1002/asi.20508 https://escholarship.org/uc/item/7tf5q7n3 http://www.science.gc.ca/eic/site/063.nsf/eng/h_83F7624E.html work_bp5h7fwmhra6lei67xdw27i6by ---- Web-based Multimedia Mapping for Spatial Analysis and Visualization in the Digital Humanities: a Case Study of Language Documentation in Nepal Web-based Multimedia Mapping for Spatial Analysis and Visualization in the Digital Humanities: a Case Study of Language Documentation in Nepal Shunfu Hu1 & Brajesh Karna2 & Kristine Hildebrandt3 Published online: 17 January 2018 # Springer International Publishing AG, part of Springer Nature 2018 Abstract There has been a growing interest in utilizing geographic information systems (GIS) in the digital humanities and social sciences (DH). GIS-based DH projects usually emphasize spatial analysis and cartographic capability (e.g., displaying the locations of people, events, or movements), however, GIS alone cannot easily integrate multimedia components (e.g., descriptive text, photographs, digital audio, and video) of DH projects. Multimedia mapping provides a unique approach to integrating geospatial information in digital map format and multimedia information, which is useful for DH integration into spatial analysis and visualization. As cartographic mapping and GIS evolve from a traditional desktop platform to the World Wide Web, it is of significance to design and develop a Web-based multimedia mapping approach that could carry out spatial analysis and incorporate multimedia components, which is greatly beneficial to the DH applications. Our objectives of the language documentation research project in Nepal were to (1) use geo-tagging equipment to collect audio and visual recordings of three types of socio-linguistic data: language attitudes and practices interviews, free-form narratives, and elicited vocabulary and grammatical paradigm sets, from representative speakers of the four endangered languages in twenty-six Manang villages; (2) design and develop a Web-based, interactive multimedia atlas that can display data points correspond- ing to the speakers, links to the three types of data gathered in multimedia format, provides friendly user interface for the manipulation and spatial analysis of all the data. It is anticipated that the Web-based, interactive, and multimedia language atlas can bring all local and international stakeholders, such as the speech communities, linguists, local government agencies, and the public, together to raise awareness of language structures, language practices, language endangerment, and opportunities for preservation, all through this easy- to-use means that enhance the geo-spatial representation in engaging visual and sensory (multimedia) formats. Google Maps API and JavaScript are employed to develop this online, interactive, and multimedia language atlas. Keywords Multimedia mapping . Spatial analysis . Visualization . Digital humanities . Language documentation . WWW Introduction There has been a recent and noticeable increase in connections between humanities and geography (including geographic information system (GIS)), particularly in visualizations and in projects that spatially represent historical, narrative, and textual descriptions. This movement has been termed by Harris et al. (2011) as the BGeohumanities.^ By Bhumanities,^ we mean spa- tial considerations of disciplines concerned with the human con- dition, and involving largely qualitative, introspective and spec- ulative methods of inquiry (e.g., literature, anthropology, philos- ophy, history, communication studies, and languages/linguistics). The linkages between social sciences and GIS geography have been substantive and productive, particularly in the rapidly expanding realm of the digital humanities (DH) (cf. Goodchild and Janelle 2004). In such projects, mapping is used in the spatial visualiza- tion of multimedia information, including digital still images, sound, and video. One discipline that is poised to greatly Electronic supplementary material The online version of this article (https://doi.org/10.1007/s41651-017-0012-4) contains supplementary material, which is available to authorized users. * Shunfu Hu shu@siue.edu 1 Department of Geography, Southern Illinois University Edwardsville, Edwardsville, IL, USA 2 Department of Computer Sciences, Southern Illinois University Edwardsville, Edwardsville, IL, USA 3 Department of English Language and Literature, Southern Illinois University Edwardsville, Edwardsville, IL, USA Journal of Geovisualization and Spatial Analysis (2018) 2: 3 https://doi.org/10.1007/s41651-017-0012-4 http://crossmark.crossref.org/dialog/?doi=10.1007/s41651-017-0012-4&domain=pdf https://doi.org/10.1007/s41651-017-0012-4 mailto:shu@siue.edu benefit from a deeper collaboration with GIS is that of linguis- tics, particularly applied dimensions such as typology, socio- linguistics, historical linguistics, and language documentation and description. Examples of such notable interdisciplinary collaboration include the AUTOTYP linguistic typology pro- ject (http://www.spw.uzh.ch/autotyp/), Der sprechende Sprachatlas BThe Speaking Language Atlas^ (http:// sprachatlas.bayerische-landesbibliothek-online.de), and The Linguistic Atlas of the Middle and Atlantic States (http://us. english.uga.edu/lamsas/) (Kirk and Kretzschmar 1992). Recent geolinguistics publications also reflect this shift in col- laborative momentum (cf. Auer and Schmidt (2009), Lameli et al. (2011), and Gawne and Ring (2016)). At the same time, in a recent paper on sampling in dialec- tology research, Buchstaller and Alvanides lament that until recently, BThe majority of sociolinguistic work [could] be de- scribed as spatially naïve, using geographical space merely as a canvas…on to which the results of linguistic analysis [could] be mapped.^ (2013: 96). This need for inclusion and testing of different types of spatial factors alongside social ones is in- creasingly being addressed in regions like the USA and Great Britain (Trudgill 1974; Auer and Schmidt 2009; Lameli et al. 2010; Buchstaller and Corrigan 2011; Cheshire et al. 1989, 1993; Labov et al. 2006; Kretzschmar 1996; Kretzschmar et al. 2014; Britain 2009 and also the rise of Bgeohumanities^ Dear et al. 2011), but is still in its infancy in other regions of the world (but cf. Stanford 2009 for a report on adjusted spatial factors on language practices and tonal patterns of Sui communities in China). Compounding this general gap, precious few quantitative studies have investigated language attitudes and practices in linguistically diverse areas. This gap is unfortunate because it is often these attitudes and reports of language practices that can shed light on shifting ideologies as precursors to endan- germent in areas where languages compete among each other and with prestige varieties (Giles et al. 1977; Coupland et al. 2006). In the Nepal scenario, two of the languages in our larger documentation project (Manange and Gurung) are threatened but viable, while the other two (Nar-Phu and Gyalsumdo) are highly endangered (technically moribund), with very few active younger speakers. Our overall aim, there- fore, is to build an online, interactive atlas that contributes towards what Britain (2009: 142) terms Bsocially rich spatiality,^ taking into account speaker practices and networks as they intersect with geo-physical location. The potential for mutual benefit in the GIS-linguistics and GIS-language documentation collaborative contexts cannot be overstated and is the focus of this paper. Increasingly, lan- guage documentation (particularly of vulnerable or threatened speech communities) relies on an awareness and understand- ing of the spatial-temporal interplay of language practices, structural variation, and contact dynamics, all working togeth- er to form a more comprehensive profile of the contributing variables to the survival and threat scenarios of these lan- guages. Spatial visualization of documentation, through maps and atlases, for example, also benefits grammatical descrip- tion in itself as a product or output, as grammars vary widely in their coding and conceptualization of space-time continua (e.g., Slobin 1996; Bickel 1997; Harrison 2008). This varia- tion can be more deeply appreciated in tandem with GIS ap- plications as relevant to the language communities. GIS is defined as a computer program for the capture, stor- age, manipulation, visualization, and spatial analysis of geospatial features (e.g., points, lines, or polygons) and their attributes (Chang 2015). The attribute data of the geospatial features in GIS are commonly alphanumeric values stored in an attribute table, thus termed as structured data. Therefore, the spatial analysis of GIS is often conducted using a struc- tured query language (e.g., BCountry_Name^ = BNepal^). As a result, GIS has traditionally lacked the ability to integrate non-structured data, such as digital photographs, digital audio and digital video clips (i.e., multimedia components). In the past two decades, a new trend of developing multimedia map- ping systems has been seen in the literature. Multimedia map- ping refers to the integration of computer-assisted mapping systems and multimedia technologies that allow one to incor- porate not only geospatial information in digital map format, but also multimedia information. The development of multi- media mapping techniques has gone through several stages, including the emergence of interactive maps and electronic atlases (Openshaw and Mounsey 1987; Rhind et al. 1988; Shepherd 1991), the development of Bhypermaps^ in the 1990s (Wallin 1990; Laurini and Milleret-Raffort 1990; Cotton and Oliver 1994; Cartwright 1999), the integration of hypermedia GIS systems (which features hypertext, hyper- links and multimedia) and GIS in the late 1990s and early 2000s (Shiffer 1998; Bill 1998; Hu 1999; Soomro et al. 1999; Chong 1999; Hu et al. 2003; Yagoub 2003; Goryachko and Chernyshev 2004; Belsis et al. 2004). After having compared the various multimedia mapping techniques, ranging from desktop-based multimedia mapping to Web- based hypermedia GIS, Hu (2012) pointed out that the former relies heavily on computer programming languages (e.g., Visual Basic), and digital mapping software (e.g., ArcView, ArcGIS, or MapObjects) with local data storage, local access, and single user. The media format is often in Microsoft Windows with large file sizes, such as .tiff for images, .avi for digital video, and .wav for digital sound. The latter relies on both computer programming languages (e.g., Visual Basic) and markup languages (e.g., HTML), and Internet map server (IMS) (e.g., MapObjects IMS, ArcGIS IMS) with remote data storage, network access, and Internet users. The media format is often Web-based with small file sizes, such as jpeg for images, .mov for digital video, and .wav for digital sound. In both cases, the multimedia map application developers must invest in dedicated computer hardware (e.g., Web server, data 3 Page 2 of 14 J geovis spat anal (2018) 2: 3 http://www.spw.uzh.ch/autotyp http://sprachatlas.bayerische-landesbibliothek-online.de http://sprachatlas.bayerische-landesbibliothek-online.de http://us.english.uga.edu/lamsas http://us.english.uga.edu/lamsas server) and computer software (e.g., map server), plus IT per- sonnel. The developer often faces a steep learning curve to become knowledgeable about the coding in native language of the map server. Now, as cartographic mapping system and GIS evolve from traditional desktop platform to Web-based online platform, there is a need and an opportunity, to develop a Web-based multimedia mapping approach to integrating geospatial information in digital map formats and multimedia information, which is of significance for DH-centered visual- ization and spatial analysis. Our objectives in our language documentation research in Nepal were to (1) use geo-tagging equipment to collect audio and visual recordings of three types of socio-linguistic data: language attitudes and practices interviews, free-form narra- tives, and elicited vocabulary and grammatical paradigm sets, from representative speakers of the four endangered lan- guages in 26 Manang villages; (2) design and develop a Web-based, interactive, and multimedia atlas that can display data points corresponding to the speakers, links to the three types of data gathered in multimedia format, provides friendly user interface for the manipulation and spatial analysis of all the data. It is anticipated that the online atlas can bring all local and international stakeholders, such as the speech communi- ties, linguists, local government agencies, and the public, to- gether to raise awareness of language structures, language practices, language endangerment, and opportunities for pres- ervation, all through this easy-to-use means that enhance the geo-spatial representation in engaging visual and sensory (multimedia) formats. The following section describes our methods and the atlas design and functionality. Methodology Study Area As alluded to, Nepal has a high degree of linguistic diversity, with approximately 100 languages attested (CBS 2012; Kansakar 2006). Most are Btribal^ languages (indigenous, lo- cally bounded, and strongly connected to community cultural identification and practices), concentrated in a couple of vil- lages over a small area. As an example, the Manang District is both culturally and linguistically heterogeneous and can be divided into four ethno-linguistic areas across 26 villages: Manang Gurung and Gyalsumdo to the south (where Manang Gurung and Gyalsumdo speakers live), the Nar val- ley to the north (where Nar-Phu speakers live), and the upper Manang valley in the west (where Nyeshangte speakers live) (Fig. 1). All languages in this area are Tibeto-Burman. The Manang District is appropriate for a case study of Web- based multimedia mapping as it intersects with geo-linguistics and language documentation and it has undergone rapid envi- ronmental, economic, and infrastructure development and changes over the past 15 years, including the ongoing con- struction of its first motorable road and the population shifts associated with this (see Laurance 2014 for commentary on road-building impacts). Some Manang communities have also witnessed population movements associated with both the rise of boarding schools in the capital Kathmandu, and also the rise of migrant worker opportunities where young adults relo- cate to Gulf States like Saudi Arabia, Bahrain, and United Arab Emirates for long-term employment (Hildebrandt et al. 2015). These changes have mixed impacts. On the one hand, they can benefit rural communities by connecting them to business and other opportunities available only to more cen- trally located marketplaces. On the other hand, these changes can trigger language shift as local residents (particularly youn- ger ones) may emigrate away from their areas of traditional language practice for education and job opportunities. These changes introduce new, complex variables behind language contact and language endangerment beyond just social vari- ables, and further motivates a spatial perspective of language practices and patterns in this area. Data Source and Data Set The primary data source was the sociolinguistic interviews conducted in the 26 villages of the Manang District across the four language groups.1 There was a total of 87 interviews conducted between 2012 and 2014. Each interviewee was asked a total of 61 questions (see Appendix for the full set of questions). The data sets for this project included the de- scriptive text from the interviews gathered from locally orig- inating and residing speakers of the four languages, geograph- ic coordinates in longitude (x) and latitude (y) of the residences where each interview took place, digital photographs, and digital video clips. Digital photographs were taken using Cannon SLR 40-D digital camera and saved in JPEG format. Digital video clips were acquired using a Sony Handycam HDR-XR550 digital video camcorder and stored in MPEG format. As part of the project agreement with the funding agency, the multimedia content from this project were ar- chived with the SHANTI (Sciences, Humanities, and Arts Network of Technological Initiatives) Collection (http:// shanti.virginia.edu/wordpress/?page_id=414). SHANTI is housed in the University of Virginia as a publisher of websites and other digital content on languages and cultures of the Tibetan Plateau and greater Himalayan region. In addition, Google Maps was employed as the base map for the integration or Bmashup^ of the multimedia information 1 All interviews began with an oral consent process (originally composed in English and given in Nepali, the regional contact language), which was based on a script approved by SIUE’s Institutional Review Board (IRB) for informed consent in research involving human subjects. This consent process included respondent awareness that his/her information would be made available for public access, through audio-visual and through still (photograph) images. J geovis spat anal (2018) 2: 3 Page 3 of 14 3 http://shanti.virginia.edu/wordpress/?page_id=414 http://shanti.virginia.edu/wordpress/?page_id=414 collected during the sociolinguistic interviews. The next sections describe the process of data preparation (BUse of XML to Store the Data^ section), data loading and display on Google Maps (BUse of jQuery JavaScript Library to Load the XML File onto the Google Maps^ and BUse of Google Maps JavaScript API V3 to Display the Data^ sections , respectively), and how the user interface for spatial analysis and visualization was developed (BUse of JavaScript, XHTML, and CSS to Design the User Interface and Functions for Spatial Analysis and Visualization^ section). Use of XML to Store the Data There are various ways to store and prepare the spatial data for display on Google Maps. Among them, XML is the simplest and easiest method to use due to its free and open source. XML (Extensible Markup Language, stored with the exten- sion .xml in plain text format) is similar to HTML but does not have any predefined tags and is platform independent. The authors defined the tags based upon project-specific require- ments. Below is an example of the XML file (e.g., Languages_pts_2016.xml) that contains the information for one sociolinguistic interview, including the speaker ID, the language name, the village name, the coded interview ques- tions and responses (e.g., Aindex, Bindex), interviewee age group, longitude (x), latitude (y), picture ID, video link, and other information. For each interview, all relevant information was placed in a pair of and tags, each representing the point location of the interview. There are currently 87 pairs of such tags in the entire XML file. Notepad++, a free source code editor which supports sev- eral programming languages running under the Microsoft Windows environment (http://notepad-plus-plus.org/), was used to prepare the XML data file. It was also used to edit all of the JavaScript code for the project. For those using different platforms, such as macOS or Linux, there are numerous open-source editors, such as Sublime Text and Atom. Fig. 1 Locations of 26 villages in the Manang District of Nepal 3 Page 4 of 14 J geovis spat anal (2018) 2: 3 http://notepad-plus-plus.org Use of jQuery JavaScript Library to Load the XML File onto the Google Maps jQuery is a cross-browser JavaScript library designed to sim- plify the client-side scripting of HTML. It is a free, open- source software designed to create dynamic Web pages. Therefore, jQuery (version 2.1.4) is selected in our applica- tion. There are two ways to integrate jQuery library to the application. One way is to download the jQuery library and place it at the same place where the main Web page (e.g., index.html) is located, which is illustrated with the following code in HTML. Another way is simply to point to the URL where the jQuery library is located, as shown below. The code below shows how the XML content (i.e., Languages_pts_2016.xml) is uploaded to the Google Maps using JavaScript jQuery.get function ($.get in short) at the initialization of the Google Maps (i.e., function initMap()) and jQuery(data).find(Bpt^).each(function(){ }); is used to retrieve the information for each data point (notice the and tags in the xml file mentioned above). A variable, xmldoc, is declared to withhold the information (e.g., language_name, village_name, picid, and video) for each point. Another variable, latlng, is de- clared to withhold the locational information (i.e., longi- tude (x) and latitude (y)). Fig. 2 Conceptual framework for the integration of the Google Maps API, XML, and HTML for Web-based multimedia mapping J geovis spat anal (2018) 2: 3 Page 5 of 14 3 Use of Google Maps JavaScript API V3 to Display the Data The launch of Google Maps in 2005 has revolutionized Web mapping service applications on the Internet. Based on Asynchronous JavaScript and XML (AJAX), a new type of client/server interaction was introduced in Google Maps to maintain a continuous connection be- tween the client and the server for immediate downloading of additional map information (Peterson 2008). In addition to implementing a better client/ server interaction, Google also provides programmers free access to its code in the form of an Application Programming Interface (API). In other words, the API consists of a set of routines or functions that can be called by a programmer using JavaScript. Linking the Web page with the Google Maps API is straightforward in version 3, using a single HTML, shown below: This is a standard HTML directive to include an external JavaScript file, served by maps.google.com. This element is added to the

section of the Web page where the map is loaded. The next key step is to initialize the API and load the map onto the Web page as follows. First, we created a initMap() function to launch the Google Maps. In the initMap() func- tion, we defined the center of the map to be displayed on the Fig. 3 Initial launch of the Web-based multimedia atlas, which displays the three tabs, the Google Maps, the locations of all 87 sociolinguistic interviews across four languages in Manang, Nepal, and the map legend 3 Page 6 of 14 J geovis spat anal (2018) 2: 3 http://google.com Web page at 28.576671° N, 84.257245° E. Another function, pushMapData(), was developed to load all data points from the xml to the map. A third function, AddMapLegend, was developed to load the map legend. Fig. 4 Search result from Languages by Village tab: in this case, the user selected Chame village. There were six sociolinguistic interviews conducted in Chame village. The marker icons on the map also indicate Gyalsumdo and Gurung as the mother tongue languages as reported in the interviews J geovis spat anal (2018) 2: 3 Page 7 of 14 3 Use of JavaScript, XHTML, and CSS to Design the User Interface and Functions for Spatial Analysis and Visualization In the design of the Web-based multimedia digital atlas, the layout design was adopted. There are three rows. The first row contains three tabs, the second row is the map container and the third row contains map legends. The three tabs serve as a user interface for spatial analysis and visualization. The first tab is named Languages by Village. With this tab, the user can search for the lan- guage identified by interviewees as their mother tongue. To do so, the user can first select a village from a dropdown list of all the villages where linguistic inter- views were conducted, and click the Search button. The Google map is zoomed into that selected village and customized marker icons that represent locations where the sociolinguistic interviews were conducted in that vil- lage are shown on the map. In this case, there are four customized marker icons for four different languages— the green balloon for BNyeshangte,^ the yellow balloon for BManang Gurung,^ the purple balloon for BNar-Phu^ and the orange balloon for BGyalsumdo.^ In addition, tooltips (e.g., language name) to the markers are provid- ed. In addition, these marker icons are clickable. When the user clicks on an icon, it will launch a Google Maps API standard Info-window in which the speaker ID, age, mother tongue, residential picture, and a link to the video clip of associated recordings will be displayed. The second tab is named Villages by Language. With this tab, the user can search for all the villages in Manang District where a selected language is spoken. To do so, the user can first select a language from the dropdown list of the four languages, and click the Search button. The search result will be displayed on the map as customized marker icons. Similarly, each marker icon is clickable. The third and last tab is Question & Response. With this tab, the user can select an interview question from the question dropdown list (see Appendix for the entire list of the sociolinguistic survey questions), and click the Search button. The search results will be displayed on the map as different marker icons, each representing an answer to that question. This function provides the user a tool to further examine the spatial distribution of language practices and attitudes in Manang languages. Fig. 5 Search result from Villages by Language tab: in this case, the user selected Nyeshangte language, all locations of interviews with this language are displayed on the map 3 Page 8 of 14 J geovis spat anal (2018) 2: 3 Inside second row is the map container where the Google Maps is displayed. A few standard Google Maps controls, such as Pan and Zoom controls, Map Scale con- trol, and Map Type control–Roadmap and Satellite, are available for the user to interact with the map or satellite imagery. The last row is where the map legend is placed. If the user selects the first tab or the second tab, the map legend contains four customized marker icons that repre- sent four different languages; if the user selects the third tab, the map legend contains customized marker icons that represent different responses for the interview questions. Finally, all the pieces were assembled. JavaScript is the native language of Google Maps. In addition, Google Maps is built from HTML and formatted with CSS (Cascading Style Sheet). Therefore, JavaScript, HTML, and CSS are used to develop the functionali- ties. These include creation of a user interface in the form of tabs using bootstrap and Ajax, uploading XML data file via jQuery, displaying points of locations for the sociolinguistic interviews with customized marker icons via Google Maps API, and providing the spatial analysis functions via Google Geocoder. Figure 2 illus- trates a conceptual framework for the integration of the Google Maps API and other JavaScript libraries in the World Wide Web environment. In Fig. 2, the workflow begins with a user-initiated request for the URL of our project website: https://mananglanguages. isg.siue.edu/atlas/. With the request, the server finds the page from the requested location and loads the default page, in this case, Index.html. The server then sends the response back to the user with the content of the project page. We are using JavaScript, which is client-sided in its scripting, so all of the code execution takes places in the client browser and reduces the load on the server. This makes our server fast enough to process additional incoming requests. On the client side, the server loads the JQuery Library, including the Google Map API. It then loads a custom written JavaScript, which sends request to the server to retrieve and load the data from our XML into the background. This makes the script dynamic and more responsive to the user. When all processes are exe- cuted completely, the user can see the map, with various mark- er icons plotted on the map with respective legends. Fig. 6 Search result from the Question & Response tab. Here, the user selected the question BDo you think there will be child learners of your mother tongue in future?^ The different marker icons on the map indicate different answers from the interviewees J geovis spat anal (2018) 2: 3 Page 9 of 14 3 https://mananglanguages.isg.siue.edu/atlas/ https://mananglanguages.isg.siue.edu/atlas/ Results The use of Google Maps API V3 provides an efficient and familiar mechanism to deliver digital cartographic information to Internet users with a user-friendly interaction. With Google Maps Standard Map Type control, the user is able to choose one of the two map types: Roadmap and satellite imagery. Figure 3 shows an outlook of the online map service for our language documentation project in Google Chrome. At the initial launch of the Web page, locations of the sociolinguistic interviews are displayed within the map container. Notice also that in Fig. 3, the three tabs are visible: Languages by Village, Villages by Language, and Questions & Responses. Figure 4 demonstrates the result when the user selects a village, for instance, Chame village, from the drop-down list and then clicks the Search button to show all the sociolinguistic inter- views conducted in that village, six in this example. The dif- ferent icons on the map also indicate the type of Tibeto- Burman languages, Gyalsumdo and Gurung identified as mother tongues by the interviewees in Chame. Figure 5 dem- onstrates the result when the user searches for all the villages where one, two, three, or four selected languages is spoken. Figure 6 demonstrates the result when the user chooses the Questions & Responses tab, selects a question from the drop- down list of all the questions, and clicks the Search button. Figure 7 illustrates the Info-window when the user clicks on a marker icon on the Google Map. The Info-window displays multimedia information related to the interview conducted at the marker location, including the speaker id, age, mother tongue, village name, a link to download the transcript of the questions and responses, and a hyperlink to the video clip of the interview. Figure 8 illustrates a sample video clip that may be played, and which is housed in the SHANTI Collection at the University of Virginia. Discussion BMethodology^ and BResults^ sections detailed the workflow and elements of the BDocumenting the Languages of Manang, Nepal^ atlas. However, this multimedia atlas is not the only example of ongoing efforts use geospatial technology in the digital humanities, particularly in geolinguistics and online, interactive language mapping. We summarize here examples Fig. 7 When the user clicks on a marker icon on the Google Map, an Info-window is launched to display all relevant information, including the speaker id, age, mother tongue, village name, a link to download the transcript of the questions and responses, and a hyperlink to the video clip of the interview (see Fig. 8) 3 Page 10 of 14 J geovis spat anal (2018) 2: 3 of three projects with parallels to the Manang Languages Atlas, their similarities, and their differences. First, Saint Mary’s University uses online multimedia map- ping for the Mi’kmaw Place Names Digital Atlas (URL: http:// sparc.smu.ca/mpnmap/). This project makes use of ArcGIS maps and not Google Maps. Additionally, Adobe Flash is required to load the content of the map in the browser. ArcGIS is not open source and can be expensive for some programs. Similarly, Flash is well known for causing spikes in CPU usage and comes with security holes. Our project uses JavaScript and HTML, and the data are encoded on an open-source XML that uses fewer computer resources. Fig. 8 Top: A video clip of a Gurung speaker that is linked from the Info-window on the atlas (Fig. 7) in which the speaker was describing his apple orchard using Gurung language; Bottom: Language transcript. The first line is the local language (Gurung), transcribed in the IPA (International Phonetic Alphabet). The 2nd line is the Nepali free translation. The 3rd line is the free translation in English. The audio was tran- scribed in ELAN (download is free at https://tla.mpi.nl/tools/tla- tools/elan/) and the language transcript is synchronized with the video, which was done by the THL (Tibetan Himalayan Library) using a Drupal platform J geovis spat anal (2018) 2: 3 Page 11 of 14 3 http://sparc.smu.ca/mpnmap/ http://sparc.smu.ca/mpnmap/ https://tla.mpi.nl/tools/tla-tools/elan/ https://tla.mpi.nl/tools/tla-tools/elan/ Second, Language Landscape is a nonprofit British orga- nization that uses maps to plot languages as they are used around the world (URL: http://languagelandscape.org). The project directors encourage people to get involved in the project by adding their own audio clips to the project, which are then displayed visually and interactively. This project uses the Google Map API for the map, JavaScript, and HTML. This map includes pins which display information in audio format which can be played in real time without page redirects. Our project similarly displays map pins related to different languages and on click, and also provides information like speaker name, age, village. Like Language Landscape, our project also includes links to other resources. Third, Atlas of Pidgin and Creole Language Structures project spatially represents pidgin and creole languages, and their structures, from around the world (URL: http://apics- online.info/contributions#2/30.3/10.0). The map is developed using JavaScript and Leaflet JavaScript library for mobile friendly interactive maps. More information about the Leaflet JavaScript library can be found at (http:// leafletjs.com/index.html). The library is open source and lightweight in terms of CPU usage. Pins are plotted to the map with various colors and also display an Info-window with hyperlinks for additional information on a given lan- guage. This map lacks legends, which results in the need to click on each pin to learn more about language types. Conclusion This paper has demonstrated a new and relatively easy ap- proach called Web-based multimedia mapping that could in- tegrate not only geospatial data but also multimedia data in the form of digital photographs, digital sound, and digital video clips—which is very useful for digital humanities and social sciences. The use of Google Maps API and JavaScript libraries allow us to employ the digital maps and satellite imagery from Google Maps with only a few JavaScript codes. The apparent benefit of using existing Maps APIs may be of value to those who do not have the resources to invest in a dedicated computer for map servers and data servers. We used only a Web host account provided by the home institu- tion to upload the atlas Web page (index.html, 48 kilobytes) and the xml file (languages_pts.xml, 30 kilobytes). The latter stored only the Bpointers^ to the digital multimedia files that are related to the social linguistic interviews and narratives collected for the project (archived in SHANTI). SHANTI offers the state-of-the-art technology to deliver the multime- dia content with the highest quality at a faster speed (e.g., video streaming) and is associated free of charge to this pro- ject. It is worth mentioning that for those who do not have their own video streaming service available, free video streaming professional services are provided by third parties, such as YouTube. In addition to the Google Maps API, Yahoo! Maps APIs and Bing Maps APIs can also be used for the same purposes. It is also beneficial for college students to start off with multimedia mapping technology and for or- ganizations, such as the university in this case study, to see the benefits of the new technology. The Web-based, interactive, and multimedia language atlas can be accessed on the Internet; therefore, it can bring all local and international stakeholders, such as the speech com- munities, linguists, local government agencies, and the pub- lic, together to raise awareness of language structures, lan- guage practices, language endangerment, and opportunities for preservation, all through this easy-to-use means that en- hance the geo-spatial representation in engaging visual and sensory (multimedia) formats. One example of this potential contribution may be found in Hildebrandt and Hu (2017) which, through quantitative analysis of spatial distributions of respondent answer types, demonstrates that non-structural (language attitude and use) variables reveal different degrees of vitality vs. endangerment in Nepal. The Web-based multimedia mapping approach of- fers a unique tool for a spatial analysis and visualization of variations in self-reported attitudes and practices across the four Manang languages, with adjusted spatial factors (e.g., location of communities to a newly built motor road, location of communities to the district headquarters, location within closely clustered communities) alongside traditional social factors (e.g., gender, age, education, occupation, and so on). As a result, this research project contributes to new under- standings of the relationship between space and language practices in Nepal. Acknowledgements This work is supported by the National Science Foundation’s Division of Behavioral and Cognitive Sciences - Documenting Endangered Languages (funding no. 1149639): BDocumenting the Languages of Manang^ and by an equipment support grant from SIU Edwardsville. We are grateful to members of the Gurung, Gyalsumdo, Manange and Nar-Phu-speaking communities in Manang, Nepal, for their help in gathering these data. We are grateful to Dubi Nanda Dhakal, Oliver Bond, Sangdo Lama, and Ritar Lhakpa Lama for assistance with interviews. We are grateful to Saita Gurung and Manisha Chaudhary for assistance with atlas construction and development. All errors are the responsibility of the authors. Funding This research was supported by the US National Science Foundation’s Division of Behavioral and Cognitive Sciences- Documenting Endangered Languages (funding no. 1149639): BDocumenting the Languages of Manang^ and by an equipment support grant from SIU Edwardsville. Compliance with ethical standards The authors of this paper will agree, accept, and comply with all the ethical standards set by the journal. Conflict of Interest The authors declare that they have no conflict of interest. 3 Page 12 of 14 J geovis spat anal (2018) 2: 3 http://languagelandscape.org/ http://apics-online.info/contributions#2/30.3/10.0 http://apics-online.info/contributions#2/30.3/10.0 http://leafletjs.com/index.html) http://leafletjs.com/index.html) Ethical Approval The field data collection (i.e., field interviews) was approved by the SIUE’s Institute Research Board. The authors used an approved oral informed consent script for data collection. Informed Consent The authors have given the informed consent to pub- lish this article in the Journal of Geovisualization and Spatial Analysis if accepted. References Auer P, Schmidt J (eds) (2009) Language and space: an international handbook of linguistic variation. Mouton de Gruyter, Berlin. https://doi.org/10.1515/9783110220278 Belsis P, Gritzalis S, Malatras A, Skourlas C, Chalaris I (2004). Enhancing knowledge management through the use of GIS and multimedia. Lecture Notes in artificial intelligence (Subseries of Lecture notes in computer science), vol 3336. pp 319–329 Bickel B (1997) Spatial operations in deixis, cognition, and culture: where to orient oneself in Belhare. In: Nuyts J, Pederson E (eds) Language and conceptualization. Cambridge University Press, Cambridge, pp 46–83. https://doi.org/10.1017/CBO97811390 86677.003 Bill R (1998) Multimedia-GIS concepts and applications. Geo-Inf Syst 11:21–24 Britain D (2009) Language and space: The variationist approach. In: Auer P and Schmidt JE (eds) Language and space: An international hand- book of linguistic variation, vol 1. Mouton de Gruyter, Berlin, pp. 142–162 Buchstaller I, Corrigan KP (2011) How to make intuitions succeed: Testing methods for analyzing syntactic microvariation. In: Maguire W and McMahon A (eds) Analyzing variation in English: what we know, what we don’t, and why it matters. Cambridge University Press, Cambridge, pp. 30–48 Buchstatller I, Alvanides S (2013) Employing geographical principles for sampling in state of the art dialectological projects. J Linguist Geogr 1:96–114 Cartwright W (1999) Development of multimedia. In: Peterson MP, Gartner G (eds) Multimedia cartography. Springer, New York, pp 11–30. https://doi.org/10.1007/978-3-662-03784-3_2 Central Bureau of Statistics (2012) National population and housing cen- sus 2011. Government of Nepal National Planning Commission Secretariat Chang KT (ed) (2015) Introduction to geographic information systems. McGraw-Hill Education, New York Cheshire J, Viv E, Henrik M, Bert W (eds) (1989) Dialect and education. Multilingual Matters, Clevedon/Philadelphia Cheshire J, Viv E, Pamela W (1993) Non-standard English and dialect levelling. In: Milroy J, Milroy L (eds) The grammar of English dialects in the British Isles. Longman, London, pp 53–96 Chong AK (1999) Orthoimage mapping bases for hybrid, multimedia, and virtual environment GIS. Cartography 28(1):33–41. https://doi. org/10.1080/00690805.1999.9714298 Cotton B, Oliver R (1994) The Cyberspace Lexicon—an illustrated dic- tionary of terms from multimedia to virtual reality. Phaidon Press Ltd., London Coupland N, Hywel B, Betsy E, Garrett P (2006) Imagining Wales and the Welsh language. J Lang Soc Psychol 25(4):351–376. https://doi. org/10.1177/0261927X06292803 Dear M, Ketchum J, Luria S, Richardson D (eds) (2011) Geohumanities: art, history, text at the edge of place, 226–240. Routledge, London Gawne L, Ring H (2016) Mapmaking for language documentation and description. Lang Doc Conserv 10:188–242. http://nflrc.hawaii.edu/ ldc/. Last Accessed 28 Nov 2016 Giles H, Taylor D, Bourhis R (1977) Dimensions of Welsh identity. Eur J Soc Psychol 7:29–39 Goodchild MF, Janelle DG (eds) (2004) Spatially integrated social sci- ence. Oxford University Press, New York Goryachko VV, Chernyshev AV (2004) Multimedia and GIS- technologies in atlas mapping. Vestnik Moskovskogo Universiteta 5(2):16–20 Harris TM, Bergeron S, Rouse LJ (2011) Humanities GIS: place, spatial storytelling and immersive visualization in the humanities. In: Dear M, Ketchum J, Luria S, Richardson D (eds) Geohumanities: art, history, text at the edge of place. Routledge, London, pp 226–240 Harrison KD (2008) When languages die: the extinction of the world’s languages and the erosion of human knowledge. Oxford University Press, New York Hildebrandt K, Dhakal DN, Bond O, Vallejo M & Fyffe A (2015) A sociolinguistic survey of the languages of Manang, Nepal: co- existence and endangerment. NFDIN J 14.6:104–122. http://www. nfdin.gov.np/securi/. Last Accessed 28 Nov 2016 Hildebrandt KA, Hu S (2017) Areal analysis of language attitudes and practices: A case study from Nepal. Language Documentation and Conservation. Special Publication 13: 152-179, http://scholarspace. manoa.hawaii.edu/handle/10125/24753 Hu S (1999) Integrated multimedia approach to the utilization of an Everglades vegetation database. Photogramm Eng Remote Sens 65(2):193–198 Hu S (2012) Multimedia Mapping on the Internet Using Commercial APIs. In Peterson MP (ed.) Online maps with APIs and Mapservices, 61–71, Springer Hu S, Gabriel AO, Bodensteiner LR (2003) Inventory and characteristics of wetland habitat on the Winnebago Upper Pool Lakes, Wisconsin, USA: an integrated multimedia-GIS approach. Wetland 23(1):82– 94. https://doi.org/10.1672/0277-5212(2003)023%5B0082: IACOWH%5D2.0.CO;2 Kansakar TR (2006) Research on the typology of Nepal’s languages. Nepalese linguist 22:106–128. http://www.digitalhimalaya.com/ collections/journals/nepling/. Last Accessed 28 Nov 2016 Kirk JM, Kretzschmar WA (1992) Interactive linguistic mapping of dia- lect features. Lit Linguist Comput 7(3):168–175. https://doi.org/10. 1093/llc/7.3.168 Kretzschmar WA (1996) Quantitative areal analysis of dialect features. Lang Var Chang 8(01):13–39. https://doi.org/10.1017/S09543945 00001058 Kretzschmar W, Juuso I, Bailey CT (2014) Computer simulation of dia- lect feature diffusion. J Linguist Geogr 2(01):41–57. https://doi.org/ 10.1017/jlg.2014.2 Labov W, Ash S, Boberg C (2006) Atlas of North American English: phonology and phonetics. Mouton de Gruyter, Berlin. https://doi. org/10.1515/9783110167467 Lameli A, Kehrein R, Rabanus S (eds) (2010) The handbook of language mapping. Mouton de Gruyter, Berlin Laurance WF (2014) Roads benefit people but can have massive envi- ronmental costs. National Geographic (online). https://blog. nationalgeographic.org/2014/10/19/roads-benefit-people-but-can- have-massive-environmental-costs/. Accessed 5 January 2018 Laurini R, Milleret-Raffort F (1990) Principles of geomatic hypermaps. In: Proceedings of the 4th International Symposium on Spatial Data Handling, Zurich, Switzerland, 2, 642–655 Openshaw S, Mounsey H (1987) Geographic information systems and the BBC’s Domesday interactive videodisk. Int J Geogr Inf Syst 1(2):173–179. https://doi.org/10.1080/02693798708927802 Peterson MP (2008). International Perspectives on Maps and the Internet: An Introduction. In Peterson MP (ed.), International Perspectives on Maps and the Internet (pp. 3–10), Springer Rhind DP, Armstrong P, Openshaw S (1988) The Domesday machine: a nationwide geographical information system. Geogr J 154(1):56– 58. https://doi.org/10.2307/633476 J geovis spat anal (2018) 2: 3 Page 13 of 14 3 https://doi.org/10.1515/9783110220278 https://doi.org/10.1017/CBO9781139086677.003 https://doi.org/10.1017/CBO9781139086677.003 https://doi.org/10.1007/978-3-662-03784-3_2 https://doi.org/10.1080/00690805.1999.9714298 https://doi.org/10.1080/00690805.1999.9714298 https://doi.org/10.1177/0261927X06292803 https://doi.org/10.1177/0261927X06292803 http://nflrc.hawaii.edu/ldc http://nflrc.hawaii.edu/ldc http://www.nfdin.gov.np/securi http://www.nfdin.gov.np/securi http://scholarspace.manoa.hawaii.edu/handle/10125/24753 http://scholarspace.manoa.hawaii.edu/handle/10125/24753 https://doi.org/10.1672/0277-5212(2003)023%5B0082:IACOWH%5D2.0.CO;2 https://doi.org/10.1672/0277-5212(2003)023%5B0082:IACOWH%5D2.0.CO;2 http://www.digitalhimalaya.com/collections/journals/nepling http://www.digitalhimalaya.com/collections/journals/nepling https://doi.org/10.1093/llc/7.3.168 https://doi.org/10.1093/llc/7.3.168 https://doi.org/10.1017/S0954394500001058 https://doi.org/10.1017/S0954394500001058 https://doi.org/10.1017/jlg.2014.2 https://doi.org/10.1017/jlg.2014.2 https://doi.org/10.1515/9783110167467 https://doi.org/10.1515/9783110167467 https://blog.nationalgeographic.org/2014/10/19/roads-benefit-people-but-can-have-massive-environmental-costs/ https://blog.nationalgeographic.org/2014/10/19/roads-benefit-people-but-can-have-massive-environmental-costs/ https://blog.nationalgeographic.org/2014/10/19/roads-benefit-people-but-can-have-massive-environmental-costs/ https://doi.org/10.1080/02693798708927802 https://doi.org/10.2307/633476 Shepherd ID (1991) Information integration and GIS. In: Maguire DJ, Goodchild MF, Rhind DW (eds) Geographical information systems: principles and applications, vol 1. Longman Scientific and Technical Publications, Essex, pp 337–357 Shiffer MJ (1998) Multimedia GIS for planning support and public dis- course. Cartogr Geogr Inf Syst 25(2):89–94. https://doi.org/10. 1559/152304098782594562 Slobin D (1996) Two ways to travel: verbs of motion in English and Spanish. In: Shibatani M, Thompson SA (eds) Grammatical con- structions: their form and meaning. Oxford University Press, Oxford, pp 195–217 Soomro TR, Zheng K, Turay S, Pan Y (1999) Capabilities of multimedia GIS. Chin Geogr Sci 9(2):159–165 Stanford JN (2009) One size fits all? Dialectometry in a small clan-based indigenous society. Lang Var Chang 24:247–278 Trudgill P (1974) Linguistic change and diffusion: description and expla- nation in sociolinguistic dialect geography. Lang Soc 3(02):215– 246. https://doi.org/10.1017/S0047404500004358 Wallin E (1990) The map as hypertext—on knowledge support systems for the territorial concern. In: Proceedings of the first European conference on geographical information system, EGIS’ 90I. EGIS Foundation, Munich, p 1125–1134 Yagoub MM (2003) Building an historical remote sensing atlas and mul- timedia GIS for Al Ain. GEO: Connexion 2. 7:54–55 3 Page 14 of 14 J geovis spat anal (2018) 2: 3 https://doi.org/10.1559/152304098782594562 https://doi.org/10.1559/152304098782594562 https://doi.org/10.1017/S0047404500004358 Web-based... Abstract Introduction Methodology Study Area Data Source and Data Set Use of XML to Store the Data Use of jQuery JavaScript Library to Load the XML File onto the Google Maps Use of Google Maps JavaScript API V3 to Display the Data Use of JavaScript, XHTML, and CSS to Design the User Interface and Functions for Spatial Analysis and Visualization Results Discussion Conclusion References work_bsxvshzuhzg7xiuvucujnkig6m ---- White Paper Report Report ID: 100782 Application Number: HD-51084-10 Project Director: Julia Flanders (j.flanders@neu.edu) Institution: Northeastern University Reporting Period: 1/1/2011-12/31/2014 Report Due: 3/31/2015 Date Submitted: 7/16/2015 1 Final Performance Report HD-51084-10 A Journal-Driven Bibliography of Digital Humanities Project Director: Julia Flanders Northeastern University May 30, 2015 2 Overview This project began with a simple premise. Digital Humanities Quarterly is an online, open-access journal whose founding coincided with the founding of the Alliance of Digital Humanities Organizations (ADHO) in 2005, and whose topical scope covers all areas of the field we now know as “digital humanities.” The bibliographies of DHQ articles thus reflect the intellectual watershed of this field, and also its formation over the life of the journal itself. Under this grant we sought to aggregate these bibliographies into a central bibliographic database, with two goals. First, at a practical level we wanted to simplify the journal’s production workflow and eliminate the duplication of data resulting from storing bibliographic data in the articles themselves. With a centralized database, we could store authoritative bibliographic data in one place and reference it from the articles, taking advantage of the fact that many DHQ articles draw on a common pool of material for their citations. Second, from a research perspective this data clearly constituted a potential public good and a fascinating data set in its own right. With a centralized database, we would be able to study patterns of co-citation, learn about the evolution of the field, and study the citation practices of different subcommunities. Bibliographic data could also potentially serve as a way for readers to find articles of interest, or clusters of related articles. We framed the effort as an 18-month process, with the project originally scheduled for completion in July 2012. Although this workplan was not unrealistic, retrospective analysis reveals its vulnerabilities: above all, because of the small size of the grant, we relied on commitments of donated effort for significant parts of the technical development work, notably the original data capture system and the integration of the new bibliographic data into the DHQ interface. As described in more detail below, one of the initial obstacles we faced was a set of problems with the data capture system which could not be addressed because the anticipated expertise was no longer available to us. Another more significant vulnerability was the fact that the data capture itself required fairly significant attention to issues of bibliographic genre and hence required a level of training and dedication that was somewhat out of proportion to the overall interestingness of the work, making it difficult to hire and retain students. As a result, there were periods of inactivity and delay while we searched for new research assistants. The third and most significant disruption could not have been predicted: in July 2013, the principal investigator changed jobs and moved from Brown University to Northeastern University, and DHQ moved its editorial operations to Northeastern at the same time. During the period of transition, work on this project was more or less suspended, and was not resumed until we hired a new research assistant in January 2014 who was able to bring the data capture and error correction to completion in December 2014 after three no-cost extensions. This prolonged and constantly changing work process could look from some perspectives like a narrative of failure, and certainly there have been important lessons learned. However, this project also illustrates an important principle that informs the design of the DH Startup grant program, namely the fact that some kinds of work are especially unpredictable. Small-scale projects are more vulnerable to disruption because they tend to have fewer resources to fall back on, and because they are operating on small enough quantities of effort that even a small reduction makes a significant 3 difference. Because small-scale projects in academic settings often rely on student labor, they have the additional vulnerability that comes from unpredictable turnover. The ultimate successful outcome of this project owes a great deal to the flexibility afforded us by NEH, for which we are extremely grateful. Project Activities Main activities Data Capture The initial capture of bibliographic data for this project was undertaken using a web- based bibliographic data capture and management system developed at the Brown University Center for Digital Scholarship for use in its digital humanities projects. The system offered a form-based data entry interface, with the data being saved as MODS. Configuration files permitted different projects to define different bibliographic genres and the required and permitted fields associated with each one, allowing a high degree of control which we felt was desirable for DHQ’s purposes. Using this system, we established a set of bibliographic genres representing the requirements of DHQ’s existing citations, and hired a group of undergraduate students to undertake the data capture. Our original goal as defined in the grant proposal was to capture bibliographic items not only from DHQ’s own article bibliographies, but also items from the other major digital humanities journals (including Computers and the Humanities and Literary and Linguistic Computing), and we made significant progress on those two journals. However, changes to personnel and local support at Brown University interrupted that work process and we did not complete the capture of CHum and LLC data. We encountered two chief obstacles at this stage. First, the data capture system was engineered in a way that caused its performance to suffer dramatically under large quantities of data, and second, changes in personnel at Brown University reduced the levels of technical support available to us, so we were not able to address the problems with the data capture system, or add the features for de-duplication and error checking that we had anticipated. However, under this system we were able to capture a significant number of records (approximately 3000 in all). After the move to Northeastern, we hired a graduate research assistant to complete the data capture, and we also faced the fact that we needed to adopt a different data capture tool and process. Although the web data entry interface of the Brown tool had significant advantages of ease of use, our new graduate assistant had greater familiarity with XML and we anticipated that once the data capture was complete our general DHQ workflow would rely on DHQ’s managing editors (also comfortable with XML), so a form-based system would not be necessary. In addition, the remaining data capture was focused on the bibliographies of existing DHQ articles which were already expressed as lightly encoded XML, so we could benefit from using XML tools to convert them into our target format. At this stage we developed a schema (described in more detail below) that reflected the genres of bibliographic record we had already established (including their requirements for the presence and order of fields) and set up a work flow to convert these bibliographies. The first step in the process involved an XSLT stylesheet that converted the existing TEI elements into the corresponding bibliographic elements 4 of our schema, wrapped in a generic element. The second step involved hand editing these records to change the wrapper element to a more specific one reflecting the genre of the item (e.g. , , , etc.) and to add further detailed markup of the individual components of the entry that were not available in the original DHQ encoding. (Because that encoding was driven by display needs rather than by goals of bibliographic completeness, only titles and URLs were typically explicit in that markup.) Following the completion of the data capture, there was some further work involved in cleaning up the data: • Some de-duplication was necessary, since the initial data capture had been done in a system that did not make it easy to check for the existence of a given record before entering it. • We had to ensure that record IDs were unique. IDs for bibliographic items in the system were based on author and date rather than on randomly assigned identifiers, to make it easier to spot errors of citation in the encoding of DHQ articles, but the author-date system requires disambiguation for common surnames and for authors who publish multiple items in a single year. As part of the cleanup process we also had to consider and document our policies concerning the level of bibliographic management we were prepared to exercise. For example, in cases where different DHQ articles cited different versions of the same published item (for instance, hardcover and the paperback editions, published in different years), we decided to treat these as separate items rather than develop a mechanism for coordinating them; at a later stage we may institute a formal mechanism for representing these connections in the data to improve analysis. Similarly, we do not track connections between versions of published items (such as a blog post that is republished in a journal and then anthologized in a book). We also determined that some kinds of cited items did not belong in the centralized bibliography at all, the primary example being items that had only local relevance within the context of a specific DHQ article, such as personal communications (“Private email to the author, 31 May 2010” and the like). These items would remain in the separate DHQ articles and would not be aggregated centrally. Bibliographic Identifiers in DHQ articles Once the bulk of the data capture was complete, the next step was to establish the linkage between DHQ articles and the bibliographic items they cite. All published DHQ articles include full bibliographies, and in our earlier practice any citations in the text pointed to entries in those bibliographies, as in the following example: Inline reference in the body of the article: Bibliography entry: 5 McGann, J. J. Radiant Textuality: Literature After the World Wide Web. New York: Palgrave, 2004. The @target attribute of the element is a local URL that points to the @xml:id attribute of the element, establishing a link between them. When the article is published, an XSLT stylesheet finds each element, follows the link and takes the value of the @label attribute to be used in the display as a link to the appropriate bibliography entry. The entry itself is transformed by the stylesheet as well to display according to the journal’s standard format: McGann 2004. McGann, J. J. Radiant Textuality: Literature After the World Wide Web. New York: Palgrave, 2004. In establishing the new system, we needed to consider both the desired endpoint of the process (a working publication in which all bibliographic data would be centralized) and also the intermediate steps, which included the need to verify the accuracy of links to the centralized bibliography, and also the need to provide a fallback in case of broken or missing data. We did not want to throw away the bibliographic data we already had in place until the very end of the process (if then). The process we followed was: 1. Create a second attribute for that would carry a pointer to the centralized bibliography, and populate it with provisional values, using the existing values of @xml:id. Since these values were based on the author and date of the item, we reasoned that those would often correctly identify the intended item in the centralized bibliography. We created a new @key attribute and globally propagated the existing value of @xml:id to @key. The existing internal pointers that link the inline references to the article’s bibliography are left in place unchanged. 2. Check for non-existent records (that is, cases where the @key value does not match any existing record in the centralized bibliography) and for incorrect links (that is, cases where the value of @key points to the wrong entry in the centralized bibliography). For this purpose we created an XSLT stylesheet that took each article’s bibliography, and for each item used its @key value to identify and pull in the matching record (if any) from the centralized bibliography. The stylesheet displayed this information in tabular form with the original entry and the matching entry side by side for comparison. It also performed a comparison of the title fields in the two entries to determine whether they were likely to represent the same bibliographic item, and it looked as well for other entries with similar titles which might be alternative matches (or possible duplicate records). Finally, it generated a color-coded border identifying probable errors: red for cases where no matching entry was found, yellow for cases where the title match was questionable, and green for entries that matched both the @key and the title 6 similarity test. Using this display, we reviewed all of the published DHQ articles, added missing entries, fixed errors, and resolved ambiguities. For purely local references (the “private email to author” case given above), we added a @key=”[unlisted]” on the to signal that no link to the centralized bibliography was needed. 3. Provide authors with a similar side-by-side view of the bibliography for their article, so that they have an opportunity to verify the accuracy of the data. This precaution serves as a fallback in case of oversight during what were necessarily quite repetitive and large-scale tasks (and hence prone to occasional slips). This process was not completed under the grant, but is now being undertaken by DHQ in summer 2015. 4. Update the DHQ display stylesheets so that instead of using the local bibliography for each article, they draw data from the centralized bibliography. As part of this process, we also had to develop new display logic to use the fully encoded data from the centralized database (which does not include literal punctuation such as periods, commas, quotation marks, etc. to delimit the individual fields). These updates have been completed and are awaiting the completion of the author check before we switch over to using the centralized data. We anticipate that we will be using the new system starting in fall 2015. 5. Discard the original bibliographic data? In theory, once we have been using the centralized bibliography for long enough to be comfortable that it is complete and accurate, we will have no further need for the locally encoded bibliographic data. Because the entire system is maintained under version control, we can delete this information without truly losing it, in case we need to check it or retrieve it at some future point. The final encoding looks like this: Inline reference in the body of the article: Local bibliography entry in the article: McGann, J. J. Radiant Textuality: Literature After the World Wide Web. New York: Palgrave, 2004. Remote entry in the centralized bibliography: Jerome McGann 7 Radiant Textuality: Literature After the World Wide Web New York Palgrave Macmillan 2001 Note that the internal linking between and , and the generation of a display label, is left untouched and is purely local to the article; the disambiguation of entries required in the centralized resource (e.g. “mcgann2004a”, “mcgann2004b”, etc.) is not necessary or visible within the article itself unless the article itself references more than one 2004 item for McGann. This separation of local and external ecologies had the added benefit of avoiding the necessity of updating the @target and @xml:id values, which would have added significant work and opportunities for error. Design of publication system The bibliographic data resource developed under this grant represents a new level of complexity for the DHQ publication, since it exists as a separate data set referenced from the DHQ articles, and the publication process needs to follow the bibliographic pointers from the articles to retrieve the relevant bibliographic records and incorporate them appropriately into the article’s display. Additionally, the existence of the bibliography as a distinct resource opens up possibilities for analysis of this resource in its own right. Both of these things can be accomplished using our existing architecture: XSLT stylesheets for the transformation of data from TEI into HTML, and the Apache Cocoon pipelining system to provide the overall user interaction logic, navigation, and site organization. However, the more natural tool to use as DHQ gains in complexity is an XML database through which the data could be indexed, searched, and processed more efficiently. We are currently exploring the use of eXist (an open-source XML database) as a next step for this project, but this carries some overhead of development and maintenance that lies outside the immediate scope of this project. Visualization experiments The final component of this project was the analysis and visualization of the bibliographic data, which was done in partnership with two groups at Indiana University. Our original plan included a collaboration Katy Börner’s research team at the Center for Network Science, and at intervals during the project we provided preliminary data sets for experimentation. Based on early discussions with the visualization team we developed a specification for exporting the combined DHQ article and bibliographic data in a spreadsheet format that supported the types of analysis we were most interested in: comparisons of DHQ articles based on co-citation, with DHQ article metadata (chiefly author affiliations and abstract) as additional facets of analysis. Later in the process, once the data capture and cleanup were close to complete, we provided a fuller data set to Scott Weingart (a member of Börner’s research group) who performed some initial analysis. Following the conclusion of the grant, we will continue to work with Weingart to take the analysis further. Because of the challenges 8 encountered earlier in the project, we did not get as far with the visualization work as we had initially hoped, but we did accomplish all of the parts that required active funding support; the foundation we have established under this grant will enable us to proceed with DHQ’s own resources. Fortuitously, we were also able to undertake a second collaboration on visualization of bibliographic data which though not formally part of this grant project is very closely tied to it. Immediately following the conclusion of the grant, in the spring semester 2015, DHQ participated as a client project in the Information Visualization MOOC offered at Indiana University, making our data available to a team of student researchers as the basis for a research project in visualization. The students developed a set of visualizations and a detailed analysis of citation patterns, and provided an extensive final report. Members of the DHQ editorial team will be collaborating with the student team to produce a co-authored article based on this report, to be published in DHQ later this year, together with the resulting visualizations. Samples are included in the appendix to this report. Reasons for changes and omissions As noted in the introduction to this report, this project deviated significantly from its original work plan. There were some modifications to the timing and duration of activities that resulted from institutional changes over which DHQ had no control: changes to staffing and level of technical support at Brown University, and the 2013 move of DHQ’s editorial operations to Northeastern as a result of Julia Flanders’ institutional move. There were also some modifications to the overall scope of the project. In our original work plan we had planned to work with arts-humanities.net (which at that time was managing a bibliographic tool as well) on shared management of bibliographic records, but arts-humanities.net ceased operations shortly after the start of this project and that collaboration was not possible. At a future time it may prove possible to host a contributory interface for DH bibliography, perhaps hosted through the Alliance of Digital Humanities Organizations, but that would need to be a community decision supported by community funding. In our original proposal we had also planned to include complete coverage of materials published in other DH journals (including Vectors, LLC, Digital Studies/Le Champ Numérique, and Text Technology) but the process of data capture proved more labor-intensive than we had expected and the data capture system itself did not mature technologically as we had planned (lacking anticipated support from Brown), so that processes like de-duplication were not as efficiently accommodated. At a future time we hope to have opportunities to ingest and integrate these other bibliographies, particularly if there turns out to be community support for a comprehensive bibliography of DH. Changes in Methods Involving Technology As noted in an earlier report, our original data capture system proved to have significant weaknesses. It was good at profiling data in an appropriately detailed manner, but it proved too slow for efficient use. As part of this grant, we did an extensive data profiling exercise and developed a schema that matches the MODS profile used internally within the original data capture system, but provides better constraint based 9 on specific bibliographic genres. MODS was appropriate within a web-based data capture environment, since all of the relevant constraint in that case was provided by the web form itself. However, in our new capture environment (using the Oxygen XML editor and relying on the schema to provide constraints), we needed a schema that would, for instance, stipulate that “book” items required a publisher field, whereas “blog post” items would not. The MODS schema is too permissive to provide such constraints, and it also provides very little precision in the semantics of specific elements. (For instance, a journal title is represented using a element within a element.) The data capture schema we developed provides a much simpler and more direct set of constraints for specific bibliographic genres such as books, book chapters, journal articles, conference papers, art works, blog posts, web pages, white papers, and other common forms of publication. For each genre, we identified the bibliographic elements that would be required and permitted, enabling us to establish consistency and test for missing required components. It is worth noting that this schema is intended for internal purposes, and is not intended as a quixotic attempt to create yet another perfect bibliographic data format. Our goals in modeling this data are: • To provide the constraint necessary to ensure consistency of data • To provide enough semantic explicitness to permit mapping the data onto other bibliographic formats (such as TEI, MODS, etc.) • To provide enough granularity to support the necessary display logic so that individual entries could be punctuated and formatted appropriately within the context of the DHQ publication interface In other words, we do not expect other projects to use this schema, but we do expect that we will be able to map bibliographic data in other formats onto this one when we want to ingest data from other sources, and we also expect to be able to export data from this format into other formats as needed. For the new data capture, we are using the Oxygen XML editor. We set up a “project” in Oxygen that permits validation, uniqueness checking, and XSLT transformations across the entire data set (which is broken up into multiple files to reduce lag). As new items are added, the system automatically runs a comparison across the data set to check for items with similar authors and titles (so as to flag potential duplicates). It also checks the uniqueness of the author-title identifier that serves as the unique key for individual entries within the system. Finally, using XSLT and CSS we can provide a basic visual display of the data when needed, e.g. for proofreading. Efforts to publicize We have publicized our goals and progress for this project at several points. An export of our journal and bibliographic data was shared with the Information Visualization MOOC held at Indiana University in 2014-15, and served as a client project for a student working group in that course. An article reporting on their analysis will be published in DHQ later in 2015. Regular reports on progress have been included in DHQ’s annual reports to the Alliance of Digital Humanities Organizations. A presentation on the 10 project was made at the DH2015 conference in Sydney, Australia in July 2015. Once we complete the final integration of the bibliographic data into DHQ’s publication interface, we will announce the completion of the project and its outcomes in a posting to the Humanist listserv, as well as via DHQ’s regular dissemination mechanisms (including Twitter and the DHQ web site). Accomplishments The accomplishments resulting from this project are as follows: 1. We digitized over 6000 bibliographic items covering all items referenced by DHQ articles, plus incomplete but substantial coverage of bibliographies from articles published in Computers and the Humanities and Literary and Linguistic Computing. Our original goal was to capture all bibliographies from CHum and LLC, plus conference proceedings from the DH conferences, but we were unable to get this data in a form we could easily convert and import, and it was not practical to capture it or convert it by hand. 2. We developed a schema for DHQ’s bibliographic data, which is fine-grained enough to support export into other bibliographic formats (such as MODS or TEI). 3. We developed a set of additional tests and quality assurance mechanisms using Schematron and XSLT that support de-duplication and data integrity checking as part of DHQ’s regular publication work flow. 4. We developed display stylesheets to support the integration of centralized bibliographic data into the DHQ publication interface. 5. In partnership with researchers at Indiana University (both within the Center for Network Science and through the Information Visualization MOOC), we developed visualizations that exploit the DHQ article metadata and bibliographic data. Following the completion of the grant, we plan the following additional work: 1. Continue to expand the centralized bibliography as new DHQ articles are published; resources permitting, expand the bibliography by ingesting or capturing additional records (e.g. from the DH Conference Abstracts database, or from other journals). 2. Develop further visualizations as we expand our metadata. For instance, we are now working on adding topical keywords to DHQ articles, and these would support visualizations showing the citation patterns of articles on specific topics. 3. Integrate a dynamic bibliographic visualization into the DHQ web site. This will require that we serve the bibliographic data dynamically from an XML database, so that users can interact with it. 4. Make the bibliographic data available for public download so that others can experiment with it; eventually, we plan to develop an API to the bibliographic data to facilitate experimentation. 11 5. Develop an interface to the bibliography itself, so that readers can search, sort, and view items and learn more about citation and publication practices in digital humanities. As the field continues to develop, this bibliography will become an important instrument for studying the history of the field through its publications. 6. Implement authority control for the major informational components of these records (such as author names, publishers, and locations) to enhance consistency and ease data entry. Audiences One primary audience for this work is DHQ’s existing readership, who will receive the bibliographic data seamlessly integrated into the DHQ interface. These readers will benefit from greater consistency in formatting and presentation of the data, and also from greater accuracy in the citations (since authors often omit or misstate specific pieces of bibliographic information and these errors are not always caught prior to publication). Another related audience is the members of the DH community who are interested in learning about the DH field through its patterns of citation and publication practices. This audience will be able to get a more detailed view of the field through the ability to query and analyze the bibliography. As the bibliography continues to grow, this audience will have an increasingly rich resource to work with. Providing the data for download and via an API will serve the smaller sector of this audience who are interested in doing their own data analysis. Finally, an important “audience” for this work is DHQ’s own internal community, especially including our production team. One primary motivation for this project was to eliminate duplication of data and to implement a more streamlined, data-driven approach to the bibliographic aspects of our publication. While this new system will not hugely reduce the overall work involved, it will shift the emphasis of that work from tasks that are annoying and demoralizing (i.e. copyediting of bibliographic minutiae) to tasks that contribute to the growth of knowledge in the field (i.e. enhancing the bibliographic data itself). Evaluation As the introductory section of this report illustrates, the design and planning of this project contained several significant weaknesses, most notably an over-reliance on a tool for which we could not take technical responsibility. It also suffered from a lack of strong project management as a result of the fact that the principal investigator was overseeing several other grant-funded initiatives and other projects. These are both classic difficulties for digital humanities projects, but knowing about these risks in advance would not necessarily have enabled us to avoid them; the reason we chose to use Brown’s bibliographic tool was its ease of use, proximity, and fitness for purpose; alternatives we considered all would have been either more expensive (i.e. out of scope for the project) or much less well adapted for the work. And at the time that we submitted the application, the other grants that competed for the principal investigator’s 12 attention had not been awarded. On balance, we made the best decisions we could at the time. One of the project’s most significant strengths has been its ability to draw on deep expertise from the DHQ editorial team, which in turn derives partly from the fact that the focus of the project was on intelligent data modeling rather than on simple data capture. All of the editors approached the project as being in part an investigation of DHQ’s citation universe, an unknown terrain to us and one in which we have an intense interest. The opportunity to inventory and model the range of cited materials— including everything from journal articles and book chapters to white papers, official reports, legal cases, private communications, tweets, blog posts, works of electronic literature, computer code, games, conference abstracts, works of fiction, manuscripts, newspaper articles, and dictionary entries—provided remarkable insight into the emergence of DH as a field and also into our own thinking about the mechanisms and purposes of scholarly citation. The editors also have a shared interest in data manipulation and data-driven work flows, so the practical challenges of the project (such as mechanisms for intelligent de-duplication) were framed as opportunities for the exercise of ingenuity. These motivations and interests continue to sustain this project after the conclusion of the grant funding. We also anticipate that the strong modeling of this data will make it more useful to third-party researchers. Grant Products, Continuing Work, and Impact The most significant product arising from this grant is the bibliography itself, which is integrated into the DHQ interface but whose data can also be downloaded from the DHQ site. A secondary product is the set of supporting tools and systems (schemas, XSLT stylesheets, work flow) that enable DHQ to maintain and further develop this bibliography and its functions within the DHQ ecosystem. Another secondary product is the visualizations (and the analytic logic underlying them) that reveal patterns within the DHQ citations. This project has a strong future trajectory for DHQ. One outcome of this project is a working system for bibliographic management in DHQ, and DHQ will now continue to use this system as part of our regular production work flow; hence we will naturally continue to expand the bibliography and groom it for quality. In addition, because DHQ is strongly committed to exploiting the journal’s XML data and demonstrating the value of this data-driven approach to journal publishing, we will be seeking opportunities for further enhancements to both the data and the systems by which we expose it. As noted above, we plan a number of ongoing activities to bring this phase of development to completion. In addition, there are some longer-term projects that may arise from this work. In particular, we plan to solicit proposals for ways to exploit and analyze DHQ’s data (including bibliographic data), possibly through microgrants in partnership with ADHO, and also through curricular opportunities such as the IVMOOC program mentioned above. The long-term impact of this project on DHQ itself is likely to be very significant. As noted above, our previous system of bibliographic information was labor-intensive (since it required our encoding staff to copyedit and correct not only the content of each 13 citation, but also its punctuation and formatting which frequently diverged from DHQ’s requested format) and duplicative (since many DHQ articles cite the same sources). Centralizing the bibliography not only does away with the most onerous parts of this work but also eliminates the duplication of information and the informational embarrassment of having the same work cited in different ways (since even conscientious authors may make different decisions concerning the inclusion of specific information, particularly in the case of less familiar genres such as white papers or conference proceedings). The satisfaction of maintaining a growing bibliography makes the labor of adding new entries much more tolerable. In addition, this data constitutes an important information resource that has great potential to enhance the DHQ interface. For example, we can enable readers of a given article to choose an item from its bibliography and discover all other DHQ articles that also cite the item, or to discover affinities between groups of DHQ articles based on their citation networks. Moreover, when we are able to expose this data to the public via an API, third party researchers may find additional ways to exploit the data (perhaps combining it or comparing it with other discipline-specific bibliographies). Through its impact on the DHQ interface and its potential to provide a valuable data resource to the public, this project raises DHQ’s visibility in the digital humanities community and in related fields such as network science. Finally and perhaps most importantly, this project accomplished a task which can only be accomplished with funded labor, but which (once completed) lays the foundation for additional work that is interesting and lightweight enough to be done by volunteers or with small-scale funding such as microgrants. It thus served as a kind of gateway or enabling step which provides impetus for a much larger set of long-term effects Appendices The appendices include the following items: 1. An XML code sample showing representative bibliographic entries encoded using the DHQ bibliographic markup. 2. A sample DHQ article encoded for publication, with a full bibliography showing the use of @key to point to the central bibliography (including handling of unlisted entries). 3. A screen shot of the side-by-side comparison view used to identify mismatched bibliographic entries during the deduplication and error correction phase of the project. 4. Internal documentation for the extraction and encoding of bibliographies from DHQ articles. 5. A final report by members of the IVMOOC working group describing their analysis of the DHQ bibliographic data. 6. The text and slides for a paper on DHQ (mentioning but not focused primarily on the bibliographic project) presented at DH2015 in Australia: “Challenges of an XML-based Open-Access Journal: Digital Humanities Quarterly,” Julia Flanders, John Walsh, Wendell Piez, Melissa Terras. The text of this paper has been revised based on commentary and discussion in the conference session. Appendix 1: XML Code Sample This appendix contains an XML code sample showing representative bibliographic entries encoded using the DHQ bibliographic markup. The first set represent genres in common usage. The second set represent genres for which we are still considering the requirements and definitions. Kate Armstrong Grafik Dynamo 2005 http://www.turbulence.org/Works/dynamo/ Humanities Blast Digital Humanities Manifesto 2.0 2009 http://www.humanitiesblast.com/manifesto/Manifesto_V2.pdf Grant Morrison J.G. Jones Marvel Boy #4 Marvel Boy Marvel Comics November 2000 Sharon Macdonald Introduction Sharon Macdonald Poetics of Display London Routledge 1998 Catherine C. Marshall Toward an ecology of hypertext annotation HyperText 98 1998 ACM 40 49 Moira MacDonald Data Storage Policy Can't Be Enforced University Affairs 4 June 2007 http://www.universityaffairs.ca/data-storage-policy-cant-be-enforced.aspx V. Martiradonna La codifica elettonica dei testi. Un caso di studio Tesi di laurea in Lettere, Facoltà di Scienze Umanistiche, Università di Roma La Sapienza 2003-2004 Relatore: D. Fiormonte. Nick Montfort Ad Verbum 2000 http://www.wurb.com/if/game/912 Jimmy Maher Review of Rune Berg’s The Isle of the Cult 2005 http://www.sparkynet.com/spag/i.html#isle The Castle of Perseverance Map The Castle of Perserverance MS. Folger Shakespeare Library, Washington. Shelfmark V.a.354. 191v. Image ID 1207-42. Baker v. Selden 1879 101 U.S. 99 U.S. Constitution, Article 1, Section 8 Institute for Advanced Technologies in the Humanities (IATH) NEH Proposal SNAC: The Social Networks and Archival Context Project. http://socialarchive.iath.virginia.edu/NEH_proposal_narrative.pdf Accessed April 15, 2012 Melissa Terras he Researching e-Science Analysis of Census Holdings Project: Final Report to AHRC 2006 www.ucl.ac.uk/reach/ AHRC e-Science Workshop scheme Appendix 2: Sample DHQ Article This appendix contains a sample DHQ article encoded for publication, with a full bibliography showing the use of @key to point to the central bibliography (including handling of unlisted entries using @key=”[unlisted]”). The Technical Evolution of Vannevar Bush’s Memex Belinda Barnet Belinda Barnet Swinburne University of Technology, Melbourne belinda.barnet at gmail.com

Belinda Barnet is Lecturer in Media and Communications at Swinburne University, Melbourne. Prior to her appointment at Swinburne she worked at Ericsson Australia, where she managed the development of 3G mobile content services and developed an obsession with technical evolution. Belinda did her PhD on the history of hypertext at the University of New South Wales, and has research interests in digital media, digital art, convergent journalism and the mobile internet. She has published widely on new media theory and culture.

000015 002 1 article 21 June 2008

Authored for DHQ; migrated from original DHQauthor format

DHQ classification scheme; full list available in the DHQ keyword taxonomy Keywords supplied by author; no controlled vocabulary Added final metadata, bio and abstract, publication statement, proofreading corrections. Restored # to targets where it was missing, for consistency. Encoded document Added date, id, issue, vol attributes to root element, revised encoding of the change element dated 2008-04-26, removed "#" from target attribute of ref element, encoded external links as xref in the listBibl, removed top xsl declaration Updated revisionDesc format, added details to publicationStmt, changed xref to ref for validation with new schema, added some missing "#" to target attribute and removed "##". Changed email address, made authorial changes.

This article describes the evolution of the design of Vannevar Bush's Memex, tracing its roots in Bush's earlier work with analog computing machines, and his understanding of the technique of associative memory. It argues that Memex was the product of a particular engineering culture, and that the machines that preceded Memex — the Differential Analyzer and the Selector in particular — helped engender this culture, and the discourse of analogue computing itself.

Can we say that technical machines have their own genealogies, their own evolutionary dynamic?

Introduction: Technical Evolution The key difference [between material cultural evolution and biological evolution] is that biological systems predominantly have vertical transmission of genetically ensconced information, meaning parents to offspring… Not so in material cultural systems, where horizontal transfer is rife — and arguably the more important dynamic . Paleontologist Dr. Niles Eldredge, interview with the author

Since the early days of Darwinism, analogies have been drawn between biological evolution and the evolution of technical objects and systems. It is obvious that technologies change over time; we can see this in the fact that technologies come in generations; they adapt and adopt characteristics over time, one suppressing the other as it becomes obsolete . The technical artefact constitutes a series of objects, a lineage or a line. From the middle of the nineteenth century on, writers have been remarking on this basic analogy – and on the alarming rate at which technological change is accelerating. But as Eldredge points out, the analogy can only go so far; technological systems are not like biological systems in a number of important ways, most obviously the fact that they are the products of conscious design. Unlike biological organisms, technical objects are invented.

Inventors learn by experience and experiment, and they learn by watching other machines work in the form of technical prototypes. They also copy and transfer ideas and techniques between machines, co-opting innovations at a whim. Technological innovation thus has Lamarckian features, which are forbidden in biology . Inventors can borrow ideas from contemporary technologies, or even from the past. There is no extinction in technological evolution: ideas, designs and innovations can be co-opted and transferred both retroactively and laterally. This retroactive and lateral transfer of innovations is what distinguishes technical evolution from biological evolution, which is characterised by vertical transfer (parents to offspring). As the American paleontologist Niles Eldredge observed in an interview with the author,

Makers copy each other, patents affording only fleeting protection. Thus, instead of the neatly bifurcating trees [you see in biological evolution], you find what is best described as "networks"-consisting of an historical signal of what came before what, obscured often to the point of undetectability by this lateral transfer of subsequent ideas . Niles Eldredge, interview with the author

Can we say that technical machines have their own genealogies, their own evolutionary dynamic? It is my contention that we can, and I have argued elsewhere that in order to tell the story of a machine, one must trace the path of these transferrals, paying particular attention to technical prototypes and to also to techniques, or ways of doing things. A good working prototype can send shockwaves throughout an engineering community, and often inspires a host of new machines in quick succession. Similarly, an effective technique (for example, storing and retrieving information associatively) can spread between innovations rapidly.

In this article I will be telling the story of particular technical machine – Vannevar Bush’s Memex. Memex was an electro-mechanical device designed in the 1930’s to provide easy access to information stored associatively on microfilm. It is often hailed as the precursor to hypertext and the web. Linda C. Smith undertook a comprehensive citation context analysis of literary and scientific articles produced after the 1945 publication of Bush's article on the device, As We May Think in the Atlantic Monthly. She found that there is a conviction, without dissent, that modern hypertext is traceable to this article . In each decade since the Memex design was published, commentators have not only lauded it as vision, but also asserted that technology [has] finally caught up with this vision . For all the excitement, it is important to remember that Memex was never actually built; it exists entirely on paper. Because the design was first published in the summer of 1945, at the end of a war effort and with the birth of computers, theorists have often associated it with the post-War information boom. In fact, Bush had been writing about it since the early 1930s, and the Memex paper went through several different versions.

The social and cultural influence of Bush’s inventions are well known, and his political role in the development of the atomic bomb are also well known. What is not so well known is the way the Memex came about as a result of both Bush’s earlier work with analog computing machines, and his understanding of the mechanism or technique of associative memory. I would like to show that Memex was the product of a particular engineering culture, and that the machines that preceded Memex — the Differential Analyzer and the Selector in particular — helped engender this culture, and the discourse of analogue computing, in the first place. The artefacts of engineering, particularly in the context of a school such as MIT, are themselves productive of new techniques and new engineering paradigms. Prototype technologies create cultures of use around themselves; they create new techniques and new methods that were unthinkable prior to the technology. This was especially so for the Analyzer.

In the context of the early 20th-century engineering school, the analyzers were not only tools but paradigms, and they taught mathematics and method and modeled the character of engineering.

Bush transferred technologies directly from the Analyzer and also the Selector into the design of Memex. I will trace this transfer in the first section. He also transferred an electro-mechanical model of human associative memory from the nascent science of cybernetics, which he was exposed to at MIT, into Memex. We will explore this in the second section. In both cases, we will be paying particular attention to the structure and architecture of the technologies concerned.

The idea that technical artefacts evolve in this way, by the transfer of both technical innovations (for example, microfilm) and techniques (for example, association as a storage technique), was popularised by French technology historian Bertrand Gille. I will be mobilising Gille’s theories here as I trace the evolution of the Memex design. We will begin with Bush’s first analogue computer, the Differential Analyzer.

The Analyzer and the Selector

The Differential Analyzer was a giant, electromechanical gear and shaft machine which was put to work during the war calculating artillery ranging tables and the profiles of radar antennas. In the late 1930s and early 1940s, it was the most important computer in existence in the US . Before this time, the word computer had meant a large group of mostly female humans performing equations by hand or on limited mechanical calculators. The Analyzer evaluated and solved these equations by mechanical integration. It created a small revolution at MIT. Many of the people who worked on the machine (e.g. Harold Hazen, Gordon Brown, Claude Shannon) later made contributions to feedback control, information theory, and computing . The machine was a huge success which brought prestige and a flood of federal money to MIT and Bush.

However, by the spring of 1950, the Analyzer was gathering dust in a storeroom — the project had died. Why did it fail? Why did the world’s most important analogue computer end up in a back room within five years? This story will itself be related to why Memex was never built; research into analogue computing technology in the interwar years, the Analyzer in particular, contributed to the rise of digital computing. It demonstrated that machines could automate the calculus, that machines could automate human cognitive techniques.

The decade between the Great War and the Depression was a bull market for engineering . Enrolment in the MIT Electrical Engineering Department almost doubled in this period, and the decade witnessed the rapid expansion of graduate programs. The interwar years found corporate and philanthropic donors more willing to fund research and development within engineering departments, and there were serious problems to be worked on generated by communications failures during the Great War. In particular, engineers were trying to predict the operating characteristics of power-transmission lines, long-distance telephone lines, commercial radio and other communications technologies (Beniger calls this the early period of the Control Revolution ). MIT’s Engineering Department undertook a major assault on the mathematical study of long-distance lines.

Of particular interest to the engineers was the Carson equation for transmission lines. This was a simple equation, but it required intensive mathematical integration to solve.

Early in 1925 Bush suggested to his Graduate Student Herbert Stewart that he devise a machine to facilitate the recording of the areas needed for the Carson equation … [and a colleague] suggested that Stewart interpret the equation electrically rather than mechanically.

So the equation was transferred to an electro-mechanical device: the Product Intergraph. Many of the early analogue computers that followed Bush’s machines were designed to automate existing mathematical equations. This particular machine physically mirrored the equation itself. It incorporated the use of a mechanical integrator to record the areas under the curves (and thus the integrals), which was

… in essence a variable-speed gear, and took the form of a rotating horizontal disk on which a small knife-edged wheel rested. The wheel was driven by friction, and the gear ratio was altered by varying the distance of the wheel from the axis of rotation of the disk.

A second version of this machine incorporated two wheel-and-disc integrators, and it was a great success. Bush observed the success of the machine, and particularly the later incorporation of the two wheel-and-disc integrators, and decided to make a larger one, with more integrators and a more general application than the Carson equation. By the fall of 1928, Bush had secured funds from MIT to build a new machine. He called it the Differential Analyzer, after an earlier device proposed by Lord Kelvin which might externalise the calculus and mechanically integrate its solution .

As Bertrand Gille observes, a large part of technical invention occurs by transfer, whereby the functioning of a structure is analogically transposed onto another structure, or the same structure is generalised outwards . This is what happened with the Analyzer — Bush saw the outline of such a machine in the Product Integraph. The Differential Analyzer was rapidly assembled in 1930, and part of the reason it was so quickly done was that it incorporated a number of existing engineering developments, particularly a device called a torque amplifier, designed by Niemann . But the disk integrator, a technology borrowed from the Product Intergraph, was the heart of the Analyzer and the means by which it performed its calculations. When combined with the torque amplifier, the Analyzer was essentially an elegant, dynamical, mechanical model of the differential equation . Although Lord Kelvin had suggested such a machine previously, Bush was the first to build it on such a large scale, and it happened at a time when there was a general and urgent need for such precision. It created a small revolution at MIT.

In engineering science, there is an emphasis on working prototypes or deliverables. As Professor of Computer Science Andries van Dam put it in an interview with the author, when engineers talk about work, they mean work in the sense of machines, software, algorithms, things that are concrete . This emphasis on concrete work was the same in Bush’s time. Bush had delivered something which had been previously only been dreamed about; this meant that others could come to the laboratory and learn by observing the machine, by watching it integrate, by imagining other applications. A working prototype is different to a dream or white paper — it actually creates its own milieu, it teaches those who use it about the possibilities it contains and its material technical limits. Bush himself recognised this, and believed that those who used the machine acquired what he called a mechanical calculus, an internalised knowledge of the machine. When the army wanted to build their own machine at the Aberdeen Proving Ground, he sent them a mechanic who had helped construct the Analyzer. The army wanted to pay the man machinist’s wages; Bush insisted he be hired as a consultant . I never consciously taught this man any part of the subject of differential equations; but in building that machine, managing it, he learned what differential equations were himself … [it] was interesting to discuss the subject with him because he had learned the calculus in mechanical terms — a strange approach, and yet he understood it. That is, he did not understand it in any formal sense, he understood the fundamentals; he had it under his skin. Bush 1970, 262 cited in Owens 1991, 24

Watching the Analyzer work did more than just teach people about the calculus. It also taught people about what might be possible for mechanical calculation — for analogue computers. Several laboratories asked for plans, and duplicates were set up at the US Army’s Ballistic Research Laboratory, in Maryland, and at the Moore School of Electrical Engineering at the University of Pennsylvania . The machine assembled at the Moore school was much larger than the MIT machine, and the engineers had the advantage of being able to learn from the mistakes and limits of the MIT machine . Bush also created several more Analyzers, and in 1936 the Rockefeller Foundation awarded MIT $85,000 to build the Rockefeller Differential Analyzer . This provided more opportunities for graduate research, and brought prestige and a flood of funding to MIT.

But what is interesting about the Rockefeller Differential Analyzer is what remained the same. Electrically or not, automatically or not, the newest edition of Bush’s analyzer still interpreted mathematics in terms of mechanical rotations, still depended on expertly machined wheel-and-disc integrators, and still drew its answers as curves.

Its technical processes remained the same. It was an analogue device, and it literally turned around a central analogy: the rotation of the wheel shall be the area under the graph (and thus the integrals). The Analyzer directly mirrored the task at hand; there was a mathematical transparency to it which at once held observers captive and promoted, in its very workings, the language of early 20th-century engineering . There were visitors to the lab, and military and corporate representatives that would watch the machine turn its motions. It seemed the adumbration of future technology. Harold Hazen, the head of the Electrical Engineering Department in 1940 predicted the Analyzer would mark the start of a new era in mechanized calculus Hazen 1940, 101 cited in Owens 1991, 4 . Analogue technology held much promise, especially for military computation — and the Analyzer had created a new era. The entire direction and culture of the MIT lab changed around this machine to woo sponsors . In the late 1930s the department became the Center of Analysis for Calculating Machines.

Many of the Analyzers built in the 1930s were built using military funds. The creation of the first Analyzer, and Bush’s promotion of it as a calculation device for ballistic analysis, had created a link between the military and engineering science at MIT which was to endure for over thirty years. Manuel De Landa (1994) puts great emphasis in his work on this connection, particularly as it was further developed during WWII. As he puts it, Bush created a bridge between the engineers and the military, he connected scientists to the blueprints of generals and admirals , and this relationship would grow infinitely stronger during WWII. Institutions that had previously occupied exclusive ground such as physics and military intelligence had begun communicating in the late 1930s, communities often suspicious of one another: the inventors and the scientists on the one side and the warriors on the other .

This paper has been arguing that the Analyzer qua technical artefact accomplished something equally important: as a prototype, it demonstrated the potential of analogue computing technology for analysis, and engendered an engineering culture around itself that took the machine to be a teacher. This is why, even after the obsolescence of the Analyzer, it was kept around at MIT for its educational value . It demonstrated that machines could automate the calculus, and that machines could mirror human tasks in an elegant fashion: something which required proof in steel and brass. The aura generated by the Analyzer as prototype was not lost on the military.

In 1935, the Navy came to Bush for advice on machines to crack coding devices like the new Japanese cipher machines . They wanted a long-term project that would give the United States the most technically advanced cryptanalytic capabilities in the world, a super-fast machine to count the coincidences of letters in two messages or copies of a single message. Bush assembled a research team for this project that included Claude Shannon, one of the early information theorists and a significant part of the emerging cybernetics community .

There were three new technologies emerging at the time which handled information: photoelectricity, microfilm and digital electronics.

All three were just emerging, but, unlike the fragile magnetic recording his students were exploring, they appeared to be ready to use in calculation machines. Microfilm would provide ultra-fast input and inexpensive mass-memory, photoelectricity would allow high-speed sensing and reproduction, and digital electronics would allow astonishingly fast and inexpensive control and calculation.

Bush transferred these three technologies to the new design. This decision was not pure genius on his part; they were perfect analogues for a popular conception of how the brain worked at the time. The scientific community at MIT were developing a pronounced interest in man-machine analogues, and although Claude Shannon had not yet published his information theory it was already being formulated, and there was much discussion around MIT about how the brain might process information in the manner of an analogue machine. Bush thought and designed in terms of analogies between brain and machine, electricity and information. This was also the central research agenda of Norbert Weiner and Warren McCulloch, both at MIT, who were at the time working on parallels they saw between neural structure and process and computation (; see also ). To Bush and Shannon, microfilm and photoelectricity seemed perfect analogues to the electrical relay circuits and neural substrates of the human brain and their capacities for managing information.

Bush called this machine the Comparator — it was to do the hard work of comparing text and letters for the humble human mind. Like the analytic machines before it and all other technical machines being built at the time, this was an analogue device; it directly mirrored the task at hand on a mechanical level. In this case, it directly mirrored the operations of searching and associating on a mechanical level, and, Bush believed, it mirrored the operations of the human mind and memory. Bush began the project in mid-1937, while he was working on the Rockefeller Analyzer, and agreed to deliver a code-cracking device based on these technologies by the next summer .

But immediately, there were problems in its development. Technical objects often depart from their fabricating intention; sometimes because they are used differently to what they were invented for, and sometimes because the technology itself breaks down. Microfilm did not behave the way Bush wanted it to. As a material it was very fragile, sensitive to light and heat, and tore easily; it had too many bugs. It was decided to use paper tape with minute holes, although paper was only one-twentieth as effective as microfilm . There were subsequent problems with this technology — paper itself is flimsy, and it refused to work well for long periods intact. There were also problems shifting the optical reader between the two message tapes. Bush was working on the Analyzer at the time, and didn’t have the resources to fix these components effectively. By the time the Comparator was turned over to the Navy, it was very unreliable, and didn’t even start up when it was unpacked in Washington . The Comparator prototype ended up gathering dust in a Navy storeroom, but much of the architecture was transferred to subsequent designs.

By this time, Bush had also started work on the Memex design. He transferred much of the architecture from the Comparator, including photoelectrical components, an optical reader and microfilm. In tune with the times, Bush had developed a fascination for microfilm in particular as an information storage technology, and although it had failed to work properly in the Comparator, he wanted to try it again. It would appear as the central technology in the Rapid Selector and also in the Memex design.

In the 1930s, many believed that microfilm would make information universally accessible and thus spark an intellectual revolution (, cited in ). Like many others, he had been enthusiastically exploring its potential in his writing , as well as the Comparator; the Encyclopaedia Britannica could be reduced to the volume of a matchbox. A library of a million volumes could be compressed into one end of a desk he wrote . In 1938, H.G. Wells even wrote about a Permanent World Encyclopaedia or Planetary Memory that would carry all the world’s knowledge. It was based on microfilm.

By means of microfilm, the rarest and most intricate documents and articles can be studied now at first hand, simultaneously in a score of projection rooms. There is no practical obstacle whatever now to the creation of an efficient index to all human knowledge, ideas, achievements, to the creation, that is, of a complete planetary memory for all mankind. , cited in

Microfilm promised faithful reproduction as well as miniaturisation. It was state-of-the-art technology, and not only did it seem the perfect analogy for material stored in the neural substrate of the human brain, it seemed to have a certain permanence the brain lacked. Bush put together a proposal for a new microfilm selection device, based on the architecture of the Comparator, in 1937. Its stated research agenda and intention was

Construction of experimental equipment to test the feasibility of a device which would search reels of coded microfilm at high speed and which would copy selected frames on the fly, for printout and use. Investigation of the practical utility of such equipment by experimental use in a library. Further development aimed at exploration of the possibilities for introducing such equipment into libraries generally. Bagg and Stevens 1961, cited in Nyce 1991, 41

Corporate funding was secured for the Selector by pitching it as a microfilm machine to modernise the library . Abstracts of documents were to be captured by this new technology and reduced in size by a factor of 25. As with the Comparator, long rolls of this film were to be spun past a photoelectric sensing station. If a match occurred between the code submitted by a researcher and the abstract codes attached to this film , the researcher was presented with the article itself and any articles previously associated with it. This was to be used in a public library, and unlike his nascent idea concerning Memex, he wanted to tailor it to commercial and government record-keeping markets.

Bush considered the Selector as a step towards the mechanised control of scientific information, which was of immediate concern to him as a scientist. According to him, the fate of the nation depended on the effective management of these ideas lest they be lost in a brewing data storm. Progress in information management was not only inevitable, it was essential if the nation is to be strong . This was his fabricating intention. He had been looking for support for a Memex-like device for years, but after the failure of the Comparator, finding funds for this library of the future was very hard . Then in 1938, Bush received funding from the National Cash Register Company and the Eastman Kodak Company for the development of an apparatus for rapid selection, and he began to transfer the architecture from the Comparator across to the new design.

But as Burke writes, the technology of microfilm and the tape-scanners began to impose their technical limitations; [a]lmost as soon as it was begun, the Selector project drifted away from its original purpose and began to show some telling weaknesses … Bush planned to spin long rolls of 35mm film containing the codes and abstracts past a photoelectric sensing station so fast, at speeds of six feet per second, that 60,000 items could be tested in one minute. This was at least one hundred-fifty times faster than the mechanical tabulator.

The Selector’s scanning station was similar to that used in the Comparator. But in the Selector, the card containing the code of interest to the researcher would be stationary. Bush and others associated with the project were so entranced with the speed of microfilm tape that little attention was paid to coding schemes , and when Bush handed the project over to three of his researchers, John Howard, Lawrence Steinhardt and John Coombs, it was floundering. After three more years of intensive research and experimentation with microfilm, Howard had to inform the Navy that the machine would not work . Microfilm, claimed Howard, would deform at such speeds and could not be aligned so that coincidences could be identified. Microfilm warps under heat, and it cannot take great strain or tension without distorting.

Solutions were suggested (among them slowing down the machine, and checking abstracts before they were used) , but none of these were particularly effective, and a working machine wasn’t ready until the fall of 1943. At one stage, because of an emergency problem with Japanese codes, it was rushed to Washington — but because it was so unreliable, it went straight back into storage. So many parts were pulled out that the machine was never again operable . In 1998, the Selector made Bruce Sterling’s Dead Media List, consigned forever to a lineage of failed technologies. Microfilm did not behave the way Bush and his team wanted it to. It had its own material limits, and these didn’t support speed of access.

In the evolution of any machine, there will be internal limits generated by the behaviour of the technology itself; Gille calls these endogenous limits . Endogenous limits are encountered only in practice — they effect the actual implementation of an idea. In engineering practice, these failures can teach inventors about the material potentials of the technology as well. The Memex design altered significantly through the 1950s; Bush had learned from the technical failures he was encountering. But most noticeable of all, Bush stopped talking about microfilm and about hardware.

By the 1960’s the project and machine failures associated with the Selector, it seems, made it difficult for Bush to think about Memex in concrete terms.

The Analyzer, meanwhile, was being used extensively during WWII for ballistic analysis and calculation. Wartime security prevented its public announcement until 1945, when it was hailed by the press as a great electromechanical brain ready to advance science by freeing it from the pick-and-shovel work of mathematics ( Life magazine, cited by Owens 1991, 3). It had created an entire culture around itself. But by the mid-1940s, the enthusiasm had died down; the machine seemed to pale beside the new generation of digital machines. The war had also released an unprecedented sum of money into MIT and spawned numerous other new laboratories. It ushered in a variety of new computation tasks, in the field of large-volume data analysis and real-time operation, which were beyond the capacity of the Rockefeller instrument . By 1950, the Analyzer had become an antique, conferred to back-room storage.

What happened? The reasons The Analyzer fell into disuse were quite different to the Selector; its limits were exogenous to the technical machine itself. They were related to a fundamental paradigm shift within computing, from analogue to digital. According to Gille, the birth of a new technical system is rapid and unforeseeable; new technical systems are born with the limits of the old technical systems, and the period of change is brutal, fast and discontinuous. In 1950, Warren Weaver and Samuel Caldwell met to discuss the Analyzer and the analogue computing program it had inspired at MIT, a large program which had become out of date more swiftly than anyone could have imagined. They noted that in 1936, no one could have expected that within ten years the whole field of computer science would so quickly overtake Bush’s project (Weaver and Caldwell, cited in ). Bush, and the department at MIT which had formed itself around the Analyzer and analogue computing, had been left behind.

I do not have the space here to trace the evolution of digital computing at this time in the US and the UK — excellent accounts have already been written by , , , and to name a few. All we need to realise at this point is that the period between 1945 and 1967, the years between the publication of the first and the final versions of the Memex essays respectively, had witnessed enormous change. The period saw not only the rise of digital computing, beginning with the construction of a few machines in the post-war period and developing into widespread mainframe processing for American business, it also saw the explosive growth of commercial television , and the beginnings of satellite broadcasting . As Beniger sees it, the world had discovered information as a means of control .

It is important to understand, however, that Bush was not a part of this revolution. He had not been trained in digital computation or information theory, and knew little about the emerging field of digital computing. He was immersed in a different technical system: analogue machines interpreted mathematics in terms of mechanical rotations, storage and memory as a physical holding of information, and drew their answers as curves. They directly mirrored the operations of the calculus. Warren Weaver expressed his regret over the passing of analogue machines and the Analyzer in a letter to the director of MIT's Center of Analysis: It seems rather a pity not to have around such a place as MIT a really impressive Analogue computer; for there is a vividness and directness of meaning of the electrical and mechanical processes involved ... which can hardly fail, I would think, to have a very considerable educational value. Weaver, cited in Owens 1991, 5

The passing away of analogue computing was the passing away of an ethos: machines as mirrors of mathematical tasks. But Bush and Memex remained in the analogue era; in all versions of the Memex essay, his goal remained the same: he sought to develop a machine that mirrored and recorded the patterns of the human brain , even when this era of direct reflection and analogy in mechanical workings had passed.

Technological evolution moves faster than our ability to adjust to its changes. More precisely, it moves faster than the techniques that it engenders and the culture it forms around itself. Bush expressed some regret over this speed of passage near the end of his life, or, perhaps, sadness over the obsolescence of his own engineering techniques.

The trend had turned in the direction of digital machines, a whole new generation had taken hold. If I mixed with it, I could not possibly catch up with new techniques, and I did not intend to look foolish.
Human Associative Memory and Biological-Mechanical Analogues There is another revolution under way, and it is far more important and significant than [the industrial revolution]. It might be called the mental revolution.

We now turn to Bush’s fascination with, and exposure to, new models of human associative memory gaining current in his time. Bush thought and designed his machines in terms of biological-mechanical analogues; he sought a symbiosis between natural human thought and his thinking machines.

As Nyce and Kahn observe, in all versions of the Memex essay (1939, 1945, 1967), Bush begins his thesis by explaining the dire problem we face in confronting the great mass of the human record, criticising the way information was then organised . He then goes on to explain the reason why this form of organisation doesn’t work: it is artificial. Information should be organised by association — this is how the mind works. If we fashion our information systems after this mechanism, they will be truly revolutionary.

Our ineptitude at getting at the record is largely caused by the artificiality of systems of indexing. When data of any sort are placed in storage, they are filed alphabetically or numerically, and information is found (when it is) by tracing it down from subclass to subclass. It can only be found in one place, unless duplicates are used; one has to have rules as to which path will locate it, and the rules are cumbersome. Having found one item, moreover, one has to emerge from the system and re-enter on a new path.

The human mind does not work that way. It operates by association. With one item in grasp, it snaps instantly to the next that is suggested by the association of thoughts, in accordance with some intricate web of trails carried by the cells of the brain.

Bush 1939, 1945, 1967

These paragraphs were important enough that they appeared verbatim in all versions of the Memex essay — 1939, 1945 and 1967 . No other block of text remained unchanged over time; the technologies used to implement the mechanism changed, Memex grew intelligent, the other machines (the Cyclops Camera, the Vocoder) disappeared. These paragraphs, however, remain a constant. Given this fact, Nelson’s assertion that the major concern of the essay was to point out the artificiality of systems of indexing, and to propose the associative mechanism as a solution for this seems reasonable. Nelson also maintains that these central precepts of the design have been ignored by commentators . I would contend that they have not been ignored; fragments of these paragraphs are often cited, particularly relating to association. What is ignored is the relationship between these two paragraphs — the central contrast he makes between conventional methods of indexing and the mental associations Memex was to support . Association was more natural than other forms of indexing — more human. This is why it was revolutionary.

Which is interesting, because Bush’s model of mental association was itself technological; the mind snapped between allied items, an unconscious movement directed by the trails themselves, trails of brain or of machine . Association was a technique that worked independently of its substrate, and there was no spirit attached to this machine: my brain runs rapidly — so rapidly I do not fully recognize that the process is going on . The speed of action in the retrieval process from neuron to neuron resulted from a mechanical switching (this term was omitted from the Life reprint of Memex II, Bush 1970, 100), and the items that this mechanical process resurrected were also stored in the manner of magnetic or drum memory: the brain is like a substrate for memories, sheets of data .

Bush’s model of human associative memory was an electro-mechanical one — a model that was being keenly developed by Claude Shannon, Warren McCulloch and Walter Pitts at MIT, and would result in the McCulloch-Pitts neuron . The MIT model of the human neuronal circuit constructed the human in terms of the machine, and later articulated it more thoroughly in terms of computer switching. In a 1944 letter to Weeks, for example, Bush argued that a great deal of our brain cell activity is closely parallel to the operation of relay circuits, and that one can explore this parallelism…almost indefinitely November 6, 1944; cited in Nyce and Kahn 1991, 62 .

In the 1930s and 1940s, the popular scientific conception of mind and memory was a mechanical one. An object or experience was perceived, transferred to the memory-library's receiving station, and then installed in the memory-library for all future reference . It had been known since the early 1900s that the brain comprised a tangle of neuronal groups that were interconnected in the manner of a network, and recent research had shown that these communicated and stored information across the neural substrate, in some instances creating further connections, via minute electrical vibrations. According to Bush, memories that were not accessed regularly suffered from this neglect by the conscious mind and were prone to fade. The pathways of the brain, its indexing system, needed constant electrical stimulation to remain strong. This was the problem with the neural network: items are not fully permanent, memory is transitory . The major technical problem with human memory was its tendency toward decay.

According to Manuel De Landa, there was also a widespread faith in biological-mechanical analogues at the time as models to boost human functions. The military had been attempting to develop technologies which mimicked and subsequently replaced human faculties for many years and this was especially heightened in the years before, during and immediately following the war. At MIT in particular, there was a tendency to take the image of the machine as the basis for the understanding of man and vice versa, writes Harold Hatt in his book on Cybernetics . The idea that Man and his environment are mechanical systems which can be studied, improved, mimicked and controlled was growing, and later gave way to disciplines such as cognitive science and artificial intelligence. Wiener and McCulloch looked for and worked from parallels they saw between neural structure and process and computation , a model which changed with the onset of digital computing to include on/off states. The motor should first of all model itself on man, and eventually augment or replace him.

Bush explicitly worked with such methodologies — in fact, he not only thought with and in these terms, he built technological projects with them . The first step was understanding the mechanical process or nature of thought itself; the second step was transferring this process to a machine. So there is a double movement within Bush’s work, the location of a natural human process within thought, a process which is already machine-like, and the subsequent refinement and modelling of a particular technology on that process. Technology should depart from nature, it should depart from an extant human process: this saves us so much work. If this is done properly, [it] should be possible to beat the mind decisively in the permanence and clarity of the items resurrected from storage .

So Memex was first and foremost an extension of human memory and the associative movements that the mind makes through information: a mechanical analogue to an already mechanical model of memory. Bush transferred this idea into information management; Memex was distinct from traditional forms of indexing not so much in its mechanism or content, but in the way it organised information based on association. The design did not spring from the ether, however; the first Memex design incorporates the technical architecture of the Rapid Selector and the methodology of the Analyzer — the machines Bush was assembling at the time.

The Design of Memex

Bush’s autobiography, Pieces of the Action, and also his essay Memex Revisited tell us that he started work on the design in the early 1930s ; . Nyce and Kahn also note that he sent a letter to Warren Weaver describing a Memex-like device in 1937 . The first extensive description of it in print, however, is found in the 1939 essay Mechanization and the Record . The description in this essay employs the same methodology Bush had used to design the Analyzer: combine existing lower-level technologies into a single machine with a higher function that automates the pick-and-shovel work of the human mind .

Nyce and Kahn maintain that Bush took this methodology from the Rapid Selector : this paper has argued that it was first deployed in the Analyzer. The Analyzer was the first working analogue computer at MIT, and it was also the first large-scale engineering project to combine lower-level, extant technologies and automate what was previously a human cognitive technique: the integral calculus. It incorporated two lower-level analogue technologies to accomplish this task: the wheel-and-disk integrator and the torque amplifier, as we have explored. Surrounded by computers and personal organisers, the idea of automating intellectual processes seems obvious to us now — but in the early 1930s the idea of automating what was essentially a function within thought was radical. Bush needed to convince people that it was worthwhile. In 1939, Bush wrote:

The future means of implementing thought are … fully as worthy of attention by one who wonders what comes next as are new ways of extracting natural resources, or of killing men.

The idea of creating a machine to aid the mind did not belong to Bush, nor did the technique of integral calculus (or association for that matter); he was, however, arguably the first person to externalise this technology on a grand scale. Observing the success of the Analyzer qua technical artefact, the method proved successful. Design on the first microfilm selection device, the Comparator, started in 1935. This, too, was a machine to aid the mind: it was essentially a counting machine, to tally the coincidence of letters in two messages or copies of a single message. It externalised the drudge work of cryptography, and Bush rightly saw it as the first electronic data-processing machine . The Rapid Selector which followed it incorporated much of the same architecture, as we have explored — and this architecture was in turn transferred to Memex.

The Memex-like machine proposed in Bush’s 1937 memo to Weaver shows just how much [the Selector] and the Memex have in common. In the rapid selector, low-level mechanisms for transporting 35mm film, photo-sensors to detect dot patterns, and precise timing mechanisms combined to support the high-order task of information selection. In Memex, photo-optic selection devices, keyboard controls, and dry photography would be combined … to support the process of the human mind.

The difference, of course, was that Bush’s proposed Memex would access information stored on microfilm by association, not numerical indexing. He had incorporated another technique (a technique which was itself quite popular among the nascent cybernetics community at MIT, and already articulated mind and machine together). By describing an imaginary machine, Bush had selected from the existing technologies of the time and made a case for how they should develop in the future . But this forecasting did not come from some genetically inherited genius — it was an acquired skill: Bush was close to the machine.

As Professor of Engineering at MIT (and after 1939, President of the Carnegie Institute in Washington), Bush was in a unique position — he had access to a pool of ideas, techniques and technologies which the general public, and engineers at other smaller schools, did not have access to. Bush had a more global view of the combinatory possibilities and the technological lineage. Bush himself admitted this; in fact, he believed that engineers and scientists were the only people who could or should predict the future of technology — anyone else had no idea. In The Inscrutable Thirties, an essay he published in 1933, he tells us that politicians and the general public simply can’t understand technology, they have so little true discrimination and are wont to visualize scientific triumphs as faits accomplis before they are even ready, even as they are being hatched in the laboratory . Bush believed that the prediction and control of the future of technology should be left to engineers; only they can distinguish the possible from the virtually impossible , only they can read the future from technical objects.

Memex was a future technology. It was originally proposed as a desk at which the user could sit, equipped with two slanting translucent screens upon which material would be projected for convenient reading . There was a keyboard to the right of these screens, and a set of buttons and levers which the user could depress to search the information using an electrically-powered optical recognition system. If the user wished to consult a certain piece of information, he [tapped] its code on the keyboard, and the title page of the book promptly appear[ed] . The images were stored on microfilm inside the desk, and the matter of bulk [was] well taken care of by this technology — only a small part of the interior is devoted to storage, the rest to mechanism . It looked like an ordinary desk, except it had screens and a keyboard attached to it. To add new information to the microfilm file, a photographic copying plate was also provided on the desk, but most of the Memex contents would be purchased on microfilm ready for insertion . The user could classify material as it came in front of him using a teleautograph stylus, and register links between different pieces of information using this stylus. This was a piece of furniture from the future, to live in the home of a scientist or an engineer, to be used for research and information management.

The 1945 Memex design also introduced the concept of trails, a concept derived from work in neuronal storage-retrieval networks at the time, which was a method of connecting information by linking units together in a networked manner, similar to hypertext paths. The process of making trails was called trailblazing, and was based on a mechanical provision whereby any item may be caused at will to select immediately and automatically another , just as though these items were being gathered together from widely separated sources and bound together to form a new book . Electro-optical devices borrowed from the Rapid Selector used spinning rolls of microfilm, abstract codes and a mechanical selection-head inside the desk to find and create these links between documents. This is the essential feature of the Memex. The process of tying two items together is the important thing . Bush went so far as to suggest that in the future, there would be professional trailblazers who took pleasure in creating useful paths through the common record in such a fashion.

The Memex described in As We May Think was to have permanent trails, and public encyclopaedias, colleague's trails and other information could all be joined and then permanently archived for later use. Unlike the trails of memory, they would never fade. In Memex Revisited, however, an adaptive theme emerged whereby the trails were mutable and open to growth and change by Memex itself as it observed the owner's habits of association and extended upon these . After a period of observation, Memex would be given instructions to search and build a new trail of thought, which it could do later even when the owner was not there . This technique was in turn derived from Claude Shannon’s experiments with feedback and machine learning, embodied in the mechanical mouse; A striking form of self adaptable machine is Shannon’s mechanical mouse. Placed in a maze it runs along, butts its head into a wall, turns and tries again, and eventually muddles its way through. But, placed again at the entrance, it proceeds through without error making all the right turns.

In modern terminology, such a machine is called an intelligent agent, a concept we shall discuss later in this work. Technology has not yet reached Bush's vision for adaptive associative indexing , although intelligent systems, whose parameters change in accordance with the user's experiences, come close. This is called machine learning. Andries van Dam also believes this to be the natural future of hypertext and associative retrieval systems .

In Memex II, however, Bush not only proposed that the machine might learn from the human via what was effectively a cybernetic feedback loop — he proposed that the human might learn from the machine. As the human mind moulds the machine, so too the machine remolds the human mind, it remolds the trails of the user’s brain, as one lives and works in close interconnection with a machine . For the trails of the machine become duplicated in the brain of the user, vaguely as all human memory is vague, but with a concomitant emphasis by repetition, creation and discard … as the cells of the brain become realigned and reconnected, better to utilize the massive explicit memory which is its servant.

This was in line with Bush’s conception of technical machines as mechanical teachers in their own right. It was a proposal of an active symbiosis between machine and human memory which has been surprisingly ignored in contemporary readings of the design. Nyce and Kahn pay it a full page of attention, and also Nelson, who has always read Bush rather closely . But aside from that, the full development of this concept from Bush’s work has been left to Doug Engelbart.

In our interview, Engelbart claimed it was Bush’s concept of a co-evolution between humans and machines, and also his conception of our human augmentation system, which inspired him . Both Bush and Engelbart believe that our social structures, our discourses and even our language can and should adapt to mechanization ; all of these things are inherited, they are learned. This process is not only unavoidable, it is desirable. Bush also believed machines to have their own logic, their own language, which can touch those subtle processes of mind, its logical and rational processes and alter them . And the logical and rational processes which the machine connected with were our own memories — a prosthesis of the inside. This vision of actual human neurons changing to be more like the machine, however, would not find its way into the 1967 essay .

Paradoxically, Bush also retreats on this close alignment of memory and machine. In the later essays, he felt the need to demarcate a purely human realm of thought from technics, a realm uncontaminated by technics. One of the major themes in Memex II is defining exactly what it is that machines can and cannot do.

Two mental processes the machine can do well: first, memory storage and recollection, and this is the primary function of the Memex; and second, logical reasoning, which is the function of the computing and analytical machines.

Machines can remember better than human beings can — their trails do not fade, their logic is never flawed. Both of the mental processes Bush locates above take place within human thought, they are forms of internal repetitive thought — perfectly suited to being externalised and improved upon by technics. But exactly what is it that machines can’t do? Is there anything inside thought which is purely human? Bush demarcates creativity as the realm of thought that exists beyond technology.

How far can the machine accompany and aid its master along this path? Certainly to the point at which the master becomes an artist, reaching into the unknown with beauty and versatility, erecting on the mundane thought processes a thing of beauty … this region will always be barred to the machine.

Bush had always been obsessed with memory and technics, as we have explored. But near the end of his career, when Memex II and Memex Revisited were written, he became obsessed with the boundary between them, between what is personal and belongs to the human alone, and what can be or already is automated within thought.

In all versions of the Memex essay, the machine was to serve as a personal memory support. It was not a public database in the sense of the modern Internet: it was first and foremost a private device. It provided for each person to add their own marginal notes and comments, recording reactions to and trails from others' texts, and adding selected information and the trails of others by dropping them into their archive via an electro-optical scanning device. In the later adaptive Memex, these trails fade out if not used, and if much in use, the trails become emphasized as the web adjusts its shape mechanically to the thoughts of the individual who uses it.

Current hypertext technologies are not quite so private and tend to emphasise systems which are public rather than personal in nature and that emphasize the static record over adaptivity due to the need for mass production, distribution and compatibility. The idea of a personal machine to amplify the mind also flew in the face of the emerging paradigm of human–computer interaction that reached its peak in the late 1950s and early 1960s, which held computers to be rarefied calculating machines used only by qualified technicians in white lab coats in air-conditioned rooms at many degrees of separation from the user. After the summer of 1946, writes Ceruzzi, computing's path, in theory at least, was clear . Computers were, for the moment, impersonal, institutionally aligned and out of the reach of the ignorant masses who did not understand their workings. They lived only in university computer labs, wealthy corporations and government departments. Memex II was published at a time when the dominant paradigm of human–computer interaction was sanctified and imposed by corporations like IBM, and it was so entrenched that the very idea of a free interaction between users and machines as envisioned by Bush was viewed with hostility by the academic community .

In all versions of the essay, Memex remained profoundly uninfluenced by the paradigm of digital computing. As we have explored, Bush transferred the concept of machine learning from Shannon — but not information theory. He transferred neural and memory models from the cybernetic community — but not digital computation. The analogue computing discourse Bush and Memex created never mixed with digital computing . In 1945, Memex was a direct analogy to Bush’s conception of human memory; in 1967, after digital computing had swept engineering departments across the country into its paradigm, Memex was still a direct analogy to human memory. It mirrored the technique of association in its mechanical workings. While the pioneers of digital computing understood that machines would soon accelerate human capabilities by doing massive calculations, Bush continued to be occupied with extending, through replication, human mental experience.

Consequently, the Memex redesigns responded to the advances of the day quite differently to how others were responding at the time. By 1967, for example, great advances had been made in digital memory techniques. As far back as 1951, the Eckert-Mauchly division of Remington Rand had turned over the first digital computer with a stored-program architecture, the UNIVAC, to the US Census Bureau . Delay Lines stored 1,000 words as acoustic pulses in tubes of mercury, and reels of magnetic tapes which stored invisible bits were used for bulk memory. This was electronic digital technology, and did not mirror or seek to mirror natural processes in any way. It steadily replaced the most popular form of electro-mechanical memory from the late 1940s and early 1950s: drum memory. This was a large metal cylinder which rotated rapidly beneath a mechanical head, where information was written across the surface magnetically . In 1957, disk memory had been produced, for the IBM305 RAMAC, and rapid advances were being made by IBM and DEC .

Bush, however, remained enamoured of physical recording and inscription. His 1959 essay proposes using organic crystals to record data by means of phase changes in molecular alignment. [I]n Memex II, when a code on one item points to a second, the first part of the code will pick out a crystal, the next part the level in this, and the remainder the individual item . This was new technology at the time, but certainly not the direction commercial computing was taking via DEC or IBM. Bush was fundamentally uncomfortable with digital electronics as a means to store material. The brain does not operate by reducing everything to indices and computation, Bush wrote . Bush was aware of how out of touch he was with emerging digital computing techniques, and this essay bears no trace of engineering details whatsoever, details which were steadily disappearing from all his published work. He devoted the latter part of his career to frank prophecy, reading from the technologies he saw around him and taking a long look ahead . Of particular concern to him was promoting Memex as the technology of the future, and encouraging the public that the time has come to try it again .

Memex, Inheritance and Transmission No memex could have been built when that article appeared. In the quarter-century since then, the idea has been with me almost constantly, and I have watched new developments in electronics, physics, chemistry and logic to see how they might help bring it to reality .

Memex became an image of potentiality for Bush near the end of his life. In the later essays, he writes in a different tone entirely: Memex was an image he would bequeath to the future, a gift to the human race. For most of his professional life, he had been concerned with augmenting human memory, and preserving information that might be lost to human beings. He had occasionally written about this project as a larger idea which would boost the entire process by which man profits by his inheritance of acquired knowledge . But in Memex II, this project became grander, more urgent — the idea itself far more important than the technical details. He was nearing the end of his life, and Memex was still unbuilt. Would someone eventually build this machine? He hoped so, and he urged the public that it would soon be possible to do this, or at least, the day has come far closer : in the interval since that paper [As We May Think] was published, there have been many developments … steps that were merely dreams are coming into the realm of practicality . Could this image be externalised now, and live beyond him? It would not only carry the wealth of his own knowledge beyond his death, it would be like a gift to all mankind. In fact, Memex would be the centrepiece of mankind’s true revolution — transcending death.

Can a son inherit the memex of his father, refined and polished over the years, and go on from there? In this way can we avoid some of the loss which comes when oxygen is no longer furnished to the brain of the great thinker, when all the patterns of neurons so painstakingly refined become merely a mass of protein and nucleic acid? Can the race thus develop leaders, of such power and intellect, and such forces of conviction, that the world can be saved from its follies? This is an objective of far greater importance than the conquest of disease, even than the conquest of mental aberrations .

Near the end of his life, Bush thought of Memex as more than just an individual’s machine; the ultimate [machine] is far more subtle than this . Memex would be the centrepiece of a structure of inheritance and transmission, a structure that would accumulate with each successive generation. In Science Pauses, Bush entitled one of the sections Immortality in a machine : it contained a description of Memex, but this time there was an emphasis on its longevity over the individual human mind . This is the crux of the matter; the trails in Memex would not grow old, they would be a gift from father to son, from one generation to the next.

Bush died on June 30, 1974. The image of Memex has been passed on beyond his death, and it continues to inspire a host of new machines and technical instrumentalities. But Memex itself has never been built; it exists only on paper, in technical interpretation and in memory. All we have of Memex are the words that Bush assembled around it in his lifetime, the drawings created by the artists from Life, its erotic simulacrum, its ideals, its ideas. Had Bush attempted to assemble this machine in his own lifetime, it would undoubtedly have changed in its technical workings; the material limits of microfilm, of photoelectric components and later, of crystalline memory storage would have imposed their limits; the use function of the machine would itself have changed as it demonstrated its own potentials. If Memex had been built, the object would have invented itself independently of the outlines Bush cast on paper. This never happened — it has entered into the intellectual capital of new media as an image of potentiality.

Bagg, T. C., and Stevens, M. E. Information Selection Systems Retrieving Replica Copies: A state-of-the-art report. National Bureau of Standards Technical note 157. Washington, D.C.: Government Printing Office, 1961. Beniger, James R. The Control Revolution: Technological and Economic Origins of the Information Society. Cambridge, MA: Harvard University Press, 1986. Burke, Collin. A Practical View of the Memex: The Career of the Rapid Selector. In . Bush, Vannevar. Mechanical Solutions of Engineering Problems, Tech Engineering News, Vol. 9, 1928. Bush, Vannevar. The Inscrutible <q>Thirties</q> . Reprinted in , 67–80. Bush, Vannevar. Mechanization and the Record, Vannevar Bush Papers, Library of Congress, Box 138, Speech Article Book File. Bush, Vannevar. As We May Think. Reprinted in , 85–112. Bush, Vannevar. Memex II. Reprinted in , 165–184. Bush, Vannevar. Man’s Thinking Machines, Vannevar Bush Papers, MIT Archives, MC78, Box 21. Bush, Vannevar. Science Pauses. Reprinted in , 185–196. Bush, Vannevar. Memex Revisited. Reprinted in , 197–216. Bush, Vannevar. Pieces of the Action. New York: William Morrow, 1970. Ceruzzi, Paul E. A History of Modern Computing. Cambridge, MA: MIT Press, 1998. De Landa, Manuel. War in the Age of Intelligent Machines. New York: Zone Books, 1994. Dennett, Daniel C. Consciousness Explained. London: Penguin Books, 1993. Edwards, Paul N. The Closed World: Computers and the Politics of Discourse in Cold War America. Cambridge, MA: MIT Press, 1997. Eldredge, Niles. Email interview with Belinda Barnet. March 2004. http://journal.fibreculture.org/issue3/issue3_barnet.html. Engelbart, Douglas. Interview with Belinda Barnet. November 10, 1999. Farkas-Conn, I. S. From Documentation to Information Science: The Beginnings and Early Development of the American Documentation Institute—American Society for Information Science. New York: Greenwood Press, 1990. Guattari, Félix. Chaosmosis: An Ethico-Aesthetic Paradigm. Tr. Paul Bains and Julian Pefanis. Sydney: Power Publications, 1995. Gille, Bertrand. History of Techniques. New York: Gordon and Breach Science Publishers, 1986. Hartree, Douglas. Differential Analyzer, http://cs.union.edu/~hemmendd/Encyc/Articles/Difanal/difanal.html Hatt, Harold. Cybernetics and the Image of Man. Nashville: Abingdon Press, 1968. Hayles, Katherine. Virtual Bodies and Flickering Signifiers, October Magazine. No 66 (Fall 1993), 69–91. Hayles, N. Katherine. How we Became Posthuman. Chicago: University of Chicago Press, 1999. Hazen, Harold. MIT President's report, 1940. Meyrowitz, Norman. Hypertext: Does it Reduce Cholesterol, Too?. In , 287–318. Mindell, David A. MIT Differential Analyzer. http://web.mit.edu/mindell/www/analyzer.htm Nelson, Theodor H. As We Will Think. In , 245–260. Nelson, Theodor H. Interview with the author. Nyce, James and Kahn, Paul, eds. From Memex to Hypertext: Vannevar Bush and the Mind's Machine. London: Academic Press, 1991. Oren, Tim 1991, Memex: Getting Back on the Trail. In , 319–338. Owens, Larry. Vannevar Bush and the Differential Analyzer: The Text and Context of an Early Computer. In , 3–38. Shurkin, Joel. Engines of the Mind, The Evolution of the Computer from Mainframes to Microprocessors. New York: WW Norton and Company, 1996. Smith, Linda C. Memex as an Image of Potentiality Revisited. In . Spar, Debora L. Ruling the Waves: Cycles of Discovery, Chaos, and Wealth from Compass to the Internet. New York: Harcourt, 2001. Stiegler, Bernard. Technics and Time, 1: The Fault of Epimetheus. Stanford: Stanford University Press, 1998. Van Dam, Andries. Interview with the author. Weaver, Warren. Project diaries. March 17, 1950. Weaver, Warren. Letter to Samuel Caldwell. Correspondence held in the Rockefeller Archive Center, RF1.1/224/2/26. Wells, H.G. World Brain. London: Methuen & Co. Limited, 1938. Ziman, John. Technological Innovation as an Evolutionary Process. Cambridge: Cambridge University Press, 2003.
Appendix 3: Side-by-side Comparison Layout This appendix contains a screen shot of the side-by-side comparison view used to identify mismatched bibliographic entries during the deduplication and error correction phase of the project. This view takes data from the bibliography for an individual DHQ article; for each entry in that bibliography, the XSLT stylesheet seeks a match (based on the value of the @key attribute) in the centralized bibliography. If a match is found, that entry is displayed beneath the original entry. The stylesheet also performs a comparison between the content of the two entries (based on author name, title, and facts of publication); if the similarity falls below a certain threshold, the entry is flagged in red so that the two can be compared and the match confirmed. In the examples shown here, the first flagged entry (Borovoy) is in fact a match but there are discrepancies between the titles; the entry from the central bibliography contains better information. In the second flagged entry (Marino) the two records represent different items and the @key will need to be fixed to point to the correct entry in the central bibliography. //bibl[@key='bakhtin1982'] //bibl[@key='borovoy2011'] //bibl[@key='marino'] //bibl[@key='camnitzer2007'] Bibl lookup: article 000157 Code as Ritualized Poetry: The Tactics of the Transborder Immigrant Tool Show Key Show Instructions Comparing with 6239 entries in Biblio. Bakhtin, M. M. The Dialogic Imagination: Four Essays. University of Texas Press, 1982. Bakhtin, M.M. The Dialogic Imagination: Four Essays. Austin: University of Texas Press, 1982. [Biblio also has 2 similar entries] Show Detail Borovoy, Rick et al. Folk Computing. ACM Press, 2001. 466–473. Web. 1 Oct. 2011. Borovoy, Rick, et al. “Folk Computing”. Presented at (2001). http://dl.acm.org/citation.cfm? id=365316. Hide Detail Biblio ID criterion Biblio entry borovoy2011 Matching ID Similarity 0.231 (6/26) Borovoy, Rick, et al. “Folk Computing”. Presented at (2001). http://dl.acm.org/citation.cfm?id=365316. Brett Stalbaum Complete Interview : Mark Marino : Free Download & Streaming : Internet Archive. Film. Marino, Mark. Brett Stalbaum Complete Interview : Mark Marino. Internet Archive. https://archive.org/details/BrettStalbaumCompleteInterview. Show Detail Camnitzer, Luis. Conceptualism in Latin American Art: Didactics of Liberation. 1st ed. Austin: University of Texas Press, 2007. Print. (Joe R. and Teresa Lozano Long Series in Latin http://dl.acm.org/citation.cfm?id=365316 http://dl.acm.org/citation.cfm?id=365316 https://archive.org/details/BrettStalbaumCompleteInterview Appendix 4: Internal Documentation for Extraction of Bibliography Entries This appendix contains the internal documentation describing the process by which bibliographic data is extracted from existing DHQ articles and converted to the DHQ bibliographic markup. Biblio Workflow Instructions 0. Open the Biblio.xpr file in Oxygen so that you have access to the "project" materials. 1. Make sure you're using the most up-to-date version of DHQ's files (via SVN). 2. Open the .xml version of the article you are working on in Oxygen 3. If the article has no bibliography, move on to the next article in the workflow. 4. If there are bibl records, extract them from the article; these records (after you de-duplicate and groom them) will become part of the Biblio list: Configure Transformation Scenario (wrench icon next to the red "Apply Transformation Scenario" arrow) Click the check-box next to "Extract biblio listings," then "Apply Associated (1)." A new file, titled "numberoffile-biblioscratch.xml," should be created. 5. IMPORTANT: Run a Find/Replace on the biblioscratch.xml file to convert all references to "dhqID" (an old referent) to "ID." go to the Find menu and choose Find/Replace in Text to Find type: "dhqID", and in Replace With type "ID." click "Replace All" the number of matches should equal the number of biblio records (for example, "88 records matched"). 6. Next you're going to check for duplicate records: i.e., records that have already been entered by Jim / DHQ into the current repository of bibliographic records (visible in the "current" sub-folder in the "data" folder in DHQ). This is done by running a Schematron check which compares the contents of your scratch file to the existing contents of Biblio. The goal here is to eliminate from your scratch file any records that are already in Biblio. You do NOT have to clean up any records that are already present in "current", and you can delete them from your scratch file without worrying that they will be disconnected from the article (which is why we're doing this in a "scratch" file). go to the "Validate" check-box at the top of Oxygen and open the drop-down menu by clicking the arrow next to it: choose "Validate With" if you do not see options visible here, find the dhqBiblio schema file in your working copy ( dhq/trunk/biblio/DHQ-Biblio- v2/schema/dhqBiblio-checkup.sch), then click "OK." (make sure you're using the checkup file here!) you should then receive a number of error messages in the "Errors" section of Oxygen. check the red exclamation points first; they provide the most accurate information re: bibliographic information that already resides in "current." then check the yellow exclamation points; they represent possible duplicates based on matching titles (but since titles are often the same, e.g. "Introduction", this isn't always indicative of a duplicate). Red Error Messages When checking these exclamation points: go to the Biblio record noted in the error message (for example: dhqID 'aarseth1997' is already assigned to another entry; see Biblio-A.xml (aarseth1997). you can find these alphabetic files in the "current" folder. check to ensure that both entries are the same. You should also verify that the information in "current" is the most comprehensive: for example, if you notice that the author's full name is not in "Biblio-A," then please update that in "current." if the entry is the same, you can delete the entry in your scratch file. in some cases you'll find that while an ID has already been assigned, the entry in your article is different. After double- checking to ensure that information on both citations is accurate, you may need to assign the citation tied to your article a new entry. For example, if your 'aarseth1997' is different from the 'aarseth1997' in the "current" folder's A file, you should rename your entry 'aarseth1997a' (or b, if an "aarseth1997a" already exists, etc.). This issue pops up with the particularly prolific writers cited by DHQ's authors (McGann, Hayles, Flanders). check every red exclamation point in your error messages until you are satisfied that records are duplicates / resolved. http://www.digitalhumanities.org/confluence/display/DHQ http://www.digitalhumanities.org/confluence/display/DHQ/Biblio+Workflow+Instructions Yellow Error Messages these error messages generally refer to titles that are similar to entries in the "current" folder. Compare these messages to the specified bibliographic files and determine if you're dealing with a duplicate or a new entry. in some cases, these error messages contain information that you've hopefully already resolved while going through the red exclamation points. However, there will inevitably be occasions when a duplicate title is present in an entry that we want to add to our records: different / revised editions of publications, generic titles that happen to overlap (like "Digital Media"), generic titles like "Wikipedia." in some cases you might find that the title listed in your scratch file could be revised (expanded to contain more information, changed because it is incorrect). Feel free to do so, but if you've otherwise established that you're dealing with the correct title and a new entry, then don't worry about the error message if it persists. 8. Update and encode the bibliographic records remaining in your scratch file to create a valid file in accordance with the Biblio schema (adding elements and attributes as needed to represent the various components of the bibliographic record). See the information about Bibliographic Elements on this page to determine what element to use for each item These are my (Jim's) suggestions for how to quickly complete this work; feel free to do what is best for you, so long as the end result is the same (a clean file that we can add to DHQ's records). put Boilerplate content in scratch file change every record's BiblioItem element to an appropriate genre (e.g. and clean up information about authors and editors update additional information for each record by BiblioItem (start with books, then journal articles, then websites, etc.). You can use the find tool to jump from item to item and work more quickly through the file this way. I tend to start with JournalArticle records, since they involve adding the most information. clean up the entire file until it is valid any items that don't conform to an existing Biblio genre should be added to the Problem Genres file Boilerplate: I tend to dump the following text into the top and/or bottom of my scratch file, since I know I'll end up using them a lot and I'll want to paste this content into many records: For all records with authors (i.e. most of them): For Books: For Journal Articles: Issuances Information about issuance accompanies information about each BiblioItem; this information designates whether an item is "monographic" or "continuing" monographic: Book, BookInSeries, ConferencePaper JournalArticle, Thesis, VideoGame continuing: BlogEntry, book (when part of BookInSeries information), journal, WebSite Tips for Author information whenever possible, use full names instead of initials for givenName information. use CorporateName for corporate authors (institutional entities, companies). CorporateName is most frequently used for WebSites where authors are unspecified If no author name is present and a CorporateName can not be determined, use the FullName field and write "Author Unknown." 9. Make sure entire file is clean and valid and that your work has been updated via Subversion (i.e. COMMIT your changes). 10. Notify Julia and we have Wendell propagate the resulting Biblio records into the Biblio data. Appendix 5: Final Report on IVMOOC Project This appendix contains the final report by members of the IVMOOC working group describing their analysis of the DHQ bibliographic data and presenting the resulting visualizations. Mapping Cultures in the Big Tent: Multidisciplinary Networks in the Digital Humanities Quarterly Dulce Maria de la Cruz, Jake Kaupp, Max Kemman, Kristin Lewis, Teh-Hen Yu Abstract—Digital Humanities Quarterly (DHQ) is a young journal that covers the intersection of digital media and traditional humanities. In this paper, we explore the publication patterns in DHQ through visualizations of co-authorship and bibliographic coupling networks in order to understand the cultures the journal represents. We find that DHQ consists largely of sole-authored papers (66%) and the authorship is dominated (75%) by authors publishing from North American institutions. Through the backbone of DHQ’s bibliographic coupling network, we identify several communities of articles published in DHQ, and we analyze their collective abstracts using term frequency-inverse document frequency (TF-IDF) analysis. The extracted terms show that DHQ has wide coverage across the digital humanities, and that sub areas of DHQ can be identified through their citation behavior. Index Terms—Digital Humanities, Information Visualization, Co-author network, Bibliographic Coupling, big tent I N T R O D U C T I O N Digital Humanities (DH) is a field of research difficult to define due to its heterogeneity1. With its inclusionary ambitions, DH is regularly referred to as a ‘big tent’ [1] encompassing scholars from a wide variety of disciplines such as history, literature, linguistics, but also disciplines such as human-computer interaction and computer science. This collaborative, multidisciplinary approach to digital media makes DH an interesting field, but also difficult to grasp. A question is to what extent the big tent of DH represents a single, or actually a variety of cultures [1, 2]. The Digital Humanities Quarterly (DHQ) journal is arguably one of the largest journals aimed specifically at DH research, and covers all aspects of digital media in the humanities, representing a meeting point between digital humanities research and the wider humanities community [3]. Articles published in DHQ involve authors of multiple countries, institutions and disciplines who work on several subjects and areas related to digital media research. Under a recent grant from NEH (National Endowment for Humanities), DHQ has developed a centralized bibliography which supports the bibliographic referencing for the journal. To gain an understanding of the diversity of culture(s) in the DH, we are interested in how unique disciplinary cultures are represented in DHQ. Considering cultures are self-referential systems, we might expect that scholars from a certain culture are more likely to cite scholars from their own culture rather than from others [2]. As such, we expect citation behaviour to reflect disciplinary cultural norms. Therefore, visualizing and analysing the bibliographic data of DHQ not only 1 See e.g. http://whatisdigitalhumanities.com for a wide variety of definitions from different scholars gives insights into the specific bibliographies from DHQ, it might give insight into the way the different epistemic cultures in the DH big tent interact with one another, and how this interaction and collaboration impacts the networks over time. This paper reports on a project undertaken in the Information Visualization MOOC from Indiana University2. We have analysed the DHQ bibliographic data and created visualizations in order to discuss the following questions provided by the DHQ editors: 1. how citations reflect differences in academic culture at the institutional and geographic level 2. the changes to that culture over time. 3. correlations between article topics (reflected in keywords) and citation patterns 1 ME T H O D 1.1 Data Two tables were extracted from the Client dataset: 1. dhq_articles (178 records) 2. works_cited_in_dhq (3823 records) The attributes for both tables are: article id, authors, year, title, journal/conference/collection, abstract, cited references, and isDHQ. The raw dataset posed several problems, including: • missing articles, • duplicate authors, • double affiliations and inconsistencies, • duplicated articles and citation self-loops, • special characters, and • incomplete information (lack of information regarding affiliation and country for each DHQ paper, and disciplines for authors). The DHQ website3 was therefore scraped using the tool Import.io4 to find missing articles and to obtain information about affiliations for each author. Once that information was known, it was used to obtain the country associated with each institution by searching in the web. Custom programs in the R language were then used to create paper IDs (cite me as) similar to those used for the references and to 2 http://ivmooc.cns.iu.edu/ 3 http://www.digitalhumanities.org/dhq/ 4 https://www.import.io/ • Dulce Maria de la Cruz is Freelance Data Analyst. E-mail: Dulce.Maria.delaCruz@gmail.com. • Jake Kaupp is Engineering Education Researcher in Queen’s University, Canada. E-mail: jkaupp@gmail.com. • Max Kemman is PhD Candidate in University of Luxembourg, Luxembourg. E-mail: maxkemman@gmail.com. • Kristin Lewis is Science & Technology Policy Fellow at AAAS. E- mail: kristin.l.m.lewis@gmail.com. • Teh-Hen Yu is IT Professional. E-mail:tehhenyu@hotmail.com. mailto:jkaupp@gmail.com mailto:maxkemman@gmail.com calculate the number of times each DHQ paper has been cited (times cited) and the number of references cited by each DHQ paper (count cited references). Furthermore, we assigned a discipline to each paper based on the first author’s departmental affiliation as described in [4]. In order to produce a more detailed list of disciplinary culture, departmental affiliation was manually mapped to web of science subject areas. This information was eventually not used for the final visualizations, but left in the dataset for further exploration by others. After validations, data mining/scraping, data processing with custom programs coding and a lot of manual work, we have come up with a master dataset with additional info added (cite me as, times cited, affiliation, country, count cited references, geocode, discipline, affiliations including departments info, and community, plus the keywords provided by editors of DHQ). To provide sufficient resolution, and categorical variables, for visualizations an author look-up table was created which contained the additional information outlined above but for each separate author for each article ID. Both the master datafile and the author lookup table are our primary sources of data to load for visualization and analysis. The source code, final datasets, and resulting visualizations are available through github5. The final dataset provides the following statistics as in Table 1. Table 1. DHQ dataset statistics Attribute Count Note DHQ articles 195 Unique cited articles 4718 Unique DHQ author 276 Affiliations 148 Including all institutions + independent scholars WOS subject areas 29 Countries 17 Publication years 8 2007-2014 Figure 1 provides an overview of the number of DHQ publications and number of co-authored papers per year, revealing a surprisingly uneven temporal distribution. Fig. 1. DHQ (co-authored) publications per year. 1.2 Co-author network 5 Available at https://jkaupp.github.io/DHQ. Please cite as Kaupp, J., De la Cruz, D.M,, Kemman, M., Lewis, K., Yu, T.-H. (2015) Mapping Cultures in the Big Tent: Multidisciplinary Networks in the Digital Humanities Quarterly. GitHub, https://jkaupp.github.io/DHQ People are the key inputs in determining and understanding cultural differences. Therefore, in order to better understand the cultures within DHQ, we explored the authors who published within DHQ. Using Sci2 [5], we created yearly cumulative time slices of the master dataset and extracted co-author networks for each time slice. Columns for author country were added, and each time slice was imported into Gephi to create a dynamic co-author network [6]. The network was laid out using the Force Atlas 2 algorithm [7], with nodes colorized by country. Each time slice was visualized, and compiled into comprehensive visualizations using Adobe Illustrator and Adobe Photoshop. In addition to a co-author network, we explored a bibliographic coupling network of authors, in which nodes (authors) would be linked based on the number of cited articles in common. This analysis however introduced a strong bias towards co-authors who cite large numbers of articles. In order to derive userful insights from this type of visualization, a de-biasing operation must be identified and applied. Without an established method for these, we chose to focus on the geographic information in the co-authorship network and analyse bibliographic coupling of articles 1.3 Bibliographic coupling & Backbone identification In order to investigate the bibliographies of DHQ articles, we analysed the data using Sci2 by extracting the paper-citation network, followed by extracting the reference co-occurrence network, also known as “bibliographic coupling” [8]. By doing so, we create a network of DHQ articles with co-occurring references. To simplify the visualization, we created a minimum spanning tree using the MST Pathfinder algorithm whereby articles are connected to the network only by their strongest relation [9], also called Backbone identification. As such, the network becomes a tree that is easier to read. Finally, all articles with zero references were removed from the network in order to remove non-DHQ articles, as well as DHQ articles that could not be analysed due to a lack of references. This network was then analyzed using the SLM community detection algorithm with undirected and weighted edges [10]. The network with community attributes was then imported into Gephi and ordered using the Force Atlas 2 algorithm [6], after which we colorized the nodes by their identified community. 1.4 Word clouds In order to investigate the correlations between article topics (reflected in keywords) and the citation patterns, word clouds of keywords were obtained for each of the communities identified via SLM detection in the bibliographic coupling network. For this purpose, community-based abstracts were obtained by combining the abstracts associated with the DHQ papers belonging to each community. These community-wide abstracts were normalized to lower case, tokenized, and stop words were removed. Words were not stemmed in order to differentiate between words like digital and digitized. Unique keywords were extracted from the community- based abstracts with custom R programs (using the R packages stringr6 and tm7). The most significant keywords for each community were then identified through the Term frequency - Inverse Document Frequency (TF-IDF) method [11]. Terms with high TF-IDF values imply a strong relationship with the document in which they appear. In this specific case, the terms are the unique keywords and the corpus of documents is the set of community-based abstracts. Therefore, the higher the TF-IDF value of a keyword in a 6 http://cran.r-project.org/web/packages/stringr/index.html 7 http://cran.r-project.org/web/packages/tm/index.html https://jkaupp.github.io/DHQ https://jkaupp.github.io/DHQ community, the more representative the keyword is of that community. The ten top-scoring words from each community were put into a word cloud and the words were sized by TF-IDF score. The word clouds were manually adjusted to unify the appearance of terms (plural vs. singular, infinitive vs. gerund, etc.) and were added to the bibliographic coupling network visualization. 2 RE S U L T 8 2.1 Co-author Network Figures 2 and 3 represent the co-author network for DHQ, both comprehensively (Figure 2) and through cumulative time slices (Figure 3). Nodes are sized by the number of works published in DHQ, and in Figure 2, authors with at least 4 DHQ publications are labeled with the author’s last name. Nodes are colorized by the country of the author. The edges are weighted by the number of times each pair co- Fig. 2 Co-Author network, 2007-2014 authored a DHQ publication together. The maximum number of authored works (articles) for a single author is 7: Julianne Nyhan from UK. The maximum co-authored articles for two authors are 6: by Anne Welsh and Julianne Nyhan from UK. The most active year is 2009, as also shown in Figure 1, with several authors publishing multiple papers in this year. 2.2 Bibliographic coupling network with word clouds Figure 4 shows the backbone bibliographic coupling network for DHQ, representing the strongest connections in the larger bibliographic coupling network (not shown). Nodes are colored by community, as identified through SLM detection, and sized by the number of articles cited in each article. Edges are weighted by the number of cited articles in common. Alongside each community is a 8 Larger versions of all visualizations are available in the github repository. word cloud of keywords in the same color extracted from the abstracts of each article in the community. Fig. 3 Co-Author network by year Figure 5 shows key papers in the backbone bibliographic coupling network, that is, the papers that link each of the communities in the giant component. The labels are shown in the same colors as the communities in Figure 4. After we removed articles with zero references, the network contained 170 articles (out of 195), of which 23 are without a connection to other (i.e. they remained isolate). These 23 are not shown in the final visualization above, showing 147 articles and 145 connections. The bibliographic coupling network contains twelve communities, of which one consists of two articles not otherwise connected to the Fig. 4 Backbone bibliographic coupling network for DHQ. major component (see dark green at the upper right). The other eleven communities are all connected in the large component and shown with their respective word clouds. Fig. 5 Key papers in the backbone bibliographic coupling network.articles There are a total of 4880 documents, including the 195 articles from DHQ itself. Together all the DHQ articles contain 5330 references. The highest cited document is Matthew Kirschenbaum’s “Mechanisms: New Media and the Forensic Imagination” (2008) , cited 15 times. The DHQ article with the most references is Christine Borgman’s “The Digital Future is Now: A Call to Action for the Humanities” (2009), with 130 references. 3 DI S C U S S I O N 3.1 Co-author Network The co-author network suggest that DHQ publications follow the patterns of the humanities community, with many single-authored papers (128 out of 195, 65.6%). Moreover, its origins are in North America, and three quarters of the authors are from either the US (58%) and Canada (17%). A distant third is the UK (9%), further demonstrating the Anglo-Saxon nature of DHQ. The largest co-author network component consists of 43 authors; which is about 16% of all authors (276 authors in all) who contributed to DHQ during this period. The second largest co-author network component consist of 18 authors. Canadian authors show the most collaborative behavior: the article with the most co-authors: “Visualizing Theatrical Text: From Watching the Script to the Simulated Environment for Theatre (SET)” has 14 co-authors. The most collaborative author in this period from Canada is Stan Ruecker; he co-authored 4 articles with 25 others. There does not seem to be a growth of co-authorship after 2008. Overall, articles have on average a little under two authors per paper, and in 2012 a bit above two on average (2.18). When we remove all the single-authored papers, the average number of authors per article is above three, but there is no trend that this is growing with the years. 3.2 Bibliographic coupling network with word clouds From the word clouds we see that several communities explicitly discuss terms such as digital and humanities as well as tool, which is unsurprising. At the centre of the large component, the communities (magenta, yellow, purple) of articles are related to (textual) tools and discussing DH itself, with terms such as curation, e-Science, project, and research. The communities further to the left (light blue & dark blue) are related to textual analysis and tools, with terms such as classification, author, write, annotation, interface, and literary. The communities to the right however (dark purple, dark red, moss- green) suggest articles related to artistic subjects, with terms such as poetry, ekphrasis, games, and fiction. 4 CO N C L U S I O N We return to the questions provided by the DHQ editors: 1. how citations reflect differences in academic culture at the institutional and geographic level 2. the changes to that culture over time. 3. correlations between article topics (reflected in keywords) and citation patterns. With respect to the first question, we focus on the geographic level of academic culture. The co-author network shows that despite DH being a collaborative culture, over half of all publications are single authored, something demonstrated earlier for other journals9. Moreover, DH as represented by DHQ is largely an Anglo-Saxon North American undertaking. With respect to the second question; there is no visible trend regarding co-authorship between 2007-2014. However, authors from non-Anglo Saxon countries are emerging, showing DH is slowly becoming a more global phenomenon as also evidenced by the DH conferences10. With respect to the third question, we find that the references present in the DHQ articles lead to a large number of communities. The boundaries are however diffuse, making it difficult to describe clear cut communities. However, from the word clouds we do see at least three different patterns emerge: 1) article related to tools and DH itself, 2) articles related to textual analysis with tools, and 3) articles related to artistic subjects. While we have provided an exploration of the articles and authors within DHQ, additional insights may be learned from further analysis. In particular, interactive visualizations will provide the user with a more comprehensive understanding of the data. These may allow the user to explore communities via institution or discipline as well as country. In addition, we believe a properly de-biased authorial bibliographic coupling network may provide further insight into the academic cultures within DHQ. Lastly, our analysis focused on DHQ articles alone. Further analysis may allow us to explore the non-DHQ articles cited by DHQ papers. In sum, we see DHQ fairly represents the heterogeneity of DH, critically examining DH itself and discussing computational analyses of research questions from different backgrounds. On the other hand, however, we see DHQ representing a somewhat homogeneous view of DH, with strong representation from Anglo-Saxon scholars and those from North America in particular. Here, DHQ can be challenged to provide a better representation of scholars from other backgrounds, as well as the ‘big tent’ of DH in general. AC K N O W L E D G M E N TS The authors wish to thank Professor Julia Flander, Professor Katy Börner, Dr. Andrea Scharnhorst, and the participants of Indiana University’s Information Visualization MOOC for providing us valuable feedback during the process of the project work. RE F E R E N C E S [1] Svensson, Patrik. (2012) Beyond the big tent. Debates in the Digital Humanities, 36-49. [2] Knorr Cetina, K. (2007). Culture in Global Knowledge Societies: Knowledge Cultures and Epistemic Cultures. The Blackwell 9 http://blogs.lse.ac.uk/impactofsocialsciences/2014/09/10/joint- authorship-digital-humanities-collaboration 10 See http://www.scottbot.net/HIAL/?p=41064 http://blogs.lse.ac.uk/impactofsocialsciences/2014/09/10/joint-authorship-digital-humanities-collaboration/ http://blogs.lse.ac.uk/impactofsocialsciences/2014/09/10/joint-authorship-digital-humanities-collaboration/ http://www.scottbot.net/HIAL/?p=41064 Companion to the Sociology of Culture, 32(4), 361–375. doi:10.1002/9780470996744.ch5 [3] Digital Humanities Quarterly (n.d.). About DHQ. Retrieved from http://www.digitalhumanities.org/dhq/about/about.html [4] Ortega, L., & Antell, K. (2006). Tracking Cross-Disciplinary Information Use by Author Affiliation: Demonstration of a Method. College & Research Libraries, 67(5), 446–462. Retrieved from http://crl.acrl.org/content/67/5/446. [5] Sci2 Team. (2009). Science of Science (Sci2) Tool. Indiana University and SciTech Strategies, https://sci2.cns.iu.edu. [6] Bastian, Mathieu, Sebastien Heymann, and Mathieu Jacomy. "Gephi: an open source software for exploring and manipulating networks." ICWSM 8 (2009): 361-362. [7] Jacomy, Mathieu, et al. "Forceatlas2, a continuous graph layout algorithm for handy network visualization." Medialab center of research 560 (2011). [8] Kessler, M. M. (1963). Bibliographic coupling between scientific papers. American documentation, 14(1), 10-25. [9] Schvaneveldt, R. W., D. W. Dearholt, and F. T. Durso. "Graph theoretic foundations of pathfinder networks." Computers & mathematics with applications 15.4 (1988): 337-345. [10] Waltman, Ludo, and Nees Jan van Eck. "A smart local moving algorithm for large-scale modularity-based community detection." The European Physical Journal B 86.11 (2013): 1-14. [11] Blázquez, M. (n.d). Frecuencias y pesos de los términos en un documento. Retrieved from: http://ccdoc- tecnicasrecuperacioninformacion.blogspot.com.es/2012/11/frecuenc ias-y-pesos-de-los-terminos-de.html http://crl.acrl.org/content/67/5/446 https://sci2.cns.iu.edu/ http://ccdoc-tecnicasrecuperacioninformacion.blogspot.com.es/2012/11/frecuencias-y-pesos-de-los-terminos-de.html http://ccdoc-tecnicasrecuperacioninformacion.blogspot.com.es/2012/11/frecuencias-y-pesos-de-los-terminos-de.html http://ccdoc-tecnicasrecuperacioninformacion.blogspot.com.es/2012/11/frecuencias-y-pesos-de-los-terminos-de.html Appendix 6: Text and Slides for DH2015 Paper This appendix contains the text and slides for a paper on DHQ (mentioning but not focused primarily on the bibliographic project) presented at DH2015 in Australia: “Challenges of an XML-based Open-Access Journal: Digital Humanities Quarterly,” Julia Flanders, John Walsh, Wendell Piez, Melissa Terras. The text of this paper has been revised based on commentary and discussion in the conference session. Flanders et al., “Challenges of an XML-based Open-Access Journal”, DH2015 1 Challenges of an XML-based Open-Access Journal: Digital Humanities Quarterly Julia Flanders (Northeastern University) John Walsh (Indiana University) Wendell Piez (Piez Technologies) Melissa Terras (University College London) 0. Introduction Digital Humanities Quarterly was founded in 2005 as ADHO's first online open-access journal and published its first issue in 2007. • In the ensuing ten years, the journal has been conducted as an ongoing experiment in standards-based journal publishing. • In this paper we’d like to reflect on the results of that experiment to date, with emphasis on a few areas of particular challenge and research interest During that period, other open-access journals in DH have also emerged, and if we look at them as a group we can see some differences of approach which reflect differences of goals and philosophy, and also the kinds of personnel and other resources they have available: • Approach to the data: is the article data itself of interest as a potential future research asset? Does the community have a predilection towards a particular data format (e.g. TEI?) • Approach to publication architecture: content management system (emphasizing configurability by novice administrators and design-oriented control over format) or data-driven approach (emphasizing consistent exploitation of the data with no design intervention except at the systemic level • Where does the mission reside? In the content or in the information system? DHQ is perhaps an extreme example of a data-driven journal with an overwhelming interest in its own information systems, and this orientation arises in great part from the specific people to whom the journal’s initial design and launch was entrusted: having a strong research interest in XML, in data curation, in future exploitation of the journal as a data source. This paper isn’t intended as an exercise in evangelism or self-praise, but rather an exploration of what happens when we choose that set of parameters and follow their logic. The results thus far may help others working on developing open-access journals to situate their efforts within this same set of constraints. 1. Background and technical infrastructure A few words about DHQ’s fiscal and organizational arrangements may be useful here because they determine many of the strategic choices I’ll be talking about. [slide] Flanders et al., “Challenges of an XML-based Open-Access Journal”, DH2015 2 • Funded jointly by ACH (which is the formal owner of the journal) and ADHO, each of which contributes $6000 per year. • As of 2014, also receives funding from Northeastern University for the managing editor positions, two graduate research assistants at 10 hours per week each during the academic year; Indiana University has also contributed staff time and services. • Uses grant funding to support special projects (currently completing two small grant- funded projects which I’ll describe a bit later) • The journal is led by three general editors and a technical editor, together with an editorial team that has more specialized responsibilities • The editor in chief oversees two managing editors and the overall workflow of submission, review, and production; and the technical editor oversees a Technical Assistant and the maintenance and development of the journal’s technical systems (version control, servers, publication apparatus) DHQ's technical design was constrained by a set of higher-level goals and needs. • As an early open-access journal of digital humanities, an opportunity to participate in the curation of an important segment of the scholarly record in the field. • Hence more than usually important that the article data be stored and curated in a manner that would maximize the potential for future reuse. • In addition to mandating the use of open standards, this aim also strongly indicated that the data should be represented in a semantically rich format. • Also anticipated a need for flexibility and the ability to experiment with both the underlying data and the publication interface, throughout the life of the journal, without constraint from the publication system. All of these considerations moved the journal in the direction of XML (and eventually to TEI), which would give us the ability to represent any semantic features of the journal articles we might find necessary for either formatting or subsequent research. It would also permit us to design a journal publication system, using open-source components, that could be closely adapted to the DHQ data and that could evolve (at our own pace and based on our own agenda) to match any changes in requirements for the data. At the journal's founding, several alternative publishing platforms were proposed (including the Open Journal System), but none were XML- based and none offered the opportunity for open-ended experimentation that we needed. DHQ's technical infrastructure is a standard XML publishing pipeline [slide] built using components that are familiar in the digital humanities: • Cocoon: pipelining tool that manages user interactions • XSLT to transform the XML • CSS and a little JavaScript for formatting and behavior • Eventually, an XML database to handle queries to bibliographic data Workflow also uses generally available tools: [slide] • Submissions are received and managed through OJS through the copyediting stage Flanders et al., “Challenges of an XML-based Open-Access Journal”, DH2015 3 • Final versions of articles are converted to basic TEI using OxGarage (http://www.tei- c.org/oxgarage/). • Further encoding and metadata are added by hand • Items from the articles' bibliographies are entered into a centralized bibliographic system that is also XML-based. • All journal content is maintained under version control using Subversion. The journal's organizational information concerning volumes, issues, and tables of contents is represented in XML using a locally defined schema [slide]. • The journal uses Cocoon, an XML/XSLT pipelining tool, to process the XML components and generate the user interface. Consider DHQ in relation to two other journals who are more or less in the same quadrant, Digital Medievalist (first issue in 2005) and jTEI (first issue in 2011), which have some similarities of approach to DHQ: • Desire to keep data in semantically rich formats such as TEI • Using open-source tools • DM and jTEI both have developed publishing workflows based on their TEI data • Neither journal is the sole proprietor of its own publishing system, so the evolution of their publishing platform is to some extent constrained by the goals of those platforms (driven by the entire community of users, not just that journal) • Hence these journals benefit from advances by those communities but can’t easily anticipate them or exercise a determining influence • Whereas DHQ has the reverse problem: we are responsible for our own interface, so we are free to change it as much as we like, but we have to find the resources to do it ourselves. 2. DHQ's Evolving Data and Interface As noted above, DHQ's approach to the representation of its article data has from the start been shaped by an emphasis on long-term data curation and a desire to accommodate experimentation, and our specific encoding practices have evolved significantly during the journal's lifetime. • The first schema developed for the journal was deliberately homegrown, and was designed based on an initial informal survey of article submissions and articles published in other venues. • Following this initial period of experimentation and bottom-up schema development, once the schema had settled into a somewhat stable form we expressed it as a TEI customization and did retrospective conversion on the existing data to bring it into conformance with the new schema. • At several subsequent points significant new features have been added to the journal's encoding: for example, explicit representation of revision sites within articles (for authorial changes that go beyond simple correction of typographical errors), enhancements to the display of images through a gallery feature, and adaptation of the encoding of bibliographic data to a centralized bibliographic management system. Flanders et al., “Challenges of an XML-based Open-Access Journal”, DH2015 4 • At the beginning of our schema design process, we noted that at some point we might want to create a “crayon-box” schema whose elements would be deliberately designed to support author-specified semantics (slide), with the author also providing the display and behavioral logic, but we have not yet had a call for this approach and have not yet explored it in any practical detail. These changes to the data have typically been driven by emerging functional requirements, such as the need to show where an article has been revised or the requirements of the special issue on comics as scholarship. However, they also respond to a broader set of requirements: • that this data should represent the intellectual contours of scholarship rather than simply interface. • For example, the encoding of revision notes retains the text of the original version, identifies the site of the revision, and supports an explanatory note by the author describing the reason for the revision. Although DHQ's current display uses this data in a simple manner to permit the reader to read the original or revised version, the data would support more advanced study of revision across the journal. • Similarly, although our current display uses the encoding of quoted material and accompanying citations in very straightforward ways, the same data could readily be used to generate a visualization showing most commonly quoted passages, quotations that commonly occur in the same articles, and similar analyses of the research discourse. The underlying data and architecture lend themselves to incremental expansion. 3. Experimentation; Design vs. Data-driven approach DHQ’s data driven approach is rooted in caution and in motives of security, which are in a sense fundamentally conservative. Supporting the long-term preservability and intelligibility of our articles-as-data becomes much easier if that data is strongly convergent. Similarly, our task of publication is much easier and cheaper if our mechanisms of display are strongly determined by the data. However, one principle we articulated at the journal’s launch was the idea that we wanted to support experimentation not just by ourselves but by authors, and we established a rationale for this experimentation that expressed its costs and risks and allocation of responsibility in terms of conceptual “zones”: [slide] • Zone 1 is DHQ proper, using standard DHQ markup and display logic. Within Zone 1 we seek to provide an expanding set of functions that keep up with the most typical needs of DHQ authors. DHQ takes full and perpetual responsibility for maintaining Zone 1 articles in working order. • Zone 2 is a space of collaborative experimentation between DHQ and the author, in which we can accommodate author-generated data and code under specified terms: o it must meet certain standards of curatability: using open standards and formats, using tools and languages that make sense for DHQ to maintain expertise in o it must conform to good practice (documentation, commented code) so that the code itself can be considered a publication, not just an instrument of getting something done Flanders et al., “Challenges of an XML-based Open-Access Journal”, DH2015 5 o it must include an XML fall-back description so that if the experimental version breaks, readers can still find an intelligible account of it, and also to provide some kind of basic operation and discoverability within DHQ’s standard search mechanisms • DHQ takes a more cautious form of responsibility for articles in Zone 2: we’ll curate the data and we’ll do our best to keep the code working, but we can’t guarantee that we’ll support all of its dependencies in the future since we can’t be sure our resources will support that level of effort • Zone 3 is a space of authorial autonomy, with many fewer constraints on the author and greatly diminished responsibility on DHQ’s part: o The code needs to be something that can actually run on DHQ servers without risk, or else the author can host it on his/her server o The code needs to conform to good practice (documentation and commenting) o There needs to be an XML fall-back description, which is even more important in this case because the likelihood of fragility is so much greater So it’s interesting to consider at this point what forms that experimentation might take: how do authors actually want to experiment, and how far are we actually prepared to go to support them? At a very simple level: • we can observe that authors do want control over formatting, and this gives us a window into what “authoring” in the digital medium entails. • the most common kinds of requests or push-back we get from authors have to do with layout: the formatting of tables, the placement and sizing of images, the fine-tuning of epigraphs and code samples. • Note that these are all components with a strong visual component to their rhetoric; unlike paragraphs and notes and block quotations and citations, in which the strength of the semantic signal is so strong that we receive their full informational payload regardless of how they are formatted, these visual features have the potential to mean differently or less successfully if they look different. • These are also all features for which it would be comparatively easy for DHQ to provide finer mechanisms of control simply by making our own stylesheets more elaborate (asking them to handle more article-specific renditional information, and taking the trouble to work out the potential collisions and tricky cases): so the chief limiter here is cost. At a more advanced level, authors might experiment by proposing new semantic features. The actual examples so far have been features that are recognizable but that we just hadn’t anticipated and hadn’t developed any specific encoding for: • Timelines • Annotated bibliography • Survey data • Oral history interviews Flanders et al., “Challenges of an XML-based Open-Access Journal”, DH2015 6 We have the choice here of representing these as if they were more generic features we already support (an oral history interview is a dramatic dialogue; a timeline is a kind of list), or of treating them as semantically distinct. The most compelling motivation for the latter approach would be the possibility of strengthening our support for the study of discourse, which would entail having a larger set of instances: so here, the role of the initial experiment is to bring a given feature to our notice but the work of actually supporting it is only warranted if it’s a feature other people want as well. We have also had a few examples of genuinely experimental writing in which the author was deliberately departing from the genre of the scholarly article. (Slides: Trettien). • The question we have to ask here is: are these experiments in semantics or in design? We’ve seen that a journal like DHQ can in principle accommodate authorial control over display (at a cost), and as we noted earlier, we have at least theoretically entertained the idea of allowing authorially specified semantics through a specialized schema. The question is, which are these experimental authors asking for? If we examine these cases more closely, a few points are worth noting: • The experimental cases so far have been expressed as JavaScript and HTML, and their rhetorical innovation takes the form of textual behaviors: responsiveness to reader actions (mouseover, clicking) in the form of navigation and motion, the text moving or changing form. • In other words, they emphasize effects which are significant precisely because they depart from display norms; the Trettien piece plays on our expectations of textual fixity and accuracy, and the Bianco piece thwarts our expectations about reading one thing at a time • However, they don’t seem to introduce a new semantics, a new rhetorical feature that they could usefully declare through through their encoding: the innovation lies in what they do rather than in what they are; it lies precisely in how the reader will experience the surface of the text rather than in what the reader might do if he/she could get at the underlying data and work directly with that. Giving the reader access to “the data” would give the reader nothing at all of what is actually going on in these pieces. So far, we have not had any proposed experiments that work in the other direction. What would they look like? • An article that does exactly what Trettien did, but using XML rather than HTML as the source data • An article that is mostly structured data (e.g. data from a survey) with XSLT that presents it to the reader for inspection and manipulation (sorting, filtering) • A special issue that uses a TEI customization and for which the guest editors have developed XSLT and CSS that exploits the articles’ markup The best way for us to pursue this kind of experimentation would be to invite proposals, perhaps structured around a grant proposal to provide some support for stylesheet development. (Consider this an informal invitation!) Flanders et al., “Challenges of an XML-based Open-Access Journal”, DH2015 7 4. Next Steps DHQ has several developmental projects under way: • With generous support from a grant organized by Marco Büchler from the University of Leipzig, we are implementing an OAI-PMH server for DHQ through which we can better expose the journal’s metadata • [slide] We have just completed an NEH DH startup grant which funded the development of a centralized bibliography for DHQ: important improvement for DHQ’s production processes, but also opens up some exciting potential for citation analysis and data visualization; we’ll be publishing an article about this in the coming months • We are also in the planning stages of a project to explore internationalization of the journal through a series of special issues dedicated to individual languages. This will involve some further work on the schema and interface, and also changes to the workflow to accommodate a multilingual review process. We will be working within our existing constraints of finances and personnel so we’ll need to proceed deliberately, but we’re excited to be undertaking this step. Challenges of an XML-based Open-Access Journal: Digital Humanities Quarterly Julia Flanders, Northeastern University John Walsh, Indiana University Wendell Piez, Piez Consulting Melissa Terras, University College London Experimentation with Data Ex pe rim en ta tio n w ith In te rfa ce DHQ Archive Vectors jTEI Digital Medievalist DHNow/JDH Scholarly Editing DS/CN Digital Commons Background on DHQ •  Founded in 2005, first issue in 2007 •  Jointly funded by ACH and ADHO •  Hosted and supported at Northeastern University and Indiana University •  Grant-funded special projects Staff and organization •  General Editors: Julia Flanders, Wendell Piez, Melissa Terras •  Technical Editor: John Walsh •  Managing editors: Elizabeth Hopwood, Duyen Nguyen, Jonathan Fitzgerald •  Technical assistant (currently vacant) •  Editorial team: Stéfan Sinclair, Adriaan van der Weel, Alex Gil, Michelle Dalmau, Jessica Pressman, Geoffrey Rockwell, Sarah Buchanan •  Special teams players: Jeremy Boggs •  Abundant excellent peer reviewers Subversion Repository DHQ Server Space Digitalhumanities.org Browser TEI/XML articles XSLT CocoonDHQ Bibliographic Data OAI Server OAIHarvesters Word, TEI, HTML, plain text Submission Open Journal System (review, feedback, revision tracking) DHQ subversion (encoding, author review) Conversion to TEI OxGarage Publication 2015
An Experiment in XML Experimental text block with behaviors controlled by stylesheets and the possibility of inline elements whose formatting and behavior are also controlled by stylesheets. Namespaces could also be used to include user-defined elements (or elements from other established XML languages) with specified semantics.
This article has been revised since its original publication. A response solicited by the author from Matthew Kirschenbaum has been added as a footnote. Zone Features Curation Zone 1 DHQ markup and stylesheets DHQ in perpetuity Zone 2 Author-supplied code, constrained by DHQ support capabilities; Zone 1 fallback required DHQ good faith curation Zone 3 Author-supplied code, constrained by good practice guidelines; Zone 1 fallback required No DHQ responsibility “Mapping Cultures in the Big Tent: Multidisciplinary Networks in the Digital Humanities Quarterly,” Dulce Maria de la Cruz, Jake Kaupp, Max Kemman, Kristin Lewis, and Teh-Hen Yu. Final project submitted for Information Visualization MOOC, Indiana University, May 2015. Thank you! Julia Flanders @julia_flanders John Walsh Wendell Piez Melissa Terras @melissaterras VisualizingDHQ_Final_Paper.pdf Introduction 1 Method 1.1 Data 1.2 Co-author network 1.3 Bibliographic coupling & Backbone identification 1.4 Word clouds 2 Result7F 2.1 Co-author Network 2.2 Bibliographic coupling network with word clouds 3 Discussion 3.1 Co-author Network 3.2 Bibliographic coupling network with word clouds 4 Conclusion Acknowledgments References work_bvl2s7smknapjip66quntvdmo4 ---- Experimental assessment of the quality of ergonomic indicators for dynamic systems computed using a digital human model HAL Id: hal-01383415 https://hal.archives-ouvertes.fr/hal-01383415 Submitted on 18 Oct 2016 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Experimental assessment of the quality of ergonomic indicators for dynamic systems computed using a digital human model Pauline Maurice, Vincent Padois, Yvan Measson, Philippe Bidaud To cite this version: Pauline Maurice, Vincent Padois, Yvan Measson, Philippe Bidaud. Experimental assessment of the quality of ergonomic indicators for dynamic systems computed using a digital human model. Interna- tional Journal of Human Factors Modelling and Simulation, Inderscience, 2016, 5 (3), pp.190 - 209. �10.1504/IJHFMS.2016.10000531�. �hal-01383415� https://hal.archives-ouvertes.fr/hal-01383415 https://hal.archives-ouvertes.fr P. MAURICE, Ergonomic indicators for collaborative robotics Experimental assessment of the quality of ergonomic indicators for dynamic systems computed using a digital human model P. MAURICE* a,b, V. PADOIS a, Y. MEASSON b, and P. BIDAUD a,c a Sorbonne Universités, UPMC Univ Paris 06, CNRS, UMR 7222, Institut des Systèmes Intelligents et de Robotique (ISIR), F-75005, Paris, France b CEA, LIST, Interactive Robotics Laboratory, Gif-sur-Yvette, F-91191, France c ONERA, 91123 Palaiseau, France Abstract The growing number of musculoskeletal disorders in industry could be addressed by the use of collaborative robots, which allow the joint manipulation of objects by both a robot and a person. Designing these robots requires to assess the ergonomic benefit they offer. Current methods use a posteriori assessment, i.e. observation of a worker performing the task, and require a physical mock-up of the robot. Moreover, they exclude dynamic phenomena because their measurements require heavy instrumentation. However, collaborative robots are not static objects, but dynamic systems which motion influences and is influenced by the physical interaction with the worker. Plus, the worker him/herself is also a dynamic system, on which dynamic phenomena have ergonomic consequences, even without the presence of a collaborative robot. In order to perform more thorough assessments of the ergonomic performances of dynamic systems, it is proposed to use a dynamic digital human model (DHM) for the evaluation, associated with a dedicated ergonomic metric. This paper presents preliminary results on three ergonomic indicators formulated to meet the requirements of ergonomic evaluations of dynamic systems. They evaluate respectively the position of the worker, his physical effort and the energy spent during the task. The same manual task is performed by seven human subjects under different time, load and geometric constraints. Each performance is recorded and replayed with a dynamic DHM in a dynamic simulation framework, in order to calculate the values of the indicators. All three indicators are strongly affected by the geometric parameters in a way that is consistent with ergonomic guidelines. Besides, a linear correlation between the values of the indicators and the strenuousness perceived by the subjects is observed. Moreover, the results show that the relevance of an indicator is strongly affected by the task features, especially its duration. Future work will be directed towards automatic selection of relevant indicators for a given task. Keywords: Ergonomics, Digital Human Model, Dynamic Motion Simulation, Motion Capture and Replay. 1. Introduction Though working conditions have improved in de- veloped countries, work-related musculoskeletal dis- orders (MSDs) remain a major health problem. In 2005, MSDs represented 59% of the occupational diseases and affected over 35% of industrial workers in Europe (Schneider and Irastorza, 2010). In the US, the total cost of MSD has been estimated around $45 to 54 billion per year (National Research Council and Institute of Medicine, 2001). Hence decreasing MSD is a high-stakes socioeconomic issue. The causes of MSDs are often multi-factorial and include different kinds of factors: personal, organi- zational, psychosocial and biomechanical (Schneider and Irastorza, 2010). However, the major risk factors are often biomechanical: most MSDs at least partly result from strenuous biomechanical demands caused by physical work (Luttmann et al., 2003). Replacing men by robots to accomplish hard tasks might then be considered an option to decrease the prevalence of MSDs. But despite the growing robotization in industry, many tasks cannot be fully automatized because of their unpredictability or their technicality. A solution is to assist the worker with a collabora- tive robot, rather than replacing him. A collaborative robot enables the joint manipulation of objects with the worker and thereby provides a variety of benefits, such as strength amplification, inertia masking and guidance via virtual surfaces and path (Colgate et al., 2003). To ensure that the use of these devices do de- crease the risk of MSDs, an ergonomic assessment of the robot-worker system must be performed through- out the design process. Standard ergonomic methods are based on the observation of a worker performing *Corresponding author. Email: maurice(at)isir.upmc.fr 1 P. MAURICE, Ergonomic indicators for collaborative robotics the task (Li and Buckle, 1999), and require a physical mock-up of the robot. Given that this assessment aims at guiding the design of the robot, it means a new prototype every time a mechanical parameter of the robot is changed, which is a significant limitation in terms of cost and time. Besides, these evaluations usually exclude dynamic phenomena that yet affect the risk of MSDs, because their measurements require heavy instrumentation of the worker. An alternative is to carry out the assessment within a digital world, where modifications are simpler, and many physical quantities can be accessed at lower cost. Several tools exist that offer the possibility to perform ergonomic evaluations of a workplace in a virtual environment by simulating the worker with a digital human model (DHM): e.g. Delmia1, Jack (Raschke, 2004), Ramsis (Seidl, 2004), Sammie (Porter et al., 2004). The manikin is animated through motion capture data, direct or inverse kinematics, or pre- defined postures and behaviors. Various ergonomic assessment methods are included in these software products. The first class of methods estimates the level of risk depending on the exposure to the main MSD factors. The most widely known are RULA (Rapid Upper Limb Assessment), REBA (Rapid En- tire Body Assessment), OWAS (Owako Working Posture Analysis System), the OCRA index (Occu- pational Repetitive Action), or the OSHA checklist (Li and Buckle, 1999; David, 2005). The second class of methods consists of equations or tables that give psycho-physiological limits not to exceed in order to minimize the MSD risk during manual handling operations. The most famous are the NIOSH equation (Waters et al., 1993) and the Snook and Ciriello tables (Snook and Ciriello, 1991), which determine a maximum acceptable load weight depending on the task features. Though a wide variety of methods are available, they are not suitable for the design of collaborative robots. Such robots must be optimized considering the whole activity and the whole human body. But the tasks which may be addressed by these robots are various and often complex, whereas the existing assessment methods are specific either to a type of activity and/or to a body part. So the evaluation of the entire activity will very likely require the use of several methods, the results of which are mostly not homogeneous and therefore cannot be compared. Moreover, what might be the main drawback of these observational methods is that they are static, meaning that dynamic phenomena are not taken into account. Yet it has been established that fast motions increase the risk of MSDs - even when there is no interaction with a robot - because of the efforts they generate in biolog- ical tissues. In collaborative robotics, evaluating the dynamic stages of the activity is even more important because, though designed to be so, the robot is never perfectly backdrivable. Some phenomena can be hard to compensate, even with a dedicated control law. In this case manipulating the robot requires extra efforts from the worker. For instance, collaborative robots providing strength amplification usually are powerful thus heavy: they are highly inertial so leaving dy- namic stages out of the assessment can lead to an underestimation of the risk. Beyond these methods associated with macroscopic human body modelling, some DHM tools provide very accurate biomechanical models including mus- cles, tendons, and bones, e.g. AnyBody (Damsgaard et al., 2006), OpenSim (Delp et al., 2007). They can calculate quantities such as muscle force or tendon length, which are closely linked to MSD (Luttmann et al., 2003), and sometimes even include dynamic ef- fects. But such models usually require to tune biome- chanical parameters, which cannot be properly done without subject specific knowledge of the human body. Besides, these tools provide a measurement for each muscle, tendon, etc. In order to represent the whole body situation these local scores have to be combined in a way that is left to the user to determine. This last criticism also applies to simpler models which provide local measurements such as forces in joints. The work presented in this paper aims at devel- oping a DHM-based ergonomic assessment method fitted for collaborative robots design. This requires the development of both a dedicated ergonomic met- ric (what to measure) and a measuring tool (how to measure) which are suitable for evaluating the ergonomic performances of dynamic systems. Note that though this work targets collaborative robots, its scope is broader and actually addresses the more general issue of assessing ergonomic performances in dynamic situations. This paper focuses on the for- mulation of ergonomic indicators and their use with a dynamic DHM. In section 2 three indicators are defined in order to meet the requirements of collab- orative robotics. An experimental validation is con- ducted to ensure that they are ergonomically consis- tent: the influence of various work conditions on the indicators values is studied. The protocol is described in section 3. The results are presented in section 4 and discussed in section 5. Section 6 concludes on the relevance of these indicators and the associated DHM and proposes some perspectives about their use within a global assessment method. 2. Definition of indicators Ergonomic indicators should account for the main MSD risk factors which are strong postural demands, high intensity forces, long exposure duration and highly repetitive exertions. The repetitiveness as well as the effect of static work (i.e. maintaining a posture 1www.3ds.com/fr/products/delmia 2 P. MAURICE, Ergonomic indicators for collaborative robotics without moving) are omitted in this work. Indeed, though repetitiveness and postural change can easily be extracted from the simulation, their biomechanical impacts on the human body are hard to quantify precisely. It requires to understand how these time- frequency factors affect the human physical capaci- ties, which is closely related to the open problem of fatigue modeling and is out of scope here. It should nevertheless be noted that the purpose here is not the assessmentof theabsolute level ofrisk for the worker, but the comparison of assistive devices which are not expected to dramatically affect the work rate. The instantaneous postural risk includes two phe- nomena: the proximity to joint limits and the effort needed to maintain the posture. In reality muscular effort is not due solely to gravity, but also to the dynamic forces associated with the motion, and to the external force caused by the interaction with an object. The former are hardly ever taken into account in existing methods, while the accuracy with which the latter is considered varies much from a method to another. In order to accurately evaluate the effect of an external force on the musculoskeletal system, the repartition of the effort among the whole muscu- loskeletal system - which depends on the posture - must be computed. In this work a DHM is used to simulate the worker, so unlike with a real human, the actuation forces (joint torques or muscle forces, de- pending on the level of detail of the model) can easily be accessed without requiring heavy instrumentation. A simple rigid-body model with hinge joints actu- ation2 is chosen (because as stated previously very detailed models are quite difficult to use), so these forces correspond to joint torques. Since the DHM is animated within a dynamic simulation, the joint torques result from the inverse dynamical model of the manikin. They include all three effects: gravity, dynamics, and external force. Despite their various origins, these three phenomena all have the same consequence on the musculoskeletal system, so they are considered together in the risk assessment. On the contrary, the effect of the proximity to joint limits is of a different kind. Though the combination of several MSD factors increases the risk, the way they interact is not well-established. So it is preferred here to evaluate them separately rather than trying to mix them together. Since disorders may appear as soon as the demands exceed the worker’s capacities, a way to estimate the risk is to compare each demand with its limit value. Since DHM ergonomic assessments - like most er- gonomic studies - are at a population level and not at a personal level, average capacities for joint range of motion and maximal joint torques are used (Holzbaur et al., 2005; Chaffin et al., 2006). The influence of joint angles and velocities on maximal joint torques is currently omitted, though models of this phenomenon can be found in the literature (Chaffin et al., 2006). However the influence of force-induced fatigue is included. Instead of being constant throughout the task, the torque capacity of joint i (i representing successively each hinge joint of the human body model) is affected by the force exertion according to the following evolution law (Ma et al., 2009): τmaxi (t) = τ max i (0) e −k ∫ t 0 τi(u) τmax i (0) du (1) where k is a fatigue rate assigned to 1 min−1, τmaxi (0) is the nominal torque capacity of joint i (before any effort), and τmaxi (t) and τi(t) are respec- tively the torque capacity and the torque exerted by joint i at time t. For both the joint angles and torques, the resulting normalized demands on every joint are added to form a score representing the whole body situation. This instantaneous score is time-integrated to provide a score representing the whole activity, taking into ac- count the duration factor. The resulting indicators are Iq for the joint positions and Iτ for the joint torques: Iq = 1 N N∑ i=1 ∫ T 0 ( qi(t) − qneutrali qmaxi − q neutral i )2 dt (2) Iτ = 1 N N∑ i=1 ∫ T 0 ( τi(t) τmaxi (t) )2 dt (3) where N is the total number of joints in the body model, T is the duration of the task, qi(t) and τi(t) are the angle and the torque of joint i at time t, qmaxi is the joint angle capacity (joint limit), qneutrali is the neutral position of the joint, and τmaxi (t) is the joint torque capacity at time t defined in equation 1. The joints neutral positions qneutrali are defined accord- ing to the REBA comfort zones, by taking the joint angles associated with a minimum score in the REBA evaluation. The resulting posture is standing upright, arms along the torso, elbows flexed at 80°. This so- called (in this work) ”neutral ergonomic posture” is defined by considering only the stress due to the proximity to joint limits: the effort needed to maintain the posture is not taken into account, since such effort is accounted for in the torque indicator. In the literature, fatigue caused by physical work is often determined through metabolic energy expendi- ture (Garg et al., 1978). Metabolic energy expenditure computation is included in some DHM software (e.g. Jack, EMA (Fritzsche et al., 2011)), but it is restricted to specific tasks for which tables are available (or it requires a very detailed biomechanical model of the human body). Here, the torque indicator Iτ (Eq. 3) 2Note that this rigid-body model necessarily leaves aside the effects of additional MSDs factors such as temperature and vibra- tions. 3 P. MAURICE, Ergonomic indicators for collaborative robotics already indirectly represents energy consumption, in particular in static postures. In order to directly take into account the energy consumption during motion, another indicator based on joint power is added: Ip = 1 N N∑ i=1 ∫ T 0 | q̇i(t) τi(t) | dt (4) where q̇i(t) is the velocity of joint i at time t. Though it does not strictly correspond to metabolic energy expenditure, the association of Iτ and Ip gives an idea of the macroscopic energetic consumption. 3. Validation of indicators An experimental validation is carried out to ensure that the above-defined indicators correctly account for the relative exposure level to MSD risks in dy- namic situations (i.e. in tasks including motion). Hu- man subjects perform a manual task in various con- ditions while their movements and external forces are recorded. Each case is replayed with a dynamic DHM, in order to compute the corresponding indi- cators values. Their variations are qualitatively in- vestigated to highlight their dependence on the task conditions. 3.1. Experimental protocol a) Task description: A generic manual task is performed3. A seated subject moves a tool along a displayed path while pushing on the work surface with it. The tool is a 200g and 15cm long handle held with the whole right hand. The path is a 50cm square. Two sides are replaced respectively with a sinusoidal line and a sawtooth line, to accentuate the joints dy- namics (see Fig. 2). Its size is chosen so that the task demands wide joint clearance yet remains feasible by a seated subject. Performing the task means following the entire path once. The subject is instructed not to use his left arm nor his legs. b) Parameters: Four parameters vary throughout the experiment: the orientation of the work surface, the position of the seat relative to the work area, the allotted time and the magnitude of the force to be applied. Table 1: Values of the parameters describing the position of the seat. H stands for Horizontal and V for Vertical: they refer to the orientation of the work plane. Height Distance Orientation low: 38cm (H) close: 20cm 45° right (V) close: 45cm medium: 52cm (H) far: 45cm 45° left (V) far: 75cm high: 66cm 0° (face on) The work surface is either horizontal or vertical. The various positions of the worker’s seat are described in Fig. 1 and Table 1. The close and medium values are chosen to match ergonomic guidelines for seated work (Chaffin et al., 2006). All combinations are tested except horizontal - close - high because the legs do not fit under or in front of the table, and 45° right is only done for close - medium for reachability reasons. From the right work plane distance height From above orientation work plane distance From the left work plane distance height From behind work plane orientation height path path pathpath Figure 1: Definition of the parameters describing the posi- tion of the worker’s seat for the horizontal (top) and vertical (bottom) work planes. The distance parameter is measured from the center of the subject’s seat to the border of the path closest to the subject. The allotted time and the magnitude of the force define three varieties of the original task, described in Table 2 as neutral, force and velocity. The force magnitude in the ”force” task is slightly lower than the maximal force capacity, calculated for this par- ticular movement according to (AFNOR, 2008). The subject is provided with an audio feedback of the exerted force: low-pitched, high-pitched or no sound when the force is respectively too weak, too strong or within the imposed range. The allotted time is displayed through a progress bar on a screen, and the subjects are instructed to move the tool as regularly 3It should be noted that the present experiment does not include interaction with a robot or other dynamic systems. However, as mentioned in section 1, the proposed method addresses any situ- ation including dynamic phenomena, starting with tasks requiring motion of the worker. 4 P. MAURICE, Ergonomic indicators for collaborative robotics as possible along the path. All three tasks - neutral, force and velocity - are performed in random order for both orientations of the work plane and for each seat position. Breaks are regularly allowed to prevent fatigue. The task (following the path once) is short enough and the breaks long enough so that force capacities are fully recovered at the beginning of each new task (i.e. the fatigue model of Eq. 1 is used only within one task but not across tasks). Table 2: Values of the time and force constraints. Task Allotted Mean hand Force kind time velocity magnitude neutral 30s 0.085m.s−1 none velocity 5s 0.5m.s−1 none force 30s 0.085m.s−1 18N ± 1.96N c) Subjects: Seven healthy subjects (5 males and 2 females) ranging from 23 to 28 years old perform the experiment for the horizontal work plane, and three of them also for the vertical work plane. Table 3 describes their physical features. Their movements are recorded with a CodaMotion4 motion capture device. The subjects are equipped with markers on their torso, right arm and hand, and on the tool. The seat is set on a force platform to measure the ground contact forces. The contact forces with the work surface are measured through a force sensor embedded in the tool. During the experiment, the subjects give each gesture a mark between 0 and 10, depending on how strenu- ous the task is perceived. Table 3: Physical features of the human subjects: stature and body mass index (bmi). Stature (m) Min Max Mean Std dev Horizontal plane 1.53 1.83 1.71 0.11 Vertical plane 1.53 1.79 1.63 0.12 BMI (kg.m−2) Min Max Mean Std dev Horizontal plane 20.9 33.3 24.5 3.9 Vertical plane 21.8 33.3 25.6 5.4 3.2. Indicators calculation a) Simulation framework: Once recorded and fil- tered, the data are imported in the XDE simulation framework developed by CEA-LIST 5. It allows for dynamic simulation and provides a DHM (see Fig. 2) which can be animated through several customizable ways. The model consists of 20 joints and 45 degrees of freedom. Each DoF is a hinge joint controlled by a sole actuator. This hinge joint representation is a simplified model, therefore the joint torques of the model do not strictly correspond to the efforts in real human joints (for instance, the dynamics of muscles activation is not rendered). However, it should be noted that the proposed indicators are not dependent on the human body model used for the simulation: they can equally be used with a more detailed model if available. The human model is automatically scaled according to the stature and mass of the subject. Each body segment is further manually modified to match the subject morphology. Figure 2: Left: A human subject performs the task while his motion is recorded. Right: The motion is replayed with a virtual manikin within a dynamic simulation framework. b) Manikin control: The motion is replayed by solving an optimization problem to determine the actuation variables (joint accelerations, joint torques and ground contact forces) which allow to follow the markers trajectories at best, while respecting physical and biomechanical constraints. The LQP controller framework developed by Salini (Salini et al., 2011) is used. Mathematical formulation of the problem is given in equation 5. argmin X ∑ i ωiTi(X) s.t.   M(q)ν̇ + C(q, ν) + g(q) = S τ − ∑ j J T cj (q)wcj GX ⪯ h (5) where τ is the joint torques, wc the contact forces, q the generalized coordinates of the system (i.e. vector of joint positions), ν the generalized velocity con- catenating the floating-base twist and the joint ve- locities q̇, and X = (τ T , wcT , ν̇T )T . The equality constraint is the equation of motion: M is the inertia matrix of the system, C the vector of centrifugal and Coriolis forces, g the vector of gravity forces, S the actuation selection matrix, and JTc the Jacobian of contacts. The inequality constraint includes the bounds on the joint positions, velocities, and torques (all formulated with the problem variables τ and q̈), and the contact existence conditions for each contact point, according to the Coulomb friction model: Ccj wcj ≤ 0 ∀j Jcj (q)ν̇ + J̇cj (ν, q)ν = 0 ∀j (6) 4www.codamotion.com 5www.kalisteo.fr/lsi/en/aucune/a-propos-de-xde 5 P. MAURICE, Ergonomic indicators for collaborative robotics where cj is the jth contact point, Ccj the corre- sponding linearized friction cone, and wcj the contact wrench. Note that the values of the contact forces in- suring the balance of the system (here the interaction between the seat and the DHM’s thighs) do not need to be known beforehand: they are automatically com- puted in the optimization, in order to be compatible with the system dynamics and the effort exerted by the hand on the tool (which needs to be given as an input of the optimization) . The objective function is a weighted sum of tasks Ti - defined as functions of the optimization vari- ables - representing the squared error between a de- sired acceleration or wrench and the system acceler- ation/wrench (ωi are the weighting coefficients). The solution is then a compromise between the different tasks, based on their relative importance. The follow- ing tasks are defined (tasks can be defined both in joint and in operational spaces): • Operational space acceleration task ∥Ẍi − Ẍ∗i ∥ = ∥Jiν̇ + J̇iν − Ẍ ∗ i ∥ 2 • Joint space acceleration task ∥q̈ − q̈∗∥2 • Operational space wrench task ∥wi − w∗i ∥ 2 • Joint torque task ∥τ − τ ∗∥2 where Ẍi is the Cartesian acceleration of body i, and wi the wrench associated with body i. The superscript ∗ refers to the desired acceleration/force, which are defined by a proportional derivative control. For in- stance, the desired operational acceleration is: Ẍ∗i = Ẍ goal i + K Xi v (Ẋ goal i − Ẋi) + K Xi p (X goal i − Xi) (7) where KXip and K Xi v are the proportional and deriva- tive gains for the considered task (they are parameters set by the user). The superscript goal refers to the tar- get value for the body or joint. Though the tasks need to be described in terms of the optimization variables (joint accelerations, joint torques and contact forces) for the problem to be solved, position or velocity can also be controlled with the proposed task model. For instance, an operational space position task (put body i at a given Cartesian position, with null velocity and acceleration) is defined by setting Ẍgoal and Ẋgoal to zero. Similarly, the desired joint acceleration is: q̈∗ = q̈goal + Kqv(q̇ goal − q̇) + Kqp(q goal − X) (8) where Kpp and K p v are the proportional and derivative gains for the considered task. In this work, the operational space acceleration tasks are defined from the markers trajectories. The weights are chosen accordingly to the technique by Demircan (Demircan et al., 2010), though here weighted instead of hierarchical control is used. The markers associated with limbs extremities and the pelvis are given the biggest weight, then the weight decreases when the body is further away from the extremities. Contrarily to inverse dynamics methods, the contact forces with the seat are not imposed here, but result from the optimization problem. So the only Cartesian force task is the contact force with the tool. The desired value is given by the force sensor mea- surement. Low weight joint position tasks are added for the body parts that are not controlled through the markers positions, so that there is no unwanted motion. Finally there is a joint force task which aims at minimizing the joint torques to prevent useless effort. Its weight is very small since it must not hinder the other tasks. 4. Results The following results depict the variations of the indicators depending on the task features. Values are averaged on all subjects since the indicators are not meant to be subject specific. For the sake of clarity, the values in each figure are normalized by the min- imum and maximum values of the addressed case. Note that unless explicitly stated, the duration of the task is not normalized for the computation of the indicators. 4.1. Position Indicator A linear correlation is observed between the posi- tion indicator values and the strenuousness perceived by the subjects when considering tasks of similar duration. The Pearson’s correlation coefficients are respectively 0.86 (p=0.015), 0.89 (p<0.01) and 0.87 (p<0.01) for the neutral, force and velocity tasks con- sidered separately, and 0.84 (p<0.01) for the neutral and the force tasks considered together. However this coefficient drops to 0.54 (p<0.01) when the velocity task, which is approximately 6 times shorter than the others, is added. This suggests that the proposed position indicator is only relevant to compare tasks of similar duration. Comparison within a same task: • Seat distance and orientation: The indicator is higher (t-test, p=0.003) when the subject sits further away from the work area (see Fig. 3), because he has to deviate much from the ”neutral ergonomic posture” to reach the path. What actually matters is the distance from the path to the right hand, which handles the tool. This explains why the left orienta- tion seems better than the face one (see Fig. 1), and why the right orientation, though associated with a close position, is roughly equivalent to the far cases. • Seatheight: In close position, thebest seatheight according to the indicator is the medium one when the work plane is horizontal, and the high one when it is vertical. These results are ergonomically consistent: in the horizontal case, the medium height was cho- 6 P. MAURICE, Ergonomic indicators for collaborative robotics sen in accordance with ergonomic guidelines; in the vertical case, the high height requires less work with the arm raised, a position discouraged by ergonomic guidelines. • Work plane orientation: For a same position of the seat, the indicator values are significantly higher (t-test, p<0.01) in the vertical case than in the hori- zontal one (see Fig. 3). The center of the path is set higher in the vertical case, so it requires the subject to work with the arm raised. Besides the imposed tool orientation (axis normal to the work plane) and whole hand grasp lead to unusual arm angles when the work plane is vertical (elbow higher than shoulder). Seat distance and orientation Seat height Work plane orientation Fr - Fc Cl - Fc Cl - Rg Cl - Lf Fr - Lf Lw Md Hg Lw Md Hg VerticalHorizontal Lf : Left Fc : Face Rg : Right Fr : Far Cl : Close Lw : Low Md : Medium Hg : High Min Max 2.3 5 4.8 2.5 3.5 3.5 4 1 1.8 4.4 4.8 7 5.3 7.3 5.7 3.4 3.4 6.3 6 6.7 3.4 7.3 6.3 4.3 Strenuousness Figure 3: Variations of Iq depending on the position of the subject’s seat and the work plane orientation (neutral task). The numbers correspond to the strenuousness perceived (between 0 and 10) by the subjects. Fr - Fc Cl - Fc Cl - Rg Cl - Lf Fr - Lf Lw Md Hg Lw Md Hg Min Max Lw Md Hg Artificial Velocity Neutral Force Seat distance and orientation Seat height Task 7 5.3 7.3 5.7 3.4 3.4 6.3 6 6.7 3.4 7.3 6.3 4.3 9.5 6.7 10 7.7 5.3 5.7 8.3 8 8.7 4.3 9.7 8 6 Lf : Left Fc : Face Rg : Right Fr : Far Cl : Close Lw : Low Md : Medium Hg : High Strenuousness Figure 4: Variations of Iq depending on the position of the subject’s seat and the kind of task: neutral, force or artificial velocity (vertical work plane). The numbers cor- respond to the perceived strenuousness. The strenuousness is not displayed for the artificial velocity task since this task has not been performed by human subjects, therefore its strenuousness has not been evaluated (and normalizing the perceived strenuousness would be meaningless). Comparison between different tasks: As stated be- fore, the position indicator does not seem suitable to compare tasks which duration differ significantly. Therefore, in this section, the durations of the tasks are artificially equalled so that the results of the three tasks can be compared. To this purpose, an artificial velocity task is created by replaying the whole gesture with the DHM six times consecutively (the real veloc- ity tasks is six times shorter than the neutral and force tasks). Note that this artificial velocity tasks is an approximation since the simulated gesture is identical the six times, whereas a real subject would probably show variations in his/her gesture. The artificial ve- locity task results in the smallest values of the position indicator (see Fig. 4). Actually, the allotted time for one loop on the path is so short that the path has to be smoothed, thus requiring less extreme joints angles. On the other hand the difference between the neutral and force tasks is not statistically significant. Despite the force exertion, the subjects do not modify their posture much, either because it is already strongly constrained by the imposed hand trajectory and seat position, or because the demanded external force is small enough not to require any change in the posture. 4.2. Torque Indicator A good correlation between the torque indicator values and the perceived strenuousness is observed within a same task (Pearson’s coefficient equals re- spectively 0.81 (p<0.01), 0.84 (p<0.01), and 0.85 (p<0.01) for the neutral, force, and velocity tasks) or when the neutral and force tasks are considered together (Pearson’s coefficient equals 0.81 (p<0.01)). But the correlation coefficient drops to 0.59 (p<0.01) when all three tasks are considered together. Simi- larly to the position indicator, the proposed torque indicator is not suitable to compare tasks of different durations. Comparison within a same task: The torque indica- tor is highly affected by the position of the subject rel- ative to the work area, because of the effect of gravity on his body segments (see Fig. 5). The further away the seat is from the work plane, the more the subject must deviate from an upright position, needing higher joint torques to maintain this posture. Min Max Lf : Left Fc : Face Rg : Right Fr : Far Cl : Close Lw : Low Md : Medium Hg : High Horizontal work plane Seat height Seat distance and orientation Lw Md Hg Fr - Fc Cl - Fc Cl - Rg Cl - Lf Fr - Lf Force task Neutral task 4 6.5 5.5 4 5.3 5 4.8 2.8 3.5 4.8 6.3 2.3 5 4.8 2.5 4 3.5 3.5 1 1.8 4.5 4.8 Lw Md Seat distance and orientation Hg Fr - Fc Cl - Fc Cl - Rg Cl - Lf Fr - Lf Seat height Force task Vertical work plane 7 5.3 7.3 5.7 3.4 3.4 6.3 6 6.7 3.4 7.3 6.3 4.3 9.5 6.7 10 7.7 5.3 5.7 8.3 8 8.7 4.3 9.7 8 6 Neutral task Figure 5: Variations of Iτ depending on the external force and the seat position. Left: horizontal work plane. Right: vertical work plane. The numbers correspond to the per- ceived strenuousness. Comparison between different tasks: In this sec- tion, the artificial velocity task (where the motion is replayed six times consecutively with the DHM) is considered instead of the real velocity task, in order to compare tasks of similar durations. Indeed, as men- 7 P. MAURICE, Ergonomic indicators for collaborative robotics tioned above, the torque indicator seems suitable only to compare tasks of similar durations. • External force: When the work plane is vertical the torque indicator of the force task is significantly higher (p=0.002) than the one of the neutral task, whereas they are not significantly different (p=0.28) in the horizontal case. This can be explained by the fact that in the horizontal case, gravity helps pushing downwards on the workplane. In the neutral task subjects need to exert an upward torque to counter the effect of gravity and maintain their arm, whereas in the force task, the arm weight is useful to ease the downward pushing effort and therefore does not need to be compensated in the same way. This phe- nomenon does not exist for the vertical workplane, since the direction of gravity and of the pushing force are orthogonal. Fr - Fc Cl - Fc Cl - Rg Cl - Lf Fr - Lf Lw Md Hg Lw Md Hg Min Max Lw Md Hg Artificial Velocity Neutral Force Seat distance and orientation Seat height Task 7 5.3 7.3 5.7 3.4 3.4 6.3 6 6.7 3.4 7.3 6.3 4.3 9.5 6.7 10 7.7 5.3 5.7 8.3 8 8.7 4.3 9.7 8 6 Lf : Left Fc : Face Rg : Right Fr : Far Cl : Close Lw : Low Md : Medium Hg : High Strenuousness Figure 6: Variations of Iτ depending on the seat position for all three tasks velocity, neutral and force (vertical work plane). The numbers correspond to the perceived strenuous- ness. The strenuousness is not displayed for the artificial velocity task since this task has not been performed by human subjects, therefore its strenuousness has not been evaluated (and normalizing the perceived strenuousness would be meaningless). • Speed of motion: The torque indicator of the ar- tificial velocity task is significantly higher (p=0.019) than the one of the neutral task, because the faster dy- namics of the movement induces higher joint torques (see Fig. 6). However, according to the torque in- dicator, this increase in the joint torques is not as important as the one due to the external load in the force task. 4.3. Power Indicator Contrarily to the two previous indicators, the correla- tion between the power indicator and the strenuous- ness is fairly good when all three tasks are considered together (Pearson’s coefficient equals 0.75 (p=0.04)), and does not improve when each task is considered separately (Pearson’s coefficients equal respectively 0.71 (p=0.04), 0.86 (p=0.02) and 0.70 (p=0.03) for the neutral, force and velocity tasks). This suggests that the power indicator is suitable to compare tasks of different duration. Comparison between different tasks: In this sec- tion, the real velocity task (where the motion is re- played only once) is considered, since there is no need to equal the tasks durations with the power indicator. • Speed of motion: Though the velocity task lasts much less than the two others, its power indicator is only slightly lower (see Fig. 7). This is explained by the fact that the joint velocities are much higher in the velocity task,resulting in a much higher instantaneous joint power compared to the neutral and force tasks. The kinetic energy spent during the whole task is therefore about the same in all three tasks, but in the velocity task it results from a high power during a short time, whereas in the neutral and force tasks, it results from a lower power during a longer time. • External force: Contrarily to the torque indica- tor (see Fig. 5 left), the power indicator of the force task is often lower than the one of the neutral task, especially when the seat is far. This result is quite unexpected because a same allotted time and a very similar posture (see section 4.1.) should lead to same joint velocities for both tasks, and therefore Iτ and Ip should have similar variations. This difference is probably due to the fact that the allotted time is not strictly respected (note that the task duration is not normalized in the indicators com- putation). Because the time constraint is not displayed on the path itself, the subject tends to move slightly slower in the force task to better control the force magnitude (especially when his/her posture makes it hard to control). The joint velocities are then slightly smaller, and so is the joint power, given that the joint torques are not very different in the neutral and force tasks for the horizontal plane (see section 4.2.). Fr - Fc Cl - Fc Cl - Rg Cl - Lf Fr - Lf Lw Md Hg Lw Md Hg Min Max Lw Md Hg Velocity Neutral Force Seat distance and orientation Seat height Task Lf : Left Fc : Face Rg : Right Fr : Far Cl : Close Lw : Low Md : Medium Hg : High Strenuousness 2.3 5.3 4.5 2.3 4 4 2.5 1 1.8 4 5 4 6.5 5.5 4 5.3 5 4.8 2.8 3.5 4.8 6.3 2.3 5 4.8 2.5 4 3.5 3.5 1 1.8 4.5 4.8 Figure 7: Variations of Ip depending on the seat position for all three tasks velocity, neutral and force (horizontal work plane only). The numbers correspond to the perceived strenuousness. 5. Discussion According to the previous results, the proposed indi- cators account quite correctly for the way a task is performed. Their main variations are ergonomically, or at least physically, consistent, and the few unex- pected results seem to come from ill-adapted choices in the task definition (external force magnitude and direction, display of the time constraint) rather than 8 P. MAURICE, Ergonomic indicators for collaborative robotics from the indicators themselves. However, all the indicators are not equivalent de- pending on the task features (i.e. on what is com- pared). According to the correlation with the stren- uousness, the position and torque indicators do not seem suitable to compare tasks of different durations. On the contrary, this remark does not apply to the power indicator. On the other hand, when consid- ering tasks of similar duration, the position and the torque indicators generally account more accurately for the strenuousness perceived by the worker than the power indicator. Therefore, previously to carrying out a comparison, it is necessary to select the relevant, i.e. the most discriminating, indicators for the given conditions. In most cases there may be several relevant indica- tors. When addressing the position of the seat, the variations of the position and the torque indicators are mainly similar (the closer, the better) and they both show a good correlation with the strenuousness, so one could be tempted to keep only one of them for their study. However these indicators are not redun- dant and sometimes bring antagonistic conclusions: for the best seat distance (close - left), the best seat height is the high one according to the position indi- cator whereas it is the low one according to the torque indicator (see Fig. 3 and 5 right). This opposition may explain the disagreement between subjects’ prefer- ence - i.e. perceived strenuousness - (low seat) and the position indicator recommendations (high seat) in the close cases. Indeed, the strenuousness sum- marizes different kinds of demands (posture, static effort, dynamic effort...) in one value and is therefore an ”aggregated” indicator. Whereas the ergonomic indicators proposed in this work consider different kinds of demands separately. More generally, the design of a workstation - or a collaborative robot - usually results from trade-offs. So this work does not mix several kinds of demands within a sole indica- tor, because considering antagonistic effects within a same task is easier this way. Several indicators can be used in a multi-criteria optimization in order to design a robot which is as good as possible regarding every MSD risk factors. Finally, it should be noted that the indicators pro- posed in this work leave out some important phenom- ena related to MSD. In particular the co-contraction of antagonistic muscles, which occurs mainly in tasks requiring high precision (Gribble et al., 2003), is not modelled. Consequences of this omission can be observed in the linear relation between the strenuous- ness and the torque indicator: the y-intercept is bigger in the force task (2.8) than in the neutral task (1.8). The increase in the joint torques during the force task is underestimated in the simulation because it only takes into account the external load (the manikin is not preoccupied with precision), whereas the human subjects must accurately control the force they apply on the work plane, which requires an additional effort due to co-contraction. The omission of the co-contraction phenomenon is not due to the indicator formula, but to the repre- sentation of the human body, in which each joint is controlled by a unique actuator. However this phe- nomenon could be modelled without changing the body model, by using a variable impedance in the manikin control (i.e. adapting the gains Kp and Kd in equations 7 and 8). A higher stiffness allows a more accurate gesture and corresponds to a higher effort. But this has not been implemented since it requires a control law performing trade-offs between the precision and the exertion, which is out of scope here. Nevertheless, the indicators proposed in this work are not intended for medical purpose (e.g. real exposure level to MSD risk factors) but for guiding the design of assistive devices, so this evaluation, though incomplete, is still a first step in the right direction. 6. Conclusion Three ergonomic indicators adapted to the needs of collaborative robotics have been proposed. They con- sider the position and the effort of the worker, and the energy he spends performing a task. An experimental validation has been carried out on seven subjects, in order to study the influence of several task features (geometric, force and time constraints) on the indi- cators values. The subjects’ movements have been recorded with a motion capture system, and replayed with a dynamic DHM to compute the indicators. The indicators show a linear correlation with the strenu- ousnessperceived by the subjects, and their variations are consistent with ergonomic guidelines and physi- cal considerations. Those results suggest that the proposed indicators could be used to compare collaborative robots in the design process. However, each indicator provides different information, so their relevance is highly dependant on the task considered. Further work will be directed towards the development of a method for selecting the relevant set of indicators depending on the task features, in order to perform a multi-objective optimization. References AFNOR , 2008. NF EN 1005 Safety of machinery - Human physical performance. Association francaise de normalisation. Chaffin DB, Andersson GBJ, and Martin BJ, 2006. Occupational biomechanics. Wiley, 4th edition. Colgate JE, Peshkin M, and Klostermeyer SH, 2003. Intelligent assist devices in industrial applications: a review. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2516- 2521. 9 P. MAURICE, Ergonomic indicators for collaborative robotics Damsgaard M, Rasmussen J, Christensen ST, Surma E, and de Zee M, 2006. Analysis of musculoskeletal systems in the AnyBody Modeling System. Simulation Modelling Practice and Theory, 14(8), 1100-1111. David GC, 2005. Ergonomic methods for assessing ex- posure to risk factors for work-related musculoskeletal disorders. Occupational Medicine, 55(3), 190-199. Delp SL, Anderson FC, Arnold AS, Loan P, Habib A, John CT, Guendelman E, and Thelen DG, 2007. Opensim: open-source software to create and analyze dynamic simulations of movement. IEEE Transactions on Biomedical Engineering, 54(11), 1940-1950. Demircan E, Besier T, Menon S, and Khatib O, 2010. Human motion reconstruction and synthesis of human skills. In: Advances in Robot Kinematics: Motion in Man and Machine, 283-292. Springer. Fritzsche L, Jendrusch R, Leidholdt W, Bauer S, Jäckel T, and Pirger A, 2011. Introducing ema (editor for manual work activities)–a new tool for enhancing ac- curacy and efficiency of human simulations in digital production planning. In: Digital Human Modeling, 272–281. Springer. Garg A, Chaffin DB, and Herrin GD, 1978. Prediction of metabolic rates for manual materials handling jobs. The American Industrial Hygiene Association Journal, 39(8), 661-674. Gribble PL, Mullin LI, Cothros N, and Mattar A, 2003. Role of cocontraction in arm movement accuracy. Jour- nal of Neurophysiology, 89(5), 2396-2405. Holzbaur KRS, Murray WM, and Delp SL, 2005. A model of the upper extremity for simulating muscu- loskeletal surgery and analyzing neuromuscular con- trol. Annals of biomedical engineering, 33(6), 829-840. Li G and Buckle P, 1999. Current techniques for assess- ing physical exposure to work-related musculoskeletal risks, with emphasis on posture-based methods. Er- gonomics, 42(5), 674-695. Luttmann A, Jäger M, Griefahn B, Caffier G, Liebers F, and Steinberg U, 2003. Preventing Musculoskeletal Disorders in the Workplace. World Health Organiza- tion. Protecting Workers’ Health Series, 5. Ma L, Zhang W, Chablat D, Bennis F, and Guillaume F, 2009. Multi-objective optimisation method for posture prediction and analysis with consideration of fatigue effect and its application case. Computers & Industrial Engineering, 57(4), 1235-1246. National Research Council and Institute of Medicine , 2001. Musculoskeletal Disorders and the Workplace: Low Back and Upper Extremities. National Academy Press. Porter J.M., Case K., Marshall R., and Freer M., 2004. Sammie: A computer-aided ergonomics design tool. In: Working postures and movements – Tools for evalua- tion and engineering, 431–437. CRC Press. Raschke U., 2004. The jack human simulation tool. In: Working postures and movements – Tools for evalua- tion and engineering, 431–437. CRC Press. Salini J, Padois V, and Bidaud P, 2011. Synthesis of Complex Humanoid Whole-Body Behavior: a Focus on Sequencing and Tasks Transitions. In: Proceedings of the IEEE International Conference on Robotics and Automation, 1283-1290. Schneider E and Irastorza X, 2010. OSH in figures: Work-related musculoskeletal disorders in the EU - Facts and figures. European Agency for Safety and Health at Work. Seidl A., 2004. The ramsis and anthropos human simulation tools. In: Working postures and movements – Tools for evaluation and engineering, 454–462. CRC Press. Snook SH and Ciriello VM, 1991. The design of manual handling tasks: revised tables of maximum ac- ceptable weights and forces. Ergonomics, 34(9), 1197- 1213. Waters TR, Putz-Anderson V, Garg A, and Fine LJ, 1993. Revised NIOSH equation for the design and evaluation of manual lifting tasks. Ergonomics, 36(7), 749-776. 10 work_c5efogsiqvfwdjdkqde277cqvi ---- J. Anat. (2004) 204 , pp165 – 173 © Anatomical Society of Great Britain and Ireland 2004 Blackwell Publishing, Ltd. The Chinese Visible Human (CVH) datasets incorporate technical and imaging advances on earlier digital humans Shao-Xiang Zhang, 1 Pheng-Ann Heng, 2 Zheng-Jin Liu, 1 Li-Wen Tan, 1 Ming-Guo Qiu, 1 Qi-Yu Li, 1 Rong-Xia Liao, 1 Kai Li, 1 Gao-Yu Cui, 1 Yan-Li Guo, 1 Xiao-Ping Yang, 1 Guang-Jiu Liu, 1 Jing-Lu Shan, 1 Ji-Jun Liu, 1 Wei-Guo Zhang, 3 Xian-Hong Chen, 3 Jin-Hua Chen, 3 Jian Wang, 4 Wei Chen, 4 Ming Lu, 4 Jian You, 4 Xue-Li Pang, 5 Hong Xiao, 5 Yong-Ming Xie 2 and Jack Chun-Yiu Cheng 6 1 Department of Anatomy, College of Medicine, Third Military Medical University, Chongqing 400038, China 2 Department of Computer Science and Engineering, and 6 Department of Orthopaedics and Traumatology, Chinese University of Hong Kong, China 3 Department of Imaging Diagnosis, Research Institute of Field Surgery, Daping Hospital, Third Military Medical University, Chongqing 400042, China 4 Department of Radiology, and 5 Department of Radiation Oncology, Southwest Hospital, Third Military Medical University, Chongqing 400038, China Abstract We report the availability of a digitized Chinese male and a digitzed Chinese female typical of the population and with no obvious abnormalities. The embalming and milling procedures incorporate three technical improvements over earlier digitized cadavers. Vascular perfusion with coloured gelatin was performed to facilitate blood vessel identification. Embalmed cadavers were embedded in gelatin and cryosectioned whole so as to avoid section loss resulting from cutting the body into smaller pieces. Milling performed at − 25 ° C prevented small structures (e.g. teeth, concha nasalis and articular cartilage) from falling off from the milling surface. The male image set (.tiff images each of 36 Mb) has a section resolution of 3072 × 2048 pixels ( ∼ 170 µ m, the accompanying magnetic resonance imaging and computer tomography data have a resolution of 512 × 512, i.e. ∼ 440 µ m). The Chinese Visible Human male and female datasets are available at http:// www.chinesevisiblehuman.com. ( The male is 90.65 Gb and female 131.04 Gb). MPEG videos of direct records of real-time volume rendering are at: www.cse.cuhk.edu.hk/ ∼ crc Key words 3D reconstruction; Chinese visible human; digital human; cryoembalming; cyrosectioning; human digital images; imaging. Introduction The Visible Human Project ( VHP) initiated Visible Human Research ( VHR) by creating the Visible Human Male ( VHM) and Female ( VHF), publishing the datasets on the Internet in 1994 and 1995, respectively (Spitzer et al. 1996; Spitzer & Whitlock, 1998; Ackerman, 1999; http:/ / www.nlm.nih.gov / research / visible / visible_human .html). Although the US VHP dataset has been widely used, it has limitations (e.g. it suffers from data loss of the three junctions caused by cadaver segmentation, and it is only typical of the Caucasian population). Digitized humans are clearly required that are repre- sentative of other populations, and the ‘Visible Korean Human ( VKH)’ project was initiated in 2000, with the first VKH dataset being of a 65-year-old patient who had died of cerebroma (Chung & Kim, 2000; Chung & Park, 2003). This therefore cannot represent a complete normal adult human. In November 2001, the Chinese Visible Human (CVH) project was set up to produce a dataset of a complete normal adult male and female from an Asian popula- tion. The CVH project was planned to include digital images derived from computerized tomography (CT) together with magnetic resonance imaging (MRI), and photographic images from cadaver cryosectioning. The CVH project set out to produce high-quality data and maintain image integrity through two Correspondence Dr Shao-Xiang Zhang, Department of Anatomy, College of Medicine, Third Military Medical University, Chongqing 400038, China. E: Zhangsx@mail.tmmu.com.cn Accepted for publication 12 January 2004 The Chinese Visible Human, S.-X. Zhang et al. © Anatomical Society of Great Britain and Ireland 2004 166 improvements to the sectioning procedures. First, the milling machine table was made large enough to mount a whole embedded human body so as to avoid data loss caused by fragmenting the body before cryomacrotoming. Second, milling was performed in a laboratory where the temperature was maintained at or below − 25 ° C, to prevent small structures (including tooth, concha nasalis and articular cartilage) from falling off the milling surface. The CVH male and female were completed in October 2002 and in February 2003, respectively. Materials and methods Acquisition of the dataset Specimen preparation Ten cadavers of each sex were gifted by the citizens of Chongqing. A key aspect of the CVH project was to use cadavers that were from relatively young adults (20 – 40 years), and of typical height (160 –190 cm) and weight (e.g. no evidence of obesity or emaciation). Those meeting these criteria were screened macroscopically to exclude those with any superficial evidence of organic pathology. The remaining cadavers then underwent preliminary CT and MRI examinations to exclude those with internal lesions or pathology. The final cadavers were then transported to the Third Military Medical University Imaging Center to capture CT and MR images, which were kept to compare with later anatomical images, whereas those cadavers with internal lesions or pathology were further excluded. The man whose body was used as the CVH male was 35 years old at the time of death. He was 170 cm tall and weighed 65 kg. He had died of carbon monoxide poisoning, and the body was received 2 h after his death. CT and MR images were obtained of each region of the body in the laboratory of the hospital. At 8 h after death, the cadaver underwent measurement of height, weight and configuration. The skin and subcu- taneous tissue were then cut below the mid-point of right inguinal (Poupart’s) ligament and, the right femoral artery was opened longitudinally. Two tubes were inserted, one cranially and the other caudally, and the cadaver was perfused with 125 000 units of heparin in 200 mL of physiological saline and 11.5 litres of 5% formalin was then injected into the artery (10 litres of formalin was injected into the vessel cranially, and 1.5 litres caudally). The right femoral vein was opened and approximately 2 litres of venous blood flow was drained. Two hours later, the femoral artery was perfused with a 20% gelatin solution, which was coloured red with food dye: 1300 mL of the gelatin solution was perfused into the artery cranially, and 200 mL caudally. The femoral cut was then sutured layer by layer. At 10 h after death, the cadaver was transferred in the anatomical position to a specially constructed freezer, where the temperature was maintained at − 70 ° C. The CVH female was 22 years old at the time of death. She was 162 cm tall and weighed 54 kg. She had died of food poisoning and the body was received 3 h after her death. The CVH female was prepared in essentially the same way as the male. At 9 h after death, after undergoing CT and MRI examinations, the cadaver was perfused with 5% formalin into the right femoral artery. We injected 8 litres into the vessel cranially, and 1 litre caudally. Two hours later, the femoral artery was perfused with 20% red gelatin solution: 1000 mL of the gelatin solution was perfused into the artery cranially, and 150 mL caudally. At 12 h after death, the body was transferred in the anatomical position to the low-temperature freezer. CT and MR imaging CT transverse images were collected every millimetre (in all, there were 1696 CT images for the male and 1618 CT images for the female; see Fig. 1a). A 1.0-tesla superconducting magnetic resonance imager (Siemens Medical Systems, Germany) was used for MR imaging. Spin-echo T1-weighted images were obtained in the axial plane with the following parameters: repetition time (TR) 580 – 600 ms, echo time (TE) 15 ms, field of view (230 –380) × 256 mm, slice thickness: 1.5 mm for the head and cervix region and 3.0 mm for the other regions, pixel matrix 256 × 256, number of acquisitions 2. In total, 683 MR images were acquired for the male and 656 MR images for the female (Fig. 1b). Cadaver embedding The specimen needed to be embedded in a suitable medium so that it could be cut in the milling machine. For this, a box was made of corrosion-resistant material (inner dimensions: 450 mm × 500 mm × 1800 mm). Four plastic tubes that had been positioned longitudinally served as fiducial rods for three-dimensional (3-D) The Chinese Visible Human, S.-X. Zhang et al. © Anatomical Society of Great Britain and Ireland 2004 167 reconstruction (Fig. 2). The specimen was then trans- ferred from the ultra freezer and placed in the box in the anatomical position, and the box was then filled with 5% gelatin solution coloured blue with food dye. The box was placed in a freezer ( − 30 ° C) for 1 week so that the body was frozen in an ice block 450 mm wide × 500 mm long × 1800 mm deep. Sectioning of the cadaver To keep the ice block hard enough to keep the cutting surface slick and to avoid the ejection of small segments of tissue from the block, we constructed a low-temper- ature chamber (5.0 m long × 5.0 m wide × 2.2 m high) that could be maintained at or below − 25 ° C. The milling machine was placed in the chamber, but its electronic control system was kept outside. For sectioning, the ice block was transferred to the chamber and mounted to the cryomacrotome table of the milling machine. Sectioning of the intact block was performed using an improved TK-6350 numerical control milling machine with a milling accuracy of 0.001 mm (the numerical control system was made in Japan and the mill made in France; engineers from our team and the Hanzhong Machine Tool Factory, China, designed and implemented necessary modifications). Slices of each body were then milled layer by layer, from head to toes, at − 25 ° C in the low-temperature chamber. The serial cross-sections were photographed with a Canon high-resolution digital camera and scanned into an animation computer (Fig. 2). Images from the structural dataset are shown in Figs 3–5. The cutting process required three operators, two mill operators, one in the low-temperature chamber (appropriately clothed) and the other outside, together with a computer operator. Communication was either visually through a glass window in the chamber wall or using a microphone. The process was as follows: the mill operator outside the room set the starting location of the block and recorded the x , y and z positions using the keyboard on the control table. After completing a newly cut surface, the operator in the room rotated the worktable through 90 ° so that the camera was in the proper position to catch the image. The surface was then cleared with compressed air and sprayed with absolute ethyl alcohol, and a ruler and a chromato- gram label were then placed on the surface. The mill operator told the computer operator that preparation of the block surface was complete and optimal for photography. This was performed by the computer operator outside the low-temperature chamber, who then viewed the image (another image was taken if the first was not satisfactory), confirmed the slice number and saved the image on the hard drive of the computer. The mill operator in the chamber then rotated the worktable of the mill back to the original position while the mill operator outside the room Fig. 1 CT and MR images of the CVH male (a) and female (b) head. The Chinese Visible Human, S.-X. Zhang et al. © Anatomical Society of Great Britain and Ireland 2004 168 set new x and z positions for the block for the next cycle. Cutting of the CVH male block began on 2 March 2002 and finished on 8 August 2002. Cutting of the CVH female block began on 1 October 2002 and finished on 8 February 2003. The axial anatomical images of the CVH male were obtained at 0.5-mm intervals for the head and neck regions, 0.1-mm intervals for the skull base, and 1.0-mm intervals elsewhere. There were 2518 serial images (tagged image file format, tiff ), each of 36 Mb, and the complete data files occupy 90.65 Gb. The axial anatomical images of the CVH female were obtained at 0.25-mm intervals for the head and 0.5-mm intervals for the other regions. In all, there were 3640 serial slices, with each tiff file occupying 36 Mb (3072 × 2048 pixels, approximate pixel size was 170 µ m). The complete dataset occupies 131.04 Gb. Fig. 2 Four plastic tubes (a – d) were positioned longitudinally to serve as fiducial rods for 3-D reconstruction and as markers for cross-sections. The image of the block surface as the camera viewed it: the colour normalization strip is included at the top and the ruler is included at the bottom of each sectional image. Fig. 3 Transverse section image of head (CVH female): 1, optic nerve; 2, mesencephalon; 3, cerebellum; 4,occipital lobe; 5, temporal lobe. The Chinese Visible Human, S.-X. Zhang et al. © Anatomical Society of Great Britain and Ireland 2004 169 Fig. 4 Transverse section image of head (CVH male): 1, frontal bone; 2, corpus callosum, genu; 3, caudate nucleus; 4, putamen; 5, internal capsule, posterior limb; 6, corpus callosum, splenium; 7, occipital lobe; 8, insula; 9, temporal lobe; 10, temporalis muscle. Fig. 5 Transverse section image of abdomen (CVH male) to show the blood vessels (inferior aspect): 1, left gastroepiploic artery; 2, ileocolic artery; 3, superior mesenteric vein; 4, superior mesenteric artery; 5, jejunal artery; 6, left colic vein; 7, accessory renal artery; 8, renal segmental artery; 9, renal artery; 10, abdominal aorta; 11, renal vein; 12, inferior vena cava. The Chinese Visible Human, S.-X. Zhang et al. © Anatomical Society of Great Britain and Ireland 2004 170 Image capture and photography Images were captured by an external computer using a high-resolution Canon EOS-D60 digital camera (resolu- tion 3072 × 2048 [6 291 456] pixels) protected in a wooden box that had been placed in the chamber before the temperature in the laboratory was lowered. To protect the camera, the temperature was lowered slowly over 2 weeks from 25 ° C to − 25 ° C. Once a newly milled surface was complete, cleaned, positioned and labelled (see above), the photography light was turned on and the picture captured and transferred to the animation computer. The computer operator then viewed the image to ensure that the image was of the required standard, assigned it the proper slice number and saved it on the computer’s hard drive. The mill operator outside the room also recorded the x , y and z position from the screen of the controlling table of the milling machine, confirmed that the slice number was correct and made a note of any features of interest. Three-dimensional reconstruction and visualization of CVH dataset After data acquisition of the CVH male and female were completed, 3-D reconstruction was achieved by surface rendering and volume rendering reconstruction. Surface rendering reconstruction Triangular meshing of the boundary surface of the CVH male and female was achieved using the Marching Cube Algorithm. This was then rendered using OpenGL (i.e. the same method that others have used for the original visible human dataset). Volume rendering reconstruction Volume rendering reconstruction was achieved through graphics hardware acceleration supported by OpenGL 1.4. Our algorithm consists of the following steps: (1) polygonization of the complete volume data through layer-by-layer processing and generating correspond- ing image texture; (2) carrying out all essential trans- formations through vertex processor operations; (3) dividing polygonal slices into smaller fragments, where the corresponding depth and texture coordinates are recorded; and (4) in fragment processing, deploying the vertex shader programming technique to enhance the rendering of fragments. According to the red–green–blue (RGB) components, the image data of the volume cross-sectional plane can be processed and displayed in both greyscale and colour mode. Using voxels as the basic modelling unit, we can render the body directly without performing any segmentation. In this way, we can visualize the whole human body with great flexibility. Figure 6 shows two cut views of the CVH male and female. Traditionally, the size limitation of texture memory has made real-time rendering of large-volume datasets difficult as existing hardware-accelerated volume ren- dering cannot render datasets exceeding the specified size limit. We have developed a programmable graphics accelerator and appropriate visualization techniques to enable real-time visualization of the CVH dataset in a 3-D virtual environment. The 3-D reconstruction of visible human slices can also be stereoscopically viewed in real time. Using our modified volume-rendering pipeline, we can interac- tively rotate the 3-D images around any spatial axis and /or section them in any orientation. Our visualiza- tion system, which is based on the initial transverse images, can display sagittal, coronal and arbitrarily orientated sections by 3-D reconstruction. It can also display organs separately, or as part of the whole by defining approximate multiple oblique clipping planes to single-out the organ/region of interest. Real-time stereoscopic visualization of the 512 × 512 × 512 dataset can be achieved on a PC with the following configuration: Pentium 4, 1.5 GHz, 2 Gb RAM, 1000 Gb hard-disk and equipped with a display card that supports OpenGL 1.4 and 128 Mb texture memory. Dataset access The CVH male and female datasets are held at http:// www.chinesevisiblehuman.com. The complete dataset of the CVH male is 90.65 Gb in size, and the CVH female is 131.04 Gb in size, and can be distributed via FTP or DVDs. The full datasets are available from the Third Military Medical University (TMMU) under a licence agreement (enquiries should be addressed to The Chinese Visible Human Project at the TMMU). In addi- tion, several MPEG videos of direct records of real-time volume rendering of the Chinese Visible Human datasets can be accessed at www.cse.cuhk.edu.hk / ∼ crc. The Chinese Visible Human, S.-X. Zhang et al. © Anatomical Society of Great Britain and Ireland 2004 171 Results and discussion Acquisition of the 2518 anatomical cross-sectional images for the CVH male took 6 months. The dataset also includes CT, MR and radiographic images. Axial and coronal MR images of the head and neck and axial sections through the rest of the body were obtained at 3.0-mm intervals and in matrices of 512 × 512 pixels (256 grey levels, approximate pixel size 440 µ m). CT data consisted of axial scans through the body at 1.0- mm intervals. CT images are 512 × 512 pixels, in which each 256-bit pixel value is related to the electron density of the specimen at that point. The MR and CT axial images have been aligned with the anatomical cross-sections. All digital images of milled surfaces had 6 291 456 (3072 × 2048) pixels. The data file of each section occupies 36 Gb. The complete data files occupy 90.65 Gb. Acquisition of the 3640 anatomical cross-sectional images for the CVH female took 4 months. The visible female dataset has the same characteristics as the male with the following exceptions. The serial sections were sampled at 0.25-mm intervals for the head and 0.50- mm intervals for other regions. CT data consisted of axial scans through the entire body at 1.0-mm intervals. The data file of each section occupies 36 Mb. The complete data files occupy 131.04 Gb. The Korean VKH project initiated in March 2000 is still in progress, but the first VKH dataset has the limitation that the 65-year-old male patient had died of cerebroma and it thus has a pathological lesion. These images are at a higher resolution and are more complete than those produced by the ‘Visible Human Project’, in which intervals between adjacent sec- tions of the visible male and female are 1.0 mm and 0.33 mm, respectively, with the image for each section being 2048 × 1216 pixels. Each uncompressed data file was 7.9 Mb, with the complete male and female data files occupying 15 Gb and about 40 Gb, respectively. In addition, the VHP dataset is incomplete: according to the US VHP, the physical limitations of the cutting system required that the cadaver had to be cut into four segments before sectioning, causing a loss of 1.5 mm between blocks. Furthermore, because of the Fig. 6 Sagittal and coronal sections of CVH, 3-D reconstructed. (a) Sagittal section of CVH male: 1, cerebrum; 2, cerebellum; 3, left lung; 4, heart; 5, liver; 6, spleen; 7, left kidney. (b) Coronal section of CVH female: 1, cerebrum; 2, left lung; 3, stomach; 4, liver; 5, uterus; 6, urinary bladder. The Chinese Visible Human, S.-X. Zhang et al. © Anatomical Society of Great Britain and Ireland 2004 172 ejection of small segments of tissue from the block, some images of the VHP were imperfect. Our methodology was designed to avoid these pro- blems. We improved the milling machine to make its table large enough for a complete body, and, by keeping the body in a frozen matrix at − 25 ° C, kept all tissues in place that otherwise risked being lost (the last few millimetres of the condyles of the femur, some bones Fig. 6 Continued The Chinese Visible Human, S.-X. Zhang et al. © Anatomical Society of Great Britain and Ireland 2004 173 of the hand, foot, teeth, the concha nasalis, the articular cartilage, the temporal lobe of the brain, the cerebellum, etc.). The CVH male and female reported here as additions to the visible human dataset are more complete, representative and accurate than those hitherto published. Acknowledgements This work was supported by the National Natural Science Fund of China (NSFC) for Distinguished Young Scholars (No. 39925022), National Science Fund of China (No. 30270698) and the Research Grants Council of the Hong Kong Special Administrative Region (No. CUHK 1/00C). We thank investigators at the Virtual Reality, Visualization and Imaging Research Center and the Chinese University of Hong Kong for dataset visualization. We thank computer experts at the Department of Computer Science and Technology, Tsinghua University, for their support. We also thank our domestic colleagues for their dedication to Chinese Visible Human research that made this project successful. We especially acknowledge those citizens who donated their bodies to medical research. References Ackerman MJ (1999) The visible human project: a resource for education. J . Acad . Med . 74 , 667– 670. Chung MS, Kim SY (2000) Three-dimensional image and virtual dissection program of the brain made of Korean cadaver. Yonsei Med . J . 41 , 299 –303. Chung MS, Park HS (2003) Another trial for making serially sectioned images ( Visible Korean Human). In International Workshop on Visible Human , pp. 2 –9. Chongqing, China: Third Military Medical University. Spitzer VM, Ackerman MJ, Scherzinger AL, Whitlock D (1996) The visible human male: a technical report. J . Am . Med . Inform. Assoc. 3, 118 –130. Spitzer VM, Whitlock DG (1998) The Visible Human dataset: the anatomical platform for human simulation. Anat. Rec. 253, 49 – 57. work_c77a4gghk5b2jly5kuxb4ixufm ---- [PDF] Devil in The Digital: Ambivalent Results in an Object‐Based Teaching Course | Semantic Scholar Skip to search formSkip to main content> Semantic Scholar's Logo Search Sign InCreate Free Account You are currently offline. Some features of the site may not work correctly. DOI:10.1111/MUAN.12088 Corpus ID: 51960747Devil in The Digital: Ambivalent Results in an Object‐Based Teaching Course @article{Turin2015DevilIT, title={Devil in The Digital: Ambivalent Results in an Object‐Based Teaching Course}, author={M. Turin}, journal={Museum Anthropology}, year={2015}, volume={38}, pages={123-132} } M. Turin Published 2015 Sociology Museum Anthropology In 2013, I piloted a course in which students used Web-based tools to explore underdocumented collections of Himalayan materials at Yale University. Through class-based research and contextualization, I set students the goal of augmenting existing metadata and designing media-rich, virtual tours of the collections that could be incorporated into the sparse catalogue holdings held within the library system. The process was experimental and had mixed results, as this article documents. The class… Expand View via Publisher halshs.archives-ouvertes.fr Save to Library Create Alert Cite Launch Research Feed Share This Paper 1 Citations View All Figures from this paper figure 1 figure 2 figure 3 figure 4 figure 5 View All 5 Figures & Tables One Citation Citation Type Citation Type All Types Cites Results Cites Methods Cites Background Has PDF Publication Type Author More Filters More Filters Filters Sort by Relevance Sort by Most Influenced Papers Sort by Citation Count Sort by Recency Big Data, Bad Metadata: A Methodological Note on the Importance of Good Metadata in the Age of Digital History Kimmo Elo Computer Science 2020 PDF View 1 excerpt Save Alert Research Feed References SHOWING 1-6 OF 6 REFERENCES ENGAGING WITH PASTS IN THE PRESENT: Curators, Communities, and Exhibition Practice M. K. Scott History 2012 14 View 1 excerpt, references background Save Alert Research Feed Salvaging the records of salvage ethnography: The story of the Digital Himalaya Project M. Turin Engineering 2012 5 PDF View 1 excerpt, references background Save Alert Research Feed From Sage on the Stage to Guide on the Side A. King Psychology 1993 827 PDF Save Alert Research Feed The cultural biography of objects C. Gosden, Y. Marshall Art 1999 679 Save Alert Research Feed Himalayan Exhibit Unites Regional Artifacts Yale Daily News 2013 Cultural Property: A Contribution to the Debate 2006 Related Papers Abstract Figures 1 Citations 6 References Related Papers Stay Connected With Semantic Scholar Sign Up About Semantic Scholar Semantic Scholar is a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI. Learn More → Resources DatasetsSupp.aiAPIOpen Corpus Organization About UsResearchPublishing PartnersData Partners   FAQContact Proudly built by AI2 with the help of our Collaborators Terms of Service•Privacy Policy The Allen Institute for AI By clicking accept or continuing to use the site, you agree to the terms outlined in our Privacy Policy, Terms of Service, and Dataset License ACCEPT & CONTINUE work_c7o5k3xulncj5fmwptihyw54ca ---- Theadok - Theaterdokumentation Collecting metadata of performances Klaus Illmayer IFTR 2018 – Belgrade Slides are licensed – if not stated otherwise – as CC BY 4.0 (Creator: Klaus Illmayer) https://creativecommons.org/licenses/by/4.0/ What is Theadok? ● Find out at: https://theadok.at ● Aim of Theadok ○ Collecting metadata of performances ○ Current focus: performances of dance and theatre in Austria between 1945 and 2001 ○ Gathering new data ● History of Theadok ○ Collecting material to performances (esp. theatre reviews) since the establishment of the Department of Theatre Studies at Vienna University in 1943 ○ In 1970s first attempts to create a database out of this material (and also to connect with other collections at other Departments) ○ Around 2000: OpenTheadok - a first web version of Theadok (based on CD: 50 years of theatre in Austria) ○ Since 2015 re-design of data model and website https://theadok.at/ Frontpage of Theadok: https://theadok.at (all screenshots taken on July 11, 2018) Zeile 1 Zeile 2 Zeile 3 Zeile 4 0 2 4 6 8 10 12 Spalte 1 Spalte 2 Spalte 3 https://theadok.at/ Frontpage of Theadok (scrolling down): https://theadok.at https://theadok.at/ Example of a search result: Searching for "radovic" https://theadok.at/search_thd?search_api_fulltext=radovic Dataset on a person found by the search: https://theadok.at/person/82719 (see identifier and references to authority files) https://theadok.at/person/82719 Dataset on a person (scrolling down): https://theadok.at/person/82719 (additional information coming from authority file GND, also seeing connections to works that are present in Theadok) https://theadok.at/person/82719 http://www.dnb.de/EN/Standardisierung/GND/gnd_node.html Dataset on a work where the person was involved: https://theadok.at/work/33378 https://theadok.at/work/33378 Dataset on a work (scrolling down): https://theadok.at/work/33378 (person is author of the work; see also the relation to a performance of the work that is registered in Theadok) https://theadok.at/work/33378 Dataset on a performance related to the work: https://theadok.at/performance/4722 (see the fields in the group "Relations") https://theadok.at/performance/4722 Dataset on a performance (scrolling down): https://theadok.at/performance/4722 https://theadok.at/performance/4722 Dataset on a performance (scrolling down): https://theadok.at/performance/4722 https://theadok.at/performance/4722 Dataset on a performance (scrolling down): https://theadok.at/performance/4722 https://theadok.at/performance/4722 Data model (simplified) There are quite a few performance focused databases, here some lists: ● Nic Leonhardt: Digital Humanities and the Performing Arts: Building Communities, Creating Knowledge , 2014. ● Vincent Baptist: Inventory of European Performing Arts Data Projects, 2017. ● Klaus Illmayer et al: Zotero group “Digital Humanities in Theatre, Film, and Media Studies, 2016- ongoing. Collections of performance metadata exist on paper already since a long time (fruitful next step: get them into databases). Comparable projects https://f-origin.hypotheses.org/wp-content/blogs.dir/1944/files/2014/09/Nic-Leonhardt_DH-and-the-Performing-Arts_June-2014.pdf https://f-origin.hypotheses.org/wp-content/blogs.dir/1944/files/2014/09/Nic-Leonhardt_DH-and-the-Performing-Arts_June-2014.pdf https://public.tableau.com/profile/v.baptist#!/vizhome/InventoryofEuropeanPerformingArtsDataProjects_0/InventoryofEuropeanPerformingArtsDataProjects https://www.zotero.org/groups/494335/digital_humanities_in_theatre_film_and_media_studies? Why another performance database? It is based on an older database → does imply a specific data model Lack of tools/databases, that can be shared/used easily → maybe better to have different instances and domains of databases Paradigm change: Connecting between databases more important than using the same database Connecting data, e.g. between Theadok and the archive at the Departement of Theatre, Film, and Media Studies at Vienna University ● Different Domains ● Theadok: Performance oriented ● tfm Archive: Material oriented ● Example of such a connection: Anatol ("has material in archive") https://theadok.at/performance/15893 Example of a connection between databases: https://theadok.at/performance/15893 (look at Materials, "has material in archive") https://theadok.at/performance/15893 Connection between databases: related material to a performance in the tfm archive https://archiv-tfm.univie.ac.at/record-set/680 https://archiv-tfm.univie.ac.at/record-set/680 Connection between databases (scrolling down): https://archiv-tfm.univie.ac.at/record-set/680 (see "is associated with" in the "Concept/Thing relations"-group) https://archiv-tfm.univie.ac.at/record-set/680 What to collect in a performance database? Different possibilities: material to performances / metadata to performance / etc. Theadok collects metadata because: - it has done so before - it focuses on structured data, connecting entities - specialization on metadata, but referencing via links to other sources Metadata on performances as research data Difficulties: data quality, establish references, connect via identifiers, tools for (semi)automatically connection necessary on both sides Performance metadata as research data ● Theater researchers need to understand (meta)data on performance as research data ● Such data should be converted into digital structured data … … and it should be shared with others ● Theadok as a platform that enables researchers to put their data into digital collections and (let) re-use it ● Enrich this data with additional information, research results, data from other sources ● Combine theater research methods with digital methods Research data life cycle ● How to gather research data and how to further work on this data, e.g. collections in Theadok. ● Example of a research data life cycle: UK Data Archive life cycle model Copyright of graphic and related text: University of Essex, University of Manchester and Jisc https://www.ukdataservice.ac.uk/manage-data/lifecycle FAIR data principles ● Work-in-progress: Applying FAIR data principles on Theadok, https://www.go-fair.org/fair-principles/ By SangyaPundir (Own work) [CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons: https://commons.wikimedia.org/wiki/File%3AFAIR_data_principles.jpg https://www.go-fair.org/fair-principles/ https://commons.wikimedia.org/wiki/File%3AFAIR_data_principles.jpg Ongoing effort to enable interoperability, e.g. how to connect data of IbsenStage with Theadok? Example: Search for the Austrian "Burgtheater" in Ibsenstage https://ibsenstage.hf.uio.no/search?searchwords=burgtheater&type=venues&restrictyear=&submit=Search Enabling interoperability: Search for "Burgtheater" in Theadok Different informations, would be interesting to combine them https://theadok.at/search_thd?search_api_fulltext=burgtheater Enabling interoperability: Performances of works from Ibsen at Burgtheater (Ibsenstage): https://ibsenstage.hf.uio.no/pages/venue/14147 https://ibsenstage.hf.uio.no/pages/venue/14147 Enabling interoperability: Performances of Gespenster by Ibsen at Burgtheater (Theadok): https://theadok.at/stage/50296 (choosing Operator "Contains" and Title "Gespenster") https://theadok.at/stage/50296 Enabling interoperability: Details on a performance of Iben's Gespenster at Burgtheater (IbsenStage): https://ibsenstage.hf.uio.no/pages/event/88351 https://ibsenstage.hf.uio.no/pages/event/88351 Enabling interoperability: Details on a performance of Iben's Gespenster at Burgtheater (Theadok): https://theadok.at/performance/11649 (see "same as" in group "Relations"). But there is also different data details on the performance. Linking datasets is a first step, next step would be to share data. https://theadok.at/performance/11649 Theadok as part of a digital Infrastructure How to introduce such an infrastructure for theater research projects? ● We need to establish better communication between databases! ● Use of APIs (Application Programming Interfaces): for getting data in a structured, machine-readable format + for doing (semi)automatically analysis and data exchange ● Use of Vocabularies (see: https://vocabs.acdh.oeaw.ac.at/en/ ): agreements on the used terms, concepts, methods; they need not to be the same, but we need to formalize similarities (see also: Jonathan Bollen: Data Models for Theatre Research: People, Places, and Performance. In: Theatre Journal 68, 2016, 615-632, DOI: https://doi.org/10.1353/tj.2016.0109 ) ● We need identifier services for performance data (like DOI for documents), to find and connect similar data sets (see also: Miguel Escobar Varela and Nala H. Lee: Language documentation: a reference point for theatre and performance archives? In: International Journal of Performance Arts and Digital Media, 2018, DOI: https://doi.org/10.1080/14794713.2018.145342 ) ● Documentation and sharing of data model of databases is crucial. Helpful are abstractions of data models as ontologies > can help to map data between domains, e.g. Swiss Performing Arts Data Model https://vocabs.acdh.oeaw.ac.at/en/ https://doi.org/10.1353/tj.2016.0109 https://doi.org/10.1080/14794713.2018.145342 https://old.datahub.io/dataset/spa-data Better communication via APIs, e.g. Theadok API with JSON-Output Connect research data infrastructure on different levels Example of connecting research data infrastructures: PARTHENOS See: http://www.parthenos-project.eu/ "PARTHENOS aims at strengthening the cohesion of research in the broad sector of Linguistic Studies, Humanities, Cultural Heritage, History, Archaeology and related fields through a thematic cluster of European Research Infrastructures, integrating initiatives, e-infrastructures and other world-class infrastructures, and building bridges between different, although tightly, interrelated fields. PARTHENOS will achieve this objective through the definition and support of common standards, the coordination of joint activities, the harmonization of policy definition and implementation, and the development of pooled services and of shared solutions to the same problems." Gives support for research communities but also needs sharing of data from research communities. http://www.parthenos-project.eu/ https://www.parthenos-project.eu/ What to do next? ● Infrastructure for an identifier service/authority file for entities that are specific to theater research ● Establish and discuss vocabularies ● Build up mappings to the databases, that allow such connections ● Recommendations for data models, technical solutions ● Best practices, especially methods and use cases that a database should be able to handle ● Support for connecting datasets between databases ● Linked open data endpoints for complex queries Folie 1 What is Theadok? Folie 3 Folie 4 Folie 5 Folie 6 Folie 7 Folie 8 Folie 9 Folie 10 Folie 11 Folie 12 Folie 13 Data model (simplified) Comparable projects Folie 16 Folie 17 Folie 18 Folie 19 Folie 20 What to collect? Performance metadata as research data Folie 23 FAIR Folie 25 Folie 26 Folie 27 Folie 28 Folie 29 Folie 30 Thedok as part of a digital Infrastructure Folie 32 Folie 33 Folie 34 Research data infrastructures How to create a network in the discipline and outside? work_cc6xdhl3jvafncpxfjdw5mjp7m ---- BACKGROUND REPORT Quaderni Digilab. Vol. 2 (ottobre 2012). DOI: 10.7357/DigiLab-26 Web semantico, linked data e studi letterari: verso una nuova convergenza Fabio Ciotti La storia finora: luci e ombre delle Digital Humanities Uno dei libri più influenti nella storia del rapporto tra informatica e scienze umane è stato senza dubbio il notissimo Hypertext di George Landow1. In quel volume lo studioso statunitense sosteneva che l'allora incipiente tecnologia dell'ipertesto avrebbe favorito una feconda convergenza tra la tradizione degli studi letterari (almeno per come questi si erano venuti evolvendo negli ultimi decenni del secolo scorso) e il dominio delle discipline computazionali e dei nuovi media digitali. Dalla pubblicazione della prima edizione di quel libro sono passati ormai venti anni, e possiamo dire che l'era dell'ipertesto è ormai alle spalle: da una parte l'introduzione e l'evoluzione del Web ha per molti versi banalizzato l'ipertesto, rendendolo una forma/tecnologia di organizzazione delle informazioni di uso 1 G. P. Landow, Hypertext: the convergence of contemporary critical theory and technology, Baltimore, Johns Hopkins University Press, 1992. F. Ciotti, Web semantico, linked data e studi letterari - 244 - comune, al contempo attenuandone molte delle caratteristiche per così dire rivoluzionarie; dall'altra ha spostato l'attenzione sulla condivisione sociale del sapere, sulla creatività diffusa e sulla cooperazione tra gli individui mediata dalle tecnologie di comunicazione digitale. Naturalmente in questo contesto di profonda trasformazione delle modalità di creazione disseminazione e conservazione della conoscenza anche i saperi umanistici, i loro attori e i loro oggetti, hanno giocato un ruolo importante. In questo modo l'Informatica umanistica, o Humanties Computing nella formulazione anglosassone, si è venuta progressi- vamente liberando della stimmate di disciplina di nicchia, riuscendo al contempo a ottenere una presenza rilevante nella didattica offerta dalla facoltà umanistiche (e questo anche in Italia, nonostante ritardi, ritrosie culturali e crisi dell'università in generale abbiano senza dubbio rappresentato e ancora rappresentino fattori di ostacolo); a conseguire importanti risultati sul piano della ricerca; a promuovere e consolidare infrastrutture e organizzazioni per la cooperazione scientifica a livello nazionale e internazionale che raccolgono e coordinano un numero ormai grandissimo di studiosi a livello plane- tario, organizzano convegni mastodontici e pubblicano monografie e periodici autorevoli2. 2 Basti ricordare l'annuale conferenza «Digital Humanities», cui attendono centinaia di ricercatori, la storica rivista «Literary and Linguistic Computing», cui si sono aggiunte più recentemente «Digital Humanities Querterly», «Text Technology» e «Digital Studies / Le champ numérique», e i due ponderosi volumi miscellanei editi dalla Blackwell: S. Schreibman, R. G. Siemens, J. Unsworth, A companion to digital humanities, Dall'Informatica umanistica alle culture digitali - 245 - La recente e rapida diffusione del termine Digital Humanities 3 sancisce sul piano linguistico il successo di questo processo di consolidamento e generalizzazione, che ha attirato l'attenzione anche delle redazioni culturali della grande stampa, come testimonia la serie di articoli scritti dalla giornalista del New York Times Patricia Cohen, nel primo del quale si legge: Members of a new generation of digitally savvy humanists argue it is time to stop looking for inspiration in the next political or philosophical "ism" and start exploring how technology is changing our understanding of the liberal arts. This latest frontier is about method, they say, using powerful technologies and vast stores of digitized materials that previous humanities scholars did not have4. Una semplice rassegna dei risultati più notevoli conseguiti in questo campo richiederebbe di gran lunga troppo spazio per pensare di proporla in questa sede. Ma in questo panorama, se lo osserviamo senza l’ausilio del cannocchiale, quali sono le vette che si stagliano con maggiore rilievo? Quali sono i risultati più importanti che la ormai pluridecennale attività di ricerca nel dominio delle Digital Humanities ha prodotto? A nostro parere possiamo riassumerli nei seguenti punti: Malden, Mass., Blackwell Pub., 2004 e R. G. Siemens, S. Schreibman, A companion to digital literary studies, Malden, MA, Blackwell Pub., 2007. 3 Termine per il quale è invero arduo trovare una traduzione soddisfacente in italiano. Preferiamo pertanto mantenere la formulazione inglese. 4 P. Cohen, "Digital Keys for Unlocking the Humanities' Riches", New York Times, novembre 16, 2010. http://www.nytimes.com/2010/11/17/arts/17digital.html?_r=2 &emc=eta1&. http://www.nytimes.com/2010/11/17/arts/17digital.html?_r=2&emc=eta1& http://www.nytimes.com/2010/11/17/arts/17digital.html?_r=2&emc=eta1& F. Ciotti, Web semantico, linked data e studi letterari - 246 - 1) La consapevolezza teorica e metodologica, ampiamente con- divisa dai centri di elaborazione e dai singoli studiosi più avanzati e propulsivi della comunità, che vede nel rapporto con le metodologie informatiche un elemento epistemologica-mente e teoricamente rilevante e non un semplice fattore strumentale. 2) Il concetto di modellizzazione come attività intellettuale che caratterizza l’attività di indagine sugli oggetti e i fenomeni culturali mediante il computer, mediando tra il livello della teoria e quello dell’osservazione. Il livello al quale si colloca dunque l’informatica nella ricerca umanistica è quello propria- mente del metodo, un metodo che è relativamente theory-inde- pendent ma che richiede alla teoria la qualità del rigore formale, l’esplicitazione degli enti teorici e delle relazioni tra tali enti che essa presuppone e l’indicazione di procedure per collegare tali enti ai dati osservativi (in ultima analisi al materiale linguistico testuale e a quello documentale o fattuale contestuale). 3) La predisposizione di linguaggi e standard condivisi per la modellizzazione, rappresentazione e disseminazione di risorse digitali di qualità, attività legata alla forte cooperazione con la comunità scientifica archivistica e biblioteconomica. 4) Le ampie campagne di digitalizzazione di fonti primarie e secondarie in formato testuale e/o immagine facsimilare e la predisposizione di vasti repositories on-line che ormai mettono a disposizione in forma libera e gratuita e con livelli di affidabilità linguistica piuttosto elevati una parte importante della tradizione testuale occidentale. 5) Lo sviluppo di importanti framework e infrastrutture software per effettuare information retrieval, analisi testuale e pubbli- cazione on-line di tali risorse testuali, in genere disponibili liberamente come prodotti open source o web service. Dall'Informatica umanistica alle culture digitali - 247 - A fronte di questi risultati, tutti di vasta portata ma collocati sul piano dei fondamenti teorici e metodologici e su quello delle infrastrutture generali per la ricerca, sta invece la oggettiva limitatezza dei risultati specifici (fatti salvi alcuni meritevoli controesempi) sul piano dei singoli campi disciplinari. Insomma il movimento delle Digital Humanities ha prodotto una notevole mole di risorse e strumenti digitali, ha acquisito una elevata autoconsapevolezza teorica (e più di un qualche riconoscimento istituzionale), ma di rado è riuscita a uscire dal circolo dei suoi addetti ai lavori, a stabilire una relazione scientifica con il mainstream della comunità scientifica umanistica; come ha scritto J. Unsworth: We need (we still need) to demonstrate the usefulness of all the stuff we have digitized over the last decade and more – and usefulness not just in the form of increased access, but specifically, in what we can do with the stuff once we get it: what new questions we could ask, what old ones we could answer5. Più recentemente, restringendo l'orizzonte della riflessione sugli studi letterari anche Willard McCarty ha evidenziato la questione della rilevanza critica dell’informatica letteraria […] literary computing is confined to providing evidence for or against what we already know or suspect. It is strongly inhibited in its capacity to surprise. Providing evidence seems justification enough, but evidence becomes increasingly 5 J. Unsworth, "Tool-Time, or 'Haven't We Been Here Already?': Ten Years in Humanities Computing", presentato al Transforming Disciplines: The Humanities and Computer Science, Washington, D.C., gennaio 18, 2003. http://www.iath.virginia.edu/ ~jmu2m/carnegie-ninch.03.html. http://www.iath.virginia.edu/~jmu2m/carnegie-ninch.03.html http://www.iath.virginia.edu/~jmu2m/carnegie-ninch.03.html F. Ciotti, Web semantico, linked data e studi letterari - 248 - problematic as the volume of data exceeds the norm for critical practices formed prior to the exponential growth of online resources. As this volume increases, so does the probability of arbitrary choice, and so the ease with which any statement may be connected to any other. Good critics may do better scholarship by finding more of what they need; bad critics may be swiftly becoming worse ones more easily. The point, however, is that literary computing has thereby served only as mutely obedient handmaiden, and so done nothing much to rescue itself from its position of weakness, from which it can hardly deliver the benefits claimed for it by the faithful. It has done little to educate scholars methodologically6. Quali sono i motivi di questo tutto sommato insoddisfacente panorama? È possibile individuare dove l’informatica umanistica in generale e quella letteraria in particolare hanno fallito nel cogliere il punto? Certo, gli ultimi decenni sono stati caratterizzati da voghe culturali troppo lontane dal rigore formale e dall'idea di testo come oggetto linguistico: la Teoria, quella senza aggettivi per dirla con Culler7, non si presta facilmente a interagire con il certosino formalismo delle strutture dati e dei linguaggi informatici – salvo poi convolare felicemente a nozze con il "decostruito" ipertesto. Per non parlare poi dell’arena vasta e multiforme degli studi culturali, che spesso di tutto si occupano meno che del testo (e però molti di questi studi non poco si gioverebbero del contributo di alcune delle recenti 6 W. McCarty, "Literary enquiry and experimental method: What has happened? What might?". In L. Dibattista, Storia della Scienza e Linguistica Computazionale: Sconfinamenti Possibili, Milano, Franco Angeli, 2009, pp. 40–41. http://www.mccarty.org .uk/essays/McCarty,%20Literary%20enquiry%20and%20experimental%20method.pdf. 7 J. D. Culler, Teoria della letteratura : una breve introduzione, Roma, Armando, 1999. http://www.mccarty.org.uk/essays/McCarty,%20Literary%20enquiry%20and%20experimental%20method.pdf http://www.mccarty.org.uk/essays/McCarty,%20Literary%20enquiry%20and%20experimental%20method.pdf Dall'Informatica umanistica alle culture digitali - 249 - tendenze innovative emerse nell’ambito delle Digital Humanities e di cui parleremo più avanti). E tuttavia, come rileva lo stesso McCarty, proprio il problema del testo, pur così al centro della riflessione teorica e metodologica delle Digital Humanities, è rimasto tutto sommato scarsamente determinato nell’ambito delle concrete applicazioni dell’informatica agli studi letterari. Il problema è che non disponiamo di teorie del testo che si possano definire in senso stretto formali. Mentre la rappresentazione (e a maggior ragione l’elaborazione) informatica è ontologicamente formale in senso stretto. La lunga storia di quel sottodominio delle Digital Humanities rubricata sotto l’etichetta di codifica testuale è consistita nel tentativo non soddisfacente di superare questo duplice divario concettuale. E di conseguenza gli strumenti informatici per l’analisi e l’edizione scientifica dei testi (e i relativi risultati in termini di analisi ed edizioni) hanno quasi sempre deluso le aspettative e non sono riusciti ad acquisire un sufficiente riconoscimento nell’ambito delle discipline per così dire tradizionali. Cioè, nonostante si sia consapevoli del problema teorico, la predisposizione degli strumenti di rappresentazione e analisi concreti ha finora fatto assai poco i conti con le specificità e la complessità degli oggetti e delle procedure di analisi tipiche della ricerca letteraria. Questo è dovuto anche e soprattutto al fatto che, nonostante le ripetute affermazioni teoriche, assai sporadico e di nicchia è stato l’investimento degli stessi cultori delle Digital Humanities nella definizione di nuovi modelli e linguaggi per la rappresentazione ed elaborazione formale dei complessi oggetti culturali cui si applicano. Più comunemente si sono ereditati e applicati modelli e linguaggi elaborati dall’informatica per finalità e domini diversi. F. Ciotti, Web semantico, linked data e studi letterari - 250 - Paradigmatico il caso del linguaggio XML. Esso ha assunto un ruolo centrale nella costruzione di linguaggi standard per la rappresentazione di dati e metadati, divenendo una sorta di esperanto digitale; in virtù della sua flessibilità, robustezza e delle sue caratteristiche sintattiche è stato ampiamente adottato per la rappresentazione dei dati in ambito umanistico. Il problema è che XML da una parte impone l'adozione di un modello di dati ad albero che non sempre si adatta alla natura strutturale degli oggetti da rappresentare, dall'altra non è in grado di rappresentare adeguatamente i numerosi e complessi livelli semantici che caratterizzano un testo letterario. Anzi, in generale possiamo dire che XML non fornisce alcuna semantica ai dati in modo computazionalmente trattabile. Il comune fraintendimento per cui si parla di "markup semantico" deriva dal fatto che i marcatori sono leggibili e che, di norma, il vocabolario dei linguaggi XML usa termini delle lingue naturali. Ma la semantica "naturale" di tale vocabolario è del tutto inaccessibile a un elaboratore XML8. 8 F. Ciotti, "La rappresentazione digitale del testo: il paradigma del markup e i suoi sviluppi". In L. Perilli, D. Fiormonte (a cura di), La macchina nel tempo : studi di informatica umanistica in onore di Tito Orlandi, Firenze, Le Lettere, 2011 Dall'Informatica umanistica alle culture digitali - 251 - Nuove frontiere per le Digital Humanities Se questo è il quadro, quali sono le prospettive che si possono aprire per lo sviluppo dell’informatica umanistica e di quella letteraria in particolare? Senza dubbio, consolidare e se possibile estendere i risultati acquisiti è una missione validissima e anzi irrinunciabile. Gli archivi testuali vanno preservati ed implementati, la trascrizioni ed edizioni digitali basate sui formalismi attualmente disponibili moltiplicate, gli standard mantenuti, applicati e diffusi. Ma è giunto il momento ormai di individuare nuove linee di ricerca, di esplorare le tendenze innovative che potrebbero fornire un ulteriore salto di qualità e una più ampia giustificazione scientifica (ma anche istituzionale) all’incontro tra informatica e studi umanistici. Difficile dire a priori quali direzioni saranno le più proficue: il tempo lo dirà. Tra i numerosi campi di indagine aperti, ne segnaleremmo almeno due: 1) Big Data: lo sviluppo e l'applicazione di strumenti per l’analisi automatica delle ingenti masse di risorse testuali/documentali e di dati disponibili sulla rete e non solo, attraverso la sperimentazione di metodologie e tecnologie di text mining e knowledge extraction. 2) Web 3.0: la sperimentazione dei nuovi linguaggi e modelli di dati per la rappresentazione dei livelli semantici nelle risorse informative, delle tecnologie e delle architetture che vanno rubricate sotto le etichette di Web Semantico e Linked Data, ovviamente adattandole alle specificità degli oggetti culturali. F. Ciotti, Web semantico, linked data e studi letterari - 252 - Per quanto riguarda la prima linea di ricerca, la cui analisi esula dagli obiettivi di questo lavoro, ci limitiamo a ricordare che si tratta di un ambito di ricerca complesso ma promettente che consiste nella applicazione di tecniche e strategie di data mining, ovvero di metodi computazionali, su base statistico/probabilistica, per la ricerca di regolarità e schemi ricorrenti impliciti e non osservabili a priori all'interno di grandi moli di dati strutturati e non strutturati: Data mining is the process of discovering meaningful new correlations, patterns and trends by sifting through large amounts of data stored in repositories, using pattern recognition technologies as well as statistical and mathe- matical techniques9. La ricerca di tali pattern e regolarità si basa su complessi algoritmi probabilistici, i più noti dei quali sono fondati sull’analisi probabilistica bayesiana, che studia la probabilità di eventi non quantificabili a priori (ad esempio la probabilità che in un insieme di soggetti prevalga una certa aspettativa, o che un testo sia categorizzabile in base a un dato tema prevalente). Quando questi algoritmi sono applicati a dati testuali si parla più specificamente di text mining. In questa direzione si sono indirizzati alcuni importanti progetti di ricerca nell'ambito delle Digital Humanities, tra cui ricordiamo un importante progetto internazionale diretto da John Unsworth, il Monk Project10, e le ricerche condotte presso lo Stanford 9 Gartner Group, "Data Mining Definition | Gartner", 2012, http://www.gartner.com/ it-glossary/data-mining. 10 J. Unsworth, M. Mueller, The MONK Project Final Report, Settembre 2, 2009. http://monkproject.org/MONKProjectFinalReport.pdf. http://www.gartner.com/it-glossary/data-mining http://www.gartner.com/it-glossary/data-mining http://monkproject.org/MONKProjectFinalReport.pdf Dall'Informatica umanistica alle culture digitali - 253 - Literary Lab diretto da Franco Moretti11. Lo stesso Moretti ha teorizzato come queste tecnologie di ricerca possano essere il fondamento di un vero e proprio nuovo metodo di studio dei fenomeni letterari, che ha definito distant reading (giocando sulla opposizione con il close reading introdotto nella critica letteraria dal New criticism)12. Il Web Semantico, le ontologie e i Linked Data In questa sede intendiamo piuttosto approfondire il discorso sulle potenzialità delle tecnologie del Web Semantico (WS) nelle Digital Humanities in generale e negli studi letterari in particolare. Il termine e la visione a cui esso allude sono state proposti da Tim Berners-Lee, l'inventore del Web, nel 199813. L’idea consiste nell’associare alle risorse informative sul Web una descrizione formalizzata del loro significato intensionale mediante la sovrap- 11 R. Heuser, L. Le-Khac, Stanford Literary Lab, A Quantitative literary history of 2,958 nineteenth-century British novels : the Semantic cohort method, 2012. http://litlab.stanford. edu/LiteraryLabPamphlet4.pdf. 12 Si vedano: F. Moretti, Graphs, maps, trees : abstract models for a literary history, London; New York, Verso, 2005; M. G. Kirschenbaum, "The remaking of reading: Data mining and the digital humanities". In The National Science Foundation Symposium on Next Generation of Data Mining and Cyber-Enabled Discovery for Innovation, Baltimore, MD, 2007. http://www.cs.umbc.edu/~hillol/NGDM07/abstracts/talks/MKirschenbaum. pdf 13 T. Berners-Lee, J. Hendler, O. Lassila, "The Semantic Web". In Scientific American, vol. 284, fasc. 5, maggio 2001, pp. 34–43.; G. Antoniou, F. Van Harmelen, A semantic Web primer, Cambridge, Mass., MIT Press, 2008; E. Della Valle, I. Celino, D. Cerizza, Semantic web : modellare e condividere per innovare, Milano, Pearson Addison Wesley, 2008. http://litlab.stanford.edu/LiteraryLabPamphlet4.pdf http://litlab.stanford.edu/LiteraryLabPamphlet4.pdf http://www.cs.umbc.edu/~hillol/NGDM07/abstracts/talks/MKirschenbaum.pdf http://www.cs.umbc.edu/~hillol/NGDM07/abstracts/talks/MKirschenbaum.pdf F. Ciotti, Web semantico, linked data e studi letterari - 254 - posizione di uno o più livelli di metadati semantici. Tali insiemi di metadati semantici sono espressi in formalismi che si collocano nella famiglia dei sistemi di rappresentazione della conoscenza a suo tempo sviluppati nell'ambito dell'intelligenza artificiale, e dunque possono essere elaborati automaticamente a diversi livelli di complessità: si va dalla semplice visualizzazione o consultazione per scorrimento di indici strutturati; alla interrogazione e ricerca supportata da motori inferenziali; alla classificazione e collegamento automatico; fino alla derivazione di nuove conoscenze implicite mediante inferenza logica e alla valutazione di attendibilità mediante il computo logico di asserti di fiducia. Figura 1. Dall'Informatica umanistica alle culture digitali - 255 - L'architettura generale del Web Semantico è comunemente raffigurata mediante un diagramma a pila che ne specifica le componenti a diversi livelli di astrazione, a partire dagli oggetti informativi a cui si applica, tecnicamente denominate risorse14. Il significato di questo termine è assai ampio: una risorsa informativa può essere un oggetto informativo accessibile sul Web – dal singolo documento, a sue parti, a collezioni di documenti – un oggetto reale o un oggetto astratto. Il primo problema che si pone è quello della 'identificazione' delle risorse in modo non ambiguo e indipendente dall'universo del discorso. Le URI (Uniform Resource Identifiers) sono i formalismi che svolgono tale ruolo: degli identificativi univoci e persistenti che permettono che una data risorsa possa essere menzionata e individuata nello spazio informativo del Web15. Se una risorsa è identificata in modo univoco è possibile esprimere su di essa asserti che ne descrivono il contenuto sotto un qualche rispetto, esprimono ciò che un utente pensa su tale contenuto, ne specificano proprietà e relazioni. Questi asserti sono i metadati semantici. Affinché i metadati semantici siano utilizzabili dai 14 In effetti al momento sono stati definiti modelli architetture e linguaggi solo fino al livello delle ontologie e dei sistemi a regole. Gli strati più 'alti' del WS sono ancora avvolti da una aura quasi mistica, e molti esperti nutrono forti dubbi che potranno mai essere tradotti in qualcosa di funzionante, almeno sulla scala totalizzante prevista dal disegno di Berners-Lee. Ci sono ad esempi dei seri problemi formali e matematici che impediscono di applicare algoritmi di dimostrazione automatica a questo livello. 15 La forma più comune di URI son gli indirizzi delle pagine Web (URL) per cui esiste un consolidato protocollo di dereferenziazione, ma non sono le uniche (e peraltro la loro funzione è spuria in quanto svolgono sia il ruolo di identificatori sia quello di localizzatori). F. Ciotti, Web semantico, linked data e studi letterari - 256 - computer, è necessario che vengano espressi in un linguaggio che sia computazionalmente trattabile. È questo il fine del Resource Description Framework (RDF) sviluppato presso World Wide Web Consortium170F16. RDF è un metalinguaggio dichiarativo per formalizzare asserti che esprimono proprietà e relazioni tra risorse, il cui modello di dati è basato su tre elementi: • Risorse. • Proprietà. • Asserti. Le risorse come abbiamo visto sono tutto ciò che può essere soggetto di descrizione: pagine Web, documenti, persone, istituzioni, concetti. Le proprietà sono coppie attributo-valore associate alla risorsa. Ogni proprietà ha un significato specifico, una serie di valori leciti ed è associabile a uno o più tipi di risorsa. Proprietà e valori possono essere espressi da URI (e dunque da altre risorse) o da valori letterali (valori diretti). Gli asserti (statement) sono la struttura predicativa che esprime l’associazione di una proprietà a una risorsa. Ogni asserto ha una struttura soggetto – predicato – oggetto. Un asserto specifica una 16 Il Web Consortium (W3C, http://www.w3c.org) è una organizzazione no profit che promuove e coordina lo sviluppo delle tecnologie di base per il Web in modo indipendente dai singoli attori privati a esso interessati, rilasciando standard e linee guida in regime aperto. Le specifiche di RDF sono in Resource Description Framework (RDF): Concepts and Abstract Syntax, K. G., Carroll J. (Ed.), W3C Recommendation, 10 February 2004. http://www.w3.org/TR/rdf-concepts, che definisce il modello dei dati e la sintassi per esprimere asserti RDF. http://www.w3c.org/ http://www.w3.org/TR/rdf-concepts Dall'Informatica umanistica alle culture digitali - 257 - relazione predicativa tra soggetto e oggetto (in RDF sono consentite solo relazioni binarie). Gli asserti sono anche noti come triple e gli insiemi di asserti si possono rappresentare come grafi etichettati orientati aciclici, come si vede nella figura seguente. Figura 2. RDF in quanto tale non fornisce un vocabolario predefinito e a priori di proprietà e di relazioni sotto cui sussumere e organizzare le risorse. Si tratta di un modello di dati semplice e rigoroso per specificare proprietà di risorse, qualsivoglia esse siano. In un contesto ampio ed eterogeneo come il Web possono esistere numerosi schemi e vocabolari semantici, basati su diverse concettua- lizzazioni di particolari domini, su diverse terminologie e lingue. In linea generale si può assumere che esistano anche concettua- lizzazioni mutuamente contraddittorie e/o mutevoli nel tempo. Al fine di rendere utilizzabili queste concettualizzazioni in modo F. Ciotti, Web semantico, linked data e studi letterari - 258 - computazionale (almeno in parte) è necessario conseguire un ulteriore livello di formalizzazione: quello delle ontologie formali. La definizione classica di questo concetto è stata fornita da Gruber17: "An ontology is an explicit specification of a conceptualization". Il termine ontologia, ereditato dalla metafisica classica dove, sin dalla sistemazione aristotelica, denotava la teoria dell'essere e delle sue categorie, è oggi adottato a denotare una ampia e diversificata classe di oggetti che vanno dai vocabolari controllati, ai thesauri fino alle ontologie formali vere e proprie. Queste, oltre a fissare una terminologia strutturata per gli enti di un dato dominio, ne fissano anche la semantica condivisa da una data comunità, in termini logico-formali: In the context of computer and information sciences, an ontology defines a set of representational primitives with which to model a domain of knowledge or discourse. The representational primitives are typically classes (or sets), attributes (or properties), and relationships (or relations among class members). The definitions of the representational primitives include information about their meaning and constraints on their logically consistent application18. Esistono numerosi linguaggi formali per specificare ontologie formali. A un primo e più semplice livello di complessità e capacità espressiva si pone RDF Schema (RDFS), che permette di definire formalmente: 17 T. R. Gruber, "A translation approach to portable ontology specifications". In Knowledge Acquisition, vol. 5, fasc. 2, 1993, p. 199. 18 T. R. Gruber, "Ontology", Encyclopedia of Database Systems, Springer-Verlag, 2009. Dall'Informatica umanistica alle culture digitali - 259 - • Classi (o tipi) di risorse • Classi di proprietà • Relazioni tra classi di risorse e proprietà (Ex.: classe -> sottoclasse) • Domini e range di proprietà I vincoli espressi da RDFS, tuttavia, non sono sufficienti per esprimere interamente i vincoli ontologici necessari agli obiettivi del WS. Occorre un sistema per specificare le relazioni logico-semantiche (equivalenza, specificazione, generalizzazione, istanziazione, cardi- nalità, simmetria etc.) tra oggetti e proprietà di un medesimo schema e di schemi diversi. Ad esempio, la relazione di "autorialità" potrebbe essere indicata dalla proprietà "essere autore" dove l'autore sta in funzione di soggetto e il cui oggetto è un dato documento. In uno schema differente, al contrario, potremmo avere che il soggetto è il documento di cui si predica la proprietà "essere scritto da" che ha come oggetto un esponente della classe degli autori. Evidentemente si sta parlando dello stesso insieme di individui e relazioni (dominio), ma in modo simmetrico. Nel contesto del WS il linguaggio deputato a conseguire questo secondo livello di formalizzazione è Web Ontology Language (OWL) . OWL, ora giunto alla versione 2.0, può essere espresso in diverse notazioni equivalenti e ha due possibili interpretazioni semantiche: una (OWL 2 DL) basata sulla semantica modellistica di una variante F. Ciotti, Web semantico, linked data e studi letterari - 260 - di description logic completa e computabile19; e una più potente (OWL 2 Full) basata su un generalizzazione del modello a grafo di RDF/S, ma non decidibile. Nelle logiche descrittive la modellizzazione ontologica è distinta in due parti: la cosiddetta Tbox (terminological box) descrive le classi e i concetti generali, specifica le loro relazioni e ne definisce proprietà; la Abox (assertion box) contiene gli asserti fattuali, che elencano gli individui del dominio, ne individuano i ruoli e indicano a quali classi e concetti definiti nella Tbox essi appartengano. Questa struttura semplifica la modellazione concettuale di domini fortemente popolati, poiché gli individui possono essere descritti con una relativa autonomia dalla formalizzazione concettuale di alto livello. Le ontologie formali basate su description logic hanno anche il vantaggio di potere essere utilizzate da sistemi di ragionamento automatico abbastanza efficienti (quali il motore inferenziale Racer o il più antico e consolidato linguaggio Prolog), mediante i quali si possono eseguire numerosi processi di elabora- zione e gestione delle basi di conoscenza: Logic reasoning is one possible application for ontologies. It is probably helpful (i) to check consistency during ontology development, (ii) to enable semi-automatic merging of (domain) ontologies as well as (iii) to deduce hidden information contained in the ontology. These three tasks can be applied to all elements of ontologies, classes as well as instances 19 F. Baader, The description logic handbook : theory, implementation, and applications, Cambridge, UK; New York, Cambridge University Press, 2003. Dall'Informatica umanistica alle culture digitali - 261 - [...] logic reasoning can fulfill different purposes in the phase of creating an ontology and in the phase of using it [...], e.g. investigating the structure of categories/ concepts, or testing if every object is used in the intended and not contradictory way. In different situations of a work process, logic reasoning can be used to avoid or to solve problems: If several persons build together an ontology, new included elements can be checked for inconsistency or redundant information can be detected20. Il progetto del WS (o Web 3.0, formulazione adottata dopo e per certi versi in alternativa al diffondersi della moda culturale/ tecnologica del Web 2.0) nella sua generalità richiede numerose e rilevanti innovazioni dal punto di vista tecnico, da quello delle competenze richieste e soprattutto da quello dei comportamenti sociali e culturali degli utenti del Web. Molti esperti e studiosi nutrono forti dubbi sul fatto che tale progetto nella sua versione più ambiziosa e universale potrà mai realizzarsi. Come accennato esistono numerosi e validi ostacoli tecnici e teorici: inconsistenza tra ontologie; incompletezza dei sistemi deduttivi per le versioni più espressive di OWL e RDFs; complessità computazionale degli algoritmi inferenziali applicati a un numero di asserti poten- zialmente enorme; criticità della Assunzione di Mondo Aperto, secondo la quale è falso solo ciò che si può dimostrare esplicitamente tale, alla base delle logiche descrittive. Ma forse più rilevanti sono i dubbi circa la sua reale necessità, almeno per gli scopi e gli obiettivi 20 A. Zöllner-Weber, "Ontologies and Logic Reasoning as Tools in Humanities?". In Digital Humanities Quarterly, vol. 3, fasc. 4, 2009. http://www.digitalhumanities.org/ dhq/vol/3/4/000068/000068.html. http://www.digitalhumanities.org/dhq/vol/3/4/000068/000068.html http://www.digitalhumanities.org/dhq/vol/3/4/000068/000068.html F. Ciotti, Web semantico, linked data e studi letterari - 262 - per cui è stato ideato e progettato, che sarebbero assai meglio conseguiti dalle tecnologie e dai sistemi di cooperazione decentrata e sociale introdotti dal cosiddetto Web 2.0: si pensi ad esempio al meccanismo delle folksonomie (contrapposte alle tassonomie e ai thesauri) e al social filtering21. Diverso il discorso relativo all’applicazione di tecnologie del WS a domini specifici e in contesti controllati e locali. In questi contesti vengono meno molte delle problematiche tecniche e vengono valorizzate le capacità di organizzazione delle conoscenze da parte di esperti pur potendo usufruire di strumenti assai più flessibili e dinamici rispetto ai tradizionali strumenti di controllo semantico dell'informazione. In questa direzione si muovono anche le recenti sperimentazioni che vanno sotto l'etichetta di Linked Data22. Con questo termine ci si riferisce a un insieme di soluzioni per la pubblicazione e l'interconnessione di dati strutturati sul Web mediante tecnologie del WS. L'idea è stata introdotta ancora da Tim Berners-Lee al fine dare concretezza al progetto del WS rendendo disponibile nei suoi formalismi e protocolli la sterminata quantità di 21 C. Shirky, Ontology is Overrated: Categories, Links, and Tags, 2005. http://www.shirky.com/writings/ontology_overrated.html; K. H. Veltman, "Towards a Semantic Web for Culture". In Journal of Digital Information, vol. 4, fasc. 4, 2004. http://jodi.tamu.edu/Articles/v04/i04/Veltman. 22 T Berners-Lee, "Linked Data - Design Issues", 2006. http://www.w3.org/ DesignIssues/LinkedData.html; C. Bizer, T. Heath, T. Berners-Lee, "Linked Data - The Story So Far", International Journal on Semantic Web and Information Systems, vol. 5, fasc. 3, 33 2009, pp. 1–22.; T. Heath, C. Bizer, "Linked Data: Evolving the Web into a Global Data Space", Synthesis Lectures on the Semantic Web: Theory and Technology, vol. 1, fasc. 1, febbraio 2011, pp. 1–136. http://www.shirky.com/writings/ontology_overrated.html http://jodi.tamu.edu/Articles/v04/i04/Veltman http://www.w3.org/DesignIssues/LinkedData.html http://www.w3.org/DesignIssues/LinkedData.html Dall'Informatica umanistica alle culture digitali - 263 - dati contenuti nei database dei sistemi informativi già ora presenti sul Web. Questo deve avvenire seguendo alcuni semplici principi di base: The term Linked Data refers to a set of best practices for publishing and interlinking structured data on the Web. These best practices were introduced by Tim Berners-Lee in his Web architecture note Linked Data and have become known as the Linked Data principles. These principles are the following: 1. Use URIs as names for things. 2. Use HTTP URIs, so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL). 4. Include links to other URIs, so that they can discover more things. The basic idea of Linked Data is to apply the general architecture of the World Wide Web to the task of sharing structured data on global scale. In order to understand these Linked Data principles, it is important to understand the architecture of the classic document Web23. La creazione di Linked Data, insomma, costituisce un percorso graduale dall'attuale Web come rete di documenti il cui contenuto di conoscenza è prodotto interamente dall'interpretazione umana a un Web come rete di dati che veicolano in modo formalizzato frammenti di semantica processabili da un elaboratore. I Linked Data 23 T. Heath, C. Bizer, op.cit. F. Ciotti, Web semantico, linked data e studi letterari - 264 - forniscono inoltre una soluzione al problema della identificazione condivisa dei concetti (intesi nel senso più generale possibile), grazie all'adozione condivisa di URI. Un'applicazione esemplare e molto nota di questa architettura è costituita da GeoNames (www.geonames.org), una ontologia (qui il termine è usato nel senso ampio di vocabolario formalizzato) che descrive circa 10 milioni di toponimi, organizzati per attributi e relazioni geografico/territoriali. Un altro repository di Linked Data assai noto è DBPedia (www.dbpedia.org) che costituisce una riformulazione formalizzata di parte del contenuto della nota enciclopedia collaborativa Wikipedia24. Studi letterari e tecnologie semantiche: verso il Web Semantico Letterario Dato il contesto teorico e tecnico che abbiamo delineato sopra, quale convergenza ci può essere tra le tecnologie del Web Semantico e gli studi umanistici e letterari in particolare? Ha un senso parlare di un Web Semantico Letterario? La risposta positiva a questo quesito la forniscono implicitamente le già numerose sperimentazioni e progetti di ricerca in atto, o in via di elaborazione, in questa area. Non ci soffermeremo in questa sede su una descrizione analitica di ciascuna di esse. Intendiamo piuttosto fornire una visione a larga scala, tracciare tendenze per future esplorazioni e sperimentazioni. 24 C. Bizer et al., "DBpedia - A crystallization point for the Web of Data"», Web Semantics: Science, Services and Agents on the World Wide Web, vol. 7, fasc. 3, settembre 2009, pp. 154–165. Dall'Informatica umanistica alle culture digitali - 265 - Un primo punto di convergenza consiste nella creazione e pubblicazione di Linked Data Set in ambito umanistico e letterario. A un livello molto generalista questa attività ovviamente si potrebbe avvalere di convergenze interdisciplinari anche all'interno dello stesso campo umanistico. Si pensi infatti alla possibilità di predisporre ontologie terminologiche (sullo stile di GeoNames) per: • luoghi e spazi geografici di epoche e culture diverse; • persone ed eventi storici; • autori e opere; • luoghi e spazi geografici immaginari e funzionali; • entità e personaggi finzionali; • temi e motivi letterari; • temi e motivi iconologici; • figure retoriche; • generi e stili letterari e artistici; Questa lista potrebbe allungarsi a piacimento, ma appare evidente come la creazione di questi repertori sotto forma di Linked Data, organizzati da un livello soggiacente di ontologie formali (a loro volta capaci di interconnettersi ad altre ontologie generali come il CIDOC CRM - http://www.cidoc-crm.org) permetterebbe di stabilire relazioni, interconnessioni e mash-up tra concetti e nozioni di diversi domini; la loro esplorazione mediante sistemi di ragionamento e dedizione automatici porterebbe a individuare e studiare in modo sistematico fenomeni altrimenti invisibili o solo http://www.cidoc-crm.org/ F. Ciotti, Web semantico, linked data e studi letterari - 266 - intuibili. Peraltro già la stessa costruzione di una ontologia, tanto per fare un esempio, di temi e motivi letterari richiederebbe uno sforzo di analisi concettuale che porterebbe ipso facto un importante contributo agli studi di critica tematica. Tuttavia predisporre repertori, ancorché strutturati sub specie di ontologie e Linked Data sarebbe una piccola rivoluzione rispetto alla disponibilità dei corrispettivi repertori cartacei25. Così come costruire repositories e biblioteche digitali di intere tradizioni letterarie in codifica XML/TEI non ha finora portato a grandi passi avanti nella conoscenza critico-letteraria. Un vero e proprio "salto gestaltico" si può invece ottenere mettendo in relazione ontologie/set di Linked Data e collezioni digitali di testi. Le ontologie possono infatti fornire una semantica computabile per i documenti digitali. Questa integrazione aprirebbe la strada a indagini e ricerche su vasta scala decisamente innovative. Si pensi alla possibilità di incrociare in un dato insieme testuale e in modo esaustivo le interazioni tra determinate forme metriche e figure o tropi; oppure alla possibilità di analizzare in modo sistematico come un tema si fenomenizzi e migri tra testi di varie epoche; o ancora studiare l'evoluzione intertestuale di un perso- naggio e la sua relazione con generi e temi. La realizzazione di una simile macro architettura è ovviamente assai complessa tecnicamente e altrettanto onerosa in termini di 25 Si pensi a opere come: R. Ceserani, M. Domenichelli, P. Fasano, Dizionario dei temi letterari, Torino, UTET, 2007; A. Ferrari, Dizionario dei luoghi letterari immaginari, Torino, UTET, 2007. Dall'Informatica umanistica alle culture digitali - 267 - tempo e risorse (e tuttavia, se venti anni fa qualcuno avesse detto che oggi avremmo avuto a disposizione in formato digitale, opportuna- mente codificate in TEI, intere tradizioni letterarie, non si sarebbe forse manifestato il medesimo scetticismo?). Per le problematiche tecniche, una soluzione consiste nell'ado- zione di una strategia di annotazione semantica multilayer dei documenti digitali basata sul paradigma dello stand-off markup, invece del tradizionale e notoriamente problematico approccio di inline markup. La segmentazione del testo al fine di codificare i frammenti linguistico-testuali che fenomenizzano i molteplici livelli semantici incorrerebbe inevitabilmente nel problema delle "gerarchie sovrapposte" che caratterizza XML26. Nello stand-off markup i marcatori sono in parte o in tutto esterni rispetto alla sequenza lineare dei caratteri, eliminando alla radice la questione delle sovrapposizioni sintattiche. Naturalmente si pone il problema di esprimere formalmente il collegamento tra i metadati semantici esterni (di norma triple RDF o asserti in OWL) e i brani di testo a cui sono applicati, preservando la portabilità e la (teorica) leggibilità umana dell'insieme di documenti digitali risultante. Una elegante soluzione interamente basata su tecnologie standard W3C è stata proposta da Di Iorio e Vitali nel formalismo ontologico EARMARK27. 26 Su questo rimandiamo a F. Ciotti, op.cit. e alla bibliografia ivi contenuta. 27 A. Di Iorio, S. Peroni, F. Vitali, "A Semantic Web approach to everyday overlapping markup". In J. Am. Soc. Inf. Sci. Journal of the American Society for Information Science and Technology, vol. 62, fasc. 9, 2011, pp. 1696–1716; A. Di Iorio, S. Peroni, F. Vitali, F. Ciotti, Web semantico, linked data e studi letterari - 268 - The basic idea is to model EARMARK documents as collections of addressable text fragments, and to associate such text content with OWL assertions that describe structural features as well as semantic properties of (parts of) that content. As a result EARMARK allows not only documents with single hierarchies (as with XML) but also multiple overlapping hierarchies where the textual content within the markup items belongs to some hierarchies but not to others. Moreover EAMARK makes it possible to add semantic annotations to the content though assertions that may overlap with existing ones28. I frammenti di testo sono identificati come range di caratteri oppure mediante puntatori XPath e XPointer. I tre autori hanno anche sviluppato diversi strumenti per applicare il modello formale, il quale è comunque utilizzabile con qualsiasi sistema di interrogazione e ragionamento compatibile con OWL e SPARQL29. Un ulteriore applicazione delle tecnologie del Web Semantico nello studio dei fenomeni testuali e letterari consiste nella modellizzazione ontologica delle strutture narrative. Come noto si tratta di uno dei pochi settori delle scienze del testo soggetto già in passato e indipendentemente dall'informatica, a numerosi tentativi "Handling Markup Overlaps Using OWL". In P. Cimiano, H. Pinto (a cura di), Knowledge Engineering and Management by the Masses. «Lecture Notes in Computer Science», vol. 6317, Springer Berlin Heidelberg, 2010, pp. 391–400 http://dx.doi.org/ 10.1007/978-3-642-16438-5_29. 28 A. Di Iorio, S. Peroni, F. Vitali, op.cit., p. 1704. 29 SPARQL (Simple Protocol and RDF Query Language) è un linguaggio di interrogazione per basi dati semantiche espresse in RDF sviluppato dal W3C. http://dx.doi.org/10.1007/978-3-642-16438-5_29 http://dx.doi.org/10.1007/978-3-642-16438-5_29 Dall'Informatica umanistica alle culture digitali - 269 - di formalizzazione più o meno compiuta, sin dalle origini della narratologia negli anni ‘60 del secolo scorso. I modelli a suo tempo elaborati da Bremond, da Todorov e poi da Greimas30 cercavano di fornire una vera e propria grammatica e semantica formale della narrazione, sul modello delle grammatiche generative della lingua a suo tempo proposte da Chomsky. Successivamente diversi autori hanno ripreso ed esteso le prime formalizzazioni inserendole nel contesto più generale della linguistica testuale (ricordiamo in particolare J. Petöfi, T. Van Dijk, W. Dressler, R. de Beaugrande31) o degli studi sui mondi possibili narrativi e sulla teoria logica dell'azione e le loro applicazione nell'intelligenza artificiale32. Una menzione speciale, visto il contesto, va fatta ai lavori di Giuseppe Gigliozzi, che negli anni ‘80 e ‘90 del secolo scorso sperimentò e realizzo diversi modelli computazionali dei fenomeni narrativi, come la creazione di story grammar per fiabe e novelle e la descrizione formale dei personaggi e dei ruoli narrativi, realizzando anche diversi programmi basati sul linguaggio Lisp, come SEB e SEBNET33. Questa ricca tradizione rappresenta una solida base teorica su cui innestare le più recenti e flessibili (nonché supportate dal punto 30 C. Brémond, Logica del racconto, Milano, Bompiani, 1977; Algirdas Julien Greimas, La semantica strutturale; ricerca di metodo, Milano, Rizzoli Editore, 1968. 31 T. A. van Dijk, Text and context, 1977; W. U. Dressler, Current trends in textlinguistics, Berlin; New York, W. de Gruyter, 1978. 32 M.-L. Ryan, Possible worlds, artificial intelligence, and narrative theory, Bloomington, Indiana University Press, 1991; J. C. Meister, Computing action: a narratological approach, Berlin, Walter de Gruyter, 2003. 33 Si vedano G. Gigliozzi, Studi di codifica e trattamento automatico di testi, Roma, Bulzoni, 1987; G. Gigliozzi, Saggi di informatica umanistica, Milano, UNICOPLI, 2008. F. Ciotti, Web semantico, linked data e studi letterari - 270 - di viste delle implementazione software) tecnologie del Web Semantico. Si muove in questa direzione l'ontologia OWL per la descrizione formale dei personaggi letterari proposta da Zöllner- Weber; da ricordare in questa ambito anche un progetto di ricerca recentemente avviato presso il Language Technology Lab del DFKI (Deutsche Forschungszentrum für Künstliche Intelligenz) che ha come obiettivo l'individuazione e il riconoscimento dei personaggi più rilevanti nei racconti popolari attraverso l'uso combinato di ontologie e sistemi di elaborazione del linguaggio naturale. Ovviamente tali ontologie di aspetti e fenomeni intratestuali possono essere integrate con quelle intertestuali che abbiamo discusso sopra, e a loro volta connesse direttamente con i testi, fino a formare un Semantic Web Letterario che apre nuove prospettive per l'avanza- mento del sapere teorico e critico sul patrimonio letterario. Resta da considerare un ultimo ma non marginale aspetto: chi e con quali risorse economiche e di tempo potrebbe procedere alla costruzione di questo nostro visionario Semantic Web Letterario? La risposta è: tutta la comunità degli studi letterari. La storia e l'evoluzione del Web ha dimostrato che non solo è possibile costruire sistemi, anche di enorme complessità, attraverso un processo pubblico incrementale e cooperativo, ma che tale strategia si dimostra assai più efficiente ed efficace di quelle private, monolitiche e centralizzate. Moltissimo lavoro nella costruzione della macro- architettura che qui proponiamo potrebbe essere condotto usando sistemi di cosiddetto crowdsourcing guidato, posto che esistano le opportune infrastrutture abilitanti. Il modello del social tagging, opportunamente corretto medianti sistemi basati su ontologie che ne Dall'Informatica umanistica alle culture digitali - 271 - orientino e controllino l'applicazione, permetterebbe di coinvolgere studiosi esperti ma anche giovani ricercatori e cultori nel costruire e popolare le ontologie. Un simile sforzo intellettuale e tecnologico non potrebbe che essere condotto in questo modo. E il prodotto di un tale impresa non potrà che essere un bene comune, un contenuto aperto e disponibile per tutta la comunità degli studi. F. Ciotti, Web semantico, linked data e studi letterari - 272 - Riferimenti Bibliografici Antoniou, Grigoris, Frank Van Harmelen, A semantic Web primer, Cambridge, Mass., MIT Press, 2008. Baader, Franz., The description logic handbook : theory, implementation, and applications, Cambridge, UK; New York, Cambridge University Press, 2003. Barnard, David T., Lou Burnard, Jean-Pierre Gaspart, Lynne A. Price, Michael Sperberg-McQueen, Giovanni Battista Varile, "Hierarchical Encoding of Text: Technical Problems and SGML Solutions". In Computers and the Humanities, vol. 29, fasc. 3, 1995, pp. 211–231. Berners-Lee, Tim, "Linked Data - Design Issues", 2006. http://www.w3.org/DesignIssues/LinkedData.html. Berners-Lee, Tim, James Hendler, Ora Lassila, "The Semantic Web". In Scientific American, vol. 284, fasc. 5, maggio 2001, pp. 34–43. Bizer, Christian, Tom Heath, Tim Berners-Lee, "Linked Data - The Story So Far". In International Journal on Semantic Web and Information Systems, vol. 5, fasc. 3, 33 2009, pp. 1–22. Bizer, Christian, Jens Lehmann, Georgi Kobilarov, Sören Auer, Christian Becker, Richard Cyganiak, Sebastian Hellmann, "DBpedia - A crystallization point for the Web of Data". In Web Semantics: Science, Services and Agents on the World Wide Web, vol. 7, fasc. 3, settembre 2009, pp. 154–165. Brémond, Claude, Logica del racconto, Milano, Bompiani, 1977. Buzzetti, Dino, "Digital Representation and the Text Model". In New Literary History, vol. 33, fasc. 1, S.d., pp. 61–88. http://www.w3.org/DesignIssues/LinkedData.html Dall'Informatica umanistica alle culture digitali - 273 - Ceserani, Remo, Mario Domenichelli, Pino Fasano, Dizionario dei temi letterari, Torino, UTET, 2007. Ciotti, Fabio, "La rappresentazione digitale del testo: il paradigma del markup e i suoi sviluppi", Lorenzo Perilli, Domenico Fiormonte (a cura di). In La macchina nel tempo : studi di informatica umanistica in onore di Tito Orlandi, Firenze, Le lettere, 2011. Cohen, Patricia, "Digital Keys for Unlocking the Humanities’ Riches", New York Times, novembre 16, 2010. http://www.nytimes.com/ 2010/11/17/arts/17digital.html?_r=2&emc=eta1&. Culler, Jonathan D., Francesco Muzzioli, Gian Paolo Castelli, Teoria della letteratura : una breve introduzione, Roma, Armando, 1999. Della Valle, Emanuele, Irene Celino, Dario Cerizza, Semantic web : modellare e condividere per innovare, Milano, Pearson Addison Wesley, 2008. Dijk, Teun A. van, Text and context, 1977. Dressler, Wolfgang U., Current trends in textlinguistics, Berlin; New York, W. de Gruyter, 1978. Ferrari, Anna, Dizionario dei luoghi letterari immaginari, Torino, UTET, 2007. Gartner Group, "Data Mining Definition | Gartner", 2012. http://www.gartner.com/it-glossary/data-mining. Gigliozzi, Giuseppe, Saggi di informatica umanistica, Milano, UNICOPLI, 2008. http://www.nytimes.com/2010/11/17/arts/17digital.html?_r=2&emc=eta1& http://www.nytimes.com/2010/11/17/arts/17digital.html?_r=2&emc=eta1& http://www.gartner.com/it-glossary/data-mining F. Ciotti, Web semantico, linked data e studi letterari - 274 - ———, Studi di codifica e trattamento automatico di testi, Roma, Bulzoni, 1987. Greimas, Algirdas Julien, La semantica strutturale; ricerca di metodo, Milano, Rizzoli Editore, 1968. Gruber, Thomas R., "A translation approach to portable ontology specifications", Knowledge Acquisition, vol. 5, fasc. 2, 1993, pp. 199–220. ———, "Ontology", Encyclopedia of Database Systems, Springer-Verlag, 2009. Heath, Tom, Christian Bizer, "Linked Data: Evolving the Web into a Global Data Space". In Synthesis Lectures on the Semantic Web: Theory and Technology, vol. 1, fasc. 1, febbraio 2011, pp. 1–136. Heuser, Ryan, Long Le-Khac, Stanford Literary Lab, A Quantitative literary history of 2,958 nineteenth-century British novels : the Semantic cohort method, 2012. http://litlab.stanford.edu/ LiteraryLabPamphlet4.pdf. Di Iorio, Angelo, Silvio Peroni, Fabio Vitali, "A Semantic Web approach to everyday overlapping markup". In J. Am. Soc. Inf. Sci. Journal of the American Society for Information Science and Technology, vol. 62, fasc. 9, 2011, pp. 1696–1716. ———, "Handling Markup Overlaps Using OWL". In P. Cimiano, H. Pinto (a cura di), Knowledge Engineering and Management by the Masses. «Lecture Notes in Computer Science», vol. 6317, Springer Berlin Heidelberg, 2010, pp. 391–400 http://dx.doi. org/10.1007/978-3-642-16438-5_29. Kirschenbaum, M. G., "The remaking of reading: Data mining and the digital humanities". In The National Science Foundation http://litlab.stanford.edu/LiteraryLabPamphlet4.pdf http://litlab.stanford.edu/LiteraryLabPamphlet4.pdf http://dx.doi.org/10.1007/978-3-642-16438-5_29 http://dx.doi.org/10.1007/978-3-642-16438-5_29 Dall'Informatica umanistica alle culture digitali - 275 - Symposium on Next Generation of Data Mining and Cyber-Enabled Discovery for Innovation, Baltimore, MD, 2007. http://www.cs .umbc.edu/~hillol/NGDM07/abstracts/talks/MKirschenbaum.pd f. Koleva, Nikolina, Thierry Declerck, Hans-Ulrich Krieger, "An Ontology-Based Iterative Text Processing Strategy for Detecting and Recognizing Characters in Folktales", Jan Christoph Meister (a cura di). In Digital Humanities 2012 Conference Abstracts, 467– 470, Hamburg, Hamburg University Press, 2012. Landow, George P., Hypertext : the convergence of contemporary critical theory and technology, Baltimore, Johns Hopkins University Press, 1992. McCarty, Willard, Humanities computing, Basingstoke [England]; New York, Palgrave Macmillan, 2005. ———, "Literary enquiry and experimental method: What has happened? What might?", in Liborio Dibattista, Storia della Scienza e Linguistica Computazionale: Sconfinamenti Possibili, 32– 54, Milano, Franco Angeli, 2009. http://www.mccarty.org. uk/essays/McCarty,%20Literary%20enquiry%20and%20experim ental%20method.pdf. McGann, Jerome, "Marking Texts of Many Dimensions", Susan Schreibman, Ray Siemens, John Unsworth (a cura di) , A Companion to Digital Humanities, 198–217, Oxford, Blackwell, 2004. http://www.digitalhumanities.org/companion. Meister, Jan Christoph, Computing action : a narratological approach, Berlin, Walter de Gruyter, 2003. http://www.cs.umbc.edu/~hillol/NGDM07/abstracts/talks/MKirschenbaum.pdf http://www.cs.umbc.edu/~hillol/NGDM07/abstracts/talks/MKirschenbaum.pdf http://www.cs.umbc.edu/~hillol/NGDM07/abstracts/talks/MKirschenbaum.pdf http://www.mccarty.org.uk/essays/McCarty,%20Literary%20enquiry%20and%20experimental%20method.pdf http://www.mccarty.org.uk/essays/McCarty,%20Literary%20enquiry%20and%20experimental%20method.pdf http://www.mccarty.org.uk/essays/McCarty,%20Literary%20enquiry%20and%20experimental%20method.pdf http://www.digitalhumanities.org/companion F. Ciotti, Web semantico, linked data e studi letterari - 276 - Moretti, Franco, Graphs, maps, trees : abstract models for a literary history, London; New York, Verso, 2005. Ryan, Marie-Laure, Possible worlds, artificial intelligence, and narrative theory, Bloomington, Indiana University Press, 1991. Schreibman, Susan, Raymond George Siemens, John Unsworth, A companion to digital humanities, Malden, Mass., Blackwell Pub., 2004. Shirky, Clay, Ontology is Overrated: Categories, Links, and Tags, 2005. work_cckcc7ya6vg4hgd2g6fej3lr6y ---- Voyant, Digital Humanities, General Chemistry, Scientific Papers, Undergraduate Education 2017, 7(1): 5-9 DOI: 10.5923/j.edu.20170701.02 Reading Science: Digital Humanities and General Chemistry Jennifer M. Vance Natural Sciences, LaGuardia Community College, Long Island City, United States Abstract Scientific papers often present challenges to undergraduate readers. This paper reports on research to explore whether Voyant, a digital humanities text analysis tool, might help students become more proficient and independent readers of scientific articles. Students taking Honors General Chemistry 2 were introduced to Voyant. For the study, they read, analyzed, and summarized a scientific paper without the use of Voyant to establish a baseline measure of their skills. They then read, analyzed, and summarized a second scientific paper with the aid of Voyant, and a third one without Voyant again. For the first article, the students earned an average of 7.6 points out of 10. For the second article, they gained a point, reaching an average of 8.7. For the third article, students maintained the gain with an average of 8.6 points. In addition, thematic coding of answers to open-ended survey questions posed after the second article confirmed reports by eleven out of fourteen students that Voyant had helped them; however, for the third article, only four missed the assistance of Voyant. In conclusion, Voyant was found to be a helpful temporary aid for reading scientific papers. Keywords Voyant, Digital Humanities, General Chemistry, Scientific Papers, Undergraduate 1. Introduction Scientific articles present a gateway to fascinating STEM (science, technology, engineering, and mathematics) fields and allow students to gain current information about research. An emphasis on encouraging students to engage in research outside the classroom during their undergraduate education has been reported as a path to greater student persistence and retention [1]. In addition, researchers report that students who do such research have greater success in graduate school than their less experienced classmates [2]. By reading scientific articles, students engage with the background of their future fields and current projects. In addition, in the classroom, students frequently need to read some scientific articles when writing their research papers. However, reading scientific literature can be daunting to an undergraduate student, because there is usually a gap in reading level between the classroom textbook and scientific journal articles [3]. In addition, extensive scientific background and vocabulary are referenced and assumed. Finally, there is a level of uncertainty in reading current research that results from not understanding the entire article, because scientific articles frequently report on complex techniques and equipment [3]. In order to assess the grade levels of the articles that I asked my students to read for this * Corresponding author: JVance@lagcc.cuny.edu (Jennifer M. Vance) Published online at http://journal.sapub.org/edu Copyright © 2017 Scientific & Academic Publishing. All Rights Reserved study, I employed the Flesch-Kincaid Readability Grade Level test through Microsoft Word. This method considers the average sentence length and the average number of syllables per word in a calculation [4]. In an attempt to make the process reading scientific articles faster for students, I decided to apply Voyant, a tool of the digital humanities, toward the reading of scientific articles in the classroom. In 2013-2014, I had been a participant in the professional development seminar initiated by Provost Paul Arcario, the Provost‘s Learning Space, which that year focused on the digital humanities. Voyant software can be used for any text that is in digital form and, therefore, can be used across the disciplines. In the fall of 2014, I introduced the tool to my classes with the goal of promoting transferrable skills such as finding the main idea, defining vocabulary, and being comfortable with possible uncertainty. My students had a very positive response to the use of Voyant. The purpose of this article is to determine whether Voyant, a free online digital humanities tool, can serve as a sort of ―training wheels‖ to spur students into becoming effective and independent readers of scientific articles. 2. Literature Review 2.1. Reading Scientific Articles in the Science Classroom Science educators have reported including scientific journal articles in the curriculum for a variety of reasons: guiding students in summarizing; teaching scientific writing 6 Jennifer M. Vance: Reading Science: Digital Humanities and General Chemistry and enhanced problem solving; and increasing the interest level of the class. Several papers have been written about using scientific journal articles to teach writing [5-8]. Some papers offer help in reading and summarizing journal articles [9-11]. For instance, students taking a third-year Introduction to Chemical Research course at Annapolis State University in Boone, North Carolina were given excerpts from scientific articles and asked to pick a key sentence that summarized each paragraph. They then created a PowerPoint slide with a key sentence as the title. The supporting sentences were used to write bullet points. Students were surveyed and they said that this technique helped them in ―finding keywords and concepts, understanding the author‘s point, and determining how to organize and evaluate information for a presentation‖ [9]. This is a creative approach to reading papers in science, although the students were not given an entire paper and the papers were chosen so that the students did not have to deal with technical jargon [9]. Another type of summarizing method was introduced in the literature as KENSHU, the Japanese word for ―research understanding‖ [10]. This method was adapted from a top Japanese national university and involved translation of articles, summarizing, and presenting. The students worked in pairs on science articles with an experimental procedure [10]. Alternatively, students in an Analytical Chemistry class were given prescreened articles and were asked questions about them. The author specifically chose analytical science papers with experimental data. The students reported that these papers helped them with exams and gave them more exposure to scientific literature [11]. Lastly, some articles report the process and benefits of incorporating journal reading into the curriculum to increase interest in the course [12, 13]. 2.2. Reading in Other Disciplines’ Classrooms Summarization itself is a reading strategy for increasing comprehension of texts [14]. Friend presents this strategy as having ―four defining features: (a) it is short, (b) it tells what is most important to the author, (c) it is written ‗in your own words,‘ and (d) it states the informa- tion ‗you need to study‘‖ [15]. Spörer, Brunstein, and Kieschke (2008) taught readers four strategies for increased comprehension: ―summarizing, questioning, clarifying, and predicting‖ [16]. They also reflected on the positive effects of asking students to teach each other. McNamara (2009) expands on these strategies to include: ―1) comprehension monitoring, 2) paraphrasing, 3) elaboration, 4) logic or common sense, 5) predictions, and bridging [inference]‖ [17]. Finally, Liu, Chen, and Chang (2010) reported the use of computer-assisted concept mapping as a technique for increasing reading comprehension with English as a Foreign Language (EFL) students [18]. 3. Voyant Software This paper differs from the literature reviewed above in that it reports on the use of a computer program that generates in minutes a word analysis of an assigned article for students to refer to while reading the article. Voyant software, available free online, analyzes the scientific article or articles and generates a word cloud, a word frequency list, a graph of frequent words, and a presentation of keywords in sentences. Students can quickly see themes and difficult words in context. For students who speak English as a second language, seeing the words in context can be particularly helpful. Using Voyant in this way has not been reported in the literature, but it has been used to analyze medical survey responses [19]. Voyant, which is found at http://www.voyant-tools.org, is a text analysis tool used in the digital humanities. The digital humanities is a new and thriving field which looks for patterns in texts by way of what is called ―distant reading.‖ Literary scholar Franco Moretti‘s view of distant reading is described as ―understanding literature not by studying particular texts, but by aggregating and analyzing massive amounts of data‖ [20]. Voyant is a distant reading tool. There is some controversy in the Humanities with regard to this type of study of large amounts of data made available by the digitization of vast numbers of books [21, 22]. Since participants in this study also had to read the paper, the controversy is avoided. An example of work done with distant reading is Ana Mitric‘s (2007) essay on ―Jane Austen and Civility: A Distant Reading.‖ [23] In addition to reading scientific papers for research outside the classroom, students must read scientific papers as part of the general chemistry curriculum because they need to use journal articles to write their own research papers. As professional scientists, students will need to read scientific papers for a living. The present study explores whether the Voyant tool will help students become more proficient with reading and summarizing scientific papers. 4. Methods Voyant analyzes an article cut and pasted from a PDF or HTML document, generating a word cloud, a word frequency list, the printed article, a graph of word frequencies, and the words in their context sentences. The word cloud simply displays words in sizes that represent their relative frequencies within the text of the article. The graph of the word frequencies provides a picture of where the chosen words appear in the article. Finally, the words in their context sentences allow students to see how important words are used in a sentence in the article. These features potentially help students interpret the major themes more quickly based on word frequency. In a study by Dooling and Lachman, they found that students who received the theme before reading a passage had better recall and comprehension of the material [24]. Students can also look up difficult words and see how they are used in various sentences within the article, in order to gain context for the words. However, the program does not change the language Education 2017, 7(1): 5-9 7 of the sentences to make it easier to interpret. In order for the program to be most useful, it is very important to click on the gear-shaped icon to filter out repetitive words such as ―the,‖ ―a,‖ and ―and.‖ Click on the box for stopwords in English and on the box to apply a stopword list globally. I booked a computer classroom for my students when I introduced Voyant and made sure that all the students were able to get the Voyant analysis to work. In my experience with General Chemistry I and II students at LaGuardia Community College, I have found that there is a gap between reading the textbook and diving into the literature. For this exploration, fourteen students in the Honors General Chemistry II course in spring 2016 read an article without Voyant, wrote a summary, and answered some survey questions. Next, the students read an article with Voyant, wrote a summary, and answered survey questions. Finally, students read another paper without Voyant, wrote a summary, and answered survey questions. The articles were checked in Microsoft Word for grade level to make sure that they were comparable; the three articles had a Flesch-Kincaid readability grade level of 13.3, 13.4, and 13.5 respectively. Students received a rubric of expectations for each article summary assignment. The surveys were analyzed with thematic coding, that is, searching for common themes in the survey responses. 5. Results and Discussion 5.1. First Article The first article, summary, and survey were designed to get a baseline estimate of the students‘ abilities in summarizing articles. The first article was titled ―Use of Human Urine Fertilizer in Cultivation of Cabbage (Brassica olera- cea): Impacts on Chemical, Microbial, and Flavor Quality‖ [25]. This article had a reading level of grade 13.3. Of all the articles, it was probably the easiest because it had fewer unfamiliar scientific terms than the other two articles. I chose an article about cabbage because the other articles are related to cabbage. In particular, red cabbage contains anthocyanins, which are natural dyes that we discussed throughout the course in our research projects. In a survey after the first article summary assignment, I asked the students about their process of crafting the summary. I asked them if creating the summary was difficult, and why or why not. My Honors students achieved a fairly high baseline score of 7.6 points out of 10 for the first summary. Six of the students reported using highlighting as a technique for drawing out the main ideas. Two read the paper and used the Internet to help them with difficult terms. Three mentioned outlining the article. As for the question of difficulty, nine students said the article was not very difficult. One student commented, ―It was not that difficult. The article was really interesting to me and so that allowed me to engage it well. Overall thought it was a good fair article.‖ Five students said that the article was difficult. One student compared it to SAT questions: ―Yes, because the article was almost like the passages that are offered in the English section of the SATs and those long passages requires a lot of analysis in order to decipher it into one‘s own words and understanding. Especially since this article felt more longer.‖ One student used an interesting term—―filtered out‖—to describe his process of summarizing. He reported, ―It wasn‘t that very difficult. There was a lot of technical details and the important parts had to be filtered out.‖ 5.2. Second Article For the second article, which they read with Voyant, the students achieved an average of 8.7 out of 10, which reflected a gain of one point over their average score of 7.6 for the summaries they had written without Voyant. The second article was titled ―Anthocyanins Contents, Profiles, and Color Characteristics of Red Cabbage Extracts from Different Cultivars and Maturity Stages‖ [26]. This article had a reading level of 13.4. In their work with the second article, eight students improved, three students stayed the same, one student did worse, and two students did not hand in the second summary. The students were asked about their process of crafting the summary, whether the process was difficult, whether Voyant had helped in any way and, if yes, in what ways. Eleven students reported that Voyant had helped them write the summary. In general, students suggested that they could find the keywords and focus of the article more quickly: ―Voyant helped me get to details faster and easier.‖ The majority of the students found Voyant helpful for the second article, but four students felt that it had not helped them. Some of them preferred their highlighting method over using the software. Some of the students misunderstood and thought I was asking them to use Voyant as a substitute for reading the article: ―I did not like not being able to physically read the article. What usually helps me is reading and manually highlighting an article, while also being able to write and scribble notes in the margins. Voyant did help in finding sections quicker but I would not use it alone.‖ None of the students reported that they could write the summary without reading the article in detail. Voyant was not viewed as an effective substitute for reading the article. 5.3. Third Article Finally, for their summaries of the third article, read without Voyant, the students achieved an average of 8.6 points out of 10. Students gained a point with the use of Voyant, and kept that gain without Voyant for the third article. The third article was titled ―Influence of Steviol Glycosides on the Stability of Vitamin C and Anthocyanins‖ [27]. This article had a grade level of 13.5. For the third article, three students improved, four stayed the same, five did worse, and two did not hand in the summary. The most extensive number of improving students was seen after the second article, but this result could have been due partially to the students becoming more comfortable with the 8 Jennifer M. Vance: Reading Science: Digital Humanities and General Chemistry assignment. Since this was an Honors class, the students were relatively strong readers to start with, having averaged a baseline 7.6 out 10. Some of them had techniques for reading articles that they already felt comfortable with. Regarding the third article, students were asked if they missed Voyant, and four said yes, and eight said no. It was interesting that many of the same students who said that Voyant helped after the second article were convinced they did not need it for the third article. One student said, ―No, I did not [miss Voyant]. Although it may have been helpful, I can do just as good without it.‖ One student thought there were too many keywords to sift through: ―Voyant was not [used] during crafting the summary because there were too many keywords and it was necessary to read the whole text and understand.‖ Some students did not want to bother with Voyant, if it meant they still had to read the whole article. One student used Voyant for the third article despite my instructions, and said, ―Yes, I used Voyant because it gave clear idea of terms mostly used and also separates the main points.‖ Although there was not the same jump in improvement and actually five students did worse with the third article, the students maintained nearly the same average as the second article. Based on these results, we can conclude that Voyant helped some students with their summaries but was not necessary for the third article. Students made gains with Voyant and kept their gains without Voyant for the third article; by then, the majority felt comfortable without the aid of Voyant. I think that the major benefit of Voyant is that it saves time by distilling the article into keywords and placing those keywords into their context sentences. Some students who are less than experienced readers might not have the persistence to wade through the article to distill those keywords on their own. Less experienced readers might see greater gains than my Honors students. This study also revealed that some students had methods such as highlighting the article, that they felt more comfortable with and preferred. 6. Conclusions This paper explores whether utilizing Voyant can help students become more independent and proficient scientific readers. Using Voyant to read scientific papers was evaluated by compiling point totals for summaries and analyzing answers to survey questions with thematic coding. A majority of students said that Voyant was helpful for reading the second article, but a majority of students also said they did not need Voyant for the third article. In reading and summarizing the third article, students retained the gains made in reading the first and second articles. Students who are weaker readers might see greater gains than my Honors students. Whether this is so is an important question that I want to explore in future research. In conclusion, student reports found Voyant to be a helpful temporary aid for summarizing research papers. ACKNOWLEDGEMENTS Many thanks to Paul Arcario and Richard Dragan for organizing and presenting the digital humanities-themed Provost‘s Learning Space in 2013–2014. REFERENCES [1] Graham, M. J., Frederick, J., Byars-Winston, A., Hunter, A. B., Handelsman, J., 2013, Increasing persistence of college students in STEM., Science, 341(6153), 1455–56. [2] Gilmore, J., Vieyra, M., Timmerman, B., Feldon, D., Maher. M., 2015, The relationship between undergraduate research participation and subsequent research performance of early career STEM graduate students., Journal of Higher Education, 86(6), 834–63. [3] Mallow, J. V., 1991, Reading science., Journal of Reading, 34(5), 324–38. [4] Test Your Document‘s Readability on Microsoft. [Online]. Available: https://support.office.com/en-us/article/test-your-document-s -readability-85b4969e-e80a-4777-8dd3-f7fc3c8b3fd2. [5] Paulson, D. R., 2001, Writing for chemists: satisfying the CSU upper-Division writing requirement., Journal of Chemical Education, 78(8), 1047–49. [6] Tilstra, L., 2001, Using journal articles to teach writing skills for laboratory reports in general chemistry., Journal of Chemical Education, 78(6), 762–64. [7] Carlisle, E. F., and Kinsinger J. B., 1977, Scientific writing: a humanistic and scientific course for science undergraduates., Journal of Chemical Education, 54(10), 632–34. [8] Whelan, R. J., and Zare, R. N., 2003, Teaching effective communication in a writing-intensive analytical chemistry course., Journal of Chemical Education, 80(8), 904–6. [9] Bennett, N. S., and Taubman, B. F.,. 2013, Reading journal articles for comprehension using key sentences: an exercise for the novice research student., Journal of Chemical Education, 90(6), 741–44. [10] Drake, B. D., Acosta, G. M., Smith, R. L., 1997, An effective technique for reading research articles: the Japanese KENSHU Method., Journal of Chemical Education, 74(2), 186–88. [11] Roecker, L., 2007, Introducing students to the scientific literature: an integrative exercise in quantitative analysis., Journal of Chemical Education, 84(8), 1380–84. [12] Floutz, V. W., 1936, An advanced course in general chemistry based on scientific journals., Journal of Chemical Education, 13(8), 374–75. [13] Duncan, B. L., 1973, A literature program in general chemistry., Journal of Chemical Education, 50(11), 735. [14] Thiede, K. W., and Anderson, M. C. M., 2003, Summarizing can improve metacomprehension accuracy., Contemporary Educational Psychology, 28(2), 129–60. Education 2017, 7(1): 5-9 9 [15] Friend, R., 2000/2001, Teaching summarization as a content area reading strategy., Journal of Adolescent & Adult Literacy, 44(14), 320–29. [16] Spörer, N., Brunstein, J. C., Kieschke, U., 2008, Improving students‘ reading comprehension skills: effects of strategy instruction and reciprocal teaching., Learning and Instruction, 19(3), 272–86. [17] McNamara, D. S., 2009, The importance of teaching reading strategies., Perspectives on Language and Literacy, 35(2), 34–40. [18] Liu, P. L., Chen, C. J., Chang, Y. J., 2010, Effects of a computer- assisted concept mapping learning strategy on EFL college students‘ english reading comprehension., Computers and Education, 54(2), 436–45. [19] Maramba, I. D., Davey, A., Elliott, M. N., Roberts, M, Roland, M., Brown, F., Burt, J., Boiko, O., Campbell, J., 2015, Web-based textual analysis of free-text patient experience comments from a survey in primary care., JMIR Medical Informatics, 3(2), 1–12. [20] Schulz, K., 2011, ―What ss distant reading?‖ Mechanic Muse, New York Times Sunday Book Review 24 June. http://www.nytimes.com/2011/06/26/books/review/the-mech anic-muse-what-is-distant-reading.html. [21] Gooding, P., Terras, M., Warwick, C., 2013, The myth of the new: mass digitization, distant reading, and the future of the book., Literary and Linguistic Computing, 28(4), 629–39. [22] Serlen, R., 2010, The distant future? Reading Franco Moretti., Literature Compass, 7(3), 214–25. [23] Mitric, A., 2007, Jane Austen and civility: a distant reading., Persuasions: The Jane Austen Journal, 29, 194–207. [24] Dooling, J. D., Lachman, R., 1971, Effects of comprehension and retention of prose., Journal of Experimental Psychology, 88(2), 216-222. [25] Pradhan, S. K., Nerg, A. M., Sjöblom, A., Holopainen, J. K., Heinonen-Tanski, H., 2007, Use of human urine fertilizer in cultivation of cabbage (brassica oleracea): impacts on chemical, microbial, and flavor quality., Journal of Agricultural and Food Chemistry, 55(21), 8657–63. [26] Ahmadiani, N., Robbins, R. J., Collins, T. M., M. Giusti, M. M., 2014, Anthocyanins contents, profiles, and color characteristics of red cabbage extracts from different cultivars and maturity stages., Journal of Agricultural and Food Chemistry, 62(30), 7524–31. [27] Woźniak, K., Marszalek, K., Skąpska, S., 2014, Influence of steviol glycosides on the stability of vitamin C and anthocyanins., Journal of Agricultural and Food Chemistry, 62(46), 11264–69. work_cdtp5ymcijchdod7rq66hueqvq ---- Prototyping Across the Disciplines Research How to Cite: El Khatib, Randa, et al. 2019. “Prototyping Across the Disciplines.” Digital Studies/Le champ numérique 8(1): 10, pp. 1–20. DOI: https://doi.org/10.16995/dscn.282 Published: 03 January 2019 Peer Review: This is a peer-reviewed article in Digital Studies/Le champ numérique, a journal published by the Open Library of Humanities. Copyright: © 2019 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. Open Access: Digital Studies/Le champ numérique is a peer-reviewed open access journal. Digital Preservation: The Open Library of Humanities and all its journals are digitally preserved in the CLOCKSS scholarly archive service. https://doi.org/10.16995/dscn.282 http://creativecommons.org/licenses/by/4.0/ El Khatib, Randa, et al. 2019. “Prototyping Across the Disciplines.” Digital Studies/Le champ numérique 8(1): 10, pp. 1–20. DOI: https://doi.org/10.16995/dscn.282 RESEARCH Prototyping Across the Disciplines Randa El Khatib1, David Joseph Wrisley2, Shady Elbassuoni3, Mohamad Jaber3 and Julia El Zini3 1 University of Victoria, CA 2 New York University Abu Dhabi, AE 3 American University of Beirut, LB Corresponding author: David Joseph Wrisley (djw12@nyu.edu) This article pursues the idea that within interdisciplinary teams in which researchers might find themselves participating, there are very different notions of research outcomes, as well as languages in which they are expressed. We explore the notion of the software prototype within the discussion of making and building in digital humanities. The backdrop for our discussion is a collaboration between project team members from computer science and literature that resulted in a tool named TopoText that was built to geocode locations within an unstructured text and to perform some basic Natural Language Processing (NLP) tasks about the context of those locations. In the interest of collaborating more effectively with increasingly larger and more multidisciplinary research communities, we move outward from that specific collaboration to explore one of the ways that such research is characterized in the domain of software engineering—the ISO/IEC 25010:2011 standard. Although not a perfect fit with discourses of value in the humanities, it provides a possible starting point for forging shared vocabularies within the research collaboratory. In particular, we focus on a subset of characteristics outlined by the standard and attempt to translate them into terms generative of further discussion in the digital humanities community. Keywords: software prototyping; interdisciplinary collaboration; standards; geocoding; spatial humanities; shared research vocabularies In various global contexts, researchers are coming together to imagine the environments that can host, sustain, and facilitate new forms of academic research and inquiry. Indeed, research infrastructures in the academy have evolved to include new spaces of exchange, data management, and computation. At their most virtual, such spaces for digital humanists include software, middleware, cloud computing, https://doi.org/10.16995/dscn.282 mailto:djw12@nyu.edu El Khatib et al: Prototyping Across the DisciplinesArt. 10, page 2 of 20 labs, makerspaces and the like. Indeed, we have entered an era of what has been called “Software Intensive Humanities (SIH),” where digital humanists not only use packaged software bundled in their institutional infrastructures, but they also embark on innovative tool creation as a form of generative, critical practice (Smithies 2017). In this article, we explore the idea of proof of concept software prototyping, stemming from a collaboration between researchers in the humanities and computer science, and we examine the issue of the value of collaboration across the disciplines. We have been attempting to model a process that could very well be popularized in coming years, even embedded within basic computational infrastructures for humanists the way that platforms such as Voyant Tools have democratized text analytics. That process is creating a map from a text. Humanities and Computer Science Collaborations: Towards a Product or Prototype? Digital humanities are a deeply social endeavor, one in which project results are shaped by the various actors involved, as well as by the mutual value drawn by them from the process. Those projects are not always carried out in the context of a shared lab. Cross-disciplinary collaborations also take place virtually, instead of within the confines of a single room, department, university, or even region. Collaboration, it can be argued, is a form of a third place, drawing on the theoretical interests and practical expertise of different kinds of disciplinary actors, taking place in and between their traditional working spaces, and importantly, growing out of the development of shared vocabularies for collaboration and an appreciation of the stakes of the research for others in our team (Bracken and Oughton 2006). Our experience stems from a collaboration initiated informally between a small group of researchers and students in departments of English and Computer Science, rather than within a formal research collaboratory, and at a moment where digital humanities had limited purchase within the home institution. Whereas interdisciplinarity is an easily promoted ideal, building structures across the disciplines for successful, and sustainable, collaboration is more challenging to achieve (Bos, Zimmerman, Olson, et al. 2007). Collaboration is known to be difficult El Khatib et al: Prototyping Across the Disciplines Art. 10, page 3 of 20 for a number of reasons: focus in some disciplines on individual work over research teams, lack of planning and project management skills, lack of infrastructure to facilitate the collaboration, new forms of accountability or communication required to carry projects to fruition, in addition to basic disciplinary difference (Siemens and Siemens 2012). In this article, we turn to another important challenge of collaboration unmentioned above: the ways we characterize the prototype resulting from interdisciplinary collaboration, or in oversimplified terms, the “finished product.” Since the work of such software prototyping is iterative, experimental, and without a clear end in sight, the humanists on our team came to appreciate the process as passing through multiple stages of somewhat finished prototypes. A new version or prototype might improve performance or user experience compared with previous versions, but, in turn, eclipsing another part of its previous functionality. We propose to examine in this article how the computational task of text mapping, that is, modelling and operationalizing a relationship between geographic entities and features of language, can be framed within a mutually beneficial language for a collaborative team. We do this by turning to some documentation from beyond the humanities—some might say far beyond the humanities—that, if reframed and generalized, might provide a starting point for forging common goals and vocabulary for interdisciplinary teams. This relies, however, on unpacking, and refining, the notion of the prototype for the specific case of software development within digital humanities. Software Prototypes: Materializing Contemplative Knowledge The word prototype appears in Renaissance English from a Latinized Greek word meaning a “first form,” or a “primitive pattern.” A software prototype, according to A Dictionary of Computer Science, can be defined as a “preliminary version of a software system in order to allow certain aspects of that system to be investigated … additionally (or alternatively) a prototype can be used to investigate particular problem areas or certain implications of alternative design or implementation decisions” (n.p.). Prototyping after the digital turn can also be seen as an assemblage of various modes of intellectual work: “theoria (or contemplation), poiesis (or making), and praxis (or practice/action)” (Saklofske 2016, n.p.). According to Saklofske, contemplation El Khatib et al: Prototyping Across the DisciplinesArt. 10, page 4 of 20 and action are related through the process of making, which can be seen as a materialization of contemplative knowledge both in, and through, engaged activity. Research on building in digital humanities has framed prototyping as an intertwined process of making and thinking, embodied together in the prototype “product;” examples of functional software prototypes, it has been argued, are a contribution to knowledge in themselves in as much as they suggest innovative methodologies (Galey and Ruecker 2010; Ruecker and Rockwell 2010). We assert that a prototype is best understood in a similar light, as intertwined thinking and making, a process of modeling embodied in step-wise software versions (El Khatib forthcoming). In this sense, in Saklofske’s terms, the prototype, which embodies the process and product, serves both as an argument and theory. In our case, what kinds of thinking across the disciplines led to our prototype? Data creation is central to many of our research projects in digital humanities, and it is common knowledge how it can be very time-consuming and expensive. One common research task at the intersection of textual and spatial analysis consists in extracting geographical information from unstructured text and visualizing such data on map interfaces. It is a rather time-consuming process that has led researchers to want to automate the process. Another system that models the text mapping process is the “Edinburgh Geoparser” (Edinburgh Language Technology Group 2017 https://www.ltg.ed.ac.uk/software/geoparser). Practitioners in the spatial humanities have recourse to a growing body of code and critical literature, in addition to infrastructure in the form of gazetteers— digital lists of places against which entities extracted from texts can be matched. Convinced that the immediate linguistic context of geographic entities found in texts is illustrative of the ways that place is constructed by literature, the authors of this paper set out to operationalize this hypothesis by prototyping software named Topotext to carry out the task. Creating a software prototype involves different skill sets in code, interface design, implementation, and testing; in short, it is a social process. The various iterations of TopoText, from basic conceptualization to implementation, involved different https://www.ltg.ed.ac.uk/software/geoparser El Khatib et al: Prototyping Across the Disciplines Art. 10, page 5 of 20 groups of students and faculty, and this meant that we confronted many disciplinary assumptions that went unmentioned. Furthermore, software prototypes, particularly of the kind one finds in digital humanities, are acts of open scholarship. They are placed in code sharing and versioning environments for others to use, adapt, and refine. Of course, working within a digital humanities lab or on funded projects, one solution to software or coding needs is to contract developers to carry out discrete tasks. If digital humanities enter the research collaboratory, however, where they are face to face with others in computer science (or other disciplines), different dynamics come into play, in which divergent notions of both process and product emerge. Our experience made us very aware of the fact that within the single academic unit of computer science, we also find different forms of reflection and action that map onto the abovementioned axis of theoria, poeisis, and praxis. In other words, “there are many different computings” (McCarty 2005, 158). McCarty qualifies the domain of software “a locus of confounding” precisely because he argues, that “the more theoretical side of computer science meets the world through systems engineered to serve and interact with it” (McCarty 2005, 164). Our specific experience has made us see the urgency of thinking through the ways that research is validated from the perspective of different disciplines, as well as within the same disciplinary groupings of the academic unit. At stake here is the way that the common language we might use to describe software prototyping within research teams, and the ways we can take home our results to our different disciplinary homes. Tensions of Reproducibility Instead of relying on a service-based, developer-for-hire model of computing, what we call for here is an active discovery of how our disciplinary values and expectations as humanists converge or diverge with those working in different aspects of computing. Within the context of prototyping, we describe a hybrid mode of interdisciplinary academic collaboration set beyond the confines of a physical space (such as a laboratory) or grant-funded project; this collaboration was carried out in its beginning on the same campus, and then subsequently via virtual communication between humanists and computer scientists working from different El Khatib et al: Prototyping Across the DisciplinesArt. 10, page 6 of 20 sites. The project was experimental, not only regarding the concepts employed, but also in the structure of the collaboration. To our knowledge, no such project had been attempted before between the two academic departments at our institution. In this light, looking back on several years of working together on the TopoText team, we wonder how this type of collaboration was able to pursue its experimental nature while there was no formal support, and likewise, what incentives kept team members pursuing the research project despite the lack of structure. Although they were not clearly articulated in this initial stage of collaboration, there were reasons that the prototyping process appealed to all members of the team. If we are to generalize from the experience, what might be some ways of constructing the frame of collaboration in mutually beneficial ways? How do we balance experimentation and rigor in software prototyping that can bring us closer to “next generation tools” (Siemens, 2016, n.p.)? How can humanists understand what colleagues in computing think is a valuable result in a research project? Some common guidelines would be useful for aligning future collaboration. Multidisciplinary collaboration models take into account disciplinary characteristics and differences. Major considerations in disciplinary difference include defining research problems and choice of critical vocabulary, designing methodology, asserting authorship, choosing publications venues, assigning rewards and recognition, as well as inter-researcher communication (Siemens, Liu, and Smith 2014, 54). Two models for collaboration in an academic setting are faculty-oriented research projects where lead faculty members make decisions on behalf of the entire team and lead the intellectual direction of the project, and collaboration that approaches team members as equals, where all members intellectually contribute to the project. The multidisciplinary humanities-computer science team discussed in this paper fits better into the latter, where the students continue to be as invested in its intellectual direction as faculty members. Additionally, it is further from the service-based approached that is more commonly associated with faculty-oriented research projects; here, team members are invested in creating shared research foci, vocabularies, and methodologies. El Khatib et al: Prototyping Across the Disciplines Art. 10, page 7 of 20 As would be expected in a new collaboration, when we began building TopoText, the computer science team came to the project with another set of implicit values. Although we worked quite closely together, it is not fair to say that at the beginning of the collaboration, we were completely aware of the other side’s workflows or standards of success. One of the members of the team from computer science pointed to systems and software development quality standards, the 2011 International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), also known as ISO/IEC 25010:2011, a part of the Systems and Software Quality Requirements and Evaluation (SQuaRE), as standards that his field follows when developing software and that informs the pedagogy of software engineering. At first, such a profession-specific document seemed to alienate us, and yet it contained some interesting wisdom; had we not made a conscious effort to “exploit the benefits of diversity” of our project team, we would perhaps have missed this way that his research community articulates project goals (Siemens, Cunningham, Duff, and Warwick 2010, n.p.). Negotiating between the more open- ended, experimental nature of prototyping and the ISO/IEC 25010:2011, indeed involved balancing novelty and conceptual innovation with functional suitability and accuracy. As a standard, it generally challenges the theoretical boundaries of the prototype, while maintaining a level of robustness of the methodologies of the software product. One of the ideas the humanities members of the team had, as we continued to theorize the emerging prototype, was how easy it is for our collaborator’s labor and professional expertise to disappear behind the accuracy of the code. In other words, as the software prototype began to embody the qualities of a purposeful, running tool, it was easy to neglect the design decisions and testing that brought the tool into functional existence. It is important to note that another contributor from the computer science team asserted that he does not rigidly follow such standards as the ISO since the research projects he directs do not have the specific goal of an end-product software system. The divide, if there can be said actually to be one, in our humanities- computer science collaboration was not so much across departmental lines, but El Khatib et al: Prototyping Across the DisciplinesArt. 10, page 8 of 20 rather between two competing goals, one of experimental modeling without any particular futurity for the prototype in mind, and another that sought to make strategic choices in step-wise implementation planning for scalability and sustainability. We might formulate this insight as a question: for whom does a prototype have an afterlife? The software engineering part of the team described this method of developing code into software as a “spiral,” incorporating feedback between developmental phases that allow for modification and improvement. That is to say, new phases are begun before predecessor phases are complete. It would seem, therefore, that unbeknownst to us, the tension in the dual meaning of the notion of the “prototype,” signifying both the singular, abstract “original” form and the mold or pattern from which subsequent copies can be developed, was playing itself out in the daily business of our collaboration. The analogy here could be extended to the tension between a “pure” thought experiment, and an experiment with the notion of reproducibility built in to its execution. Reproducibility is key to non-proprietary, open software development these days, as well as to standards of reliability and transparency in certain circles in the digital humanities community, as the use of environments such as GitHub, R Markdown source documents or Jupyter notebooks would seem to attest (Kluyver, Ragan-Kelly, Perez, et al. 2016). The principle of reproducibility also serves historiographic ends, as a way “of thinking- through the history and possibilities of computer-assisted text analysis” (Rockwell and Sinclair 2016). On our TopoText team, there were multiple perspectives on what needed to be done to bring about the software prototype as a public, shareable object: one that conceived the prototype as a kind of essai, and another that was conceptualizing the prototype as a structure in ways that it could be expanded on later. Whereas in digital humanities we might speak of operationalizing a concept, that is, translating a theoretical concept into a finite, computable experiment, others aim to move beyond experimental thinking with a computer to build a tool guided by best practices of software development so that it can be shared, distributed and further elaborated. Again, we are reminded of the “rich plurality of concerns” included within computing (Edwards, Jackson, Chalmers, et al. 2013). Let us push El Khatib et al: Prototyping Across the Disciplines Art. 10, page 9 of 20 forward to explore, then, the differences and commonalities between such social experiments in digital humanities software prototyping and the abovementioned ISO/IEC 25010:2011 standard. Forging Vocabularies of Collaboration: From Standards to Guidelines We are not suggesting that all digital humanities projects or collaboratories should aim for standardized implementation. Far from it. Instead, it is worth examining to what extent the details of the ISO/IEC 25010:2011 standard for software development might be translated into guidelines for software prototyping. Could delving into these standards be helpful for defining a mutually comprehensible language for some of the guidelines of the research collaboratory? Are there aspects of them that all sides of a collaboratory can share? Are there other aspects that are simply too commercially oriented to take root in the open source ethos espoused by digital humanities? Are there ways that the ISO/IEC 25010:2011 standard might be used to draft some more general guidelines, or even rethought to be useful to the informal, interdisciplinary encounter such as ours? Could such guidelines be scaled from the informal collaboration to the more formalized research collaboratory? We believe that the documentation contains some material for mutual understanding and deserves closer analysis. Central to multidisciplinary collaboration is developing understanding between disciplines, which can be forged through an understanding of field-specific language and the contexts in which it is being used. L. J. Bracken and E. A. Oughton (2006) identify three main aspects of language that are involved in attaining such an understanding: dialectics, metaphor, and articulation. Dialectics refers to the difference between the everyday use of a word and its expert use, as well as the different meanings that are assigned to the same words by different disciplines. Metaphor, or ‘heuristic metaphor,’ refers to expressions that push a conceptual understanding by systematically extending an analogy (Klamer and Leonard 1994). A metaphor assumes that those involved in the conversation share the same context before making a conceptual correlation. The final aspect is articulation, which refers El Khatib et al: Prototyping Across the DisciplinesArt. 10, page 10 of 20 to the process of deconstructing one’s disciplinary knowledge in conjunction with the disciplines of collaborators in an attempt to identify the building blocks and employ them towards developing a common understanding. According to Thierry Ramadier (2004), “articulation is what enables us to seek coherence within paradoxes, and not unity” (432). This idea of seeking coherence within paradoxes rather than attempting to reconcile disciplinary differences is what drives us to engage with the language of ISO/IEC 25010:2011. All three aforementioned aspects of language play a crucial role in developing a disciplinary understanding as articulated through these standards. Dialectics is crucial since some of the words used in the standards are familiar either in everyday life (such as “trust” or “comfort”) or a humanities context (such as “effectiveness”), but are actually used in a more specialized context in software development. We employ a metaphor in our explanation of the “modularity” characteristic below by drawing an analogy to a wrapper in order to explain how the characteristic functions. Our approach to the language of ISO/IEC 25010:2011 focuses on developing an understanding between its characteristics and humanities concepts rather attempting to reconcile the two. The above standards related to quality control in software development— originally published in 2011 and reviewed and confirmed in 2017, remain in vigor at the moment of the writing of this article. Although software developers aspire to them, and they are taught as guidelines in computer science programs, like all other standards, more work needs to be done to assess to what extent they are effective or even observed. At first take, adapting guidelines for product-driven industry into the humanities may seem counterintuitive, or even meet with resistance; however, let us not forget that adopting—and reinterpreting—the languages of the different strands of computing runs deep in digital humanities. Practitioners have been both adopting and adapting standards since some of its earliest days. Take, for example, the Text Encoding Initiative, which initially adopted Standard Generalized Markup Language (SGML) as an expression of its metadata, and was then succeeded by Extensible Markup Language (XML) that is still being used today. By adopting robust guidelines for markup of structural and conceptual features of humanities data, the TEI community laid one of the foundations of digital scholarship today. Whereas the El Khatib et al: Prototyping Across the Disciplines Art. 10, page 11 of 20 TEI community has created “guidelines” out of what are XML standards, would it not be possible to do the same with software prototyping out of the ISO/IEC 25010:2011 document? As we mentioned above, our first foray into collaborative software prototyping, TopoText 1.0, was made in an undergraduate software engineering course offered at the American University of Beirut (Lebanon). It might be fair to say that the initial impulse for the collaboration—automating a process of mapping toponyms found in texts and conducting some basic textual analysis around them—was a case study through which undergraduates were exposed to industry-defined standards for high-quality software. This does not mean that such standards were scrupulously followed in the ensuing code, nor that TopoText went through all the series of tests for professional software development, but rather that they were the ideal to which the computing team referred in building the software core. After being exposed to this process model, the humanists on the project team were stirred to explore how such standards were not only product-centered aims, but also how they enriched the conceptualization of code-based work. This led us to ask the question: are standards a conceptual apparatus sitting at the human interface of digital humanists and developers without ever being acknowledged as such? In the “waterfall model” of software development followed by our colleague in computing, an initial phase deals with the translation of concepts into processes and the articulation of specifications. This phase is followed by one focused on design, consisting of a modular decomposition of the steps of the core process. In these first two phases, the humanists worked closely with the computer scientists to articulate a common vision of the conceptual model. In the implementation phase that followed, this was less the case. In the ensuing testing and validation phases, the humanists stepped back in to confirm to what extent the desired processes were successfully implemented. These phases reinforce the social element of software development, in which “tests of strength” of the project’s functionality and usability are carried out. Indeed, the testing phase attempts to compare the “symbolic level of the literate programmer with the machinic requirement of compilation and execution of the software” (Berry 2011, 68). El Khatib et al: Prototyping Across the DisciplinesArt. 10, page 12 of 20 One of the basic tensions inherent in our process-oriented collaborative model was the language used to describe the resultant system. The digital humanists on the team called the first version of the validated system a prototype, by which we meant an initial step that exposed some of the shortcomings of the text mapping process, whereas the software engineering approach characterized the system as a product. For some on the research team, the partial operationalization of a concept was at stake, a full of implementation of which may never be possible, whereas from the software development angle it consisted of an entire “life cycle” from the taking of specifications to the delivery of a workable system. We can conclude from this difference in perspective that the cycles of labor in prototyping, or perhaps just in research where software development is especially involved, from planning, implementation to validation, are conceived of very differently across disciplines. In retrospect, working together necessitated an understanding of our mutual notions of such phases of labor in research. In the ISO/IEC 25010:2011 standards documentation, the reader is struck by the language of engineering, functionalism and quality control, a far cry from what most humanists, even digital humanists, deal with every day. The 2011 document in question provides guidelines of what it calls characteristics and sub-characteristics for quality software development. The relevant sections of the document are contained in section 4, Terms and Definitions. Section 4.1 outlines “quality in use” characteristics and sub-characteristics, that is to say, traits of a piece of software that deal with the user experience. Section 4.2 outlines “product quality” characteristics and sub-characteristics, in other words, elements related to how the objectives set out in the design process phase are met by the software prototype. Both of these domains, the role and experience of the non-expert user, and the optimal performance of the tool, were issues of perpetual conversation and debate in our collaboration. The principles set out in the ISO/IEC 25010:2011 document are not all applicable to the specific case of software development engaged in by the authors of this paper; for example, the principle of freedom from risk touches on forms of risk El Khatib et al: Prototyping Across the Disciplines Art. 10, page 13 of 20 management that do not come into play with our text mapping tool. Likewise, the principles of physical comfort and security do not seem immediately relevant, since TopoText creates no particular physical stress and works with open source gazetteers and plain text source material. The risk of compromising the confidentiality or integrity of its users is very low to nil. The same might not be true of other software prototyping endeavors with geo-locating devices or wearable computing that collect data about users or create other physical stress. These notions notwithstanding, the characteristics and sub-characteristics of sections 4.1 and 4.2, both user-centered and function-centered, contain quite a few pertinent concepts worthy of our both attention and contextualization within current conversations in digital humanities. It is with them that we believe bridges of dialogue could be built. Space does not allow us to cover every single one of the themes evoked by the ISO/IEC 25010:2011. Here, we will limit ourselves to a brief discussion of a few of them that are most relevant to our experience within the framework of designing TopoText. By linking various functionalities together and automating a process, some of the more rigid standards were satisfactorily met in the software prototype; without them, the prototype would not exhibit (in terms of the ISO/IEC 25010:2011) functional completeness, that is, the extent to which the software functions matched the outlined objectives. For example, the first version of TopoText aimed to map locations from texts using the Google Map API, and also to carry out what Bubenhofer has called “geo-collocation” [Geokollokationen], making spatial association of features with natural language (Bubenhofer 2014, 45–59). This approach encountered problems with erroneous spatial data. In the case of nineteenth- century novels about London, the errors were most often mismatched with other places in the Anglophone world with locations named after the geographies of London. Although this version met the sufficient standards to carry out its functions, it left little space for effectiveness/accuracy. We did not know that we would discover something about the qualities of the data we were using—historical literary texts and a contemporary gazetteer—as well as about the processes were attempting to model. There is a growing literature on “failure” in digital humanities and the possibility El Khatib et al: Prototyping Across the DisciplinesArt. 10, page 14 of 20 of a “failure that works,” leaving open the possibility of “uncovering and correcting your mistakes to be an essential part of the creative process, rather than something reserved for hindsight” (Mlynaryk 2016). It is not immediately clear if the software development world would adopt the “working failure” as part of its standard, but the notion does seem to be found lurking within Section 4.2 of the ISO/IEC 25010:2011 about product quality, in as much as functional completeness of a software prototype, may be satisfied, but functional accuracy or appropriateness may not. Affinities Between Software Prototyping and Digital Humanities Building on the first instance of collaboration, as well as on a functioning skeleton of the first prototype, in the second version of TopoText, we sought to integrate human judgment into the geocoding process. We changed the reference gazetteer to GeoNames and implemented a basic interface by which a list of potential matching places was produced, a function similar to the Edinburgh Geoparser’s capacity to disambiguate with respect to a gazetteer. TopoText adds the function of allowing the user to rank, in an act of close reading, the best match. We also added what the layperson might call functionalities to TopoText, aspects of which are also defined by the ISO/IEC 25010:2011, a selection of the characteristics that the first iteration failed to achieve. These include modularity, reusability and maintainability; compatibility, interoperability, and coexistence; functional correctness, and, from the “quality in use” section, trust. Modularity insists that the implementation of the prototype should be well documented and should be based on wrappers to ensure the feasibility of future enhancement, such as replacing used technologies or integrating different libraries. Moreover, the model should be separated from the view (i.e., from ways of displaying the model) in order to support different types of interfaces for data consumers. Essentially, this quality has to do with separating the content from the form. Being an open data generator, TopoText generates a comma-separated values (CSV) file of the geographic entities included in each text matched with lat-long coordinates, which can then be exported, allowing reusability in other environments. It also generates the maps and word clouds of most frequent words in collocation with El Khatib et al: Prototyping Across the Disciplines Art. 10, page 15 of 20 particular places, although in a separate browser window. Taken from this angle, the development process has aimed to keep the prototype separate from its form. Although this principle has not translated to a seamless, non-expert user experience with the tool, the notion of open data generation overlaps with the standard of modularity. The software engineering focus on the tool adopted an “agile scrum” approach with respect to the various functions; new functions can be added—that is, specific theoretical interventions can be operationalized—under their modularity and their consistency with the overall process framework. More work can be done in the case of TopoText with documenting its own interwoven process of design and theory to ensure replicability. After all, a prototype must exhibit maintainability, that is, it should always contain the seed of its own improvement. Reusability is one of the key motivating factors for version 2.0 of TopoText. The question of reusability finds its most immediate expression in the tool’s function allowing for data import and export of the tool’s geo-coded data in a plain, CSV format; in other words, all data generated by TopoText is reusable in other GIS-based platforms. This sub-characteristic closely relates to compatibility, which houses the two subcategories of interoperability and coexistence. Both versions of TopoText were created through a deep remix of existing tools and libraries that are interoperable; in the theoria stages of the second iteration, the outside data source that TopoText draws upon was revisited and replaced in order to situate it within the realm of open data further; we switched from Google Maps Engine and Map API to Leaflet (an open source JavaScript library for interactive maps) and GeoNames (an open gazetteer published with a liberal Creative Commons license). As we mentioned before, sometimes a new version of a prototype might improve specific functions at the risk of outperforming previous functions. In fact, future versions of TopoText need to upgrade the visualization of its textual analysis to match the improved level of the geocoding. In sum, interoperability was taken into account from the beginning and in a way that would allow us to shuffle the coexisting tools as the prototyping process continued. Nonetheless, a working prototype exhibiting modularity exists. It remains a work in progress with its different parts changing incrementally. El Khatib et al: Prototyping Across the DisciplinesArt. 10, page 16 of 20 Functional correctness refers to the degree to which the prototype “provides the correct results with the needed degree of precision” (See ISO/IEC 25010:2011, section 4.2.1.2. System and Software Quality Models. n.d. https://www.iso.org/ standard/35733.html). As we mentioned above, the move away from automatic geocoding toward a semi-automatic, human-in-the-loop process of disambiguation of data not only allowed for more accurate matching of place name with spatial coordinates, but it enacted one of the more interesting human-centered aspects of the ISO/IEC 25010:2011, namely trust. Reincorporating human close reading, that is, human judgement about location, obviously slows down the data creation process, but it also serves as a way to peek into the “black box;” this semi- automated approach is meant to mediate between the advantages of automatic parsing, namely speed and scope, and the painstakingly slow process of manual geocoding. Conclusion Much is made of the interdisciplinary encounters in the digital humanities lab, in particular, the inclusion of other academic voices from outside the humanities, and yet much more needs to be theorized about the languages of collaborative work, especially if we imagine reaching far beyond the humanities for potential collaborators. Such collaborative work necessarily means venturing into disciplinary conventions and idioms that appear foreign and even alienating. Navigating such radically different discourses is tantamount to analyzing, and even deconstructing, the “boundary-work” of disciplinary construction (Klein 2006, 265–283). We might call it a form of digital humanities translanguaging, moving beyond established academic language systems, in order to draw upon complex semiotic resources for enacting our transdisciplinary research. The examples of the International Organization for Standardization (ISO) document have provided us with some starting points for a dialogue with other disciplines, in what might just be an opportunity to infuse lessons learned from critical digital humanities into a software development model. Experimental prototypes such as TopoText implement any number of important design decisions that are based upon theoretical positions, for example, about https://www.iso.org/standard/35733.html https://www.iso.org/standard/35733.html El Khatib et al: Prototyping Across the Disciplines Art. 10, page 17 of 20 the value of the human in semi-automated computational processes. Although notebooks have not been built for TopoText, as Rockwell and Sinclair suggest, it is perhaps a valuable next step as they document for others how theoretical positions become instantiated in code and then developed towards software. For a third iteration, we plan to continue thinking through the terminology of the standards explored in this article, and about how to continue prototyping across disciplines in a meaningful way, seeking points of interest or overlap between what might appear to be divergent research goals. One of the foci will be on the usability and operability of the prototype, characteristics which refer to the attributes that make software easy to use and control, in particular for non-expert users. This effort will focus on existing functionalities but will address interface design aesthetics, that, incidentally, are also covered in the standards. Software is meant for something more than an end in itself. Software developers work on innovating their methods in order to fine tune practical applications. Instead of viewing the standards as a technical and limiting framework, or as a strictly industry-based, product-driven set of rules alien to the type of work carried out in digital humanities, let us continue to think of ways that the standards might be drawn upon as resources to shape critically informed guidelines that will enable next-generation software. The standards can, and should, be approached critically, conceived of as a core part of the prototyping process that allows for future flexibility, given changing project goals or project team members, rather than serving as an ideal to which all products conform. Competing Interests The authors have no competing interests to declare. References Berry, David. 2011. The Philosophy of Software: Code and Mediation in the Digital Age. Houndmills: Palgrave Macmillan Limited. DOI: https://doi. org/10.1057/9780230306479 Bos, Nathan, Ann Zimmerman, Judith Olson, Jude Yew, Jason Yerkie, Erik Dahl, and Gary Olson. 2007. “From Shared Databases to Communities of Practice: A https://doi.org/10.1057/9780230306479 https://doi.org/10.1057/9780230306479 El Khatib et al: Prototyping Across the DisciplinesArt. 10, page 18 of 20 Taxonomy of Collaboratories.” Journal of Computer-Mediated Communication 12: 652–672. DOI: https://doi.org/10.1111/j.1083-6101.2007.00343.x Bracken, L. J., and E. A. Oughton. 2006. “‘What do you mean?’ The Importance of Language in Developing Interdisciplinary Research.” Transactions of the Institute of British Geographers New Series 31(3): 371–382. DOI: https://doi.org/10.1111/ j.1475-5661.2006.00218.x Bubenhofer, Noah. 2014. “Geokollokationen – Diskurse zu Orten: Visuelle Korpusanalyse.” Mitteilungen des Deutschen Germanistenverbandes 61(1): 45–59. DOI: https://doi.org/10.14220/mdge.2014.61.1.45 Edinburgh Language Technology Group. 2017. “Edinburgh Geoparser.” Accessed July 18, 2018. https://www.ltg.ed.ac.uk/software/geoparser. Edwards, Paul N., Steven J. Jackson, Melissa K. Chalmers, Geoffrey C. Bowker, Christine L. Borgman, David Ribes, Matt Burton, and Scout Calvert. 2013. Knowledge Infrastructures: Intellectual Frameworks and Research Challenges. Ann Arbor: Deep Blue. http://hdl.handle.net/2027.42/97552. El Khatib, Randa. Forthcoming. “Collocating Places and Words with TopoText.” In Social Knowledge Creation in the Humanities 2, edited by Alyssa Arbuckle, Aaron Mauro, and Daniel Powell. New Technologies in Medieval and Renaissance Studies Series. Toronto: Iter Press. Galey, Alan, and Stan Ruecker. 2010. “How a Prototype Argues.” Literary and Linguistic Computing 25(4): 405–24. DOI: https://doi.org/10.1093/llc/ fqq021 International Organization for Standardization. 2017. “ISO/IEC 25010: 2011.” Accessed Nov 20, 2018. https://www.iso.org/standard/35733.html. Klamer, Arjo, and Thomas C. Leonard. 1994. “So What’s an Economic Metaphor?” In Natural Images in Economic Thought: Markets Read in Tooth and Claw, edited by Philip Mirowski, 20–51. Cambridge: Cambridge University Press. DOI: https:// doi.org/10.1017/CBO9780511572128.002 Klein, Julie Thomson. 2006. “The Rhetoric of Interdisciplinarity: Boundary Work in the Construction of New Knowledge.” In The SAGE Handbook of Rhetorical https://doi.org/10.1111/j.1083-6101.2007.00343.x https://doi.org/10.1111/j.1475-5661.2006.00218.x https://doi.org/10.1111/j.1475-5661.2006.00218.x https://doi.org/10.14220/mdge.2014.61.1.45 https://www.ltg.ed.ac.uk/software/geoparser http://hdl.handle.net/2027.42/97552 https://doi.org/10.1093/llc/fqq021 https://doi.org/10.1093/llc/fqq021 https://www.iso.org/standard/35733.html https://doi.org/10.1017/CBO9780511572128.002 https://doi.org/10.1017/CBO9780511572128.002 El Khatib et al: Prototyping Across the Disciplines Art. 10, page 19 of 20 Studies, edited by Andrea A. Lunsford, Kirt H. Wilson, and Rosa A. Eberly, 265– 283. Thousand Oaks: Sage Publications. Kluyver, Thomas, Benjamin Ragan-Kelley, Fernando Perez, Brain Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica Hamrick, Jason Grout, and Sylvian Corlay. 2016. “Jupyter Notebooks-a Publishing Format for Reproducible Computational Workflows.” In Positioning and Power in Academic Publishing: Players, Agents and Agendas, edited by Fernando Loizides, and Brigit Schmit, 87–90. Amsterdam, IOS Press. McCarty, Willard. 2005. Humanities Computing. Houndmills, Hampshire: Palgrave McMillan. DOI: https://doi.org/10.1057/9780230504219 Mlynaryk, Jenna. 2016. “Working Failures in Traditional and Digital Humanities.” HASTAC (blog). Accessed Nov 20, 2018. https://www.hastac. org/blogs/jennamly/2016/02/15/working-failures-traditional-and-digital- humanities. Ramadier, Thierry. 2004. “Transdisciplinarity and its Challenges: The Case of Urban Studies.” Futures 36(4): 423–39. DOI: https://doi.org/10.1016/j. futures.2003.10.009 Rockwell, Geoffrey, and Stephan Sinclair. 2016. “Thinking-through the History of Computer-Assisted Text Analysis.” In Doing Digital Humanities: Practice, Training, Research, edited by Constance Crompton, Richard J. Lane, and Ray Siemens, 9–21 London: Routledge. Saklofske, Jon. 2016. “Digital Theoria, Poiesis, and Praxis: Activating Humanities Research and Communication Through Open Social Scholarship Platform Design.” Scholarly and Research Communication 7(2): 0201252, 15. Siemens, Lynne, and Ray Siemens. 2012. “Notes from the Collaboratory: An Informal Study of an Academic DH Lab in Transition.” Paper presented at Digital Humanities 2012 Conference. Hamburg, Germany, July 18. Published on Implementing New Knowledge Environments Blog. Siemens, Lynne, Richard Cunningham, Wendy Duff, and Claire Warwick. 2010. “‘More Minds are Brought to Bear on a Problem’: Methods of Interaction and https://doi.org/10.1057/9780230504219 https://www.hastac.org/blogs/jennamly/2016/02/15/working-failures-traditional-and-digital-humanities https://www.hastac.org/blogs/jennamly/2016/02/15/working-failures-traditional-and-digital-humanities https://www.hastac.org/blogs/jennamly/2016/02/15/working-failures-traditional-and-digital-humanities https://doi.org/10.1016/j.futures.2003.10.009 https://doi.org/10.1016/j.futures.2003.10.009 El Khatib et al: Prototyping Across the DisciplinesArt. 10, page 20 of 20 How to cite this article: El Khatib, Randa, David Joseph Wrisley, Shady Elbassuoni, Mohamad Jaber and Julia El Zini. 2019. “Prototyping Across the Disciplines.” Digital Studies/Le champ numérique 8(1): 10, pp. 1–20. DOI: https://doi.org/10.16995/dscn.282 Submitted: 24 July 2017 Accepted: 15 June 2018 Published: 03 January 2019 Copyright: © 2019 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. OPEN ACCESS Digital Studies/Le champ numérique is a peer-reviewed open access journal published by Open Library of Humanities. Collaboration within Digital Humanities Research Teams.” Digital Studies/Le champ numérique 2(2). DOI: https://doi.org/10.16995/dscn.80 Siemens, Lynne, Yin Liu, and Jefferson Smith. 2014. “Mapping Disciplinary Differences and Equity of Academic Control to Create a Space for Collaboration.” Canadian Journal of Higher Education 44(2): 49–67. Siemens, Ray. 2016. “Communities of Practice, the Methodological Commons, and Digital Self-Determination in the Humanities.” Digital Studies/le Champ Numérique. DOI: https://doi.org/10.16995/dscn.31 Smithies, James. 2017. The Digital Humanities and the Digital Modern. London: Palgrave Macmillan. DOI: https://doi.org/10.1057/978-1-137-49944-8 https://doi.org/10.16995/dscn.282 http://creativecommons.org/licenses/by/4.0/ https://doi.org/10.16995/dscn.80 https://doi.org/10.16995/dscn.31 https://doi.org/10.1057/978-1-137-49944-8 Humanities and Computer Science Collaborations: Towards a Product or Prototype? Software Prototypes: Materializing Contemplative Knowledge Tensions of Reproducibility Forging Vocabularies of Collaboration: From Standards to Guidelines Affinities Between Software Prototyping and Digital Humanities Conclusion Competing Interests References work_cdtzjkypkrhwvm65hjrapmg7ki ---- manuscript_for_virtual_reality.hyper21771.dvi Fatigue evaluation in maintenance and assembly operations by digital human simulation Liang Ma, Damien Chablat, Fouad Bennis, Wei Zhang, Bo Hu, François Guillaume To cite this version: Liang Ma, Damien Chablat, Fouad Bennis, Wei Zhang, Bo Hu, et al.. Fatigue evaluation in maintenance and assembly operations by digital human simulation. Virtual Reality, Springer Verlag, 2010, 14 (1), pp.339-352. <10.1007/s10055-010-0156-8>. HAL Id: hal-00495263 https://hal.archives-ouvertes.fr/hal-00495263 Submitted on 25 Jun 2010 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. CORE Metadata, citation and similar papers at core.ac.uk Provided by HAL-Univ-Nantes https://core.ac.uk/display/53013342?utm_source=pdf&utm_medium=banner&utm_campaign=pdf-decoration-v1 https://hal.archives-ouvertes.fr https://hal.archives-ouvertes.fr/hal-00495263 Noname manuscript No. (will be inserted by the editor) Fatigue evaluation in maintenance and assembly operations by digital human simulation Liang MA · Damien CHABLAT · Fouad BENNIS · Wei ZHANG · Bo HU · François GUILLAUME Abstract Virtual human techniques have been used a lot in industrial design in order to consider human factors and er- gonomics as early as possible. The physical status (the phys- ical capacity of virtual human) has been mostly treated as invariable in the current available human simulation tools, while indeed the physical capacity varies along time in an operation and the change of the physical capacity depends on the history of the work as well. Virtual Human Status is proposed in this paper in order to assess the difficulty of manual handling operations, especially from the physi- cal perspective. The decrease of the physical capacity be- fore and after an operation is used as an index to indicate the work difficulty. The reduction of physical strength is simulated in a theoretical approach on the basis of a fa- tigue model in which fatigue resistances of different mus- cle groups were regressed from 24 existing maximum en- durance time (MET) models. A framework based on digi- tal human modeling technique is established to realize the comparison of physical status. An assembly case in airplane assembly is simulated and analyzed under the framework. The endurance time and the decrease of the joint moment strengths are simulated. The experimental result in simu- Liang MA · Damien CHABLAT · Fouad BENNIS Institut de Recherche en Communications et Cybernétique de Nantes, UMR 6597 du CNRS, École Centrale de Nantes, IRCCyN - 1, rue de la Noë - BP 92 101 - 44321 Nantes CEDEX 03, FRANCE Tel.:+33-02 40 37 69 58 Fax: +33-02 40 37 69 30 E-mail: {liang.ma, damien.chablat, fouad.bennis}@irccyn.ec-nantes.fr Wei ZHANG · Bo HU Department of Industrial Engineering, Tsinghua University, 100084, Beijing, P.R.CHINA E-mail: zhangwei@tsinghua.edu.cn, b-hu05@mails.tsinghua.edu.cn François GUILLAUME EADS Innovation Works, 12, rue Pasteur - BP 76, 92152 Suresnes Cedex - FRANCE E-mail: francois.guillaume@eads.net lated operations under laboratory conditions confirms the feasibility of the theoretical approach. Keywords Virtual human simulation · muscle fatigue model · fatigue resistance · physical fatigue evaluation · human status 1 Introduction Although automation techniques have played a very impor- tant role in industry, there are still lots of operations re- quiring manual handling operations thanks to the flexibility and the dexterity of human. Some of these manual handling operations deal with relative heavy physical loads, which might result in physical fatigue in the muscles and joints, and further generate potential risks for Musculoskeletal Dis- orders (MSDs) (Li and Buckle, 1999). In order to improve the work design, digital human mod- eling (DHM) technique has been used more and more in in- dustry taking human as the center of the work design system (Chaffin, 2002, 2007), since it benefits the validation of the workspace design, the assessment of the accessibility of an assembly design, the reduction of the production cost, and the reduction of the physical risks as well. Several commercial available DHM tools have already been developed and integrated into computer aided design (CAD) tools, such as Jack (Badler et al, 1993), 3DSSPP (Chaffin et al, 1999), RAMSIS (Bubb et al, 2006), AnyBody (Damsgaard et al, 2006), SantosT M (VSR Research Group, 2004), etc. In general, the virtual human in those tools is modeled with a large number of degrees of freedom (DOF) to represent the joint mobility, create the cinematic chain of human, and complete the skeleton structure of human. Meanwhile, the graphical appearance of virtual human is realized by bone, muscle, skin, and cloth models from the interior to the exterior, from simple stick models to compli- cated 3D mesh models. Normally, biomechanical database 2 and anthropometry database are often set up to determine virtual human’s dimensional and physical properties. The main functions of the virtual human simulation tools are posture analysis and posture prediction. These tools are capable of determining the workspace of virtual human (Yang et al, 2008), assessing the visibility and accessibility of an operation (Chedmail et al, 2003), evaluating postures (Bubb et al, 2006), etc. Conventional motion time methods (MTM) and posture analysis techniques can be integrated into vir- tual human simulation systems to assess the work efficiency (Hou et al, 2007). From the physical aspect, the moment load of each joint (e.g., 3DSSPP) and even the force of each individual muscle (e.g., AnyBody) can be determined, and the posture is predictable for reach operations (Yang et al, 2006) based on inverse kinematics and optimization meth- ods. Overall, the human motion can be simulated and an- alyzed based on the workspace information, virtual human strength information, and other aspects. However, there are still several limitations in the existing virtual human simula- tion tools. There is no integration of physical fatigue model in most of the human simulation tools. The physical capacity is of- ten initialized as constant. For example, the joint strength is assigned as joint maximum moment strength in 3DSSPP, and the strength of each muscle is set proportional to its physiological cross section area (PSCA) in AnyBody. The physical capacity keeps constant in the simulation, and the fatigue effect along time is not considered enough. However, the change of the physical status can be experienced every- day by everyone, and different working procedures generate different fatigue effects. Furthermore, it has been reported that the motion strategy depends on the physical status, and different strategies were taken under fatigue and non-fatigue conditions (Chen, 2000; Fuller et al, 2008). Therefore, it is necessary to create a virtual human model with a variable physical status for the simulation. Some fatigue models have been incorporated into some virtual human tools to predict the variable physical strength. For example, Wexler’s fatigue model (Ding et al, 2000) has been integrated into SantosT M (Vignes, 2004), and Giat’s fa- tigue model (Giat et al, 1993) has been integrated based on Hill’s muscle model (Hill, 1938) in the computer simulation by Komura et al (2000). However, either the muscle fatigue model has too many variables for ergonomic applications (e.g. Wexler’s model), or there is no confidential physiolog- ical principle for the fatigue decay term (Xia and Frey Law, 2008) in the previous studies. It is necessary to find a simple fatigue model interpretable in muscle physiological mecha- nism for ergonomics applications. In addition, some assessments in those tools provide in- dexes generated by traditional evaluation methods (e.g., Rapid upper limb assessment (RULA)). Due to the intermittent record- ing procedures of the conventional posture analysis meth- ods, the evaluation result cannot analyze the fatigue effect in details. In this case, a new fatigue evaluation tool should be developed and integrated into virtual human simulation. In order to assess the variable human status, a prototype of a digital human modeling and simulation tool developed in OpenGL is presented in this paper. This human model- ing tool is under a virtual environment framework involving variable physical status on the basis of a fatigue model. The structure of the paper is as follows. First, a vir- tual human model is introduced into the framework for pos- ture analysis based on kinematic, dynamic, biomechanical, and graphical modeling. Second, the framework is presented with a new definition called Human Status. Third, the fatigue model and fatigue resistance for different muscle groups are introduced. At last, an application case European Aeronau- tic Defence & Space (EADS) Company is assessed using this prototype tool under the framework with experimental validation. 2 Digital human modeling 2.1 Kinematic modeling of virtual human In this study, the human body is modeled kinematically as a series of revolute joints. The Modified Denavit-Hartenberg (modified DH) notation system (Khalil and Dombre, 2002) is used to describe the movement flexibility of each joint. According to the joint function, one natural joint can be de- composed into 1 to 3 revolute joints. Each revolute joint has its rotational joint coordinate, labeled as qi, with joint limits: the upper limit qUi and the lower limit q L i . A general coordi- nate q = [q1,q2,...,qn] is defined to represent the kinematic chain of the skeleton. The human body is geometrically modeled by 28 rev- olute joints to represent the main movement of the human body in Fig. 1. The posture, velocity, and acceleration are expressed by the general coordinates q, q̇, and q̈. It is fea- sible to carry out the kinematic analysis of the virtual hu- man based on this kinematic model. By implementing in- verse kinematic algorithms, it is able to predict the posture and trajectory of the human, particularly for the end effec- tors (e.g., the hands). All the parameters for modeling the virtual human are listed in Table 1. [Xr,Yr,Zr] is the Carte- sian coordinates of the root point (the geometrical center of the pelvis) in the coordinates defined by X0Y0Z0. The geometrical parameters of the limb are required in order to accomplish the kinematic modeling. Such informa- tion can be obtained from anthropometry database in the lit- erature. The dimensional information can also be used for the dynamic model of the virtual human. The lengths of dif- ferent segments can be calculated as a proportion of body stature H in Table 2. 3 Table 1 Geometric modeling parameters of the overall human body j a( j) u j σ j γ j b j α j d j q j r j qini 1 0 1 0 0 Zr − π 2 Xr θ1 Yr 0 2 1 0 0 0 0 π2 0 θ2 0 π 2 3 2 0 0 0 0 π2 0 θ3 0 π 2 4 3 0 0 0 0 π2 0 θ4 Rlb 0 5 4 0 0 0 0 − π2 0 θ5 0 0 6 5 0 0 0 0 π2 0 θ6 Rub π 2 7 6 0 0 0 0 π2 0 θ7 0 π 2 8 7 0 0 0 0 π2 0 θ8 0 0 9 5 1 0 − π2 0 0 Dub θ9 − Ws 2 0 10 9 0 0 0 0 − π2 0 θ10 0 − π 2 11 10 0 0 0 0 − π2 0 θ11 −Rua − π 2 12 10 0 0 0 0 − π2 0 θ12 0 0 13 11 0 0 0 0 π2 0 θ13 0 0 14 5 1 0 − π2 0 0 Dub θ14 Ws 2 0 15 14 0 0 0 0 − π2 0 θ15 0 − π 2 16 15 0 0 0 0 − π2 0 θ16 −Rua − π 2 17 16 0 0 0 0 − π2 0 θ17 0 0 18 17 0 0 0 0 π2 0 θ18 0 0 19 1 1 0 − π2 0 − π 2 0 θ19 − Ww 2 − π 2 20 19 0 0 0 0 − π2 0 θ20 0 − π 2 21 20 0 0 0 0 − π2 0 θ21 −Rul − π 2 22 21 0 0 0 0 − π2 0 θ22 0 − π 2 23 22 0 0 0 0 0 −Dll θ23 0 0 24 1 1 0 − π2 0 − π 2 0 θ24 Ww 2 − π 2 25 24 0 0 0 0 − π2 0 θ25 0 − π 2 26 25 0 0 0 0 − π2 0 θ26 −Rul − π 2 27 26 0 0 0 0 − π2 0 θ27 0 − π 2 28 27 0 0 0 0 0 −Dll θ28 0 0 Table 2 Body segment lengths as a proportion of body stature (Chaffin et al, 1999; Tilley and Dreyfuss, 2002) Symbol Segment Length Rua Upper arm 0.186H Rla Forearm 0.146H Rh Hand 0.108H Rul Thigh 0.245H Dll Shank 0.246H Ws Shoulder width 0.204H Ww Waist width 0.100H Dub, Lub Torso length (L5-L1) 0.198H Rub Torso length (L1-T1) 0.090H 2.2 Dynamic modeling of virtual human Necessary dynamic parameters for each body segment in- clude: gravity center, mass, moment of inertia about the grav- ity center, etc. According to the percentage distribution of total body weight for different segments (Chaffin et al, 1999), the weights of different segments can be calculated using Ta- ble 3. It is feasible to calculate other necessary dynamic infor- mation with simplification of the segment shape. For limbs, the shape is simplified as a cylinder, head as a ball, and torso as a cube. The moment of inertia can be further determined based on the assumption of uniform density distribution. For Table 3 Percentage distribution of total body weight according to dif- ferent segmentation plans (Chaffin et al, 1999) Grouped segments, individual segments % of total body weight % of grouped-segments weight Head and neck=8.4% Head=73.8% Neck=26.2% Torso=50% Thorax=43.8% Lumbar=29.4% Pelvis=26.8% Total arm=5.1% Upper Arm=54.9% Forearm=33.3% Hand=11.8% Total leg=15.7% Thigh=63.7% Thigh=63.7% Shank=27.4% Foot=8.9% the virtual human system, once all the dynamic parameters are known, it is possible to calculate the torques and forces at each joint following Newton-Euler method (Khalil and Dombre, 2002). 2.3 Biomechanical modeling of virtual human The biomechanical properties of the musculoskeletal system should also be modeled for virtual human simulation. From 4 Fig. 1 Geometrical modeling of virtual human the physical aspect, the skeleton structure, muscle, and joint are the main biomechanical components in a human. In our study, only the joint moment strengths and joint movement ranges are used for the fatigue evaluation. As mentioned before, with correct kinematic and dy- namic models, it is possible to calculate torques and forces in joints with an acceptable precision. Although biomechan- ical properties of muscles are reachable and different op- timization methods have been developed in the literature, the determination of the individual muscle force is still very complex and not as precise as that of joint torque (Xia and Frey Law, 2008). Since there are several muscles attached around a joint, it creates an mathematical underdetermined problem for force calculation in muscle level. In addition, each individual muscle has different muscle fiber compo- sitions, different levers of force, and furthermore different muscle coordination mechanisms, and the complexity of the problem will be increased dramatically in muscle level. There- fore, in our system, only the joint moment strength is taken to demonstrate the fatigue model. The joint torque capacity is the overall performance of muscles attached around the joint, and it depends on the pos- ture and the rotation speed of joint (Anderson et al, 2007). When a heavy load is handled in a manual operation, the ac- tion speed is relatively small, and it is almost equivalent to static cases. The influence from speed can be neglected, so only posture is considered. In this situation, the joint strength can be determined according to strength models in Chaffin et al (1999). The joint strength is measured in torque and modeled as a function of joint flexion angles. An example of joint strength is given in Fig. 2. The shoulder flexion an- gle and the elbow flexion angle are used to determine the profile of the male adult elbow joint strength. The 3D mesh surfaces represent the elbow joint strengths for 95% popula- tion. For the 50th percentile, the elbow joint strength varies from 45 to 75 N according to the joint positions. 0 100 200 0 50 100 150 20 25 30 35 40 x 2.5 percentile y z 0 100 200 0 50 100 150 35 40 45 50 55 60 x 16 percentile y z 0 100 200 0 50 100 150 40 50 60 70 80 x 50 percentile y z 0 100 200 0 50 100 150 60 70 80 90 100 x 84 percentile y z 0 100 200 0 50 100 150 70 80 90 100 110 120 x 97.5 percentile y z x axis - flexion angle [deg] of shoulder α s y axis - flexion angle of elbow[deg] α e z axis - elbow joint strength [Nm] Fig. 2 Elbow static strength depending on the human elbow and shoul- der joint position, αs, αe [deg] 5 2.4 Graphical modeling of virtual human The final step for modeling the virtual human is its graphi- cal representation. The skeleton is divided into 11 segments: body (1), head and neck (1), upper arms (2), lower arms (2), upper legs (2), lower legs (2), and feet (2). Each segment is modeled in 3ds file (3D Max, Autodesk Inc.) (Fig. 3(a)) and is connected via one or more revolute joints with another one to assemble the virtual skeleton (Fig. 3(b)). For each segment, an original point and two vectors perpendicular to each other are attached to it to represent the position and the orientation in the simulation, respectively. The position and orientation can be calculated by the kinematic model of the virtual human. (a) 3DS model (b) virtual skeleton Fig. 3 Virtual skeleton composed of 3DS models 3 Framework for evaluating manual handling operations The center of the framework is the objective work evalua- tion system (OWES) in Fig. 4. The input module includes: human motion, interaction information, and virtual environ- ment. Human motion is either captured by motion capture system or simulated by virtual human simulation. The inter- action information is either obtained via haptic interfaces or modeled in simulation. Virtual environment is constructed to provide visual feedback to participants or workspace in- formation in simulation. Input information is processed in OWES. With different evaluation criteria, different aspects of human work can be assessed as in the previous human simulation tools. A new conception human status is proposed for this framework to generalize the discussion. Human Status: it is a state, or a situation in which the human possesses dif- ferent capacities for an industrial operation. It can be further classified into mental status and physical status. Human sta- tus can be described as an aggregation of a set of human abilities, such as visibility, physical capacity (joint strength, muscle strength), and mental capacity. Virtual human sta- tus can be mathematically noted as HS = {V1,V2,...,Vn}. OWES Objective Work Evaluation SystemVirtual Environment Virtual Human Virtual Interaction Fatigue Criteria Posture Criteria Efficiency Criteria Comfort Criteria Environment Human Motion Interaction Motion Capture Haptic Interfaces Virtual Reality Simulated Human Motion Human Simulation F a tig u e A n a lysis C o m fo rt A n a lysis P o su tre A n a lysis Posture Prediction Algorithm Virutal Human Status Update Fig. 4 Framework for the work evaluation Each Vi represents one specific aspect of human abilities, and this state vector can be further detailed by a vector Vi = {vi1,vi2,...,vimi }. The change of the human status is defined as ∆ HS = HS(t +δ t)−HS(t) = {∆ V1,∆ V2,...,∆ Vn}. For example, one aspect of the physical status can be noted as HS = [S1,S2,...,Sn], where Si represents the physical joint strength of the ith joint of the virtual human. In order to make the simulation as realistic as in real world, it is necessary to know how the human generates a movement. The bidirectional communication between hu- man and the real world in an operation decides the action to accomplish a physical task: worker’s mental and physical status can be influenced by the history of operation, while the worker chooses his or her suitable movement according to his or her current mental and physical statuses. Hence the framework is designed to evaluate the change of human sta- tus before and after an operation, and furthermore to predict the human motion according to the changed human status. The human is often simplified for posture control as a sensory-motor system in which there are enormous external sensors covering the human body and internal sensors in the human body capturing different signals, and the central ner- vous system (CNS) transfer the signals into decision making system (Cerebrum and Vertebral disc); the decision mak- ing system generates output commands to generate forces in muscles and then drives the motion and posture responding to the external stimulus. Normally, most of the external in- put information is directly measurable, such as temperature, external load, moisture, etc. However, how to achieve all the information for such a great number of sensors all over the human body is a challenging task. In addition, the internal perception of human body, which plays also an important role in motor sensor coordination, is much more difficult to be quantified. The most difficult issue is to know how the brain handles all the input and output signals while perform- ing a manual operation. In previous simulation tools, the ex- ternal input information has been already provided and han- dled. Visual feedback, audio feedback, and haptic feedback are often employed as input channel for a virtual human sim- 6 ulation. One limitation of the existing methods is that the in- ternal sensation is not considered enough. Physical fatigue is going to be modeled and integrated into the framework to predict the perceived strength reduction and the reactions of the human body to the fatigue, which provides a close-loop for the human simulation (Fig. 5). Fig. 5 Human status in human simulation tools The special contribution in this framework is that the reduction of the physical strength can be evaluated in the framework based on a muscle fatigue model. And then the changed physical strength is taken as a feedback to the vir- tual human simulation to update the simulation result. The framework performs mainly two functions: posture analysis and posture prediction (human simulation). The func- tion of posture prediction is to simulate the human motion based on the current virtual human status. Posture analysis focuses on assessing the difficulty of the manual operation. The difficulty of the work is assessed by the change of hu- man status before and after the operation ∆ HS = ∆ HSphysical . Physical fatigue is one of the physical aspects, and this as- pect is evaluated by the decrease of the strength in joints. The posture analysis function of the framework is our focus in this paper. More precisely in this paper, the joint strengths models are used to determine the initial joint moment capacity, and then the fatigue in the joints can be further determined by the external load in the static operation and the fatigue model in Section 4, and then the change of the physical status can be assessed. 4 Fatigue modeling and Fatigue analysis 4.1 Fatigue modeling A new dynamic fatigue model based on muscle motor unit recruitment principle was proposed in (Ma et al, 2009). This model was able to integrate task parameters (external load) and temporal parameters for predicting the fatigue of static manual handling operations in industry. Equation 1 is the original form of the fatigue model to describe the reduction of the capacity. The descriptions of the parameters for Eq. 1 are listed in Table 4. The detailed explanation about this model can be found in Ma et al (2009). dFcem(t) dt = −k Fcem(t) MV C Fload(t) (1) Table 4 Parameters in dynamic fatigue model Item Unit Description MVC N Maximum voluntary contraction, maximum capacity of muscle Fcem(t) N Current exertable maximum force, current capacity of muscle Fload(t) N External load of muscle, the force which the muscle needs to generate k min−1 Constant value, fatigue ratio %MVC Percentage of the voluntary maximum con- traction fMVC %MVC/100, Fload(t) MVC Maximum endurance time (MET) models can be used to predict the endurance time of a static operation. In static cases, Fload(t) is constant in the fatigue model, and then ME T is the duration in which Fcem falls down to Fload . Thus, ME T can be determined in Eq. (2) and (3). Fcem(t) = MV C e ∫ t 0 −k Fload(u) MV C du = Fload(t) (2) t = ME T = − ln Fload(t) MV C k Fload(t) MV C = − ln( fMV C) k fMV C (3) This model was validated in comparison with 24 MET models summarized in El ahrache et al (2006). The previous MET models were used to predict the maximum endurance time for static exertions and they were all described in func- tions with fMV C as the only variable. High Pearson’s corre- lations and interclass correlations (ICC) between the MET model in Eq. 3 and the other previous MET models validated the availability of our model for static cases. Meanwhile, the comparison between our model and a dynamic motor unit recruitment based model (Liu et al, 2002) suggested that our model was also suitable for modeling muscle fatigue in dy- namic cases. In (Ma et al, 2009), the fatigue ratio k was assigned 1 min−1. However, from the literature, substantial variability 7 in fatigue resistance in the population, and the variability re- sults from several factors, such as age, career, gender, mus- cle groups, etc. The parameter k can handle the effects on the fatigue resistance globally. Therefore, it is necessary to determine the fatigue resistances for different muscle groups to complete the muscle fatigue model. 4.2 Fatigue resistance based on MET models Thanks to the high linear relationship between our MET model and the previous MET models, it is proposed that each static MET model f (x) can be described mathemati- cally by a linear equation (Eq. 4). In Eq. 4, x is used to re- place fMV C and p(x) represents Eq. 3. m and n are constants describing the linear relationship between static model and our model, and they need be determined in regression. Here, m = 1/k indicates the fatigue resistance of the static model, and k is fatigue ratio or fatigability of different static model. f (x) = m p(x)+ n (4) Due to the asymptotic tendencies of MET models, when x → 1 (%MV C → 100), f (x) → 0 and p(x) → 0 (ME T → 0), we assume n = 0. Since some MET models are not suit- able for %MV C ≤ 15%, the regression is carried out from x = 0.16 to x = 0.99. With a step length 0.01, N = 84 MET values are calculated to determine the parameter m of each MET model by minimizing the function in Eq. 5. M(x) = N ∑ i=1 ( f (xi)− m p(xi)) 2 = a m2 + b m + c (5) From Eq. 5, m can be calculated by Eq. 6. m = −b 2a = N ∑ i=1 p(xi) f (xi) N ∑ i=1 p(xi)2 > 0 (6) The regression result represents the fatigue resistance of the muscle group. In comparison with 6 general MET models, 6 elbow models, 5 shoulder models, and 6 hip/back models, different muscle fatigue resistances for correspond- ing muscle groups were calculated and listed below in Table 5. The mean value m̄ and σm can be used to adjust our MET model to cover different MET models, and they can be fur- ther used to predict the fatigue resistance of a muscle group for a given population. The prediction with mean value and its deviation in general MET models is shown in Fig.6. It is observable that the bold solid curve and two slim solid curves cover most of the area formed by the previous empir- ical MET models. It should be noted that the fatigue resistance for differ- ent muscle groups are only regressed based on the empirical data grouped in the literature, and the results (Table 6) for shoulder and hip/back muscle groups did not conform to the normal distribution. For the shoulder joint, the subjects in these models were not only from different careers but also from different gender mixture. Therefore, the fatigue resis- tance result can only provide a reference in this study. Table 5 Fatigue resistance m̄ for different muscle groups Segment m̄ σm General 0.8135 0.2320 Shoulder 0.7562 0.4347 Elbow 0.8609 0.4079 Hip 1.9701 1.1476 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 9 10 Prediction of MET using Dynamic model in comparison with General static models f MVC E n d u ra n ce T im e in G e n e ra l m o d e ls [ m in ] Rohmert Huijgens Sato Manenica Sjogaard Rose Dynamic (mean) Dynamic (−σ) Dynamic (+σ) Fig. 6 Prediction of MET in the dynamic MET model in comparison with that in the general models Table 6 Fatigue resistances of shoulder MET models Model Subjects m Sato et al (1984) 5 male 0.427 Rohmert et al (1986) 6 male and 1 female students 0.545 Mathiassen and Ahsberg (1999) 20 male and 20 female mu- nicipal employees 0.698 Garg et al (2002) 12 female college subjects 1.393 8 4.3 Workflow for fatigue analysis The general process of the posture analysis has been dis- cussed in Section 3, and here is the flowchart in Fig. 7 to depict all the details in processing all the input information. Fig. 7 Workflow for the fatigue evaluation First, human motion obtained either from human simu- lation or from motion capture system is further processed to displacement q, speed q̇, and acceleration q̈ in general coor- dinates. The external forces and torques on the human body are either measured directly by force measurement instruments or estimated in the simulation. The external loads are trans- formed to Γi and Fi in the coordinates attached to qi in the modified DH method. Human motion and interaction (forces, torques) are mapped into the digital human model which is geometrically and dynamically modeled from anthropometry database and the biomechanical database. Inverse dynamics is used to calcu- late the torque and force at each general joint. If it goes fur- ther, the effort of each individual muscle can be determined using optimization method as well. Once the loads of the joints are determined, the fatigue of each joint can be analyzed using the fatigue model. The reduction of the physical strength can be evaluated, and fi- nally the difficulty of the operation can be estimated by the change of physical strengths. 5 Analysis Results for EADS Application Cases - Drilling 5.1 Operation description The application case is the assembly of two fuselage sec- tions with rivets from the assembly line of an airplane in European Aeronautic Defence & Space (EADS) Company. One part of the job consists of drilling holes all around the cross section. The tasks is to drill holes around the fuselage circumference. The number of the holes could be up to 2,000 on an orbital fuselage junction of an airplane. The drilling machine has a weight around 5 kg, and even up to 7 kg in the worst condition with consideration of the pipe weight. The drilling force applied to the drilling machine is around 49N. In general, it takes 30 seconds to finish a hole. The drilling operation is illustrated in Fig. 8. The fatigue hap- pens often in shoulder, elbow, and lower back because of the heavy load. Only the upper limb is taken into consider- ation in this demonstration case to decrease the complexity of the analysis. Fig. 8 Drilling case in CATIA 5.2 Endurance time prediction The drilling machine with a weight 5 kg is taken to calcu- late the maximum endurance time under a static posture with shoulder flexion as 30◦ and elbow flexion 90◦ for maintain- ing the operation in a continuous way. The weight of the drilling machine is divided by two in order to simplify the load sharing problem. The endurance result is shown in Ta- ble 7 for the population falling in the 95% strength distribu- tion. It is found that the limitation of the work is determined by the shoulder, since the endurance time for the shoulder joint is much shorter than that of the elbow joint. The difference in endurance results has two origins. One is the external load relative to the joint strength. The second comes from the fatigue resistance difference among the pop- ulation. These differences are graphically presented from Fig. 9 to Fig. 12. Figure 9 and Figure 10 show the vari- able endurance caused by the joint strength distribution in the adult male population with the mean fatigue resistance. Larger strength results in longer endurance time for the same external load. Figure 11 and Figure 12 present the endurance 9 Table 7 Maximum endurance time of shoulder and elbow joints for drilling work MET [sec] S − 2σ S − σ S S + σ S + 2σ Shoulder m̄ − σm 19.34 45.05 75.226 108.81 145.16 m̄ 45.489 105.96 176.94 255.94 341.44 m̄ + σm 71.639 166.87 278.65 403.07 537.71 Elbow m̄ − σm 230.61 424.27 640.47 873.52 1120.1 m̄ 438.27 806.3 1217.2 1660.1 2128.6 m̄ + σm 645.92 1188.3 1793.9 2446.6 3137.2 time for the population with the average joint strength but different fatigue resistances, and it shows that larger fatigue resistance leads to longer endurance time. Combining with the strength distribution and the fatigue resistance variance, the MET can be estimated for all the population. 50 100 150 200 250 300 0 20 40 60 80 100 120 R ed uc ti on o f S ho ul de r F le xi on s tr en gt h [N m ] S j −2σ j S j −σ j S j S j +σ j S j +2σ j Γ j Load 50 100 150 200 250 300 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Holding time of Shoulder Flexion for drilling hole tasks [s] N or m al iz ed R ed uc ti on o f S ho ul de r F le xi on s tr en gt h Geometric configuration in α s = 30o, α e = 90o, mass of drilling machine 2.5 kg S j −2σ j S j −σ j S j S j +σ j S j +2σ j Fig. 9 Endurance time prediction for shoulder with average fatigue resistance 5.3 Fatigue evaluation The fatigue is evaluated by the change of the joint strength in a fatigue operation. The working history can generate in- fluence on the fatigue. Therefore, the fatigue for drilling a hole is evaluated in a continuous working process up to 6 holes. Only the population with the average strength and the average fatigue resistance is analyzed in fatigue evalu- ation in order to present the effect of the work history. The reduced strength is normalized by dividing the maximum joint strength, and it is shown in Fig. 13. It takes 30 seconds to drill a hole, and the joint strength is calculated and nor- malized every 30 seconds until exhaustion for the shoulder joint. 500 1000 1500 2000 0 20 40 60 80 100 120 R ed uc ti on o f E lb ow F le xi on s tr en gt h [N m ] 500 1000 1500 2000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Holding time of Elbow Flexion for drilling hole tasks [s] N or m al iz ed R ed uc ti on o f E lb ow F le xi on s tr en gt h Geometric configuration in α s = 30o, α e = 90o, mass of drilling machine 2.5 kg S j −2σ j S j −σ j S j S j +σ j S j +2σ j S j −2σ j S j −σ j S j S j +σ j S j +2σ j Γ j Load Fig. 10 Endurance time prediction for the elbow with average fatigue resistance 50 100 150 200 250 0 10 20 30 40 50 60 70 80 Holding time of Shoulder Flexion for drilling hole tasks [s] R ed uc ti on o f S ho ul de r F le xi on s tr en gt h [N m ] Geometric configuration in α s = 30o, α e = 90o, mass of drilling machine 2.5 kg m m+σ m m−σ m Γ j Load Fig. 11 Endurance time for the population with average strength for shoulder joint In our current research, HS includes only the joint strength vector. The evaluation of the fatigue is measured by the change of the joint strength for drilling a hole. The result is shown in Table 8. Three measurements are given in this table: one is the normalized physical strength every 30 seconds, noted as HSi HSmax ; one is the difference between the joint strength before and after finishing a hole, noted as HSi − HSi+1 HSmax ; the last one if the difference between the joint strength and the maximum joint strength, noted as HSmax − HSi HSmax . In Table 8, only the reduction of the shoulder joint strength is presented, since the relative load in elbow joint is much smaller. From Fig. 13 and Table 8, the joint strength keeps the trend of descending in the continuous work. The ratio of the reduction gets smaller in the work progress due to the 10 Table 8 Normalized shoulder joint strength in the drilling operation Time [s] 0 30 60 90 120 150 180 m̄ HSi HSmax 100% 82.2% 67.2% 54.9% 44.8% 36.6% 30.1% HSi − HSi+1 HSmax 0% 17.8% 15.0% 12.3% 10.1% 8.2% 6.5% HSmax − HSi HSmax 0% 17.8% 32.8% 45.1% 55.2% 63.4% 69.9% 200 400 600 800 1000 1200 1400 1600 0 10 20 30 40 50 60 70 80 Holding time of Elbow Flexion for drilling hole tasks [s] R ed uc ti on o f E lb ow F le xi on s tr en gt h [N m ] Geometric configuration in α s = 30o, α e = 90o, mass of drilling machine 2.5 kg m m+σ m m−σ m Γ j Load Fig. 12 Endurance time for the population with average strength for elbow joint 30 60 90 120 150 180 210 240 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Holding time of Shoulder Flexion for drilling hole tasks [s] P er ce nt ag e of t he R ed uc ti on o f S ho ul de r F le xi on s tr en gt h Geometric configuration in α s = 30o, α e = 90o, mass of drilling machine 2.5 kg Normalized reduction of joint strength Normalized external load Fig. 13 Fatigue evaluation after drilling a hole in a continuous drilling process physiological change in the muscle fiber composition. More time consumed to work leads more reduction in physical strengths. The reduction relative to the maximum strength is able to assess the difficulty of the operations. 5.4 Experiment validation Simulated drilling operations were tested under laboratory conditions in Tsinghua University. A total of 40 male in- dustrial workers were asked to simulate the drilling work in a continuous operation for 180 seconds. Maximum output strengths were measured in the simulated operations at dif- ferent periods of the operation. Fatigue was indexed by the reduction of the joint strength along time relative to the ini- tial maximum joint strength. Three out of the 40 subjects could not sustain the external load for a duration of 180 seconds, and 34 subjects had a shoulder joint fatigue resis- tance (Mean=1.32, SD=0.62) greater than the average shoul- der joint fatigue resistance in Table 5, which means that the sample population has a higher fatigue resistance than the population grouped in the regression. The physical strength has been measured in simulated job static strengths, and the reduction in the operation varies from 32.0% to 71.1% (Mean=53.7% and SD=9.1%). The re- duction falls in the fatigue prediction of the theoretical meth- ods in Table 9 (Mean=51.7%, SD=12.1%). Table 9 Normalized torque strength reduction for the population with higher fatigue resistance HSmax − HS180 HSmax S − 2σ S − σ S S + σ S + 2σ m̄ - - 69.9% 62.5% 56.3% m̄ + σm - 63.2% 53.2% 46.4% 40.8% m̄ + 2σm 64.9% 51.9% 43.0% 36.7% 31.9% 5.5 Discussion Under the proposed framework, the conception of the virtual human status is introduced and realized by a virtual human modeling and simulation tool. The virtual human is kine- matic modeled based on the modeling method in robotics. Inverse dynamics is used to determine the joint loads. With the integration of a general fatigue model, the physical fa- tigue in a manual handling operation in EADS is simulated and analyzed. The decrease in human joint strengths can be 11 predicted in the theoretical approach, and it has been vali- dated with experimental data. Human status is introduced in this framework in order to generalize all the discussion for the human simulation. We concentrate only on the physical aspect of the virtual human, in particular on joint strengths. Physical status can be ex- tended to other aspects, either measurable using instruments (e.g., heart rate, oxygen consumption, electromyograph of muscle, etc.) or predicable using mathematical models (e.g., vision, strength, etc.). Similarly, the mental status of human can also be established by similar terms (e.g., mental capac- ity, mental workload, mental fatigue, etc.). Under the con- ception of human status, different aspects of the human can be aggregated together to present the virtual human com- pletely. The changed human status caused by a physical job or a mental job can be measured or predicted to assess differ- ent aspects of the job. It should be noted that the definition of human status is still immature and it requires great effort to form, extend, and validate this conception. The main difference between the fatigue analysis in our study and the previous methods for posture analysis is: in previous methods (Wood et al, 1997; Iridiastadi and Nuss- baum, 2006; Roman-Liu et al, 2005), intermittent proce- dures were used to develop the fatigue model with job spe- cific parameters; in contrast, all the related physical expo- sure factors are taken into consideration in a continuous ap- proach in our model. In this way, the analysis of the manual handling operation can be generalized without limitations of job specific parameters. Furthermore, the fatigue and recov- ery procedures can be decoupled to simplify the analysis in a continuous way. Although only a specific application case is presented in this paper, the feasibility of the general con- cept has been verified by the introduction of human status and the validation of the fatigue model. It should be noted that the recovery of the physical strength has not been considered yet. Although there are several work- rest allowance models in the literature, substantial variabil- ity was found among the prediction results for industrial op- erations (El ahrache and Imbeau, 2009) and it is still ongoing to develop a general recovery model. 6 Conclusions In this study, human status is introduced into the work eval- uation system, especially for the physical status. It provides a global definition under which different aspects of human abilities can be integrated and assessed simultaneously. The effect of the work on the human status, either positive or negative, can be measured by the change of the human sta- tus before and after the operation. We concentrate our study on physical aspects, especially on joint moment strengths. The physical fatigue analysis in a drilling case under the work evaluation framework demonstrates the work flow and the functions of the virtual human simulation. The change of joint moment strength, a specific aspect of human phys- ical status, has been simulated based on a general fatigue model with fatigue resistances. The similar results between the analysis and the experimental data suggests that the frame- work may be useful for assessing the physical status in con- tinuous static operations. The new conception human status and the theoretical method for assessing the physical status may provide a new approach to generalize the virtual human simulation and eval- uate the physical aspect in continuous static manual han- dling operations. This approach is useful to assess the phys- ical load to prevent industrial workers from MSD risks, and it can also be used to assess mental load with extension of mental status. However, it should be noted that great effort has to be done to extend different aspects in human status to make it more precise. Even only for physical fatigue, it is still nec- essary to develop a recovery model to complete the fatigue prediction. Acknowledgments This research was supported by the EADS and the Région des Pays de la Loire (France) in the context of collaboration between the École Centrale de Nantes (Nantes, France) and Tsinghua University (Beijing, PR China). References Anderson D, Madigan M, Nussbaum M (2007) Maximum voluntary joint torque as a function of joint angle and an- gular velocity: Model development and application to the lower limb. Journal of Biomechanics 40(14):3105–3113 Badler NI, Phillips GB, Webber BL (1993) Simulating hu- mans: computer graphics animation and control. Oxford University Press, USA Bubb H, Engstler F, Fritzsche F, Mergl C, Sabbah O, Schae- fer P, Zacher I (2006) The development of RAMSIS in past and future as an example for the cooperation between industry and university. International Journal of Human Factors Modelling and Simulation 1(1):140–157 Chaffin D (2002) On simulating human reach motions for ergonomics analyses. Human Factors and Ergonomics in Manufacturing 12(3):235–247 Chaffin D (2007) Human motion simulation for vehicle and workplace design. Human Factors and Ergonomics in Manufacturing 17(5):475 Chaffin DB, Andersson GBJ, Martin BJ (1999) Occupa- tional biomechanics, 3rd edn. Wiley-Interscience Chedmail P, Chablat D, Roy CL (2003) A distributed ap- proach for access and visibility task with a manikin and a 12 robot in a virtual reality environment. IEEE Transactions on Industrial Electronics 50(4):692–698 Chen Y (2000) Changes in lifting dynamics after local- ized arm fatigue. International Journal of Industrial Er- gonomics 25(6):611–619 Damsgaard M, Rasmussen J, Christensen S, Surma E, de Zee M (2006) Analysis of musculoskeletal systems in the AnyBody Modeling System. Simulation Modelling Practice and Theory 14(8):1100–1111 Ding J, Wexler AS, Binder-Macleod SA (2000) A predictive model of fatigue in human skeletal muscles. Journal of Applied Physiology 89:1322–1332 El ahrache K, Imbeau D (2009) Comparison of rest al- lowance models for static muscular work. International Journal of Industrial Ergonomics 39(1):73–80 El ahrache K, Imbeau D, Farbos B (2006) Percentile val- ues for determining maximum endurance times for static muscular work. International Journal of Industrial Er- gonomics 36(2):99–108 Fuller J, Lomond K, Fung J, Côté J (2008) Posture- movement changes following repetitive motion-induced shoulder muscle fatigue. Journal of electromyography and kinesiology: official journal of the International So- ciety of Electrophysiological Kinesiology Garg A, Hegmann K, Schwoerer B, Kapellusch J (2002) The effect of maximum voluntary contraction on endurance times for the shoulder girdle. International Journal of In- dustrial Ergonomics 30(2):103–113 Giat Y, Mizrahi J, Levy M (1993) A musculotendon model of the fatigue profiles of paralyzed quadriceps muscle un- der FES. IEEE Transactions on Biomechanical Engineer- ing 40(7):664–674 Hill A (1938) The heat of shortening and the dynamic con- stants of muscle. Proceedings of the Royal Society of London Series B, Biological Sciences 126(843):136–195 Hou H, Sun S, Pan Y (2007) Research on virtual human in ergonomic simulation. Computers & Industrial Engineer- ing 53(2):350–356 Iridiastadi H, Nussbaum M (2006) Muscle fatigue and en- durance during repetitive intermittent static efforts: devel- opment of prediction models. Ergonomics 49(4):344–360 Khalil W, Dombre E (2002) Modelling, identification and control of robots. Hermes Science Publications Komura T, Shinagawa Y, Kunii TL (2000) Creating and re- targetting motion by the musculoskeletal human. The Vi- sual Computer 16(5):254–270 Li G, Buckle P (1999) Current techniques for assessing physical exposure to work-related musculoskeletal risks, with emphasis on posture-based methods. Ergonomics 42(5):674–695 Liu J, Brown R, Yue G (2002) A dynamical model of mus- cle activation, fatigue, and recovery. Biophysical Journal 82(5):2344–2359 Ma L, Chablat D, Bennis F, Zhang W (2009) A new simple dynamic muscle fatigue model and its validation. Interna- tional Journal of Industrial Ergonomics 39(1):211–220 Mathiassen S, Ahsberg E (1999) Prediction of shoulder flex- ion endurance from personal factor. International Journal of Industrial Ergonomics 24(3):315–329 Rohmert W, Wangenheim M, Mainzer J, Zipp P, Lesser W (1986) A study stressing the need for a static postural force model for work analysis. Ergonomics 29(10):1235– 1249 Roman-Liu D, Tokarski T, Kowalewski R (2005) Decrease of force capabilities as an index of upper limb fatigue. Ergonomics 48(8):930–948 Sato H, Ohashi J, Iwanaga K, Yoshitake R, Shimada K (1984) Endurance time and fatigue in static contractions. Journal of human ergology 13(2):147–154 Tilley AR, Dreyfuss H (2002) The measure of man & woman. Human factors in design. Revised Edition. New York: John Wiley & Sons, Inc. Vignes RM (2004) Modeling muscle fatigue in digital hu- mans. Master’s thesis, Graduate College of The Univer- sity of Iowa VSR Research Group (2004) Technical report for project virtual soldier research. Tech. rep., Center for Computer- Aided Design, The University of IOWA Wood D, Fisher D, Andres R (1997) Minimizing fatigue during repetitive jobs: optimal work-rest schedules. Hu- man Factors: The Journal of the Human Factors and Er- gonomics Society 39(1):83–101 Xia T, Frey Law L (2008) A theoretical approach for mod- eling peripheral muscle fatigue and recovery. Journal of Biomechanics 41(14):3046–3052 Yang J, Pitarch E, Kim J, Abdel-Malek K (2006) Posture prediction and force/torque analysis for human hands. In: Proc. of SAE Digital human modelling for design and en- gineering conference, p 2326 Yang J, Sinokrot T, Abdel-Malek K (2008) A general an- alytic approach for Santos upper extremity workspace. Computers & Industrial Engineering 54(2):242–258 work_ci4peodnbbf4lgydwra4jxwwua ---- CLARIN-IT: State of Affairs, Challenges and Opportunities Lionel Nicolas Eurac Research, Bolzano, Italy lionel.nicolas@eurac.edu Alexander König Eurac Research, Bolzano, Italy alexander.koenig@eurac.edu Monica Monachini ILC ”A. Zampolli” CNR, Pisa, Italy monica.monachini@ilc.cnr.it Riccardo Del Gratta ILC ”A. Zampolli” CNR, Pisa, Italy riccardo.delgratta@ilc.cnr.it Silvia Calamai Università di Siena, Italy silvia.calamai@unisi.it Andrea Abel Eurac Research, Bolzano, Italy andrea.abel@eurac.edu Alessandro Enea ILC ”A. Zampolli” CNR, Pisa, Italy alessandro.enea@ilc.cnr.it Francesca Biliotti Università di Siena, Italy francesca.biliotti@unisi.it Valeria Quochi ILC ”A. Zampolli” CNR, Pisa, Italy valeria.quochi@ilc.cnr.it Francesco Vincenzo Stella Università di Siena, Italy francesco.stella@unisi.it Abstract This paper gives an overview on the Italian national CLARIN consortium as it currently stands two years after its creation at the end of 2015. It thus discusses the current state of affairs of the consortium on several aspects, especially with regards to members. It also discusses the events and initiatives that have been undertaken, as well as the ones that are planned in the close future. It finally outlines the conclusions of a user survey performed to understand the expectations of a targeted user population and provides indications regarding the next steps planned. 1 Introduction Among the research fields of interest for the CLARIN initiative as a whole, several have a long history of research efforts performed in Italy over the past decades and have identifiable associations organizing recurrent Italian events. For example, for Computational Linguistics and Language Technology Applica- tions - particularly for Italian - there is the Associazione Italiana di Linguistica Computazionale1 (AILC) that organizes, among other events, the yearly celebrated conference CLIC-IT2 and the periodically held evaluation campaign of Natural Language Processing - Evalita3; for Speech Sciences there is the As- sociazione Italiana di Scienze della Voce4 (AISV), that also organizes a yearly celebrated conference, together with the Franco Ferrero prize, and for Digital Humanities there is the Associazione per l’In- formatica Umanistica e le Culture Digitali5 (AIUCD) that also organizes, among other events, a yearly This work is licensed under a Creative Commons Attribution 4.0 International Licence. Licence details: http:// creativecommons.org/licenses/by/4.0/ 1http://www.ai-lc.it/ 2http://www.ai-lc.it/en/initiatives/clic-it/ 3http://www.ai-lc.it/en/initiatives/evalita 4https://www.aisv.it/ 5http://www.aiucd.it Selected papers from the CLARIN Annual Conference 2017, Budapest, 18–20 September 2017. Conference Proceedings published by Linköping University Electronic Press at www.ep.liu.se/ecp/contents.asp?issue=147. © The Author(s). 1 celebrated AIUCD conference6. Accordingly, it is only natural that ever since CLARIN started in 2008 with a preparatory phase, it has always been of great interest for several Italian institutions7. When CLARIN ERIC was established in 2012 after the end of the preparatory phase in 2011, several efforts have been made to create a national consortium. On October 2015, the Ministero dell’Istruzio- ne, dell’Università e della Ricerca (MIUR) signed the Memorandum of Understanding to become a full member and Italy finally joined CLARIN ERIC with the Department of Social Sciences and Humanities (DSU) of the Consiglio Nazionale delle Ricerche appointed as Representing Entity, the Istituto di Lingui- stica Computazionale (ILC) nominated as leading Italian participant and Monica Monachini nominated as National Coordinator. This paper aims at providing a clear overview of the Italian national CLARIN consortium as it cur- rently stands two years after its creation. In Section 2, it discusses the current state of affairs of the consortium, be it in terms of members, funding, technical infrastructure or role within the CLARIN fe- deration. Section 3 then provides a number of information on the CLARIN-IT members, especially with regards to what they offer to CLARIN in terms of resources, services and expertise, and what CLARIN offers them to further their own research. Section 4 discusses the CLARIN-IT events organized in Italy so far and the events planned in the close future, especially with regards to the organization of the CLARIN 2018 conference. Finally, Section 5 outlines the conclusions of a user survey performed to understand the expectations of a targeted Italian user population while Section 6 provides indications regarding the next steps planned for the consortium as a whole and for each member individually. 2 Current State of Affairs 2.1 Members As it stands at the moment, the CLARIN-IT consortium includes four institutions as full members with two other institutions in the process of formally joining it. The four current full members are: 1. the Istituto di Linguistica Computazionale ”A. Zampolli” (ILC) of the Consiglio Nazionale delle Ricerche in Pisa, 2. the Institute for Applied Linguistics (IAL) of Eurac Research in Bolzano, 3. the Dipartimento di Scienze della Formazione, Scienze Umane e della Comunicazione Interculturale (DSFUCI) of the Università di Siena, 4. the Dipartimento di Filologia e critica delle Letterature antiche e moderne (DFCLAM) of the Università di Siena. The two other institutions in the process of formally joining the consortium are the Dipartimento di Discipline Umanistiche of the Università di Parma and the Dipartimento di Studi Umanistici of the Università ”Ca’ Foscari” in Venezia. Aside from these six institutions, a noticeable number of other Italian institutions from a wide range of disciplines have expressed their interest in participating. Among those, we can cite the Fondazione Bruno Kessler (Trento) the Università Cattolica del Sacro Cuore (Milano), the Università ”Tor Vergata” (Roma), and the Università di Pisa, Dipartimento di Linguistica (Pisa). 2.2 Funding One reason for the late arrival and the limited number of members (when compared to other CLARIN consortia) is due to the fact that negotiations regarding an Italian national funding of the CLARIN-IT consortium with the Research Ministry are still ongoing. Consequently, while other institutions have put on hold their membership until a viable context for their participation can be arranged, the current 6http://www.aiucd.it/convegno-annuale/ 7One of them, the Institute for Computational Linguistics ”Antonio Zampolli” (ILC) of the Italian National Research Council, was already a member of the consortium that carried out the preparatory phase under the FP7-INFRASTRUCTURES EC programme (GA:2007-212230). Selected papers from the CLARIN Annual Conference 2017, Budapest, 18–20 September 2017. Conference Proceedings published by Linköping University Electronic Press at www.ep.liu.se/ecp/contents.asp?issue=147. © The Author(s). 2 members have either sought funding for personnel at regional or local level, have committed some of their own internal resources or are contributing on a purely voluntary basis. 2.3 CLARIN-IT committees Following the best practices implemented within CLARIN ERIC and the CLARIN federation, CLARIN-IT has established the following committees: the technical committee, the metadata and standards committee, the legal issues committee and the committee for the relations with users. The technical committee coordinates all CLARIN-IT type C and B centres and ensures the smooth functioning of all the technical services. It will be responsible of ensuring conformance to CLARIN ERIC technical requirements and of the prompt uptake of technological upgrades and new solutions developed and/or suggested by CLARIN ERIC. It also advises the National Coordinator on any critical issue regarding the quality of the services provided and on the possible measures to take. The metadata and standards committee is responsible for the adoption of the metadata and data format standards supported by CLARIN ERIC. As such it selects and disseminates the existing supported stan- dards relevant for its user communities; helps adapting the standards to the specific needs of the users and members, and contributes to the definition of metadata and concepts in the CLARIN Concept Registry, when needed. It also gives advice to the National Coordinator in matters of standards. The legal issues committee deals with IPR and privacy protection issues. Its main task is to revise and adapt the policies and licenses devised and recommended by CLARIN ERIC to the needs of CLARIN-IT. The committee also helps and advises members on IPR critical matters about specific data resources, with the aim of maximising research data sharing within the CLARIN community. The committee for the relations with users discusses and coordinates the national activities towards an active engagement of user communities. Its main responsibilities are to adapt and implement the guidelines and best practices promoted by CLARIN ERIC within CLARIN-IT, discuss new ideas for involving new users and research communities, receive feedback from the users, disseminate information about services, resources, projects and all relevant CLARIN-like activities in Italy and beyond. In addition to those, CLARIN-IT is also creating an advisory board that will provide strategic advice on various matters such as, among others, quality, new initiatives or synergies with international and national related infrastructures and projects. The advisory board will be formed by high profile scholars that are not directly involved in CLARIN-IT activities, but who have access to relevant networks in the Social Sciences and Humanities (SSH). 2.4 Technical infrastructure As regards the CLARIN centres and resources made available, the ILC currently hosts the ILC4CLARIN, CLARIN Type C Centre, and is in the active process of achieving a CLARIN-B certification8. ILC4CLARIN is the first CLARIN-IT technological node that links the Italian SSH community to the EU-wide CLARIN communities; it has set up a CLARIN DSpace repository which will soon offer de- posit services to the Italian community. The IAL has successfully created its own CLARIN DSpace repository and is progressively making its resources available on it. It is also aiming at achieving a CLA- RIN C status as soon as possible. The DSFUCI and DFCLAM are actively working on making their resources available via the ILC4CLARIN repository. On a different technical perspective, the CLARIN-IT consortium closely cooperates with the Con- sortium GARR, the Italian University and Research Network, in particular with the IDEM-GARR9 office that supports federated authentication in CLARIN. Thus, any member or participant of the IDEM-GARR federation already has access to services hosted at any CLARIN centre in Europe via their institutional credentials. The CLARIN-IT consortium is also in contact with the CLOUD-GARR10 office so as to allow members to safely and securely deposit data in the cloud. 8https://www.clarin.eu/content/centre-requirements-revised-version 9https://www.idem.garr.it/en 10https://cloud.garr.it Selected papers from the CLARIN Annual Conference 2017, Budapest, 18–20 September 2017. Conference Proceedings published by Linköping University Electronic Press at www.ep.liu.se/ecp/contents.asp?issue=147. © The Author(s). 3 2.5 CLARIN-IT within the CLARIN federation Regarding the participation in CLARIN events, CLARIN-IT members participated in the CLARIN An- nual General Assembly 2016 and 2017, in the CLARIN Annual Conference 2015, 2016 and 2017 and in the CLARIN Centre meeting in 2016 and 2017. CLARIN-IT members also participated in the first, second and third CLARIN-PLUS Workshop on Oral History (Oxford, Utrecht, Arezzo)11, the CLARIN- PLUS Workshop on User Involvement, the CLARIN-PLUS Workshop on Digital Collections of Newspa- pers, the CLARIN-PLUS Workshop ”Sustainability and Governance”, and in the CLARIN Workshop on ”Interoperability of L2 resources and tools”. Finally, CLARIN-IT was represented at the CLARIN booth at LREC 2016 and at the PARTHENOS WP3 Meeting (an initiative of which CLARIN is member of) in November 2016. 3 CLARIN-IT centres A large networking initiative such as CLARIN allows institutions with their own agendas to devise efficient roadmaps to approach their common or inter-related challenges and achieve several added va- lues such as, among many others, preventing the duplication of efforts, the sharing of resources or the creation of new initiatives resulting from productive encounters. Also, a common added value brought to all CLARIN-IT members comes from the opportunities in terms of sustainability, be it through the CLARIN-supported standards and tools or through the interaction with expert fellow stakeholders. Mo- re specifically, we can outline the following synergies between the overall CLARIN initiative and the CLARIN-IT centres. 3.1 Synergies between the ILC and CLARIN 3.1.1 The ILC in few words The Institute for Computational Linguistics ”A. Zampolli” is a reference centre in the field of Computa- tional Linguistics at both national and international levels. Its various research lines (Digital Humanities, Representation Standards, Distributed Research Infrastructures and Knowledge Management) makes the ILC a unique institution. The Institute is part of the Department of Social Science and Humanities, Cultu- ral Heritage (DSU) of the Consiglio Nazionale delle Ricerche (CNR). It was already an active participant in the CLARIN preparatory phase. 3.1.2 The ILC as an asset for CLARIN ILC has for many years been active in the field of language resources and technologies for natural lan- guage processing. The group of Language Resource and Infrastructures12 has been paying attention to the development of digital resources (corpora, computational lexicons) for Italian and English and is now creating new lexical resources for Greek and Latin according to the Linked Open Data (LOD) paradigm. ILC recognizes, indeed, that there is still a lack of lexical resources dealing with ’historical’ langua- ges, such as ancient Greek, Latin or Sanskrit, and this can be seen as a missed opportunity for the DH community. ILC is thus making available legacy, digitalized, print resources as LOD, as well as creating new resources by linking existing ones and distributing them with standard methods such as SPARQL end points and/or HTML browsing. ILC is an active member and covers leading roles within the ISO Committee TC/37 SC4, as well as in the W3C OntoLex working group, thus facilitating both the liaison and the coordination between CLARIN ERIC and the ISO Standard Committees. ILC is also involved in developing methods and digital technology for preservation of textual archives. Experts are dealing with text encoding and mark-up to provide the scientific community with digital data access, exchange and research on textual heritage of the literature held by ILC. ILC has set up a CLARIN C-Centre (ai- ming for type B certification in 2018), ILC4CLARIN13, along with a CLARIN DSpace repository, where the above-mentioned language resources are deposited and/or described according to the CMDI model 11http://oralhistory.eu/workshops 12http://lari.ilc.cnr.it 13https://ilc4clarin.ilc.cnr.it/en/ Selected papers from the CLARIN Annual Conference 2017, Budapest, 18–20 September 2017. Conference Proceedings published by Linköping University Electronic Press at www.ep.liu.se/ecp/contents.asp?issue=147. © The Author(s). 4 (Broeder et al., 2012), which make them also visible and retrievable in the CLARIN Virtual Language Observatory14 (VLO) (Van Uytvanck et al., 2010; Goosen and Eckart, 2014). Along with the digital resources made available through the repository, ILC4CLARIN provides a set of linguistic services15 such as systems for querying text corpora, natural language analysis and annotation tools, tools for extraction and acquisition of linguistic information, format converters and tools for lexicon creation or manipulation. Many of these tools are offered in the form of webservices; some of them are already available in Weblicht16 (Hinrichs et al., 2010) and the Language Resource Switchboard17 (Zinn, 2016) or are currently being integrated there (see Section 6 for the next steps). Through its repository, ILC4CLARIN also makes available both web applications and lexical resour- ces for Latin and Greek. The web application for lemmatizing short Latin texts18 offers also a REST web service19 which outputs the results of the lemmatization process in JSON; a search interface is available for browsing several wordnets in different languages including Italian, Ancient Greek, Latin, Croatian, and Arabic.20 Together with these web applications, a revised portion of the Ancient Greek WordNet is also available.21 3.1.3 CLARIN as an asset for the ILC Participating in CLARIN provides a number of opportunities in terms of sustainability, preservation, persistent identification, and visibility for the ILC’s research outputs. Sustainability is a key aspect for the ILC’s strategy as it kept on growing and conducting research over the years; preservation and persi- stent identification of research data and results is fundamental as well, since they provide to users and researchers the technologies to retrieve data and replicate experiments. CLARIN offers ILC frameworks and platforms where to promote and support the use of technology and text analysis tools. For example, Weblicht22 allows to combine web services so as to handle and exploit textual data. Finally, the VLO makes the resources produced and described in the ILC centre available to a wider audience in the DH community while the CMDI model ensures a high quality in terms of metadata. 3.2 Synergies between the IAL and CLARIN 3.2.1 The IAL in few words The Institute for Applied Linguistics23 (IAL) is part of Eurac Research, a private non-profit research centre located in Bolzano and composed of several research groups focussing their efforts on research subjects of particular importance for the South Tyrolean region where it is situated. The IAL in particu- lar aims at addressing current issues of language and education policy as well as economic and social questions at the local and international level. It is an international research environment where around 25 Junior and Senior Researchers with heterogeneous backgrounds are performing research on a wide range of language-related subjects. 3.2.2 The IAL as an asset for CLARIN With a majority of its workforce dedicated to linguistics-related or terminology-related research que- stions, the IAL is an active figure in several research fields and an active producer of manually crafted and curated high-quality datasets. 14https://vlo.clarin.eu 15https://ilc4clarin.ilc.cnr.it/services/ 16https://weblicht.sfs.uni-tuebingen.de/weblichtwiki/index.php/Main\_Page 17In particular, the Italian tokenizer whose REST APIs are described at http://ilc4clarin.ilc.cnr.it/ services/ltfw/readme, while the CMDI file used by Weblicht is available from http://hdl.handle.net/20. 500.11752/ILC-85@format=cmdi. 18http://hdl.handle.net/20.500.11752/ILC-59 19http://cophilab.ilc.cnr.it:8080/LatMorphWebApp/services/complete/ 20http://hdl.handle.net/20.500.11752/ILC-55 21http://hdl.handle.net/20.500.11752/ILC-56. Such format is compatible with the Global Wordnet initiative http://globalwordnet.org/wordnets-in-the-world/. 22https://weblicht.sfs.uni-tuebingen.de 23http://www.eurac.edu/en/research/autonomies/commul/Pages/ Selected papers from the CLARIN Annual Conference 2017, Budapest, 18–20 September 2017. Conference Proceedings published by Linköping University Electronic Press at www.ep.liu.se/ecp/contents.asp?issue=147. © The Author(s). 5 As regards linguistics-related questions, the IAL is a known figure in the research fields of Learner Corpora, Didactics and E-lexicography. Among the initiatives undertaken for the field of Learner Corpo- ra, the IAL has created or participated in the creation of several Learner Corpora such as Kolipsi (Abel et al., 2012), KoKo (Abel et al., 2014), and Merlin (Wisniewski et al., 2013). It also has organized in 2017 the 4th Learner Corpus Research Conference24. As regards to Didactics, the IAL has both strong connec- tions with schools and policy makers in and outside South Tyrol and organizes a number of workshops25 and training courses for teachers and pupils (Engel and Colombo, 2018). It also organized the interna- tional conference on language competences ”Sprachkompetenzen erheben, beschreiben und fördern im Kontext von Schule und Mehrsprachigkeit”26 celebrated in Bolzano in 2017. Finally (with regards to linguistics-related questions), the IAL is an active member of the COST Action ”European Network for e-Lexicography” (ENeL), is currently leading the European Association for Lexicography (EURALEX) and has organized in 2014 the 16th edition of the EURALEX International Congress27. As regards terminology-related questions, the IAL is active in the field of Legal Terminology, for which it has produced and made available several terminological datasets such as the LexALP and Bistro Information Systems (Chiocchetti et al., 2013; Lyding et al., 2006; Streiter et al., 2004). The IAL is also part of the ISO Committee TC/37 for ”Terminology and other language and content resources”, is an active member of the RaDT28, is part of the beta-test group for the SDL Multiterm and Trados Studio29 and is acting on regular occasions and through several local projects as terminological consultant for the local South Tyrolean government. With the rest of its workforce providing assistance on automatic processing for their colleagues, the IAL has also become over time an active figure in the domain of Language Technologies, especially with regards to the automatic processing of the South Tyrolean German Dialect. Among the efforts undertaken for this field, the IAL has developed expertise for non-standard written communication such as computer- mediated communication (CMC), with a special focus on social media, and webcorpora. In that research context, it has released the CMC corpus Didi (Frey et al., 2015) and the Webcorpus Paisa (Lyding et al., 2014). It also has organized in 2017 the 5th Conference on CMC and Social Media Corpora for the Humanities30 and is an active contributor in a TEI Special Interest Group (TEI-CMC-SIG). Finally, as re- gards Language Technologies, the IAL also started a very CLARIN-alike local project named DI-ÖSS31 which aims at establishing a local digital infrastructure among the South Tyrolean language stakeholders allowing them to benefit from each others’ expertise and services. Except for specific cases, the IAL intends to integrate as many resources as possible into its CLARIN DSpace repository32. Because of its diversity in terms of research subjects and member profiles, the IAL relies on a varied set of workflows and can accordingly be an asset by providing a range of expertise of interest to a larger scope of stakeholders. Therefore, it also intends to be involved in several CLARIN initiatives and committees33. 3.2.3 CLARIN as an asset for the IAL The main added value from the IAL’s participation to CLARIN is the number of opportunities it offers in terms of sustainability, an aspect that became key in the IAL’s strategy as it kept on growing over the years. In that aspect, an initiative such as CLARIN DSpace greatly benefits the IAL which could not hope to develop such an advanced solution on its own. 24http://lcr2017.eurac.edu/ 25Up to now, more than 3000 pupils (aged 8 to 18) took part in the offered didactic activities. 26”Describe, nurture, and improve language competencies in the context of school and multilingualism”. 27http://euralex2014.eurac.edu 28”Rat für Deutschsprachige Terminologie” (an expert panel including large institutions such as the UNESCO). 29Leading professional solutions in language and content management services with a focus on terminology and translation. 30https://cmc-corpora2017.eurac.edu/ 31”Digitale Infrastruktur für das Ökosystem Südtiroler Sprachdaten und -dienste” (Digital infrastructure for the South Tyrolean ecosystem of language data and services). 32https://clarin.eurac.edu/ 33Members of IAL already participate actively in the CMDI taskforce and the CLARIN DSpace initiative. Selected papers from the CLARIN Annual Conference 2017, Budapest, 18–20 September 2017. Conference Proceedings published by Linköping University Electronic Press at www.ep.liu.se/ecp/contents.asp?issue=147. © The Author(s). 6 In a different but similar logic, as outlined earlier, the research profile of the IAL is rather varied and as such the IAL lacks often enough the tools (or uses suboptimal ones) to pursue some research oppor- tunities, as it cannot afford developing and maintaining new ones. However, CLARIN as a whole is even more varied in terms of research profiles and a number of CLARIN-related initiatives, targeted at first to the needs of other institutions, directly address needs of the IAL. A good example is the Language Re- source Switchboard which allows non-expert stakeholders to seamlessly use advanced natural language processing tools and can thus allow linguists and terminologists at the IAL to test and develop indepen- dently their own research ideas, while relying on their colleagues’ expertise in language technologies for the later stages (e.g. for the fine tuning of the automatic tools). In that perspective, such technologies, despite having been developed independently of the IAL, directly tackle one of its needs34. Finally, CLARIN represents a great asset for the IAL in terms of visibility and dissemination. Indeed, because the IAL is an active producer of high-quality datasets, being able to reference such datasets on international catalogues such as the VLO is particularly interesting. 3.3 Synergies between the DSFUCI and CLARIN 3.3.1 The DSFUCI in few words The Dipartimento delle Scienze della Formazione, Scienze umane e della Comunicazione interculturale (DSFUCI) is one of the 15 Departments of the Università di Siena and is located in Arezzo. The Arezzo campus brings together a community of scholars with a range of methodological approaches and research interests in various areas of education, languages, the humanities, and the social sciences. The Depart- ment coordinates and promotes theoretical and applied research projects aimed in particular at improving and changing life and work styles; strengthening cultural, linguistic and professional skills of adults and professionals; studying the role of languages, technologies and the media in today’s world and in the historical development of groups and social communities; providing services to public and private or- ganizations, administrations and professional associations in the realm of human resources development (educators, language experts, school teachers, cultural managers, trainers and middle management). 3.3.2 The DSFUCI as an asset for CLARIN The DSFUCI carried out together with the Scuola Normale Superiore di Pisa (Pier Marco Bertinetto, p.i.) the Grammo-foni (Gra.fo) project (Calamai and Frontini, 2016), a co-founded project35 devoted to the building of a digitization and cataloguing system with the aim of creating a regional network for the management of speech and oral archives of the past (Calamai et al., 2013). The preservation of analogue archives, that have so far remained unknown to the large public, entailed their detection as a first step, and then the digitisation (including restoration, when necessary) and cataloguing of the recordings contained in them. The oral documents preserved are disseminated via a web portal36 that allows registered users to access the audio files and the corresponding cataloguing records, together with the relative transcriptions and accompanying material (when available). A subsequent project, Voci da ascoltare37, was devoted to the dissemination of oral archives to high school students and also to the building of cultural trail via Mobile APPs (Pozzebon and Calamai, 2015; Pozzebon et al., 2016). Therefore, with respect to the speech and oral archives domain, the participation of DSFUCI and the Gra.fo archive in CLARIN would give several advantages. With over 3,000 hours of digitized recordings and the incredibly vast range of type of documents and topics covered, the Gra.fo archive is a unique and exemplary accomplishment in the Italian panorama. Having preserved such a significant collection of oral documents, Gra.fo not only constitutes a precious repository of Tuscan memory and provides a first-hand documentation of Tuscan language varieties from the early 1960s, but also represents a model for other research groups or institutions dealing with oral archives. Gra.fo covered the entire workflow with respect to the managing of oral archives: from digitization to long-term preservation, cataloguing 34The interest in being involved in several CLARIN initiatives and committees is also motivated by the possibility to detect, influence and contribute to other useful initiatives such as the Switchboard. 35Regione Toscana PAR FAS 2007-13 36https://grafo.sns.it 37Università di Siena and Unicoop Firenze, 2016-2017. Selected papers from the CLARIN Annual Conference 2017, Budapest, 18–20 September 2017. Conference Proceedings published by Linköping University Electronic Press at www.ep.liu.se/ecp/contents.asp?issue=147. © The Author(s). 7 and description, ethical and privacy issues managing, and dissemination, also in terms of public history and general public involvement (Calamai et al., 2016). Nevertheless, the DSFUCI’s commitment to speech and oral archives does not confine itself to the Gra.fo experience. We succeeded in discovering and locating the first oral archive related to an Italian psychiatric hospital – which was located in the same buildings as the department in Arezzo, also where the historical archive of the Arezzo psychiatric hospital is hosted. The oral archive of Anna Maria Bruz- zone, an analogue archive (made of 36 compact cassettes) contained the testimonies (life stories) of more than thirty former patients. It represents the documental basis of the book ”Ci chiamavano matti. Voci da un ospedale psichiatrico” (Bruzzone, 1979). The author wrote it after a two-month stay in Arezzo, when she spent almost every day in the hospital. The book testifies to the patients’ miserable lives inside and outside the hospital and sheds light on the atrocity of their everyday condition by letting them speak for themselves. Yet, what the book contains is not their actual voice: their voice is contained in the tapes that Bruzzone recorded during her research, when she witnessed the lives of the inpatients, in a continuous dialogue of which only a part is collected in the published interviews. The tapes were donated by the heirs and we are currently working on their digitisation and on metadata description. 3.3.3 CLARIN as an asset for the DSFUCI Being part of CLARIN would benefit the speech sciences and oral history research communities in at least three main aspects: (1) the possibility to use a shared and internationally consistent metadata standard (e.g., the OralHistory profile in the CLARIN component registry38); (2) the possibility to ensure the long-term preservation of the original speech data (both preservation and access copy) and of the metadata according to the FAIR principles (Wilkinson et al., 2016); (3) the possibility to offer a proper reuse of research data (license agreement, ethical and legal issues). As for (3), the inclusion of a member of DSFUCI in the CLARIN Legal Issues Committee39 may be considered as the first step towards a more conscious involvement of the Italian research communities in the ethical and legal issues associated to the web dissemination and re-use of speech and oral archives. At present, another crucial issue is represented by Automatic Speech Recognition tools. One of the aims of the Oral History research group inside CLARIN was to provide full Speech Recognition for different languages in order to perform one of the main ”steps” envisaged in the OH transcription chain40, enabling the researchers to go from a ”recorded interview” to a findable, accessible and viewable digital AV-document with relevant metadata on the Internet. Italian language is devoid of a web-based ASR, which would be of a great benefit for both communities of oral historians and linguists. 3.4 Synergies between the DFCLAM and CLARIN 3.4.1 The DFCLAM in few words The Dipartimento di Filologia e Critica delle Letterature Antiche e Moderne (DFCLAM) of the Univer- sità di Siena, ranked in 2018 as one of the national excellence departments by the Italian Government (MIUR), focusses on the philological, literary and anthropological competences that lie at the very heart of the study of literary texts, from the ancient world to modernity and for each literary genre. The in- teraction between philology and literature is central in the long-standing European humanistic tradition and the history of the Department includes significant names of the Italian literature (Antonio Tabucchi, Franco Fortini, Alessandro Fo, etc.) and some forerunners of the application of anthropological methods to literature (M. Bettini, S. Ronchey). In particular, its strongest points of engagement concern the an- thropology of the ancient world (Centre AMA), the Italian contemporary literature (Centro Fortini) and the study of medieval literatures (Latin and Romance) through digital methods and tools. The Depart- ment includes some research centres such as the Centre for Comparative Studies � I Deug-Su � which is strongly engaged also in research on digital humanities and three laboratories on digital humanities funded by a development project newly approved by the MIUR. 38https://catalog.clarin.eu/ds/ComponentRegistry#/?itemId=clarin.eu\%3Acr1\%3Ap_ 1369752611610®istrySpace=public 39https://www.clarin.eu/governance/legal-issues-committee 40http://oralhistory.eu/workshops/transcription-chain\#transcriptions Selected papers from the CLARIN Annual Conference 2017, Budapest, 18–20 September 2017. Conference Proceedings published by Linköping University Electronic Press at www.ep.liu.se/ecp/contents.asp?issue=147. © The Author(s). 8 3.4.2 The DFCLAM as an asset for CLARIN The DFCLAM committed itself to offering data and free online access to some digital archives of literary and historical texts: among them the ALIM (the Archive of the Italian Latinity of the Middle Ages), the largest digital library (with textual analysis tools and a medieval-Latin lemmatizer), which includes Latin texts and documents, encoded in XML-TEI from philologically checked sources or firstly edited from manuscripts, produced in Italy during the Middle Ages. Strategies for importing the metadata of ALIM in the CLARIN-ILC repository through a shared TEI-header are under study, as well as procedures for delivering dedicated tools for textual and linguistic analysis through the CLARIN channels. This would allow meta-queries and cross-queries on semantic items which could connect Latin and modern European languages derived from Latin and allow to develop semantic trees and networks of lexical derivations at the very heart of the European shared lexicon. 4 CLARIN-IT Events & promotion 4.1 Lectures Four lectures were given to introduce CLARIN-IT to the next generation of collaborators. A contribution called ”Language Resources and Infrastructures for Digital Humanities” was presen- ted at the Curso de Verano 2016 ”New trends in quantitative and computational linguistics”, organized by the Universidad de Castilla-La Mancha in Ciudad Real, Spain. A keynote lecture on ”Humanities: advantages, opportunities and benefits of the CLARIN Research Infrastructure and the CLARIN-IT national node for the Italian community.” was performed at the final ceremony of the Master Digital Humanities (2016-2017), held in Venice at the Università di Ca’ Fosca- ri. Subsequently, a lecture on ”Digital Humanities and Research Infrastructures: CLARIN” was given during the Course ”Digital Humanities: Web Resources, Tools and Infrastructures” of the third edition (2017-2018) of the Master in Digital Humanities41. Finally, during the first and second part of the Workshop ”Digital Humanities and Greek Philology: resources and research infrastructures applied to the study of ancient Greek” organized at the Università di Parma in November and December 2017, two lectures were given. The first one was entitled ”New technologies and new investigations: CLARIN-IT and some examples of application to the study of an- cient Greek” whereas the second one was entitled ”Infrastructures of Research and Classical Studies. CLARIN-IT: opportunities and perspectives”. This event was devoted to the discussion of the opportu- nities and research perspectives offered by the collaboration between Research Infrastructures and the Digital Classics community. Different approaches to a traditional discipline are expected to offer, in per- spective, new study habits that, based on the good practices inherited from the previous tradition, allow the development of new research methodologies and teaching practices. 4.2 Participation in Italian events In order to raise awareness among the Italian research communities and extend the consortium, CLARIN-IT members have been participating in four relevant Italian events. A keynote ”CLARIN-IT, l’Infrastruttura di Ricerca per le Scienze Umane e Sociali”42 was presented at the 5th Annual Conference of the Associazione per l’Informatica Umanistica e la Cultura Digitale (AIUCD) held in Venezia in September 2016 and in which numerous Italian researchers in Digital Hu- manities were taking part. During the conference a survey on CLARIN-IT aiming at raising awareness about CLARIN and collecting needs, requirements and expectations was launched (see Section 5.). CLARIN-IT was presented at the Workshop ”Utilizzo e diffusione di metodi, strumenti e tecnologie digitali per gli studi filologici: l’applicazione della filologia digitale al greco antico”43 and at the Se- minar ”Le risorse informatiche applicate alle discipline umanistiche: strumenti e metodi con esempi 41CLARIN-IT, in collaboration with the AIUCD, patronizes the third Master in Digital Humanities (a.a. 2017-2018). 42”CLARIN-IT: A research infrastructure for the Social Sciences and Humanities” 43”Utilisation and dissemination of methods, instruments and digital technologies for the philologies: application of digital philology to ancient Greek” Selected papers from the CLARIN Annual Conference 2017, Budapest, 18–20 September 2017. Conference Proceedings published by Linköping University Electronic Press at www.ep.liu.se/ecp/contents.asp?issue=147. © The Author(s). 9 sull’utilizzo didattico nelle discipline classiche”44 held in October 2016. Such events were organized by the Dipartimento di Discipline Umanistiche of the Università di Parma, which is about to become a member. The contribution was called ”Infrastrutture di ricerca nel settore umanistico”45. CLARIN-IT was also presented at the GARR 2016 Conference ”The CreActive NEtwork: uno spazio per creare e condividere nuova conoscenza”46, held in November 2016 and organized by the Gruppo per l’Armonizzazione delle Reti della Ricerca (GARR), an Italian network aiming at providing high- performance connectivity and developing innovative services for the daily activities of teachers, re- searchers and students and with which CLARIN-IT is actively collaborating on technical questions. The presentation was entitled ”Corpora digitali: dall’obsolescenza tecnologica, alla salvaguardia e alla condivisione”47 (Sassolini et al., 2016). 4.3 Organization of CLARIN & CLARIN-IT events A presentation of CLARIN-IT, the ILC and the ILC4CLARIN repository was held at the Consiglio Nazionale delle Ricerche in Pisa in March 2017. This presentation was the occasion to raise awareness among colleagues about the aim and functioning of CLARIN, its potential and its benefits. A first result of CLARIN’s interest towards the Tuscan speech and oral archives can be found in the CLARIN Oral History workshop (Arezzo, May, 10-12 2017; Henk van den Heuvel p.i.), whose aim is the finalization of the setup of a transcription chain for OH-interviews48. An implementation plan for an OH transcription chain that can be integrated into the CLARIN infrastructure has been set up during the Arezzo workshop. As for the Italian community, the meeting brought together the CLARIN-IT executive committee (ILC) and representatives from the Italian Speech Sciences Association (AISV) and the Italian Oral History Association (AISO). The workshop undertook the challenging task of putting together different kinds of expertise (from Linguistics to Oral History, to Language and Speech Technology, to Infrastructure Analysis and Implementation). In June 2017, an application to organize and host the 7th edition of the CLARIN conference was sub- mitted and selected, thus acknowledging the efforts and capacities of the CLARIN-IT consortium, its contribution as a full member of the federation, as well as demonstrating the interest in supporting its growth. The CLARIN Annual Conference is an imporant scientific event where the wider Humanities and Social Sciences communities can meet in order to exchange ideas and experiences with the CLA- RIN infrastructure. This includes the design, construction and operation of the CLARIN infrastructure, the data, tools and services that it contains or should contain, its actual use by researchers, its relation to other infrastructures and projects, and the CLARIN Knowledge Sharing Infrastructure. The Special Thematic session for this edition will be in the areas of multimedia, multimodality and speech, including the collection, annotation, processing and study of audio, visual or multimedia data with language as an important part of the content. The conference will be held in October 8-10, 2018 in Pisa. It is expected to last 3 days and receive around 200 participants. On October 4th, IAL organized a CLARIN User Involvement event titled ”How to use TEI for the annotation of CMC and social media resources: a practical introduction”. The event was held in conjunction with the 5th Conference on CMC and Social Media for the Humanities (cmccorpora17)49. 5 Survey 5.1 Motivation In CLARIN, users are recognized as a central part of the infrastructure and of any service design process, but saying that our audience are Humanists is not enough. We have a wide range of scholars working within the Academy or research institutions who have different needs. 44”Computational resources applied to the humanities: tools and methods with examples on their didactic use in the classical disciplines” 45”Research infrastructures in the Humanities” 46”The Creative Network: a space for creating and sharing new knowledge” 47”Digital corpora: from technological obsolescence towards preservation and sharing” 48http://oralhistory.eu/workshops/arezzo 49https://cmc-corpora2017.eurac.edu/ Selected papers from the CLARIN Annual Conference 2017, Budapest, 18–20 September 2017. Conference Proceedings published by Linköping University Electronic Press at www.ep.liu.se/ecp/contents.asp?issue=147. © The Author(s). 10 While there is a soaring interest for the use of digital resources and related tools in the broader context of Humanities, some specific scientific communities are still reluctant to adopt them. We performed a survey to ascertain the current interest in digital methods, the practice and the related needs within the community of classical philologists; in particular, it was performed on a restricted sample of Italian digital humanists with Ancient Greek Philology as the main focus of interest. Other surveys were sporadically carried out during the last decade, aiming at collecting input from sectors that, although not strictly within the realm of Digital Classics, may have similar requirements and arrive at similar conclusions as far as resource design is concerned. A point worth remarking is that the preceding studies concern a wide spectrum of scientific interests within the Digital Humanities realm and involve mostly English native speakers scholars; our study, in- stead, focused on the specific scientific community dealing with Digital Classics and aimed at evaluating the impact of digital techniques in their practice. 5.2 Context The CLARIN-IT survey collects the points of view of a restricted sample of Italian digital humanists, with a focus of interest on ancient Greek philology. This area of study is a relatively small field but it retains great interest in Italy, where it also looks back to a great tradition. It also includes university students and schoolteachers. Moreover, Italian Scholars of Ancient Greek are an active part of a large international community (especially in Europe, North and South America). The perspective of the study was enhanced by the fact that its spectrum involves also Latin, Ancient History and Philosophy, and Classics in general. Finally, it is important to remember that Ancient Greek studies are an essential part of our Occidental Cultural Heritage, and it is crucial that the highest number of people knows these texts and their contents. For all these reasons, the Italian Ancient Greek community is an excellent field to test new opportunities about knowledge and quality in transmission of ancient texts. The questionnaire was sent to a selected group of Italian researchers whose main focus of study was Ancient Greek language, although their interests span over a broader area, encompassing Greek and Latin literature. The sample was numerically consistent with the survey target (about 10% with respect to the potential target population of about 130 people). The survey focused on the digital resources and tools needed to support an excellent and usable digital edition of an ancient text. For this reason, first, the applicants were asked to evaluate the tools they use and know. They were then asked to indicate their expectations towards technologies and to rank a set of four functionalities in priority order. Finally, they were asked to rate, on a 1-5 scale, the set of functionalities considered as crucial. 5.3 Main Outcomes and Action Plan Consistently to the preceding surveys, the key outcome was that most of the available resources do not respond to users’ requirements50. Many respondents pointed out that important research needs in the field are models and software for authoring, editing, indexing and presenting a digital edition, how to link it to the available resources and improve them. All of them insisted on the need to develop and/or make tools more reliable and usable, thus lamenting the absence of tools integrating textual data and bibliography links, or hypertext links with other texts or resources available. Based on the outcomes of this survey, CLARIN-IT could address a set of R&D priorities that may be the base for establishing a research and innovation action plan for Digital Classics. As it currently stands, the plan foresees a workbench in which to insert text in a simple and intuitive way and visualize its encoding with specific TEI transcription; provide apparatus, literature and translation, link together primary sources and lexica, provide textual (and metrical) analysis and commentary, and offer search tools. We are developing a sample prototype to submit for evaluation by end-users. At a larger scale, the work represented one of the first attempts undertaken within the context of CLARIN-IT to contribute to the wider impact of CLARIN on the specific Italian community of users interested in the application of Digital Humanities to the field of Classics and to ancient world studies. 50For an extensive analysis, see Monachini et al. (2018). Selected papers from the CLARIN Annual Conference 2017, Budapest, 18–20 September 2017. Conference Proceedings published by Linköping University Electronic Press at www.ep.liu.se/ecp/contents.asp?issue=147. © The Author(s). 11 6 Next steps 6.1 For the CLARIN-IT consortium as a whole For the consortium as a whole, the next step is to include more members which will depend on whether or not a national funding for personnel can be secured. The plan is to agree with the Ministry on a national project aiming at strenghtening the infrastructural activities in Italy to foster the use of digital technologies in the Humanities, through the collection of needs and requirements of the community and the development of case studies. CLARIN-IT will enhance the use of language resources and technology through the Italian infrastructure and, at the same time, will encourage innovation in research paradigms and methodologies of the sectors. The consolidation of the consortium will respond to representativeness criteria. While a scientific criterion is aimed at covering research sectors related to the study of language and gather language resource producers, linguists, computational linguists and language engineers, a geographical criterion will also be considered to ensure territorial coverage. Participation in the national consortium of all the most important research centers in the language technologies will allow to achieve the goal of coverage, ensuring the long-term preservation of the great wealth of digital resources and their easier access to the scientific community. The CLARIN-IT consortium aims to attract the scientific communities of the various fields: classical, modern and contemporary history, literary studies, political science, communication science, sociology, theology, philosophy, social anthropology and ethnography, linguistics and philology. Furthermore, CLARIN-IT is also aimed at attracting disciplines that make use, albeit less massively, of text resources and technologies, such as law, education, archeology, artistic disciplines and entertainment, design, architecture, music, demography, human geography, economics, social and political studies, the history of science and medicine. The CLARIN-IT consortium is also deeply involved in one of the aspects on which ERIC insists, namely the training sector, with the launch of master’s or doctoral theses and university courses in line with the objectives of CLARIN. The Digital Classics survey, now publicly available on the CLARIN-IT channels51, may further help the Italian Consortium in fostering new and sustaining existing knowledge in Digital Classics (DC). CLARIN-IT will play an important role in disseminating the results to the relevant academic, cultural, industrial communities and the interested public. Furthermore, our plans are to extend the survey to other CLARIN consortia, thus helping to identify gaps and drive the development of new technologies for ancient studies at large. This will contribute to the general CLARIN mission to grow its infrastructure so as to serve in a better way the international community of scholars from any disciplines dealing with language and help them to boost their studies. Last but not least, since each consortium is unique but none is fully different from the others, the CLARIN federation constitutes an important source of inspiration as regards to the next efforts and initiatives to undertake. Therefore, we will keep on observing the past and on-going initiatives undertaken by other national consortia and, whenever relevant and possible in practice, undertake similar ones. 6.2 For each CLARIN centre Regarding the ILC4CLARIN, the next steps are to complete the set of linguistic resources freely accessible through its online portal and achieve a CLARIN-B certification in 2018, which is under way. A high priority task of the near future is the integration of the web services developed within previous funded projects (and thus already available) into the CLARIN federated services Language Resource Switchboard and Weblicht. As mentioned in Del Gratta (2018), these are mainly basic NLP services that may serve various purposes and can thus be included in useful analysis chains for textual research. For the IAL, the next steps are to get recognized as a CLARIN-C Centre as soon as possible. Du- ring the course of 2018, IAL will start integrating all its language resources into its recently-established CLARIN DSpace repository, and will undertake the additional steps needed to achieve a B Centre certi- fication within 2018 or 2019. Finally, through its (CLARIN-like) local DI-ÖSS project, the IAL intends to organize a number of events with the South Tyrolean language stakeholders to raise awareness around 51http://www.clarin-it.it/it/content/sondaggio-current-practice-digital-classics-tools Selected papers from the CLARIN Annual Conference 2017, Budapest, 18–20 September 2017. Conference Proceedings published by Linköping University Electronic Press at www.ep.liu.se/ecp/contents.asp?issue=147. © The Author(s). 12 digital infrastructures, the DI-ÖSS project itself, the CLARIN-IT consortium and the overall CLARIN initiative as a whole. As pertaining to the DSFUCI, the next steps are to make the Gra.fo digital archives accessible via CLARIN DSpace (Calamai et al., 2017) and ensure their long-term preservation, to describe new digital archives according to CLARIN metadata profiles (e.g. BAS-COALA service) and to update the Registry of Oral History Collections in Italy, which is made accessible and maintained by CLARIN ERIC. Finally, another future objective is to strengthen the collaboration among linguists and oral historians in the speech and oral archives domain. Finally, regarding the DFCLAM, the next steps are to make the ALIM digital archive accessible via ILC4CLARIN and ensure its long-term preservation. 7 Conclusion This paper presented the current Italian CLARIN consortium and discussed its current state of affairs. This paper also provided a number of information on its current members, especially with regards to what they offer to CLARIN in terms of resources, services and expertise, and what CLARIN offers them to further their own research, as well as information on the institutions that are expected to join in the close future. The events and initiatives undertaken at the Italian level have also been discussed together with one planned in a close future, namely the 2018 edition of the CLARIN conference. This paper finally outlined the conclusions of a user survey performed to understand the expectations of a targeted user population and provided indications regarding the next steps planned. As one can observe from the efforts undertaken and the results achieved, CLARIN-IT has a lot to offer to CLARIN and vice-versa. Despite limited means, CLARIN-IT is slowly but surely taking its place in the CLARIN landscape. The consortium has yet to grow larger and address several questions. Nonetheless, its steady growth and its widening participation in the CLARIN federation are positive indications as regards to the challenges to come. References Andrea Abel, Chiara Vettori, and Katrin Wisniewski. 2012. Gli studenti altoatesini e la seconda lingua: in- dagine linguistica e psicosociale Die Südtiroler SchülerInnen und die Zweitsprache: eine linguistische und sozialpsychologische Untersuchung. Eurac Research. Andrea Abel, Aivars Glaznieks, Lionel Nicolas, and Egon Stemle. 2014. KoKo: an L1 Learner Corpus for German. In Proceedings of the LREC Conference. Daan Broeder, Menzo Windhouwer, Dieter Van Uytvanck, Twan Goosen, and Thorsten Trippel. 2012. Cmdi: a component metadata infrastructure. In Describing LRs with metadata: towards flexibility and interoperability in the documentation of LR workshop programme, volume 1. Anna M. Bruzzone. 1979. Ci chiamavano matti. Voci da un ospedale psichiatrico. Einaudi. Silvia Calamai and Francesca Frontini. 2016. Not quite your usual kind of resource. Gra.fo and the documentation of Oral Archives in CLARIN. In Proceedings of the 5th CLARIN Annual Conference (CAC). Silvia Calamai, Pier Marco Bertinetto, Chiara Bertini, Francesca Biliotti, Irene Ricci, and Gianfranco Scuotri. 2013. Architecture, methods and purpose of the Gra. fo sound archive. In Digital Heritage International Congress (DigitalHeritage), volume 2, pages 439–439. IEEE. Silvia Calamai, Veronique Ginouvès, and Pier Marco Bertinetto, 2016. Sound Archives Accessibility, pages 37–54. Springer International Publishing, Cham. Silvia Calamai, Francesca Biliotti, and Aleksei Kelli. 2017. Authorship and ownership in the digital oral archives domain: The Gra.fo digital archive in the CLARIN-IT repository. In Proceedings of the 6th CLARIN Annual Conference (CAC). Elena Chiocchetti, Barbara Heinisch-Obermoser, Georg Löckinger, Vesna Lušicky, Natascia Ralli, Isabella Stanizzi, and Tanja Wissik. 2013. Guidelines for collaborative legal/administrative terminology work. EURAC. Selected papers from the CLARIN Annual Conference 2017, Budapest, 18–20 September 2017. Conference Proceedings published by Linköping University Electronic Press at www.ep.liu.se/ecp/contents.asp?issue=147. © The Author(s). 13 Riccardo Del Gratta. 2018. (Re)Using OpeNER and PANACEA Web Services in the CLARIN Research Infrastructure. In Digital Infrastructures for Research 2017. Dana Engel and Sabrina Colombo. 2018. Strategien in der Förderung von Multilingual Awareness im Rahmen der Südtiroler Wanderausstellung ” Sprachenvielfalt – in der Welt und vor unserer Haustür“. Sprachen lehren, Sprachen lernen. Jennifer-Carmen Frey, Aivars Glaznieks, and Egon W Stemle. 2015. The DiDi Corpus of South Tyrolean CMC Data. In Proceedings of the 2nd Workshop of the Natural Language Processing for Computer-Mediated Communication/Social Media. Twan Goosen and Thomas Eckart. 2014. Virtual language observatory 3.0: What’s new. In CLARIN annual conference. Erhard Hinrichs, Marie Hinrichs, and Thomas Zastrow. 2010. WebLicht: Web-based LRT services for German. In Proceedings of the ACL 2010 System Demonstrations, pages 25–29. Association for Computational Linguistics. Verena Lyding, Elena Chiocchetti, Gilles Sérasset, and Francis Brunet-Manquat. 2006. The LexALP information system: Term bank and corpus for multilingual legal terminology consolidated. In Proceedings of the workshop on multilingual language resources and interoperability. Association for Computational Linguistics (ACL). Verena Lyding, Egon Stemle, Claudia Borghetti, Marco Brunello, Sara Castagnoli, Felice Dell’Orletta, Henrik Dittmann, Alessandro Lenci, and Vito Pirrelli. 2014. The PAISA corpus of Italian web texts. In Proceedings of the 9th Web as Corpus Workshop (WaC-9). Monica Monachini, Anika Nicolosi, and Alberto Stefanini. 2018. Digital Classics: A Survey on the Needs of An- cient Greek Scholars in Italy. In Proceedings of the CLARIN 2017 Conference. Linköping University Electronic Press. Alessandro Pozzebon and Silvia Calamai. 2015. Smart devices for Intangible Cultural Heritage fruition. In Digital Heritage, 2015, volume 1, pages 333–336. IEEE. Alessandro Pozzebon, Francesca Biliotti, and Silvia Calamai. 2016. Places Speaking with Their Own Voices. A Case Study from the Gra. fo Archives. In EUROMED 2016, pages 232–239. Springer. Eva Sassolini, Sebastiana Cucurullo, and Alessandra Cinini. 2016. I corpora digitali: dall’obsolescenza tecnologica, alla salvaguardia e alla condivisione. In GARR Conference Proceedings. Oliver Streiter, Natascia Ralli, Isabella Ties, and Leonhard Voltmer. 2004. BISTRO: the online platform for terminology management. Structuring terminology without entry structures. Linguistica Antverpiensia, New Series–Themes in Translation Studies, (3). Dieter Van Uytvanck, Claus Zinn, Daan Broeder, Peter Wittenburg, and Mariano Gardelleni. 2010. Virtual lan- guage observatory: The portal to the language resources and technology universe. In Seventh conference on In- ternational Language Resources and Evaluation [LREC 2010], pages 900–903. European Language Resources Association (ELRA). Mark D Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E Bourne, et al. 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific data, 3:160018. Katrin Wisniewski, Karin Schöne, Lionel Nicolas, Chiara Vettori, Adriane Boyd, Detmar Meurers, Andrea Abel, and Jirka Hana. 2013. MERLIN: An online trilingual learner corpus empirically grounding the European reference levels in authentic learner data. In 6th edition of the ICT for Language Learning Conference (ICT4LL), Florence, Italy. Claus Zinn. 2016. The CLARIN Language Resource Switchboard. In Proceedings of the 5th CLARIN Annual Conference (CAC). Selected papers from the CLARIN Annual Conference 2017, Budapest, 18–20 September 2017. Conference Proceedings published by Linköping University Electronic Press at www.ep.liu.se/ecp/contents.asp?issue=147. © The Author(s). 14 work_clxcyxc3zjc5bgw5iwhil6yzzi ---- Deep Mapping and the Spatial Humanities Florida State University Libraries Faculty Publications Department of Religion 2013 Deep Mapping and the Spatial Humanities David Bodenhamer, John Corrigan, and Trevor M. Harris Follow this and additional works at the FSU Digital Library. For more information, please contact lib-ir@fsu.edu http://fsu.digital.flvc.org/ mailto:lib-ir@fsu.edu Bodenhamer, David J., John Corrigan, and Trevor M. Harris. “Deep Mapping and the Spatial Humanities,” International Journal of Humanities and Arts Computing, 7.1-2 (2013), 170-75. In 2012, the Virtual Center for Spatial Humanities (VCSH) held an advanced institute in Indianapolis, Indiana, on spatial narratives and deep maps. Sponsored by a major grant from the National Endowment for the Humanities, a U.S. government agency that funds humanities research, the institute invited twelve scholars—seven from the U.S. and five from Europe— whose work at the intersection of digital technologies and their disciplinary domains (history, religious studies, literary studies, geography and geographic information science, archaeology, and museum studies) promised to advance an institute aim of re-envisioning the theories and technologies of spatialization to serve the needs of humanities research more completely. Rather than asking humanists to adopt unabridged geo-spatial technologies, such as GIS, that are based on positivist epistemologies often ill-suited to the humanities, the institute focused on a range of available geospatial technologies including GIS, geo-visualization, the geospatial semantic web, wiki-maps and mash-ups, social media and mapping systems, spatialized tag clouds, and self- organizing maps. Powerful as maps are, the institutes proposed to link and deepen scholarly understanding of complex humanities data and geospatial technologies through a focus on two innovative forms—spatial narratives and deep maps—that bend spatial and other digital technologies to the intellectual traditions of humanists, thereby constituting a bridge between diverse avenues of investigation. In doing so, it addressed two goals of the NEH call for proposals, namely, to bring together humanists and technologists to advance an innovative approach to the digital humanities and to assess the tools and methods available to support it. Developments in Geographic Information Systems (GIS) over the past few decades have been nothing short of remarkable. So revolutionary have these advances been that the impact of GIS on many facets of government administration, industrial infrastructure, commerce, and academia has been likened to the discoveries brought about by the microscope, the telescope, and the printing press. But the dialogue between geographic information science (GISci) and the humanities has thus far been limited and largely revolves around the use of ‘off-the-shelf’ GIS in historical mapping projects. This limited engagement is in stark contrast to the substantive inroads that GIS/GISci has made in the sciences and social sciences, as captured by the growing and valuable field of a social-theoretically informed Critical GIS. Not surprisingly, the humanities present additional significant challenges to GISci because of the complexities involved in meshing a positivist science with humanist traditions and predominantly literary Spatial Narratives and Deep Maps 2 and spatial methods. And yet it is the potential dialogue and engagement between the humanities and GISci that promises reciprocal advances in both fields as spatial science shapes humanist thought and is in turn reshaped by the multifaceted needs and approaches represented by humanist traditions. We use the term spatial humanities to capture this potentially rich interplay between Critical GIS, spatial science, spatial systems, and the panoply of highly nuanced humanist traditions. The use of GIS in the humanities is not new. The National Endowment for the Humanities has funded a number of projects to explore how geospatial technologies might enhance research in a number of humanities disciplines, including but not limited to history, literary studies, and cultural studies. The National Science Foundation and National Institutes of Health also have supported projects related to spatial history, such as the Holocaust Historical GIS (NSF) and Population and Environment in the U.S. Great Plains (National Institute of Child Health and Human Development). In Europe the list of projects is equally impressive, with studies of nineteenth-century railroad development and urbanization, a spatial analysis of child mortality in industrializing Great Britain, and a detailed geography of the Irish famine among the more noteworthy accomplishments. Although successful by their own terms, these projects have revealed the limits of the technology for a wider range of humanities scholarship, which an increasing body of literature discusses in detail. Chief among the issues are a mismatch between the positivist epistemology of GIS, with its demand for precise, measurable data, and the reflexive and recursive approaches favored by humanists and some social scientists (e.g. practitioners of reflexive sociology) who wrestle continually with ambiguous, uncertain, and imprecise evidence and who seek multivalent answers to their questions. The problem, it seems, is both foundational and technological: we do not yet have a well-articulated theory for the spatial humanities, nor do we have the tools sufficient to meet the needs of humanists. Addressing these deficits is at the heart of much current work in the spatial humanities, with the focus on four interrelated areas research and development. First, researchers are exploring the epistemological frameworks of the humanities and GISci for the purpose of locating common ground on which the two can cooperate. This step is invariably been overlooked in the rush to apply the new technology but it is the essential point of departure for any effort to bridge them. This venture is not to be confused with a more sweeping foundational analysis of ingrained methodological conceits within the sciences and the humanities, and certainly should not be misunderstood as a query about the qualitative approach versus the quantitative approach. Rather, what is desired here is to expose humanities scholars to the breadth of Spatial Narratives and Deep Maps 3 geospatial technologies and subsequently for the technology itself to be interrogated as to its adaptability. This approach is in full understanding that the technology has, in its genesis, been epistemologically branded and yet still offers potential for the humanities. GIS, for all of its demonstration of confidence in Euclidean space, quantification, disambiguation, and reduction, has proven its capability to represent uncertainty and variability in the visualization of geo- spatial data. In weather forecasting and ocean modeling, for example, uncertainty can be encoded with the data and visualizations fashioned that are multivariate and multidimensional. The technology, then, is more supple than its critics suggest. What is required is an appropriate intellectual grounding and arena in the humanities that will enable skilled humanities scholars to draw the technology further out of its positivistic homeland. In a similar way, the development of spatial humanities requires both an understanding of the ontology and epistemology of GIS and a closer collaboration with its GIScience practitioners. The challenge is how to realize the promise of hybridity between humanistic critical discourses and the theoretical perspectives of Critical GIS. Humanists can give more thoughtful consideration to location and spatial relationality, and can take leads from visualizations of data such as self-organizing maps and Virtual GIS, which can capture complex data at the same time that they indicate relativity and ambiguity. The payoff for collaboration will be a humanities scholarship that integrates insights gleaned from spatial information science and spatial theory into scaled narratives about human lives and culture. Such rewards are glimpsed, for example, in Mei-Po Kwan’s and Guoxiang Ding’s analysis of “geo-narratives,” assembled from oral history sources and a blend of other qualitative and quantitative data as a way to understand the lives of Muslim women in Columbus, Ohio after 9/11. 2 Humanities scholars work largely with texts, and the majority of those texts take the form of language, alongside material artifacts, behavioral enactments, art, and the like. A key part of the challenge of thinking spatially and leveraging spatial technology is to design and frame narratives about individual and collective human experience that are spatially contextualized. At one level, the task is defined as the development of reciprocal transformations from text to map and map to text. More importantly, the humanities and social sciences must position themselves to exploit the Geospatial Semantic Web, which in its extraordinarily complexity and massive volume, offers a rich data bed and functional platform to researchers to effectively mine it, organize the harvested data, and contextualize it within the spaces of culture. The agenda here is to advance textual analysis that understands the bi-locality of text in both metaphorical space and geographic space. Humanities scholars can benefit by Spatial Narratives and Deep Maps 4 learning to extract spatial relationships embedded in text and devise narrative forms that join spatial story-telling to more traditional humanities semantics. Here, the payoff is potentially rich; a significant extension of work already underway in literary and cultural studies (e.g. narrative topographies, the spatial imaginaire, and novel mappings). Not only is the vast bulk of human experience recorded as text rather than in quantitative form, words are the preferred medium of both ordinary and scholarly communication, regardless of topic or field. Finding ways to make the interaction among words, location, and quantitative data more dynamic and intuitive will yield rich insights into complex socio-cultural, political, and economic problems, with enormous potential for areas far outside the traditional orbits of humanities research. In short, we should vigorously explore the means by which to advance translation from textual to visual communication, making the most of visual media and learning to create “fits” between the messages of text and numbers and the capabilities of visual forms to express spatial relationships. An emphasis on absolute space based on Euclidean coordinate systems often frustrates the humanist’s effort to understand how spaces change over time, and how spatial relativities emerge and develop. There is an urgent need for the development, within GIS specifically and spatial technologies more generally, of spatio-temporal tools that will enable humanities scholars, social scientists, geographers, and others to incorporate time into analyses that are spatially contextualized. The increasing utilization of GIS by historians suggests that the historical interests in cause and effect, the development and alteration of networks, and the temporal patterning of events is served at least to some extent by current technologies. Such historical studies, however, strain to translate a technology that treats time as categorical and discontinuous into a tool that can represent the richly contingent flow of culture, opting by default for a model that strings together spatio-temporal snapshots on the way to a story- as-collage. The importance of narrative within the humanities can stimulate the development of better spatial tools that incorporate time as well, just as spatial thinking and tools can encourage richer considerations of spatial relationships in narrative time. Central to the emergence of the spatial humanities is a trust that the contingent, unpredictable, and ironic in history and culture can be embodied within a narrative context that incorporates space alongside of time. For the humanities—and for social scientists who are influenced by the humanities—it is above all the thick weave of events, locations, behaviors, and motivations that make human experience of space into place. Place is the product of “deep contingency” and of the human effort to render that experience meaningful in language, art, ritual, and in other ways. Place is constructed out of the imagination as much as through what is visible and tangible in experience. Humanists, social Spatial Narratives and Deep Maps 5 scientists, and geographers, and all who are interested in seeing a spatial humanities mature, should plan for a future state of affairs that will extend the frontiers of “deep mapping.” That is, we should build increasingly more complex maps (using the term broadly) of the personalities, emotions, values, and poetics, the visible and invisible aspects of a place. The spatial considerations remain the same, which is to say that geographic location, boundary, and landscape remain crucial, whether we are investigating a continental landmass or a lecture hall. What is added by these “deep maps” is a reflexivity that acknowledges how engaged human agents build spatially framed identities and aspirations out of imagination and memory and how the multiple perspectives constitute a spatial narrative that complements the verbal narrative traditionally employed by humanists. Here is where the deep map becomes important. An avant-garde technique first urged by the Situationists International in 1950s France, the approach “attempts to record and represent the grain and patina of place through juxtapositions and interpenetrations of the historical and the contemporary, the political and the poetic, the discursive and the sensual….” 3 Its best form results in a subtle and multilayered view of a small area of the earth. As a new creative space, deep maps have several qualities well-suited to a fresh conceptualization of GIS and other spatial technologies as they are applied to the humanities. They are meant to be visual, time- based, and structurally open. They are genuinely multi-media and multilayered. They do not seek authority or objectivity, but involve negotiation between insiders and outsiders, experts and contributors, over what is represented and how. Framed as a conversation and not a statement, deep maps are inherently unstable, continually unfolding and changing in response to new data, new perspectives, and new insights. The analogue between a deep map and advanced spatial technologies seems evident. Geographic information systems operate as a series of layers, each representing a different theme and tied to a specific location on planet earth. These layers are transparent, although the user can make any layer or combination of layers opaque while leaving others visible. A deep map of heritage and culture, centered on memory and place, ideally would work in a similar fashion. The layers of a deep map need not be restricted to a known or discoverable documentary record but could be opened, wiki-like, to anyone with a memory or artifact to contribute. However structured, these layers would operate as do other layers within a GIS, viewed individually or collectively as a whole or within groups, but all tied to time and space that provide perspectives on the places that interest us. It is an open, visual, and experiential space, immersing users in a virtual world in which uncertainty, ambiguity, and contingency are Spatial Narratives and Deep Maps 6 ever-present but all are capable of being braided into a narrative that reveals the ways in which space and time influences and is influenced by social interaction. In narrative theory, this space is one in which both horizontal and vertical movement is possible, with the horizontal providing the linear progression we associate with rational argument and vertical movement providing the depth, texture , tension, and resonance of experience. The coalescence of digital technologies over the past decade, especially seen in the toolkit of Web 2.0, makes it possible to envision how geospatial technologies might contribute to the formation of a deep map, just as the various theories about spatial narratives offer guidance on the structure they may take. However, work in both areas is still too scattered and too abstract to be useful to humanists. And it is here that the institute sought to makes its greatest contribution: it worked collaboratively across disciplines and with experts in technology to develop structured approaches to deep mapping and spatial narratives that in turn could be tested as prototypes, with the aim of developing a robust platform in subsequent grants Divided into three teams, the participants developed three approaches to a deep map, using spatially enabled data provided by the institute directors. The data include: (1) the rich set of religious adherence and demographic data for each of the nation’s 3,000 + counties, included in the Digital Atlas of American Religion, a web-based GIS (www.religionatlas.org); (2) a large archive of digital, spatially referenced ethnographic, image, interview, and video data from the six-year Project on Religion and Urban Culture conducted by the Polis Center from 1996-2002 that examined the intersections between religion and community in 20 th -century Indianapolis; (3) the SAVI Community Information System for Central Indiana, an interactive web-based GIS community information system developed by the Polis Center that contains an enormous amount of data on over 2,000 geographical units in the eleven-county Indianapolis MSA from 1990 to the present; and (4) the digital newspaper and print, image, and audio archives of the Indiana Historical Society and Indiana State Library. The problem supported by the data focused on an important issue in modern American history and culture, namely, social fragmentation and spatial change as evidenced in American religion at national, state, local, and neighborhood levels. The essays that follow are reports of work-in-progress. We offer them because they represent different ways to think about the challenges and potential of deep mapping, but their importance is much larger than the schema presented here. They are in fact among the first efforts to make the attempt to move toward a more integrated, less GIS-dependent Spatial Narratives and Deep Maps 7 spatial framework for humanities research. As such, they are prototypes from which we can learn as we seek to lean what works in this new and exciting field. The Florida State University DigiNole Commons 10-2013 Deep Mapping and the Spatial Humanities David J. Bodenhamer John Corrigan Trevor M. Harris Recommended Citation work_cnna6vnve5gspbow2jzuisxyni ---- Digital Humanities 2010 1 A Tale of Two Cities: Implications of the Similarities and Differences in Collaborative Approaches within the Digital Libraries and Digital Humanities Communities Siemens, Lynne siemensl@uvic.ca Faculty of Business/School of Public Administration, University of Victoria Cunningham, Richard richard.cunningham@acadiau.ca Acadia Digital Culture Observatory, Acadia University Duff, Wendy wendy.duff@utoronto.ca Faculty of Information, University of Toronto Warwick, Claire c.warwick@ucl.ac.uk Department of Information Studies, University College London Besides drawing on content experts, librarians, archivists, developers, programmers, managers, and others, many emerging digital projects also pull in disciplinary expertise from areas that do not typically work in team environments. To be effective, these teams must find processes – some of which are counter to natural individually-oriented work habits – that support the larger goals and group-oriented work of these digital projects. This paper will explore the similarities and differences in approaches within and between members of the Digital Libraries (DL) and Digital Humanities (DH) communities by formally documenting the nature of collaboration in these teams. The objective is to identify exemplary work patterns and larger models of research collaboration that have the potential to strengthen this positive aspect of these communities even further, while exploring the key differences between them which may limit digital project teams’ efforts. Our work is therefore designed to enable those who work in such teams to recognise factors that tend to predispose them to success, and perhaps more importantly, to avoid those that may lead to problematic interactions, and thus make the project less successful than it might otherwise have been. 1. Context Traditionally, research contributions in the humanities field have been felt to be, and documented to be, predominantly solo efforts by academics involving little direct collaboration with others, a model reinforced through doctoral studies and beyond (See, for example, Cuneo 2003; Newell and Swan 2000). However, DL and DH communities are exceptions to this. Given that the nature of digital projects involves computers and a variety of skills and expertise, collaborations in these fields involve individuals within their institutions and with others nationally and internationally. Such collaboration typically must coordinate efforts between academics, undergraduate and graduate students, research assistants, computer programmers and developers, librarians, and other individuals as well as financial and other resources. Further, as more digital projects explore issues of long term sustainability, academics and librarians are likely to enter into more collaborations to ensure this objective (Kretzschmar Jr. and Potter 2009). Given this context, some research has been done on the DL and DH (See, for example Liu and Smith 2007; Ruecker and Radzikowska 2008; Siemens 2009) communities as separate entities (See, for example Johnson 2009; Liu, Tseng and Huang 2005; Johnson 2005; Siemens et al. 2009b), but little has been done on the interaction between these two communities when in collaboration. Tensions can exist in academic research teams when the members represent different disciplines and approaches to team work (Birnbaum 1979; Fennel and Sandefur 1983; Hara et al. 2003). Collaborations can be further complicated when some team members have more experience and training in collaboration than other members, a case which may exist with digital projects involving librarians and archivists, who tend to have more experience, and academics, who have tend to have less. Ultimately, too little is known about Digital Humanities 2010 2 how these teams involving DL and DH members collaborate and the types of support needed to ensure project success. 2. Methods This paper is part of a larger project examining research teams within the DH and DL communities, led by a team based in Canada and England (For more details, see Siemens et al. 2009a; Siemens et al. 2009b). It draws upon results from interviews and two surveys of the communities exploring the individuals’ experiences in digital project teams. The findings include a description of the communities’ work patterns and relationships and the identification of supports and research preparation required to sustain research teams (as per Marshall and Rossman 1999; McCracken 1988). A total of seven individuals were interviewed and another 69 responded to the two surveys. 3. Preliminary Findings At the time of writing this proposal, final data analysis of the surveys and interviews is being completed. However, some preliminary comparisons between the two communities can be reported. As a starting point, similarities exist among DL and DH projects. First, digital projects are being accomplished within teams, albeit relatively small ones, as defined by budget and number of individuals involved. Both communities report that the scale and scope of digital projects require individuals with a variety of skills and expertise. Further, these collaborations tend to operate without formal documentation that outline roles, responsibilities, decision making methods, and conflict resolution mechanisms. The survey and interview respondents from both communities report similar benefits and challenges within their collaborations. Finally, these teams rely heavily on email and face-to-face interaction for their project communications. Some interesting differences between DL- and DH-based teams exist and may influence a digital project team’s effectiveness. First, the DL respondents seem to have a greater reliance on email as opposed to face-to- face communications and tend to rate the relative effectiveness of email higher than the DH respondents. Several explanations may be offered for this. According to survey results, DL teams appear more likely to be located within the same institution, which means that casual interpersonal interaction may be more likely to occur between team members than with groups that are geographically dispersed, as many DH teams are. For dispersed teams, meetings need to be more deliberately planned, which may mean a higher consciousness about the importance of this kind of interaction and the necessity to build this into project plans. Also, given that many of the DL teams are within the same organization, team members may be more familiar with each other in advance of a project start, meaning that more communication can be done by email. Less time may need to be spent in formal meetings developing work processes as is the case with those teams whose members may not have worked together on previous projects. Second, a greater percentage of respondents (42%) within the DH community indicated that they “enjoyed collaboration” than the DL respondents (18%). Comprising of more academics, the DH community tends to undertake more solitary work, and therefore collaboration may be seen as a welcomed change and may be a deliberate choice that they have made to undertake this type of work. In contrast, team work is more the norm for librarians and archivists, and thus they may feel it is an expected part of their jobs, rather than a choice and welcomed activity. As a result, members of these two communities approach collaboration from two fundamentally different positions, which must be understood from the outset of a digital project in order to reduce challenges and ensure success. Further, differences in roles and perceived status may complicate collaboration. Often, tensions may exist between service departments, such as libraries and computer support, and the researcher, who is perceived to have higher status (Warwick 2004). These differences in perceived status can complicate work process as those with lower status may have difficultly directing those with perceived higher status (Hagstrom 1964; Ramsay 2008; Newell and Swan 2000). The benefits to the DL and DH communities will be several. First, the study contributes to Digital Humanities 2010 3 an explicit description of these communities’ work patterns and inter-relationships. Second, it designed to enable those who work in such teams to recognise factors that tend to predispose them to success, and perhaps more importantly, to avoid those that may lead to problematic interactions, and thus make the project less successful than it might otherwise have been. References Birnbaum, Philip H. (1979). 'Research Team Composition and Performance'. Interdisciplinary Research Groups: Their Management and Organization. Richard T. Barth and Rudy Steck (ed.). Vancouver, British Columbia: International Research Group on Interdisciplinary Programs. Cuneo, Carl (November 2003). 'Interdisciplinary Teams - Let's Make Them Work'. University Affairs. 18-21. Fennel, Mary, and Gary D. Sandefur (1983). 'Structural Clarity of Interdisciplinary Teams: A Research Note'. The Journal of Applied Behavioral Science. 19.2: 193-202. Hagstrom, Warren O. (1964). 'Traditional and Modern Forms of Scientific Teamwork'. Administrative Quarterly. 9: 241-63. Hara, Noriko, et al. (2003). 'An Emerging View of Scientific Collaboration: Scientists' Perspectives on Collaboration and Factors That Impact Collaboration'. Journal of the American Society for Information Science and Technology. 54.10: 952-65. Johnson, Ian M. (2005). 'In the Middle of Difficulty Lies Opportunity" - Using a Case Study to Identify Critical Success Factors Contributing to the Initiation of International Collaborative Projects'. Education for Information. 23. 1/2: 9-42. Johnson, Ian M. 'International Collaboration between Schools of Librarianship and Information Studies: Current Issues'. Asia- Pacific Conference on Library & Information Education & Practice. Kretzschmar Jr., William A., and William G. Potter (2009). 'Library Collaboration with Large Digital Humanities Projects'. Digital Humanities. Liu, Jyi-Shane, Mu-Hsi Tseng, and Tze- Kai Huang (2005). 'Building Digital Heritage with Teamwork Empowerment'. Information Technology & Libraries. 24.3: 130-40. Liu, Yin, and Jeff Smith (2007). 'Aligning the Agendas of Humanities and Computer Science Research: A Risk/Reward Analysis'. SDH- SEMI. Marshall, Catherine, and Gretchen B. Rossman (1999). Designing Qualitative Research. Thousand Oaks, CA: SAGE Publications, 3rd edition. McCracken, Grant (1988). The Long Interview. Qualitative Research Methods. Newbury Park, CA: SAGE Publications. V. 13. Newell, Sue, and Jacky Swan (2000). 'Trust and Inter-Organizational Networking'. Human Relations. 53.10: 1287-328. Kretzschmar Jr., William A., and William G. Potter (2008). 'Rules of the Order: The Sociology of Large, Multi-Institutional Software Development Projects'. Digital Humanities. Ruecker, Stan, and Milena Radzikowska (2008). 'The Iterative Design of a Project Charter for Interdisciplinary Research'. DIS. Siemens, Lynne (2009). 'It's a Team If You Use "Reply All": An Exploration of Research Teams in Digital Humanities Environments'. Literary & Linguistic Computing. 24.2: 225-33. Siemens, Lynne, et al. (2009). 'Able to Develop Much Larger and More Ambitious Projects: An Exploration of Digital Projects Teams'. DigCCurr 2009: Digital Curation: Practice, Promise and Prospects. Helen R. Tibbo, et al. (ed.). University of North Carolina at Chapel Hill. Siemens, Lynne, et al. (2009). 'Building Strong E-Book Project Teams: Processes to Maximize Success While Drawing on Essential Academic Disciplinary Expertise'. BooksOnline '09: 2nd Workshop on Research Advances in Large Digital Book Collections. Warwick, Claire. 'No Such Thing as Humanities Computing? An Analytical History Digital Humanities 2010 4 of Digital Resource Creation and Computing in the Humanities'. Joint International Conference of the Association for Computers and the Humanities and the Association for Literary & Linguistic Computing. work_csszuck3crfqdloyyuxdux4dei ---- The Language of Caring: Digital Human Modeling, Practice Patterns, and Performance Assessment 2351-9789 Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of AHFE Conference doi: 10.1016/j.promfg.2015.07.881 Procedia Manufacturing 3 ( 2015 ) 3788 – 3795 Available online at www.sciencedirect.com ScienceDirect 6th International Conference on Applied Human Factors and Ergonomics (AHFE 2015) and the Affiliated Conferences, AHFE 2015 The language of caring: digital human modeling, practice patterns, and performance assessment J.R. Hotchkissa,b,c,*, J.D. Paladinod, C.W. Brackneya,b, A.M. Kaynarb, P.S. Crookee aVeteran’s Affairs Healthcare System, Pittsburgh, PA 15240, USA bDepartment of Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA cVeteran’s Engineering Resource Center, Pittsburgh, PA 15215, USA dDepartment of Medicine, John A. Burns School of Medicine, Honolulu, HI 96813, USA eDepartment of Mathematics, Vanderbilt University, Nashville, TN 37240, USA Abstract Digital human modeling offers unique potential in educating providers to apply complex, titratable forms of medical care and assessing their cognitive competence in these domains. Mechanical ventilation uses a machine (a ventilator) to support patients who cannot breathe independently, and is a cornerstone of modern intensive and emergency medical care. This cognitively complex, titrated, and potentially harmful therapy saves hundreds of thousands of lives per year. Practical and ethical considerations limit the provision of extensive bedside training, and there are no current mechanisms for assessing operational competence. We constructed a comprehensive digital model of patients undergoing mechanical ventilation that was populated with “virtual patients,” as well as specific guidelines regarding clinical goals for each patient. Individuals ranging from experienced clinicians to trainees were evaluated regarding their performance as they managed the virtual patient population. The training experience was well received and required less than 2 hours. Nonetheless, exposure to the simulator improved provider efficiency, and was accompanied by clear changes in patterns of practice. The ability to test on rigorously standardized cases (entirely unfeasible in the clinical setting) facilitated assessment of competence and more sophisticated quantification of. © 2015 The Authors. Published by Elsevier B.V. Peer-review under responsibility of AHFE Conference. Keywords: Mechanical ventilation; Virtual patients; Practice patterns; Performance assessment * Corresponding author. E-mail address: john.r.hotchkiss@gmail.com Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of AHFE Conference http://crossmark.crossref.org/dialog/?doi=10.1016/j.promfg.2015.07.881&domain=pdf 3789 J.R. Hotchkiss et al. / Procedia Manufacturing 3 ( 2015 ) 3788 – 3795 1. Introduction Mechanical ventilation (MV) is a cornerstone intervention in modern intensive and emergency care that is used to support hundreds of thousands of individuals each year while they cannot breathe independently. Unfortunately, this lifesaving intervention can cause harm: injudicious ventilator settings can promote lung injury, compromise circulatory stability, produce patient distress, stimulate an inflammatory response, and prolong the period of support required. Such adverse consequences of mechanical ventilation add to the burden of patient suffering, increase healthcare resource utilization, and compromise outcomes. The physiologic and engineering foundations of mechanical ventilation are relatively well understood [1-12]. Regrettably, the teaching of mechanical ventilation remains primarily a “bedside” exercise more akin to an apprenticeship than a systematic approach to mastery. New learners cannot practice extensively on actual patients, for ethical and practical reasons, and physiologically realistic alternatives (large animals or physical simulators) are expensive and suffer from limited access. Moreover, the exposure of the practitioner to the full spectrum of possible mechanical or physiologic derangements cannot be guaranteed. Contemporary approaches to assessment of expertise in mechanical ventilation are ill-suited for defining clinician practice patterns or competence in the context of a potentially harmful intervention for which any patient problem may have many possible solutions, the prevailing physiology is highly dynamic, and the clinician is reasoning in the setting of uncertainty. However, there is recent evidence that model-based training can have an impact on the proficiency of clinicians [13-18]. We addressed these issues by refining and extending an existing simulation-based educational model of mechanical ventilation and coupling this software to a state of the art approach to characterizing practice patterns, one based on symbolic dynamics [19, 20]. This ensemble included a computer based micro-simulation training tool and software and algorithms for constructing a database, characterizing provider practice patterns. We explored the evolution of the practice patterns adopted by individual providers as they progressed through a training exercise in which they confronted 100 virtual patients having common clinical derangements of respiratory mechanics. An updated version of the simulation tool freeware can be downloaded from: http://www.math.vanderbilt.edu/ ~pscrooke/CANVENT/upload.html. 2. Methods 2.1. Simulation tool The simulation tool comprises 5 distinct simulation based elements: Element one: mathematical models that faithfully emulate airspace mechanics during mechanical ventilation. The mathematical models that underlie the simulator are based on general models of non-passive (patient active to varying degrees) mechanical ventilation under pressure controlled (PCV) or volume controlled (VCV) mechanical ventilation. The primary model has been parameterized and tested in a large animal model of lung injury, and are based on a representation of the pulmonary pressure-volume curve (lung recruitment) originally proposed and validated in humans [21]. In this approach, lung compliance is represented as a trapezoidal (increasing, constant, and decreasing) function of lung volume [22, 23]. We studied this model in an oleic acid swine model, and found it to faithfully emulate the dynamic behaviors of this large animal model of very severe lung injury [24]. Element two: a model emulating gas exchange during mechanical ventilation. We developed a simple “two- compartment” model of pulmonary gas exchange that captures relevant behaviors based on a “perfused, ventilated” compartment and a “perfused, unventilated” compartment. The “size” of the unventilated compartment is determined by the volumes predicted from the mechanics model above. Element three: models emulating acid-base metabolism and the effects of elevated intrathoracic pressure We developed a simple model of CO2 clearance and systemic pH that captures relevant behaviors based on CO2 production, minute ventilation, and anatomic deadspace . Similarly, we incorporated a simple model of interactions between elevations in intrathoracic pressure and decrements in cardiac output that is used to emulate mean arterial pressure responses to elevated intrathoracic pressure. 3790 J.R. Hotchkiss et al. / Procedia Manufacturing 3 ( 2015 ) 3788 – 3795 Element four: a population of “virtual patients” that faithfully emulate the behaviors of patients managed in everyday clinical practice. Manipulation of patient specific impedance parameters (such as inflection points, gain values, oxygen consumption, etc.) was undertaken to construct a population of virtual patients having physiologic characteristics mimicking common clinical derangements: Chronic Obstructive Lung Disease (COLD) Severe Acute Asthma (SAA) Mean Airway Pressure Responsive Acute Lung Injury (rALI) Mean Airway Pressure Unresponsive Hypoxemia (UH) Restrictive Lung Disease (RLD). Several iterations were performed in which experienced Critical Care clinicians confronted each simulated patient; those displaying grossly unrealistic behaviors, or that were deemed so easy to be uninformative, were replaced by alternate candidates. We sought virtual patients that were both ultimately “solvable,” and non-trivial in the manipulations required for solution. The simulator sequentially presented 100 patients, of which 80 (16 from each pathophysiologic class) were unique and an additional 5 of which (1 from each class) appeared 4 times (at the beginning of the simulation, and after 31, 67, and 95 patients). These “recurring patients” allowed evaluation of user responses to identical patients at different points in the educational experience. Element five: a user friendly interface in which learners confront sequential patients, attempt to satisfy specified goals, and can terminate the simulation when they believe goals have been met. Following (potentially multiple) iterations of ventilator adjustments, mode selections, and fluid management decisions, the user commits to the solution or determines that goals cannot be met. Immediate feedback is provided. 2.2. Assessment tool For each ventilator adjustment imposed on each patient, the simulator archives the patient type and impedance and other characteristics, the current values of each physiologic variable, and requests for ancillary data. Similarly, the software archives the exact values for changes in ventilator settings made by the user. These data, collected for each learner, comprise the inputs of the assessment toolkit. This toolkit provides two broad classes of performance data, detailed below. Analysis of provider solution speeds, success rates, and response patterns. Gross outcomes, such as the number of attempts to solve each patient, number of successful solutions, and complication rates (unsatisfactory physiologic parameters within a trial) are calculated directly. In addition, the complexity of each intervention imposed by the learner is quantified in two ways: average complexity and weighted complexity. Average complexity is simply the number of ventilator settings that the practitioner changes at each attempt, divided by the number of relevant attempts. Weighted complexity “weights” each setting change by the number of outcome measures that parameter can affect—for example, changes in tidal volume can affect pH, plateau pressure, oxygenation, and blood pressure (weight = 4), whereas changes in inspired oxygen concentration typically only affect oxygenation (weight=1). These metrics were constructed to capture the tendency of more experienced practitioners to respond with patterns of interventions, rather than unitary changes. Quantitative analysis of provider practice patterns. For each point in each simulated management problem, the prevailing pattern of derangements of the patient (such as hypoxemia, low pH and high plateau pressure combined with a low blood pressure: the failure pattern) can be assigned a unique numerical symbol. Similarly, the provider responses (such as decreasing tidal volume, increasing PEEP, increasing respiratory frequency, administering a fluid bolus, or combinations thereof) can also be assigned a unique numerical symbol. The simulation can thus be cast as a series of aligned symbols: “the provider saw this pattern of derangements, and responded with the following pattern of interventions.” 3791 J.R. Hotchkiss et al. / Procedia Manufacturing 3 ( 2015 ) 3788 – 3795 Table 1. Attempt/failure patterns. These data are used to construct provider- and population- specific frequency tables depicting the frequency with which a provider (or population of providers) responds to a specific derangement pattern with a specific pattern of interventions. Approaches promulgated by Tang and Daw [25, 26] can be used to construct difference matrices expressing the “distance” between practice patterns. Such difference matrices can provide quantitative measurements of the distance between a provider’s practice patterns and those of other providers or those of a consensus panel of providers. 3. Results We studied 29 subjects ranging from trainees to experienced faculty members. Each managed 100 virtual patients; completion of the training experience took each subject approximately 1.5 hours of interaction time (1.41 ± 0.59 h). User acceptance was good, with 97% of users agreeing that the virtual patients resembled those in their daily practice and 84% indicating that they would apply the knowledge gained in this experience to their daily practice. The resulting database contained 14,503 disorder: intervention dyads. Incomplete pairs resulting from keystroke errors accounted for 0.5% and were excluded, leaving 14,426 dyads for analysis. Two subjects had keystroke errors in the last of the standardized patient panels (96-100) rendering these standardized sets not evaluable. Accordingly, performance on the standardized sets was assessed using patients 1-5 and 68-72 and performance on previously unseen patients was conducted using patients 6-31 and 42-67. When the two individuals with incomplete terminal sets were excluded from the analyses and the remaining 27 individuals were evaluated using patients 1-5 and 96-100 (standard patient panels) and patients 6 through 27 and 73 through 95 (previously unseen patients), the results of the analyses were not substantively changed. All results are corrected for multiple comparisons within the study. Simulation based training increases practitioner efficiency. Following simulation based training practitioners solved a panel of standardized patients with fewer attempts on each patient. Rates of successful solution were similar (as planned- patients were “designed” to be solvable), suggesting increased efficiency (Figure 1). For the 27 provider subset comparing performance on standardized patients 1-5 and 96-100 the corresponding corrected p value was 0.0003. Fig. 1. Evolution of performance on standardized patient sets. Panel A: number of attempts required to solve standardized set before and after training; Panel B: success rates before and after training. Attempt 1 2 3 4 Failure Pattern 14 18 9 17 Provider Pattern 904 83 172 811 3792 J.R. Hotchkiss et al. / Procedia Manufacturing 3 ( 2015 ) 3788 – 3795 Fig. 2. Evolution of performance on novel patient sets. Panel A: number of attempts required to solve individual patients before and after training; Panel B: success rates before and after training. The increased efficiency seen on standardized patient panels generalized to new patients. Following simulation based training, practitioners solved panels of more difficult patients they had not previously encountered with fewer attempts and similar or increased success, suggesting that the increased efficiency seen in the standard panel generalized to “novel” encounters (Figure 2). For the 27 providers subset comparing performance on patients 6 through 27 and 73 through 95 the corresponding corrected p value is 0.003. Providers adopt more sophisticated practice patterns following simulation-based training. Following simulation based training, practitioners implemented significantly more complex patterns of adjustment at each change in ventilator settings. “Complexity” is simply the average number of parameters changed at each step. “Weighted complexity” is the sum of setting changes at each intervention, with each setting change weighted by the number of outcome parameters that are affected by that setting. For example, frequency can affect minute ventilation, plateau pressure, oxygenation, minute ventilation, and mean arterial pressure; changes in FiO2 only affect oxygenation. Practitioners qualitatively changed their patterns of practice (Figure 3). For the 27 provider subset comparing performance on standardized patients 1-5 and 96-100 the corresponding corrected p-values are 0.04 and 0.12. Fig. 3. Average complexity of interventions imposed by each user before and after training. Panel A: Average complexity; Panel B: Average weighted complexity. 3793 J.R. Hotchkiss et al. / Procedia Manufacturing 3 ( 2015 ) 3788 – 3795 Fig. 4. Similarity of individual practice patterns to the patterns adopted by high performing individuals. Panel A: similarity on standardized patient sets; Panel B: similarity on novel patients. Simulation based training leads to providers adopting higher performance practice languages. As previously demonstrated, practice patterns during management of mechanical ventilation display language-like characteristics [20]. Accordingly, we defined a consensus “practice language” based on the patterns of the 5 most effective subjects, and compared the remainder of the subjects to these patterns before and after simulation training. The “most effective” designation was determined as those subjects having the lowest values for: . Following simulation based training, the practice patterns of the remaining subjects converged toward those of highly efficient providers, whether on a standardized panel of patients or on panels of previously unseen patients, suggesting that practitioners were learning a more efficient “language” (Figure 4). For the 27 provider subset comparing practice patterns on standardized patients 1-5 vs 96-100 and 6-27 vs 73-95 the corresponding corrected p-values are 0.02 and 0.001. Although not a primary outcome within the analysis, within the standard sets learners demonstrated a qualitative trend toward increased efficiency across each of the different patient classes. This finding was observed when both the full data and the 27 learner data were examined. Table 2. Attempt/failure before and after training. Class Average Attempts (before training) Average Attempts (after training) p-value (corrected) 1 3.6 2.2 0.02 2 5.7 2.9 0.01 3 2.8 1.4 0.82 4 4.3 2.2 0.23 5 5.2 3.6 0.05 A similar analysis was not possible for the novel simulated patients, as the degree of difficulty of these patients was not constant (by design). Accordingly, patients could not be “matched” for inherent difficulty for comparison before and after training. 3794 J.R. Hotchkiss et al. / Procedia Manufacturing 3 ( 2015 ) 3788 – 3795 4. Discussion Users of this micro-simulation based training tool for mechanical ventilation increased their solution efficiency, implemented more complex patterns of intervention, and converged toward a common “expert practice pattern” as they progressed through the simulations and were confronted with a rigorously standardized testing panel of simulated patients. These changes in performance and practice pattern were mirrored by similar changes as the users confronted “novel-“ not seen before – virtual patients. Of note, learners were exposed to a large volume of cases spanning a wide clinical range in a very short period- on average less than 2 hours. This work highlights unique advantages of digital human models as training tools. First, the trainee can “practice” in an environment that poses no threat to patient safety. Second, an adequate, high volume exposure to the entire range of clinical problems can be assured. Our results suggest that, in the setting of mechanical ventilation, such exposure can be accomplished in a short time frame. The individual cases can be rigorously standardized, facilitating intra-and inter- individual comparisons. In addition, such standardization, combined with complete data capture, facilitates the definition of practice patterns and allows the implementation of practice pattern based assessment tools. As real world medical care requires the apprehension of multi element patterns of patient derangement and multicomponent provider interventions, such pattern based competency assessment is likely more appropriate than currently employed, univariate assessments. Properly deployed digital human models are uniquely suited for training practitioners to manage patients with complex medical conditions requiring titrated care. In addition, such models allow characterization of “expert” practice patterns, facilitating assessment of competence along multiple simultaneous axes. More sophisticated training tools, such as those including user-tailored training (where the cases focus on the user’s weak points) are also readily implemented. The future is bright for such approaches. Acknowledgements This research was supported by NIH R01 HL084113 and AHRQ R18 HS023453-01. References [1] Burke, W.C., et al., Comparion of mathematical and mechanical models of pressure-controlled ventilation. Journal of Applied Physiology, 1993. 74(2): p. 922-933. [2] Crooke, P.S., J.D. Head, and J.J. Marini, A general two-compartment model for mechanical ventilation. Mathematical and Computer Modelling, 1996. 24(7): p. 1-18. [3] Crooke, P.S., et al., Mathematical models of passive, pressure-controlled ventilation with different resistance assumptions. Mathematical and Computer Modelling, 2003. 38(5-6): p. 495-502. [4] Crooke, P.S. and J.J. Marini, A nonlinear mathematical-model of pressure preset ventilation - Description and limiting values for key outcome variables. Mathematical Models & Methods in Applied Sciences, 1993. 3(6): p. 839-859. [5] Hardman, J.G., et al., A physiology simulator: validation of its respiratory components and its ability to predict the patient's response to changes in mechanical ventilation. British Journal of Anaesthesia, 1998. 81(3): p. 327-332. [6] Knorzer, A., C. Schranz, and K. Moller, Evaluation of a Model based Optimization Algorithm for Pressure Controlled Ventilation. Biomedical Engineering-Biomedizinische Technik, 2013. 58: p. 2. [7] Kulish, V. and V. Kulish, Human Respiration: Anatomy and Physiology, Mathematical Modeling, Numerical Simulation and Applications. Human Respiration: Anatomy and Physiology, Mathematical Modeling, Numerical Simulation and Applications. 2006: Wit Press, Ashurst Lodge, Southampton So40 7aa, Ashurst, UK. [8] Morgenstern, U. and S. Kaiser, Mathematical-modeling of ventilation mechanics. International Journal of Clinical Monitoring and Computing, 1995. 12(2): p. 105-112. [9] Schranz, C., et al., Model-based setting of inspiratory pressure and respiratory rate in pressure-controlled ventilation. Physiological Measurement, 2014. 35(3): p. 383-397. [10] Schranz, C., et al., Model-based ventilator settings in pressure controlled ventilation. Biomedical Engineering-Biomedizinische Technik, 2013. 58(Suppl. 1). [11] Batzel, J.J., M. Bachar, and F. Kappel, Mathematical Modeling and Validation in Physiology Applications to the Cardiovascular and Respiratory Systems Preface, in Mathematical Modeling and Validation in Physiology: Applications to the Cardiovascular and Respiratory Systems, J.J. Batzel, M. Bachar, and F. Kappel, Editors. 2013, Springer-Verlag Berlin: Berlin. p. V-IX. 3795 J.R. Hotchkiss et al. / Procedia Manufacturing 3 ( 2015 ) 3788 – 3795 [12] Hotchkiss, J.R., P.S. Crooke, and J.J. Marini, Theoretical interactions between ventilator settings and proximal deadspace ventilation during tracheal gas insufflation. Intensive Care Medicine, 1996. 22(10): p. 1112-1119. [13] Barsuk, J.H., et al., Unexpected collateral effects of simulation-based medical education. Academic Medicine, 2011. 86(12): p. 1513-1517. [14] McGaghie, W.C. and P.M. Fisichella, The science of learning and medical education. Medical Education, 2014. 48(2): p. 106-108. [15] Motola, I., et al., Simulation in healthcare education: A best evidence practical guide. AMEE Guide No. 82. Medical Teacher, 2013. 35(10): p. E1511-E1530. [16] Rosenthal, M.E., et al., Achieving housestaff competence in emergency airway management using scenario based simulation training - Comparison of attending vs housestaff trainers. Chest, 2006. 129(6): p. 1453-1458. [17] Schroedl, C.J., et al., Use of simulation-based education to improve resident learning and patient care in the medical intensive care unit: A randomized trial. Journal of Critical Care, 2012. 27(2): p. 7. [18] Singer, B.D., et al., First-year residents outperform third-year residents after simulation-based education in critical care medicine. Simulation in Healthcare-Journal of the Society for Simulation in Healthcare, 2013. 8(2): p. 67-71. [19] Paladino, J., et al., The language of caring: quantitating medical practice patterns using symbolic dynamics. Mathematical Modelling of Natural Phenomena, 2010. 5(3): p. 165-172. [20] Paladino, J.D., et al., Medical practices display power law behaviors similar to spoken languages. BMC Medical Informatics and Decision Making, 2013. 13: p. 9. [21] Svantesson, C., B. Drefeldt, and B. Jonson, The static pressure-volume relationship of the respiratory system determined with a computer- controlled ventilator. Clinical Physiology, 1997. 17(4): p. 419-430. [22] Kongkul, K., S. Rattanamongkonkul, and P.S. Crooke, A multi-segment mathematical model with variable compliance for pressure controlled ventilation. ScienceAsia, 2004. 30: p. 191-203. [23] Crooke, P.S., J.J. Marini, and J.R. Hotchkiss, A new look at the stress index for lung injury. Journal of Biological Systems, 2005. 13(3): p. 261-272. [24] Crooke, P.S., et al., Mathematical models for pressure controlled ventilation of oleic acid-injured pigs. Mathematical Medicine and Biology, 2005. 22(1): p. 99-112. [25] Daw, C.S., et al., Observing and modeling nonlinear dynamics in an internal combustion engine. Physical Review E, 1998. 57(3): p. 2811- 2819. [26] Tang, X.Z., et al., Symbol sequence statistics in noisy chaotic signal recognition. Physical Review E, 1995. 51(5): p. 3871-3889. work_ctryenwgpfgqnon4dn6z4b2yvq ---- White Paper Report ID: 107433 Application Number: HT-50070-12 Project Director: Trevor Munoz Institution: University of Maryland, College Park Reporting Period: 10/1/2012-9/30/2015 Report Due: 12/31/2015 Date Submitted: 8/31/2016 project white paper Digital Humanities Data Curation ht-50070-12 Institutes for Advanced Topics in the Digital Humanities National Endowment for the Humanities Project director: Trevor Muñoz Assistant Dean for Digital Humanities Research, University Libraries, Associate Director, Maryland Institute for Technology in the Humanities, University of Maryland Local project directors: Julia Flanders Head, Digital Scholarship Group, Professor of the Practice of English, Northeastern University Megan Senseney Project Coordinator for Research Services Center for Informatics Research in Science and Scholarship, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign The Digital Humanities Data Curation (DHDC) Institute was awarded an Institutes for Advanced Topics in the Digital Humanities grant by the National Endowment for the Humanities in the amount of $248,721. DHDC was a collaborative initiative led by the Maryland Institute for Technology in the Humanities (MITH) in cooperation with the Women Writers Project at Northeastern University and the Center for Informatics Research in Science and Scholarship at the University of Illinois Graduate School of Library and Information Science (GSLIS). The institute was designed to serve as an opportunity for humanities scholars with all levels of expertise—from beginners to the most advanced—to receive guidance in understanding the role of data curation in enriching humanities research projects. Institute workshops intended not only to further the educational efforts of the selected participants but also to allow for the adaptation of data curation curricula from the several degree and advanced certificate programs in research data curation within existing library and information science graduate programs to the specific needs of the digital humanities research community. Ultimately the goal of DHDC was to create a community of practitioners invested in humanities data curation from a range of different disciplinary communities including digital humanities, information science, and digital libraries. Project Activities activities from october 1, 2012 to september 30, 2013 Technical development on the DH Curation Guide.​ The DH Curation Guide (​http://guide.dhcuration.org​) is a community-​based curricular resource, providing an annotated, contextualized listing of key resources for understanding data curation in the humanities. The Guide was first developed at the University of Illinois as part of a project dedicated to extending data curation research and curriculum with library and information graduate programs to the humanities. This earlier project was funded by the Institute for Museum and Library Services (IMLS). For DHDC, the Guide served as a course reader base and was also considered as a mode of wider dissemination for the knowledge that is shared within the institute events (see discussion below). During this performance period, the project team shifted the Guide’s hosting from Illinois to Maryland and also transitioned to a simpler platform that will reduce the overall time expended for programming and content management. A web page announcing the first DHDC institute was also added to the existing DH Curation site (see ​http://dhcuration.org/institute​). Curriculum Development and Workshop Planning​ . Due to Carole Palmer’s planned sabbatical in Spring 2013, the project team began curricular development earlier than anticipated in Fall 2012, in order to benefit from Palmer’s data curation expertise and prior experience in developing and running the IMLS​-funded Summer Institutes in Data Curation from 2008– 2011. Trevor Muñoz, Julia Flanders, and Dorothea Salo took the lead in planning a three-​day curriculum for DHDC. Each day of workshop comprised lecture-style presentations by session leaders, case studies, hands-on activities, and group discussions. For the first 1 http://guide.dhcuration.org/ http://dhcuration.org/institute institute, Dr. Ted Underwood, Associate Professor of English at the University of Illinois Urbana​-Champaign, agreed to share a local case study presenting his experiences and personal challenges addressing data curation issues related to his research, which includes running a variety of algorithmic approaches against large​-scale text corpora for literary history analysis. All key planning activities for the first workshop occurred as planned and on schedule, including making local arrangements for meeting space and housing, creating an institute page on the DH Curation website, issuing a call for applications, reviewing applications, and notifying participants of their acceptance. The project team had not, however, anticipated the overwhelming positive response to our call for applications. In total, we received 111 applications for our first institute, which is limited to 20 participants. In addition to institute applications, another 225 people signed up to receive more information about future workshops. The first round of applicants’ demographics ranged from local to international with 18 faculty members, 45 library and information science professionals, 7 alternative academics, 36 graduate students, and 5 others. With an 18% acceptance rate, the DHDC team was able to be highly selective in the choice of participants. Workshop #1​ . The first DHDC workshop was held at Graduate School of Library and Information Science at the University of Illinois, from June 24​–26, 2013. Palmer opened the first workshop with remarks about humanities data curation in the context of the data curation education program at GSLIS. The first day of the institute consisted of providing an introduction to humanities data curation, an interactive exercise in which participants introduced their data as well as themselves, a discussion of the social dimensions of data, and an activity introducing participants to different approaches to data management planning. Day two began with a discussion of the nature of data and digital objects followed by a tour of data curation systems and platforms. In the afternoon, Underwood presented the case of his own research in the digital humanities for a discussion of real-world curation issues and an exercise in problem solving. Day two closed with a discussion of the role collections play in curation activities. The third and final day of the workshop opened with a curation activity using data from the New York Public Library’s “What’s on the Menu” project and Google Refine. The institute concluded with a set of sessions devoted to legal and policy aspects of data curation, risk assessment and mitigation, and sustainability. The three-workshop format of DHDC allowed the project team to better understand the needs of the community, assess how workshops might better meet those needs, and revise curriculum accordingly. To that effect, the project team rigorously documented the first institute and analyzed materials generated from workshop activities, note-taking, social media interactions, and direct feedback via an evaluation survey (see below). The project team began making in-development resources available through the DHDC Workshop Wiki on GitHub, including copies of all presentation slides and workshop notes 2 (https://github.com/digital-humanities-data- curation/dhdc-workshop/wiki). In an effort both to foster ongoing conversation among workshop participants and to create potential avenues of participation for applicants who were not selected, the team experimented with a closed pilot of an online discussion forum (see additional discussion below). Curriculum Revisions​ . In response to participants’ requests for more hands-on activities, the project team made several revisions to the DHDC curriculum after the first workshop. Revisions included an introduction to depositing and retrieving items in an Islandora repository; an exercise in exploring the many functional roles and intersections of professionals who participate in data curation activities; and a brainstorming activity in which participants shared examples of personal solutions to data curation problems that they’ve already implemented in their own work or at their home institution. Additional curriculum revisions included developing a lecture on metadata; tweaking minor aspects of pre-existing sessions; further developing the scaffolding for previously-introduced exercises; and building in more flexible time for technical troubleshooting and professional networking. Planning for Workshop #2​ . Once again, the project team was overwhelmed by the enthusiastic response to the call for applications to the second DHDC institute at the University of Maryland. In total, DHDC received 136 applications for the second institute, which could only accommodate 20 participants. The total number of subscribers to the Digital Humanities Data Curation email list by summer 2013 was over 500 people. The selected cohort for the second institute included 2 faculty, 8 library and information science professionals, 4 alternative academics, 4 graduate students, and 2 others from a range of institutions across the United States. activities from october 1, 2013 to september 30, 2014 Workshop #2​ . The second workshop was held from October 16–18, 2013, at the University of Maryland, College Park. In addition to curriculum revisions discussed above, the instructors included additional time for advanced topic sessions based on attendees’ interests. The workshop participants identified two topics: “Open Data and Data Journalism” and “Linked Data”, which were covered on the last day of the workshop. Kari Kraus, Associate Professor in the College of Information Studies and the Department of English at the University of Maryland provided a case study from her recent research on the second day of the workshop. Curriculum Revisions, Round Two​ . In response to participants’ enthusiasm for the attendee-driven session on linked data at the second workshop, the topic was formalized and incorporated into the DHDC curriculum. The data personae and self-profiling exercises were replaced with a birds-of-a-feather breakout session based on topics that arose during participant introductions and a mind mapping exercise to 1) determine what data curation encompasses and 2) explore the data curation landscape. The instructors also explored a new approach to the session on data management planning, which included peer review of 3 real data management plans and a breakout session in which participants brainstormed ways to think beyond project-based plans to customize and improve the process of planning for data management across varying scales and scenarios. Workshop #3. Once again, the project team was overwhelmed by the enthusiastic response to the call for applications to the third and final DHDC workshop at Northeastern University. In total, the project received 112 applications for the second institute, which could only accommodate 21 participants. This round of applicants’ demographics ranged from local to international with 21 faculty members, 47 library and information science professionals, 8 alternative academics, 25 graduate students, and 11 others. With a 19% acceptance rate, DHDC was able to be highly selective in the choice of participants. The final selected cohort included 5 faculty, 8 library and information science professionals, 4 alternative academics, 3 graduate students, and 1 other from a wide range of institutions. The third workshop was held from April 30–May 2, 2014, at Northeastern University. The other significant change to conduct of the third workshop (aside from those described above) involved the handling of the digital humanities case study. For this institute’s case study, the instructors recruited representatives from two very different DH projects, both located at Northeastern University. The case study presenters were given more thorough context for the role of the session within the broader workshop, and the two-project structure facilitated a comparative approach to participant discussions in considering how different projects may pose discrete curatorial challenges and require different approaches. Jim McGrath, Ph.D. candidate in English at Northeastern University, presented Our Marathon: The Boston Bombing Digital Archive and WBUR Oral History Project, and Elizabeth Dillon, professor of English and co-​director of the NULab for Texts, Maps, and Networks at Northeastern University, presented The Early Caribbean Digital Archive. activities from october 1, 2014 to september 30, 2015 Additional dissemination activities​ . In fall 2014, the National Endowment for the Humanities approved a one-​year no-​cost extension for the DHDC project through September 30, 2015. The project used the extension period to continue refining curriculum, explore additional opportunities for dissemination of project resulting, and develop the project’s final performance reports. The third cohort of workshop participants were the first group invited to test the pilot of the discussion forum initiated after Workshop #2. Participants used the space for introductions prior to the workshop as well as for giving and receiving advice about data-related problems and for sharing tools, techniques, and resources. Ultimately however, uptake of the discussion forums was very limited. Many digital humanities practitioners 4 already participate in numerous online fora, from Twitter to Digital Humanities Question and Answers, to email discussion lists. The project team determined that an additional forum devoted to humanities data curation was unlikely to be sustainable and the pilot project was closed down in Spring 2015. After an internal review and discussions with potential hosts, the project team decided likewise not to expand the content of the DH Curation Guide or invest further resources in its development. The Guide will remain online as an open access resource. The introductory article on “Humanities Data Curation” by Flanders and Muñoz has been cited 10 times since 2013, in published literature in both the humanities and information science. From anecdotal reports, the DHDC team is aware of several courses in information science graduate programs using materials from the Guide in courses. Moreover, given the limited number of resources specifically devoted to data curation for the humanities, the Guide site often ranks highly in Google search results. However, to expand the Guide to include new content would require substantial investment of time for soliciting submissions, managing editorial workflows, and supporting digital publication. The resources to support this investment do not currently exist at any of the potential host institutions the project team considered. The Guide, which pre​dates the DHDC project, served as a teaching resource and was considered as an outlet for dissemination of project results. Given the decision not to invest further in the Guide as an active publication, the usefulness to DHDC for dissemination is greatly decreased. One of the most successful outcomes of this project has been sustainably embedding DHDC curricular content within recurring, self​-sustaining digital humanities training initiatives. In summer 2014, Senseney organized a workshop entitled “Data Curation and Access for the Digital Humanities” at the Digital Humanities Oxford Summer School (DHOxSS) with colleagues from the HathiTrust Research Center, the Center for Informatics Research in Science and Scholarship, the Bodleian Library, and the Oxford e​Research Center. Her sessions drew upon DHDC lectures, activities, and exercises related to data management, using Open Refine, and discussion-​based case studies with the goal of extending the impact of the DHDC curriculum to the additional audiences. Senseney led a revised version of the workshop entitled “Humanities Data: Curation, Analysis Access and Reuse” at the 2015 Digital Humanities Oxford Summer School with a colleague the Oxford e​Research Center. As part of the Humanities Intensive Learning + Teaching (HILT) Institute in 2015, Muñoz collaborated with Katie Rawson from the University of Pennsylvania to offer an advanced course, “Humanities Data Curation Praxis,” which experimented with more tool-​intensive curricular materials and with guided opportunities for participants to workshop project​specific data curation strategies in a small group setting. At several points during the no-​cost extension period, team members shared insights from DHDC through invited talks and guest appearances in other training workshops. For 5 example, Muñoz led mini workshops on humanities data curation for the Center for Digital Humanities at Princeton University (April 2015), as part of the Early Modern Digital Agendas Institute (EMDA) held by the Folger Institute (June 2015), and as part of the “bootcamp” for postdoctoral fellows (July 2015) supported by the Council on Library and Information Resources (CLIR). These additional opportunities for dissemination introduced additional audiences to data curation principles, tools, and methods. Accomplishments The DHDC project succeeded in introducing over 60 humanities scholars, librarians and other information professionals, graduate students, and technologists to the basic concepts of data curation as well as to several useful tools and methods for maintaining the value of humanities research over time through three in-person workshops hosted at the University of Illinois, Urbana-Champaign, the University of Maryland, and Northeastern University. Furthermore, the large response to the calls for participation in these workshops—far beyond what could be accommodated—identified a clear need for additional training and activities focused on data curation for the humanities. Audiences One of the stated goals of the DHDC project was to address a distinct shortage of focused training opportunities for working professionals that address the discipline-specific data curation training needs of digital humanities scholars and the librarians or other information specialists who collaborate closely with digital humanists. The selected participants reflected this goal. Library and information professionals represented the largest group of participants, likely reflecting their higher awareness of data curation issues. Graduate students and young scholars represented the next largest cohort, followed by faculty members, whose time and participation is often harder to secure. The DHDC institute also served a number of unaffiliated researchers, representatives of federal government agencies and professional organizations, and one member of local government. Evaluation The DHDC project conducted an evaluation survey after each of the three institute workshops and used the results of these surveys to adjust and improve the curriculum. Eleven out of 20 participants (55%) responded to the post-institute evaluation survey for the first workshop. All respondents considered themselves as having beginner-level or moderate experience with data curation. Participants generally agreed or strongly agreed that they gained a greater understanding of data curation in context of digital humanities research (90.9%) and a greater understanding of how to make data curation decisions related to 6 creating, organizing, using, and preserving digital content (72.7%). However, only 45.5% of respondents agreed or strongly agreed that indicated that they had gained a better understanding of tools for comparison and analysis of humanities data, indicating an area for improvement when reviewing the curriculum for the second institute. Responses to open-ended questions expressed enthusiasm for the institute’s instructors and overall framework (“Loved Trevor's framework and overall themes for humanities data.”; “I love the presenters! They did an awesome job for a new subject that has very few hard and fast rules”) while also recommending the addition of more hands-on activities (“I would focus less on data curation theory and incorporate more "real life" examples into the workshop.”; “I wish we could use 1-2 tools with our own data to see how they work with some guided practice.”) Eleven out of 20 participants (55%) responded to the post-institute evaluation survey for the second workshop. Of those who responded, 72% considered themselves as having beginner-level or moderate experience with data curation. This represented an increase in experienced or expert-level participants from the first to the second institute, a decision that the workshop organizers made while selecting workshop participants with the hope that such participants would help seed and enliven workshop discussions. When asked to evaluate individual sessions, participants identified the newly-developed metadata lecture as most valuable with 90% (n = 10) of respondents ranking the lecture as “Very valuable”, the highest rank on a 5-point Likert scale. Other well-received sessions included an introductory lecture on conceptual frameworks for humanities data curation, a lecture on understanding the nature of digital objects, and the newly-developed exercise on “sharing what works” with 80% (n = 10) of respondents ranking these sessions as “Very valuable”. Participants were least enthusiastic about a data personae exercise, a self-profiling data exercises, and the case study presentation. Upon reflection, the workshop organizers agreed that the case study required more advanced consultation with the presenter and session scaffolding for the participants. The other two-exercises were new to the DHDC workshops and were not used again. Overall, participants were pleased with the general balance of lectures, hands-on work with tools and technologies, and small-group exercises with a general preference for slightly more emphasis on hands-on work. In open-ended responses to a question seeking additional comments about the workshop curriculum, two respondents highlighted a preference for a more in-depth and comprehensive session on linked data. Topics recommended for future DHDC workshops included: ● More time spent on useful curation tools and more time discussing the common data types that are outcomes of DH projects; ● Add-ins and plug-ins that enhance the curatorial function of commonly used applications, mind map tools, and workflow systems and frameworks; ● Metadata schemas for specific fields and disciplines; 7 ● Case studies more tightly aligned with representative DH projects; and ● A dedicated session on linked open data. All 10 respondents indicated that they were either very satisfied (70%) or satisfied (30%) with the training received at the DHDC workshop, and 80% of respondents confirmed that they would definitely recommend DHDC to a colleague. Eleven out of 21 participants (52%) responded to the post​-institute evaluation survey for the third workshop—consistent with the rate from the previous two events. Of those who responded, 27% self identified as experienced or expert in data curation compared with 73% with beginner ​level or moderate experience. This demographic spread is consistent with the second institute and informed by experiences from the first institute, after which the workshop organizers chose to include a larger selection of more experienced data curators with the goal of seeding and enlivening workshop discussions across the board. When asked to evaluate individual sessions, participants identified the lecture introducing data management plans at the most valuable with 100% (n=9) of respondents ranking the lecture as “Very valuable”, the highest rank on a 5​-point Likert scale. Other well​-received sessions included a lecture on understanding the nature of digital objects, an exercise based on identifying the significant properties of the NYPL “What’s on the Menu?” dataset, and a lecture on metadata and linked data with 78% (n=9) of respondents ranking these sessions as “Very valuable”. Participants were least enthusiastic about the hands-​on Islandora repository exercise, the two case studies, and a group exercise on customizing and improving plans for data management. Upon reflection, these responses (combined with lessons learned from previous institutes) underscore the challenges associated with case-​based approaches to data curation instruction, the additional scaffolding required for successfully negotiating hands-​on sessions with complex tools intended to be used as part of routine institutional (rather than project-based) practice. Also tensions between data management as a core component of technical projects and the data management plan as a conceptual exercise and pre​requisite for funding remained a challenging balance to get right in a workshop setting. Overall, participants were pleased with the general balance of lectures, hands​-on work with tools and technologies, and small​-group exercises with a general preference for slightly more emphasis on hands-​on work with tools and technologies over small-​group exercises and discussions. In open-​ended responses to a question seeking additional comments about the workshop curriculum, one respondent requested, “more explanation of how the different layers of information architecture work together.” Other comments touched upon the community ​building aspects of the program with varying degrees of enthusiasm ranging from “Great training and teaching methods and formats for engagement, group ownership, learning and community​ building” to “it was difficult to find a balance in groups between it being broad enough to involve everyone but specific enough to be useful.” 8 There were fewer recommendations for future topics from the third workshop cohort than from previous cohorts, however, two themes emerged: data management plans and collected best practices. Two respondents suggested an exercise in which participants draft new data management plans. Notably, an analogous session was an early component of the institute curriculum, but the organizers evolved in a different direction based on feedback from prior institutes. This divergent response from the third workshop might be a consequence of including more experienced practitioners (who felt ready to tackle writing data management plans). Another possible explanation could be increased awareness of the data management plan requirements from NEH and other funders. Three respondents requested more emphasis on best practices, suggesting possible future directions as we prepare online resources for the digital humanities community. Nearly all respondents indicated that they were either very satisfied (56%) or satisfied (33%) or with the training received at the DHDC workshop, and 67% of respondents confirmed that they would definitely recommend DHDC to a colleague. Final comments from respondents included: ● “I had a great time and I feel I learned a lot. That metaphor for ‘seeing’ things differently and being able to apply what we have learned to our future research is a good lesson for all. I feel I will approach my datasets in a more nuanced and intelligent way in the future. Thank you! Great organization and material.” ● “I really appreciated the broad mix of professions and disciplines represented. That was of enormous value to me in understanding the broader concepts and how they relate to one another.” Taken together these evaluations suggest a number of conclusions about the state of data curation knowledge in the humanities and about the future needs of the field. One intervention the project team expected to make was to respond directly to the new data management plan requirements for projects funded by NEH and other funders. In most of the workshops, participants were grappling with data curation theory as well as with the large number of participants, institutions, and other actors involved in managing research information over time. Thus, it seemed as though participants struggled to feel sufficient mastery of these topics to effectively distill their curation strategies into the artefact (a two-page plan) required by funding applications. Iterations of the data management planning sessions which reduced the engagement with the specific funder guidelines in favor of a wider exploration of the expectations and possibilities for data management planning seemed more successful based on participant evaluations. Also, participants felt more confident workshopping existing data management plans rather than generating new plans—suggesting the value of model and sample plans. Topics such as linked data were introduced in response to participant feedback. Other topics such as risk assessment were introduced earlier in the workshop schedule to provide context but remained relatively unchanged, while yet other topics such as 9 sustainability were increasingly truncated (probably due to the need for more extensive treatment than time allowed). Guided discussion and small group work grew over the series of workshop while individual work and lectures were reduced. Ultimately, the evaluations trace the project team’s experimentation with different potential elements of humanities data curation training—a process which would probably continue in any future extension to the DHDC project. Continuation of the Project The project team discussed two possible continuations of the DHDC project. First, as the number of people working in the area of humanities data curation grows, it might be valuable to convene a summit meeting—or even another full institute—focused on curriculum and pedagogy for humanities data curation. The explicit aim of this activity would be to develop and harmonize training opportunities that cross the boundaries of formal graduate training, either in the humanities or information science. In other words, what might a coherent approach to teaching data curation look like as both modules or units within larger curricula and as extracurricular or in-service training outside formal degree programs? Second, because of the humanities particular engagement with the study of underrepresented communities, a data curation institute series which incorporated standards for managing traditional cultural knowledge (such as those practices developed by the Mukurtu project) or which incorporated a particular focus on disciplines such as Black Studies, Latinx Studies, or LGBTQ studies might be a valuable continuation of the DHDC work. DHDC project team members expect to remain active in promoting and teaching humanities data curation, however, the team has no current plans to pursue additional activities. Long Term Impact The DHDC project will have an impact beyond the period of performance in two ways—through the cohort of workshop participants and through the development of a pool of experienced teachers of humanities data curation. Informally, participants have reported using the concepts and tools learned in DHDC workshops in their digital humanities scholarship and teaching. These reports often mention two specific components. First, participants report relying on the definition of data developed through the workshops. This definition of data—as information serving in the role of evidence for knowledge claims—they report, resonates well with other humanities scholars, even those who might be sceptical of digital tools and methods. Second, Open Refine, as a “power tool” for data inspection and reconciliation, remains popular with many workshop attendees. 10 Through the incorporation of humanities data curation courses into ongoing training activities such as the Digital Humanities Oxford Summer School and HILT, DHDC team members have partnered with other instructors to continue offering training opportunities based on the institute curriculum. In this way, the pool of instructors and leaders for humanities data curation activities continues to grow. Grant Products The project maintains a website at ​http://www.dhcuration.org​, which includes both the DH Curation Guide and documentation of DHDC workshops through schedules and slide decks. The project team presented on preliminary findings from DHDC at the Digital Humanities 2014 conference: Senseney, M., Muñoz, T., Flanders, J., & Fenlon, A. (2014). Digital Humanities Data Curation Institutes: Challenges and preliminary findings. Poster presented at Digital Humanities, Lausanne, Switzerland, July 8​12, 2014. 11 http://www.dhcuration.org/ work_cxgrjqjaergz3gbuaypp2wyhea ---- Slides Monica Berti (Universität Leipzig) Scienze dell’Antichità e Digital Humanities: buone pratiche, frontiere teoriche e prospettive di ricerca Università degli Studi di Roma Tre Piattaforma Teams – 10 marzo 2021 Gli studi classici e i Linked Ancient World Data /37 Outline 1. Open Data of Ancient Greek and Latin Sources 2. Catalogs and Citation Systems 3. Data Entry and Analysis for Classical Philology 4. Critical Editions and Annotations 5. Linguistic Resources 1 /37 Open Data of Ancient Greek and Latin Sources 1 2 /37 and other projects … Open Access? 3 https://scaife.perseus.org/ https://github.com/OpenGreekAndLatin http://stephanus.tlg.uci.edu/index.php https://digitallatin.org/ https://packhum.org/texts.html http://papyri.info/ http://mizar.unive.it/mqdq/public/ https://www.degruyter.com/document/doi/10.1515/btl/html http://www.hup.harvard.edu/features/loeb/digital.html https://www.oxfordscholarlyeditions.com/ https://www.brillonline.com/ http://www.litpap.info/ https://sourceforge.net/p/epidoc/wiki/Home/ http://www.digiliblt.unipmn.it/ https://about.brepolis.net/databases/ /37 • a wide variety of digital collections and corpora • different models of accessibility • different business models 4 Open Access? /37 https://www.w3.org/standards/semanticweb/data https://www.w3.org/wiki/LinkedData The Linked Open Data Cloud Linked Open Data Open Data? 5 https://www.w3.org/standards/semanticweb/data https://www.w3.org/wiki/LinkedData https://lod-cloud.net/ /37 Linked Ancient World Data Hugh A. Cayless Digital Classical Philology Sustaining Linked Ancient World Data De Gruyter Saur | 2019 DOI: https://doi.org/10.1515/9783110599572-004 http://dlib.nyu.edu/awdl/isaw/isaw-papers/7/ http://dlib.nyu.edu/awdl/isaw/isaw-papers/20/ 6 Open Data? https://www.degruyter.com/document/doi/10.1515/9783110599572/html https://doi.org/10.1515/9783110599572-004 https://www.degruyter.com/document/doi/10.1515/9783110599572-004/html https://pleiades.stoa.org/ http://www.papyri.info/ https://www.trismegistos.org/ https://opencontext.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://dlib.nyu.edu/awdl/isaw/isaw-papers/7/ http://dlib.nyu.edu/awdl/isaw/isaw-papers/20/ /37 {{cite web |url=https://pleiades.stoa.org/places/727070 |title=Places: 727070 (Alexandria) |author=Bernand, A. |accessdate=November 6, 2019 7:36 am | publisher=Pleiades}} 7 https://pleiades.stoa.org/ https://pleiades.stoa.org/places/727070 /37 • https://www.trismegistos.org/text/10775 • http://papyri.info/hgv/10775/source • http://papyri.info/apis/columbia.apis.p387/source • http://papyri.info/apis/columbia.apis.p387/images http://papyri.info/ddbdp/p.fay;;110 8 http://www.papyri.info/ https://www.trismegistos.org/text/10775 http://papyri.info/hgv/10775/source http://papyri.info/apis/columbia.apis.p387/source http://papyri.info/apis/columbia.apis.p387/images http://papyri.info/ddbdp/p.fay;;110 /379 https://www.trismegistos.org/ https://www.trismegistos.org/text/10775 /37 https://opencontext.org/projects/3DE4CD9C-259E-4C14-9B03-8B10454BA66E https://opencontext.org/subjects/0801DF9C-F9B2-4C76-0F34-93BE7123F373#tab_obs-1 https://opencontext.org/media/48c1bdeb-ffb9-4fd3-84d2-20ba189a1f4a Projects Data record Media 10 https://opencontext.org/ https://opencontext.org/projects/3DE4CD9C-259E-4C14-9B03-8B10454BA66E https://opencontext.org/subjects/0801DF9C-F9B2-4C76-0F34-93BE7123F373#tab_obs-1 https://opencontext.org/media/48c1bdeb-ffb9-4fd3-84d2-20ba189a1f4a /37 http://nomisma.org/id/ephesus 11 http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/ http://nomisma.org/id/ephesus /37 Digital Libraries generations of digital corpora in Classics • make texts available online • professional data entry and consistent markup scheme • semantic mark up in SGML/XML (TEI) • image-front collections with page images • decentralized contributions from users (million book libraries) • multiple editions of primary sources • annotations • … https://www.clir.org/pubs/reports/pub150/ 12 https://www.clir.org/pubs/reports/pub150/ https://www.clir.org/pubs/reports/pub150/ /37 Catalogs and Citation Systems 2 13 /37 Tauromenium, Lapidarium Fragments of a painted library catalog (3rd-2nd century BC) http://sicily.classics.ox.ac.uk/inscription/ISic0613 • canons • catalogs • citation systems philology 14 http://sicily.classics.ox.ac.uk/inscription/ISic0613 /37 TLG Canon of Greek Authors and Works http://stephanus.tlg.uci.edu/canon.php {tlg0032} * * 15 http://stephanus.tlg.uci.edu/Iris/indiv/csearch.jsp#doc=tlg&aid=0032&wid=&q=XENOPHON&dt=list&cs_sort=1_sortname_asc&st=author_text&aw=&verndipl=0&per=100&c=2&acp=1&editid= http://stephanus.tlg.uci.edu/Iris/indiv/csearch.jsp#doc=tlg&aid=0032&wid=&q=XENOPHON&dt=list&cs_sort=1_sortname_asc&st=author_text&aw=&verndipl=0&per=100&c=2&acp=1&editid= http://stephanus.tlg.uci.edu/canon.php /37 TLG Canon of Greek Authors and Works {tlg0032.001.1.1.1} https://dl.acm.org/doi/10.3115/992567.992595 * morpheus * 16 http://stephanus.tlg.uci.edu/Iris/indiv/browser.jsp#doc=tlg&aid=0032&wid=001&st=0&l=20 https://dl.acm.org/doi/10.3115/992567.992595 /37 TLG Canon of Greek Authors and Works 17 http://stephanus.tlg.uci.edu/Iris/indiv/csearch.jsp#doc=tlg&aid=0085&wid=&q=AESCHYLUS&dt=list&cs_sort=1_sortname_asc&st=author_text&aw=&verndipl=0&per=100&c=3&acp=1&editid= /37 * TLG Canon of Greek Authors and Works 18 http://stephanus.tlg.uci.edu/Iris/indiv/browsers.jsp#doc=tlg&aid_s=85&wid_s=&ac_q_s=&n_source_s=12&l=20&aid_t=&wid_t=8&n_source_t=2&similar=dif&c=1&td=greek&type=com_edit&links=tlg /37 Perseus Catalog urn:cts:greekLit:tlg0032.tlg001 http://catalog.perseus.org * * * 19 http://catalog.perseus.org/catalog/urn:cite:perseus:author.1499 http://catalog.perseus.org/catalog/urn:cts:greekLit:tlg0032.tlg001 http://catalog.perseus.org /37 https://github.com/PerseusDL/catalog_data/blob/master/mods/greekLit/tlg0032/tlg001/opp-grc2/tlg0032.tlg001.opp-grc2.mods1.xml 20 https://github.com/PerseusDL/catalog_data/blob/master/mods/greekLit/tlg0032/tlg001/opp-grc2/tlg0032.tlg001.opp-grc2.mods1.xml /37 * * urn:cts:greekLit:tlg0032.tlg001.perseus-grc2:1.1.1@Ἀγησανδρίδου[1] https://scaife.perseus.org/reader/urn:cts:greekLit:tlg0032.tlg001.perseus-grc2:1.1.1-1.1.5/ 21 https://scaife.perseus.org/reader/urn:cts:greekLit:tlg0032.tlg001.perseus-grc2:1.1.1-1.1.5/ https://scaife.perseus.org/reader/urn:cts:greekLit:tlg0032.tlg001.perseus-grc2:1.1.1-1.1.5/ /37 Data Entry and Analysis for Classical Philology 3 22 /37 http://heml.mta.ca/lace/index.html Optical Character Recognition (OCR) 23 http://heml.mta.ca/lace/index.html /37 Beta Code Unicode Character Encoding 24 https://unicode.org/charts/ /37 Character Encoding Marmor Parium A 3 Χ = 1000 Η = 100 𐅃 = 5 Δ = 10 Ι = 1 1268 = (1531/30 BC) https://epigraphy.packhum.org/text/77668 http://digitalmarmorparium.org http://www.dfhg-project.org/DFHG/index.php?volume=Volumen primum#urn:lofts:fhg.1.marmor_parium.i:3 25 https://epigraphy.packhum.org/text/77668 http://digitalmarmorparium.org http://www.dfhg-project.org/DFHG/index.php?volume=Volumen%20primum#urn:lofts:fhg.1.marmor_parium.i:3 /37 http://cltk.org/ 26 http://docs.cltk.org/en/latest/ http://docs.cltk.org/en/latest/ http://cltk.org/ http://cltk.org/ https://www.degruyter.com/document/doi/10.1515/9783110599572-010/html /37 Text Alignment http://www.digitalathenaeus.org 27 http://www.digitalathenaeus.org/tools/KaibelText/indexTOtext.php?passage=1.5&T=2 http://www.digitalathenaeus.org /37 Text Reuse Detection http://www.dfhg-project.org 28 http://www.dfhg-project.org/text_reuse_detection/xml_catalog_alignment.php?what%5B%5D=author%7CHELLANICUS&onoffswitch=+OR+&url=https://raw.githubusercontent.com/PerseusDL/canonical-greekLit/master/./data/tlg1389/tlg001/tlg1389.tlg001.perseus-grc2.xml http://www.dfhg-project.org /37 Critical Editions and Annotations 4 29 /37 http://www.homermultitext.org/ict2/?urn=urn:cite2:hmt:vaimg.2017a:VA012RN_0013 Marcianus Graecus Z. 454 (= 822) (Venetus A) 10th century 30 http://www.homermultitext.org/ict2/?urn=urn:cite2:hmt:vaimg.2017a:VA012RN_0013 /37 a collaborative environment and interface 31 Marcianus Graecus Z. 454 (= 822) (Venetus A) 10th century /37 https://inception-project.github.io/use-cases/digital-athenaeus/ Semantic Annotations 32 https://inception-project.github.io/use-cases/digital-athenaeus/ /37 Linguistic Resources 5 33 /37 Morpho-syntactic Annotation https://perseusdl.github.io/treebank_data/ https://github.com/PerseusDL/treebank_data/blob/master/AGDT2/guidelines/Greek_guidelines.md 34 https://www.perseids.org/tools/arethusa/app/#/ https://perseusdl.github.io/treebank_data/ https://github.com/PerseusDL/treebank_data/blob/master/AGDT2/guidelines/Greek_guidelines.md /37 https://lila-erc.eu Knowledge Bases 35 https://lila-erc.eu /37 https://doi.org/10.1515/9783110599572 36 https://doi.org/10.1515/9783110599572 Thank You Monica Berti monica.berti@uni-leipzig.de work_cztsfgdu5bcbfdnfrhryslcu6a ---- Review: Ellas tienen nombre Reviews in Digital Humanities Review: Ellas tienen nombre Sylvia Fernández Quintanilla1 1University of Kansas Published on: Sep 08, 2020 Updated on: Sep 07, 2020 DOI: 10.21428/3e88f64f.bc2e340a License: Creative Commons Attribution 4.0 International License (CC-BY 4.0) https://creativecommons.org/licenses/by/4.0/ Reviews in Digital Humanities Review: Ellas tienen nombre 2 Proyecto Ellas tienen nombre: Cartografía digital de feminicidios Autor del proyecto Ivonne Ramírez Ramírez, Investigador Independiente Liga del proyecto https://www.ellastienennombre.org/ Evaluadora del proyecto Sylvia Fernández Quintanilla, University of Kansas Traducción Sylvia Fernández Quintanilla, University of Kansas Desc ri pc i ón del proyec t o Ivonne Ramírez Ramírez Ellas tienen nombre es un proyecto de mapeo feminista cuya información se puso a disposición en marzo de 2015, en el marco del Día Internacional de la Mujer. El objetivo se centra en las víctimas de femi(ni)cidio en Ciudad Juárez, México, monitoreando y georreferenciado los lugares donde los cuerpos de niñas y mujeres se han encontrado asesinadas, además de desglosar las especificidades y datos de cada uno de los casos. Mi perspectiva feminista del proyecto ha variado porque ha sido un aprendizaje continuo. Es por eso que el mapa ha pasado por varios procesos, visualizaciones, metodologías, dilemas teóricos e incluso éticos desde entonces, aunque tengo cierta formación, formal e informal, en estudios feministas, no me especializo en Humanidades Digitales. Por ejemplo, en vista de que Google cerró sus tablas dinámicas (Fusion Tables) aún tenemos que actualizar los datos de los últimos meses de 2019 y hasta la fecha, por lo que estamos averiguando cómo implementar una nueva infraestructura técnica. Por sus fuentes bibliográficas y cuestiones teórico-metodológicas, la información en este mapeo digital consta de dos f ases: La primera etapa corresponde de 1993 a 2014. Para el registro de feminicidios cometidos en estos años, me apoyé en expedientes de la Fiscalía y otras agencias gubernamentales, en archivos de organizaciones sociales no gubernamentales, prensa, libros dedicados al tema, así como en información recopilada por mujeres, activistas, feministas, escritoras/es, todos aquellos especialmente de la región. Sin degradar en modo alguno el trabajo hecho por todos/as los/as https://www.ellastienennombre.org/ https://sylviafernandezq.wordpress.com/ https://sylviafernandezq.wordpress.com/ Reviews in Digital Humanities Review: Ellas tienen nombre 3 Reiterando que cada uno de estos casos tiene sus especificidades y que hay niñas y mujeres más vulnerables que otras, este mapa intenta mostrar que los feminicidios son una violencia extrema que afecta a todas las clases sociales, edades y espacios geográficos, ya que el mapa denuncia visualmente en primera instancia que, en toda la ciudad, hombres han matado a niñas, mujeres y mujeres transgénero. El proyecto pretende ser un instrumento complementario con el que podamos alertar sobre la gravedad de la violencia en contra de niñas y mujeres. Parece que el impacto visual con el mapa y las fotografías es diferente a compartir información a través de estadísticas con gráficas. Mediante la visualización del mapa en correlación con los feminicidios, también se puede imaginar, investigar y ahondar en cómo el contexto, las políticas públicas, la situación social, el régimen político, los sistemas opresivos, la cultura, etc. afectan a las personas (en este caso, niñas y mujeres) que viven en lugares geopolíticos particulares. Dos personas colaboran en este proyecto: Debido a una postura político-ideológica relacionada con el contexto histórico en Ciudad Juárez por los feminicidios, hasta ahora no he buscado y no he querido recibir fondos o apoyo financiero para este proyecto. Esto puede cambiar en algún momento. investigadores/as en los que baso mi investigación, esta información previa publicada en la bibliografía, son datos sobre feminicidios difícil de consultar para muchas personas o el proceso de consulta es complejo y de alguna manera inaccesible para personas no académicas. Ellas tienen nombre recopila los datos y la información que ya estaba ahí y los libera, aprovechando las posibilidades y herramientas que brinda el Internet. La segunda parte comprende desde 2015 hasta la fecha. En esta parte he monitoreado diariamente la prensa escrita y digital (mayormente digital), así como redes sociales. También he recibido correos electrónicos donde las mujeres comparten conmigo información sobre feminicidios que yo no tenía registrados en el mapa. Asimismo, intento utilizar fotografías respetuosas de las víctimas cuando estaban en vida para identificar cada uno de los casos, si no las obtengo incluyo fotografías de las víctimas no explícitas o de la escena del crimen. Maestra Ivonne Ramírez (Ciudad Juárez 1980-). Activista cultural y feminista. Fundadora del proyecto. Gustavo Ramírez (Ciudad de México 1979-). Ingeniero en sistemas. Trabaja en el código que permite tener una plantilla fija que se utiliza para completar la base de datos y para la visualización final (interf az frontal y trasera) Reviews in Digital Humanities Review: Ellas tienen nombre 4 R evi si ón del proyec t o Sylvia Fernández Quintanilla Marcela Lagarde de los Ríos en el pref acio de Terrorizing Women, “Feminist Keys to Understand Feminicide” define: violencia feminicida, como “lo extremo, la culminación de muchas formas de violencia de género contra las mujeres que representa un atentado a sus derechos humanos y que dirige a diversas formas de muerte violenta” (xxi); y, el feminicidio, como “ una de las formas extremas de violencia de género; está constituido por todo el conjunto de actos misóginos violentos contra las mujeres que impliquen una violación de sus derechos humanos, representan un atentado a su seguridad y ponen en peligro sus vidas” (xxiii). Ellas tienen nombre es un proyecto de mapeo digital feminista que documenta la violencia feminicida y los feminicidios desde 1993 hasta la actualidad en la frontera de Cd. Juárez, Chihuahua. Es un proyecto que, imagina, investiga y profundiza en los feminicidios a través de un contexto geopolítico — una región fronteriza gobernada por dinámicas nacionales, transfronterizas, binacionales y transnacionales. Fue fundado por la Maestra Ivonne Ramírez para contrarrestar la f alta de información disponible y accesible a distintos públicos interesados en el tema. Ramírez inició este proyecto en marzo de 2015 recopilando y visualizando datos de las niñas y mujeres desaparecidas en el área fronteriza de donde es ella. El proyecto consiste en identificar datos específicos de cada uno de los casos y la georreferenciación de los lugares donde se han encontrado los cuerpos. Además del mapa digital, el proyecto visualiza casos de estudio para mostrar la relación de algunos de los feminicidios cometidos, mediante la presentación de imágenes e información específica de niñas y mujeres. Un ejemplo es el registro de jóvenes lesbianas o bisexuales desaparecidas y asesinadas en 2019. Esta iniciativa se sustenta de forma independiente y autónoma, a raíz de la postura ideológica-política relacionada con el feminicidio en esta región. Ellas tienen nombre responde a las preocupaciones que Lagarde de los Ríos explica: “las autoridades no han revelado ninguna información sobre sus investigaciones o lo han hecho solo de manera parcial, incompleta y confusa...Las autoridades se contradicen casi todo el tiempo. No hay certeza en muchos de los casos de que las víctimas corresponden a los cadáveres no identificados” (xiv). Desde los primeros registros oficiales de feminicidios cometidos en esta frontera, las autoridades y una serie de representantes políticos han abordado estos casos de forma negativa. Ellos han presentado informes en tonos misóginos e incongruentes que luego se transfieren a la forma en la que se trabaja con los datos. De 1993 al 2014, los datos del proyecto consisten en información oficial de agencias gubernamentales, organizaciones no gubernamentales, asociaciones civiles, periódicos, así como información que activistas, escritoras, mujeres y feministas recopilaron previamente. De 2015 hasta la fecha, los datos Reviews in Digital Humanities Review: Ellas tienen nombre 5 han sido monitoreados y registrados diariamente por Ramírez a través de la prensa digital y análoga, en redes sociales y de reportes que le han enviado directamente por correo electrónico. Los aspectos técnicos del proyecto han pasado por varias iteraciones a medida que Ramírez ha evolucionado en su propio aprendizaje de nuevas metodologías, aspectos teóricos y éticos, así como nuevas tecnologías. Actualmente el Ing. Gustavo Ramírez supervisa el código para completar la base de datos y crear las visualizaciones. El proyecto está alojado en un portal creado en HTML y está compuesto por 5 apartados: “Projecto”, descripción en formato bilingüe (español e inglés); “Mapa”, la cartografía digital creada con Mapbox; “2019”, una serie de visualizaciones con imágenes e información de algunos de los casos particulares de niñas y mujeres asesinadas; “Contacto”; y, “Enlaces Relacionados”, un listado de otros recursos digitales de los casos en Cd. Juárez y mapas digitales de feminicidios en otras partes de México y el mundo. Ellas tienen nombre contribuye a las prácticas feministas y al trabajo de justicia social de la comunidad de madres y activistas que siguen buscando a sus hijas. Esta comunidad participa haciendo rastreos por el desierto en busca de los restos de sus hijas. El uso de una cartografía local expone el modus operandi de los feminicidios fronterizos. El mapa muestra la complejidad de los casos más extremos de violencia de género permitiendo filtrar los datos de forma grupal, individualmente, así como de forma geoespacial y temporal. Este enfoque indica la ética feminsita de Ramírez al resaltar la subjetividad de estas mujeres al evitar presentarse como víctimas, objetos o cuerpos desechables. Las limitaciones de este enfoque, sin embargo, están ligadas al estado del récord físico y digital que sigue controlado por poderes hegemónicos-patriarcales que intervienen en qué información está disponible. Asimismo, las ausencias/silencios en los datos muestran cómo el problema estructural-sistemático prevalece a nivel institucional y social limitando el acceso o borrando la evidencia. Al revisar este proyecto, es importante recordarles a los lectores la fuerte influencia hacia el lado mexicano y las intervenciones provenientes de Estados Unidos y otras partes del mundo, como parte de las principales causas de violencia de género y feminicidas en esta región. Con esto en mente, este proyecto podría incluir datos de niñas y mujeres, víctimas de feminicidios, que han ocurrido en el lado estadounidense de la frontera y añadir información de cuando el feminicida/agresor está relacionado con los Estados Unidos o una institución que sea parte de este sistema binacional/transnacional. Esto permitirá una comprensión más profunda de la realidad poscolonial de lo que es la frontera, donde poderes hegemónicos de ambos países y de terceros gobiernan y las mujeres deben navegar entre estos diferentes sistemas y culturas patriarcales. En general, este proyecto feminista cumple su misión al crear recursos digitales para hacer que los datos de niñas y mujeres, víctimas de feminicidios en Cd. Juárez, estén disponibles y accesibles. Asimismo, desarrolla una mayor conciencia sobre la situación que enfrentan las mujeres fronterizas, así como la necesidad de compartir esta información al público en general, ya que es un tema que Reviews in Digital Humanities Review: Ellas tienen nombre 6 forma parte del presente y la memoria histórica de la región. En esta línea, artistas, académicas/os de diversas partes del mundo, activistas y medios de comunicación han utilizado este proyecto para crear nuevas iniciativas artísticas, testificar ante las autoridades y abogar por políticas de justicia y equidad. Cabe señalar que el esfuerzo y tiempo, no remunerado, que mujeres como Ramírez o María Salgado (autora del Mapa digital de feminicidios en todo México) le han dado a este tipo de proyectos independientes, es un trabajo humanitario crucial que merece de mayor reconocimiento y apoyo para mantener con vida este tipo de material digital que trabaja para representar las voces de protesta, demandar justicia y movilizar la solidaridad. Bibliografía Lagarde de los Ríos, Marcela. Pref ace. “Feminist Keys for Understanding Feminicide: Theoretical, Political, and Legal Construction”. Terrorizing Women: Feminicides in the Americas, editor por Rosa Linda Fregoso y Cynthia Bejarano, Duke University Press, 2009, xi-xxvi. Project Ellas tienen nombre: Cartografía digital de feminicidios Project Director Ivonne Ramírez Ramírez, Independent Scholar Project URL https://www.ellastienennombre.org/ Project Reviewer Sylvia Fernández Quintanilla, University of Kansas Translator Sylvia Fernández Quintanilla, University of Kansas P roj ec t Overvi ew Ivonne Ramírez Ramírez Ellas Tienen Nombre is a feminist mapping project using information made available in March 2015, within the framework of International Women's Day. The objective is to focus on the victims of femi(ni)cide in Ciudad Juárez, Mexico, monitoring and georeferencing the places where the bodies of https://feminicidiosmx.crowdmap.com/ https://www.ellastienennombre.org/ https://sylviafernandezq.wordpress.com/ https://sylviafernandezq.wordpress.com/ Reviews in Digital Humanities Review: Ellas tienen nombre 7 murdered girls and women have been found, as well as breaking down the specificities and data of each of the cases. My feminist perspective of the project has varied because it is a continuous learning process. That is why the map has gone through various iterations, visualizations, methodologies, and theoretical and even ethical dilemmas since, although I have some formal and informal training in feminist studies, I do not specialize in digital humanities. For example, because Google shut down its Fusion Tables, we have yet to update the last months of 2019 and year to date, so we are figuring out how to implement a new technical infrastructure. Due to its bibliographic sources and theoretical-methodological issues, the information in this digital mapping consists of two phases: Stressing that each of these cases has its specificities and that there are girls and women more vulnerable than others, this map tries to show that feminicides are extreme violence that affects all social classes, ages, and geographic spaces since the map visually denounces in the first instance that throughout the city, men have killed girls, women, and transgender women. The project tries to be a complementary instrument that communicates the gravity of violence against girls and women. It seems that the visual impact of the map and photos is different from sharing information through statistics with graphs. Through the visualization of feminicides on the map, we could also imagine, investigate, and delve into how the context, public policies, social situation, The first stage corresponds with 1993 to 2014. For the data on femicides committed in these years, I relied on files from the Prosecutor's Office and other government agencies, files of non- governmental social organizations, press, books dedicated to the subject, and information collected by women activists, feminists, writers, all of them specifically from the region. Without criticizing the work I base my research on, this previous information, written in published bibliographies, made the data on femicides difficult for many people to consult or perhaps made the process of understanding the data more complex and somehow inaccessible for regular people. Ellas Tienen Nombre collects the data and information that was already there and makes them available by taking advantage of the possibilities and tools provided by the Internet. The second part comprises 2015 to date. For this part, I monitor daily the digital and written press (mostly digital), as well as social networks. I have also received emails where women share with me information about feminicides that I did not have registered on the map. Likewise, I am trying to use respectful photos of the victims when they were alive to identify each of the cases, or, if I do not obtain them, non-explicit photos of the victims or of the crime scene are included. Reviews in Digital Humanities Review: Ellas tienen nombre 8 political regime, oppressive systems, culture, etc., impact people (in this case, girls and women) who live in particular geopolitical areas. Two people collaborate on this project: ● MA Ivonne Ramírez (Ciudad Juárez 1980- ). Cultural and feminist activist. Head of the project. ● Gustavo Ramírez (Ciudad de México 1979- ). Software engineer. He works on the code that allows me to have a fixed template that is used to fill out our database and for the final visualization (back end and front end). The project arises from the idea that this information should be available to all types of audience interested in the subject and with internet access. Due to a political-ideological stance related to the historical context of Ciudad Juárez regarding feminicides, so f ar, I have not searched and did not want to receive funds or financial support for this project. This may change at some point. P roj ec t R evi ew Sylvia Fernández Quintanilla In the pref ace of Terrorizing Women, “Feminist Keys to Understand Feminicide,” Marcela Lagarde de los Ríos defines feminicidal violence, as “the extreme, the culmination of many forms of gender violence against women that represent an attack on their human rights and that lead them to various forms of violent death” (xxi). She defines feminicide, as “one of the extreme forms of gender violence; it is constituted by the whole set of violent misogynist acts against women that involve a violation of their human rights, represent an attack on their safety, and endangered their lives” (xxiii). Ellas tienen nombre is a feminist digital mapping project that records feminicidal violence and feminicide from 1993 to the present in the border of Cd. Juárez, Chihuahua. It is a project that, imagines, investigates, and delves into feminicides through a geopolitical context — a border region governed by national, cross-border, bi-national, and transnational dynamics. It was founded by M.A. Ivonne Ramírez to counter the lack of publicly available and accessible information from the different publics interested in the subject. Ramírez began this project in March 2015 by collecting and visualizing data about the missing girls and women in the area of the border that she is from. The project consists of identifying specific data of each of the cases and georeferencing the places where the bodies have been found. In addition to the digital map, the project includes case studies that show the relationship between femicides committed by displaying images and specific information of girls and women. An example is the registry of Reviews in Digital Humanities Review: Ellas tienen nombre 9 lesbian or bisexual youth disappeared and murdered in 2019. This initiative is supported independently and autonomously, as a result of the political-ideological stance related to feminicide in this region. Ellas tienen nombre speaks to the concerns of Lagarde de los Ríos who explains: “Authorities have f ailed to disclose any information on their inquiries or have done so only in a partial, incomplete, and confused manner…The authorities contradict themselves almost all the time. There is no certainty in many cases that the victims correspond to unidentified corpses” (xiv). Since the first official records of femicides committed in this border region, an array of authorities and political representatives have dealt with these cases in a negative way. They present reports with misogynistic and incongruous overtones, which then transfers as ways of working with this data. From 1993 to 2014, the project’s data consists of official information from government agencies, non- governmental organizations, civil associations, newspapers, as well as activists, writers, women and feminists who previously collected this information. From 2015 to date, data has been monitored and recorded daily by Ramírez through digital and analog press, on social networks and that have been sent to her directly via email. The technical aspects of the project have gone through various iterations as Ramírez evolves in her own learning of new methodologies, theoretical, and ethical aspects, as well as new digital technologies. Currently, Ing. Gustavo Ramírez oversees the code for the database and creates the visualizations. The project is hosted in a portal created in HTML and is made up of 5 sections: “Projecto,” description in bilingual format (Spanish and English); “Mapa,” the digital cartography created with Mapbox; “2019,” a series of visualizations with images and information of some of the particular cases of murdered girls and women; “Contacto”; and, “Enlaces Relacionados,” a list of other digital resources of the Cd. Juárez cases and digital mapping of femicides in other parts of Mexico and the world. Ellas tienen nombre contributes to feminist practices and social justice work via the community of mothers and activists who continue to search for their daughters. They participate by doing crawls in the desert searching for remains of their children. The use of local cartography exposes the modus operandi of the border femicides. The map shows the complexity of the most serious cases of gender violence allowing users to filter the data as a group, individually, and in geospatial and temporal ways. This approach indicates Ramírez's feminist ethic by highlighting the subjectivity of these women. It avoids presenting them as victims, objects, or disposable bodies. The limitations of this approach though are tied to its relationship to the physical and digital record still controlled by hegemonic- patriarchal powers that intervene in what information is available. Likewise, the data absences/silences show how the structural-systematic problem prevails at the institutional and social level, limiting the access or erasing the evidence. Reviews in Digital Humanities Review: Ellas tienen nombre 10 In reviewing this project, it is important to remind readers of the strong influences and capitalist interventions coming from the United States and other parts of the world towards the Mexican side of the border, which are among the main causes of gender and feminicidal violence in this region. With this in mind, this project could include data of girls and women, victims of feminicides, that have occurred on the US side of the border and add indicate when feminicides are related to the United States or an institution that is part of this bi-national/transnational system. This will allow a deeper understanding of the postcolonial reality of the border, where hegemonic powers of both countries and third parties rule and women must navigate these different systems and patriarchal cultures. Overall, this feminist project fulfills its mission by creating digital resources to make the data of girls and women, victims of femicides in Cd. Juárez, available and accessible. Likewise, it has generated greater awareness of the situation f aced by border women, as well as the need to share this information to the general public, since it is part of the present and historical memory of the region. In this vein, artists, academics from various parts of the world, activists and the media have used this project to create new artistic initiatives, testify to authorities, and advocate for justice and equity policies. It should be noted that the effort and time, unpaid, that women like Ramírez or María Salgado (author of the Digital Map of Feminicides in México) have given to these types of independent projects, is crucial humanitarian work that deserves more recognition and support to maintain digital material that works towards the representation of voices of protest, demands justice, and mobilizes solidarity. Bibliography Lagarde de los Ríos, Marcela. Pref ace. “Feminist Keys for Understanding Feminicide: Theoretical, Political, and Legal Constuction.” Terrorizing Women: Feminicide in the Americas, Rosa Linda Fregoso and Cynthia Bejarano, editors, Duke University Press, 2009, xi-xxvi. https://feminicidiosmx.crowdmap.com/ work_d2dvfrm6pngktajsnvoi5nawmm ---- Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 A. Hoenen, C. Koc, M. D. Rahn – A Manual for Web Corpus Crawling of Low Resource Languages DOI: http://doi.org/10.6092/issn.2532-8816/9931 A Manual for Web Corpus Crawling of Low Resource Languages 1Armin Hoenen, 2Cemre Koc, 3Marc Daniel Rahn Goethe-Universität Frankfurt, Germany 1hoenen@em.uni-frankfurt.de 2cem_koc@icloud.com 3marc.rahn@venturerebels.de Abstract Since the seminal publication of “Web as Corpus” the potential of creating corpora from the web has been realized for the creation of both online and offline corpora: noisy vs. clean, balanced vs. convenient, annotated vs. raw, small vs. big are only some antonyms that can be used to describe the range of possible corpora that can be and have been created. In our case, in the wake of the project Under Resourced Language Content Finder (URLCoFi), we describe a systematic approach to the compilation of corpora for low (or under) resource(d) languages (LRL) from the web in connection with a free eLearning course funded by studiumdigitale at Goethe University, Frankfurt. Despite the ease of retrieval of documents from the web, some characteristics of the digital medium introduce certain difficulties. For instance, if someone was to collect all documents on the web in a certain language, firstly, the collection could only be a snapshot since the web constantly changes content and secondly, there would be no way to ascertain completeness. In this paper, we show ways to deal with such difficulties in search scenarios for LRLs presenting experiences springing from a course about this topic. Dalla pubblicazione di "Web as Corpus", il potenziale di creazione di corpora dal web è stato realizzato per la creazione di corpora sia online che offline: noisy vs. clean, balanced vs. convenient, commentato vs. raw, small vs. big sono solo alcuni antonimi che possono essere usati per descrivere la gamma di corpora possibili che possono essere e sono stati creati. Nel nostro caso, sulla scia del progetto Under Resourced Language Content Finder (URLCoFi), descriviamo un approccio sistematico alla compilazione di corpora per low (or under) resource(d) languages (LRL) dal web introducendo strumenti e un corso gratuito di eLearning finanziato da studiumdigitale, Goethe University, Francoforte. Nonostante la facilità di reperimento dei documenti dal web, certe caratteristiche del mezzo digitale presentano alcune difficoltà. Per esempio, se qualcuno dovesse raccogliere tutti i documenti sul web in una certa lingua, in primo luogo, la raccolta potrebbe essere solo uno snaphshot, dato che il web cambia costantemente il contenuto e, in secondo luogo, non ci sarebbe modo di accertarne la completezza. In questo articolo, mostriamo come affrontare tali difficoltà negli scenari di web 79 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 search per gli LRL, e presentiamo esperienze che nascono da un corso su questo argomento. Pedagogical framework The motto of the AIUCD conference in Udine 2019 was Pedagogy, teaching, and research in the age of Digital Humanities and we presented there an abstract describing the concept of an eLearning-based course given at Goethe University Frankfurt in summer 2019. The aim of this course was to enable students of linguistics, especially those studying smaller languages, to compile their own corpora from the web for the use in essays or theses on LRLs (including Low Resource Genres). We believe this course trains an ability which should be taught more widely in universities in the age of Digital Humanities, namely as a key competence in all areas: Sophisticated Web Search. As such, the course specializes in LRLs, which have a peculiar situation where it may be challenging to find their content among masses of content in larger languages, in closely related and similar sister languages and with restricted ranges of formats and topics. This paper gives a general guideline, with recourse to the experiences from many LRL search scenarios exercised within the course. The course itself is publicly available with eLearning lectures via https://lernbar.uni-frankfurt.de/ (Digital Humanities >> URLCoFi). Introduction - LRL in place of a definition LRLs are not unanimously defined ([15]) but characterized by the term. Few language resources are typically available for those languages. These resources are subject to debate and change over time. One basic component is (natural) written text in a language. For an attempt at defining which resources are basic see both [15] and [14]. For our purposes, we understand LRLs primarily as such languages for which the compilation of text corpora from the internet is difficult because of a lack of such texts or a reduced accessibility. We assume a certain correlation with the lack of other resources for those languages. A further subdivision however could be made by certain (non-exhaustive) characteristics of an LRL: 1. LRLs with large speaker numbers; these numbers determine a positive prospect for growth of the resources and also a high fluidity of contents. 2. LRLs with few speakers; typically, those languages are either endangered or threatened to become endangered; prospects for growth of resources are worse than above. 3. LRLs of largely oral or hunter-gatherer populations; often the typical usage or domains of internet use can be different from more literate communities, compare for instance [4], [8]; often these communities have few speakers and presumably some typical grammatical or lexical properties ([6]). 4. historical LRLs; obviously natural language content does not grow in most such cases. For a search on the internet, other characteristics are likewise important. For instance, an LRL may have its own exclusive writing system (very rare but existent) like the Yi in China, in which 80 A. Hoenen, C. Koc, M. D. Rahn – A Manual for Web Corpus Crawling of Low Resource Languages case finding content is not rendered more difficult by masses of other documents of the same writing system. Before we look at such characteristics and their implications for searching and querying in more detail, we introduce the general LRL situation on the web and then conduct a first preparatory step for corpus compilation by using some characteristics to define which kinds of contents and languages are most likely to come up unintendedly together with or on top of a particular LRL in our focus. The scenario we thus focus on is one where we search as much content on the web for any LRL in our focus as possible. General characteristics of the web and their implications for LRLs Why is finding LRL content on the web hard at all? Or better, what makes it difficult? According to different statistics, more than 75% of the web are constituted by the 10 largest languages. Now, according to authoritative linguistic resources such as ethnologue.com, there are more than 7,000 languages in the world. By numbers of native speakers, these same 10 languages as mentioned above constitute only roughly 36% of the world's population. This leads to an imbalance where the internet is not reflecting the world's population's native language composition but exhibits a clear skew towards some larger languages (of course, adding to this imbalance are the numbers of acquired second languages). For a variety of reasons, this situation with the majority of content in so few languages can be a hindrance for page retrieval in LRLs. Consider, for instance, that nowadays webpages are often technically realized as instances of so-called Content Management Systems (CMS). Those offer an infrastructure where precomposed menu-items exist which are not always customizable or if they are, fewer (smaller) languages may be available. In consequence, much content of LRLs is forcedly mixed with menu items (and other marginal content) in one of the larger languages of the web which - to a certain extent - prohibits the otherwise often effective use of operators to exclude certain terms of larger languages (- operator on Google). Search engines ideally want to produce the most relevant results quickly. Since the largest languages are per se, by being large, statistically most relevant, this may lead to frequent irrelevant results when searching for LRL contents. There is also accidental orthographic overlap between LRLs and larger languages. Especially short words with frequent letters (and thus frequent words) tend to be good candidates for this, which is probably unfortunate since search engines do not generate results on the basis of the current web page but on the basis of a so-called index, a database where they usually (business secret) save only characteristic features of a web page with frequent words likely to be considered. The word bere in Basque is the feminine possessive pronoun (her) while in Italian it is the verb to drink. Kata means words in Maori, floors in Turkish. The proportion of such words is typically below 10% but can contribute considerably to difficulties whenever there are larger languages using the same writing system – any one of the larger languages may overlap with a 81 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 query term of an LRL one looks for. Linguistically, loanwords are more likely to come from a larger language into an LRL. In summary, there are a variety of reasons, linguistic and technical, why searching content of an LRL on the web may become difficult, the needle in the haystack being a suitable metaphor. On the other hand, in the age of Big Data the benefit of retrieving so much data and compiling text corpora is larger for LRLs. Statistical training requires much data and so mining the internet may be more important for LRLs than it is for larger languages. Consequently, prior to this paper other projects on the topic have come into existence, where the aim was to provide corpora in LRLs, albeit all suffering from some kind of barrier, mostly copyright related. Also, the corpora thence compiled are of course snapshots of their time (meanwhile the resource landscape has changed considerably for some LRLs). We would like to mention some of them since they are an obvious first place to search when looking for a specific LRL: 1. An Crúbadán: Scannell ([18]) collects lists of URLs and ngrams1 for roughly 2200 LRLs (15.10.2019) in order to provide resources for Natural Language Processing on crubadan.org. However, due to copyright reasons, the texts themselves are not provided and some of the links may be outdated. 2. Leipzig Corpora Collection (LCC): Hosting contents in 252 languages (15.10.2019), the LCC ([10]) provides their web-crawled and processed texts via a web-interface. 3. DoBeS: This project concentrates on endangered languages and provides data for 60 of them on the web which are high-quality linguistically curated datasets. It is connected to the Language Archive of the Max Planck institute in Nijmegen, which has resources in more than 800 (15.10.2019). Other general resources have grown to include considerable resources in many LRLs such as the Wikipedia and related projects. Holy books and missionary effort around LRLs have likewise seen large translation and digitization endeavors. The obvious limitation is that the materials are generally of only one genre and the language of many holy books often somewhat archaic. Nevertheless, homepages such as the one of the Jehovah’s Witnesses possess versions and materials in many LRLs. However, having become aware of some purely technical use of their texts, their copyright is very explicit and should be read. Ideally the legal status (copyright, licensing) must be checked for each site from which texts are taken especially in non-purely- scientifically used corpora. Apart from the Wikipedia, which is a free resource, OPUS ([19]) a parallel corpora archive offers free resources in many language pairs and provides a large list of free sources. There is another class of often free resources, namely legal governmental documents (and webpages) in LRLs. Lastly, resource lists such as the OLAC inventory provide also Links to online resources for many LRLs. Finally, there is a tool, which has been used with larger languages, but which can also be used for the automatic compilation of corpora in all languages, naturally also LRLs. 1 n-grams are sequences of n characters or words which occur in sequence in texts, where n stands for a number. 82 http://www.language-archives.org/ http://opus.nlpl.eu/ https://it.wikipedia.org/ https://dobes.mpi.nl/ https://corpora.uni-leipzig.de/ http://crubadan.org/ A. Hoenen, C. Koc, M. D. Rahn – A Manual for Web Corpus Crawling of Low Resource Languages This tool is called BootCat ([1]). While we do not attempt to evaluate the performance of BootCat for LRLs in general, it may suffice to say that despite being a great tool for our purposes, noise and parametric limitations such as towards the search engines used, numbers of terms and tuples etc. are reason enough to still embark on the endeavor of manually retrieving content. Users should however use BootCat and combine its results with those of manual search. Summarizing, when building a corpus for some LRLs (or some low resource genres/ text types etc.), we believe that an integral part of looking for content must be manual content search. It can be an extension to existing resources and tools or the main key activity for corpus compilation. Lastly, facing a medium such as the internet results are probably always at risk of being relatively quickly outdated. Some of the difficulties we face, and faced in this article, may become obsolete due to the arrival of new technologies, new devices and new regulations etc. We believe however, that the picture will likely just become more complex than changing completely, leaving some of the results still accurate, while others will be added, some outdated and some updated. We believe the internet is made of strata of webpages, some, like http://info.cern.ch/ go back to the scientific very beginning of the web, others reflect taste and technical infrastructure of the 1990ies or 2000s and yet others are recent. While our methods are based on the peculiarities of the pages until 2019, we cannot foresee further complexity added after that. That said, one additional difficulty of any (written [or spoken]) communication is the forced linearity of language ([17]) and the missing adaptability of written language. We need to figure out a sequence of the things we want to describe and a back and forth between paragraphs is cumbersome as it usually involves large eye skips which so conveniently are largely absent from reading a print book. Our article is intended for multiple readerships, but since the authors do not have the time nor the possibility to write 10 different versions of the article each with different levels of detail, terminology and sequences, each for a different readership, the burden is upon the reader to cut his way through the forest of this article. In order to (hopefully) facilitate this, we will introduce some few tags with those types of LRLs (and readers) for which the tagged content is relevant. 1. Step - Defining Distractors Distractors are systematically occurring instances of another language or consistent paralinguistic type of strings (such as faulty OCR, written glossolalia, machine codes, encrypted text or other artifacts) in the results when querying our target language. This must be distinguished from noise, where we understand noise to be such documents or search results which surface for a specific or a very limited number of queries only. The border between distractors and noise is fuzzy, ([18]) intending only linguistic distractors used the term “polluting” languages. Whenever intending to manually find content in an LRL we advise a 83 http://info.cern.ch/ Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 first step, where the main distractors are listed in order to develop strategies to exclude their content in the results and thus increase precision of the search results carefully without affecting recall. Related Linguistic Distractors Linguistic distractors can again be subdivided into several types but let us first look at a characteristic of our LRL. Firstly, there are languages, which are part of a larger language family - LRLf. Then, there are language isolates LRLi, which have no family, a true isolate doesn't even have closely related varieties. Finally, there are mixed languages LRLm known as Pidgins and Creoles, which often exhibit some grammatical features of their substrate while adhering largely to the lexicon of the superstrate. Depending on this characteristic, a language may or may not have closely related sister languages which can be some of the most distracting distractors. Naturally LRLms are confusable with other LRLms of the same or even a closely related superstrate. The border between language and dialect is fuzzy and a matter of definition. As the famous saying attributed to Weinreich reminds us A language is a dialect with an army and a navy. Thus, searching for an LRL, one should be able to clearly delimit the variety against others or include different standards then delimiting the varieties from others in the same continuum. If one has achieved that, language family trees (classificatory systems) can help us immediately identify candidate distractors. Likewise, a matter of decision is the upper node at which one considers a family a family (would we take Indo-European [for our purposes surely too large a unit], Indo-Iranian or Iranian as the root for the family of Balochi?). Generally, a closer root for the family with an uncontroversial branching point and consistent with general linguistic typology appears advisable. Online sources for classifications are sites such as Ethnologue, Glottologue, WALS, to name but a few. Here, one may search for the closest languages and then test how close they are. For them to be formidable distractors, they should have the same writing system (otherwise they can be skipped) and very few differences from our target LRL. A second factor is the size or status of the sister language, if it is a large language, one may want to put it on the distractor list even if a little less related than another closer LRL sister. There is no perfect strategy of defining a threshold for a distractor sister, but a simple heuristic may be useful: if the percentage of exact orthographic overlap (and/or overlap in trigrams) between two as large and clean as possible word lists (the distance may be weighed by frequency) considerably exceeds the average accidental overlap with the largest unrelated languages of the web, one should include it . One may also consider overlap in grammatical features (WALS) or phoneme inventories and graphemic systems additionally. The exact threshold may vary per scenario, but this is not tragic, since one always has the possibility to extend the distractor list later. Again, generally one may rather include too many than too few distractors. A side-note is that one may want to consider an LRLm attached to the root node of the superstrate language's family and likewise all other LRLm with the same superstrate. The test on the percentages of overlap may be run as described. Historical stages (earlier distinguished 84 https://wals.info/ https://glottolog.org/ https://www.ethnologue.com/ A. Hoenen, C. Koc, M. D. Rahn – A Manual for Web Corpus Crawling of Low Resource Languages varieties) of our target LRL may always be a related linguistic distractor. Unrelated Linguistic Distractors After having listed the closest related distractors, one may proceed to unrelated linguistic distractors. Here, the linguistic processes of loaning and borrowing 2 is the most important one for the potential of becoming a distractor. Consequently, Sprachbund and regional proximity are good estimators for candidates, but also former colonial languages. Loanwords may constitute a considerable part of a language's vocabulary. Since the direction of loaning often reflects power, it implies a larger part of words from bigger languages. The opposite direction however is equally relevant, words for local plants, meals, industrial products etc. which the larger language has absorbed. What we are looking for with non-related distractors is languages which are either the (ultimate or intermediate) source of many (loan) words of our target LRL or which have themselves borrowed a considerable number of terms from our target LRL or various mixtures of both. Loans may be mediated (for instance, directly loaned from French, but originating in English). If those terms preserve their original orthography (at least in some of the terms or to large degrees), then they can lead to serious distraction in queries. Especially if the contact is so intense, that even some (frequent) function words such as discourse markers (like amma in some oriental languages) have been loaned. It might also matter if one looks more for formal or informal language. All LRLs usually loan and borrow, albeit for different reasons and in different ways; LRLms maintain some words of their substrate for instance. Again, the question when to include a non-related LRL cannot be answered globally, but some thoughts may facilitate the decision. The larger the distractor is (especially English, which loans into many languages, likewise French, Spanish in South America, Russian in the East, Mandarin Chinese in the Far East) the more likely a distractor it will be since these languages are typically associated with plenty of web content. On the other hand, the level of perseverance of original orthography (in the original alphabet) is also very important. While Japanese has many English loanwords, rendering “glass” as グラス is hardly producing any English content (aided by the fact that in the Japanese rendering there is no more distinction between r and l, so the same transcription could also stand for the word “grass” which is however usually not used as an English loan). Here, one may thus measure how many words in a sufficiently large wordlist of the target LRL (if one has one) are English words (intersecting with a big English word list or an online corpus query API). In this case, it may also matter which words are such loans: are they all very low frequency terms, so that there is no big danger excluding target LRL content when excluding content containing them? Based on these characteristics and individual ones, one may decide to list a language as distractor language. Paralinguistic Distractors 2 The common use of loanword and borrowing in English is at least partly synonymous, see Oxford Advanced Learner’s Dictionary (22.10.2019). 85 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 Finally, there are paralinguistic distractors as we learned during our course. It is hard to preview which LRL will attract which paralinguistic distractors, but based on the spread of the Latin alphabet and the fact that most programming code and mark-up code etc. is written in Latin alphabet, it is clear that the likelihood for paralinguistic distractors is clearly much higher for languages written in the Latin alphabet.3 Now, a paralinguistic distractor should be a consistent unit if one is to later find strategies to exclude content of this type. We found some possibilities: faulty OCR: Some websites provide text derived from OCR which was neither post-corrected, nor very accurate. In fact, there are tremendous examples, where there seem to be many more wrong characters than correct ones; and the numbers of such documents seem to be on the way to being Big Data. OCR-errors are often systematic, that is the same letter sequence in the same font tends to be misread as the same wrong sequence each time encountered. This leads to the systematic distortion of the original language of the OCR (or if the faulty parts make more than 50% of what is being displayed some language like gibberish). If one now thinks that language change likewise distorts an actual variety to form another one, it becomes intuitively clear that the distorted OCR of a sister language or even an unrelated language can accidentally (if in the process some very frequent function words and some fewer, longer content words are produced) resemble another, if we are unlucky our target LRL. So, faulty OCR is always a bullet point on the list of distractors. glossolalia and pseudo-X: Some comedians imitate languages by producing fake sequences incorporating many characteristic sounds and maybe some words of certain languages. To give a written paraphrase, what language would you guess the following sequence to be Das Gehortung warrende Humpelkatz rimpelt in Ratzfatz? Does this look like German to you? Actually, as on the 10th of October 2019, Google translate would also classify the sentence as German and it probably should. Only two words and two morphemes are German: Das is the German article, Ratzfatz, better ratz-fatz is an onomatopoetic meaning immediately, the word Humpelkatz could be analyzed as limping cat. We hope this made clear that such imitation works also on a written basis. If we take this to be a sequence uttered or written by a comedian to humorously emphasize a rough sounding aspect of German maybe based on its affricates and some other phonotactic properties, then with some actual words produced by accident or intent such a sequence would make a formidable distractor (as also the language classification systems witness). Besides comedians, psycholinguists may use pseudo-words (with a certain permissible orthographic and phonological structure) or even phrases for experiments and some religious activities include speaking in tongues (or glossolalia) although this is seldom written down (but could be as auto generated and separated subtitles to a YouTube video 3 Here, we would like to clarify that if one searches for transliterations, too or for multiple renderings if a language has more than one principle writing system (such as Serbian) the strategy is different. To increase clarity of our method, we assume however that the search scenario involves only one (main) writing system. If your scenario involves more, you can break it down into one scenario per writing system. 86 A. Hoenen, C. Koc, M. D. Rahn – A Manual for Web Corpus Crawling of Low Resource Languages for instance). Despite being much less abundant than faulty OCR, one would want to avoid such content in a serious corpus, which is why it is to be considered a possible, yet rare distractor. encrypted text and orthographic plays: Since antiquity ([7]) people have used codes to transmit messages which should not be intercepted. Since then an arms-race between cryptography (the process of encoding messages) and cryptoanalytics (the process of decoding) has taken place and seen the development of many different techniques. Some of these codes may produce text looking like our LRL or worse but extremely unlikely senseless text in our LRL (here, one may remember Chomskys famous sentence: Green colorless ideas sleep furiously, which is asemantical but in principle not agrammatical). Or one could simply write a Latin alphabet-based language in a Semitic style omitting short vowels, which then may accidentally look like another language. Also, one could invent a new orthography (for instance a simpler one for French, which then could resemble a French based creole while not being one). This distractor should also be rather rare, but there are cases where it could be a serious distractor. machine code: programs and men produce all kinds of codes for instance in transmission or machine2machine communication. Again, accidentally, these could at least in parts look like sequences of words in the target LRL and thus become some kind of distractor. abbreviations and acronyms: heavily abbreviated text may also lead to a completely different linguistic appearance. Since there are languages, where an abbreviation must not be marked as such (as with a dot in English), these cases could lead to another distractor. Especially in short message communication (where space is also money) innovative abbreviations have become a substandard (lol, 4u, 2b or not 2b, etc.). Apart from these, there might be other paralinguistic distractors such as lists of names, but we were neither aware of others, nor did we witness them during the course. The extent to which such paralinguistic distractors play a role is largely dependent on the target language and on connected random factors such as the frequency of certain easily misOCRed ngrams such as ni (for m). It may therefore seem advisable to check these paralinguistic distractors one by one. In a tiny experiment, we took an abstract of a paper in Indonesian, reduced the image size and quality by a Ghostscript command, took a screenshot of it and then let tesseract OCR with English extract text from the resulting png. On the Textfile we ran fastText and found that about only 40% of lines had been identified as Indonesian, 33% as English, 15% as Malay and one line each as Finnish, Russian, Hungarian, Ukrainian and Catalan. So, in all, more than half of the lines were misclassified, while especially English and Malay, one close sister language and a contact language had larger proportions. Badly OCRed sister languages may be especially prone to become a distractor. With a list of distractors at hand, only two things are missing before starting the search. 87 https://fasttext.cc/ Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 Legal Issues, Copyright Before proceeding, a word of caution is in place. Generally, determining a legal policy on web texts is both difficult and not yet internationally uniformly resolved. Usually, the law of the country of the place where the web server is located applies. For scientific use there are various rules, such as the so-called fair use doctrine which ascertains the legality of the usage of copyrighted material without permission request for educational purposes under certain conditions. A company may sue people using their texts if they suffer a loss of profit (and certainly will if this is large) - imagine a web corpus makes an annotated version of a complete Harry Potter novel available which people could use for reading instead of analysis. Likewise, small languages or religious communities may feel preyed upon when their materials get used (especially if to the end of a profit). A large project on web crawled corpora ([2]) states that: “The copyright issue remains a thorny one: there is no easy way of determining whether the content of a particular page is copyrighted, nor is it feasible to ask millions of potential copyright holders for usage permission. However, our crawler does respect the download policies imposed by website administrators (i.e. the robots.txt file), and the WaCky website contains information on how to request the removal of specific documents from our corpora”. Also, they argue that their corpora are processed, that is annotated which represents additional work on the texts, they are not anymore the same as the raw material. A complete download of a corpus is what we term here legal level 1. The LCC makes their contents available only through queries via their web-interface while a complete download is prohibited, legal level 2, a model which many websites with linguistic resources follow in order to avoid legal persecution (the Harry Potter novel may give examples in an analysis but the complete text is not downloadable). The An Crúbadán Website goes a step further, legal level 3, and does not make the texts themselves available but only word and ngram lists. Especially if one plans to publish entire text collections of texts in a target LRL from the web or to use explicit examples from such a collection (not only statistical and metadata), one should very carefully examine the legal status and options and offer the smaller language communities wherever possible a say in whether they want their texts to be used in this way or not and if profit is involved a way to participate. If using the texts for scientific purposes only one should at least make sure that this is covered for instance by the fair use doctrine. As a reference the work of the ELDAH should be considered, also as an address to turn to for specific questions. Wordlists for query generation Before starting to query a search engine, one needs some terms from the target language. Since we assume for our scenario that there is a need to compose a corpus, we also must assume that the researcher is not in possession of such a corpus beforehand. What follows is that (s)he must obtain a wordlist in other ways. The properties the wordlist should have are determined mostly 88 https://eldah.hypotheses.org/ A. Hoenen, C. Koc, M. D. Rahn – A Manual for Web Corpus Crawling of Low Resource Languages by statistics. Such a list should feature very frequent function words, words of intermediate frequency and some rare specialized longer content words. Generally, the larger the list is the better it is and the more information on a word (its frequency or frequency class for instance) is available the better it is for monitoring and evaluation purposes. A printed grammar may be a good starting point for manually extracting terms. Ways to search the internet and other networks The world wide web (WWW) is an open public network of computers where some (servers) can be accessed via a curated system of addresses from any computer connected to the network for instance through a web browser. Since each of the servers can host varying numbers of pages and content and furthermore, since pages are constantly updated, removed or added (fluidity), nobody really can know how large the WWW is content-wise. There are estimates on the size of the WWW, but naturally it can‘t be verified. [20] estimate only the size of the portion indexed by search engines and gives roughly 6 billion pages (15.10.2019). URL Guessing Explained in a simplified way, to access a certain content, a user can type the address in a Browser and thereby asks the server registered under this “name” for data which it sends back. This is also the first mechanism by which to find a particular web page: direct address input. Now, if one does not know a page, one can guess names.4 LRL communities might have names such as language-name.country-code-top-level-domain, for instance for German deutsch.de.5 This may or may not work, in case it doesn't either content in another language is accessed there, a provider has reserved the name and advertises it there or the address cannot be reached. This way of obtaining LRL content is both cumbersome and has a low probability of success. This is because there are numerous possible combinations of subdomains, top-level domains and hostname letters to compose a URL (in various alphabets and ways such as Punycode) and since LRL content may only be present on some files (thus subadresses on a certain server) this again opens many possibilities.6 Link-Hopping, Surfing The second possibility to search the internet is via so called hyperlinks. With any starting point in the internet, one can search and follow the links on that page and go to others. For LRLs 4 Analogously, one may also guess IPs, which browsers also understand, but in guessing which IPs link to content in a certain LRL, only non-linguistic clues are useful for which we have no good description. 5 This is just a quickly retrieved example, we do not want to advertize or promote contents hosted on that site. 6 We do not treat FTP and SMTP protocols here. 89 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 this strategy is especially useful when pages are interlinked. There has been a lot of research on the topology of the internet when symbolized by a graph where pages are nodes and hyperlinks edges. One famous contribution ([3]) assumes a bow-tie model. Another famous topological property could be the small world property ([16]) where few pages function as linking hubs which connect many groups of loosely interlinked pages (which rarely link to pages outside their groups). Finally in search engine research so-called hubs and authorities ([13]) are being distinguished where hubs by and large correspond topologically to those in small-world graphs whilst authorities are pages which provide high quality content. Should the webpages of our target LRL be located in disconnected components of the WWW, meaning such groups of pages which are not interlinked with the main core of the internet, this could make them considerably harder to find. Firstly, because then there would be no way to find them through hyperlink hopping (surfing) from pages in the core. Secondly, search engines - the third way of searching the web - would be somewhat less likely to index these pages and thus they may not be retrievable through them. However, we found during the course that most pages in LRLs (of various sizes) were usually somehow connected to the core.7 Furthermore, we realized that at least some proportion of the disconnected components could be fresh pages, which have not yet been filled with content. Thus at least some disconnected pages are irrelevant. Yet, we cannot exclude that some pages in LRLs are found in disconnected components and as a consequence are only accessible through URL guessing or informants. Search Engines The third and presumably most well-known way of searching content on the web is through search engines. A description of the process is problematic for various reasons, mainly because the exact functioning of search engines is their business secret and subject to change at least of parameters at least every now and then (for otherwise people would manipulate webpages in absurd ways in order to come up on top of certain customer-loaded searches). 8 In a nutshell, a search engine periodically sifts through the web (or portions deemed relevant) and generates (or updates) a so-called index, that is a database where addresses are stored along with some features. Now, because this is a secret it is not clear what these features are, but certain words (or ngrams) and their frequencies as well as the number and sources or targets of incoming and outgoing links should be involved almost certainly. On the basis of these indices, search engine queries which the user sends to the web interface of the search engine are being answered. Thus, for each query a search engine receives, an algorithm produces those result pages which according to the features of the index are most relevant to that query. Thus, one does usually not search “the internet” when using a search engine but only those portions of it known and relevant to the search engine. Generally, large search engines such as above all Google (and Bing) seem to have the largest indices. Further below, we describe how to retrieve LRL documents through search engines by composing linguistically informed queries. A side-note is that semantic web technologies (and search engines) could play an ever more important role if 7 This refers also to pages, which we knew and found not by search engine queries or surfing. 8 Actually, people try this through reverse-engineering. The concurrent field is Search Engine Optimization. 90 A. Hoenen, C. Koc, M. D. Rahn – A Manual for Web Corpus Crawling of Low Resource Languages considering projects such as Babelnet which connect languages, some of them small, through semantic web technology. For instance, could a tag for a small language’s name be connected in many meaningful ways to different types of content for it. Meta-linguistic ontologies such as OLiA can already be used to obtain features and characteristics of a potential target language and are used also for smaller languages (such as Dzongkha or Yucatec Maya or Fon) and in interlinked lexical resources. Dark Webs There are alternative webs which some call darkwebs. The onion-web is the largest of those and one of the earliest ([9]). It was intended as a place where political dissidents could voice their opinions if officially oppressed and actually parts of this web are used today for such purposes. Facebook also has a presence there. However, since in these kinds of webs, users who host or view pages are rather anonymous as secured through the technical underpinnings of the system, which at the same time make it slower than the WWW, much criminal activity is also to be found there. Since LRLs can be oppressed, content may be located in a darkweb. The way to find such content is through darkweb search engines or Hidden Wiki lists with thematically ordered entry-points, and through surfing. URL-guessing is rather impossible, since darkweb URLs are usually long codes, not registered and intentionally chosen as in the WWW (or Surface Web as darkweb surfers may call it) and seldomly meaningful. Legally and ethically, the use of texts found on a darkweb may present a greater challenge than those from the Surface Web. One has to think about their potentially threatening contents, the risk one may put their authors or oneself to using them, the unclaimed or unclear copyright situation and quotability (reproducibility) since contents appear and disappear at high rates in this segment of the web. Social Media One must mention Social Media such as Facebook, VKontakte and Twitter, which come with their own search engines and where accounts prompt users into areas, which the bigger search engines are not supposed to index or offer publicly. For these reasons, some content may only be accessible while logged in into an account in a Social Network and consequently looking for content in LRLs may involve Social Media Search. While as of summer 2019, Facebook closed its semantic graph search where one could for instance query all accounts of Hindi-speakers living in Canada and working as teachers, the current search functionalities typically still support searches using individual characteristics such as the school users attended (provided they input this information). However, the search is no more semantic but now combines query terms and is no more fundamentally different from conventional search engines. Deep Web A last point to mention is the so called Deep Web, which refers to content which is generated dynamically from the contents of a database only in the moment, the user onsite submits a form or performs a related interactive activity. Such content cannot be indexed by larger search 91 http://www.acoli.informatik.uni-frankfurt.de/resources/olia/ https://babelnet.org/ Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 engines unless they mimic the onsite query behavior which would clearly be unfeasible, given for instance all possible queries on a train company’s website for the next trains at every point in time, as well as arrival and departures times at every station. It has been estimated that considerable portions of the WWW are potentially Deep Web contents and they comprise also genuine LRL content. Search engine queries for LRLs - step by step While guessing URLs is a possible strategy, the probabilities of success are rather low, so that it seems more advisable not to start your endeavor with this kind of search on the WWW unless for reduced ubiquitous examples. Likewise, without a good starting point (for instance one of the pages called hubs) crawling or surfing the web (hopping from page to page via hyperlinks) might be a rather difficult start let alone be cumbersome. The easiest and probably most successful way to start the web search for LRL corpora is thus the use of a search engine interface. In this section, we describe step by step how to look for content in the target LRL by using search engines. 0. Looking for resources on known sites and BootCatting The first place to look for content may be one of the larger sites and projects which have focused on LRLs, such as the above-mentioned ones (DoBeS, An Crúbadán, LCC, etc.). However, their content should be thoroughly checked for noise. Secondly, using BootCat and feeding it with a limited number of terms may be a good start. The results can be URLs or even directly a corpus which BootCat draws from those. The result must then be checked manually and purified, which can in the ideal case provide a larger basis for manual queries or a second and third BootCat round. For some LRLs however, BootCat may not be able to retrieve valid content. 1. Single Term Querying For a first approach to querying search engines we can take single query terms and compose an evaluative spreadsheet where we note for instance a) how many of the first 10 pages returned for a certain query have been in the target language and b) how many results have been returned.9 We can then annotate different statistical or semantic or other linguistic properties of query terms. We query some very frequent function words (avoiding terms accidentally or by loaning overlapping with one of the big languages of the WWW or our distractors), some intermediately frequent terms, some content words (even if our wordlist features no explicit frequency information, we can, by universality, be quite certain that a not too unusual and 9 Note, that the number-of-results estimate which some search engines provide is based on a quick precomputation and can deviate from the actual number of results. 92 A. Hoenen, C. Koc, M. D. Rahn – A Manual for Web Corpus Crawling of Low Resource Languages short function word is very frequent, whereas a longer content word, which is not general such as thing, animal, machine should be infrequent). We can try words from different genres, different registers, regions etc. Afterwards, we can look at the list and try to draw inferences on the effect of the assumed characteristics on the evaluation (a and b or a [weighted] product of both). Some properties are unsurprising and can be hypothesized a priori such as function words return more results than content words or that specialized terms return less than more general ones which at times will be used as anaphora for the former. However, the current composition of content of our target LRL on the web is what has to be characterized. Guiding questions such as does a considerable proportion of the assumed content contain pages with folklore content could be tested by using concurrent vocabulary and seeing if result numbers or precision increase. These questions may be very individually dependent on the current time, situation and other circumstances of the LRL community at hand. 2. Multiple Term Querying After having evaluated single terms, where some may have been found useful, the next proposed step is in combining terms. Generally, the first term can be thought of as constituting a certain amount of results and each subsequent term as eliciting a subset of the previous results, so that in principle on average the number of results decreases with the number of query terms. A function word is then a good first query term but so-called stop words should be avoided.10 Stop words are usually very frequent function words which big search engines simply ignore if they appear in queries. The reason is that those would simply elicit too many documents as they appear in quasi all of them (think for instance of the English article). This touches upon another issue, the settings for search. For stop words to be relevant, a search engine must have a list of them and in turn a setting for searches in the target language. Other language dependent settings may also exist and can crucially influence the results in ways again hidden in the business secret. Often queries contain terms, which have not been indexed (that is they may be contained on the target website but not in the search engine index as a feature) or are simply not present in any target site. Now, it is the secret search engine algorithm which decides how to prioritize imperfect results. For instance, if you query 5 terms and no site is found containing all 5 in the index, should a result page which has only 3 of 5 terms but has a very good hub-score (or authority score) be prioritized over one which has 4 out of 5 terms but appears less connected and important? This is an additional process to be thought of when querying more than one term. Apart from this, one can make another table (spreadsheet) and start combining terms systematically for their characteristics. Now, the possibilities for combination are manifold, combining a function word with a content word, a general with a specialized word etc. This allows hypotheses to become more flexible. Likewise, interpretation becomes more complex. The benefit is however, 10 Likewise, non-stop word-function words appear unuseful towards the end of a query. 93 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 that we can get a better idea of which genres, registers etc. are especially fruitful in our scenario. What we found during the course was that if one combines content words which are too unrelated, since they do not occur together naturally, dictionaries or word lists may surface. 3. Queries with Operators Operators are characters or character sequences which the algorithm of a search engine will treat in a predefined way when seeing them in a query. Among the most well-known and used operators may be the Google's quotation marks and minus operators, where the “ operator forces the search to look for the occurrence of a sequence which may contain spaces as such (“an apple” searches pages which contain exactly this phrase, not such which contain apple somewhere, an being an ignored stop word). The - operator excludes. If we formulate a query and add - terms, we thereby exclude pages which contain the minusterm. This can be used monolingually for word sense disambiguation in queries, for instance searching for bank being interested in the use of the term as riverbank, one may formulate bank river water -financial -money -business in order to exclude pages where the dominant sense of the word bank which one can naturally not exclude from this query is the financial institution. In parallel, we can exclude other languages (big WWW languages and our distractors). Since minusterms apply only if any page at all is present in the result set, which has them, one can theoretically add a large number of them to any query. Note that search engines limit the number of maximally allowed search terms (either in tokens or as a certain bit encoding size). 4 General Remarks Compiling a corpus from internet sources is work-intense. At least if one aims at clean corpora with very low amounts of noise or none at all. Some sections of the internet such as member only content or certain content within social media platforms or sections of the internet not indexed by a search engine are obviously not retrievable via a/that search engine. Thus, manual content search can always produce additional content for target LRLs as long as such content exists. In doing so, we found a profound knowledge of technical and linguistic underpinnings of documents on the web useful. For instance, the search for certain document types (txt, pdf, etc.) partly benefitted from different query term selection and composition strategies (in opposition to html content, we must not expect menu-items in plaintext or pdfs for instance). The personalization options of the search engine (and so does the filter bubble) modified the results partly crucially. Generally, we found search engines to be richer in content for languages spoken on the territories close to their core language than the others (English for Google, Russian for Yandex, Mandarin Chinese for Baidu). Some content was blocked from certain regions. 94 A. Hoenen, C. Koc, M. D. Rahn – A Manual for Web Corpus Crawling of Low Resource Languages 2 Student Search Scenarios During the course, the following two search scenarios have been worked out and shall be given as an example here. Nogai (Cemre Koc) My scenario was about the search for Nogai, a Turkic language spoken by approximately 70,000 speakers in southwestern European Russia (Caucasus). The first step of my search was downloading a Nogai wordlist from crubadan.org which contained over 200 words and word bigrams in .txt-format. Then, I formulated queries by gathering and combining random words out of the downloaded txt-file. At first, I used multiple search engines but later changed only to Google which had given me 7 results in the first run (apparently Nogai newspapers and poems) whereas the others none. However, I noticed that the retrieved newspapers and poems could easily be written in other Turkic languages of their own language branch or neighboring languages like Kumyk, Karachi-Balkar or Bashkir. The main problem was to distinguish newspaper articles and poems as PDFs between Kumyk newspaper (2), Karachi-Balkar and Nogai (3), since their writing systems and lexica significantly overlap. To tackle such a low resource challenge facing the large similarity of Turkic languages in their language family branches ([11]: 81), it was necessary to collect unique linguistic characteristics such as affixes or cognates in their forms which made it easier to discern Nogai from the other languages (distractors) and to discard unwanted content (noise). The exclusion of words where the same form appeared in a distractor language (for example уьй (house)) lead to higher numbers of results in the Nogai language (first page from 1/10 to 9/10 hits). Using combinations of unique Nogai words also provided an overall higher number of hits. While мылтык (riffle) has zero hits on the first page in Google the combination with the words сары тамбыз (august) provides three hits, which include a novel of a Nogai writer (Isa Kapaev). Furthermore, it is important to mention that the combination of words of one category or topic was advisable as then the number of hits increased. Moreover, the filetype operator used for searching PDFs only influenced the results positively. In summary, I found 30 newspapers and over 30 children’s books in the Nogai language in PDF format by using combinations of unique Nogai words and a filetype PDF operator which is a considerable corpus for such a small language and as such for Nogai to the best of our knowledge unprecedented. Maori (Marc D. Rahn) My search scenario was restricted to pdfs and concerned with Maori, a Polynesian language spoken by roughly 160.000 people in New-Zealand. To start, I looked for frequency based Maori wordlists via Google and found one11. From this list, I selected a smaller subset for manual work based on the following criteria: 11 https://tereomaori.tki.org.nz/content/download/2031/11466/file/1000 frequent words of Māori- in frequency order.doc 95 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 1. The words should be high ranking, that is frequent, so as to elicit as much content as possible. 2. Since shorter words have a higher chance of accidental overlap with another language, I preferred slightly longer words (slightly less frequent). The most frequent word in Maori, "Te" ("The") was not very suitable, as it could easily appear in other languages (which it does, for example in French "you", Dative/Accusative). 3. I also preferred words with a peculiar arrangement of letters, as well as words with diacritic symbols, further reducing the chance of a random match with another language. For instance, the sixth most frequent word "ngā", is a perfect candidate: It is very common (meaning "and"), yet not too short, could come up in any kind of text and features a peculiar combination of letters and even a diacritic symbol. The next step in my preparation was to think of distracting languages, especially English. The Web is full of English content, especially for a country like New Zealand where it is the main language. Additionally, I wanted to avoid mixed texts and teaching resources. Therefore, I picked the words "the, and, with" to use for the exclusion of English content (blacklist terms). Despite the fact that those terms are so-called stop words, that is they are ignored by Google when searching, when excluding them they are apparently not. Lastly, as I had discovered with a few similar queries for other languages beforehand, a lot of linguistic resources do come up, both educational and scholarly. To exclude these results, I excluded the words "language" and "status" from all of my results. To then obtain my actual search results, I picked a number of terms on the Maori wordlist that fit my criteria (ngā, kua, rā, haere and tētahi) and started searching for them one by one, using the blacklist terms at the same time. I used Google for these queries, because it simply is the biggest search engine, and there is, to my knowledge, no comparable specialized search engine for the region of New Zealand or Polynesia. In addition, Google allows for a multitude of search settings: I changed the search language to English, the region to "New Zealand" and turned off personalized results. For example, one of my search queries would look like this: ngā -the -and -with -language -status filetype:pdf For a first impression, I chose to estimate the percentage of correct findings (PDF's in only Maori) on the first result page. A count of total results was performed manually by counting all result pages (which is more accurate than the results estimate). The query given above returned 54 results in total (Last checked: 21.10.2019). Of the first 10 results, 9 were completely in Maori, one was a short list mainly in Maori but with English words appearing, and one was a broken link. The other queries and their results are given here below, see Table 1. 96 A. Hoenen, C. Koc, M. D. Rahn – A Manual for Web Corpus Crawling of Low Resource Languages Query Results total Results of first 10 in target language, Region: Germany Results of first 10 in target language, Region: New Zealand Results total (with blacklist) Results of first 10 in target language, Region: Germany Results of first 10 in target language, Region: New Zealand ngā 193 4 4 54 8 9 rā 241 0 1 102 8 10 Kua 182 0 0 173 0 10 haere 163 0 1 106 5 10 tētahi 118 4 9 93 10 10 Combined ngā tētahi 103 9 10 rā ngā 119 5 5 kua rā 91 6 7 haere kua 163 7 7 tētahi haere 112 9 9 ngā haere 170 7 6 rā tētahi 105 7 8 haere rā 139 1 2 tētahi kua 105 8 8 ngā kua 120 8 8 Table 1: Query documentation for Maori. All queries had an additional filetype:pdf restriction, the combined queries used blacklisted terms. It is relatively straightforward to see that a combination of the techniques described above yields the best results. When, however, the techniques are used one by one, the selection of search terms and blacklist terms becomes more significant. Especially when stop words cannot be used (as with larger languages), when the region cannot be filtered for, or when both applies, a combination of terms can still lead to good results. In contrast, a single word query without blacklisted terms is likely to yield quite noisy results even if filtering for the right region. However, longer words with diacritics have led to cleaner results in all circumstances within this 97 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 scenario. In both scenarios, pdfs played a crucial role, but this is of course not necessarily so. Filetypes however may not only play an important role for the text type one may find (with presentation formats barely promising connected text for instance). The way in which one chooses terms may also differ depending on the filetype or type of web page, where informal language may for instance be associated with blog entries and comments more than with formal documents such as laws and constitutions. The two scenarios showed that the internet can provide texts even for small languages if one knows well how to find and distinguish them. They have provided many approaches for factors, both linguistic and paralinguistic, which play a role in manually querying content in LRLs. We also analyzed the lexical overlap between the languages involved and found that trees generated from the similarity matrices of lexical overlap in one case roughly reflected genealogy. Compare also [5] who found that language genealogy and language contact (often correlating well with geographical proximity) influence similarity (on various linguistic levels with a hint towards the lexicon). Figure 1 shows a Neighbor Joining Tree from lexical overlap in our corpus for Maori and distractors (mainly Wikipedias, filtered for most frequent English noise). It coincides almost perfectly with linguistic genealogy. For Nogai, the findings differed (with more assumed noise placing Russian in the middle). 12 While Russian is unrelated to Nogai, Bahasa Indonesia is a distant relative of Maori, English not. Despite that, both were clearly distinguishable in the maximal overlap they displayed with any of the other languages which suggests that it could be the case, that only relatively close sister languages play a crucial role as immediate distractors (which the Nogai corpus data roughly supports) whereas distant cousins and contact languages can have a similar degree of lexical similarity, lower than the sister languages. 12 Compare also the classifier, we published on https://github.com/ArminHoenen/URLCoFi 98 https://github.com/ArminHoenen/URLCoFi A. Hoenen, C. Koc, M. D. Rahn – A Manual for Web Corpus Crawling of Low Resource Languages Experiment on Overlap To investigate such issues further, we conduct a small experiment on the accidental overlap between an LRL and large languages of the internet. We embed this into a scenario for the decision of which distractor languages to choose. For our experiment, we take the Romance language Galician which is spoken in the north west of Spain. We obtained a wordlist from the Corga corpus13 and extracted only the 10,000 most frequent words as we did for the largest languages on the Web (according to https://en.wikipedia.org/wiki/Languages_used_on_the_Internet: 14.10.2019, where we 13 http://www.cirp.gal/corga/ 99 Figure 1: Neighbor Joining Tree for lexical overlap. Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 extracted the top 10,000 terms from Wiktionary or open subtitles if the former was too small, from sources given on https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists; we did so for the top 20 languages in their first estimate plus Indonesian [so all languages in the second list are covered]; the threshold of 10,000 represents a spontaneous trade-off between frequencies and noise ratio). Figure 2 shows the overlap per language ordered by the largest overlap. Unsurprisingly, Spanish and Portuguese have the largest overlaps. They are not only two of the largest languages on the web but also clearly the most important distractors to Galician. They have 4 times more overlap than the next language Italian, which overlaps in 733 terms. Then, French, Dutch, Swedish and English come in a group where overlap should partly be due to origin (French) or large amounts of unchanged borrowed Romance vocabulary. Vietnamese and Indonesian featured as much accidental overlap as the former group. This is mainly due to an elevated level of noise in those two wordlists. Portuguese and Spanish are clearly distinguishable by the number of overlapping terms. They should be considered distractors in this case. The distantly related Indo-European cousins which are also contact languages fell thus into one group. Compared to the Maori scenario however, the role of Italian as an intermediary here points to more variety in scenarios which might make it necessary to conduct such an overlap study in each target language case, yet with the caveat, that there might be no target language data in the first place - so it seems rather advisable to include more than fewer distractors, also from a statistical point of view. Attempting to assess the question, if generally unrelated languages with smaller phoneme inventories and simpler syllable structures feature much more overlap, we considered all languages in the WALS which had a simple syllable structure annotated and a small consonant or vowel inventory [or both] (if requiring all 3, the number of languages decreased to two, 100 Figure 2: Number of items in word lists of important and ubiquitous languages of the WWW overlapping with the Galician word list. A. Hoenen, C. Koc, M. D. Rahn – A Manual for Web Corpus Crawling of Low Resource Languages Pirahã and Tacana – for both of which we could not locate word lists longer than roughly 1500 tokens). From these 38 languages, we found 4 to have a Wikipedia from which we extracted wordlists of the most frequent 10,000 tokens (using WikiExtractor.py for text extraction from the Wikimedia dumps from 01.10.2019): Guaraní, Hawaiian, Maori and Yoruba. Hawaiian and Maori are relatively closely related. We then intersected the lists and excluded English, Portuguese, Spanish and French words (50k lists from open subtitles 14) and punctuation. We then added Basque which according to WALS has average inventories and a complex syllable structure to see if there was less overlap for the former languages with Basque than with each other (apart from related Hawaiian and Maori). We found hints for this although not in larger magnitudes, but a much larger investigation has to be conducted to confirm or disprove such claims and for the quantification of such effects. Conclusion We have presented a guideline to searches for content in LRLs on the web which sprang from the experiences made and resources gathered during a course in 2019, the concept of which we had presented as an abstract at the AIUCD 2019. The guideline included a wide variety of suggestions for dealing with manual searches for LRL content in the fluid medium of the internet and considered ways to search, tools, web and language statistics, well-known linguistic and metalinguistic sources, legal caveats and much more. References [1] Baroni, M., and S. Bernardini. 2004. “BootCaT: Bootstrapping Corpora and Terms from the Web.” In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), edited by M. T. Lino, M. F. Xavier, F. Ferreira, R. Costa, R. Silva, 1313–1316. Paris: European Language Resources Association (ELRA). [2] Baroni, M., S. Bernardini, A. Ferraresi and E. Zanchetta. 2009. “The WaCky wide web: a collection of very large linguistically processed web-crawled corpora.” Language Resources and Evaluation 43, no. 3: 209–226. [3] Broder, A., et alii. 2000. “Graph structure in the web,” Computer Networks 33, no. 1– 6: 309–320. [4] Cocq, C., and K. P. Sullivan. 2019. Perspectives on Indigenous writing and literacies. Leiden: Brill. 14 https://invokeit.wordpress.com/frequency-word-lists/; we also intersected all 59 wordlists from 2016 from this project and found through generation of a Neighbour Joining Tree from a distance matrix that the data by and large reflected genealogical relationships if the alphabets were the same and found corroboration that Vietnamese and Indonesian had higher levels of noise. 101 https://github.com/attardi/wikiextractor/blob/master/WikiExtractor.py https://invokeit.wordpress.com/frequency-word-lists/ Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 [5] Cysouw, M. 2013. “Disentangling geography from genealogy.” In Space in Language and Linguistics: Geographical, Interactional, and Cognitive Perspectives, edited by P. Auer, M. Hilpert, A. Stukenbrock, B. Szmrecsanyi, 21–37. Berlin: De Gruyter. [6] Cysouw, M., and B. Comrie. 2013. “Some observations on typological features of hunter-gatherer languages.” In Language Typology and Historical Contingency, edited by B. Bickel, L. A. Grenoble, D. A. Peterson and A. Timberdale, 383–394. Amsterdam: John Benjamins. [7] Dooley, J. F. 2018. History of Cryptography and Cryptanalysis. Berlin: Springer. [8] Evans, N. 2010. Dying words: Endangered languages and what they have to tell us. Hoboken: John Wiley & Sons. [9] Gehl, R. W. 2018. Weaving the Dark Web: Legitimacy on Freenet, Tor, and I2P. Boston: MIT Press. [10] Goldhahn, D., T. Eckart and U. Quasthoff. 2012. “Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages.” In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), edited by N. Calzolari, K. Choukri, T. Declerck, M. Uğur Doğan, B. Maegaard, J. Mariani, A. Moreno, J. Odijk and S. Piperidis, 759–765. Paris: European Language Resources Association (ELRA). [11] Johanson, L., and É. Á. C. Johanson. 2015. The Turkic Languages. London: Routledge. [12] Kilgarriff, A., and G. Grefenstette. 2001. “Web as corpus.” In Proceedings of Corpus Linguistics 2001, edited by P. Rayson, A. Wilson, T. McEnery, A. Hardie and S. Khoja, 342–344. Lancaster: UCREL. [13] Kleinberg, J. M. 1998. “Authoritative sources in a hyperlinked environment.” In Proceedings of the ACM-SIAM symposium on discrete algorithms, 668-677. Philadelphia: Society for Industrial and Applied Mathematics. [14] Kornai, A. 2013. “Digital Language Death.” PLOS ONE 8, no. 10: 1–11. DOI: 10.1371/journal.pone.0077056 [15] Krauwer, S. 2003. “The basic language resource kit (BLARK) as the first milestone for the language resources roadmap.” In SPECOM'2003. Proceedings of the International workshop (Moscow, Russia, 27-29 October 2003), edited by R. Potapova, 8–15. Moscow: URSS Publishing Group. [16] Milgram, S. 1967. “The small world problem,” Psychology Today 2, no. 1: 60–67. [17] Ong, W. J. 2012. Orality and Literacy. London: Routledge. [18] Scannell, K. P. 2007. “The Crúbadán Project: Corpus building for under-resourced languages.” In Building and Exploring Web Corpora: Proceedings of the 3rd Web as Corpus Workshop, edited by F. Cédrick, H. Naets, A. Kilgariff, G.-M. De Schryver, 5– 102 A. Hoenen, C. Koc, M. D. Rahn – A Manual for Web Corpus Crawling of Low Resource Languages 15. Louvain: Presses Universitaires de Louvain. [19] Tiedemann, J. 2012. “Parallel Data, Tools and Interfaces in OPUS.,” In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), edited by N. Calzolari, K. Choukri, T. Declerck, M. Uğur Doğan, B. Maegaard, J. Mariani, A. Moreno, J. Odijk and S. Piperidis, 2214–2218. Paris: European Language Resources Association (ELRA). [20] Van den Bosch, A., T. Bogers, and M. De Kunder. 2016. “Estimating search engine index size variability: a 9-year longitudinal study.” Scientometrics 107, no. 2: 839–856. Last access URLs: 22 October 2019. 103 Abstract Pedagogical framework Introduction - LRL in place of a definition General characteristics of the web and their implications for LRLs 1. Step - Defining Distractors Related Linguistic Distractors Unrelated Linguistic Distractors Paralinguistic Distractors Legal Issues, Copyright Wordlists for query generation Ways to search the internet and other networks URL Guessing Link-Hopping, Surfing Search Engines Dark Webs Social Media Deep Web Search engine queries for LRLs - step by step 0. Looking for resources on known sites and BootCatting 1. Single Term Querying 2. Multiple Term Querying 3. Queries with Operators 4 General Remarks 2 Student Search Scenarios Nogai (Cemre Koc) Maori (Marc D. Rahn) Experiment on Overlap Conclusion References work_cq4wurwbijfg3k7nubx3lkx5g4 ---- Microsoft Word - DH-bibliography 1. 19 copy.docx 1 A Digital Humanities Bibliography Compiled by John Taormina, Duke University With assistance from Alexander Strecker, Katherine McCusker, and Michael O’Sullivan AAHC. “Tenure Guidelines.” American Association for History and Computing, n.d. http://theaahc.org/about/tenure-guidelines/. Aarseth, Espen J. Cybertext. Perspectives on Ergodic Literature. Baltimore, MD: Johns Hopkins University Press, 1997. Abbate, J. Inventing the Internet. Cambridge, MA: MIT Press, 2000. Abelson, Hal, Ken Ledeen, and Harry Lewis. Blown to Bits: Your Life, Liberty, and Happi- ness After the Digital Explosion. New York, NY: Addison-Wesley Professional, 2008. “About the Emory Center for Digital Scholarship.” Emory Center for Digital Scholarship. http://digital scholarship.emory.edu/about/index.html. Abrams, S., J. Kunze, and D. Loy. “An Emergent Micro-Services Approach to Digital Cura- tion Infrastructure.” International Journal of Digital Curation 5 (1). 172-186. 10.2218/ijdc.v5il.151. Ackoff, R.L. “From Data to Wisdom.” Journal of Applied Systems Analysis, 16 (1989): 3-9. Acland, Charles R. Residual Media. Minneapolis, MN: University of Minnesota Press, 2006. Adair, Bill, Benjamin Filene, and Laura Koloski, eds. “Throwing Open the Doors.” in Let- ting Go?: Sharing Historical Authority in a User-Generated World. 68-123. Left Coast Press, 2011. http://arthistory2014.doingdh.org/readings/ Adams, Jennifer, and Kevin B. Gunn. “Keeping up with … Digital Humanities.” Associa- tion of College and Research Libraries, April 2013. Adams, Randy, Steve Gibson, and Stefan Muller, eds. Transdisciplinary Digital Art: Sound, Vision and the New Screen. Heidelberg, Germany: Springer-Verlag Publications, 2008. Adams, Robyn. “Bodley Diplomatic Correspondence Project.” Textal. http://www.textal.org/clouds/879f96786eaa. 2 Agar, Jon. The Government Machine: A Revolutionary History of the Computer. Cam- bridge, MA: MIT Press, 2003. Agosti, M, M. Manfioletti, N. Orio, and C. Ponchia. “Enhancing End User Access to Cul- tural Heritage Systems: Tailored Narratives and Human-Centered Computing.” In New Trends in Image Analysis and Processing: ICIAP 2013 International Workshops, Naples, Italy, September 2013. Eds. A. Petrosino, L. Maddalena, and P. Papa. 278-287. Berlin, Germany: Springer, 2013. Ahlberg, Kristin, William S. Bryans, Constance B. Schulz, Debbie Ann Doyle, Kathleen Franz, John R. Dichtl, Edward Countryman, Gregory E. Smoak, and Susan Ferenti- nos. Tenure, Promotion and the Publicly Engaged Historian. AHA/NCPH/OAH Working Group on Evaluating Public History Scholarship, 2010, updated 2017. http://ncph.org/cms/wp-content/uploads/Engaged-Historian.pdf. Aldenderfer, M. and H.D.G. Maschner. Anthropoloy, Space, and Geographic Information Systems. Oxford, UK: Oxford University Press, 1996. Alexander, Bryan, and Rebecca Frost Davis. “Should Liberal Arts Campuses Do Digital Hu- manities? Process and Products in the Small College World.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 368-389. Minneapolis, MN: University of Minnesota Press, 2012. Alexander, Marc. “Patchworks and Field-Boundaries: Visualizing the History of Eng- lish.” Digital Humanities 2012. https://www.academia.edu/1793281/Patch- works_and_Field-_Boundaries_Visualizing_the_History_of_English. Allen, K.M.S., S.W. Green, and E.B.W. Zubrow. Interpreting Space: GIS and Archaeology. London, UK: Taylor and Francis, 1990. Allington, Daniel, Sarah Brouillette, and David Golumbia. “Neoliberal Tools (and Ar- chives): A Political History of Digital Humanities.” Los Angeles Review of Books. May 1, 2016. Alston, Robin. “The Eighteenth Century Short Title Catalogue: A Personal History to 1989.” http://web.archive.org/web/20080908103158/http:/www.r-al- ston.co.uk/estc.htm. Alvarado, Rafael. “Are MOOCs Part of the Digital Humanities?” The Transducer. January 5, 2013. http://transducer.ontoligent.com/?p=992. Alvarado, Rafael. “The Digital Humanities Situation.” In Debates in the Digital Humani- ties. Ed. Matthew K. Gold. 50-55. Minneapolis, MN: University of Minnesota Press, 2012. 3 Alvarado, Rafael. “Start Calling it Digital Liberal Arts.” The Transducer, 19 (2013). Amelunxen, H, ed. Photography after Photography: Memory and Representation in the Digital Age. Munich, Germany: G+B Arts, 1995. American Council of Learned Societies. Computing and the Humanities: Summary of a Roundtable Meeting. Occasional Paper, No. 41. Chicago: ACLS., 1998. American Council of Learned Societies. Our Cultural Commonwealth: The Report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Hu- manities and Social Sciences. New York: American Council of Learned Societies, 2006. American Historical Association. American Historical Association Statement on Policies Regarding the Embargoing of Completed History PhD Dissertations. https://www.histo- rians.org/publications-and-directories/perspectives-on-history/summer-2013/american- historical-association-statement-on-policies-regarding-the-embargoing-of-completed- history-phd-dissertations. Amsterdam Centre for Digital Humanities. “Modeling Crowdsourcing for Cultural Herit- age.” http://cdh.uva.nl/projects-2013-2014/m.o.c.c.a.html Anderson, Chris. Makers: The New Industrial Revolution. New York, NY: Crown, 2012. Anderson, Deborah Lines, ed. Digital Scholarship in the Tenure, Promotion, and Review Process. Armonk, NY: M.E. Sharpe, 2003. Anderson, Deborah Lines. “Introduction.” In Digital Scholarship in the Tenure, Promo- tion, and Review Process. Ed. Deborah Lines Andersen. 3-24. Armonk, NY: M.E. Sharpe, 2003. Anderson, Erin R., and Trisha N. Campbell. “Ethics in the Making.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 331-42. Minneapolis, Minnesota: University of Minnesota Press, 2017. Anderson, Richard. “Is a Rational Discussion of Open Access Possible?” (transcript url: http://discussingoa.wordpress.com/ ; video url: http://library.si.edu/webcasts/rick-an- derson-rational-discussion-open-access.) Anderson, Richard. “Print on the Margins.” In Library Journal, 136, no. 11 (2011): 38-39. url: http://lj.libraryjournal.com/2011/06/academic-libraries/print-on-the-margins-circu- lation-trends-in-major-research-libraries/. Anderson, Steve. “What are Research Infrastructures?” International Journal of Humani- ties and Arts Computing 7 (1-2) (2013): 4-23. 4 Anderson, Steve, and Tara McPherson. “Engaging Digital Scholarship: Thoughts on Eval- uating Multimedia Scholarship.” Profession (2011): 136–151. URL: http://www.mlajour- nals.org/doi/abs/10.1632/prof.2011.2011.1.136. Andrews, T.L. “The Third Way: Philology and Critical Edition in the Digital Age.” Variants 10 (2013): 61-76. Ankersmit, F.R. Historical Representation. Stanford, CA: Stanford University Press, 2001. Antoniou, G., and F. Van Harmelen. A Semantic Web Primer. Cambridge, MA: MIT Press, 2004. http://www.dcc.fc.up.pt/-zp/aulas/1415/pde/geral/bibliografia/MIT.Press.A.Se- mantic.Web.Primer.eBook-TLFeBOOK.pdf. Appleford, Simon, and Jennifer Guliano. Devdh: Development for the Digital Humanities. 2013. Applehans, W., A. Globe, and G. Laugero. Managing Knowledge: A Practical Web-based Approach. Reading, MA: Addison-Wesley, 1999. Arango, J. “Architectures.” Journal of Information Architecture 3,1 (2001): 41-47. Arazy, Ofer, Eleni Stroulia, Stan Ruecker, Cristina Arias, Carlos Fiorentino, Veselin Ganev, and Timothy Yau. “Recognizing Contributions in Wikis: Authorship Categories, Algo- rithms, and Visualizations.” Journal of the American Society for Information Science and Technology 61.6 (2010): 1166-1179. Archer, Dawn. “Digital Humanities 2006: When Two Became Many.” Literary and Lin- guistic Computing 23, no. 1 (April 1, 2008): 103 -108. Archer, Dawn, ed. What’s in a word-List? Investigating Word Frequency and Keyword Ex- traction. Farnham: Ashgate, UK, 2009. Arctur, David, and Michael Zeiler. Designing Geodatabases: Case Studies in GIS Data Mo- deling. Redlands, CA: ESRI Press, 2004. ARL/NSF Workshop on Long-Term Stewardship of Digital Data Collections. Association of Research Libraries, September 2006. URL: http://www.arl.org/pp/access/nsfwork- shop.shtml. Arms, W. and Larsen, R. “Building the Infrastructure for Cyberscholarship.” Report of a workshop held in Phoenix, Arizona, National Science Foundation, 2007. Arnold, M. Culture and Anarchy. Oxford, UK: Oxford University Press, 2009. 5 Arthur, P.L., and Katherine Bode, eds. Advancing Digital Humanities: Research, Methods, Theories. Basingstoke, UK: Palgrave Macmillan. arts-humanities.net: Guide to Digital Humanities and Arts. http://arts-humanities.net/ ARTStor Digital Library. www.artstor.org Ashton, K. “That ‘Internet of Things’ Thing.” Journal (2009). http://www.rfidjour- nal.com/articles/view?4986. Association of College and Research Libraries. “Changing Roles of Academic and Re- search Libraries.” Association of College and Research Libraries, November 2006.URL: http://www.ala.org/ala/mgrps/divs/acrl/issues/value/changingroles.cfm. Association for Literary and Linguistic Computing. www.allc.org Auerbach, Eric. Mimesis: The Representation of Reality in Western Literature. Translated by W. Trask. New York, NY: Doubleday Anchor, 1953. Aufderheide, Patricia, et al. Copyright, Permissions, and Fair Use among Visual Artists and the Academic and Museum Visual Arts Communities: An Issues Report. College Art Association, 2014. http://www.collegeart.org/pdf/FairUseIssuesReport.pdf (PDF) Avery, J.M. “The Democratization of Metadata: Collective Tagging, Folksonomies and Web 2.0.” Library Student Journal. Ayers, Edward L. “The Academic Culture and the IT Culture: Their Effect on Teaching and Scholarship.” Educause Review 39, no.6 (2004): 48-62. http://www.educause.edu/EDU- CAUSE+Review/EDUCAUSERReviewMAgazineVolume39/TheAcademicCultureandtheIT- Cult/157939. Ayers, Edward L. “Does Digtal Scholarship Have a Future?” In Educause Review/48, no. 4 (2013): 24. http://www.educause.edu/ero/article/does-digital-scholarship-have-a-fu- ture. Ayers, Edward L. History in Hypertext. Charlottesville, VA: University of Virginia Press, 1999. Ayers, Edward L. “The Past and Futures of Digital History.” Virginia Center for Digital His- tory, 1999. http://www.vcdh.virginia.edu/PastsFutures.html. 6 Bady, Aaron. “The MOOC Moment and the End of Reform.” In The New Inquiry. May 15, 2013. http://thenewinquiry.com/blogs/zunguzungu/the-mooc-moment-and-the-end-of- reform/. Bailey, Moya Z. “All the Digital Humanists Are White, All the Nerds Are Men, but Some of Us Are Brave.” Journal of Digital Humanities 1, no. 1 (2011). http://journalofdigitalhu- manities.org/1-1/all-the-digital-humanists-are-white-all-the-nerds-are-men-but-some- of-us-are-brave-by-moya-z-bailey/. Bailey, Moya, Anne Cong-Huyen, Alexis Lothian, and Amanda Phillips. “Reflections on a Movement: #transformDH, Growing Up.” In Debates in the Digital Humanities. Eds. Mat- thew K. Gold and Lauren Klein. 71-79. Minneapolis, MN: University of Minnesota Press, 2016. Bailey, Trevor C., and Anthony C. Gatrell. Interactive Spatial Data Analysis. Harlow: Long- man, 1995. Bair, Sheila, and Sharon Carlson. “Where Keywords Fall: Using Metadata to Facilitate Digital Humanities Scholarship.” Journal of Library Metadata 8.3 (2008): 249-62. Univer- sity Libraries Faculty and Staff Publications, paper 12. Western Michigan University, 1 January 2008. http://scholarworks.wmich.edu/cgi/viewcontent.cgi?article=1012&con- text=library_pubs. Baird, D. Thing Knowledge: A Philosophy of Scientific Instruments. Berkeley, CA: Univer- sity of California Press, 2004. Baker, Christopher W. Scientific Visualization: The New Eyes of Science. Brookfield, CT: Millbrook Press, 200. Baker, N. Double Fold: Libraries and the Assault on Paper. New York, NY: Random House, 2001. Baker, N. The Size of Thoughts: Essays and Other Lumber. New York, NY: Random House 1996. Ball, A. Preserving Computer-Aided Design (CAD). DPC Technology Watch. Digital Preser- vation Coalition. Ball, Cheryl E. “Show, Not Tell: The Value of New Media Scholarship.” Computers and Composition 21, No. 4 (2004): 403-25. Ball, Cheryl E., and Douglas Eyman. “Digital Humanities Scholarship and Electronic Publi- cation.” In Rhetoric and the Digital Humanities. Eds. William Hart-Davidson and Jim Rodolfo. Chicago, IL: University of Chicago Press, 2015. 65-79. 7 Balmer, J. “Review: Digital Hadrian’s Villa Project.” Journal of the Society of Architectural Historians 73 (3) (2014): 444-445. Baltes, Elizabeth P. “Dedication and Display of Portrait Statues in Hellenistic Greece: Spatial Practices and Identity Politics.” PhD dissertation, Duke University Press, 2016. Balsamo, Anne Marie. Designing Culture: The Technological Imagination at Work. Durham, NC: Duke University Press, 2011. Balsamo, Anne Marie. “Videos and Frameworks for ‘Tinkering’ in a Digital Age.” Spot- light on Digital Media and Learning. http://sptlight.macfound.org/blog/entry/anne-bal- samo-tinkering-videos. Banz, David A. “The Values of the Humanities and the Values of Computing.” In Humani- ties and Computer: New Directions. Ed. David S. Miall, 27-37. Oxford, UK: Clarendon Press, 1990. Barab, Sasha, and Kurt Squire. “Design-Based Research: Putting a Stake in the Ground.” The Journal of the Learning Sciences 13, no. 1 (2004): 1–14. Barab, Sasha. et al. “Making Learning Fun: Quest Atlantis, A Game Without Guns”. Edu- cational Technology Research & Development 53 (1), (2005): 86-107. Barateiro, J., G. Antunes, F. Freitas, and J. Borbinha. “Designing Digital Preservation So- lutions: A Risk Management-Based Approach.” International Journal of Digital Curation 5 (1) (2010): 4-17. 10.2218/ijdc.v5il.140. Barbour, Kim. “Hiding in Plain Sight: Street Artists Online.” Journal of Media and Com- munication 5, no. 1 (2013): 86-96. Barnett, Fiona. “The Brave Side of Digital Humanities.” Differences 25, no. 1 (2014): 64- 78. Barnett, Fiona, Zach Blas, Micha Cárdenas, Jacob Gaboury, Jessica Marie Johnson, and Margaret Rhee. “Queer OS: A User’s Manual.” In Debates in the Digital Humanities. Eds. Matthew Gold and Lauren Klein. 50-59. Minneapolis, MN: University of Minnesota Press, 2016. Barribeau, Susan. “Enhancing Digital Humanities at UW-Madison: A White Paper.” Uni- versity of Wisconsin at Madison, 2009. http://dighum.wisc.edu/facultyseminar/in- dex.html. 8 Barthes, Roland. Camera Lucida: Reflections on Photography. Translated by Richard Howard. New York, NY: Farrar, Straus and Giroux, 1981. Barthes, Roland. “From Work to Text.” In Image, Music, Text. Trans. Stephen Heath. 155-164. New York, NY: Hill and Wang, 1977. Bartscherer, Thomas and Roderick Coover. Switching Codes: Thinking Through Digital Technology in the Humanities and the Arts. Chicago, IL: University of Chicago Press, 2011. Bates, David. “Peer Review and Evaluation of Digital Resources for the Arts and Humani- ties.” Institute of Historical Research – Digital Resources, n.d. http://www.his- tory.ac.uk/projects/digital/peer-review. Batley, S. Information Architecture for Information Professionals. Oxford, UK: Chandos, 2007. Battles, Matthew, and Michael Maizels. “Collections and/of Data: Art History and the Art Museum in the DH Mode.” In Debates in the Digital Humanities. Eds. Matthew Gold and Lauren Klein. 325- 344. Minneapolis, MN: University of Minnesota Press, 2016. Battle, R.A. Designing Virtual Worlds. Indianapolis, IN: New Riders, 2004. Baym, Nancy K. Personal Connections in the Digital Age. Cambridge, UK: Polity Press, 2015. Baym, Nancy K., and Danah Boyd. “Socially Mediated Publicness: An Introduction.” Journal of Broadcasting and Electronic Media, 56, no. 3 (September 2012): 320-29. Beagrie, N. “The Digital Curation Centre.” Learned Publishing 17 (2004): 7-9. Bearman, David and Jennifer Trant. “Authenticity of Digital Resources: Towards a State- ment of Requirements in the Research Process.” D-Lib Magazine 4, no. 6 (June 1998). Beckett, C. Supermedia: Saving Journalism So It Can Save the World. London, UK: Wiley- Blackwell, 2008. Becker, Jonathan. “Scholar 2.0: Public Intellectualism Meets the Open Web.” UCEA Re- view 52, no. 2. (June 16, 2011):17-19.URL: http://www.ucea.org/special_fea- ture_52_2_pcp/2011/6/16/scholar-20-public-intellectualism-meets-the-open-web.html Belfiore, E. and A. Upchurch, eds. Humanities in the Twenty-First Century: Beyond Utility and Markets. New York, NY: Palgrave Macmillan, 2013. 9 Benedict, B.M. Curiosity: A Cultural History of Early Modern Inquiry. Chicago, IL: Univer- sity of Chicago Press. 2001. Benjamin, Walter. “Theses on the Philosophy of History.“ In Illuminations. Trans. H. Zohn. 245-255. London, UK: Fontana, 1992. Benjamin, Walter. "The Work of Art in the Age of Mechanical Reproduction." In Illumina- tions. Ed. Hannah Arendt. Trans. Harry Zohn. New York, NY: Schocken Books, 1969 Benkler, Yochai. The Wealth of Networks: How Social Production Transforms Markets and Freedom. New Haven, CT: Yale University Press, 2006. Bentkowska-Kafel, Anna, Hugh Denard, and Drew Baker, eds. Paradata and Transpar- ency in Virtual Heritage. Digital Research in the Arts and Humanities. Burlington, VT: Ashgate, 2012. Bentkowska-Kafel, Anna, Trish Cashen, and Hazel Gardiner, eds. Digital Art History: A Subject in Transition. Bristol, UK: Intellect, 2005. Berens, Kathi Inman. “Judy Malloy’s Seat at the (Database) Table: A Feminist Reception History of Early Hypertext Literature.” Literary & Linguistic Computing 29.3 (2014): 340- 348. Berman, Merrick Lex. “Boundaries or Networks in Historical GIS: Concepts of Measuring Space.” Historical Geography 33 (2005): 118-33. Berens, Kathi Inman. “Interface.” In Digital Pedagogy in the Humanities: Concepts, Mod- els, and Experiments. Eds. Rebecca Frost Davis, Matthew K. Gold, Katherine D. Harris, and Jentery Sayers. New York: Modern Language Association, 2015. https://digitalpeda- gogy.commons.mla.org/keywords/interface/. Berg, A.J. “A Gendered Socio-Technical Construction: The Smart House.” In The Social Shaping of Technology. Eds. D. MacKenzie and J. Wajcman. 301-313. Buckingham, UK: Open University Press, 1999. Berg, Maggie, and Barbara Seeber. The Slow Professor: Challenging the Culture of Speed in the Academy. Toronto, ON: University of Toronto Press, 2016. Berger, John. Ways of Seeing. New York, NY: Penguin, 1972. Berlin, Isaiah. “The Divorce between the Sciences and the Humanities.” In The Proper Study of Mankind. 326-58. New York, NY: Farrar, Straus and Giroux, 1997. 10 Bernardi, Joanne, and Nora Dimmock. “Creative Curating: The Digital Archive as Argu- ment.” In Making Things and Drawing Boundaries: Experiments in the Digital Humani- ties. Ed. Jentery Sayers, 187-97. Minneapolis, MN: University of Minnesota Press, 2017. Bernardou, A., P. Constantopoulos, C. Dallas, and D. Gavrilis. “Understanding the Infor- mation Requirements of Arts and Humanities Scholarship: Implications for Digital Cura- tion.” International Journal of Digital Curation 5. (2010): 18-33. Berners-Lee, Tim, and Mark Fischetti. Weaving the Web: The Original Design and Ulti- mate Destiny of the World Wide Web by its Inventor. San Francisco, CA: Harper, 1999. Berners-Lee, T., J. Hendler, and O. Lassila. “The Semantic Web.” Scientific American 284 (5) (2001): 28-37. Bernstein, M.C. “Hypertext and the Linearity of History.” In HypertextNow: Remarks on the State of Hypertext, 1996-1999. 1999. Berry, David M. “The Computational Turn: Thinking about the Digital Humanities.” Cul- ture Machine 12. 2011. http://www.culturemachine.net/index.php/cm/arti- cle/view/440/470/. Berry, David M. “Critical Digital Humanities.” Author’s blog. http://stunlaw.blog- spot.com/2013/01/critical-digital-humanities.html Berry, David M. “The Computational Turn: Thinking About the Digital Humanities.” Cul- ture Machine 12. (2011): 1-22. http://www.culturemachine.net/index.php/cm/arti- cle/viewArticle/440. Berry, David M. Copy, Rip, Burn: The Politics of Copyleft and Open Source. London, UK: Pluto Press, 2008. Berry, David M. and Anders Fagerjord. Digital Humanities. Cambridge, UK: Polity Press, 2017. Berry, David M. The Philosophy of Software: Code and Mediation in the Digital Age. Lon- don, UK: Palgrave Macmillan, 2011. Berry, David M., ed. Understanding Digital Humanities. New York, NY: Palgrave MacMil- lan, 2012. Bescoby, D.J. “Detecting Roman Land Boundaries in Aerial Photographs Using Radon Transforms.” Journal of Archaeological Science 33 (2006): 735-43. 11 Besser, Howard. The Past, Present, and Future of Digital Libraries. Oxford, UK: Blackwell, 2004. Best, Stephen, and Sharon Marcus. “Surface Reading: An Introduction.” Representa- tions 108. (2009): 1–21. Bevan, Andrew, and James Conolly. ”GIS, Archaeological Survey, and Landscape Archae- ology on the Island of Kythera, Greece.” Journal of Field Archaeology 29, no. ½. (2002): 123-138. Bianco, Jamie “’Skye’This Digital Humanities Which Is Not One.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 96-112. Minneapolis, MN: University of Minnesota Press, 2012. Bieber, Douglas. “Representativeness in Corpus Design.” Literary and Linguistic Compu- ting 8, no.4 (1993): 243–257. Bijker, Wiebe E., Thomas P. Hughes, and Trevor Pinch, eds. The Social Construction of Technological Systems: New Directions in the Sociology and History of Technology. Cam- bridge, MA: MIT Press, 1989. Billinghurst, Mark, Adrian Clark, and Gun Lee. A Survey of Augmented Reality. Hanover, MA: Now Publishers, 2015. Bimber, Oliver and Ramesh Raskar. Spatial Augmented Reality. Merging Real and Virtual Worlds. Wellesley, MA: Peters, 2005. Binder, Jeffrey M. “Alien Reading: Text Mining, Language Standardization, and the Hu- manities.” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 201-217. Minneapolis, MN: University of Minnesota Press, 2016. Binkley, Richard. “New Tools, New Recruits, for the Republic of Letters.” Robert C. Bin- kley, 1897–1940/ Life, Works, Ideas. http://www.wallandbinkley.com/rcb/works/new- tools-new-recruits-for-the-republic-of-letters.html. Bird, Steven, Ewan Klein, and Edward Loper. Natural Language Processing with Python. Beijing, China: O’Reilly, 2009. Birkerts, Sven. The Gutenberg Elegies. The Fate of Reading in an Electronic Age. Boston, MA: Faber and Faber, 1994. Bissell, T. Extra Lives: Why Video Games Matter. New York, NY: Pantheon Books, 2010. 12 Bjork, Olin. “Digital Humanities and the First Year Writing Course.” Digital Humanities Pedagogy: Practices, Principles and Policies. Ed. Brett D. Hirsch. 97-119. Open Book Pub- lishers, 2012. Blackwell, Christopher, and Thomas R. Martin. “Technology, Collaboration, and Under- graduate Research.” DHQ: Digital Humanities Quarterly 3, no. 1. http://digitalhumani- ties.org/dhq/vol/3/1/000024/000024.html. Blair, Ann. Too Much to Know: Managing Scholarly Information Before the Modern Age. New Haven, CT: Yale University Press, 2010. Blais, Joline, Jon Ippolito, and Owen Smith. New Criteria for New Media. New Media De- partment, University of Maine, (January 2007). http://newmedia.umaine.edu/interar- chive/new_criteria_for_new_media.html. Blaney, Jonathan. “Citing Digital Resources.” SECT: Sustaining the EBBO-TCP. Bodleian Library. https://blogs.bodleian.ox.ac.uk/eebotcp/sect/. Blanke, Tobias. Digital Asset Ecosystems: Rethinking Clouds and Crowds. Oxford, UK: Chandos Publishing, 2014. Blanke, Tobias, and M. Hedges. “Scholarly Primitives: Building Institutional Infrastruc- ture for Humanities e-Science.” Future Generation Computer Systems 29 (2) (2013): 654- 661. Blei, David M. “Topic Modeling and Digital Humanities.” Journal of Digital Humanities 2, no. 1 (Winter 2012). http://journalofdigitalhumanities.org/2-1/topic-modeling-and-digi- tal-humanities-by-david-m-blei. Blevins, Cameron. “Digital History’s Perpetual Future Tense.” In Debates in the Digital Humanities. Eds. Matthew K. Gold, and Lauren Klein. 308-324. Minneapolis, MN: Univer- sity of Minnesota Press, 2016. Blevins, Cameron. “Space, Nation, and the Triumph of Region: A View of the World from Houston.” Journal of American History 101, no. 1 (June 2014): 122-147. Block, Sharon. “Doing More with Digitization: An Introduction to Topic Modeling of Early American Sources.” Common-Place 6, no. 2 (January 2006). http://www.common- place.org/vol-06/no-02/tales/. Blum, Andrew. Tubes: A Journey to the Center of the Internet. New York, NY: Ecco/Har- per Collins, 2012. 13 Blustain, Harvey, and Donald Spicer. “Digital Humanities at the Crossroads: The Univer- sity of Virginia.” ECAR Case Studies. Boulder, Colorado: Educause, 2005. net.edu- cause.edu/ir/library/pdf/ers0605/cs/ecs0506.pdf. Boast, R., M. Bravo, and R. Srinivasan. “Return to Babel: emergent diversity, digital re- sources, and local knowledge.” Information Society 23, 5 (2007): 395-403. Bode, Katherine. Reading by Numbers: Recalibrating the Literary Field. London, UK: An- them Press, 2012. Bode, Katherine. “Resourceful Reading: A New Empiricism in the Digital Age?” In Re- sourceful Reading: The New Empiricism, eResearch, and Australian Literary Culture. Eds. Katherine Bode and Robert Dixon. 1-27. Sydney, Australia: University of Sydney Press, 2009. Bodenhamer, D.J. “Narrating Space and Place.” In Spatial Narratives and Deep Maps. Eds. D.J. Bodenhamer, J. Corrigan and T.M. Harris. 7-27. Bloomington, IN: Indiana Uni- versity Press, 2015. Bodenhamer, David J., J. Corrigan, and T.M. Harris, eds. Spatial Narratives and Deep Maps. Bloomington, IN: Indiana University Press, 2015. Bodenhamer, David J. “The Potential of Spatial Humanities.” In The Spatial Humanities: GIS and the Future of Humanities Scholarship. Eds., David J. Bodenhamer, John Corrigan, and Trevor M. Harris. 14-30. Bloomington, IN: Indiana University Press, 2010. Bodenhamer, David J., John Corrigan, and Trevor Harris, eds. The Spatial Humanities: GIS and the Future of Humanities Scholarship. Bloomington, IN: Indiana University Press, 2010. Bodersen, Lars. Geo-Communication and Information Design. Fredikshavn, Denmark: Takegang, 2008. Boellstorff, Tom, Bonnie Nardi, Celia Pearce, T.L. Taylor, and George E. Marcus. Ethnog- raphy and Virtual Worlds: A Handbook of Method. Princeton, NJ: Princeton University Press, 2012. Boeva, Yana, Devon Elliott, Edward Jones-Imhotep, Shean Muhammedi, and William J. Turkel. “Doing History by Reverse Engineering Electronic Devices.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Sayers Jentery. 163-76. Minneapolis, MN: University of Minnesota Press, 2017. 14 Boggs, Jeremy, Jennifer Reed, and J.K. Purdom Linblad. ”Making it Matter.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Sayers Jen- tery. 322-30. Minneapolis, MN: University of Minnesota Press, 2017. Boggs, Jeremy. “Participating in the Bazaar: Sharing Code in the Digital Humanities.” Clioweb. June 10, 2010. http://clioweb.org/2010/06/10/participating-in-the-bazaar- sharing-code-in-the-digital-humanities/. Bogost, Ian. “The Cathedral of Computation.” The Atlantic, January 15, 2015. http://www.theatlantic.com/technology/archive/2015/01/the-cathedral-of-computa- tion/384300/. Bogost, Ian. “Gamification Is Bullshit.” The Atlantic, August 9, 2011. www.theatlan- tic.com/technology/archive/2011/08/gamification-is-bullshit/243338/. Bogost, Ian. Persuasive Games: The Expressive Power of Videogames. Cambridge, MA: MIT Press, 2007. Bogost, Ian, and Nick Montfort. “Platform Studies: Frequently Questioned Answers.” Digital Arts and Culture Conference Proceedings (12-15 December 2009): 12-15. Bogost, Ian. “The Turtlenecked Hairshirt.” In Debates in the Digital Humanities. Ed. Mat- thew K. Gold. 241-242. Minneapolis, MN: University of Minnesota Press, 2012. Bohon, Cory, Jennifer Guiliano, James Smith, George Williams, and Amanda Visconti. “’Making the Digital Humanities More Open’: Modeling Digital Humanities for a Wider Audience.” Journal of the Digital Humanities no. 1 (2014 Spring): 3. Bol, Peter K., and Jianxiong Ge. “China Historical GIS.” Historical Geography 33 (2005): 150-2. Bolter, J. David. “Critical Theory and the Challenge of New Media.” In Eloquent Images: Word and Image in the Age of New Media. Eds. Mary E. Hocks and Michelle R. Kendrick. 19-36. Cambridge, MA: MIT Press, 2003. Bolter, J. David. “Ekphrasis, Virtual reality, and the Future of Writing.” In The Future of the Book. Ed. Geoffrey Nunberg. 253-72. Berkeley, CA: University of California Press, 1996. Bolter, J. David. Writing Space: The Computer, Hypertext, and the History of Writing. Boston, MA: Houghton Mifflin, 1991. Bolter, J. David. Writing Space: Computers, Hypertext, and the Remediation of Print. Tay- lor & Francis. 2001. 15 Bolter, J. David, and Richard Grusin. Remediations: Understanding New Media. Cam- bridge, MA: MIT Press, 2000. Bonacchi, Chiara, ed. Archaeologists and the Digital: Towards Strategies of Engagement. London, UK: Archetype Publications, 2012. Bonds, E. Leigh. “Listening in on the Conversations: An Overview of Digital Humanities Pedagogy.” CEA Critic 76, no. 2 (July 2014). https://muse.jhu.edu/login?auth=0&type=summary&url=/jour- nals/cea_critic/v076/76.2.bonds.pdf. Booch, Grady, James Rumbaugh, and Ivar Jacobson. The Unified Modeling Language User Guide. Upper Saddle River, NJ: Addison-Wesley, 2005. Borenstein, Greg. Making Things See: 3D Vision with Kinect, Processing, Arduino, and MakerBot. Sebastopol, CA: Media Maker, 2012. Borgman, Christine L. Big Data, Little Data, No Data: Scholarship in the Networked World. Cambridge, MA: MIT Press, 2015. Borgman, Christine L. “The Digital Future Is Now: A Call to Action for the Humanities.” Digital Humanities Quarterly 3, no. 4 (2009). http://works.bepress.com/borgman/233/. Borgmann, Albert. Holding on to Reality: The Nature of Information at the Turn of the Millennium. Chicago, IL: University of Chicago Press, 1999. Borgman, Christine L. Scholarship in the Digital Age. Cambridge, MA: MIT Press, 2007. Börner, K. The Atlas of Science: Visualizing What We Know. Cambridge, MA: MIT Press, 2010. Bornstein, George, and Ralph G. Williams, eds. Palimpsest: Editorial Theory in the Hu- manities. Ann Arbor, MI: University of Michigan Press, 1993. Bornstein, George and Theresa Tinkle. The Iconic Page in Manuscript, Print, and Digital Culture. Ann Arbor, MI: University of Michigan Press, 1998. Bosak, Jon and Tim Bray. "XML and the Second-Generation Web." Scientific American (6 May 1999). Bouchard, Matt, and Andy Keenan. “From Theory to Experience to Making to Breaking: Iterative Game Design for Digital Humanists.” In Doing Digital Humanities: Practice, 16 Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 328-40. New York, NY: Routledge, 2016. Bowker, Geoffrey C., and Susan Leigh Star. Sorting Things Out: Classification and its Con- sequences. Cambridge, MA: MIT Press, 2000. Boyack, Kevin W., Brian N. Wylie, and George S. Davidson. “A Call to Researchers: Digital Libraries Need Collaboration Across Disciplines.” D-Lib Magazine 7, no. 10 (October 2001). http://www.dlib.org/dlib/october01/boyack/10boyack.html. Boyd, Danah, and Kate Crawford. “Critical Questions for Big Data: Provocations for a Cul- tural, Technological, and Scholarly Phenomenon.” Information, Communication & Soci- ety 15, no. 5 (2012): 662-679. Boyd, Danah, Scott Golder, and Gilad Lotan. “Tweet, Tweet, Retweet: Conversational As- pects of Retweeting on Twitter.” Hawaii International Conference on System Sciences, 2010, Kuai, Hawaii. Boyd, Jason, and Lynne Siemens. “Project Management.” DHSI@Congress 2014. 2014. Boyle, James. “A Closed Mind about an Open World.” Financial Times. August 7, 2006. http://www.It.com/home/us.Path:Search;Boyle Closed Mind. Boyle, John. The Public Domain: Enclosing the Commons of the Mind. New Haven, CT: Yale University Press, 2008. Brabham, D.C. Crowdsourcing. MIT Press Essential Knowledge Series. Cambridge, MA: MIT Press, 2013. Bradley, Jeffrey. “No Job for Techies: Technical Contributions to Research in the Digital Humanities.” In Collaborative Research in the Digital Humanities. Eds. M. Deegan and W. McCarty. 11-26. Farnham, UK: Ashgate, 2012. Bradshaw, Jeffrey, ed. Software Agents. Cambridge, MA: MIT Press, 1997. Bradshaw, Roy, and Robert J. Abrahart. “Widening Participation in Historical GIS: The Case of Digital Derby 1841.” RGS-IBG Annual International Conference, London. Septem- ber 1, 2005. Bradwell, P. The Edgeless University: Why Higher Education Must Embrace Technology. London, UK: Demos, 2000. 17 Brennan, Sheila A. “Let the Grant Do the Talking.” Journal of Digital Humanities 1, no. 4 (Fall 2012). http://journalofdigitalhumanities.org/1-4/let-the-grant-do-the-talking-by- sheila-brennan/. Brennan, Sheila A. “Navigating DH for Cultural Heritage Professionals.” Lot 49. January 10, 2011, http://www.lotfortynine.org/2011/01/navigating-dh-for-cultural-heritage-pro- fessionals/. Brennan, Sheila A. “Public, First.” In Debates in the Digital Humanities. Eds. Matthew Gold and Lauren Klein. 384-389. Minneapolis, MN: University of Minnesota Press, 2016. Brett, Guy. “The Computers Take to Art.” The Times, 2 August (1968): 7. Brett, Megan R. “Topic Modeling: A Basic Introduction.” Journal of Digital Humanities (2:1). http://journalofdigitalhumanities.org/2-1/topic-modeling-a-basic-introduction-by- megan-r-brett/ Brier, Stephen. “Where’s the Pedagogy? The Role of Teaching and Learning in the Digital Humanities.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 390-401. Min- neapolis, MN: University of Minnesota Press, 2012. Britton, Lauren. “Democratized Tools of Production: New Technologies Spurring the Maker Movement.” Technology & Social Change Group. Seattle, WA: University of Washington Information School, 2014. Britton, Lauren. “Examining the Maker Movement through Discourse Analysis: An Intro- duction.” Technology & Social Change Group. Seattle, WA: University of Washington In- formation School, 2014. Britton, Lauren. “Power, Access, Status: The Discourse of Race, Gender, and Class in the Maker Movement.” Technology & Social Change Group. Seattle, WA: University of Washington Information School, 2015. Britton, Lauren. “STEM, DASTEM, and STEAM in Making: Debating America’s Economic Future in the 21st Century.” Technology & Social Change Group. Seattle, WA: University of Washington Information School, 2014. Brosnan, Mark. Technophobia: The Psychological Impact of Information Technology. London, UK: Routledge, 1998. Brook, T. “Mapping Knowledge in the Sixteenth Century: The Gazetteer Cartography of Ye Chunji.” The [Princeton University, Gest] East Asian Library Journal 7:2 (1994): 5-32. 18 Brooke, Collin. Lingua Fracta: Toward a Rhetoric of New Media. New York, NY: Hampton Press, 2009. Brown, Bill. “Thing Theory.” Critical Inquiry 28.1 (Autumn 2001): 1-22. Brown, James Jr. “Crossing State Lines: Rhetoric and Software Studies.” In Rhetoric and the Digital Humanities. Ed. Jim Ridolfo and William Hart-Davidson, 20-32. Chicago, IL: University of Chicago Press, 2015. Brown, John Seely and Douglas Thomas. A New Culture of Learning: Cultivating the Im- agination for a World of Constant Change. CreateSpace Independent Publishing Plat- form, 2011. Brown, John Seely and Paul Duguid. The Social Life of Information. Cambridge, MA: Har- vard Business School Press, 2000. Brown, John Seely and Paul Duguid. “Universities in the Digital Age.” Change, 24.4 (1996): 10-19. Brown, Laura, Rebecca Griffiths, and Matthew Rascoff. University Publishing in a Digital Age. New York, NY: ITHAKA, 2007. Brown, Paul, Charlie Gere, Nicholas Lambert, and Catherine Mason, eds. White heat Cold Logic: British Computer Art 1960-80. Cambridge, MA: MIT Press, 2010. Brown, Susan. “CWRC-Writer.” The Canadian Writing Research Collaboratory. http://www.dh2012.uni-hamburg.de/conference/programme/abstracts/cwrc-writer-an- in-browser-xml-editor/. Brown, Susan. “Towards Best Practices in Collaborative Online Knowledge Production.” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 47-64. New York, NY: Routledge, 2016. Brown, Susan, and John Simpson. “The Curious Identity of Michael Field and its Implica- tions for Humanities Research with the Semantic Web.” IEEE Big Humanities Data (2013): 77-85. Brown, Susan, John Simpson, the INKE Research Group, and CWRC Project Team. “The Changing Culture of Humanities Scholarship: Iteration, Recursion, and Versions in Schol- arly Collaboration Environments.” Scholarly and Research Communication 5.4 (2014). Brown, S., and M. Greengrass. “Research Portals in the Arts and Humanities.” Literary and Linguistic Computing, Vol. 25, No. 1 (2010): 1-21. 19 Brown, Vincent. “Mapping a Slave Revolt: Digital Tools and the Historian’s Craft.” Ameri- can Historical Association, New York City, January 2-5, 2015. https://aha.con- fex.com/aha/2015/webprogram/Paper17474.html. Browne, Simone. Dark Matters: On the Surveillance of Blackness. Durham, NC: Duke Uni- versity Press, 2015. Bruns, Axel, and Hallvard Moe. “Structural Layers of Communication on Twitter.” In Twitter and Society. Eds. Katrin Weller, Axel Bruns, Jean Burgess, Merja Mahrt, Cornelius Puschmann. 15-28. New York, NY: Peter Lang, 2014. Bruns, Axel, and Stefan Stieglitz. “Quantitative Approaches to Comparing Communica- tion Pattens on Twitter.” Journal of Technology and Human Services 30, nos. 3-4 (2012): 160-85. Bruns, Axel, and Stefan Stieglitz. “Towards More Systematic Twitter Analysis: Metrics for Tweeting Activities.” International Journal of Social Research Methodology (2013). Bruzelius, Caroline. Preaching, Building and Burying: Friars and the Medieval City. Lon- don, UK: Yale University Press, 2014. Bruzelius, Caroline. “Teaching with Visualization Technologies: How Information Be- comes Knowledge.” Material Religion 9 (2013): 246-253 Bruzelius, Caroline. “Visualizing Venice: An International Collaboration.” In Lo Spazio Narrabile. Scritti di Storia Inonore di Donatella Calabi. Eds. Rosa Tamborrino and Guido Zucconi. 155-160. Venice, Italy: Quodlibet, 2014. Bryant, Levi. The Democracy of Objects. Ann Arbor MI: Open Humanities, 2011. Bryson, Tim. “Digital Humanities.” SPEC Kit, 0160-3582. Washington, DC: Association of Research Libraries, (2011): 326. Buckland, Michael K. “Information as Thing.” Journal of the American Society for Infor- mation Science 42, no. 5 0 (1991): 351-36. Bulger, Monica, Eric Meyer, Grace De la Flor, Melissa Terras, Sally Wyatt, Marina Jirotka, Katherine Eccles, and Christine McCarthy Madsen. “Reinventing Research? Information Practices in the Humanities.” Information Practices in the Humanities. A Research Infor- mation Network Report (2011). Burdette, Alan R. “EVIA Digital Archive Project.” Online Humanities Scholarship: The Shape of Things to Come. Ed. Jerome McGann. 189-209. Houston, TX: Rice University Press, 2010. 20 Burdick, Anne, Johanna Drucker, Peter Lunenfeld, Todd Presner, and Jeffrey Schnapp. Digital_Humanities. Cambridge, MA: MIT Press, 2012. Burgess, Helen J, and Jeanne Hamming. “New Media in the Academy: Labor and the Pro- duction of Knowledge in Scholarly Multimedia.” DHQ: Digital Humanities Quarterly 5, no. 3 (Summer 2011). http://digitalhumanities.org/dhq/vol/5/3/000102/000102.html. Burgoyne, John Ashley, Ichiro Fujinaga, and J. Stephen Downie. “Music Information Re- trieval.” In A New Companion to Digital Humanities. Eds. Susan Schreibman, Ray Sie- mens, and John Unsworth. 213-228. West Sussex, UK: Wiley Blackwell, 2016. Burke, Timothy. “The Humane Digital.” In Debates in the Digital Humanities. Eds. Mat- thew K. Gold and Lauren Klein. 514-518. Minneapolis, MN: University of Minnesota Press, 2016. Burnard, L., K. O’Brien O’Keefe, and J. Unsworth, eds. Electronic Digital Editing. 269-276. New York, NY: Modern Language Association. Burrows, John F. Computation into Criticism. Oxford, UK: Clarendon Press, 1987. Burrows, John. “Textual Analysis.” In A Companion to Digital Humanities. http://nora.lis.uiuc.edu:3030/companion/view?docId=black- well/9781405103213/9781405103213.xml&chunk.id=ss1-4-4&toc.depth=1&toc.id=ss1- 4-4&brand=9781405103213_brand Burrows, T. “A Data-Centered ‘Virtual Laboratory’ for the Humanities: Designing the Australian Humanities Networked Infrastructure (HuNI) Service.” Literary and Linguistic Computing 28 (4) (2013): 576-81. Burton, Matt. “The Joy of Topic Modeling.” http://mcburton.net/blog/joy-of-tm/. Buurma, Rachel Sagner, and Anna Tione Levine. “The Sympathetic Research Imagina- tion: Digital Humanities and Liberal Arts.” In Debates in the Digital Humanities. Eds. Mat- thew K. Gold and Lauren Klein. 274-279. Minneapolis, MN: University of Minnesota Press, 2016. Buzzetti, Dino. “Digital Representation and the Text Model.” New Literary History 33: (2002): 61-88. Buzzetti, Dino, and Jerome McGann. “Critical Editing in a Digital Horizon.” In Electronic Textual Editing. Eds. Lou Burnard, Katherine O’Brien O’Keefe, and John Unsworth, 53– 73. New York, NY: Modern Language Association, 2006. 21 Byron, Mark. “Digital Scholarly Editions of Modernist Texts: Navigating the Text in Sam- uel Beckett’s Watt Manuscripts.” Sydney Studies in English 36 (2010): 150-69. Callahan, V. Reclaiming the Archive: Feminism and Film History. Detroit, MI: Wayne State University Press, 2010. Campbell, Timothy. Wireless Writing in the Age of Marconi. Minneapolis, MN: University of Minnesota Press, 2006. Cantara, Linda. “Long-Term Preservation of Digital Humanities Scholarship.” OCLC Sys- tems and Services 22, no. 1. (2006): 38-42. Carey, Craig. "And: Marks, Maps, Media, and the Materiality of Ambrose Bierce’s Style." American Literature 85, no. 4 (2013): 629-660. Carey, James W. Communication as Culture: Essays on Media and Society. New York- London: Routledge, 1992. Carr, Nicholas. “Is Google Making Us Stupid?” The Atlantic. July/August 2008. http://www.theatlantic.com/magazine/archive/2008/07/is-google-making-us-stu- pid/6868/. Carr, Nicholas. The Shallows: What the Internet Is Doing to Our Brains. New York, NY: W. W. Norton. 2008. Carr, Patricia. “Serendipity in the Stacks: Libraries, Information Architectures, and the Problems of Accidental Discovery.” College and Research Libraries. Association of Col- lege and Research Libraries, 2015. http://crl.acrl.org/content/early/2015/01/01/crl14- 655.full.pdf. Carter, Paul. The Road to Botany Bay: An Essay in Spatial History. London, UK: Faber & Faber, 1987. Carusi, A., A.S. Hoel, T. Webmoor, and S. Woolgar, eds. Visualization in the Age of Com- puterization. New York, NY: Routledge, 2015. Castells, Manuel. The Rise of Network Society. Cambridge, MA: Blackwell, 1996. Causer, T. and M. Terras. “Crowdsourcing Bentham: Beyond the Traditional Boundaries of Academic History.” International Journal of Humanities and Arts Computing 8 (1) (2014): 46-64. Cavanagh, Sheila. “Living in a Digital World: Rethinking Peer Review, Collaboration and 22 Open Access.” Journal of Digital Humanities 1, no. 4 (Fall 2012). http://journalofdigi- talhumanities.org/1-4/living-in-a-digital-world-by-sheila-cavanagh/. Cázes, Hélène, and J. Matthew Huculak. “Understanding the Pre-digital Book: ‘Every Contact Leaves a Trace’.” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, and Ray Siemens. 65-82. New York, NY: Routledge, 2016. Cecire, Natalia. “The Visible Hand.” Works Cited. http://nataliacecire.blog- spot.com/2011/05/visible-hand.html. Cecire, Natalia. “When Digital Humanities was in Vogue.” Journal of Digital Humanities 1, no. 1 (2011): 54-59. Center for Digital Research in the Humanities, University of Nebraska-Lincoln. “Promo- tion & Tenure Criteria for Assessing Digital Research in the Humanities.” Center for Digi- tal Research in the Humanities. http://cdrh.unl.edu/articles/eval_digital_scholar.php. Center for Digital Research in the Humanities, University of Nebraska-Lincoln. “Recom- mendations for Digital Humanities Projects.” Center for Digital Research in the Humani- ties, n.d. http://cdrh.unl.edu/articles/best_practices.php. Chabries, D.M., S.W. Booras, and G.H. Bearman. “Imagining the Past: Recent Applica- tions of Multispectral Imaging Technology to Deciphering Manuscripts.” Antiquity 77 (2003): 296, 359-72. Chachra, Debbie. “Beyond Making.” In Making Things and Drawing Boundaries: Experi- ments in the Digital Humanities. Ed. Jentery Sayers. 319-21. Minneapolis, MN: University of Minnesota Press, 2017. Chachra, Debbie. “Why I Am Not a Maker.” The Atlantic, January 23, 2015. www.theat- lantic.com/technology/archive/2015/01/why-i-am-not-a-maker/384767. Champion, Erik. Critical Gaming: Interactive History and Virtual Heritage. New York, NY: Routledge, 2015. Chan, Anita Say, and Harriet Green. “Practicing Collaborative Digital Pedagogy to Foster Digital Literacies in Humanities Classrooms.” Educause Review. 2014. Chan, Seb. “Spreadable Collections: Measuring the Usefulness of Collection Data.” Mu- seums and the Web 2010: Proceedings. Toronto, ON: Archives & Museum Informatics, 2010. http://www.archimuse.com/mw2010/papers/chan/chan.html. 23 Chang, K-T. Introduction to Geographic Information Systems. Boston, MA: McGraw-Hill, 2009. Chartier, Roger. The Order of Books. Trans. Lydia G. Cochrane. Stanford, CA: Stanford University Press, 1994. Chassanoff, Alexandra. “Historians and the Use of Primary Sources in the Digital Age.” The American Archivist 76, no. 2 (2013): 430-471. Cheal, C. “Second Life: Hype or Hyperlearning?” On the Horizon 15 (Pt.4), (2004): 204- 210. Chen, Chaomei. Information Visualization: Beyond the Horizon. 2nd ed. New York, NY: Springer, 2006. Chenhall, R.G. “The Archaeological Data Bank: A Progress Report.” Computers and the Humanities 5, no. 3 (1971): 159-169. Chenhall, R. G. “The Description of Archaeological Data in Computer Language.” Ameri- can Antiquity 32, no.2 (1967): 161-167. Chenhall, R.G. “The Impact of Computers on Archaeological Theory: An Appraisal and Projection.” Computers and the Humanities 3, no. 1 (1968): 15-24. Chernaik, W., C. Davis, and M. Deegan, eds. The Politics of the Electronic Text. London, UK: University of London Centre for English Studies, 1993. Chrisman, Nicholas. Exploring Geographic Information Systems, 2d ed. New York, NY: John Wiley & Sons, Inc., 2002. Christen, K. “Ara Irititja: Protecting the Past, Accessing the Future-Indigenous Memories in a Digital Age.” Museum Anthropology 29, 1 (2006): 56-60. Christensen, Christian. “Twitter Revolutions? Addressing Social Media and Dissent.” Communication Review 14, no. 3 (2011): 155-57. Chui, M., M. Löffler, and R. Roberts. “The Internet of Things.” McKinsey Quarterly. McKinsey & Company, 2010. Chun, Wendy Hui Kong. Control and Freedom: Power and Paranoia in the Age of Fiber Optics. Cambridge, MA: MIT Press, 2008. 24 Chun, Wendy Hui Kyong. “Introduction: Did Somebody Say New Media?” In New Media, Old Media: A History and Theory Reader. Ed. Wendy Hui Kyong Chun and Thomas Kee- nan. 1-10. New York, NY: Routledge, 2006. Chun, Wendy Hui Kong, and Matthew Fuller. Programmed Visions: Software and Memory. Cambridge, MA: MIT Press, 2013. Chun, Wendy Hui Kong, and Lisa Marie Rhody. “Working the Digital Humanities: Uncov- ering Shadows Between the Dark and the Light.” Differences: A Journal of Feminist Cul- tural Studies 25, no. 1 (2014): 1-26. Ciula, A., and Øyvind Eide. “Reflections on Cultural Heritage and Digital Humanities: Modeling in Practice and Theory.” In Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage. 35-41, New York: ACM. http://doi.acm.org/10.1145/2595188.2595207. Cilevitz, Adam. “The Digital Chastity Belt.” 2015. http://criticalmedia.uwater- loo.ca/crimelab/?p=1482. Clavert, Frédéric. “The Digital Humanities Multicultural Revolution Did Not Happen Yet.” L’histoire contemporaine à l’ère numérique. N.p., 2013 Clement, Tanya. “The Ground Truth of DH Text Mining.” In Debates in the Digital Hu- manities. Eds. Matthew K. Gold and Lauren Klein. 534-535. Minneapolis, MN: University of Minnesota Press, 2016. Clement, Tanya. “Half-baked: The State of Evaluation in the Digital Humanities.” Ameri- can Literary History 24.4 (2012): 876-890. EBSCOhost. Clement, Tanya. “Multiliteracies in the Undergraduate Digital Humanities Curriculum: Skills, Principles, and Habits of Mind.” In Digital Humanities Pedagogy: Practices, Princi- ples, and Politics. Ed. Brett D. Hirsch. Cambridge, MA: Open Book Publishers, 2012. http://www.openbookpublishers.com/htmlreader/DHP/chap15.html. Clement, Tanya. “Text Analysis, Data Mining, and Visualizations in Literary Scholarship.” In Literary Studies in the Digital Age: A Methodological Primer. Eds. K. Price and R Sie- mens. New York, N: MLA Commons, 2013. Clement, Tanya. “‘A Thing Not Beginning and Not Ending’: Using Digital Tools to Distant- Read Gertrude Stein’s ‘The Making of Americans’.” Literary and Linguistic Computing 23 (3), (2008): 361-382. Clement, Tanya. “Welcome to HiPSTAS.” HiPSTAS. https://blogs.ischool.utexas.edu/hip- stas/2012/11/14/welcome-to-hipstas/. 25 Clement, Tanya E. “When Texts of Study are Audio Files: Digital Tools for Sound Studies in Digital Humanities.” In A New Companion to Digital Humanities. Eds. Susan Schreib- man, Ray Siemens, and John Unsworth. 348-357. West Sussex, UK: Wiley-Blackwell. 2016. Clement, Tanya E. “Where is Methodology in Digital Humanities?” In Debates in the Digi- tal Humanities. Eds. Matthew K. Gold and Lauren Klein. 153-175. Minneapolis, MN: Uni- versity of Minnesota Press, 2016. Clement, Tanya, S. Steger, J. Unworth, and K. Uszkalo. “How Not to Read a Million Books.” http://www3.isrl.illinois.edu/-unsworth/hownot2read.html#sdendnote4sym. Clement, Tanya, Wendy Hagenmaier, and Jennie Levine Knies. “Toward a Notion of the Archive of the Future: Impressions of Practice by Librarians, Archivists, and Digital Hu- manities Scholars.” The Library 83, no. 2 (2013): 112-30. Clouston, Nicole, and Jentery Sayers. “Fabrication and Research-Creation in the Arts and Humanities.” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 313-27. New York, NY: Routledge, 2016. Coble, Zach. “Evaluating DH Work: Guidelines for Librarians.” Journal of Digital Humani- ties 1, no. 4 (Fall 2012). http://journalofdigitalhumanities.org/1-4/evaluating-digital-hu- manities-work-guidelines-for-librarians-by-zach-coble. Codd, E.F. “A Relational Model of Data for Large Shared Data Banks.” Communications of the ACM 13.6 (June 1970): 377-387. PDF. Cohen, Daniel J. “Creating Scholarly Tools and Resources for the Digital Ecosystem: Building Connections in the Zotero Project.” First Monday 13.8 (2008). Cohen, Daniel J. “From Babel to Knowledge: Data Mining Large Digital Collections.” D-Lib Magazine 12, no. 3 (2006). http://www.dlib.org/dlib/march06/cohen/03cohen.html. Cohen, Daniel J. “Introducing Digital Humanities Now.” In Debates in the Digital Humani- ties. Ed. Matthew K. Gold. 319-321. Minneapolis, MN: University of Minnesota Press, 2012. Cohen, Daniel J. “The Ivory Tower and the Open Web: Introduction: Burritos, Browsers, and Books [Draft].” Dan Cohen, July 26, 2011. http://www.danco- hen.org/2011/07/26/the-ivory-tower-and-the-open-web-introduction-burritos-brows- ers-and-books-draft/. 26 Cohen, Daniel J. “Searching for the Victorians.” Dan Cohen’s Digital Humanities Blog. Oc- tober 4, 2010. http://www.dancohen.org/2010/10/04/searching-for-the-victorians/ Cohen, Daniel J. “The Social Contract of Scholarly Publishing.” In Debates in the Digital Humanities, ed. Matthew K. Gold. 322-323. Minneapolis, MN: University of Minnesota Press, 2012. Cohen, Daniel J. “Welcome to the Digital Public Library of America.” In Digital Public Li- brary of America. April 18, 2013. http://dp.la/info/2013/04/18/message-from-the-exec- utive-director/. Cohen, Daniel J., J. Frabetti, D. Buzzetti, and J.D. Rodriguez-Velasco. Defining the Digital Humanities. 2011. Http://academiccommons.columbia.edu/catalog/ac%3A150603. Cohen, Daniel J., M. Frisch, P. Gallagher et al. “Interchange: The Promise of Digital His- tory.” Journal of American History 95 (2), 442-451. 2009. Cohen, Daniel J. and Roy Rosenzweig. Digital History: A Guide to Gathering. Preserving, and Presenting the Past on the Web. Philadelphia, PA: University of Pennsylvania Press, 2006. Cohen, Daniel J., and Roy Rosenzweig. “To Mark Up, Or Not to Mark Up.” In Digital His- tory: A Guide to Gathering, Preserving, and Presenting the Past on the Web. University of Pennsylvania Press, 2005. http://chnm.gmu.edu/digitalhistory/digitizing/3.php. Cohen, Daniel J. and Tom Scheinfeldt, eds. Hacking the Academy: New Approaches to Scholarship and Teaching from Digital Humanities. Ann Arbor, MI: University of Michi- gan Press, 2013. Cohen, Julie. Configuring the Networked Self. New Haven, CT: Yale University Press, 2012. Cohen, Patricia. “Humanities Scholars Embrace Digital Technology.” New York Times, November 16, 2010. http://www.nytimes.com/2010/11/17/arts/17digital.html. Cohoon, JM. and W. Aspray. Woman and Information Technology: Research on Un- derrepresentation. Cambridge, MA: MIT Press, 2006. Coleman, B. Hello Avatar: Rise of the Networked Generation. Cambridge, MA and Lon- don UK: MIT Press, 2011. Coletta, Cristina Della. “Guidelines for Promotion and Tenure Committees in Judging Digital Work.” Evaluating Digital Scholarship – NINES/NEH Summer Institutes: 2011- 27 2012. 2011. http://institutes.nines.org/docs/2011-documents/guidelines-for-promo- tion-and-tenure-committees-in-judging-digital-work/. College Art Association Intellectual Property Resources. http://www.collegeart.org/ip/ College Art Association and the Society of Architectural Historians. "Guidelines for the Evaluation of Digital Scholarship in Art and Architectural History.” 2016. Collins, Harry, Robert Evans, and Michael E. Gorman. “Trading Zones and International Expertise.” Trading Zones and International Expertise: Creating New Kinds of Collabora- tion. Ed. Michael E. Gorman. 7-23. Cambridge, MA: MIT Press, 2010. Collins, Nicolas. Handmade Electronic Music: The Art of Hardware Hacking. 2nd ed. New York, NY: Routledge, 2009. Cong-Huyen, Anne. “Thinking Through Race (Gender, Class, & Nation) in the Digital Hu- manities: The #transformDH Example.” Anne Cong-Huyen (Blog), January 7, 2013. http://anitaconchita.org/uncategorized/mla13-presentation/. Cong-Huyen, Anne. “Toward a Transnational Asian/American Digital Humanities: A #transformDH Invitation.” In Between Humanities and the Digital. Eds. Patrik Svensson and David Theo Goldberg. 109-120. Cambridge, MA: MIT Press, 2015. Connor, W.R. “Scholarship and Technology in Classical Studies.” In Scholarship and Tech- nology in the Humanities. Proceedings of a Conference Held at Elvetham Hall. Hamp- shire, UK, 9-12 May. ed. May Katzen, 52-62. London, UK: British Library Research, Bowker Saur, 1991. Consalvo, Mia. Cheating: Gaining Advantage in Videogames. Cambridge, MA: MIT Press, 2007. Conway, P. “Preservation in the Age of Google: Digitization, Digital Preservation, and Di- lemmas.” Library Quarterly: Information, Community, Policy, 80,1 (2010): 61-79. Cook, T. “Archival Science and Postmodernism: New Formulations for Old Concepts.” Ar- chival Science 1, 1 (2001): 3-24. Cook, T. “Evidence, Memory, Identity, and Community: Four Shifting Archival Para- digms.” Archival Science 13 (2013): 95-120. Cook, T. “Fashionable Nonsense or Professional Rebirth: Postmodernism and the Prac- tice of Archives.” Archivaria 51 (2001):14-35. 28 Cooley, Heidi Rae, and Duncan A. Buell. “Building Humanities Software That Matters: The Case of the Ward One Mobile App.” In Making Things and Drawing Boundaries: Ex- periments in the Digital Humanities. Ed. Jentery Sayers. 272-87. Minneapolis, MN: Uni- versity of Minnesota Press, 2017. Cooper, Andrew, and Michael Simpson. “Looks Good in Practice, but Does It Work in Theory? Rebooting the Blake Archive.” Wordsworth Circle 31, no. 1 (Winter 2000): 63- 68. Cooper, D., C.D. Donaldson and P. Murrieta-Flores, eds. Literary Mapping in the Digital Age. Aldershot, UK: Ashgate, 2016. Cordell, Ryan. “How Not to Teach Digital Humanities.” In Debates in the Digital Humani- ties. Eds. Matthew K. Gold and Lauren Klein. 459-474. Minneapolis, MN: University of Minnesota Press, 2016. Cordell, Ryan. “How to Start Tweeting and Why You Might Want To.” April, 2010. http://chronicle.com/blogs/profhacker/how-to-start-tweeting-and-why-you-might- want-to/26065 Cordell, Ryan. “New Technologies to Get Your Students Engaged.” Chronicle of Higher Education (May 2011). Cosgrave, Mike, Anna Dowling, Lynn Harding, Róisín O’Brien, and Olivia Rohan. “Evaluat- ing Digital Scholarship: Experiences in New Programmes at an Irish University.” Journal of Digital Humanities 1, no. 4 (Fall 2012). http://journalofdigitalhumanities.org/1-4/eval- uating-digital-scholarship-experiences-in-new-programmes-at-an-irish-university/. Coté, Mark. “Data Motility: The Materiality of Big Social Data.” Culture Studies Review 20, no.1 (2014). Coté, Mark. “The Prehistoric Turn? Networked New Media, Mobility and the Body.” In The International Companions to Media Studies: Media Studies Futures. Ed. Kelly Gates. 171-94. Oxford, UK: Blackwell, 2012. Coté, Mark. “Technics and the Human Sensorium: Rethinking Media Theory Through the Body.” Theory and Event 13, no. 4 (2010). Cotton, Tressie McMillan. “More Scale, More Questions: Observations from Sociology.” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 540-545. Minnesota, MN: University of Minnesota Press, 2016. 29 Council on Library and Information Resources. “Library as Place: Rethinking Roles, Re- thinking Space.” Washington, DC: Council on Library and Information Resources, 2005. http://www.clir.org/pubs/abstract/pub129abst.html. Council on Library and Information Resources. “No Brief Candle: Reconceiving Research Libraries for the 21st Century.” Washington, DC: Council on Library and Information Re- sources, 2008. http://www.clir.org/pubs/abstract/pub142abst.html. Council on Library and Information Resources. “Working Together or Apart: Promoting the Next Generation of Digital Scholarship.” Washington, DC: Council on Library and In- formation Resources, 2009. http://www.clir.org/pubs/reports/pub145/pub145.pdf. Cowgill, George L. “Computer Applications in Archaeology.” Computers and the Humani- ties 2, no. 1 (1967): 17-23. Cox, Gary W., and Johnson N. Katz. Elbridge Gerry’s Salamander: The Electoral Conse- quences of the Reappointment Revolution. Cambridge, UK: Cambridge University Press: 2002. Cox, R.J. Archives & Archivists in the Information Age. New York, NY: Neal-Schuman Pub- lishers, 2005. Craig, A.B., W.R. Sherman, and J.D. Will. Developing Virtual Reality Applications: Founda- tions of Effective Design. Burlington, MA: Morgan Kaufmann, 2009. Craig, Hugh, and Arthur Kinney. Shakespeare, Computers, and the Mystery of Author- ship. Cambridge, UK: Cambridge University Press, 2009. Crampton, Jeremy. The Political Mapping of Cyber Space. Chicago, IL: University of Chi- cago Press, 2003. Crane, G. “The Humanities in the Digital Age.” Paper presented at Big Data & Uncer- tainty in the Humanities, University of Kansas, 2012. http://www.youtube.com/watch?v=sVdOaYgU7qA. Crane, Gregory, and Alison Jones. “Text, Information, Knowledge and the Evolving Rec- ord of Humanity.” D-Lib Magazine 12, no. 3. http://www.dlib.org/dlib/march06/jones/03jones.html. Crane, G., D. Bamman, L. Cerrato, et al. “Beyond Digital Incunabula: Modeling the Next Generation of Digital Libraries?” European Conference on Digital Libraries. 2006. http://www.eecs.tufts.edu/-dsculley/papers/incunabula.pdf. 30 Crane, G., B. Seales, and M. Terras. “Cyberinfrastructure for Classical Philology.” DHQ: Digital Humanities Quarterly 3 (1). (2009). Cranny-Francis, Anne. Multimedia: Texts and Contexts. London, UK: Sage, 2005. “Creating Your Web Presence: A Primer for Academics.” Profhacker. February 14, 2011. http://chronicle.com/blogs/profhacker/creating-your-web-presence-a-primer-for-aca- demics/30458 Creative Commons. creativecommons.org. Crofts, N. “Museum informatics: The Challenge of Integration.” University of Geneva. http://archive-ouverte.unige.ch/unige:417. 2004. Crogan, Patrick. Gameplay Mode: War, Simulation, and Technoculture. Minneapolis, MN: University of Minnesota Press, 2011. Crompton, Constance, Richard J. Lane, and Ray Siemens, eds. Doing Digital Humanities: Practice, Training, Research. 1-6. New York, NY: Routledge, 2016. Crowther, P. Phenomenology of the Visual Arts (Even the Frame). Stanford, CA: Stanford University Press, 2009. Croxall, Brian. “All Things Google: Google Maps.” Profhacker. April 5, 2011. http://chron- icle.com/blogs/profhacker/all-things-google-google-maps-labs/32421. Croxall, Brian. “Build Your Own Interactive Timeline.” briancroxall.net, 2010. http://bri- ancroxall.net/TimelineTutorial/TimelineTutorial.html. Croxall, Brian. “Tired of Tech: Avoiding Tool Fatigue in the Classroom.” Writing and Ped- agogy 5, No. 2 (2013): 249-68. Cubitt, Sean. “Cybertime: Ontologies of Digital Perception.” In Society for Cinema Stud- ies. Chicago, IL: March 2000. Cudworth, A.L Virtual World Design: Creating Immersive Virtual Environments. Boca Ra- ton, FL: CRC Press, 2014. “Cultural Analytics.” Software Studies Initiative. http://lab.softwarestudies.com/p/cul- tural-analytics.html. (Watch the intro video, scroll down to the description of the work at the Software Studies lab, and explore some of the examples.) CUNY Digital Humanities Resource Guide. http://commons.gc.cuny.edu/wiki/in- dex.php/The_CUNY_Digital_Humanities_Resource_Guide 31 Curry, Michael R. “The Digital Individual in the Private Realm.” Annals of the Association of American Geographers 87 (1997): 681-99. Curry, Michael R. Digital Places: Living with Geographic Information Systems. London, UK: Routledge, 1998. Curry, Michael R. “Rethinking Privacy in a Geocoded World.” In Geographic Information Systems: Principles and Applications, (2nd ed). Eds. Paul A. Longley, Michael F. Goodchild, David J. Maguire, and David W. Rhind. 757—66. New York, NY: John Wiley and Sons, Inc., 1998. Dahlström, M., J. Hansson, and U. Kjellman. “’As We May Digitize’-Institutions and Docu- ments Reconfigured.” LIBER Quarterly, 21: 3-4 (2012): 455-74. Darnton, Robert. “Google and the Future of Books.” New York Review of Books, Febru- ary 12, 2009. http://www.nybooks.com/articles/archives/2009/feb/12/google-the-fu- ture-of-books/. Date, C.J. An Introduction to Database Systems. Reading, MA: Addison-Wesley, 2000. David Rumsey Map Collection. http://www.davidrumsey.com/. Davidson, Cathy N. “How Can A Digital Humanist Get Tenure?” HASTAC. September 17, 2012. http://hastac.org/blogs/cathy-davidson/2012/09/17/how-can-digital-humanist- get-tenure. Davidson, Cathy N. “Humanities and Technology in the Information Age.” The Oxford Dictionary of Interdisciplinarity. Eds. Robert Frodeman, Julie Thompson Klein, and Carl Mitcham. 372-79. Oxford and New York: Oxford University Press, 2001. Davidson, Cathy N. “Humanities 2.0: Promise, Perils, Predictions.” In Debates in the Digi- tal Humanities. Ed. Matthew K. Gold. 476-489. Minneapolis, MN: University of Minne- sota Press, 2012. Davidson, Cathy N. Now You See It: How the Brain Science of Attention Will Transform the Way We Live, Work, and Learn. New York, NY: Penguin, 2011. Davidson, Cathy N. “We Can’t Ignore the Influence of Digital Technologies.” Chronicle of Higher Education Review. (March 23, 2007): B20. Davidson, Cathy N., and David Theo Goldberg. The Future of Thinking: Learning Institu- tions in a Digital Age. Cambridge, MA: MIT Press, 2010. 32 Davies, John, Dieter Fensel, and Frank van Harmelen. Towards the Semantic Web: Ontol- ogy-Driven Knowledge Management. Hoboken, NJ: J. Wiley, 2003. Davies, Mark. “A Corpus-Based Study of Lexical Developments in Early and Late Modern English.” In Handbook of English Historical Linguistics. Eds. Merja Kytö and Päivi Pahta. Cambridge, UK: Cambridge University Press. Davies, Mark. “Expanding Horizons in Historical Linguistics with the 400 million word Corpus of Historical American English.” Corpora 7, no. 2 (2012): 121–57. Davies, Mark. “Gephi+ MALLET + EMDA.” Robin Camille Davis/ Blog. http://www.robin- camille.com/2013-07-03-gephi-emda/. Davis, Robin Camille. “Testing out the NLTK sentence tokenizer.” Robin Camille Davis’ Blog. http://www.robincamille.com/2012-02-18-nltk-sentence-tokenizer/. Davies, Robin, and Michael Nixon. “Digitization Fundamentals.” In Doing Digital Humani- ties: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Sie- mens. 163-176. New York, NY: Routledge, 2016. Davis, Rebecca Frost. “Learning from an Undergraduate Digital Humanities Project.” Techne. December 1, 2010. http://blogs.nitle.org/2010/12/01/learning-from-an-under- graduate-digital-humanities-project/. Dawson, Ashley. “Academic Freedom and the Digital Revolution.” AAUP Journal of Aca- demic Freedom 1 (2010). Dawson, P. “’Breaking the Fourth Wall’: 3D Virtual Worlds as Tools for Knowledge Re- patriation in Archaeology.” Journal of Social Archaeology 11 (3) (2011): 387-402. Dear, Michael, Jim Ketchum, Sarah Luria, and Doug Richardson, eds. Geohumanities: Art, History, Text at the Edge of Place. New York, NY: Routledge, 2011. Debord, Guy. The Society of the Spectacle. Trans. Donald Nicholson-Smith. New York, NY: Zone Books, 1994. Deegan, Marilyn. “A World of Possibilities: Digitisation and the Humanities.” In Research Methods for Creating and Curating Data in the Digital Humanities. Eds. Matt Hayler and Gabriele Griffin. 181-199. Edinburgh, UK: Edinburgh University Press, 2016. Deegan, M. and K. Sutherland, eds. Text Editing, Print and the Digital World. Aldershot, UK: Ashgate. 33 Deegan, Marilyn and Willard McCarty, eds. Collaborative Research in the Digital Human- ities. Farnham, UK: Ashgate, 2011. Deleuze, Gilles. Cinema 1: The Movement Image. Trans. Hugh Tomlinson and Barbara Habberjam. Minneapolis, MN: University of Minnesota Press, 1986. Deleuze, Gilles. Cinema 2: The Time Image. Translated by Hugh Tomlinson and Barbara Habberjam. Minneapolis, MN: University of Minnesota Press, 1989. De Man, Paul. “The Resistance to Theory.” In The Resistance to Theory. Minneapolis, MN: University of Minnesota Press, 1986. DeRose, S.J., D.G. Durand, E. Mylonas et al. “What is Text, Really?” Journal of Computing in Higher Education 1 (2) (1990): 3-26. The Design-Based Research Collective. “Design-Based Research: An Emerging Paradigm for Educational Inquiry.” Educational Researcher 32, no. 1 (2003): 5–8. Deutschmann, Mats, Anders Steinvall, and Anna Lagerström. “Raising Language Aware- ness Using Digital Media: Methods for Revealing Linguistic Stereotyping.” In Research Methods for Creating and Curating Data in the Digital Humanities. Eds. Matt Hayler and Gabriele Griffin. 158-180. Edinburgh, UK: Edinburgh University Press, 2016. Deuze, Mark. Media Work. Cambridge, UK: Polity, 2007. Dictionary of Art Historians. http://arthistorians.info. Dieter, Michael, and Geert Lovink. “Theses on Making in the Digital Age.” In Critical Making. Ed. Garnet Hertz. Hollywood, CA: Garnet Hertz, 2014. Digital Art History Society. https://digitalarthistorysociety.org Digging Into Data Challenge. 2009. http://www.diggingintodata.org/ Digital Curation Centre University of Edinburgh. DCC Curation Lifecycle Model. http://www.dcc.ac.uk/digital-curation/what-digital-curation. Digital Curation Centre University of Edinburgh. What is Digital Curation. http://www.dcc.ac.uk/digital-curation/what-digital-curation. http://www.digitalhumanities.org/companionDLS/. 34 The Digital Humanities Manifesto 2.0. 2009. http://www.humanitiesblast.com/mani- festo/Manifesto_V2.pdf. Digital Humanities Now. digitalhumanitiesnow.org. Digital Humanities Quarterly. Alliance of Digital Humanities Organizations. http://digi- talhumanities.org/dhq/. Digital Humanities Questions & Answers. http://digitalhumanities.org/answers/. Digital Humanities Summer Institute Statement of Ethics and Inclusion. Led by Jacquel- ine Wernimont and Angel David Nieves. http://www.dhsi.org/events.php#ethics+inclu- sion. “Digital Humanities and the Undergraduate: Campus Projects Recognized.” National In- stitute for the Technology in Liberal Education. October 12, 2010. http://www.ni- tle.org/live/news/134-digital-humanities-and-the-undergraduate-campus. “Digital Humanities at the University of Washington.” Simpson Center for the Humani- ties, University of Washington. http://depts.washington.edu/uwch/docs/digital_human- ities_case_statement.pdf. “Digital Humanities at Yale: About.” Digital Humanities at Yale. http://digital humani- ties.yale.edu/. Digital Labor Reference Library. Digital Labor Working Group. CUNY Graduate Center. https://digitallabor.commons.gc.cuny.edu/digital-labor-reference-library/. Digital Librarians Initiative. “Role of Librarians in Digital Humanities Centers.” White Pa- per. Emory University Library, August 2010. http://docs.google.com/Doc?do- cid=0AZbw4Qx_a5JPZGM2OWdrdzZfMTMycWRncHJwbWo&hl=en. Digital Library Federation. diglib.org. Digital Public Library of America (DPLA). https://dp.la/. Digital Research Infrastructure for the Arts and the Humanities. www.dariah.eu. Digital Research Tools Wiki (DiRT). https://digitalresearch- tools.pbworks.com/w/page/17801672/FrontPage. Digital Roman Forum: http://dlib.etc.ucla.edu/projects/Forum/. Digital Scholarship Lab. University of Richmond, 2011. http://dsl.richmond.edu/. 35 Digital Studies/Le champ numérique. www.digitalstudies.org. Dillon, Sheila, and Elizabeth Palmer Baltes. “Honorific Practices and the Politics of Space on Hellenistic Delos.” American Journal of Archaeology 117 (2013): 207-46. Dillon, Sheila, and Timothy D. Shea. “Sculpture and Context: Towards an Archaeology of Greek Statuary.” In Greek Art in Context. Ed. D. Rodríguez Perez. New York, NY: Routledge, 2017. “Discussion Area, Archived.” Internet Shakespeare Editions. http://internetshake- speare.uvic.ca/Annex/discussion.html#toc_On_line_numbering_in_the_electronic_edi- tion. Dobrzynski, Judith H. “Modernizing Art History.” The Wall Street Journal. http://online.wsj.com/news/arti- cles/SB10001424052702304518704579519632304010744 Doel, Ronald E., and Pamela M. Henson. “Reading Photographs: Photographs as Evi- dence in Writing the History of Recent Science.“ In Writing Recent Science. Eds. Ronald E. Doel and Thomas Söderquist. London, UK: Routledge, 2006: 201-236. Dombrowski, Quinn. “Drupal and other Content Management Systems.” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 289-302. New York, NY: Routledge, 2016. Dombrowski, Quinn. “What Ever Happened to Project Bamboo?” Literary and Linguistic Computing 29, No. 4 (December 2014): 4014. Dombrowski, Quinn. “When Not to Use Drupal.” Drupal for Humanists, http://dru- pal.forhumanists.org/book/when-not-use-drupal Donahue-Wallace, Kelly, Laetitia La Follette, and Andrea Pappas, eds. Teaching Art His- tory with New Technologies: Reflections and Case Studies. Cambridge, UK: Cambridge Scholars Publishing, 2008. Dörk, Marian, Christopher Collins, Patrick Feng, and Sheelagh Carpendale. “Critical Info- Vis: Exploring the Politics of Visualization.” CHI 2013 Extended Abstracts. Paris, 2013. Dorn, Sherman. “Is (Digital) History More than an Argument about the Past?” In Writing History in the Digital Age. Eds. Kristen Nawrotzki and Jack Dougherty. Ann Arbor, MI: University of Michigan Press, 2013. 36 Dooley, Jackie. “Ten Commandments for Special Collections Librarians in the Digital Age.” RBM: A Journal of Rare Books, Manuscripts and Cultural Heritage 10, no. 1 (2020): 61-79. Dougherty, Jack, and Kristen Nawrotzki, eds. Writing History in the Digital Age. Ann Ar- bor, MI: University of Michigan Press, 2013. Douglas, J. Yellowlees. The End of Books—Or Books Without End? Ann Arbor, MI: Univer- sity of Michigan Press, 2000. Dourish, Paul, and Genevieve Bell. Divining a Digital Future: Mess and Mythology in Ubiquitous Computing. Cambridge, MA: MIT Press, 2014. Downey, Greg. “Virtual Webs, Physical Technologies, and Hidden Workers: The Spaces of Labor in Information Internetworks.” Technology and Culture 42, no. 2. 209-235. 2001. “Downgrading your Website, or Why We Are Moving to WordPress.” Smithsonian Cooper-Hewitt Museum, http://labs.cooperhewitt.org/2014/downgrading-your-web- site-or-why-we-are-moving-to-wordpress/ Drain, Adam. ”Design Anthropology: Working On, With, and For Technologies.” In Digital Anthropology. Ed. Heather A. Horst and Daniel Miller. 245-265. New York, NY: Berg, 2012. Draxler, Bridget. “Digital Humanities Symposium: The Scholar, the Library and the Digital Future.” HASTAC, February 2011. http://hastac.org/blogs/bridget-draxler/digital-hu- manities-symposium-scholar-library-and-digital-future. Drucker, Johanna. Graphesis: Visual Forms of Knowledge Production. Cambridge, MA: Harvard University Press, 2014. Drucker, Johanna. “Graphical Approaches to the Digital Humanities.” In A New Compan- ion to Digital Humanities. Eds. Susan Schreibman, Ray Siemens, and John Unsworth. 238-250. West Sussex, UK: Wiley-Blackwell, 2016. Drucker, Johanna. “Humanistic Theory and Digital Scholarship.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 85-95. Minneapolis, MN: University of Minnesota Press, 2012. Drucker, Johanna. “Humanities Approaches to Graphical Display.” Digital Humanities Quarterly 5, no. 1 (2011). Drucker, Johanna. “Is There a Digital Art History?” Visual Resources 29 (1-2) 2013: 5-13. 37 Drucker, Johanna. “Performative Materiality and Theoretical Approaches to Interface.” DHQ: Digital Humanities Quarterly 7 (1). Drucker, Johanna. SpecLab: Digital Aesthetics and Projects in Speculative Compu- ting. Chicago, IL: University of Chicago Press, 2009. Drucker, Johanna. “Theory as Praxis: The Poetics of Electronic Textuality.” Modern- ism/modernity 9, no. 4 (2002): 683-691. Drucker, Johanna, and Emily McVarish. Graphic Design History. 2nd edition. Boston, MA: Pearson, 2012. Duguid, Paul. “Material Matters: Aspects of the Past and the Futurology of the Book.” In The Future of the Book. Ed. Geoffrey Nunberg. 63-102. Berkeley, CA: University of Cali- fornia Press, 1996. Duguid, Paul. “Material Matters: The Past and Futurology of the Book.” In The Future of the Book. Ed. by Geoffrey Nunberg. Berkeley and Los Angeles, CA: University of Califor- nia Press, 1996. Duke University Libraries Digital Humanities Research Guide. http://guides.li- brary.duke.edu/content.php?pid=129864&sid=1114048 Dumbill, Ed. “What Is Big Data? An Introduction to the Big Data Landscape.” O’Reilly Ra- dar. 2012. http://radar.oreilly.com. Duncan, J., and P.L. Main. “The Drawing of Archaeological Sections and Plans by Com- puter.” Science & Archaeology 20 (1977): 17-26. Dunne, Anthony, and Fiona Raby. Speculative Everything: Design, Fiction, and Social Dreaming. Cambridge, MA: MIT Press, 2013. Dziuban, Charles, Charles R. Graham, and Anthony G. Picciano, eds. “Blended Learning.” Research Perspectives, vol. 2. New York, NY: Routledge, 2013. Earhart, Amy E. “Can Information Be Unfettered? Race and the New Digital Humanities Canon.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 309-318. Minneap- olis, MN: University of Minnesota Press, 2012. Earhart, Amy E. “Challenging Gaps: Redesigning Collaboration in the Digital Humanities.” In The American Literature Scholar in the Digital Age. Eds. Amy Earhart and Andrew Jew- ell. 27-43. Ann Arbor, MI: University of Michigan Press, 2010. 38 Earhart, Amy E. Recovering the Recovered Text: Diversity, Canon Building, and Digital Studies-Amy Earhart. 2012, video url. http://www.youtube.com/watch?v=7ui9PIjDreo&feature=youtube_gdata_player. Earhart, Amy E., and Andrew Jewell. The American Literature Scholar in the Digital Age. Ann Arbor, MI: University of Michigan Press and University of Michigan Library, 2011. Earhart, Amy E. and Toneisha L. Taylor. “Pedagogies of Race: Digital Humanities in the Age of Ferguson.” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lau- ren Klein. 251-264. Minneapolis, MN: University of Minnesota Press, 2016. Eder, Maciej. “Visualization in Stylometry: Cluster Analysis Using Networks.” Digital Scholarship in the Humanities 30. (December 2015). Edmond, Jennifer. “Collaboration and Infrastructure.” In A New Companion to Digital Humanities. Ed. Susan Schreibman, Ray Siemens, and John Unsworth. 54-66. West Sus- sex, UK: Wiley-Blackwell, 2016. Edmond, Jennifer. “The Role of the Professional Intermediary in Expanding the Humani- ties Computing Base.” Literary and Linguistic Computing 20 (3) (2005): 367-380. Edwards, Richard. “Creating the Center for Digital Research in the Humanities.” Univer- sity of Nebraska-Lincoln, July 18, 2005. http://cdrh.unl.edu/articles/creatingcdrh.php. Edwards, Richard. “The Digital Humanities and Its Users.” In Debates in the Digital Hu- manities. Ed. Matthew K. Gold. 213-232. Minneapolis, MN: University of Minnesota Press, 2012. Egan, Gabriel, and John Jowett. “Review of the Early English Books Online (EEBO).” Inter- active Early Modern Literary Studies (January 2001): 1–13. Eggert, P. “Text-Encoding, Theories of the Text, and the ‘Work-Site’.” Literary and Lin- guistic Computing 20 (4) (2005): 425-435 Eisenstein, Elizabeth L. The Printing Press as Agent of Change. Cambridge, UK: Cam- bridge University Press, 1980. Eisenstein, Elizabeth L. The Printing Press as Agent of Change: Communications and Cul- tural Transformations in Early Modern Europe. Cambridge, UK: Cambridge University Press, 2009. Eisenstein, Elizabeth L. The Printing Revolution in Early Modern Europe. Cambridge, UK: Cambridge University Press, 1983. 39 Electronic Literature Organization. eliterature.org Eliot, Simon, and Jonathan Rose. A Companion to the History of the Book. Malden, MA: Blackwell Publishing, 2007. Elliott, D., R. MacDougall, and W.J. Turkel. ”New Old Things: Fabrication, Physical Com- puting, and Experiment in Historical Practice.” Canadian Journal of Communication 37 (1). 121-128. Elliot, Tom and Richard Talbert. “Mapping the Ancient World.” In Past Time, Past Place: GIS for History. Ed. Anne Kelly Knowles. Redlands, CA: ESRI Press, 2002: 145-62. Emerson, Lori. Reading Writing Interfaces: From the Digital to the Bookbound. Minneap- olis, MN: University of Minnesota Press, 2014. Emirbayer, Mustafa, and Jeff Goodwin. “Network Analysis, Culture and the Problem of Agency.” American Journal of Sociology 9, no. 6 (1994): 1411-54. Endres, Bill. “A Literacy of Building: Making in the Digital Humanities.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 44- 54. Minneapolis,MN: University of Minnesota Press, 2017. Ensign, R. “Historians Are Interested in Digital Scholarship but Lack Outlets.” Chronicle of Higher Education. Wired Campus Blog, October 5, 2010. http://chroni- cle.com/blogs/wiredcampus/historians-are-interested-in-digital-scholarship-but-lack- outlets/27457 Ensslin, Astrid. Canonizing Hypertext: Explorations and Constructions. London, UK: Bloomsbury Press, 2007. Ensslin, Astrid. Literary Gaming. Cambridge, MA: MIT Press, 2014. EpoSS. Internet of Things in 2020: A Roadmap for the Future. Brussels, Belgium: Euro- pean Commission. 2008. Ernst, W. Digital Memory and the Archive. Minneapolis, MN: University of Minnesota Press. 2013. Erway, R. Swatting the Long Tail of Digital Media: A Call for Collaboration. Dublin, OH: OCLC Research, 2012. http://www.oclc.org/research/publications/library/2012/2012- 08.pdf. 40 Ethington, Philip J. "Los Angeles and the Problem of Urban Historical Knowledge." Amer- ican Historical Review 105, no. 5 (2000): 1667. Ethington, P. “Placing the Past: ’Groundwork’ for Spatial Theory of History.” Rethinking History 11 (4), (2007): 465-493. Europeana. www.europeana.eu. Evans, Mel. “Curating the Language of Letters: Historical Linguistic Methods in the Mu- seum.” In Research Methods for Creating and Curating Data in the Digital Humanities. Eds. Matt Hayler and Gabriele Griffin. 44-62. Edinburgh, UK: Edinburgh University Press, 2016. Everett, Anna. Digital Diaspora: A Race for Cyberspace. Albany, NY: SUNY Press, 2009. Eyman, Douglas. “Are You a Digital Humanist?” In Computers and Writing. Ann Arbor, MI: University of Michigan, May 21, 2011. Eyman, Douglas. Digital Rhetoric: Theory, Practice, and Method. Ann Arbor, MI: Univer- sity of Michigan Press, 2015. Ezell, M.J.M. Social Authorship and the Advent of Print. Baltimore, MD: Johns Hopkins University Press, 1999. Fair Cite Initiative. faircite.wordpress.com. Farman, Jason. “Mapping the Digital Empire.” New Media and Society 12. (2010): 869- 888. Farman, Jason. Mobile Interface Theory: Embodied Space and Locative Media. New York and London: Routledge, 2011. Faull, Katherine and Diane Jakacki. “Digital Learning in an Undergraduate Context: Pro- moting Long Term Student-Faculty Collaboration.” In Digital Scholarship in the Humani- ties. Oxford, UK: Oxford University Press, 2015. Favro, Diane. “Wagging the Dog in the Digital Age: The Impact of Computer Modeling on Architectural History.” Paper presented at The Computer Symposium: The Once and Fu- ture Medium for the Social Sciences and the Humanities. Brock University, Toronto. May 30, 2006. Favro, Diane, and Willeke Wendrich. “Digital Karnak.” University of California, Berkeley, 2007-8. 41 Fayyad, Usama, Georges Grinstein, Andreas Wierse. Information Visualization in Data Mining and Knowledge Discovery. San Francsico, CA: Mogran Kaufman, 2001. Fayyad, Usama, G. Piatetsky-Shapiro, and P. Smythe. “From Data Mining to Knowledge Discovery in Databases.” AI Magazine 17, (1996): 37-54. Fedora Commons. www.fedora-commons.org. Feigenbaum, Gail. “Unlocking Archives through Digital Tech.” The Getty Iris. June 9, 2014: http://blogs.getty.edu/iris/unlocking-archives-through-digital-tech/ Felluga, Dino Franco. “Addressed to the NINES: The Victorian Archive and the Disappear- ance of the Book.” Victorian Studies 48, no. 2 (2006): 305- 319. http://muse.jhu.edu/journals/victorian_studies/v048/48.2felluga.html Ferster, Bill. Interactive Visualization: Insight Through Inquiry. Cambridge, MA: MIT Press, 2012. Findlen, Paula. “How Google Rediscovered the 19th Century.” Chronicle of Higher Educa- tion 22 (2013). https://www.chronicle.com/blogs/conversation/2013/07/22/how- google-rediscovered-the-19th-century/. Finger, Anke, and Danielle Follett, eds. The Aesthetics of the Total Artwork: On Borders and Fragments. Baltimore, MD: Johns Hopkins University Press, 2010. Finnegan, R. Participating in the Knowledge Society: Research beyond University Walls. Basingstoke, UK: Palgrave Macmillan, 2005. Finneran, Richard J. The Literary Text in the Digital Age. Ann Arbor, MI: University of Michigan Press, 1996. Fiormonte, Domenico. “Toward a Cultural Critique of Digital Humanities.” In Debates in the Digital Humanities. 438-458. Eds. Matthew K. Gold and Lauren Klein. Minneapolis, MN: University of Minnesota Press, 2016. Fiormonte, Domenico. “Towards a Monocultural (Digital) Humanities.” Infolet, July 12, 2015. http://infolet.it/2015/07/12/moncultural-humanities/. Fischer, C. “All Tech is Social.” Boston Review, August 4. http://www.bostonre- view.net/blog/claude-fischer-all-tech-is-social. Fish, Stanley. “The Digital Humanities and the Transcending of Mortality.” New York Times: Opinionator. http://opinionator.blogs,nytimes.com/2012/01/09/the-digital-hu- manities-and-the-transcending-of-mortality. 42 Fish, Stanley. Is There a Text in This Class? Cambridge, MA: Harvard University Press, 1980. Fish, Stanley. “Mind Your P’s and B’s: The Digital Humanities and Interpretation.” New York Times, January 23, 2013. http://opinionator.blogs.nytimes.com/2012/01/23/mind- your-ps-and-bs-the-digital-humanities-and-interpretation/?_r=o. Fister, Barbara. “Getting Serious About Digital Humanities (Peer to Peer Review).” Li- brary Journal. May 27, 2010. http://www.libraryjournal.com/arti- cle/CA6729325.html?nid=2673&source=title&rid=#reg_visitor_id#. Fitch, Catherine A. and Steven Ruggles. “Building the National Historical Geographic In- formation System.” Historical Methods 36:1 (Winter 2003): 41-51. Fitzpatrick, Kathleen. “Beyond Metrics: Community Authorization and Open Peer Re- view.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 452-459. Minneapolis, MN: University of Minnesota Press, 2012. Fitzpatrick, Kathleen. “Giving it Away: Sharing and the Future of Scholarly Communica- tion.” In Planned Obsolescence: Publishing, Technology, and the Future of the Academy. New York, NY: New York University Press, 2011. Fitzpatrick, Kathleen “The Humanities, Done Digitally.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. Minneapolis, MN. University of Minnesota Press, 2012. Fitzpatrick, Kathleen. “Peer Review.” In A New Companion to Digital Humanities. Eds. Susan Schreibman, Ray Siemens, and John Unsworth. 439-448. West Sussex, UK: Wiley- Blackwell, 2016. Fitzpatrick, Kathleen. “Peer Review, Judgment, and Reading.” Profession (2011): 196– 201. http://www.mlajournals.org/doi/abs/10.1632/prof.2011.2011.1.196. Fitzpatrick, Kathleen. Planned Obsolescence: Publishing, Technology, and the Future of the Academy. New York, NY: New York University Press, 2009. Fitzpatrick, Kathleen, and Rowe, Katherine. “Keywords for Open Review.” LOGOS: The Journal of the World Book Community 21, no. 3-4 (2010): 133-141. Flanagan, Mary. Critical Play. Cambridge, MA: MIT Press, 2009. 43 Flanders, Julia. “The Body Encoded: Questions of Gender and the Electronic Text.” In Electronic Text: Investigations in Method and Theory. Ed. K. Sutherland. 127-144. Ox- ford, UK: Clarendon Press, 1997. Flanders, Julia. “Collaboration and Dissent: Challenges of Collaborative Standards for Digital Humanities.” In Collaborative Research in the Digital Humanities. Ed. Marilyn Deegan and Willard McCarty. 67-80. Farnham, UK: Ashgate, 2012. Flanders, Julia. “The Literary, the Humanistic, the Digital: Toward a Research Agenda for Literary Studies.” In Literary Studies in the Digital Age: An Evolving Anthology. Eds. Ken- neth M. Price and Ray Siemens. New York, NY: Modern Language Association, 2012. Flanders, Julia. “The Productive Unease of 21st-century Digital Scholarship.” DHQ: Digi- tal Humanities Quarterly 3, no. 3 (Summer 2009). http://digitalhuma- nities.org/dhq/vol/3/3/000055/000055.html. Flanders, Julia. “Time, Labor, and ‘Alternate Careers’ in Digital Humanities Knowledge Work.” Debates in the Digital Humanities. Ed. Matthew K. Gold. 292-308. Minnesota, MN: University of Minneapolis Press, 2012. Flanders, Julia, and Fotis Jannidis. “Data Modeling.” In A New Companion to Digital Hu- manities. Eds. by Susan Schreibman, Ray Siemens, and John Unsworth. 229-237. West Sussex, UK: Wiley-Blackwell, 2016. Flanders, Julia, Syd Bauman, and Sarah Connell. “Text Encoding.” In Doing Digital Hu- manities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 104-122. New York, NY: Routledge, 2016. Flanders, Julia, Syd Bauman and Sarah Connell. “XSLT: Transforming our XML Data.” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Rich- ard J. Lane, Ray Siemens. 255-272. New York, NY: Routledge, 2016. Flanders, J., & T. Muñoz. “An Introduction to Humanities Data Curation.” In DH curation guide: A Community Resource Guide to Data Curation in the Digital Humanities. 2011. Flanders, Julia, Wendell Piez, and Melissa Terras. “Welcome to Digital Humanities Quar- terly.” Digital Humanities Quarterly 1, no. 1 (2007). http://digitalhumanities .org/dhq/vol/1/1/000007/000007.html Fletcher, Pamela and Anne Helmreich, with David Israel and Seth Erickson. “Lo- cal/Global: Mapping Nineteenth-Century London’s Art Market.” Nineteenth Century Art Worldwide 11:3 (Autumn 2012). http://www.19thc-artworldwide.org/index.php/au- tumn12/fletcher-helmreich-mapping-the-london-art-market. 44 Flew, T. New Media: An Introduction. 3rd edition. Melbourne, Australia: Oxford Univer- sity Press, 2008. Flynn, B. “V-Embodiment for Cultural Heritage.” Digital Heritage International Congress, 347-354. Marseille: IEEE, 2013. Folsom, E. Ed. “Database as Genre: The Epic Transformation of Archives.” PMLA 122, no. 5 (October 2007): 1572-79. Folsom, E., & K.M. Price. The Walt Whitman Archive. 2011. http://www.whitman- archive.org. Fong, Deanna, Katrina Anderson, Lindsey Bannister, Janey Dodd, Lindsey Seatter, and Michelle Levy. “Students in the Digital Humanities: Rhetoric, Reality and Representa- tion.” University of Victoria, DHSI Colloquium 2014. Forer, P., and D. Unwin. “Enabling progress in GIS and Education.” In Geographical Infor- mation Systems. Eds. Paul Longley, Michael F. Goodchild, David J. Maguire, and David W. Rhind. 747-56. New York, NY: John Wiley & Sons, Inc., 1999. Foresman, Timothy W. Ed. The History of Geographic Information Systems: Perspectives from the Pioneers. Upper Saddle River, NJ: Prentice Hall, 1998. Forte, Maurizio. Virtual Archaeology. New York, NY: Harry N. Abrams, 1997. Forte, Maurizio. “Virtual Archaeology: Communication in 3D and Ecological Thinking.” In Beyond Illustration: 2d and 3d Digital Technologies as Tools for Discovery in Archaeol- ogy. BAR International Series. Eds. B. Frischer and A. Dakouri-Hild. 75-119. Oxford, UK: Archaeopress, 2008. Forte, Maurizio and Stafano Campana. Digital Methods and Remote Sensing in Archaeol- ogy. Cham, Switzerland: Springer, 2017. Foster, A.L. “Second Life: Second Thoughts and Second Doubts.” Chronicle of Higher Ed- ucation 54, 4 (2007): 24-25. Foster, A.L. “Professor Avatar.” Chronicle of Higher Education 54, 4, (2007): 24-26. Foster, Hal. “The Archive without Museums.” October 77 (Summer 1996): 97-119. Fotheringham, A. Stewart, Chris Brundson, and Martin Charlton. Geographically Weighted Regression: The Analysis of Spatially Varying Relationships. Chichester, UK: John Wiley & Sons, Inc., 2002. 45 Fotheringham, A. Stewart. Quantitative Geography: Perspectives on Spatial Data Analy- sis. London: Sage, 2000. Foucault, Michel. “The Discourse on Language.” in The Archaeology of Knowledge. Trans. A.M. Sheridan Smith. 224. New York, NY: Pantheon Books, 1972. Foulonneau, Muriel, and Jenn Riley. Metadata for Digital Resources: Implementation, Systems Design and Interoperability. Oxford, UK: Chandos, 2008. Fountain, Kathleen Carlisle. “To Web or Not to Web? The Evaluation of World Wide Web Publishing in the Academy.” In Digital Scholarship in the Tenure, Promotion, and Review Process. Ed. Deborah Lines Anderson. 67-80. Armonk, NY: M.E. Sharpe, 2003. Fox, Andrea. “Bit by Bit: Tapping into Big Data.” Library of Congress, Digital Preservation, March 12, 2014.http://digitalpreservation.gov/documents/big-data-report-andrea- fox0414.pdf. Fox, Nichols. Against the Machine: The Hidden Luddite Tradition in Literature, Art, and Individual Lives. Washington, DC: Island Press, 2002. Foys, Martin. Virtually Anglo-Saxon: Old Media, New Media, and Early Medieval Studies in the Late Age of Print. Gainesville, FL.: University Press of Florida, 2007. Frabetti, Federica. “Rethinking the Digital Humanities in the Context of Originary Tech- nicity.” Culture Machine 12 (2011). Fraistat, Neil. “The Function of Digital Humanities Centers at the Present Time.” In De- bates in the Digital Humanities. Ed. Matthew K. Gold. 281-291. Minneapolis, MN: Uni- versity of Minnesota Press, 2012. Fraistat, Neil, and Steven E. Jones. “Immersive Textuality.” TEXT 15 (2003): 69-82. Fraistat, Neil. “The Question(s) of Digital Humanities.” Maryland Institute for Technology in the Humanities, February 7, 2011. http://mith.umd.edu/the-questions-of-digital-hu- manities/. Freedman, Jonathan, N. Katherine Hayles, Jerome McGann, Meredith L. McGill, Peter Stallybrass, and Ed Folsom. “Responses to Ed Folsom’s ‘Database as Genre: The Epic Transformation of Archives’.” PMLA 122, no. 5 (October 2007): 1580-1612. French, Amanda. “Make ‘10’ Louder; or, The Amplification of Scholarly Communication.” Amandafrench.net. http://amandafrench.net/blog/2009/12/30/make-10-louder/. 46 Friedlander, Amy. “Foreward.” In A Survey of Digital Humanities Centers in the United States. Washington, DC: Council on Library and Information Resources, 2008. Friedlander, Amy. “Preface.” In A Survey of Digital Humanities Centers in the United States. Washington, DC: Council on Library and Information Resources, 2008. Friendly, Michael. “DataVis.ca.” Gallery of Data Visualization. New York University. Frischer, B., and A. Dakouri-Hild, eds. Beyond Illustration: 2d and 3d Digital Technologies as Tools for Discovery in Archaeology. BAR International Series. 1805. Oxford, UK: Ar- chaeopress. Froehlich, Heather. “We’re up All Night Playing with Docuscope.” Early Modern Digital Agendas. Folger Shakespeare Library. https://earlymoderndigitalagendas.word- press.com/2013/07/21/were-up-all-night-playing-with-docuscope/. (January 3, 2019). Froehlich, Heather. “How Many Female Characters Are There in Shakespeare?”' http://hfroehlich.wordpress.com/2013/02/08/how-many-female-characters-are-there- in-shakespeare/. Frost Davis, R. “Crowdsourcing, Undergraduates, and Digital Humanities Projects.” 2012. http://rebeccafrostdavis.wordpress.com/2012/09/03/crowdsourcing-undergraduates- and-digital-humanities-projects Fry, Ben. Visualising Data: Exploring and Explaining Data with the Processing Environ- ment. Sebastopol, CA: O’Reilly Media, 2008. Fuchs, Christian. Digital Labour and Karl Marx. New York, NY: Routledge, 2014. Fuchs, Christian. Internet and Society: Social Theory in the Information Age. New York, NY: Routledge, 2008. Fuhrt, Borko, ed. Handbook of Augmented Reality. New York, NY: Springer, 2011. Fuller, Matthew. Media Ecologies: Materialist Energies in Art and Technology. Cam- bridge, MA: MIT Press, 2003. Fuller, Matthew. “Software Studies Workshop.” 2006. http://pzwart.wdka.hro.nl/mdr/Seminars2/softstudworkshop. Fuller, Matthew. Software Studies: A Lexicon. Cambridge, MA: MIT Press, 2008. 47 Fuller, S. “Humanity: The Always Already-or Never to Be-Object of the Social Sciences?” In The Social Sciences and Democracy. Ed. J.W. Bouwel. London: Palgrave Macmillan, 2010. Fuller, S. The New Sociological Imagination. London, UK: Sage, 2006. Funkhouser, C.T. New Directions in Digital Poetry. New York, NY: Continuum Press, 2012. Fyfe, Paul. “Digital Pedagogy Unplugged.” Digital Humanities Quarterly 5, no. 3 (2011). http://digitalhumanities.org/dhq/vol/5/3/000106/000106.html. Fyfe, Paul. “Electronic Errata: Digital Publishing, Open Review, and the Futures of Cor- rection.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 259-280. Minneap- olis, MN: University of Minnesota Press, 2012. Fyfe, Paul. “Mid-Sized Digital Pedagogy.” In Debates in the Digital Humanities. 104-117. Eds. Matthew K. Gold and Lauren Klein. Minneapolis, MN: University of Minnesota Press, 2016. Gabrys, Jennifer. Digital Rubbish: A Natural History of Electronics. Ann Arbor, MI: Uni- versity of Michigan Press, 2011. Gadd, Ian. “The Use and Misuse of Early English Books Online.” Literature Compass 6 (2009): 680–692. Gaddis, J.L. The Landscape of History: How Historians Map the Past. New York, NY: Ox- ford University Press, 2002. Gaffney, Vincent. “In the Kingdom of the Blind: Visualization and E-Science in Archaeol- ogy, the Arts and Humanities.” In The Virtual Representation of the Past. Eds. Mark Greengrass and Lorna Hughes. 125-34. Farnham, UK: Ashgate, 2008. Galarza, Alex, Jason Heppler, and Douglas Seefeldt. “A Call to Redefine Historical Schol- arship in the Digital Turn.” Journal of Digital Humanities 1, no. 4 (Fall 2012). http://jour- nalofdigitalhumanities.org/1-4/a-call-to-redefine-historical-scholarship-in-the-digital- turn/. Galey, Alan, and Stan Ruecker. “How a Prototype Argues.” Literary and Linguistic Com- puting 25, no. 4 (2010): 405-24. Galina, Isabel. “Is There Anybody Out There? Building a Global Digital Humanities Com- munity.” Humanidades Digitales. Wordpress, 2013. 48 Gallon, Kim. “Making a Case for the Black Digital Humanities.” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 42-49. Minneapolis, MN: University of Minnesota Press, 2016. Galloway, Alexander R., and Eugene Thacker. The Exploit: A Theory of Networks. Minne- apolis, MN: University of Minnesota Press, 2007. Galloway, Alexander R., E. Thacker, and M. Wark. Excommunication: Three Inquiries in Media and Mediation. Chicago, IL: University of Chicago Press, 2013. Galloway, Alexander R. The Interface Effect. Cambridge, UK: Polity, 2012. Gallway, P. “Retrocomputing, Archival Research, and Digital Heritage Preservation: A Computer Museum and School Collaboration.” Library Trends 59 (4). 623-636. Gamelsberger, G. Ed. From Science to Computational Sciences: Studies in the History of Computing and its Influence on Today’s Sciences. Zürich: Diaphanes. 2011. Gantz, John, and David Reinsel. “The Digital Universe Decade: Are You Ready?” Interna- tional Data Corporation. Gardin, J.-C. “The Structure of Archaeological Theories.” In Mathematics and Infor- mation Science in Archaeology: A Flexible Framework. Ed. A. Voorrips. Studies in Modern Archaeology 3. Bonn, Germany: Holos, (1990): 7-25. Gardiner, Eileen and Ronald G. Musto. The Digital Humanities: A Primer for Students and Scholars. Cambridge, UK: Cambridge University Press, 2015. Gardner, Chelsea A.M., Gwynaeth McIntyre, Kaitlyn Solberg, and Lisa Tweten. “Looks Like We Made It, But Are We Sustaining Digital Scholarship?” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 95-101. Minneapolis, MN: University of Minnesota Press, 2017. Garfinkel, Susan. “Dialogic Objects in the Age of 3-D Printing: The Case of the Lincoln Life Mask.” In Making Things and Drawing Boundaries: Experiments in the Digital Humani- ties. Ed. Jentery Sayers. 206-20. Minneapolis, MN: University of Minnesota Press, 2017. Garrison D.R. & Kanuka, H. “Blended Learning: Uncovering its Transformative Potential in Higher Education.“ Internet and Higher Education 7 (2004): 95-105. Gatrell, Anthony C. “Any Space for Spatial Analysis?” In The Future of Geography. Ed. Ronald J. Johnston. 190-208. London, UK: Methuen, 1985. 49 Gatrell, Simon. “Electronic Hardy.” In The Literary Text in the Digital Age. Ed. Richard Finneran. 185-92. Ann Arbor, MI: University of Michigan Press, 1996. Gavin, Michael and K.M. Smith. “An Interview with Brett Bobley.” In Debates in the Digi- tal Humanities. Ed. Matthew K. Gold. 61-66. Minneapolis, MN: University of Minnesota Press, 2012. Gee, James Paul. What Video Games Have to Teach Us about Literacy and Learning. New York, NY: Palgrave Macmillan, 2003. Gershenfeld, N. Fab: The Coming Revolution on Your Desktop: From Personal Computers to Personal Fabrication. New York, NY: Basic Books, 2005. Geroimenko, Vladimir; Chaomei Chen, eds. Visualizing the Semantic Web: XML-Internet and Information Visualization. New York, NY: Springer, 2003. Gerschenfeld, Neil, Raffi Krikorian, and Danny Cohen. “The Internet of Things.” Scientific American (December 4, 2009): 76-81. Gershon, N., and W. Page. “What Storytelling Can Do for Information Visualization.” ACM 44, 8 (2001): 31-7. “Getting Started with Topic Modeling.” Digital Humanities 2013. UCLA. June 11, 2013. Web. August 9, 2013. Gibbs, Fred. “Critical Discourse in the Digital Humanities.” Journal of Digital Humani- ties 1, no. 1 (Winter 2012). http://journalofdigitalhumanities.org/1-1/critical-discourse- in-digital-humanities-by-fred-gibbs/. Gibbs, Fred. Digital Methods for the Humanities. Albuquerque, NM: University of New Mexico, 2014. http://fredgibbs.net/courses/digital-methods/. Gibson, James J. The Ecological Approach to Visual Perception. Hillsdale, NJ: Lawrence Erlbaum, 1986. Gibson, William. Neuromancer. New York, NY: Ace Books, 1984. Gil, Alex. “Interview with Ernesto Oroza.” In Debates in the Digital Humanities. Eds. Mat- thew Gold and Lauren Klein. 184-193. Minneapolis, MN: University of Minnesota Press, 2016. Gil, Alex, and Élika Ortega. “Global Outlooks in Digital Humanities: Multilingual Practices and Minimal Computing.” In Doing Digital Humanities: Practice, Training, Research. Eds. 50 Constance Crompton, Richard J. Lane, Ray Siemens. 22-34. New York, NY: Routledge, 2016. Gilbert, L.S. “Going the Distance ‘Closeness’ in Qualitative Data Analysis Software.” In- ternational Journal of Social Research Methodology, 5, 3 (2002): 215-28. Gillen, Julia, and David Barton. Digital Literacies: A Research Briefing by the Technology Enhanced Learning Phase of the Teaching and Learning Research Programme. 3. Lon- don, UK: London Knowledge Lab, Institute of Education, University of London, 2010. Gillespie, Tarleton. “The Relevance of Algorithms.” In Media Technologies: Essays on Communication, Materiality, and Society. Ed. Tarleton Gillespie, Pablo Boczkowski, and Kirsten Foot. 167-194. Cambridge, MA: MIT Press, 2014. Gillespie, Tarleton, Pablo Boczkowski, and Kirsten Foot, eds. Media Technologies: Essays on Communication, Materiality, and Society. Cambridge, MA: MIT Press, 2014. Gilliland, Jason. “Imag(in)ing London’s Past into the Future with Historical GIS.” Paper presented at the Annual Association of Canadian Geographers. Toronto, June 1, 2006. Gillings, Mark and David Wheatley. Spatial Technology and Archaeology: The Archaeo- logical Applications of GIS. London, UK: Taylor and Francis, 2002. Giordano, A., K. Huffman Lanzoni, & C. Bruzelius. eds. Visualizing Venice: Mapping and Modeling Time and Change in a City. New York and London: Routledge, 2017. Gitelman, Lisa. Always Already New: Media, History, and the Data of Culture. Cambridge MA: MIT Press, 2006. Gitelman, Lisa. Paper Knowledge: Toward a Media History of Documents (Sign, Storage, Transmission). Durham, NC: Duke University Press Books, 2014. Gitelman, Lisa. “Raw Data” is an Oxymoron (Infrastructures). Cambridge, MA: MIT Press, 2013. Gitelman, Lisa, and Geoffrey B. Pingree. “Introduction: What’s New about New Media?” In New Media, 1740-1915. Eds. Lisa Gitelman and Geoffrey B. Pingree. xi-xxiv. Cam- bridge, MA: MIT Press, 2003. Gladney, H.M. “Long-term Digital Preservation: A Digital Humanities Topic?” Historical Social Research/Historiche Sozialforschung 37.3 (2012): 201-217. Glazier, Los Pequeno. Digital Poetics: The Making of E-Poetries. Tuscaloosa, AL: Univer- sity of Alabama Press, 2002. 51 Gleick, James. “Books and Other Fetish Objects.” The New York Times, July 16, 2011, sec. Opinion/Sunday Review. http://www.nytimes.com/2011/07/17/opinion/sun- day/17gleick.html?_r=1. Gleick, James. The Information: A History, A Theory, A Flood. New York, NY: Pantheon, 2011. Global Outlook: Digital Humanities. http://www.globaloutlookdh.org/. Gold, Harvey, and Shirley E. Gold. “Implementation of a Model to Improve Productivity of Interdisciplinary Groups.” In Managing High Technology: An Interdisciplinary Perspec- tive. Eds. Brian W. Mar, William T. Newell, and Borje O. Saxbeg. 255-267. Amsterdam: Elsevier, 1985. Gold, Matthew K., ed. Debates in the Digital Humanities. Minneapolis, MN: University of Minnesota Press, 2012 Gold, Matthew K. “Looking for Whitman: A Grand, Aggregated Experiment.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 406-408. Minneapolis, MN: University of Minnesota Press 2012. Gold, Matthew K. “Looking for Whitman: A Multi-Campus Experiment in Digital Peda- gogy.” In Digital Humanities Pedagogy: Practices, Principles and Politics. Ed. Brett D. Hirsch, 151-176. Open Book Publishers, 2012. http://www.openbookpublish- ers.com/reader/161. Gold, Matthew K. “Whose Revolution? Towards a More Equitable Digital Humanities.” The Lapland Chronicles, January 10, 2012. http://blog.mkgold.net/category/presenta- tions/. Gold, Matthew K. and Lauren F. Klein. Debates in the Digital Humanities. Minneapolis, MN: University of Minnesota Press, 2016. Gold, Matthew K. and Lauren F. Klein. “Introduction.” In Debates in the Digital Humani- ties. Ed. Gold, Matthew and Klein, Lauren. 525-526. Minneapolis, MN: University of Min- nesota Press, 2016. Gold, Matthew K. and Lauren F. Klein. “Series Introduction and Editors’ Note.” In De- bates in the Digital Humanities. Eds. Gold, Matthew and Klein, Lauren. 569-571. Minne- apolis, MN: University of Minnesota Press, 2016. Goldstein, Evan R. “Digitally Incorrect.” Chronicle of Higher Education, October 3, 2010. http://chronicle.com/article/Digitally Incorrect/124649/. 52 Goldstone, Andrew, and Ted Underwood. “The Quiet Transformations of Literary Stu- dies: What Thirteen Thousand Scholars Could Tell Us.” New Literary History 45, no. 3 (2014): 359-384. doi:10.1353/nlh.2014.0025. Golumbia, David. The Cultural Logic of Computation. Cambridge, MA: Harvard University Press, 2009. Gombrich, E.H. “The Evidence of Images.” In Interpretation: Theory and Practice. Ed. Charles Singleton. 35-104. Baltimore, MD: Johns Hopkins University Press. 1969. Goodchild, Michael F. “Geographical Information Science.” International Journal of Geo- graphical Information Systems 6 (1992): 31-45. Goodchild, Michael F. “Geographic Information Systems and Spatial Analysis in the So- cial Sciences.” in Anthropology, Space, and Geographic Information Systems. Eds. M. Aldenerfer and H.D.G. Maschner. 241-250. New York, NY: Oxford University Press, 1996. Goodchild, Michael F. Introduction to Spatial Autocorrelation. Concepts and Techniques in Modern Geography. 47. Norwich, UK: GeoAbstracts, 1987. Goodchild, Michael F., and Donald G. Janelle, eds. Spatially Integrated Social Science. Oxford, UK: Oxford University Press, 2004. Goodchild, Michael F., and N.S.-N.Lam. “Areal interpolation: A Variant of the Traditional Spatial Problem.” Geo-Processing 1 (1980): 297-312. Gooding P., C. Warwick, and M. Terras. “The Myth of the New: Mass Digitization, Distant Reading and the Future of the Book.” In Digital Humanities 2012, Hamburg. 2012. http://www.dh2012.uni-hamburg.de/conference/programme/abstracts/the-myth-of- the-new-mass-digitization-distant-reading-and-the-future-of-the-book.1.html. Goodrick, Glyn Thomas, and Mark Gillings. “Constructs, Simulations and Hyperreal Worlds: The Role of Virtual Reality (VR) in Archaeological Research.” In On the Theory and Practice of Archaeological Computing. Eds. G.R. Lock and K. Smith. 41-59. Oxford, UK: Oxbow, 2000. Goodrum, Abby. “The Ethics of Hacktivism.” Journal of Information Ethics 9 (2000): 51- 59. Gordon, Eric, and Adriana de Souza e Silva. Net Locality: Why Location Matters in a Net- worked World. Chichester, West Sussex, UK: Wiley-Blackwell, 2011. 53 Gorman, Michael. “Introduction: Trading Zones, Interactional Expertise, and Collabora- tion.” In Trading Zones and Interactional Expertise: Creating New Kinds of Collaboration. Ed. Michael E. Gorman. 1-4. Cambridge, MA: MIT Press, 2010. Gorman, Michael, ed. Trading Zones and Interactional Expertise: Creating New Kinds of Collaboration. Cambridge, MA: MIT Press, 2010. Gosden, C., and Y. Marshall. “The Cultural Biography of Objects.” World Archaeology, 31.2 (1999): 169-78. Gouglas, S., G. Rockwell, V. Smith, S. Hoosin, and H. Quamen. “Before the Beginning: The Formation of Humanities Computing as a Discipline in Canada.” Digital Studies/Le Champ Numérique 3.1 (2013). Gradmann, S., and J.C. Meister. “Digital Document and Interpretation: Re-Thinking “Text” and Scholarship in Electronic Settings.” Poiesis & Praxis 5 (2) (2008): 139-153. Grafton, Anthony. “Apocalypse in the Stacks: The Research Library in the Age of Google.” Daedelus 138, no. 1 (Winter 2009): 87-98. Grafton, Anthony. The Footnote: A Curious History. Cambridge, MA: Harvard University Press, 1997. Graham, Shawn, Ian Milligan, and Scott Weingart. “Principles of Information Visualiza- tion.” in The Historian’s Macroscope – Working Title. Under contract with Imperial Col- lege Press, 2013. http://www.themacroscope.org/?page_id=469. Grau, Oliver. MediaArtHistories. Cambridge, MA: MIT Press, 2007. Grau, Oliver. Virtual Art: From Illusion to Immersion. Cambridge, MA: MIT Press, 2003. Green, Karen. “Naughty Bits.” Comixology. 2008. http://www.aca- demia.edu/4916117/Naughty_Bits. Greenbaum, Joan M., and Morten Kyng. Design at Work: Cooperative Design of Com- puter Systems. Boca Raton, FL: CRC Press, 1991. Greenberg, Hope, Elli Mylonas, Scott Hamlin, and Patrick Yott. “Supporting Digital Hu- manities Research: The Collaborative Approach.” Northeast Regional Computing Pro- gram, March 2008. net.educause.edu/ir/library/pdf/NCP08094.pdf. Greene, M.A. “The Power of Meaning: The Archival Mission in the Postmodern Age.” The American Archivist 65, 1 (2002): 42-55. 54 Greenfield, Adam. Everyware: The Dawning Age of Ubiquitous Computing. Berkeley, CA: New Riders, 2006. Greengrass, Mark and Lorna Hughes. The Virtual Representation of the Past. Eds. Mark Greengrass and Lorna Hughes. London, UK: Ashgate, 2008. Greenshow, Christine, and Benjamin Gleason. “Social Scholarship: Reconsidering Schol- arly Practices in the Age of Social Media.” British Journal of Educational Technology 45.3 (2014): 392-402. Greenspan, Brian. “Are Digital Humanists Utopian?” In Debates in the Digital Humani- ties. 393-409. Eds. Matthew K. Gold and Lauren Klein. Minneapolis, MN: University of Minnesota Press, 2016. Greenstein, Daniel, and Suzanne E. Thorin. “The Digital Library: A Biography.” Washing- ton, D.C.: Digital Library Federation/Council on Library and Information Resources, 2002. http://www.clir.org/pubs/abstract/pub109abst.html. Greetham, David. “The Resistance to Digital Humanities.” In Debates in the Digital Hu- manities. Ed. Matthew K. Gold. 438-451. Minneapolis, MN: University of Minnesota Press, 2012. Greetham, D.C. Textual Scholarship: An Introduction. New York, NY: Garland, 1994. Gregory, Derek. Geographical Imaginations. Cambridge, MA: Blackwell, 1994. Gregory, Ian N. A Place in History: A Guide to Using GIS in Historical Research. Oxford, UK: Oxbow Books, 2003. Gregory, Ian N., C. Bennett, V.L. Gilbam, and H.R. Southall. “The Great Britain Historical GIS Project: From Maps to Changing Human Geography.” Cartographic Journal 39:1 (2002): 37-49. Gregory, Ian, C. Donaldson, P. Murrieta-Flores and P. Rayson. “Geoparsing, GIS and Tex- tual Analysis: Current Developments in Spatial Humanities Research.” International Jour- nal of Humanities and Arts Computing 9 (2015): 1-14. Gregory, Ian N., and Paul S. Ell. Historical GIS: Technologies, Methodologies, and Scholar- ship. Cambridge, UK: Cambridge University Press. Gregory, Ian, and P.S. Ell. Historical GIS: Technologies, Methodologies, Scholarship. Cam- bridge, UK: Cambridge University Press, 2007. Gregory, Ian N., and Paul S. Ell, eds. History and Computing 13, 1 (2001). 55 Gregory, Ian, and Patricia Murrieta-Flores. “Geographical Information Systems as a Tool for Exploring the Spatial Humanities.” In Doing Digital Humanities: Practice, Training, Re- search. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 177-192. New York, NY: Routledge, 2016. Gregory, Ian, N. Karen Kemp, and Ruth Mostern. “Geographical Information and Histori- cal Research: Current Progress and Future Directions.” History and Computing 13 (2001): 7-22. Gregory, Ian, and R.G. Healey. “Historical GIS: Structuring, Mapping and Analyzing Geog- raphies of the Past.” Progress in Human Geography 31 (2007): 638-653. Griffey, Jason. “3D Printers for Libraries: Types of Plastics.” Library Technology Reports 50.5 (2014): 13-15. Griffin, G. and M. Hayler, eds. Research Methods for Reading Digital Data in the Digital Humanities. Edinburgh, UK: Edinburgh University Press, 2016. Grigar, Dene. “Curating Electronic Literature as Critical and Scholarly Practice.” Digital Humanities Quarterly 8, 4 (2015). Grigar, Dene. “Electronic Literature and Digital Humanities: Opportunities for Practice, Scholarship and Teaching.” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 193-196. New York, NY: Routledge, 2016. Grigar, Dene. “Electronic Literature: Where is it?” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 227-240. New York, NY: Routledge, 2016. Grigar, Dene. “The Present [Future] of Electronic Literature.” Transdisciplinary Digital Art: Sound, Vision and the New Screen. Eds. Randy Adams, Steve Gibson, and Stefan Muller. 127-142. Heidelberg, Germany: Springer-Verlag Publications, 2008. Grigar, Dene and Stuart Moulthrop. Pathfinders: Documenting the Experience of Early Digital Literature. Electronic Literature Organization, 2015. Grimes, Sara M., and Andrew Feenberg. “Rationalizing Play: A Critical Theory of Digital Gaming.” The Information Society 25.2 (2009): 105-118. Gronlund, Melissa. Contemporary Art and Digital Culture. New York, NY: Routledge, 2017. 56 Gruber, David. “New Materialism and a Rhetoric of Scientific Practice in the Digital Hu- manities.” In Rhetoric and the Digital Humanities. Eds. Jim Ridolfo and William Hart-Da- vidson. 296-306. Chicago, IL: University of Chicago Press, 2015. Guiliano, Jennifer. “I’ll see your open access and raise you two book contracts: or why the AHA should re-think its policy.” Jennifer Guilliano’s Blog. Cyber Chimps. http://jguili- ano.com/blog/2013/07/24/can-we-get-a-re-do-please-the-aha-policy-on-embargoing- dissertations-or-why-im-disappointed-in-my-professional-organization/. Guldi, Jo. “Spatial Turn in Art History.” Spatial Humanities. http://spatial.schol- arslab.org/spatial-turn/the-spatial-turn-in-art-history/index.html. Guldi, Jo. “What Is the Spatial Turn?” Spatial Humanities, 2011. http://spatial.schol- arslab.org/spatial-turn/. Gurak, Laura, and Smiljana Antonijevic. “Digital Rhetoric and Public Discourse.” In The SAGE Handbook of Rhetorical Studies. Eds. Andrea Lunsford, Kirt H. Wilson, and Rosa A. Eberly. 497-508. Thousand Oaks, CA: SAGE, 2009. Habermas, Jürgen. The Structural Transformation of the Public Sphere: An Inquiry into a Category of Bourgeois Society. Trans. Thomas Burger, with Frederick Lawrence. Cam- bridge, MA: MIT Press, 2002. Haegler, Simon, Pascal Müller, and Luc Van Gool. “Procedural Modeling for Digital Cul- tural Heritage.” EURASIP Journal on Image and Video Processing, (2009): 1-11. Hagood, J. “Brief Introduction to Data Mining Projects in the Humanities.” Bulletin of the American Society for Information Science and Technology, 38. 4 (2012): 20-3. Hai-Jew, Shalin, ed. Data Analytics in Digital Humanities. Cham, Switzerland: Springer, 2017. Hale, Constance, ed. Wired Style: Principles of English Usage in the Digital Age. New York, NY: Hardwired, 1996. Hales, N. Katherine. How We Became Posthuman: Virtual Bodies in Cybernetics, Litera- ture, and Informatics. Chicago, IL: University of Chicago Press, 1999. Hall, Gary. “The Digital Humanities Beyond Computing: A Postscript.” Culture Machine 12 (2011). http://www.culturemachine.net/index.php/cm/article/view/441/459. Hall, Gary. Digitize This Book! The Politics of New Media, or Why We Need Open Access Now. Minneapolis and London: University of Minnesota Press, 2008. 57 Hall, Gary. “Has Critical Theory Run Out of Time for Data-Driven Scholarship?” In De- bates in the Digital Humanities. Ed. Matthew K. Gold. 127-132. Minneapolis, MN: Uni- versity of Minnesota Press, 2012. Hall, Gary. “There Are No Digital Humanities.” Debates in the Digital Humanities. Ed. Matthew K. Gold. 133-136. Minneapolis, MN: University of Minnesota Press, 2012. Hall, Gary. “Toward a Postdigital Humanities: Cultural Analytics and the Computational Turn to Data-Driven Scholarship.” American Literature 85, no. 4 (2013): 781-809. Hall, Stephen S. Mapping the Next Millennium. New York, NY: Random House, 1992. Hall, Stuart. “Emergence of Cultural Studies and the Crisis of the Humanities.” October, 53 (1990): 11-23. Hall, Stuart “Encoding/Decoding.” In Culture, Media, Language. Eds. Stuart Hall, Dorothy Hobson, Andrew Lowe, Paul Willis. 128-138. London, UK: Hutchinson, 1980. Halpern, Orit. Beautiful Data: A History of Vision and Reason since 1945. Durham, NC: Duke University Press, 2014. Hamburger, J. The Visual Culture of a Medieval Convent. Berkeley, CA: University of Cali- fornia Press, 1997. Hamming, Richard. Numerical Analysis for Scientists and Engineers. New York, NY: McGraw-Hill, 1973. Han, J., M. Kamber, and J. Pei. Data Mining: Concepts and Techniques. Burlington, MA: Morgan Kaufmann, 2012. Hancher, Michael. “Re: Search and Close Reading.” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 118-138. Minneapolis, MN: University of Minne- sota Press, 2016. Hannigan, Lee, Aurelio Meza, and Alexander Flamenco. “Reading Series Matter: Per- forming the SpokenWeb Project.” In Making Things and Drawing Boundaries: Experi- ments in the Digital Humanities. Ed. Jentery Sayers. 198-204. Minneapolis, MN: Univer- sity of Minnesota Press, 2017. Hansen, Derek L., Ben Shneiderman and Marc A. Smith. Analyzing Social Media Net- works with NodeXL: Insights from a Connected World. Burlington, MA: Morgan Kauff- man, 2011. 58 Hansen, Mark B.N. “Affect as Medium or the ‘Digital-Facial-Image’.” Journal of Visual Culture 2, no. 2 (2003): 205-28. Hansen, Mark B.N. Embodying Technesis: Technology Beyond Writing. Ann Arbor, MI: University of Michigan Press, 2000. Hansen, Mark B.N. New Philosophy for New Media. Cambridge, MA: MIT Press, 2004. Haraway, D. “A Cyborg Manifesto: Science, Technology, and Socialist-Feminism in the Late Twentieth Century.” In Simians, Cyborgs, and Women: The Reinvention of Nature. 149-181. New York, NY: Routledge, 1991. Hardt, Michael, and Antonio Negri. Multitudes. New York, NY: Penguin, 2004. Hardy, Molly O’Hagan. “‘Black Printers’ on White Cards: Information Architecture in the Data Structures of the Early Americans Book Trades.” In Debates in the Digital Humani- ties. Eds. Matthew K. Gold and Lauren Klein. 377-383. Minneapolis, MN: University of Minnesota Press, 2016. Harley, J. Brian. “Deconstructing the Map.” Cartographica 26 (1989): 1-20. Harley, J. Brian. “Maps, Knowledge, and Power.” In The Iconography of Landscape. Eds. Denis Cosgrove and Stephen Daniels. 277-312. Cambridge, UK: Cambridge University Press, 1988. Harley, J. Brian. The New Nature of Maps. Ed. Paul Laxton. Baltimore, MD: Johns Hop- kins University Press, 2001. Harley, Diane, Jonathan Henke, and Shannon Lawrence, et al. Use and Users of Digital Resources: A Focus on Undergraduate Education in the Humanities and Social Sciences. Berkeley’s Center for Studies in Higher Education, April 5, 2006. http://cshe.berke- ley.edu/publications/publications.php?id=211. Harley, Diane, and University of California, Berkeley. Assessing the Future Landscape of Scholarly Communication an Exploration of Faculty Values and Needs in Seven Disci- plines. Berkeley, CA: Center for Studies in Higher Education, 2010. Harley, J.B. Deconstructing the Map. http://hackitectura.net/osfavelados/2009_proyec- tos_eventos/200907_cartografia_ciudadana/Harley1989_maps.pdf Harrell, D.F. Phantasmal Media: An Approach to Imagination, Computation, and Expres- sion. Cambridge, MA: MIT Press, 2013. 59 Harris, Katherine. “Explaining Digital Humanities in Promotion Documents.” The Journal of Digital Humanities 1, no. 4 (2012). Harris, Katherine. “Explaining Digital Humanities in Promotion Documents.” Journal of Digital Humanities 1, no. 4 (Fall 2012). http://journalofdigitalhumanities.org/1-4/ex- plaining-digital-humanities-in-promotion-documents-by-katherine-harris/. Harris, Katherine D. “Let’s Get Real with Numbers: The Financial Reality of Being a Ten- ured Professor.” https://triproftri.wordpress.com/2013/06/24/lets-get-real-with-num- bers-the-financial-reality-of-being-a-tenured-professor/. Harris, Trevor M. “GIS in Archaeology.” In Past Time, Past Place: GIS for History. Ed. Anne Kelly Knowles, 131-143. Redlands, CA: ESRI Press, 2002. Harrower, Mark. “Representing Uncertainty: Does It Help People Make Better Deci- sions?” White paper prepared for UCGIS Workshop: Geospatial Visualization and Knowledge Discovery Workshop. National Conference Center, Lansdowne, Virginia. No- vember 18-20, 2003. Hartman, J. et al. “Preparing the Academy of Today for the Learner of Tomorrow” EDU- CAUSE. http://net.educause.edu/ir/library/pdf/pub7101f.pdf Hartman, Kate. Wearable Electronics: Design, Prototype, and Wear Your Own Interactive Garments. Sebastopol, CA: Maker Media, 2014. Harvard Library Digital Humanities Café. http://guides.hcl.harvard.edu/digitalhumani- ties Harvey, Franci, Marianna Pavlovskaya, and Mei-Po Kwan. “Introduction to Critical GIS.” Cartographica 40:4 (2005): 1-4. Harvey, R. Digital Curation: A How-to-do-it Manual. New York, NY: Neal-Schuman, 2010. Harpham, Geoffrey Galt. The Humanities and the Dream of America. Chicago, IL: Univer- sity of Chicago Press, 2011. Hassan, Robert, and Julian Thomas, eds. The New Media Theory Reader. Maidenhead, UK: Open University Press, 2006. HASTAC (Humanities, Arts, Sciences, and Technology Advanced Collaboratory). www.HASTAC.org Hatch, Mark. The Maker Movement Manifesto: Rules for Innovation in the New World of Crafters, Hackers, and Tinkerers. New York, NY: McGraw Hill, 2014. 60 Hatfield, J. “Imagining Future Gardens of History.” Camera Obscura 21 (2/62) (2006): 185-189. HathiTrust Digital Library. www.hathitrust.org Hawkins, D.T., ed. Personal Archiving: Preserving our Digital Heritage. Medford, NJ: In- formation Today, 2013. Hawkins, Ann R. “Making the Leap: Incorporating Digital Humanities into the English Classroom.” In CEA Critic 76, no. 2 (July 2014). https://muse.jhu.edu/login?auth=0&type=summary&url=/jour- nals/cea_critic/v076/76.2.hawkins.pdf. Haworth, K.M. “Archival Description: Content and Context in Search of Structure.” In En- coded Archival Description on the Internet. Eds. D.V. Pitti, & W.M. Duff. 7-26. Bing- hamton, NY: Haworth Information Press, 2001. Hayler, Matt, and Gabriele Griffin. “Introduction.” Research Methods for Creating and Curating Data in the Digital Humanities. Eds. Matt Hayler and Gabriele Griffin. 1-13. Ed- inburgh, UK: Edinburgh University Press, 2016. Hayler, Matt and Gabriele Griffin, eds. Research Methods for Creating and Curating Data in the Digital Humanities. Edinburgh, UK: Edinburgh University Press, 2016. Hayles, N. Katherine “Cybernetics.” In Critical Terms for Media Studies. Eds. W.J.T. Mitchell and Mark B. N. Hansen. 145-156. Chicago, IL: University of Chicago Press, 1999. Hayles, N. Katherine. Electronic Literature: New Horizons for the Literary. Notre Dame, IN: University of Notre Dame Press, 2008. Hayles, N. Katherine. “Electronic Literature: What Is It?” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 197-227. New York, NY: Routledge, 2016. Hayles, N. Katherine. “Elit: What Is It?” Electronic Literature Organization. 2007. Re- trieved: 25 October 2008. Hayles, N. Katherine. How We Became Posthuman: Virtual Bodies in Cybernetics, Litera- ture, and Informatics. Chicago, IL: University of Chicago Press, 1999. Hayles, N. Katherine. How We Think: Digital Media and Contemporary Technogenesis. Chicago, IL: University of Chicago Press, 2012. 61 Hayles, N. Katherine. “How We Think: Transforming Power and Digital Technologies.” In Understanding the Digital Humanities. Ed. D. M. Berry. London, UK: Palgrave, 2012. Hayles, N. Katherine. “How We Read: Close, Hyper, Machine.” ADE Bulletin 150 (2010): 62-79. http://www.mla.org/adefl_bulletin_c_ade_150_62. Hayles, N. Katherine. My Mother Was a Computer: Digital Subjects and Literary Texts. Chicago, IL: University of Chicago Press, 2005. Hayles, N. Katherine. “Print Is Flat, Code Is Deep: The Importance of Media-Specific Analysis.” Poetics Today 25 (1). 67–90. Hayles, N. Katherine. “Speech, Writing, Code: Three Worldviews.” In My Mother Was a Computer: Digital Subjects and Literary Texts, 39-61. Chicago, IL: University of Chicago Press, 2005. Hayles, N. Katherine. Writing Machines. Cambridge, MA: MIT Press, 2002. Hayles, N. Katherine and Jessica Pressman, eds. Comparative Textual Media: Transform- ing the Humanities in the Post-Print Era. Minneapolis, MN: University of Minnesota Press, 2013. Healey, Richard G., and Trem R. Stamp. “Historical GIS as a Foundation for the Analysis Regional Economic Growth: Theoretical, Methodological, and Practical Issues.” Social Science History 24:3 (2000): 575-612. Heasley, Lynne. “Shifting Boundaries on a Wisconsin Landscape: Can GIS Help Historians tell a Complicated Story.” Human Ecology 31:2 (2003): 183-211. Heath, T., and C. Bizer. Linked Data: Evolving the Web into a Global Data Space. San Ra- fael, CA: Morgan & Claypool, 2011. Heller, Margaret. “Lazy Consensus and Libraries.” ACRL Tech Connect, March 13, 2012. http://acrl.ala.org/techconnect/?p=391 Heidegger, M. “The Question Concerning Technology .” In Martin Heidegger: Basic Wri- tings. D.F. Krell Ed. 311-41. London, UK: Routledge, 1993. Hellqvist, Björn. “Referencing in the Humanities and Its Implications for Citation Analy- sis.” Journal of the American Society for Information Science and Technology 61, no. 2 (2009). 62 Hendren, Sara. “All Technology is Assistive: Six Design Rules on Disability.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Say- ers. 139-48. Minneapolis, MN: University of Minnesota Press, 2017. Henry, Chuck. “Removable Type.” In Online Humanities Scholarship: The Shape of Things to Come. Ed. Jerome McGann. 385-400. Houston, TX: Rice University Press, 2010. Henry, Shawn Lawton. “Very Briefly: Scalable Reading.” Scalable Reading. WordPress, 1 June 2012. Web. 09 Aug. 2013. Herbert, James. “Masterdisciplinarity and Pictorial Turn.” The Art Bulletin 77.4 (1995): 537-40. Hertz, G. “Methodologies of Reuse in the Media Arts: Exploring Black Boxes, Tactics, and Archaeologies.” PhD dissertation, University of California Irvine, 2009. Hess, Charlotte, and Elinor Ostrom. Understanding Knowledge as a Commons: From The- ory to Practice. Cambridge, MA: MIT Press, 2011. Higgin, Tanner. “Cultural Politics, Critique and the Digital Humanities.” MediaCommons. 25 May, 2010. http://www.tanneerhiggin.com/2010/05/cultural-politics-critique-and- the-digital-humanities/. Higgin, Tanner. “How do You Define Humanities Computing/Digital Humanities?” In Day of Digital Humanities. March 8, 2011. http://tapor.ualberta.ca/taporwiki/in- dex.php/How_do_you_define_Humanities_Computng _/_Digital_Humanitites%3F. Higgins, S. “The DCC Curtain Lifecycle Model.” International Journal of Digital Curation 3 (1) (2008): 134-140. http://ijdc.net/index.php/ijdc/article/view/69. Hill, Linda. Georeferencing: The Geographic Associations of Information. Cambridge, MA: MIT Press, 2006. Hilyard, Stephen. “The Object and the Event: Time-based Digital Simulation and Illusion in the Fine Arts.” In Research Methods for Creating and Curating Data in the Digital Hu- manities. Eds. Matt Hayler and Gabriele Griffin. 87-112. Edinburgh, UK: Edinburgh Uni- versity Press, 2016. Himanen, Pekka. The Hacker Ethic. New York, NY: Random House, 2001. Hindley, Meredith. “The Rise of the Machines.” Humanities 34, no. 4 (2013). Hirsch, Brett D. ed. Digital Humanities Pedagogy: Practices, Principles and Politics. Cam- bridge, UK: Open Book Publishers, 2012. 63 Hitchcock, Tim. “Big Data, Small Data and Meaning.” Historyonics (blog). November 9, 2014. http://historyonics.blogspot.com/2014/11/big-data-small-data-and-mean- ing_9.html. Hitchcock, Tim. “Digital Searching and Re-formulation of Knowledge.” In The Virtual Rep- resentation of the Past. Eds. Mark Greengrass and Lorna Hughes. 81-90. London, UK: Ashgate, 2008. Hitchcock, Tim. “Digitising British History since 1980.” In Making History: The Changing Face of the Profession in Britain. Institute for Historical Research, 2008. http://www.his- tory.ac.uk/makinghistory/resources/articles/digitisation_of_history.html. Hockey, Susan. Electronic Texts in the Humanities. Oxford, UK: Oxford University Press, 2000. Hockey, Susan. “The History of Humanities Computing.” In A Companion to Digital Hu- manities. Eds. S. Schreibman, R. Siemens, and J. Unsworth. Oxford, UK: Blackwell, 2004. http://www.digitalhumanities.org/companion . Hockey, Susan. “Living with Google: Perspectives on Humanities Computing and Digital Libraries.” Literary and Linguistic Computing 20, no. 1 (March 1, 2005): 7-24. Hockey, Susan. "Towards a Model for Web-Based Language Documentation and De- scription: Some Contributions from Digital Libraries and Humanities Computing Re- search." Web-Based Language Learning Workshop, Philadelphia. December 12-15, 2000. Hockey, Susan. “Workshop on Teaching Computers and the Humanities Courses.” Liter- ary and Linguistic Computing 1.4 (1986): 228-29. Hocks, Mary, and Michelle Kendrick, eds. Eloquent Images: Word and Image in the Age of New Media. Cambridge, MA: MIT Press, 2003. Holley, R. “Crowdsourcing: How and Why Should Libraries Do It?” In D-lib Magazine 16 (3/4) (2010). (http://www.dlib.org/dlib/march10/holley/03holley.html Holt, Jim. “Two Brains Running.” The New York Times. November 25, 2011: Sunday Book Review. Horst, Heather A., Daniel Miller, eds. Digital Anthropology. London and New York: Bloomsbury Academic, 2012. 64 Hoover, David L. “Argument, Evidence, and the Limits of Digital Literary Studies.” In De- bates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 230-250. Minne- apolis, MN: University of Minnesota Press, 2016. Hoover, David L., Jonathan Culpeper, and Kieran O’Halloran. Digital Literary Studies: Cor- pus Approaches to Poetry, Prose, and Drama. London, UK: Routledge, 2014. Hoover, David L. “Making Waves: Algorithmic Criticism Revisited.” DH2014, University of Lausanne and Ecole Polytechnique Fédérale de Lausanne, 8-12 July 2014. Hopes, D. “Digital CoPs and Robbers: Communities of Practice and the use of Digital Ar- tefacts.” Museum Management and Curatorship, 29.5 (2014): 498-518. Howard, Jennifer. “Digital Materiality; or Learning to Love Our Machines.” Wired Cam- pus Blog at The Chronicle of Higher Education. August 22, 2012. http://chroni- cle.com/blogs/wiredcampus/digital-materiality-or-learning-to-love-our-ma- chines/38982. Howard, Jennifer. “The MLA Convention in Translation.” Chronicle of Higher Education. http://chronicle.com/article/The-MLA-Convention-in/63379/. Howe, Jeff. “The Rise of Crowdsourcing.” Wired.com, Condé Nast Digital, June 2006. http://www.wired.com/. Hsu, Mei-ling. “The Qin Maps: A Clue to Later Chinese Cartographic Development.” Imago Mundi 45 (1993): 90-100. Hsu, Wendy F. “Digital Ethnography toward Augmented Empiricism: A New Methodo- logical Framework.” Journal of Digital Humanities 3, no. 1 (2014). Hsu, Wendy F. “Lessons on Public Humanities from the Civic Sphere.” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 280-286. Minneapolis, MN: University of Minnesota Press, 2016. Huffman, Kristin L., Andrea Jordano, and Caroline Bruzelius, eds. Visualizing Venice: Mapping and Modeling Time and Change in a City. Oxford, UK: Routledge, 2018. Hughes, Lorna, Panos Constantopoulos, and Costis Dallas. “Digital Methods in the Hu- manities: Understanding and Describing their Use across the Disciplines.” In A New Com- panion to Digital Humanities. Eds. Susan Schreibman, Ray Siemens, and John Unsworth. 150-170. West Sussex, UK: Wiley-Blackwell, 2016. Huhtamo, Erkki. Illusions in Motion: Media Archaeology of the Moving Panorama and Related Spectacles. Cambridge, MA: MIT Press, 2013. 65 Huhtamo E. and J. Parikka, ed. Media Archaeology: Approaches Applications, Implica- tions. Berkeley and Los Angeles, CA: University of California Press, 2011. Hui Kong Chun, Wendy, Richard Grusin, Patrick Jagoda, and Rita Raley. In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 493-509. Minneapolis, MN: University of Minnesota Press, 2016. Huitfeldt, Claus. "Scholarly Text Processing and Future Markup Systems." Forum Com- puterphilologie 2003. http://computerphilologie.uni-muenchen.de/jg03/huitfeldt.html Humanist Discussion Group. www.digitalhumanities.org/humanist. Hunter, John, Katherine Faull, and Diane Jakacki. “Reifying the Maker as Humanist.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jen- tery Sayers. 130-8. Minneapolis, MN: University of Minnesota Press, 2017. Hunter, M. Editing Early Modern Texts: An Introduction to Principles and Practice. New York, NY: Palgrave Macmillan, 2006. Hunyadi, Laszlo. “Collaboration in Virtual Space in Digital Humanities.” In Collaborative Research in the Digital Humanities. Eds. Marilyn Deegan and Willard McCarty. 93-103. Farnham, UK: Ashgate, 2012. Hutchison, Coleman. “Breaking the Book Known as Q.” PMLA (2006): 33–66. HyperCities. http://www.hypercities.com. IDC. Digital Universe Study. December 2012. http://www.emc.com. Igoe, T. Making Things Talk: Using Sensors, Networks, and the Arduino to See, Hear, and Feel Your World. 2nd edition. Sebastopol, CA: O’Reilly, 2011. Ihde, Don. Postphenomenology and Technoscience: The Peking University Lectures. Al- bany, NY: SUNY Press, 2009. Inkpen, Deborah. “MUNFLA: Digitizing the Past.” Gazette January 22, 2004: 9. Inscho, Jeffrey. “Guest Post: Oh Snap! Experimenting with Open Authority in the Gal- lery.” Museum 2.0. March 13, 2013. http://museumtwo.blogspot.com/2013/03/guest- post-oh-snap-experimenting-with.html. Institute for the Future of the Book. www.futureofthebook.org. 66 Institute of Museum and Library Services. www.imls.gov. “Interchange: The Promise of Digital History.” Journal of American History 95, no. 2 (2008): 452-491. http://www.journalofamericanhistory.org/issues/952/interchange/. International Journal for Digital Art History. http://www.dah-journal.org Itō, Mizuko. Hanging Out, Messing Around, and Geeking Out: Kids Living and Learning with New Media. Cambridge, MA: MIT Press, 2010. Jackacki, Diane, and Katherine Faull. “Doing DH in the Classroom: Transforming the Hu- manities Curriculum through Digital Engagement.” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 358-72. New York, NY: Routledge, 2016. Jackson, S.J. “Rethinking Repair.” In Media Technologies: Essays on Communication, Ma- teriality and Society. Eds. T. Gillespie, P. Boczkowski, and K. Foot. Cambridge, MA: MIT Press, 2014. Jackson, William A. “Some Limitations of Microfilm.” Papers of the Bibliographical Soci- ety of America 35 (1941): 281–88. JAH. “Interchange: The Promise of Digital History.” The Journal of American History. Re- trieved December 12, 2010. http://www.journalofamericanhistory.org/issues/952/inter- change/index.html. Jagoda, Patrick. “Gamification and Other Forms of Play.” Boundary 2 40, no. 2 (Summer 2013): 113-144. Jagoda, Patrick. “Gaming in the Humanities.” Differences: A Journal of Feminist Cultural Studies 25, no. 1 (2014): 189-215. Jameson, Fredric. Postmodernism, or the Cultural Logic of Late Capitalism. Durham, NC: Duke University Press, 1991. Jannidis, Fotis. “TEI in a Crystal Ball.” Literary and Linguistic Computing 24 (3), 2009: 253-265. Jannidis, Fotis et al. “An Encoding Model for Genetic Editions.” TEI Guide- lines. http://www.tei-c.org/Vault/TC/tcw19.html. Jannidis, Fotis et al. “Ch. 11: Representation of Primary Sources.” TEI Guidelines. http://www.tei-c.org/release/doc/tei-p5-doc/en/html/PH.html. 67 Jarmon, L. Traphagan, T. et al. “Virtual World Teaching, Experimental Learning and As- sessment: An Interdisciplinary Communications Course in Second Life.” Computers in Ed- ucation 53, (2009): 169-182. Jaschik, Scott. “An Open, Digital Professoriat.” Inside Higher Ed. January 10, 2011. http://www.insidehighered.com/news/2011/01/10/mlaa_embraces_digital_humani- ties_and_blogging. Jaskot, Paul B. “Commentary: Art-Historical Questions, Geographic Concepts, and Digital Methods,” Historical Geography 45 (2017): 92-99. Jaskot, Paul B. and Ivo van der Graaff, “Historical Journals as Digital Sources: Mapping Architecture in Germany, 1914-24,” Journal of the Society of Architectural Historians 76, no. 4 (December 2017): 483-505. Jaskot, Paul B. and Anne Kelly Knowles, “Architecture and Maps, Databases and A chives: An Approach to Institutional History and the Built Environment in Nazi Ge many,” The Iris (15 February 2017): http://blogs.getty.edu/iris/dah_jaskot_knowles/ Jaskot, Paul B., Anne Kelly Knowles, Andrew Wasserman, Stephen Whiteman, and Benjamin Zweig, “A Research-Based Model for Digital Mapping and Art History: Notes from the Field,” Artl@s Bulletin 4, no. 1 (Spring 2015): 65-74. Jaskot, Paul B., Anne Kelly Knowles, and Chester Harvey, with Benjamin Perry Blackshear, “Visualizing the Archive: Building at Auschwitz as a Geographic Problem,” In Tim Cole, Alberto Giordano and Anne Kelly Knowles, Eds., Geographies of the Holo- Caust. Bloomington, IN: Indiana University Press, 2014: 158-191. Jebara, Tony. Machine Learning: Discriminative and Generative. New York, NY: Springer, 2004. Jenkins, Henry. “Bringing Critical Perspectives to the Digital Humanities: An Interview with Tara McPherson (Part Three).” Confessions of an Aca-Fan, Blog, March 20, 2015. Jenkins, Henry. Convergence Culture: Where Old and New Media Collide. New York, NY: New York University Press, 2006. Jenson, Jennifer, Stephanie Fisher, and Suzanne De Castell. “Disrupting the Gender Or- der: Leveling up and Claiming Space in an After-School Video Game Club.” International Journal of Gender, Science and Technology 3.1 (2011). Jenstad, Janelle. “Restoring Place to the Digital Archive.” In Teaching Early Modern Eng- lish Literature from the Archives. Eds. Heidi Brayman Hackel and Ian Frederick Moulton. 101-12. New York, NY: Modern Language Association, 2015. 68 Jenstad, Janelle, and Joseph Takeda. “Making the RA Matter: Pedagogy, Interface, and Practices.” In Making Things and Drawing Boundaries: Experiments in the Digital Hu- manities. Ed. Jentery Sayers. 71-85. Minneapolis, MN: University of Minnesota Press, 2017. Jessop, Martyn. “The Inhibition of Geographical Information in Digital Humanities Schol- arship.” Literary and Linguistic Computing 22 (1). 1-12. Jessop, Martyn. “The Visualization of Spatial Data in the Humanities.” Literary and Lin- guistic Computing 19 (2004), 335-50. Jockers, Matthew L. “Digital Humanities: Methodology and Questions.” Matthew L. Jock- ers. April 23, 2010, http://www.stanford.edu/-mjockers/cgi-bin/drupal/node/43. Jockers, Matthew L. Macroanalysis: Digital Methods and Literary History. Urbana, IL: University of Illinois Press, 2013. Jockers, Matthew L. and Ted Underwood. “Text-Mining the Humanities.” In A New Com- panion to Digital Humanities. Eds. Susan Schreibman, Ray Siemens, and John Unsworth. 291-306. West Sussex, UK: Wiley-Blackwell, 2016. Johanson, Christopher. “Making Virtual Worlds.” In A New Companion to Digital Human- ities. Eds. Susan Schreibman, Ray Siemens, and John Unsworth. 110-126. West Sussex, UK: Wiley-Blackwell, 2016. Johanson, Christopher. “Visualizing History: Modeling in the Eternal City.” Visual Re- sources: An International Journal of Documentation 25 (4) (2009): 403. doi: 10.1080/01973760903331924. Johnson, I. “Putting time on the map: using TimeMap for Map Animation and Web De- livery.” GeoInformatics 7(5) (2004): 26-29. Johnson, Jessica Marie. Diaspora Hypertext. https://diasporahypertext.com/. Johnson, L. “Topic Maps: From Information to Discourse Architecture.” Journal of Infor- mation Architecture 2, 1 (2010): 5-18. Johnson, Steven. Where Good Ideas Come From: The Natural History of Innovation. Lon- don, UK: Penguin, 2011. Johnston, John. The Allure of Mechanic Life: Cybernetics, Artificial Life, and the New AI. Cambridge, MA: MIT Press, 2008. 69 Jones, M., and N. Beagrie. Preservation Management of Digital Materials: A Handbook. London, UK: The British Library for Resource, the Council for Museums, Archives and Li- braries, 2001. Jones, R., and C. Hafner. Understanding Digital Literacies: A Practical Introduction. Lon- don, UK: Routledge, 2012. Jones, Stephen E. The Emergence of the Digital Humanities. New York and London: Routledge, 2013. Jones, Stephen E. “The Emergence of the Digital Humanities (as the Network Is Evert- ing).” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 3-15. Minneapolis, MN: University of Minnesota Press, 2016. Jones, Steven E. Against Technology: From the Luddites to Neo-Luddism. New York, NY: Routledge, 2006. Jones, Steven E. “New Media and Modeling: Games and the Digital Humanities.” In A New Companion to Digital Humanities. Eds. Susan Schreibman, Ray Siemens, and John Unsworth. 84-97. West Sussex, UK: Wiley-Blackwell, 2016. Jones-Imhotep, Edward, and William J. Turkel. “Image Mining for the History of Elec- tronics and Computing.” In Seeing the Past: Augmented Reality and Computer Vision. Ed. Kevin Kee. Ann Arbor, MI: University of Michigan Press, 2019. Jones-Kavalier, Barbara R., and Suzanne L. Flannigan. “Connecting the Dots: Literacy of the 21st Century.” Educause Quarterly, No. 2 (January 2010): 8-10. Jordan, Tim. Activism! Direct Action, Hacktivism and the Future of Society. London, UK: Reaktion Books, 2002. Jørgensen, Finn Arne. “The Internet of Things.” In A New Companion to Digital Humani- ties. Eds. Susan Schreibman, Ray Siemens, and John Unsworth. 42-53. West Sussex, UK: Wiley-Blackwell, 2016. Joyce, Michael. Of Two Minds: Hypertext Pedagogy and Poetics. Ann Arbor, MI: Univer- sity of Michigan Press, 1995. Juola, Patrick. “Killer Applications in Digital Humanities.” Literary and Linguistic Compu- ting, 23.1 (2008): 73-83. Journal of Digital Humanities: http://journalofdigitalhumanities.org/, particularly the is- sue on evaluation: http://journalofdigitalhumanities.org/1-4/ 70 Journal of Interactive Pedagogy, http://jitp.commons.gc.cuny.edu/ Jurgenson, Nathan. “Digital Dualism versus Augmented Reality.” Cyborgology. The Soci- ety Pages. February 24, 2011. http://thesocietypages.org/cyborgology/2011/02/24/dig- ital-dualism-versus-augmented-reality/. Juul, Jesper. A Casual Revolution: Reinventing Video Games and Their Players. Cam- bridge, MA: MIT Press, 2010. Kadushin, C. Understanding Social Networks: Theories, Concepts, and Findings. New York, NY: Oxford University Press, 2012. Kalas, Gregor, Diane Favro, and Chris Johanson. “Visualizing Statues in the Late Antique Forum.” Inscriptions. http://inscriptions.etc.ucla.edu Kalay Y.E., T. Kvan, and J. Affleck, eds. New Heritage: New Media and Cultural Heritage. London and New York: Routledge, 2008. Kallinkos, Jannis, Aleksi Aaltonen, and Attila Marton. “A Theory of Digital Objects.” First Monday 15, no. 6 (2010). Kamada, Hitoshi. “Digital Humanities: Roles for Libraries?” College & Research Libraries News 71, no. 9 (October 2010): 484 -485. Kasik, D.J., D. Ebert, G. Lebanon, H. Park, and W.M. Pottenger. “Data Transformations and Representations for Computation and Visualization.” Information Visualization 8(4) 275-285. Kauai, Y.B., M.S. Cook, and D.A. Fields. “‘Blacks Deserve Bodies Too!’: Design and Discus- sion about Diversity and Race in a Tween Virtual World.” Games and Culture 5 (1), (2010): 43-63. doi: 10.1177/1555412009351261. Kearney, Patrick J., and G. Legman. The Private Case: An Annotated Bibliography of the Private Case Erotica Collection in the British (Museum) Library. London, UK: J. Landes- man, 1981. Kee, Kevin, ed. Pastplay: Teaching and Learning History with Technology. Ann Arbor, MI: University of Michigan Press, 2014. Keeling, Kara. The Witch’s Flight: The Cinematic, the Black Femme, and the Image of Common Sense. Durham, NC: Duke University Press, 2007. Keim, D.A., F. Mansmann, J. Schneidewind, and H. Ziegler. “Challenges in Visual Data Analysis.” Proceedings in Information Visualization IV 2006. 9-16. London, UK: IEEE. 71 Kelland, Lara. “The Master’s Tools, 2.0.” In Public History Commons. May 5, 2014. http://publichistorycommons.org/the-masters-tools-2-0/. Keller, Michael. “Response to Rotunda: A University Press Starts a Digital Imprint.” In Online Humanities Scholarship: The Shape of Things to Come. 375-83. Ed. Jerome McGann. Houston, TX: Rice University Press, 2010. Kelley, Victoria. “Time, Wear and Maintenance: The Afterlife of Things.” In Writing Ma- terial Culture. Eds. Anne Gerritsen and Giorgio Riello. London:, UK Bloomsbury, 2015. Kelly, T. Mills. “Making Digital Scholarship Count (Part I- of III).” Edwired, June 13, 2008. http://edwired.org/2008/06/13/making-digital-scholarship-count/. Kelly, T. Millis. Teaching History in the Digital Age. Ann Arbor, MI: University of Michigan Press, 2013. Kelly, T. Mills. “Visualizing Information.” Edwired. October 25, 2005. http://ed- wired.org/2005/10/25/visualizing-information/. Kelly, T. Mills. “Visualizing Millions of Words.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 402-403. Minneapolis, MN: University of Minnesota Press, 2012. Kelty, Christopher M. Two Bits: The Cultural Significance of Free Software. Durham, NC: Duke University Press, 2008. Kemman, Max, Martijn Kleppe, and Stef Scagliola. “Just Google It.” In Proceedings of the Digital Humanities Congress 2012. Eds. Clare Mills, Michael Pidd, and Esther Ward. Shef- field: HRI Online Publications, 2014. http://www.hrionline.ac.uk/openbook/chap- ter/dhc2012-kemman. Kenderdine, S., J. Shaw, and T. Gremmler. “Cultural Data Sculpting: Omnidirectional Vis- ualization for Cultural Datasets.” In Knowledge Visualization Currents: From Text to Art to Culture. Eds. E.T. Marchese and E. Banissi. 199-221. London, UK: Springer, 2012. Kenderdine, S. “Speaking in Rama: Panoramic Vision in Cultural Heritage Visualization.” In Digital Cultural Heritage: A Critical Discourse. Ed. F. Cameron and S. Kenderdine. 301- 332. Cambridge, MA: MIT Press. Kenderline, Sarah. “Embodiment, Entanglement, and Immersion in Digital Cultural Herit- age.” In A New Companion to Digital Humanities. Eds. by Susan Schreibman, Ray Sie- mens, and John Unsworth. 22-41. West Sussex, UK: Wiley-Blackwell, 2016. 72 Kennicott, P. “Pure Land Tour: for Visitors Virtually Exploring Buddhist Cave, it’s Pure Fun.” Washington Post, November 9, 2012. Kenny, Anthony. The Computation of Style. Oxford, UK: Oxford University Press, 1982. Kenny, Anthony. Computers and the Humanities. Ninth British Library Research lecture. British Library, London, UK. 1992, Keramidas, Kimon. “Interactive Development as Pedagogical Process: Digital Media De- sign in the Classroom as a Method for Recontextualizing the Study of Material Culture.” Museums and the Web 2014: Proceedings. Museum and the Web. http://mw2014.museumsandtheweb.com/paper/interactive-development-as-pedagogi- cal-process-digital-media-design-in-the-classroom-as-a-method-for-recontextualizing- the-study-of-material-culture/ Kernighan, Brian, and Rob Pike. The Unix Programming Environment. Englewood Cliffs, NJ: Prentice-Hall, 1984. Kernighan, Brian, and D.M. Ritchie. The C Programming Language. Englewood Cliffs, NJ: Prentice-Hall, 1978. Reprint, 1988. Kernighan, Brian W. D is for Digital: What a Well-Informed Person Should Know About Computers and Communications. CreateSpace Independent Publishing Platform, 2011. Ketelhut, D.J. “The Impact of Student Self-Sufficiency on Scientific Inquiry Skills: An Ex- ploratory Investigation in River City, a Multi-User Virtual Environment.” Journal of Sci- ence Education & Technology. 16, 1, (2007): 99-111. Killbride, William. “Saving the Bits: Digital Humanities Forever?” In A New Companion to Digital Humanities. Eds. by Susan Schreibman, Ray Siemens, and John Unsworth. 408- 419. West Sussex, UK: Wiley-Blackwell, 2016. Kim, David. “Archives, Models, and Methods for Critical Approaches to Identities: Repre- senting Race and Ethnicity in the Digital Humanities.” PhD dissertation, University of Cal- ifornia Los Angeles, 2015. Kim, David. “‘Data-izing’ the Images: Process and Prototype.” In Performing Archive: Cur- tis + the Vanishing Race. Eds. Jacqueline Wernimont, Beatrice Schuster, Amy Borsuk, David J. Kim, Heather Blackmore, and Ulia Gusart (Popova). Scalar, 2013 Kinder, Marsha, and Tara McPheson, eds. Transmedia Frictions: The Digital, The Arts, and The Humanities. Berkeley, CA: University of California Press, 2014. Kirsch, Adam. “Technology Is Taking over English Departments: The False Promise of the 73 Digital Humanities.” New Republic, May 2, 2014. http://www.newrepublic.com/arti- cle/117428/limits-digital-humanities-adam-kirsch. Kirschenbaum, Matthew G. “Ancient Evenings: Retrocomputing in the Digital Humani- ties.” In A New Companion to Digital Humanities. Eds. by Susan Schreibman, Ray Sie- mens, and John Unsworth. 185-198. West Sussex, UK: Wiley-Blackwell, 2016. Kirschenbaum, Matthew G. “Bookscapes: Modeling Books in Electronic Space.” Human- Computer Interaction Lab 25th Annual Symposium. 1-2. May 29, 2008. Kirschenbaum, Matthew G., et al. “Collaborators’ Bill of Rights.” Off the Tracks Work- shop. January 21, 2011. Kirschenbaum, Matthew G. “Digital Humanities As/Is a Tactical Term.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 415-428. Minneapolis, MN: University of Min- nesota, 2012. Kirschenbaum, Matthew. “Done: Finishing Projects in the Digital Humanities.” DHQ: Dig- ital Humanities Quarterly 3, no. 2 (Spring 2009). http://digitalhumani- ties.org/dhq/vol/3/2/000037/000037.html. Kirschenbaum, Matthew G. “Hello Worlds.” The Chronicle of Higher Education. 2009. http://chronicle.com/article/Hello-Worlds/5476. Kirschenbaum, Matthew G. Mechanisms: New Media and the Forensic Imagination. Cambridge, MA: MIT Press, 2012. Kirschenbaum, Matthew G. “What is Digital Humanities?” ADE Bulletin 150 (2010): 1-7 http://mkirschenbaum.wordpress.com/2011/01/22/what-is-digital-humanities/. Kirschenbaum, Matthew G. “What is Digital Humanities and What’s it Doing in English Departments?” ADE Bulletin 150 (2010): 55-61. Kirschenbaum, Matthew G. “What is ‘Digital Humanities’ and Why are They Saying Such Terrible Things About It?” Differences: A Journal of Feminist Cultural Studies 25, no.1 (2014): 46-53. Kirschenbaum, Matthew G., Bethany Nowviskie, Tom Scheinfeldt, and Doug Reside. “Collaborators’ Bill of Rights.” Maryland Institute for Technology and the Humanities, January 22, 2011. http://mith.umd.edu/offthetracks/recommendations/. 74 Kirschenbaum, Matthew G., Richard Ovenden, and Gabriela Redwine. “Digital Forensics and Born-Digital Content in Cultural Heritage Collections.” Council on Library and Infor- mation Resources. December 2010. http://www.clir.org/pubs/ab- stract/pub149abst.html. Kirton, Isabella and Melissa Terras. “Where Do Images of Art Go Once They Go Online? A Reverse Image Lookup Study to Assess the Dissemination of Digitized Cultural Herit- age.” Museums and the Web 2013: Proceedings. Museum and the Web. 2013. http://mw2013.museumsandtheweb.com/paper/where-do-images-of-art-go-once-they- go-online-a-reverse-image-lookup-study-to-assess-the-dissemination-of-digitized-cul- tural-heritage/ Kissane, Erin. The Elements of Content Strategy. New York, NY: A Book Apart, 2011. Kitchin, Rob, and Martin Dodge. Code/Space: Software and Everyday Life. Cambridge, MA: MIT Press, 2011. Kittler, Friedrich. Discourse Networks 1800/1900. Trans. Chris Metteer with Chris Cul- lens. Stanford, CA: Stanford University Press, 1990. Klein, Julie Thompson. “The Boundary Work of Making in Digital Humanities.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Say- ers. 21-31. Minneapolis, MN: University of Minnesota Press, 2017. Klein, Julie Thompson. Crossing Boundaries: Knowledge, Disciplinarities, and Interdisci- plinarities. Charlottesville, VA: University of Virginia Press, 1996. Klein, Julie Thompson. Creating Interdisciplinary Campus Centers. San Francisco, CA: Jossey-Bass and Association of American Colleges and Universities, 2010. Klein, Julie Thompson. Humanities, Culture, and Interdisciplinary: The Changing Ameri- can Academy. Albany, NY: State University of New York Press, 2005. Klein, Julie Thompson. Interdisciplinarity: History, Theory, and Practice. Detroit, MI: Wayne State University Press, 1990. Klein, Julie Thompson. Interdisciplining Digital Humanities: Boundary Work in an Emerg- ing Field. Ann Arbor, MI: University of Michigan Press, 2015. Klein, Lauren F., and Matthew K. Gold. “Digital Humanities: The Expanded Field.” In De- bates in the Digital Humanities. Ed. Matthew K. Gold and Lauren F. Klein. ix-xv. Minne- apolis MN: University of Minnesota Press, 2016. 75 Klein, Lauren F. “The Image of Absence: Archival Silence, Data Visualization, and James Hemings.” American Literature 85, no. 4 (2013):661-88. Kline, M.-J., and S.H. Perdue. A Guide to Documentary Editing. Charlottesville, VA: Uni- versity of Virginia Press, 2008. Kling, Rob, and Lisa B. Spector. “Rewards for Scholarly Communication.” In Digital Schol- arship in the Tenure, Promotion, and Review Process. Ed. Deborah Lines Andersen. 78- 103. Armonk, NY: M.E. Sharpe, 2003. Knight, Kim. “MLA 2011 Paper for ‘The Institution(alization) of Digital Humanities’.” Kim Knight. January 14, 2011, http://kimknight.com/?p=801. Knochel, Aaron D., and Amy Papaelias. “Placeable: A Social Practice for Place-Based Learning and Co-Design Paradigms.” In Making Things and Drawing Boundaries: Experi- ments in the Digital Humanities. Ed. Jentery Sayers. 288-300. Minneapolis, MN: Univer- sity of Minnesota Press, 2017. Knowles, Anne K. “A Case for Teaching Geographic Visualization without GIS.” Carto- graphic Perspectives 36 (2000): 24-37. Knowles, Anne K. “A Cutting-Edge Second Look at the Battle of Gettysburg.” Smithsonian Magazine. June 27, 2013. http://www.smithsonianmag.com/history-archaeology/A-Cut- ting-Edge-Second-Look-at-the-Battle-of-Gettysburg.html. Knowles, Anne K. “Introduction to the Special Issue: Historical GIS: The Spatial Turn in Social Science History.” Social Science History 24:3 (2000): 451-67. Knowles, Anne K., ed. Past Time Past Place: GIS for History. Redlands, CA: ESRI Press, 2002. Knowles, Anne K., ed. Placing History: How GIS is Changing Historical Scholarship. Red- lands, CA: ESRI Press, 2008. Knowles, Anne K., ed. “Reports on National Historical GIS Projects.” Emerging Trends in Historical GIS, Historical Geography 33 (2005): 134-58. Knowles, Anne K., and Richard G. Healey. “Geography, Timing, and Technology: A GIS- Based Analysis of Pennsylvania’s Iron Industry, 1825-1875.” Journal of Economic History 66:3 (2006): 608-34. Kocsis, A. and S. Kenderline,. I Sho U: “An Innovative Method for Museum Visitor Evalua- tion.” In Digital Heritage and Culture: Strategy and Implementation. Eds. H. Din and S. Wu. Singapore: World Scientific Publishing Co., 2014. 76 Koh, Adeline. “The Challenges of Digital Scholarship.” The Chronicle of Higher Educa- tion. ProfHacker, January 25, 2012. http://chronicle.com/blogs/profhacker/the-chal- lenges-of-digital-scholarship/38103. Koh, Adeline. “First Look: Textual, A Free SmartPhone App for Text Analysis.” The Chron- icle of Higher Education. https://www.chronicle.com/blogs/profhacker/first-looks- textal-a-free-smartphone-app-for-text-analysis/51109. Koh, Adeline. “A Letter to the Humanities: DH Will Not Save You.” Hybrid Pedagogy (April 19, 2015). http://www.hybridpedagogy.com/journal/a-letter-to-the-humanities- dh-will-not-save-you/. Koh, Adeline. “Niceness, Building, and Opening the Genealogy of the Digital Humanities: Beyond the Social Contract of Humanities Computing.” Differences 25, no. 1 (2014): 93- 106. Kopas, Merrit. “What are Games Good For? Videogame Creation as Social, Artistic, and Investigative Practice.” MKOPAS, 2013. http://mkopas.net/files/talks/UVic2013Talk- WhatAreGamesGoodFor.pdf Kraemer, Harald. “Art is Redeemed, Mystery is Gone: The Documentation of Contempo- rary Art.” in Theorizing Digital Cultural Heritage. Eds. Fiona Cameron and Sarah Kender- ine. 193-222. Cambridge, MA: MIT Press, 2007. Kramer, Michael. “What Does Digital Humanities Bring to the Table?” Issues in Digital History, September 25, 2012. http://www.michaeljkramer.net/issuesindigitalhis- tory/blog/?p=862. Kretzschmar, William A. “Large-Scale Humanities Computing Projects: Snakes Eating Tails, or Every End is a New Beginning?” DHQ: Digital Humanities Quarterly 3 (1) (2009). Kretzschmar, William A., and William Gray Potter. “Library Collaboration with Large Digi- tal Humanities Projects.” Literary and Linguistic Computing 25, no. 4 (December 1, 2010): 439-445. Krug, Steven. Don’t Make Me Think!: A Common Sense Approach to Web Usability. Berkeley, CA: New Riders Publishers, 2005. Kuhn, T. S. The Structure of Scientific Research Revolutions. Chicago, IL: Chicago Univer- sity Press, 1996. Kulesz, Octavio. Digital Publishing in Developing Countries. Paris: International Alliance of Independent Publishers/Prince Claus Fund for Culture and Development, 2011. http://alliance-lab.org/etude/?lang=en. 77 Kumar, Vijay. 101 Design Methods: A Structured Approach for Driving Innovation in Your Organization. Hoboken, NJ: Wiley, 2012. Kurgan, L. Close Up at a Distance: Mapping, Technology and Politics. New York, NY: Zone Books, 2013. Kvamme, K.L. “Geographic Information Systems in Regional Archaeological Research and Data Management.” Archaeological Method and Theory 1 (1989): 139-203. Kwan, Mei-Po. “Feminist Visualization: Re-envisioning GIS as a Method in Feminist Geo- graphic Research.” Annals of the Association of American Geographers 92 (2002): 645- 61. Kwan, Mei-Po, and J. Lee. “Geo-Visualization of Human Activity Patterns Using 3-D GIS: A Time-Geographic Approach.” Spatially Integrated Social Science. 48-66. New York, NY: Oxford University Press, 2004. Lakatos, I. Methodology of Scientific Research Programmes. Cambridge, UK: Cambridge University Press, 1980. Lake, M.W., P.E. Woodman, and S.J. Mithen. “Tailoring GIS Software for Archaeological Applications: An Example Concerning Viewshed Analysis.” Journal of Archaeological Sci- ence 25 (1998): 27-38. Lancaster, Lewis R., and David J. Bodenhamer. “The Electronic Cultural Atlas Initiative and the North American Region Atlas.” In Past Time, Past Place: GIS for History. 163-77. Redlands, CA: ESRI Press, 2002. Landow, George P. Hypertext: The Convergence of Contemporary Critical Theory and Technology. Baltimore, MD: Johns Hopkins University Press, 1991. Landow, George P. Hyper/Text/Theory. Baltimore, MD: Johns Hopkins University Press, 1994. Landow, George P. Hypertext 2.0. Baltimore, MD: Johns Hopkins University Press, 1997. Landow, George P. Hypertext 3.0: Critical Theory and New Media in an Era of Globaliza- tion. Baltimore, MD: Johns Hopkins University Press, 2006. Landow, George P. “What’s a Critic to Do? Critical Theory in the Age of Hypertext.” In Hyper/text/theory. 10-48. Baltimore, MD: Johns Hopkins University Press, 1994. 78 Langran, Gail. Time in Geographic Information Systems. London, UK: Taylor & Francis, 1992. Lanier, Jaron. You Are Not a Gadget: A Manifesto. New York, NY: Vintage, 2010. Lantham, E. & Zhou, W. “Cultural Issues in Online Learning - Is Blended Learning a Possi- ble Solution?” International Journal of Computer Processing of Oriental Languages 16, 4 (2003): 275-292. Lanzoni, Kristin, Mark Olson, and Victoria Szabo. "Wired! and Visualizing Venice: Scaling Up Digital Art History.” Artl@s Bulletin 4(1) (2015). Purdue, IN. http://docs.lib.pur- due.edu/artlas/vol4/iss1/3/ Lascarides, Michael and Ben Vershbow. “What’s On the Menu?: Crowdsourcing at the New York Public Library.” Crowdsourcing our Cultural Heritage. Ed. Mia Ridge. Surrey, UK: Ashgate, 2014. Lascarides, Michael, Ben Vershbow, and Trevor Owens. “Digital Cultural Heritage and the Crowd.” Curator: The Museum Journal 56, No. 1 (2013): 121-30. Latour, Bruno. “Tarde’s Idea of Quantification.” In The Social After Gabriel Tarde: De- bates and Assessments. Ed. M. Candea. London, UK: Routledge, 2010. Latour, Bruno. "Visualization and Cognition: Drawing Things Together." LOGOS 2 (2017): 95-156. Latour, Bruno. “Visualization and Cognition: Thinking with Eyes and Hands.” Knowledge and Society, vol. 6 (1986): 1-40. Latour, Bruno. “The Landscape of Digital Humanities.” Digital Humanities Quarterly 4, no. 1 (2010). http://digitalhumanities.org/dhq/vol/4/1/000080/000080.html. Latour, Bruno. “Why Has Critique Run Out of Steam? From Matters of Fact to Matters of Concern.” Critical Inquiry 30, no. 2 (2004). Latour, Bruno, and Peter Weibel. Making Things Public: Atmospheres of Democracy. Cambridge, MA: MIT Press, 2005. Latour, Bruno, and Tomas Sanchez-Criado. “Making the ‘Res Public’.” Ephemera 7 (2) (2007): 364–371. Laufer, Roger, and Domenico Scavetta. Texte, Hypertexte et Hypermedia. Paris, France: PUF, 1992. 79 Lawless, Séamus, Owen Conlan, and Cormac Hampton. “Tailoring Access to Content.” In A New Companion to Digital Humanities. Eds. by Susan Schreibman, Ray Siemens, and John Unsworth. 171-184. West Sussex, UK: Wiley-Blackwell, 2016. Lazer, D., et al. “Computational Social Science.” Science 323/5915. 6 (February, 2009): 721-723. Learning through Digital Media: Experiments in Technology and Pedagogy. 2011. http://learningthroughdigitalmedia.net/ Lee, C.A. I, Digital: Personal Collections in the Digital Era. Chicago, IL: Society of Ameri- can Archivists, 2011. Lee, C.A., M. Kirschenbaum, A. Chassanoff, P. Olsen, & K. Woods. “Bitcurator: Tools and Techniques for Digital Forensics in Collecting Institutions.” D-Lib Magazine 18, 5/6 (2012). Lee, C.A., H. Tibbo. “Digital Curation and Trusted Repositories: Steps toward Success.” Journal of Digital Information 8, 2 (2007). Lee, C.A., K. Woods, M. Kirschenbaum, & A. Chassanoff. “From Bitstrams to Heritage: Putting Digital Forensics into Practice in Collecting Institutions.” BitCurator, 2013. Lee, Maurice S. “Searching the Archive with Dickens and Hawthorne: Databases and Aesthetic Judgment after the New Historicism.” ELH 79.3 (Fall 2012): 747-771. Lee, Rainie, and Barry Wellman. Networked: The New Social Operating System. Cam- bridge, MA: MIT Press, 2012. Lehman, Robert S. “Allegories of Rending: Killing Time with Walter Benjamin.” New Lit- erature History 39, no. 2 (2008): 233-50. Lerman, N., A.P. Mohun, and R. Oldenziel. Technology and Culture, 38 (1). Special Issue: Gender Analysis and the History of Technology (2005). Lesk, Michael. Practical Digital Libraries: Books, Bytes, and Bucks. San Francisco, CA: Morgan Kaufmann Publishers, 1997. Lesk, Michael. Understanding Digital Libraries. San Francisco, CA: Morgan Kaufmann Publishers, 2005. Lessig, Lawrence. Code and Other Laws of Cyberspace. version 2.0. New York, NY: Basic Books, 2006. 80 Lessig, Lawrence. The Future of Ideas: The Fate of the Commons in a Connected World. New York, NY: Random House, 2001. Levinson, Stephen C. Space in Language and Cognition: Explorations in Cognitive Diver- sity. Cambridge, UK: Cambridge University Press, 2003. Levmore, Saul, and Martha Craven Nussbaum. The Offensive Internet; Speech, Privacy, and Republican. Cambridge, MA: Harvard University Press, 2010. Levy, David M. Scrolling Forward: Making Sense of Documents in the Digital Age. New York, NY: Arcade, 2001. Lévy, P. Collective Intelligence. London, UK: Perseus, 1999. Lévy, Pierre, and Robert Bononno. Collective intelligence: Mankind's emerging world in cyberspace. London, UK: Perseus, 1997. Library of Congress. American Memory: Historical Collections for the National Digital Li- brary. Library of Congress. “Metadata for Digital Content (MDC). Developing Institution-Wide Policies and Standards at the Library of Congress.” 2015. Liestøl, Gunnar, Andrew Morrison, Terje Rasmussen, eds. Digital Media Revisited: Theo- retical and Conceptual Innovations in Digital Domains. Cambridge, MA: MIT Press, 2003. Lilley, Keith, Chris Lloyd, and Steven Trick. Mapping the Medieval Urban Landscape: Ed- ward 1’s New Towns of England and Wales. http://www.qub.ac.uk/urban_mapping. Lima, Manuel. Visual Complexity: Mapping Patterns of Information. Princeton, NJ: Princeton Architectural Press, 2011. Lind, Rebecca Ann, ed. “Producing Theory in a Digital World 2.0: The Intersection of Au- diences and Production in Contemporary Theory.” Digital Formations, Vol. 99. Peter Lang, 2015. Lindhé, Cecilia. “Medieval Materiality through the Digital Lens.” In Between Humanities and the Digital. Eds. D.T. Goldberg and P. Svensson. 193-204. Cambridge, MA: MIT Press, 2015. Linley, Margaret. “Ecological Entanglements of DH.” In Debates in the Digital Humani- ties. Eds. Matthew K. Gold and Lauren Klein. 410-437. Minneapolis, MN: University of Minnesota Press, 2016. 81 Lipson, H. and M. Kunman. Fabricated: The New World of 3D Printing. Indianapolis, IN: John Wiley & Sons, Inc., 2013. Lipson, H., F.C. Moon, J. Hai, and C. Paventi. “3-D Printing the History of Mechanisms.” Journal of Mechanical Design. 127 (5) (2004): 1029-1033. Literary and Linguistic Computing. Oxford Journals. llc.oxfordjournals.org. Liu, Alan. “Digital Humanities and Academic Change.” English Language Notes 47, no. 1 (2009): 117-35. Liu, Alan. "Escaping History: New Historicism, Databases, and Contingency." Digital Ret- roaction Conference, University of California at Santa Barbara, pp. 17-19. 2004. Liu, Alan. “Friending the Past: The Sense of History and Social Computing.” New Literary History 42, no. 1 (2011): 1-30. Liu, Alan. “Imagining the New Media Encounter.” In A Companion to Digital Literary Studies. Eds. Susan Schreibman and Ray Siemens. Oxford, UK: Blackwell, 2008.. Liu, Alan. The Laws of Cool: Knowledge Work and the Culture of Information. Chicago, IL: University of Chicago Press, 2004. Liu, Alan. Local Transcendence: Essays on Postmodern Historicism and the Database. Chicago, IL: University of Chicago Press, 2008. Liu, Alan. “Manifesto for the Digital Humanities.” In THATCamp Paris 2010. Hypotheses. June 3, 2010. http://tcp.hypotheses.org/411. Liu, Alan. “The Meaning of Digital Humanities.” PMLA 128, no. 2 (2013): 409-23. Liu, Alan. “N+1: A Plea for Cross-Domain Data in the Digital Humanities.” In Debates in the Digital Humanities. Ed. Matthew K. Gold and Lauren Klein. 559-568. Minneapolis, MN: University of Minnesota, 2016. Liu, Alan. “Sidney’s Technology.” In Local Transcendence: Essays on Postmodern Histori- cism and the Database. 187-206. Chicago, IL: University of Chicago Press, 2008. Liu, Alan. “The State of the Digital Humanities: A Report and a Critique.” Arts and Hu- manities in Higher Education 11, no. 1 (2012): 1-34. Liu, Alan. “Theses on the Epistemology of the Digital: Advice for the Cambridge Centre for Digital Knowledge.” Author’s blog, August 14, 2014. http://liu.english.ucsb.edu/the- ses-on-the-epistemology-of-the-digital-page 82 Liu, Alan. “Transcendental Data: Toward a Cultural History and Aesthetics of the New Encoded Discourse.” Critical Inquiry 31 (1). 49-84. Liu, Alan. “Where Is Cultural Criticism in the Digital Humanities?” In Debates in the Digi- tal Humanities. Ed. Matthew K. Gold. 490-509. Minneapolis, MN: University of Minne- sota, 2012. Liu, Lydia. The Freudian Robot: Digital Media and the Future of the Unconscious. Chi- cago, IL: University of Chicago Press, 2010. The Living Net. “Project Snapshot: Vibrant Lives Present.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 42-43. Minneap- olis, MN: University of Minnesota Press, 2017. Llobera, Marcos. “Building Past Landscape Perception wit GIS: Understanding Topo- graphic Prominence.” Journal of Archaeological Science 28 (2001): 1005-14. Lock, G.R., and K. Smith, eds. On the Theory and Practice of Archaeological Computing. Oxford, UK: Oxbow, 2000. Lock, Gary. Using Computers in Archaeology: Towards Virtual Pasts. London, UK: Routledge, 2003. Long, Christopher. “Performative Publication.” http://cplong.org/2013/07/performa- tive-publication/. Long, P.O. Openess, Secrecy, Authorship: Technical Arts and the Culture of Knowledge from Antiquity to the Renaissance. Baltimore, MD: Johns Hopkins University Press, 2004. Longley, Paul A., and Michael F. Goodchild, David J. Maguire, and David W. Rhind, eds. Geographic Information Systems and Science. New York, NY: John Wiley & Sons Inc., 2001. Lopez, Andrew, Fred Rowland, and Kathleen Fitzpatrick. “On Scholarly Communication and the Digital Humanities: An Interview with Kathleen Fitzpatrick.” In The Library with the Lead Pipe. Self-published, 2015. Losh, Elizabeth. “Hacktivism and the Humanities: Programming Protest in the Era of the Digital Humanities.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 161-186. Minneapolis, MN: Minnesota, 2012. 83 Lost, Elizabeth, Jacqueline Wernimont, Laura Wexler, and Hong-An Wu. In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren lein. 92-103. Minneapolis, MN: University of Minnesota, 2016. Lotan, Gilad, Erhardt Graeff, Mike Ananny, Devin Gaffney, Ian Pearce, and Danah Boyd. “The Revolutions Were Tweeted: Information Flows During the 2011 Tunisian and Egyp- tian Revolutions.” International Journal of Communication 5 (2011): 1375-405. Lothian, Alexis, and Amanda Phillips. “Can Digital Humanities Mean Transformative Cri- tique?” Journal of e-Media Studies 3, no. 1 (2013). https://journals.dartmouth.edu/cgi- bin/WebObjects/Journals.woa/xmlpage/4/article/425. Lothian, Alexis. “Marked Bodies, Transformative Scholarship, and the Question of The- ory in Digital Humanities.” Journal of Digital Humanities 1 (2011). http://journalofdig- italhumanities.org/1-1/marked-bodies-transformative-scholarship-and-the-question-of- theory-of-digital-humanities-by-alexis-lothian/. Lovink, Geert. Zero Comments: Blogging and Critical Internet Culture. New York, NY: Routledge, 2008. Luff, Paul, Jon Hindmarsh, and Christian Heath. Workplace Studies: Recovering Work Practice and Informing System Design. Cambridge, MA: Cambridge University Press, 2000. Lum, Casey Man Kong. “Notes Toward an Intellectual History of Media Ecology.” In Per- spectives on Culture, Technology, and Communication: The Ecology Tradition. Ed. Casey Man Kong Lum. 1-60. Cresskill, NJ: Hampton, 2006. Lupton, Julia Reinhard. “Blur Building: Softscape.” Shakespeare & Hospitality. https://folgerpedia.folger.edu/Julia_Reinhard_Lupton. Lynch, C.A. “Institutional Repositories: Essential Infrastructure for the Digital Age.” ARL Bimonthly Report 226 (2003): 1-7. Lynch, Michael. “Science in the Age of Mechanical Reproduction: Moral and Epistemic Relations between Diagrams and Photographs.” Biology and Philosophy 6, no. 2 (1991): 205-26. Lyotard, Jean-François. The Postmodern Condition: A Report on Knowledge. Manchester, UK: Manchester University Press, 1986. Maack, Mary Niles. “Toward a New Model of the Information Professions: Embracing Empowerment.” Journal of Education for Library and Information Science 38, no. 4 (1997): 283-302. 84 MacArthur Foundation. Reports on Digital Media and Learning. Cambridge, MA: MIT Press, 2009-11. www.scribd.com/collections/2346520/John-D-and-Catherine-T-MacAr- thur-Foundation-Reports-on-Digital-Media-and-Learning. MacDonald, Bertram H., and Fiona A. Black. “Using GIS for Spatial and Temporal Anal- yses in Print Culture Studies: Some Opportunities and Challenges.” Social Science History 24:3 (2000): 505-36. MacEachren, Alan M. “Visualization Quality and the Representation of Uncertainty.” In Some Truth with Maps: A Primer on Symbolization & Design. Washington, DC: Associa- tion of American Geographers, 1994. MacEachren, Alan M., and Fraser Taylor. Visualization in Modern Cartography. London, UK: Elsevier, 1994. Mackenzie, Adrian. Cutting Code: Software and Sociality. Oxford, UK: Peter Lang, 2006. Mackenzie, E.S., J. McLaughlin, A. Moore, and K. Rogers. “Digitising the Middle Ages: The Experience of the ‘Lands of the Normans’ Project.” International Journal of Humani- ties & Arts Computing 3, no. ½ (March 2009): 127-42. Mackey, Thomas P., and Trudi E. Jacobson. “Reframing Information Literacy as a Metalit- eracy.” College & Research Libraries. (2011): 70. Mackey, Wendy E. “Augmented Reality: Linking Real and Virtual Worlds. A New Para- digm for Interacting with Computers.” Proceedings of the Workshop on Advanced Visual Interfaces AVI (1998): 13-21. Madden, L. “Applying the Digital Curation Lessons Learned from American Memory.” In- ternational Journal of Digital Curation 3.2 (2008). Maeda, John. Creative Code: Aesthetics and Computation from the MIT Media Lab. Lon- don, UK: Thames & Hudson, 2004. Maeroff, G.I. A Classroom of One: How Online Learning is Changing Our Schools and Col- leges. Basingstoke, UK: Palgrave, 2003. Maher, Jimmy. The Future was Here: Commodore Amiga. Cambridge, MA: MIT Press, 2012. Mahoney, Simon and Elena Pierazzo. “Teaching Skills of Teaching Methodology?” Digital Humanities Pedagogy: Practices, Principles and Policies. Ed. Brett D. Hirsch. 212-25. Cambridge, UK: Open Book Publishers, 2012. 85 Maier, Andrew. “Digital Literacy, Part 1: Cadence.” UX Booth. October 3, 2013. http://www.uxbooth.com/articles/digital-literacy-part-1-cadence/. Mailing, D.H. Measurements from Maps: Principles and Methods of Cartometry. New York, NY: Pergamon, 1989. Mak, Bonnie. “Archaeology of a Digitization.” Journal of the American Society for Infor- mation Science and Technology. 65, no. 8 (2014): 1515-1526. DOI: 10.1002/asi.23061 Maker Lab in the Humanities. http://maker.uvic.ca/. Mandell, Laura C. Breaking the Book: Print Humanities in the Digital Age. Oxford, UK: Wiley-Blackwell, 2015. Mandell, Laura C. “Gendering Digital Literacy History: What Counts for Digital Humani- ties.” In A New Companion to Digital Humanities. Eds. Susan Schreibman and Ray Sie- mens. Oxford, UK: Blackwell, 2008. Mandell, Laura C. “Promotion and Tenure for Digital Scholarship.” Journal of Digital Hu- manities 1, no. 4 (Fall 2012). http://journalofdigitalhumanities.org/1-4/promotion-and- tenure-for-digital-scholarship-by-laura-mandell/. Manoff, Marlene. “Archive and Database as Metaphor: Theorizing the Historical Rec- ord.” Portal: Libraries and the Academy 10, no. 4 (2010): 385-98. Manoff, Marjorie. “Theories of the Archive from Across the Disciplines.” Portal: Libraries and the Academy 4, no. 1 (January 2004): 9-25. Manovich, Lev. “Cultural Analytics: Analysis and Visualization of Large Cultural Data Sets.” CALIT2 White Paper, 2007. www.manovich.net/cultural_analytics.pdf Manovich, Lev. “Database as a Genre of New Media.” AI & Society 14, no. 2 (June 1, 2000). http://time.arts.ucla.edu/AI_Society/manovich.html. Manovich, Lev. “Database as Symbolic Form.” Convergence: The International Journal of Research into New Media Technologies 5, no. 2 (1999): 80-99. Manovich, Lev. "Info-Aesthetics.” New Media / Culture / Software. 3 May 2006. http://www.manovich.net. Manovich, Lev. The Language of New Media. Cambridge, MA: MIT Press, 2002. 86 Manovich, Lev. “The Language of New Media (What Is New Media?), “The Interface,” and “The Forms.” In The Language of New Media. Ed. Lev Manovich. Cambridge, MA: MIT Press, 2002. Manovich, Lev. Software Takes Command. New York, NY: Continuum Publishing Corpo- ration, 2013. Manovich, Lev. “Trending: The Promises and the Challenges of Big Social Data.” In De- bates in the Digital Humanities. Ed. Matthew K. Gold. 460-475. Minneapolis, MN: Uni- versity of Minnesota, 2012. Manovich, Lev. “Visualizing Large Image Collections for Humanities Research.” In Media Studies Futures. Ed. Kelly Gates. Oxford, UK: Blackwell, 2012. http://moanovich.net/DOCS/media_visualization.2011.pdf. Manovich, Lev. “What is New Media?” The New Media Theory Reader. Eds. Robert Has- san and Julian Thomas. 5-10. Maidenhead, UK: Open University Press, 2006. Mansfield, Elizabeth, ed. Art History and Its Institutions: Foundations of a Discipline. New York, NY: Psychology Press, 2002. Mantovani, F. et al. “Virtual Reality Training for Health-Care Professionals.” Cyber Psy- chology & Behaviour. 6, 4, (2003): 398-395. Map of Early Modern London (MoEML). https://mapoflondon.uvic.ca/. “Mapping the Stacks: A Guide to Chicago’s Hidden Archives.” http://mts.lib.uchi- cago.edu/. “Mapping Initiatives.” United States Holocaust Memorial. http://www.ushmm.org/maps/. “MARC in XML.” Library of Congress. http://www.loc.gov/marc/marcxml.html. Marcus, Manfred. “Article Contents.” Wright’s English Dialect Dictionary Computerised: Towards a New Source of Information. University of Helsinki, 17 Dec. 2007. Marche, Stephen. “Literature Is Not Data: Against Digital Humanities.” Los Angeles Re- view of Books. October 28, 2012. https://lareviewofbooks.org/essay/literature-is-not- data-against-digital-humanities. Marchionini, G., C. Plaisant, & A. Komlodi. “The People in Digital Libraries: Multifaceted Approaches to Assessing Needs and Impact.” In Digital Library Use: Social Practice in De- sign and Evaluation. Eds. Bishop, A. P. et al., 119–160. Cambridge, MA: MIT Press, 2003. 87 Marcum, Deanna, and Amy Friedlander. “Keepers of the Crumbling Culture.” D-Lib Ma- gazine 9, no. 5 (May 2003). http://www.dlib.org/dlib/may03/friedlander/05friedlan- der.html. Marcus, Leah S. “The Silence of the Archive and the Noise of Cyberspace.” In The Renais- sance Computer: Knowledge Technology in the First Age of Print, Eds. Neil Rhodes and Jonathan Sawday. 18–28. London and New York: Routledge, 2000. Marcuse, Herbert. “Some Social Implications of Modern Technology.” In The Essential Frankfurt School Reader. Eds. Andrew Arato and Erike Gebhardt. 138-62. New York, NY: Continuum, 1982 Marino, Mark C. ”Why We Must Read the Code: The Science Wars, Episode IV.” In De- bates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 139-152. Minne- apolis, MN: University of Minnesota Press, 2016. Maron, Nancy, K. Kirby Smith, and Matthew Loy. “Sustaining Digital Resources: An On- the-Ground View of Projects Today.” Ithaka Case Studies in Sustainability. Ithaka S+R, July 2009. http://www.ithaka.org/ithaka-s-r/research/ithaka-case-studies-in-sustainabil- ity/report/SCA_Ithaka_SustainingDigitalResources_Report.pdf. Maron, N.L. and S. Pickle. Sustaining the Digital Humanities; Host Institution Support Be- yond the Start-up Phase. Ithaka S+R. 2014. http://www.sr.ithaka.org/research-publica- tions/sustaining-digital-humanities. Marsh, Leslie. “Review of ‘Natural-Born Cyborgs: Minds, Technologies, and the Future of Human Intelligence’.” Cognitive Systems Research 6 (2005): 405–409. Marshall, Catherine C. Reading and Writing the Electronic Book. San Rafael, CA: Morgan & Claypool, 2010. Martin, Kim, Beth Compton, and Ryan Hunt. “Disrupting Dichotomies: Mobilizing Digital Humanities with the MakerBus.” In Making Things and Drawing Boundaries: Experi- ments in the Digital Humanities. Ed. Jentery Sayers. 251-6. Minneapolis, MN: University of Minnesota Press, 2017. Martinec, R., and T. Van Leeuwen. The Language of New Media Design: Theory and Practice. New York, NY: Routledge, 2009. Marwick, Alice E., and Danah Boyd. “I Tweet Honestly, I Tweet Passionately: Twitter Us- ers, Context Collapse, and the Imagined Audience.” New Media and Society 13, no 1. (2011): 114-33. 88 Marx, Vivien. “Data Visualization: Ambiguity as a Fellow Traveler.” Nature Methods 10, no. 7 (July 2013): 613-615. doi: 10.1038/nmeth.2530. Massey, Doreen. “Space-Time, ‘Science,’ and the Relationship between Physical Geogra- phy and Human Geography.” Transactions of the Institute of British Geographers: New Series 24 (1999): 261-76. Mateas, M. “Procedural Literacy: Educating the New Media Practitioner.” Beyond Fun: Serious Games and Media. Ed. D. Davidson. Pittsburgh, PA: ETC Press, 67-83. 2008. Mattern, Shannon Christine. “Evaluating Multimodal Work, Revisited.” Journal of Digital Humanities 1, no. 4 (Fall 2012). http://journalofdigitalhumanities.org/1-4/evaluating- multimodal-work-revisited-by-shannon-mattern/. Mauro, Aaron. “Digital Liberal Arts and Project-Based Pedagogies.” Doing Digital Hu- manities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 373-383. New York, NY: Routledge, 2016. Maya Mapping Project. Maya Atlas: The Struggle to Preserve Maya Land in Southern Be- lize. Berkeley, CA: North Atlantic Books, 1997. McCarty, Willard. “Becoming Interdisciplinary.” In A New Companion to Digital Humani- ties. Eds. by Susan Schreibman, Ray Siemens, and John Unsworth. 69-83. West Sussex, UK: Wiley-Blackwell, 2016. McCarty, Willard. “Being Reborn: The Humanities Computing and Styles of Scientific Reasoning.” New Technology in Medieval and Renaissance Studies 1, 1-23. 2007. McCarty, Willard. “Collaborative Research in the Digital Humanities.” In Collaborative Research in the Digital Humanities. Ed. Marilynn Deegan and Willard McCarty. 2-10. Farnham, UK: Ashgate, 2012. McCarty, Willard. “Digital Knowing, Not Digital Knowledge.” Humanist 28, No. 140 (2014). McCarty, Willard. "Finding Implicit Patterns in Ovid's Metamorphoses with TACT." Digi- tal Studies/Le champ numérique (1996). McCarty, Willard. “The Future of Digital Humanities is a Matter of Words.” In A Compan- ion to New Media Dynamics. Eds. J. Hartley, J. Burgess, and A. Burns. Chichester, UK: John Wiley & Sons Ltd., 2013. 89 McCarty, Willard. “Getting There from Here: Remembering the Future of Digital Human- ities.” Roberto Bush Award lecture 2013. Literary and Linguistic Computing 29 (3) (2014): 283-306. McCarty, Willard. “Humanities Computing: Essential Problems, Experimental Practice.” Literary and Linguistic Computing 17, no. 1 (April 1, 2002): 103 -125. McCarty, Willard. Introduction to Humanities Computing. Basingstoke, UK: Palgrave Macmillan, 2005. McCarty, Willard. “Modeling.” In Humanities Computing. Ed. Willard McCarty 20-72. Ba- singstoke, UK: Palgrave Macmillan, 2005. McCarty, Willard. “What is Humanities Computing? Toward a Definition of the Field.” http://ilex.cc.kcl.ac.uk/wlm/essays.what/. McCarty, Willard. “Modeling: A Study in Words and Meanings.” In A Companion to Digi- tal Humanities. Eds. Susan Schreibman, Ray Siemens, and John Unsworth. Oxford, UK: Blackwell, 2004. McCarty, Willard. “The Ph.D. in Digital Humanities.” In Digital Humanities Pedagogy: Practices, Principles and Policies. Ed. Brett D. Hirsch. 33-46. Cambridge, UK: Open Book Publishers, 2012. McCarty, Willard. “A Telescope for the Mind?” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 113-123. Minneapolis, MN: University of MN, 2012. McCarty Willard, ed. Text and Genre in Reconstruction: Effects of Digitalization on Ideas, Behaviors, Products and Institutions. Cambridge, UK: Open Book Publishers, 2010. McCarty, Willard, and Matthew Kirschenbaum. “Institutional Models for Humanities Computing.” Literary and Linguistic Computing 18, no. 4 (November 1, 2003): 465-489. McCloud, Scott. Understanding Comics: The Invisible Art. New York, NY: HarperCollins, 1994. McCullough, Malcolm. Ambient Commons: Attention in the Age of Embodied Infor- mation. Cambridge, MA: MIT Press, 2013. McCullough, Malcolm. Digital Ground: Architecture, Pervasive Computing, and Environ- mental Knowing. Cambridge, MA: MIT Press, 2005. 90 McDonough, J., R. Olendorf, M. Kirschenbaum, K. Kraus, D. Reside, R. Donahue, A. Phelps, C. Egert, H. Lowood, and S. Rojo. Preserving Virtual Worlds Final Report. Decem- ber 20, 2010. http://www.ideals.illinois.edu/handle/2142/17097. McEnery, Tony, and Andrew Hardie. Corpus Linguistics: Method, Theory and Practice. Cambridge, UK: Cambridge University Press, 2012. McGann, Jerome. “Culture and Technology: The Way We Live Now, What is to Be Done?” New Literary History 36, no. 1 (2005): 71-82. http://muse.jhu.edu/jour- nals/new_literary_history/v036/36.1mcgann.html McGann, Jerome. “Electronic Archives and Critical Editing.” Literature Compass 7 (2) (2010): 37-42. McGann, Jerome. “Imagining What You Don’t Know: The Theoretical Goals of the Ros- setti Archive.” Institute for Advanced Technology in the Humanities. 1997. http://www2.iath.virginia.edu/jjm2f/old/chum.html McGann, Jerome. “Making Texts of Many Dimensions.” In A New Companion to Digital Humanities. Eds. by Susan Schreibman, Ray Siemens, and John Unsworth. 358-376. West Sussex, UK: Wiley-Blackwell, 2016. McGann, Jerome. A New Republic of Letters: Memory and Scholarship in the Age of Digi- tal Reproduction. Cambridge, MA: Harvard University Press, 2014. McGann, Jerome, ed. Online Humanities Scholarship: The Shape of Things to Come. Hou- ston, TX: Rice University Press, 2010. McGann, Jerome. “Philology in a New Key.” Critical Inquiry 29, no. 2 (2013): 327–46. McGann, Jerome. Radiant Textuality: Literature after the World Wide Web. New York, NY: Palgrave, 2004. McGann, Jerome. “The Rationale of Hypertext.” In Radiant Textuality: Literature After the World Wide Web. Ed. Jerome McGann. 53-74. New York, NY: Palgrave, 2001. McGann, Jerome. “The Rossetti Archive and Image-Based Electronic Editing.” In The Lit- erary Text in the Digital Age. Ed. Richard Finneran. 145-83. Ann Arbor, MI: University of Michigan Press, 1996. McGann, Jerome. “Visible and Invisible Books: Hermetic Images in N-Dimensional.” In The Future of the Page. Eds. Peter Stoicheff and Andrew Taylor. 143-58. Toronto, ON: University of Toronto Press, 2004. 91 McGann, Jerome, Andrew Stauffer, Dana Wheeles, and Michael Pickard. “Abstract of Roger Bagnall, ‘Integrating Digital Papyrology’.” Online Humanities Scholarship: The Shape of Things to Come. Ed. Jerome McGann. 135. Houston, TX: Rice University Press, 2010. McGann, Jerome and Bethany Nowviskie. "NINES: A Federated Model for Integrating Digital Scholarship." McGonigal, Jane. Reality Is Broken: Why Games Make Us Better and How They Can Change the World. New York, NY: Penguin Press, 2011. McGrail, Anne B. “The ‘Whole Game’: Digital Humanities at Community Colleges.” In De- bates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 16-31. Minneap- olis, MN: University of Minnesota Press, 2016. McKenzie, D.F. Bibliography and the Sociology of Text. Cambridge, UK: Cambridge Uni- versity Press, 1999. McKenzie, Jon. “Enhancing Digital Humanities at UW-Madison: A White Paper.” http://www.labster8.net/wp-content/uploads/2012/09/FDS_White_Paper.pdf McKeon, Richard. “The Uses of Rhetoric in a Technological Age: Architectonic Productive Arts.” The Prospect of Rhetoric: Report of the National Development Project. Eds. Lloyd. F. Bitzer and Edwin Black. Upper Saddle River, NJ: Prentice Hall, 1979. McLafferty, Sara. “Women and GIS: Geospatial Technologies and Feminist Geographies.” Cartographica 40:4 (2005): 37-45. McLuhan, Marshall. The Guttenberg Galaxy: The Making of Typographic Man. Toronto, ON: University of Toronto Press, 1962. McLuhan, Marshall. Understanding Media: The Extensions of Man. 1964. Ed. Lewis Lap- ham. Cambridge, MA: MIT Press, 1994. McLuhan, Marshall and Quentin Fiore. The Medium is the Massage. Berkeley, CA: Gin- gko Press, 2005. McPherson, Tara. “Introduction: Media Studies and the Digital Humanities.” Cinema Journal 48, no. 2 (2008): 119-23. http://muse.jhu.edu/journals/cj/sum- mary/v048/48.2.mcpherson.html. McPherson, Tara. “Media Studies and the Digital Humanities.” Cinema Journal 48 (2) (2009): 119-123. 92 McPherson, Tara. “U.S. Operating Systems at Mid-Century: The Intertwining of Race and UNIX.” In Race after the Internet. Eds. Lisa Nakamura and Peter A. Chow-White. 21-37. New York, NY: Routledge, 2012. McPherson, Tara. “Why Are the Digital Humanities So White? or Thinking the Histories of Race and Computation.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 139-160. Minneapolis, MN: University of Minneapolis Press, 2012. McPherson, Tara. Reconstructing Dixie: Race, Place and Nostalgia in the Imagined South. Durham, NC: Duke University Press, 2003. MediaCommons. mediacomons.futureofthebook.org. Medieval Kingdom of Sicily Image Database. http://kos.aahvs.duke.edu/index.php. Meeks, Elijah. “The Digital Humanities as Imagined Community.” Digital Humanities Spe- cialist. September 14, 2010. http://dhs.stanford.edu/the-digital-humanities-as/the-digi- tal-humanities-as-imagined-community/. Meeks, Elijah. “More Networks in the Humanities or Did Books Have DNA?” Digital Hu- manities Specialist. Stanford University Libraries. https://dhs.stanford.edu/visualiza- tion/more-networks/. Meeks, Elijah, and Scott B. Weingart. “The Digital Humanities Contribution to Topic Modeling.” Journal of Digital Humanities 2, no. 1 (April 9, 2013). http://journalofdig- italhumanities.org/2-1/dh-contribution-to-topic-modeling/. Meeks, E. and K. Grossner. “ORBIS: An Interactive Scholarly Work on the Roman World.” Journal of Digital Humanities 1 (3) (2012). http://journalofdigitlhumanities.org/1-3/or- bis-an-interactive-scholarly-work-on-the-roman-world-by-elijah-meeks-and-karl-gross- ner. Mendoza, Marcelo, Barbara Poblete, and Carlos Castillo. “Twitter Under Crisis: Can We Trust What We RT?” First Workshop on Social Media Analystics (SOMA ’10). Washington DC, 2010. Metcalfe, A.S. Knowledge Management and Higher Education: A Critical Analysis. Lon- don, UK: Information Science, 2006. Metraux, Stephen. “Waiting for the Wrecking Ball: Skid Row in Postindustrial Philadel- phia.” Journal of Urban History 25:5 (1999): 691-716. Meuhrcke, Phillio C. “The Logic of Map Design.” In Cartographic Design: Theoretical and Practical Perspectives. 271-8. New York, NY: John Wiley & Sons, Inc., 1996. 93 Meyer, Eric T. and Ralph Schroeder. Knowledge Machines: Digital Transformations of the Sciences and Humanities. Cambridge, MA: MIT Press, 2015. Miall, David S. Humanities and Computer: New Directions. Oxford, UK: Clarendon Press, 1990. Michel, Jean Baptiste, Yuan Kui Shen, Aviva Presser Aiden, Adrian Veres, Matthew K. Gray, The Google Books Team, Joseph P. Picket, Dale Hoiberg, Dan Clancy, Peter Norvig, Jon Orwant, Steven Pinker, Martin A. Nowak, Erez Lieberman Aiden. "Quantita- tive Analysis of Culture Using Millions of Digitized Books.” Science 331: 6014 (January 14, 2011). MIlgram, P., H. Takemura, A. Utsumi, and F. Kishino. “Augmented Reality: a Class of Dis- plays on the Reality-Virtuality Continuum.” Proceedings of Telemanipulator and Telepresence Technologies 2351, 282-292. 1994. Miller, J.H. and S.E. Page. “Complex Adaptive Systems. An Introduction to Computa- tional Models of Social Life.” Princeton, NJ: Princeton University Press, 2007. Miller, Peter, ed. Cultural Histories of the Material World. Ann Arbor, MI: University of Michigan Press, 2013. Milic, L. “The Next Step.” Computers and the Humanities 1: 1 (1966): 3-6. Millon, Emma. “Project Bamboo: Building Shared Infrastructure for Humanities Re- search.” Maryland Institute for Technology in the Humanities Blog. July 1, 2011. http://mith.umd.edu/project-bamboo-building-shared-infrastructure-for-humanities- research/. Milton, N. Knowledge Management for Teams and Projects. Oxford, UK: Chandos Pub- lishing, 2005. Mirzoeff, Nicholas. An Introduction to Visual Culture. London and New York: Routledge, 1999. Mirzoeff, Nicholas, ed. The Visual Culture Reader. London and New York: Routledge, 1998. Mirzoeff, Nicholas. “What is Visual Culture?” In The Visual Culture Reader. Ed. Nicholas Mirzoeff. 3-13. London and New York: Routledge, 1998. Mitchell, Don. The Right to the City: Social Justice and the Fight for Public Space. New York, NY: Guilford Press, 2003. 94 Mitchell, E.T., ed. Library Linked Data: Research and Adoption. Chicago, IL: ALA Tech- source, 2013. Mitchell, E.T. “Metadata Developments in Libraries and other Cultural Heritage Institu- tions.” In Library Linked Data: Research and Adoption. Ed. E.T. Mitchell. 5-10. Chicago, IL: ALA Techsource, 2013. Mitchell, Marilyn. Library Workflow Redesign: Six Case Studies. Washington D.C.: Coun- cil on Library and Information Resources, 2007. http://www.clir.org/pubs/ab- stract/pub139abst.html. Mitchell, W.J.T. Picture Theory. Chicago, IL: University of Chicago Press, 1994. Mitchell, W.J.T., and Mark B.N. Hansen. Critical Terms for Media Studies. Chicago, IL: University of Chicago Press, 2010. Mitchell, William J., Alan S. Inouye, Marjory S. Blumenthal, eds. Beyond Productivity: In- formation Technology, Innovation, and Creativity. 22 May 2006. http://newton.nap.edu/html/beyond_productivity/. Mod, Craig. “The Digital-Physical: On Building Flipboard for iPhone & Finding the Edges of Our Digital Narratives.” @craigmod. https://craigmod.com/journal/digital_physical/. Modern Language Association. “Documenting a New Media Case.” Journal of Digital Hu- manities 1, no. 4 (Fall 2012). http://journalofdigitalhumanities.org/1-4/documenting-a- new-media-case-evaluation-wiki-from-the-mla/. Modern Language Association. “Guidelines for Editors of Scholarly Editions.” Modern Language Association, n.d. http://www.mla.org/resources/documents/rep_schol- arly/cse_guidelines. Modern Language Association. “Guidelines for Evaluating Work in Digital Humanities and Digital Media.” Journal of Digital Humanities 1, no. 4 (Fall 2012). http://journalofdig- italhumanities.org/1-4/guidelines-for-evaluating-work-in-digital-humanities-and-digital- media-from-the-mla/. Modern Language Association. Report of the MLA Task Force on Evaluating Scholarship for Tenure and Promotion. 2006. http://www.mla.org/tenure_promotion. Mohl, Raymond. “Planned Destruction: The Interstates and Central City Housing.” In From Tenements to the Taylor Homes. 226-45. University Park, PA: Pennsylvania State University Press, 1985. 95 Monmonier, Mark. Drawing the Line. New York, NY: Henry Holt, 1995. Monmonier, Mark. Spying with Maps. Chicago, IL: University of Chicago Press, 2002. Monmonier, Mark. How to Lie with Maps, 2nd edition. Chicago, IL: University of Chicago Press, 1996. Montfort, Nick. “Beyond the Journal and the Blog: The Technical Report for Communica- tion in the Humanities.” Amodern 1 (2013): http://amodern.net/article/beyond-the- journal-and-the-blog-the-technical-report-for-communication-in-the-humanities. Montfort, Nick. “Exploratory Programming in Digital Humanities Pedagogy and Re- search.” In A New Companion to Digital Humanities. Eds. by Susan Schreibman, Ray Sie- mens, and John Unsworth. 98-109. West Sussex, UK: Wiley-Blackwell, 2016. Montfort, Nick, and Ian Bogost. Racing the Beam: The Atari Video Computer System. Cambridge, MA: MIT Press, 2009. Montfort, Nick. Twisty Little Passages: An Approach to Interactive Fiction. Cambridge, MA: MIT Press, 2003. Moran, Joe. Interdisciplinarity. London and New York: Routledge, 2002. Moravec, Michelle. “Teaching with Pinterest.” http://historyinthecity.blog- spot.com/2014/01/teaching-students-in-pinterest.html Moretti, Franco. “Conjectures on World Literature.” New Left Review. 1 (Jan-Feb. 2000): 54-68. Moretti, Franco. Distant Reading. London and New York: Verso, 2013. Moretti, Franco. Graphs, Maps, Trees: Abstract Models for a Literary History. London and New York: Verso, 2005. Moretti, Franco. “Network Theory, Plot Analysis.” Literary Lab. Pamphlet 2, May 1, 2011. https://litlab.stanford.edu/LiteraryLabPamphlet2.pdf. Morgan, Paige. “How to Get your Digital Humanities Project off the Ground.” http://www.paigemorgan.net/how-to-get-a-digital-humanities-project-off-the-ground/ Morozov, Evgeny. To Save Everything, Click Here: The Folly of Technological Solutions. New York, NY: Public Affairs, 2014. 96 Moore, R. “Towards a Theory of Digital Preservation.” International Journal of Digital Curation 3.1 (2008): 63-75. Moore, Suzanne. “Grayson Perry’s Tapestries: Weaving Class and Taste.” The Guardian (2013). https://www.theguardian.com/books/2013/jun/08/grayson-perry-tapestries- class-taste. Moretti, Franco. “Network Theory, Plot Analysis.” New Left Review 68 (March-April 2011): 80-102. PDF. Morgan, Coleen Leah. “Emancipatory Digital Archaeology.” PhD dissertation, University of California, 2012. Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud. Sebastopol, CA: Maker Media, 2016. Mortensen, P. “The Place of Theory in Archival Practice.” Archivaria 47 (1999): 1-26. Mossberger, Karen, Caroline J. Tolbert, and Mary Stansbury. Virtual Inequality: Beyond the Digital Divide. Washington, DC: Georgetown University Press, 2003. Mostern, Ruth, and Elana Gainor. “Traveling the Silk Road on a Virtual Globe: Pedagogy, Technology, and Evaluation for Spatial History.” Digital Humanities Quarterly 7, No. 2 (2013). Mueller, Martin. “About the future of the TEI.” August 4, 2011. http://ariadne.north- western.edu/mmueller/teiletter.pdf. Mueller, Martin. “Collaborative curation of Early Modern plays by undergradu- ates.” Scalable Reading (2012). Mueller, Martin. “Digital Shakespeare, or Towards a Literacy Informatics.” Shakespeare 4, no. 3 (December 2008): 300-317. Mueller, Martin. “How to Fix 60,000 Errors.” Scalable Reading (2013). Mueller, Martin. “What is a Young Scholar Edition.” Scalable Reading (2013). Mukurtu. www.mukurtu.org. Mullen, Lincoln. “Digital Humanities Is a Spectrum: or, We’re All Digital Humanists Now.” In Backward Glance. April 29, 2010. http://lincolnmullen.com/2010/04/29/digital-hu- manities-is-a-spectrum-or-were-all-digital-humanists-now/. 97 Mullen, Lincoln. “These Maps Show How Slavery Expanded Across the United States.” Smithsonian.com. http://www.smithsonianmag.com/history/maps-reveal-slavery-ex- panded-across-united-states-180951452/?no-ist. Muñoz, Trevor. “In Service? A Further Provocation on Digital Humanities Research in Li- braries.” dh + lib. June 19, 2013. http://acrl.ala.org/dh/2013/06/19/in-service-a-further- provocation-on-digital-humanities-research-in-libraries. Munster, Anna. An Aesthesia of Networks: Conjunctive Experience in Art and Technol- ogy. Cambridge, MA: MIT Press, 2013. Muri, Allison. “The Grub Street Project.” Online Humanities Scholarship: The Shape of Things to Come. Ed. Jerome McGann. 25-58. Houston, TX: Rice University Press, 2010. Murray, Janet H. Hamlet on the Holodeck: The Future of Narrative in Cyberspace. Cam- bridge, MA: MIT Press, 1997. Murray, K.M.E. Caught in a Web of Words. New Haven, CT: Yale University Press, 2001. Murray, Susan “Digital Images, Photo-Sharing, and Our Shifting Notions of Everyday Aes- thetics.” Journal of Visual Culture 7, no. 2 (2008): 147-63. Mussell, James. “Doing and Making: History as Digital Practice.” In History in the Digital Age. Ed. Toni Weller. 79-94. London, UK: Routledge, 2013. Nakamura, Lisa. Digitizing Race: Visual Cultures of the Internet. Minneapolis, MN: Uni- versity of Minnesota Press, 2008. Nakamura, Lisa and Peter Chow-White. Race after the Internet. New York, NY: Routledge 2012. Nardi, B.A. My Life as a Night Elf Priest. Ann Arbor, MI: University of Michigan Press, 2010. National Information Standards Organization (NISO). Understanding Metadata. Be- thesda, MD: NISO Press, 2004. National Initiative for a Networked Cultural Heritage (NINCH). The NINCH Guide to Good Practice in the Digital Representation and Management of Cultural Heritage Materials. National Initiative for a Networked Cultural Heritage, 2002. www.nyu.edu/its/pubs/pdfs/NINCH_Guide_to_Good_Practice.pdf. Naughton, J. From Gutenberg to Zuckerberg: What You Really Need to Know About the Internet. London, UK: Quercus, 2012. 98 Nawrotzki, Kristen, and Jack Dougherty. “Introduction.” Writing History in the Digital Age. Eds. Jack Dougherty and Kristen Nawrotzki. 1-18. Ann Arbor, MI: University of Michigan Press, 2013. Neal, Mark Anthony. “Race and the Digital Humanities.” Left of Black (webcast), season 3, episode 1, John Hope Franklin Center, September 17, 2012. Https://www.youtube.com/watch?v=AQth5_-QNj0. Negroponte, Nicholas. Being Digital. New York, NY: Alfred A. Knopf, 1995. NEH Office of Digital Humanities. www.neh.gov/odh. Nelson, B. “Exploring the Use of Individualised, Reflective Guidance in an Educational Multi-User Environment.” Journal of Science Education & Technology, 16, 1 (2007): 83- 97. Nelson, Theodor. Computer Lib/Dream Machines. Redmont, WA: Tempus Books, 1987. Nelson, Theodor. “A File Structure for the Complex, the Changing, and the Indetermi- nate.” In The New Media Reader. Ed. Noah-Wardrip Fruin. Cambridge, MA: MIT Press, 1965. Nelson, Theodore. Literary Machines: The Report on, and of, Project Xanadu concerning Word Processing, Electronic Publishing, Hypertext, Thinkertoys, Tomorrow’s Intellectual Revolution, and Certain other Topics Including Knowledge, Education, and Freedom. 0- 13. San Antonio, TX: T.H. Nelson, 1987. Nelson, Robert. “The Slide Lecture, or the Work of Art ‘History’ in the Age of Mechanical Reproduction.” Critical Inquiry 26, no. 3 (Spring, 2000): 414-434. Nesmith, T. “Seeing Archives: Postmodernism and the Changing Intellectual Place of Ar- chives.” The American Archivist, 65.1 (2002): 24-41. Netz, R., and W. Noel. The Archimedes Codex: Revealing the Secrets of the World’s Greatest Palimpsest. London, UK: Weidenfeld & Nicolson, 2011. Newfield, Christopher. “Ending the Budget Wars: Funding the Humanities during a Crisis in Higher Education.” Profession (2009): 270–84. New York Public Library. “Digital Humanities and the Future of Libraries (Multimedia Conference Proceedings).” New York Public Library, June 16, 2011. http://www.nypl.org/events/programs/2011/06/16/digital-humanities-and-future-li- braries 99 Ngata, W., H. Ngata-Gibson, and H. Salmond, “Te Ataakura: Digital Taonga and Cultural Innovation.” Journal of Material Culture, 17.3 (2012): 229-44. Nichols, Stephen G. “Time to Change Our Thinking: Dismantling the Silo Model of Digital Scholarship.” Ariadne, no. 58 (January 30, 2009). http://www.ariadne.ac.uk/is- sue58/nichols/. Notes from THATCamp Digital Humanities & Libraries. Topics include “Starting a DH Pro- gram in the Library.” “Re-Skilling Librarians for DH,” and “DHT.” Novak, Peter. That Noble Dream: The “Objectivity” Question and the American Historical Profession. Cambridge, UK: Cambridge University Press, 1988. Nowviskie, Bethany. “Digital Humanities in the Anthropocene.” Bethany Nowviskie (blog), July 10, 2014. http://nowviskie.org/2014/anthropocene/. Nowviskie, Bethany. “Eternal September of the Digital Humanities.” Debates in the Digi- tal Humanities. 243-246. ed. Matthew K. Gold. Minneapolis, MN: University of Minne- sota Press, 2012. Nowviskie, Bethany. “Evaluating Collaborative Digital Scholarship (or, Where Credit is Due).” Journal of Digital Humanities 1, no. 4 (Fall 2012). http://journalofdigitalhumani- ties.org/1-4/evaluating-collaborative-digital-scholarship-by-bethany-nowviskie/. Nowviskie, Bethany. “Mapping the Catalog of Ships.” University of Virginia Library. http://scholarslab.org/blog/mapping-the-catalogue-of-ships/. Nowviskie, Bethany. “A Skunk in the Library.” June 28, 2011. http://nowvis- kie.org/2011/a-skunk-in-the-library/. Nowviskie, Bethany. “Skunks in the Library: A Path to Production for Scholarly R&D.” Journal of Library Administration 53, no. 1 (2013): 53-66. Nowviskie, Bethany. “On the Origin of ‘Hack’ and ‘Yack.’” In Debates in the Digital Hu- manities. Eds. Matthew K. Gold and Lauren Klein. 66-70. Minneapolis, MN: University of Minnesota Press, 2016. Nowviskie, Bethany. “Reality Bytes.” June 20, 2012. http://nowviskie.org/2012/reality- bytes/. Nowviskie, Bethany. “Resistance in the Materials.” In Debates in the Digital Humanities. Eds. Matthew K. Gold, and Lauren Klein. 176-183. Minneapolis, MN: University of Min- nesota Press, 2016 100 Nowviskie, Bethany. “What do Girls Dig?” In Debates in the Digital Humanities. Ed. Mat- thew K. Gold. 235-241. Minneapolis, MN: University of Minnesota Press, 2012. Nowviskie, Bethany. “Where Credit is Due: Preconditions for the Evaluation of Collabo- rative Digital Scholarship.” Profession 13 (2011): 169-181. Nowviskie, Bethany, and Dot Porter. “Graceful Degradation Survey Findings: Managing Digital Humanities Projects Through Times of Transition and Decline?” Digital Humani- ties 2010 Conference Abstract, June 2010. http://dh2010.cch.kcl.ac.uk/academic-pro- gramme/abstracts/papers/html/ab-722.html. Nuffield Foundation. Interdisciplinarity. London, UK: Nuffield Foundation, 1975. Nunberg, Geoffrey. “Counting on Google Books.” The Chronicle of Higher Education. The Chronicle Review. December 16, 2010. http://chronicle.com/article/Counting-on- Google-Books/125735/. Nunberg, Geoffrey. The Future of the Book. Berkeley, CA: University of California Press, 1996. Nygren, Zephyr Frank, Nicholas Bauch and Erik Steiner. “Connecting with the Past: Op- portunities and Challenges in Digital History.” In Research Methods for Creating and Curating Data in the Digital Humanities. Eds. Matt Hayler and Gabriele Griffin. 62-86. Edinburgh, UK: Edinburgh University Press, 2016. Nyhan, Julianne, Andrew Flinn, and Anne Welsh. “Oral History and the Hidden Histories Project: Towards Histories of Computing in the Humanities.” Digital Scholarship in the Humanities 30.1 (2015): 71-85. Web. 10 June 2015. Nyhan, J., and O. Duke-Williams. “Joint and Multi-Authored Publication Patterns in the Digital Humanities.” Literary and Linguistic Computing 29 (3) (2014): 387-399. Nyhan, Julianne, Melissa M. Terras, and Claire Warwick. Digital Humanities in Practice. Facet Publishing in association with UCL Centre for Digital Humanities, 2012. O’Donnell, Angela N., and Sharon J. Derry. “Cognitive Processes in Interdisciplinary Groups: Problems and Possibilities.” In Interdisciplinary Collaboration: An Emerging Cog- nitive Science. Eds. Sharon Derry, Christopher D. Schunn, and Morton A. Gernsbacher. 51-82. Mahwah, NJ: Earlbaum, 2005. O’Donnell, Daniel Paul, Katherine L. Walter, Alex Gil, and Neil Fraistat. In A New Com- panion to Digital Humanities. Eds. by Susan Schreibman, Ray Siemens, and John Un- sworth. 493-510. West Sussex, UK: Wiley-Blackwell, 2016. 101 O’Donnell, James J. “Engaging the Humanities: The Digital Humanities.” Daedalus, 138.1 (2009): 99-104. O’Gorman, Marcel. “The Making of a Digital Humanities Neo-Luddite.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 116- 127. Minneapolis, MN: University of Minnesota Press, 2017. Ohya, K. “Programming with Arduino for Digital Humanities.” Journal of Digital Humani- ties 2 (3). http://journalofdigitalhumanities.org/2-3/programming-with-arduino-for-digi- tal-humanities Oishi, L. “What does Second Life have to do with Real-Life learning?” Technology & Learning 27, 11 (2007): 54. Old Maps Online. http://oldmapsonline.org/. Oldman, Dominic, Martin Doerr, and Stefan Gradmann. ”Zen and the Art of Linked Data: New Strategies for a Semantic Web of Humanist Knowledge.” In A New Companion to Digital Humanities. Eds. by Susan Schreibman, Ray Siemens, and John Unsworth. 251- 273. West Sussex, UK: Wiley-Blackwell, 2016. Olsen, P. “Building a Digital Curation Workstation with BitCurator (update).” BitCurator. August 2, 2013. http://www.bitcurator.net/building-a-digital-curation-worskstation- with-bitcurator-update. Olsen, M. “Signs, Symbols, and Discourses: a New Direction for Computer-Aided Litera- ture Studies.” Computers and Humanities 27 (5-6) (1993): 309-314. Olsen, Mark. “What Can and Cannot Be Done with Electronic Text in Historical and Liter- ary Research.” Paper for “Modeling Literary Research Methods by Computer”. Modern Language Association Annual Meeting. Olson, Mark J.V. “Hacking the Humanities: 21st Century Literacies and the ‘Becoming- Other’ of the Humanities.” In Humanities in the Twenty-First Century: Beyond Utility and Markets. Eds. E. Belfiore and A. Upchurch. New York, NY: Palgrave Macmillan, 2013. Omeka. http://omeka.net. Ong, Walter J. Interfaces of the Word. Ithaca, NY: Cornell University Press, 1977. Ong, Walter J. Orality and Literacy: The Technologization of the Word. London, UK: Me- thuen, 1982. 102 Ong, Walter. “Writing Restructures Consciousness.” Orality and Literacy: The Technolo- gizing of the Word. 77–114. London and New York: Routledge, 1982. Open Access Directory. oad.simmons.edu. Orr, Julian E. Talking about Machines: An Ethnography of a Modern Job. Ithaca, NY: ILR Press, 1996. Ortiz, Santiago. “45 Ways to Communicate Two Quantities.” visual.ly. (2012). https://vis- ual.ly/blog/45-ways-to-communicate-two-quantities/. Orwant, John. “Our Commitment to the Digital Humanities.” The Official Google Blog. July 14, 2010. http://googleblog.blogspot.com/2010/07/our-commitment-to-digital-hu- manities.html. O’Sullivan, David, and David Unwin. Geographic Information Analysis. Chichester, UK: John Wiley & Sons, Inc., 2003. O’Sullivan, David, and T. Igor. Physical Computing: Sensing and Controlling the Physical World with Computers. New York, NY: Thomson, 2004. O’Sullivan, James, Christopher P. Long, and Mark A. Mattson. “Dissemination as Cultiva- tion: Scholarly Communications in a Digital Age.” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 384-98. New York, NY: Routledge, 2016. Otty, Lisa, and Tara Thomson. “Data Visualisation and the Humanities.” In Research Methods for Creating and Curating Data in the Digital Humanities. Eds. Matt Hayler and Gabriele Griffin. 113-139. Edinburgh, UK: Edinburgh University Press, 2016. Owen, J.B., and Laura Woodworth-Ney. “Envisioning a Master’s Degree Program in Geo- graphically Integrated History.” Journal of the Association for History and Computing 8:2 (2005): n.p. Owens, Trevor. “Defining Data Humanists: Text, Artifact, Information or Evidence?” Journal of Digital Humanities 1.1 (2011). Owens, Trevor. “The Public Course Blog: The Required Reading We Write Ourselves for the Course That Never Ends.” 409-411. In Debates in the Digital Humanities. Ed. Mat- thew K. Gold. Minneapolis, MN: University of Minnesota Press, 2012. Owens, Trevor, and J. Bailey. “Viewshare: Digital Interfaces as Scholarly Activity.” Per- spectives on History. American Historical Association, 2012. 103 Padrón, Ricardo. The Spacious Word: Cartography, Literature, and Empire in Early Mod- ern Spain. Chicago, IL: University of Chicago Press, 2004. Palamidese, Patrizia. Scientific Visualizaion: Advanced Software Techniques. New York, NY: Ellis Horwood, 1993. Palen, Leysia, Kate Starbird, Sarah Vieweg, and Anabda Hughes. “Twitter-Based Infor- mation Distribution During the 2009 Red River Valley Flood Threat.” Bulletin of the American Society for Information Science and Technology 36, no. 5 (2010): 13-17. Pallasma, J. The Embodied Image: Imagination and Imagery in Architecture. Hoboken, NJ: John Wiley & Sons, Inc., 2011. Palmer, Carole L. “Thematic Research Collections.” In Online Humanities Scholarship: The Shape of Things to Come. Ed. Jerome McGann. 348-65. Houston, TX: Rice University Press, 2010. Pannapacker, William. “Digital Humanities Triumphant?” In Debates in the Digital Hu- manities. Ed. Matthew K. Gold. 233-234. Minneapolis, MN: University of Minnesota Press, 2012. Pannapacker, William, “The MLA and the Digital Humanities.” Chronicle of Higher Educa- tion. December 28, 2009. http://chronicle .com/blog/Author/Brainstorm/3/William-Pan- napacker/143/. Pannapacker, William. “Stop Calling It ‘Digital Humanities’.” Chronicle of Higher Educa- tion. February 18, 2013. http://chronicle.com/article/Stop-Calling-It-Digital/137325/. Papacharissi, Zizi A. A Private Sphere: Democracy in a Digital Age. Cambridge, UK: Polity, 2010. Papacharissi, Zizi A. “Conclusion: A Networked Self.” In A Networked Self: Identity, Com- munity, and Culture on Social Network Sites. Ed. Zizi Papacharissi. 304-18. New York, NY: Routledge, 2011. Pappano, Laura. “The Year of the MOOC.” NYTimes, November 2, 2012. http://www.ny- times.com/2012/11/04/education/edlife/massive-open-online-courses-are-multiplying- at-a-rapid-pace.html Parikka, Jussi. What Is Media Archaeology? Cambridge, UK: Polity, 2012. Parker, Cornelia. Cold Dark Matter: An Exploded View. London: Tate Modern, 1991. http://www.tate.org.uk/learn/online-resources/cold-dark-matter 104 Parker, Patricia A. “Othello and Hamlet: Syping, Discoery and Secret Faults.” In Shake- speare from the Margins: Language, Culture, Context. Chicago, IL: University of Chicago, 1996. Parks, Lisa. Culture in Orbit: Satellites and the Televisual. Durham, NC: Duke University Press, 2005. Parry, David. “Be Online or Be Irrelevant.” AcademHack. January 11, 2010. http://academhack.outsidethetext.com/home/2010/be-online-or-be-irrelevant/. Parry, David. “The Digital Humanities or a Digital Humanism.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 429-437. Minneapolis, MN: University of Minnesota Press, 2012. Parry, Ross, ed. Museums in a Digital Age. New York, NY: Routledge, 2010. Pasztory, Esther. Thinking with Things: Toward a New Vision of Art. Austin, TX.: Univer- sity of Texas Press, 2005. Pastorino, Cesare. “The Mine and the Furnace: Francis Bacon, Thomas Russell, and Early Stuart Mining Culture.” Early Science and Medicine 14, no. 5 (2009): 630–60. Pearce, Celia. Communities of Play: Emergent Cultures in Multiplayer Games and Virtual Worlds. Cambridge, MA: MIT Press, 2009. Pearce-Moses, R., ed. A Glossary of Archival and Records Terminology. SAA, 2012. Pearson, Alastair W., and Peter Collier. “Agricultural History with GIS.” In Past Time, Past Place, GIS for History. 105-16. Redlands, CA: ESRI Press, 2002. Pensias, Arno. Ideas and Information: Managing in a High-Tech World. New York, NY: W.W. Norton & Company, 1989. Perkins, David. Future Wise: Educating Our Children for a Changing World. San Fran- cisco, CA: Jossey-Bass, 2014. Peters, John Durham. The Marvelous Clouds: Toward a Philosophy of Elemental Media. Chicago, IL: University of Chicago Press, 2015. Peters, John Durham. Speaking into the Air: A History of the Idea of Communication. Chi- cago, IL: University of Chicago Press, 1999. Petroski, Henry. The Pencil: A History of Design and Circumstance. New York, NY: Knopf, 1992. 105 Petroski, Henry. The Toothpick: Technology and Culture. New York, NY: Knopf, 2007. Petzold, Charles. Code: The Hidden Language of Computer Hardware and Software. 1st edition. Microsoft Press, 2000. Peuquet, Donna J. Representations of Space and Time. New York, NY: Guilford, 2002. Phillips, Whitney. In This Is Why We Can’t Have Nice Things: Mapping the Relationship between Online Trolling and Mainstream Culture. Cambridge, MA: MIT Press, 2015. Pickering, A. The Cybernetic Brain: Sketches of Another Future. Chicago, IL: University of Chicago Press, 2011. Pickles, John. “Arguments, Debates, and Dialogues: The GIS-Social Theory Debate and the Concern for Alternatives.” In Geographic Information Systems. 49-60. New York, NY: Johns Wiley & Sons, Inc., 1999. Pickles, John, ed. Ground Truth: The Social Implications of Geographic Information Sys- tems. New York, NY: Guilford Press, 1995. Pickles, John. A History of Spaces: Cartographic Reason, Mapping, and the Geo-Coded World. New York, NY: Routledge, 2004. Pickles, John. “Representations in an Electronic Age: Geography, GIS, and Democracy.” In Ground Truth: The Social Implications of Geographic Information Systems. 1-50. New York, NY: Guilford Press, 1995. Pierazzo, Elena. “Digital Humanities: a Definition.” 2011. http://epierazzo.blog- spot.co.uk/2011/01/digital-humanities-definition.html. Pierazzo, Elena. “Digital Documentary Editions and the Others.” Scholarly Editing, 35 (2014). Pierazzo, Elena. ”Textual Scholarship and Text Encoding.” In A New Companion to Digital Humanities. Eds. Susan Schreibman, Ray Siemens, and John Unsworth. 307-321. West Sussex, UK: Wiley-Blackwell, 2016. Pinch, Trevor J., and Wiebe E. Bijker. “The Social Construction of facts and Artifacts: Or How the Sociology of Science and the Sociology of Technology Might Benefit Each Other.” In The Social Construction of Technological Systems: New Directions in the Soci- ology and History of Technology. Eds. Wiebe E. Bijker, Thomas P. Hughes, and Trevor Pinch, 17-50. Cambridge, MA: MIT Press, 1989. 106 Piper, Andrew. Book Was There: Reading in Electronic Times. Chicago, IL: University of Chicago Press, 2012. Pitti D.V., and W.M. Duff. Encoded Archival Description on the Internet. Binghamton, NY: Haworth Information Press, 2001. Plants, S. Zeroes and Ones: Digital Women and the New Technoculture. New York, NY: Doubleday, 1997. Plewe, Brandon. “The Nature of Uncertainty in Historical Geographic Information.” Transactions in GIS 6:4 (2002): 431-56. Polefrone, Phillip R., John Simpson, and Dennis Yi Tenen. “Critical Computing in the Hu- manities.” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 85-103. New York, NY: Routledge, 2016. Poole, A. “Now is the Future Now? The Urgency of Digital Curation in the Digital Human- ities.” DHQ: Digital Humanities Quarterly, 7 (2). 2013. http://www.digitalhumani- ties.org/dhq/vol/7/2/000163/000163.html. Poole, Steven. “Green’s Dictionary of Slang by Jonathon Green and Guardian Style by David Marsh & Amelia Hodsdon–review.” The Guardian, 2010. https://www.theguard- ian.com/books/2010/dec/18/dictionary-slang-guardian-style-review. Posner, Miriam. “Here and There: Creating DH Community.” In Debates in the Digital Humanities. Eds. Matthew K. Gold, and Lauren Klein. 265-273. Minneapolis, MN: Univer- sity of Minnesota Press, 2016. Posner, Miriam. “No Half Measures: Overcoming Common Challenges to Doing Digital Humanities in the Library.” Journal of Digital Humanities 53:1 (January 2013). Posner, Miriam. “Think Talk Make Do: Power and the Digital Humanities.” Journal of Dig- ital Humanities 1.2 (2012). Posner, Miriam. “What’s Next: The Radical, Unrealized Potential of Digital Humanities.” In Debates in the Digital Humanities. Ed. Matthew Gold and Lauren Klein. 32-41. Minne- apolis, MN: University of Minnesota Press, 2016. Postcolonial Digital Humanities. http://dhpoco.org. Potter, R. “Literary Criticism and Literary Computing.” Computers in the Humanities 22 (2) (1988): 93. 107 Potter, Claire. “Putting the Humanities in Action: Why We Are All Digital Humanists, and Why That Needs to Be a Feminist Project.” Keynote presentation, Women’s History in the Digital World Conference, Bryn Mawr College, 2015. http://repository.bryn- mawr.edu/greenfield_conference/2015/Thursday/14/. Potter, R.G. “Statistical Analysis of Literature: A Retrospective on Computers and the Hu- manities, 1966-1990.” Computers and the Humanities 25, no. 6 (1991): 401-29. Potter, W. James. Media Literacy. Los Angeles, CA: Sage, 2008. Powell, Daniel. “Dispatches from Capitol Hill: #1.ʺ http://djp2025.com/dispatches-from- capitol-hill-1/. Powell, Daniel. “Dispatches from Capitol Hill: #2, or EEBO and the Infinite Weird- ness.” http://djp2025.com/dispatches-from-capitol-hill-2-or-eebo-and-the-infinite- weirdness/. Powell, Daniel. “Dispatches from Capitol Hill: #3, or XML and TEI are Scary.” http://djp2025.com/dispatches-from-capitol-hill-3/. Powell, Daniel. “Dispatches from Capitol Hill: #4, or What is Transcription, Re- ally?” http://djp2025.com/dispatches-from-capitol-hill-4/. Power, Eugene. Edition of One. Ann Arbor, MI: University of Michigan, 1990. Prady Lougee, Wendy. Diffuse Libraries: Emergent Roles for the Research Library in the Digital Age. Washington, DC: Council on Library and Information Resources, 2002. http://www.clir.org/pubs/abstract/pub108abst.html. Pratt, Vernon. Thinking Machines: The Evolution of Artificial Intelligence. Oxford, UK: Basil Blackwell, 1987. Prescott, Andrew. “An Electric Current of the Imagination.” Digital Humanities: Works in Progress. http://blogs.cch.kcl.ac.uk/wip/2012/01/26/an-electric-current-of-the-imagina- tion. Prescott, Andrew. “Beyond the Digital Humanities Center: The Administrative Land- scapes of the Digital Humanities.” In A New Companion to Digital Humanities. Eds. Susan Schreibman, Ray Siemens, and John Unsworth. 461-475. West Sussex, UK: Wiley-Black- well, 2016. Prescott, Andrew. “Consumers, Creators or Commentators? Problems of Audience and Mission in Digital Humanities.” Arts and Humanities in Higher Education 11, nos. 1-2 (2011): 61-75. 108 Prescott, Andrew. “An Electric Current of the Imagination: What the Digital Humanities Are and What They Might Become.” Journal of Digital Humanities, 26 June 2012. Prescott, Andrew. “Riffs on McCarty.” Digital Riffs. http://digitalriffs.blog- spot.com/2013/07/riffs-on-mccarty.html. Presner, Todd. “Critical Theory and the Mangle of Digital Humanities.” in Between Hu- manities and the Digital. Eds. Patrik Svensson and Davi Theo Goldberg. 55-68. Cam- bridge, MA: MIT Press, 2015. Presner, Todd. “The Ethics of the Algorithm: Close and Distant Listening to the Shoah Foundation Visual History Archive.” In History Unlimited: Probing the Ethics of Holocaust Culture. Cambridge, MA: Harvard University Press, 2015. Presner, Todd. “How to Evaluate Digital Scholarship.” Journal of Digital Humanities 1, no. 4 (Fall 2012). http://journalofdigitalhumanities.org/1-4/how-to-evaluate-digital- scholarship-by-todd-presner/. Presner, Todd. “Hypercities.” In Online Humanities Scholarship: The Shape of Things to Come. Ed. Jerome McGann. 251-72. Houston, TX: Rice University Press, 2010. Presner, Todd. "Remapping German-Jewish Studies: Benjamin, Cartography, Moder- nity." The German Quarterly 82, no. 3 (2009): 293-315. Presner, Todd, and Chris Johanson. “The Promise of Digital Humanities: A Whitepaper. March 1, 2009-Final Version.” http://www.itpb.ucla.edu/documents/2009/PromiseofD- igitalHumanities.pdf. Presner, Todd, David Shepard, Yoh Kawano. HyperCities: Thick Mapping in the Digital Humanities (metaLABprojects). Cambridge, MA: Harvard University Press, 2014. Presner, Todd, and David Shepard. “Mapping the Geospatial Turn.” In A New Companion to Digital Humanities. Eds. by Susan Schreibman, Ray Siemens, and John Unsworth. 201- 212. West Sussex, UK: Wiley-Blackwell, 2016. Presner, Todd, J. Schnapp, and P. Lunenfeld. The Digital Humanities Manifesto 2.0. 2009. http://www.humanitiesblast.com/manifesto/Manifesto_V@.pdf. Price, Jacob. “Recent Quantitative Work in History: A Survey of the Main Trends.” His- tory and Theory 9 (1969): 11-13. 109 Price, Kenneth M. “Civil War Washington Project.” Online Humanities Scholarship: The Shape of Things to Come. Ed. Jerome McGann. 287-310. Houston, TX: Rice University Press, 2010. Price, Kenneth M. “Collaborative Work and the Conditions for American Literary Schol- arship in a Digital Age.” The American Literature Scholar in the Digital Age. Eds. Amy E. Earhart and Andrew W. Jewell. 9-27. Ann Arbor, MI: University of Michigan Press, 2011. Price, Kenneth M. “Digital Scholarship, Economics, and the American Literary Canon.” Literature Compass 6, no. 2 (2009): 274-290. http://onlineli- brary.wiley.com/doi/10.1111/j.1741-4113.2009.00622.x/full. Price, Kenneth M. “Edition, Project, Database, Archive, Thematic Research Collection: What’s in a Name?” DHQ: Digital Humanities Quarterly (2009). Price, Kenneth M. “Electronic Scholarly Editions.” In A Companion to Digital Literary Studies. Eds. R.G. Siemens, and S. Schreibman. 434-450. Oxford, UK: Blackwell, 2008. Price, Kenneth M. and R. Siemens, eds. Literary Studies in the Digital Age: A Methodo- logical Primer. New York, NY: MLA Commons. Price, Kenneth. “Social Scholarly Editing.” In A New Companion to Digital Humanities. Eds. Susan Schreibman, Ray Siemens, and John Unsworth. 137-149. West Sussex, UK: Wiley-Blackwell, 2016. Pritchard, D. “Working Papers, Open Access, and Cyber-Infrastructure in Classical Stud- ies.” Literary and Linguistic Computing 23 (2). 149-162. Proctor, Nancy. “Digital: Museum as Platform, Curator as Champion, in the Age of Social Media.” Curator: The Museum Journal 53, no. 1 (January 1, 2010): 35–43. http://arthis- tory2014.doingdh.org/readings/ Project Gutenberg. www.gutenberg.org. Project Bamboo. 2011. http://www.projectbamboo.org/. Promey, Sally M., and Miriam Stewart. “Digital Art History: A New Field for Collabora- tion.” American Art 11, no. 2 (July 1, 1997): 36–41. http://www.jstor.org/sta- ble/3109247 Proot, Goran, and Leo Egghe. “Estimating Editions on the Basis of Survivals…” Papers of the Bibliographic Society of America, 102, no. 2 (2008): 149–74. Prown, Jules. “The Art Historian and the Computer.” Art as Evidence: Writings on Art 110 and Material Culture. New Haven, CT: Yale University Press, 2001. Public Knowledge Project. pkp.sfu.ca. Pumfrey, Paul, Paul Rayson and John Mariani. “Experiments in 17th Century English: manual versus automatic conceptual history.” Literary and Linguistic Computing 27, no. 4 (2012): 395–408. Purdue University. “Evaluation Criteria for the Scholarship of Engagement.” n.d. http://www.vet.purdue.edu/engagement/files/documents/Evaluationcriterion.pdf. Puschmann, Cornelius, and Jean Burgess. “The Politics of Twitter Data.” HIIG Discussion Paper Series 2013-01 (2013). Quamen, Harvey, and Jon Bath. “Databases.” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 145-62. New York, NY: Routledge, 2016. Quan-Haase, Anabel, Juan Luis Suarez, and David M. Brown. “Collaborating, Connecting, and Clustering in the Humanities: A Case Study of Networked Scholarship in an Interdis- ciplinary, Dispersed Team.” American Behavioral Scientist 59.4 (2014): 443-456. Race: The Floating Signifier. Dir. Sut Jhally, with Stuart Hall and Media Education Foun- dation. Northhampton, MA: Media Education Foundation, 2002. Radford, Marie L., Pamela Snelson. “Academic Library Research: Perspectives and Cur- rent Trends.” ACRL Publications in Librarianship no. 59. Chicago: Association of College and Research Libraries, 2008. Raessens, Joost. “Computer Games as Participatory Media Culture.” In Handbook of Computer Games Studies. Eds. J. Raessens and J. Goldstein. 373-88. Cambridge, MA: MIT Press, 2005. Raffaelle, Simone. “The Body of the Text.” In The Future of the Book. Ed. Geoffrey Nun- berg. 239-251. Berkeley, CA: University of California Press, 1996. Rahtz, S. “Storage, Retrievals, and Rendering.” In Electronic Textual Editing. Eds. L. Bur- nard, K. O’Brien O’Keeffe, and J. Unsworth. 310-333. New York, NY: Modern Language Association, 2006. Raley, RIta. “Digital Humanities for the Next Five Minutes.” Differences 25, no. 1 (2014): 26-45. Raley, Rita. Tactical Media. Minneapolis, MN: University of Minnesota Press, 2009. 111 Rambsy, Kenton. “African American Literature and Digital Humanities.” January 17, 2014. http://www.culturalfront.org/2014/01/african-american-literature-and-digi- tal.html. Ramsay, Stephen. “Algorithmic Criticism.” In A Companion to Digital Literary Studies. Eds. Ray Siemens and Susan Schreibman. Oxford, UK: Blackwell, 2004. Ramsay, Stephen. “Care of the Soul.” Literatura Mundana, October 8, 2010. http://lenz.unl.edu/wordpress/?p=266. Ramsay, Stephen. “Centers are People.” April 2012. http://lenz.unl.edu/pa- pers/2012/04/25/centers-are-people.html. Ramsay, Stephen, “Databases.” In A Companion to Digital Humanities. Eds. Susan Schreibman, Ray Siemens, and John Unsworth. Oxford: Blackwell, 2004. Ramsay, Stephen. “Hard Constraints: Designing Software in the Digital Humanities.” In A New Companion to Digital Humanities. Eds. by Susan Schreibman, Ray Siemens, and John Unsworth. 449-458. West Sussex, UK: Wiley-Blackwell, 2016. Ramsay, Stephen. “The Hermeneutics of Screwing Around; or What You Do with a Mil- lion Books.” In Pastplay: Teaching and Learning History with Technology. Ed. Kevin Lee. 111-120. Ann Arbor, MI: University of Michigan Press, 2014. Ramsay, Stephen. “Humane Computation.” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 527-529. Minneapolis, MN: University of Minnesota Press, 2016. Ramsay, Stephen. “In Praise of Pattern.” Text Technology 2 (2005):177-190. Ramsay, Stephen. “On Building.” Stephen Ramsay (Author’s Blog). January 11, 2011. http://stephenramsay.us/text/2011/01/11/on-building/. Ramsay, Stephen. Reading Machines: Toward an Algorithmic Criticism (Topics in the Dig- ital Humanities). Urbana-Champaign, IL: University of Illinois Press, 2011. Ramsay, Stephen. “Rules of the Order: The Sociology of Large, Multi-Institutional Soft- ware Developmental Projects.” Digital Humanities 2008. 2008. Ramsay, Stephen. “Toward an Algorithmic Criticism.” Literary and Linguistic Computing 18.2 (2003): 167-174. 112 Ramsay, Stephen. “Who’s in and Who’s Out.” Stephen Ramsay Blog. January 8, 2011. http://lenz.unl.edu/papers/2011/01/08/whos-in-and-whos-out.html Ramsay, Stephen, and Geoffrey Rockwell. “Developing Things: Notes toward an Episte- mology of Building in the Digital Humanities.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 75-84. Minneapolis, MN: University of Minnesota Press, 2012. Raper, Jonathan. Multidimensional Geographic Information Science: Extending GIS in Space and Time. New York, NY: Taylor & Francis, 2000. Ratto, M. “Critical Making: Conceptual and Material Studies in Technology and Social Life.” Information Society 27 (2011): 252-60. Ratto, M., S. Wylie, and K. Jalbett. “Introduction to the Special Forum on Critical Making as Research Program.” Information Society 30, 2 (2014): 85-95. Real, L.A. “Collaboration in the Sciences and the Humanities: A Comparative Phenome- nology.” Arts and Humanities in Higher Education 11 (2012): 250-261. Reed, Ashley. “Managing an Established Digital Humanities Project: Principles and Prac- tices from the Twentieth Year of the William Blake Archive.” Digital Humanities Quar- terly 8, no. 1 (2014). Reichardt, J. Robots: Fact, Fiction, and Prediction. London, UK: Thames & Hudson, 1978. Reid, Alexander. “The Creative Community and the Digital Humanities.” Digital Digs. Oc- tober 17, 2010. http://www.alex-reid.net/2010/10/the-creative-community-and-the- digital-humanities.html. Reid, Alexander. “Digital Digs: The Digital Humanities Divide.” Digital Digs. February 17, 2011. http://www.alex-reid.net/2011/02/the-digital-humanities-divide.html. Reid, Alexander. “Digital Humanities: Two Venn Diagrams.” Digital Digs. March 9, 2011. http://www.alex-reid.net/2011/03/digital-humanities-two-venn-diagrams.html. Reid, Alexander. “Graduate Education and the Ethics of the Digital Humanities.” In De- bates in the Digital Humanities. Ed. Matthew K. Gold. 350-367. Minneapolis, MN: Uni- versity of Minnesota Press, 2012. Reigar, Oya Y. “Framing Digital Humanities: The Role of New Media in Humanities Schol- arship.” First Monday 15, no. 10 (2010). 113 Renear, Allen. “Text Encoding.” In A Companion to Digital Humanities. Eds. S. Schreib- man, R. Siemens, and J. Unsworth. Oxford, UK: Blackwell, 2004. http://www.digitalhu- manities.org/companion. Renear, Allen, David Dubin, C. M. Sperberg-McQueen, Claus Hiutfeldt. "XML Semantics and Digital Libraries." International Conference on Digital Libraries. Washington, DC: 2002. Resh, Gabby, Dan Southwick, Isaac Record, and Matt Ratto. “Thinking as Handwork: Crit- ical Making with Humanistic Concerns.” In Making Things and Drawing Boundaries: Ex- periments in the Digital Humanities. Ed. Jentery Sayers. 149-61. Minneapolis, MN: Uni- versity of Minnesota Press, 2017. Resig, John “Using Computer Vision to Increase the Research Potential of Photo Ar- chives.” http://ejohn.org/research/computer-vision-photo-archives/. Rettberg, Scott. “Electronic Literature as Digital Humanities.” In A New Companion to Digital Humanities. Ed. Susan Schreibman, Ray Siemens, and John Unsworth. 127-136. West Sussex, UK: Wiley-Blackwell, 2016. Rheingold, Howard. Smart Mobs: The Next Social Revolution. New York, NY: Basic, 2003. Rhody, Lisa Marie. “Why I Dig: Feminist Approaches to Text Analysis.” In Debates in the Digital Humanities. Eds. Matthew K. Gold, and Lauren Klein. 536-539. Minneapolis, MN: University of Minnesota Press, 2016. Rhyne, Charles S. “Images as Evidence in Art History and Related Disciplines.” In MW97: Museums and the Web 1997. Ridolfo, Jim, and William Hart-Davidson, eds. Rhetoric and the Digital Humanities. 20-32. Chicago, IL: University of Chicago Press, 2015. Riegar, Oya Y. “Framing Digital Humanities: The Role of New Media in Humanities Schol- arship.” First Monday 15, no. 10 (October 4, 2010) http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/3198/2628. Rigney, A. “When the Monograph is no Longer the Medium: Historical Narrative in the Online Age.” History and Theory, Theme Issue 49 (December 2010). 100-117. Riley, Jenn, and David Becker. “Seeing Standards: A Visualization of the Metadata Uni- verse.” Indiana University Libraries. 2010. www.dlib.indiana.edu/-jentrile/metada- tamap/. 114 Rimmer, Jon, Claire Warwick, Ann Blandford, Jeremy Gow, and George Buchanan. “An Examination of the Physical and the Digital Qualities of Humanities Research.” Infor- mation Processing & Management 44, no. 3 (May 2008): 1374-1392. Risam, Roopika. “Navigating the Global Digital Humanities: Insights from Black Femi- nism.” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 359- 367. Minneapolis, MN: University of Minnesota Press, 2016. Rivera Monclova, Marta. “Towards an Open Digital Humanities.” In THATCamp Southern California 2011. January 11, 2011. http://socal2011.thatcamp.org/01/11/opendh/. Rizzo, Mary. “Every Tool Is a Weapon: Why the Digital Humanities Movement Needs Public History.” Public History Commons, November 12, 2012. http://publichisto- rycommons.org/every-tool-is-a-weapon/. Robbins, K. & Webster, F. Times of the Technoculture: From the Information Society to the Virtual Life. London, UK: Routledge, 2001. Roberts, Colin H., and T.C. Skeat. The Birth of the Codex. London, UK: Oxford University Press, 1983. Robertson, Stephen. “The Difference between Digital Humanities and Digital History.” In Debates in the Digital Humanities. Eds. Matthew Gold and Lauren Klein. 289-307. Min- neapolis, MN: University of Minnesota Press, 2016. Robertson, Stephen. “Putting Harlem on the Map.” In Writing History in the Digital Age. Eds. Jack Dougherty and Kristen Nawrotzki: http://writinghistory.trincoll.edu/evi- dence/robertson-2012-spring. Robertson, Stephen, Shane White, and Stephen Garton. “Harlem in Black and White: Mapping Race and Place in the 1920s.” Journal of Urban History 39, no. 5 (2013): 864- 880. Robinson, Arthur H. The Look of Maps. Madison, WI: University of Wisconsin Press, 1952. Robinson, Arthur H., and Barbara Bartz Petchenik. The Nature of Maps: Essays toward Understanding Maps and Mapping. Chicago, IL: University of Chicago Press, 1976. Robinson, P. “Digital Humanities: Is Bigger Better?” In Advancing Digital Humanities: Re- search, Methods, Theories. Eds. P.L. Arthur and K. Bode. 243-247. Basingstoke, UK: Pal- grave Macmillan, 2014. 115 Robinson, Peter. “Response to Roger Bagnall, ‘Integrating Digital Papyrology.” In Online Humanities Scholarship: The Shape of Things to Come. Ed. Jerome McGann. 171-88. Houston, TX: Rice University Press, 2010. Rockenbach, Barbara. “Digital Humanities in Libraries: New Models for Scholarly Engage- ment.” Journal of Library Administration 53:1 (January 2013). Rockwell, Geoffrey. “Crowdsourcing the Humanities: Social Research and Collabora- tion.” Collaborative Research in the Digital Humanities. Eds. Marilyn Deegan and Willard McCarty. 135-54. Farnham, UK: Ashgate, 2012. Rockwell, Geoffrey, and S. Sinclair. Hermeneutica: The Rhetoric of Text Analysis. Cam- bridge, MA: MIT Press, 2016. Rockwell, Geoffrey. “Humanities Computing Challenges.” Theoreti.ca (2004). Rockwell, Geoffrey. “Inclusion in the Digital Humanities.” philosphi.ca. June 19, 2010. http://www.philosophi.ca/pmwiki.php/Main/InclusionInTheDigitalHumanities. Rockwell, Geoffrey. “On the Evaluation of Digital Media as Scholarship.” Profession 1 (2011): 152-168. Rockwell, Geoffrey. “Serious Play at Hand: Is Gaming Serious Research in the Humani- ties?” Text Technology 12 (2), 89-99. Rockwell, Geoffrey. “Short Guide to Evaluation of Digital Work.” Journal of Digital Hu- manities 1, no. 4 (Fall 2012). http://journalofdigitalhumanities.org/1-4/short-guide-to- evaluation-of-digital-work-by-geoffrey-rockwell. Rockwell, Geoffrey. “Thinking-through the History of Computer-Assisted Text Analysis.” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 9-21. New York, NY: Routledge, 2016. Rockwell, Geoffrey. “The Visual Concordance: The Design of Eye-Contact.” Technology 10, no. 1 (2001): 73-86. Rockwell, Geoffrey. “What is Text Analysis, Really?” Literary and Linguistic Compu- ting. 18.2 (2003): 209-220. Rockwell, Geoffrey, and Stefan Sinclair. “Acculturation and the Digital Humanities Com- munity.” Digital Humanities Pedagogy: Practices, Principles and Politics. Ed. Brett D. Hirsch. 177-211. Cambridge, UK: Open Book Publishers, 2012. Rodowick, D.N. The Virtual Life of Film. Cambridge, MA: Harvard University Press, 2007. 116 Roegiers, S., and F. Truyen. “History is 3D: Presenting a Framework for Meaningful His- torical Representation in Digital Media.” In New Heritage: New Media and Cultural Her- itage. Eds. Y.E. Kalay, T. Kvan, & J. Affleck. 67-77. London and New York: Routledge, 2008. Rogers, Melissa. “Making Queer Feminisms Matter: A Transdisciplinary Makerspace for the Rest of Us.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 234-48. Minneapolis, MN: University of Minnesota Press, 2017. Rogers, Richard. Digital Methods. Cambridge, MA: MIT Press, 2013. Rogoff, Irit. “Studying Visual Culture.” In The Visual Culture Reader. Ed. Nicholas Mir- zoeff. 14-26. New York, NY: Routledge, 1998. Rorabaugh, Pete. “Twitter Theory and the Public Scholar.” Hybrid Pedagogy. March 2012. Rosenfeld, Gabriel. “Why Do We Ask ‘What If?’ Reflections on the Function of Alterna- tive History.” History and Theory 41 (December 2002): 90-103. Rosenfeld, L., and P. Moorville. Information Architecture for the World Wide Web. 2nd ed. Beijing, China: O’Reilly, 2002. Rosenzwieg, Roy. Clio Wired: The Future of the Past in the Digital Age. New York, NY: Co- lumbia University Press, 2011. Rosenzweig, Roy. “The Road to Xanadu: Public and Private Pathways on the History Web.” Journal of American History 88, 3 (September 2001). Rosenzweig, Roy. “Scarcity or Abundance? Preserving the Past in a Digital Era.” Ameri- can Historical Review 108, 3 (June 2003): 735-762. http://chnm.gmu.edu/essays-on-his- tory-new-media/essays/?essayid=6. Rosner, Daniela K., and Sarah E. Fox. “Legacies of Craft and the Centrality of Failure in a Mother-Operated Hackerspace.” New Media & Society 18, no. 4 (2016): 558-80. Ross, Andrew. "Hacking Away at the Counterculture." Postmodern Culture 1, no. 1 (1990). 117 Ross, Nancy. “Teaching Twentieth-Century Art History with Gender and Data Visualiza- tions.” Journal of Interactive Technology and Pedagogy, Issue 4. http://jitp.com- mons.gc.cuny.edu/teaching-twentieth-century-art-history-with-gender-and-data-visuali- zations/. Ruecker, Stan. “Interface as Mediating Actor for Collection Access, Text Analysis, and Ex- perimentation.” In A New Companion to Digital Humanities. Eds. Susan Schreibman, Ray Siemens, and John Unsworth. 397-407. West Sussex, UK: Wiley-Blackwell, 2016. Ruecker, Stan, Luciano Frizzera, Milena Radzikowska, Geoff Roeder, Ernesto Pena, Te- resa Dobson, Geoffrey Rockwell, Susan Brown, The INKE Research Group. “Visual Work- flow Interfaces for Editorial Processes.” Literary and Linguistic Computing 28.4 (2013): 615-628. Ruecker, Stan, Milena Radzikowska, and Stéfan Sinclair. “Hackfests, Designfests, and Writingfests: The Role of Intense Periods of Face-to-Face Collaboration in International Research Teams.” Digital Humanities 2008. 2008. Ruecker, Stan, and Milena Radzikowska. “The Iterative Design of a Project Charter for Interdisciplinary Research.” In Proceedings of the 7th ACM conference on Designing in- teractive systems – DIS ‘08, 288-294. Cape Town, South Africa, 2008. http://dl.acm.org/citation.cfm?id=1394476. Ruecker, S., Milena Radikowska, and S. Sinclair. Visual Interface Design for Cultural Herit- age: A Guide to Rich-Prospect Browsing. Farnham, UK: Ashgate, 2011. Ruecker, Stan, and Jennifer Roberts-Smith. “Experience Design for the Humanities: Acti- vating Multiple Interpretations.” In Making Things and Drawing Boundaries: Experi- ments in the Digital Humanities. Ed. Jentery Sayers. 259-70. Minneapolis, MN: University of Minnesota Press, 2017. Rumsey, David, and Edith M. Punt. Cartographica Extraordinaire: The Historical Map Transformed. Redlands, CA: ESRI Press, 2004. Rumsey, David, and Meredith Williams. “Historical Maps in GIS.” In Past Time, Past Place: GIS for History. 1-18. Redlands, CA: ESRI Press, 2002. Rush, Matthew. New Media in Late 20th Century Art (World of Art). London, UK: Thames and Hudson, 1999. Rushkoff, D. Program or Be Programmed: Ten Commands for a Digital Age. New York, NY: OR Books, 2010. 118 Russell, Isabel Galina. “CASE STUDY: Digital Humanities in Mexico.” In Digital Humanities in Practice. Eds. Claire Warwick, Melissa Terras, and Julianne Nyhan. 202-4. London, UK: Facet in Association with UCL Center for Digital Humanities, 2012. Russell, John. “Teaching Digital Scholarship in the Library: Course Evaluation.” dh + lb. ARCL Digital Humanities Discussion Group, 2013. Russo, A., and J. Watkins. “Digital Cultural Communication: Audience and Remediation.” In Theorizing Digital Cultural Heritage: A Critical Discourse. Eds. F. Cameron, and S. Kenderdine. 149-164. Cambridge, MA: MIT Press, 2007. Ryan, Marie-Laure, ed. Cyberspace Textuality: Computer Technology and Literary The- ory. Bloomington, IN: Indiana University Press, 1999. Ryan, M.-L. “Defining Narrative Media.” Image and Narrative: Online Magazine of the Visual Narrative, 6 (2003). http://www.imageandnarrative.be/inarchive/mediumthe- ory/marielaureryan.htm. Rybicki, Jan, Maciej Eder, and David L. Hoover. “Computational Stylistics and Text Analy- sis.” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 123-44. New York, NY: Routledge, 2016. Rydberg Cox, Jeffrey A. Digital Libraries and the Challenges of Digital Humanities. Chan- dos Information Professional Series. Oxford, UK: Chandos Publishing, 2006. Sabharwal, Arjun. Digital Curation in the Digital Humanities: Preserving and Promoting Archival and Special Collections. Oxford, UK: Chandos Publishing, 2015. Sabharwal, Arjun. “Digital Directions in Academic Knowledge Management: Visions and Opportunities for Digital Initiatives at the University of Toledo.” Special Libraries Associ- ation 2010 Annual Conference & INFO-EXPO. 2010. Sabharwal, Arjun. “Digital Representations of Disability History: Developing a Virtual Ex- hibition at the Ward M. Canaday Center, University of Toledo.” Archival Issues: Journal of the Midwest Archives Conference 34, 1 (2012): 7-21. Saenger, Paul. Space Between Words: The Origin of Silent Reading. Stanford, CA: Stan- ford University Press, 1997. Saint-Martin, Fernande. Semiotics of Visual Language. Translated by Ferande Saint-Mar- tin. Bloomington, IN: Indiana University Press, 1997. 119 Saklofske, Jon, Estelle Clements, and Richard Cunningham. “On the Digital Future of Hu- manities.” In Digital Humanities Pedagogy: Practices, Principles, and Policies. Ed. Brett D. Hirsch. 311-30. Cambridge, MA: Open Book Publishers, 2012. Saklofske, Jon, Estelle Clements, and Richard Cunningham. “They Have Come, Why Won’t We Build It? On the Digital Future of the Humanities.” In Digital Humanities Peda- gogy: Practices, Principles, and Politics. Ed. Brett D Hirsch. Cambridge, MA: Open Book Publishers, 2012. http://www.openbookpublishers.com/htmlreader/DHP/chap13.html. Salen, Katie, and Eric Zimmerman. Rules of Play: Game Design Fundamentals. Cam- bridge, MA: MIT Press, 2004. Saler, Michael. “The Hidden Cost: Review of To Save Everything, Click Here, by Evgeny Morozov.” The Times Literary Supplement (May 24, 2013): 3–4. Salter, Anastasia. What is Your Quest?: From Adventure Games to Interactive Books. Iowa City, IA: University of Iowa Press, 2014. Salter, C. “Entangled: Technology and the Transformation of Performance.” Cambridge, MA: MIT Press. 2012. Salway, Benet. “Travel, Itinerary and Tabellaria.” In Travel and Geography in the Roman Empire. Eds. Colin Adams and Ray Laurence. 22-66. London and New York: Routledge, 2001. Sample, Mark. “Difficult Thinking about the Digital Humanities.” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 510-513. Minneapolis, MN: Univer- sity of Minnesota, 2016. Sample, Mark. “The Digital Humanities Is Not about Building, it’s about Sharing.” Sam- ple-Reality (blog), May 25, 2011. / http://www.samplereality.com/2011/05/25/the-digi- tal-humanities-is-not-about-building-its-about-sharing/. Sample, Mark. “On the Death of the Digital Humanities Center.” Sample Reality (blog). March 26, 2010. http://www.samplereality.com/2010/03/26/on-the-death-of-the-digi- tal-humanities-center/. Sample, Mark. ”Renetworking House of Leaves in the Digital Humanities.” Sample Real- ity (blog). August 18, 2011. http://www.samplereality.com/2011/05/25/the-digital-hu- manities-is-not-about-building-its-about-sharing/. Sample, Mark. “Resisting Technology: The Right Idea for All the Wrong Reasons.” Works and Days 16, no. 1-2 (1998): 423-426. 120 Sample, Mark. “Tenure as a Risk-Taking Venture.” Journal of Digital Humanities 1, no. 4 (Fall 2012). http://journalofdigitalhumanities.org/1-4/tenure-as-a-risk-taking-venture- by-mark-sample/. Sample, Mark. “Unseen and Unremarked On: Don DeLillo and the Failure of the Digital Humanities.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 187-201. Min- neapolis, MN: University of Minnesota, 2012. Sample, Mark. “What’s Wrong with Writing Essays.” In Debates in the Digital Humani- ties. Ed. Matthew K. Gold. 404-405. Minneapolis, MN: University of Minnesota, 2012. Sanchez, Elie. Fuzzy Logic and the Semantic Web. New York, NY: Elsevier, 2006. Sandweiss, Martha A. “Artifacts as Pixels, Pixels as Artifacts: Working with Photographs in the Digital Age.” Perspectives on History (November 2013). Sandweiss, Martha A. “Image and Artifact: The Photograph as Evidence in the Digital Age.” Journal of American History 92 (207): 193-202. Sau-Dufrene, Bernadette, ed. Heritage and Digital Humanities: How Should Training Practices Evolve? LIT Verlag, 2014. Sayers, Jentery. “Dropping the Digital.” In Debates in the Digital Humanities. Eds. Mat- thew K. Gold and Lauren Klein. 475-492. Minneapolis, MN: University of Minnesota, 2016. Sayers, Jentery. How Text Lost Its Source: Magnetic Recording Cultures. PhD disserta- tion, University of Washington, 2011. Sayers, Jentery. “I Don’t Know All the Circuitry.” In Making Things and Drawing Bounda- ries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 1-20. Minneapolis, MN: University of Minnesota Press, 2017. Sayers, Jentery, ed. Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Minneapolis, MN: University of Minnesota Press, 2017. Sayers, Jentery. “Project Snapshot: ‘AIDS Quilt Touch’: Virtual Quilt Browser.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Say- ers. 271. Minneapolis, MN: University of Minnesota Press, 2017. Sayers, Jentery. “Project Snapshot: Bibliocircuitry and the Design of the Alien Everyday, 2012-13.” In Making Things and Drawing Boundaries: Experiments in the Digital Human- ities. Ed. Jentery Sayers. 162. Minneapolis, MN: University of Minnesota Press, 2017. 121 Sayers, Jentery. “Project Snapshot: Designs for Foraging: Fruit Are Heavy, 2015-16.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jen- tery Sayers. 257-8. Minneapolis, MN: University of Minnesota Press, 2017. Sayers, Jentery. “Project Snapsot: Fashioning Circuits, 2011-Present.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 232- 3. Minneapolis, MN: University of Minnesota Press, 2017. Sayers, Jentery. “Project Snapshot: Glitch Console.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 185-6. Minneap- olis, MN: University of Minnesota Press, 2017. Sayers, Jentery. “Projects Snapshot: Loss Sets.” In Making Things and Drawing Bounda- ries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 205. Minneapolis, MN: University of Minnesota Press, 2017. Sayers, Jentery. “Project Snapshot: Made: Technology on Affluent Leisure Time.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jen- tery Sayers. 128-9. Minneapolis, MN: University of Minnesota Press, 2017. Sayers, Jentery. “Project Snapshot: MashBOT.” In Making Things and Drawing Bounda- ries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 55-56. Minneapolis, Min- nesota: University of Minnesota Press, 2017. Sayers, Jentery. “Project Snapshot: Mic Jammer.” In Making Things and Drawing Bound- aries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 115. Minneapolis, MN: University of Minnesota Press, 2017. Sayers, Jentery. “Project Snapshot: Movable Party.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 249-50. Minneap- olis, MN: University of Minnesota Press, 2017. Sayers, Jentery. “Prototyping the Past.” Visible Language 49, no. 3 (2015): 157-77. Sayer, Jentery. Teaching and Learning Multimodal Communications. 2013. http://sca- lar.usc.edu/maker/english-507/index. Sayers, Jentery. “Technology.” In Keywords for American Cultural Studies. 2nd edition. Eds. B. Burnett and G. Hendler. New York, NY: New York University Press. http://hdl.handle.net/2333.1/rr4xh08x). Sayers, Jentery. “Why Do Marketspaces Matter for the Humanities? For Writing Cen- ters?” Two Year College Association Pacific-Northwest, October 26, 2013. http://www.maker.uvic.ca/pnwca2013/#/title. 122 Sayers, Jentery, Devon Elliot, Kari Kraus, Bethany Nowviskie, and William J Turkel. “Be- tween Bits and Atoms: Physical Computing and Desktop Fabrication in the Humanities.” In A New Companion to Digital Humanities. Eds. by Susan Schreibman, Ray Siemens, and John Unsworth. 3-21. West Sussex, UK: Wiley-Blackwell, 2016. Sayers, Jentery, J. Boggs, D. Elliott, and W.J. Turkel. “Made to Make: Expanding Digital Humanities through Desktop Fabrication.” Digital Humanities. Scalar. http://scalar.usc.edu/. Schaffner, J., and R. Erway. “Does Every Research Library Need a Digital Humanities Cen- ter?” Dublin, OH: OCLC Research. http://www.oclc.org/content/dam/research/publica- tions/library/2014/oclcresearch-digital-humanities-center-2014.pdf. Schama, Simon. Landscape and Memory. New York, NY: Random House, 1996. Schantz, H. The History of OCR, Optical Character Recognition. Manchester Center, VT: Recognition Technologies Users Association, 1982. Scheindfeldt, Tom. “The Dividends of Difference: Recognizing Digital Humanities’ Di- verse Family Tree/s.” Found History. April 7, 2014. http://foundhistory.org/2014/04/the- dividends-of-difference-recognizing-digital-humanities-diverse-family-trees/. Scheindfeldt, Tom. “Stuff Digital Humanities Like: Defining Digital Humanities by its Val- ues.” Found History. December 2, 2010. http://www.foundhistory.org/2010/12/02/stuff-digital-humanists-like/. Scheindfeldt, Tom. “Sunset for Ideology, Sunrise for Methodology?” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 124-126. Minneapolis, MN: University of Min- nesota Press, 2012. Scheindfeldt, Tom. “’Where’s the Beef?’” Does Digital Humanities Have to Answer Ques- tions?” In Debates in the Digital Humanities. Ed. Matthew K. Gold 56-58. Minneapolis, MN: University of Minnesota Press, 2012. Scheindfeldt, Tom. “Why Digital Humanities is ‘Nice’?”. In Debates in the Digital Humani- ties. Ed. Matthew K. Gold. 59-60. Minneapolis, MN: University of Minnesota Press, 2012. Schell, J. The Art of Game Design: A Book of Lenses. Amsterdam and Boston: Else- vier/Morgan Kaufmann. Schmidt, Desmond. “The Inadequacy of Embedded Markup for Cultural Heritage Texts.” Literacy and Linguistic Computing 25, no. 3 (2010): 337-356. 123 Schmidt, Benjamin. “Do Digital Humanists Need to Understand Algorithms?” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 546-555. Minneapolis, MN: University of Minnesota Press, 2016. Schmidt, Benjamin. “Words Alone: Dismantling Topic Models in the Humanities.” Jour- nal of Digital Humanities 2, no. 1 (2013). http://journalofdigitalhumanities.org/2- 1/words-alone-by-benjamin-m-schmidt/. Schnapp, Jeffrey and Matthew Battles. The Library Beyond the Book (metaLABProjects). Cambridge, MA: Harvard University Press, 2014. Shneiderman, Ben. Leonardo's Laptop: Human Needs and the New Computing Technolo- gies. Cambridge, MA: MIT Press, 2002. Schöch, Christof. “Big? Smart? Clean? Messy? Data in the Humanities.” Journal of Digital Humanities 2.3 (2013): 2-13. Scholle, David. “Resisting Disciplines: Repositioning Media Studies in the University.” Communication Theory, 5 (1995): 130-43. Scholz, Sandra, and Robert Chenhall. “Archaeological Data Banks in Theory and Prac- tice.” American Antiquity 41, no. 1 (1976): 89-96. Scholz, R. Trebor, ed. Digital Labor: The Internet as Playground and Factory. New York, NY: Routledge, 2013. Scholz, R. Trebor. Digital Labor: New Opportunities, Old Inequalities. Re:public, 2013. May 7, 2013. video. http://www.youtube.com/watch?v=52CqKIR0rVM. Scholz, R. Trebor. Learning Through Digital Media. New York, NY: Institute for Distrib- uted Creativity, 2011. Schreibman, Susan. “Computer-mediated Texts and Textuality: Theory and Practice.” Computers and the Humanities 36, no. 3 (2002): 283-293. http://www.jstor.org/pss/30200528. Schreibman, Susan. “Digital Scholarly Editing.” Literary Studies in the Digital Age: An Evolving Anthology. Eds., Kenneth M. Price and Ray Siemens. Modern Language Associa- tion, 2013. Schreibman, Susan, Ray Siemens, and John Unsworth. A Companion to Digital Humani- ties. West Sussex, UK: Wiley-Blackwell, 2004. www.digitalhumanities.org/companion. 124 Schreibman, Susan, Ray Siemens, and John Unsworth, eds. A New Companion to Digital Humanities. xxiii-xxvii. West Sussex, UK: Wiley-Blackwell, 2016. Schreibman, Susan, Laura Mandela, and Olsen Stephen. “Introduction: Evaluating Digital Scholarship.” Profession 1 (2011): 123-201. http://www.mlajour- nals.org/doi/abs/10.1632/prof.2011.2011.1.123. Schuler, Douglas, and Aki Namioka. Participatory Design: Principles and Practices. Mah- wah, NJ: Erlbaum, 1993. Schulz, Kathryn. Being Wrong: Adventures in the Margin of Error. New York, NY: Harper Collins, 2010. Schulz, Kathryn. “What Is Distant Reading?” New York Times Sunday Book Review, June 24, 2011. http://www.nytimes.com/2011/06/26/books/review/the-mechanic-muse- what-is-distant-reading.html?_r=0. Schuurman, N. “Trouble in the heartland: GIS and its Critics in the 1990s.” Progress in Human Geography, 24, 4 (2000): 569-90. ScoreAHit. “The Hit Equation.” http://scoreahit.com/TheHitEquation/. Seaman, David. “GIS and the Frontier of Digital Access: Application of GIS Technology in the Research Library.” Paper presented at Future Foundations: Mapping the Past-Build- ing the Greater Philadelphia GeoHistory Network. Chemical Heritage Foundation, Phila- delphia, PA. 2005. Segel, Edward, and Jeffrey Heer. “Narrative Visualisation: Telling Stories with Data.” TVCG 16, 6, (2010): 1139-48. Selfe, Cynthia. “Computers in English Departments: The Rhetoric of Technopower.” ADE Bulletin 90 (1988): 63-67. http://www.mla.org/adefl_bulle- tin_c_ade_90_63&from=adefl_bulletin_t_ade90_0. Selfe, Cynthia. and G. Hawisher. Literate Lives in the Information Age: Narratives of Liter- acy from the United States. Mahwah, NJ: Lawrence Erlbaum, 2004. Selfe, Cynthia. Technology and Literacy in the Twenty-First Century: The Importance of Paying Attention. Carbondale, IL: Southern Illinois University Press, 1999. Selisker, Scott. “Digital Humanities Knowledge: Reflections on the Introductory Gradu- ate Syllabus.” In Debates in the Digital Humanities. Eds. Matthew K. Gold, and Lauren Klein. 194-198. Minneapolis, MN: University of Minnesota Press, 2016 125 Senchyne, Jonathan. “Between Knowledge and Metaknowledge: Shifting Disciplinary Borders in Digital Humanities and Library and Information Studies.” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 368-376. Minneapolis, MN: University of Minnesota Press, 2016. She, Sydney J. “Digital Materiality.” In A New Companion to Digital Humanities. Eds. Su- san Schreibman, Ray Siemens, and John Unsworth. 322-330. West Sussex, UK: Wiley- Blackwell, 2016. Sheppard, Eric. “Knowledge Production through Critical GIS: Genealogy and Prospects.” Cartographica 40:4 (2005): 5-21. Sherman, Erica. “Urban Agents: Confraternities, Devotion and the Formation of a New Urban State in Eighteenth-Century Minas Gerais.” PhD dissertation, Duke University, 2017. Sherratt, Tim. “It’s all about the Stuff: Collections, Interfaces, Power and People.” Di- scontents. November 2011. http://journalofdigitalhumanities.org/1-1/its-all-about-the- stuff-by-tim-sherratt/ Shillingsburg, Peter L. From Gutenberg to Google: Electronic Representations of Literary Texts. Cambridge, MA: Cambridge University Press, 2006. Shillingsburg, Peter L. “Principles for electronic Archives, Scholarly Editions, and Tutori- als.” In The Literary Text in the Digital Age. Ed. Richard J. Finneran. 23-35. Ann Arbor, MI: University of Michigan Press, 1996. Shields, R. “The Virtual.” In Key Ideas. London and New York: Routledge, 2002. Shirazi, Roxane. “Reproducing the Academy: Librarians and the Question of Service in the Digital Humanities.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 86-94. Minneapolis, MN: University of Minnesota Press, 2017. Sholette, Gregory. “Disciplining the Avant-Garde: The United States Versus the Critical Art Ensemble.” Circa 112 (2005): 50-59. http://www.jstor.org/pss/25564316. Shopes, Linda. “Making Sense of Oral History.” Oral History in the Digital Age. http://ohda.matrix.msu.edu/2012/08/making-sense-of-oral-history/. Shore, Daniel. “WWJD? The Genealogy of a Syntactic Form.” Critical Inquiry. 37, no. 1 (2010): 1–25. 126 Short, H., and J. Nyhan. “‘Collaboration Must be Fundamental or It’s not Going to Work’: an Oral History.” DHQ: Digital Humanities Quarterly. 3 (2) (2009). Showers, Ben. “Does the Library Have a Role to Play in the Digital Humanities?” JISC Dig- ital Infrastructure Team, February 23, 2012. http://infteam.jiscin- volve.org/wp/2012/02/23/does-the-library-have-a-role-to-play-in-the-digital-humani- ties/. Siebert, Loren. “Using GIS to Document, Visualize, and Interpret Tokyo’s Spatial History.” Social Science History 24:3 (2000): 537-74. Siefring, Judith. “SECT (Sustaining the EBBO-TCP Corpus in Translation).” JISC. (2013). https://www.webarchive.org.uk/wayback/ar- chive/20140614062112/http://www.jisc.ac.uk/whatwedo/programmes/preserva- tion/SECT.aspx. Siemens, Lynne. “Project Management and the Digital Humanist.” In Doing Digital Hu- manities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 343-57. New York, NY: Routledge, 2016. Siemens, Lynne. “‘It’s a Team If You Use ‘Reply All’: An Exploration of Research Teams in Digital Humanities Environments.” Literary and Linguistic Computing 24, no. 2 (June 1, 2009): 225 -233. Siemens, Lynne, Ray Siemens, Richard Cunningham, Teresa Dobson, Alan Galey, Stan Ruecker, and Claire Warwick. “INKE Administrative Structure, Omnibus Document.” New Knowledge Environments 1, no. 1. 2009. http://journals.uvic.ca/index.php/INKE/arti- cle/view/546/245. Siemens, Lynne, Richard Cunningham, Wendy Duff, and Claire Warwick. “A Tale of Two Cities: Implications of the Similarities and Differences in Collaborative Approaches Within the Digital Libraries and Digital Humanities Communities.” Literary and Linguistic Computing 26, no. 3 (2011): 335 -348. Siemens, Raymond, and S. Schreibman, eds. A Companion to Digital Literary Studies. Ox- ford, UK: Blackwell, 2007. Siemens, Raymond. and J. Sayers. “Toward Problem-Based Modeling in the Digital Hu- manities.” In Between Humanities and the Digital. Eds, P. Svensson and D.T. Goldberg. Cambridge, MA: MIT Press, 2015. 127 Siemens, Raymond, et al. “Human-Computer Interface/Interaction and the Book: A Con- sultation-derived Perspective on Foundational E-Book Research.” In Collaborative Re- search in the Digital Humanities. Eds. Marilyn Deegan and Willard McCarty. 162-89. Farnham, UK: Ashgate, 2012. Silberschatz, A., H.F. Korth, and S. Sudarshan, eds. Database System Concepts, 3rd edi- tion. New York, NY: McGraw-Hill, 1996. Simon, Herbert A. “Understanding the Natural and the Artificial Worlds.” In The Sciences of the Artificial, 3rd ed., 1–24. Cambridge and London: The MIT Press, 2000. Simon, Nina. The Participatory Museum. http://www.participatorymuseum.org/. Simsion, G. Data Modeling: Theory and Practice. Bradley Beach, NJ: Technics Publica- tions, 2007. Sinclair, S., S. Ruecker, and M. Radzikowska. “Information Visualization for Humanities Scholars.” In Literary Studies in the Digital Age: A Methodological Primer. Eds. K. Price and R. Siemens. New York, NY: MLA Commons, 2013. Sinclair, Stéfan, and Geoffrey Rockwell. “Text Analysis and Visualization: Making Mean- ing Count.” In A New Companion to Digital Humanities. Eds. Susan Schreibman, Ray Sie- mens, and John Unsworth. 274-290. West Sussex, UK: Wiley-Blackwell, 2016. Sinclair, Stéfan, and Geoffrey Rockwell. “Towards an Archaeology of Text Analysis Tools.” Digital Humanities 2014. 2014. Sinclair, Stéfan, Stan Ruecker, and Milena Radzikowska. “Information Visualization for Humanities Scholars.” Literary Studies in the Digital Age: An Evolving Anthology. Eds. Kenneth M. Price and Ray Siemens. MLA Commons. Modern Language Association of America. 2013. Sinton, Diana S., and Jennifer J. Lund. Understanding Place: GIS and Mapping Across the Curriculum. Redlands, CA: ESRI Press, 2007. Slack, Jennifer Daryl, and John Macgregor Wise. Culture and Technology: A Primer. New York, NY: Peter Lang, 2005. Slade, G. Made to Break: Technology and Obsolescence in America. Cambridge, MA: Har- vard University Press, 2006. Smith, H., and R. Dean. Practice-Led and Research-led Practice. Edinburgh, UK: Edin- burgh University Press, 2009. 128 Smith, I.G., ed. The Internet of Things 2012. New Horizons. Internet of Things European Research Cluster, 2012. Smith, J.B. “Computer Criticism.” STYLE XII, 4 (1978): 326-356. Smith, J.B. “Image and Imagery in Joyce’s Portrait: A Computer-Assisted Analysis.” Direc- tions in Literary Criticism: Contemporary Approaches to Literature. Eds. S. Weintraub and P. Young. 220-227. University Park, PA: The Pennsylvania State University Press, 1973. Smith, J.B. “A New Environment for Literary Analysis.” Perspectives in Computing 4, 2/3, (1984): 20-31. Smith, James. “Working with the Semantic Web.” Doing Digital Humanities: Practice, Training, Research. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 273-88. New York, NY: Routledge, 2016. Smith, Martha Nell. “Electronic Scholarly Editing.” In A Companion to Digital Humanities. Eds. Ray Siemens, John Unsworth, and Susan Schreibman. Oxford, UK: Blackwell, 2004. http://www.digitalhumanities.org/companion/. Smith Rumsey, Abby. “Creating Value and Impact in the Digital Age Through Transla- tional Humanities.” Washington, DC: Council on Library and Information Resources. 2013. Smith Rumsey, Abby. “Report of the Scholarly Communication Institute 8: Emerging Genres in Scholarly Communication.” Scholarly Communication Institute, University of Virginia Library, July 2010. Smithies, James. “Evaluating Scholarly Digital Outputs: The 6 Layers Approach.” Journal of Digital Humanities 1, no. 4 (Fall 2012). http://journalofdigitalhumanities.org/1-4/eval- uating-scholarly-digital-outputs-by-james-smithies/. Smithies, James. “Full Stack DH: Building a Virtual Research Environment.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Say- ers. 102-14. Minneapolis, MN: University of Minnesota Press, 2017. Smithies, James. “Introduction to Digital Humanities.” March 14, 2012. http://jamessmithies.org/2012/03/14/introduction-to-digital-humanities/. Smithsonian “Smithsonian Digital Volunteers.” Smithsonian Digital Volunteers. https://transcription.si.edu. 129 Smithsonian Social Media Policy. 2011. http://www.si.edu/content/pdf/about/sd/SD- 814.pdf. Sneha, P.P. “Making Humanities in the Digital: Embodiment and Framing in Bichitra and Indiancine.ma.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 57-70. Minneapolis, MN: University of Minnesota Press, 2017. Snow, C.P. The Two Cultures and the Scientific Revolution. New York, NY: Cambridge Uni- versity Press, 1959. Snyder, Susan. The Comic Matrix of Shakespeare’s Tragedies: Romeo and Juliet, Hamlet, Othello, and King Lear. Princeton, NJ: Princeton University Press, 1979. Soja, Edward. Postmodern Geographies: The Reassertion of Space in Critical Social The- ory. London, UK: Verso, 1989. Somerson, R., and M. Hermano, eds. The Art of Critical Making: Rhode Island School of Design on Creative Practice. Hoboken, NJ: John Wiley & Sons, Inc, 2013. Sorapure, Madeleine. “Between Modes: Assessing Student New Media Compositions.” Kairos 10, No. 2 (2005): 4-14. “Sorting Algorithms as Dances.” 2011. https://www.i-programmer.info/news/150-train- ing-a-education/2255-sorting-algorithms-as-dances.html. (January 3, 2019). Sousanis, Nick. Unflattening. Cambridge, MA: Harvard University Press, 2015. Southall, Humphrey R. “Applying Historical GIS Beyond the Academy: Four Use Cases for the Great Britain HGIS.” In Toward Spatial Humanities. Bloomington, IN: Indiana Univer- sity Press, 2010. Spatial Humanities. spatial.scholarslab.org. Speck, R., and P. Links. “The Missing Voice: Archivists and Infrastructures for Humanities Research.” In International Journal of Humanities and Arts Computing 7 (1-2) (2013): 128-146. doi: 10.3366/ijhac.2013.0085. Sperberg-McQueen, C.M. “Classification and its Structures”. In A New Companion to Digital Humanities. Eds. by Susan Schreibman, Ray Siemens, and John Unsworth. 377- 394. West Sussex, UK: Wiley-Blackwell, 2016. 130 Spiro, Lisa. “Collaborative Authorship in the Humanities.” Digital Scholarship in the Hu- manities. April 21, 2009. http://digitalscholarship.wordpress.com/2009/04/21/collabo- rative-authorship-in-the-humanities/. Spiro, Lisa. “Computing and Communicating Knowledge: Collaborative Approaches to Digital Humanities Projects.” http://ccdigitalpress.org/cad/Ch2_Spiro.pdf. Spiro, Lisa. Digital Research Tools (DiRT) Wiki. https://digitalresearch- tools.pbworks.com/w/page/17801672/FrontPage. Spiro, Lisa. “Examples of Collaborative Digital Humanities Projects.” Digital Scholarship in the Humanities, June 1, 2009. http://digitalscholarship.word- press.com/2009/06/01/examples-of-collaborative-digital-humanities-projects/. Spiro, Lisa. “Getting Started in Digital Humanities.” Journal of Digital Humanities, vol 1, no. 1 (2011). http://journalofdigitalhumanities.org/1-1/getting-started-in-digital-human- ities-by-lisa-spiro/ Spiro, Lisa. “Getting Started in the Digital Humanities.” Digital Scholarship in the Human- ities. October 14, 2011. http://digitalscholarship.wordpress.com/2011/10/14/getting-started-in-the-digital-hu- manities/. Spiro, Lisa. “Opening Up Digital Humanities Education”. Digital Scholarship in the Hu- manities. September 8, 2010. http://digitalscholarship.word- press.com/2010/09/08/opening-up-digital-humanities-education/. Spiro, Lisa. “’This Is Why We Fight’: Defining the Values of the Digital Humanities.” In De- bates in the Digital Humanities. Ed. Matthew K. Gold. Minneapolis, MN: University of Minnesota Press, 2012. Spiro, Lisa. “Tips on Writing a Successful Grant Proposal.” Digital Scholarship in the Hu- manities, September 9, 2008. http://digitalscholarship.wordpress.com/2008/09/09/tips- on-writing-a-successful-grant-proposal/. Srinivasan, Ramesh. “Taking Power Through Technology in the Arab Spring.” Al Jazeera. October 25, 2012. http://www.aljazeera.com/indepth/opin- ion/2012/09/2012919115344299848.html. Srinivasan, Ramesh, Katherine M. Becvar, Robin Boast, and Jim Enote. “Diverse Knowl- edges and Contact Zones within the Digital Museum.” Science, Technology, and Human Values 35, no. 5 (2010): 735-768. 131 Srinivasan, R., J. Enote, K. Becvar, and R. Boast. “Critical and Reflective Uses of New Me- dia in Tribal Museums.” Museum Management and Curatorship, 24, 2 (2009): 161-181. Srinvasan, Ramesh, and Jeffrey Huang. “Fluid Ontologies for Digital Museums.” Interna- tional Journal on Digital Libraries 5, no. 3 (2005): 193-204. Staley, David J. Brain, Mind and Internet: A Deep History and Future. Basingstoke, UK: Palgrave Pivot, 2014. Staley, David J. Computers, Visualization, and History: How New Technology Will Trans- form Our Understanding of the Past. Armonk, NY: M.E. Sharpe, 2003. Staley, David J. "Historical Visualizations." Journal of the Association for History and Computing 3, no. 3 (2000). Staley, David J. “On the ‘Maker Turn’ in the Humanities.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 32-41. Minneap- olis, MN: University of Minnesota Press, 2017. Staley, David J. “Visual Historiography: Toward an Object-Oriented Hermeneutics.” The American Historian. https://tah.oah.org/content/visual-historiography/. Staley, David J., Scott A. French, and Bill Ferster. “Visual Historiography: Visualizing ‘The Literature of a Field’.” Journal of Digital Humanities 3, no. 1 (Spring 2014). Steinkuehler, Constance. “Massively Multiplayer Online Gaming as a Constellation of Lit- eracy Practices.” E-learning 4.3 (2007): 297-318. Sternberg, S. H. Five Hundred Years of Printing. New York, NY: Criterion Books, 1959. Sternfeld, J. “Archival Theory and Digital Historiography: Selection, Search, and Metadata as Archival Processes for Assessing Historical Contextualization.” The Ameri- can Archivist 74, 2 (2011): 544-575. Stertzer, Jennifer. “Foundations for Digital Editing, with Focus on the Documentary Tra- dition.” In Doing Digital Humanities: Practice, Training, Research. Eds. Constance Cromp- ton, Richard J. Lane, Ray Siemens. 243-54. New York, NY: Routledge, 2016. Strommel, Jesse. “The Twitter Essay.” Hybrid Pedagogy (January 2012). Suber, Peter. Open Access. Cambridge, MA: MIT, 2012. 132 Suda, Brian, and Sam Hampton Smith. “The 20 Best Tools for Data Visualization.” Crea- tive Bloq. Future Publishing Limited, 2013. https://www.creativebloq.com/design- tools/data-visualization-712402. Sullivan, Elaine, Angel David Nieves, and Lisa M. Snyder. “Making the Model: Scholarship and Rhetoric in 3-D Historical Reconstructions.” In Making Things and Drawing Bounda- ries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 301-18. Minneapolis, MN: University of Minnesota Press, 2017. Sukovic, Suzana. “Beyond the Scriptorium: The Role of the Library in Text Encoding.” D- Lib Magazine 8, no. 1 (January 2002). http://www.dlib.org/dlib/january02/sukovic/01su- kovic.html. Suri, V.R. “The Assimilation and Use of GIS by Historians: a Socio-technical Interaction Networks (STIN) Analysis.” International Journal of Humanities and Arts Computing, 5, 2 (2011): 159-188. Stafford, Barbara Maria. Good Looking: Essays on the Virtue of Images. Cambridge, MA: MIT Press, 1996. Stauffer, Andrew. “Digital Scholarly Resources for the Study of Victorian Literature and Culture.” Victorian Literature and Culture 39 (2011): 293-303. Stauffer, Andrew. “My Old Sweethearts: On Digitalization and the Future of the Print Record.” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 218-229. Minneapolis, MN: University of Minnesota Press, 2016. Sterling, Bruce. Shaping Things. Cambridge, MA: MIT Press, 2005. Stern, Fritz, ed. The Varieties of History: From Voltaire to the Present. New York, NY: Vin- tage Books, 1972. Sterne, Jonathan. MP3: The Meaning of a Format. Durham, NC: Duke University Press, 2012. Sternfeld, Joshua. “Pedagogical Principles of Digital Historiography.” In Digital Humani- ties Pedagogy: Practices, Principles and Policies. Ed. Brett D. Hirsch. 255-290. Cambridge, UK: Open Book Publishers, 2012. Stone, A.R. The War of Desire and Technology at the Clone of the Mechanical Age. Cam- bridge, MA: MIT Press, 1996. Stone, Michael. “Map or Be Mapped.” Whole Earth (Fall 1998): 54. 133 Stone, S. “Humanities Scholars: Information Needs and Uses.” Journal of Documentation 38 (4) (1982): 292-313. Strate, Lance. “Studying Media as Media: McLuhan and the Media Ecology Approach.” MediaTropes 1 (2008): 127-142. http://www.mediatropes.com/index.php/Medi- atropes/article/view/3344/1488. Sturm, Sean, and Stephen Francis Turner. “Digital Caricature.” Digital Humanities Quar- terly 8, no. 3 (2014). http://www.digitalhumani- ties.org/dhq/vol/8/3/000182/000182.html. Suchman, Lucille Alice. Human-Machine Reconfigurations: Plans and Situated Actions. Cambridge and New York: Cambridge University Press, 2007. Sui, Daniel Z. “GIS, Cartography, and the ‘Third Culture’: Geographic Imaginations in the Computer Age.” Professional Geographer 56 (2004): 62-72. Sula, Chria Alen. “Digital Humanities and Libraries: A Conceptual Model.” Journal of Li- brary Administration 53:1 (January 2013). Summit on Digital Tools for the Humanities. The Institute for Advanced Technology in the Humanities – University of Virginia, 2006. http://www.iath.vir- ginia.edu/dtsummit/SummitText.pdf. Sunstein, Cass R. Infotopia: How Many Minds Produce Knowledge. New York, NY: Oxford University Press, 2006. “Sustainable Economics for a Digital Planet: Ensuring Long-Term Access to Digital Infor- mation.” Washington, DC: Blue Ribbon Task Force on Sustainable Digital Preservation and Access, February 2010. http://brtf.sdsc.edu/biblio/BRTF_Final_Report.pdf. Svensson, Patrik. “Beyond the Big Tent.” In Debates in the Digital Humanities. Ed. Mat- thew K. Gold. Minneapolis, MN: University of Minnesota Press, 2012. Svensson, Patrik. Big Digital Humanities: Imagining a Meeting Place for the Humanities and the Digital. Ann Arbor, MI: University of Michigan Press, 2016. Svennson, Patrik. “The Digital Humanities as a Humanities Project.” Arts and Humanities in Higher Education 11 (1-2) (2012): 42-60. Svensson, Patrik. “Humanities Computing as Digital Humanities.” Digital Humanities Quarterly 3, no. 3 (2009). http://digitalhuma- nities.org/dhq/vol/3/3/000065/000065.html 134 Svensson, Patrik. “The Landscape of Digital Humanities.” DHQ: Digital Humanities Quar- terly 4, no. 1 (Summer 2010). http://digitalhuma- nities.org/dhq/vol/4/1/000080/000080.html Svensson, Patrik. “Sorting out the Digital Humanities.” In A New Companion to Digital Humanities. Eds. by Susan Schreibman, Ray Siemens, and John Unsworth. 476-492. West Sussex, UK: Wiley-Blackwell, 2016. Svensson, Patrik. “A Visionary Scope of the Digital Humanities.” HUMLab Blog. February 23, 2011. http://blog.humlab.umu.se/?p=2894. Svensson, Patrik and David Theo Goldberg, eds. Between Humanities and the Digital. Cambridge, MA: MIT Press, 2015. Swafford, Joanna. “Messy Data and Faulty Tools.” In Debates in the Digital Humanities. Eds. Matthew K. Gold, and Lauren Klein. 556-558. Minneapolis, MN: University of Min- nesota Press, 2016. Szabo, Victoria. “Transforming Art History Research with Database Analytics: Visualizing Art Markets.” Art Documentation 31: 2 (2012): 158-175. TaDiRAH. “TaDiRAH: Taxonomy of Digital Research Activities in the Humanities.” Dariah. 2014. http://tadirah.dariah.eu/vocab/index.php. Tally, R. Melville, Mapping and Globalization: Literary Cartography in the American Ba- roque Writer. London, UK: Continuum, 2009. Tanner, Simon. “Inspiring Research, Inspiring Scholarship. The Value and Benefits of Dig- itized Resources for Learning, Teaching, Research and Enjoyment.” Proceedings of Ar- chiving 2011. 77-82. Arlington, VA: Society for Imaging Science and Technology, 2011. Tanner, Simon. Measuring the Impact of Digital Resources: Balanced Value Impact Model. London, UK: King’s College, October 2012. http://www.kdcs.kcl.ac.uk/innova- tion/impact.html. Tanner, Simon. and G. Bearman. “Digitising the Dead Sea Scrolls.” Proceedings of Archiv- ing 2009. 119-23. Arlington, VA: Society for Imaging Science and Technology, 2009. Tanner, Simon, Laura Gibson, Rebecca Kahn, and Geoff Laycock. “Choices in Digitisaion for the Digital Humanities.” Research Methods for Creating and Curating Data in the Dig- ital Humanities. Eds. Matt Hayler and Gabriele Griffin. 14-43. Edinburgh, UK: Edinburgh University Press, 2016. 135 Tanopir, Carol, et al. Trust and Authority in Scholarly Communications in the Light of the Digital Transition: Final Report. University of Tennessee and CIBER Research ltd, 2013. Tate, Nicholas J., and Peter M. Atkinson, eds. Modelling Scale in Geographical Infor- mation Science. Chichester, UK: Wiley, 2001. Taylor, Pamela. “Critical Thinking in and Through Interactive Computer Hypertext and Art Education.” Innovate: Journal of Online Education 2, no. 3 (2006): 1-7. Taylor, Tina L. Play Between Worlds: Exploring Online Game Culture. Cambridge, MA: MIT Press, 2006. Teboul, Ezra. “Electronic Music Hardware and Open Design Methodologies for Post-Op- timal Objects.” In Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Sayers. 177-84. Minneapolis, MN: University of Minnesota Press, 2017. TEI (Textual Encoding Initiative Consortium). http://www.tei-c.org. TEI: A Test Coding Initiative. “A Gentle Introduction to XML.” http://www.tei-c.org/re- lease/doc/tei-p5-doc/en/html/SG.html. Templeman-Kluit, Nadaleen, and Alexa Pearce. “Invoking the User from Data to De- sign.” College & Research Libraries (2014). Tenen, Dennis. “Blunt Instrumentalism: On Tools and Methods.” In Debates in the Digi- tal Humanities. Eds. Matthew K. Gold and Lauren Klein. 83-91. Minneapolis, MN: Univer- sity of Minnesota Press, 2016. Terras, Melissa. “Being the Other.“ Collaborative Research in the Digital Humanities. Eds. Marilyn Deegan and Willard McCarty. 213-30. Farnham, UK: Ashgate, 2012. Terras, Melissa. “Crowdsourcing in the Digital Humanities.” In A New Companion to Digi- tal Humanities. Eds. by Susan Schreibman, Ray Siemens, and John Unsworth. 420-438. West Sussex, UK: Wiley-Blackwell, 2016. Terras, Melissa. Defining Digital Humanities: A Reader. Farnham, UK: Ashgate, 2013. Terras, Melissa. Digital Images for the Informational Professional. Aldershot, UK: Ash- gate, 2008. Terras, Melissa. “Disciplined: Using Educational Studies to Analyze Humanities Compu- ting.” Literary and Linguistic Computing, 21.2 (2006): 229-46. 136 Terras, Melissa. “Digitization and Digital Resources in the Humanities.” In Digital Huma- nities in Practice. Eds. Claire Warwick, Melissa Terras, and Julianne Nyhan. 47-70. Lon- don, UK: Facet in Association with UCL Center for Digital Humanities, 2012. Terras, Melissa, and Julianne Nyhan. “Father Busa’s Female Punch Card Operatives.” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 60-65. Min- neapolis, MN: University of Minnesota Press, 2016. Terras, Melissa. “The Impact of Social Media on the Dissemination of Research: Results of an Experiment.” Journal of Digital Humanities, Vol. 1, No. 3 (Summer 2012), http://journalofdigitalhumanities.org/1-3/the-impact-of-social-media-on-the-dissemina- tion-of-research-by-melissa-terras/. Terras, Melissa. “Peering inside the Big Tent: Digital Humanities and the Crisis of inclu- sion.” Author’s blog. July 26, 2011. http://melissaterras.blogspot.com/2011/07/peering- inside-big-tent-digital.html. Terras, M. “Present, Not Voting: Digital Humanities in the Panopticon: Closing Plenary Speech, Digital Humanities 2010.” Literary and Linguistic Computing 26, no. 3 (2011): 257-69. Thacker, Eugene. Biomedia. Minneapolis, MN: University of Minnesota Press, 2004. Thacker, Eugene. “Networks, Swarms, Multitudes: Part One.” CTheory. May 18, 2004. http://dhdebates.gc.cuny.edu/debates/text/422. Thacker, Eugene. “Networks, Swarms, Multitudes: Part Two.” CTheory. May 18, 2004. http://dhdebates.gc.cuny.edu/debates/text/423. Thaller, M., ed. Controversies around the Digital Humanities. Historical Social Research/ Historische Sozialforschung 37.1. Köln, Germany: QUANTUM and Zentrum für Histor- ische Sozialforschung. THATCamp: The Humanities and Technology Camp. thatcamp.org. Thomas, Douglas and John Seely Brown. A New Culture of Learning: Cultivating the Im- agination for a World of Constant Change. CreateSpace Independent Publishing Plat- form, 2011. Thomas, Lindsay, and Dana Solomon. “Active Users: Project Development and Digital Humanities Pedagogy.” CEA Critic 76, no. 2 (July 2014). http://muse.jhu.edu/login?auth=0&type=summary&url=/jour- nal/cea_critic/v076/76.2.thomas.html. 137 Thomas III, William G. “Blazing Trails toward Digital History Scholarship.” Social His- tory/Histoire Sociale 34, no. 68 (2001): 415-26. Thomas III, William G., and Elizabeth Lorang. “The Other End of the Scale: Rethinking the Digital Experience in Higher Education.” Educause Review 49, no. 5 (2014). http://www.educause.edu/ero/article/other-end-scale-rethinking-digital-experience- higher-education. Thomas III, William G. “The Promise of the Digital Humanities and the Contested Nature of Digital Humanities.” In A New Companion to Digital Humanities. Eds. by Susan Schreibman, Ray Siemens, and John Unsworth. 524-537. West Sussex, UK: Wiley-Black- well, 2016. Thompson, Ann. “Teena Rochfort Smith, Frederick Furnivall, and the New Shakespere Society’s Four-Text Edition of Hamlet.” Shakespeare Quarterly 49, no. 2 (1998): 125– 149. Thompson Klein, Julie. Interdisciplining Digital Humanities. Ann Arbor, MI: University of Michigan Press, 2015. Tiffany, Daniel. Toy Medium: Materialism and Modern Lyric. Berkeley, CA: University of California, 2000. Tiles, Mary, and Hans Oberdiek, “Conflicting Visions of Technology.” In Living in a Tech- nological Culture: Human Tools and Human Values, 12–28. London and New York: Routledge, 1995. Tillman, R L. “Pirensi: Now in 3-D.” Printeresting. Warhol Foundation. 5 October 2010. Web. 8 August 2013. Tolman, E.C. “Cognitive Maps in Rats and Men.” Psychological Review 55, no.4 (1948): 189-208. Townsend, R.B. “How is New Media Reshaping the Work of Historians?” Perspectives on History. November 2010. Trahey, Tara M. “A Black-Figure Vase in the Nasher Museum: Visualizing an Iconographic Network between Athens and Vulci in the 6th Century BCE.” BA Honors thesis, Duke Uni- versity, 2015. “#transformDH: This is the Digital Humanities.” http://transformdh.tmblr.com/. Troyano, Joan Fragaszy. “Discovering Scholarship on the Open Web: Communities and Methods.” April 1, 2013, http://pressforward.org/discovering-scholarship-on-the-open- 138 web-communities-and-methods/http://www.lotfortynine.org/2012/08/navigating-dh- for-cultural-heritage-professionals-2012-edition/. Tryon, Chuck. “Using Video Annotation Tools to Teach Film Analysis.” Profhacker. http://chronicle.com/blogs/profhacker/using-video-annotation-tools-to-teach-film-anal- ysis/57171. Tuan, Yi-Fu. “Images and Mental Maps.” Annals of the Association of American Geogra- phers. 65, no 2 (1975): 205-13. Tuan, Yi-Fu. Space and Place: The Perspective of Experience. reprint. Minneapolis, MN: University of Minnesota Press, 2001. Tufte, Edward. Envisioning Information. Cheshire, CT: Graphics Press, 1990. Tufte, Edward. “PowerPoint is Evil.” Wired. (2003). https://www.wired.com/2003/09/ppt2/. Tufte, Edward. The Visual Display of Quantitative Information. 2nd ed. Cheshire, CT: Graphics Press, 2001. Tufts University. Perseus Digital Library. http://www.perseus.tufts.edu/hop- per/help/versions.jsp. Tunkelang, Daniel. Faceted Search. San Rafael, CA: Morgan & Claypool, 2009. Turkel, William J. “Hacking History, from Analog to Digital and Back Again.” Rethinking History 15 (2) 287-296. Turkel, William J. Shezan Muhammedi, and Mary Beth Start. “Grounding Digital History in the History of Computing.” IEEE Annals of the History of Computing (2014): 72. Tukey, John W. Exploratory Data Analysis. Reading, MA: Addison-Wesley, 1977. Turkle, Sherry. Alone Together: Why We Expect More from Technology and Less from Each Other. New York, Ny: Basic Books, 2011. Turkle, Sherry. Life on the Screen: Identity in the Age of the Internet. New York, NY: Si- mon and Schuster, 1997. Turner, Fred. From Counterculture to Cyberculture: Stewart Brand, the Whole Earth Net- work, and the Rise of Digital Utopianism. Chicago, IL: University of Chicago Press, 2006. 139 Tversky, Barbara, and Paul U. Lee. “Pictorial and Verbal Tools for Conveying Routes.” In Spatial Information Theory: Cognitive and Computational Foundations of Geographical Information Science: International Conference Cosit ’99, stade, Germany, 25-29 August: Proceedings. Eds. Christian Freska and David Mark. 51-64. Berlin, Germany: Springer Verlag, 1999. “The 20 Best Tools for Data Visualization.” Creative Blog. Future Publishing Limited. March 18, 2013. Tweten, Lisa, Gwynaeth McIntyre, and Chelsea Gardner. “From Stone to Screen: Digital Revitalization of Ancient Epigraphy.” Digital Humanities Quarterly 10, no.1 (2016). Twycross, M. “Virtual Restoration and Manuscript Archaeology.” in The Virtual Repre- sentation of the Past. Eds. M. Greengrass and L. Hughes. 23-48. Farnham, UK: Ashgate, 2008. UCLA Library Digital Humanities Research Guide. http://guides.library.ucla.edu/digi- talhumanities. Underwood, Ted. “Distant Reading and Recent Intellectual History.” In Debates in the Digital Humanities. Eds. Matthew K. Gold and Lauren Klein. 530-533. Minneapolis, MN: University of Minnesota Press, 2016. Underwood, Ted. “Hold on Loosely, or Gemeinschaft and Gesellschaft on the Web.” In Debates in the Digital Humanities. Ed. Matthew Gold and Lauren Klein. 519-522. Minne- apolis, MN: University of Minnesota Press, 2016. Underwood, Ted. “How Much DH can you Fit in a Literature Department?” The Stone and the Shell. http://tedunderwood.com. Underwood, Ted. “Seven Ways Humanists are Using Computers to Understand Text.” The Stone and the Shell. http://tedunderwood.com. Underwood, Ted. “We don’t already understand the broad outlines of literary his- tory.” The Stone and the Shell. http://tedunderwood.com Underwood, Ted. “Where to start with text mining.” The Stone and the Shell. http://tedunderwood.com. Underwood, Ted. “Why Digital Humanities isn’t Actually ‘The next Thing in Literary Stud- ies.” The Stone and the Shell. http://tedunderwood.com. University of Texas Libraries “Using the four factor fair use test.” Fair Use. (2012). http://guides.lib.utexas.edu/copyright#test. 140 Unsworth, John, Raymond George Siemens, and Susan Schreibman, eds. A Companion to Digital Humanities. Blackwell Companions to Literature and Culture 26. Maiden, MA: Blackwell Pub, 2004. Unsworth, John. “Evaluating Digital Scholarship, Promotion & Tenure Cases.” University of Virginia College and Graduate School of Arts and Sciences – Office of the Dean, n.d. http://artsandsciences.virginia.edu/dean/facultyemployment/evaluating_digi- tal_scholarship.html. Unsworth, John. “The State of Digital Humanities, 2010.” Talk Manuscript. Digital Hu- manities Summer Institute, June 2010. http://www3.isrl.illinois.edu/-un- sworth/state.of.dh.DHSI.pdf. Unsworth, John. “University 2.0.” The Tower and the Cloud: Higher Education in the Age of Cloud Computing. Ed. R. N. Katz. Washington, DC: Educause, 2008. Unsworth, John. “What Is Humanities Computing and What Is Not?” Graduate School of Library and Information Sciences. Illinois Informatics Institute, University of Illinois, Ur- bana. http://computerphilologie .uni-muenchen.de/jg02/unsworth.html. Urban, Richard, and Marla Misunas. “A Brief History of the Museum Computer Net- work.” Encyclopedia of Library and Information Sciences. Boca Raton, FL: CRC Press, 2007. Urban, R. Marty, P. & Twidale, M. “A Second Life for Your Museum: 3D Multi-User Vir- tual Environments and Museums.” Museums and the Web Conference, San Francisco. (2007). www.archimuse.com/mw2007/papers/urban/urban.html. Vaidhyanathan, Siva. “Afterword: Critical Information Studies.” Cultural Studies 20, no. 2-3 (2006): 292-315. Vaidhyanathan, Siva. The Googlization of Everything (And Why We Should Worry). Oak- land, CA: University of California Press, 2011. Van Zundert, JJ., C. Van den Heuvel, B. Brumfield, ed. “Text Theory, Digital Documents, and the Practice of Digital Editions.” Digitize Humanities, 2013. Van der Weel, Adriaan van der. Changing Our Textual Minds: Towards a Digital Order of Knowledge. Manchester UK: Manchester University Press, 2012. Vandendorpe, Christian. From papyrus to hypertext: Toward the universal digital library. Vol. 14. Urbana, IL: University of Illinois Press, 2009. 141 Vanhemert, Kyle. “Artist Turns a Year’s Worth of Tracking Data into a Haunting Rec- ord.” Wired. (2013). https://www.wired.com/2013/07/a-years-worth-of-location-data- transformed-into-a-beautiful-record/. Vanhoutte, E. “Traditional Editorial Standards and in the Digital Edition.” In Learned Love: Proceedings of the Emblem Project Utrecht Conference on Dutch Love Emblems and the Internet (November 2006). Eds. E. Stronks and P. Boot. 157-174. The Hague: DANS- Data Archiving and Networked Services, 2007. Various Authors. “Reports on National Historical GIS Projects.” Historical Geography 33 (2005): 134-58. Vaughan-Nichols, Steven J. “Augmented Reality: No Longer a Novelty?” Computer 42:1 (2009): 19-22. Vectors: Journal of Culture and Technology in a Dynamic Vernacular. www.vectorsjour- nal.org. Verbeek, Peter-Paul. Moralizing Technology: Understanding and Designing the Morality of Things. Chicago, IL: University of Chicago Press, 2011. Verhoeven, Deb. “Doing the Sheep Good: Facilitating Engagement in Digital Humanities and Creative Arts Research.” In Advancing Digital Humanities: Research, Methods, Theo- ries. Eds. Paul Longley Arthur and Katherine Bode, 206-220. New York, NY: Palgrave MacMillan, 2014. Vershbow, Ben. “NYPL Labs: Hacking the Library.” Journal of Library Administration, 53 (2013): 79-96. Vesna, Victoria, ed. Database Aesthetics: Art in the Age of Information Overflow. Minne- apolis, MN: University of Minnesota Press, 2007. Vickers, Jill. “Diversity, Globalization, and ‘Growing Up Digital’: Navigating Interdiscipli- narity in the Twenty-First Century.” History of Intellectual Culture, 3.1 (2003). http://www.ucalgary.ca/hic/issues/vol3. Vinopal, Jennifer. “Supporting Digital Humanities in the Library: Creating Sustainable & Scalable Services.” Library Sphere, June 29, 2012. http://vinopal.org/2012/06/29/sup- porting-digital-humanities-in-the-library-creating-sustainable-scalable-services/. Vinopal, Jennifer and Monica McCormick. “Supporting Digital Scholarship in Research Libraries: Scalability and Sustainability.” Journal of Library Administration 53:1 (January 2013). 142 Vinopal, Jennifer. “Why Understanding the Digital Humanities Is Key for Libraries.” Li- brary Sphere, February 2011. http://vinopal.org/2011/02/18/why-understanding-the- digital-humanities-is-key-for-libraries/. Visconti, Amanda. “‘Songs of Innocence and of Experience:’ Amateur Users and Digital Texts.” Ann Arbor, MI: University of Michigan, 2010. http://hdl.han- dle.net/2027.42/71380. Voyant Tools. voyant-tools.org. Wajcman, Judy. Feminism Confronts Technology. Oxford, UK: Polity, 1991. Wajcman, Judy. “Reflections on Gender and Technology Studies: in What State is the Art?” Social Studies of Science 30 (3) (2000): 447-464. Walk, Paul. “Linked, Open, Semantic?” (2009). http://www.paulwalk.net. Wallace, David Foster. “Tense Present: Democracy, English and the Wars over Us- age.” Harper’s Magazine, 2001. Waltzer, Luke. “Digital Humanities and the ‘Ugly Stepchildren’ of American Higher Edu- cation.” Debates in the Digital Humanities. Ed. Matthew K. Gold. 335-349. Minneapolis, MN: University of Minnesota Press, 2012. Wands, B. Art of the Digital Age. London, UK: Thames and Hudson, 2007. Wankel, C. & Kingsley, J., eds. Higher Education in Virtual Worlds: Teaching and Learning in Second Life. Bradford, UK: Emerald, 2009. Ware, Colin. Information Visualization: Perception for Design. San Francisco, CA: Morgan Kaufman, 2004. Warwick, Claire. “The End of the Beginning: Building, Supporting and Sustaining Digital Humanities Institutions.” Digital Humanities Summer Institute, Victoria, 2015. Waters, D. “An Overview of the Digital Humanities.” Research Library Issues 284 (2013): 3-11. Warburtone, S. “Second Life in Higher Education: Assessing the Potential for the Barriers to Deploying Virtual Worlds in Learning and Teaching.” British Journal of Educational Technology, 40 (3), (2009): 414-426. 143 Warde, Beatrice. “The Crystal Goblet.” first delivered in 1930 as “Printing Should be In- visible.” In The Crystal Goblet: Sixteen Essays on Typography. London, UK: Sylvan Press, 1955. Wardrip-Fruin, Noah, and P. Harrigan, eds. First Person: New Media as Story, Perfor- mance, and Game. Cambridge, MA: MIT Press, 2004. Wardrip-Fruin, Noah. “Reading Digital Literature: Surface, Data, Interaction, and Expres- sive Processing.” In A Companion to Digital Literary Studies. Eds. by Ray Siemens and Su- san Schreibman. Oxford, UK: Blackwell, 2008. Warwick, Claire. “Building Theories or Theories of Building? A Tension at the Heart of Di- gital Humanities.” In A New Companion to Digital Humanities. Eds. by Susan Schreib- man, Ray Siemens, and John Unsworth. 538-552. West Sussex, UK: Wiley-Blackwell, 2016. Warwick, Claire. “Institutional Models for Digital Humanities.” In Digital Humanities in Practice. Eds. Claire Warwick, Melissa Terras, and Julianne Nyhan. 193-216. London, UK: Facet in Association with UCL Center for Digital Humanities, 2012. Warwick, Claire, Isabel Galina, Melissa Terras, Paul Huntington, and Nikoleta Pappa. “The Master Builders: LAIRAH Research on Good Practice in the Construction of Digital Humanities Projects.” Literary and Linguistic Computing 23, no. 3 (2008): 383 -396. Warwick, Claire, Melissa Terras, and Julianne Nyhan. “Introduction.” In Digital Humani- ties in Practice. Eds. Claire Warwick, Melissa Terras, and Julianne Nyhan. 1-21. London, UK: Facet in Association with UCL Center for Digital Humanities, 2012. Warwick, Claire, Melissa Terras, and Julianne Nyhan, eds. A Practical Guide to the Digital Humanities. London, UK: Facet Publishing, 2011. Watrall, Ethan. “Archaeology, the Digital Humanities, and the ‘Big Tent’.” In Debates in the Digital Humanities. Eds. Matthew K. Gold, and Lauren Klein. 345-358. Minneapolis, MN: University of Minnesota Press, 2016. Watts, Reggie. “Beats that Defy Boxes.” TED Conference, February 2012. Lecture. TED: Ideas Worth Spreading. https://www.ted.com/talks/reggie_watts_disori- ents_you_in_the_most_entertaining_way. Weber, Max. "Science as a Vocation." From Max Weber: Essays in Sociology. Trans. H. H. Gerth, C. Wright Mills. New York, NY: Oxford University Press, 1946. 129-156. Weibel, Peter, and Timothy Druckrey, eds. Net Condition Art and Global Media. Cam- bridge, MA: MIT Press, 2001. 144 Weible, Robert. “Defining Public History: Is It Possible? Is It Necessary?” In Perspectives on History, March 2008. http://www.historians.org/pubications-and-directories/per- spectives-on-history/march-2008/defining-public-history-is-it-possible-is-it-necessary. Weinberger, David. Everything is Miscellaneous. New York, NY: Henry Holt and Com- pany, 2007. Weir, George R. S., and Marina Livitsanou. “Playing Textual Analysis as Music.” Corpus, ICT, and Language Education. Eds. Weir, George R. S., and Shinʼichirō Ishikawa. Glasgow, UK: University of Strathclyde Press, 2011. Weiser, Mark. “The Computer for the Twenty-First Century.” Scientific American, Sep- tember, 94-104. 1991. Weiser, Mark. “Ubiquitous Computing.” Computer Science Lab at Xerox PARC, 1988. www.ubiq.com/ubicomp. Weiss, Sholom M., Nitin Indurkhya, Tong Zhang, and Fred J. Damerau. Text Mining: Pre- dictive Methods for Analyzing Unstructured Information. New York, NY: Springer, 2005. Weller, Martin. The Digital Scholar: How Technology is Transforming Scholarly Practice. London, UK: Bloomsbury Academic, 2011. Wellmon, Chad. Organizing Enlightenment: Information Overload and the Invention of the Modern Research University. Baltimore, MD: Johns Hopkins University Press, 2015. Werner, Sarah. “Fetishizing Books and Textualizing the Digital.” sarahwerner.net, July 24, 2011. http://sarahwerner.net/blog/index.php/2011/07/fetishizing-books-and-textu- alizing-the-digital/. Wernimont, Jacqueline. “Feminist Digital Humanities: Theoretical, Social, and Material Engagements around Making and Breaking Computational Media.” June 4, 2014. http://jwernimont.wordpress.com/2014/06/02/feminist-digital-humanities-theoretical- social-and-material-engagements-around-making-and-breaking-computational-media/. Wernimont, Jacqueline. “Whence Feminism? Assessing Feminist Interventions in Digital Literary Archives.” DHQ: Digital Humanities Quarterly, 7 (1) (2013). http://digitalhumani- ties.org:8080/dhq/vol/7/1/000156/000156.html. Wernimont, Jacqueline and J. Flanders. “Feminism in the Age of Digital Archives: The Women Writers Project.” Tulsa Studies in Women’s Literature 29 (2), 425-435. 145 Wernimont, Jacqueline, and Elizabeth Losh. “Problems with White Feminism: Intersec- tionality and Digital Humanities.” In Doing Digital Humanities: Practice, Training, Re- search. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 35-46. New York, NY: Routledge, 2016. Westphal, B. Geocriticism: Real and Fictional Spaces. Trans. R. Tally. New York, NY: Pal- grave Macmillan, 2011. Wetmorland, B.K., Ragas, M.W. et al. “Assessing the Value of Virtual Worlds for Post- Secondary Instructors, Early Adopters and the Early Majority in Second Life.” Interna- tional Journal of Humanities and Social Sciences, 3 (1) (2009). Whallon, Robert, Jr. “The Computer in Archaeology: A Critical Survey.” Computers and the Humanities 7, no. 1 (1972): 29-45. Wheatley, D. and M. Gillings. Spatial Technology and Archaeology: The Archaeological Applications of GIS. London, UK: Taylor & Francis, 2000. White, John W., and Heather Gilbert. Laying the Foundation. West Lafayette, IN: Purdue University Press, 2016. White, Richard. “What is Spatial History?” Stanford University Spatial History Project. 2010. http://www.stanford.edu/group/spatialhistory/cgi-bin/site/pub.php?id=29. Whitelaw. Mitchell. “Generous Interfaces for Digital Cultural Collections.” Digital Hu- manities Quarterly 9, no. 1 (2015). Whitson, Roger. “Critical Making in Digital Humanities: A MLA 2014 Special Session Pro- posal.” Washington State University, 2013. Wickham, Hadley. “Tidy Data.” Journal of Statistical Software. http://vita.had.co.nz/pa- pers/tidy-data.pdf. Wiener, Nobert. Cybernetics: Or Control and Communication in the Animal and the Ma- chine. Cambridge, MA: MIT Press, 1948. Wiener, Norbert. “Men, Machines, and the World About.” In The New Media Reader. Ed. Noah Wardrip-Fruin and Nick Montfort. 65-72. Cambridge, MA: MIT Press, 2003. Wikipedia Statistics. En.wikipedia.org/wiki/Special:Statistics. Wilkens, Matthew. “Canons, Close Reading, and the Evolution of Method.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 249-258. Minneapolis, MN: University of Minnesota Press, 2012. 146 Wilkinson, Lane. “Join the Digital Humanities…or Else.” Sense & Reference (blog). Janu- ary 31, 2012. http://senseandreference.wordpress.com/2012/01/31/join-the-digital-hu- manities-or-else/. Williams, George H. “Disability, Universal Design and the Digital Humanities. Day of DH: Defining Digital Humanities.” In Debates in the Digital Humanities. Ed. Matthew K. Gold. 202-212. Minneapolis, MN: University of Minnesota Press, 2012. Williams, Joseph C. “Architectural Practice in the Medieval Mediterranean: The Church of S. Corrado in Molfetta.” PhD dissertation, Duke University, 2017. Williams, Raymond. Keywords: A Vocabulary of Culture and Society. Revised Edition. New York, NY: Oxford University Press, 1983. Williams, William Proctor, and William Baker. “Caveat Lector. English Books 1475–1700 and the Electronic Age.” Analytical & Enumerative Bibliography 12 (2001): 1–29. Williford, Christa and Charles Henry. One Culture: Computationally Intensive Research in the Humanities and Social Sciences. A Report on the Experiences of First Respondents to the Digging into Data Challenge. Washington, DC: Council on Library and Information Resources, 2012. Willinsky, John. Technologies of Knowing. Boston, MA: Beacon Press, 1999. Wilson, Greg. “Software Carpentry: Lessons Learned.” Cornell University Library. (2013). https://arxiv.org/abs/1307.5448. Wilson, Stephen. Information Arts: Intersections of Art, Science, and Technology. Cam- bridge, MA: MIT Press, 2002. Winchester, Simon. The Map that Changed the World: William Smith and the Birth of Modern Geology. New York, NY: HarperCollins, 2001. Winesmith, K., and A. Carey. “Why Build an API for a Museum Collection?” San Francisco Museum of Modern Art, 2014. http://www.sfmoma.org/about/research_pro- jects/lab/why_build_an_api. Winn, James Anderson. The Pale of Words: Reflections on the Humanities and Perfor- mance. New Haven, CT: Yale University Press, 1998. Winter, Michael. “Specialization, Territoriality, and Jurisdiction in Librarianship.” Library Trends, 45.2 (1996): 343-63. 147 Wired! Group, Duke University. Wired! @ 5 (Years): Visualizing the Past at Duke Univer- sity. Visual Resources Association Bulletin 41:2 (May 2015): 1-41. Witcomb, Andrea. “The Materiality of Virtual Technologies: A New Approach to Thinking about the Impact of Multimedia in Museums.” Theorizing Digital Cultural Heritage. Eds. Fiona Cameron and Sarah Kenderine. 35-48. Cambridge, MA: MIT Press, 2007. Withington, Phil. Society in Early Modern England: The Vernacular Origins of Some Pow- erful Ideas. Cambridge, UK: Polity Press, 2010. Witmore, Michael. “Fuzzy Structuralism.” Wine Dark Sea (blog). 2013. http://winedarksea.org/?p=1693. Witmore, Michael. “Text: A Massively Addressable Object.” In Debates in the Digital Hu- manities. Ed. Matthew K. Gold. 324-327. Minneapolis, MN: University of Minnesota Press, 2012. Witmore, Michael. “The Ancestral Text.” In Debates in the Digital Humanities. Ed. Mat- thew K. Gold. 328-331. Minneapolis, MN: University of Minnesota Press, 2012. Witten, Ian H., David Bainbridge, and David M. Nichols, eds. How to Build a Digital Li- brary. San Francisco, CA: Morgan Kaufmann Publishers, 2013. Wood, Denis. The Power of Maps. New York, NY: Guilford Press, 1992. Wood, Denis. Rethinking the Power of Maps. New York, NY: Guilford Press, 2010. Woodley, Mary S. Digital Project Planning & Management Basics: Instructor Manual. 2008. Woodward, David, et al., eds. The History of Cartography. Vol. 1 and Vol. 2, books 1,2,3. Chicago, IL: University of Chicago Press. 1987-1998. Worthy, Glen. “Literary Texts and the Library in the Digital Age, or, How Library DH is Made.” Stanford Digital Humanities. March 4, 2014. https://digitalhumanities.stan- ford.edu/literary-texts-and-library-digital-age-or-how-library-dh-made. Wosh, Peter J., Cathy Moran Hajo, and Esther Katz. “Teaching Digital Skills in an Archives and Public History Curriculum.” In Digital Humanities Pedagogy: Practices, Principles and Politics. Ed. Brett D. Hirsch. Cambridge, MA: Open Book Publishers, 2012. Wouters, Paul, and Rodrigo Costas. Users, Narcissism and Control – Tracking the Impact of Scholarly Publications in the 21st Century. SURF Foundation, February 148 2012. http://www.surf.nl/nl/publicaties/Documents/Users%20narcis- sism%20and%20control.pdf. Wright, Alex. Glut: Mastering Information Through the Ages. Ithaca, NY: Cornell Univer- sity Press, 2008. Wu, Tim. “Book review: ‘To Save Everything, Click Here’ by Evgeny Morozov.” The Wash- ington Post. 2013. https://www.washingtonpost.com/opinions/book-review-to-save- everything-click-here-by-evgeny-morozov/2013/04/12/0e82400a-9ac9-11e2-9a79- eb5280c81c63_story.html?noredirect=on&utm_term=.1e2b4b6791f7. Wust, Markus. “Augmented Reality.” Doing Digital Humanities: Practice, Training, Re- search. Eds. Constance Crompton, Richard J. Lane, Ray Siemens. 303-27. New York: Routledge, 2016. Wynne, Martin. “Archiving, Distribution and Preservation,” in Developing Linguistic Cor- pora: A Guide to Good Practice. Eds. M. Wynne. Oxford, UK: Oxbow Books: 71–78. Yakel, E. “Digital Curation.” OSLC Systems & Services 23, 4 (2007) 335-340. Yakel, E., P. Conway, M. Hedstrom, & D. Wallace. “Digital Curation for Digital Natives.” Journal of Education for Library & Information Science 52, 1 (2011): 23-31. Yan, L., Y. Zhang, L.T. Yang, and H. Ning. The Internet of Things: From RFID to the Next- Generation Pervasive Networked Systems. Boca Raton, FL: Auerbach Publications, 2008. Young, J.R. “Virtual Reality on a Desktop Hailed as a New Tool in Distance Educa- tion.” Chronicle of Higher Education 47, 6, (2000): 43-44. Zeldman, Jeffrey. “Understanding Web Design.” A List Apart. November 20, 2007. http://alistapart.com/article/understandingwebdesign. Zhang, Jingxiong, and Michael F. Goodchild. Uncertainty in Geographical Information. London and New York: Taylor & Francis, 2002. Zimmer, Ben. “Rowling and “Galbraith”: an Authorial Analysis.” Language Log. Linguistic Data Consortium. (16 July 2013). http://languagelog.ldc.upenn.edu/nll/?p=5315. Ziemer, Tom. “Collaborative Project Pushes Discovery in Humanities, Computer Sci- ences.” University of Wisconsin-Madison College of Arts & Science: News. University of Wisconsin-Madison, 2013. 149 Zorich, Diane M. “The ‘Art’ of Digital Art History.” Presented at The Digital World of Art History, Princeton University, June 26, 2013. http://ica.princeton.edu/digitalbooks/digi- talworldofarthistory2013/7.D.Zorich.pdf. Zorich, Diane M.“ Digital Humanities Centers: Loci for Digital Scholarship.” Washington, DC: Council on Library and Information Resources, November 2008. http://www.clir.org/activities/digitalscholar2/zorich.pdf. Zorich, Diane M. A Survey of Digital Humanities Centers in the United States. CLIR Publi- cation no. 143. Washington, DC: Council on Library and Information Resources, 2008. Zorich, Diane M. A Survey of Digital Cultural Heritage Initiatives and Their Sustainability Concerns. Washington, DC: Council on Library and Information Resources, June 2003. http://www.clir.org/pubs/reports/pub118/contents.html. Zorich, Diane M. “Transitioning to a Digital World: Art History, Its Research Centers, and Digital Scholarship; A Report to the Samuel H. Kress Foundation and the Roy Rosenzweig Center for History and New Media.” May 2012. http://www.kressfoundation.org/re- search/Default.aspx?id=35379. Zoran, A., and L. Buechley. “Hybrid Reassemble: An Exploration of Craft, Digital Fabrica- tion and Artifact Uniqueness.” Leonardo 46, 4-10. Zotero. https://www.zotero.org/. Zubrow, Ezra. “Digital Archaeology: A Historical Context.” In Digital Archaeology. Bridg- ing Method and Theory. Eds. Patrick Daly and Thomas L. Evans. 8-27. London, UK: Routledge, 2005. Zundert, Joris J. van. “Screwmeneutics and Hermenumericals: The Computationality of Hermeneutics.” In A New Companion to Digital Humanities. Eds. by Susan Schreibman, Ray Siemens, and John Unsworth. 331-347. West Sussex, UK: Wiley-Blackwell, 2016. 2/1/2019 work_d5w5brltajce3nflneehc65qvq ---- i INTERSECTIONALITY IN DIGITAL HUMANITIES FOR PRIVATE AND NON-COMMERCIAL USE ONLY ii COLLECTION DEVELOPMENT, CULTURAL HERITAGE, AND DIGITAL HUMANITIES This exciting series publishes both monographs and edited thematic collections in the broad areas of cultural heritage, digital humanities, collecting and collections, public history and allied areas of applied humanities. The aim is to illustrate the impact of humanities research and in particular re lect the exciting new networks developing between researchers and the cultural sector, including archives, libraries and museums, media and the arts, cultural memory and heritage institutions, festivals and tourism, and public history. iii INTERSECTIONALITY IN DIGITAL HUMANITIES edited by BARBARA BORDALEJO and ROOPIKA RISAM FOR PRIVATE AND NON-COMMERCIAL USE ONLY iv We dedicate this volume to Tessa Bordalejo Robinson, who is already ighting to dismantle the heteronormative patriarchy . British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library. © 2019, Arc Humanities Press, Leeds The authors assert their moral right to be identi ied as the author of their part of work. Permission to use brief excerpts from this work in scholarly and educational works is hereby granted provided that the source is acknowledged. Any use of material in this work that is an exception or limitation covered by Article 5 of the European Union’s Copyright Directive (2001/29/EC) or would be determined to be “fair use” under Section 107 of the U.S. Copyright Act September 2010 Page 2 or that satis ies the conditions speci ied in Section 108 of the U.S. Copyright Act (17 USC §108, as revised by P.L. 94– 553) does not require the Publisher’s permission. ISBN (print): 9781641890502 eISBN (PDF): 9781641890519 www.arc- humanities.org Printed and bound by CPI Group (UK) Ltd, Croydon, CR0 4YY v CONTENTS List of Illustrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Introduction BARBARA BORDALEJO and ROOPIKA RISAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 All the Digital Humanists Are White, All the Nerds Are Men, but Some of Us Are Brave MOYA Z. BAILEY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2 Beyond the Margins: Intersectionality and Digital Humanities ROOPIKA RISAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3 You Build the Roads, We Are the Intersections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 ADAM VÁZQUEZ 4 Digital Humanities, Intersectionality, and the Ethics of Harm DOROTHY KIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5 Walking Alone Online: Intersectional Violence on the Internet BARBARA BORDALEJO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 6 Ready Player Two: Inclusion and Positivity as a Means of Furthering Equality in Digital Humanities and Computer Science KYLE DASE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 7 Gender, Feminism, Textual Scholarship, and Digital Humanities PETER ROBINSON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 8 Faulty, Clumsy, Negligible? Revaluating Early Modern Princesses’ Letters as a Source for Cultural History and Corpus Linguistics VERA FASSHAUER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 FOR PRIVATE AND NON-COMMERCIAL USE ONLY vi vi 9 Intersectionality in Digital Archives: The Case Study of the Barbados Synagogue Restoration Project Collection AMALIA S. LEVI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 10 Accessioning Digital Content and the Unwitting Move toward Intersectionality in the Archive KIMBERLEY HARSLEY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 11 All along the Watchtower: Intersectional Diversity as a Core Intellectual Value in Digital Humanities DANIEL PAUL O’DONNELL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Appendix: Writing about Internal Deliberations DANIEL PAUL O’DONNELL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Select Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 vii ILLUSTRATIONS Figures Figure 1. Gender ratio at digital humanities conferences, 2010– 2103. . . . . . . . . . . . 90 Figure 2. Proportion of men and women editors for the series surveyed, 1860– 2016. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Figure 3. Proportions of men and women editors by decade. . . . . . . . . . . . . . . . . . . . . 93 Figure 4. Relative proportions of men and women editors by decade. . . . . . . . . . . . . 93 Figure 5. Comparison of relative proportions of men and women editors by decade. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Figure 6a. Names of editors in the Oxford editions series, and others. . . . . . . . . . . . . . 95 Figure 6b. Names of recipients of the MLA seal, and others. . . . . . . . . . . . . . . . . . . . . . . 96 Figure 7. An American editor: Fredson Bowers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Figure 8. An EETS edition not edited by a woman. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Figure 9. Board members of the Society for Textual Scholarship, June 2017. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Figure 10. Annotation levels in the score editor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Figure 11. Consonant duplication in Sibylla’s page margin. . . . . . . . . . . . . . . . . . . . . . . 124 Table Table 1. Graphic realization of in Sibylla. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 FOR PRIVATE AND NON-COMMERCIAL USE ONLY viii ix ACKNOWLEDGEMENTS This volume originated from the Intersectionality and Digital Humanities conference organized by Barbara Bordalejo at KU Leuven. We gratefully acknowledge the support of the following sponsors for the conference: Doctoral School of Humanities and Social Sciences (Doctoral School Humane Wetenschappen), KU Leuven; Faculty of Arts (Faculteit Letteren), KU Leuven; Flemish Research Foundation (Fonds Wetenschappelijk Onderzoek). We also acknowledge the Social Sciences and Humanities Research Council of Canada’s Future Commons Partnership Development Grant. The collection was also greatly enriched by the thought- provoking presentations and conversations that took place at the conference. Thank you, in particular, to keynote speakers Alex Gil, Daniel Paul O’Donnell, Padmini Ray Murray, Melissa Terras, and Deb Verhoeven, whose support and friendship has sustained our work in many ways, long past the conference itself. Our gratitude, as well, to the presenters: Koenraad Brosens, J. De Gussem, Kimberley Harsley, Tiziana Mancinelli, Peter M. W. Robinson, Fred Truyen, Carole Urlings, Sytze Van Herck, Paola Verhaert, Tom Willaert, Sally Wyatt, and Heleen Wyffels. Thanks are due, as well, to Dymphna Evans and Danièle Cybulskie for their guidance as editors. We also greatly appreciate the generous feedback that Jacqueline Wernimont offered during the editing process. Additionally, Roopika would like to thank Dennis Cassidy for his endless patience and support. FOR PRIVATE AND NON-COMMERCIAL USE ONLY x 167 Chapter 11 ALL ALONG THE WATCHTOWER : INTERSECTIONAL DIVERSITY AS A CORE INTELLECTUAL VALUE IN DIGITAL HUMANITIES Daniel Paul O’Donnell † This problem is signi icant because it indicates the failure of the traditional model for scholarship adequately to describe serious intellectual work in humanities computing, whose scope cannot be delimited in the same way and to the same extent as the traditional kind … A new de inition of scholarship, demanding new abilities, would seem to follow. 1 The Bonfi re of the (Digital) Humanities Digital humanities came close to imploding as an organized discipline in the 2015– 2016 academic year. The origins of the dispute lay in the deliberations of the program committee for Digital Humanities, the annual, usually very competitive, international conference organized by the Alliance of Digital Humanities Organizations (ADHO) and held in 2016 in Krakow, Poland. What criteria, this committee asked itself, should we use for accepting or rejecting submissions? Should we privilege “quality”— presumably as this is measured by success in the conference’s traditionally highly structured and quite thorough peer review process? Or should we privilege “diversity”— de ined largely in terms of ensuring that speakers from as wide a range of demographics as possible are given slots at a conference (and in a discipline) that has been accused of skewing heavily toward white, Northern, and Anglophone men? Or, as one member of the committee put it with forceful clarity in an email: There’s a solid consensus that the conference is there in order to hear from diverse groups, but whenever one opts for diversity, it usually means opting for less quality (oth- erwise there would be no issue), so the danger is that one loses sight of this, very central goal of the conference. 2 Email is an informal medium, and it would be unfair to take the position expressed here and later circulated by others on social media as having been considered in the same way as this chapter or other formal presentations that have referred to this email † University of Lethbridge, Canada. 1 Willard McCarty, Humanities Computing (Basingstoke: Palgrave Macmillan, 2005), 1227. 2 ADHO Conference Coordinating Committee Email Listserv, “Re: DH2016 and Diversity,” September 16, 2015. FOR PRIVATE AND NON-COMMERCIAL USE ONLY 168 168 since this controversy irst arose. As Steven Ramsay has noted of his own apparently unintentionally provocative comments on the belief that coding is the core activity within digital humanities, “All quotes are by nature taken out of context.” 3 In this partic- ular case, it is important to remember, the passage in question comes from the middle of an internal debate (most of which has not been published or released on social media) in which members of a conference organizing committee struggled to determine the best method of fairly distributing access to a major conference with a high rejection rate. At the same time, however, the “diversity debate” exempli ied (and in part provoked) by this email was real and involved the numerous regional, national, linguistic, and other organizations that make up ADHO and run the ield’s major journals, conferences, and societies. The debate led to the resignation of one of ADHO’s of icers and it resulted in inter- society debates about cultural norms surrounding issues of “diversity” and “quality” that are still ongoing. This resignation and these debates led to a brief threat from one of the societies to break away from the larger consortium, taking its journal and participation in the international conference with it. The debate provoked in part by this email, in other words, was serious enough to threaten some of the most prestigious and central organs and activities that characterize global digital humanities and undo what can be considered one of the most characteristic features of international digital humanities as it is currently constituted: its strong and highly centralized international organizational collaboration and cooperation. Moreover, while people seem wary of putting it in writing, the sentiment that there is an opposition between “quality” on the one hand and “diversity” on the other remains relatively common within some parts of institutional digital humanities (as well as other industries). 4 It also aligns to a certain extent with longer- standing positions and regional trends in how the ield as a whole is understood: between “those who build digital tools and media and those who study traditional humanities questions using digital tools and media,” as Mark Sample puts it: “do vs. think, practice vs. theory, or hack vs. yack.” 5 I am a member of a national digital humanities society executive and a former chair of the Special Interest Group (SIG) Global Outlook::Digital Humanities (GO::DH), an orga- nization that played a pivotal role in the recent “global turn” within digital humanities. I am also a middle- aged, white Anglophone man who enjoys the security of a tenured North American professorship. And I have been, at various times, a member of the ADHO executive, ADHO conference organizing committees, and president of one of the national societies that collectively govern the organization. In these contexts, I have heard both 3 Stephen Ramsay, “On Building,” accessed June 28, 2017, http:// stephenramsay.us/ text/ 2011/ 01/ 11/ on- building/ . 4 See Cleve R. Wootson Jr, “A Google Engineer Wrote That Women May Be Unsuited for Tech Jobs. Women Wrote Back,” The Washington Post , August 6, 2017, www.washingtonpost.com/ news/ the- switch/ wp/ 2017/ 08/ 06/ a- google- engineer- wrote- that- women- may- be- genetically- unsuited- for- tech- jobs- women- wrote- back/ . 5 Mark Sample, “The Digital Humanities Is Not about Building, It’s about Sharing,” samplereality, May 25, 2011, www.samplereality.com/ 2011/ 05/ 25/ the- digital- humanities- is- not- about- building- its- about- sharing/ . 169 169 dismissive complaints about “diversity” as a way of promoting the less quali ied, and honest struggles with the question of how a desire to promote as wide participation as possible within digital humanities might con lict with de initions of various forms of “quality” within the  ield. As is true of many signi icant disciplinary debates within the digital humanities, how- ever, much of this discussion has taken place out of public view— on closed email lists used by the ADHO executive or in closed meetings of its various committees; as Shelaigh Brantford pointed out in an unpublished paper, a person unfamiliar with the details of the internal debate provoked by this email and resignation would not be able to build an accurate sense of the issues at stake (or just how serious the crisis had become) from the organization’s own public pronouncements. 6 In this chapter, I would like to tackle the question of “diversity” and “quality” within digital humanities head on. That is to say, I would like to consider the question raised in the email thread from the Digital Humanities 2016 organizing committee directly and seriously. Is there an inherent con lict between these two concepts within digital human- ities? Is it the case that “whenever one opts for diversity, it usually means opting for less quality”? And is the promotion of “quality,” to the extent that it can be kept distinct from “diversity,” actually a “very central goal of the [Digital Humanities] conference,” or any other venue for disseminating our research? To anticipate my argument, I am going to suggest that the answer to each of these questions is “no.” That is to say, irst, that there is no inherent con lict between “diver- sity” and “quality” in digital humanities; second, that emphasizing “diversity” does not threaten the “quality” of our conferences and journals; and, inally, that “quality”— when taken by itself, without attention to questions of “diversity”— is in fact not the central goal of the Digital Humanities conference, or any other digital humanities dissemination channel. Indeed, to the extent they can be distinguished at all (and to a great degree, in fact, I argue they are the same thing), “diversity”— in the sense of access to as wide a pos- sible range of experiences, contexts, and purposes in the computational context of the study of problems in the humanities or application of computation to such problems, par- ticularly as this is represented by the lived experiences of different demographic groups — is in fact more important than “quality,” especially if “quality” is determined using methods that encourage the reinscription of already dominant forms of research and experience. 6 See Daniel Paul O’Donnell and Shelaigh Brantford, “The Tip of the Iceberg: Transparency and Diversity in Contemporary DH,” CSDH- SCHN (Congress 2016) , Calgary, June 1, 2016. For a summary, see Geoffrey Rockwell, “CSDH- CGSA 2016,” philosophi.ca , August 26, 2016, http:// philosophi.ca/ pmwiki.php/ Main/ CSDH- CGSA2016 . Examples of public statements showing this oblique approach include Alliance of Digital Humanities Organizations, “ADHO Announces New Steering Committee Chair,” ADHO, November 20, 2015, http:// adho.org/ announcements/ 2015/ adho- announces- new- steering- committee- chair ; Karina van Dalen- Oskam, “Report of the Steering Committee Chair (November 2015— July 2016),” ADHO, July 4, 2016, http:// adho.org/ announcements/ 2016/ report- steering- committee- chair- november- 2015- %E2%80%93- july- 2016 . It is important to remember that the purpose of such statements is administrative and political rather than academic and that an approach that makes things dif icult for the researcher may represent good management practice. FOR PRIVATE AND NON-COMMERCIAL USE ONLY 170 170 Full of Sound and Fury …? As intense as it was, the “quality vs. diversity” debate revolved around what can only be described as a very odd premise for a discipline that is commonly described as a “meth- odological commons” 7 or “border land.” 8 At the most literal level, the debate suggests that the two qualities in question (i.e., “diversity” and “quality”) have a zero- sum relationship to each other: the more “diversity” there is of participation on a panel or at a conference, the fewer examples (presumably) of “quality” work you are likely to ind. That this is inherently problematic can be tested simply by reversing the terms: if diversity of par- ticipation is thought to lead to lower “quality,” then, presumably, greater “quality” comes from increasing the homogeneity of participation. In certain circumstances and to certain degrees, of course, this can be true: a confer- ence that is focused on a single discipline or subject, for example, is likely to be of higher “quality” (in the sense of creating opportunities to advance that discipline or topic) than a conference that sets no limits on the subject matter of the papers or quali ications of the participants. Faculty and students at the University of Lethbridge participate in several conferences each year where the principle of organization is geographic (“academics living in Alberta”) or educational status (“graduate students”) rather than discipline or topic. In such cases, the principal goal of the conference is less the advance- ment of research in a particular discipline (i.e., promoting the kind of “quality” that seemed to be at issue in the ADHO debate) than the advancement of researchers as a community. These conferences can attract a wide variety of approaches, subjects, and methods and, frankly, “quality” of contributions (in the sense of “likely to be of broad interest or impact to the ield or discipline in question”). The bene it they offer lies in the practice they afford early- career academics and students in preparing papers or the cross- disciplinary networking opportunities they provide for scholars working in a par- ticular geographic area. But while it would be wrong to measure the success of such conferences by the impact they have on their ield (since there is no single ield), it is also undeniable that such conferences generally have lower “quality” when measured from a disciplinary perspective. At the same time, however, absolute homogeneity is also obviously problematic. Research, like many collaborative tasks, is an inherently dialectic process. It involves argument and counter- argument; debate over methods and results; agreement, dis- agreement, and partial agreement over signi icance and context. In many cases, this dialectic takes place within a broader context of theoretical agreement (the so- called “normal science” 9 ), in others, it can involve sweeping changes to the framing theo- ries or concepts (the infamous “paradigm shift” 10 ). Advancement in research, in other 7 McCarty, Humanities Computing , 2005. 8 Julie Thompson Klein, Interdisciplining Digital Humanities: Boundary Work in an Emerging Field (Ann Arbor: University of Michigan Press, 2014). 9 See Thomas S. Kuhn, The Structure of Scienti ic Revolutions (Chicago: University of Chicago Press, 2012). 10 See Kuhn, The Structure of Scienti ic Revolutions . While Kuhn is discussing science, the same pattern can be found, mutatis mutandis , in the social sciences and humanities. 171 171 words, requires there to be at least some difference among researchers in approach, goals, method, or context. For great advancement to occur— the kind that changes the ield or opens up new avenues of exploration— it is necessary for at least some of the participating researchers to understand the problems the discipline is facing from very different perspectives from those of the rest of the  ield. The relationship between lack of homogeneity and advancement of research is particularly true in the case of digital humanities. This is because the “ ield” is really a paradiscipline — that is to say “a set of approaches, skills, interests, and beliefs that gain meaning from their association with other kinds of work.” 11 In contrast to many traditional humanities disciplines, digital humanities traditionally has been much more about methodology than content: that is, it is less about something than it is about how one studies or researches something else. Advancing the ield in such cases requires developments either in the range of “some- thing elses” to which these “hows” can be applied (i.e., the range of subjects studied); or in the “hows” themselves (i.e., the methods that can then be used across disciplines and problems). Novelty in digital humanities (and research is always about new ideas or concepts), in other words, requires either the application of existing techniques, models, or understandings to an ever widening range of humanities problems (testing the boundaries of our existing tools and approaches); or experiments in the develop- ment and application of new techniques, tools, theories, and approaches to new or old types of problems (expanding the range of digital humanities methodologies). In both cases, diversity of experience and situation are crucial preconditions for advancement. We improve our understanding of computers and the humanities by discovering new problems for old solutions and re- solving existing problems in new cultural, economic, social, and computational contexts. Without such diversity of experi- ence and condition, digital humanities ceases to be a paradiscipline and becomes instead simply a computationally heavy sub- discipline within some larger traditional ield of research. Medieval Studies: A Counter Case This fundamental importance of diversity to digital humanities can be seen when they are compared to a more traditionally content- focused ield such as medieval studies. As a cross- disciplinary area study, medieval studies covers a wide range of topics, approaches, and subjects— from archaeology to philosophy to literature to geography— and involves a number of technical and methodological skills (e.g., paleography, linguis- tics, numismatics, etc.). The ield is commonly organized along cultural and temporal lines, with often parallel (but largely unconnected) research going on otherwise similar topics within different political, cultural, or linguistic contexts. A scholar of Anglo- Saxon 11 Daniel Paul O’Donnell, “ ‘There’s No Next about It’: Stanley Fish, William Pannapacker, and the Digital Humanities as Paradiscipline,” dpod blog, June 22, 2012, http:// dpod.kakelbont.ca/ 2012/ 06/ 22/ theres- no- next- about- it- stanley- ish- william- pannapacker- and- the- digital- humanities- as- paradiscipline/ . FOR PRIVATE AND NON-COMMERCIAL USE ONLY 172 172 kingship may have little to do with somebody studying the same topic with regard to continental European or Middle Eastern cultures during the same time frame— or even with those studying the same topic in earlier or later periods in the same geographic area. Medieval vernacular literary studies, similarly, tend to focus on relatively narrowly delimited languages, movements, or periods. Apart from some common broad theoret- ical concerns, a student of early Italian vernacular literature might have very little to do with research on early French, Spanish, or English literature of the same or different periods. Even within a single time or culture, the multidisciplinary nature of the ield means that it is quite common for research by one medievalist to be of only marginal immediate relevance or interest to another medievalist trained in a different discipline or tradition: art historians debate among themselves without necessarily seeking input from (or affecting the work of ) philologists or archaeologists working the same geo- graphical or cultural area and time period. But while the range of medieval studies is huge, its de inition is still primarily about content rather than methodology. That is to say, the goal of medieval studies ultimately is to know or understand more about the Middle Ages , not, primarily, to develop new research techniques through their application to the Middle Ages. While differences between the different sub- disciplines within medieval studies are such that advanced research in one area can be dif icult or impossible to follow by researchers trained in some other area, it remains the case that the overall goal of research across domains and approaches is to develop a comprehensive picture of the time or location under discus- sion: the history, archaeology, politics, language, literature, culture, and philosophical understandings of a particular place or time in the (European) Middle Ages. If a piece of research focuses on Europe or the Middle East (as a rule, research involving a similar time period in Africa, Asia, or the Americas is not considered part of medieval studies) and if it involves or analyzes content or events occurring from (roughly speaking) the fall of the Roman Empire through to the beginning of the Renaissance, then that research is likely to be considered “medieval studies” and its practitioner a “medievalist”; if, on the other hand, a piece of research falls outside of these temporal and geographical bound- aries, then it is not considered “medieval studies,” even if the techniques it uses are iden- tical to those used within medieval studies or could be applied productively to material from the medieval period. 12 Content vs. Method in Historical Disciplines One implication of this is that in medieval studies, comprehensiveness or completeness can be as important a scholarly goal as novelty of method, and the discovery and expli- cation of additional examples of a concept or type of cultural object are as or more valu- able than more generalizable methods or studies. If having a scholarly edition of one Anglo- Saxon poem is thought to be useful for the study of the period, for example, then 12 For a discussion of this with regard to medieval and classical studies, see Gabriel Bodard and Daniel Paul O’Donnell, “We Are All Together: On Publishing a Digital Classicist Issue of the Digital Medievalist Journal,” Digital Medievalist 4 (2008), https:// doi.org/ 10.16995/ dm.18 . 173 173 having editions of two Anglo- Saxon poems— or, better still, all Anglo- Saxon poems— will be thought to be even more useful. A digital library of Frankish coins, similarly, is the better the more it is complete. Just how important this focus on the accumulation of examples and detail is can be seen simply by examining medievalist conference programs or publishers’ booklists. Medievalist conferences, for example, place a premium on the speci ic. While broad generalized papers synthesizing across domains are not unheard of (they are in fact characteristic of keynote addresses), by far the majority of contributions focus on quite speci ic topics: “The Music of the Beneventan Rite I (A Roundtable)” or, in a session on “ lyting” (i.e., the exchange of insults in Germanic poetry), papers on three or four spe- ci ic texts: “The Old High German St. Galler Spottverse,” “Flyting in the Hárbarðsljóð,” “Selections from Medieval Flyting Poetry,” and “Hrothgar, Wealhtheow, and the Future of Heorot [i.e., in the poem Beowulf ],” to take some examples from the 2017 International Congress on Medieval Studies at Western Michigan University. 13 Indeed, it is signi icant in this regard that the dominant form of submission to a con- ference like the International Congress on Medieval Studies is by externally organized panel (i.e., a collection of papers assembled and proposed by an external organizer) rather than through the submission of individual papers by individual scholars. Given the level of detail involved in the majority of the papers (and the lack of generalizing emphasis), this is the only way of ensuring a critical mass of background knowledge in speakers and audience. 14 Book series on topics in medieval studies, similarly, tend to justify their claims to the scholars’ attention through their comprehensiveness. Thus, the Early English Text Society advertises for new subscriptions by pointing to its collection of: Most of the works attributed to King Alfred or Aelfric, along with some of those by bishop Wulfstan and much anonymous prose and verse from the pre- Conquest period … all the surviving medieval drama, most of the Middle English romances, much religious and sec- ular prose and verse including the English works of John Gower, Thomas Hoccleve, and most of Caxton’s prints … 15 A similar emphasis on comprehensiveness is found in the advertisement for Early English Books Online: From the irst book published in English through the age of Spenser and Shakespeare, this incomparable collection now contains more than 125,000 titles … Libraries possessing 13 Andrew J. M. Irving, “The Music of the Beneventan Rite I (A Roundtable) [Conference Session] and Doaa Omran, “Dead Poet Flyting Karaoke [Conference Session],” both in International Congress on Medieval Studies , Kalamazoo, MI, May 11, 2017, https:// scholarworks.wmich.edu/ medieval_ cong_ archive/ 52/ . 14 This focus on speci icity is the norm across the traditional humanities; the annual conference of the Modern Language Association, for example, the largest in the humanities, ills its program entirely by means of externally proposed sessions (Nicky Agate, personal communication). 15 Anne Hudson, “The Early English Text Society, Present Past and Future,” The Early English Text Society , accessed August 29, 2017, http:// users.ox.ac.uk/ ~eets/ . FOR PRIVATE AND NON-COMMERCIAL USE ONLY 174 174 this collection ind they are able to ful ill the most exhaustive research requirements of graduate scholars— from their desktop— in many subject areas: including English liter- ature, history, philosophy, linguistics, theology, music, ine arts, education, mathematics, and science. 16 Signi icantly, this interest in completeness is such that it can even trump methodological diversity: the goal of comprehensive collections of texts or artifacts, after all, is to pro- vide researchers with a body of comparable research objects— that is to say, research objects established using (more or less) common techniques and expectations. This is both why it makes sense for scholars to regularly re- edit core texts in the ield (the better to make them compatible with current scholarly trends and interests) and why it can make sense to explicitly require researchers to follow speci ic methodological approaches and techniques. Thus, the Modern Language Association’s (MLA) Committee on Scholarly Editions codi ies its views on best practice in textual editing in the form of a checklist against which new editions can be compared. This checklist and the associated guidelines include advice on the speci ic analytic chapters or sections that ought to be included in a “certi ied edition” as well as minimum standards of accuracy and preferred work lows. 17 The Early English Text Society, likewise, warns potential editors of its strong pref- erence for editions that follow the models set by previous editions in the series, recommending against experimentation without prior consultation: We rely considerably on the precedents set by authoritative earlier editions in our series as a means of ensuring some uniformity of practice among our volumes. Clearly discre- tion must be used: departures from practice in earlier editions are likely to have been made for good, but particular, reasons, which do not necessarily suit others. Moreover, if they wish to make an argument from precedent, editors should follow EETS editions, in preference to those of other publishers. Once again, please consult the Editorial Secretary in cases of doubt. 18 This emphasis on continuity, consistency, and clearly identi ied standards is not (neces- sarily) evidence of unthinking conservatism. Textual criticism and editing as a method has gone through some remarkable developments in the last three decades, and while not all presses or series are prepared to accept some newer methods for representing texts and objects editorially (the Early English Text Society, for example, promises to issue separate guidelines for “electronic editions … as and when the Society decides to pursue this manner of publication in the future”), 19 others, such as the MLA, have 16 Early English Books Online, “About EEBO,” EEBO , accessed August 29, 2017, https:// eebo. chadwyck.com/ marketing/ about.htm . 17 MLA Committee on Scholarly Editions, “Guidelines for Editors of Scholarly Editions,” Modern Language Association , June 29, 2011, www.mla.org/ Resources/ Research/ Surveys- Reports- and- Other- Documents/ Publishing- and- Scholarship/ Reports- from- the- MLA- Committee- on- Scholarly- Editions/ Guidelines- for- Editors- of- Scholarly- Editions . 18 Early English Text Society, “Guidelines for Editors,” Early English Text Society, 4, accessed August 29, 2017, http:// users.ox.ac.uk/ ~eets/ Guidelines%20for%20Editors%2011.pdf . 19 See Early English Text Society, “Guidelines for Editors,” 3. 175 175 worked diligently to ensure their guidelines work with different prevailing methodol- ogies and approaches. 20 What it does suggest, however, is a belief in the necessity of minimum common standards, in a minimal degree of common understanding about expectations and purpose, and that the purpose of method is to develop reliable content rather than, as both the MLA and the Early English Text Society emphasize, experiment for the sake of experiment— a sense of minimum “quality,” in other words, that is more important than “diversity” if “diversity” produces something methodologically or con- ceptually unexpected. Given the choice between reliable content produced using a conservative, well- tested methodology and content of unknown quality produced using novel, but less well- tested methodologies, in other words, these examples suggest that mainstream medievalists will tend to prefer the reliable success over the interesting “failure.” 21 This bias against (methodological) diversity need not, in principle, lead to a bias against participation by “diverse” communities (in the sense of gender, belonging to a racialized community, economic class, or educational background)— although medi- eval studies as a ield has recently begun to recognize both its lack of diversity in this respect as well, and the degree to which this homogeneity may leave it particularly vulnerable to co- option by explicitly racist political movements. 22 But it does in current 20 MLA Committee on Scholarly Editions, “MLA Statement on the Scholarly Edition in the Digital Age,” Modern Language Association , May 2016, www.mla.org/ content/ download/ 52050/ 1810116/ rptCSE16.pdf . 21 A famous example in Medieval English Studies is the reception of the Athlone Press editions of Piers Plowman, i.e., George Kane, Piers Plowman: The A Version. Will’s Visions of Piers Plowman and Do- Well (London: University of London, 1960); George Kane and E. Talbot Donaldson, Piers Plowman: The B Version (London: Athlone Press, 1975); George Russell and George Kane, Piers Plowman: The C Version; Will’s Visions of Piers Plowman, Do- Well, Do- Better and Do- Best (London: Athlone, 1997). These were generally criticized on the basis that their innovative editorial method, while interesting and perhaps theoretically sound, left the texts “unreliable” and incompa- rable to other editions of the poem. See, among many others, Derek Pearsall, “Piers Plowman: The B Version (Volume II of Piers Plowman: The Three Versions), by George Kane, E. Talbot Donaldson,” Medium Aevum , 1977; John A. Alford, “Piers Plowman: The B Version. Will’s Vision of Piers Plowman, Do- Well, Do- Better and Do- Best. George Kane, E. Talbot Donaldson,” Speculum , 1977; Traugott Lawler, “Reviewed Work: Piers Plowman: The B Version. Will’s Visions of Piers Plowman, Do- Well, Do- Better, and Do- Best. An Edition in the Form of Trinity College Cambridge Ms. B. 15.17, Corrected and Restored from the Known Evidence, with Variant Readings by George Kane, E. Talbot Donaldson,” Modern Philology , 1979. Lawler’s review is an interesting example as it praises the edition while mentioning these same caveats. Robert Adams, “The Kane- Donaldson Edition of Piers Plowman: Eclecticism’s Ultima Thule,” Text 16 (2006): 131– 41, contains a discussion of the reception. 22 See, among others, Candace Barrington, “Beyond the Anglophone Inner Circle of Chaucer Studies (Candace Barrington),” In the Middle , September 11, 2016, accessed January 14, 2019, www.inthemedievalmiddle.com/ 2016/ 09/ beyond- anglophone- inner- circle- of.html ; Wan- Chuan Cao, “#palefacesmatter? (Wan- Chuan Kao),” In the Middle , July 26, 2016, accessed January 14, 2019, www.inthemedievalmiddle.com/ 2016/ 07/ palefacesmatter- wan- chuan- kao.html ; Dorothy Kim, “A Scholar Describes Being Conditionally Accepted in Medieval Studies (opinion)” Inside Higher Ed , FOR PRIVATE AND NON-COMMERCIAL USE ONLY 176 176 practice discourage it, in part because it interacts poorly with the lived experience of intersectionally diverse participants: it allows for participation by “anybody,” but is methodologically suspicious of those whose experience, training, interests, or eco- nomic situation results in work that does not easily continue the larger common pro- ject using clearly recognized methods and meeting previously recognized standards. As a new generation of medievalists tackle this problem using an explicitly intersectional theoretical approach, the ield may gradually become more hospitable to a broader and more welcoming de inition of diversity. Digital Humanities as Methodological Science The focus on content, comprehensiveness, and, in the more technical areas, method- ological conservatism that I argue characterizes the practice of a traditionally histor- ically focused ield like medieval studies contrasts very strongly against what we can easily see to be the case within digital humanities. If medieval studies can be described as a discipline that marshals speci ic types of method and theory in order to apply it to the study of a speci ic temporally and geographically bound subject, digital humanities can be described as a ield that marshals studies of a variety of (often) temporally, geo- graphically, and similarly bound subjects in order to develop different types of method and theory. As in medieval studies, the range of topics, approaches, and subjects covered by dig- ital humanities is extremely wide— indeed, in as much as digital humanities does not focus on a speci ic temporal period or geographic location, far wider. And as in medi- eval studies, different streams of research in different areas of digital humanities— while engaged, broadly speaking, in the same large project— commonly advance with a fair degree of independence. Advances in 3D imaging, for example, may or may not be related to or have an impact on developments in text encoding, media theory, gaming, or human- computer interaction, to name only a few areas commonly considered to be part of digital humanities. The difference, however, is that the project of digital humanities, in contrast to that of an area study like medieval studies, is primarily about the methods and theories used rather than the content developed. That is to say, the goal of digital humanities as a disci- pline is not primarily to know more about any speci ic period, text, idea, object, culture, or any other form of content (though it does no harm if it helps further this knowledge). Rather, it is to develop theories, contextual understandings, and methods that can be August 30, 2018, accessed July 14, 2019, www.insidehighered.com/ views/ 2018/ 08/ 30/ scholar- describes- being- conditionally- accepted- medieval- studies- opinion ; Dorothy Kim, “The Unbearable Whiteness of Medieval Studies,” In the Middle , November 10, 2016, accessed January 14, 2019, www. inthemedievalmiddle.com/ 2016/ 11/ the- unbearable- whiteness- of- medieval.html ; and Medieval Institute, “Featured Lesson Resource Page: Race, Racism and the Middle Ages,” TEAMS: Teaching Association for Medieval Studies , July 29, 2018, accessed January 14, 2019, https:// teams- medieval. org/ ?page_ id=76 . 177 177 used in the context of the use of computation to study such periods, texts, ideas, objects, and cultures. This is not to deny that research in digital humanities can have an impact on our knowledge of such periods, texts, ideas, objects, and cultures. In fact, much good digital humanities work does have that impact. Rather it is to claim that this impact is not the primary interest of such research to other digital humanities researchers. For example, a digital edition of an Anglo- Saxon poem can be at the same time a work of medieval studies (if it adds to our knowledge of the Anglo- Saxon period) and digital humanities (if it adds to our knowledge of how one can make digital editions or some other aspect of digital method or theory). To make such an edition a contribution to digital humanities, however, it must do something new computationally, regardless of its value to Anglo- Saxon studies. Thus, the kind of methodological conservatism we have seen as being acceptable in medieval studies is simply fatal in a ield like digital humanities. Where editing yet another Anglo- Saxon text improves our knowledge of Anglo- Saxon England, the simple application of well- known computational techniques to yet another cultural object of the same kind dealt with previously by others does nothing to advance dig- ital humanities as a paradiscipline. Advancement in digital humanities requires there to be something new, innovative, or generalizable about the work from a digital/ methodological perspective. As is the case with medieval studies, this difference in emphasis is re lected in how digital humanities dissemination channels de ine themselves and operate. Digital humanities book series, in contrast to the examples we have seen from medieval studies, tend to celebrate the methodological and disciplinary breadth of their catalogue, rather than the comprehensiveness of their collections. Both “Digital Culture Books,” a dig- ital humanities imprint of the University of Michigan Press, and “Topics in the Digital Humanities,” an imprint of the University of Illinois Press, for example, advertise their series in terms of the breadth of topics covered in their volumes, the methodological diversity and innovation they entail, and the diverse experiences of their authors. In the case of “Digital Culture Books”: The goal of the digital humanities series will be to provide a forum for ground- breaking and benchmark work in digital humanities. This rapidly growing ield lies at the intersections of computers and the disciplines of arts and humanities, library and infor- mation science, media and communications studies, and cultural studies. The purpose of the series is to feature rigorous research that advances understanding of the nature and implications of the changing relationship between humanities and digital technolo- gies. Books, monographs, and experimental formats that de ine current practices, emer- gent trends, and future directions are accepted. Together, they will illuminate the varied disciplinary and professional forms, broad multidisciplinary scope, interdisciplinary dynamics, and transdisciplinary potential of the ield. 23 23 University of Michigan Press, “Digital Humanities Series,” Digital Culture Books , accessed September 11, 2017, www.digitalculture.org/ books/ book- series/ digital- humanities- series/ . FOR PRIVATE AND NON-COMMERCIAL USE ONLY 178 178 For “Topics in the Digital Humanities”: Humanities computing is undergoing a rede inition of basic principles by a continuous in lux of new, vibrant, and diverse communities or practitioners within and well beyond the halls of academe. These practitioners recognize the value computers add to their work, that the computer itself remains an instrument subject to continual innovation, and that competition within many disciplines requires scholars to become and remain current with what computers can do. Topics in the Digital Humanities invites manuscripts that will advance and deepen knowledge and activity in this new and innovative ield. 24 Conference sessions, too, tend to be far less specialized and homogeneous in terms of subject. Where in the case of area or historical studies, conference papers tend to focus on very speci ic research questions and outcomes, and submissions tend to be primarily through the externally organized panel, in the case of Digital Humanities conferences, papers tend both to be on a wider variety of topics in any single session (because the content is less important than the methodology) and organized by single- paper submis- sion rather than externally organized panels. I have been on conference panels in both digital humanities and medieval studies; in the case of medieval studies conferences, committees commonly look favourably on papers that emphasize new detailed indings, while digital humanities committees commonly ask the authors of papers that concen- trate too much on the details of their “case” and not enough on its generalizability to reorganize their paper or consider presenting their indings as a short paper or poster. The Role of Diversity This brings us, inally, to the role of intersectional diversity in the advancement of dig- ital humanities. Thus far in this paper, I have been emphasizing the way in which digital humanities acts as what Willard McCarty and Harold Short have described as a method- ological commons: an intellectual space in which researchers active in different discip- lines, in essence, compare notes and develop new approaches and ideas about the role, context, and use of the digital in relation to humanities questions. The great change in the last ive years within digital humanities, however, has been the recognition that this “commons” also involves lived experience within the digital realm. That is to say, that diversity of personal, gendered, regional, linguistic, racialized, and economic experience and context is as important to developing our understanding of method and theory in digital humanities as is diversity of subject or focus. What this means is that it is as important to promote diversity of experience in dig- ital humanities as it is diversity of methodology or topic. The experiences of researchers working with relatively poor infrastructure in mid- and especially low- income communities, for example, are as important to the progress of digital humanities as a discipline as those working with cutting- edge infrastructure in the most advanced technological contexts. The problem of doing good humanities work with “minimal” computing infrastructure is 24 University of Illinois Press, “Topics in the Digital Humanities,” University of Illinois Press , accessed September 11, 2017, www.press.uillinois.edu/ books/ ind_ books.php?type=series&search=TDH . 179 179 at least as challenging (and interesting) for digital humanities as the problem of adapting the latest tools from Silicon Valley in a high- bandwidth environment— and it remains so, even if the research in high- bandwidth infrastructures produces “better” content for the domain specialist (e.g., colour or HD imagery vs. black and white, for example, or larger collections taking advantage of the latest interfaces and technologies). The experiences of those working in rigid or very traditional research environments that discourage novel work with computation in traditional humanities ields, likewise, bring interesting cultural and methodological challenges that enrich the understanding of researchers working in environments in which digital humanities is “the Next Big Thing.” 25 Because it also involves the application of computation to the humanities or the understanding of the humani- ties in an age of (mostly) ubiquitous networked computing, the research of underfunded researchers, those at non- research- intensive institutions, those without permanent fac- ulty positions, and those just beginning their careers as students is at least as important to our understanding of digital humanities as that of tenured researchers working with the best funding in the most elite institutions. Digital humanities, in other words, is about the intersection of the humanities and the world of networked computation. It is not (solely) about the intersection of the humanities and the world of the fastest, most expensive, and best- supported examples of networked computation. Because it is part of the contemporary humanities, the experiences of the marginalized in their use of computation or their understanding of and access to different computation contexts are at least as important to a full under- standing of digital humanities as are the experiences of those at the centre of our best- funded and most technologically advanced research and cultural institutions. Diversity and Quality There is in theory, of course, no reason why encouraging the contributions of the mar- ginalized alongside those of the non- marginalized (i.e., encouraging “diversity”) should result in lower “quality,” as measured by things like “impact,” citation rates, or peer review scores. Researchers working with poor infrastructure can do as “careful” work as those working with excellent infrastructure and, as Dombrowski and Ramsay 26 have pointed out, excellent infrastructure and funding does not preclude large- scale failure. The problem, however, is that measures of “quality” in the academy are as a rule, self- inscribing. That is to say, the mechanisms by which “quality” is determined strongly 25 William Pannapacker, “No DH, No Interview,” The Chronicle of Higher Education , July 22, 2012, http:// chronicle.com/ article/ No- DH- No- Interview/ 132959/ ; William Pannapacker, “The MLA and the Digital Humanities,” Brainstorm , accessed June 22, 2012, http:// chronicle.com/ blogPost/ The- MLAthe- Digital/ 19468/ . 26 See Stephen Ramsay, “Bambazooka,” accessed August 29, 2017, http://web.archive.org/web/ 20161105014445/ http://stephenramsay.us/2013/07/23/bambazooka/ and Quinn Dombrowski, “What Ever Happened to Project Bamboo?,” Literary and Linguistic Computing 29, no. 3 (September 1, 2014): 326–39 . FOR PRIVATE AND NON-COMMERCIAL USE ONLY 180 180 favour the already favoured: as my colleagues and I have demonstrated of “excellence” (a synonym for “quality” in this context): A concentration on the performance of “excellence” can promote homophily among … [researchers] themselves. Given the strong evidence that there is systemic bias within the institutions of research against women, under- represented ethnic groups, non- traditional centres of scholarship, and other disadvantaged groups, it follows that an emphasis on the performance of “excellence”— or, in other words, being able to convince colleagues that one is even more deserving of reward than others in the same ield— will create even stronger pressure to conform to unexamined biases and norms within the disciplinary culture: challenging expectations as to what it means to be a scientist is a very dif icult way of demonstrating that you are the “best” at science; it is much easier if your appearance, work patterns, and research goals conform to those of which your adjudicators have previous experience. In a culture of “excellence” the quality of work from those who do not work in the expected “normative” fashion run a serious risk of being under- estimated and unrecognized. 27 This is particularly true when measures of relative “quality” (or “excellence”) are used to distribute scarce resources among researchers. Peer review is an inherently conser- vative process— the core question it asks is whether work under review conforms to or exceeds existing disciplinary norms. In zero- sum or close to zero- sum competitions— such as the distribution of prizes or space in a conference— it has a well- established record of both rewarding the already successful and under- recognizing the work of those who do not conform to pre- existing understandings in the discipline. 28 In other words, as we have argued elsewhere: the works that— and the people who— are considered “excellent” will always be evaluated, like the canon that shapes the culture that transmits it, on a conservative basis: past per- formance by preferred groups helps establish the norms by which future performances of “excellence” are evaluated. Whether it is viewed as a question of power and justice or simply as an issue of lost opportunities for diversity in the cultural coproduction of 27 Samuel Moore et al., “ ‘Excellence R Us’: University Research and the Fetishisation of Excellence,” Palgrave Communications 3 (January 19, 2017): 7, https:// doi.org/ 10.1057/ palcomms.2016.105 . Internal bibliographic citations within this quotation have been silently elided. 28 This is known as the “Matthew Effect”; see Robert K. Merton, “The Matthew Effect in Science,” Science 159, no. 3810 (1968): 56– 63, www.jstor.org.ezproxy.alu.talonline.ca/ stable/ 1723414 . Dorothy Bishop, “The Matthew Effect and REF2014,” BishopBlog , October 15, 2013, http:// deevybee.blogspot.ca/ 2013/ 10/ the- matthew- effect- and- ref2014.html discusses the effect in rela- tion to the 2014 Research Excellence Framework. As Jian Wang, Reinhilde Veugelers, and Paula E. Stephan, “Bias Against Novelty in Science: A Cautionary Tale for Users of Bibliometric Indicators,” Social Science Research Network , January 5, 2016, have shown, novelty in science is consistently underestimated by most traditional measures of “impact” in the short and medium term. There is a minor industry researching the failure of peer review to recognize papers that later turned out to be extremely successful by other measures such as citation success or the receipt of major prizes. See Joshua S. Gans and George B. Shepherd, “How Are the Mighty Fallen: Rejected Classic Articles by Leading Economists,” The Journal of Economic Perspectives: A Journal of the American Economic Association 8, no. 1 (winter 1994): 165; Juan Miguel Campanario, “Rejecting and Resisting 181 181 knowledge, an emphasis on the performance of “excellence” as the criterion for the distri- bution of resources and opportunity will always be backwards looking, the product of an evaluative process by institutions and individuals that is established by those who came before and resists disruptive innovation in terms of people as much as ideas or process. 29 Diversity Instead of Quality Taken as a whole, this bias among traditional measures of quality means that they are highly likely to underestimate the value of potentially excellent work by digital humani- ties researchers from non- traditionally dominant demographic groups— especially if this work challenges existing conventions or norms in the ield. But what about poor- quality work from “diverse” researchers? That is to say, what about work from researchers outside traditionally dominant demographic groups within digital humanities that can be shown on relatively concrete grounds to be below the accepted standards in the ield? Work, for example, that does not use or recognize existing technological standards? That ignores (or appears to be unaware of ) basic disciplinary conventions? A student project, say, that encodes text for display rather than structure? Or a project from a researcher working outside mainstream digital humanities that uses proprietary software or formats or strict commercial licences? It is easy to see, in theory, how a conference programming committee that had to choose between a good project by a research team from a dominant demographic group and a lawed project by a team working outside such traditionally dominant com- munities might struggle with the question of “diversity vs. quality” when it came to assign speaking slots. The answer is that it is a mistake to see “poor quality” as a diversity issue. While such problems can arise with researchers from demographics that are not traditionally dominant within digital humanities, they also arise among researchers from traditionally dominant demographics as well. Indeed, the willingness to celebrate (or at the very least destigmatize) “failure” is one of the features of digital humanities that distinguishes it Nobel Class Discoveries: Accounts by Nobel Laureates,” Scientometrics 81, no. 2 (April 16, 2009): 549– 65; Pierre Azoulay, Joshua S. Graff Zivin, and Gustavo Manso, “Incentives and Creativity: Evidence from the Academic Life Sciences,” The Rand Journal of Economics 42, no. 3 (2011): 527– 54; Juan Miguel Campanario, “Consolation for the Scientist: Sometimes It Is Hard to Publish Papers that Are Later Highly Cited,” Social Studies of Science 23 (1993): 342– 62; Juan Miguel Campanario, “Have Referees Rejected Some of the Most- Cited Articles of All Times?,” Journal of the American Society for Information Science 47, no. 4 (April 1996): 302– 10; Juan Miguel Campanario, “Commentary on In luential Books and Journal Articles Initially Rejected because of Negative Referees’ Evaluations,” Science Communication 16, no. 3 (March 1, 1995): 304– 25; Juan Miguel Campanario and Erika Acedo, “Rejecting Highly Cited Papers: The Views of Scientists Who Encounter Resistance to Their Discoveries from Other Scientists,” Journal of the American Society for Information Science and Technology 58, no. 5 (March 1, 2007): 734– 43; Kyle Siler, Kirby Lee, and Lisa Bero, “Measuring the Effectiveness of Scienti ic Gatekeeping,” Proceedings of the National Academy of Sciences 112, no. 2 (January 13, 2015): 360– 65. 29 Moore et al., “ ‘Excellence R Us,’ ” 7. FOR PRIVATE AND NON-COMMERCIAL USE ONLY 182 182 from traditional area ields like medieval studies. McCarty has described digital human- ities as “the quest for meaningful failure” 30 and many authors in the ield have devoted considerable attention to the “error” part of “trial and error” 31 (I am aware of no such bib- liography or tradition within medieval studies). We have a proud tradition of accepting student papers at digital humanities conferences— indeed, there are often both spe- cial prizes and special adjudication tracks for such papers. As long as the researchers in question conform to dominant group expectations in other ways, it seems, referees and review panels are prepared to accept work that implicitly or explicitly violates disci- plinary norms on an exceptional basis because it helps de ine the ield. In the case of stu- dent papers, they also take positive steps to identify and support a demographic that, by de inition, is still presumably acquiring the skills that otherwise make for “quality” work. What this suggests, in turn, is that even “poor quality” is not a reason to avoid privileging diversity within digital humanities. Digital humanities has a tradition of encouraging accounts of failure and accounts of structurally often less accomplished researchers such as students for the same reason it has a tradition of encouraging reports from researchers working in a wide variety of disciplinary contexts— because these accounts contribute collectively to the breadth of our understanding of the application of computation to humanities problems, expanding particularly our knowledge of method (i.e., the “hows,” or, in this case perhaps, “how not tos”). Adding to this the occasional failed or less accom- plished work of a researcher from a traditionally non- dominant demographic will neither disturb this tradition of celebrating failure nor result in the crowding out of successful projects by members of traditionally dominant or non- dominant demographics. Conclusion The history of digital humanities is often traced through landmark projects and movements, from the initial work by Roberto Busa on his concordance, through the stylometrics and statistical work of the 1970s and 1980s, to the “electronic editions” of the 1990s and 2000s, to big data and ubiquitous computing today. This history, however, is also a history 30 Willard McCarty, “Humanities Computing,” Encyclopedia of Library and Information Science (New York: Marcel Dekker, 2003), https:// doi.org/ 10.1081/ E- ELIS . 31 See, among many others, Isaac Knapp, “Creation and Productive Failure in the Arts and Digital Humanities,” inspire- Lab, January 22, 2016, https:// inspire- lab.net/ 2016/ 01/ 22/ creation- and- productive- failure- in- the- arts- and- digital- humanities/ ; Katherine D. Harris, “Risking Failure, a CUNY DHI Talk,” triproftri , March 20, 2012, https:// triproftri.wordpress.com/ 2012/ 03/ 19/ risking- failure- a- cuny- dhi- talk/ ; Brian Croxall and Quinn Warnick, “Failure,” Digital Pedagogy in the Humanities , MLA Commons, accessed August 29, 2017, https:// digitalpedagogy.mla. hcommons.org/ keywords/ failure/ ; Jenna Mlynaryk, “Working Failures in Traditional and Digital Humanities,” HASTAC , February 15, 2016, www.hastac.org/ blogs/ jennamly/ 2016/ 02/ 15/ working- failures- traditional- and- digital- humanities ; Stephen Ramsay, “Bambazooka,” accessed August 29, 2017, http:// web.archive.org/ web/ 20161105014445/ http:// stephenramsay.us/ 2013/ 07/ 23/ bambazooka/ ; Quinn Dombrowski, “What Ever Happened to Project Bamboo?,” Literary and Linguistic Computing 29, no. 3 (September 1, 2014): 326– 39. 183 183 of diversity. At each stage, progress in the ield has required the introduction of new problems, new methods, and new solutions: a broadening of, rather than simple repeti- tion or perfection of, the type of problems to which computation can be applied or which exist in an interesting computational context. Digital humanities is what it is today because we did not privilege “quality”— of concordance- making or edition- making or other early forms of humanities computing— over other novel forms of computational work. Rather, it has thrived because we have embraced new and (often initially) imperfect experiments in the application of computation to other problems or new approaches to understanding the signi icance of computation in the context of humanistic research. This is, indeed, as McCarty has pointed out, perhaps the most ironic thing about the decision of the editors of Computers and the Humanities to narrow the focus of their journal to Language Resources and Evaluation in 2005, just as digital humanities entered its most expansive and diverse phase. 32 Just as progress in humanities computing would have stalled if it had been unable to expand beyond Roberto Busa’s early interest in concordances, or the burst of activity in text encoding and presentation that characterized the “electronic editions” of the 1990s and early years of this decade, so too digital humanities will fail to progress if it cannot expand its range of experiences beyond those whose work and experience have largely de ined it for most of its history: the white, Northern, university researcher who is a man and has access to reasonably secure funding and computational infrastructure. As digital culture (and hence the scope of humanities research) expands globally, the type of methodological and theoretical questions we are faced with have become itself much broader: Why are some groups able to control attention and others not? How do (groups of ) people differ in their relationship to technology? How do you do digital humanities differently in high- vs. low- bandwidth? How does digital scholarship differ when it is done by the colonized and the colonizer? How is what we discuss and research in luenced by factors such as class, gender, race, age, and social capital in an intersectional way? This expansion requires the ield, if it is to advance, to ensure that researchers with experience in these questions from different perspectives are given a place to present their indings in our conferences and journals. In some cases— and there is no reason to believe that the frequency of such cases will be more than we ind whenever new approaches and ideas enter the ield— this work will belong to the well- established tradition of “failure” narratives within digital humanities. Much more often— again, in keeping with what we would expect from those belonging to more tra- ditionally dominant demographics— this work will represent the kind of “quality” we expect as the norm in our various dissemination channels. Regardless of whether such “diverse” work is a “success” or a “failure,” however, it is crucial that it be heard. Digital humanities only grows as a ield when researchers differ from each other in what they do, why they do it, and how they understand what it is that they are doing. Without this diversity, there is no such thing as digital humanities— of any quality. 32 See Humanist Discussion Group (by way of Willard McCarty), “18.615 Computers and the Humanities 1966– 2004 from Humanist Discussion Group (Humanist Archives Vol. 18 ),” accessed June 25, 2017, http:// dhhumanist.org/ Archives/ Virginia/ v18/ 0604.html . FOR PRIVATE AND NON-COMMERCIAL USE ONLY 184 185 APPENDIX: WRITING ABOUT INTERNAL DELIBERATIONS Daniel Paul O’Donnell† This chapter discusses the internal deliberations of the Alliance of Digital Humanities Organizations (ADHO), its constituent organizations, and committees (such as the steering committee, which I was a part of during some of this time, and its various con- ference committees, which I was not). These deliberations were carried out by email and in person. As the debate about “quality” vs. “diversity” broke out, parts of the debate were also discussed in social media, notably Twitter and Facebook. The debate inally became the subject of a number of conference presentations and, with this collection, chapters and articles. This history raises various ethical, evidentiary, and argumenta- tive challenges. As noted in the introduction, many of the key texts in this debate were composed as emails as part of an at times heated and semi- private discussion among committee members faced with the practical problem of how to distribute speaking spots at the annual and high- prestige Digital Humanities conference. As a result, they were not intended for publication (or even wide circulation) and, given the context of the discussion, they cannot be assumed to represent the considered, evidence- based, and reasoned positions of their authors. Moreover, our knowledge of the discussion from which these emails come is by nature fragmentary and partial. In my experience of participating on similar committees, the collected correspondence for a conference programming committee can range into the hundreds (or even thousands) of emails. If the committee also meets in person or by tele- conference, this correspondence also has an unrecorded oral context. This means that the few emails from this debate that have circulated on social media, in addition to representing perhaps unguarded and also provisional and informal positions taken in the context of a larger discussion, are also by nature incomplete: we do not know (or it is impossible to report) the full context of the discussion from which they have been extracted or how views were modi ied, strengthened, or abandoned in the course of debate. Having said all this, however, the discussion these emails prompted is important to the ield. While it is true that much of the evidence discussed in this essay was not intended for publication and may not represent the considered views of their authors, the debate from which it comes was much more than a private philosophical discussion among colleagues. Conference programming committees play an important gatekeeping function in any discipline, and the debate that was going on in this case was about the practical de inition of digital humanities as a discipline as it would be manifested at what is its premier conference. As such, it has the potential to affect the direction of the disci- pline as much as any published theoretical piece or trendsetting project. † University of Lethbridge, Canada. FOR PRIVATE AND NON-COMMERCIAL USE ONLY 186 : 186 Digital humanities as a discipline, moreover, seems to me to be unusual in the degree to which such “internal” administrative and institutional debates and acts affect its intel- lectual growth and direction, particularly in the course of the last twenty years. There are a number of famous and not- so- famous examples of this, beginning, perhaps most famously with the “internal” agreement between the publishers and editors of the irst Companion to Digital Humanities to use “Digital Humanities” rather than “Humanities Computing” (or similar) to “brand” their collection of essays— an “administrative” decision that, as many have argued, has had a profound effect on the direction of the ield. A considerable amount of published scholarly discussion within digital humanities, moreover, focuses on the intellectual and practical signi icance of these organizational discussions and decisions— as a glance at the foundational essays in many of the most important collections suggests. What this means, therefore, is that the history of digital humanities simply cannot be written without reference to ostensibly private conversations and documents. In some cases, these references are seemingly positive and are willingly promoted by the participants to the conversation. For the same reasons that digital humanities also attempts to destigmatize failure, however, these conversations and documents cannot be ignored when they are less obviously lattering to the participants in the discus- sion, especially once, as in this case, they either become part of the public record or are hinted at in of icial, public pronouncements. Given the degree to which research in digital humanities is networked, collaborative, and organized, ignoring what happens “behind closed doors” is both misleading to those “not in the know” and ultimately coun- terproductive in a ield that at least ostensibly emphasizes openness and transparency as primary values. In this paper, I have tried to respect both aspects of this problem. On the one hand, I have, as much as possible, tried to avoid tying some of the more provocative documents to named individuals and organizations— what is signi icant about this debate is not who held what position but rather what these positions were and the stakes involved in the debate. On the other hand, however, I have directly quoted from and commented on speci ic emails from this debate as they were released on social media. A discussion about what general kind of work is and is not allowed at a discipline’s major conference or what kinds of criteria should or should not be used to adjudicate access to speaking slots is more than a private conversation: it is as much about the de inition of the ield as any theoretical book or article. << /ASCII85EncodePages false /AllowTransparency false /AutoPositionEPSFiles false /AutoRotatePages /None /Binding /Left /CalGrayProfile (Dot Gain 20%) /CalRGBProfile (sRGB IEC61966-2.1) /CalCMYKProfile (U.S. Web Coated \050SWOP\051 v2) /sRGBProfile (sRGB IEC61966-2.1) /CannotEmbedFontPolicy /Error /CompatibilityLevel 1.3 /CompressObjects /Tags /CompressPages true /ConvertImagesToIndexed true /PassThroughJPEGImages true /CreateJobTicket false /DefaultRenderingIntent /Default /DetectBlends true /DetectCurves 0.1000 /ColorConversionStrategy /LeaveColorUnchanged /DoThumbnails false /EmbedAllFonts true /EmbedOpenType false /ParseICCProfilesInComments true /EmbedJobOptions true /DSCReportingLevel 0 /EmitDSCWarnings false /EndPage -1 /ImageMemory 1048576 /LockDistillerParams true /MaxSubsetPct 100 /Optimize false /OPM 1 /ParseDSCComments true /ParseDSCCommentsForDocInfo true /PreserveCopyPage false /PreserveDICMYKValues true /PreserveEPSInfo false /PreserveFlatness true /PreserveHalftoneInfo false /PreserveOPIComments false /PreserveOverprintSettings true /StartPage 1 /SubsetFonts true /TransferFunctionInfo /Remove /UCRandBGInfo /Remove /UsePrologue false /ColorSettingsFile () /AlwaysEmbed [ true ] /NeverEmbed [ true ] /AntiAliasColorImages false /CropColorImages true /ColorImageMinResolution 150 /ColorImageMinResolutionPolicy /OK /DownsampleColorImages true /ColorImageDownsampleType /Bicubic /ColorImageResolution 300 /ColorImageDepth -1 /ColorImageMinDownsampleDepth 1 /ColorImageDownsampleThreshold 1.50000 /EncodeColorImages true /ColorImageFilter /DCTEncode /AutoFilterColorImages true /ColorImageAutoFilterStrategy /JPEG /ColorACSImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000ColorACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000ColorImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 150 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /GrayImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /JPEG2000GrayACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000GrayImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 400 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict << /K -1 >> /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False /CreateJDFFile false /Description << /CHS /CHT /DAN /DEU /ESP /FRA /ITA /JPN /KOR /NLD (Gebruik deze instellingen om Adobe PDF-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. De gemaakte PDF-documenten kunnen worden geopend met Acrobat en Adobe Reader 5.0 en hoger.) /NOR /PTB /SUO /SVE /ENU (Use these settings to create Adobe PDF documents best suited for high-quality prepress printing. Created PDF documents can be opened with Acrobat and Adobe Reader 5.0 and later.) >> /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ << /AsReaderSpreads false /CropImagesToFrames true /ErrorControl /WarnAndContinue /FlattenerIgnoreSpreadOverrides false /IncludeGuidesGrids false /IncludeNonPrinting false /IncludeSlug false /Namespace [ (Adobe) (InDesign) (4.0) ] /OmitPlacedBitmaps false /OmitPlacedEPS false /OmitPlacedPDF false /SimulateOverprint /Legacy >> << /AddBleedMarks false /AddColorBars false /AddCropMarks false /AddPageInfo false /AddRegMarks false /ConvertColors /ConvertToCMYK /DestinationProfileName () /DestinationProfileSelector /DocumentCMYK /Downsample16BitImages true /FlattenerPreset << /PresetSelector /MediumResolution >> /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ] >> setdistillerparams << /HWResolution [2400 2400] /PageSize [612.000 792.000] >> setpagedevice work_d7dssdffpfd35gxifapptm2qvi ---- – 1682 – Journal of Siberian Federal University. Humanities & Social Sciences 9 (2019 12) 1682–1693 ~ ~ ~ УДК 37.02 From Digital Humanities to a Renewed Approach to Digital Learning and Teaching Samuel Nowakowski and Guillaume Bernard* University of Lorraine Loria UMR7503, 54000 Nancy, France Received 06.08.2019, received in revised form 26.08.2019, accepted 09.09.2019 In a world in which digital interfaces, dematerialization, automation, so-called tools of artificial intelligence aim to drive away the human or eliminate the relationship with humans! The way other beings see us is important. What would happen if we took the full measure of this idea? How would this affect our understanding of society, culture, and the world we inhabit? How would this affect our understanding of the human, since in this world beyond the human, we sometimes find things that we prefer to attribute only to ourselves? What impacts on education, learning, teaching? After having explored the field opened by these questions, we will bring an answer with a reinvention of the learning platform named KOALA (KnOwledge Aware Learning Assistant). KOALA is a new online learning platform that comes back to internet sources. Symmetrical and acentric, KOALA combines analyzes from the digital humanities and answers to the challenges of education in the 21st century. Keywords: computer environments, LMS, human values, digital support, teaching and learning. Research area: psychology; pedagogy. Citation: Nowakowski, S., Bernard, G. (2019). From digital humanities to a renewed approach to digital learning and teaching. J. Sib. Fed. Univ. Humanit. soc. sci., 12(9), 1682–1693. DOI: 10.17516/1997–1370–0486. Introduction “It’s our dreams that turn us from machines into full-fledged human beings.” K. Dick Why this quote? It is from one of Philip K. Dick’s last lectures, you know K. Dick, you have certainly seen Minority report, Blade Runner (obviously the 1982 version), © Siberian Federal University. All rights reserved * Corresponding author E-mail address: samuel.nowakowski@univ-lorraine.fr; ippssfu@mail.ru This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). – 1683 – Samuel Nowakowski, Guillaume Bernard. From Digital Humanities to a Renewed Approach to Digital Learning and… Total Recall, … In his end-of-life lectures, K. Dick poses the question of the place of man and technology, of what makes us human beings. In this world of technology, this question is central today — not to ask it, it is to take the risk of entrusting to the machines our capacity to decide, to think, … our humanity … and without possible turning back (Dick, 2015)! First, a little digression! With a book entitled “How do forests think?” “As we settled down to sleep under the thatch of our hunting camp on the foothills of the Sumaco volcano, Juanicu warned me, ‘Sleep on your back!’ If a jaguar comes, he will see that you can look back and he will not bother you. If you sleep on your stomach, he’ll think you’re aicha, a prey or literally meat in Quichua and he’ll attack you” (Kohn, 2017). “He said that if a jaguar sees you as a being able to look at him in return — a self like him, a you — he will leave you alone. But if he came to see you as a prey — one of them — you could end up dead meat. The way other beings see us is important. […] and more especially and more strongly in a world in which digital interfaces, dematerialization, automation, so-called tools of artificial intelligence aim to distance the human or eliminate the relationship with humans! The way other beings see us is important. What would happen if we took the full measure of this idea? How would this affect our understanding of society, culture, and the world we inhabit? How does this affect our understanding of the human, since in this world beyond the human, we sometimes find things that we prefer to attribute only to ourselves? Why tell you this? Because the question that animates me and must animate us today is what we are going to do with all the tools, more and more powerful, that we are developing. Recall with Plato or Bernard Stiegler, that any tool is a pharmakos, both remedy and poison, emancipator and destroyer, and our powerful tools from the digital do not escape! Destroyers, these tools are, I want for proof their ecological impact. To take a concrete example, each consultation of a web page results in the emission of 2 g of greenhouse gases in the atmosphere and the consumption of 3 centilitres of water. Globally, the internet is a 6th continent that “weighs” annually 1037 TWh of energy, 608 million tons of greenhouse gases and 8.7 billion cubic meters of fresh water. About 2 times the footprint of France! In addition to being colossal, the environmental impacts of digital are multiple: depletion of non-renewable natural resources, pollution of the air, water and soil inducing health impacts contributing to the destruction of ecosystems and biodiversity, emissions of greenhouse gases that contribute to climate change, etc. And they reinforce – 1684 – Samuel Nowakowski, Guillaume Bernard. From Digital Humanities to a Renewed Approach to Digital Learning and… each other. It is therefore essential to adopt a multi- criteria approach when studying these impacts and not to limit oneself to a single environmental indicator. These environmental impacts occur at each stage of the life cycle. But they are concentrated especially in the manufacture of equipment and their end of life. It is therefore essential to extend the active life of equipment by promoting their reuse and by pushing the inevitable stage of recycling as much as possible. It is therefore essential to put the human in the center! Not to subordinate the intelligence to the tools, and not to forget it in the implementation of our tools. Humanity and digital Human beings obviously have genius to design, manufacture and use tools. Our innate talent for technological invention is one of the main qualities that distinguish our species from others and one of the main reasons why we have taken such a hold on the planet and its destiny. But if our ability to see the world as a raw material, something we can alter and manipulate as we please, gives us tremendous power, it also carries great risks. The first risk is that we ourselves become a technical instrument, optimized and programmed, a technology among the others. The anxiety of seeing machines attack our humanity is as old as the machines themselves. Max Weber and Martin Heidegger have described how a narrow and instrumentalist vision of existence influences our understanding of ourselves and shapes the kind of societies we create. With our smartphones and other digital devices ubiquitous, most of us are permanently connected to the computer network (Internet Live Stats, 2019). The companies that control the network seek to know as much as possible about the users to control their senses until their thoughts through the applications, sites and services that they “propose”. At the same time, the proliferation of connected objects, networked machines and devices, from home to workplaces always immerses us in computer environments designed to anticipate our needs (Une Rentree Pour Tout Changer, 2019). There are of course many advantages to an existence more and more mediatized. Many activities, once difficult or lengthy, have become easier, requiring less effort and reflection, and the risk of losing, if we are not careful, our ability to act on our own in the world. However, by transferring the initiative to computers and software, we give control of our desires and decisions to programmers and the companies that employ them. Already, many people rely on computer programs to choose which movie to watch, which meal to cook, which news to follow, even which person to meet! – 1685 – Samuel Nowakowski, Guillaume Bernard. From Digital Humanities to a Renewed Approach to Digital Learning and… Why think when you can click? By giving such choices to strangers, we inevitably open ourselves to manipulation. Since the design and operation of algorithms is almost always hidden from us, it can be difficult, if not impossible, to know whether the choice made on our behalf reflects our own interests or those of corporations, governments or other third parties. We want to believe that technology strengthens our control over our lives and circumstances, but if it is used without consideration, technology is just as likely to turn us into puppets of those who master and deploy these technologies (Zamiatine, 1971). Thomas Hughes’ “technological impetus” is a powerful force, and opposing this force becomes possible when we have a keen awareness of how technologies are designed and used. If we do not accept this responsibility, we risk going from creator to creature status. What will become of us when we install the externalization of the calculus in “rational” machines in a will to get rid of the uncertainties of the human “reason” which is fallible, unpredictable, submissive, subject to the feelings. This reduction of human contingencies to sequential protocols makes it possible to associate economic rationality with algorithmic rationality. Philip K. Dick expresses it very well: “To become […], for want of a more appropriate term, […] an android, [which] means, […] to be transformed into an instrument, to let oneself be crushed, manipulate, become an instrument without his knowledge or consent — it’s the same thing. But you cannot turn a human into an Android if that human tends to break the law whenever he has the opportunity. Androïdization requires obedience. And, above all, predictability. It is precisely when the reaction of a given person to a given situation can be predicted with scientific precision that the doors to the Trojan horse are opened wide. For what purpose would a flashlight be if, when the button is pressed, the bulb only comes on once in a while? Every machine must walk without firing to be reliable. The Android, like any other machine, must walk on the finger and the eye (Dick, 2015).” It would then be “like Ray Bradbury’s story in which a Los Angeles man discovers in horror that the police car chasing him has no driver — and that he pursues it of his own accord! What is frankly horrible is not that the car has its own tropism in pursuing the protagonist, but it is the fact that inside the car there is a void. An empty place. The absence of something vital — that is what is horrible! “The person, once gone, cannot be replaced in any way. Whatever our feelings about her, we cannot do without them. And once gone, nothing can make her come back. What’s more, if this person is turned into an android, she will never come back to the human state either (Dick, 2015).” – 1686 – Samuel Nowakowski, Guillaume Bernard. From Digital Humanities to a Renewed Approach to Digital Learning and… What is the human in a world increasingly machinized? What crossroads where to decide alone the path to take at the expense of the dream promise of the other way? Remember that there is never “too much human” in a world of dehumanized multitudes. So how to find a way out? Find a way out Let’s take these words from Etienne Klein. “It’s not because there are frogs after the rain that we have the right to say that it’s raining frogs”. But why? Because today, big data is erected as a means to access the ultimate knowledge! Indeed, learning analytics in prediction, profiling in artificial intelligence, to profile, to induce from behavioral regularities, to model consumer behaviors, to identify regularities, we infer laws that we consider as general or even universal then even that they are only the digest of what has already been given! Whereas only human thought alone can predict the existence of new dimensions of reality! Let’s take a few examples! It is not thanks to big data but thanks to the equations of particle physics that we have been able to predict the existence of the Higgs boson and allow its detection in 2012! It is not thanks to big data but thanks to the equations of gravitation that we have been able to predict the gravitational waves that were detected in 2016, a century after their prediction by the general relativity formulated in 1916 by Einstein. Thus, THE THOUGHT IT alone makes it possible to go beyond the limits of the observable world, the empirical world, the world as it is given to us. Thought cannot be dissociated from our nature of being situated and experimenting. The most demonstrative example is that of Einstein who proceeded by “thought experiment”, that is to say by experiments capable of keeping the empirical world at a distance and prolonging, in a sort of elsewhere, the implications of a theory: what would happen in this or that situation that I can imagine, if this physical law was really accurate? What would the equations say if they could talk? Big Data is obviously a fantastic opportunity but may lead us to look only at correlations between the data that are available, or a correlation is not the same as a cause-and-effect relationship. The moral of this beautiful story is that a theory allows to emerge new data but the reciprocal is not true. Data does not always make it possible to bring out a theory leading to understanding. So, to all the data gurus, to all who think to read our future in the marc of the algorithms, remember that without the thought in action, nothing – 1687 – Samuel Nowakowski, Guillaume Bernard. From Digital Humanities to a Renewed Approach to Digital Learning and… that you set up as new gods would exist! That mechanical intelligence, programmed, calculated, is in no way comparable to human intelligence. Collecting and processing masses of data, to act intelligences cannot replace this ability of human thought to go beyond frames, to look beyond the hill blocking the horizon. A question of education — towards a reinvention? The understanding of all this necessarily involves learning, listening to nature and others (“we always learn alone but never without others” Philippe Carré). As science is the most effective method we have found to understand the world and the democratic modes that are the best way we have found to organize the collective decision- making process, education must be based on tolerance, debate, rationality, the search for common ideas, learning, listening to the opposite point of view, awareness of the relativity of its place in the world. In order to be aware that we can be wrong, to retain the possibility of changing our mind when we are convinced by an argument, and to recognize that views opposed to ours could prevail. Every step forward in the scientific understanding of the world is also a transgression from what was happening before. So scientific thought always has something subversive, volutionary (a term that I borrow from Alain Damasio because a revolution always returns to the point of departure). Whenever we redraw the world, we change the very grammar of our thoughts, the frame of our representation of reality. To be open to knowledge is to be open to the subversive. Unfortunately, in school, on the contrary, science is often taught as a list of “established facts” and “laws” or as a problem- solving training. This way of teaching betrays the very nature of scientific thought (Le Guin, 1974). We must teach critical thinking, not the respect of textbooks. We must invite students to question preconceptions and teachers, not blindly believe them. Science and thought must lead us to recognize our ignorance, and that in “others” there is more to learn than to dread. That the truth is to be sought in a process of exchange, and not in the certainties or the common conviction that “we are the best”. Teaching must therefore be the teaching of doubt and wonder and we must be careful to put our tools at its service. – 1688 – Samuel Nowakowski, Guillaume Bernard. From Digital Humanities to a Renewed Approach to Digital Learning and… Reinventing education? Towards openness, the participative, the agile, the connected and the human Martin Luther King said in his sermon pronounced in New York on April 4, 1967: “When machines, computers and the quest for profit are more important than people, the fatal triptych of materialism, militarism and racism is invincible.” What is more premonitory than such information? Half a century away from a world in which the machine and materialism have never been so overwhelming! So, me too, in this present, in France today, I have a dream! I dream of a society in which industry, economy, democracy, education will be conceived in symbiosis. I dream to see people living in green cities, planted with bamboo oases that will purify the waters and produce biomass for a local industry and adapted to the needs. Participatory workshops or FabLabs will allow everyone to upgrade their skills, learn, learn to learn and accompany, to repair objects rather than replace them, to produce what is needed without passing by massive industrialization. I dream of a society in which we will have fewer objects. I dream of a measured and intelligent use of the internet, an internet whose resources will be relocated, adapted to the needs, an internet which will allow to share vehicles, various tools and any device with intermittent uses. I dream of a digital resource that will be local, “populated by artificial intelligences” accompanying, adaptable and benevolent. I dream of a sustainable internet, responsible and respectful of humans, life and the Earth. I dream of a company where to rent phones and computers rather than buy them, will encourage builders not to promote planned obsolescence. A society in which the bulk of the goods would be made from recyclable materials, all within a circular economy. I dream of a society that can reduce, reuse, recycle, repair, rent, share in a logic of interconnectivity and preservation of diversity, diversity fostered by the education of a conscious citizen, enlightened, responsible and able to reject the clutches of Silicon Valley engineers, advertisers, media directors who seek to make them more vulnerable, impressionable and receptors of standardized thinking. So, I dream of a school where, from their early years, children are taught co- operation, in addition to mathematics, the digital humanities in addition to grammar and history, the art of communicating better with others express their needs and resolve conflicts instead of competition and coercion. I dream of an education that accompanies throughout life the citizen in his understanding of digital and its issues. I dream of a school that supports the construction of a digital culture accessible to all by facing the difficulty of the uses that have become – 1689 – Samuel Nowakowski, Guillaume Bernard. From Digital Humanities to a Renewed Approach to Digital Learning and… part of our professional and personal lives. I dream of a school that, beyond familiarity, recall the foundations of the digital sciences and human sciences. So, at a time when we agree on the importance for every citizen to have a critical spirit to escape all forms of obscurantism, I dream of a school that gives the keys to understanding issues related to digital, to avoid falling into a too angelic or too demonizing vision of the contributions of digital. I dream answers, tracks, alternative ways. I dream with Ivan Illich of a convivial society “which gives to man the possibility of exercising the most autonomous and creative action, using tools less controllable by others. Productivity is conjugated in terms of having, usability in terms of being” (Illich, 1973). I dream of a universe-city, imagining a friendly university society open to all and without distinction of sexes, ages and origins. I dream of a world in which one considers with Deleuze, that “[…] to make your questions, [it must be] with elements coming from everywhere, from anywhere”. I dream of thinking of the actions that will create the conditions for the inclusion of the university in the city, a connected university, which makes it possible to “think, in things, among things, [that makes it possible] to make rhizome, and not root”. A university that “grows between and among other things (Deleuze, Parnet, 1996)”. I dream of a universe-city that grows where you do not expect, interstices. I dream of doing universe-city, of working “from” and not “on the side”, of working from society, of these issues, of the needs and expectations of students, and not “beside” stakes of the beginning of the 21st century. It is careful not to “substitute the awakening of education for the awakening of knowledge” so as not to “stifle in man the poet, [and] freeze his power to give meaning to the world”. I dream of the university as an active part of the city to welcome, give a place, guide, accompany the student. I dream the universe-city open on the city, which give a role to the student by adapting and accompanying the project of all to the city. To universe-city is to consider that the learning and the construction of the personal and professional project does not take place only in formal teachings but, in a university involved in the city, in which a true university citizenship apply. To universe-city is to be free, not to live in a flock by thinking yourself free! To universe-city is to give the conditions for the emancipation of the individual! So my idea for France is to dare the Universe-city to remember that people more important than “machines, computers and the quest for profit”! – 1690 – Samuel Nowakowski, Guillaume Bernard. From Digital Humanities to a Renewed Approach to Digital Learning and… Our answer in a digital learning platform Our answer is called KOALA, for “Knowledge Aware Learning Assistant”. Based on the observations above, in the idea of giving meaning to the action of each, we make ours the statement of Van Jacobson in 1995, “How to kill internet?” It’s easy, just invent the web. The educational platforms are not immune to this. We sought to introduce a new logic of definition of the authority, the states and the exchanges of roles between the actors of the educational act. Our guiding principle is to work on the notion of commitment to all by proposing a space that facilitates commitment by enabling cognitive engagement, behavioral engagement, social engagement and emotional engagement. The goal is then to lead to emancipation, the re-enchantment of the educational act and the well-being in learning and teaching (Smolyaninova, Ovchinnikov, 2010). Based on these observations and on the experience of previous projects and the challenges of the early 21st century of a learning society, we have developed a new platform that allows us to return to the sources of the Internet. KOALA is a space for facilitating engagement by installing a new dynamic of the occupation of the learning and teaching space. KOALA will promote accessibility, continuity and porosity where the other platforms enclose. We thus have a vision that goes from centric to a-centered, from a-symmetrical to symmetrical. KOALA leaves a place on the periphery and puts another approach to the Authority. Access to educational resources, the realization of learning activities, the exercise of the educational relationship is built within the spaces that everyone Fig. 1. KOALA-LMS (Learning Management System) is now at the following address: https://www.koala-lms.org/fr/ – 1691 – Samuel Nowakowski, Guillaume Bernard. From Digital Humanities to a Renewed Approach to Digital Learning and… wants to invest, and that it will appropriate in its own way. KOALA is thus a living space that guarantees socially and physically situated actions. It promotes an action that fits in a lived and proxemic space and redefines the articulation between private / individual and the collective at the level of each user. Moreover, KOALA authorizes the indeterminacy of the conditions of realization of an action: invention, freedom, autonomy, capacitive environment and potential of situation. In order to provide an answer to our questions, KOALA thus opens up a renewed approach to digital support that allows us to integrate the recommendation of content adapted to the needs, contexts, objectives of learning and training of each according to its needs. KOALA is then an ecology of the learning experience (Fig. 1). Conclusion You know the story Dr. Frankenstein. Dr. Frankenstein “makes” a man, he “manufactures” it. An act so frightening that he himself abandons this creature who has no name. The monster as we call it, is monster because it is both similar to us and so different. The monster is therefore a creature who will make his education alone but will sink into violence when the abandonment of its creator will be combined with the stupidity of men. But what a foolish enterprise to want to make a human! Yet this is what we strive for each time we want to “build a subject by adding knowledge,” or “make a student stacking knowledge.” We all want, more or less, to “do something with someone” after “doing something”. But like Dr. Frankenstein, we do not always quite understand what “something” and “someone” is, it’s not quite the same and we often do not know that this confusion condemns us, despite all the goodwill we can deploy, to failure, to conflict, to suffering, and sometimes even to misfortune. And in this world of connected machines, so-called “intelligent machines,” “the question of what it means to be human in the face of machines is no longer so trivial. We should not be so afraid of robots as we are afraid of becoming robots ourselves. We need to introduce human values into technology rather than technology introducing its values into our humanity. For this, one must be able to measure when a technology is dehumanizing or when humans do not think or behave in humans” (Meirieu, 2017). In addition, today, some voices are raised, relayed by the higher authorities of power to say that learning can be explained only through measurable and optimizable mechanisms, by a procedural decomposability optimizable and modelable, by an – 1692 – Samuel Nowakowski, Guillaume Bernard. From Digital Humanities to a Renewed Approach to Digital Learning and… approach to optimal control of learning and especially forgetting the child as a human becoming and evacuating the political dimension of choices in terms of education. Education is not just a process to be optimized. Education is a social milieu, education is a political project of elaboration of society. What are we? A brain only, or a complex complex in perpetual construction who’s learning. To educate is not to manufacture! To educate is not to equip, to educate it is perhaps there in a few words in this text of the Comité Invisible: “Do not wait anymore, Do not hope anymore. Do not be distracted anymore, unseat. Breaking. Return the lie to the ropes. Believe in what we feel. Act in consequence. Force the door of the present. Try. Miss. To try again. To miss better. Be proactive. Attack. Build. Defeat, perhaps. In any case, overcome. Go his way. Live, then. Now.” Adopt KOALA, and join the experience! References Deleuze, G., Parnet, C. (1996). Dialogues. In Flammarion. 179 p. Dick, P.K. (2015). If this world displeases you … and other tests [Si ce monde vous déplaît… et autres essais]. In Eclat Eds De L’. Illich, I. (1973). Convivality. Chapter II. Internet live stats (2019). Available at https://www.internetlivestats.com/ (accessed 25 July 2019). KOALA-LMS (2019). Available at: https://www.koala-lms.org/fr/ (accessed 25 July 2019). Kohn, E. (2017). How Forests Think [Comment pensent les forêts]. In Editions Zones sensibles. Le Guin, U. (1974). The dispossessed [Les dépossédés]. In Robert Laffont. 301 p. Meirieu, P. (2017). Frankenstein pedagogue [Frankenstein pédagogue]. In Editions Sociales Françaises. Samuel Nowakowski, Guillaume Bernard. From Digital Humanities to a Renewed Approach to Digital Learning and… Smolyaninova, O., Ovchinnikov, V. (2010). University Electronic library for human resources development in Siberia: school content. In 9th Europen Conference on Elearning (ECEL 2010), 828–835. Une Rentree Pour Tout Changer [A Return to Change Everything] (2019). Available at http://multimedia.ademe.fr/infographies/infographie- travail-ademe-logo- qqf/ (accessed 25 July 2019). Zamiatine, E. (1971). We [Nous autres]. Gallimard. 200 p. От цифровых технологий к обновленному подходу  в сфере цифрового обучения Самуэль Новаковски и Гильом Бернард Университет Лотарингии Франция, Нанси, 54000, Лориа UMR7503 Мы живем в мире, где цифровые интерфейсы, дематериализация, автоматизация и так называемые инструменты искусственного интеллекта стремятся свести к ми- нимуму взаимоотношения между людьми. Но ведь то, как нас видят другие существа, очень важно. Что будет, если мы в полной мере осознаем эту идею? Как она повлияет на наше понимание общества, культуры и мира, в котором мы живем? Как изменится восприятие нами человека, тем более что в этом мире за пределами человеческого мы иногда находим то, что предпочита- ем приписывать только себе? Что влияет на образование, обучение, преподавание? Изучив очерченную этими вопросами область, мы дадим ответ: KOALA (KnOwledge Aware Learning Assistant). Это новая образовательная онлайн- платформа, которая отсылает к интернет- источникам. Будучи симметричной и ацентрической, KOALA ведет анализ цифровых материалов в сфере гуманитарных наук и дает ответы на вы- зовы образованию в XXI веке. Ключевые слова: компьютерная среда, СДО, человеческие ценности, цифровая под- держка, преподавание и обучение. Научная специальность: 19.00.00 — психологические науки; 13.00.00 — педагогические науки. work_dbicdw24cnfdjclw345f6asqz4 ---- PEMM Charter - for sharing (2).pdf The charter is the foundational document that describes the rationale, goals, plan of work, resources needed, terms and conditions, and outcomes of a Center for Digital Humanities at Princeton (hereafter CDH) project. Charters are written by core members of a project team in a series of planning meetings taking place over the course of a month. The planning process is intensive, collaborative and requires substantial input from everyone on a team. Charters serve as formalized agreements among all team members on such crucial questions as scope, technical design, infrastructural needs, and success criteria. A draft of each project charter is peer-reviewed by all CDH staff, and optionally by additional partners or stakeholders, at a “design review” before the start of project work. It is circulated at least one week before the review takes place in an open comment period. Questions and concerns from this period may be raised at the design review. Project teams have two weeks after the design review to address any issues raised and make any requested changes. Project work only begins (and funds are released) once the charter has been finalized and signed by the Project Director (PI) and the CDH Faculty Director. Charters are amended as necessary throughout the project lifecycle to document major changes and note when “Built by CDH” Software Warranty and “Built by CDH” Long Term Service Agreement take effect, and serve as part of the CDH project archive. CDH charters and their planning documents exist in several forms as we have refined them over the years and tailored them to the several types of projects we have supported. For more about CDH project management, including the charter process, visit: https://cdh.princeton.edu/research/project-management/ Cite this document: Belcher, Wendy Laura, Rebecca Sutton Koeser, Rebecca Munson, Gissoo Doroudian, and Meredith Martin. CDH Project Charter — Princeton Ethiopian Miracles of Mary 2019-20. Center for Digital Humanities at Princeton. 2019. http://doi.org/10.5281/zenodo.3359178 PEMM Charter (2019-20) Part I: Project Overview Stories have been told for almost two millennia about the Virgin Mary, the mother of Christ, and the miracles she has performed for the faithful who call upon her name. One of the most important collections of such folktales is the body of over 700 Ethiopian Marian miracles, written from the 1300s through the 1900s, in the ancient African language of Gəˁəz (also known as classical Ethiopic). These story collections, called the ​ Täˀammərä Maryam ​ (The Miracles of Mary), are central not only to the ancient church liturgy of Ethiopia, but to the daily felt and religious life of 50 million Ethiopians and Eritreans. Princeton University has in its Firestone Library one of the largest and finest collections of Marian miracle manuscripts anywhere in the world outside of Ethiopia, with over 130 codices and hundreds of textual amulets. Worldwide, at least 100,000 ​Täˀammərä Maryam ​manuscripts exist, some with just a handful of stories, some with hundreds, and many with different versions of the same stories. While the ​Täˀammərä Maryam ​ is one of the most important African archives of texts, basic information about it and its stories are lacking; as a result, scholars can authoritatively state almost nothing about them. How many are there? When was each written? What themes do they have? Have these African stories grown and changed across regions, languages, and periods? Princeton Ethiopian Miracles of Mary (PEMM) project will collect and collate information about these hundreds of stories across hundreds of manuscripts as the basis for an open access resource that will enable researchers and Ethiopian community members around the world to conduct in-depth research on this vital corpus. Wendy Laura Belcher, professor of African literature in the departments of Comparative Literature and African American Studies will serve as the project’s Principal Investigator (PI). PEMM was begun with a Center for Digital Humanities at Princeton (CDH) Dataset Curation grant. Description and Objectives With the guidance of the CDH, the PEMM project team will collect and collate data about hundreds of Marian miracle stories in hundreds of Ethiopian manuscripts. Our aim is to enable computational analysis of this vital corpus of African folktales and to generate answers about their number, dating, origin, provenance, themes, recensions, translations, sources, placement, and diachronic change. The CDH and PEMM will design a robust data structure to migrate, store, connect, validate, and query the data. We will discuss a preliminary web interface with sample data visualizations that will make that data available to scholars in the United States, Europe, and Ethiopia; this interface may not be possible in AY20. 1 Relevant Resources and Projects PEMM builds on the work of previous scholars, using resources created by others: the most important for PEMM is the Macomber Handlist, and the finding aids of the Princeton University Rare Books and Special Collections. We hope to reference the manuscripts in the British Library. We will also make use of and share data with both the Oxford Cantigas de Santa Maria (CSM) database and the University of Hamburg Beta Maṣāḥǝft (Hamburg BM) Database with the intention of avoiding overlap as much as possible. For complete information about what has been collected and catalogued thus far about the ​Täˀammərä Maryam, ​please see Appendix A. Research Questions We are collecting and collating data in AY20 in order to answer three main research questions. ● How many Ethiopian Marian miracle tales are there? ​ Scholars have not been able to arrive at an accurate number of how many Ethiopian Marian tales there are despite a century of labor on the issue. One scholar says 540, another says 643; a current database project (the Hamburg BM) has likely identified over 700. Meanwhile, Princeton has some stories in its manuscripts that appear on none of those lists. PEMM has access to what those scholars did not have: thousands of digitized manuscripts (instead of dozens) and sophisticated ways of curating and analyzing the data in those manuscripts (see below). ● What are the themes of the Ethiopian Marian miracle tales? ​ ​Macomber’s 1980s catalog provided keywords for some of the tales, but many of the terms were dated, insufficient, and inconsistently applied. By enhancing dataset with better keywords, by refining and standardizing keywords from a controlled vocabulary and consistently applying them, PEMM will give scholars access to an accurate dataset of tale themes and ways of studying how tales correlate with those keywords. We will build a controlled vocabulary from combining and refining Macomber’s handlist, the Hamburg BM, the Oxford CSM, and consulting the Index on Medieval Art. ● What is the origin of each​ ​Ethiopian Marian miracle tale? ​ ​No one has clearly established which of the tales were originally from Europe or the Middle East. Some say only 33 of them, others say 75 of them, but no one has done the work to be certain. This matters not only to Ethiopianists, but to scholars working on the European Marian tales. Our work correlating the Ethiopian Marian tales with the tales in the Oxford CSM database may enable scholars to discern patterns across and analyze indigenous, European, and Middle Eastern Marian tales. 2 Project Significance Significance for African and Literary Studies PEMM is a historic project with a range of scholarly contributions: ● Makes a disciplinary contribution​: These folktales about the Virgin Mary are rich repositories of cultural knowledge and literary practice, providing a matchless comparative literature site to study tales across continents, languages, and periods. Comparative literature remains a largely Eurocentric discipline and the Marian miracle tales have seldom been studied outside of their European iterations. PEMM provides a useful corrective to such limited approaches and does so through pairing two innovative comparative literature methodologies: distance reading and world literature. ● Fills a scholarly gap​:​ African literature in general, and eight centuries of Ethiopian written literature in particular, are criminally understudied. PEMM will enable more scholars to do more research on African literature. It will also bring greater global visibility to this vital corpus through a web interface. ● Serves an underserved community: ​ ​The number of digital humanities projects that focus on African literature is miniscule. Indeed, perhaps the only other initiative is the Programme in African Digital Humanities, 2018–2023, at the South African universities of Cape Town, Pretoria, Stellenbosch, Western Cape and the Witwatersrand, which “ ​aims to examine ​the current forms and practices of reading and digital publishing in order to encourage and support self-directed, digital literary enquiries in the South African humanities environments.” PEMM is part of increasing the number of digital humanities projects that focus on Africa. ○ For instance, the annual ​Digital Humanities conference in Utrecht for 2019 decided to have an Africa focus, working to give funding for Africans to attend and noting that African DH projects “cover the spectrum of DH topics in a somewhat different way than elsewhere.” However, only one panel was about Africa: “African Languages And Digital Humanities: Challenges And Solutions,” (which has a linguistics focus). Only one of the paper presentations seemed to be about Africa. Isabelle Alice Zaugg’s “Global Language Justice in the Digital Sphere: The Ethiopic Case” is “an instrumental case study of Unicode inclusion and the development of supports for the Ethiopic script and its languages.” Thus, the DH focus seems to be on methods, underscoring the need for a project like PEMM that focuses on literature. ● Collects scattered information in one place. ​ ​Ethiopian literature has been the subject of study for some centuries. There are large repositories of Ethiopian manuscripts inside and outside of Ethiopia, and massive cataloging and digitizing projects have been underway for the past sixty years. But, little of this information is available online, in one place, in English, for computational analysis. 3 ● Focuses on literature. ​ Most projects that focus on Ethiopian manuscripts do not attend to literature. They are linguistic or philological in nature, focusing on manuscripts as material objects, and tend to prioritize biblical books. PEMM is focused on African stories and their themes, providing a useful corrective to the overemphasis on influence and apparatus and underemphasis on African thought and creativity in Ethiopian studies. ● Increases information about and access to stories. ​Of the tens of thousands of Täˀammərä Maryam ​ in existence, only a few hundred have been catalogued, and only a few dozen of those have been cataloged with any detail; that is, naming the exact tales in that manuscript. PEMM will increase the number of cataloged ​Täˀammərä Maryam manuscripts. ● Provides foundation for Belcher’s book. ​Belcher works on Ethiopian literature in general and has a book in progress on ​Täˀammərä Maryam ​, titled ​Ladder of Heaven: The Miracles of the Virgin Mary in Ethiopian Literature and Art ​. It is a book of literary analysis, which will appear with many gorgeous illuminations of the tales from Princeton’s manuscripts. Given the dearth of information about these stories, PEMM will provide a necessary basis for the writing of this book. Significance for Digital Humanities The CDH’s approach to this particular project will also serve to make existing tools and approaches more robust and more useful to Digital Humanities researchers who do not have the support of a development team. ● Google Sheets as a simplified relational database. ​We will develop and document a model for working with Google Sheets as a simplified relational database, and exploring the possibilities offered by an exportable static site based on relational data exported from Google Sheets. Working with spreadsheets and Google Sheets is obviously not new for data work or for Digital Humanities. However, it seems clear that there is a 1 need for a data curation and management solution that sits somewhere between a spreadsheet and a relational database. By applying CDH Development & Design Team 2 skills and expertise, we will push these technologies forward in a way that will benefit others, including those doing data curation and graduate students working on their own Digital Humanities projects. Our approach will include structuring the data across multiple sheets as a simplified relational database, considering the spreadsheet as a user interface, providing enhanced functionality via scripting, and documenting the data structure and the implementation. We will also write scripts to automate data export, data validation and reporting--tools which have the potential to be generalized for wider use. 1 For example, see Matthew Lincoln’s post on using Google Sheets as part of a Getty data migration project. https://matthewlincoln.net/2018/03/26/best-practices-for-using-google-sheets-in-your-data-project.ht ml 2 See for instance the popularity of products like ​Airtable ​ or the existence of projects like ​NodeGoat ​. 4 ● Furthers DH work with static website technologies​. Using static website technologies, as championed by the ​Minimal Computing ​ working group, is also not new for Digital Humanities, although it is new for a CDH sponsored project supported by the Development & Design Team. Our commitment to development best practices and documentation will help further work being done by others to make static sites more accessible to scholars. In addition, for this project the possibility of sharing the results of our work through alternate means is particularly appealing. ● Takes advantage of new initiatives. ​ Scholars have access to only a fraction of Ethiopian manuscripts, as most are in remote monasteries. Only a few have been digitized; some estimates put the number as low as 10 percent of all Ethiopian manuscripts. Fortunately, over the last decade, we have seen a huge push to digitize, as Ethiopian manuscripts are globally recognized as an important endangered archive. PEMM takes advantage of this newly available archive. ● Strengthens partnerships​ with scholarly communities in Oxford and Hamburg by establishing a model for sharing data across different projects using different technologies. Audiences The audiences for this project are multiple and overlapping. ● Scholarly audiences for the data: ​ Scholars of Ethiopian literature (mostly scholars in Europe and North American, Ethiopian and non-Ethiopian) will find this information useful for doing their own research, in particular, the three existing comparative projects will be interested in the data we produce: the Hamburg BM project, the Oxford CSM Database, and the Miracula Mariae project. ● Scholarly audiences for a public interface: ​ Clerical scholars (including Ethiopian clerics); non-clerical scholars (scholars of Marian miracles); and students (undergraduate and graduate) ● Non-scholarly audiences for a public interface​: Ethiopian priests interested in sources and themes for writing sermons. Project Team Project Director​ (a.k.a. Project PI): Professor Wendy Belcher ● Leads and champions project ● Learns enough about project’s technical components to be able to describe it on a basic level ● Oversees, participates in, and delegates project work ● Attends regular project meetings (approximately twice a month during periods of active development) ● Supervises and guides Project Manager; keeps open line of communication, responds to emails and questions in a timely manner; alerts in advance of any disruptions or PI unavailability 5 ○ Support for PM may take the form of writing and commenting on charter, helping with project team coordination, approving project workflows, approving project publicity, or other duties TBD in consultation with PM, Technical Lead, and CDH Project Manager and Project Coordinator. ● If necessary, attends quarterly check-in with Technical Lead ● If necessary, submits quarterly data progress report to CDH ● Participates in acceptance testing on software development work ● Responsible for approving the work and progress of the data team ● Responsible for project budget, including and overseeing payment of students. ● Responsible for final project summary of accomplishments Project Manager​: Evgeniia Lambrinaki ● Maintains regular communication with team members, partners, and groups engaged in project work ● Helps to design and implement project workflows with PI approval ● Schedules and facilitates project check-in meetings (including creating an agenda); captures meeting notes ● Tracks progress on project goals and outcomes and communicating with CDH on project progress and/or issues ● Prepares and updates project documentation ● Responsible for overseeing the day-to-day work of the data team ● Responsible for acceptance testing on software features ● Responsible for project publicity (project page, blog) with PI approval Technical Lead: ​ Rebecca Sutton Koeser ● Oversees design and implementation of project’s technical aspects ● Acts as main CDH decision maker on project ● If necessary, modifies decisions about software tools and approach in order to more efficiently and effectively complete the project ● If needed, holds quarterly check-in meeting with PIs ● Has authority to make project decisions if PIs are unavailable ● Responsible for technical documentation at the conclusion of project CDH Project Manager​: Gissoo Doroudian ● Helps manage and coordinate development and design work ● Serves as a resource and point of contact for Project Manager ● Attends and helps create agenda for project meetings (with Project Manager) ● Helps design and implement project workflows ● Supplies periodic updates on project status and progress ● Decides in collaboration with Project Manager who will document meetings User Experience (UX) Designer: ​ ​Gissoo Doroudian 6 ● Collaborates on data structure and architecture for project data and advises on configuration and customization for data entry user experience ● Recommends the appropriate types of data visualizations to try with the project data (diagrams/maps) ● Thinks through and conducts user research on access for target audiences specific to this project ● Consults on collaborative and iterative design for content structure and website architecture ● Helps iteratively design a usable, and accessible interface CDH Project Coordinator​:​ Rebecca Munson ● Advises CDH Project Manager on coordinating development and design work ● Serves as a resource and point of contact for CDH Project Manager ● Advises on project workflows ● Supplies periodic updates on project status and progress ● When applicable, attends and helps to document project team meetings CDH Developers​: Rebecca Sutton Koeser, Nick Budak ● Develop or consult on data architecture and implementation ● Contribute to and review custom software developed for the project ● Document custom software and data architecture ● Write automated tests for custom software ● Experiment with and assesses potential project technologies (e.g. Jekyll, Wax, Hugo) and advise Technical Lead on decisions ● Prototype and implement custom data visualizations ● Provide consultation and training on tools such as OpenRefine to empower Project Director and other project team members to work with the project data Data Team (Student Researchers) ● Have skills in at least one of these languages: French, Italian, Amharic, Gəˁəz ● Catalog uncatalogued ​Täˀammərä Maryam ​ manuscripts using Macomber handlist identifiers and Gəˁəz incipits Budget [ Budget available upon request. ] 7 Part II: Grant Year 2019-2020 Plans - Data Data Status Types of Data and Storage Format The data are currently in structured text files, pdfs, Google Docs and Sheets, XML, and MS Word. Past FY19 Data Work Macomber Handlist cleanup. ​ ​In 2018-2019, Belcher and Lambrinaki converted the handlist from a PDF (a scan of a hand typed manuscript with many hand emendations in pen) into a structured text file. They cleaned up the file in Sublime Text, but there are still some errors, since it was incredibly garbled. This file will be converted into a Google Sheet titled “Macomber Canonical Stories.” Here is the structure of the text file: ● MAC###: ​ Macomber Marian Miracle Canonical Story identifier (some of these stories are various parts of the same story, so the identifier for a story may be something like MAC034A) ● Title: ​ English Title from Macomber ● Text ​: ​Secondary source that discusses this particular Canonical Story ● English Translation: ​ Translation of that story as it appears in one or two manuscripts (unfortunately, this is not currently a standard category in the structured file because most entries don’t have English translations; not sure if this needs to be made standard in all entries before transferring to Google Sheets) ● PEth​: Shelf number, beginning folio, and ending folio for where this particular Canonical Story appears in Princeton’s RBSC ​Täˀammərä Maryam ​manuscripts (shelfmark and folios needs splitting out). This field is often empty. ● EMIP​: Shelf number, beginning folio, and ending folio for where this particular Canonical Story appears in the EMIP digital repository. This field is often empty. ● MSS​: Shelf number and beginning folio only for where this particular Canonical Story appears in other repositories. Macober uses abbreviations for these (see list) ● EMML​: Shelf number and beginning folio only for where this particular Canonical Story appears in the HMML digital library. ● Keywords​: keywords from Macomber handlist with a few additional ones that Belcher and Lambrinaki came up with (problematic and needs to be updated, or categories missing) ● Incipit ​: Current text in this field is garbage and will be deleted. It will be globally replaced with Brown’s list of incipits, matching up using the Macomber identifiers. 8 Earlier data work​. ​Some earlier work was done correlating tales and manuscripts and themes in Excel (such as matching Princeton manuscripts to Macomber based on data from the Princeton finding aid), but those files are now out of date. Create a data structure. ​We designed a preliminary data structure for Canonical Stories, Story Instances,, as well as for manuscripts and incipits. Planned FY20 Data Work Over the next year, we will: ● Migrate data. ​ ​Project data (namely, Macomber’s handlist) currently managed as a structured text file will be migrated to Google Sheets by the CDH team, structured in multiple sheets based on planned data architecture (see Technical Design Plan section on Preliminary Data Structure, below) with data validation to share information across sheets. ● Enhance data​.​ Project data will be enhanced by the CDH team by importing data provided from other teams (e.g., those from the Hamburg BM project and Oxford CSM database). ● Match data. ​ Use Oxford CSM story data to identify Canonical Stories that came out of Europe and the Middle East, not Ethiopia (to distinguish between foreign and indigenous). ● Configure data validation. ​ ​Connect different sets of data without duplicating information (i.e., Gəˁəz Marian miracle manuscripts and Gəˁəz Marian miracle tales) ● Develop a controlled vocabular​y. ​Wendy and Evgeniia work with Macomber, Oxford CSM, Hamburg BM, and Index on Medieval Art’s controlled vocabularies to develop a controlled vocabulary for PEMM Canonical Stories. ● Develop a simple incipit tool (or outsource). ​Create a tool for searching Macomber’s standardized incipits so that research assistants can catalog manuscripts, preferably a dialog box in Google Sheets. Two challenges are homophones and recensions. The tool for searching must account for homophones, treating certain fidal letters as exchangeable (say ሀ and ኅ) because scribes easily substitute one for the other. Also, when searching incipits, research assistants must be careful about using proper noun searches. That is, a Canonical Story may give a name for the main character (e.g., Barok from Finqe [Phoenicia]), but a Story Instance in a particular manuscript may refer to him only as a “deacon” or as a “sinner.” Data Standards and Capture Procedures ● Controlled vocabulary​.​ We are developing and using controlled vocabulary lists (see above). ● Abbreviations​. ​ We are developing and will use abbreviations for repositories (e.g., BN not Bibliotheque Nationale). The data currently in the structured text file uses codes for repositories holding Ethiopic manuscripts. The repository abbreviations will be used to generate brief repository records in the new Google Sheets data structure, which will be used to document current locations of materials. 9 ● Organize data​. ​We use Google Drive and Google Docs and Sheets to organize the project. After migration to Google Sheets, data will be stored in multiple sheets of a single document in order to allow data validation and autocomplete-style lookups for related data. A regular, automated export will be set up to export Google Sheets data to a GitHub repository. We use Slack to communicate about it. ● Validate data​. ​In some instances, we have multiple researchers typing the same data, or cataloging the same manuscripts, to check accuracy. Grant Year Objectives – Data Main Outcomes/Deliverables: ● Dataset of Canonical Stories (publishable and citable) ● Dataset of 100+ MSS cataloged using list of Canonical Stories (publishable and citable) ● Import relevant data from Princeton Ethiopian manuscripts to Google Sheets ● Documentation of the new data structure ● Documentation of data entry workflow & processes 3 ● Cataloging of the MSS to provide to Hamburg BM project ● Linking to Hamburg BM IDs ● Export data from Google Sheets (possibly as XML) to share with Hamburg BM Possibly in Scope – Data: ● Linking PEMM Canonical Stories to other project by cross-linking identifiers from Hamburg BM, Oxford CSM database project (European stories) ● Linking to Index of Medieval Art ● Linking MSS records to Princeton University Libraries (PUL) digital editions in DPUL (Treasures of the Manuscript Division) ● Linking MSS records to EMML, if the digital edition is available Out of Scope – Data: ● Cannibal of Khmer project with The Textual History of the Ethiopic Old Testament (THEOT) ○ 90 MSS with the story carefully selected from different times and regions and typed to run through the software to compare versions. Won’t be done until October or November, but then all of that data will be available. May provide interesting information to help with this work but out of scope for this phase. ● Describing characters, objects, places, and other subject matter that occur in stories ● Annotating PUL digitized images or crowd-sourcing 3 ​CDH developers expect to write code to do the migration, but the code itself is not a deliverable sinces it's a means to an end and not something we are likely to reuse or generalize. 10 Project Needs – Data ● Research assistants with the language skills (either reading-level or at the level of recognizing the letters) ● Data structure for Google Sheets (see Technical Design Plan section on Preliminary Data Structure, below) ● Incipit tool ● Structured text file migrated into Google Sheets with customized data validation and formatting, including: ○ autocomplete on incipits with homophone search functionality, for matching particular stories to Canonical Stories ○ data validation for field types as appropriate, e.g. numeric values or sequential / increasing numbers for folio numbers within a manuscript. ● Regular check-ins with CDH staff including dev team and project management time. ● Training and support to query the data (e.g., OpenRefine training) Concerns – Data: Risks ● Possible overlaps with Hamburg BM; will use their data where possible, and hope to supply ours back to them to be incorporated, but want to avoid duplication of effort. In some cases we may use their data for checking and comparison. We have consulted them with this concern and they are willing to assist. ● Handling Fidel in Google Sheets and in search. One challenge is that the data will use Latin letters with diacritics (about 20 Unicode characters, like ə, ṭ, ṣ, ś, ä, ə, w, ḍ, ǧ, ḥ, ḫ, ḵ, č, ñ), as well as over 300 Ethiopic fidəl characters (also available in Unicode, like ወ፣ቀ፣ም፣ት). ● Different types of software that researchers use to handle and input Fidel ● Finding students with necessary language skills ● Changes with the scale of data may make Google Sheets unusable Interdependencies ● Sharing / linking to data without other projects ● Manuscript access (digital or otherwise) ○ Firestone Library began digitizing its Gəˁəz manuscripts; prioritizing the Täˀammərä Maryam ​. All ten are now digitized and ​online ​ in Digital PUL. 11 Data security consideration​s ● Structured text file is currently stored and shared via DropBox; a copy has been added to the PEMM Google Team Drive for backup. ● This project does not include any personal or sensitive data Data management plan ● After data migration, Google Sheets will be the primary canonical data source; GitHub will be a secondary source for backup and experimentation. Long-term preservation plan ● A released version of data exported from Google Sheets will be deposited with Zenodo or other repository for long term secure storage, and also to make it citable. ● Where appropriate, data will be exported and shared with other relevant projects. Future Plans Future Data Work ● Write precises of Canonical Stories. ​ ​A huge and difficult task will be writing short summaries of the 700+ indigenous Canonical Stories. Only those with an excellent understanding of Amharic, French, or Ge`ez will be able to do this work. Maybe 100 can be done from stories available in English translation. Or, perhaps, if the keywords are good enough, no precis is needed? ● Track word length of Story Instances. ​ ​This is a way of getting at the possibility of different recensions. ● Tag Canonical Stories with keywords. ​ Another difficult task will be using the controlled vocabulary list to better tag Canonical Stories. Macombere did tag 642 of them with keywords, but many remain and his list can be improved. Only those with a good level of Ge`ez and English will be able to do this work. ● Identify new Canonical Stories. ​ ​We need to give new identifier numbers, titles, themes, and incipits for Canonical Stories not in Macomber. We will use Hamburg BM identifiers where possible, but may need to do this a bit ourselves. ● Translate into Amharic. ​Translate titles, keywords, and website into Amharic. ● Design and write static website. ​This can be done in the last year. ● Compare Cannibal of Qemer transcripts​. Once Steve, Jeremy, Jonah, and Ashlee complete typing up all 90 versions of the Cannibal of Qemer tale, we will do computational analysis. Eventually, PEMM would like to answer the following question: 12 ● How did individual Ethiopian Marian miracle tales change over time and region? ​ ​PEMM is conducting a textual history of just one of the tales, called the Cannibal of Qəmər. We already know it has three quite different recensions, but we are trying to determine how, where, and when the tale differs. With new denogram comparative software, we can begin to establish recensions and compare them statistically. To do this, we are collaborating with The Textual History of the Ethiopic Old Testament (THEOT) Project, which has worked out the methods and workflow necessary to carry out textual histories of Gəˁəz texts. Part III: Grant Year Plans - Interface Grant Year Objectives – Interface Main Outcomes/Deliverables: ● Preliminary interface (prototype): ○ Feed from Google sheets → Github ○ Simple data viz including map ○ Showcasing images from PUL MSS via IIIF ○ Lightweight search and browse ○ Article or post to accompany visualizations Possibly in Scope – Interface: ● Support for multilingual site capacity ● Preliminary web interface design Project Needs – Interface: ● Geographic data to generate a map ● Script to pull data from Google sheets into static website Concerns – Interface: Risks ● Front-facing deliverables are dependent on progress in data work ● Working with fidel and amharic (multilingualism) ● Community push-back on making data on certain stories accessible and visible 13 ● Prototyping with static website technologies may have more constraints than we expect; CDH development team does not have substantial experience with static site technology ● Changes with the scale of manuscript data may impact static site performance ● If we shift from a static site to a database driven site in a future phase of the project, there are likely to be changes in the site structure and architecture Interdependencies ● Hamburg BM project ● Working with PUL IIIF + digitized content, but not annotating ● The Textual History of the Ethiopic Old Testament (THEOT) Project ● Rights on existing code for searching fidel Future Plans – Frontend: ● Distribution of static site with data and images on thumb drives ● Assess prototype and data work to determine how to expand, e.g. database driven site ● Apply to follow up Research Partnership grant for next phase of project development Part IV: Technical Design Plan Data In this phase of PEMM, we will migrate the data from the semi-structured text file into a Google Sheets spreadsheet comprised of multiple structured and related sheets with data validation configured to automatically connect data between different sheets within a single Google Sheets spreadsheet. The data for this project is highly relational and certainly could be implemented as a relational database, but given the phase of this project and the amount of data work still to be done, we chose Google Sheets. Sheets supports edits by multiple concurrent users and tracks versions, and working within a spreadsheet will allow project team members to manage and query the data more easily without developer intervention. We will model the data as if for a relational database (see Preliminary Data Model), but implement it in Google Sheets with an eye towards data entry usability and efficiency rather than a fully normalized database structure. As a first step, we will prototype the Google Sheets structure based on the new data architecture and determine the appropriate data validation, formatting, and any other configuration that is useful and necessary. This will also give us a chance to experiment with Fidel characters to make sure we can get everything working as expected. We expect to write one piece of custom code in Google Apps Script to support homophone searching in Fidel for an incipit lookup, which will 4 4 The public search interface for Hamburg BM project (​https://betamasaheft.eu/as.html ​) includes an option for homophone searching, and their help text includes a list of orthographic variants. We will use their implementation as a reference and their list of characters as a starting point for our implementation. 14 enable project team members to match stories in a manuscript with Canonical Stories from the Macomber catalog based on unique words or phrases. A prototype incipit lookup has already been created by Steve Delamarter and demoed to the team which we could purchase, but we prefer to design and implement something simple based on project needs. This will give us in-house expertise to maintain and support the tool, and we can iteratively refine it if necessary. We also plan to document as an example for others using Google Sheets for dataset work. If implementing the incipit search proves to be more difficult than anticipated, we will consult with Garry Jost or purchase the prototype. Once the Project Director has agreed to the data structure and tested and accepted Google Sheets functionality, CDH developers will write a script to parse the structured text file and convert into multiple CSV files for import into Google Sheets, which will create preliminary records for Manuscripts, Canonical Stories (i.e. those cataloged by Macomber) and Story Instance (a story as it occurs in a manuscript). Records for archival repositories that hold these manuscripts will be added manually to the spreadsheet by project researchers, since there are a small number and the structured text file does not supply the needed information. After data is migrated into Google Sheets, that will become the canonical data source for the project, and the project researchers can begin working on the data. After the migration is complete, CDH developers write a script to generate a regular, automatic export of the Google Sheets as CSV and/or JSON which will be added to a GitHub repository. The data in the GitHub repository will serve as both a versioned backup and as a data source for querying, visualization, and a prototype interface; it may also eventually be used to publish a citable version of the data via Zenodo or a similar service. The export will be powered by the “publish to web” functionality available in Google Sheets, if it is sufficient; otherwise it will be implemented with an existing Python Google API client to access the data. Additionally, if the Google APIs allow it without too much difficulty , we will use revision information to credit the project team members who have 5 made edits to the data as co-authors of the commit using the GitHub co-author syntax , as a way 6 of making the contributions of project team members a visible part of the record of the data. We may also implement continuous validation and reporting on the data in GitHub, making use of continuous integration tools that are usually applied to software code in order to automate regular data validation. Interface – Prototype Website If time permits, we will develop a prototype website as proof of concept which will allow us to experiment, try new technologies, get familiar with the data and working with Fidel characters. The ultimate goal is to know enough to decide how to proceed in the next phase of the project. 5 Documentation on the Google Drive API indicates this should be possible (​https://developers.google.com/drive/api/v3/reference/revisions ​), but it’s unclear how difficult it is. 6 ​See ​https://help.github.com/en/articles/creating-a-commit-with-multiple-authors 15 The GitHub data repository generated from the Google Sheets data will be used as a starting point for experimentation, creating a prototype static site which could allow the project team to 7 browse and search the data, and will give the development team a chance to become familiar with the data and working with Fidel characters. We have chosen to work with static site technology because it should allow for quick prototyping and experimentation based on the data from the Google Sheets without making a heavy investment in a particular technology stack for the next phase of the project. We hope to experiment with the following static site technologies: ● Jekyll (​https://jekyllrb.com/​; implemented in ruby and commonly in use for Digital Humanities projects) ● Hugo (​https://gohugo.io/​ implemented in Go; newer and more powerful than Jekyll) ● Gatsby (​https://gatsbyjs.org/​ implemented in JavaScript) ● Wax (​http://marii.info/projects/wax​, software for generating exhibit sites with IIIF, spreadsheets, and Jekyll) ● Elasticlunr.js (​http://elasticlunr.com/​, browser-based searching) The prototype website will be implemented with a responsive design that supports mobile devices by choosing an existing theme to allow us to focus on the more innovative aspects of the project. Creating a site that is usable, accessible, and welcoming to the diverse audiences for this project will require user research, but because this is a prototype website we may begin that research during the project year to guide later phases of the project. The static site will be hosted on GitHub pages as we prototype. If we determine we want a Princeton URL for the prototype site before the end of the current grant phase, we will request a hostname and possibly a virtual machine from OIT. Alongside the static site development, dependent on data delivery, CDH developers and UX Designers will contribute to data visualizations and maps as appropriate to help answer the project research questions. These may or may not be part of the static site; they may be included in an essay to be published at the end of this phase of the project. If time allows, CDH developers will experiment with internationalization, with the hope of making the prototype site available in both English and Amharic. This is not only something we’re technically interested in, but also something we feel ethically challenged and compelled to do, based on Wendy Belcher’s comment that no Ethiopic manuscript materials or data are currently available in Amharic. There are existing solutions for multilingual sites implemented with Jekyll, including notably ​The Programming Historian​, in the Digital Humanities space. 7 Any static site code committed to GitHub will be put in a separate repository from the data to allow the data to be easily deposited with Zenodo without including static site software or content. 16 We are also interested in leveraging minimal computing principles and techniques to provide a version of the static site as a standalone package, including project data and any image content with permissions that allow redistribution, to be shared and distributed via alternate means, such as USB drive or inexpensive hardware. 8 Preliminary Data Model Part V: Deliverable Timeline Summer 2019 (by September 1): Data: ● Documentation & diagram describing data structure 8 For example, see Ed Summer’s post about building an offline static site with React https://inkdroid.org/2018/01/10/offline-react/ 17 ● Google sheets spreadsheet that implements data structure ● Data from Sublime Text file migrated to Google Sheets Interface: ● None Fall 2019 (through December 31): Data: ● Incipit lookup with phonetic searching ● PUL finding aid data added to existing Google Sheets ● BM data added to existing Google Sheets ● Oxford CSM data added to existing Google Sheets ● Trained student researchers (at least orient to MSS and project, orient to Google Sheets; more if data structure is ready) ● Automated feed from Google Sheets to a GitHub repository ● Continuous integration for data validation on GitHub (nice to have) ● 40 Story Instances catalogued Interface: ● None Spring 2019 (through the end of the grant period) Data: ● Data visualizations ● 20 manuscripts catalogued Interface: ● Prototype static site, if sufficient data is available and time permits Part VI: Grant Year Wrap-up This section is completed after the grant year concludes. Describes the goals reached and outcomes of the project, and explains major changes and discrepancies with planned work. Continuing projects will include this in the charter for their subsequent project phase. Completed projects must provide to the CDH Project Coordinator within one month of the conclusion of the grant period. 18 Part VII: Agreement Project pause policy To ensure that all projects receive sufficient and equitable development time, time-sensitive queries and requests must be addressed within two weeks of initial (email) request. The PI is responsible for communication with development team. If PI is does not respond to a task that has been indicated as time-sensitive by the CDH team within 2 weeks of initial request, further project development will be paused until the project can be reasonably integrated back into the CDH development schedule. Rights, Permissions, and Attribution Site content and data will both be licensed under Creative Commons Attribution 4.0 International (CC-BY 4.0). If any of the datasets consist solely of factual data where authorship cannot be claimed, they will be licensed as CC0. Any software developed by CDH that merits release will be licensed under Apache 2.0. The Technical Lead will fill out an invention disclosure form in order to gain approval from the Office of Technology Licensing in order to release the code. Before approval is granted, the code will be owned by the Trustees of Princeton University. Web Presence and Project Publicity The PI will create a project page on the CDH website, keeping it up-to-date and accurate during the grant year. The Project Manager will submit at least two blog posts per year, to be published on the CDH website. The schedule for publishing blog posts will be determined in consultation with CDH staff. In the case of a public site launch, or similar event, the PI and PM will work closely with CDH and Princeton University Library staff as needed on publicity, communication, and outreach. Currently, the PI has a web page for the project at https://wendybelcher.com/african-literature/pemmproject/ Credit All team members will be credited on the project’s website and CDH project page. The project’s website will include a sponsorship statement (indicating the CDH as well as any other supporting groups, departments, agencies) and will include a citation statement indicating how the project assets should be cited. The site will also list and link to other projects that contributed data. 19 Project PI ___________________________________________________________ CDH Faculty Director ___________________________________________________________ Date: 20 Appendix A: Relevant Resources and Projects Data Currently, the project data exists in seven separate, uncorrelated formats: 1. Macomber Handlist of Marian Miracles in the Ethiopian tradition with identifiers for each of 642 Canonical Stories, translations and analyses, and its keywords, and shelfmarks and folios for Story Instances in about 200 manuscripts). 2. Brown’s list of incipits for Macomber’s Canonical Stories: (typed up in fidäl and available in Google doc or sheet; checked by Hamburg BM for accuracy; used to catalog stories in manuscripts.) 3. Oxford CSM database for controlled vocabulary for themes (700+ terms, which need to be cleaned up and merged with the Macomber Handlist themes, which need to be updated, ) (available in text file.) 4. Princeton RBSC finding aid for its ten ​Täˀammərä Maryam ​manuscripts (not yet cataloged with the Macomber Handlist identifiers,) (available in EAD XML). The project will also reference data and images from PUL digitized editions of relevant Ethiopic manuscripts held by PUL, managed by PUL and provided via IIIF Presentation and Image APIs . 5. Hamburg BM identifiers and cataloging data for about 75 ​Täˀammərä Maryam manuscripts (cataloged with the Macomber Handlist identifiers, ) (available in TEI XML.) 6. Delamarter’s list of 90 manuscripts across six centuries (1400s, 15oos, 1600s, 1700s, 1800s, 1900s) and five regions (north, south, central, east, west Ethiopia) (in Google Sheets, uncataloged) at EMIP / HMML 7. Archives of manuscripts from EMML, Bibliotechque Nationale, the British Library (none cataloged with Macomber handlist). Archives and Databases Lots of information has been collected about the ​ Täˀammərä Maryam ​, but is spread across hundreds of obscure print catalogs, French and German articles, Italian books, and Gəˁəz monasteries. Each uses different numbering systems and tale titles, almost none have keywords, and many catalogs do not enumerate the tales in a manuscript, rather simply stating that something is a ​Täˀammərä Maryam ​ and moving on. Thus, a crucial aspect of PEMM will be collating information using the following archives and databases. Macomber Handlist The most important PEMM resource is William Macomber’s unpublished handlist of 642 Marian miracles, based on his study of 100+ manuscripts, including each story’s title, 21 translations, themes, and incipit (the unique first sentence of each story; used with medieval manuscripts as an identifier, as they have no titles). (This was quite an extraordinary accomplishment, before digital work in the humanities was common. It gives PEMM a huge leg up.) ● Macomber, William F. n.d. [1980s]. ​[Handlist of] The [Ethiopian] Miracles of Mary. Collegeville, MN: Hill Monastic Museum and Library, St. John's Abbey and University. Lombardi Handlist Chiara Lombardi disagrees with Macomber and thinks there are only 530 Canonical Marian Miracle Tales. Where available, and depending on time, we may include her identifiers in addition to Macomber’s. ● Lombardi, Chiara. 2009. "Il Libro etiopico dei Miracoli di Maria (The Ethiopic Miracles of the Blessed Virgin)."BA thesis, Archeology, Università di Napoli. ● Lombardi, Sabrina. 2010. "Miracoli di Maria."MA thesis, Anthropology, Corso di laurea in lettere, Università di Firenze. Archive Catalogs An indispensable source for PEMM are extant catalogs of the ​Täˀammərä Maryam manuscripts. Macomber worked with about fifteen (see list of abbreviations). While most catalogs do not use Macomber’s numbering, these catalogs often provide manuscript dates, provenance, and folios of each story. The most important catalogs are as follows: 1. Princeton University Rare Books and Special Collections. ​ Belcher and Qesis Melaku catalogued the RBSC’s ten ​Täˀammərä Maryam ​manuscripts, although they did not use Macomber’s numbering for their stories. ll of ten of these manuscripts have been digitized and are available online. a. Treasures of the Manuscripts Division, Ethiopic manuscripts https://dpul.princeton.edu/msstreasures/catalog?f%5Breadonly_collections_ssi m%5D%5B%5D=Ethiopic+Manuscripts&q= b. Melaku Terefe, and Wendy Laura Belcher. 2009. Princeton Collections of Ethiopic Manuscripts, 1600s-1900s: Finding Aid. Princeton, NJ: Princeton University Library, Department of Rare Books and Special Collections, Manuscripts Division. ​https://findingaids.princeton.edu/collections/C0776 2. Hill Museum and Manuscript Library (HMML), St. John’s Abbey and University, Collegeville, MN. ​ ​This is a digital library; that is, it archives microfilms and digital images of manuscripts that are elsewhere. A huge project of HMML in the 1960s and 1970s was the Ethiopian Microfilm Manuscript Library (EMML), which microfilmed 8,000 manuscripts in monasteries and churches in Ethiopia. William Macomber, author of the handlist, was one of the directors for this project, and his handlist used many EMML manuscripts. HMML also hosts other digital collections of Ethiopian manuscripts, such as the Ethiopian Manuscript Imaging Project (EMIP, by Stephen Delamarter). At least 22 ​Täˀammərä Maryam ​are currently available for free online, but HMML has over 535 ​Täˀammərä Maryam ​manuscripts awaiting transfer 22 from microfilm to online digital form. In general, they only digitize microfilm if a client wants access to it and pays for it. Anyone can access the microfilm for free, but of course only on site, in Minnesota. Belcher is in correspondence with them about gaining digital access to more of their manuscripts. Developing a relationship with HMML may be useful to PEMM more broadly as they have digital copies of over 250,000 handwritten manuscripts in many languages from around the world. a. One can search parts of this collection at ​https://www.vhmml.org/readingRoom/ 3. British Library​.​ This archive has eighteen of the most splendid ​Täˀammərä Maryam manuscripts in existence, most looted from the royal scriptorium. Manuscripts from the center of power, from the clerics of the royal house, will be most useful for establishing the canon of stories, as they would be the ones to make it. The regions will have their own interesting trends; and perhaps have their own standard collections. Both are interesting, but tracking what is canon is vital. This collection has been catalogued and recently digitized, but has not used Macomber’s numbering. a. http://www.bl.uk/manuscripts/BriefDisplay.aspx?source=advanced b. Wright, William. 1877. ​Catalogue of the Ethiopic Manuscripts in the British Museum Acquired since the Year 1847​. London: British Museum. 4. UCLA Library. ​ ​Belcher and Qesis Melaku also catalogued UCLA’s collection, but before a big acquisition. It has very few ​Täˀammərä Maryam ​manuscripts, however. a. http://digital2.library.ucla.edu/viewItem.do?ark=21198/zz0009gx3x 5. Catholic University of America, Institute of Christian Oriental Research (ICOR). ​Its catalog of its Ethiopic manuscripts is not yet online, but it has 12 Täˀammərä Maryam ​ manuscripts. a. Weiner Codex 233 – EMIP 2175 (17 ​th​ cent.?): 178 miracles b. Weiner Codex 260 – EMIP 2259 (1930–1974): 5 miracles c. Weiner Codex 308 – EMIP 2340 (20​th​ cent.): 11 miracles d. Weiner Codex 335 – EMIP 2370 (18 ​th​ cent.): 80 miracles e. Weiner Codex 364 – EMIP 2399 (20​th​ cent.): 31 miracles f. Weiner Codex 366 – EMIP 2401 (20​th​ cent.): 81 miracles g. Weiner Codex 370 – EMIP 2405 (19​th​ cent.): 3 miracles h. Weiner Codex 395 – EMIP 2660 (20​th​ cent.): 3 miracles i. Weiner Codex 403 – EMIP 2668 (20​th​ cent.): 36 miracles j. Weiner Codex 428 – EMIP 2716 (18 ​th​ cent.): 3 miracles k. Weiner Codex 449 – EMIP 2737 (18 ​th​ / 19​th​ cent.): 34 miracles l. Weiner Codex 463 – EMIP 3238 (19​th​ cent.?): 45 miracles 6. Other Archives. ​ Many other archives exist in Europe and North America, including the Bibliotheque Nationale, the Vatican Library, St Petersburg Library, and so on. a. Platt, Thomas Pell. 1823. ​A Catalogue of the Ethiopic Biblical Manuscripts in the Royal Library of Paris, and in the Library of the British and Foreign Bible Society: Also Some Account of Those in the Vatican Library at Rome, to which are Added, Specimens of Versions of the New Testament Into the Modern Languages of Abyssinia​. London: R. Watts. 23 Oxford Cantigas de Santa Maria (CSM) database This database, created by Stephen Parkinson of the Oxford University Centre for the Study of the Cantigas de Santa Maria​ (manuscript) and PI for the Oxford CSM database launched in 2005, is the most authoritative on the world-wide collection of Marian miracle stories, but based largely on manuscript collections in Europe. The Oxford CSM database is “designed to give access to a vast range of information relevant to the processes of collection, composition and compilation” of the Marian miracle stories. It provides a fully searchable electronic version of Poncelet’s list of Marian miracles (​Index miraculorum B.V. Mariae quae saec. VI-XV latine conscripta sunt ​); brief descriptions of all European Marian miracles; and a controlled vocabulary list for Marian miracle story themes. The principal investigator is Stephen Parkinson, who is eager to share any data he has, in return for better information about the Ethiopian Marian miracles for his database. This database will help us to identify which stories originated outside of Ethiopia and to develop our own controlled vocabulary theme list. ● http://csm.mml.ox.ac.uk/ Beta Maṣāḥǝft (Hamburg BM) Database This database project is hosted by the Hiob Ludolf Centre for Ethiopian Studies at the Universität Hamburg in Germany and with Prof. Alessandro Bausi as the principal investigator. It is a very long-term project (from 2016–2040) with large German government funding to create a “virtual research environment” for collecting and managing data about “the predominantly Christian manuscript tradition of the Ethiopian and Eritrean Highlands.” They are also using Macomber’s Handlist to catalog Marian miracle stories, although they are not currently tagging themes. They are also collating data on manuscripts (date, provenance, total folios), developing controlled vocabulary lists for people and places, and arriving at their own identifier for Canonical Stories (since Macomber missed some). They are open access and so we will be exchanging information as much as we can: the Hamburg BM giving PEMM its data in accessible forms and PEMM giving the Hamburg BM whatever data we create. 7. Beta maṣāḥǝft: Manuscripts of Ethiopia and Eritrea (Schriftkultur des christlichen Äthiopiens und Eritreas: eine multimediale Forschungsumgebung) 8. “ ​See the section on contributing and reusing data below.” Miracula Mariae project This comparative project, with principal investigators Ewa Balicka Witakowska and Anthony John Lappin, is also called “Miracles of the Virgin: Medieval Short Narratives Between Languages and Cultures.” Begun in 2015, it will compare six to ten individual Marian miracle stories across many languages and regions as a way of studying transmission (in image and text) (in Arabic, Armenian, Croatian, Dutch, Ethiopic, French, Georgian, Greek, Hungarian,Latin, Middle English, Old Icelandic, Old Swedish, Polish Italian, Romanian, Slavonic, South Slavonic, Spanish, Syriac, and Ukranian). It will also compare sociological and psychological aspects of 24 the stories in different cultural contexts. The eventual aim is a database, but there is no evidence of this having been initiated yet. “The overall tradition [of Marian miracle stories] offers a rich and vast body of literature, which, in its totality, has not been studied, and whose intertextuality offers a number of interesting problems and resources for further study. … The project will seek to analyse this complex set of interrelated traditions from three successive standpoints. The ​first will consider manuscript transmission and the physical distribution of miracle-tales; the second ​will compare collections and versions, in order to understand the cultural pressures that led to variation and re-elaboration of a set number of miracle-tales; and the ​third ​will look at the resulting texts from a narratological point of view, and aim to establish the limits and development of a story within a primarily manuscript culture. The first stage of the project will be a​ text-critical study of a selected number of miracles from the core collection, tracing their development across manuscripts, enabling sub-families and recensions to be established, and allowing the evolution of the collections to be precisely identified.” ● https://hildefonsus.wordpress.com Index on Medieval Art This Princeton institute may have useful controlled vocabularies. Appendix B. Planned FY20 Data Steps Step 1. Create PEMM Canonical Stories Dataset. ​The Macomber handlist will be used as the basis for a ​Google Sheet titled PEMM Canonical Stories Dataset.​ It will have more data than the Macomber one. Below is its format (with invented data for one story to give a sense for the appearance of the data): ● PEMM miracle tale ​identifier​: MAC0007 ● Hamburg BM identifier: LIT3640Miracle ● Lombardi miracle tale identifier: 53 ● Oxford CSM miracle tale identifier: 5 [From BM] ● Macomber ​title​ ​of Marian miracle tale: The monk of Dabra Qalǝmon who did not fast. ● Lombardi/ Cerulli title of Marian miracle tale: Il monaco che non ha digiunato ● Dillmann title of Marian miracle tale: Monachi non ieiunant ● EMIP title of Marian miracle tale: The monk who did not fast ● Oxford CSM title of Marian miracle tale: none ● Colin title of Marian miracle tale: Le moine qui n'a pas jeûné ● Tsegaye title of Marian miracle tale: ያልበሰበው መነኩሴ ● Budge miracle tale ​edition/text ​[translator, title, page]: Budge, ​Hundred ​, item 86; Budge, Miracles, p. 42. 25 ● Tsegaye miracle tale edition [translator, title, page]: None ● Tasfā Giyorgis edition [translator, title, page]: Tasfā Giyorgis ​TM​, item 77, page 274-275. ● English ​translation​ ​of tale [translator, title, page]: Budge, ​Hundred ​, p. 87. ● French translation of tale [translator, title, page]: Colin, ​TM​, p. 10 ● Amharic translation of tale [translator, title, page]: Tsegaye, ​TM ​, p. 402 ● Italian summary of tale [translator, title, page]: Cerulli, Il libro, p. 166. ● Princeton Ethiopic ​manuscripts ​with the tale and its beginning and ending ​folios​: 47 (23r-24v) ● Other repositories known manuscripts with the repository name, shelfmark, and beginning (and sometimes ending) folios: G-7; ZBNE 60-29; 61-27; CRA 52-35; 54 (91r); 55 (91r); SBLE 32-28; BM 2-31; 3-36; VLVE 267 (56v); 298 (23r, 50v); SALE 23-27; 43-29; LUE 30-40; 32-28; CBS-28; CCBE 951-28; AECE 1 (30r). ● EMIP (Ethiopian Manuscript Imaging Project) digitized manuscripts with the tale and their beginning and ending folios: [to come] ● EMML (Ethiopian Microfilmed Manuscript Library) digitized manuscripts with the tale and its beginning and ending folios: 2275 (194r); 6938 (29v); 5520 (31v); 2378 (20v); 6640 (49r); 2060 (195r); 2066 (31v, 135v); 2802 (22v); 6196 (81r); 7543 (17v). ● Keywords/themes​ (controlled vocabulary list TBD ): ● Story length (how many characters or words): 1,250 ● Story number of total stories in manuscript (order): 231 ● Story precis (100 words or fewer ): ● Story translation (if in public domain): ● Story Instance ​illustrations ​no.: 4 ● Story Instance illustrations characters: farmer, abbott ● Story Instanceillustrations objects: bow and arrow ● Story Instance illustrations dating (if later): same ● Incipit ​1 with manuscript shelfmark (imported from Brown): ወሀሎ፡ በደብረ፡ ቅዱስ፡ ዐቢይ፡ አባ፡ ሳሙኤል፡ ዘቀልሞን፡ ቤተ፡ ክርስቲያን፡ ሠናይት፡ በስመ፡ እግዝእትነ፡… ወኮነ፡ ውስተ፡ ዛቲ፡ ቤተ፡ ክርስቲያን፡ ስዕል፡ ዐቢይ፡ ወመንክር፡ (6938). ● Incipit 2 ​ ​with manuscript shelfmark: ወሀሎ፡ አሐዱ፡ ብእሲ፡ መነኮስ፡ በደብረ፡ አባ፡ ሳሙኤል፡ ዘቀልሞን፡ ወያፈቅራ፡ ለእግዝእትነ፡ ማርያም፡ ወያነብብ፡ ወትረ፡ ተአምኆተ፡ መልአክ፡ ሌሊተ፡ ወመዐልተ። ወዝንቱሰ፡ ብእሲ፡ ኢይጸውም፡ ወኢይጼሊ፡ ወይትሜሰል፡ ከመ፡ አብድ፡ ወእንቡዝ፡ (2378). Macomber’s abbreviations for repositories are as follows (note that some repositories appear twice because Macomber is using catalogues of collections, not the actual collections): ● AECE = Abbaye d'En Calcat, Dourgne, France (but microfilmed for HMML) ● CBS= manuscript of the Berlin Staatsbibliothek (described by E. CERULLI) 26 ● CCBE = manuscripts of the Chester Beatty Library (described by E. CERULLI) ● CF= manuscripts of the Biblioteca Nazionale of Florence (described by E. CERULLI) ● CL= manuscript of the Academy of Sciences of Leningrad (described by E. CERULLI) ● CRA= manuscripts of the d'Abbadie Collection of the Bibliotheque Nationale in Paris (described by Conti Rossini) ● DULE = Ethiopian manuscripts of the Duke University Library, Durham, North Carolina (but microfilmed for HMML) ● EMML= Ethiopian Manuscript Microfilm Library, of Hill Monastic Manuscript Library (HMML), St. John's Abbey and University, Collegeville, Minnesota ● G = manuscript of the Biblioteca Giovardiana in Veroli (described by E. CERULLI) ● GBAE = manuscripts of the Biblioteca Ambrosiana in Milan (described by S. Grebaut) ● GVE = manuscripts of the Vatican Library (described by S. Grebaut and E. Tisserant) ● HBS = manuscripts of the Staatsbibliothek in Berlin (described by E. Hammerschmidt) ● LUE= Ethiopian manuscripts of the Uppsala University Library (described by O. Lofgren) ● SALE= Ethiopian manuscripts of the Conti Rossini and Caetani Collections of the Accademia Nazionale dei Lincei in Rome (described by S. Strelcyn) ● SBLE = Ethiopian manuscripts of the British Library (described by S. Strelcyn) ● SGE = Manuscripts of the Griaule Collection of the Bibliotheque Nationale in Paris (described by S. Strelcyn) ● SWE = Ethiopian manuscripts of the Seabury-Western Theological Seminary (described by W. F. Macomber) ● VLVE = Ethiopian manuscripts of the Vatican Library (described by A. Van Lantschoot) ● WBLE = Ethiopian manuscripts of the British Library (described by W. Wright) ● ZBNE = Ethiopian manuscripts of the Bibliotheque Nationale in Paris (described by Zotenberg) Step 2. Create keyword fields for PEMM Canonical Stories Dataset. ​ ​When consultants who read Gəˁəz start cataloging stories, they will need access to the controlled vocabularies through dropdown menus for many keyword/theme fields. Although this particular type of cataloging task will not be done until FY21, we want to be aware of the types of fields we will need in the future. For now, the dataset will have fields for characters and settings; we can add others later. Below are the possible fields. ● Story themes ● Story main human character, including proper noun [e.g., Barbara, Simon]; profession [abbess, beggar, wife]; nation/town; status [noble, royal, commoner]; gender; age [infant, child, teenager, young adult, middle age adult, the old]; type [protagonist, antagonist, and/or generally bad, generally good, and/or, evil, sinning believer, good nonbeliever, saintly]; religon [Muslim, Jew, Christian, pagan]; type of conflict [against self, against society/group, against another, against nature]; problem/challenge/conflict [lame, blind, castration, deaf, poor, disbelief, disease, exile, false accusation, famine, childlessness, away from home, pregnant]; sin [man-eater, adultery, jealousy, arson, blasphemy, drunk, frivolity, heresy]; virtue [celibacy, chastity, belief in Mary, doing 27 something for Mary, fasting]; threat [hell, ambush, discovery, death, drought, drowning, hanging]; activity [plowing, bathing, childbirth, dream, fall, travelling]; body part [ear, hands, penis] ● Story human character 2 [same as above] ● Story human character 3 [same as above] ● Story human character 4 [same as above] ● Story human character 5 [same as above] ● Story human character 6 [same as above] ● Story human character 7 [same as above] ● Story other characters (nonacting): wife, children, servants, friends ● Story human characters group(s): monks, children, Muslims, Cistercians, family, enemies, demons ● Story Divine Character 1 : Mary ● Story Divine Character 2: angels, demons, Christ, Holy spirit ● Story plot (maybe): travelling away from home, committing a sexual sin, healing ● Story emotions: hate, envy, terror ● Story animal(s): dog, dragon, birds, frog ● Story food(s): bread, grain, honey, beer ● Story four elements: water, earth, air, fire ● Story Mary mechanism: icon, vision, apparition, milk, hand, baptism, her belt, fragrance ● Story national setting/location: Ethiopia, Egypt, Israel, Syria, Cyprus, France/Europe/Farang ● Story province setting/location: Gojjam, Tigray ● Story town/village setting/location: ● Story landscape setting/location: mountain, sea, lake, farm, field, bridge, cave, garden, heaven ● Story building setting/location: monastery, church, home, castle, boat, furnace, gallows ● Story religious rite: baptism, prayer, burial, confession, Easter, eucharist ● Story texts: Gospel of John, Hail Mary ● Story religious objects: icon, Bible ● Story domestic objects: gourd, table, candle ● Story fighting objects: bow, arrows, sword ● Story other objects: alms, bell ● Story sources/intertextuality: seems to be in relation with foreign story ● Story origin: France, Germany, England or Europe Step 3. Create PEMM Manuscripts Dataset. ​ ​The Macomber handlist will be used as the basis for a ​Google Sheet titled PEMM ​Täˀammərä Maryam ​ Manuscripts Dataset ​. The sheet will have more data than Macomber. It will be important to include information on manuscript dating and region, where available. Below is its format (with invented data for one manuscript to give a sense for the appearance of the data): ● Manuscript title: ​Täˀammərä Maryam ● PEMM Ms No.: PEMM 105 28 ● Others’ Ms No.: BN Ms. No. 23 ● Manuscript original repository: Dabra Libanos, Ethiopia ● Manuscript provenance (lat., long.): 9.712177, 38.848075 ● Manuscript current repository now: BN (Bibliotheque Nationale) ● Manuscript total no. of folios: 405 ● Manuscript total no. of pages: 202 ● Manuscript total no. of images: 202 ● Manuscript total no. of stories: 103 ● Manuscript century: 15.25 ● Manuscript date range (if available): 1517-1543 ● Manuscript illustrations no.: 28 ● Manuscript illustrations size: 25 full page; 3 quarter page Step 4. Create controlled vocabularies for PEMM Canonical Stories Dataset. ​We need to develop a controlled vocabulary of our own because (1) Macomber’s is outdated (e.g., it uses “Moslem” instead of “Muslim”); (2) Hamburg BM’s is designed for the Ethiopian environment, but not the Marian miracles specifically; (3) Oxford CSM’s was not that well controlleds, so needs to be cleaned up; and (4) the Index’s doesn’t account for the Ethiopian environment specifically. Wendy and Evgeniia plan to combine all four controlled vocabulary sets and then comb through them for redundancies, to create cross references, and to identify hierarchies (e.g., we have “oxen” but also “animal” and the first is a type of the second). For instance, since catalogers may not think of the exact same word, we should have cross references (e.g., “angels” and “divine messengers [angels]”). With tightly controlled vocabulary lists, we can do better analysis of story themes. Step 5. Mark beginning of incipts of each Story Instance in each manuscript. ​The research assistants will not know where the incipits are for each Story Instance. Those with an excellent level of Ge`ez will need to mark up manuscripts so that research assistants using the incipit tool and matching incipits don’t have to read for the beginning of the incipit on the page. Step 6. Catalog manuscripts. ​ The biggest task will be increasing the number of Story Instances in the Google Sheet. With a bigger data set, we will be able to better answer the research questions. The research assistants will use the Incipit tool to catalog manuscripts first, giving each Story Instance a Canonical Story identifier number and then marking their degree of certainty that the incipits match. If they are not confident, someone who reads Ge`ez will go after them, checking. Alternate Tasks Currently out of scope is the following, but it might come into scope if we run into problems with students doing cataloging. ● Write precises of Canonical Stories. ​ ​A huge and difficult task will be writing short summaries of the 700+ indigenous Canonical Stories. Only those with an excellent 29 understanding of Amharic, French, or Ge`ez will be able to do this work. Maybe 100 can be done from stories available in English translation. Or, perhaps, if the keywords are good enough, no precis is needed? ● Track word length of Instantiation Stories. ​ ​This is a way of getting at the possibility of different recensions. ● Tag Canonical Stories with keywords. ​ Another difficult task will be using the controlled vocabulary list to better tag Canonical Stories. Macombere did tag 642 of them with keywords, but many remain and his list can be improved. Only those with a good level of Ge`ez and English will be able to do this work. ● Identify new Canonical Stories. ​ ​We need to give new identifier numbers, titles, themes, and incipits for Canonical Stories not in Macomber. We will use Hamburg BM identifiers where possible, but may need to do this a bit ourselves. ● Translate into Amharic. ​Translate titles, keywords, and website into Amharic. ● Design and write static website. ​This can be done in the last year. ● Compare Cannibal of Qemer transcripts​. Once Steve, Jeremy, Jonah, and Ashlee complete typing up all 90 versions of the Cannibal of Qemer tale, we will do computational analysis. Regarding the précises specifically, there would be certain sources: ● Amharic ​translations (currently out of scope): ○ Täsfa Giyorgis, ed. 1931. ​Täˀammərä Maryam bä-Gəˁəz ənna bä-Amarəññ​a [The Miracles of Mary in Gəˁəz and Amharic: 111 Miracles]. Addis Ababa, Ethiopia. ○ Täsfa Gäbrä Śəllase, ed. 1996. ​Täˀammərä Maryam bä-Gəˁəz ənna bä-Amarəñña ​[The Miracles of Mary in Gəˁəz and Amharic: Part Two: 402 Miracles]. Addis Ababa, Ethiopia: Täsfa Gäbrä Śəllase Printing Press. ○ Täsfa Gäbrä Śəllase, ed. 1994. ​Sǝdsa Arattu Täˀammərä Maryam ​ [Sixty-four Miracles of Mary]. Addis Ababa, Ethiopia: Täsfa Gäbrä Śəllase Printing Press. ○ Täsfa Gäbrä Śəllase, ed. 1968. ​Täˀammərä Maryam bä-Gəˁəz ənna bä-Amarəñña ​[The Miracles of Mary in Gəˁəz and Amharic: Part One: 270 Miracles]. Addis Ababa, Ethiopia: Täsfa Gäbrä Śəllase Printing Press. ● English ​translations (currently out of scope): ○ Budge, E. A. Wallis. 1900. ​The Miracles of the Blessed Virgin Mary, and the Life of Hannâ (Saint Anne), and the Magical Prayers of 'Aheta Mîkâêl: The Ethiopic Texts Edited with English Translations Etc​. 2 vols, Lady Meux Manuscripts Nos. 2-5. London: W. Griggs. ○ Budge, E. A. Wallis, ed. 1933. ​One Hundred and Ten Miracles of Our Lady Mary. London: Oxford University Press, H. Milford. ○ Zärˀa Yaˁqob. 1992.​ The Mariology of Emperor Zärˀa Yaˁəqob of Ethiopia: Texts and Translations ​. Translated by Getatchew Haile. Edited by Getatchew Haile. Rome, Italy: Pontificium Institutum Studiorum Orientalium. ● French ​translations (currently out of scope): a. Colin, Gérard. 2004. Le livre éthiopien des miracles de Marie (Taamra Mâryâm). Paris: Les Editions du Cerf. ● Italian ​translations (currently out of scope): 30 a. Cerulli, Enrico. 1943. Il libro etiopico dei Miracoli di Maria e le sue fonti nelle letterature del medio evo latino. Rome: G. Bardi. 31 work_df5kwu25qfgu5gapgitc4odrwu ---- Wuttke_2019_PosterDARIAH2019_Zenodo Here be Dragons: Open Access to Research Data in the Humanities Ulrike Wuttke, Poster DARIAH Annual Event 2019, 15.-17.05.2019, Warsaw Based on the winning blogpost of the Open Humanities Tools and Methods Blog Competition DOI: 10.5281/zenodo.3233447 Contact: Fachhochschule Potsdam | Fachbereich Informationswissenschaften Kiepenheuerallee 5, 14469 Potsdam wuttke@fh-potsdam.de | @UWuttke HERE BE DRAGONS Open Access to Research Data in the Humanities Picture: African penguins, Boulders beach, Simon's Town, South Africa by Pierre-Selim Huard h@ps://commons.wikimedia.org/wiki/File:Simon%27s_Town_-_African_Penguins_-_2018-07-27_-_9028.jpg, CC BY 4.0 “Fellowship of the Data” To create a broad culture of FAIR data sharing in the humanities, we have to roll up our sleeves, team up, and distribute hats: •  Embrace Open principles •  Bridge the gap between the digital and the humanities •  Look what we can learn from the Digital Humanities and other data savvy disciplines Picture: open by velkr0 CC BY 2.0, https://flic.kr/p/mzqM Viva la Open Revolution! Here be dragons (Key challenges) •  Ambivalence about the concept “data” •  Key concepts and recommendations, e.g. FAIR principles little known in research communities •  Publication of research data only as afterthought •  Issues around incentivisation •  Fears •  Availability & sustainability of specialist support structures for humanities research data support Vision for Humanities Data FAIR! Preserved Lost Photo background: “Digital hologram” by Ashwin Vaswani, h@ps://unsplash.com/photos/JqZ7q_S3xOE • Support system for the quest for FAIR humanities data • Incentives for FAIR data publications, e.g. DORA • Infrastructure (aggregation, data centers & repositories, tools and services) • Training & Education of (digital) humanities researchers • “Data infrastructure literacy” • Added value of Research Data Management for the planning of digital projects before the start • Active research data management (e.g. RDMO = Research Data Management Organiser) https://ulrikewuttke.wordpress.com/2019/04/09/open-data-humanities/ There is a lot at stake for the Humani?es, maybe the very ques?on what we want the future of the Humani?es to be. When it comes to Open and FAIR research data in the Humani?es, I can only say it with Queen: “I want it all, and I want it now!” https://creativecommons.org/licenses/by/4.0/ mailto:wuttke@fh-potsdam.de https://ulrikewuttke.wordpress.com/2019/04/09/open-data-humanities/ https://creativecommons.org/licenses/by/4.0/ work_dfpsdyuuwbdalmxgrfz26at2bm ---- A social network analysis of Twitter: Mapping the digital humanities community HAL Id: hal-01517493 https://hal.archives-ouvertes.fr/hal-01517493 Submitted on 3 May 2017 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. A social network analysis of Twitter: Mapping the digital humanities community Martin Grandjean To cite this version: Martin Grandjean. A social network analysis of Twitter: Mapping the digital humanities community. Cogent Arts & Humanities, Taylor & Francis, 2016, pp.1171458. �10.1080/23311983.2016.1171458�. �hal-01517493� https://hal.archives-ouvertes.fr/hal-01517493 https://hal.archives-ouvertes.fr Grandjean, Cogent Arts & Humanities (2016), 3: 1171458 http://dx.doi.org/10.1080/23311983.2016.1171458 DIGITAL HUMANITIES | RESEARCH ARTICLE A social network analysis of Twitter: Mapping the digital humanities community Martin Grandjean1* Abstract: Defining digital humanities might be an endless debate if we stick to the discussion about the boundaries of this concept as an academic “discipline”. In an attempt to concretely identify this field and its actors, this paper shows that it is possible to analyse them through Twitter, a social media widely used by this “com- munity of practice”. Based on a network analysis of 2,500 users identified as mem- bers of this movement, the visualisation of the “who’s following who?” graph allows us to highlight the structure of the network’s relationships, and identify users whose position is particular. Specifically, we show that linguistic groups are key factors to explain clustering within a network whose characteristics look similar to a small world. Subjects: Internet & Multimedia - Computing & IT; Network Theory; Sociology of Science & Technology Keywords: digital humanities; social network analysis; Twitter; digital studies; social media; data visualisation; sociometry; networks 1. Introduction: Identifying the digital humanities community At a time where shelves could easily overflow with journal issues and monographs attempting to precisely define the nature of “digital humanities”, it seems that we are now, at long last, gradually leaving the “time of definition”. Acknowledging that the diversity of these definitions does not help to put an end to the debate, the actors of this field are turning to a more operational concept: the notion of “community of practice”. But if this very inclusive concept, just like the “big tent”,1 allows *Corresponding author: Martin Grandjean, Department of History, University of Lausanne, Lausanne, Switzerland E-mail: martin.grandjean@unil.ch Reviewing editor: Aaron Mauro, Pennsylvania State University, USA Additional information is available at the end of the article ABOUT THE AUTHOR Martin Grandjean is a researcher in intellectual history at the University of Lausanne (Switzerland). He studies the structuration of scientific networks in the interwar period and develops network analysis and visualisation methods for archives and texts. Specialised in data visualisation, he leads parallel experiments in the fields of data-driven journalism, open data and social media analysis. He’s member of the board of Humanistica, the French-speaking digital humanities association. PUBLIC INTEREST STATEMENT In recent years, the emergence of new technologies in the humanities and social sciences caused a major upheaval. Grouped under the term “digital humanities”, thousands of researchers worldwide structure gradually their community around issues related to the use of new tools and methods. Understanding how this new community organises itself is a challenge because it takes very different forms depending on the institutions and scientific disciplines. This article analyses the presence of the main actors of this community on Twitter, a social media where each user publishes very short messages to his subscribers. By analysing the “who’s following who?” network among these 2,500 people, we discover who are the most connected individuals. Language groups are also very visible and allow to question the homogeneity of this community of practice. Received: 15 November 2015 Accepted: 02 March 2016 Published: 15 April 2016 © 2016 The Author(s). This open access article is distributed under a Creative Commons Attribution (CC-BY) 4.0 license. Page 1 of 14 http://crossmark.crossref.org/dialog/?doi=10.1080/23311983.2016.1171458&domain=pdf&date_stamp=2016-04-15 mailto:martin.grandjean@unil.ch http://creativecommons.org/licenses/by/4.0/ Page 2 of 14 Grandjean, Cogent Arts & Humanities (2016), 3: 1171458 http://dx.doi.org/10.1080/23311983.2016.1171458 us to overcome the disciplinary clashes, it makes it difficult to identify the borders, as the common denominator seems to be the “kinda the intersection of …” definition (Terras, 2010). Our study vol- untarily chooses to focus on a particular field of expression of this community, a social network that has for many years been regarded as one of the main exchange places for digital humanities. Our goal is therefore not to draw conclusions that go beyond this very specific object, but to observe it in order to offer a transversal view of this movement, otherwise difficult to map with traditional meth- ods. Let’s seize the opportunity to ditch the useful “network” metaphor to apprehend it more for- mally, through a social media that embodies these relationships. 2. Twitter, a growing field of study Twitter,2 a social network created in 2006, is a place dedicated to personal expression that brings together hundreds of millions of users around its minimalist concept of microblogging. Its messages of 140 characters and its principle of “following” users without mandatory reciprocity, coupled to a very open application programming interface (API), make it an ideal medium for the study of online behaviour. Its simplicity makes it a frequently used tool to report current events. Hence, many stud- ies analysing the diffusion of information consecutive to an event: an earthquake (Sakaki, Okazaki, & Matsuo, 2010), demonstrations such as the London riots (Beguerisse-Diaz, Garduno-Hernandez, Vangelov, Yaliraki, & Barahona, 2014; Casilli & Tubaro, 2012), international conferences (Grandjean & Rochat, 2014; Jussila, Huhtamaki, Henttonen, Karkkainen, & Still, 2014), teachings (Stepanyan, Borau, & Ullrich, 2010) or interactions on neutral corpus (Darmon, Omodei, & Garland, 2015). These “dynamic” analyses, which typically map networks of tweets, mentions and retweets out, owe their popularity to the availability of the material and the possibility for researchers to analyse its con- tents. They frequently lead to questions on influence measuring (Subbian & Melville, 2011; Suh, Hong, Piroll, & Chi, 2010), especially when it comes to political communication (Stieglitz & Dang- Xuan, 2012; Vainikka & Huhtamäki, 2015) or scientometry (Haustein, Peters, Sugimoto, Thelwall, & Larivière, 2014). But when it becomes clear that the content of a user’s tweets is not always indicative of his field of specialisation—due to the noise produced by the many personal messages, jokes, politics, etc.— we need to turn to a network whose structure seems more readily analysable in terms of “commu- nity”: the follow graph (Myers, Sharma, Gupta, & Lin, 2014). 3. Dataset The prerequisite to this study was the preparation of a list of more than 2,500 Twitter users identified as part of the digital humanities community. We saw above that the definition of this field was sub- ject to many changes: rather than stick to lists of association members or authors of a set of jour- nals, we listed all users who identify themselves as being directly or indirectly part of this “community of practice”. It is in the very short Twitter “bio” (160 characters) that we spotted the vocabulary linking these researchers together. First of all, it is by listing all the followers of the most visible users (national or international institutions, established professors and researchers in the field, Twitter accounts of scientific events, etc.) and by reviewing their biographies that a first selection was made. Within this corpus, we then randomly select a number of users and we also analyse their subscribers. This list is then enriched in three ways: by the identification of users who tweeted with specific DH conferences hashtags; through the self-reporting of users who, following the publication of blog posts about this research, announced to be part of the corpus; and finally, through harvesting the results of the Twitter search engine on a selection of keywords related to the digital humanities. By its nature, this corpus cannot aim to be comprehensive, but it should be noted that it offers, unlike most official lists, to include a segment of the academic population (generally non-institution- al) who doesn’t publish or doesn’t normally participate in official events. They did not wait to receive the DH “label” to assign themselves and see themselves as members of this community. Specifically, this article analyses the “who’s following who?” relationships inside a Twitter list con- taining exactly 2,538 Twitter accounts of individuals or institutions (on 1 October 2015). This network Page 3 of 14 Grandjean, Cogent Arts & Humanities (2016), 3: 1171458 http://dx.doi.org/10.1080/23311983.2016.1171458 is obtained after downloading—via the Twitter API—the list of all the followers of each of the ac- counts, then filtered according to whether they are themselves members of the list or not. It there- fore only concerns the relations within this corpus, not the tens of thousands of non-DH users who follow these 2,500 accounts. 4. Result: an apparent small world At first, the network of digital humanities on Twitter is a form of small world (Milgram, 1967), at least that’s what suggests its visual representation3 (Figure 1). It indeed shows an extremely dense net- work. Only one cluster seems to detach itself slightly, while another one, nearby somewhat distorts the very circular structure of the network. The size of the circles/vertices is proportional to the cen- trality degree of the users (the number of connections, followers and followings together), we note that only 11 of them exceed 1,000 connections. In addition, the colour of the circles shows the in- degree (inbound degree, their followers only), allowing us to see that only 17 people (white circles) are followed by more than one-third of the users in the corpus. Median user follows 59 Twitter ac- counts from the list and is followed by 39 of them. Are digital humanities—whereas describing themselves as a transversal field—finally a closed world where everybody knows everybody?4 Figure 1. Digital humanities network on Twitter: 2,500 users following each other. Page 4 of 14 Grandjean, Cogent Arts & Humanities (2016), 3: 1171458 http://dx.doi.org/10.1080/23311983.2016.1171458 In fact, despite its apparent homogeneity—its limited division into small communities—the den- sity of the graph isn’t extremely high. The density is calculated based on the number of possible edges in the network, its value here is 0.036, on a scale from 0 (no edge) to 1 (an edge between every 2,500 vertices). Even if the network can be structurally considered a small world under the terms of (Watts & Strogatz, 1998), with a high average clustering coefficient (0.366) and a reduced average path length (2.297, with a maximal distance of 5), the application of this concept to an asymmetrical social me- dia remains unclear. These first elements should not make us forget that this network is a visual representation of a set of data whose complexity is not limited to a simple graphical rendering. Beyond a certain aesthetic, sometimes very suggestive, it is in its ability to generate new research questions—pushing the re- searcher to get back into the data itself—that a network analysis proves his interest.5 5. To follow or to be followed? Prior to the benefit from more advanced structural measures, the first analysis that we propose is the comparison of the ratio between followers and followings. Figure 2 visualises this relationship as a scatter plot, supported by two bar charts that summarise the distribution of these two values. First observation: more than half of the users follow less than 100 people and are themselves followed by less than 100 people (category A, 63.1%). The vast majority of the corpus is actually made up of very weakly connected users, information that the network visualisation (Figure 1), with its totalising aim, tends to make us forget. Traditionally, it is considered that users that are highly followed are personalities and institutions whose influence and reputation is superior to users who subscribe to a large number of accounts without themselves being widely followed. We can now distinguish six categories of users, based on their followings/followers ratio (assuming category A users are excluded from this ranking due to their insignificant number of connections): • Category B: users who follow at least four times more users than they have subscribers (1.3%). • Category C: users who follow at least twice as many users than they have subscribers (6.6%). Figure 2. Followings and followers among peers, and frequency distribution. Page 5 of 14 Grandjean, Cogent Arts & Humanities (2016), 3: 1171458 http://dx.doi.org/10.1080/23311983.2016.1171458 • Category D: users who follow up to two times more users they have subscribers (13.8%)—This is the largest population of this corpus, behind category A. The first three categories bring together users who use Twitter a technological monitor. Without necessarily creating content that will make them influencers (even if it’s not incompatible), a signifi- cant portion of these users is kept informed of the news of their research fields through this social media. It is also to be noted that following a large number of users obviously has a social function that has nothing pejorative. Subscribing to a large number of users typically increases the number of followers (the people being followed are notified of the subscription, they discover their new sub- scriber and sometimes follow him or her back if interested). • Category E: users who are followed up to two times more than they follow themselves (8.7%). • Category F: users who are followed at least twice as much as they follow themselves (4.1%). • Category G: users who are followed at least four times more than they follow themselves (2.3%). In the last three categories, we find users who are followed by more users than they follow them- selves, generally because they occupy a privileged position in the field (journals, institutions, asso- ciations, advanced academic positions, prominent figures in the community or content producers). While the border between categories D and E isn’t very significant, the presence of a user in catego- ries F and G is very indicative about his behaviour on the social network. It is indeed among these last two categories that we can find some of the “stars” of the field (in the sense of Moreno, 1934, which lays the basis of network analysis, where the stars are individuals who focus incoming relations). However, with a little distance, it should be noted that the presence in one or the other of these categories is not a definitive marker of the user’s position in the field: having a very high ratio does not always mean being an influential person, but sometimes simply shows a rather elitist attitude (following hardly anyone, e.g.), or a popularity due to an external factor (being a renowned institu- tion outside as well as inside the DH field, e.g.). Let’s also recall that we’re only analysing the follow- ers/following ratio inside our corpus. A user with a very low ratio may well be followed by tens of thousands of Twitter users outside the community (and a “star” user can have no followers outside this network). 6. A geography of the linguistic communities Beyond the apparent homogeneity of the network of these 2,500 Twitter users, the geographical, cultural and language distribution must be questioned. While digital humanities are often seen as an essentially English-speaking movement, many local or linguistic communities have emerged in recent years, claiming for their specificities not to be embedded in a large English-speaking congre- gation. While the geographical issues do not always tally with the language issues (French is spoken in Europe, Africa and North America, Spanish and Portuguese in Europe and South America and English on every continent, at least as a second tongue), national, regional or linguistic associations are emerging,6 as a “special interest group” of the Alliance of Digital Humanities Association (ADHO), dedicated to the promotion of diversity.7 However, the Internet, in general, and Twitter, in particular, are highly globalised places. It is not uncommon for a user to overlook national and linguistic borders as he or she follows the publica- tions of a very wide variety of users. Therefore, are the language communities discernible in our data-set? And if so, how to judge their representativeness regarding the “real world”?8 Analysing the language of the tweets posted by users from our corpus for a given period is a chi- meric operation, both by the amount of “noise” to disambiguate and by the nature of the content of tweets that are often multilingual. Fortunately, the Twitter API provides, for each of its users, the language of the interface used. Even if English is often the default language, the proportion of ac- counts using another language is important in our list (27%). Figure 3 visualises, on the same graph Page 6 of 14 Grandjean, Cogent Arts & Humanities (2016), 3: 1171458 http://dx.doi.org/10.1080/23311983.2016.1171458 as Figure 1, the interface language of our 2,500 Twitter users. It appears very clearly that the two major “clusters” we were already able to distinguish beforehand correspond to very well-defined linguistic communities. In particular, the French-speaking community is almost completely de- tached from the main group. To a lesser extent, the German-speaking community is clearly circum- scribed. Far behind, constituting the third largest non-English speaking community, Spanish-speaking users are also all in the same area of the graph but do not come off from the main group as clearly as the previous two. The remaining users, particularly small Italian-speaking and Dutch-speaking communities, are spread in a kind of “global village” at the intersection of all the other communities. Two important notes for reading this graph: • The community of a given language is not limited to the individuals who use Twitter in the con- cerned language: there are many French or Germans using Twitter in English in the identified Figure 3. Highlighting the interface language. Page 7 of 14 Grandjean, Cogent Arts & Humanities (2016), 3: 1171458 http://dx.doi.org/10.1080/23311983.2016.1171458 clusters. We estimate that we can add about 30% to the total of the linguistic communities presented in Figure 3, diminishing the English-speaking community in the same proportion. • The spatialisation of the network is obtained by a force algorithm,9 which means that the prox- imity between two vertices cannot be interpreted as a real proximity to each other: as in all network visualisations, this geography is the result of a complex calculation that takes into ac- count each of the edges (there are 236,000). One thing remains: the French-speaking community is particularly isolated. We can elaborate sev- eral hypotheses and questions: is English less used there than in other non-English speaking com- munities? Or at the opposite, is it a language less mastered in the other regions, justifying that French users are followed less because they are less understandable? Is the French-speaking digital humanities community important and structured enough to be less dependent on English refer- ences? Or are the practices so different that the need for skills transfer is less strong with this com- munity than with others? Is it finally only a bias related to the social media analysed, where behaviours differ according to local “cultures”? Besides, we also note that in the French-speaking cluster, we find most of the French users in the peripheral group. Most of the Swiss, Belgian and Canadian users are rather positioned at the inter- section with the other linguistic communities, and thus less isolated. If the position of the French-speaking community is surprising, it is also because of the compari- son with other language communities that we would have expected to be a stronger presence. Rather than seeing the French position as abnormal, is it not worrying to observe such a fusion be- tween the Spanish- and German-speaking communities and the main group? And what about the users using an Italian (36), Dutch (24), Portuguese (10) or other marginal languages (40) interface? Note that the language distribution within the digital humanities community on Twitter is not comparable with the general distribution of languages in the world, or with the distribution of lan- guages in usually studied tweets sets (Hale, 2014; working on a 2011 data-set). This is not the con- sequence of a biased data-set but simply a research field that is not (yet) globalised and remains in its major part a European and North American phenomenon (which is also demonstrated by the geography of THATCamps, the emblematic manifestations of this “community of practice”, see Grandjean, 2015b). 7. Measuring structural features Using a formal network only to be satisfied by a comment on its visual characteristics is to miss its structural characteristics. Centrality is a way to quantify the importance of the vertices in a network: its different declinations are frequently used in social network analysis to identify and highlight spe- cific positions (Newman, 2010). We will therefore seek to go beyond the visual representation in order to list the users of our cor- pus holding a remarkable structural position. This process is not restricted to online social network- ing and has been used since the works of Freeman (1978). As in Rochat (2014), Table 1 shows the values of four centrality measurements from our corpus. Two of them, the In- and Out-degrees have already been exploited above. These are also the easiest to define as they are immediately translat- able into Twitter’s language, respectively “followers” and “followings”. The Betweenness centrality, which measures the number of times a vertice is present on the shortest path between two other vertices, highlights users who are structurally in a “bridge” position between the subdivisions of the network. The Eigenvector centrality assigns each vertice a score of authority that is based on the score of the vertices with which it is connected. Table 1 is completed by Figure 4, allowing readers to get a sense of the geography of the measurements obtained and to clarify their distribution. Page 8 of 14 Grandjean, Cogent Arts & Humanities (2016), 3: 1171458 http://dx.doi.org/10.1080/23311983.2016.1171458 Table 1. Centrality measures of the 100 most followed accounts User In-degree Out-degree Betweenness Eigenvector Value Rank Value Rank Value Rank Value Rank dhnow 1,698 1 705 5 389,710 1 1.00 1 NEH_ODH 1,199 2 0 2,495 0 2,495 0.81 8 nowviskie 1,184 3 403 45 49,740 27 0.94 2 melissaterras 1,178 4 447 32 92,664 9 0.87 4 dancohen 1,170 5 237 212 35,909 38 0.90 3 mkirschenbaum 1,069 6 201 302 14,958 99 0.86 5 DHQuarterly 1,068 7 105 758 12,813 119 0.72 17 miriamkp 1,044 8 525 17 75,883 15 0.83 7 thatcamp 999 9 163 426 55,051 24 0.68 22 ADHOrg 983 10 450 31 62,772 22 0.68 24 DHInstitute 978 11 1,104 1 268,691 3 0.68 23 GrandjeanMartin 929 12 488 24 313,162 2 0.47 90 HASTAC 903 13 408 43 78,561 14 0.67 28 brettbobley 894 14 444 33 41,450 32 0.83 6 briancroxall 887 15 435 36 35,795 40 0.79 9 kingsdh 875 16 24 1,943 3,503 398 0.51 67 DHanswers 862 17 505 20 63,961 21 0.70 20 trevormunoz 841 18 580 12 58,264 23 0.77 10 elotroalex 822 19 644 8 107,208 7 0.73 12 dpla 811 20 76 1,046 13,987 109 0.58 49 RayS6 807 21 323 105 29,224 44 0.73 14 lisaspiro 793 22 312 112 22,752 61 0.75 11 DHandLib 789 23 804 3 173,640 5 0.58 47 UCLDH 772 24 85 938 9,150 181 0.53 60 kfitz 769 25 95 839 3,391 407 0.71 18 mkgold 763 26 250 187 14,019 108 0.72 16 UMD_MITH 759 27 213 269 12,041 127 0.66 31 chnm 749 28 62 1,226 3,961 358 0.60 41 ryancordell 748 29 384 55 26,960 51 0.72 15 ProfHacker 747 30 133 564 9,073 182 0.59 44 DHcenterNet 745 32 396 50 29,122 45 0.65 32 Ajprescott 745 31 312 113 40,501 34 0.60 40 foundhistory 730 33 259 172 20,907 67 0.70 21 frederickaplan 719 34 929 2 248,304 4 0.50 70 manovich 716 35 59 1,263 6,526 241 0.57 50 DayofDH 703 36 596 11 88,388 11 0.54 57 sramsay 698 37 99 799 3,074 431 0.68 26 Ted_Underwood 697 38 370 65 35,880 39 0.67 29 fraistat 689 39 355 79 19,653 70 0.73 13 sgsinclair 688 40 187 346 11,714 129 0.67 30 julia_flanders 685 41 51 1,391 2,060 570 0.65 33 alanyliu 684 42 154 457 10,640 152 0.62 37 (Continued) Page 9 of 14 Grandjean, Cogent Arts & Humanities (2016), 3: 1171458 http://dx.doi.org/10.1080/23311983.2016.1171458 User In-degree Out-degree Betweenness Eigenvector Value Rank Value Rank Value Rank Value Rank 4Hum 681 43 290 135 23,971 56 0.57 51 JenServenti 664 44 644 9 53,033 25 0.71 19 eadh_org 658 45 183 357 14,772 100 0.47 89 amandafrench 654 46 203 296 10,916 143 0.68 25 unsworth 642 47 195 320 10,768 150 0.67 27 TEIconsortium 641 48 354 81 67,328 19 0.42 112 dhgermany 639 49 661 7 118,555 6 0.37 147 Jessifer 638 50 369 68 38,218 37 0.53 59 samplereality 635 51 245 196 8,575 191 0.64 34 tjowens 623 52 245 197 13,600 113 0.60 42 DHCommons 605 53 23 1,972 605 995 0.47 91 williamjturkel 600 54 713 4 80,180 13 0.63 35 britishlibrary 599 55 4 2,434 620 986 0.30 196 DSHjournal 595 56 175 380 14,958 98 0.49 75 mljockers 583 57 65 1,188 1,711 631 0.56 52 karikraus 582 58 264 167 11,431 133 0.61 38 adelinekoh 582 59 337 97 22,458 62 0.50 72 GeoffRockwell 578 60 139 526 5,241 294 0.59 46 christof77 570 61 628 10 87,611 12 0.44 102 jenterysayers 567 62 222 253 7,079 223 0.60 43 zotero 553 63 48 1,444 4,065 351 0.45 95 cunydhi 552 64 385 54 30,468 43 0.51 68 scott_bot 549 65 214 268 15,291 94 0.53 61 ernestopriego 540 66 370 66 28,079 48 0.54 56 mia_out 539 68 96 826 6,387 244 0.48 81 omeka 539 67 96 825 5,632 282 0.44 101 jasonrhody 532 69 324 103 9,256 177 0.63 36 clioweb 523 70 441 35 28,267 46 0.60 39 jenguiliano 523 71 526 16 26,969 50 0.59 45 GCDH 515 72 443 34 72,751 16 0.34 167 wragge 514 73 533 15 65,783 20 0.54 55 CathyNDavidson 501 74 106 749 4,624 326 0.47 87 Literature_Geek 497 75 181 368 5,893 270 0.52 66 DH2014Lausanne 495 76 505 21 68,466 17 0.38 140 lornamhughes 491 77 288 137 15,165 95 0.47 88 sleonchnm 490 78 189 341 5,788 273 0.54 54 DARIAHeu 486 79 35 1,695 3,828 367 0.27 228 tmcphers 485 80 375 61 24,907 55 0.54 58 HumanisticaDH 483 81 428 38 103,554 8 0.27 243 DH_Oxford 481 82 76 1,047 3,768 374 0.32 179 GeorgeOnline 478 84 348 83 12,967 118 0.55 53 Table 1. (Continued) (Continued) Page 10 of 14 Grandjean, Cogent Arts & Humanities (2016), 3: 1171458 http://dx.doi.org/10.1080/23311983.2016.1171458 7.1. In-degree The number of followers decreases very rapidly within the top 100 users. The most followed account is @dhnow (Digital Humanities Now10) whose aggregation mission seems to be recognised by the community. Then, follow leading figures, institutions, associations and publishers. Spatially, a high inbound degree is not the exclusive privilege of one of the “linguistic communities” studied above. Note that the most followed users are still generally—and logically—located in the “heart” of the network, between the central “global village” and the English-speaking region. 7.2. Out-degree Some massively followed accounts are themselves following very few users from the list. Consequence: the classification according to the out-degree is quite different from the previous one. The account that follows the largest number of users is @DHInstitute (Digital Humanities Summer Institute,11 University of Victoria), an event that presumably seeks to bring together the community. The distribution in “long tail” of this measure is less pronounced than for the in-degree. This can be explained very naturally because on Twitter, the majority of users follow more people than they are followed themselves. Except for a few users with a high in-degree but a very low out-degree, the distribution of this measure on the graph is very similar to the previous one. 7.3. Betweenness In a harmoniously distributed network, highly connected accounts (with a high centrality degree) are usually also the ones most often being on the shortest path between the vertices of the graph. But we have seen that our network contains clusters that are detached from the main structure. It is therefore logical that we find individuals with high betweenness in the area which is located at the intersection between the main network and the French and German clusters. These users—often French or German speakers engaged in international structures as ADHO or EADH—are transmission belts between different regions of the graph. User In-degree Out-degree Betweenness Eigenvector Value Rank Value Rank Value Rank Value Rank HASTACscholars 478 83 273 155 10,337 157 0.48 85 pannapacker 475 85 277 148 13,316 115 0.52 63 tanyaclement 470 86 142 506 1,777 617 0.58 48 jean_bauer 467 87 212 276 5,368 287 0.52 62 inactinique 461 88 397 49 90,683 10 0.33 171 roopikarisam 456 89 524 18 41,926 30 0.47 86 jamescummings 446 90 216 265 10,870 146 0.41 122 jasonaboyd 445 92 468 27 18,187 75 0.50 71 FrostDavis 445 91 355 80 13,818 110 0.49 78 UCLA_DH 443 93 81 985 4,716 320 0.34 169 DARIAHde 442 94 427 39 67,421 18 0.22 303 Adam_Crymble 440 95 483 25 48,675 29 0.44 103 amyeetx 432 96 211 277 4,040 353 0.52 64 patrick_mj 427 97 237 213 6,861 229 0.50 69 seth_denbo 424 98 296 130 8,269 201 0.49 73 j_w_baker 424 99 424 40 22,959 59 0.40 128 epierazzo 423 100 141 512 5,785 274 0.35 160 Table 1. (Continued) Page 11 of 14 Grandjean, Cogent Arts & Humanities (2016), 3: 1171458 http://dx.doi.org/10.1080/23311983.2016.1171458 7.4. Eigenvector As the eigenvector centrality is assigned to the vertices according to the score their neighbours re- ceived, it produces a result that highlights the very connected users within the larger group of our graph. Here, this measure of authority no longer focuses our attention to the periphery and to the inter-community “bridges”, but rather to the centre and its English-speaking majority. Except a few hyper-connected users who monopolise the top positions in almost every centrality ranking, we see here less cosmopolitan users, better “installed” in their English-speaking environment. We will avoid considering these measures as indicators of influence. They document the network structure, not the nature and content of the relations themselves. They nevertheless allow to pin- point patterns whose study should be coupled with an analysis of the position of these users in the world of academic hierarchies, publications or co-directions of research projects. Figure 4. Spatial and statistical distribution of the four metrics. Page 12 of 14 Grandjean, Cogent Arts & Humanities (2016), 3: 1171458 http://dx.doi.org/10.1080/23311983.2016.1171458 8. Limits and perspectives The mode of creation of the data-set can, at least partially be a factor in the small-world visual im- pression: if the majority of the 2,500 users were detected because they were following a more “vis- ible” account, then the high density of the network is logical, even though an effort was made to focus on minorities. Self-determination also has its limits: a person that all his colleagues would describe as a “digital humanist” but describes himself on Twitter with a biography that does not describe his scientific activity may pass under the spectrum of our analysis. Note that it is possible to overcome this problem by no longer focusing on biographies of registered users but on their struc- tural characteristics themselves. The next step of this analysis could indeed be to list the hundreds of thousands of followers “off-list” from our 2,500 selected profiles and automatically integrate to the list those who follow or are followed by a determined number of members from the original list. A systematic way of grouping the community of those who, even without being practitioners, “fol- low” the latest research in DH. Concerning the debate on the linguistic structure of the network, a limitation obviously comes from the language of the author. Speaking French, he is better able to explore this part of the net- work than another, something that could artificially produce a high clustering of his own linguistic community. This risk is minor here due to a special effort made to find a maximum of users repre- senting the linguistic diversity of the field, particularly in German-, Italian- and Spanish-speaking areas. The fact that the list is public is likely to skew the results of this analysis: conscious of having been added, some users could use it to discover and follow new users, which would have the effect of in- creasing the network’s density. Similarly, we cannot exclude that the process has led some to dis- cover the list author’s account: they may have found the initiative or the profile interesting and will have therefore followed it, which could have caused a slight upgrade of the latter in the ranking. In the longer term, a public list is problematic because it is likely, gradually acquiring the status of “ref- erence”, to encourage compulsive subscription behaviours, such as users hoping to be “followed back” by colleagues. In itself, this behaviour is not a problem, it is a networking strategy that can be justified to socialise in a given community, but to use only one list for this is problematic: the more it is used for this purpose, the denser the network becomes, the more it impoverishes the diversity possibility in the field. But on the other hand, keeping this list public is mostly a way of giving the community a chance to discover unknown profiles and is a contribution to the friendly spirit of this social media. This also allows other researchers to use this corpus to conduct other types of studies: content analysis, interactions, biographies, shared links, etc. Also note that the representativeness of Twitter is widely debated (Mislove, Jorgensen, Ahn, Onnela, & Rosenquist, 2011; Sloan et al., 2013), and that it is established that the social network’s users are not a sample image of the population (Duggan, Ellison, Lampe, Lenhart, & Madden, 2015; Miller, Ginnis, Stobart, Krasodomski-Jones, & Clemence, 2015). While this representativeness is cru- cial to draw political conclusions (Boyadjian, 2014; Vainikka & Huhtamäki, 2015), the universities’ landscape and the digital humanities are themselves such a little representation of the population that these considerations are difficult to apply here. Hence, the need to combine our analysis to a qualitative survey of these areas to assess this very special representativeness. 9. Conclusion In this paper, we found that defining digital humanities as a “community” avoids endless debates on its disciplinary boundaries but does not allow us to know who’s practicing them today. As an attempt to identify this field, leaving aside the epistemological discussion, our study shows that this item is analysable through a social media widely used by the so-called “digital humanists” (2,500 users). In analysing the network of “who’s following who?”, it was found that a small number of individuals and institutions are focusing so much attention that the graph appears to be very homogeneous Page 13 of 14 Grandjean, Cogent Arts & Humanities (2016), 3: 1171458 http://dx.doi.org/10.1080/23311983.2016.1171458 around them. The fact remains that many types of behaviour can be deduced from the graph and that structural characteristics of the network enable us to highlight some users holding remarkable positions. Specifically, we showed that French-speaking users, and to a lesser extent German- speaking users, stand out: the language factor strongly influences the network structure. Obviously, any quantification leads to a form of objectification whose limits we need to under- stand. But we note that the availability of this type of data-set and the opportunities offered by tools and theories such as social network analysis allows us to shed new light on this community. Acknowledgements Martin Grandjean is thankful to Vincent Barbay (Ecole Polytechnique Fédérale de Lausanne) for his help in the data recovery and processing and to Etienne Guilloud (Université de Lausanne) for his help correcting his English. The idea to analyse this kind of data-sets was initiated with Yannick Rochat (Ecole Polytechnique Fédérale de Lausanne) during the Pegasus Data works in 2012. Grandjean is also thankful to Frédéric Clavert (Université de Lausanne) and Olivier Le Deuff (Université Bordeaux Montaigne) for exchanging about the French position. Funding The author received no direct funding for this research. Author details Martin Grandjean1 E-mail: martin.grandjean@unil.ch ORCID ID: http://orcid.org/0000-0003-3184-3943 1 Department of History, University of Lausanne, Lausanne, Switzerland. Citation information Cite this article as: A social network analysis of Twitter: Mapping the digital humanities community, Martin Grandjean, Cogent Arts & Humanities (2016), 3: 1171458. Notes 1. “Big Tent Digital Humanities”, title of the Digital Hu- manities 2011, ADHO Conference, Stanford 2011. 2. http://www.twitter.com. 3. Data visualisation produced with Gephi (Bastian, Hey- mann, & Jacomy, 2009). 4. Not forgetting that the “closed world” impression is ac- centuated by the fact that the data-set is itself finite. The term is rather to be taken metaphorically. 5. We develop this typology between demonstration and research visualisation in (Grandjean, 2015a). 6. Linguistic associations: French, German, Spanish, Portu- guese and Japanese; regional associations: European, Nordic, Australasian, South America, Argentinian and Israeli. 7. Global Outlook:Digital Humanities http://www.globaloutlookdh.org. 8. This choice of vocabulary should not make us forget that this “virtual” world is contained within the “real” world, and this especially as the use of online social networking is increasingly becoming a factor of scien- tific socialisation, trade and promotion. 9. Force Atlas 2 (Jacomy, Venturini, Heymann, & Bastian, 2014). 10. http://www.digitalhumanitiesnow.org. 11. http://www.dhsi.org. References Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: An open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media, 361–362. Beguerisse-Diaz, M., Garduno-Hernandez, G., Vangelov, B., Yaliraki, S. N., & Barahona, M. (2014). Interest communities and flow roles in directed networks: The Twitter network of the UK riots. Journal of the Royal Society Interface, 0940. doi:10.1098/rsif.2014.0940 Boyadjian, J. (2014). Twitter, un nouveau “baromètre de l’opinion publique”? Participations, 55–74. http://dx.doi.org/10.3917/parti.008.0055 Casilli, A., & Tubaro, P. (2012). Social media censorship in times of political unrest—A social simulation experiment with the UK riots. Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique, 115, 5–20. http://dx.doi.org/10.1177/0759106312445697 Darmon, D., Omodei, E., & Garland, J. (2015). Followers are not enough: A multifaceted approach to community detection in online social networks. PloS ONE. 10, e0134860. doi: 10.1371/journal.pone.0134860 Duggan, M., Ellison, N. B., Lampe, C., Lenhart, A., & Madden, M. (2015). Social media update 2014. Pew Research Center. Retrieved from http://www.pewinternet.org/2015/01/09/ social-media-update-2014/ Freeman, L. (1978). Centrality in social networks: Conceptual clarification. Social Networks, 1, 215–239. http://dx.doi.org/10.1016/0378-8733(78)90021-7 Grandjean, M. (2015a). Introduction à la visualisation de données: l’analyse de réseau en histoire. Geschichte und Informatik, 18, 109–128. Grandjean, M. (2015b). THATCamp geography. Retrieved from http://www.martingrandjean.ch/ data-visualization-thatcamp-geography/ Grandjean, M., & Rochat, Y. (2014). The digital humanities network on Twitter (#DH2014). Retrieved from http://www.martingrandjean.ch/ dataviz-digital-humanities-twitter-dh2014/ Hale, S. A. (2014). Global connectivity and multilinguals in the Twitter network. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 833–842. Haustein, S., Peters, I., Sugimoto, C. R., Thelwall, M., & Larivière, V. (2014). Tweeting biomedicine: An analysis of tweets and citations in the biomedical literature. Journal of the Association for Information Science and Technology, 65, 656–669. http://dx.doi.org/10.1002/ asi.23101 Jacomy, M., Venturini, T., Heymann, S., & Bastian, M. (2014). ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. PloS ONE, 9, 1–12. Jussila, J., Huhtamaki, J., Henttonen, K., Karkkainen, H., & Still, K. (2014). Visual network analysis of Twitter data for co-organizing conferences: Case CMAD 2013. System Sciences (HICSS), 1474–1483. Milgram, S. (1967). The small-world problem. Psychology Today, 2, 60–67. Miller, C., Ginnis, S., Stobart, R., Krasodomski-Jones, A., & Clemence, M. (2015). The road to representivity, a Demos and Ipsos MORI report on sociological research using Twitter. London: Demos. Mislove, A., Jorgensen, S. L., Ahn, Y.-Y., Onnela, J.-P., & Rosenquist, J. N. (2011). Understanding the demographics of Twitter users. International AAAI Conference on Weblogs and Social Media, 554–557. mailto:martin.grandjean@unil.ch http://orcid.org/0000-0001-6329-1413 http://www.twitter.com http://www.globaloutlookdh.org http://www.globaloutlookdh.org http://www.digitalhumanitiesnow.org http://www.dhsi.org http://dx.doi.org/10.1098/rsif.2014.0940 http://dx.doi.org/10.3917/parti.008.0055 http://dx.doi.org/10.3917/parti.008.0055 http://dx.doi.org/10.1177/0759106312445697 http://dx.doi.org/10.1177/0759106312445697 http://dx.doi.org/10.1371/journal.pone.0134860 http://www.pewinternet.org/2015/01/09/social-media-update-2014/ http://www.pewinternet.org/2015/01/09/social-media-update-2014/ http://dx.doi.org/10.1016/0378-8733(78)90021-7 http://dx.doi.org/10.1016/0378-8733(78)90021-7 http://www.martingrandjean.ch/data-visualization-thatcamp-geography/ http://www.martingrandjean.ch/data-visualization-thatcamp-geography/ http://www.martingrandjean.ch/dataviz-digital-humanities-twitter-dh2014/ http://www.martingrandjean.ch/dataviz-digital-humanities-twitter-dh2014/ http://dx.doi.org/10.1002/asi.23101 http://dx.doi.org/10.1002/asi.23101 Page 14 of 14 Grandjean, Cogent Arts & Humanities (2016), 3: 1171458 http://dx.doi.org/10.1080/23311983.2016.1171458 © 2016 The Author(s). This open access article is distributed under a Creative Commons Attribution (CC-BY) 4.0 license. You are free to: Share — copy and redistribute the material in any medium or format Adapt — remix, transform, and build upon the material for any purpose, even commercially. The licensor cannot revoke these freedoms as long as you follow the license terms. Under the following terms: Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. No additional restrictions You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits. Cogent Arts & Humanities (ISSN: 2331-1983) is published by Cogent OA, part of Taylor & Francis Group. Publishing with Cogent OA ensures: • Immediate, universal access to your article on publication • High visibility and discoverability via the Cogent OA website as well as Taylor & Francis Online • Download and citation statistics for your article • Rapid online publication • Input from, and dialog with, expert editors and editorial boards • Retention of full copyright of your article • Guaranteed legacy preservation of your article • Discounts and waivers for authors in developing regions Submit your manuscript to a Cogent OA journal at www.CogentOA.com Moreno, J. L. (1934). Who shall survive? A new approach to the problem of human interrelations. Washington, DC: Nervous and Mental Disease Publishing. Myers, S. A., Sharma, A., Gupta, A., & Lin, J. (2014). Information network or social network? The structure of the Twitter follow graph. Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion, 493–498. Newman, M. J. (2010). Networks. Oxford: Oxford University Press. http://dx.doi.org/10.1093/acprof:o so/9780199206650.001.0001 Rochat, Y. (2014). Character networks and centrality (Thesis). Lausanne: University of Lausanne. Sakaki, T., Okazaki, M., & Matsuo, Y. (2010). Earthquake shakes Twitter users: Real-time event detection by social sensors. World Wide Web, 851–860. Sloan, L., Morgan, J., Housley, W., Williams, M., Edwards, A., Burnap, P., & Rana, O. (2013). Knowing the Tweeters: Deriving sociologically relevant demographics from Twitter. Sociological Research Online, 1–11. Stepanyan, K., Borau, K., & Ullrich, C. (2010). A social network analysis perspective on student interaction within the Twitter microblogging environment. The 10th IEEE International Conference on Advanced Learning Technologies, 70–72. Stieglitz, S., & Dang-Xuan, L. (2012). Political communication and influence through microblogging, an empirical analysis of sentiment in Twitter messages and retweet behavior. System Science (HICSS), 3500–3509. Subbian, K., & Melville, P. (2011). Supervised rank aggregation for predicting influencers in Twitter. Social Computing, 661–665. Suh, B., Hong, L., Piroll, P., & Chi, E. H. (2010). Want to be retweeted? Large scale analytics on factors impacting retweet in Twitter network. Social Computing, 177–184. Terras, M. (2010). Present, not voting: Digital humanities in the panopticaon. Closing Plenary Speech at DH2010. Retrieved from http://melissaterras.blogspot.ch/2010/07/ dh2010-plenary-present-not-voting.html Vainikka, E., & Huhtamäki, J. (2015). Tviittien politiikkaa – poliittisen viestinnän sisäpiirit Twittterissä. Media & viestintä, 165–183. Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of ‘small-world’ networks. Nature, 393, 440–442. http://dx.doi.org/10.1038/30918 http://dx.doi.org/10.1093/acprof:oso/9780199206650.001.0001 http://dx.doi.org/10.1093/acprof:oso/9780199206650.001.0001 http://melissaterras.blogspot.ch/2010/07/dh2010-plenary-present-not-voting.html http://melissaterras.blogspot.ch/2010/07/dh2010-plenary-present-not-voting.html http://dx.doi.org/10.1038/30918 http://dx.doi.org/10.1038/30918 Abstract:  1. Introduction: Identifying the digital humanities community 2. Twitter, a growing field of study 3. Dataset 4. Result: an apparent small world 5. To follow or to be followed? 6. A geography of the linguistic communities 7. Measuring structural features 7.1. In-degree 7.2. Out-degree 7.3. Betweenness 7.4. Eigenvector 8. Limits and perspectives 9. Conclusion Acknowledgements Notes References work_dgo3rarr6fgfjldjz472zd62uu ---- Comparing the Quality of Highly Realistic Digital Humans in 3DoF and 6DoF: A Volumetric Video Case Study Comparing the Quality of Highly Realistic Digital Humans in 3DoF and 6DoF: A Volumetric Video Case Study Shishir Subramanyam* Jie Li Irene Viola Pablo Cesar CWI, Amsterdam, The Netherlands Figure 1: Users Evaluating Realistic Digital Humans in 6DoF (left) and 3DoF (right) ABSTRACT Virtual Reality (VR) and Augmented Reality (AR) applications have seen a drastic increase in commercial popularity. Different repre- sentations have been used to create 3D reconstructions for AR and VR. Point clouds are one such representation characterized by their simplicity and versatility, making them suitable for real time appli- cations, such as reconstructing humans for social virtual reality. In this study, we evaluate how the visual quality of digital humans, rep- resented using point clouds, is affected by compression distortions. We compare the performance of the upcoming point cloud compres- sion standard against an octree-based anchor codec. Two different VR viewing conditions enabling 3- and 6 degrees of freedom are tested, to understand how interacting in the virtual space affects the perception of quality. To the best of our knowledge, this is the first work performing user quality evaluation of dynamic point clouds in VR; in addition, contributions of the paper include quantitative data and empirical findings. Results highlight how perceived visual quality is affected by the tested content, and how current data sets might not be sufficient to comprehensively evaluate compression solutions. Moreover, shortcomings in how point cloud encoding solutions handle visually-lossless compression are discussed. Index Terms: Human-centered computing—Human computer in- teraction (HCI)—HCI design and evaluation methods—User studies; —Interaction paradigms—Virtual reality; 1 INTRODUCTION Recent advances in capturing, media processing, and 3D rendering technologies make VR/AR applications popular for mass consump- tion [34]. In this new media landscape, point clouds are becoming commonplace due to their simplicity and versatility. Still, the size of dense point clouds is significant (a frame of roughly 1M points takes around 19-20 MBytes), which need compression techniques before transmission. This paper provides an exhaustive quality comparison between different encoding configurations of digital humans, repre- sented as point clouds. By investigating the differences in quality, we provide insights about how to optimise the delivery for both downloading and real-time communication. One key novelty of this paper is to study the quality based on realistic consumption conditions, in 3- and 6- Degrees of Freedom (DoF) scenarios. *e-mail: {S.Subramanyam, Jie.Li, Irene.Viola, P.S.Cesar}@cwi.nl Avatars are a core part of VR applications like social communi- cation [28], sports training [21], or healthcare [20]. A major line of scientific work has focused on how to make such avatars more realistic, interactive, and autonomous [10, 24, 33]. In this paper, we focus instead on point clouds as a suitable representation for digital humans based on tele-portation principles [25]. In this case, the research problem is not so much how to render and animate them to make them look more realistic, but how to transport them optimally. Given current advances in technology, real-time delivery of point clouds is becoming a realistic alternative; focusing the attention of the research community [23] and industry [32] in encoding and transmission. Still, given the massive number of points per repre- sentation, decisions need to be taken regarding the delivery (type of encoder, bit-rate) to ensure an acceptable quality of experience depending on the viewing conditions (3DoF, 6DoF). This is the core research question this paper answers. Contributions of the paper are two-fold: 1) It provides a first evaluation of the quality of highly realistic digital humans repre- sented as dynamic point clouds in immersive viewing conditions. Existing protocols [5, 7, 8, 40, 42] did not consider the dynamic of the point clouds, focused on one type of data set, and did not take into account VR viewing conditions; 2) It provides quantitative sub- jective results about the perceived quality of the contents, along with qualitative insights on what is important for users in interacting with digital humans in VR. Such results will help in better configuring the network conditions for the delivery of points clouds for real-time transmission, and have implications over ongoing research and stan- dardisation work regarding the underlying compression technology. Particularly, this paper extensively studies this current and rel- evant area of research by proposing 1) a new evaluation protocol, including the work to create dynamic point clouds for evaluation, and 2) quality of experience results. These results are based on an experiment with 52 participants, evaluating 72 stimuli based on eight dynamic point cloud sequences. Each point cloud sequence was compressed in four bit-rates, using two types of compression techniques. These 72 stimuli were evaluated in two viewing con- ditions (3DoF and 6DoF). The data gathered include rating scores, presence questionnaires, simulator sickness reports, and time spent watching the content. The results indicate that, while bit-rate savings can be obtained by choosing one compression solution over another, visually lossless compression has not been fully achieved by the algorithms under evaluation, even at rather large bit-rates. Moreover, the choice of content can have an impact on how users rate its quality, influencing the discriminating power of the selected protocol. 127 2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR) 2642-5254/20/$31.00 ©2020 IEEE DOI 10.1109/VR46266.2020.00-73 (a) V-PCC (b) MPEG Anchor Figure 2: Point Cloud Digital Humans compressed using two point cloud codecs, V-PCC (left) and MPEG anchor (right), at the 4 selected bit-rates. 2 RELATED WORK 2.1 Quality assessment for point clouds Capturing and displaying volumetric videos is becoming feasi- ble [2, 30]. Point clouds are frequently used as a data format for volumetric video in augmented reality (AR) and virtual reality (VR) applications. Point clouds collate a large number of geo-referenced points to represent humans or objects in 3D. The color informa- tion can be provided with each point [40]. To visualize 3D content sufficiently, the number of points must be high, which results in large size and increases the difficulty to store and transmit the point clouds. To support low latency transmission in AR/VR applications within a limited bandwidth, compression is necessary. However, it remains challenging to measure and predict the acceptable quality of compressed point clouds. There is a growing interest on subjective quality assessment of point clouds rendered on 2D displays. Zhang et al. [42] evaluated the quality degradation effect of resolution, shape and color on static point clouds. The results indicate that resolution is almost linearly correlated with the perceived quality, and color has less impact than shape on the perceived quality. Zerman et al. [40] compressed two dynamic human point clouds using a state-of-the-art algorithm [22], and assessed the effects of this algorithm and input point counts on the perceived quality. Their results showed that no direct correlation was found between human viewers’ quality ratings and input point counts. In a recent study [11], a protocol to conduct subjective quality evaluations and benchmark objective quality metrics were proposed. The viewers passively assessed the quality of a set of static point clouds, as animations with pre-defined movement path. In a comprehensive work by Alexiou et al. [8], the entire set of emerging point cloud compression encoders developed in the MPEG committee were evaluated through a series of subjective quality assessment experiments. Nine static models, including both humans and objects, were used in the experiments. The experiments provided insights regarding the performance of the encoders and the types of degradation they introduce. Only a limited number of point cloud quality assessment studies have been conducted in immersive environments. Mekuria et al. [23] evaluated the subjective quality of their codec performance in a realistic 3D tele-immersive system, in which users were represented as 3D avatars and/or 3D dynamic point clouds, and could navigate in the virtual space using mouse cursor in a desktop setting. Several aspects of quality, such as level of immersiveness, togetherness, realism, quality of motion, were considered. Alexiou and Ebrahimi [7] proposed the use of AR to subjectively evaluate the quality of colorless point cloud geometry. Tran et al. [37] suggested that, in case of evaluating video quality in an immersive setup, aspects such as cybersickness and presence should not be overlooked. For the objective evaluation of point clouds, there are two main approaches. Considering the availability of point location and color information, either point-based or projection-based metrics can be used [36]. Current point-based approaches can assess either geometry- or color-only distortions. For geometry errors, three met- rics were commonly used in studies [3, 6, 7, 35, 36], namely the point-to-point metrics, the point-to-plane metrics and the plane-to- plane metrics. These metrics computed using the root mean square (RMS) distance, mean square error (MSE) or Hausdorff distance. Moreover, the geometric Peak-Signal-to- Noise-Ratio (PSNR) is used for the point-to-point and point-to-plane metrics [35]. For color distortions, the total color degradation value is based either on the color MSE, or the PSNR, computed in either the RGB or the YCbCr color spaces [8]. The projection-based approaches map the rendered models onto planar surfaces, and conventional 2D imaging metrics are employed [12]. A comprehensive study [8] showed that the per- formance of the current objective metrics is not ideal, revealing the need for better solutions. Therefore, in this study, we do not include any above-mentioned objective metrics, and focus on subjective quality assessment. 2.2 Point cloud compression A single point cloud frame is represented by an unordered collec- tion of points sampled from the surface of an object. In a dynamic sequence of point clouds, there are no correspondences of points maintained across frames. Thus, detecting spatial and temporal redundancies is often difficult, making point cloud compression challenging. Octrees have been used extensively as a space parti- tioning structure to represent point cloud geometry. They are a 3D extension of the 2D quadtree used to encode video and images. Research into point cloud compression can be broadly divided into two categories. The first is based on signal processing, Zhang et al. [41] proposed a method to compress point cloud attributes using a graph Fourier transform. They assume that an octree has been cre- ated and separately coded for geometry prior to coding attributes. De Queroz and Chou [27] used a region adaptive hierarchical transform to use the colors of nodes in lower levels of the octree to predict the colors of nodes in the next level. As these approaches require expensive computations of graph laplacians, they are not suitable for dynamic sequences in real-time applications. The second category of point cloud codecs are based on extending legacy solutions from image and video compression. Intra Frame coding in octrees can be achieved by entropy coding the occupancy codes, as shown in [23]. The authors then compress the color attributes by mapping them to a 2D grid and using legacy JPEG image compression. In 2017, MPEG started a standardization activity to determine a new standard codec for point clouds, to be launched in 2020. They used the codec created by Mekuria et al. [23] as an anchor to evaluate proposals. To encode dynamic point cloud sequences MPEG provides two verification models [32], Geometry-PCC for point clouds with a sparse distribution, and Video-PCC for dense point clouds. V-PCC is based on leveraging existing 2D video codecs to compress point cloud geometry and attributes. 128 3 METHODOLOGY 3.1 Dataset Preparation A dataset of dynamic point cloud sequences was used from the MPEG repository. All sequences were clipped to five seconds and sampled at 30 frames per second. This included point cloud se- quences [13] [14] captured using photogrammetry (Longdress, Loot, Red and black, Soldier are shown in figure 3) and one sequence of a synthetic character sampled from an animated mesh (Queen). Four additional point cloud sequences; Manfred, Despoina, Sarge (shown in Figure 3) and Rachel were added for the evaluation. These sequences were created using motion captured animated mesh se- quences. Keyframes were selected at 30 frames per second and extracted along with the associated mesh materials. Particular care was put in ensuring the selected sequences have the characters facing the user and speaking in their general direction. Then, 1 million points were randomly sampled, independently per key frame to create a con- sistent groundtruth dataset. The points are sampled from the mesh surface with a probability proportional to the area of the underlying mesh face. This was done to ensure no direct point correspon- dences across point cloud frames, to mimic realistic acquisition and maintain consistency with the rest of the dataset. The point clouds sampled from meshes were used in Test T1 and the point clouds captured using photogrammetry were used in test T2. The X, Y, Z coordinates of each point is represented using an unsigned integer, as is required for the current version of the V-PCC software. Colors are encoded as 8bits per color in the RGB color space. To encode the contents, we first use Release 7.0 of the VPCC MPEG codec. For test 1, the configuration files provided by MPEG for the Queen sequence are used for all the contents. We select the rate points 1, 3 and 5 from the provided preset V-PCC configurations and extend in to an additional final rate point using a Texture quanti- zation parameter (QP) of 8, a geometry QP of 12 and an occupancy precision of 2. We re-label the rate points as R1, R2, R3 and R4, respectively. All sequences are encoded using the C2AI (Category 2 All Intra) config. For the photogrammetry sequences, we use the predefined dedicated configuration files for each sequence, at the same rate points. The VPCC compressed bitstream was used to set the bitrate targets for R1 to R4, separately for each sequence. We then use the MPEG anchor codec [23] in an all intra configu- ration, and match the bit-rates per sequence and rate point (R1-R4) with a tolerance of 10%, as defined in the MPEG call for proposals. The codec was selected as it has a significantly lower encode and decode time and is suitable for real-time applications, as demon- strated by the authors. We use an octree depth from 7 to 10 for the rate points R1 to R4 respectively. The highest possible JPEG quantization parameter values were then chosen per sequence, while meeting the target bit rate set using VPCC. 3.2 Experiment setup All point cloud sequences were rendered using the Unity game en- gine, by storing all the points of each frame in a vertex buffer, and then drawing procedural geometry on the GPU. The point clouds were rendered using a quadrilateral at each point location with a fixed offset of 0.08 units (this corresponds to a side length of approx- imately 2mm) around each point (placed at the centre) for all the sequences, to be consistent. In the case of bitrate R1 generated using the MPEG anchor, we increased the offset value to 0.16 by eye, as the resulting point clouds were too sparse (shown in Figure 2b). We maintain a fixed frame rate of 30fps throughout the experiment. Participants were asked to wear an Oculus Rift Head Mounted Display to view each of the point cloud sequences. For the 3DoF condition, participants were asked to sit on a swivel chair placed at a fixed location in the room and navigate using head movements alone. For the 6DoF condition, participants were allowed to navigate freely within the room, as shown in Figure 1. Each sequence was 5 seconds long, after which the playback looped around. We set the background of the virtual room to mid-grey, to avoid distractions. The Oculus Guardian System was used to display in-application wall and floor markers if the participants got too close to the boundary. We used a workstation with 2 GeForce GTX 1080 Ti in SLI for the GPU and an Intel Core i9 Skylake-X 2.9GHz CPU. 3.3 Subjective methodology To perform the experiments, the subjective methodology Absolute Category Rating with Hidden References (ACR-HR) was selected, according to ITU-T Recommendations P.910 [15]. Participants were asked to observe the video sequences depicting digital humans, and rate the corresponding visual quality on a scale from 1 to 5 (1-Bad, 2-Poor, 3-Fair, 4-Good, and 5-Excellent). A series of pilot studies were conducted to determine the posi- tioning of digital humans in the virtual space and the length of each sequence, to ensure the sequences were running smoothly within the limited computer RAM. Due to the huge size of the test material, it was not possible to evaluate all 8 point cloud contents in one single session, as long loading times would have brought fatigue to the participants and corrupted the results. Thus, we decided to split the evaluation into two separate tests: one focused on the evaluation of contents obtained from random sampling of meshes (T1: contents Queen, Manfred, Despoina and Sarge), and one focused on contents acquired through photogrammetry (T2: contents Long dress, Soldier, Red and black, and Loot). From each sequence, a subset of frames comprising 5 seconds was selected. Before the test took place, 3 training sequences depicting exam- ples of 1-Bad, 5-Excellent and 3-Fair were shown to the users to help them familiarize with the viewing condition and test setup, and to guide their rating. The training sequences were created using one additional content not shown during the test, to prevent biased results. For test T1, content Ana was selected, whereas for test T2, content Ulli Wagner was chosen. Each content sequence was encoded using the point cloud compression algorithms under test. For each test and viewing condition, 36 stimuli were evaluated. For each stimulus, the 5 second sequence was played at least once in full, and kept in loop until the participants gave their score. The order of the displayed stimuli was randomized per participant and per viewing condition, and the same content was never displayed twice in a row to avoid bias. Moreover, the presentation order of viewing conditions was randomized between participants, to prevent any confounding effect. Two dummy samples were added at the beginning of each viewing session to ease participants into the task, and the corresponding scores were subsequently discarded. After each view condition, participants were requested to fill in the Igroup Presence Questionnaire (IPQ) [31] on a 1-7 discrete scale (1=fully disagree to 7=totally agree) and Simulator Sickness Ques- tionnaire (SSQ) on a 1-4 discrete scale (1=none to 4=severe) [18]. IPQ has three subscales, namely Spatial Presence (SP), Involvement (INV) and Experienced Realism (REAL), and one additional general item (G) not belonging to a subscale, which assesses the general ”sense of being there”, and has high loadings on all three factors, with an especially strong loading on SP [31]. SSQ was developed to measure cybersickness in computer simulation and was derived from a measure of motion sickness [18]. For both T1 and T2, after the two viewing conditions, participants were interviewed to 1) compare their experiences of assessing quality in 3DoF and 6DoF, and 2) reflect on the factors they considered when assessing the quality. A total of 27 participants were recruited for T1 (12 males, 15 female, average age: 22,48 years old), whereas 25 participants were recruited for T2 (17 males, 8 females, average age: 28,39 years old). All participants were screened for color vision and visual acuity, using Isihara and Snellen charts, respectively, according to ITU-T Recommendations P.910 [15]. 129 Figure 3: Sequences used for the test, from left to right: Manfred, Sarge, Despoina, Queen, Longdress, Loot, Red and black, Soldier 3.4 Data analysis Outlier detection was performed separately for each test T1 and T2, according to ITU-T Recommendations P.913 [16]. The recom- mended threshold values r1 = 0.75 and r2 = 0.8 were used. One outlier was found in test T1, and the corresponding scores were discarded. No outliers were found in the scores collected for test T2. After outlier detection, the Mean Opinion Score (MOS) was com- puted for each stimulus, independently per viewing condition. The associated 95% Confidence Intervals (CIs) were obtained assum- ing a Student’s t-distribution. Additionally, the Differential MOS (DMOS) was obtained by applying HR removal, following the pro- cedure described in ITU-T Recommendations P.913 [16]. Non-parametric statistical analysis was applied to understand whether statistical differences could be found among variables, using the MATLAB Statistics and Machine Learning Toolbox, along with the ARTool package in R [17]. 4 RESULTS 4.1 Subjective quality assessment Figures 4 and 5 shows the results of the subjective quality assess- ment of the contents comprising test T1 and test T2, respectively, for both 3DoF and 6DoF viewing conditions. In particular, the MOS scores associated with the compressed contents are shown with solid lines, along with relative CIs, whereas the dashed lines represent the respective DMOS scores. The HR scores for each content are represented with a solid line to indicate the mean, and a shaded plot for the corresponding CIs. To assess whether significant differences could be found between the two visual conditions under test, we ran a Wilcoxon signed- rank test on the scores obtained in the two DoF scenarios. The Wilcoxon test was chosen as the gathered data was not found to be normally distributed, according to the Shapiro-Wilk normality test (W = 0.90, p < .001 and W = 0.91, p < .001 for tests T1 and T2, respectively). Results of the Wilcoxon signed-rank test showed statistical significance for DoF for test T1 (Z = 2.97, p = 0.0029, r = 0.07), whereas for test T2, no significance was found (Z = −1.96, p = 0.0502, r = 0.05). Values seems to indicate an effect of the DoF in test T1; however the small r-value indicates that while the effect apparently exists, it is small. It can be observed that codec V-PCC has generally a more fa- vorable performance with respect to the MPEG anchor. This is especially evident for the contents acquired through photogramme- try (see Fig. 5), for which the gap among the two codecs is more pronounced. Wilcoxon signed-rank test confirmed statistical signifi- cance for the two codecs (T1: Z = 9.87, p < .001, T2: Z = 20.18, p < .001), albeit with different effect sizes between test T1 and T2 (r = 0.24 and r = 0.50, respectively). A Friedman rank test performed on the scores revealed a signifi- cant effect of the content on the final scores, for both sets of contents (T1: χ 2 = 57.38, p < .001, T2: χ 2 = 17.31, p < .001). Table 1 shows the results of the post-hoc test conducted using Wilcoxon Table 1: Pairwise post-hoc test on the contents for test T1 and T2, using Wilcoxon signed-rank test with Bonferroni correction. Z p r T 1 Manfred - Sarge 3.78 <.001 0.12 Manfred - Despoina 2.09 0.036 0.07 Manfred - Queen 7.48 <.001 0.25 Sarge - Despoina 1.30 0.192 0.04 Sarge - Queen 9.94 <.001 0.33 Despoina - Queen 8.79 <.001 0.29 T 2 Long dress - Loot 7.03 <.001 0.23 Long dress - Red and black 1.08 0.279 0.05 Long dress - Soldier 4.11 <.001 0.14 Loot - Red and black 6.42 <.001 0.21 Loot - Soldier 3.32 <.001 0.11 Red and black - Soldier 3.10 0.002 0.10 Table 2: Pairwise post-hoc test on the bitrates for test T1 and T2, using Wilcoxon signed-rank test with Bonferroni correction. Z p r T 1 R1 - R2 -14.21 <.001 0.50 R1 - R3 -16.85 <.001 0.60 R1 - R4 -17.08 <.001 0.60 R2 - R3 -12.61 <.001 0.45 R2 - R4 -14.45 <.001 0.51 R3 - R4 -8.75 <.001 0.30 T 2 R1 - R2 -14.20 <.001 0.50 R1 - R3 -16.85 <.001 0.60 R1 - R4 -17.08 <.001 0.60 R2 - R3 -12.61 <.001 0.45 R2 - R4 -14.45 <.001 0.51 R3 - R4 -8.57 <.001 0.30 signed-rank test with Bonferroni correction (α = .05/6). Contents Manfred, Sarge and Despoina all show statistical significance with respect to content Queen (p < .001, r > 0.20 for all pairs). Statis- tical significance has also been observed between content Manfred and Sarge, albeit with a smaller effect size ( p < .001, r = 0.12). For contents acquired through photogrammetry, statistical significance was found between contents Long dress and Loot, and Loot and Red and black (p < .001, r > 0.20 in both cases), as well as be- tween contents Long dress and Soldier, Loot and Soldier ( p < .001, r > 0.10), and Red and black and Soldier (p = 0.0019, r = 0.10). Results corroborate our previous statements on how contents Long dress and Red and black appeared to be given different scores with respect to contents Loot and Soldier. We also ran a Friedman rank test on the scores to assess whether the selected bit-rates were showing statistical significance. Results confirmed that the bit-rates have a significant effect for both tests (T1: χ 2 = 682.29, p < .001, T2: χ 2 = 667.39, p < .001). Post-hoc 130 (a) Manfred (b) Sarge (c) Despoina (d) Queen (e) Manfred (f) Sarge (g) Despoina (h) Queen Figure 4: MOS (solid line) and DMOS (dashed line) against achieved bit-rate, expressed in Mbps. HR scores are shown using a shaded yellow plot. Each column represents a content in test T2, whereas first row and second row depict results obtained using the viewing conditions 3DoF and 6DoF, respectively. (a) Long dress (b) Loot (c) Red and black (d) Soldier (e) Long dress (f) Loot (g) Red and black (h) Soldier Figure 5: MOS (solid line) and DMOS (dashed line) against achieved bit-rate, expressed in Mbps. HR scores are shown using a shaded yellow plot. Each column represents a content in test T2, whereas first row and second row depict results obtained using the viewing conditions 3DoF and 6DoF, respectively. analysis using Wilcoxon signed-rank test with Bonferroni correction (α = .05/6), shown in Table 2 further confirmed that all pairwise comparisons were statistically significant, for both test T1 and T2 (p < .001, r > 0.30 for all pairs). In order to further analyze the effect of DoF conditions, contents, codecs and bit-rates, and relative interactions, on the gathered scores, we fitted a full linear mixed-effects model on the data, accounting for randomness introduced by the participants. Due to the non-normality of our data, the aligned rank transform was applied prior to the fitting [39]. Since the transform is designed for a fully randomized test, it is not suitable for the scores collected during the test, as the HR addition makes the design matrix rank deficient. However, the transform can be applied to the differential scores used to obtain DMOS, as it follows a fully randomized design. Thus, it was decided to perform the analysis on the differential scores. For test T1, analysis of deviance on the full mixed-effects model showed significance for main effects Content (F = 48.14, d f = 3, p < .001), Codec (F = 51.01, d f = 1, p < .001) and bit-rate 131 (F = 375.35, d f = 3, p < .001), but not for DoF (F = 0.0003, d f = 1, p = 0.988). Moreover, significant interaction effects were found for DoF - Content (F = 4.31, d f = 3, p = 0.005), Content - bit-rate (F = 5.88, d f = 9, p < .001) and Codec - bit-rate (F = 4.73, d f = 3, p = 0.003). Post-hoc interaction analysis with Holm p-value adjustment indicates that the difference between 3DoF and 6DoF has statistical significance at 5% level when comparing contents Man- fred and Queen (χ 2 = 10.34, p = 0.008), as well as Inspector and Queen (χ 2 = 8.35, p = 0.019). In other words, the relative differ- ence in scores between contents Manfred and Queen (and Inspector and Queen) was not found to be statistically equivalent in 3DoF with respect to 6DoF. This indicates that the DoF might have an effect on how contents are scored with respect to one another, for example by increasing or reducing their differences. Regarding the interaction effect between contents and bit-rates, post-hoc interaction analysis with Holm p-value correction showed statistical significance in dif- ferences between contents Manfred and Queen at bit-rates R2 and R4 (χ 2 = 29.52, p < .001), between contents Sarge and Despoina at bit-rates R2 and R4 (χ 2 = 11.00, p = 0.028), between Sarge and Queen at bit-rates R2 and R4 (χ 2 = 11.56, p = 0.022), and between Despoina and Queen at bit-rates R1 and R2 (χ 2 = 13.75, p = 0.007), R2-R3 (χ 2 = 13.59, p = 0.007) and R2-R4 (χ 2 = 45.13, p < .001). Results can be explained considering that the low HR scores given to content Queen meant a narrower range of ratings. Thus, bit-rate point R2, for example, presents relatively higher differential scores for Queen with respect to the rest of the contents, whereas for bit-rate point R4, due to the HR removal, all contents have similar ratings. This is reflected in the statistical analysis conducted on the scores. Finally, post-hoc interaction analysis with Holm p-value adjustment on differences between codecs and bit-rates shows that the difference among codecs is statistically significant at 5% level only between R1 and R2 (χ 2 = 10.51, p = 0.007), R1 and R3 (χ 2 = 7.09, p = 0.031), and R1 and R4 (χ 2 = 10.17, p = 0.007). This indicates that the dif- ferences between codecs remain constant at all bit-rates, except for R1. This is in line with what observed in Fig. 4, which show similar trends for codec V-PCC with respect to the MPEG anchor, except for the lowest bit-rate point, for which V-PCC achieves better performance. Results of analysis of deviance on the full mixed-effects model for test T2 showed significance for main effects Content (F = 139.41, d f = 3, p < .001), Codec (F = 692.24, d f = 1, p < .001) and bit- rate (F = 485.11, d f = 3, p < .001), but not for DoF (F = 2.57, d f = 1, p = 0.115), similarly to what was seen for test T1. In- teractions were found significant at 5% level between Content and Codec (F = 3.81, d f = 3, p = 0.01), Content and bit-rate (F = 3.03, d f = 9, p = 0.001), and Codec and bit-rate (F = 39.40, d f = 3, p < .001). The lack of significance in interactions involv- ing DoF is in line with the results of the Wilcoxon signed-rank test, which showed no significance for DoF in test T2 (Z = −1.96, p = 0.0502, r = 0.05). Post-hoc interaction analysis with Holm p- value adjustment shows significance at 5% level for the differences among codecs for content Long dress with respect to content Loot (χ 2 = 10.09, p = 0.009). This confirms what can be seen in Fig. 5: the gap among codecs is more prominent for content Loot with re- spect to Long dress, probably due to the reduced range associated with a low-rated HR. Post-hoc analysis on the interaction between contents and bit-rates indicates statistical significance at 5% level for differences among contents Long dress and Soldier when consid- ering differences between bit-rates R1-R4 (χ 2 = 17.03, p = 0.001) and R2-R4 (χ 2 = 11.81, p = 0.021), and among contents Red and black and Soldier for differences between R1 and R4 (χ 2 = 11.80, p = 0.021). Again, this can be explained considering that both Long dress and Red and black received remarkably lower scores, which resulted in a narrower rating range. Thus, differences among lowest and highest bit-rates are quite different between those two contents and Soldier, which benefited from a larger rating span. Lastly, post- 1 2 3 4 5 Given score 0 2 4 6 8 10 12 14 A vg t im e ( s) 3DoF 6DoF (a) T1 1 2 3 4 5 Given score 0 2 4 6 8 10 12 14 A vg t im e ( s) 3DoF 6DoF (b) T2 Figure 6: Average time spent looking at the sequence (in seconds) and relative CIs, against score given to the sequence, for 3DoF (blue) and 6DoF (red), in test T1 (left) and T2 (right). hoc analysis on the interaction between codecs and bit-rates reveals statistical significance at 5% level for all pairwise comparison, ex- cept R1-R3 (R1-R2: χ 2 = 14.60, p < .001, R1-R4: χ 2 = 46.58, p < .001, R2-R3: χ 2 = 13.81, p < .001, R2-R4: χ 2 = 113.34, p < .001, R3-R4: χ 2 = 48.02, p < .001 ). Indeed, in Fig. 5 it is quite evident that the curves for the two codecs follow different trends. In particular, codec V-PCC seems to saturate between R2 and R3, whereas a steeper slope is observed for the MPEG anchor. 4.2 Additional questionnaires and interaction data 4.2.1 IPQ & SSQ Questionnaires For T1 and T2, the collected IPQ data under each subscale are all normally distributed as examined by the Shapiro-Wilk test (p > 0.05). A paired sample t-test was applied to check the differences between 3DoF and 6DoF in terms of SP, INV, REAL and G. For T1, there was a significant difference in SP between 3DoF (M=4.13, SD=0.92) and 6DoF (M=5.04, SD=0.67), t(26)=-4.44, p < .001, Cohen’s d = 0.52 and also a significant difference in G between 3DoF (M=4.11, SD=1.28) and 6DoF (M=4.96, SD=1.13), t(26)=- 2.60, p < .01, Cohen’s d = 0.64. For T2, SP was also significantly different in 3DoF (M=4.16, SD=1.17) and 6DoF (M=4.83, SD=1.12), t(24)=-3.48, p < .01, Cohen’s d = 0.45 and so was G between 3DoF (M=4.20, SD=1.61) and 6DoF (M=5.08, SD=1.19), t(24)=-3.56, p < .01, Cohen’s d = 0.71. Other factors showed no significant differences between 3DoF and 6DoF in both T1 and T2. With respect to SSQ, no significant differences (p > 0.05) were found between 3DoF and 6DoF in terms of cybersickness. We further tested whether there were order effects in experiencing cyber- sickness, where half of the participants started with 6DoF as the first condition and 3DoF as the second, and the remainder the inverse. No significant differences (p > 0.05) were found for any order effects in experiencing cybersickness. 4.2.2 Interaction time Interaction time was found to be strongly correlated with MOS val- ues in a study conducted on light field image quality assessment [38]. In particular, it was found that users tended to spend more time interacting with contents at high quality, whereas for low quality scores, less time was spent looking at the contents. In order to see whether similar trends could be observed in our data, we compared the average time spent watching the sequence in 3DoF and 6DoF, separately for each quality score given by the participants. Results are shown in Fig. 6. A positive trend can be observed between the given score and the average time spent looking at the sequence, with the exception of score 5, which for test T2 shows a negative trend with respect to the time. However, it should be considered that on average, a small percentage of scores equal to 5 were given in test T2 (10% of the total scores), thus, variations may be due to the difference in sample size. It is also worth noting that, on average, 132 participants spent more time looking at the sequences in 6DoF, with respect to the 3DoF case. Indeed, several participants pointed out that the lowest scores were the fastest to be given, whereas for higher quality, it was harder to decide on the rating. 4.2.3 Interviews We asked the same interview questions for T1 and T2. So, we com- bined the interview transcripts of 52 participants (T1=27, T2=25). The categorized answers are presented as follows: Factors considered when assessing quality. 56% of the partici- pants mentioned that they assessed the quality based on three criteria: 1) overall outline and pattern distortion on body and on clothes, 2) natural gestures and movements of the digital humans, and 3) visual artifacts such as blockiness, blurriness, and extraneous floating arti- facts. 48% of the participants mentioned the quality assessment cri- teria are content related, who agreed that it is easier to spot artifacts for the content with complex patterns (e.g., Long dress) and domi- nant colors (e.g., Red and black) than the content with uniformed colors (e.g., Soldier and Sarge). 46% of the participants considered facial expressions as an unignorable factor for quality assessment, which they believe is an important cue for social connectedness. For the extraneous floating artifacts (e.g., bubbles flickering outside the digital humans), 23% found it very annoying and lowered the overall quality for the content, but a few participants (8%) thought these artifacts do not influence their quality judgement. Difficulties in assessment. 42% of the participants pointed out the difficulties in assessing the quality, especially for the high quality contents, which are not perfect and still have missing details like blurry faces or wrong fingers. 15% of the participants specifically pointed out that it is difficult to distinguish between quality level 3 to 5. 17% of the participants commented that it gradually became easier in rating the quality when they adapted to the contents. So, the second viewing condition was easier for them. Comparison between 3DoF and 6DoF. 52% of the participants preferred 6DoF, because it allowed them to move closer to examine the details (e.g., shoes and fingers). They felt more realistic when walking in the virtual space. However, they also commented that 3DoF offered a fixed distance between them and digital humans, enabling a more stable and focused assessment. 21% of the partic- ipants preferred relaxation and passiveness in 3DoF, because they did not find much differences between 3DoF and 6DoF in terms of quality assessment, but they found 3DoF is less nauseous than 6DoF. 4.3 Analysis of results Results vary considerably depending on the content under assess- ment. In particular, for test T1, content Queen is generally given lower ratings with respect to the other contents in the test. This is made evident by the MOS score given to the HR, which is equal to 3.35 for the 3DoF and 6DoF condition, indicating that even when uncompressed, the content was never considered as having a good quality. As a result, the MOS scores computed for the content have a limited range, spanning between 1.08 and 3.35 for the 6DoF case, and between 1 and 2.58 for the 3DoF (excluding HR). Such a nar- row range is inadequate in expressing the quality variations among different compression parameters: for the 3DoF case in particular, paired t-test at 5% significance shows that bit-rate points R3 and R4 are statistically equivalent for both codecs, and for codec V-PCC R2 is considered statistically equivalent to R4, despite the latter being 10 times as large. Statistical analysis results confirmed that content Queen showed different rating patterns with respect to the other contents. The ratings given to the rest of the contents com- prising T1 have a larger range, seemingly covering the entire rating space. Trends show that codec V-PCC is generally preferred to the MPEG anchor, especially at low bit-rates, whereas for the highest bit-rate point R4, the codecs are always statistically equivalent at 5% confidence level, in both 3DoF and 6DoF. It is worth noting that the two codecs seldom reach the same quality level as the uncompressed HRs. In particular, for the 3DoF viewing condition, transparent quality (as in, the level of quality for which the distortions are “transparent” to the user, meaning that statistical equivalence with the HR has been observed) is only achieved by content Manfred at bit-rate R4, by both codecs. On the other hand, in the 6DoF scenario, V-PCC encoded contents at bit-rate R4 seem to always be statistical equivalent to the HR. Rating variability among different contents is even more visible for contents acquired through photogrammetry. In particular, at high bit-rates contents Long dress and Red and black are consistently given lower scores with respect to contents Loot and Soldier, for both DoF conditions. In fact, as seen with content Queen above, both contents do not reach MOS levels higher than 4, even when considering the HR content. This indicates that the source content is never considered of excellent quality, even when no compression artifact is involved. This impacts the way scores are distributed across the rating space: left with a smaller rating range (as the higher rating values are never given), MOS results show that contents compressed at high bit-rates are considered statistically equivalent to the respective HR. This is particularly evident when considering the DMOS scores, which have an operational range between 4 and 5 for codec V-PCC, for a range of bit-rates spanning between 4 and 120 Mbps. On the other hand, for contents that were given higher scores (Loot and Soldier) and use the full rating space, results indicate what was already seen in test T1: no codec is able to reach transparent quality, meaning that scores given to the compressed content are always statistically different with respect to the HR. Decisions on which codec to employ should be made depending on the use case. The MPEG anchor is more suitable for real-time system, due to its fast encoding time, and at high enough bitrates, differences with the other codec become less noticeable. V-PCC, on the other hand, might be more appropriate for on-demand streaming and storage, since it retains better quality for the same bitrate. For the majority of the contents under test, a bitstream size between 20 and 40Mbps seems to provide an acceptable quality. However, regarding the selection of the appropriate target bitrate, the decision should be made taking into account other factors, such as network conditions, available bandwith and scene complexity. Statistical analysis showed a small effect of the chosen DoF condition on the gathered scores for test T1. In general, the two visualization scenarios led to similar trends in MOS values; however, several participants pointed out that, while 3DoF offered a more stable assessment, as the same point of view is used for all contents, 6DoF felt more realistic. Any decision between the two viewing conditions for quality assessment, thus, should be made consider- ing the trade-off between immersive, personalized experience, and fairness of comparison between solutions. 5 DISCUSSION 5.1 Datasets Despite the rich literature in point cloud acquisition and compression, few point cloud datasets are publicly available. This is especially true when considering point cloud datasets depicting photo-realistic humans. One of the most popular and widely used full-body dataset, created by 8i Labs [13], consists of only 4 individual contents, whereas the HHI Fraunhofer dataset has 1 individual content [14]. In the context of point cloud compression, such scarcity of available data may lead to compression solutions being designed, optimized and tested while considering a considerably narrow range of input data, thus leading to algorithms that are overfitted to the specifics of the acquisition method used to obtain the contents. The conse- quences of such a scenario are reflected in our results. Whereas for the contents assessed in test T2 a large difference was observed between codec V-PCC and the MPEG anchor, for the contents in test T1 the gap was markedly lower, and indeed the significance of 133 the effect of the codec selection had a smaller effect size for test T1 with respect to test T2, as seen in section 4.1. Test T2 consisted of contents that had been used in multiple quality assessment experi- ments [8, 9, 11, 36], notably including the performance evaluation of the upcoming MPEG standard [32]. On the other hand, test T1 in- cluded contents that have not been used so far in assessment of point cloud compression solutions. The discrepancies in the results of the subjective quality assessment campaign indicate that performance gains may vary considerably when new contents are evaluated. A larger body of contents depicting digital humans, involving several acquisition technologies, is needed in order to properly design, train and evaluate new compression solutions in a robust way. 5.2 Personal preferences and bias Subjective evaluation experiments are complicated by many aspects of human psychology and viewing conditions, such as participants’ vision ability, translation of quality perception into ranking scores, adaptations and personal preferences for contents. Through carefully following the ITU-T Recommendations P.913 [16], we are able to control some of the aspects. For example, eliminate the scores given by the participants with vision problems; train participants to help them understand the quality levels; randomize the stimuli and viewing conditions to minimize the order effects. However, we noticed that personal preferences towards certain contents are difficult to control. Satgunam et al. [29]) found that their participants were divided into two preference groups: prefer sharper content versus smoother content. Similarly, Kortum and Sullivan [19] found that the ”desirability” of participants had an impact on video quality responses, with a more desirable video clip being given a higher rating. In our experiments, content Queen is generally given lower ratings with respect to the other contents. In the interviews, many participants (27%) expressed dislike towards Queen, because of her lifeless look and static gestures; 40% showed their preference towards Soldier, due to his high-resolution facial features, unitoned clothes and natural movements. This observation suggests that quality assessment may need to be adjusted based on content and viewer preferences, and offering training with different contents. 5.3 Technological constraints and limitations The two codecs used in this experiment introduce different distor- tions during compression. As the MPEG anchor codec uses the octree data structure to represent geometry, the number of points in the decoded cloud varies exponentially based on the tree depth. Thus, at lower bitrates, the decoded point clouds are quite sparse, and when the point size is increased to make them appear watertight, they have a block-y appearance. This codec design allows for future optimizations based on human perception of 3D objects in VR. The low delay encoding and decoding of this codec makes it suitable for real time applications such as social VR. On the other hand, the V-PCC codec leverages existing 2D video codecs to compress both geometry and color, which introduces noise in terms of extraneous objects, and general geometric artifacts such as misaligned seams. However, the approach yields better results at low bitrates, as demon- strated in our results. The codec is optimized for human perception of 2D video and this might not transfer to perception of 3D objects in VR. The mapping from 3D to 2D is critical to codec performance, thus the encoding phase has high complexity. Decoding has a lower delay, as it benefits from hardware acceleration of video decoders on GPUs, making this approach suitable for on demand streaming. One of the main shortcoming of both compression solutions lays in their inability to reach visually-lossless quality, as demonstrated by our results. Achieving a visually pleasant result is of paramount importance for the market adoption of the technology; indeed, poor visual quality might lead consumers to tune off from the experience altogether [1]. Visual perception should be taken into account when designing compression solutions, especially at high bitrates, to en- sure that in absence of strict bandwidth constraints, excellent quality can be achieved. 5.4 Protocols for subjective assessment in VR Choosing the right methodology to follow in order to collect users’ opinions is a delicate matter, as it can influence the statistical power of the collected score, and in some cases lead to difference in results. Single stimulus methodologies, in particular, lead to larger CIs with respect to double stimulus methodologies, and are more subject to be influenced by individual content preference [16]. An early study comparing single and double stimulus methodologies for the evalu- ation of colorless point cloud contents indicated that the latter was more consistent in recognizing the level of impairment, as relative differences facilitate the rating task [4]. However, the study pointed out that the single stimulus methodology shows more discrimination power for compression-like artifacts, albeit at the cost of wider CIs. Double stimulus methodologies, while commonly used in video quality assessment and widely adopted in 2D-based quality assess- ment of point cloud contents [8, 11, 32], are tricky to adopt in VR technology, due to the difficulties in displaying both contents simul- taneously in a perceptually satisfying way [26], while ensuring a fair comparison between the contents under evaluation. When dealing with interactive methodologies, in particular, synchronous display of any modification in viewport is usually enforced, to ensure that the two contents are always visible at the same condition [8, 38]. This is clearly challenging to implement in a 6DoF scenario, in which users are free to change their position in the VR space at any given time. Positioning the two contents side by side in the same virtual space would mean that, at any given time, they are seen from two different angles; the same problem would arise when temporal sequencing is employed. A toggle-based method like the one proposed in [26] is not applicable to moving sequences, as different frames would be seen between stimuli. In our study, we saw that content preference had an impact on the ratings, as several contents were deemed of lower quality, as the scores given to the HR exemplify. Such bias resulted in a reduced rating range for the contents. Results of the interviews also pointed out that naturalness of gestures were an important criteria in assess- ing the visual quality. Such components would not be normally evaluated in a double stimulus scenario; however, they are important in understanding how human perception reacts to digital humans. 6 CONCLUSION We compare the performance of the point cloud compression stan- dard V-PCC against an octree-based anchor codec (MPEG anchor). Participants were invited to assess the quality of digital humans represented as dynamic point clouds, in both 3DoF and 6DoF condi- tions. The results indicate that codec V-PCC has a more favorable performance than the MPEG anchor, especially at low bit-rates. For the highest bit-rate, the two codecs are often statistically equivalent. Results indicate that the content under test has a significant influence on how the scores are distributed; thus, new data sets are needed in order to comprehensively evaluate compression distortions. More- over, current encoding solutions, while efficient at low bitrates, are unable to provide visually lossless results, even when large volumes of data are available, revealing significant shortcomings in point cloud compression. We also point out that commonly-used double stimulus methodologies for quality evaluation often reduce the rating task to a difference recognition, while insights on the quality of the original contents are missed. ACKNOWLEDGMENTS This work is funded by the European Commission H2020 program, under the grant agreement 762111, VRTogether, http://vrtogether.eu/ 134 REFERENCES [1] OTT: Beyond Entertainment Consumer Survey Report. https://www. conviva.com/research/ott-beyond-entertainment/. [2] D. S. Alexiadis, D. Zarpalas, and P. Daras. Real-time, full 3-D recon- struction of moving foreground objects from multiple consumer depth cameras. IEEE Transactions on Multimedia, 15(2):339–358, 2012. [3] E. Alexiou and T. Ebrahimi. On subjective and objective quality evalu- ation of point cloud geometry. In 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX), pp. 1–3. IEEE, 2017. [4] E. Alexiou and T. Ebrahimi. On the performance of metrics to predict quality in point cloud representations. In Applications of Digital Im- age Processing XL, vol. 10396, p. 103961H. International Society for Optics and Photonics, 2017. [5] E. Alexiou and T. Ebrahimi. Impact of visualisation strategy for sub- jective quality assessment of point clouds. In 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–6. IEEE, 2018. [6] E. Alexiou and T. Ebrahimi. Point cloud quality assessment metric based on angular similarity. In 2018 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE, 2018. [7] E. Alexiou, E. Upenik, and T. Ebrahimi. Towards subjective quality assessment of point cloud imaging in augmented reality. In 2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP), pp. 1–6. IEEE, 2017. [8] E. Alexiou, I. Viola, T. M. Borges, T. A. Fonseca, R. L. de Queiroz, and T. Ebrahimi. A comprehensive study of the rate-distortion performance in mpeg point cloud compression. APSIPA Transactions on Signal and Information Processing, 8:27, 2019. doi: 10.1017/ATSIP.2019.20 [9] E. Alexiou, P. Xu, and T. Ebrahimi. Towards modelling of visual saliency in point clouds for immersive applications. In 26th IEEE International Conference on Image Processing (ICIP), 2019. [10] J. Constine. Facebook animates photo-realistic avatars to mimic VR users’ faces, 2018. [11] L. A. da Silva Cruz, E. Dumić, E. Alexiou, J. Prazeres, R. Duarte, M. Pereira, A. Pinheiro, and T. Ebrahimi. Point cloud quality evalua- tion: Towards a definition for test conditions. In 2019 Eleventh Inter- national Conference on Quality of Multimedia Experience (QoMEX), pp. 1–6. IEEE, 2019. [12] R. L. de Queiroz and P. A. Chou. Motion-compensated compression of dynamic voxelized point clouds. IEEE Transactions on Image Processing, 26(8):3886–3895, 2017. [13] E. d’Eon, B. Harrison, T. Myers, and P. A. Chou. 8i Vox- elized Full Bodies - A Voxelized Point Cloud Dataset, ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document WG11M40059/WG1M74006, Geneva. January 2017. [14] T. Ebner, I. Feldmann, O. Schreer, P. Kauff, and T. v. Unger. HHI Point cloud dataset of a boxing trainer, ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document MPEG2018/m42921, Ljubljana. July 2018. [15] ITU-T P.910. Subjective video quality assessment methods for mul- timedia applications. International Telecommunication Union, April 2008. [16] ITU-T P.913. Methods for the subjective assessment of video quality, audio quality and audiovisual quality of Internet video and distribution quality television in any environment. International Telecommunication Union, March 2016. [17] M. Kay and J. Wobbrock. mjskay/artool: Artool 0.10.6, Feb. 2019. doi: 10.5281/zenodo.2556415 [18] R. S. Kennedy, N. E. Lane, K. S. Berbaum, and M. G. Lilienthal. Simulator sickness questionnaire: An enhanced method for quantifying simulator sickness. The international journal of aviation psychology, 3(3):203–220, 1993. [19] P. Kortum and M. Sullivan. The effect of content desirability on subjective video quality ratings. Human factors, 52(1):105–118, 2010. [20] S. Y. Liaw, G. A. C. Carpio, Y. Lau, S. C. Tan, W. S. Lim, and P. S. Goh. Multiuser virtual worlds in healthcare education: A systematic review. Nurse education today, 65:136–149, 2018. [21] J.-L. Lugrin, M. Landeck, and M. E. Latoschik. Avatar embodiment realism and virtual fitness training. In 2015 IEEE Virtual Reality (VR), pp. 225–226. IEEE, 2015. [22] K. Mammou. PCC test model category 2 v0. ISO/IEC JTC1/SC29/ WG11 N17248, 1, 2017. [23] R. Mekuria, K. Blom, and P. Cesar. Design, implementation, and evaluation of a point cloud codec for tele-immersive video. IEEE Transactions on Circuits and Systems for Video Technology, 27(4):828– 842, 2017. [24] S. Narang, A. Best, A. Feng, S.-h. Kang, D. Manocha, and A. Shapiro. Motion recognition of self and others on realistic 3D avatars. Computer Animation and Virtual Worlds, 28(3-4):e1762, 2017. [25] S. Orts-Escolano, C. Rhemann, S. Fanello, W. Chang, A. Kowdle, Y. Degtyarev, D. Kim, P. L. Davidson, S. Khamis, M. Dou, et al. Holo- portation: Virtual 3d teleportation in real-time. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology, pp. 741–754. ACM, 2016. [26] A.-F. Perrin, C. Bist, R. Cozot, and T. Ebrahimi. Measuring quality of omnidirectional high dynamic range content. In Applications of Digital Image Processing XL, vol. 10396, p. 1039613. International Society for Optics and Photonics, 2017. [27] R. D. Queiroz and P. A. Chou. Compression of 3D Point Clouds Using a Region-Adaptive Hierarchical Transform. IEEE Transactions on Image Processing 25, June 2016. [28] D. Roth, K. Waldow, M. E. Latoschik, A. Fuhrmann, and G. Bente. Socially immersive avatar-based communication. In 2017 IEEE Virtual Reality (VR), pp. 259–260. IEEE, 2017. [29] P. N. Satgunam, R. L. Woods, P. M. Bronstad, and E. Peli. Factors affecting enhanced video quality preferences. IEEE Transactions on Image Processing, 22(12):5146–5157, 2013. [30] O. Schreer, I. Feldmann, T. Ebner, S. Renault, C. Weissig, D. Tatzelt, and P. Kauff. Advanced volumetric capture and processing. SMPTE Motion Imaging Journal, 128(5):18–24, 2019. [31] T. W. Schubert. The sense of presence in virtual environments: A three-component scale measuring spatial presence, involvement, and realness. Zeitschrift für Medienpsychologie, 15(2):69–71, 2003. [32] S. Schwarz, M. Preda, V. Baroncini, M. Budagavi, P. Cesar, P. A. Chou, R. A. Cohen, M. Krivokuća, S. Lasserre, Z. Li, et al. Emerging MPEG standards for point cloud compression. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 9(1):133–148, 2018. [33] M. Seymour, K. Riemer, and J. Kay. Actors, avatars and agents: potentials and implications of natural face technology for the creation of realistic visual presence. Journal of the Association for Information Systems, 19(10):953–981, 2018. [34] M. Slater and M. V. Sanchez-Vives. Enhancing our lives with immer- sive virtual reality. Frontiers in Robotics and AI, 3:74, 2016. [35] D. Tian, H. Ochimizu, C. Feng, R. Cohen, and A. Vetro. Geometric distortion metrics for point cloud compression. In 2017 IEEE Interna- tional Conference on Image Processing (ICIP), pp. 3460–3464. IEEE, 2017. [36] E. M. Torlig, E. Alexiou, T. A. Fonseca, R. L. de Queiroz, and T. Ebrahimi. A novel methodology for quality assessment of vox- elized point clouds. In Applications of Digital Image Processing XLI, vol. 10752, p. 107520I. International Society for Optics and Photonics, 2018. [37] H. TT Tran, N. P. Ngoc, C. T. Pham, Y. J. Jung, and T. C. Thang. A subjective study on user perception aspects in virtual reality. Applied Sciences, 9(16):3384, 2019. [38] I. Viola and T. Ebrahimi. A new framework for interactive quality assessment with application to light field coding. In Applications of Digital Image Processing XL, vol. 10396, p. 103961F. International Society for Optics and Photonics, 2017. [39] J. O. Wobbrock, L. Findlater, D. Gergle, and J. J. Higgins. The Aligned Rank Transform for nonparametric factorial analyses using only ANOVA procedures. In Proceedings of the SIGCHI conference on human factors in computing systems, pp. 143–146. ACM, 2011. [40] E. Zerman, P. Gao, C. Ozcinar, and A. Smolic. Subjective and objective quality assessment for volumetric video compression. In Fast track article for IST International Symposium on Electronic Imaging 2019: Image Quality and System Performance XVI proceedings, 2019. [41] C. Zhang, D. Florencio, and C. Loop. Point cloud attribute com- pression with graph transform. Image Processing (ICIP), 2014 IEEE 135 International Conference on, October 2014. [42] J. Zhang, W. Huang, X. Zhu, and J.-N. Hwang. A subjective quality evaluation for 3D point cloud models. In 2014 International Conference on Audio, Language and Image Processing, pp. 827–831. IEEE, 2014. 136 work_dgpqd6codbg2xgxz7q6bcgdncq ---- Exporting Finnish Digitized Historical Newspaper Contents for Offline Use Search D-Lib: HOME | ABOUT D-LIB | CURRENT ISSUE | ARCHIVE | INDEXES | CALENDAR | AUTHOR GUIDELINES | SUBSCRIBE | CONTACT D-LIB D-Lib Magazine July/August 2016 Volume 22, Number 7/8 Table of Contents   Exporting Finnish Digitized Historical Newspaper Contents for Offline Use Tuula Pääkkönen National Library of Finland, Centre for Preservation and Digitization tuula.paakkonen@helsinki.fi Jukka Kervinen National Library of Finland, Centre for Preservation and Digitization jukka.kervinen@helsinki.fi Asko Nivala University of Turku, Department of Cultural History asko.nivala@utu.fi Kimmo Kettunen National Library of Finland, Centre for Preservation and Digitization kimmo.kettunen@helsinki.fi Eetu Mäkelä Aalto University eetu.makela@aalto.fi DOI: 10.1045/july2016-paakkonen   Printer-friendly Version   Abstract Digital collections of the National Library of Finland (NLF) contain over 10 million pages of historical newspapers, journals and some technical ephemera. The material ranges from the early Finnish newspapers from 1771 until the present day. The material up to 1910 can be viewed in the public web service, where as anything later is available at the six legal deposit libraries in Finland. A recent user study noticed that a different type of researcher use is one of the key uses of the collection. National Library of Finland has gotten several requests to provide the content of the digital collections as one offline bundle, where all the needed content is included. For this purpose we introduced a new format, which contains three different information sets: the full metadata of a publication page, the actual page content as ALTO XML, and the raw text content. We consider these formats most useful to be provided as raw data for the researchers. In this paper we will describe how the export format was created, how other parties have packaged the same data and what the benefits are of the current approach. We shall also briefly discuss word level quality of the content and show a real research scenario for the data.   1 Introduction The historical newspaper, journal and ephemera collection of NLF is available up to 1910. The online system offers page images of the content, which can be accessed either by browsing or using the search engine. Recently, the text content of the page as ALTO XML (Analyzed Layout and Text Object) was also released for the users. However, for doing any mass operations on the XML files, there needs be to a way to offer the whole content at once, instead of having to download it page by page. Part of the Finnish newspaper corpus has been made available via the FIN-CLARIN consortium and Europeana Newspapers. FIN-CLARIN offers the data in two different formats. In the recent one the data is provided in the original ALTO XML format, but the directory structure follows the original file system order, where one newspaper title can span different archive files. FIN-CLARIN offers the original ALTO format via the Language bank service for years 1771-1874. Secondly, FIN-CLARIN also provides a processed version by offering the Finnish word n-grams 1820-2000 of the newspaper and periodicals corpus partitioned in decades with word frequency information. Thirdly, the data is available as one of the materials of the Korp corpus environment. In addition, NLF has provided a data set to The Europeana Newspapers via the European Library (TEL), which was one of the partners of the Europeana Newspapers. At TEL, for example, eleven representative Finnish newspapers were selected that are in the public domain. TEL does not offer the texts themselves, but only the metadata associated with them (with links to page images). These are offered in RDF (in RDF/XML and Turtle serializations), as well as Dublin Core XML. The metadata records are also available in JSON format via the Europeana project, this time also including the raw text of the page. Presently, there are plans to move the TEL portal to a new newspaper channel of Europeana.   2 Creation of the Export Before beginning export creation we needed to decide which metadata is added to the set, and which formats the data would be offered in. During outlining of the digital humanism policy of the NLF, there was lots of internal discussion about the various formats which the library should provide for researchers. For the newspaper corpus we decided to create the export with original ALTO XML and raw text plus the metadata. Initially we considered offering just ALTO XML as it is the main format from the post-processing of the production line. Raw text format was chosen to enable more convenient use for those who might not feel comfortable with ALTO XML, as it was relatively straightforward to extract the text from the NLF XML format. However, just when we were starting exporting, the need to also have the metadata arose. In order to avoid having several separated datasets, we also incorporated metadata to each XML file. We had earlier provided the raw text in the Digital Humanities Hackathon of 2015 (DHH15), where the raw texts were used alongside the precompiled word n-grams from the FIN-CLARIN (The Language Bank of Finland). The ALTO format of the package contains the information, which has been captured from the original page images. The quality assurance process of NLF is quite light and it aims mainly to check the metadata. For example, no layout or reading order issues are typically fixed from what the backend system (DocWorks) provides with the text content and logical layout of the page off-the-shelf. The digitization of the NLF has spanned several years, averaging about 1 million pages a year. Nearly every year the backend system has also improved, and new versions, with new features or improvements have appeared, and the METS/ALTO files also contain this version information. The generic structure for the export is shown in Figure 1 (metadata, XML ALTO and the plain text). Figure 1: General structure of the export files   2.1 Technical Implementation In the generation of the export, our main database, which contains the information of the material (i.e. title metadata, page data, and file data containing the archive directory information) was utilized. At the beginning of the export, a title is selected and its metadata and location of ALTO XML files are extracted first from the database. Then the corresponding ALTO XML files are extracted, and combined with the raw text of the page, which is available in the database. All of these are combined as a NLF-specific XML file. We decided to produce one XML format, to ease our internal generation of the export and to keep the export as one complete set. In the long run, we hope that this will make management and versioning of the export set easier, in comparison with having three separate data sets. Our goal is to have one master set, which would be the same, which we offer to those researchers who do offline analysis. If we get feedback for the export set, it will be easier to trace back any suggestions and improve the originals or re-generate the export with enhancements with only one main export. Implementation of the export script took a few days, but the most time consuming task was to run it against the material set. To speed the creation of the export packages, execution of the generation script was replicated so that it was possible to run different exports in parallel. Extracting one particular file from the database is quite fast, but adding the database operations and other processing takes time when it has to been done to millions of files. With several simultaneous batch runs the extraction was completed faster.   2.2 Metadata within the Exports The format of the NLF-specific XML file is presented in Figure 2 in detail. The metadata fields of rows 5-9 and 18-22 are currently visible in the web presentation system on the page itself, but the details of the publisher rows 12-15 are only visible via the detailed information of the title. This information is not conveniently located in the event that some page-level calculations or analysis needs to be done. Page label is not visible in the presentation system, where the page numbers are as given in PageOrder. Language will be made available as one of the search criteria in an upcoming release, but currently it is not visible anywhere. Figure 2: Example of an exported XML file: metadata, ALTO XML and CDATA Thanks to a suitable infrastructure for the exporting, where we have one table with information about the storage location of the ALTO XML files and information about the publication date of the material, we were able to extract the material in a straightforward way. All pages before 1.1.1911 were exported and extracted in two different directories, one based on ISSN and other based on the publication year. After the metadata comes the matching ALTO XML file for the given page id (starting from line 27 in Figure 2). For any processing that might be feedbacked from the researchers, for example, the tags pageIdentifier and bindingIdentifier are useful for the NLF, as they identify that particular item in a reliable manner as every page and binding have a unique identifier. This enables a feedback loop, in case there is opportunity to get improvements from the researcher community. The ALTO XML is generated by the Content Conversion Specialists' DocWorks program in the post-processing phase of the digitization. ALTO XML contains the layout and the actual words of the page text, with the coordinates that indicate the location of each word on the page. Finally, at the end of each of the files, there is the raw text of the page from the database (CDATA, starting from line 2717 in Figure 2), which is what the search of the presentation system uses when giving search results to the users.   2.3 Top-level Structure In the planning phase of the export we thought of potential user scenarios for the export. Based on earlier feedback we figured that the most probable needs are based on years or by ISSN, which were used as ways to divide the material before packaging. Language of the pages was also included as one folder level. Figure 3 shows the folder structure . Figure 3: The folder structure Below the lowest-level folder are the actual ALTO XML files, which are named descriptively, for example: fk25048_1906-10-03_9_002.xml (= ISSN_YEAR_DATE_ISSUE_PAGE). We hope that the folder level structure and the file naming policy will provide enough flexibility for us to be able to generate subsequent exports based on user needs. If the request is for all of the material, then everything under by_year structure is given. The selected year divisions are estimations based on the amount of pages and estimated sizes, to get each year span's export package to be roughly of the same size.   2.4 Export File Sizes The sizes of the export files are considerable: 187 gigabytes as zip files. For this reason, downloadable packages were put in the first phase to an external service, which provides enough storage space. The pilot version was offered to the key contact points via Funet FileSender, and the upload speed was at some point 6MB / second, meaning that just uploading the whole data set took nearly nine hours. In the future, the export dumps will be offered via the http://digi.kansalliskirjasto.fi website, which enables us to generate new versions more fluently. Table 1 shows sizes of the zip package parts. Export file name Size nlf_ocrdump_v0-2_journals_1816-1910.zip 18G nlf_ocrdump_v0-2_newspapers_1771-1870.zip 12G nlf_ocrdump_v0-2_newspapers_1871-1880.zip 13G nlf_ocrdump-v0-2_newspapers_1881-1885.zip 12G nlf_ocrdump-v0-2_newspapers_1886-1890.zip 16G nlf_ocrdump-v0-2_newspapers_1891-1893.zip 13G nlf_ocrdump-v0-2_newspapers_1894-1896.zip 15G nlf_ocrdump-v0-2_newspapers_1897-1899.zip 16G nlf_ocrdump-v0-2_newspapers_1900-1902.zip 16G nlf_ocrdump-v0-2_newspapers_1903-1905.zip 18G nlf_ocrdump-v0-2_newspapers_1906-1907.zip 18G nlf_ocrdump_v0-2_newspapers_1908-1909.zip 20G nlf_ocrdump_v0-2_newspapers_1910.zip 9.9G Table 1: Packages and their sizes The number of ALTO XML files in the newspaper part of the export is presented in Figure 4 with languages. The total number of pages is ca. 1.961 M. The language is the primary language of the title as listed in our newspaper database. For clarity, information about the number of Russian and German pages has been omitted (in total 8,997 and 2,551 pages, respectively) and we show only the number of Finnish and Swedish pages. The total number of Finnish pages is 1,063,648, and the total number of Swedish pages 892,101. Journal data is more varied: out of its 1,147,791 pages, 605,762 are in Finnish and 479,462 are in Swedish (total 1,085,224 pages). The rest are either multilingual or in other languages. Figure 4: Number of pages in Finnish and Swedish in different packages   3 Nordic Situation Briefly Opening of data is in its early phases in Scandinavia. In Finland, the National Library of Finland has started by opening the main portal Finna (Finna API, 2016) and ontology and thesaurus service of Finto. The Fenno-Ugrica service has opened material from the digital collection of Uralic languages. In comparison to these, the Finnish newspaper and journal collection data would be quite large, consisting of around three million pages. In Sweden, the National Library of Sweden has opened their data via the http://data.kb.se service. The opened data is an interesting cross-cut of different material types, with a couple of newspapers, namely Post- och inrikes tidningar (1645-1705) (aka POIT) and Aftonbladet (1831-1862). Interestingly, these have made different data available: POIT offers original page images (.tif), text of the pages and the corrections of the page texts as .doc files. The data set of Aftonbladet, on the other hand, offers the combination of post processing outputs, i.e. METS, ALTO and page images (.jp2). In Denmark the practice is currently that access to the digitized newspapers is provided via the State and University Library, the Royal Library and the Danish Film Institute. In Norway, for example, the newspaper corpus is offered via the Språkbanken, for the years 1998-2014. In the Nordic context the NLF export is thus quite extensive, with a time range of 1771-1910, and coverage of all the titles published during that time. The year 1910 is in Finland currently the year before which we feel that the risk of copyright problems is minimized. However, in the feedback from some researchers, there is a desire to get data from later than 1910, especially as the centenary of Finland's independence in 2017 is nearing.   4 Quality of the OCRed Word Data in the Package Newspapers of the 19th and early 20th century were mostly printed in the Gothic (Fraktur, blackletter) typeface in Europe. Most of the contents of Digi have been printed using Fraktur, Antiqua is in the minority. It is well known that the typeface is difficult to recognize for OCR software (cf. e.g. Holley, 2008; Furrer and Volk, 2011; Volk et al., 2011). Other aspects that affect the quality of the OCR recognition are, among others: quality of the original source and microfilm; scanning resolution and file format; layout of the page; OCR engine training; noisy typesetting process; and unknown fonts (cf. Holley, 2008). As a result of these difficulties, scanned and OCRed document collections have a varying amount of errors in their content. The amount of errors depends on the period and printing form of the original data. Older newspapers and magazines are usually more difficult for OCR; newspapers from the early 20th century are usually easier. Tanner et al. (2009), for example, report for the British Library's 19th Century Newspaper Project an estimated word correctness of 78%. There is no single available method or software to assess the quality of large digitized collections, but different methods can be used to approximate quality. In Kettunen and Pääkkönen (2016) we discussed and used different corpus analysis style methods to approximate overall lexical quality of the Finnish part of the Digi collection. For the Swedish part, assessment is missing so far. Methods include usage of parallel samples and word error rates, usage of morphological analyzers, frequency analysis of words and comparisons to comparable edited lexical data. Our results show that about 69% of all the 2.385 B word tokens of the Finnish Digi can be recognized with a modern Finnish morphological analyzer, Omorfi. If orthographical variation of v/w in the 19th century Finnish is taken into account and the number of out-of-vocabulary words (OOVs) is estimated, the recognition rate increases to 74-75%. The rest, about 625 M words, is estimated to consist mostly of OCR errors, at least half of them hard ones. Out of the 1 M most frequent word types in the data that make 2.043 billion tokens, 79.1% can be recognized. The main result of our analysis is that the collection has a relatively good quality rating, about 69-75%. Nevertheless, about a 25-30% share of the collection needs further processing so that the overall quality of the data can improve. However, there is no direct way to higher quality data: re-OCRing of the data may be difficult or too expensive/laborious due to licensing models of the proprietary OCR software or the lack of font support in open-source OCR engines. Post-processing of the data with correction software may help to some extent, but it will not cure everything. Erroneous OCRed word data is a reality with which we have to live.   5 Use Case of the Material: Text Reuse Detection and Virality of Newspapers Computational History and the Transformation of Public Discourse in Finland 1640-1910 (COMHIS), a joint project funded by the Academy of Finland, is one of the first large projects that will utilize the newspaper and journal data of Digi. The objective of the consortium is to reassess the scope, nature, and transnational connections of public discourse in Finland 1640-1910. As part of this consortium, the Work Package "Viral Texts and Social Networks of Finnish Public Discourse in Newspapers and Journals 1771-1910" (led by Prof. Hannu Salmi), will be based on the text mining of all the digitized Finnish newspapers and journals published before 1910. The export package will be the main data source for this. The project tracks down the viral texts that spread in Finnish newspapers and journals by clustering the 1771-1910 collection with a text reuse detection algorithm. This approach enables one to draw new conclusions about the dissemination of news and the development of a newspaper network as a part of Finnish public discourse. We study, for example, what kinds of texts were widely shared. How fast did they spread and what were the most important nodes in the Finnish media network? The historical newspaper and journal collection of NLF from the 1770s to the 1900s enables various digital humanities approaches, like topic modeling, named-entity recognition, or vector-space modeling. However, the low OCR quality of some volumes sets practical restrictions for the unbiased temporal distribution of the data. Fortunately, text reuse detection is relatively resistant to OCR mistakes and other noise, which makes virality of newspapers a feasible research question. The COMHIS project is currently testing a program called Passim for detecting similar passages with matching word n-grams, building on the methods developed for the similar study of 19th century US newspapers by Ryan Cordell, David A. Smith, and their research group (Cordell, 2015; Smith et al., 2015). Text reuse detection is an especially suitable method for a newspaper corpus. In contrast to novels, poetry, and other forms of fiction, authorship of a newspaper article was usually irrelevant in the 19th century publication culture. The more the news was copied the more important it was, whereas fictional texts were considered as an individual expression of their author. Circulation of the texts was reinforced by the lack of effective copyright laws prior to the Berne Convention in 1886. By analyzing the viral networks of Finnish newspapers, our aim is not to track down the origin of a specific text, but rather to study the development of media machines and the growing social and political influence of the public sphere from a new perspective. In the Finnish case, the rapid increase in newspaper publishing happened relatively late. As shown in Figure 4, the growth of the press was almost exponential at the beginning of the 20th century. It thus makes sense to analyze the 1900-1910 data as its own group. Because the project has just started, a complete analysis of the clustering results is not finished yet, but it is possible to give a few examples of widely shared text. During the time period of 1770-1899, one interesting cluster encountered so far is the succession of the Russian throne to Tsar Alexander III. These dramatic events started with the assassination of Tsar Alexander II, who died in the bomb attack by Polish terrorists on 13 March 1881. The death of the beloved emperor became viral news. The official statement of the new ruler Alexander III to Finnish people was first printed in Suomalainen Wirallinen Lehti on 17 March 1881. Between 17-26 March the text was printed in 15 major newspapers around Finland. The same text was also reprinted in the following years, e.g. on 26 July 1890 in Waasan Lehti and on 30 December 1890 in Lappeenrannan Uutiset (Table 2). Finally, the death of Alexander III was reported in 2 November 1894 in Suomalainen Wirallinen Lehti and it also became a similar viral text. Newspaper title, number of reprints Date Suomalainen Wirallinen Lehti, Uusi Suometar (2) 1881-03-17 Aura (1) 1881-03-18 Sanomia Turusta, Tampereen Sanomat, Satakunta (3) 1881-03-19 Waasan Lehti, Vaasan Sanomat (2) 1881-03-21 Savo (1) 1881-03-22 Tapio, Karjalatar (2) 1881-03-23 Hämeen Sanomat (1) 1881-03-25 Ahti, Kaiku, Savonlinna, Waasan Lehti (4) 1881-03-26 Lappeenrannan Uutiset (1) 1890-12-30 Turun Lehti (1) 1892-02-02 Table 2: Reprints of the inauguration statement of the emperor Alexander III Virality was a more typical feature of newspapers, but it is of course possible to process all Finnish journals with Passim. When it comes to journals, the most shared texts are Bible quotations. Among the biggest clusters mined so far is the one consisting of 87 references to the Finnish translation of the Book of Revelations (3:20). In Swedish material, John 3:16 is reprinted 58 times over the time period, and there are lots of other widely-shared Bible passages, which could provide an interesting perspective to the tempo of secularization in Finland. The biggest group of clusters after the religious texts appears to be advertisements by publishing houses, banks, and insurance companies. Only after that there are stories and magazine articles, which can be reprinted many years after their original time of publication. Historical newspapers have been an important genre of sources for historical research. Before becoming available as a digitized collection, the Finnish newspapers and journals were read as originals or on microfilm. Traditionally historical scholarship has been based on the ideal of close reading, which implies careful interpretation of each document before one is able to draw any conclusions based on the sources. However, in the case of the digitized newspapers, the mere quantity of available textual data makes this approach virtually impossible. Franco Moretti (2007) has asserted that in order to readjust the methodology of humanities to the scale of big data, we need to develop methods of distant reading to process substantial amounts of textual documents without losing analytical rigor. Moreover, the humanist scholars may have to abandon any strict dichotomy between narrative and database (Hayles, 2012). The research questions of the digital humanities often entail converting a textual corpus to a database for various forms of computational analysis and heuristics. After this, the new textual locations discovered by the computational algorithm are then closely read. For example, the text reuse detection algorithm can cluster widely shared text passages together for further analysis, but it cannot typically distinguish whether the text is an advertisement, a common Bible quotation, or an interesting news article. In future it could be possible to use ALTO XML metadata or machine learning methods (like Naive Bayes classifier) to group the detected clusters according to their textual genre.   6 Conclusion As Tanner (2016) says, the digital library can face a "growing unfunded mandate", as additional requirements of digital material, such as availability and accessibility, create a new kind of demand for the material, while data production resources decrease or remain the same. Therefore it is crucial that research use is considered to be one impact factor, and for it to become more observable to a content or raw research data provider, such as National Library of Finland. Collaboration with research units should be increased so that researchers can express their needs regarding what should be digitized, and they could tell if there is need for curation or special formats. However, this is a balancing act — how should a library organization best serve researchers with its limited resources? Is it possible to find joint funding opportunities and actually complete tasks together, in close relationship? The National Library of Finland is aiming to improve this via forthcoming Digital Humanism and Open Data policies which are due to be published during 2016. The key tenet in the policies is to work together and consider researchers as one customer segment. Research usage should also be considered as a result metric for NLF, either via collaborative projects or keeping track of where the digitized materials are used. The work on the export continues. During early summer we plan to set up our own website to share the export data sets with anyone who is interested. For the statistics, we will have a short survey about the research use which we hope will also give us insight into what kind of exports would be useful in the future, and if there is something that should be taken into consideration with the long-term digitization roadmaps. The big question is the structure of the export. Should it contain all of the content within one file, or should it be structurally divided so that each data item (metadata, ALTO, page text) is available separately. There are advantages and disadvantages in both approaches, so it needs to be determined which would be the best long-term solution with the available resources. In addition, we need to consider the content itself, to see if there is a need for the METS and the page images, too. With input from those who have the first released versions we have the opportunity to develop the export further, by adding data or adjusting the fields and their content accordingly. We anticipate a clear increase in researchers' needs which could lead to closer collaboration between NLF and researchers.   Acknowledgements Part of this work is funded by the EU Commission through its European Regional Development Fund and the program Leverage from the EU 2014-2020. COMHIS is funded by the Academy of Finland, decision numbers 293239 and 293341.   References [1] Cordell, R. (2015). Reprinting, Circulation, and the Network Author in Antebellum Newspapers. American Literary History, 27 (3), pp. 417-445. http://doi.org/10.1093/alh/ajv028 [2] Finna API (in English) — Finna — Kansalliskirjaston Kiwi. (2016). [3] Furrer, L. and Volk, M. (2011). Reducing OCR Errors in Gothic-Script Documents. In Proceedings of Language Technologies for Digital Humanities and Cultural Heritage Workshop, pp. 97-103, Hissar, Bulgaria, 16 September 2011. [4] Hayles, N. K. (2012). How We Think. Digital Media and Contemporary Technogenesis. Chicago; London: The University of Chicago Press. [5] Holley, R. (2009). How good can it get? Analysing and Improving OCR Accuracy in Large Scale Historic Newspaper Digitisation Programs. D-Lib Magazine March/April 2009. http://doi.org/10.1045/march2009-holley [6] Hölttä, T. (2016). Digitoitujen kulttuuriperintöaineistojen tutkimuskäyttö ja tutkijat. M. Sc. thesis (in Finnish), University of Tampere, School of Information Sciences, Degree Programme in Information Studies and Interactive Media. [7] Kettunen, K. and Pääkkönen, T. (2016). Measuring lexical quality of a historical Finnish newspaper collection — analysis of garbled OCR data with basic language technology tools and means. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). [8] Kingsley, S. (2015). Eteenpäin sopimalla, ei lakia muuttamalla. Kansalliskirjasto, 57(1), pp. 18-19. [9] Moretti, F. (2007). Graphs, Maps, Trees: Abstract Models for a Literary History. London; New York: Verso. [10] Smith, D. A., Cordell, R. and Mullen, A. (2015). Computational methods for uncovering reprinted texts in antebellum newspapers. American Literary History 27 (3), pp. E1-E15. http://doi.org/10.1093/alh/ajv029 [11] Tanner, S. (2016). Using Impact as a Strategic Tool for Developing the Digital Library via the Balanced Value Impact Model. Library Leadership and Management, 30(3). [12] Tanner, S., Muñoz, T. and Ros, P. H., (2009). Measuring Mass Text Digitization Quality and Usefulness. Lessons Learned from Assessing the OCR Accuracy of the British Library's 19th Century Online Newspaper Archive. D-Lib Magazine July/August. http://doi.org/10.1045/july2009-munoz [13] Volk, M., Furrer, L. and Sennrich, R. (2011). Strategies for reducing and correcting OCR errors. In C. Sporleder, A. Bosch, and K. Zervanou, editors, Language Technology for Cultural Heritage, 3-22. Springer-Verlag, Berlin/Heidelberg.   About the Authors M. Sc. Tuula Pääkkönen works in National Library of Finland as Information Systems Specialist. Her work includes the development of some of the tools to support digitization efforts, technical specification and projects for the DIGI.KANSALLISKIRJASTO.FI service. She has been working in the library in a project dealing with copyrights and data privacy topics as well as other development projects, which have dealt with digital contents, crowdsourcing and metrics.   Mr. Jukka Kervinen has worked as a Systems Analyst in the National Library of Finland since 1999. His main responsibilities have been designing and setting up Library's in house digitization workflows and post processing. His experience encompasses metadata development (METS, ALTO, PREMIS, MODS, MARCXML), system architecture planning and database design. He is a member of ALTO XML editorial board since 2009 and member of METS editorial board since 2011.   Dr. Asko Nivala is a Postdoctoral Researcher in Cultural History at the University of Turku, Finland. His research focuses on early nineteenth-century cultural history and especially on the Romantic era. His other research interests include the recent discussions on the digital humanities and posthumanism. Nivala has co-edited the collection "Travelling Notions of Culture in Early Nineteenth-Century Europe" (Routledge 2016) and written the monograph "The Romantic Idea of the Golden Age in Friedrich Schlegel's Philosophy of History" (Routledge, to be published in 2017).   Dr. Kimmo Kettunen works at the Centre for Preservation and Digitisation of National Library of Finland as a research coordinator in the Digitalia project. His work includes research related to the DIGI.KANSALLISKIRJASTO.FI service. He has been involved especially in the quality estimation and improvement of the content of the service. He has also conducted research on Named Entity Recognition (NER) of the OCRed newspaper material. Kimmo is part of the digital humanities research team at the Centre.   D.Sc. Eetu Mäkelä is a computer scientist from Aalto University, Finland. His current main interest is to further computer science through tackling the complex issues faced by scholars in the humanities when using their tools and data. Thus, he currently spends most of his time in collaboration with multiple international digital humanities research projects. Through these collaborations, and through having attained his doctorate in research on data integration and Linked Data, he has ample experience in best practices for publishing data in as usable a form as possible.   Copyright ® 2016 Tuula Pääkkönen, Jukka Kervinen, Asko Nivala, Kimmo Kettunen and Eetu Mäkelä work_dgqqsodngzbe3pkxr3s4zt73ty ---- Türk Kütüphaneciliği 30, 3 (2016), 552-554 Dijital İnsanî Bilimler Konferansı 2016 Raporu The Report of Digital Humanities 2016 Conference Sümeyye Akça* ve Müge Akbulut** *Arş. Gör., Hacettepe Üniversitesi Bilgi ve Belge Yönetimi Bölümü. e-posta: sumeyyesakca@gmail.com Research Ass. Hacettepe University Department of Information Management "Arş. Gör., Yıldırım Beyazıt Üniversitesi Bilgi ve Belge Yönetimi Bölümü. e-posta: mugeakbulut@gmail.com Research Ass. Yıldırım Beyazıt University Information and Records Management Department Geliş Tarihi - Received: 12.08.2016 Kabul Tarihi - Accepted: 17.08.2016 Öz Bu çalışmada, 11-16 Temmuz 2016 tarihlerinde Polonya'nın Krakow şehrinde düzenlenen Dijital İnsanî Bilimler Konferansı'nda edinilen izlenimler yer almaktadır. Anahtar Sözcükler: Dijital İnsanî Bilimler Konferansı; DH2016; Polonya. Abstract In this paper, impressions from Digital Humanities 2016 held in Krakow, Poland in July 11-16, 2016 is presented. Keywords: Digital Humanities Conference; DH2016; Poland. Giriş Bilgisayar teknolojilerinin insanî bilimlere uygulanması şeklinde tanımlanan dijital insanî bilimler (McCarty, 1998), dijital teknolojilerin yaratılması, uygulanması ve yorumlanması için geniş bir uygulama alanı sunan şemsiye bir terimdir (Presner ve Johanson, 2009). Bu alanda yapılan çalışmalar, tarihsel bir gerçeği, olayı, eseri somutlaştırarak anlatır (Krischel ve Fangerau, 2012, ss. 51-52). Dijital insanî bilimler alanında çalışmalar yapan kurumları bir çatı altında toplayan Alliance of Digital Humanities Organizations (ADHO) ana sponsorluğunda her yıl bu alanda konferans düzenlenmektedir. Katılımcıları tüm dünyadan bilgisayar mühendisi, tarihçi, arkeolog, eğitimci, kütüphaneci, akademisyen gibi çok çeşitli mesleklerden oluşan bu konferansın ilki 1989 yılında düzenlenmiştir. Konferans Polonya'nın Krakow şehrinde Jagiellonian (Jagiellonian University in Krakow) ve Pedagoji (Pedagogical University in Krakow) üniversitelerinin ev sahipliğinde yapılmıştır. Ana teması Geçmiş ve Gelecek olan konferansa 48 farklı ülkeden 902 kişi katılmıştır. Konferansa Reşat Nuri Güntekin'in mektupları üzerine yaptığımız sosyal ağ analizi çalışması (Akça ve Akbulut, 2016) ile katılarak ülkemizi temsil ettik. Çalıştaylar Konferansın ilk iki günü (11-12 Temmuz) çalıştaylara (workshop) ayrılmış olup, katıldığımız çalıştaylardan ilki A Demonstration of Multispectral Imaging başlıklı çalıştaydı. Yarım günlük bu çalıştayda, çeşitli sebeplerle (nem, solma, ısı, yeniden yazım gibi) hasara uğramış olan mailto:sumeyyesakca@gmail.com mailto:mugeakbulut@gmail.com Dijital İnsanî Bilimler Konferansı 2016 Raporu The Report of Digital Humanities 2016 Conference______________________________________________ 553 yazmaların okunabilmesi için uygulanan görüntüleme sistemi tanıtıldı. Mississippi Üniversitesi tarafından yürütülen The Lazarus Projesinin (www.lazarusprojectimaging.com) tanıtımı yapılarak oluşturulan portatif çoklu ortam görüntüleme cihazları ve bunların kullanımı gösterildi. Aynı gün öğleden sonra katıldığımız çalıştay ise TEI (Text Encoding Initiative) metin kodlama üzerineydi (TEI Processing Model Toolbox: Power To The Editor). Bu çalıştayda eXist db (http://exist-db.org/exist/apps/public-repo/index.html) uygulaması üzerinden denemeler yapıldı. Özellikle edebî metinlerin TEI standardına uygun hale getirilerek dijital ortamda nasıl yayımlanabileceği üzerinden uygulamalar yaptırıldı. İkinci gün Digital Archiving and Storytelling in the Classroom with Omeka and CurateScape başlığındaki çalıştaya katıldık. Omeka (http://omeka.org/) programı kullanılarak dijital ortamda arşiv geliştirme ve hikaye anlatma (storytelling) uygulamalı olarak gösterildi. Duke University Lab for Digital Art History & Visual Culture tarafından geliştirilen eğitimler öncülüğünde yapılan örneklerle Omeka içerik programında dijital arşiv oluşturuldu. Yine CurateScape (http://curatescape.org) lokasyon tabanlı uygulama hakkında bilgi verilerek örnekler yapıldı. Omeka eklentileri (plugins) ile web sitesinin kişiselleştirilmesi üzerine çalışma yaptırıldı. Bu programla oluşturulan projelerden örnekler gösterildi (Gothicpast projesi (http://www.gothicpast.com) vb.). Açılış CERN Araştırma Konseyi (CERN Research Council) başkanlığını yürüten parçacık fizikçisi Agnieszka Zalewska açış konuşmasında CERN'ün genel yapısı, çalışma alanları ve ülkelerle olan işbirliği konularına değindi. Teknik bir dille kaleme alınan konuşmada ağırlıklı olarak CERN'ün başarısındaki anahtar faktörler tartışıldı. CERN'ün son yıllarda dünyaya açılma girişimlerinden de bahseden Zalewska, kısaca yürütülen projelere değindi. Devletlerin üyelik durumları ve işbirlikleri sunumun dikkat çeken kısımlarıydı. Zalewska konuşmasında CERN'ün çalışma yapısındaki bilgi ve becerilerin öğrencilere aktarılmasını örnek göstererek bu sistemin dijital insanî bilimler için bir model olabileceğini belirtti. Konferans Günleri 13 Temmuzda aynı anda on bir salonda yapılan paralel oturumlara geçildi. Semantik teknolojiler, kültür ve sanat alanında büyük verinin kullanılması, sosyal ağlar ve kültürel miras gibi çok farklı alanlarda yapılan çalışmaların sunumlarını izledik. Aynı gün olan poster sunumları ise iki farklı şekilde düzenlenmişti. İsteyen yazarlar çalışmalarını 1-5 dk. süre zarfında sözlü olarak sundular. Daha sonra ise sözlü sunum yapmak istemeyen yazarların poster sunumlarına geçildi. 14 Temmuzda paralel oturumlar devam etti. Aynı günün akşamı ADHO tarafından verilen Busa Ödülünün bu yılki kazananı olan Helen Agüera'nın konuşmasını dinledik. Agüera, Amerikan Ulusal İnsanî Bilimler Vakfının (National Endowment for Humanities) projeleri desteklemedeki rolünü ve vakfın işleyişini değerlendirdi. Vakıf tarafından erken dönemde (1970-1980) yapılan desteklerin (hibe) merkezinde kütüphane işbirliği ağının geliştirilmesi ve veri tabanları gibi konular varken, dijital insanî bilimler alanının gelişmesiyle birlikte daha çok eğitim (daha çok bilgisayar yeteneklerinin geliştirilmesi üzerine), teknoloji araçları ve dil çalışmaları (istatiksel analizler) gibi konularda destek verildiğini dile getirdi. Diğer bağış http://www.lazarusprojectimaging.com http://exist-db.org/exist/apps/public-repo/index.html http://omeka.org/ http://curatescape.org http://www.gothicpast.com 554 Okuyucu Mektupları / Reader Letters___________________________________________ Akça ve Akbulut kurumlarıyla olan ilişkilere de değinen konuşmacı, Vakıf tarafından desteklenen uluslararası projelerden örnekler verdi (TEI, Unicode gibi). Paralel oturumların devam ettiği son günde ise konferansın kapanış konuşmasını Durham Üniversitesinden Claire Warwick yaptı. Touching the Interface: Bishop Cosin and Unsolved Problems in (digital) Information Design başlıklı konuşmasında Warwick, dijital kaynakların henüz tam olarak içselleştirilemediği ve hâlâ insanların dijital kaynakları kullanmaya uyum sağlayamadığı sorununu tartıştı. Bu sorunda mekânsal bağlılıktan, dokunma duyusunun önemine kadar birçok sebebin etkili olduğunu belirten konuşmacı, internet ve kullanıcı arasında daha etkili bir deneyim sağlayabilmek için çalışmalar yaptıklarını söyleyerek ortaçağ manastır kütüphaneleri tasarımlarından örnekler verdi (Durham Psikoposu Cosin gibi). Bu konferans önümüzdeki yıl Kanada'nın Montreal şehrinde McGill ve Montreal Üniversitesinin ev sahipliğinde yapılacaktır. Kaynakça Akça, S. ve Akbulut, M. (2016). Content based social network analysis of Reşat Nuri Güntekin's letters. Digital Humanities 2016 (DH2016), Krakow, Poland, 12-16 July, 2016. Krakow: Jagiellonian University and Pedagogical University. Krischel, M. ve Fangerau, H. (2012). Historical network analysis can be used to construct a social network of 19th-century evolutionists. Classification and Evolution in Biology, Linguistics and the History of Science, 45-60. Erişim adresi: http://www.steiner- verlag.de/fileadmin/Dateien/Steiner/EBook/9783515105897_eb.pdf#page=46 McCarty, W. (1998). What is humanities computing? Toward a definition of the field. Erişim adresi: http://www.dighum.kcl.ac.uk/legacy/teaching/dtrt/class1/mccarty_humanities_computing.pdf Presner, T. ve Johanson, C. (2009). The promise of digital humanities: A whitepaper. Erişim adresi: http://www.itpb.ucla.edu/documents/2009/PromiseofDigitalHumanities.pdf http://www.steiner- http://www.dighum.kcl.ac.uk/legacy/teaching/dtrt/class1/mccarty_humanities_computing.pdf http://www.itpb.ucla.edu/documents/2009/PromiseofDigitalHumanities.pdf work_di3hwira4zelpeo7ekcfsj5kpm ---- Originally developed by Rachel Arteaga in collaboration with Rebecca Kilpatrick, Andrew Archer, and Peter Gallo. For project information, see spartrees.wordpress.com. Teacher Notes: Digital Humanities Lesson Plan Originally designed for: 7th – 12th Grades · Computer Lab · Nonfiction · Digital Narrative Mapping Note: this lesson plan supports instruction for CCSS RI.1 and CCR Anchor Standards for Reading 7 Thank you for participating in this pilot project! These materials are entirely open access. You are encouraged to share, adapt, and revise them as you see fit. Note that the digital humanities (DH) definition provided on the lesson handout is all you and your students should need to get started. However, more information is available on the project website in the form of a one page, accessible overview: http://spartrees.wordpress.com/digital-humanities/ StoryMap requires a google account. Note: because I wrote the handout to cover grades 7-12, you may want to make changes to the expectations of each grade. The most efficient way to do that would be to expect the 12th graders, for example, to include 5 locations instead of 3, etc. Lesson Rationale and Student Learning Goals:  Review and reinforce learning goals emphasized by teacher in previous class discussions  Use the story maps as new objects for reading and discussion  Provide a shared platform and visualization method for students to use for presentations of different books across grade levels and individual interests Digital Humanities Tool: StoryMap: http://storymap.knightlab.com/ http://spartrees.wordpress.com/digital-humanities/ http://storymap.knightlab.com/ Introduction to Digital Humanities: Digital Narrative Mapping The digital humanities (DH, for short) help us study subjects related to human culture, like English and History, by using computer programs. For example, we can use computers to help us read nonfiction. This handout will give you instructions for using a DH tool to map out the locations and movement in a book. The tool is called StoryMap. Digital Narrative Mapping First, we need to define our terms. Digital mapping is a way to visualize the places in a book (setting) and the movement or journeys taken by the characters. In a nonfiction book, mapping the setting and connecting the places can help us better understand the structure of the story the writer is trying to tell. It can also help us to think about space and time in a story. StoryMap lets you make a digital map with links and pictures, and share it. Preparing to make your StoryMap First, you’ll want to take some notes on the book you are reading. What is the setting of the book? Choose three locations that are important to the book – countries, states, cities, landmarks – and write them down. 1. 2. 3. Making your StoryMap 1. Go to the StoryMap homepage: http://storymap.knightlab.com/ 2. You will be prompted for a google account. Ask your instructor for details. 3. What type of story do you want to create? Click Map 4. Name your map using the title of your nonfiction book and your initials for your instructor to identify you 5. A new screen will appear with many features. On the left side, click “add slide” three times 6. Click on each slide and in the “HEADLINE” field type in the name of a location (repeat three times!) 7. On each slide, move the red map pin to the location matching that slide 8. On each slide, beneath the headline, write what happens in your book at that location 9. On each slide, add a link to a trusted source that gives more information about the location. You can also upload images from your computer to show more visual information about the location. 10. Now click the tab at the top that says “Preview” – you can see your StoryMap all in one place, with details 11. Remember to save your work using the “save” button in the top left hand side 12. To share your work, click the “share” button in the top right hand side and copy the link 13. Show your StoryMap to your peers and instructor. Does it help you better understand the book? Why or why not? Does the map give you a different way of thinking about space and time in the book? Questions for Discussion: Conclusion:  Now that you have read the book yourself, and used a computer program to help you read it in new ways, what do you think are the advantages of both strategies? What is best about reading the story yourself – perhaps this is something that the computer can’t offer – and what is best about using the tool? In other words, what are the strengths and weaknesses of both?  If you could make a computer program that could change the way we read, what would it do? http://storymap.knightlab.com/ Teacher Notes: Digital Humanities Lesson Plan Originally designed for: 9th Grade · Computer Lab · Reading Fiction · Sentiment Analysis Note: this lesson plan supports instruction for CCSS RL.1 and CCR Anchor Standards for Reading 7 Thank you for participating in this pilot project! These materials are entirely open access. You are encouraged to share, adapt, and revise them as you see fit. Note that the digital humanities (DH) definition provided on the lesson handout is all you and your students should need to get started. However, more information is available on the project website in the form of a one page, accessible overview: http://spartrees.wordpress.com/digital-humanities/ The digital humanities tool used in this lesson plan is a demo page on the Stanford Natural Language Processing (NLP) Group page. It allows a reader to type or paste a small excerpt of a text into the browser, then it gives a text analysis of the mood of the excerpt based on the range of words it contains. In other words, this tool attempts to assess the emotional content of a text. It uses five classes of sentiment: very negative, negative, neutral, positive, and very positive. The sentence “I am so happy today!” is read by the tool as “very positive,” while the sentence “I am so sad today!” is interpreted as negative, and so on. Of course, literary texts are much more complicated, and this is where things become interesting. Lesson Rationale and Student Learning Goals:  Review and reinforce learning goals emphasized by teacher in previous class discussions  Use the sentiment analysis charts as new objects for reading and discussion  Connect prior reading comprehension and discussion of texts to the sentiment analysis charts Primary text available online: Of Mice and Men by John Steinbeck http://www.kgbsd.org/cms/lib3/AK01001769/Centricity/Domain/664/Of_Mice_and_Men_- _Full_Text.pdf Digital Humanities Tool: Sentiment Analysis by Stanford Natural Language Processing (NLP) Group http://nlp.stanford.edu:8080/sentiment/rntnDemo.html http://spartrees.wordpress.com/digital-humanities/ http://www.kgbsd.org/cms/lib3/AK01001769/Centricity/Domain/664/Of_Mice_and_Men_-_Full_Text.pdf http://www.kgbsd.org/cms/lib3/AK01001769/Centricity/Domain/664/Of_Mice_and_Men_-_Full_Text.pdf http://nlp.stanford.edu:8080/sentiment/rntnDemo.html Introduction to Digital Humanities: Sentiment Analysis The digital humanities (DH, for short) help us study subjects related to human culture, like English and History, by using computer programs. For example, we can use computers to help us read fiction. This handout will give you instructions for using a DH tool to make charts of the emotions of a story. The tool is called sentiment analysis. Sentiment Analysis First, we need to define our terms. Sentiment analysis is a tool that asks the computer to read words and tell us what type of emotions they express. This tool can only tell us if these emotions are more positive or negative. You can use the Stanford Sentiment Analysis tool to determine whether the emotions in the story are mostly positive or mostly negative. Before you start, answer this question: Based on your reading of the entire novella Of Mice and Men by John Steinbeck, would you say that the book expresses mostly positive emotions or mostly negative ones? Why? In your answer, use evidence from the book. 1. Go to the Stanford Sentiment Analysis demo page: http://nlp.stanford.edu:8080/sentiment/rntnDemo.html 2. Copy and paste this excerpt from the beginning of the novella into the screen: For a moment the place was lifeless, and then two men emerged from the path and came into the opening by the green pool. 3. Click “submit” 4. Notice that the sentiment charts that pop up in the next screen give a color to each word. If there is no color, then the program thinks that the word is “neutral” or without emotion. The more blue the color, the more positive the emotions the word makes us feel. The more red, the more negative emotions. The highest colored circle on the chart is what the program thinks of the whole phrase. So, the phrase above is light red, or somewhat negative, overall. Do you agree? 5. If you disagree with the program’s results, and you want to change a word’s color, click on the circle above the word. Change the color to the one you think is more accurate. Then click on the check mark to save it. For example, you might think that “lifeless” is very negative. Make it dark red, and save your work! 6. Once you have changed the colors to match your interpretation of the emotions of the sentence, print the page (just the first page, because the comments below are not important to your work) or save the page as a pdf. If you don’t save your work by printing or creating a pdf, it will be lost. 7. Go back to the original link. Copy and paste this excerpt from the end of the novella into the screen: George shivered and looked at the gun, and then he threw it from him, back up on the bank, near the pile of old ashes. 8. Click “submit” 9. Repeat the instructions above, then discuss! Questions for discussion:  Do you think that the overall feeling of the story was mostly positive or mostly negative?  Do these sentences from the story express mostly positive or negative emotions? Why?  Which parts of the sentiment charts did you think were accurate? Which did you change? Why? Questions for discussion: conclusion:  Now that you have read the book yourself, and used a computer program to help you read it in new ways, what do you think are the advantages of both strategies? What is best about reading the story yourself – perhaps this is something that the computer can’t offer – and what is best about using the tool?  If you could make a computer program that could change the way we read books, what would it do? http://nlp.stanford.edu:8080/sentiment/rntnDemo.html Teacher Notes: Digital Humanities Lesson Plan Originally designed for: 9th Grade · Chromebooks · Short Stories · Google Ngrams Note: this lesson plan supports instruction for CCSS RL.1 and CCR Anchor Standards for Reading 7 Thank you for participating in this pilot project! These materials are entirely open access. You are encouraged to share, adapt, and revise them as you see fit. Note that the digital humanities (DH) definition provided on the lesson handout is all you and your students should need to get started. However, more information is available on the project website in the form of a one page, accessible overview: http://spartrees.wordpress.com/digital-humanities/ The digital humanities tool used in this lesson plan is introductory, and you may have already heard about or used it yourself. Google Ngrams produces a timeline charting the frequency of any words you choose during any time period you choose based on the vast digital library Google has created. This tool may seem simple, even “just for fun” – and it is fun – but, it also echoes the most advanced scholarship in digital humanities today. The learning goals for this lesson plan help the students move from the fun part to the rigorous analysis part of what is made possible by DH. Lesson Rationale:  Review and reinforce learning goals emphasized by teacher in previous class discussions o Angelou: making predictions and inferences  Introduce students to the term “digital humanities,” and to Ngrams  Use the timeline charts as new objects for analysis and discussion  Connect prior reading comprehension and discussion of texts to the charts Student Learning Goals:  Learn the term “digital humanities” and understand Ngrams as part of the field  Use Ngrams to produce timeline charts and note the quantitative aspect of the tool  Gain a new perspective on and deeper comprehension of the assigned primary texts Primary text available online: “New Directions” by Maya Angelou http://www.nexuslearning.net/books/holt_elementsoflit-3/Collection%204/new%20directions.htm Digital Humanities Tool: Ngrams: https://books.google.com/ngrams http://spartrees.wordpress.com/digital-humanities/ http://www.nexuslearning.net/books/holt_elementsoflit-3/Collection%204/new%20directions.htm https://books.google.com/ngrams Introduction to Digital Humanities: Google Ngrams The digital humanities (DH, for short) help us study subjects related to human culture, like English and History, by using computer programs. For example, we can use computers to help us read short stories. This handout will give you instructions for using a DH tool online to make timeline charts. The tool is Google Ngrams. Google Ngrams First, we need to define our terms. An “Ngram” is a continuous unit of language. This means that it can be a word, a phrase, a single letter, and so on. Instead of calling these units “words,” we have the word “Ngram” to show that the unit might be less or more than a whole word. When you search for these units in the Google Ngram viewer, it creates a timeline chart based on all of the many thousands of books in its digital library. You can make timeline charts using the prominent words from the story as your Ngrams: 1. Go to https://books.google.com/ngrams 2. Delete the examples in the google search box after the words “Graph these comma-separated phrases:” 3. Type in your words there. For “New Directions,” you might choose a list like this one: pie,cotton,mill,road,workers,lumber,gin,path,fresh,marriage 4. In the boxes for “between” and “and” choose two years for your starting and ending points. Google Ngram viewer tracks words from anytime between 1800 and 2008. For “New Directions,” you might choose 1903-2008, because based on the first line of the story, the story takes place in 1903. 5. The word “corpus” simply means a big library full of books. In this case, that library is the digital library that Google has made and shared with us. You will notice that the drop down menu shows that there is a corpus available for many other languages than English. You can search terms in these languages as well. 6. When you change the “smoothing” number, you can see that there will be more jagged lines the smaller the number is, and smoother lines the larger the number is. This is basically the same effect of “rounding up” numbers from their decimals. You’ve done this in your math classes, for example, when you round up from 9.7 to a smooth 10. 7. The best way to save your timeline is to print it out. Otherwise, when you clear the fields, you start again. Questions for Discussion:  Which words are used most frequently during the time span you chose? In other words, which words are attached to the top lines on the chart? Which are the least frequently used? So, the bottom lines?  Are there any words that are used more often earlier in the century, and less often later? What do you think this means?  Some people have used Google Ngram viewer to track two different words that have a similar meaning. In the example list above, the words “road” and “path” are similar. Are they used at the same rate? Do they overlap? What do you think this means?  Think about the language used in the story. Using your own knowledge of the world, did the story sound like it was set in the year 1903, over 100 years ago? Would using the Ngram viewer help us to know that, for example, if Maya Angelou had not told us the year? Questions for Discussion: Conclusion:  Now that you have read the stories yourself, and used a computer program to help you read them in new ways, what do you think are the advantages of both strategies? What is best about reading the story yourself – perhaps this is something that the computer can’t offer – and what is best about using the tools? In other words, what are the strengths and weaknesses of both?  If you could make a computer program that could change the way we read stories, what would it do? https://books.google.com/ngrams Teacher Notes: Digital Humanities Lesson Plan Originally designed for: 9th Grade · Chromebooks · Short Stories · Wordle Word Clouds Note: this lesson plan supports instruction of CCSS RL.1, RL.4, and RL.7 Thank you for participating in this pilot project! These materials are entirely open access. You are encouraged to share, adapt, and revise them as you see fit. Note that the digital humanities (DH) definition provided on the lesson handout is all you and your students should need to get started. However, more information is available on the project website in the form of a one page, accessible overview: http://spartrees.wordpress.com/digital-humanities/ The digital humanities tool used in this lesson plan is introductory, and you may have already heard about or used it yourself. Wordle makes word clouds instantly from any text that you copy and paste into your web browser. This tool may seem simple, even “just for fun” – and it is fun – but, it also echoes the most advanced scholarship in digital humanities today. The learning goals for this lesson plan help the students move from the fun part to the rigorous analysis part of what is made possible by DH. Lesson Rationale:  Review and reinforce learning goals emphasized by teacher in previous class discussions o Angelou: making predictions and inferences  Introduce students to the term “digital humanities,” and to word clouds  Use the word clouds as new objects for analysis and discussion  Connect prior reading comprehension and discussion of texts to the word clouds Student Learning Goals:  Learn the term “digital humanities” and understand word clouds as part of the field  Create word clouds and see the quantitative aspect of their design  Gain a new perspective on and deeper comprehension of the assigned primary texts Primary text available online: “New Directions” by Maya Angelou http://www.nexuslearning.net/books/holt_elementsoflit-3/Collection%204/new%20directions.htm Digital Humanities Tool: Wordle: http://www.wordle.net/ Note: tech requirement: Wordle requires a Java plug-in. http://spartrees.wordpress.com/digital-humanities/ http://www.nexuslearning.net/books/holt_elementsoflit-3/Collection%204/new%20directions.htm http://www.wordle.net/ Introduction to Digital Humanities: Word Clouds The digital humanities (DH, for short) help us study subjects related to human culture, like English and History, by using computer programs. For example, we can use computers to help us read short stories. This handout will give you instructions for using a DH tool online to make word clouds. The tool is Wordle. Before we get started, let’s do a quick memory exercise. Without looking at the story you read for class, take one minute to write down the words or concepts you remember most strongly from it. We will return to this after you make your word clouds. “New Directions” by Maya Angelou Words I remember most: Important ideas from the story I remember most: Making Word Clouds Go to www.wordle.net and answer the following question based on what you see on the first page. When you have an answer, share it with the class. What is a word cloud? ____________________________________________________________ 1. Go to www.wordle.net and click “create” 2. Open a new tab or window in your web browser. Go to the online copy of “New Directions” http://www.nexuslearning.net/books/holt_elementsoflit-3/Collection%204/new%20directions.htm and highlight all of the text on the page (except for the words “click here to navigate…etc.”). After you have highlighted the text, either right click your mouse and select “copy” or hit CTRL+C on your keyboard. 3. Go back to the Wordle “create” page that you opened. In the box, right click and select “paste” or hit CTRL+V on your keyboard. 4. Click “Go” on the Wordle “create” page 5. A word cloud should appear 6. You can use the Language, Font, Layout, and Color tabs in the top left hand side of the screen to change the way the word cloud looks 7. If you want to remove a word from the cloud, right click on the word (for example, the word “the” is sometimes very prominent and distracting, and you might want to remove it) 8. Show your word cloud to your peers and teacher! What do you notice about it? 9. The best way to save your word cloud is to print it. Otherwise, you will have to start again. 10. To make a word cloud for a different story, copy and paste it into another Wordle “create” page. Questions for Discussion:  Which words are most prominent? In other words, which words stand out to you visually? Write them down here:  Do these words match up with the words you wrote down for the memory exercise?  Wordle shows the words used the most often in a story. How does this help us understand the story?  In class, you looked for significant details in “The Washerwoman.” Does the word cloud give you different answers to which details are significant in the story? The same answers?  In class, you practiced making predictions and inferences while reading “New Directions.” Do you think that your predictions would have changed if you had seen the word cloud before reading the story? Questions for Discussion: Conclusion:  Now that you have read the stories yourself, and used a computer program to help you read them in new ways, what do you think are the advantages of both strategies? What is best about reading the story yourself – perhaps this is something that the computer can’t offer – and what is best about using the tools? In other words, what are the strengths and weaknesses of both?  If you could make a computer program that could change the way we read stories, what would it do? http://www.wordle.net/ http://www.wordle.net/ http://www.nexuslearning.net/books/holt_elementsoflit-3/Collection%204/new%20directions.htm work_dkcnnr2bgfg7fay2k2g6cw67lq ---- 577PB July 2016 It is Not Good to Forget Good Done to You 53 Certain World sparkles when compared to Thomas Merton’s voluminous Journals. While Merton is cynical and mostly without wit, Auden laughs at himself and the world: ‘If the rich could hire other people to die for them, the poor could make a won- derful living.’ (Yiddish proverb, 203). Unlike Mer- ton’s nagging soul-searching, Auden promised us to ‘let others, more learned, intelligent, imagina- tive, and witty’ tell his life story. Letting others speak throughout, Auden has revised the genre of the autobiography here. A Certain World is in the tradition of the early modern commonplace book. His oeuvre both as poet and prose-writer shows a continuum with what is now neglected in literary studies—the study or reading of literature itself. Like literature, there is nothing certain about Auden. There is nary a better introductory essay on George Herbert than Auden’s Introduction to Her- bert (562–7). Rosemond Tuve and Helen Gardner pale in comparison to Auden’s assessment of Her- bert, being only equal to T S Eliot’s understanding of Herbert. Auden’s genius in understanding Her- bert is borne out by his statement that Herbert’s poems ‘cannot be judged by aesthetic standards alone’, since ‘all of Herbert’s poems are concerned with the religious life’ and they are ‘the counter- part of Jeremy Taylor’s prose’ (564). Three points emerge from these observations on Herbert: Auden was convinced that there are standalone aesthetic standards which are sufficient for a work of art to exist sui generis—since poetry makes noth­ ing happen—religion can produce beautiful lit- erature which surpasses Chaucer’s caricatures of religion and Jeremy Taylor’s prose is art. Edward Mendelson needs to be better known among English literature students than Roland Bar- thes, Jonathan Culler, and Terry Eagleton. Litera- ture is hard back-breaking work having little to do with reading snappy papers using presentation software or commenting on what Derrida might have thought of Auden. It has everything to do with understanding Robert Browning’s A Grammarian’s Funeral. If Mendelson’s clarion call does not con- vert self-professed literature scions, nothing will. Subhasis Chattopadhyay Psychoanalyst Assistant Professor of English Ramananda College, Bishnupur Interdisciplining Digital Humanities: Boundary Work in an Emerging Field Julie thompson Klein University of Michigan Press, 839 Greene Street, Ann Arbor, Michigan 48104–3209. usa. www.press.umich. edu. 2015. 218 pp. hb. $70. isbn 9780472072545. he best way to test scholarship is to remove paywalls and put up one’s academic work online. Plagiarists and snobs will scoff at these suggestions. Hence Julie Thompson Klein had to write A Culture of Recognition (144–51). The his- trionics regarding the value of web scholarship she documents at the Modern Language Associ- ation and the Council of the American Histor- ical Association are worth noting. Thompson Klein’s book is the single most important book on the subject of web scholarship available now and should complement the MLA Handbook. Is it believable that in this era of webinars and countless online tools for academics, one needs to beg donors from the ‘developed’ countries for doles to study the humanities in their na- tions? One should get rid of seminars—huge wastes of money—all sorts of ‘prestigious’ schol- arships and halt the demeaning culture of beg- ging. It does an academician no good to beg to read a paper at some conference at an ‘estab- lished’ university. As Klein mentions, what we need is the computational turn in the humanities (63). Those who still go to libraries to study in original some medieval manuscripts are poten- tial dangers to their own domains. What if one spoils the manuscript? Why not use digital tools to study it from one’s own laptop? A thorough study of Klein’s text will hopefully open some perennially shut eyes. Andy Engel’s Resourcing at the end of the book is valuable to beginners who want to learn the techne of doing digital humanities. The cultural work of Klein is to chronicle and even inaugurate a new era in reading, scholar- ship, and interdisciplinary collaboration. After Gutenberg’s press, the Internet is the biggest event in the world. Her book will be remembered T PB July 2016578 Prabuddha Bharata54 as one of the first texts to chronicle the inevit- able. Everyone can now study and network with like-minded scholars. Nepotism, political favour- itism, and all sorts of cronyism in getting pub- lished, crucial for tenure, are going to be eased out through the Internet. Subhasis Chattopadhyay The Complete Works of W. H. Auden: Prose: Volume V, 1963–1968 Ed. Edward Mendelson Princeton University Press, 41 Wil- liam Street, Princeton, New Jersey 08540. usa. www.press.princeton . edu. 2015. 608 pp. hb. $65. isbn 9780691151717. It is fascinating to read Auden’s opinions on Robert Browning’s The Pied Piper of Hame­ lin (7–8). Both Browning and Auden have been forgotten by Indian humanists. Auden’s huge prose-corpus is unknown to even admirers of his poetry. Edward Mendelson and Princeton Uni- versity Press have done literature a big service by publishing the prose of Auden in these defini- tive volumes. Auden, like every great writer, engages with that one problem which matters most accord- ing to the Russian philosopher Nikolai Berdyaev. This is the problem of evil. Auden’s ‘Good and Evil in The Lord of the Rings’ (331–5) is worth careful reading to understand fairy tales, to understand the role of the family in creating stable societies, and the dialectics of the Kantian good and the bad. Auden’s prose in this essay takes on a uni- versal sheen. Auden’s prose is a plea against xenophobia, ethnic cleansing, and fascism. He celebrates the family as a locus for self-actualisation; indeed of agape. Research scholars and general readers will be swept away by Auden’s range of reading and Mendelson’s scrupulous editing. This definitive volume should be in all English departments throughout the world. Subhasis Chattopadhyay Intellectuals and Power François Laruelle in conversation with Philippe Petit Polity Press, 65 Bridge Street, Cam- bridge cb2 1ur. uk . www.politybooks. com. 2015. xxvi + 155 pp. pb. $19.95. isbn 9780745668413. re we not all tired of the endless rantings of ‘in- tellectuals’ in the electronic media at the slight- est act of injustice? To what end do these ‘guardians of knowledge’ express their opinions? These and many other questions are critically explored in this volume, which is the outcome of long conversations of Philippe Petit with François Laruelle. The translator Anthony Paul Smith tells in his preface that ‘Laruelle marks a difference between what he terms dominant intellectuals, who carry various adjectives like engaged, humanitarian, right-wing, left-wing, etc., and what he terms the determined intellectual. … The determined intel- lectual is an intellectual whose character is deter- mined in the sense of conditioned or driven by his or her relationship to the victim’ (xiv–v). It is this attempt to relate to the victim that propels him to ‘undertake … a philosophical re-contextualization of the intellectual’ (5). He ventures to classify intel- lectuals ‘on a philosophical basis, a true intellectual function’ (7). He is concerned with the overarching ‘mediatization’ of the intellectual. This book aims to see how the victim and the ‘identity of the Real’ are wedded to philosophers and intellectuals. Towards this aim Laruelle does not ‘leave philosophy to its own authority’ just as he does not ‘leave theology or religious beliefs to their own authorities’ (119). A militant activist related to the victim is Laruelle’s vision: ‘The non-humani- tarian intellectual is not necessarily someone who would refuse to go to demonstrations, someone who would refuse to sign petitions. He looks for an- other usage. He can absolutely participate in these things, but he will not limit his own action to the belief that sustains them’ (131). Anyone concerned with the suffering needs to dive deep into this book. Editor Prabuddha Bharata A work_dlega65id5autotq6spq7tcwrq ---- 4/29/2019 Refracting Digital Humanities – jarah moesch jarahmoesch.com/2014/07/05/refracting-digital-humanities/ 1/3 HILT logo Refracting Digital Humanities Refracting Digital Humanities Critical Race, Gender and Queer Theories as (Digital Humanities) Methods August 4 – 8, 2014 Course Schedule Course Software Course Reading List HILT Course Description The methods and tools used and produced by Digital Humanists function as organizing principles that frame how race, gender, sexuality, and ability are embodied and understood within and through projects, code-bases, and communities of practice. The very ‘making’ of tools and projects is an engagement with power and control. Through a critical theoretical exploration of the values in the design and use of these tools and methods, we begin to understand that these methods and practices are structures which are themselves marginalizing, tokenizing, and reductionist. By pairing hands-on learning/making with Critical Race Theory, Queer, and Gender Theories, we will interrogate the structures of the tools themselves while creating our own collaborative practices and methods for ‘doing’ (refracting) DH differently. To accomplish this, each day will focus on one tool or method. Mornings will be a combination of reading-based discussion and experimental structural/tools-based exercises, while afternoon sessions will focus on pulling it all together in collaborative analytical projects. While no prior technical experience is necessary, you will be experimenting with, and creating your own theoretical practice that incorporates key themes in critical race, gender and queer theories with digital humanities methods and tools. Therefore, the key requirement for this course is curiosity and a willingness to explore new ideas in order to fully engage with the materials. Students are also encouraged to bring their own research questions to explore through these theories and practices. Course process website: jarah moesch scholar. artist. designer. a b o u t r e s e a r c h e x h i b i t i o n s p o r t f o l i o t e a c h i n g http://thejarahtree.com/wp-content/uploads/2014/07/logo.png http://thejarahtree.com/2014/07/05/schedule-refracting-dh http://thejarahtree.com/2014/07/05/software-list-refracting-dh/ http://thejarahtree.com/2014/07/07/reading-list-refracting-dh/ http://www.dhtraining.org/hilt/ http://refractivemapping.wordpress.com/about/ http://jarahmoesch.com/ http://jarahmoesch.com/about-2/ http://jarahmoesch.com/category/research/ http://jarahmoesch.com/category/exhibitions/ http://jarahmoesch.com/category/portfolio/ http://jarahmoesch.com/category/teaching/ 4/29/2019 Refracting Digital Humanities – jarah moesch jarahmoesch.com/2014/07/05/refracting-digital-humanities/ 2/3 the J-4000 software list: refracting DH This site reflects our process over the course, and is not a complete repository. Student review of course (Kayla Hammond Larkin) at Digital Library Foundation Dec 2014, update: This syllabus is now part of the Modern Language Association Book Project Digital Pedagogy in the Humanities: Concepts, Models, and Experiments, which is “an open-access, curated collection of downloadable, reusable, and remixable pedagogical resources for humanities scholars interested in the intersections of digital technologies with teaching and learning” To read more about this project go here: https://github.com/curateteaching/digitalpedagogy/blob/master/announcement.md To look at other resources for the keyword queer, go here: https://github.com/curateteaching/digitalpedagogy/blob/master/keywords/queer.md       taught courses Transhuman Worlding Art Networks & Media Ecologies Designing the Post Apocalypse Fleshy Futures Practicum in Design Cultures + Creativity Perspectives on Design Cultures + Creativity Performing the Virtual Digital Queers Intro to Digital Cultures & Creativity I Feminist Social Media Activisms Women and the Media Television & Society Web Programming jarah moesch scholar. artist. designer. a b o u t r e s e a r c h e x h i b i t i o n s p o r t f o l i o t e a c h i n g http://jarahmoesch.com/2014/06/14/the-j-4000/ http://jarahmoesch.com/2014/07/05/software-list-refracting-dh/ http://www.diglib.org/archives/6338/ https://github.com/curateteaching/digitalpedagogy/blob/master/announcement.md https://github.com/curateteaching/digitalpedagogy/blob/master/keywords/queer.md http://jarahmoesch.com/category/taught-courses/ http://jarahmoesch.com/2019/01/08/transhuman-worlding-how-to-abandon-the-earth-change-the-world/ http://jarahmoesch.com/2019/01/08/art-networks-media-ecologies-monsters-and-ghosts-of-the-anthropocene/ http://jarahmoesch.com/2019/01/08/designing-the-post-apocalypse/ http://jarahmoesch.com/2019/01/08/fleshy-futures-technologies-and-the-body/ http://jarahmoesch.com/2019/01/08/practicum-in-design-cultures-creativity/ http://jarahmoesch.com/2019/01/08/perspectives-on-design-cultures-creativity/ http://jarahmoesch.com/2019/01/08/performing-the-virtual-intro-to-digital-cultures-creativity-ii/ http://jarahmoesch.com/2019/01/08/digital-queerspublic-space-art-performance-in-the-digital-age/ http://jarahmoesch.com/2019/01/08/intro-to-digital-cultures-creativity-i/ http://jarahmoesch.com/2019/01/08/feminist-social-media-activisms-2/ http://jarahmoesch.com/2019/01/08/women-and-the-media/ http://jarahmoesch.com/2019/01/08/television-society/ http://jarahmoesch.com/2019/01/08/web-programming/ http://jarahmoesch.com/ http://jarahmoesch.com/about-2/ http://jarahmoesch.com/category/research/ http://jarahmoesch.com/category/exhibitions/ http://jarahmoesch.com/category/portfolio/ http://jarahmoesch.com/category/teaching/ 4/29/2019 Refracting Digital Humanities – jarah moesch jarahmoesch.com/2014/07/05/refracting-digital-humanities/ 3/3 Introduction to Digital Media Twitter: @theJarahTree Powered by WordPress | Theme: Astrid by aThemes. jarah moesch scholar. artist. designer. a b o u t r e s e a r c h e x h i b i t i o n s p o r t f o l i o t e a c h i n g http://jarahmoesch.com/2019/01/08/introduction-to-digital-media/ https://twitter.com/thejarahtree https://wordpress.org/ http://athemes.com/theme/astrid http://jarahmoesch.com/ http://jarahmoesch.com/about-2/ http://jarahmoesch.com/category/research/ http://jarahmoesch.com/category/exhibitions/ http://jarahmoesch.com/category/portfolio/ http://jarahmoesch.com/category/teaching/ 4/29/2019 schedule: refracting DH – jarah moesch jarahmoesch.com/2014/07/05/schedule-refracting-dh/ 1/2 HILT logo schedule: refracting DH Day 1:        SEE 1.1        introductions; discussion 1.2        discussion:    images as objects 1.3        iteration 1:     exploration & prototyping 1.4        iteration 2:     images Day 2:        HEAR 2.1        discussion:     what does race, gender sound like? 2.2        workshop:     audio, exploration 2.3        iteration 1:     audio editing & prototyping 2.4        iteration 2:     out into the world Day 3:        KNOW 3.1        discussion:    critical code 3.2        workshop:     basic electronics, intro to arduino 3.3        iteration 1:     prototyping Day 4:        MOVE 4.1        discussion: critical cartography & mapping 4.2        iteration 1: exploring & prototyping 4.3        workshop: mapping 4.4        iteration 2: prototyping Day 5:        MAKE 5.1        workshop: next steps 5.2        workshop: completion of work 5.3        wrap-up: research & teaching   jarah moesch scholar. artist. designer. a b o u t r e s e a r c h e x h i b i t i o n s p o r t f o l i o t e a c h i n g http://thejarahtree.com/wp-content/uploads/2014/07/logo.png http://jarahmoesch.com/ http://jarahmoesch.com/about-2/ http://jarahmoesch.com/category/research/ http://jarahmoesch.com/category/exhibitions/ http://jarahmoesch.com/category/portfolio/ http://jarahmoesch.com/category/teaching/ 4/29/2019 schedule: refracting DH – jarah moesch jarahmoesch.com/2014/07/05/schedule-refracting-dh/ 2/2 software list: refracting DH reading list: refracting DH Uncategorized The Housemate Collection (2019) LUNGS Imagining Access Designing the Sick Body webs of knowledge refracting mapping: timeline reading list: refracting DH schedule: refracting DH software list: refracting DH myWaldo Queer Ghosts in the Machine Homeland Security Twitterbot the ego page massively multiplayer soba Twitter: @theJarahTree Powered by WordPress | Theme: Astrid by aThemes. jarah moesch scholar. artist. designer. a b o u t r e s e a r c h e x h i b i t i o n s p o r t f o l i o t e a c h i n g http://jarahmoesch.com/2014/07/05/software-list-refracting-dh/ http://jarahmoesch.com/2014/07/07/reading-list-refracting-dh/ http://jarahmoesch.com/category/uncategorized/ http://jarahmoesch.com/2019/01/24/the-housemate-collection/ http://jarahmoesch.com/2019/01/23/lungs/ http://jarahmoesch.com/2018/10/18/imagining-access/ http://jarahmoesch.com/2016/10/01/designing-the-sick-body/ http://jarahmoesch.com/2016/08/24/webs-of-knowledge/ http://jarahmoesch.com/2014/08/08/refracting-timeline/ http://jarahmoesch.com/2014/07/07/reading-list-refracting-dh/ http://jarahmoesch.com/2014/07/05/schedule-refracting-dh/ http://jarahmoesch.com/2014/07/05/software-list-refracting-dh/ http://jarahmoesch.com/2013/02/17/mywaldo/ http://jarahmoesch.com/2013/01/24/queer-ghosts-in-the-machine/ http://jarahmoesch.com/2012/01/24/homeland-security-twitterbot/ http://jarahmoesch.com/2011/01/24/the-ego-page/ http://jarahmoesch.com/2008/08/20/massively-multiplayer-soba/ https://twitter.com/thejarahtree https://wordpress.org/ http://athemes.com/theme/astrid http://jarahmoesch.com/ http://jarahmoesch.com/about-2/ http://jarahmoesch.com/category/research/ http://jarahmoesch.com/category/exhibitions/ http://jarahmoesch.com/category/portfolio/ http://jarahmoesch.com/category/teaching/ 4/29/2019 software list: refracting DH – jarah moesch jarahmoesch.com/2014/07/05/software-list-refracting-dh/ 1/2 HILT logo Refracting Digital Humanities schedule: refracting DH software list: refracting DH These are the software that we will be using throughout the week. Please pre-load them onto your laptop to keep the course on schedule. Dropbox On phone and on laptop Please email me with your dropbox info so I can invite you to join a shared folder Google docs please email me with your google email info so I can invite you to join a shared folder Audacity http://audacity.sourceforge.net/ Soundplant http://soundplant.org/ Arduino http://arduino.cc/ Processing https://www.processing.org/ jarah moesch scholar. artist. designer. a b o u t r e s e a r c h e x h i b i t i o n s p o r t f o l i o t e a c h i n g http://jarahmoesch.com/2014/07/05/refracting-digital-humanities/ http://jarahmoesch.com/2014/07/05/schedule-refracting-dh/ http://thejarahtree.com/wp-content/uploads/2014/07/logo.png http://audacity.sourceforge.net/ http://soundplant.org/ http://arduino.cc/ https://www.processing.org/ http://jarahmoesch.com/ http://jarahmoesch.com/about-2/ http://jarahmoesch.com/category/research/ http://jarahmoesch.com/category/exhibitions/ http://jarahmoesch.com/category/portfolio/ http://jarahmoesch.com/category/teaching/ 4/29/2019 software list: refracting DH – jarah moesch jarahmoesch.com/2014/07/05/software-list-refracting-dh/ 2/2 Uncategorized The Housemate Collection (2019) LUNGS Imagining Access Designing the Sick Body webs of knowledge refracting mapping: timeline reading list: refracting DH schedule: refracting DH software list: refracting DH myWaldo Queer Ghosts in the Machine Homeland Security Twitterbot the ego page massively multiplayer soba Twitter: @theJarahTree Powered by WordPress | Theme: Astrid by aThemes. jarah moesch scholar. artist. designer. a b o u t r e s e a r c h e x h i b i t i o n s p o r t f o l i o t e a c h i n g http://jarahmoesch.com/category/uncategorized/ http://jarahmoesch.com/2019/01/24/the-housemate-collection/ http://jarahmoesch.com/2019/01/23/lungs/ http://jarahmoesch.com/2018/10/18/imagining-access/ http://jarahmoesch.com/2016/10/01/designing-the-sick-body/ http://jarahmoesch.com/2016/08/24/webs-of-knowledge/ http://jarahmoesch.com/2014/08/08/refracting-timeline/ http://jarahmoesch.com/2014/07/07/reading-list-refracting-dh/ http://jarahmoesch.com/2014/07/05/schedule-refracting-dh/ http://jarahmoesch.com/2014/07/05/software-list-refracting-dh/ http://jarahmoesch.com/2013/02/17/mywaldo/ http://jarahmoesch.com/2013/01/24/queer-ghosts-in-the-machine/ http://jarahmoesch.com/2012/01/24/homeland-security-twitterbot/ http://jarahmoesch.com/2011/01/24/the-ego-page/ http://jarahmoesch.com/2008/08/20/massively-multiplayer-soba/ https://twitter.com/thejarahtree https://wordpress.org/ http://athemes.com/theme/astrid http://jarahmoesch.com/ http://jarahmoesch.com/about-2/ http://jarahmoesch.com/category/research/ http://jarahmoesch.com/category/exhibitions/ http://jarahmoesch.com/category/portfolio/ http://jarahmoesch.com/category/teaching/ 4/29/2019 reading list: refracting DH – jarah moesch jarahmoesch.com/2014/07/07/reading-list-refracting-dh/ 1/5 HILT logo reading list: refracting DH Reading List for Refracting Digital Humanities Critical Race, Gender and Queer Theories as (Digital Humanities) Methods note: students in this course will have access to book chapters via dropbox Digital Humanities Background Reading Open Thread: The Digital Humanities as “Refuge”  from Race/Class/Gender/Sexuality/Disability? http://dhpoco.org/blog/2013/05/10/open-thread-the-digital-humanities-as-a-historical-refuge- from-raceclassgendersexualitydisability/ DHpoco: founding principles http://dhpoco.org/founding-principles/ #Transformdh: http://transformdh.org/ Smith, Martha Nell The Human Touch Software of the Highest Order: Revisiting Editing as Interpretation Textual Cultures, Vol. 2, No. 1 (Spring, 2007), pp. 1-15   Day 1:        SEE Bailey, Moya Z All the digital humanists are white, all the nerds are men, but some of us are brave http://journalofdigitalhumanities.org/1-1/all-the-digital-humanists-are-white-all-the-nerds-are- men-but-some-of-us-are-brave-by-moya-z-bailey/ Gershenson, Olga  and Barbara Penner. Ladies and Gents: Public Toilets and Gender •    Forward jarah moesch scholar. artist. designer. a b o u t r e s e a r c h e x h i b i t i o n s p o r t f o l i o t e a c h i n g http://thejarahtree.com/wp-content/uploads/2014/07/logo.png http://dhpoco.org/blog/2013/05/10/open-thread-the-digital-humanities-as-a-historical-refuge-from-raceclassgendersexualitydisability/ http://dhpoco.org/founding-principles/ http://transformdh.org/ http://journalofdigitalhumanities.org/1-1/all-the-digital-humanists-are-white-all-the-nerds-are-men-but-some-of-us-are-brave-by-moya-z-bailey/ http://jarahmoesch.com/ http://jarahmoesch.com/about-2/ http://jarahmoesch.com/category/research/ http://jarahmoesch.com/category/exhibitions/ http://jarahmoesch.com/category/portfolio/ http://jarahmoesch.com/category/teaching/ 4/29/2019 reading list: refracting DH – jarah moesch jarahmoesch.com/2014/07/07/reading-list-refracting-dh/ 2/5 •    Introduction: The Private Life of Public Conveniences •    Chapter 13: “our little secrets: A Pakistani Artist Explores the Shame and Pride of her Community’s Bathroom Practices hooks, bell The Oppositional Gaze in Black Looks: Race and Representation. •    p 115-31 Lutz, Catherine A.  and Jane L. Collins The Photograph as Intersection of Gazes in Reading National Geographic McFadden, Syreeta Teaching The Camera To See My Skin Navigating photography’s inherited bias against dark skin http://www.buzzfeed.com/syreetamcfadden/teaching-the-camera-to-see-my-skin McPherson, Tara Why are the Digital Humanities so white? http://dhdebates.gc.cuny.edu/debates/text/29 Muñoz , Jose Esteban Disidentifications Queers of Color and the Performance of Politics •    Introduction p 3-11 Day 2:        HEAR Bradley, Regina Death Wish Mixtape. Sounding Trayvon Martin’s Death http://soundstudiesblog.com/2012/03/26/death-wish-mixtape-sounding-trayvon-martins- death/ Bradley, Regina Fear of a Black in the Suburb http://soundstudiesblog.com/2014/02/17/fear-of-a-black-in-the-suburb/ Casillas, D. Ines. Speaking “Mexican” and the Use of “Mock Spanish” in Children’s Books. Or, Do not read Skippy Jon Jones. http://soundstudiesblog.com/2014/05/05/speaking-mexican-and-the-use-of-mock-spanish-in- childrens-books-or-do-not-read-skippyjon-jones/ Laorale. Óyeme Voz: U.S. Latin@ & Immigrant Communities Re-Sound Citizenship and Belonging http://soundstudiesblog.com/2014/01/13/oyeme-voz-u-s-latin-immigrant-communities-re- sound-citizenship-and-belonging/ Day 3:        KNOW Enteen, Jillana virtual_english queer internets and digital creolization •    Introduction Life Skills:  p 1-20 •    Booting Up The Languages of Computer Technologies p 21-45 jarah moesch scholar. artist. designer. a b o u t r e s e a r c h e x h i b i t i o n s p o r t f o l i o t e a c h i n g http://www.buzzfeed.com/syreetamcfadden/teaching-the-camera-to-see-my-skin http://dhdebates.gc.cuny.edu/debates/text/29 http://soundstudiesblog.com/2012/03/26/death-wish-mixtape-sounding-trayvon-martins-death/ http://soundstudiesblog.com/2014/02/17/fear-of-a-black-in-the-suburb/ http://soundstudiesblog.com/2014/05/05/speaking-mexican-and-the-use-of-mock-spanish-in-childrens-books-or-do-not-read-skippyjon-jones/ http://soundstudiesblog.com/2014/01/13/oyeme-voz-u-s-latin-immigrant-communities-re-sound-citizenship-and-belonging/ http://jarahmoesch.com/ http://jarahmoesch.com/about-2/ http://jarahmoesch.com/category/research/ http://jarahmoesch.com/category/exhibitions/ http://jarahmoesch.com/category/portfolio/ http://jarahmoesch.com/category/teaching/ 4/29/2019 reading list: refracting DH – jarah moesch jarahmoesch.com/2014/07/07/reading-list-refracting-dh/ 3/5 Introna & Nissenbaum Shaping the Web: Why the Politics of Search Engines Matter http://www.indiana.edu/~tisj/readers/full-text/16-3%20Introna.html Koh, Adeline. Less yack more hack: Modularity Theory & Habitus in the Digital Humanities http://www.adelinekoh.org/blog/2012/05/21/more-hack-less-yack-modularity-theory-and- habitus-in-the-digital-humanities/ Kolko, B.E Erasing @race: Going White in the (Inter)Face in Race in Cyberspace: p 213-232 Nakamura, Lisa and Peter Chow-White Race After the Internet •    Introduction HASTAC Book Review of Race After the Internet: http://www.hastac.org/content/reviews-race-after-internet Posner, Miriam Some things to think about before you exhort everyone to code: http://miriamposner.com/blog/some-things-to-think-about-before-you-exhort-everyone-to- code/ Day 4:        MOVE Chun, Wendy Control and Freedom •    Introduction Farman, Jason Mapping the Digital Empire: Google Earth and the Process of Postmodern Cartography New Media Society 2010 12: 869 http://nms.sagepub.com/content/12/6/869 Kwan, Mei-Po Affecting Geospatial Technologies: Toward a Feminist Politics of Emotion The Professional Geographer 59.1 (2007): 22-34 Radical Cartography http://www.radicalcartography.net/   Other Related Readings (recommended): Mulvey, Laura Visual Pleasure, Narrative Cinema Monmonier, Mark How to Lie with Maps jarah moesch scholar. artist. designer. a b o u t r e s e a r c h e x h i b i t i o n s p o r t f o l i o t e a c h i n g http://www.indiana.edu/~tisj/readers/full-text/16-3%20Introna.html http://www.adelinekoh.org/blog/2012/05/21/more-hack-less-yack-modularity-theory-and-habitus-in-the-digital-humanities/ http://www.hastac.org/content/reviews-race-after-internet http://miriamposner.com/blog/some-things-to-think-about-before-you-exhort-everyone-to-code/ http://nms.sagepub.com/content/12/6/869 http://www.radicalcartography.net/ http://jarahmoesch.com/ http://jarahmoesch.com/about-2/ http://jarahmoesch.com/category/research/ http://jarahmoesch.com/category/exhibitions/ http://jarahmoesch.com/category/portfolio/ http://jarahmoesch.com/category/teaching/ 4/29/2019 reading list: refracting DH – jarah moesch jarahmoesch.com/2014/07/07/reading-list-refracting-dh/ 4/5 schedule: refracting DH refracting mapping: timeline •    Introduction •    Chapter 3 Map Generalizations NPR Light And Dark: The Racial Biases That Remain In Photography http://www.npr.org/blogs/codeswitch/2014/04/16/303721251/light-and-dark-the-racial-biases- that-remain-in-photography (audio) Paglen, Trevor Blank Spots on the Map •    Chapter 1 •    Chapter 4 Review: http://content.time.com/time/arts/article/0,8599,1876478,00.html Smith, David Racism of early colour photography explored in art exhibition http://www.theguardian.com/artanddesign/2013/jan/25/racism-colour-photography-exhibition Uncategorized The Housemate Collection (2019) LUNGS Imagining Access Designing the Sick Body webs of knowledge refracting mapping: timeline reading list: refracting DH schedule: refracting DH software list: refracting DH myWaldo Queer Ghosts in the Machine Homeland Security Twitterbot the ego page massively multiplayer soba jarah moesch scholar. artist. designer. a b o u t r e s e a r c h e x h i b i t i o n s p o r t f o l i o t e a c h i n g http://jarahmoesch.com/2014/07/05/schedule-refracting-dh/ http://jarahmoesch.com/2014/08/08/refracting-timeline/ http://www.npr.org/blogs/codeswitch/2014/04/16/303721251/light-and-dark-the-racial-biases-that-remain-in-photography http://content.time.com/time/arts/article/0,8599,1876478,00.html http://www.theguardian.com/artanddesign/2013/jan/25/racism-colour-photography-exhibition http://jarahmoesch.com/category/uncategorized/ http://jarahmoesch.com/2019/01/24/the-housemate-collection/ http://jarahmoesch.com/2019/01/23/lungs/ http://jarahmoesch.com/2018/10/18/imagining-access/ http://jarahmoesch.com/2016/10/01/designing-the-sick-body/ http://jarahmoesch.com/2016/08/24/webs-of-knowledge/ http://jarahmoesch.com/2014/08/08/refracting-timeline/ http://jarahmoesch.com/2014/07/07/reading-list-refracting-dh/ http://jarahmoesch.com/2014/07/05/schedule-refracting-dh/ http://jarahmoesch.com/2014/07/05/software-list-refracting-dh/ http://jarahmoesch.com/2013/02/17/mywaldo/ http://jarahmoesch.com/2013/01/24/queer-ghosts-in-the-machine/ http://jarahmoesch.com/2012/01/24/homeland-security-twitterbot/ http://jarahmoesch.com/2011/01/24/the-ego-page/ http://jarahmoesch.com/2008/08/20/massively-multiplayer-soba/ http://jarahmoesch.com/ http://jarahmoesch.com/about-2/ http://jarahmoesch.com/category/research/ http://jarahmoesch.com/category/exhibitions/ http://jarahmoesch.com/category/portfolio/ http://jarahmoesch.com/category/teaching/ 4/29/2019 reading list: refracting DH – jarah moesch jarahmoesch.com/2014/07/07/reading-list-refracting-dh/ 5/5 Twitter: @theJarahTree Powered by WordPress | Theme: Astrid by aThemes. jarah moesch scholar. artist. designer. a b o u t r e s e a r c h e x h i b i t i o n s p o r t f o l i o t e a c h i n g https://twitter.com/thejarahtree https://wordpress.org/ http://athemes.com/theme/astrid http://jarahmoesch.com/ http://jarahmoesch.com/about-2/ http://jarahmoesch.com/category/research/ http://jarahmoesch.com/category/exhibitions/ http://jarahmoesch.com/category/portfolio/ http://jarahmoesch.com/category/teaching/ work_dplcocllrbdozp4rl4pm7p4ssu ---- [The following is the full text of an essay published in differences 25.1 (2014) as part of a special issue entitled In the Shadows of the Digital Humanities edited by Ellen Rooney and Elizabeth Weed. Duke UP’s publishing agreements allow authors to post the final version of their own work, but not using the publisher’s PDF. The essay as you see it here is thus a PDF that I created and formatted myself from the copy edited file I received from the press; subscribers, of course, can also read it in the press’s published form direct from the Duke UP site. Other than accidentals of formatting and pagination this text should not differ from the published one in any way. If there are discrepancies they are likely the result of final copy edits just before printing—I’d appreciate having them pointed out. These and other comments can be sent to mgk@umd.edu. This article is copyright © 2014 Duke University Press.] What Is “Digital Humanities,” and Why Are They Saying Such Terrible Things about It? Matthew Kirschenbaum University of Maryland I. In the midst of the 2009 MLA Convention, Chronicle of Higher Education blogger William Pannapacker wrote, “Amid all the doom and gloom [. . .] one field seems to be alive and well: the digital humanities. More than that: Among all the contending subfields, the digital humanities seem like the first ‘next big thing’ in a long time, because the implications of digital technology affect every field.” Two years later Pannapacker titled his MLA Chronicle column with the seemingly unnecessary interrogative “Digital Humanities Triumphant?” But would that more in life were so predictable as an academic dialectic: in 2013, Pannapacker’s by-now anticipated convention Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 2 coverage centered on “The Dark Side of the Digital Humanities,” the special session event from which this journal issue is derived. I was not a participant on the panel, but I was in the crowded ballroom at the Boston Sheraton. The mood beforehand was festive, not contentious. Everyone was expecting a good show from the A-list speakers assembled. From somewhere up front the strains of the Star Wars Imperial March, made tinny by a laptop speaker, were accompanied by scattered laughter. Nonetheless, the issues raised and the charges leveled were of the most serious order. Richard Grusin, who had convened the session, built toward an arresting summation: “I would assert that it is no coincidence that the digital humanities has emerged as ‘the next big thing’ at the same moment that the neoliberalization and corporatization of higher education has intensified in the first decades of the twenty-first century.” This short essay is not intended as a defense of digital humanities, not least because I don’t think I disagree with Grusin, at least insofar as his articulation of the institutional environment that surrounds digital humanities is concerned. (I work in a university too, I have eyes, I have ears.) Yet next big thing or no, when it comes to digital humanities we are still only ever talking about someone’s or several someones’ work, the errors and limitations of which, whatever they may be in their particulars, should require no special forum or occasion for airing. So let me say it at the outset: everything produced by digital humanities—and I do mean every thing, every written, scripted, coded, or fabricated thing—in whatever its guise or form, medium or format, may be subject to criticism and critique on the basis of its methods, assumptions, expressions, and outcomes. All of that is completely normative and part of the routine conduct of academic disciplines. Yet in the last couple of years events that are not normative or routine have occurred, and it is those events that we are addressing with this special journal issue and that were addressed at the MLA special session. These events, I would maintain, concern not the papers, projects, and other material pursuits of digital humanities—not the things of the digital humanities—but rather the advent of a construct of a “digital humanities.” Lest anyone think I am beginning with a semantic slip-slide, what I have just asserted is not only uncontroversial, it is also unoriginal, echoing as it does statements by the MLA session’s invited participants. Wendy Chun, for example, insisted: “But let me be clear, my critique is not directed at DH per se. DH projects have extended and renewed the humanities and revealed that the kinds of critical thinking (close textual analysis) that the humanities have Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 3 always been engaged in is and has always been central to crafting technology and society” (emphasis in original). By this account, then, DH “projects” have “extended and renewed” the humanities and have also helped historicize its activities in ways Chun finds salutary. Rita Raley, meanwhile, commenting afterward on the response to the session (which unfolded in real time on Twitter), is even more direct, noting: “[T]hough our roundtable referred in passing to actually existing projects, collectives, and games that we take to be affirmative and inspiring, the ‘digital humanities’ under analysis was a discursive construction and, I should add, clearly noted as such throughout” (my emphasis). Whatever else we are talking about in this special issue, then, whatever else the MLA session was addressing itself to, and whatever else I am engaging in my contribution here, it is not the material conduct of digital humanities or, if you prefer, “actually existing projects,” an especially clarifying phrase to keep in mind. It is, instead, and still in Raley’s terms, a “discursive construction.” I have written about the existence of such a construct before, in two previous essays to which this, I suppose, contributes a third and final entry in an unanticipated trilogy. The first and most widely circulated of these, “What Is Digital Humanities and What’s It Doing in English Departments?” began as an assignment for a 2010 Association of Departments of English meeting (hence the specificity of its address). I opened it by enjoining anyone truly interested in the first half of the titular question to Google it, or perhaps consult Wikipedia. At the time I was merely acting out my impatience, since whatever else one could say about digital humanities, there had been no shortage of writing seeking to define it and so, as I put it then, “Whoever asks the question has not gone looking very hard for an answer.” But my real point wasn’t that Google or Wikipedia were the de facto authorities, but rather that they offered convenient portals to layers of consensus that are shaped, over time, by a community of interested persons. In other words, digital humanities was a construct, and the state of the construct could be more or less effectively monitored by checking in on its self-representations in aggregate. (The remainder of the piece did some historical spadework, excavating the actual origin of the term digital humanities and explaining why I thought English departments had—again, historically—been especially hospitable to its emergence.) But while the essay historicizes and characterizes DH, at no time does it actively define it; instead, in retrospect, here is what I see as its most clearly spoken moment: Digital humanities has also, I would propose, lately been galvanized by a group of younger (or not so young) graduate students, faculty members (both tenure line and Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 4 contingent), and other academic professionals who now wield the label “digital humanities” instrumentally amid an increasingly monstrous institutional terrain defined by declining public support for higher education, rising tuitions, shrinking endowments, the proliferation of distance education and the for-profit university, and, underlying it all, the conversion of full-time, tenure-track academic labor to a part-time adjunct workforce. I don’t see this description of what I term a “monstrous” institutional terrain differing substantially from Grusin’s view of where we are in the academy today. For several years thereafter, whenever asked to define digital humanities, my response was thus to say “a term of tactical convenience.” The contention that “DH” was usefully understood as a tactical term, then, became the subject of the second of these two essays, a contention necessary in order to, as I next wrote, “insist on the reality of circumstances in which it [‘digital humanities’] is unabashedly deployed to get things done—‘things’ that might include getting a faculty line or funding a staff position, establishing a curriculum, revamping a lab, or launching a center” (“Digital” 415). That second piece does some further historical work, examining in detail one such tactical deployment of DH at one specific institution, and also, in a separate section, attempting to delineate how “DH,” as a signifier, was increasingly operationalized algorithmically on the network, actively mobilized via hashtags and metadata. This essay has been criticized by Brian Lennon on the grounds that “tactical,” if read to follow de Certeau’s usage, invokes an outsider position that DH can no longer (or indeed, ever) claim the luxury of inhabiting; that DH is, rather, a strategic formation complicit with the state, or at the very least, complicit with the aims of conniving deans and administrators and foundation officers who are actively seeking to dismantle the bare, ruined choirs of the professoriate. As I previously responded: “[F]or those of us who have [built] centers/programs/curricula/what-have-you one proposal, one hire, one lecture series, one grant, one server, one basement room at a time, the institutional interiority and strategic complicity of digital humanities seems perhaps equally unpersuasive” (Comment). Be that as it may. Why write a third piece on the topic? While questions about digital humanities did not originate with the 2013 MLA, that moment does seem to me to mark the onset of an increasingly aggressive challenge that deserves recognition, and response. Some elements of that challenge, like the MLA session or this journal issue, assume Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 5 conventional shapes and forms that will be familiar to the uninitiated and easily processed. Others, like blog entries (perhaps with comments appended), are also increasingly accepted as part of the space of our conversations, a grey literature that requires only a link passed in an email or on Facebook to access and assimilate. Yet other maneuvers have unfolded in more hermetic environments, largely inaccessible to outsiders, defined especially by Twitter but more specifically by the interaction between Twitter and other online services (including Facebook and blogs), the result being a complex, always evolving ecology that rewards the 24/7 attention cycle. This particular discourse network is characterized by subtle layers of indirection and innuendo (sometimes called “subtweets” for subliminal tweets, i.e., oblique commentaries in which particular individuals may or may not recognize themselves), a kind of social steganography (danah boyd’s term) whose stratifications render individual agendas transparent to the initiated and opaque to the neophyte. While no one can be plugged in all the time, for a number of the contributors to this issue, these discussions form a normative part of their routines, an extension or facet of their critical engagement over the course of a day as the feed refreshes and the notifications chime. (I pause for these details because online speech denaturalizes the register of the discourse here; and I lay emphasis on them to break down the dualism between the landscape of social media and traditional venues of professional record, like a Duke University Press journal.) If you follow the right Twitter accounts, then, if you read the right blogs, if you’re on the right lists, and if you’re included in the right backchannels . . . if you do these things, you’ll be within your rights to wonder (all over again) what digital humanities is anyway, and why on earth anyone would want it in their English (or any other) department. Herewith, then, are some of the terrible things of my title, hardly any of which are exaggerated for effect: Digital humanities is a nest of big data ideologues. Digital humanities digs MOOCs. Digital humanities is an artifact of the post-9/11 security and surveillance state (the NSA of the MLA). Like Johnny, digital humanities can’t read. Digital humanities doesn’t do theory. Digital humanities never historicizes. Digital humanities is complicit. Digital humanities is naive. Digital humanities is hollow huckster boosterism. Digital humanities is managerial. Digital humanities is the academic import of Silicon Valley solutionism (the term that is the shibboleth of bad-boy tech critic Evgeny Morozov). Digital humanities cannot abide critique. Digital humanities appeals to those in search of an oasis from the concerns of race, Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 6 class, gender, and sexuality. Digital humanities does not inhale (easily the best line of the bunch). Digital humanities wears Google Glass. Digital humanities wears thick, thick glasses (guilty). Perhaps most damning of all: digital humanities is something separate from the rest of the humanities, and—this is the real secret—digital humanities wants it that way. 1 Terrible things indeed these are! But while terrible can mean repugnant, the etymology of the word (Greek treëin, “to tremble”) also encompasses that which is terrific, by which we can mean possessed of great intensity (see also contemporary French usage). It is not then so inappropriate to be saying “terrible” things about digital humanities at this particular moment, a moment when the institutions we inhabit are indeed at the epicenter of seismic shifts in attitude, means, and mission. But we should be clear about exactly what it is we are addressing with these terrible allegations: we are (almost always) addressing and investing a construct, a construct that is variously journalistic (note the straight line from Grusin’s MLA comments to Pannapacker), administrative, algorithmic, and opportunistic (for which one might read, yes, tactical). Collectively, and above all else, it is discursive, as Raley so astutely noted. The very orthographic contours of “digital humanities” have been subject to unprecedented scrutiny: not long ago, William Germano, now Dean of Humanities and Social Sciences at Cooper- Union, pronounced upon “[t]he spectacular rise of ‘DH’ as the most powerful digraph in the non-STEM academy.” It is appropriate that Germano, editor-in-chief for twenty years at Columbia University Press prior to his Cooper-Union appointment, selects exactly the right term of art here. The digraph “DH”—variously also dh/DH/D_H/#dh as well as #transformdh and #dhpoco—is especially conspicuous on Twitter, where it functions not only as economical shorthand but also, as I have noted previously, as a hashtag—metadata—to be operationalized through search engines, aggregators, and notification services. The orthographic (and very often orthogonal) tensions around digital humanities—is it the digital humanities or just digital humanities, is it capitalized or not capitalized—are further emblematic in this regard. The agon par excellence of the construct is of course the question of definition: what is digital humanities? The insistence on the question is what allows the construct to do its work, to function as a space of contest for competing agendas. But more importantly—and this is precisely where the logic of the construct most readily reveals itself—there is no actual shortage of definitions of digital humanities. They are, by contrast, always latent and very often explicit in every curriculum and program proposal, every search Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 7 committee charge and hiring brief, every grant application and book project that sees fit to invoke the term. The definitions may not align, indeed they may at times prove inimical to one another. But variegation is not the same as absence or ineffability, and so we may conclude that the continued insistence on definition is precisely what allows the construct to function as a self-evident given, perpetuating itself through brute repetition and the proliferation of localized, sometimes media-specific digraphic focalizers. You may recall that the Construct was also the name given to the self- contained emulation of the Matrix in the Wachowskis’ films, the dojo where Neo spars with Morpheus to hone his Kung Fu technique. The construct in this sense is overtly a place of ritualized (and dematerialized) contest. This is not incidental to the sense in which I use it here, literalizing the meaning of the term beyond (I am sure) Raley’s intentions. In the construct, the habitus of social media disrupts the traditional comity of academic exchange. Just as Neo learns to bend—hack—the physics of his programmed reality, here one bends collegial niceties in competition for hits, retweets, likes, and replies, the very stuff—the Fu, in Internet parlance—of such odious reputation trackers as Klout. Indeed, we know that when this journal issue is published its availability will be widely tweeted. Brief excerpts from the essays (140 characters, remember) will circulate on Twitter. Blog posts characterizing or responding to the essays at greater length will appear; the essays themselves may be uploaded to personal sites or institutional repositories by their authors. The authors and others will engage one another in the tweets and blog comments. All of this will happen over a course of days, weeks, and months. While the records of those responses will linger thereafter on the Web, they will be mute remainders, mere husks, of the frisson, the serotonin- and caffeine-fueled jags that propel real-time online exchange. Only much more slowly will these essays pass into the collected professional literature, where they will be indexed, quoted, and referenced in the usual way. This issue on the dark side of the digital humanities is itself an artifact (an issue) of the construct and will serve to sustain it, not least through (again) the cascade of agonistic reductionism that will inevitably characterize those engaging it through channels of metrical (that is, reputation-based) circulation on social media. Metrical, and often brutal. Brutalism, or what some have dubbed the rhetoric of contempt, like ex cathedra pronouncement and aphorism, is a recognized online interactive mode, and the take-down is its consummate expression as genre and form. Such is in fact the signature style of Evgeny Morozov, the caustic technology critic whose first book was titled The Net Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 8 Delusion: The Dark Side of Internet Freedom (2011). Morozov, as much as the dark sides of Star Wars or Pink Floyd, furnishes the referential framing for the current debate. He enters the scene as one of the most visible and vociferous critics of Silicon Valley, and indeed, “the Internet,” a macroconstruct whose artifice he emphasizes by insisting on its embrace with quotation marks. The critique, honed on the whetstone of personal contact and up-close immersion in the day-to-day doings of the technoratti (see, for example, his 15,000 words on Tim O’Reilly in The Baffler), is aimed at technological essentialism and technological determinism, and above all idealism—what Morozov brands solutionism—which his second book, To Save Everything Click Here, effectively demolishes. As a break-out public intellectual, Morozov is in his element online, cultivating an uncompromising, acerbic persona (his Twitter bio reads simply: “There are idiots. Look around”). The transposition to digital humanities by some of his followers was predictable: DHers are themselves solutionists, pretenders who arrive to fix the ills of the present-day academy with tools, apps, and the rhetorical equivalent of TED talks, all driven by a naive (and duplicitous) agenda that has its roots if not (yet) in an IPO then in the academic currency of jobs, funding, and tenure. But this is poor critique and worse history, suggesting, as it does, that the differences between venture capital and public institutions are, quite literally, immaterial. Digital humanities in the United States at least has its beginnings not in California and not (for the most part) on the Ivy campuses, but instead in mostly eastern land-grant institutions. When a full documentary and archivally sound history of “digital humanities” is written, it will have to take into account the idiosyncrasies of this particular class of institution, and these will, I think, reveal a very different set of contexts than Silicon Valley’s orchards, lofts, and technology parks. Charges of brutalism and lack of civility are de facto subject to infinite regress, for the very charges become the object of brutal ridicule, and the cycle perpetuates. But at some level it should be uncontroversial to observe that many of the terrible things uttered about “digital humanities” as a construct simply lack an elemental generosity, as if there were no critical (let alone ethical) distinctions obtainable between data mining a corpus of nineteenth-century fiction and data mining your telephone calling records, as if those who “do” DH haven’t been educated in the same critical traditions (indeed, sometimes in the same graduate programs) as their opponents, as if those who do DH aren’t also politically committed and politically engaged, and as if they don’t (as a result) typically find Morozov himself both amusing and smart and profoundly uncontroversial. (And you will not convince me otherwise: here I unapologetically rely on my own stores of anecdote and Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 9 personal interaction, on conversations and relationships that go back in some cases decades, to make these determinations.) To indulge digital humanities only ever as a construct and a site of contest is also thus to give in to a world view that seems to me precisely neoliberal, precisely zero sum and agonistic—disembodied, desocialized, and evacuated of materiality or material history. II. I am finishing this essay in the weeks immediately following the conclusion of the Digital Humanities 2013 conference, held in Lincoln, Nebraska. DH13 was this year’s conference of record for the Alliance of Digital Humanities Organizations (ADHO), first formed in 2005 as an administrative entity shared by two scholarly associations, the predominantly North American Association for Computing and the Humanities and the predominantly European Association for Literary and Linguistic Computing, which have themselves been holding joint conferences since 1989 and individually since the early 1970s. Today ADHO encompasses six constituent organizations, also including the Canadian Society for Digital Humanities / Société pour l'étude des médias interactifs (SDH-SEMI, now SDH/SCHN), the Australasian Association for Digital Humanities (aaDH), centerNet: An International Network of Digital Humanities Centers, and the Japanese Association for Digital Humanities (JADH). I mention these particulars to place two sets of facts before us: one, that digital humanities, even in its current configuration (what Steve Ramsay has dubbed “DH Type 2”), has a history going back nearly a decade (and, as “humanities computing,” much longer than that), and two, that digital humanities has become thoroughly internationalized. Indeed, an attendee at the 2013 conference might have heard papers such as “Uncovering the ‘Hidden Histories’ of Computing in the Humanities 1949–1980: Findings and Reflections on the Pilot Project” or “Authorship Problem[s] of Japanese Early Modern Literatures in Seventeenth Century.” Or else papers like “Are Google’s Linguistic Prosthesis Biased toward Commercially More Interesting Expressions? A Preliminary Study on the Linguistic Effects of Autocompletion Algorithms” or “The Digitized Divide: Mapping Access to Subscription-Based Digitized Resources” or “Against the Binary of Gender: A Case for Considering the Many Dimensions of Gender in DH Teaching and Research.” While the conference is heavily attended by humanities faculty and graduate students, it also includes significant representation from information studies, Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 10 computer science, and library and archives professionals, as well as the so- called alt-ac space. Consequently, critical methods, assumptions, and discourse networks do not always align, even within the same panel; for every scrupulously written and carefully read paper citing Judith Butler or Bruno Latour, there were slide decks with data sets, graphs, and bullet points. If definition is the first great agon of the construct, inclusion and extent— who’s in, who’s out—is the second. The stakes are obvious: when a federal funding agency flies the flag of the digital humanities, one is incentivized to brand their work as digital humanities. When an R1 does a cluster hire in digital humanities, one is incentivized to be on the market as a digital humanist. When a digital humanities center has institutional resources, one is incentivized to seek to claim them by doing DH. None of this is disingenuous or cynical, nor can anyone who has looked in detail at the history of academic disciplines think digital humanities is in any way exceptional with regard to dependencies between its intellectual currency and bottom-line ways and means. Yet we frequently ignore these institutionalized realities in favor of an appeal to the “digital humanities” construct, as though the construct (and not the institution) were the desired locus of our agency and efficacy. In fact, digital humanists are recognized in the same way as individuals working in other fields: by doing work that is recognizable as digital humanities. My publishing in differences does not make me a scholar of feminist cultural studies; were I to wish to have myself considered as such, though, I would seek to publish in differences (and kindred venues), and I would develop my work within a network of citations recognizable to the already active participants who are publishing and speaking and teaching in that area with the goal of being listened to by them. In time, if my contributions had merit, they might be taken up and cited by others and thus assimilated into an ongoing conversation. So it is with digital humanities: you are a digital humanist if you are listened to by those who are already listened to as digital humanists, and they themselves got to be digital humanists by being listened to by others. Jobs, grant funding, fellowships, publishing contracts, speaking invitations—these things do not make one a digital humanist, though they clearly have a material impact on the circumstances of the work one does to get listened to. Put more plainly, if my university hires me as a digital humanist and if I receive a federal grant (say) to do such and such a thing that is described as digital humanities and if I am then rewarded by my department with promotion for having done it (not least because outside evaluators whom my department is enlisting to listen to as digital humanists have attested to its value to the digital humanities), then, well, yes, I am a Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 11 digital humanist. Can you be a digital humanist without doing those things? Yes, if you want to be, though you may find yourself being listened to less unless and until you do some thing that is sufficiently noteworthy that reasonable people who themselves do similar things must account for your work, your thing, as part of the progression of a shared field of interest. That is what being a digital humanist is; it is almost all of what being a digital humanist is. And while the material particulars of the work may vary in certain respects, including some very consequential respects, it is different not at all from being a Victorianist or a feminist cultural studies scholar or a scholar of Victorian feminist cultural studies. Digital humanists don’t want to extinguish reading and theory and interpretation and cultural criticism. Digital humanists want to do their work. They want jobs and (if the job includes the opportunity for it) they want tenure and promotion. They (often) want to teach. They (often) want to publish. They want to be heard. They want professional recognition and stability, whether as contingent labor, ladder faculty, graduate students, or in “alt-ac” settings. In short, they want pretty much the same things that every working academic wants, and the terrible truth is that they go about it in more or less familiar ways that include teaching, publishing, and administration. Take, for example, Matthew Jockers, a collaborator and past colleague of Franco Moretti, he who gave us the term distant reading: now Jockers is on the English faculty at the University of Nebraska–Lincoln, an institution that has developed an exceptionally strong capacity in digital humanities (hence its hosting the recent conference). If anybody is “in” DH, surely it is Jockers. He has recently published a book titled Macroanalysis: Digital Methods and Literary History as part of the University of Illinois Press’s Topics in the Digital Humanities series. In one early chapter, over the span of about a page, Jockers deploys a sequence of metaphors gleaned from strip mining to articulate his work’s relation to the literary history of his subtitle: [W]hat is needed now is the equivalent of open-pit mining or hydraulicking. [. . .] Close reading, traditional searching, will continue to reveal nuggets, while the deeper veins lie buried beneath the mass of gravel layered above. What are required are the methods for aggregating and making sense out of both the nuggets and the tailings [. . .] [to] exploit the trammel of computation to process, condense, deform, and analyze the deeper strata from which these nuggets were born, to unearth, for the first time, what these corpora really contain. (9–10) I am quoting selectively, and elsewhere Jockers develops his argument along paths more subtle, perhaps more comfortable, than mountaintop removal. Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 12 But that’s secondary to my point, which is that to receive even such passages as these with the agonistic zero-sum view that the author seeks to somehow eradicate traditional close reading and interpretation makes sense only in the construct. The facts, after all, are these: Jockers’s book was published in late 2012 with a print run of such and such. It will be bought by university libraries and some number of individuals. Some fewer number of those who bought it will read it. It will be reviewed in some number of venues, though the reviews will fall off after the first few years as they always do. Eventually (we do not know when) it will go out of print. It will be cited, by how many we do not yet know. It will be assigned, to how many classes we do not yet know. It will inspire some number of students, some fraction of whom may perhaps go to Nebraska, to work with Jockers. At some point the approaches in the book may pass out of fashion, and it may thus appear dated or naive. At some point the approaches may become more widespread, in which case the book will appear prescient and wise. Regardless, the book will do what almost all serious books do, albeit to greater or lesser extents: contribute to a conversation. Right now there is an especially lively such conversation around how we read. My colleague Lee Konstantinou has been collecting the different modalities; besides close and distant, his list includes also uncritical reading (Michael Warner), reparative reading (Eve Sedgwick), generous reading (Timothy Bewes), disintegrated reading (Rita Raley), surface reading (Sharon Best and Stephen Marcus; also Heather Love), and the hermeneutics of situation (Chris Nealon and Jeffrey Nealon). Jockers’s interventions in Macroanalysis have precisely no chance of displacing or discouraging any of these other modes of reading even if such were his intent, which it manifestly is not. Jockers does not wish for us all to become text miners and for none of us to read symptomatically or generously or reparatively; he likely wishes for more of us to mine texts (surely that is a motive in writing the book), and then talk to those who read reparatively and generously and closely (surely that is the motive in doing the mining). None of this differs in any substantial way from the publication of a special journal issue collecting papers from a group of scholars around an intervention such as “surface reading,” for example. Let me offer an example from another quarter. Peter Robinson, who has had a long and distinguished career as an editorial theorist and textual scholar, has lately been giving papers in which he purports to explain “[w]hy digital humanists should get out of textual scholarship. And if they don't, Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 13 why we textual scholars should throw them out.” Robinson’s argument is predicated on the belief that digital humanists build tools and that textual studies now more or less has all the tools it needs to go about its work, which is that of making critical editions (electronic or otherwise). He ends with this: “We may use digital humanities to be better textual scholars, but we do not pretend to be digital humanists. In return, digital humanists might also declare: we do digital humanities, and we try to help textual scholars to be better textual scholars through digital humanities, but we do not pretend to be textual scholars.” There are many ways in which one might seek to answer Robinson, starting with the assumption that digital humanities is confined to the activity of tool building. But we can also say this: Robinson’s concluding statement is a catechism that makes sense only in the construct, that virtual discursive space where Morpheus and Neo (who are both really on the same side, remember) can battle without regard for bodies, history, or physics. Outside of the construct, Robinson’s statement has no sense, indeed, no context. It speaks to no body. Why? Because it presumes the existence of entities called digital humanities (or for that matter textual scholarship) that exist apart from the practices of the people who identify with them. (To be sure, there are exemplars of digital humanists who have no great interest in textual scholarship just as there are textual scholars who have no investments in the digital humanities—but these individual cases merely reflect the reality of individual choices and careers, not the fractal coastlines of some metadisciplinary geography exposed at low tide.) Robinson is thus making a purely discursive move in a purely discursive space. Put more plainly, it is not as if one could sit in the audience and hear his talk and say, “Yes, Robinson has this right, and so I will return to my campus and dissociate digital humanities from textual scholarship forthwith.” Indeed, Robinson himself clearly knows this, since the most tangible action items in his paper refer to the material circumstances of scholarly production: copyrights, costs, the quality of markup and metadata, and the interoperability of tools. In any case, Robinson’s positions would have been unimaginable just a few years ago, before the first large-scale deployments of the “digital humanities” construct. Not because there are no intellectual distinctions to be drawn between what digital humanities does and what textual scholarship does, but rather because the number of actual people—outside the construct—who would wish to concern themselves with the things Robinson concerns himself with who do not also have a history and identity in the “digital humanities” is nowadays vanishingly small. Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 14 I have written as I have to suggest neither that all dark side critiques are disingenuous nor that any questioning of “digital humanities” is universally reducible to a construct. Of course one should ask questions about any set of disciplinary practices that have been as visible and prodigious as digital humanities in recent years. And the construct serves its purpose too; reductionism is often nothing more (and nothing less) than a concession to the limitations of the human capacity for attention. Indeed, the formation of discursive constructs around areas of critical engagement is itself entirely normative (see, for example, “New Historicism” or “Romanticism”); Brian McHale once chose exactly that phrase—discursive construct—to characterize “postmodernism” (4-5). Thus it is also not surprising that “DHers” themselves have written innumerable statements which contribute to the construct’s formation and perpetuation. But it is also necessary and appropriate to draw attention to what seems to me to be a recent and particular and peculiarly conspicuous set of moves, those suggested by the serial repetition of qualifying language seeking to establish discursive distance between critiques of “digital humanities” as such and those addressed to individual projects and productions. Drawing attention to that move (I have sought to do this typographically through my own use of quotation marks around “digital humanities,” much as Morozov insists on “the Internet”) ought to remind us of the limits of critique when critique is exercised according to recognizable and repeatable (and procedural) stances. So-called “dark side” critiques could therefore productively probe the “digital humanities” construct in relation to what we know of prior academic discursive formations, an inquiry remarkably absent from those critiques to date despite their own charges that “digital humanities” is not sufficiently invested in its histories. Moreover, critiques of “digital humanities” can ameliorate the construct (as opposed to indulging its brutal and metric perpetuation) by acknowledging—historically, materially—that “digital humanities” is in fact a diversified set of practices, one whose details and methodologies responsible critique has a responsibility to understand and engage. Such I would dearly like to see, for it is needed not just by “digital humanities” but by the constituencies of the humanities. Recent revelations notwithstanding, we cannot proceed as though such suddenly public phenomena as “metadata” or “data mining” are simply the calling cards of the state. I know of at least one exemplar already at hand. I am thinking of Alan Liu’s essay “The Meaning of the Digital Humanities” in the March 2013 issue of PMLA. Given the title, one could be forgiven for expecting the usual bout with definitions and measures of inclusion. But the essay offers Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 15 little in that regard. It makes a remarkably novel move instead: a close reading (if you will) of one particular digital humanities project, specifically, a paper published out of Stanford University’s Literary Lab based on experiments with computational analysis of a data corpus. In focusing his address on the research reported in this one paper, Liu hews very close to the science and technology studies (STS) approach that I believe offers the best basis for relevant critique of and in the digital humanities, a critique focused around the illumination of the antecedents, assumptions, and material dependencies of particular tools, methods, parentages of mentoring, and institutional settings. Digital humanities, after all, is sometimes said to suffer from physics envy. Let us, then, take that as it may be and avail ourselves of a singularly powerful intellectual precedent for examining in close (yes, microscopic) detail the material conditions of knowledge production in scientific settings or configurations. Let us read citation networks and publication venues. Let us examine the usage patterns around particular tools. Let us treat the recensio of data sets. Liu gives us a more-than-passing glimpse of what all this may look like: he undertakes to correspond, for example, with the managing editor of the Historical Thesaurus of the Oxford English Dictionary (HTOED), a reference whose data furnishes the main ingredient in what Liu terms an essential “adjustment step” in the authors’ methodology. Liu, a master reader, rightly recognizes this as the crux of the narrative he is unspooling, and so he follows the thread to the source in order to expose the implications of the dependencies to the HTOED. Liu further notes that the HTOED, though historically “precomputational,” is not “pretechnological” and has in fact been implemented and transposed through a series of online databases since its origins in the 1960s; it thus (now) manifests a rich range of media archaeological layers. The essay succeeds not only because it offers up a critique with which we may better see the contributions and limits of a particular project but also because it is actively interested in—I would go so far as to say fascinated by—digital humanities. Liu, in short, seeks to give us the digital humanities in action, and so he sites critique amid the evidentiary details of data sets and databases and algorithms, as well literary historical interpretation and disciplinary knowledge. 2 In previous essays, I’ve described digital humanities as both a “methodological outlook” (“What is”) and as a “tactical term” (“DH As/is”). In closing, I will be as plain as I can be: we will never know what digital humanities “is” because we don’t want to know nor is it useful for us to know. John Unsworth, who may well have written the foundational naming Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 16 document for digital humanities (given as a talk on May 25, 2001), introduced digital humanities as a “concession” arrived at for want of other, different terms. From that very day, we were already in the construct, a concession that exists to consolidate and propagate vectors of ambiguity, affirmation, and dissent. 3 Regardless, there is one thing that digital humanities ineluctably is: digital humanities is work, somebody’s work, somewhere, some thing, always. We know how to talk about work. So let’s talk about this work, in action, this actually existing work. 4 Notes 1 This paragraph consolidates and paraphrases (but exaggerates hardly at all) a number of ongoing discourses around digital humanities, principally online. Those wishing to reconstruct the particular sources that inspired me (which are by no means coextensive with the totality of the “dark side” critique) are advised to consult the following. For digital humanities as big data ideology (and antitheoretical/historical/hermeneutical/critical), see the 2012–13 Twitter feeds of David Golumbia and Brian Lennon. See also the various entries in the “digital humanities” category on Golumbia’s Uncomputing blog. For digital humanities and MOOCs, see (if only as a starting point) Grusin. For digital humanities and the post-9/11 surveillance state, see (esp.) Lennon on Twitter. For digital humanities as managerial see Allington. For Morozov worship, see (again) Golumbia and Lennon (Twitter). “Digital humanities never once inhaled” is from Alan Liu’s trenchant essay, “Where Is Cultural Criticism in the Digital Humanities?” For an extensive discussion around race, class, gender, sexuality, and disability—and the extent to which DH is or is not a refuge from them all—see Smith and Koh and Risam, including comments. Though this accounting is not exhaustive, a reader who spends any length of time with these sources (including also comments, replies, and the other dialogic features of online expression) will, I think, see voiced most if not all of the “terrible things” I seek here to address. 2 Fred Gibbs, reacting to an earlier essay of Liu’s, has also delineated the need for such a situated critique. He asserts, “digital humanities criticism needs to go beyond typical peer review and inhabit a genre of its own-—a critical discourse, a kind of scholarship in its own right.” 3 That mere definitions of digital humanities are commonplace and easy to come by— Ashgate has now devoted a reader to collecting them—only accentuates the point. See Terras, Nyhan, and Vanhouette. 4 A number of wise friends commented on an initial draft of this essay. I am grateful to them. Works Cited Allington, Daniel. “The Managerial Humanities; or, Why the Digital Humanities Don’t Exist.” Weblog entry. Daniel Allington. 31 Mar. 2013. http://www.danielallington.net/2013/03/the-managerial-humanities-or-why-the-digital- humanities-dont-exist/. Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 17 Chun, Wendy Hui Kyong. “The Dark Side of Digital Humanities--Part 1.” Center for 21st Century Studies. 9 Jan. 2013. http://www.c21uwm.com/2013/01/09/the-dark-side-of-the- digital-humanities-part-1/. Germano, William (WmGermano). “The spectacular rise of ‘DH’ as the most powerful digraph in the non-STEM academy.” 19 Jan. 2013, 9:35 a. m. Tweet. Gibbs, Fred. “Critical Discourse in Digital Humanities.” Journal of Digital Humanities 1.1 (2011): http://journalofdigitalhumanities.org/1-1/critical-discourse-in-digital-humanities-by- fred-gibbs/ (accessed 10 October 2013). Gold, Matthew, ed. Debates in the Digital Humanities. Minneapolis: U of Minnesota P, 2011. Golumbia, David. Uncomputing. Weblog. http://www.uncomputing.org/?cat=235 (accessed 1 October 2013). ---- (@dgolumbia). Twitter. Grusin, Richard. “The Dark Side of Digital Humanities--Part 2.” Center for 21st Century Studies 9 Jan. 2013. http://www.c21uwm.com/2013/01/09/the-dark-side-of-the-digital- humanities-part-2/. Jockers, Matthew L. Macroanalysis: Digital Methods and Literary History. Urbana: U of Illinois P, 2013. Kirschenbaum, Matthew. Comment on “Digital Humanities: Two Definitions.” Uncomputing. 21 Jan. 2013. http://www.uncomputing.org/?p=203&cpage=1#comment- 2135. ----. “Digital Humanities as/Is a Tactical Term.” Gold 415–28. ----. “What Is Digital Humanities and What’s It Doing in English Departments?” ADE Bulletin 150 (2010). http://mkirschenbaum.files.wordpress.com/2011/03/ade-final.pdf (accessed Aug. 5, 2013. Koh, Adeline, and Roopika Risam. “Open Thread: The Digital Humanities as a Historical ‘Refuge” from Race/Class/Gender/Sexuality/Disability?” Weblog entry. Postcolonial Digital Humanities. 10 May 2013. http://dhpoco.org/blog/2013/05/10/open-thread-the- digital-humanities-as-a-historical-refuge-from-raceclassgendersexualitydisability/. Konstantinou, Lee. Personal communication. 20 Jul. 2013, 8:46 p. m. Lennon, Brian. Comment on “Digital Humanities: Two Definitions.” Uncomputing. 21 Jan. 2013. http://www.uncomputing.org/?p=203&cpage=1#comment-2129. ---- (@cesgnal). Twitter. Liu, Alan. “The Meaning of the Digital Humanities.” PMLA 128.2 (March 2013): 409–23. http://journalofdigitalhumanities.org/1-1/critical-discourse-in-digital-humanities-by-fred-gibbs/ http://journalofdigitalhumanities.org/1-1/critical-discourse-in-digital-humanities-by-fred-gibbs/ Kirschenbaum, “What is ‘Digital Humanities,’ and Why Are They Saying Such Terrible Things about It?” differences 25.1 (2014): 46-63. Copyright © 2014 Duke University Press. 18 ----. “Where Is Cultural Criticism in the Digital Humanities?” Gold, 2011. McHale, Brian. Postmodernist Fiction. New York and London: Routledge, 1987. Morozov, Evgeny. The Net Delusion: The Dark Side of Internet Freedom. New York: PublicAffairs, 2011. Pannapacker, William. “The MLA and the Digital Humanities.” Chronicle of Higher Education 28 Dec. 2009. http://chronicle.com/blogPost/The-MLAthe-Digital/19468/. Raley, Rita. “The Dark Side of Digital Humanities--Part 4.” Center for 21st Century Studies. 9 Jan. 2013. http://www.c21uwm.com/2013/01/09/the-dark-side-of-the-digital- humanities-part-4/. Ramsay, Stephen. “DH Types One and Two.” Weblog entry. Stephen Ramsay. 5 May 2013. Robinson, Peter. “Why Digital Humanists Should Get Out of Textual Scholarship. And If They Don't, Why We Textual Scholars Should Throw Them Out.” Scholarly Digital Editions 29 Jul. 2013. http://scholarlydigitaleditions.blogspot.com/2013/07/why-digital- humanists-should-get-out-of.html. Smith, Martha Nell. “The Human Touch Software of the Highest Order: Revisiting Editing as Interpretation.” Textual Cultures: Texts, Contexts, Interpretation 2.1: 2007: 1–15. Terras, Melissa, Julianne Nyhan, and Edward Vanhoutte, eds. Defining Digital Humanities: A Reader. Farnham, Surrey: Ashgate, 2013. Unsworth, John. “A Master’s Degree in Digital Humanities: Part of the Media Studies Program at the University of Virginia.” 2001 Congress of the Social Sciences and Humanities. Université Laval, Québec, Canada. Lecture. 25 May 2001. http://people.lis.illinois.edu/~unsworth/laval.html. work_dtqpmklv3fgkhlutvc4a3ukmze ---- SurveyMonkey Analyze - Export Q1 Name Demmy Verbeke Q2 Library KU Leuven Libraries, Artes Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH research data management support; teaching about data science, RDM, databases, etc.; support for the preparation of project proposals with digital component; R&D projects initiated by the library (OCR, NER and LInked Data) Q5 Please describe the positive aspects of this activity re-establish the library as a central partner in research Q6 Please describe what could be going better concerning this activity not all library staff has received the necessary training, so there is big train-the-trainer need Q7 How long have you been doing this activity in your library? 3-5 years Q8 How many people of your library are involved in the activity? 6-10 Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? Yes, it is part of the library’s policy #1#1 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 1 / 67 LIBER DH & DCH Working group - use case survey Q10 Do you have a dedicated budget for this activity? No Q11 Are you assessing the impact of your work? If no, why not? , not exactly clear how we would do this Please specify: Q12 What kind of collection are you using in the activity? A born-digital collection we have a license to , A digitised collection we have a license to , A born-digital collection we created and curate , A digitised collection we created and curate , Metadata Q13 How is the data you are using licensed? Public domain, CC0, Any other CC- license , Copyrighted Q14 How did you find/built a relationship with the researchers working in this activity? Training / outreach events , Physical Space – Facilitating/Hosting , Other forums – committees, faculty liaison, others? , Through existing relationships Q15 What would be the main topics describing your relationship with the researchers working in the activity? Advisory / consultation roles , Skills training and development 2 / 67 LIBER DH & DCH Working group - use case survey Q16 What were the most significant skill-gaps you identified for this activity? Hard skills (coding, tools, etc.) Q17 Did you follow or offer any training for librarians as part of this activity? Yes, namely , RDM training, OA training, copyright training Please specify: Q18 If you have followed or offered any DH training for librarians, how is it organised? It belongs to a personal professional development programme , It belongs to a library-wide training programme , It belongs to the scope of this activity Q19 How aware are academics in your institution of the DH activities the library is active in? Aware (They know (some of) the activities the library is active in) Q20 If the library promotes the activity, where does it do so? Articles in journals, Conferences, Own website Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity https://bib.kuleuven.be/english/research; and esp. https://bib.kuleuven.be/english/research/digital-humanities see also http://www.tijdschriftkarakter.be/de-vrije-hand-de-valorisatie-van-digitaal-onderzoek-in-de-menswetenschappen/; https://lirias2repo.kuleuven.be/bitstream/id/428482/; https://lirias2repo.kuleuven.be/bitstream/id/485856/; http://www.northernrenaissance.org/renaissance-studies-digital-humanities-and-the-library/; https://www.digitisation.eu/tools-evaluation- university-library-ku-leuven/; https://acrl.ala.org/dh/2014/08/06/opportunistic-librarian/; https://lirias2repo.kuleuven.be/bitstream/id/269607/ 3 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Andreas Degkwitz Q2 Library Library of the Humboldt University Berlin Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH Digitising the fairy tail collection within the private library of Jacob and Wilhelm Grimm and make the annotations in these books searchable. We started to prepare the project and an application at the German Reserahc Foundation. Q5 Please describe the positive aspects of this activity The fairy tail collection - as the entire privare library of the both Grimm brothers - is relevant for DH-projects by the annotations, comments amd remarks done by Jacob and Wilhelm Grimm and other reseracher of the 19th century. The corpus of books covers about 600 voulmes (10% of the privare library). We have to apply for funds at the German Research Foundation, that we can run the activity as a project. Q6 Please describe what could be going better concerning this activity The cooperation between the library and the Grimm researchers (Grimm-Arbeitsstelle) of the university is going very well. We have resp. had some problems to find the necessary technical expertise to identify the annotations in the full texts automatically and to make these elements searchable. Our own competence is to less to realise the project. We need technical cooperation with an appropriate institution. Q7 How long have you been doing this activity in your library? 1-2 years #2#2 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 4 / 67 LIBER DH & DCH Working group - use case survey Q8 How many people of your library are involved in the activity? 2-5 Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? Yes, it is part of the library’s policy Q10 Do you have a dedicated budget for this activity? Yes, with external funding Q11 Are you assessing the impact of your work? If yes, how? , The project will be a pilot to process further materials of the Grimms private library in this way Please specify: Q12 What kind of collection are you using in the activity? A digitised collection we created and curate Q13 How is the data you are using licensed? Public domain, CC0 Q14 How did you find/built a relationship with the researchers working in this activity? Through existing relationships Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections , Advisory / consultation roles Q16 What were the most significant skill-gaps you identified for this activity? Hard skills (coding, tools, etc.) Q17 Did you follow or offer any training for librarians as part of this activity? Yes, namely Q18 If you have followed or offered any DH training for librarians, how is it organised? It belongs to a personal professional development programme 5 / 67 LIBER DH & DCH Working group - use case survey Q19 How aware are academics in your institution of the DH activities the library is active in? Aware (They know (some of) the activities the library is active in) Q20 If the library promotes the activity, where does it do so? Articles in journals, Conferences, Own website Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity The project is in the phase of preparation. Therefore no project site is available yet. Here are some links about our activities and collections: - https://www.digi-hub.de/viewer/ - https://www.digi-hub.de/viewer/sammlungen/ 6 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Sinéad Keogh Q2 Library University of Limerick Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH We created an online exhibition following the lives of one family throughout the First World War. Using the letters, diaries and photographs from the Armstrong Family archive, we add a new post each week to show their lives 100 years ago that week. Q5 Please describe the positive aspects of this activity Engagement with the archive has dramatically increased, with subscribers, followers and likes on the site and social media platforms. We have also had people contact us to give further information, to correct errors and even to contribute material to the archive. Q6 Please describe what could be going better concerning this activity With the WW1 centenary commemorations approaching, we really wanted to do this project but the resources were not available. So it became a labour of love to do it. More staff and time resources would have been useful. Q7 How long have you been doing this activity in your library? 3-5 years Q8 How many people of your library are involved in the activity? 2-5 #3#3 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 7 / 67 LIBER DH & DCH Working group - use case survey Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? No, it is an ad-hoc activity Q10 Do you have a dedicated budget for this activity? No Q11 Are you assessing the impact of your work? If yes, how? , Google analytics, social media, publications Please specify: Q12 What kind of collection are you using in the activity? A digitised collection we created and curate Q13 How is the data you are using licensed? Copyrighted Q14 How did you find/built a relationship with the researchers working in this activity? Online Presence – DH Lab/Portal, Social Media , Physical Space – Facilitating/Hosting , Other forums – committees, faculty liaison, others? Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections , Skills training and development Q16 What were the most significant skill-gaps you identified for this activity? Hard skills (coding, tools, etc.) Q17 Did you follow or offer any training for librarians as part of this activity? Yes, namely , html, php, digitisation Please specify: Q18 If you have followed or offered any DH training for librarians, how is it organised? It belongs to a personal professional development programme 8 / 67 LIBER DH & DCH Working group - use case survey Q19 How aware are academics in your institution of the DH activities the library is active in? Not aware (They have no idea the library does anything related to DH) Q20 If the library promotes the activity, where does it do so? Articles in journals, Conferences, Own website, Partner's website, Social media, Blog posts Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity http://longwaytotipperary.ul.ie/ https://www.facebook.com/ArmstrongFamilyMoyaliffe https://twitter.com/ww1ul 9 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Anda Baklāne Q2 Library National Library of Latvia Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH Provide collections of text files for researchers ("corpus on demand") Q5 Please describe the positive aspects of this activity (1) There is a growing interest among researchers, we can see potential in this activity. (2) Allows to build upon already existing digital resources/services. Q6 Please describe what could be going better concerning this activity Unclear long-term funding solutions. Q7 How long have you been doing this activity in your library? Under a year Q8 How many people of your library are involved in the activity? 2-5 Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? Yes, it is part of the library’s policy #4#4 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 10 / 67 LIBER DH & DCH Working group - use case survey Q10 Do you have a dedicated budget for this activity? No Q11 Are you assessing the impact of your work? If yes, how? , Not yet, but we plan to do so. Please specify: Q12 What kind of collection are you using in the activity? A digitised collection we created and curate Q13 How is the data you are using licensed? Public domain Q14 How did you find/built a relationship with the researchers working in this activity? Training / outreach events , Online Presence – DH Lab/Portal, Social Media , Physical Space – Facilitating/Hosting , Through existing relationships Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections , Advisory / consultation roles , Skills training and development Q16 What were the most significant skill-gaps you identified for this activity? Hard skills (coding, tools, etc.), Everything starting from understanding what kinds of files/data are needed, how to deliver them, what advice to give about the further handling and computation of data. If yes, please specify which skill gaps you indentified: 11 / 67 LIBER DH & DCH Working group - use case survey Q17 Did you follow or offer any training for librarians as part of this activity? Yes, namely , This summer we are organizing a 4 day course at the library (with invited lecturers). Try to learn from colleagues in conferences, seminars, by asking for advice. Please specify: Q18 If you have followed or offered any DH training for librarians, how is it organised? It belongs to the scope of this activity Q19 How aware are academics in your institution of the DH activities the library is active in? Aware (They know (some of) the activities the library is active in) Q20 If the library promotes the activity, where does it do so? Conferences, Own website, Partner's website, Social media Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity http://www.digitalhumanities.lv/ http://www.digitalhumanities.lv/BSSDH/ https://www.lnb.lv/en/researchers/digital-humanities 12 / 67 LIBER DH & DCH Working group - use case survey Page 1: LIBER DH & DCH WG Use case survey Q1 Name Q2 Library Stockholm University Library Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website, but please do not include my name Q4 Please describe one of the activities that you are doing in your library that you would define as DH Digitization (scanning and, in some cases, optical character recognition of printed materials) Q5 Please describe the positive aspects of this activity Increases findability and availability of research materials. Full-text scans also allow for various types of new inquiry (text-mining, etc.). Q6 Please describe what could be going better concerning this activity A long-term plan with dedicated funding. We could also benefit from more detailed workflows. Q7 How long have you been doing this activity in your library? More than 5 years Q8 How many people of your library are involved in the activity? 2-5 Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? No, it is an ad-hoc activity #5#5 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W 13 / 67 LIBER DH & DCH Working group - use case survey Q10 Do you have a dedicated budget for this activity? Yes, with internal funding Q11 Are you assessing the impact of your work? If yes, how? , We have some limited statistics on numbers of views/downloads, but only for certain items, and these statistics are not collected systematically. Please specify: Q12 What kind of collection are you using in the activity? A born-digital collection we have a license to , A digitised collection we have a license to , A digitised collection we created and curate Q13 How is the data you are using licensed? Public domain, Copyrighted Q14 How did you find/built a relationship with the researchers working in this activity? Other forums – committees, faculty liaison, others? , Through existing relationships Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections Q16 What were the most significant skill-gaps you identified for this activity? Soft skills (communication, project management, etc.) Q17 Did you follow or offer any training for librarians as part of this activity? No, because, Our digitization personnel were self-taught. Please specify: Q18 If you have followed or offered any DH training for librarians, how is it organised? Not applicable, we did not follow or offer any training 14 / 67 LIBER DH & DCH Working group - use case survey Q19 How aware are academics in your institution of the DH activities the library is active in? Not aware (They have no idea the library does anything related to DH) Q20 If the library promotes the activity, where does it do so? Own website Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity SU Library Digitization Information Site: https://www.su.se/english/library/research-support/digitalisation SU Map Collections (in Swedish): https://kartavdelningen.sub.su.se/kartrummet/ Digitized SU Special Collections in LIBRIS (federated catalog in Sweden): http://libris.kb.se/hitlist? d=libris&q=db%3aDIGI+images.sub.su.se&f=simp&spell=true&hist=true&p=1 Digitized dissertations from SU (in DiVA, federated infrastructure for academic publications in Sweden): http://su.diva- portal.org/smash/resultList.jsf?dswid=-3428&language=sv&searchType=RESEARCH&query=&af=[]&aq=[[]]&aq2= [[{%22dateIssued%22%3A{%22from%22%3A%221906%22%2C%22to%22%3A%222003%22}}%2C{%22publicationTypeCode%22%3 A[%22monographDoctoralThesis%22%2C%22comprehensiveDoctoralThesis%22]}]]&aqe= []&noOfRows=50&sortOrder=author_sort_asc&sortOrder2=title_sort_asc&onlyFullText=false&sf=all 15 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Lotte Wilms Q2 Library KB, National Library of the Netherlands Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH We run a researcher-in-residence programme where we invite an early career researcher to join the library for 6 months for 0,5 fte to do a research project with our digital collections. We offer help from a programmer, advisor, and collection specialists. Q5 Please describe the positive aspects of this activity We build our network in the academic community, we learn a great deal from how researchers work with our collection and how we can improve access to and our collections for them. Q6 Please describe what could be going better concerning this activity Last year we had very few proposals, so this year we decided to open the call for proposals for a longer period, invite more researchers and set up consultation slots. Q7 How long have you been doing this activity in your library? 3-5 years Q8 How many people of your library are involved in the activity? 6-10 #7#7 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 19 / 67 LIBER DH & DCH Working group - use case survey Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? Yes, it is part of the library’s policy Q10 Do you have a dedicated budget for this activity? Yes, with internal funding Q11 Are you assessing the impact of your work? If yes, how? , We recently did an evaluation and interviewed previous participants. Please specify: Q12 What kind of collection are you using in the activity? A born-digital collection we created and curate , A digitised collection we created and curate , Metadata Q13 How is the data you are using licensed? Public domain, Copyrighted Q14 How did you find/built a relationship with the researchers working in this activity? Online Presence – DH Lab/Portal, Social Media , Other forums – committees, faculty liaison, others? , Through existing relationships , Other (please specify): Call for proposals Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections , Advisory / consultation roles , Other services or technical expertise (please specify): Tool development 20 / 67 LIBER DH & DCH Working group - use case survey Q16 What were the most significant skill-gaps you identified for this activity? None Q17 Did you follow or offer any training for librarians as part of this activity? No, because Q18 If you have followed or offered any DH training for librarians, how is it organised? Not applicable, we did not follow or offer any training Q19 How aware are academics in your institution of the DH activities the library is active in? Aware (They know (some of) the activities the library is active in) Q20 If the library promotes the activity, where does it do so? Articles in journals, Conferences, Own website, Partner's website, Social media, Blog posts Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity https://www.kb.nl/en/organisation/research-expertise/researcher-in-residence http://lab.kb.nl/about-us/blog https://arbido.ch/de/ausgaben-artikel/2017/zusammenarbeit/the-researcher-in-residence-programme-at-the-kb-national-library-of-the- netherlands 21 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Kirsty Lingstadt Q2 Library University of Edinburgh Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH We are working on improving accuracy of OCR for mid 17th to mid 19th Century Scottish printed text with a focus on legal material. As part of this work we are looking at automated metadata extraction including places and people etc and are developing tools to facilitate this. Q5 Please describe the positive aspects of this activity Though project specific the methods and tools can be more widely applied and will take us further forward with the development of digital humanities tools required for students and researchers. Q6 Please describe what could be going better concerning this activity The challenge is funding as just now we are working on project specific funding. Once this comes to an end we are looking for how we can ensure that we continue this work. Q7 How long have you been doing this activity in your library? 1-2 years Q8 How many people of your library are involved in the activity? 2-5 #9#9 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 25 / 67 LIBER DH & DCH Working group - use case survey Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? Yes, it is part of the library’s policy Q10 Do you have a dedicated budget for this activity? Yes, with external funding Q11 Are you assessing the impact of your work? If yes, how? , Measuring accuracy of tools and processes as well as the ability to use these for other projects. Please specify: Q12 What kind of collection are you using in the activity? Other (please specify): Collection we are digitising and are making available CC-BY Q13 How is the data you are using licensed? Any other CC- license Q14 How did you find/built a relationship with the researchers working in this activity? Other (please specify): Currently working to build relationships with researchers. Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections , Digital storage / preservation / hosting , Other services or technical expertise (please specify): Tools for metadata extraction. Q16 What were the most significant skill-gaps you identified for this activity? Hard skills (coding, tools, etc.), Also attracting researchers to the content If yes, please specify which skill gaps you indentified: Q17 Did you follow or offer any training for librarians as part of this activity? No, because, More general training has been offered to Library staff eg Library Software Carpentry Please specify: 26 / 67 LIBER DH & DCH Working group - use case survey Q18 If you have followed or offered any DH training for librarians, how is it organised? Other (please specify): Wider DH training being delivered within the University and the library is engaged in developing some of the programming Q19 How aware are academics in your institution of the DH activities the library is active in? Somewhat aware (They know that the library does something, but are not sure what) Q20 If the library promotes the activity, where does it do so? Conferences, Own website, Social media, Blog posts Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity Website area not live yet - due to go live late July 2018. 27 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Per Cullhed Q2 Library Uppsala university library Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH The hosting of a digital repository (Alvin) and services to promote the use of its content. Q5 Please describe the positive aspects of this activity The creation of digital media for use in for example DH without the subscription costs associated with bought digital media Q6 Please describe what could be going better concerning this activity Growth is slowed down due to the need of maintaining traditional library services Q7 How long have you been doing this activity in your library? More than 5 years Q8 How many people of your library are involved in the activity? More than 10 Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? Yes, it is part of the library’s policy #10#10 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 28 / 67 LIBER DH & DCH Working group - use case survey Q10 Do you have a dedicated budget for this activity? Yes, with a combination of internal and external funding Q11 Are you assessing the impact of your work? If yes, how? , Download statistics and other knowledge of usage Please specify: Q12 What kind of collection are you using in the activity? A born-digital collection we have a license to , A digitised collection we have a license to , A digitised collection we created and curate , Metadata Q13 How is the data you are using licensed? Public domain, CC0, Any other CC- license , Copyrighted Q14 How did you find/built a relationship with the researchers working in this activity? Training / outreach events , Online Presence – DH Lab/Portal, Social Media , Other forums – committees, faculty liaison, others? , Through existing relationships 29 / 67 LIBER DH & DCH Working group - use case survey Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections , Advisory / consultation roles , Skills training and development , Other services or technical expertise (please specify): Use of the web for building contexts based on a digital content Q16 What were the most significant skill-gaps you identified for this activity? Hard skills (coding, tools, etc.), Therei s a general lack of combined programming and subject skills If yes, please specify which skill gaps you indentified: Q17 Did you follow or offer any training for librarians as part of this activity? Yes, namely , Promoting the new competences needed (in many ways) Please specify: Q18 If you have followed or offered any DH training for librarians, how is it organised? Other (please specify): Lectures, workshops, study visits and regular courses Q19 How aware are academics in your institution of the DH activities the library is active in? Aware (They know (some of) the activities the library is active in) Q20 If the library promotes the activity, where does it do so? Articles in journals, Conferences, Own website, Social media, Blog posts, Other (please specify): Lectures 30 / 67 LIBER DH & DCH Working group - use case survey Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity www.alvin-portal.org http://ub.uu.se/om-biblioteket/forum-for-digital-humaniora https://liber2016.org/wp-content/uploads/2015/10/WS7_Cullhed_Digital-Humanities-Forum.pdf http://www.ub.uu.se/anvand-biblioteket/labb/ 31 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Caleb Derven Q2 Library Glucksman Library, University of Limerick Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH Building and deploying a digital library service Q5 Please describe the positive aspects of this activity Offering a new service to the research community in our university; engaging with new technologies; creating a (hopefully) sustainable digital scholarship tool. Q6 Please describe what could be going better concerning this activity More dedicated staff for creating resources; full time developer resources Q7 How long have you been doing this activity in your library? 3-5 years Q8 How many people of your library are involved in the activity? 6-10 Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? Yes, it is part of the library’s policy #11#11 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 32 / 67 LIBER DH & DCH Working group - use case survey Q10 Do you have a dedicated budget for this activity? Yes, with internal funding Q11 Are you assessing the impact of your work? If yes, how? , We regularly asses analytic metrics related to the use of the resource. Please specify: Q12 What kind of collection are you using in the activity? A digitised collection we created and curate , Metadata Q13 How is the data you are using licensed? CC0, Any other CC- license , Copyrighted Q14 How did you find/built a relationship with the researchers working in this activity? Scoping tools – Environmental Scans, Surveys , Training / outreach events , Other forums – committees, faculty liaison, others? , Through existing relationships Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections , Digital storage / preservation / hosting , Advisory / consultation roles , Skills training and development Q16 What were the most significant skill-gaps you identified for this activity? Hard skills (coding, tools, etc.) 33 / 67 LIBER DH & DCH Working group - use case survey Q17 Did you follow or offer any training for librarians as part of this activity? Yes, namely , Metadata training Please specify: Q18 If you have followed or offered any DH training for librarians, how is it organised? It belongs to the scope of this activity Q19 How aware are academics in your institution of the DH activities the library is active in? Somewhat aware (They know that the library does something, but are not sure what) Q20 If the library promotes the activity, where does it do so? Not applicable, the library does not promote the activity Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity Respondent skipped this question 34 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Clara Riera Q2 Library Universitat Oberta de Catalunya Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH Give training and consultacy to DH researchers for digitals projects Q5 Please describe the positive aspects of this activity Interdiciplinary, connnecting inside resaerch to outside infrastructure Q6 Please describe what could be going better concerning this activity More time at the beginning of the research project to spend with library research servicies Q7 How long have you been doing this activity in your library? Under a year Q8 How many people of your library are involved in the activity? 2-5 Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? No, it is an ad-hoc activity #14#14 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 41 / 67 LIBER DH & DCH Working group - use case survey Q10 Do you have a dedicated budget for this activity? No Q11 Are you assessing the impact of your work? If no, why not? , At the moment the pilot is in a initial phase Please specify: Q12 What kind of collection are you using in the activity? Other (please specify): Catalogues Q13 How is the data you are using licensed? CC0, Copyrighted Q14 How did you find/built a relationship with the researchers working in this activity? Training / outreach events , Through existing relationships Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections , Digital storage / preservation / hosting Q16 What were the most significant skill-gaps you identified for this activity? Hard skills (coding, tools, etc.) Q17 Did you follow or offer any training for librarians as part of this activity? Yes, namely , we took part in a consortium course Please specify: Q18 If you have followed or offered any DH training for librarians, how is it organised? Not applicable, we did not follow or offer any training Q19 How aware are academics in your institution of the DH activities the library is active in? Not aware (They have no idea the library does anything related to DH) 42 / 67 LIBER DH & DCH Working group - use case survey Q20 If the library promotes the activity, where does it do so? Not applicable, the library does not promote the activity Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity http://transfer.rdi.uoc.edu/en/node/42020 project website http://www.ub.edu/openscienceandthehumanities/2018/04/12/social-networks-of-the-past-mapping-hispanic-and-lusophone-literary- modernity-1898-1959/ conference presenting the research project 43 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Susan Halfpenny Q2 Library University of York Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH Digital Creativity Week (https://www.york.ac.uk/library/news/2018/digital-creativity/): week long event working with archival data in creative ways. Students spent the week looking at data from the Yorkshire Dictionary project (https://www.york.ac.uk/borthwick/projects/yorkshire-dictionary/). Activities included, data cleaning and visualisation, image and audio editing, and coding. Students showcased their outputs to staff at an event. Q5 Please describe the positive aspects of this activity Collaborative endeavour: staff from Library, IT Services and Archives collaborated to host the event and provide a range of activities. Damien Murphy, the research champion for Creativity supported with the development of the event and provided useful academic contacts. Archival data: Yorkshire Dictionary data provided by the Archives enabled students to get involved in a current project. Students used this as a source of inspiration and also picked up other data sources as they progressed with the project. Creative digital output: students produced a digital story linked to the data they had explored. The project ended with them presenting their work in the 360 space to enable an immersive experience for the audience which incorporated a visual presentation, audio and AR triggers. #16#16 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 47 / 67 LIBER DH & DCH Working group - use case survey Q6 Please describe what could be going better concerning this activity This is the first year we have run this event. However, the plan is to run the event again in 2019. Some of the considerations for going forward include: Structure of the week: students came up with their own ideas and the output wasn't fixed. More direction and structure would be worthwhile in the future to enable students to fix on an idea for their projects earlier in the week. Marketing and communications: We had a lot of students drop out of the event, which meant we only had 5 participants. In future events earlier communications and over recruitment would be considered to enable us to work with a bigger group. Smaller events: as well as running a week long event we would like to run more one off events to enable us to reach more staff and students. Q7 How long have you been doing this activity in your library? Under a year Q8 How many people of your library are involved in the activity? 6-10 Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? No, it is an ad-hoc activity Q10 Do you have a dedicated budget for this activity? Yes, with internal funding Q11 Are you assessing the impact of your work? If yes, how? , Participant feedback Please specify: Q12 What kind of collection are you using in the activity? Other (please specify): Archive data Q13 How is the data you are using licensed? Copyrighted Q14 How did you find/built a relationship with the researchers working in this activity? Through existing relationships Q15 What would be the main topics describing your relationship with the researchers working in the activity? Advisory / consultation roles , Skills training and development 48 / 67 LIBER DH & DCH Working group - use case survey Q16 What were the most significant skill-gaps you identified for this activity? Hard skills (coding, tools, etc.) Q17 Did you follow or offer any training for librarians as part of this activity? No, because, staff involved had the skills to support with the activity Please specify: Q18 If you have followed or offered any DH training for librarians, how is it organised? Not applicable, we did not follow or offer any training Q19 How aware are academics in your institution of the DH activities the library is active in? Somewhat aware (They know that the library does something, but are not sure what) Q20 If the library promotes the activity, where does it do so? Own website, Social media, Blog posts Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity https://www.york.ac.uk/library/news/2018/digital-creativity/ 49 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Despoina Gkogkou Q2 Library Library and Information Center, University of Patras Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH We hold several digital collections (digitized periodicals, books, manuscripts and other archival materials). We also run a digital publishing platform (open access serials on several humanities and social science disciplines). The library hosted a series of DH seminars on literary networks organized by the Department of Mathematics and a meeting on DH in Greece by the Department of Philology. Q5 Please describe the positive aspects of this activity Our collections of digitized press of the 19th and 20th centuries are considered of the most important in Greece, being used widely by scholars and researchers. The content of our collections is valuable for Greek and comparative philologists, historians of the press, archivists and the activity has bestowed the library with a stimulus towards digital humanities. Q6 Please describe what could be going better concerning this activity It would be of great benefit if we could enhance the processing of our data (OCR, text mining techniques). Q7 How long have you been doing this activity in your library? More than 5 years Q8 How many people of your library are involved in the activity? 1 #17#17 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 50 / 67 LIBER DH & DCH Working group - use case survey Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? No, it is an ad-hoc activity Q10 Do you have a dedicated budget for this activity? No Q11 Are you assessing the impact of your work? If yes, how? , Through our in-person relation and communication with the academics in our university that use the collections mostly. Please specify: Q12 What kind of collection are you using in the activity? A digitised collection we have a license to , A born-digital collection we created and curate , A digitised collection we created and curate , Metadata Q13 How is the data you are using licensed? Public domain, CC0, Any other CC- license Q14 How did you find/built a relationship with the researchers working in this activity? Training / outreach events , Physical Space – Facilitating/Hosting , Through existing relationships Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections , Advisory / consultation roles , Skills training and development 51 / 67 LIBER DH & DCH Working group - use case survey Q16 What were the most significant skill-gaps you identified for this activity? Hard skills (coding, tools, etc.) Q17 Did you follow or offer any training for librarians as part of this activity? Yes, namely , Offer training and guidance to interns adding content and the metadata Please specify: Q18 If you have followed or offered any DH training for librarians, how is it organised? It belongs to a personal professional development programme Q19 How aware are academics in your institution of the DH activities the library is active in? Somewhat aware (They know that the library does something, but are not sure what) Q20 If the library promotes the activity, where does it do so? Articles in journals, Conferences, Partner's website, Social media Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity https://library.upatras.gr/english (general information) https://library.upatras.gr/digital (digital collections - in Greek only) 52 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Ioannis Clapsopoulos Q2 Library University of Thessaly Library & Information Centre Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH Our main activity in DH is through a recently initiated research project titled “THESSALY MEMORY DOCUMENTANTION and COMMUNICATION” [THE.ME.DO.COM], which constitutes the main part of the more wide-ranging “University of Thessaly Historical Archive Design and Implementation” project. THE.ME.DO.COM is a collaboration project, with a research team comprising Central Library [http://www.lib.uth.gr/LWS/en/en_hp.asp] staff and teaching-research staff members of the University’s Department of History, Archaeology and Social Anthropology (http://www.ha.uth.gr/setlanguage.php?lang=en). A project webpage (in Greek and English), including a progress log, is maintained at research gate platform: http://bit.ly/2uyAqbH Q5 Please describe the positive aspects of this activity Our main activity in DH is through a recently initiated research project titled “THESSALY MEMORY DOCUMENTANTION and COMMUNICATION” [THE.ME.DO.COM], which constitutes the main part of the more wide-ranging “University of Thessaly Historical Archive Design and Implementation” project. THE.ME.DO.COM is a collaboration project, with a research team comprising Central Library [http://www.lib.uth.gr/LWS/en/en_hp.asp] staff and teaching-research staff members of the University’s Department of History, Archaeology and Social Anthropology (http://www.ha.uth.gr/setlanguage.php?lang=en). An initial project webpage (in Greek and English), including a progress log, is maintained at research gate platform: http://bit.ly/2uyAqbH Q6 Please describe what could be going better concerning this activity Since THE.ME.DO.COM project is at an early stage of implementation its progress will be evaluated for the first time by the end of 2018. Q7 How long have you been doing this activity in your library? Under a year #18#18 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 53 / 67 LIBER DH & DCH Working group - use case survey Q8 How many people of your library are involved in the activity? 2-5 Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? Yes, it is part of the library’s policy , Yes, it is part of the parent organisation’s policy Q10 Do you have a dedicated budget for this activity? Yes, with a combination of internal and external funding Q11 Are you assessing the impact of your work? If no, why not? , The first impact assesment will be done within 2019. Please specify: Q12 What kind of collection are you using in the activity? A born-digital collection we created and curate , A digitised collection we created and curate , Metadata Q13 How is the data you are using licensed? CC0, Any other CC- license Q14 How did you find/built a relationship with the researchers working in this activity? Other forums – committees, faculty liaison, others? , Through existing relationships Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections , Skills training and development 54 / 67 LIBER DH & DCH Working group - use case survey Q16 What were the most significant skill-gaps you identified for this activity? None, The THE.ME.DO.COM project is at an early stage of implementation, so no skill-gaps have been identified yet. If yes, please specify which skill gaps you indentified: Q17 Did you follow or offer any training for librarians as part of this activity? No, because, Not yet - required training needs will be evaluated by the end of 2018. Please specify: Q18 If you have followed or offered any DH training for librarians, how is it organised? It belongs to the scope of this activity Q19 How aware are academics in your institution of the DH activities the library is active in? Somewhat aware (They know that the library does something, but are not sure what) Q20 If the library promotes the activity, where does it do so? Own website, Social media, Other (please specify): Articles in journals and conferences will be presented as the project evolves Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity http://bit.ly/2uyAqbH : initial THE.ME.DO.COM project website (at research gate platform – a dedicated project website will be published by October 2018 (social media i.e. twitter, facebook and instragram project accounts will also be published by October 2018). http://bit.ly/2LlmMSJ : The University of Thessaly Historical Archive powerpoint short presentation (in Greek). 55 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Emmanuelle Bermès Q2 Library Bibliotheque National de France Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH The Corpus project = a 4 years internal research program aimed at defining new services for researchers using digital collections (gallica, web archives, metadata). Cf. c.bnf.fr/fom Q5 Please describe the positive aspects of this activity Iterations with researchers on their needs. Concrete results using BnF collections. Support from the library's direction. Q6 Please describe what could be going better concerning this activity Currently we struggle to get a clear view on physical space works programming & budget. We are lacking dedicated staff. Q7 How long have you been doing this activity in your library? 3-5 years Q8 How many people of your library are involved in the activity? 6-10 Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? Yes, it is part of the library’s policy #19#19 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 56 / 67 LIBER DH & DCH Working group - use case survey Q10 Do you have a dedicated budget for this activity? Yes, with internal funding Q11 Are you assessing the impact of your work? If yes, how? , Yearly reviews Please specify: Q12 What kind of collection are you using in the activity? A born-digital collection we created and curate , A digitised collection we created and curate , Metadata Q13 How is the data you are using licensed? Any other CC- license Q14 How did you find/built a relationship with the researchers working in this activity? Training / outreach events , Physical Space – Facilitating/Hosting , Through existing relationships , Other (please specify): API Website - api.bnf.fr Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections , Advisory / consultation roles , Skills training and development , Other services or technical expertise (please specify): Infrastructure, physical space (hosting research teams), tools to analyse the data Q16 What were the most significant skill-gaps you identified for this activity? Hard skills (coding, tools, etc.) 57 / 67 LIBER DH & DCH Working group - use case survey Q17 Did you follow or offer any training for librarians as part of this activity? No, because, Not yet, but we will Please specify: Q18 If you have followed or offered any DH training for librarians, how is it organised? Not applicable, we did not follow or offer any training Q19 How aware are academics in your institution of the DH activities the library is active in? Somewhat aware (They know that the library does something, but are not sure what) Q20 If the library promotes the activity, where does it do so? Articles in journals, Conferences, Blog posts, Other (please specify): Workshops Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity c.bnf.fr/fom Eleonora Moiraghi's report: hal-bnf.archives-ouvertures.fr/hal-01739730 BnF Research blog: bnf.hypotheses.org/2809 bnf.hypotheses.org/2299 bnf.hypotheses.org/2214 (Reports from the workshops) webcorpora.hypotheses.org 58 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Agnes Ponsati Q2 Library Spanish National Library Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH At BNE we have created a portal BNELAB under it we'll place different activities we are developing to improve the use and re-use of our digital/print collections and enrichment of our collection. Take a look at www.bne.es/bnelab Q5 Please describe the positive aspects of this activity Promotion of BNE heritage collections with different narratives and also within new communities apart from the academic public. Q6 Please describe what could be going better concerning this activity n.a. Q7 How long have you been doing this activity in your library? 1-2 years Q8 How many people of your library are involved in the activity? 2-5 Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? Yes, it is part of the library’s policy #20#20 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 59 / 67 LIBER DH & DCH Working group - use case survey Q10 Do you have a dedicated budget for this activity? Yes, with external funding Q11 Are you assessing the impact of your work? If yes, how? , We follow up the impact through some analysize tools, the social media impact, and impact on newspapers, radio and TV programs Please specify: Q12 What kind of collection are you using in the activity? A digitised collection we have a license to , A digitised collection we created and curate , Metadata Q13 How is the data you are using licensed? Public domain, Any other CC- license Q14 How did you find/built a relationship with the researchers working in this activity? Online Presence – DH Lab/Portal, Social Media , Through existing relationships Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections , Digital storage / preservation / hosting , Advisory / consultation roles Q16 What were the most significant skill-gaps you identified for this activity? Soft skills (communication, project management, etc.) Q17 Did you follow or offer any training for librarians as part of this activity? Yes, namely Q18 If you have followed or offered any DH training for librarians, how is it organised? It belongs to the scope of this activity 60 / 67 LIBER DH & DCH Working group - use case survey Q19 How aware are academics in your institution of the DH activities the library is active in? Somewhat aware (They know that the library does something, but are not sure what) Q20 If the library promotes the activity, where does it do so? Own website, Partner's website, Social media, Blog posts Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity http://www.bne.es/bnelab/ 61 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Helga Kardos Q2 Library Library of the Hungarian Parliament Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH Digitisation of parliamentary documents, legal books, journals, other textual materials. Digitised Legislative Knowledge Base. Q5 Please describe the positive aspects of this activity - To know deeply the collection - To protect the collection - make the documents familiar to researchers and students Q6 Please describe what could be going better concerning this activity Better and richer metadata (even in English). Language limits are narrowing the audience and possibilities. Q7 How long have you been doing this activity in your library? 3-5 years Q8 How many people of your library are involved in the activity? 6-10 #21#21 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 62 / 67 LIBER DH & DCH Working group - use case survey Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? Yes, it is part of the library’s policy , Yes, it is part of the parent organisation’s policy Q10 Do you have a dedicated budget for this activity? Yes, with internal funding Q11 Are you assessing the impact of your work? If yes, how? , With surveys Please specify: Q12 What kind of collection are you using in the activity? A digitised collection we created and curate , Metadata Q13 How is the data you are using licensed? Public domain, Copyrighted Q14 How did you find/built a relationship with the researchers working in this activity? Online Presence – DH Lab/Portal, Social Media , Through existing relationships Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections Q16 What were the most significant skill-gaps you identified for this activity? Hard skills (coding, tools, etc.) Q17 Did you follow or offer any training for librarians as part of this activity? No, because, Lack of money Please specify: Q18 If you have followed or offered any DH training for librarians, how is it organised? Not applicable, we did not follow or offer any training 63 / 67 LIBER DH & DCH Working group - use case survey Q19 How aware are academics in your institution of the DH activities the library is active in? Somewhat aware (They know that the library does something, but are not sure what) Q20 If the library promotes the activity, where does it do so? Articles in journals, Own website, Social media, Blog posts Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity dtt.ogyk.hu -> digital collection of th elibrary ogyk.hu -> website of the library 64 / 67 LIBER DH & DCH Working group - use case survey Q1 Name Karine Bacher-Eyroi Q2 Library Université Toulouse Capitole / Library Research Support Dept. Q3 Is it OK if we share this use case (including the name of library)? Yes, you may publish it on the LIBER website and please include my name as contact person Q4 Please describe one of the activities that you are doing in your library that you would define as DH - Research materials & publications digitisation - Digital research publishing Q5 Please describe the positive aspects of this activity - Developing expertise & skills in our team - Added-value projects highlighting the library in our university - Working closely with researchers & research units as real partners and stakeholders of projects Q6 Please describe what could be going better concerning this activity - Evaluating better the time needed - Having a dedicated budget - Upskilling the team on technical aspect of DH - Integrating DH in our organisation chart - Better advocacy towards researchers & university board - Having an institutional policy on DH Q7 How long have you been doing this activity in your library? 1-2 years #22#22 COMPLETECOMPLETE Collector:Collector: Started:Started: Last Modified:Last Modified: Time Spent:Time Spent: W Page 1: LIBER DH & DCH WG Use case survey 65 / 67 LIBER DH & DCH Working group - use case survey Q8 How many people of your library are involved in the activity? 2-5 Q9 Is the activity being undertaken in an embedded program in your library? In other words, is it part of a policy? Yes, it is part of the library’s policy Q10 Do you have a dedicated budget for this activity? No Q11 Are you assessing the impact of your work? If no, why not? , The activity is quite recent in the library, we lack of elements, and of evaluation tools. But this is a priority to develop the assessment of our work. Please specify: Q12 What kind of collection are you using in the activity? A born-digital collection we created and curate , A digitised collection we created and curate , Metadata Q13 How is the data you are using licensed? Public domain, Copyrighted Q14 How did you find/built a relationship with the researchers working in this activity? Online Presence – DH Lab/Portal, Social Media , Through existing relationships , Other (please specify): Responding the research units demands 66 / 67 LIBER DH & DCH Working group - use case survey Q15 What would be the main topics describing your relationship with the researchers working in the activity? Digital content / collections , Advisory / consultation roles , Skills training and development , Other services or technical expertise (please specify): Expertise on tools, formats, standards, legislation Q16 What were the most significant skill-gaps you identified for this activity? Hard skills (coding, tools, etc.), Potential of TDM, advocacy & communication If yes, please specify which skill gaps you indentified: Q17 Did you follow or offer any training for librarians as part of this activity? No, because, We didn't really develop that for the moment, except for some specific tools presentation. We integrate professional training institutions programs. Please specify: Q18 If you have followed or offered any DH training for librarians, how is it organised? It belongs to the scope of this activity Q19 How aware are academics in your institution of the DH activities the library is active in? Somewhat aware (They know that the library does something, but are not sure what) Q20 If the library promotes the activity, where does it do so? Conferences, Own website, Other (please specify): Internal university board of research units meeting Q21 Where can we find any extra information about the activity? Please list all the links to websites, news items, blog posts, articles or any other communication about the activity http://www.ut-capitole.fr/bibliotheques/ 67 / 67 LIBER DH & DCH Working group - use case survey KU Leuven Libraries Library of the Humboldt University Berlin University of Limerick National Library of Latvia Stockholm University Library KB, National Library of the Netherlands University of Edinburgh Uppsala University Library Glucksman Library, University of Limerick Universitat Oberta de Catalunya University of York Library and Information Center, University of Patras University of Thessaly Library & Information Centre Bibliotheque National de France Spanish National Library Library of Hungarian Parliament Université Toulouse Capitole / Library Research Support Dept. work_dyvz74kqyjb6hnzcol4jrnqr6y ---- http://www.dhsi.org/events.php http://adho.org/administration/conference-coordinating-program-committee/adho-conference-code-conduct http://adho.org/administration/conference-coordinating-program-committee/adho-conference-code-conduct http://eadh.org/about/diversity-and-inclusivity https://ach.org/activities/advocacy/ach-statement-in-the-aftermath-of-the-2016-election/ https://ach.org/activities/advocacy/ach-statement-in-the-aftermath-of-the-2016-election/ https://csdh-schn.org/inclusivity-and-diversity-statement/ https://csdh-schn.org/inclusivity-and-diversity-statement/ https://www.youtube.com/watch?v=t4DT3tQqgRM https://www.youtube.com/watch?v=t4DT3tQqgRM https://implicit.harvard.edu/implicit/selectatest.html https://implicit.harvard.edu/implicit/selectatest.html https://docs.google.com/presentation/d/1JQH1RvKiNxZbOq6mzV8QKBCB8n_kqIPNrd3lj8JIsKc/edit?usp=sharing https://docs.google.com/presentation/d/1JQH1RvKiNxZbOq6mzV8QKBCB8n_kqIPNrd3lj8JIsKc/edit?usp=sharing https://docs.google.com/presentation/d/1JQH1RvKiNxZbOq6mzV8QKBCB8n_kqIPNrd3lj8JIsKc/edit?usp=sharing https://privilege.huc.knaw.nl/ Try it at home (or here, right now) https://privilege.huc.knaw.nl ● ● ✔ ● ✔ ● ✔ ● ○ ● ● ✔ ● ✔ ● ○ Series celebrate repetition of method across lots of examples ● The important thing is not “What can we edit next?” ● Rather − “Can we edit that?” − “Can we do something other than ‘edit’?” − “What can we apply computation to next?” − “How does this affect our computation?” Series celebrate diversity of problem rather than comprehensiveness It is the variety of new problems, not the number of successful examples that moves the field forward Variety of new problems ● McCarty and Short’s image has boxes and bubbles, not columns and silos ● It is the way that the domains intersect through computing methods that is “the field” ✔ This is Digital Humanities ✔ This is (still) Digital Humanities x This is a Special Interest Group for Latin Concordance Builders A DH where everyone agrees with me is dead. A DH where everyone’s like me is dying. Three implications 1. It is possible to do digital work in the Humanities without doing “Digital Humanities”: ● Use computation to advance historical work rather than use historical examples to advance our understanding of how to solve Humanities problems computationally ● e.g. a structurally marked-up transcription and edition of a straightforward medieval manuscript is (today) Medieval Studies, not Digital Humanities Three implications 2. Diversity (of problem) is more important than “Quality” (of work) if you are doing Digital Humanities ● DH Began as text-focussed discipline: ● Databases, stylistics, and text-representation ● It is exciting because it isn’t that any more ● New subjects (text, images, 3D) ● New techniques (XML, GIS, Crowd sourcing, wikis, visualisation, etc) ● New arenas (Academy, GLAM, popular, etc.) ● New people (Scholars, Crowd, Journalists, Citizen scientists, etc.) Three implications 3. It’s not (just) a Diversity of Problem − The flaw in McCarty and Short’s diagram is that it assumes there is a single methodological commons: “Communications & Hypermedia” Not just... diversity (of problem) ● Great disciplinary realisation of the last 5-7 years is that diversity of people, region, language, context is as important as diversity of application ● There should be as many ovals in the diagram as there are clouds and (disciplinary) boxes Not just... diversity (of problem) ● Why are some groups able to control attention and others not? ● How do (groups of) people differ in their relationship to technology? ● How do you do digital humanities differently in high- vs. low-bandwidth environments? Not just... diversity (of problem) ● How does digital scholarship differ when it is done by the colonised and the coloniser? ● How is what we discuss and research influenced by factors such as class, gender, race, age, social capital? ● Etc. !!! Conclusion ● DH depends on a supply of problems to continue its development ● Because it exists at the intersection of fields and involves the study of this intersection, its growth needs to be measured by its width rather than its bulk ● A DH that never got beyond a traditional interest in text, concordances and editing would be a DH that had died ● The same is true for a DH that cannot get beyond a narrow group of practitioners bringing a relatively limited set of problems ... no matter how well “they” do it. Funding • SSHRC • ADHO • HuC Humanities Cluster KNAW work_dz6hv2rzkbhrvefe5gfu4v3yvu ---- ijhac.2014.0120.dvi MANUSCRIPTS AND MACHINES: THE AUTOMATIC REPLACEMENT OF SPELLING VARIANTS IN A PORTUGUESE HISTORICAL CORPUS RITA MARQUILHAS AND IRIS HENDRICKX Abstract The CARDS-FLY project aims to collect and transcribe a diverse sample of historical personal letters from the 16th to 20th century in a digital format to create a linguistic resource for the historical study of the Portuguese language and society. The letters were written by people from all social layers of society and their historical, social and pragmatic contexts are documented in the digital format. Here we study one particular aspect of this collection, namely the spelling variation. Furthermore, on the basis of this analysis, we improved a statistical spelling normalisation tool that we aim to use to automatically normalise the spelling in the full collection of digitised letters. Keywords: historical linguistics, spelling variation, automatic normalization, Portuguese 1. introduction1 Personal letters can have a twofold importance for historians. First, they play a supporting role as documents that contain first person testimonies (with all their flaws of accuracy, to be sure) ready for interpretation alongside all other available sources on whatever topic the historian is studying. Under this light, they often become ‘a providential manna to feed biographies, the sketch of everyday life, the taste for intimacy and confidential matters’.2 Secondly, historians can also find personal letters to be important for their own sake, if the context is that of a history of written culture.3 Here they play the leading role, International Journal of Humanities and Arts Computing 8.1 (2014): 65–80 DOI: 10.3366/ijhac.2014.0120 © Edinburgh University Press 2014 www.euppublishing.com/ijhac 65 Rita Marquilhas and Iris Hendrickx given their status of social practices, ‘traces of a complex reality that absorbs countless other practices and registers’.4 For example, they enclose literate (and halfliterate) discourses on the practice of writing itself. Also, they are samples of intimate interactions, whose participants were conscious of the spatial-temporal discontinuity of their speech acts. They constituted either polite or impolite behaviour, either orderly, or disorderly conduct, depending on the observance of conventions valid for the historical communities in question. For this second approach, nevertheless, rich collections of letters are mandatory because cultural interpretations have to be tested against a large quantity of data that represents the norm followed by social actors, and a thin quantity of exceptions that constituted possible marginal behaviours. At the Linguistics Centre of the University of Lisbon (CLUL), such a large collection is being assembled, the CARDS-FLY corpus, in order to attend both the needs of cultural historians, and the needs of historical linguists. Historical linguistics is the study of language change through time, and original, non- literary sources are the most preferred data for the description and interpretation of such change. Spontaneous oral utterances would be the ideal data, but since their retrieval is impossible for language as spoken in past centuries, the personal letter discourse is the next best candidate. It offers the linguist the recording of a behaviour carried out by interactive speakers with a more informal attitude than the one adopted by writers of literary or institutional texts. The CARDS corpus (Cartas Desconhecidas – Unknown Letters) is a collection of 2,000 personal Portuguese letters written between the 16th and the 19th century. The ones dating from 1500 to 1800 were mainly seized by a religious court (the Portuguese Inquisition) as instrumental proof to prosecute individuals accused of heretical beliefs. As for the 19th century ones, they were mainly seized by a Crown court (the Casa da Suplicação) as instrumental proof exhibited either by the prosecution or by the defence of individuals accused of anti-social or anti-political behaviour. The project ran from 2007 to 2010, carried out by a mixed team of historians and linguists. The role of the linguists was to decipher and publish the manuscripts with philological care in order to preserve their relevance as sources for the history of language variation and change. The role of the historians was to contextualise the letters discourse as social events. The whole set of transcriptions, accompanied by a context summary, was given a machine- readable format, which allowed for the assemblage of an online Portuguese historical corpus of Early Modern Ages. In the sequence of CARDS, the FLY project (Forgotten Letters, Years 1900–1974) was launched in 2010 by the same core team, now accompanied by modern history experts, as well as sociologists. The aim was to enlarge the former corpus with data from the 20th century. Since collecting personal papers from contemporary times is a delicate task, given the need to guarantee the 66 The Automatic Replacement of Spelling Variants protection of private data from the public scrutiny, the letters of the FLY project come mostly from donations by families willing to contribute to the preservation of Portuguese collective memory having to do with wars (World War I and the 1961–1974 colonial war), emigration, political prison, and exile. These were also favourable contexts for a high production of written correspondence with family and friends because in such circumstances strong emotions such as fear, longing, and loneliness were bound to arise. The CARDS-FLY corpus is thus a linguistic resource prepared for the historical study of Portuguese language and society. Its strength relies on the broad social representativeness, being entirely composed by documents whose texts belong to the letter genre, the personal domain, and the informal linguistic register.5 The final goal is to have a total of 4,000 letters. By May 2013, the team had already transcribed a total of 3,809 letters involving 2,286 different participants (82 per cent men, 18 per cent women) and around 1,1 million words. The digital encoding of the letters follows a set of guidelines prepared by the Flemish project DALF: Digital Archive of Letters in Flanders based on the TEI P4 Guidelines.6 This encoding offers a machine-readable file format that allows for the philological care critical editions demand. The mark-up language is XML, and the labels contents are the ones fixed by DALF for letters idiosyncrasies and by TEI for primary sources.7 The letters manuscripts were transcribed in a conservative way and features such as unreadable parts, scratched-out parts or perforations in the letters are encoded explicitly in the XML mark-up. Also the spelling of the original document is maintained, as this is relevant for the history of language change, a prospect that is always compromised when spelling normalisations are practiced by editors. On the other hand, the lack of normalisation for spelling creates a problem when the letters are seen as a target for corpus linguistics operations: morphologic annotation, parsing, semantic annotation, concordancing, word lists, and keywords. Such level of processing demands for a corpus in standard spelling, a resource also invaluable for historians focusing on the discursive features that manifest themselves through keywords and semantic fields present in the corpus.8 As we intend to use the corpus for this purpose, we are in need of a normalised version. Manual spelling correction is a laborious and time-consuming effort and therefore we decided to explore the possibilities for automatic normalisation. We already did some exploratory experiments along that path.9 Here we first give a detailed analysis of how spelling varied and changed over time in our corpus based on a statistical analysis of a sample taken from the CARDS-FLY corpus. Next we present some practical results of automatic spelling normalization. We conclude with a discussion of the benefits and limits of using statistical methods for spelling normalization but we conclude that the benefits of the procedure are indeed remarkable ones. 67 Rita Marquilhas and Iris Hendrickx 2. setting a standard for written portuguese In the history of Portugal the standard norm for written language came late in time, only in 1911, one year after the Republican instauration. The standard adoption had been persistently proposed since the 18th century, following foreign examples, but there was never a favourable occasion for the Royal Academy of Sciences of Lisbon (Academia Real das Ciências de Lisboa) to produce a written model, neither in the 1700s nor in the 1800s.10 When a Portuguese orthography could finally be decided, there were two possible paradigms that would serve as alternative models: the shallow orthography, such as the Spanish and the Italian, which preserved phoneme- grapheme correspondences, and the deep orthography, followed in the French and the English spelling standards. The deep paradigm, more etymological, is a type of spelling where morphology, rather than phonology, is recoverable by literate people.11 The authors of the 1911 Portuguese spelling reform decided openly for the shallow paradigm. They motivated their choice as a way of creating the proper instrument that would lead to a quick progress of literacy rates in Portuguese society: What are the bases for the Portuguese orthography that our Commission proposes? There was, from the beginning of the works, two systems that could be followed. One of them was the French orthography, which, more or less coherently, is being imitated in Portugal for some time now. The other system is the one of the Spanish and Italian orthographies, much simpler, more rational, logical and easy to learn, much more adapted to the natural and even literary evolution of those languages, which is also similar to the evolution of Portuguese. What radically differentiates the orthography of those two official languages [Spanish and Italian] is the modification of the Latin spelling of innumerable Romanised Greek words to other spellings, much more similar to the value of the letters of such words in modern times. In order to make the teaching of reading and writing an easier task, the Commission found that the time had come to banish once and for all from the Portuguese writing, as they were banished from the Spanish and the Italian for a long time, [. . . ] the symbols ph, th, rh, and y [. . . ]. Translated from the Portuguese Bases da Reforma de 1911.12 The 1911 reform put an end to a long search for a Portuguese standard for spelling. But it raised a diplomatic misunderstanding between Portugal and Brazil, a problem that took a new period of 100 years to be solved. In 1990, all the Portuguese speaking countries signed an agreement on a decisive spelling 68 The Automatic Replacement of Spelling Variants reform. In 2011 that reform was finally adopted by the Portuguese education system.13 3. automatic spelling normalisation 3.1. Related Work Here we first give some examples of recent related studies that handle spelling variation in historical corpora in general and then focus on studies for the Portuguese language. The VARiant Detector (VARD) tools aimed to detect spelling variation in Early Modern English and were created for corpus linguistic research.14 The first version of the tool was based on a list of manually created mappings between historical variants and their modern versions. The latest version combined several different modules such as a list of letter replacement rules, a phonetic matching algorithm and an edit distance search method to detect spelling variation. We discuss a Portuguese version of VARD in the next section. Craig and Whipp have also worked on a tool for automatic spelling variation detection for Early Modern English but in the perspective of authorship attribution.15 For the corpus of Early Modern German, a spelling variation detection tool is currently under development.16. For the Spanish diachronic corpus, a study of the effect of automatic spelling normalisation has been conducted.17 They compared two different strategies, namely to first automatically normalise the data before using an NLP tool or to adapt the NLP tool itself to handle spelling variation. For their purpose of Parts of Speech tagging, they argued that tool adaption is better as the original spelling is kept. As for Portuguese, most of the available studies concerning the spelling change along Early Modern and Modern times have a cultural historical perspective, which means that what they analyse is the discourses of contemporary élite writers, mostly grammar authors and dictionary authors. Such discourses were either bitter criticisms because of the lack of a spelling standard for the language, or concrete proposals for a solution to that void.18 As for quantitative corpus-based approaches of the same spelling change, they had to wait for the assemblage of large Portuguese historical corpora covering the Early Modern and Modern era, a work that is being mostly undertaken in Brazil. The Tycho Brahe team, of Campinas University, was the first to present statistical measurements of the spelling change phenomenon in order to solve the processing problems it raised,19 followed by the Historical Dictionary of Brazilian Portuguese team (Dicionário Histórico do Português do Brasil).20 This dictionary is constructed on the basis of a historical Portuguese corpus (16th to 19th century) of approximately 5 million tokens. As they needed a normalised 69 Rita Marquilhas and Iris Hendrickx corpus to produce reliable frequency counts for the dictionary, they developed a rule-based method to automatically cluster spelling variants together. They clustered spelling variants around one common word form that is not always a modern word form, but the most central word form in the cluster of related variants leading to a spelling variants dictionary.21 A resource very similar to the CARDS-FLY corpus is the Shared Diachronic Corpus: Personal Brazilian Letters (Corpus Compartilhado Diacrônico: cartas pessoais brasileiras), which consists of a Brazilian collection of historical personal letters from the 18th to 20th century.22 The aim is to provide the academic community with a resource for the sociolinguistic history of Rio de Janeiro’s society along 300 years. The documents in this collection have also been normalised for spelling, but all normalisation was done manually, with the help of a friendly tool, namely E-Dictor, offered by the above-mentioned Tycho Brahe project.23 3.2. DICER Similarly to the Brazilian experiments, our study also uses a statistical corpus- based approach to get a better insight in the Portuguese spelling variation over the 16th–20th century time span. Our major originality is that we deal with an ultra-varied corpus, entirely made up of text within original letter manuscripts, either written by common people, or by élite people in common moments of their lives. We extracted a random sample from the CARDS-FLY corpus of 200 letters. These letters were manually normalised to the modern spelling by a linguist. Each word in the documents that was labelled as spelling variant was paired with its modern spelling counterpart. This sample was intended both for a manual inspection and analysis of the spelling variation present in the data, and for the development of an automatic tool for spelling normalisation. For the latter purpose, we split the sample in two parts. We used a hundred letters for training and tuning the automatic normalisation tool for this specific genre. The other hundred letters are used for evaluation of the tool as we can compare the manual normalisation against the automatic normalisation produced by the tool. We set apart the evaluation set and excluded it from any manual analysis. Tuning an automatic tool to the errors in the evaluation set would lead to a tool that performs very well on this one set but it might lead to an overly optimistic estimation of the true performance of the tool on other, unseen material. DICER (Discovery and Investigation of Character Edit Rules) is a statistical tool that creates a list of edit rules on the basis of a corpus labelled with spelling variants and their modern counterparts.24 The tool uses these pairs to detect which character(s) differ between the variant and the modern word, and it produces simple edit rules that capture the steps to rewrite the old word form 70 The Automatic Replacement of Spelling Variants to the modern form. The edit rules express what characters are being changed, what type of operation (deletion, insertion or substitution) is applied, and on which location of the word (start, second, middle, penultimate or end). To rewrite a spelling variant to its modern form may need multiple different rewrite rules. For example, apezare is a variant in our historical data for the modern form apesar ‘despite’ and the transformation requires two edit rules: ‘substitute < z > with < s > ’, and ‘delete < e > ’. DICER creates a new rule for every edit that it encounters in the corpus and therefor gives a full statistical and systematic overview of the spelling changes that are present in the corpus. Below we show a detail of the DICER results summary, after the processing of the CARDS-FLY corpus sample of a hundred letters. The summary shows the operations involving word types (not tokens). The table captures the ten top edit rules on the modernisation of those types. We can see that the substitution of < z > by < s > , especially when the < z > letter appears in the middle or in the penultimate position, is the edit rule that has been applied most frequently, namely 193 times, as shown in the column labelled as ‘Total’ (see Table 1). Since DICER finds all the edit rules involved in the modernisation process, it follows that a close examination of column ‘Variant’ versus column ‘Standard’, combined with the number of different word types that changed (column ‘Total’) will give us a good snapshot of the variation problems we have to face when dealing with the CARDS-FLY corpus. The letters authors were either following old spelling traditions, later abandoned, or, in the case of half-illiterate authors, also struggling with the rationale of the general spelling usage of their time, either old or modern. A computation of the spelling behaviour of those authors, as compared to modern Portuguese orthography, tells us that a total of 718 edit rules were needed in order to modernise the sample of 100 letters, and that these rules affected, one or more times, a sum of 3,450 different word types. When summing all operations of the 718 edit rules, we counted 4,225 different operations, which means that several of these word types had to be standardised step by step by multiple edit rules. In order to have a manageable, humanly observable, sample of this large population of data, we only examined the rules that were applied at least three times, leaving aside the less frequent ones. The resulting sample had a large lexical representativeness (3,590 operations) but a feasible number of edit rules (only 171). In the following two tables we show an interpretation of how the 171 top edit rules of the DICER tool could be distributed in terms of rule contents. The most frequent changes involved the spelling of phonological features (67 per cent), and, within these, the spelling of coronal fricatives was the most critical problem presented by our corpus variation (see Table 2 and Table 3). 71 Rita Marquilhas and Iris Hendrickx T ab le 1. T he D IC E R st an da rd iz in g ed it ru le s on th e C A R D S -F L Y co rp us (d et ai l) . P os it io n # ID O pe ra ti on V ar ia nt S ta nd ar d T ot al S ta rt S ec on d M id dl e P en ul ti m at e E nd 1 8 S ub st it ut io n Z S 19 3 0 4 13 2 46 11 2 20 S ub st it ut io n S S S 16 4 2 20 89 53 0 3 14 9 S ub st it ut io n M N 13 7 0 50 86 1 0 4 76 In se rt io n - 12 3 0 1 11 7 5 0 5 40 S ub st it ut io n à O A M 12 1 0 2 0 0 11 9 6 10 S ub st it ut io n S C 11 8 23 10 78 7 0 7 45 S ub st it ut io n I E 11 7 33 41 37 3 3 8 22 S ub st it ut io n I Í 10 7 2 11 90 2 2 9 68 S ub st it ut io n E I 10 6 23 35 40 8 0 10 6 S ub st it ut io n A Á 92 5 8 36 9 34 72 The Automatic Replacement of Spelling Variants Table 2. Causes for spelling variation in the CARDS-FLY corpus. Word types to General cause Specific cause standardise Phonology coronal fricatives 860 Phonology unstressed oral vowels written with 456 < i > , < e > , < u > , < o > Phonology nasal vowels and diphthongs 426 Phonology stressed oral vowels 408 Mixed mixed 308 Graphic Tradition abbreviations 267 Graphic Tradition learned consonant groups, 233 digraphs, and double consonants: < ct > , < pt > , < ph > , < pf > , < pp, < ff > , etc. Syntax enclisis: hyphenated verbal forms, 154 with or without sandhi, followed by clitic pronoun vs. non hyphenated verbal forms Graphic Tradition etymological vs. non etymological 136 initial < h > Phonology non standard phonology 132 (dialectal variation) Graphic Tradition archaic letters: < y > vs. < i > , 95 < u > vs. < v > , < i > vs. < j > ) Phonology liquids /l, r, R/ 63 Phonology labialised velar stops 52 /kw, gw/ vs. velar stop /k, g/1 TOTAL 3590 1We follow here Maria Helena Mateus and Ernesto d’Andrade, who present a case for the existence of segment /kw/ in the phonology of Portuguese: M. H. Mateus and E. d’ Andrade, The Phonology of Portuguese (Oxford, 2000). The fact that the CARDS-FLY corpus is composed by original manuscripts, instead of printed texts, together with the large variety of their authors’ social status, accounts for such a distribution of spelling variants. This means that much of the correspondence was written in a close-to-spoken manner, without the opportunity of being revised by a more literate copywriter. The above results also reveal the most important stumbling block in the Portuguese modern spelling system when the researcher wants to modernise historical written matter. That stumbling block is the lack of correspondent letters for the distribution of voiced and voiceless coronal fricatives. 73 Rita Marquilhas and Iris Hendrickx Table 3. Summary of spelling variation in the CARDS-FLY corpus. General cause Frequency of Rate of word for variation word types to standardise types to standardise Phonology 2397 66,7% Graphic Tradition 731 20,4% Mixed 308 8,6% Syntax 154 4,3% Totals 3590 100% In the Middle Ages, Southern Portuguese dialects were already experiencing seseo (the merge of the dental-alveolar affricates /ts, dz/ and the dental-alveolar fricatives /s, z/).25 Today only the archaic variety of the North-Eastern area keeps a distinction between four segments, articulating different fricatives in the middle of passo ‘step’, paço ‘palace’, coser ‘sew’, and cozer ‘bake, steam’. Also, but later, from the 17th century on, the voiceless palatal affricate (traditionally written < ch > ) merged with the voiceless palatal fricative (traditionally written < x > ) in Southern and Central dialects, so that the phonological difference between words like chá ‘tea’, and xá ‘shah’ was lost.26 All affricates disappeared in the innovative dialects, but since their traditional spelling was always kept by learned writers, including the ones that established the 20th century Portuguese orthography, it became a major source of variation in texts by poor writers along the centuries. Nevertheless, if we split our data into chronological segments, it is clear that the major problem for 20th century uneducated letter writers is not the spelling of coronal fricatives. That problem is specific of earlier writers, especially the ones of the 18th and the 19th century. The major problem with standardizing the spellings of 20th century poor writers resides in the system of stressed vowels, which they normally write without the phonographic diacritics prescribed by the standard rules. The other two more important sets of rules applied by the DICER tool have to do with the spelling of unstressed vowels and the spelling of nasal vowels and diphthongs, two phonological categories that are insufficiently mirrored by the Portuguese standard spelling. Neither the Spanish nor the Italian language, the overt examples that guided the creators of the Portuguese standard spelling in 1911, compare to Portuguese in what concerns the phonology of unstressed vowels and nasal vowels and diphthongs. So here the Portuguese spelling system became more etymological, less shallow, a feature that triggers several problems when it comes to standardizing historical data with many spelling variations. 74 The Automatic Replacement of Spelling Variants 3.3. VARD2 As a next step in our study we used the edit rules automatically generated by DICER to further improve the VARD2 tool for automatic spelling normalisation of historical Portuguese.27 We already experimented with the tool VARD2 in a previous study, and here we show how DICER can contribute to a better performance. VARD2 was initially developed for Early Modern English but we converted it to Portuguese. The system uses a modern lexicon to detect possible spelling variants in a historical input text. Words that do not occur in the modern lexicon are marked as possible candidates. The system checks for each candidate if it occurs in a variant dictionary, which lists frequent spelling variants and normalised equivalents. If the variant is listed, it is recognised as a true spelling variant and is replaced automatically by its modern equivalent. Otherwise, both rules based on phonological information and character rewrite rules are used to generate possible modern equivalents for the variant and associated confidence weights. One of the parameters of VARD2 is a confidence threshold that determines what weight is needed to replace the variant with the highest weighted modern equivalent that exceeds the minimum threshold. If no likely candidates are found, the variant is kept. To convert VARD2 to the Portuguese language we replaced the English modules by Portuguese ones.28 As modern lexicon we used the Multifunctional Computational Lexicon of Contemporary Portuguese.29 We had created the variant list of spelling variants and their modern equivalents on the basis of an existing spelling variants dictionary extracted from the Historical Corpus of Brazilian Portuguese mentioned above.30 We made several small improvements to the Portuguese modules in VARD2. When inspecting the modern lexicon, we noticed that even though it was extracted from a contemporary dictionary it still contained several archaic word forms. We attempted to filter out these word forms on the basis of a list of archaic word forms from the Houaiss dictionary.31 We also used the list of spelling variants from the training sample of a hundred letters to filter the lexicon by deleting the variants and adding the modern word forms. Furthermore, a manual check of the most frequent items in the spelling variant list was needed as we had already noticed that some variants were not mapped to a modern word form but to another, more frequent archaic word form. For example, in our previous experiments the variant list contained the archaic form fforão ’(they) were/went’ matched with equivalent forão instead of the correct modern counter part foram. VARD2 uses a set of rewrite rules to generate the modern word form candidates. In our first approach we manually constructed such a list of rewrite rules based on our own intuitions and on the rule set described by Giusti et al. 75 Rita Marquilhas and Iris Hendrickx Table 4. VARD2 scores on the development set with different thresholds for the rule set. Threshold Accuracy Recall Precision F-score 5 93.0 74.3 98.5 84.7 10 93.0 739 98.5 84.5 25 92.8 73.0 98.6 83.9 50 92.2 70.6 98.7 82.3 Here we intend to investigate to what extent the automatically generated rewrite rules by the DICER tool can help improve the performance of VARD2. Our analysis and interpretation of the generated rule set presented above showed that the DICER was able to produce edit rules that capture a broad and diverse set of spelling changes. As DICER generates a large rule list and some of the rules are based on evidence of only one occurrence, we decided to search for an optimal minimum frequency threshold for the rule set.32 To get an indication for a suitable cut- off point, we ran experiments on the training set to see the effect of using rules that occurred at least 5, 10, 25 and 50 times. The higher the cut-off threshold, the smaller the rule set would be. The rule set with cut-off threshold 5 has 99 rules while a cut off of 50 only leaves 14 rules. We split the training sample in a part of 80 letters for training and 20 letters as a development set to determine the optimal rule set. We ran experiments with the different thresholds on the development set. To evaluate the performance of the tool, we compute accuracy, recall, precision and F-score for the words (excluding punctuation marks) in the held out evaluation data. Recall expresses the number of cases in which there was a spelling variant in the text and the modern variant was correctly predicted by the tool, divided by the total number of predictions (errors because the tool predicted too many cases). Precision on the other hand focuses on the number of correct predictions divided by the number of true spelling variants in the data (errors because the tool missed some cases). In table 4 we show the effect of varying the threshold on the development set. We do not observe huge differences between the different thresholds, but as the threshold of 5 had a slightly higher score, we decided to use this cut-off threshold for the experiments on the test set. As we aim to study the effect of DICER edit rules on the VARD2 system, we made a comparison between the DICER edit rules, and the set of rules that we had manually created for our previous experiments. The manual rule set contains 62 different rules while the DICER rule set with threshold 5 contains 99 rules. When we compare the two rule sets, we notice only a few overlaps in rules. Both sets contain the rules to remove the double consonants < ll > , < nn > , < tt > , 76 The Automatic Replacement of Spelling Variants Table 5. A comparison on the test set of two versions of the VARD2 tool one with the DICER rule set and one with handcrafted rules. Rule set Accuracy Recall Precision F-score handcrafted 92.7 64.9 98.4 78.3 DICER 94.2 73.4 97.0 83.6 the substitution of < y > with < i > and some accent changes. The manual rule set contains many specific rules that cover multiple character strings such as ‘substitute < zente > with < sente > at the End position’. The DICER tool however has more general rules that do capture the same event, for example the rule < z > - < s > is a generalisation of the ‘substitute < zente > ’ rule. In the table 5 we show the results of the comparison VARD2 with the handcrafted rule set against a version of VARD2 trained with the DICER rule set with threshold 5 on the held out test sample of a hundred letters. Overall, we observe that VARD2 has a very high precision. The automatically generated rule set leads to a higher performance of 84 per cent F-score and 94 per cent accuracy. As shown in the table, the automatically generated rule set leads to a higher overall performance due to an increase of the recall. The DICER rule set enables the VARD2 tool to create a larger list of possible modern candidates thereby reducing the number of missed variants. For example, the variant lansar was not corrected by VARD2 trained with the handcrafted rule set, but it was correctly changed to lançar ‘to launch’ by the version trained with the DICER rule set as it included the edit ‘substitute < s > with < ç > ’. In general, the limitation of VARD2 to only detect non-word errors causes a major part of the errors. To give an example, the noun circunstancia was not detected as a spelling variant because it is listed in the modern lexicon where it represents a conjugation of the verb circunstanciar ‘to state in detail’. However, the modern equivalent of the noun has an accent: circunstância ‘circumstance’. The information about the grammatical function of a word in the sentence is not available and therefor the system cannot detect this variant. In other cases VARD2 will chose the most likely and closest modern variant, and this may not be the best option in a given context. Like the form frea that can either be an abbreviation of freguesia ‘parish’ or a variant of fria ‘cold’. A context-sensitive tool is needed to solve this type of problems but this is a line of future research as there are currently not many context-sensitive spelling normalisation tools available, certainly not for historical texts.33 4. conclusions We have presented an analysis of the main types of spelling variation that we encountered in CARDS-FLY corpus, a corpus of Portuguese historical personal 77 Rita Marquilhas and Iris Hendrickx letters that lacks standardisation because it corresponds to extremely varying sources, which were transcribed in a semi-palaeographic way. The systematic account of all spelling changes in the corpus sample, as generated by the DICER tool, shows the mixed nature of Portuguese modern orthography, not so much shallow as their inventors wanted it to be. This mixed nature of the modern standard clashes both with etymological spellings within the corpus, and with phonological ones. As spelling variation can be a hindrance for certain types of research and for automatic search in the corpus, we presented a series of experimental results with the VARD2 statistical normalisation tool. This tool can automatically normalise variants with an F-score of 84 per cent and a precision of 97 per cent. A high precision means that when VARD2 makes a correction, this is in general correct. The errors that it makes are caused by missing a spelling variant. This score is more than sufficient to be useful for automatic correction of the corpus as it is preferable to have a conservative tool making only those corrections that it is certain about. We have shown that a systematic statistical analysis of spelling variation is a powerful way to both consolidate known changes in the spelling conventions and to discover new insights in the way people wrote in earlier times. We also showed that both diachronic linguists and historians wanting to subject historical Portuguese sources to processing operations can have them modernised by an automatic way. They do not have to wait long years, nor to exhaust large human resources, in the operation of manually modernising the variant spellings of such texts, even if they were written by the poor-writer type of author. Additionally, the same procedure can always be adapted to new languages, since the tools we worked with were originally designed for English historical texts. end notes 1 Acknowledgements: This research is funded by the Portuguese Foundation of Science and Technology (FCT), under the project FLY (PTDC/CLE-LIN/098393/2008), and the FCT program Ciência 2007/2008. 2 Translated from C. Dauphin, ‘Pour une histoire de la correspondance familiale’, Romantisme 90, (1995), 89–99. Cited here at 89. 3 A. Petrucci, Public lettering: script, power, and culture (Chicago, 1993). 4 Translated from Dauphin, ‘Pour une histoire de la correspondance familiale’, 89. 5 D. Y. W. Lee, ‘Genres, registers, text types, domains, and styles: clarifying the concepts and navigating a path through the BNC jungle’, Language Learning & Technology 5, 3 (2001), 37–72. Cited here at 46 and 50. 6 DALF, Guidelines for the description and encoding of Modern correspondence material, Version 1.0, 2003, http://ctb.kantl.be/project/dalf/. 7 TEI, Text Encoding Initiative, P5 guidelines, http://www.tei-c.org/index.xml, last accessed 24 May 2013. 78 The Automatic Replacement of Spelling Variants 8 Recent examples are D. Archer and J. Culpeper, ‘Identifying key sociophilological usage in plays and trial proceedings (1640–1760): An empirical approach via corpus annotation’, Journal of Historical Pragmatics 10, 2 (2009), 286–309, and D. Z. Mohd, G. Knowles and Ch. K. Fatt, ‘Nationhood and Malaysian identity: a corpus-based approach’, Text & Talk – An Interdisciplinary Journal of Language, Discourse & Communication Studies 30, 3 (2010), 267–287. 9 I. Hendrickx and R. Marquilhas, ‘From old texts to modern spellings: an experiment in automatic normalisation’, Journal for Language Technology and Computational Linguistics 26, 2 (2011), 65–76. 10 M. F. Gonçalves, As ideias ortográficas em Portugal: de Madureira Feijó a Gonçalves Viana (1734–1911) (Lisboa, 2003), 779–786. 11 F. Coulmas, The Blackwell encyclopedia of writing systems (Oxford & Cambridge, Mass., 1996), 380. 12 Reprinted by I. Castro, I. Duarte and I. Leiria, eds, A demanda da ortografia portuguesa (Lisboa, 1987), 152. 13 Presidência do Conselho de Ministros, ‘Resolução do Conselho de Ministros n.o 8/2011’, Diário da República, 1.a Série, n.o 17, January 25, 2011. 14 P. Rayson, D. Archer and N. Smith, ‘VARD versus Word: A comparison of the UCREL variant detector and modern spell checkers on English historical corpora’, Proceedings of the corpus linguistics conference (Birmingham, 2005). 15 H. Craig and R. Whipp, ‘Old spellings, new methods: automated procedures for indeterminate linguistic data’ , Literary and Linguistic Computing 25, 1 (2010), 37–52. 16 S. Scheible, R. J. Whitt, M. Durrell and P. Bennett, ‘For the A Gold Standard Corpus of Early Modern German’, Proceedings of the 5th linguistic annotation workshop (Portland, Oregon, 2011), 124-128. 17 C. Sánchez-Marco, G. Boleda, J. M. Fontana and J. Domingo, ‘Annotation and representation of a diachronic corpus of Spanish’, Proceedings of the seventh conference on international language resources and evaluation (Malta, 2010), 2713–2718. 18 Gonçalves, As ideias ortográficas em Portugal; M. L. C. Buescu, Gramáticos portugueses do século XVI (Lisboa, 1978); R. Marquilhas, ‘O acento, o hífen e as consoantes mudas nas Ortografias antigas portuguesas’, in I. Castro, I. Duarte, and I. Leiria, eds., A demanda da ortografia portuguesa (Lisboa, 1987), 103–116; M. H. Paiva, ‘Variação e evolução da palavra gráfica: o testemunho dos textos metalinguísticos do século XVI’, in Actas do XII encontro nacional da Associação Portuguesa de Linguística, 2 (Coimbra, 1997), 233–252. 19 T. A. Menegatti, Regras lingüísticas para o tratamento computacional da variação de grafia e abreviaturas do corpus Tycho Brahe (Campinas, 2002). 20 R. Giusti, et al., ‘Automatic detection of spelling variation in historical corpus: An application to build a Brazilian Portuguese spelling variants dictionary’, in Proceedings of the corpus linguistics conference CL2007 (Birmingham, 2007). 21 BP spelling variants dictionary is available at: http://www.nilc.icmc.usp.br/nilc/projects/hpc/, last accessed 24 May 2013. 22 The Corpus Compartilhado Diacrônico was created by the Laboratório de História do Português Brasileiro from the Universidade Federal do Rio de Janeiro in Brazil. More information can be found at http://www.letras.ufrj.br/laborhistorico/, last accessed 24 May 2013. 23 M. C. Paixão de Sousa, F. N. Kepler and P. P. F. Faria, ‘E-Dictor: novas perspectivas na codificação e edição de corpora de textos históricos’, in Caminhos da linguística de corpus (Campinas, 2010). 24 DICER is described in chapter 4 of the following thesis: A. Baron, ‘Dealing with spelling variation in Early Modern English texts, PhD dissertation’ (Lancaster University, 2011). 79 Rita Marquilhas and Iris Hendrickx 25 L. F. Lindley Cintra, ‘Observations sur l’orthographe et la langue de quelques textes non littéraires galicien-portugais de la seconde moitié du XIIIe siècle’, Revue de Linguistique Romane 27 (1963), 59–77. 26 P. Teyssier, História da língua portuguesa (Lisboa, 1982); I. Castro, Introdução à História do Português (Lisboa, 2006). 27 A. Baron and P. Rayson, ‘VARD 2: A tool for dealing with spelling variation in historical corpora’, in Proceedings of the postgraduate conference in corpus linguistics (Birmingham, UK, 2008). 28 For a detailed description of the Portuguese modules in our version of the VARD2 tool, we refer to the following paper: Hendrickx and Marquilhas, ‘From old texts to modern spellings’, sec 4. 29 This Lexicon is available for download at: Multifunctional Computational Lexicon of Con- temporary Portuguese, 2010, http://www.clul.ul.pt/en/resources/88-project-multifunctional- computational-lexicon-of-contemporary-portuguese-r. 30 Giusti, et al., ‘Automatic detection of spelling variation in historical corpus’, sec 9. 31 A. Houaiss, et al., Dicionário Houaiss da língua portuguesa (Rio de Janeiro, 2001). We wish to thank Mauro Villar for kindly granting us access to the digital form of the Houaiss dictionary’s archaic lexicon. 32 The Dicer rules were manually converted to the VARD format and some rules were adapted as very general rules such ‘insert e anywhere’ slow down and ultimately crash the VARD program as they generate too many possibilities. To elevate this problem, such general rules were converted to more specific rules. 33 Baron, ‘Dealing with spelling variation in Early Modern English texts’, sec 6.4, and sec 7. 80 work_dzpx2hl4kvdblaufhkagfpeu3i ---- Creative Destruction in Libraries: Designing our Future – In the Library with the Lead Pipe Skip to Main Content chat18.webcam Open Menu Home About Awards & Good Words Contact Editorial Board Denisse Solis Ian Beilin Jaena Rae Cabrera Kellee Warren Nicole Cooke Ryan Randall Emeritus Announcements Authors Archives Conduct Submission Guidelines Lead Pipe Publication Process Style Guide Search Home About Awards & Good Words Contact Editorial Board Denisse Solis Ian Beilin Jaena Rae Cabrera Kellee Warren Nicole Cooke Ryan Randall Emeritus Announcements Authors Archives Conduct Submission Guidelines Lead Pipe Publication Process Style Guide Search 2013 Nov 20 Caro Pinto /9 Comments Creative Destruction in Libraries: Designing our Future In Brief: Joseph Schumpeter defines creative destruction as a “process of industrial mutation that incessantly revolutionizes the economic structure from within, incessantly destroying the old one, incessantly creating a new one.” As libraries struggle with how to position themselves to thrive in the digital age, how can we balance the traditional elements of librarianship like collecting and reference with the demands of the present, all without sacrificing staffing and support for collections, space, and community? Image Credit: Rebecca Partington by Caro Pinto In my first job after library school, I worked in Manuscripts & Archives in the Yale University Library. There I worked adjacent to an extraordinary archivist named Laura Tatum. Laura was the architectural archivist and she worked with firm records and personal papers, forging unique relationships with donors to streamline the processing of manuscript and records collections. Through Laura I became familiar with Eero Saarinen, the Finnish architect who designed the TWA Terminal at Kennedy Airport and the Gateway Arch in St. Louis. Saarinen’s structures and aesthetic mesmerized me. I spent hours poring over plans, drawings, and photographs of his completed projects during the slower moments of my reference shifts. At home I began reading widely about his work. I continue to take field trips to his completed projects whenever time allows. Saarinen designed furniture and buildings with the intention to build a vision for the present that also leaned forward to the future. Considering his projects and his vision for futurism in the built environment, I began to connect my interest in Saarinen with my exploration of the role of creative destruction in academic libraries. Through the course of my reading, I came across these words from Saarinen: “Each age must create its own architecture out of its own technology and one which is expressive of its own Zeitgeist-the spirit of the time.” (Serraino, 2009) Within our own libraries and within the field of librarianship at large, creative destruction is the idea that in order to create new ways of knowing and thinking, we must break with the past to plan and shape our future. Through my relationship with Laura, my devotion to Saarinen scholarship, and my interest in futurism, I often consider what creative destruction can and should mean for libraries. What should libraries be in the twenty-first century? What should twenty-first century librarians do? As our collection bases transition from print to hybrid print to digital collections, libraries face new challenges around budgets, space, personnel, and questions of relevance. Many organizations have shuttered their reference desks in favor of unified information desks like the Info Bar at Hampshire College or programs like the Personal Librarian Program at Yale. Technical services and acquisitions departments manage spreadsheets of data to make selection decisions, rather than relying on a monkish bibliographer ordering title by title. Libraries are increasingly loud, bustling, collaborative places, out of step with the image so many have of the classic library-a somber building governed by a stern cat lady who demands silence. Can librarians and libraries evolve to meet new challenges and expectations, or will these things require  a new generation of managers who will, as a colleague remarked to me in 2010, “turn off the lights?” Librarians are guardians of our profession: we are the stakeholders in our future. Libraries have long survived threats to their existence and as Scott Bennett discussed in 2009, have experienced “paradigm shifts” from “reading centered” spaces into “learning centered spaces.” (Bennett, 181-182)  The nature of librarianship in the digital age demands that we continue to re-evaluate our work and confront the reality that our personnel, job descriptions, and spaces must change. In order to facilitate that change, what should we give up? If libraries do what Saarinen suggests – creating their own architecture reflective of the time, how will libraries creatively destroy traditional aspects of our profession without too much collateral damage? How can we make creative destruction in libraries, particularly in the context of higher education, sustainable and constructive as we create a profession that fits the evolving demands of our digital age? Students are the heart of today’s academic libraries; engaging students as collaborators in library work; redesigning spaces to be active hubs of student engagement and learning; and putting ourselves in the role of students for a continuous arc of learning to continually revise how we provide and promote library services. Tools of the Trade: Once Pencils, Now Pinterest Recently, while I was sitting on the reference desk in Archives & Special Collections at Mount Holyoke College, I ran into a colleague from my days as an archives assistant at the University of Massachusetts. We caught up after having not seen each other since 2007, when I graduated. While I was working with other patrons, he walked around the reading room, marvelling at the readers, poring over the card catalog that houses descriptive details of collections and remarking, “the tools of the trade: the pencils, the cards, the boxes.” Indeed, those were the tools of the trade when I worked at UMass processing collections and responding to reference requests. But will they be for much longer? Recently, the Taiga forum posted about a “gentle disturbance, The End of Library Scut Work?”  Responding to an earlier piece in Library Journal, where Stanley Wilder asserted that the decline in library support and student worker staff since 2008 in (Association of Research Libraries) is less a byproduct of the recession and an impact of the “evolving nature of library work.” Wilder writes, “The iconic image of library workers pushing book trucks is quickly slipping into obsolescence…Lower skill library work is disappearing, and it will never come back.” (Wilder, 2013) At Mount Holyoke College, we continue to hire student workers to manage the stacks, and to staff service points like the circulation desk and the research help desk. Indeed, I see students pushing book trucks daily as physical books return to the library and to their rightful places in the stacks. However, these are not the only types of student positions we offer at Mount Holyoke; in true “learning paradigm” fashion, we engage students in library work that leverages critical thinking skills and creative imaginations. The Library at Mount Holyoke College employs students to conduct outreach, publicize events, and generate content for our social media channels. These positions leverage the excellent communication skills that the Mount Holyoke College curriculum cultivates while preparing these students to apply skills learned in the classroom, exercised in student positions and applied in internships and jobs off campus. Students as collaborators incubating projects and actively engaging in daily work is a core part of how we can promote and sustain a user-centered library experience. The increasing disappearance of piecemeal library work among student workers is a new opportunity to train undergraduates to meet the demands of today’s workplace; we may give up solitary, meditative, repetitive tasks for these works, but the students and staff who supervise them gain much more. Where students like me once relied on pencils for our library work, today’s students rely on Pinterest. This Used to be my Playground? Revising Job Descriptions As Stanley Wilder discussed the end of the low wage library work in Library Journal, he also described the simultaneous 40% increase in professional library salaries. (Wilder, 2013) Citing the impact of digital scholarship, Wilder wrote, “There is a second answer as to how libraries managed to raise skills and salaries: they had to. For every physical process that no longer exists, a new and complex digital process has sprung up in its place. These digital processes employ far fewer people but the expertise required is greater.” Indeed, the trend that Wilder reports at ARL institutions is similar to trends at liberal arts colleges; new developments in digital scholarship, collections, and workflows supplants traditional library work. I made this connection over the summer when the Five Colleges (Five Colleges, Incorporated is a consortium of colleges in western Massachusetts) held a Digital Humanities Symposium to consider how to build an effective community of practice in the digital humanities, especially at liberal arts colleges. We circulated a call for proposals and invited speakers from Colgate University, Haverford College, and Washington & Lee University to present on how they were conducting digital scholarship in their local contexts; how they were adapting to the new scholarly landscape; and how their organizations were changing to meet the growing demands of digital scholarship. In all cases, staffing changed to reflect the new missions and charges of departments. Washington & Lee created a brand new position of Digital Scholarship Librarian; Haverford underwent an organizational shift that resulted in one of their unit heads becoming the digital scholarship coordinator; and finally, Colgate saw sweeping changes in terms of how their library shifted from a 20th century model of reference librarians to a dynamic team of 21st century instructional designers. Joanne Schneider of Colgate reflected on the process: “This effort also has focused on rebuilding the Collaboration for Enhanced Learning (CEL) Group, a partnership of the Libraries and Information Technology Services composed of librarians and technologists who provide coordinated support to faculty who wish to rethink courses and pedagogical approaches using current and emerging technologies to enhance student learning and engagement with information.” (Digital Humanities for Liberal Arts Colleges Symposium, 2013) In order to accomplish this transition, the organization had to destroy old job descriptions and create new ones in their stead. The type of human capital transformation described at Colgate is also represented well at Columbia University, where librarians in the history and humanities division cultivated The Developing Librarian Project as an effort to empower their librarian staff to reinvent themselves to meet the challenges of the present and position themselves for success in the future: “In the fall of 2012, and running in parallel with the expansion of the Digital Humanities Center, we initiated the Developing Librarian Project (DLP), a two-year training program, with the goal of acquiring new skills and methodologies in digital humanities. The DLP is created by and for librarians and other professional staff in the Humanities and History division.” (dh+lib, 2013) Columbia recognizes Schumpter’s “incessant revolution” and responds by empowering its staff to gain the skills necessary to participate in the digital scholarship ecosystem by participating in the process themselves. The team reflected in their announcement on dh + lib, the Association of College & Research Libraries Digital Humanities interest project earlier this summer stating, “We realize training is no longer a thing to do a couple of times a year, but a continual process of learning integrated into the fabric of what we do every day. In that sense it would be more accurate to say that ours is not a training program, but part of our continuing professional development and research. We are committed to gaining a better understanding of emergent technologies and to being partners in the research process.” (dh+lib, 2013) Projects like the Developing Librarians Project and organizational shifts like the one described at Colgate University enforce the idea that in order to stay agile and relevant, librarians and libraries must have organizational structures and programs in place to promote change. Libraries cannot realize radical change to support emerging digital scholarship unless we build organizations and cultures with the human capital to scaffold instruction, resources, and technical support to enact new models for scholarship. Just as the Jet Age demanded new architecture to acculturate Americans to air travel, libraries must design new types of organizational structures and cultures to acculturate faculty and students to the changing demands of our rapidly shifting scholarly landscape. Trading Spaces: A Slide Library Becomes a Media Lab The end of “scut work” Wilder describes and new trends in student library employment have coalesced in a project at Mount Holyoke College called the Media Lab. I first learned about the lab during a webinar I hosted last February about new types of learning spaces at liberal arts colleges. My colleague, Nick Baker, presented on the development of the media lab he built in collaboration with arts faculty at Mount Holyoke College in the former MHC slide library. In 2002, the slide library at Mount Holyoke enjoyed a triumphant renovation; faculty packed the library reviewing slides for their lectures. As time passed and database products like ARTstor matured – and other faculty members began digitizing slides to embed in power point presentations – by 2009 Mount Holyoke faculty no longer stood “elbow to elbow” in the slide library. The space stood idle. In 2010, the library created a new department, Digital Assets and Preservation Services (DAPS) and absorbed the slide librarian into their group. The slide library effectively closed; the art librarian and the former slide librarian shifted to the main library. In response, the Art and Architecture departments hosted a contest for students to propose new plans to revise the space. Students across the Five Colleges submitted proposals. The winning proposal devised a pop-up media lab; the students wanted to add new furniture, computers, and some minor physical modifications to the space. While plans moved forward with an architecture consultation and a modest budget proposal of $50,000, the financial landscape at the College  rendered those changes impractical. In spite of this, Baker and the Art department moved forward with small changes, couches from elsewhere on campus moved into the space along with older computers and some grant-funded studio supplies. With minimal intervention, Baker and faculty programmed the lab slowly with workshops and projects. Baker hired students to do experimental projects and serve as ambassadors to evangelize about the space and its potential for interdisciplinary studio work. The students’ outreach efforts drew more students into the space. Faculty and library staff recognized that in order for precious campus space to remain vital, it was necessary for the the slide library to close and transform into something entirely new. Baker also found ways to ground the space in the past in spite of its experimental nature. As Baker cleared out projectors and obsolete technologies, it inspired him to save some items and create a slide museum that demonstrates for students how the building was used in the past. What was state of the art in 2002 became obsolete by 2009. A creative intervention transformed a slide library into a dynamic teaching and learning space. The evolving nature of the curriculum demanded a new type of space informed by student needs. Given the constraints of budget and space at Mount Holyoke College, librarians, faculty, and students collaborated to remake an obsolete space into a energized and relevant one. Which Way Do We Go? As guardians of the profession, we all must decide how to proceed. In many cases, change is hard, even emotional for some employees, users, and organizations. There are clearly tasks that librarians will no longer do: sit at reference desks for regular shifts, only develop collections by ordering monographs title by title, or shush patrons as they labor in rows of tables in pristine reading rooms without a machine or whiteboard in sight. There are librarians who mourn the loss of some of these activities, their hours spent reading book reviews, days at the reference desk where people asked questions of facts now easily accessible through a plethora of online resources. On the other hand, there are a growing number of librarians like me who have “library” in their job titles, but who also work in instructional technology or digital scholarship or digital humanities, or as digital archivists. Transformations like the Developing Librarian program at Columbia or the staff reorganization Joanne Schneider initiated at Colgate require bold leadership, vision to build new programs and positions that did not exist, the balancing of budgets by dissolving positions like reference librarian or cataloger in favor of different choices – relevant ones. We may throw out older copies of AACR2 as our supply closets burst with materials discarded from our desks, but we are not discarding the contributions of our librarian forebears. Those communities built the foundations that our positions of the future depend upon; we create new opportunities unimaginable by previous generations, but we must do so with an eye towards respecting the past, too. Acknowledgements: Many thanks to Emily Ford for shepherding the project from idea to article; Alex Gil (external editor) for astute edits, my writing group at Mount Holyoke College, especially Julie Adamo, Sarah Oelker, and Alice Whiteside for their support, and, finally, to Laura Tatum, whose encouragement, friendship, and brilliance inspired me to evolve and grow as a librarian. References and Further Readings: Serraino, Pierluigi. Eero Saarinen, 1910-1961: a Structural Expressionist. Köln ; London: Taschen, 2005. Schumpeter, Joseph A. Capitalism, Socialism, and Democracy. New York; London: Harper & brothers, 1947. Bennett, Scott. “Libraries and Learning: A History of Paradigm Change.” Portal: Libraries and the Academy 9, no. 2 (2009): 181–197. Booth, Char. “The Library As Indicator Species: Evolution, or Extinction?” October 18, 2011. http://www.slideshare.net/charbooth/the-library-as-indicator-species-evolution-or-extinction. “The End of Library Scut Work? | Taiga Forum.” Accessed September 8, 2013. http://taiga-forum.org/the-end-of-library-scut-work/. “The End of Lower Skill Employment in Research Libraries | BackTalk.” Accessed September 8, 2013. http://lj.libraryjournal.com/2013/06/opinion/backtalk/the-end-of-lower-skill-employment-in-research-libraries-backtalk/. “Digital Humanities for Liberal Arts Colleges Symposium.” Accessed October 31, 2013. https://sites.google.com/a/mtholyoke.edu/digital-humanities-for-liberal-arts-colleges-symposium/. “The Developing Librarian Project.” Accessed October 31, 2013 http://acrl.ala.org/dh/2013/07/01/the-developing-librarian-project/ Nick Baker, interview by Caro Pinto, Mount Holyoke College, August 21, 2013. academic libraries, creative desruction, higher education, liberal arts colleges, makerspaces, saarinen, social media New literacies, learning, and libraries: How can frameworks from other fields help us think about the issues? Charles A. Cutter and Edward Tufte: Coming to a Library Near You, via BIBFRAME 9 Responses laborlibrarian 2013–11–20 at 9:36 am “As Stanley Fish discussed the end of the low wage library work in Library Journal, he also described the simultaneous 40% increase in professional library salaries. (Fish, 2013)” In referring to the source, readers will learn that this figure covers a 10-year period, that 27% is accounted for by ‘routine wage growth’, and that it is only applicable to ARL member libraries. Please try to employ stats with more care. I’d hate to see folks throwing around that 40% number indiscriminately without actually looking at aggregate salary data (from the ARL or ALA-APA salary surveys) or labor market statistics. Robert Teeter 2013–11–20 at 12:54 pm There’s a reference to “Fish 2013” in the article, but it doesn’t show up in the references. The one link to LJ doesn’t work. Caro Pinto 2013–11–20 at 7:20 pm Here’s the link to the LJ article: http://lj.libraryjournal.com/2013/06/opinion/backtalk/the-end-of-lower-skill-employment-in-research-libraries-backtalk/#_ It should be Stanley Wilder, not Stanley Fish. Thanks for catching that! Caro Pinto 2013–11–20 at 8:28 pm It’s been corrected in the article, too. Pingback : 5 Things Thursday: Taxonomy, Serials Solutions, NARA | MOD LIBRARIAN Pingback : rxn: creative destruction | the girl works SteveM 2013–12–02 at 10:45 am I applaud your search for a new way of thinking about the future of libraries and librarianship in this new millennium. Almost 3 years ago I wrote in my Blog 21st Century Library “Discontinuous thinking sounds very impressive. Some might call it thinking outside the box, or lateral thinking, or creativity, or whatever. The point is still that conventional thinking and incremental decision making will not address the changes that confront 21st Century libraries. Charles Handy based the title of his book THE Age OF UNREASON on George Bernard Shaw’s observation that “all progress depends on the unreasonable man. His argument was that the reasonable man adapts himself to the world, while the unreasonable [person] persists in trying to adapt the world to himself; therefore for any change of consequence we must look to the unreasonable man, or, I must add, to the unreasonable woman.” [Handy, C. (1990). THE Age OF UNREASON. Harvard Business School Press, Boston, MA.] Discontinuous Thinking 10 Reasons to Believe Discontinuous Change Pingback : #8: In Which I Blog About Blogs | Historicity Pingback : NMC Library Horizon Report 2014 (Pt. 1 of 6): Documenting Where We Are and Where We Might Be Going | Building Creative Bridges This work is licensed under a CC Attribution 4.0 License. ISSN 1944-6195 About this Journal | Archives | Submissions | Conduct work_e76ry4ouybginjc554eibya46u ---- 1 STREAM (Spatiotemporal Research Infrastructure for Early Modern Brabant and Flan- ders): Sources, Data and Methods Isabelle Devos 1 , Torsten Wiedemann 1 , Ruben Demey 1 , Sven Vrielinck 1 , Sofie De Veirman 1 , Thijs Lambrecht 1 , Philippe De Maeyer 2 , Elien Ranson 2 , Michiel Van den Berghe 2 , Glenn Plettinck 3 , Anne Winter 3 1: History Department, Ghent University 2: Geography Department, Ghent University 3: History Department, Vrije Universiteit Brussel Corresponding author: Isabelle Devos, History Department, Ghent University, Isa- belle.Devos@ugent.be Biographical note: Isabelle Devos is Associate Professor at the History Department of Ghent Univer- sity, Belgium. Over the years her research has revolved around social and economic issues of the early modern period and the long nineteenth century in a comparative perspective, with a particular focus on demography. Abstract: This article presents the technical characteristics of the Belgian STREAM-project (2015- 2019). The goal of STREAM is to facilitate and innovate historical research into local and regional processes through the development of a spatiotemporal infrastructure for early modern Brabant and Flanders, two of the most urbanized and developed areas of pre-industrial Europe. To this end, STREAM systematically collects a range of key data from a diversity of historical sources to provide a geographically comprehensive and long-run quantitative and spatial account of early modern society at the local level (parishes, villages, towns) regarding territory, transport, demography, agriculture, indus- try and trade, related to the development of a tailored historical geographical information system (GIS) based on the well-known Ferraris map (1770-1778). This article discusses the possibilities and pitfalls of the data collection and the construction of a spatial infrastructure for the pre-statistical era. Keywords: spatial history, digital history, data infrastructure, GIS, economic history, historical de- mography, Belgium mailto:Isabelle.Devos@ugent.be mailto:Isabelle.Devos@ugent.be 2 STREAM (Spatiotemporal Research Infrastructure for Early Modern Brabant and Flan- ders): Sources, Data and Methods 1. INTRODUCTION High-level historical research is extremely dependent upon access to primary source materials. Over the last decade the demand for large-scale databases in historical research has increased enormously (Digital History). Due to the powerful advances in ICT, there are hardly any technological limitations to the development and analysis of large datasets, while the integration of geographical information systems (GIS) has greatly enriched the uses that can be made of historical databases. Spatial analysis has enabled historians not only to visually present their research results, but more importantly to use space to integrate, collect and study historical data in new ways. This geographical approach to history (Spatial History) has proved its usefulness and reliability over the past decade and has had a significant impact on the progress of historical research, in particular of the nineteenth century. 1 For the early modern period (ca. 1500-1850), the Cambridge Group for the Study of the Population and Social Structure (CAMPOP) set up an impressive regional data infrastructure for England. Based on the infrastructure, Shaw-Taylor and Wrigley showed that in some areas of England the economy was far more advanced by 1750 than had previously been supposed and suggest that economic growth and population change in the two preceding centuries must have been decisive in bringing about the diver- gence of England. 2 Yet, as they rightly point out, their argument is problematic in the sense that their account – as that of others 3 – highlights the country’s exceptional demographic and economic history, without being able to rely on comparable data for the European continent. 4 In Belgium, early modern historians are insufficiently able to profit from these new research opportuni- ties because of the absence of suitable databases and GIS infrastructures that collect and integrate orig- inal data from archival and manuscript data on a local level for a sufficiently large territory. While me- dieval and modern historians have built a solid tradition in developing various tools facilitating the study of primary sources and statistics, most early modern sources for Belgium are only available on paper in manuscript form in various archives. The connection between such archival data and digitized repositories so far remains poorly developed. Likewise, as early modern maps are often available only on paper or raster images, historians have been slow at bringing a geographical dimension to analyses of the pre-industrial world. According to Wrigley, the creation of historical maps is nonetheless indis- pensable for the articulation of new historical insight and a necessary prerequisite for detecting regional patterns and temporal changes which might otherwise remain unnoticed. 5 As a result, our knowledge of the Belgium’s social and economic history during the early modern period is characterized by a strong fragmentation. On the one hand, there are hundreds of micro studies examining different phenomena and processes at the local level. Although these works are often of excellent quality and high scientific relevance, the extent to which the results can be generalized is often questionable. On the other hand, there are many studies that describe society at the macro level. They usually lack the necessary detail to reach a deeper understanding of social processes and geographical dimension of population patterns, 3 social change and economic developments within countries. This imbalance is due to a lack of quantita- tive studies providing a comprehensive description of Belgium’s early modern society. The collection of quantitative information for the early modern period is a labour intensive task as the available sources are not easily quantifiable for research purposes. To this end, the STREAM-project (www.streamproject.ugent.be) carried out by historians and geogra- phers at Ghent University and the Vrije Universiteit Brussel, is currently systematically collecting a range of key data from a diversity of historical sources in order to provide a geographically comprehen- sive and long-run quantitative and spatial account of early modern society at the level of localities (par- ishes, villages, towns). 6 As such, STREAM joins in with and contributes to recent developments in digital and spatial history. Its goal is to facilitate and innovate historical research into local and regional processes through the development of a spatiotemporal research infrastructure for early modern Bra- bant and Flanders that allows for spatial analysis of key historical data. The Duchy of Brabant and the County of Flanders, two of the most urbanized and developed areas of pre-industrial Europe, are re- gions par excellence to tackle new research questions and re-examine ‘old questions’, such as the de- bate on the birth of modern economic growth and our understanding of the preconditions of Europe’s leading economic and demographic role in the eighteenth and nineteenth century. The development of the STREAM spatial data infrastructure involves two strands of data collection. On the one hand, a series of high-quality datasets relating to the population and social and economic structure of the Duchy of Brabant and County of Flanders between 1550 and 1815. On the other hand, a geographical information system is developed to spatially structure and map the historical data, which also allows for manipulation and analysis of the data at different spatial levels, from parishes to bishop- rics, castellanies and counties. Working backwards from the map of Ferraris (1770-1778), the GIS not only includes administrative boundaries, but also key data on roads and waterways, building density and economic infrastructure in the second half of the eighteenth century. The combination of both will result in a geographically comprehensive and long-run quantitative and spatial account of early modern society at the level of localities. This account will improve our understanding of the timing of regional- ly and locally differentiated economic, social and demographic developments in the long run and bring novel insight into the origins of economic and demographic growth in Europe in general and Brabant and Flanders in particular. The STREAM infrastructure will be fully operational by 2020. In what follows, we first provide a short overview of the political and socio-economic history of the Duchy of Brabant and the County of Flanders (section 2). Next, we discuss the data collection (section 3). Finally, we consider the construction of a spatial infrastructure for the pre-statistical area, by means of two tailor-made GIS tools: Ferraris Vectorized and Ferraris Georeferenced (section 4). 2. A REGIONAL APPROACH: BRABANT AND FLANDERS In the STREAM project, the focus is on the Southern Netherlands, and two of its core regions the Duchy of Brabant and the County of Flanders (figure 1). From 1555 onwards the Southern Low Coun- http://www.streamproject.ugent.be/ 4 tries were in possession of successively the Spanish Habsburgs and, after the Spanish War of Succes- sion (1701–1714), the Austrian Habsburgs. 7 The region was united politically only to the extent that the different regional entities (castellanies, duchies and seigneuries) recognized the same personal ruler, who from 1482 on was a Habsburg. Throughout the early modern period, the Southern Low Countries therefore formed a composite state within the larger territories of the Habsburg dynasties. The Habs- burg rule ended in 1794, when the Southern Netherlands and the principality of Liège were conquered by France. The Duchy of Brabant and the County of Flanders were two of the most urbanized and de- veloped areas of pre-industrial Europe. Between c. 1500 and 1800, population in these areas doubled, while economic growth was gradual, though not considerably slower than in Britain. 8 Both regions, comprising in turn 538 and 574 parishes, represent interesting cases for comparative analysis into so- cio-economic and demographic change in the early modern period, as variations within these regions (together 12,000 km 2 ) was far greater than in many other parts of Europe, with regard to soil types (clay, sandy to loamy soils), rural systems (market-oriented agriculture, subsistence agriculture and proto-industrial areas), urban types (regional market towns, ports, large cities oriented towards urban export), labour relations (from areas with capitalist labour relations to areas where wage labour was only of marginal importance), poor relief institutions (from formalized, elite-controlled to informal, community-based), transport infrastructure (from relatively isolated areas to areas with a ‘modern’ network of paved roads), military damages (some areas were spared, other heavily targeted) etc. 9 Yet although we know already a lot about specific localities in particular sub-periods, we know much less on the economic and demographic history in these areas on a more comprehensive scale. Figure 1. Duchy of Brabant and County of Flanders with territorial subdivisions Source: UGent Quetelet Center, Stream (2017). Note: The study area of STREAM covers 17 administrative entities with fiscal and judicial powers, not including the territo- ries annexed by France and the Dutch Republic in the seventeenth century. The Duchy of Brabant consists of the quarters (kwartieren) of Antwerp, Brussels and Louvain, whereas the County of Flanders comprises the castellanies (kasselrijen) of Furnes, Ypres, Courtrai, Oudenaarde, the Liberty of Bruges, Oudburg, Vier Ambachten, Land of Aalst, Land of Dender- monde, and Land of Waas. North Sea 5 The data collected via STREAM will allow to shed light on the geography of the early modern econo- my and population in Brabant and Flanders, and to improve our understanding of the timing and articu- lation of regionally and locally differentiated economic, social and demographic developments in the long run. This will permit to link up with current discussions on the origins of modern economic and population growth and with recent historiographical trends that strongly emphasize a regional ap- proach. As a result, the results of the STREAM project have the potential to transcend the time- and place- specificity of research, and represent a new step in the socio-economic and demographic history of early modern Europe. 3. LARGE-SCALE DATA COLLECTION The period before the nineteenth century is known as the ‘prestatistic era’, implying that aggregated data on demographic, economic and social developments for this period is scarce. Before the late eight- eenth century, governments organized few censuses or surveys about the country, its citizens and socie- ty, in a way that transcended the local level. To discover ‘the big picture’, historians have to retrieve data by tapping into a wide array of sources and quantifying them. This is arduous for several reasons. First, experience with old terminology, language and writing is required. Second, an abundance of sources containing micro data must be consulted and data must be counted prior to the compilation of usable data series. Moreover, to be able to apply quantitative and computerized research techniques, the original data must comply with strict criteria, such as continuity and regularity. This is not always the case as historical documents often contain gaps and display a lack of standardization. Because of these limitations a lot of preparatory work is needed to convert the sources into solid research material. During the last decades, local historians and historical societies have covered and collected a wide range of sources from status animarum to taxation lists, while numerous genealogists in search of their ancestry have rifled through a multitude of parish registers. Their efforts have led to the creation of many databases documenting baptisms, marriages and burials. None of these data have, however, been collected in a systematic way. No effort so far has been undertaken to bring source collections together, in combination with the systematic collection of raw data to identify the ‘blind spots’ where further collection is needed. As a result, we lack an overview of the timing and geography of demographic and economic change between the mid sixteenth and early nineteenth centuries. Through the creation of a series of major high-quality datasets relating to population development and economic change in early modern Brabant and Flanders, STREAM meets this need and will make valuable quantitative data available for early modern research. In the data collection, STREAM takes into account all quantitative and quantifiable sources on the ba- sis of which regionally diverse datasets can be compiled. The applicability and suitability of the early modern data for scientific research is the main selection criteria for incorporating sources into the pro- ject. Priority is given to data collectable from sources at the local level (parishes, villages, towns) for wide geographical areas. For instance, it concerns the collection and critical assessment of cross- sectional data derived from population and hearth surveys (1571, 1614, 1695, 1702, 1748, 1755, 1796), 6 fiscal registers (1571, 1615, 1748, 1755) and surveys on agriculture (1556, 1709, 1724, 1771), industry (1738, 1764), poor relief (1695, 1702, 1748, 1755, 1808), seasonal migration (1800, 1811), cadastral surveys (1570, 1686, 1815), together with time series on enumeration of communicants, and on annual births, deaths and marriages collected from parish and civil registers between 1550 and 1815. Data re- garding the Brabant and Flemish territory are collected by means of the Ferraris Vectorized tool (see next section). Still, STREAM is not limited to collecting, cleaning and harmonizing early modern data into one extensive repository, it also consists of a critical appraisal of the data by drawing up a meta- description and source criticism. As such, it will allow users to assess changes in data definitions and discuss issues of data comparability Although the STREAM infrastructure is work in progress, the collected data so far already allow to show its exciting analytical possibilities. 10 By way of example, we refer to the two very recent studies. A study by Devos and Van Rossem, for example, was able to highlight the tremendous variations in mortality levels across Brabant and Flanders at successive points in time by mapping the annual num- bers of burials and baptisms gathered for ca. 330 parish registers via STREAM. As a result, certain areas with distinct health experiences could be identified, in particular the exceptionally high mortality in the coastal areas. The authors pointed to topography as an important determinant for the differ- ences. 11 Likewise, correlating STREAM datasets on poor relief, occupational structure and land hold- ings, Van den Broeck, Lambrecht and Winter examined geographic trends in relief income and ex- penditure. They concluded that parishes in the coastal areas, dominated by capital-intensive commer- cial agriculture, were more likely to have high relief incomes, which were distributed among relatively few people. Conversely, those characterized by subsistence-oriented and proto-industrial cottagers or small-scale independent farmers, which were more prominent in the inland areas, were more likely to have less relief income, more relief recipients, and lower hand-outs per recipient. They claim therefore that in early modern Flanders and Brabant different roles were assigned to poor relief: as an instrument for labour regulation or one for social cohesion. 12 4. SPATIAL DATA VISUALIZATION AND ANALYSIS: GIS Since maps provide an extremely powerful way to organize, investigate and visualize data, STREAM is developing a geography-driven data infrastructure: a historical geographic information system. This HISGIS functions as the backbone of the project: it is responsible for the storage, linkage, editing and presentation of the historical data at the different spatial scales, from parishes to bishoprics, castellanies and counties. It entails, in fact, a tailored digital infrastructure which will allow to place the historical data within their local geographic context – a necessary prerequisite for detecting regional patterns and temporal changes. The geographic component of the STREAM infrastructure is based on a full coverage basis map, more specifically the map drawn up between 1770 and 1778 by count Joseph Jean François de Ferraris, who was assisted by a staff of about 70 soldiers. The Ferraris map is currently being manually vectorized by the STREAM team and subsequently compared with recent topographic maps to adjust for geometrical 7 deviations. To speed up the digitization and georeferencing of the Ferraris map, we created two GIS tools: one for vectorizing the scanned Ferraris map, called Ferraris Vectorized and one for georeferenc- ing the vectorized Ferraris maps, called Ferraris Georeferenced. In what follows, we first describe the context in which the original Ferraris map was created and the difficulties the map entails for current research. Subsequently, we describe the Ferraris Vectorized tool and the development of the spatial infrastructure. This section ends with a short explanation of the Ferraris Georeferenced tool. 4.1 CARTE DE CABINET OF COUNT JOSEPH DE FERRARIS (1770-1778) The starting point for the development of the STREAM HISGIS is the Ferraris map, or in full the Carte de Cabinet des Pays-Bas Autrichiens levée à l'initiative du comte de Ferraris (figure 2). This 1:11520 map of the Austrian Netherlands and the Prince-Bishopric of Liège covers more or less the current ter- ritory of Belgium and is as such the only source that provides a very detailed ‘national’ overview of the delineation and topography of all localities in the late eighteenth century. For historians and geogra- phers, the Ferraris map constitutes a primary source of information on the pre-industrial landscape. 13 Figure 2. Details of the Ferraris map (1770-1778), Mechelen and Liège. Source: Carte de Cabinet des Pays-Bas Autrichiens levée à l'initiative du comte de Ferraris. The Carte de Cabinet was drawn up between 1770 and 1778, following the example of the new map of France led by César François Cassini de Thury (Cassini map, 1756-1789). 14 Three copies of the Ferraris manuscript were produced, each consisting of 275 multicolored sheets (0,9 x 1,4 meters each) 15 : one for Empress Maria Theresia (now preserved in The Hague, National Archives), one for Charles de Lor- raine, governor of the Austrian Netherlands at the time (now preserved in Brussels, Royal Library of Belgium) and one which was to be preserved in the Chancellery of Court and Nation in Vienna (now preserved in Vienna, Kriegsarchiv) (Bracke 2009). The Viennese copy, which was the first to be drawn up, is used as the starting point of the STREAM project. The hand drawn Ferraris map contains a lot of historically interesting information. For one thing, it shows the administrative and legal situation of the Austrian Netherlands at the end of the early modern 8 period, before the reforms imposed by the French Republic in the 1790s. The borders of counties, duchies and seigneuries are indicated, as well as the rough size of parishes. To this end, each parish church was assigned a number. The same number was written in the houses whose residents belonged to that parish. The map also gives an idea of the extent of urbanization in the Austrian Netherlands. Rural houses were depicted by small red rectangles or squares, while houses in towns and cities were drawn as part of residential blocks (instead of separately). In addition, the map shows the complex road network of paved and unpaved roads, exit roads of polders, waterways, bridges etc.. From the symbols for pine forests, orchards, hedges, meadows, fields, etc. we have an indication of eighteenth-century topography. Because of its level of detail, the Ferraris map is of invaluable importance for contempo- rary historical and geographic research. Nonetheless, the Ferraris map poses some difficulties for con- temporary use. An important obstacle is the interpretation of symbols used by eighteenth-century car- tographers. The cartographic symbols used in the eighteenth century were either self-evident or con- ventional. For example, a church was drawn as a little church and gallows were drawn as they were in reality. Other geographic indicators such as markings of land covering were based on conventions. Land and forest were so common on maps that an easy style for representing them was necessary (fig- ure 3). 16 Figure 3. Example: symbols on the Ferraris map for different types of vegetation Bushes High-trunk trees Pines Source: Carte de Cabinet des Pays-Bas Autrichiens levée à l'initiative du comte de Ferraris, sheets 2 (Nieuwpoort) and 188 (Rekem) At the time when Ferraris’s Carte de Cabinet was produced, the need for a legend was not taken for granted. In the case of Ferraris, there is a written key included in the introduction of Ferraris’s Mé- moires, which accompany the Carte de Cabinet (Eclaircissement). However, as this is only a text with- out visual representation, it is not always straightforward for present-day scholars to distinguish be- tween symbols, especially as they were not always standardized. This probably has to do with the way in which the map sheets were drafted. Usually, the surveyor in the field gave only an indication of the semiotics on his planchet (for example, a ‘p’ for prez, pasture and a ‘b’ for bois, forest) while a cartog- rapher completed the map in his office. As several cartographers, each with their own individual style, continued the work, differences between the sheets could occur. Additionally, we notice that the level of detail decreased as more sheets were finished. 17 Fortunately, this project can benefit from legend- making initiatives by contemporary scholars. 18 9 A second difficulty relates to the determination of borders. This was as much a problem for eighteenth- century cartographers as it is for scholars today. One of the biggest concerns of Ferraris’s contemporar- ies was a correct representation of borders. The feudal chaos that characterized the frontiers in Ferra- ris’s time was so complex that surveyors made mistakes and considered parts of free states, such as Liège or Stavelot, to be Austrian territories. Through the courtesy of the prince bishop of Liège, the prince abbot of Stavelot and the rulers of the many free states, a rectification of the border sections of the map was carried out between 1777 and 1779. As a result, the Carte de Cabinet often shows a dou- ble pattern in the border symbols because the borders were corrected later. 19 In the development of a GIS for the period before 1800, we are not only faced with difficulties inter- preting the eighteenth-century borders, the construction is also complicated by the fact that early mod- ern units of administration, be it ecclesiastical (parishes and bishoprics) or civil (counties, castellanies etc.) are different from the nineteenth-century administrative boundaries (municipalities, departments and provinces). The different content of eighteenth- and nineteenth-century administrative units entails difficulties in merging on the one hand eighteenth- and nineteenth-century data into a longitudinal da- taset and on the other hand nineteenth century data with eighteenth century maps (and vice versa). 20 4.2 FERRARIS VECTORIZED TOOL The STREAM HISGIS is based on the digitization of the Ferraris map. To speed up the digitization process, we developed an online GIS tool for vectorizing the scanned Ferraris map, called Ferraris Vec- torized. Vectorizing means that we convert raster data (in this case, scans of Ferraris map sheets) to vector data (a series of digital points, lines, and polygons). Points are used to indicate the position of buildings or structures such as houses, churches, chapels, bridges, mills and gallows. Lines are used to show the geometry of linear features such as different types of roads, rivers, canals and shorelines, but also administrative boundaries and territorial circumscriptions. Polygon features are enclosed areas like forests, pastures, domains and city centers. All these different features put together in a digital form give us a detailed picture of the Belgian territory in the second half of the eighteenth century (figure 4). The Ferraris Vectorized tool is custom-made to facilitate and accelerate the vectorizing process. Com- pared to conventional GIS systems, Ferraris Vectorized is more user-friendly due to its simplicity (few buttons), the easy linking of attributes to map elements through standardized drop-down menus and the flexibility with which you can switch between different map layers. The tool is accessible from any web browser and requires no additional software. The result is non-georeferenced vector files in GeoJSON-format, which can be further processed in any standard GIS environment. 10 Figure 4. Vectorization process by means of the ‘Ferraris Vectorized’ tool Scan of the Ferraris map (Raster data) Points (vector data) Lines (vector data) Vector file 11 Vectorization entails two types of difficulties. First, we are faced with difficulties in interpreting the symbols on the scanned Ferraris map sheets (see previous section). Second, topological problems arise from the vectorization process in itself. There are different types of topological errors and they can be grouped according to whether the vector feature types are polygons or lines. Topological errors with polygon features can include unclosed polygons, gaps between polygon borders or overlapping poly- gon borders. A common topological error with line features is that they do not meet perfectly at a point (node). This type of error is called an undershoot when a line feature such as a river does not exactly meet another feature to which it should be connected, and an overshoot if a line ends beyond the line it should connect to. The result of overshoot and undershoot errors are so-called ‘dangling nodes’ at the end of the lines. Dangling nodes are acceptable in special cases, for example if they are attached to dead-end streets. Figure 5 demonstrates what undershoots and overshoots look like. Over- and under- shoot errors most often occur when different map sheets are connected to each other. Figure 5. Examples of an under- and overshoot error Undershoot error Overshoot error Topological errors break the relationship between features. These errors need to be fixed in order to be able to analyze vector data with procedures like network analysis (e.g. finding the best route across a road network) or measurement (e.g. finding out the length of a river). The technique most often used to fix over- and undershoot errors is snapping. Snapping is an automatic editing operation in which points or features within a specified distance of other points or features are moved to match or coincide exact- ly with each other’s coordinates. This cleaning process, together with the correction of some man-made mistakes such as misidentification (wrong parish name, wrong type of object) contribute to the time- consuming character of this project. 4.3 FERRARIS GEOREFERENCED TOOL In order to increase STREAM’s potential for spatial analysis, we need to relate the collected quantita- tive data to topographical elements and boundaries as attribute data. This is known as georeferencing. It entails that the historical geographic data, in this case from the Ferraris map, is assigned to a known coordinate system so it can be viewed, queried, and analysed with other geographic data. Therefore, a 12 second GIS tool was built by the STREAM team, Ferraris Georeferenced, to compare the line data – the roads – from the vectorized Ferraris map (made with Ferraris Vectorized) with the current network of roads and as such create a georeferenced map of the road system that existed between 1770 and 1778 in Brabant and Flanders. Ferraris Georeferenced is a specific web-based editor tailored to STREAM needs and designed for the purpose of user-friendliness. 21 As STREAM consists of a collaboration between researchers from dif- ferent disciplines, Ferraris Georeferenced was custom designed by UGent geographers – as opposed to extending existing GIS tools like QGIS or ArcGIS. Consequently, researchers do have to immerse themselves in all functions available in a typical GIS-platform, which usually results in a slow and lim- ited learning curve. Ferraris Georeferenced, for example, allows to easily compare different source im- ages (historical maps, aerial imagery, soil maps,…) and multiple users can work simultaneously with- out conflicts. Different historical maps were entered in the Ferraris Georeferenced tool: (1) a map of the current network of roads, provided by the Belgian National Geographic Institute (NGI) in the Lam- bert projection; (2) this contemporary map layer is placed over the vectorized Ferraris map for compar- ison. (3) Other maps such as the Vermaelen and Popp maps (nineteenth-century cadastral maps), for which vectorized maps are available, were imported as additional map layers. By means of a slider, it is possible to compare eighteenth (Ferraris) with nineteenth (Popp and Vermaelen) and twentieth century (NGI) road systems. In the comparison, we start from the current roads map (NGI). Each map sheet of the Carte de Cabinet is roughly positioned onto the modern map through an affine transformation of its four vertices. 22 Once the map sheet is positioned, the current road network is visualized by purple vector lines, indicating they are in a pending state. When there is a clear resemblance between a present road and a road on the Ferraris map, based on visual interpretation, the purple line turns into a green one by right clicking the segment of the road and choosing ‘confirm’. If a current road does not correspond to a road on the Fer- raris map, this road is turned orange by right clicking the segment and choosing ‘delete’. By choosing ‘confirm’ or ‘delete’ for every road segment of the current network, we get a clear picture of which roads were present during the time the Ferraris maps were made. A fourth option is when there is a road present on the Ferraris map that has disappeared in the current network. In that case the missing road has to be digitalized. During the process of comparison, a few problems can occur. A first complication appears when part of the current road corresponds to the old one and another part does not. In that case, a drawing tool al- lows to split the road allowing the new created segments to be analyzed separately. Every segment can be split as many times as necessary. A second problem arises when certain roads do not correspond perfectly, but the general shape is present on the Ferraris map. In that case it is useful to consult the Popp map. This cadastral map was edited by Philippe Christian Popp in the nineteenth century and re- sembles in great detail the current road system. This map was integrated in the software as a separate layer which can be inspected by sliding the Ferraris layer to the left (figure 6). 13 Figure 6. Comparison with the POPP map (by means of a scrollbar) In case the current roads correspond perfectly with those on the Popp map, these roads are selected as ‘confirm’ since it can be assumed that the differences on the Ferraris map are due to measuring flaws If the Popp map shows little resemblance to the current roads, then these roads are selected as ‘delete’ since the presence of the general shape on the Ferraris map will then be due to mere coincidence. Since the Popp map is dated after the Ferraris map, it is safe to assume that current roads that are missing on the Popp map, could also not have existed during the time the Ferraris maps were made. In this respect, it is easier to first delete all the roads from the current network that do not correspond to the Popp map, giving that this map is much less difficult to read than the Ferraris map. A third problem is the subjec- tive nature of the interpretation. Even when certain roads seem to correspond with one another, it is still possible that this correspondence is an unfortunate coincidence. Decisions based on visual resemblance or differences therefore need confirmation by combining the visual interpretation with analyses of other cartographic documents. 5. CONCLUSIONS The origins of modern economic and population growth have been the subject of longstanding interna- tional debate. At the centre of current discussions are the demographic and economic developments in the early modern period, when the foundations of modern growth were laid. Most of these studies fol- low either a macro-approach based on ‘national’ estimates that ignore regional differences or a micro approach using analysis of households and individuals of which the representativeness remains unclear. In recent years, however, a number of methodological and conceptual innovations have opened up to allow new directions and approaches in research. STREAM capitalizes on three of these promising new avenues for research, which we briefly discussed in this article: (1) A regional approach. Recent research has increasingly argued that questions concerning the interac- tion between demographic and economic developments in early modern Europe can best be tackled with regional analysis, since social relations and economic activities in pre-industrial times predomi- nantly articulated regionally rather than nationally. By collecting data at the local level and analysing from the local level upwards (bottom-up approach), it is possible to investigate whether and which Popp map 14 broader regional dynamics can be discerned. In the STREAM data collection, priority is given to data collectable from sources at the local level (parishes, villages, towns) for wide geographical areas in the Duchy of Brabant and the County of Flanders. Brabant and Flanders, as two of the most urbanized and developed areas of pre-industrial Europe represent interesting cases for comparative analysis because of the large socioeconomic and demographic variations within these regions. Moreover, fundamental economic transitions, such as the transformation of the agrarian sector and the rise of industrial produc- tion started early here. (2) The deployment of large scale databases. This development is rapidly gaining access into historical research and is transforming the field of socio-economic and demographic history (Digital History). So far, most exciting databases in the rapidly expanding field of digital history are based on published sources, of which digitization can be partly automated. However, for early modern socio-economic and demographic history, most relevant data are stored only on paper in manuscript form in various ar- chives. Through the implementation of key data from a diversity of sources related to the early modern population and social and economic structure, as illustrated in this article, STREAM will protect and make accessible a multitude of historical data for diverse research applications. (3) The development of a historical geographic information system. This geographical approach to his- tory (Spatial History) has proved its usefulness and reliability over the past decade and has had a signif- icant impact on the progress of historical research. Spatial analysis enables historians not only to visu- ally present their research results, but more importantly to use space to integrate, collect and interpret historical data in new ways. As most early modern source materials and maps are available only on paper, historians have been slow at bringing a geographical dimension to analyses of the pre-industrial world. Starting from what is considered the first topographic map available for the Southern Nether- lands, the Ferraris map (1771-1778), STREAM is developing a historical geographic information sys- tem which will allow to place the early modern data within their local geographical context – a neces- sary prerequisite for detecting regional patterns and temporal changes. The historical HISGIS is devel- oped by means of two GIS tools: Ferraris Vectorized and Ferraris Georeferenced. The objectives of the Ferraris Vectorized and Georeferenced tool are complementary: Ferraris Georeferenced comprises the the digitisation of the historical road network in a geographic reference system. Ferraris Vectorized, on the other hand, aims at digitizing all topographic characteristics – roads, buildings, administrative and judicial entities, etc. – of eighteenth-century Brabant and Flanders. Together, the tools will enable us to study the historical evolution of infrastructure and to make spatio-temporal analyses of the demograph- ic, social and economic data for Flanders and Brabant in order to improve our understanding of region- ally differentiated economic, social and demographic developments during the early modern period. 1 See special issue of Journal of Interdisciplinary History, 42 (2011). 2 L. Shaw-Taylor and E.A. Wrigley, ‘Occupational structure and population change’, in R. Floud, J. Humphries and P. Johnson, eds., The Cambridge Economic History of Modern Britain (Cambridge, 2014), 53-88. 15 3 S. Broadberry et al., British economic growth 1270-1870 (Cambridge, 2015) ; R. Allen, The British Industrial Revolution in Global Perspective (Cambridge, 2009); J. Mokyr, The Enlightened Economy: an Economic History of Britain (New Ha- ven, 2009). 4 L. Shaw-Taylor, Male Occupational Change and Economic Growth, 1750-1851, End of Award Report (available via CAMPOP, 2007). 5 E.A. Wrigley, ‘The Region as a Unit of Study. History and Geography in Harmony’, Romanian Journal of Population Studies, 7 (2013), 107-119. 6 Funded by the Hercules Foundation (Medium-Scale Research Infrastructure), Research Foundation – Flanders (FWO). 7 The Northern Low Countries became an independent Republic during the Eighty Years’ war (1568-1648). As a result, the Brabant was split into two parts, the north belonging to the Dutch Republic, the south belonging to the Spanish Netherlands 8 P. Klep, ‘Population Estimates of Belgium, by Province (1375-1831)’, in: E. Hélin, ed., Historiens et Populations. Liber amicorum Etienne Hélin (Brussels, 1991), 498-507; J. Blomme and H. Van der Wee, ‘The Belgian Economy in a Long- Term Historical Perspective: Economic Development in Flanders and Brabant, 1500-1812’, Proceedings of the eleventh international economic history congress (Milan, 1994). 9 I. Devos, T. Lambrecht and R. Paping, ‘The Low Countries, 1000-1750’, in I. Devos, T. Lambrecht and E. Vanhaute, eds., Rural Economy and Society in North-Western Europe, 500-2000: Making a Living: Family, Income and Labour (Turnhout, 2011). 10 Some of its research potential was recently demonstrated in two conference sessions devoted entirely to STREAM: in April 2016 at the European Social Science History Conference in Valencia and in September 2016 at the conference of the European Society for Historical Demography in Leuven. 11 I. Devos and T. Van Rossem, ‘Oud, ouder, oudst. Regionale en lokale verschillen in sterfte in het graafschap Vlaanderen tijdens de zeventiende eeuw’, Jaarboek De Zeventiende Eeuw, (2017), 39-53. 12 N. Van den Broeck, T. Lambrecht and A. Winter, ‘Preindustrial welfare between regional economies and local regimes: Rural poor relief in Flanders around 1800’, Continuity and Change, in press. 13 For a detailed discussion, see S. Vervust, Deconstructing the Ferraris Maps (1770-1778). A study of the map production process and its implications for geometric accuracy (Ghent University, Department of Geography, doctoral dissertation, 2016). 14 A digitization (vectorization) of the Cassini map is currently carried out by the GeoHistoricalData project (https://www.geohistoricaldata.org/). See J. Perret, M. Gribaudi and M. Barthelemy, ‘Roads and Cities of 18 th century France’, Scientific Data, 2 (2015). 15 In addition a smaller-scale (1:86400) engraved map in 21 sheets was drawn up from the Carte de Cabinet. This, less de- tailed, map is known as the Carte Marchande or Carte Chorographique and was intended for a larger audience. 16 K. De Coene, T. Ongena, F. Stragier, S. Vervust, W. Bracke- and P. De Maeyer, ‘Ferraris, the Legend’, Cartographic Journal, 49 (2012), 30-42. 17 K. De Coene et al., ‘Ferraris, the Legend’, 30-42. 18 K. De Coene et al., ‘Ferraris, the Legend’, 30-42; M. Beyaert, M. Antrop, P. De Maeyer, C. Vandermotten, C. Billen, J.- M. Decroly, C. Neuray, T. Ongena, S. Queriat, I. Van Den Steen and B. Wayens, België in kaart. De evolutie van het land- schap in drie eeuwen cartografie (Tielt, 2006); W. Bracke, De grote atlas van Ferraris (Tielt, 2009). 19 K. De Coene et al., ‘Ferraris, the Legend’, 30-42. 20 The LOKSTAT project (www.lokstat.ugent.be) developed at Ghent University during the past decade has systematically collected a range of quantitative data from nineteenth-century and twentieth century censuses, alongside the borders of nine- teenth-century municipalities However, the LOKSTAT territorial subdivision cannot be used to organize eighteenth-century data, as parish boundaries do not necessarily correspond to municipal boundaries. 21 P. De Maeyer, E. Ranson, K. Ooms, K. De Coene, B. De Wit, M. Van den Berghe, S. Vrielinck, T. Wiedemann, A. Wi n- ter, R. Kruk, I. Devos (2018), ‘User-centered design of a collaborative object oriented historical GI-platform’, in: The Des- simination of Cartographic Knowledge, Springer ICA series, in press. 22 In some cases it would be useful to rotate as well as move the map. In the future it would therefore be practical to impl e- ment a rotation tool. https://www.geohistoricaldata.org/ http://www.lokstat.ugent.be/ work_ebmpwo5ymjb2jnp6d7kgtlbkna ---- Perseids: Experimenting with Infrastructure for Creating and Sharing Research Data in the Digital Humanities CODATACODATA II SS UU Almas, B 2017 Perseids: Experimenting with Infrastructure for Creating and Sharing Research Data in the Digital Humanities. Data Science Journal, 16: 19, pp. 1–17, DOI: https://doi.org/10.5334/dsj-2017-019 PRACTICE PAPER Perseids: Experimenting with Infrastructure for Creating and Sharing Research Data in the Digital Humanities Bridget Almas Perseids Project, Tufts University, 5 the Green, Medford, MA 02155, US balmas@gmail.com The Perseids project provides a platform for creating, publishing, and sharing research data, in the form of textual transcriptions, annotations and analyses. An offshoot and collaborator of the Perseus Digital Library (PDL), Perseids is also an experiment in reusing and extending existing infrastructure, tools, and services. This paper discusses infrastructure in the domain of digital humanities (DH). It outlines some general approaches to facilitating data sharing in this domain, and the specific choices we made in developing Perseids to serve that goal. It concludes by identifying lessons we have learned about sustainability in the process of building Perseids, noting some critical gaps in infrastructure for the digital humanities, and suggesting some implications for the wider community. Keywords: infrastructure; digital humanities; data sharing; interoperability; research data Overview The Perseids project provides a platform for creating, publishing, and sharing research data, in the form of textual transcriptions, annotations and analyses. An offshoot and collaborator of the Perseus Digital Library (PDL), Perseids is also an experiment in reusing and extending existing infrastructure, tools, and services. This paper discusses infrastructure in the domain of digital humanities (DH). It outlines some general approaches to facilitating data sharing in this domain, and the specific choices we made in developing Perseids to serve that goal. It concludes by identifying lessons we have learned about sustainability in the process of building Perseids, noting some critical gaps in infrastructure for the digital humanities, and suggesting some implications for the wider community. General Approaches What constitutes infrastructure, and how does it facilitate data sharing in the domain of DH, and in the Perseids project in particular? According to Mark Parsons, Secretary General of the Research Data Alliance (RDA), infrastructure can be defined as ‘the relationships, interactions and connections between people, technologies, and institutions that help data flow and be useful (Parsons 2015).’ In the realm of DH, any of the following might be considered infrastructure: original digital collections, linked data providers, general purpose and domain-specific platforms, content management systems (CMSs), virtual research environments (VREs), online tools and services, repositories and service providers, aggregators and portals, APIs and standards. Table 1 provides some specific examples of these in the DH and digital classics (DC) communities, illustrating the diversity and breadth of infrastructure in this community. Enabling data sharing includes ensuring that data objects have persistent, resolvable identifiers, providing descriptive and structural metadata, providing licensing and access information, and using standard data formats and ontologies. The recent W3C recommendation ‘Data on the Web Best Practices’ (Loscio, et. al. 2016) cites many strategies such as providing version history, provenance information, and data quality information. https://doi.org/10.5334/dsj-2017-019 mailto:balmas@gmail.com Almas: PerseidsArt. 19, page 2 of 17 Infrastructure type Examples in DH and DC Original digital collections PDL, Papyri.info, NINES, Digital Latin Library, Coptic Scriptorium, Roman de La Rose Linked data providers and gazetteers Pleiades, PeriodO, Syriaca.org, VIAF, Getty, Trismegistos, DBPedia General purpose platforms, CMS, VREs, tools and services Omeka, MediaWiki, Heurist, TextGrid, Voyant, Mirador, CollateX, JUXTA, Neatline Domain-specific platforms, CMS, VREs, tools and services Perseids, Recogito, Symogih, PECE Repositories and service providers CLARIN, DARIAH, EUDAT, MLA Commons/CORE, HumaNum, Hathi Trust Research Center, California Digital Library Aggregators and portals Europeana, Digital Public Library of America, HuNi, EHRI APIs and standards IIIF, OA, TEI, OAUTH, Shibboleth/SAML, CTS Table 1: Examples of infrastructure in digital humanities and digital classics. Above and beyond this, ensuring that adequate editorial and/or peer review has taken place before data is shared is often an important criteria for data sharing in the humanities. Background Perseids evolved to fill a critical need of the digital classics community of scholars and students (Bodard and Romanello 2016): infrastructure that supports textual transcription, annotation, and analysis at a large scale, with review, in both scholarly and pedagogical contexts. Such infrastructure would give us the ability to work with text-centric publications containing a variety of different data types, and would include: • stable, persistent identifiers for all publications; • a versioned, collaborative editing environment; • the ability to extend the environment with data type-specific behaviors and tools; • customized review workflows. Perseids is, in part, a successor to a prior ambitious, but ultimately unsuccessful, infrastructure effort in the humanities, Project Bamboo (Dombrowski 2014). One of the aims of Project Bamboo was to develop a Service Oriented Architecture (SOA) that could serve a wide variety of use cases and requirements for textual analysis and humanities research. This accorded with the goal of the PDL: to begin to decouple the many services making up the Perseus 4 application, so that they could be recombined and reused to build new applications (Almas 2015). The PDL’s contribution to Bamboo included development (and implementation) of service APIs for morphological analysis and syntactic annotation. These services, intended to be shared on the Bamboo Services Platform, reused code from two main sources: the PDL’s web application and the Alpheios Project’s reading environment, and were designed to be easily extended to serve additional languages and use cases. They provided essential functionality for textual analysis and annotation. At the same time, we also began separately investigating development of a scalable solution for engaging undergraduate students in the production of original transcriptions and translations of Medieval Latin Manuscripts and Greek Epigraphy. This work was inspired by, and involved reuse of architecture and tools from, two major projects in digital classics, the Homer Multitext and Papyri.info (Almas and Beaulieu 2013). One thing that prevented Bamboo from succeeding was the assumption that scholars would be willing to give up their domain-specific tools and services for a more general infrastructure to which everyone would contribute (Dombrowski 2015). Humanities use cases at the time appeared too diverse for that, and t echnologies were moving very fast. It is unclear whether or not Bamboo could have succeeded but the project ended before we could develop a critical component needed for our own use cases, a platform for management of the data and scholarly workflow which would allow for full peer and professorial review. Perseids took up in part where Bamboo left off, but with a more modest goal of providing infrastructure for our own specific set of use cases. We reused the services we built for Bamboo in Perseids, and also reused http://www.syriaca.org/ Almas: Perseids Art. 19, page 3 of 17 an existing piece of infrastructure from another project, the Son of SUDA Online (SoSOL), to fill the role of managing the data and review workflows. Drawing on the experiences of Bamboo, we decided that Perseids would support a looser coupling of existing tools and services. One goal of infrastructure is to connect what already works, adding value and capacity without reinventing solutions. Our development approach for Perseids was thus based on three principles: 1. data interoperability; 2. flexibility and agility; 3. tool interoperability. We wanted not only to support our scholarly workflows, but also to be sure that the outputs would be fully sharable and preservable. Perseids currently serves an active user base, averaging between one and two thousand sessions by at least five hundred unique users per month during the academic year, the majority of which come from six active DH communities: Tufts, the University of Nebraska at Lincoln, the College of Letters and Science of the Sao Paulo State University, the University of Leipzig, the University of Lyon, and the University of Zagreb. Several external projects also connect to Perseids’s tools and review workflow via its API. Functionality Use Cases Perseids offers functionality for creation, curation and review of texts, translations and annotations. It enables its users to: 1. Create and edit a new text transcription. 2. Edit an existing text transcription. 3. Create and edit a new text translation. 4. Edit an existing text translation. 5. Create and edit a new commentary annotation. 6. Create and edit a new treebank1 annotation. 7. Create and edit a new text alignment2 Annotation. 8. Ingest and edit simple annotation data from external sources. 9. Create and edit simple annotations on texts. The process of creating a publication on Perseids involves workflows fulfilling one or more of these use cases (Figure 1). Workflows A workflow, in this context, is a series of actions carried out by a user to achieve some goal. In a typical workflow on Perseids the user creates a publication containing one or more of the supported data types. She uses an editing tool appropriate to the data type to edit and curate her work and then submits it to a review board for acceptance. For example, she may choose to create and edit a Treebank annotation using the Arethusa editing tool (Figure 2). If the work is being done in the context of a pedagogical assignment, the review board is likely to be made up of the professor and teaching assistants for the class. If the work is being done in the context of a specific project or community, the review board will be composed of peers or expert members of an editorial team (Figures 3 and 4). The ability to support peer-review functionality is a distinguishing feature of the Perseids infrastructure, and an important driving factor behind the architectural decision to built it upon the SoSOL platform. As we discuss further below, a common driver for external projects to integrate with Perseids is to take advantage of the flexible review workflow features it offers. 1 Annotation of morphology, syntax and sentence structure. 2 N-to-N word-level alignment across two texts. Almas: PerseidsArt. 19, page 4 of 17 Figure 1: The Perseids home screen, showing a variety of data types and actions. Figure 2: Annotating a Treebank in Arethusa. Figure 3: Perseids user interface – voting on a publication. Almas: Perseids Art. 19, page 5 of 17 Figure 4: Perseids review workflow. Architecture The Perseids architecture (Figures 5–7) supports these workflows through a complex sequence of interactions between its core components, hosted tools and services, 3rd party applications and platforms and external identity providers and content repositories. SoSOL is the core of the Perseids platform. It is a Ruby on Rails application, built on top of a Git repository, that provides an open-access, version-controlled, multi-author web-based editing environment that supports working with collections of related data objects as publications. SoSOL was developed for the Papyri.info site by the Integrating Digital Papyrology project, a multi-institution project aimed at supporting interoperability between five different digital papyrological resources (Baumann 2013) and is now maintained jointly by the Duke Collaboratory for Classics Computing and the Perseids project. A Git repository provides versioning support for all documents, annotations and other related objects managed on the platform. SoSOL also provides additional functionality on top of Git’s, including document validation, templates for documentation creation, review boards, and communities. It uses a relational database (MySQL) to store information about document status and to track the activity of users, boards, and communities. SoSOL uses the OpenID and Shibboleth/SAML protocols to delegate responsibility for user authentication to social or institutional identity providers. Social identity providers (IdP) are supported through a third-party gateway, currently Janrain Engage. The Perseids deployment of SoSOL incorporates the Canonical Text Services (CTS) protocol. The CTS specification defines an API protocol and a URN syntax for identifying and retrieving text passages via machine-actionable, canonical identifiers (Smith and Blackwell 2012). To support CTS, as well as provide features such as tokenization of texts, the Perseids deployment of SoSOL delegates some functionality to external databases and services. The SoSOL application itself provides lightweight user interfaces for creating and editing documents and annotations, but in order to support an open-ended set of different editing and annotation activities, we rely on integrations with external web-based tools for editing and annotating. These integrations are enabled by API interactions between the tools and the SoSOL application. The Perseids Client Applications component acts as a broker between the end-user, the SoSOL platform, external repositories and services, and the web-based editing and annotation tools.3 Built on the Python Flask framework, this component implements a client-side workflow for the creation of new annotations of text passages identified by CTS URN. It uses the CTS abstraction libraries from CapiTainS infrastructure for CTS URN resolution and processing, as does the Nemo browsing interface, which offers a discovery interface for identifying texts to annotate and an anchoring point for front-end annotation tools and visualizations. 3 The Perseids Client Applications were co-developed by Perseids and The Humboldt Chair for Digital Humanities at the University of Leipzig. Almas: PerseidsArt. 19, page 6 of 17 Figure 6: Perseids core components. Figure 5: Perseids infrastructure and ecosystem. A recent addition to the platform is a Flask based GitHub Proxy Service which enables us to send data directly to external GitHub repositories after it has been through the review workflow.4 (See the ‘Tool Interoperability’ section below for further details on these scenarios.) The role that each component of the architecture and ecosystem plays in supporting the workflow is described in the ‘Tools Interoperability’ section below. 4 Development of this component was supported by an NEH-funded collaboration with the Syriaca.org project. http://www.syriaca.org/ Almas: Perseids Art. 19, page 7 of 17 Information Model Data publications produced on Perseids are collections of related data objects of different types. The SoSOL information model was designed for this type of publication. The “Publication” is the container for a collection of data objects belonging to a parent abstract class of “Identifier.” Different type object types are implemented as derivations of the “Identifier” class, which add type-specific behaviors and properties, such as schema validation rules. Figure 8 shows how this design applies in Perseids. However, Perseids publications can also be thought of as research objects (Bechhofer, et. al. 2013), where the object of the research is a passage or passages of canonically-identifiable text. Figure 9 shows our original vision for a CTS-focused publication on Perseids5 (Figure 9). Tool interoperability Decoupling data creation tools from the sources and destinations of the data was a key part of our design approach. APIs and standards are critical components of infrastructure, and integration and sharing require that data be retrievable from and persistable to any source (Hilton 2014). Perseids offers an API for Create, Read, Update, and Delete update operations for all data types supported by the platform. API clients can authenticate using the OAuth 2.0 protocol (Hardt 2012) or co-hosted tools have the option of using a shared session cookie. These approaches enable integration with specific tools and services, such as the Arethusa Annotation Framework and the Alpheios Alignment Editor, as well as external projects such as Sematia (Vierros and Henriksson 2016) and the Syriaca.org Gazetteer (Figure 10). Perseids also uses external APIs to pull data from other infrastructures. We use the Canonical Text Services URN protocol and API (Smith and Blackwell 2012) to identify and retrieve textual transcription, translation, and annotation targets (Figures 11 and 12). 5 The vision in Figure 3 has largely been implemented, with the exception of CITE collections server component. We now expect this function to be filled by an implementation of a multidisciplinary Collections API we are working on as part of the Research Data Alliance’s Research Data Collections Working Group. Figure 7: Perseids hosted tools and services. http://www.syriaca.org Almas: PerseidsArt. 19, page 8 of 17 Figure 8: Perseids information model. Figure 9: Perseids publication as a CTS focused research object. Almas: Perseids Art. 19, page 9 of 17 Figure 10: Creating and submitting a publication from an external application using OAuth2. Figure 11: Sequence of API interactions for creating and editing a CTS-focused annotation template using the Perseid Client Apps and a locally hosted editing tool. We also offer a lightweight URL-based API which lets individual scholars and smaller projects, particularly those without the time or skills to develop client software, pull their own data in or integrate Perseids with their application. Professors such as Robert Gorman at University of Nebraska Lincoln (Gorman and Gorman, forthcoming) are using this feature to produce templates for new annotations that they publish on their university Learning Management Systems (LMS). They then include links to Perseids in their syllabi that instruct Perseids to pull the templates from the LMS to create a new annotation publication (Figure 13). Almas: PerseidsArt. 19, page 10 of 17 Figure 12: Using the Perseids Client Apps to create a new translation alignment annotation in Perseids for editing via the Alpheios Alignment Editor. Texts available for use are populated via a call to the CTS API. Figure 13: Sequence of actions for creating a publication from an LMS-hosted syllabus and annotation template. Other applications such as Digital Athenaeus use Perseids’s URL API to offer links to Perseids with specific content already identified for transcribing, translating, or annotating (Figures 14 and 15). We also implemented a workflow for Marie-Claire Beaulieu’s Journey of the Hero course which allows students to use the Hypothes.is annotation tool to annotate named entities and social networks of mythological characters from Smith’s Dictionary of Greek Names. This workflow uses the Hypothes.is API to pull the annotations into Perseids for review and publication (Figure 16). https://hypothes.is/ https://hypothes.is/ Almas: Perseids Art. 19, page 11 of 17 Figure 15: Screenshot of the Digital Athenaeus interface (at http://www.digitalathenaeus.org) showing the links to Annotate in Perseids. Figure 16: Perseids Hypothes.is workflow. Figure 14: Sequence of actions for creating a CTS targeted text annotation publication from a link from Digital Athenaeus. http://www.digitalathenaeus.org/ https://hypothes.is/ Almas: PerseidsArt. 19, page 12 of 17 The Perseids/EAGLE integration uses a combination of both of these pull strategies: links from EAGLE to Perseids identify a resource on the EAGLE site, and trigger a callback to the EAGLE MediaWiki API to pull metadata and data from that resource into new translation publications on Perseids (Figures 17 and 18). We also use external APIs to push data to external repositories. For the EAGLE project integration, Perseids uses the MediaWiki API to publish data to the EAGLE repository once it has passed through a review workflow. Through a new NEH-funded collaboration with the Syriaca.org project, we have developed a service which allows us to push data to external GitHub repositories at the end of the review workflow (See Figure 4, Step 5b). Eventually we’d like to be able to support pushing data to any external API endpoint. Figure 17: Perseids/EAGLE workflow. Figure 18: Screenshot of the EAGLE Portal (http://www.eagle-network.eu/wiki) showing a link to edit a translation in Perseids. http://www.syriaca.org/ http://www.eagle-network.eu/wiki file://192.168.1.10/TypeSetting/Silicon%20Chips/UP_Journals/005_DSJ/Application%20CS5.5/2017/dsj-681_almas\ file://192.168.1.10/TypeSetting/Silicon%20Chips/UP_Journals/005_DSJ/Application%20CS5.5/2017/dsj-681_almas\ Almas: Perseids Art. 19, page 13 of 17 Designing for Flexibility and Agility From the outset, we have taken an agile approach to development of Perseids. While we do not use official sprints and strictly scheduled iterations, we approach planning in short increments, guided by a long-term vision and goals. In addition, we aim to deploy features to users as quickly as possible, so that we can get feedback from them. We do this not only for internal-facing features, but also to prototype new integrations with external services and projects. This flexibility allows us to try many things, keeping those that work and prove to be useful and deprecating those that do not. To support this approach, we could not commit to a specific set of hardware requirements in advance, as we needed the flexibility to extend and reduce resources used as development proceeded. We therefore chose to budget for cloud-based resources on the Amazon Web Services (AWS) platform rather than using university IT resources. Full ownership and control over our infrastructure allowed us to experiment with features and integrations that otherwise would not have been possible; however, it did have some drawbacks and unexpected costs. These are described in the ‘Sustainability’ section below. Standards for Data Data Interoperability A strategic principle in our development is to take steps to ensure data interoperability through the use of stable identifiers and standard formats. We use CTS URNs to identify both texts and annotation targets. These URNs can be considered stable identifiers, but do not quite qualify as persistent identifiers as they are not universally resolvable or guaranteed to be available. Other identifier systems, such as Handles (Sun et. al. 2003), are designed for persistence, and one approach we might take in the future to address this would be to map CTS URNs to the Handles (Almas and Schroeder, 2016), but in the absence of this piece of infrastructure, the CTS URNs do provide stable, machine actionable identifiers that are technology independent. We also use other types of stable identifiers within our annotations and texts, including the URIs published by the Pleiades Gazetteer. We are working towards ensuring that any data published by the platform has a persistent identifier as well. We are therefore participating in the Research Data Alliance’s Research Data Collections working group to develop a multidisciplinary, collections-based approach to data management that supports persistent identifiers for the collections themselves, and for the items within a collection. We also strive to use standard data formats and ontologies for our data and to validate all objects against these. The primary data format standards supported on the platform include the TEI Epidoc Schema for textual transcriptions and translations, the Open Annotation protocol for annotations, the ALDT/ALGT schemas for treebank data, the Alpheios Alignment Scheme for translation alignments, and the SNAP ontology for social network annotations. Provenance and Preservation Incorporating provenance information in our publications is an important enabling factor for data sharing. We have taken steps in this direction, for example by supporting Shibboleth/SAML protocol for authentication on Perseids in order to to be able to ensure a chain of authority for university repository systems. We have also included provenance information for tokenization services and tools in our annotation documents, and have explored models for more comprehensive approaches (Almas, Berti, et. al. 2013). However, capturing and recording provenance information reliably across a diverse ecosystem of tools and services is difficult, and we need general-purpose solutions that we can reuse. As articulated by Padilla (2016): “A researcher should be able to understand why certain data were included and excluded, why certain transformations were made, who made those transformations, and at the same time a researcher should have access to the code and tools that were used to effect those transformations. Where gaps in the data are native to the vagaries of data production and capture, as is the case with web archives, these nuances must be effectively communicated.” We recognize that we fall short of meeting these goals currently and aim to do a more complete job of this in the future. It is also very important to us that the research data produced with Perseids be preserved. However, our data models and approach to publications are constantly evolving, making coordination with the university library to preserve this data challenging, as they don’t necessarily fit the data models the library is already able to support. As a publicly available and open infrastructure, we also have many users from many institutions across the world, and it is not clear what responsibility Tufts, the university hosting the infra- structure, should have for data created by external users. We mitigate this with Perseids by providing links Almas: PerseidsArt. 19, page 14 of 17 that users can use to access and download their data, and encouraging them to take responsibility for publishing and preserving it on their own. We continue to explore general models such as the Research Object (Belhajjame, et. al. 2015), or BagIt, which will enable users to export data in a format that is ready to store in a repository. Another question is that of software preservation (Rios 2016). As the Perseids software is under active development, it is difficult to keep the code for digital publications up to date with all the underlying services providing the data (Rios and Almas 2016). We need to plan better for this preservation, including taking into account the need to represent interdependencies between visualizations and the underlying services and software (Lagos and Vion Dury 2016). Sustainability Human and Governance Factors We have learned much about infrastructure building throughout the course of this project. The technical hurdles to interoperability and sharing are usually much less difficult to overcome than those of social issues, funding, and governance. Even where there was a clear interest in interoperability and it was technically possible, we failed sometimes to implement or sustain an integration because doing so wasn’t in the funded mandate of the partner project. This was the case for us with the Recogito application from the Pelagios Project. But even where explicit funding support doesn’t exist, interoperability can still succeed if one project can fill a key gap in another, and if there are people willing to champion the effort to ensure its success. One example was our integration with the EAGLE project, where Perseids provides a review workflow for EAGLE, and which was implemented without being a funded deliverable for either project, but it remains to be seen if we can sustain it indefinitely. This is an area where more formal governance structures, such as those offered by larger research infrastructures such as CLARIN and DARIAH (Lossau 2012) could be useful. The key challenge for the community is to encourage and support ad-hoc collaborations to get initial solutions working, and then move from there to more formal agreements to ensure sustainability. Hardware and Software Factors Laura Mandell talks about the various models being considered for where and how to position DH, and points out that the question of how to support diverse infrastructure needs is still unsolved (Dinsman 2016). A second lesson we have learned from our experience on Perseids is that for development of interoperable infrastructure to succeed and be sustainable, we need better collaborative models for working with our university Information Technology departments and libraries. We knew we needed the flexibility to change our hardware requirements as we developed, and to deploy new code and services quickly to support rapid prototyping. This allows us to develop and try out new solutions more rapidly than we would have been able to if we had to go through university policies and procedures, but it also involved a lot of extra system administration work we had not anticipated, leaving us with a somewhat over-complicated infrastructure at the end of the first phase of the project. Accordingly, in the second phase we built in funding for a devops consultant, who helped us move to a fully configuration-managed system, so that the Perseids platform can be deployed easily by others and sustained for the long term. This is a critical characteristic for software-related infrastructure - in order for it to be reproducible by others, building and deploying it must be automated. In hindsight, having such consultancy from the outset would have been beneficial; collaboration between developers and the IT staff responsible for deploying and sustaining software is a more viable model than throwing code ‘over the wall’ at the end of a project (Arundel 2016). As cloud computing becomes increasingly cost-efficient, and new models of deployment, such as container- based solutions, are introduced, there is a need for models in which university IT departments can partner with projects to provide expertise and facilities (for example, private cloud or container infrastructure, or extending university infrastructure to the public cloud). Conclusion With Perseids, we have explored an agile approach to infrastructure development, emphasizing reuse of both software and data. This has been successful on many levels. Reuse of existing infrastructure compo- nents leads to collaborations which increase the chances of sustainability, such as the joint maintenance of the SoSOL application. Agile approaches to prototyping cross-project integration also benefit all parties involved. However, transitioning to more formal governance models and increased engagement with host institutions will be essential to longer term success. Almas: Perseids Art. 19, page 15 of 17 Acknowledgements The author wishes to thank her colleagues, John Arundel, Frederik Baumgardt, Marie-Claire Beaulieu and Thibault Clérice for contributing their energy and ideas in reviewing this paper. Competing Interests The author has no competing interests to declare. Author Information Bridget Almas is the software architect and co-director of the Perseids Project at Tufts University. Bridget has worked in software development since 1994, in roles which have covered the full spectrum of the software development life cycle, focusing since 2007 in the fields of digital humanities and pedagogy. Bridget served as an elected member of the Technical Advisory Board of the Research Data Alliance (RDA), from 2013–2015, and currently is co-chair of the Research Data Collections Working Group, the Data Fabric Interest Group and acts as liaison between the Alliance of Digital Humanities Organizations (ADHO) and RDA. References Almas, B 2015 The Road to Perseus 5 – why we need infrastructure for the digital humanities. Blog post on the Perseus Updates Blog (18, May 2015). Available at: http://sites.tufts.edu/perseusupdates/2015/05/18/ the-road-to-perseus-5-why-we-need-infrastructure-for-the-digital-humanities/. Almas, B and Beaulieu, M-C 2013 Developing a New Integrated Editing Platform for Source Documents in Classics. Literary and Linguistic Computing, 28: 493–503. DOI: https://doi.org/10.1093/llc/ fqt046 Almas, B, Berti, M, Choudhury, S, Dubin, D, Senseney, M and Wickett, K 2013 Representing Humanities Research Data Using Complementary Provenance Models. In Building Global Partnerships - RDA Second Plenary Meeting, Washington, D.C.: RDA. Available at: https://www.rd-alliance.org/filedepot_ download/694/158. Almas, B and Schroeder, C T 2016 Applying the Canonical Text Services Model to the Coptic SCRIPTORIUM. Data Science Journal, 15: 13. DOI: http://doi.org/10.5334/dsj-2016-013 Arundel, J 2016 Build bridges not walls: devops is about empathy and collaboration. Available at: http:// bitfieldconsulting.com/bridges-not-walls. Baumann, R 2013 The Son of Suda Online. In: Dunn, S and Mahoney, S (Eds.) The Digital Classicist 2013. Offprint from BICS Supplement-122. London: The Institute of Classical Studies University of London, pp. 91–106. Bechhofer, S, Ainsworth, J, Bhagat, J, Buchan, I, Couch, P, Cruickshank, D, Delderfield, M, Dunlop, I, Gamble, M, Goble, C, Michaelides, D, Missier, P, Owen, S, Newman, D, De Roure, D and Sufi, S 2013 Why Linked Data is Not Enough for Scientists. Future Generation Computer Systems, 29(2): 599–611. DOI: https://doi.org/10.1016/j.future.2011.08.004 Belhajjame, K, Zhao, J, Garijo, D, Gamble, M, Hettne,K, Palma, R, Mina,E, Corcho,O, Gómez-Pérez,J M, Bechhofer,S, Klyne,G and Goble, C 2015 (May) Using a suite of ontologies for preserving workflow- centric research objects. Journal of Web Semantics, 32: 16–42. DOI: https://doi.org/10.1016/j.web- sem.2015.01.003 Bodard, G and Romanello, M (eds.) 2016 Digital Classics Outside the Echo-Chamber. London. Ubiquity Press. Dinsman, M 2016 (April 24) The Digital in the Humanities: An Interview with Laura Mandell - Los Angeles Review of Books. Available at: https://lareviewofbooks.org/article/digital-humanities-interview-laura- mandell/. Dombrowski, Q 2014 What Ever Happened to Project Bamboo? Literary and Linguistic Computing, 29(3): 326–339. DOI: https://doi.org/10.1093/llc/fqu026 Gorman, R and Gorman, V forthcoming Approaching questions of text reuse in Ancient Greek using computational syntactic stylometry. Open Linguistics Topical Issue on Treebanking and Ancient Languages. Hardt, D (ed.) 2012 The OAuth 2.0 Authorization Framework, RFC 6749. Available at: http://www.rfc-editor. org/info/rfc6749. DOI: https://doi.org/10.17487/RFC6749 Hilton, J L 2014 Enter Unizin. EDUCAUSE Review, 49(5). https://doi.org/10.1093/llc/fqt046 https://doi.org/10.1093/llc/fqt046 http://doi.org/10.5334/dsj-2016-013 https://doi.org/10.1016/j.future.2011.08.004 https://doi.org/10.1016/j.websem.2015.01.003 https://doi.org/10.1016/j.websem.2015.01.003 https://doi.org/10.1093/llc/fqu026 https://doi.org/10.17487/RFC6749 Almas: PerseidsArt. 19, page 16 of 17 Lagos, N and Vion-Dury, J-Y 2016 (September 13–16) Digital Preservation Based on Contextualized Dependencies. Doc Eng. Available at: http://www.xrce.xerox.com/content/download/93294/1307736/ file/2016-031.pdf. Loscio, B F, Burle, C and Calegari, N 2016 (30 August) W3C. 2016 Data on the Web Best Practices. W3C Candidate Recommendation. Available at: https://www.w3.org/TR/2016/CR-dwbp-20160830/. Lossau, N 2012 An Overview of Research Infrastructures in Europe - and Recommendations to LIBER. LIBER Quarterly, 21(3–4): 313–329. DOI: https://doi.org/10.18352/lq.8028 Padilla, T 2016 Humanities Data in the Library: Integrity, Form, Access. D-Lib Magazine, 22(3/4). DOI: https://doi.org/10.1045/march2016-padilla Parsons, M 2015 (22 September) e-Infrastructures & RDA for data intensive science. Available at: https:// rd-alliance.org/sites/default/files/attachment/Infrastructures,%20relationship,%20trust%20and%20 RDA_MarkParsons.pdf. Rios, F 2016 The Pathways of Research Software Preservation: An Educational and Planning Resource for Service Development. D-Lib Magazine, 22(7/8). DOI: https://doi.org/10.1045/july2016-rios Rios, F and Almas, B 2016 Preserving Digital Scholarship in Perseids: An Exploration. Blog Post. DOI: https:// doi.org/10.5281/zenodo.159569 Smith, N and Blackwell, C W 2012 Four URLs, limitless apps: Separation of concerns in the Homer Multitext architecture. In Donum natalicium digitaliter confectum Gregorio Nagy septuagenario a discipulis collegis familiaribus oblatum: A Virtual Birthday Gift Presented to Gregory Nagy on Turning Seventy by His Students, Colleagues, and Friends. Boston: The Center of Hellenic Studies of Harvard University Sun, S, Lannom, L and Boesch, B 2003 Handle System Overview, RFC 3650. Available at: http://www.rfc- editor.org/info/rfc3650. DOI: https://doi.org/10.17487/RFC3650 Vierros, M and Henriksson, E 2016 Preprocessing Greek Papyri for Linguistic Annotation. Hal-01279493. Preprint. Available at: https://hal.archives-ouvertes.fr/hal-01279493. Projects, Websites, Software Alpheios [WWW Document] n.d. Available at: http://alpheios.net/ (accessed 9.29.16). Alpheios Alignment Editor [Software] n.d. Available at: https://github.com/alpheios-project/alignment- editor (accessed 9.29.16). Arethusa [Software] n.d. Available at: https://github.com/alpheios-project/arethusa (accessed 9.29.16). CapiTainS [WWW Document] n.d. Available at: http://capitains.github.io/ (accessed 9.29.16). Digital Athenaeus - A digital edition of the Deipnosophists of Athenaeus of Naucratis [WWW Document] n.d. Available at: http://digitalathenaeus.org/ (accessed 9.29.16). EpiDoc Guidelines 8.22 [WWW Document] n.d. Available at: http://www.stoa.org/epidoc/gl/latest/ (accessed 9.29.16). Flask (A Python Microframework) [WWW Document] n.d. Available at: http://flask.pocoo.org/ (accessed 11.8.16). flask-github-proxy: Github proxy to push resource to github [Software] n.d. Available at: https:// github.com/PonteIneptique/flask-github-proxy (accessed 9.29.16). Hypothes.is [WWW Document] n.d. Available at: https://hypothes.is/ (accessed 9.29.16). Journey of the Hero [WWW Document] n.d. Available at: http://perseids.org/sites/joth/#index (accessed 9.29.16). Morphological Analysis Service Contract Description - v1.1.1 [WWW Document] n.d. Available at: https://wikihub.berkeley.edu/display/pbamboo/Morphological+Analysis+Service+Contract+Descript ion+-+v1.1.1 (accessed 9.29.16). OpenID Authentication 2.0 - Final [WWW Document] n.d. Available at: http://openid.net/specs/openid- authentication-2_0.html (accessed 11.3.16). Pleiades Gazetteer [WWW Document] n.d. Available at: https://pleiades.stoa.org/ (accessed 9.29.16). RECOGITO [WWW Document] n.d. Available at: http://pelagios.org/recogito (accessed 9.29.16). Research Data Collections WG [WWW Document] n.d. Available at: https://rd-alliance.org/groups/pid- collections-wg.html (accessed 9.29.16). Sematia [WWW Document] n.d. Available at: http://sematia.hum.helsinki.fi (accessed 9.29.16). Shibboleth [WWW Document] n.d. Available at: https://shibboleth.net/ (accessed 11.3.16). Standards for Networking Ancient Prosopographies [WWW Document] n.d. Available at: https:// snapdrgn.net/ontology (accessed 9.29.16). https://doi.org/10.18352/lq.8028 https://doi.org/10.1045/march2016-padilla https://doi.org/10.1045/july2016-rios https://doi.org/10.17487/RFC3650 https://hypothes.is/ Almas: Perseids Art. 19, page 17 of 17 Syntactic Annotation Service Contract Description - v1.1.1 [WWW Document] n.d. Available at: https:// wikihub.berkeley.edu/display/pbamboo/Syntactic+Annotation+Service+Contract+Description+-+v1.1.1 (accessed 9.29.16). Syriaca.org: The Syriac Reference Portal [WWW Document] n.d. Available at: http://syriaca.org/ (accessed 9.29.16). The Ancient Greek and Latin Dependency Treebank by PerseusDL [WWW Document] n.d. Available at: https://perseusdl.github.io/treebank_data/ (accessed 9.29.16). The BagIt File Packaging Format (V0.97) [WWW Document] n.d. Available at: https://tools.ietf.org/html/ draft-kunze-bagit-08 (accessed 9.29.16). How to cite this article: Almas, B 2017 Perseids: Experimenting with Infrastructure for Creating and Sharing Research Data in the Digital Humanities. Data Science Journal, 16: 19, pp. 1–17, DOI: https://doi.org/10.5334/ dsj-2017-019 Submitted: 10 November 2016 Accepted: 17 March 2017 Published: 18 April 2017 Copyright: © 2017 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/ licenses/by/4.0/. OPEN ACCESS Data Science Journal is a peer-reviewed open access journal published by Ubiquity Press. http://www.syriaca.org/ https://doi.org/10.5334/dsj-2017-019 https://doi.org/10.5334/dsj-2017-019 https://doi.org/10.5334/dsj-2017-019 http://creativecommons.org/licenses/by/4.0/ http://creativecommons.org/licenses/by/4.0/ work_ebs62kxihzd2fnlqoaouq5ylii ---- Born-digital archives EDITORIAL Born-digital archives Thorsten Ries1,2 & Gábor Palkó3 Published online: 25 March 2019 # Springer Nature Switzerland AG 2019 The first special issue of International Journal of Digital Humanities (IJDH) is about born-digital archives, their preservation and research perspectives involving born- digital primary records in the humanities. This is not only a result of the collaboration between the journal’s editor-in-chief, Gábor Palkó, Co-Director of the Centre for Digital Humanities at the Eötvös University, who is interested in the practice and theory of digital archives, and the editor of this volume, Thorsten Ries, who conducts research on born-digital dossiers génétiques with digital forensic methods at Ghent University. It is also meant to be a programmatic call to intensify cross-sectoral collaboration between galleries, libraries, archives, and museums (GLAM institutions), digital preservation projects, and humanities research working with digital primary sources. The born-digital historical record of the present age poses great challenges for archival science, librarianship, museology, and information science on the one hand, and to humanities research on the other, next to offering exciting opportunities. Personal digital archives, legal, governmental, institutional, scientific, public, and non-governmental organisations’ documentation records or datasets, public repositories of digital publications, web archives, and social media archives are incredibly rich, diverse and multi-faceted treasure troves for historians, political scientists, sociologists, philologists, literary scholars, art historians, digital humanists, and researchers from other humanities disciplines. The effort of long-term preservation, curator- and custo- dianship for these records and the development of setups, applications and application programming interfaces (API) to make them available for research has been subject of multiple large, successful international projects in archival science, librarianship, and information science. Landmark projects such as the archiving of the digital collections of Salman Rushdie at Emory University Library (Rockmore 2014; Waugh and Russey International Journal of Digital Humanities (2019) 1:1–11 https://doi.org/10.1007/s42803-019-00011-x * Thorsten Ries Thorsten.Ries@ugent.be; T.Ries@sussex.ac.uk * Gábor Palkó palko.gabor@btk.elte.hu 1 Ghent University, Ghent, Belgium 2 University of Sussex, Brighton, UK 3 Eötvös Loránd University, Budapest, Hungary http://crossmark.crossref.org/dialog/?doi=10.1007/s42803-019-00011-x&domain=pdf mailto:palko.gabor@btk.elte.hu mailto:T.Ries@sussex.ac.uk mailto:palko.gabor@btk.elte.hu Roke 2017); Hanif Kureishi at The British Library (Foss 2017), Friedrich Kittler at the German Literature Archive Marbach / Neckar (Enge and Kramski 2014), Franz Josef Czernin at the Austrian National Library (Catalogue ÖNB, accessed 2018) and the Thomas Kling Archive at Stiftung Insel Hombroich (Ries 2017, 2018), to name but a few, as well as national (e.g. UK, Germany, the Netherlands, Belgium, etc.) and international repositories and web archives (Internet Archive, etc.) with sophisticated frontends such as RESAW (REsearch Infrastructure for the Study of Archived Web materials), SHINE UK Web Archive and Wayback Machine, are just some of the most visible results of this broad development of born-digital archiving. Memory institu- tions, international archival, and information science projects are very active on addressing fundamental issues of born-digital archiving such as developing workflows for identification, selection, triage and bibliographic documentation. Management of the sheer data volumes and curatorship that caters for the fragility and obsolescence of legacy hardware, software and formats of complex, context-dependent digital records are ongoing challenges. Key research and development areas in this interdisciplinary sector are the development of preservation formats and workflows that ensure authen- ticity, fixity, physical as well as logical stability and accessibility by forensic imaging, virtualisation, emulation, migration and the development of environments, tools and API‘s for secure, controlled access to the archive for researchers. Currently, archival and information science, memory institutions, and archiving projects are working towards interoperable standards and making standardised workflows, protocols, expert resources, tools and infrastructure for born-digital curation available to archives, libraries, memory institutions, and projects of all sizes and all levels. The early beginnings of born-digital archiving practice and applications of digital forensic meth- odology in libraries and archives are mostly associated with the names of individual archivists, librarians, archival and information scientists, and humanists such as Susan Thomas, Kirschenbaum (2008, 2013, 2016a, b); Kirschenbaumet al. (2009, 2010); Jeremy Leighton John (2012); Duranti (2009); Duranti and Endicott-Popovsky (2010) and Doug Reside (2011a, b, 2017). Since then, we have seen an enormous growth of these efforts in archival research, development and professional practice, which today are orchestrated by large, national and international, often high-level projects such as InterPARES and InterPARES Trust (International Research on Permanent Authentic Records in Electronic Systems, Canada, Europe, international, since 1994, 4th phase), Digital Presevation Coalition, DPC (Europe, UK, international, 2002-today), PArA- DigM (Personal Archives Accessible in Digital Media, Europe, UK, 2005–2007), CASPAR (Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval) and Digital Preservation Europe, DPE (Europe, 2006–2009), PLANetS (Preservation and Long-term Access through NETworked Services, Europe, 2006– 2010), nestor (Kompetenznetzwerk Langzeitarchivierung, Germany, 2003–2006, 2006–2009, since 2009 self-sustained), PREMIS (Preservation Metadata: Implementa- tion Strategies, USA, since 2003), the CLIR and OCLC research initiatives (Council on Library and Information Resources, Online Computer Library Center, Incorporated, USA, international, 2010: CLIR report, 2013 „Demystifying Born Digital“), VIMM (Virtual Multimodal Museum, Europe, Cyprus, 2016–2019) and the Computational Archival Science working group (international, since 2016). National repositories for born-digital publications, research infrastructures, and web archives are mostly hosted and run by the national library system of individual countries, and complemented by 2 Ries, Palkó supranational humanities research infrastructures such as DARIAH (Digital Research Infrastructure for the Arts and Humanities) and CLARIN (Common Language Re- sources and Technology Infrastructure) in the European context. On a meso-level, however, there seems to be less institutions and projects that are enabling born-digital preservation, curation, and research at the level of smaller archives and individual researchers. We would like to highlight PACKED (Centre of Expertise in Digital Heritage, Belgium, since 2003), the DCC (Digital Curation Centre, UK, 2004-today), the BitCurator project (USA, 2011–2014, now BitCurator NLP) and BitCurator Access (USA, 2014–2016). It is encouraging to see that, at least every now and then, memory institutions reach out to humanities research in order to collaboratively identify in which digital formats, with which metadata and by which access tools born-digital records might be most useful for researchers and encourage them to find out about the possibilities. Excellent examples are the hands-on exhibition of Salman Rushdie‘s emulated computer at Emory Libraries (Rockmore 2014), the pilot of born-digital reading room at the British Library featuring materials from the Hanif Kureishi Archive (Foss 2017), the workshop on born-digital archives access at Wellcome Collection (Sloyan 2018), the inclusion of both humanities researchers and representatives of memory institutions responsible for web-archiving in the RESAW network (Winters 2018a), and the Personal Digital Archiving conference series (e.g. PDA conferences 2017 at Stanford University Libraries, 2018 at Houston, TX). This interdisciplinary and intersectoral collaboration between archival and humanities research, methodological development and practice is of crucial importance and the humanities certainly need to take Matthew Kirschenbaum’s imperative to heart: Digital archivists need digital humanities researchers and subject experts to use born-digital collections. Nothing is more important. If humanities researchers don’t demand access to born-digital materials then it will be harder to get those materials processed in a timely fashion, and we know that with the born-digital every day counts. (Kirschenbaum 2013, 38) Despite the fact that Kirschenbaum rather stated the obvious when he defined that „the concept of a primary record can no longer be assumed to be coterminous with that of a physical object“ and that „electronic texts, files, feeds and transmissions of all sorts are also indisputably primary records“ relevant to historical research (Kirschenbaum 2016b, 25:27), humanities researchers still seem to be rather reluctant when it comes to include born-digital primary sources into their research. There is probably no simple answer to the question why this is the case. If we look at personal digital archives, legal and ethical considerations concerning the protection of privacy and personal rights of the data subjects and of third parties as well as copyright are probably the most important reasons for the hesitation of humanities researchers (Carroll et al. 2011; Baker 2018). Jane Winters argues that “web archives, and other kinds of born-digital data, do bring the possibility of, and perhaps even necessitate, a radical reframing of humanities research – through their scale, their heterogeneity, their complexity, their fragility”, which might not be sufficiently accessible with “the tools and methods available to us at present” (Winters 2018b). Further concerns about born-digital archives, especially web-archives, might have to do with inherent biases and misrep- resentations introduced through a focus on “significant and/or traumatic events, [...] Born-digital archives 3 personal interest and enthusiasm or a serendipitous partnership” that comes with individual archiving efforts, triggered by events or specific research interests (Winters 2018b). Born-digital primary sources (and archives), according to Winters, are different from analogue ones in many ways, and she further makes the point that historians still need to embrace the fact “that a digit[al] manuscript is an object in its own right, with its own context of production” (Winters 2018a). For this delayed development among historians, she identifies disciplinary and sectoral boundaries as reasons, next to the methodological issues: One explanation is that while digital history has embraced a range of historical sub-disciplines, and borrowed readily from subjects like archaeology and histor- ical geography, it has largely failed to take account of developments in two crucial areas: library, archive and information studies; and digital preservation. Libraries and archives have necessarily been at the forefront of web archives research and practice. [...] (Winters 2018b) This diagnose is indeed consequential. The gap between the progress in born-digital preservation development and archival science research, on the one hand, and (digital) humanities research on the other, needs to be closed, first and foremost in order to enable GLAM institutions, institutional networks and infrastructures to develop their born-digital collections in meaningful ways, improve preservation formats, curation workflows, repositories, services, and access for researchers. This can only be achieved by cross-sectoral and interdisciplinary collaboration to support active research on born- digital collections. This is precisely what this first special issue of International Journal of Digital Humanities seeks to encourage. The necessary collaboration will benefit from the new European General Data Protection Regulation (GDPR) guidelines, as these regulations provide an excellent basis for GLAM institutions and researchers to establish trust relationships with archive depositors and creators. This will, moreover, encourage them to enable research by having their materials preserved, archived, and made available to research in a secure, controlled, and authentic way with security procedures that empowers them as data subjects. Kirschenbaum and Winters urge those in the field of humanities to embrace the born-digital historical record as an object and primary source in its own right – a claim which, of course, has precursors in media history and theory, historical bibliography, textual scholarship, and digital humanities (see Dahlström 2000; Manovich 2001; Gitelman 2006). This implies the critical appraisal of the born-digital primary record’s specific historical materiality, along the lines of philological and forensic disciplines such as diplomatics, palaeography, philology, and analytical bibliography. Since Kirschenbaum’s Mechanisms. New Media and the Forensic Imagination (2008) and Duranti’s introduction of the concept of digital diplomatics (Duranti 2009); Duranti and Endicott-Popovsky (2010), digital forensic methods and tools, especially bitstream- preserving imaging (also known as forensic imaging), became a standard practice in memory institutions for the preservation of digital storage media and born-digital records. Kirschenbaum’s seminal definition of formal and forensic digital materiality (2008, 10–11, see also Ries 2018, 389–401) – a selective focus on one dichotomic dimension in the spectrum of digital materiality (for an overview, see Drucker 2013) – conceptually enabled an analytical perspective on the materiality of the physical 4 Ries, Palkó characteristics of storage media and the bitstream of the historical born-digital record that reveals its digital history, embedded as latent forensic artefacts, recoverable data, and traces of processing and user interaction. While his work on digital materiality is certainly indebted to New Bibliography (Lebrave 2011) and oriented towards a relative physical stability of the forensic record, his more recent theoretical considerations of the born-digital archive seem to rather reflect issues of the instability, context-depen- dency, authenticity, and intangibility of the born-digital historical record as a logical digital object within formal materiality (Kirschenbaum 2013, 2016b). But this also means that this data is fundamentally unstable in the sense that they rest upon the foundations of other data, what is quite literally in the trade known as metadata, in order to be legible under the appropriate computational regiments, which I have previously termed as formal materiality in my own work. (Kirschenbaum 2016b, 30:05) In the terms I put forth in Mechanisms, each access engenders a new logical entity that is forensically individuated at the level of its physical representation on some storage medium. Access is thus duplication, duplication is preservation, and preservation is creation — and recreation. That is the catechism of the .txtual condition, [...]. (Kirschenbaum 2013, 16) As questions of stability, authenticity, and technological context-dependency (on different concepts of authenticity relating to context, see Rogers 2015, 100) and materiality of the born-digital historical record become even more complex for preser- vation and research tasks, the role of archival custodian- and curatorship, digital signing (Blanchette 2012), digital forensic methodology, and context-preservation of complete operating systems by emulation or virtualisation and even computer hardware (Kirschenbaum et al. 2009) becomes even more prominent. In the light of inevitable ageing and obsolescence of hard- and software, bitrot, fading network contexts, and online services going offline, memory institutions already today have to decide accord- ing to which standards and criteria to select relevant materials. They have to decide which aspects of digital objects and their contexts are relevant to future research and have to be preserved in order to achieve an authentically preserved record, and what would be acceptable loss. Is it just the text or the content of a document that has to be preserved, metadata in the document or in the surrounding operating system, contextual material in file folders, the materiality of the complete operating system or file server – as „dead“ system in a forensic, fixed, bit-precise image or emulated at runtime –, or is the hardware or network context an important aspect to be preserved? Or is the experience of contemporary interaction a main factor that needs documentation? Some of the contributions to this special issue of IJDH revolve around these key questions of born-digital archives. The archival and digital forensic perspective sheds light on the specific historicity of the born-digital record. Digital historicity does not only become apparent when one interacts with still functional legacy hard- and software in computing musea, experienc- ing the look and feel of historic operating systems and applications, the today unusual feel of thick cables, old port connectors and adapters, motherboards, controllers and storage media. The forensic materiality of the born-digital record, preserved in the form Born-digital archives 5 of forensic images and other forensic formats, bears a highly specific signature of historical computing that can best be understood from the vantage point of Jean- Francois Blanchette‘s A Material History of Bits (2011). He remarkably takes the perspective of a historian who analyses historical hard- and software architectures, such as the processing and networking stack, principles such as layering and modularity of operating systems and applications, read as historical documents of design decisions taken by hard- and software engineers, programmers, and tech companies in their pursuit to overcome the physical constraints of computing by architecture abstraction and error-correction mechanisms to maintain an ‘illusion of immateriality’ (Kirschenbaum 2008, p. 135). Blanchette stresses that maintaining the illusion of immateriality of resources, and hiding their physical limitations and characteristics to programmers and users is in itself a resource-intensive, critical and error-prone task that is mostly implemented at the cost of technical ‘efficiency trade-offs’. This purported independence from matter would have two distinct and important consequences: (a) digital information can be reproduced and distributed at neg- ligible cost and high speed, and thus, is immune to the economics and logistics of analogue media; (b) digital information can be accessed, used, or reproduced without the noise, corruption, and degradation that necessarily results from the handling of material carriers of information. [...] Yet, this abstraction from the material can never fully succeed. Rather, it stands in dialectical tension with the evolution of these material resources and with the efficiency trade-offs their abstraction requires. (Blanchette 2011, p. 1042) Blanchette especially names the efficiency trade-offs implied by modularity, the effi- ciency cost of necessary garbage collection and error correction at runtime as ‘design trade-offs inherent in abstracting from physical resources are rarely acknowledged in the computing literature’ (Blanchette 2011, p. 1047). While some might want to nuance Blanchette’s argument and note that modularity as a foundational principle of system architecture, code organisation, and programming language implementation is a neces- sity to ensure maintainability, manageability and extensibility of almost any larger system rather than be regarded as a performance penalty (which it can be), most will agree that overcoming the quirks of physical materiality is a resource-intensive task: The digital abstraction can be maintained in spite of this “noise” because, as Kirschenbaum notes, through error-correction codes, buffering, and other tech- niques, computers can self-efface the static—scratches on a record, smudges on paper—that typically signals the materiality of media: […] These mechanisms, formally described in information theory, are used throughout networked com- puting systems: the impact of media irregularities on hard drive platters can be mitigated through the use of error-correction codes; the unpredictability of network bandwidth can be mitigated through the use of buffering, ensuring smooth delivery of latency-sensitive content [...]. It is this ability to ceaselessly clean up after its own noise that so powerfully enables computers to seemingly sever their dependency on physical processes that underlie processing, storage, and connectivity. Yet the physical characteristics of a resource (be it computation, storage, or networking) cannot simply be transcended, and noise can only be 6 Ries, Palkó conquered at the expense of other resources. [...] error-correcting codes, widely used to protect against transmission interference, result in both data expansion (and thus, reduced capacity) and increased processing load. [...] Once again, then, independence from the material can only be obtained at the costs of certain trade- offs. (Blanchette 2011, p. 1047) Blanchette’s reasoning could serve as a foundation for a historical theory of digital forensics, an explanatory framework for many digital forensic phenomena, and the specific historicity of forensic digital materiality. Many phenomena that digital forensic tools and methods analyse are ultimately rooted in the mitigation of material constraints of hard- and software. Deleted data can be recovered because effective deletion through overwriting is a very resource-expensive task that would slow down a computer, which is why effective deletion does not take place by default. Often deleted data and documents “survive” on a system because of bugs, file system corruption, and system crashes: in CHKDSK error correction or hibernation files created by the operating system, in temporary and auto-recovery files not deleted because of system crashes. Temporary files are created on hard drives especially when a runtime environment runs out of physical RAM and has to swap memory with the storage medium. On some operating systems, automatic system snapshots are being created (e.g. VSS shadow copy partitions) in order to mitigate the risk of data loss through system instability. Files and file fragments are preserved in the so-called “drive slack” of data clusters because modern storage media are organised in blocks, which speeds up the process of data lookup and the navigation of large storage spaces on storage media with physical moving parts, such as conventional hard drives: it is the physical block size on the storage medium that determines where exactly a file is cut off. Fastsave artefacts in Microsoft Word documents and in temporary files are a result of a saving mechanism that was implemented to mitigate the relatively slow operation of early hard drives, at the cost of deleted text passages still present in documents and temporary files (Ries 2017, 2018). This incomplete list names just a few of the effects, mechanisms and design decisions that digital forensics is about and which are based on the computing- historical perspective that Blanchette describes. The digital forensic record, in turn, is deeply informed by designs that are specific for different types of hardware, versions of operating systems and application software, giving it a highly specific historicity that is accessible and readable through the forensic traces of digital processing. The latent digital forensic features of the born-digital historical record are not only of interest for philologists who search for hidden draft versions of a text. They are also relevant for historians and archivists who have to determine whether a historical record is authentic or might have been manipulated. Furthermore, they are relevant to the historian who investigates the history of the digitisation of society using original archived computing systems. When we speak about the born-digital record, there is another aspect to be kept in mind, an aspect that is not in the foreground in this volume, but hopefully will be scrutinized in more detail during further issues of IJDH. As Blanchette rightly empha- sizes, the historicity of born-digital phenomena is rooted in the material constraints of hard- and software, it is embedded in an infrastructure context without which it cannot be understood. The infrastructure of the digital archive, which serves as an interface between the researcher and their subject of research, requires attention in itself, Born-digital archives 7 regardless of the fact that the research is based on born-digital or digitised materials. Michel de Certeau has pointed out in his seminal work The Writing of History (De Certeau 1988; Palkó 2019) that the computer, as an archive, forms a new apparatus for research and as such will fundamentally change the way historical documents will be formed. The materiality of the archive as medium of knowledge formation is one of the main research questions media archaeology focuses on (Ernst 2011; Parikka 2012). Parikka sheds light on the interdependence of problems current archiving practices face in a born-digital culture, and on the theoretical challenges of understanding how a digital archive as an apparatus forms our documents of the past and present. the theoretical problems of recent media archaeologies of technical media and software along with a rethinking of the archive, go hand in hand with the practical challenges faced by cultural heritage institutions and professionals: how do you archive processes and culture which is based on both technical processes (soft- ware and networks) and social ones (participation and collaboration, as in massive online role-playing platforms as cultural forms). (Parikka 2012, 115) The analyses of the institutional archiving practices have always been complicated for their medial and material mechanisms tend to stay in the shadow (Groys 2000; Palkó 2017). However, the analyses of the apparatus of the digital archive, which includes born-digital, processual, network- or environment-based material, is even more com- plicated. Although a lot has been done in the last decade to provide a stable digital object by forensic imaging on the level of forensic materiality, the actual documents extracted from a forensic image depend highly on the technical infrastructure (e. g. the chosen software and workflow), and requires technical skills that are normally not part of a humanities scholar’s qualification. The same is true for the growing importance and complexity of searching the digital medium. As both digitized and born-digital records are available in a quantity impossible to fathom through the methodology of close reading, records relevant for a research question will mostly be gathered by using query services. Digital archives normally radically limit the possibility to use custom search tools and query languages, they only provide predefined and simplified options. A lot has been done by national institutions and international projects both on technical, institutional, and discursive level to augment the traditionally analogue field of scientifically relevant material to the born-digital. Trusted formats, standards, meth- odologies, and services are available for GLAM institutions and researchers as well, but it remains an open question how the complexity of handling born-digital primary records and the thus established digital archives will be manageable for the humanists of the twenty-first century. The current special issue of International Journal of Digital Humanities features articles by international researchers from the libraries and archives sector, as well as from the (digital) humanities that address born-digital archives on several levels, ranging from the digital forensic perspective on individual records (Archival Methodology: Digital Forensics), via personal digital archives and born-digital cultural heritage archives (Digital Culture and Literature Archives), web archives (Web Archives), to born-digital archiving in large digital infrastructures (Born-Digital Archives and Infrastructures). Corinne Rogers (University of British Columbia, Vancouver, Canada) strikes the connection between digital forensics and born-digital archival science and 8 Ries, Palkó practice with a historical introduction to how digital forensics became a viable tool for digital curatorship. Bénédicte Vauthier (Bern University, Switzerland), after her studies on Robert Juan- Cantavella’s born-digital dossier génétique of his novel El Dorado (Vauthier 2014, 2016), traces the inherent connection between Anglo-American textual scholarship and analytical bibliographyon the one hand and the introduction of digital forensic meth- odology to archival science on the other, in an effort to find an explanation why European textual scholarship and philology seems to lag behind in this field. Vauthier also presents the results of her survey among Spanish-speaking writers about their digital self-archiving practice and their willingness to deposit their digital archives at memory institutions and make them available for research. Nicholas Schiller and Dene Grigar (Washington State University, Vancouver, Canada) provide an insight into their work at the Electronic Literature Lab (ELL) at Washington State University Vancouver on the process of archiving electronic literature, specifically about documenting the interactive experience with Sarah Smith’s King of Space in the ‘traversal’ format. Schiller and Grigar’s discussion and example show some of the important challenges of electronic literature archiving and the solutions practiced at ELL. Libi Striegl and Lori Emerson (University of Colorado Boulder, USA) describe their archival and ‘anarchival’ experience- and practice-based approach to research and research creation at the Media Archaeology Lab (MAL) at the University of Colorado at Boulder. As an example, they document the project on mesh-networked One Laptop Per Child XO laptops at MAL. The One Laptop Per Child initiative with its tailored technological ecosystem is an important educational inclusion project worth documenting, its use of mesh networks and hardware design introduced an innovative approach to local networking, network capacity sharing and solutions for operation under technologically difficult circumstances and infrastructure. In the web archives section of the current issue, Trevor Owens, editor of Owens 2013a, b, and Grace H. Thomas (Library of Congress, USA) trace the history and functional changes of the Spacer GIF and the resulting challenges for web archiving. Eveline Vlassenroot (Ghent University, Belgium), Sally Chambers (Ghent University, Belgium), Emmanuel Di Pretoro (Haute École Bruxelles-Brabant, Brussels, Belgium), Friedel Geeraert (Royal Library and State Archives of Belgium, Brussels, Belgium), Gerald Haesendonck (Ghent University, Belgium), Alejandra Michel (Namur Univer- sity, Belgium) and Peter Mechant (Ghent University, Belgium) discuss national and international web archives as a data resource for digital scholars in Europe. In the Born-Digital Archives and Infrastructures section, Tibor Kálmán (GWDG Göttingen, Germany), Matej Ďurčo (Austrian Academy of the Sciences, Austria), Frank Fischer (Higher School of Economics, Moscow, Russia), Nicolas Larrousse (Huma- Num, Paris, France), Claudio Leone (State and University Library Göttingen, Germa- ny), Karlheinz Mörth (Austrian Academy of the Sciences, Austria) and Carsten Thiel (State and University Library Göttingen, Germany) map the challenges, approaches and solutions of born-digital archiving and access, especially for born-digital research datasets, learning materials, services and software in the context of the European DARIAH research infrastructure, and beyond. The special issue concludes with Peter Mechant‘s (Ghent University, Belgium) review of Web 25. Histories from the first 25 years of the world wide web, edited by Niels Brügger in 2017. Born-digital archives 9 References Baker, J. (2018). Outlook: Email archives, 1990–2007, guest lecture at Ghent University, Belgium, 8 May 2018. Blanchette, J.-F. (2011). A material history of bits. Journal of the American Society for Information Science and Technology, 62(6), 1042–1057. Blanchette, J.-F. (2012). Burdens of proof. Cryptographic culture and evidence law in the age of electronic documents. Cambridge: MIT Press. Carroll, L., Farr, E., Hornsby, P., & Ranker, B. (2011). A comprehensive approach to born-digital archives. Archivaria, 72, 61–92. Catalogue ÖNB (2018). Catalogue entry at Austrian National Library , ÖNB. https://www.onb.ac. at/bibliothek/sammlungen/literatur/bestaende/personen/czernin-franz-josef-geb-1952/. Accessed 11 July 2018. Dahlström, M. (2000). Drowning by versions. Human IT 4, http://etjanst.hb.se/bhs/ith/4-00/md.htm. Accessed 11 July 2018 De Certeau, M. (1988). The writing of history, (trans: Conley, T.). New York: Columbia UP. Drucker, J. (2013). Performative materiality and theoretical approaches to interface. Digital Humanities Quarterly, 7(1). https://www.digitalhumanities.org/dhq/vol/7/1/000143/000143.html. Accessed 11 July 2018. Duranti, L. (2009). From digital diplomatics to digital records forensics. Archivaria, 68, 39–66. Duranti, L., & Endicott-Popovsky, B. (2010). Digital records forensics. a new science and academic program for forensic readiness. In ADFSL Conference on Digital Forensics, Security and Law. http://arqtleufes. pbworks.com/w/file/fetch/94919918/Duranti.pdf. Accessed 11 July 2018. Enge, J., & Kramski, H. W. (2014). “Arme Nachlassverwalter ...”. Herausforderungen, Erkenntnisse und Lösungsansätze bei der Aufbereitung komplexer digitaler Datensammlungen. Filthaut, J. (Ed.), Von der Übernahme zur Benutzung. Aktuelle Entwicklungen in der digitalen Archivierung. 18. Tagung des Arbeitskreises Archivierung von Unterlagen aus digitalen Systemen on 11–12 March 2014 in Weimar, (pp. 53–62) Weimar: Thüringisches Hauptstaatsarchiv. Ernst, W. (2011). Media archaeography method and machine versus history and narrative of media In: Huhtamo, E. and Parikka, J. (Eds.), Media archaeology. Approaches, applications, and implications (pp 239–255). Los Angeles: University of California Press. Foss, R. (2017). The British Library: Learning from users of personal digital archives at The British Library. Research paper at conference Personal Digital Archiving 2017, 29-31 Mar 2017, Stanford Libraries. Gitelman, L. (2006). Always already new. Media, history, and the data of culture. Cambridge: MIT Press. Groys, B. (2000). Unter Verdacht. Eine Phänomenologie der Medien. Hanser. John, J. L. (2012). Digital forensics and preservation. DPC technology watch report 12–03 November 2012. Digital Preservation Coalition. https://doi.org/10.7207/twr12-03. Accessed 11 July 2018. Kirschenbaum, M. (2008). Mechanisms. New media and the forensic imagination. Cambridge: MIT. University Press. Kirschenbaum, M. (2013). The .txtual condition: digital humanities, born-digital archives, and the future literary. Digital Humanities Quarterly, 7(1), http://www.digitalhumanities.org/dhq/vol/7/1/000151 /000151.html. Accessed 11 July 2018. Kirschenbaum, M. (2016a). Track changes. A literary history of word processing. Cambridge: Harvard University Press. Kirschenbaum, M. (2016b): The transmissions of the archive. Literary remainders in the late age of print. Kirschenbaum, M. (Ed.), Bitstreams. The Future of Digital Literary Heritage. Lecture Series at KISLAK Center for Special Collections, Rare Books and Manuscripts, Penn Libraries. 14 March 2016. https://www.youtube/6TuA4dkRegQ. Accessed 11 July 2018. Kirschenbaum, M., Farr, E.L., Kraus, K., et al. (2009). Digital materiality. Preserving Access to Computers as Complete Environments. In: iPress 2009. 6th International Conference on Preservation of Digital Objects. 5. October 2009. University of California. California Digital Library. http://www.escholarship. org/uc/cdl_ipres09. Accessed 11 July 2018. Kirschenbaum, M., Ovenden, R., Redwine, G., et al. (Eds.) (2010). Digital forensics and born-digital content in cultural heritage collections. Washington, D.C.: Council on library and information resources, Washington, D.C. Dec. 2010. http://www.clir.org/pubs/reports/reports/pub149/pub149.pdf. Accessed 11 July 2018. Lebrave, J.-L. (2011). Computer forensics: la critique génétique et l’écriture numérique. Genesis, 33, 137–147. 10 Ries, Palkó https://www.onb.ac.at/bibliothek/sammlungen/literatur/bestaende/personen/czernin-franz-josef-geb-1952/ https://www.onb.ac.at/bibliothek/sammlungen/literatur/bestaende/personen/czernin-franz-josef-geb-1952/ http://etjanst.hb.se/bhs/ith/4-00/md.htm https://www.digitalhumanities.org/dhq/vol/7/1/000143/000143.html http://arqtleufes.pbworks.com/w/file/fetch/94919918/Duranti.pdf http://arqtleufes.pbworks.com/w/file/fetch/94919918/Duranti.pdf https://doi.org/10.7207/twr12-03 http://www.digitalhumanities.org/dhq/vol/7/1/000151/000151.html http://www.digitalhumanities.org/dhq/vol/7/1/000151/000151.html https://www.youtube/6TuA4dkRegQ http://www.escholarship.org/uc/cdl_ipres09 http://www.escholarship.org/uc/cdl_ipres09 http://www.clir.org/pubs/reports/reports/pub149/pub149.pdf Manovich, L. (2001). The language of new media. Cambridge: MIT Press. Owens, T. (Ed.) (2013a). Preserving.exe: Toward a national strategy for software preservation. National Digital Information Infrastructure and Preservation Program at the Library of Congress, Washington, D.C. http://www.digitalpreservation.gov/multimedia/documents/PreservingEXE_report_final101813.pdf. Accessed 11 July 2018. Owens, T. (2013b). Historic iPhones: Personal digital media devices in the collection. 15 November 2013. http://www.trevorowens.org/2013/11/historic-iphones-personal-digi-tal-media-devices-in-the-collection/. Accessed 11 July 2018. Palkó, G. (2017). Media archaeology of institutional archives? Studia UBB Digitalia, 7(62), 75–82. https://doi. org/10.24193/subbdigitalia.2017.1.05. Accessed 11 July 2018. Palkó, G. (2019). Sites of digita l humanities. About virtual research environments. In Kelemen, P. & Pethes, N. (Eds.). Philology in the Making. Analog/Digital Cultures of Scholarly Writing and Reading (pp. 221- 230). Bielefeld: Transcript. Parikka, J. (2012). What is media archaeology. Cambridge: Polity. Reside, D. (2011a). “Last modified January 1996”. The digital history of RENT. Theatre Survey, 52(2), 335–340. Reside, D. (2011b, April 22). “No day but today”. A look at Jonathan Larson’s Word Files [Blog post]. http://www.nypl.org/blog/2011/04/22/no-day-today-look-jonathan-larsons-word-files. Accessed 11 July 2018. Reside, D. (2017). West side story: The journey to Lincoln center theater. In L. MacDonald & W. A. Everett (Eds.), The palgrave handbook of musical theatre producers (pp. 359–367). New York: Palgrave Macmillan. Ries, T. (2017). Philology and the digital writing process. Cahier voor Literatuurwetenschap, 9, 129–158. Ries, T. (2018). The rationale of the born-digital dossier génétique: Digital forensics and the writing process: With examples from the Thomas Kling archive. Digital Scholarship in the Humanities, 33(2), 391–424. Rockmore, D. (2014). The digital life of Salman Rushdie. The New Yorker published 29 July 2014. http://www.newyorker.com/tech/elements/digital-life-salman-rushdie. Accessed 11 July 2018. Rogers, C. (2015). Authenticity of digital records. A Survey of Professional Practice. Canadian Journal of Information and Library Science, 39(2), 97–113. Sloyan, V. (2018). Overview of a born-digital archives access workshop held at Wellcome collection. Findings of a workshop held on 24th November 2017. https://doi.org/10.6084/m9.figshare.6087194.v1. Accessed 1 March 2019. Vauthier, B. (2014). Tanteos, calas y pesquisas en el dossier genético digital de El Dorado de Robert Juan Cantavella. In M. Kunz & S. Gómez-Rodríguez (Eds.), Nueva narrativa española (pp. 311–345). Barcelona: Linkgua. Vauthier, B. (2016). Genetic criticism put to the test by digital technology: Sounding out the (mainly) digital genetic file of El Dorado by Robert Juan-Cantavella. Variants, 12-13, 163–186. Waugh, D. & Russey Roke, E. (2017). Second-generation digital archives: What we learned from the Salman Rushdie project. Research paper at conference Personal Digital Archiving 2017, 29–31 Mar 2017, Stanford Libraries. Winters, J. (2018a). Humanities and the born digital: Moving from a difficult past to a promising future? Keynote at DHBenelux 2018, Amsterdam. 7 June. Winters, J. (2018b). Web archives and (digital) history: a troubled past and a promising future? In N. Brügger & I. Milligan (Eds.), Sage Handbook of Web History (pp. 593–606). Newcastle: Sage. Born-digital archives 11 http://www.digitalpreservation.gov/multimedia/documents/PreservingEXE_report_final101813.pdf http://www.trevorowens.org/2013/11/historic-iphones-personal-digi-tal-media-devices-in-the-collection/ https://doi.org/10.24193/subbdigitalia.2017.1.05 https://doi.org/10.24193/subbdigitalia.2017.1.05 http://www.nypl.org/blog/2011/04/22/no-day-today-look-jonathan-larsons-word-files http://www.newyorker.com/tech/elements/digital-life-salman-rushdie https://doi.org/10.6084/m9.figshare.6087194.v1 Born-digital archives References work_ecd7o5ue2bb4zgedpkoemhn5qe ---- Web archives as a data resource for digital scholars RESEARCH ARTICLE Web archives as a data resource for digital scholars Eveline Vlassenroot1 & Sally Chambers2 & Emmanuel Di Pretoro3 & Friedel Geeraert4 & Gerald Haesendonck5 & Alejandra Michel6 & Peter Mechant1 Published online: 8 March 2019 # Springer Nature Switzerland AG 2019 Abstract The aim of this article is to provide an exploratory analysis of the landscape of web archiving activities in Europe. Our contribution, based on desk research, and complemented with data from interviews with representatives of European heritage institutions, provides a descriptive overview of the state-of-the-art of national web archiving in Europe. It is written for a broad interdisciplinary audience, including cultural heritage professionals, IT specialists and managers, and humanities and social science researchers. The legal, technical and operational aspects of web archiving and the value of web archives as born-digital primary research resources are both explored. In addition to investigating the organisations involved and the scope of their web archiving programmes, the curatorial aspects of the web archiving process, such as selection of web content, the tools used and the provision of access and discovery services are also considered. Furthermore, general policies related to web archiving programmes are analysed. The article concludes by offering four important issues that digital scholars should consider when using web archives as a historical data source. Whilst recognising that this study was limited to a sample of only nine web archives, this article can nevertheless offer some useful insights into the technical, legal, curato- rial and policy-related aspects of web archiving. Finally, this paper could function as a stepping stone for more extensive and qualitative research. Keywords Webarchives.Digital scholarship. Curationofdigital collections . Copyright. Technology for web archiving International Journal of Digital Humanities (2019) 1:85–111 https://doi.org/10.1007/s42803-019-00007-7 * Eveline Vlassenroot Eveline.Vlassenroot@UGent.be * Sally Chambers Sally.Chambers@UGent.be Extended author information available on the last page of the article http://crossmark.crossref.org/dialog/?doi=10.1007/s42803-019-00007-7&domain=pdf mailto:Sally.Chambers@UGent.be mailto:Sally.Chambers@UGent.be 1 Setting the scene: Archiving the web as a historical source The history of web archiving goes back more than 20 years, with the first initiatives launched in 1996 by the Internet Archive, the National Library of Australia and Sweden (Schroeder and Brügger 2017). France was also a pioneer in the field with the National Library of France (BnF) undertaking its first web archiving experiments in 1999 (BnF 2014). However, web archiving has roots in a wider digital preservation movement, which emerged in the 1980s–1990s. Led by memory institutions, the aim of this movement was to develop strategies to respond to the rise of digital technologies and in particular address their ability to capture and preserve digital artefacts as ‘records of social phenomena’ (Schneider and Foot 2008). As web archiving is still a nascent field, clear definitions are sometimes difficult to find. For this reason, the phrase ‘web archiving’ is often used interchangeably with ‘web preservation,’ without any clarification or distinction between the two. For example, the International Internet Preservation Consortium (IIPC)‘s definition of web archiving includes both terms: ‘Web archiving is the process of collection portions of the World Wide Web, preserving the collections in an archival format, and then serving the archives for access and use’ (IIPC 2017). ‘Web archiving’, therefore, refers to the whole process, whereas ‘web preservation’ is one of the steps in the process of archiving the web. Web preservation is a crucial step as, in the words of Reyes Ayala, it is ‘the process of maintaining internet resources in a condition suitable for use’ (2013: 1). A website can be captured and stored, but the preservation of this content ensures it will still be accessible over time. Given this long-term perspective, web archiving requires a strategic approach as much is required in terms of technologies, systems, policies, procedures and resources to make web archiving more than merely harvesting and storing online content. For digital scholars in the social sciences and humanities, web archives are increasingly recognised as an essential source for studying cultural and social phenomena of recent decades (Schneider & Foot 2005). Some examples include: Brügger et al. (2017), who have been studying the evolution of national do- mains; Helmond et al. (2017), who used the Internet Archive Wayback Machine for empirically surveying the historical dynamics of social media industry part- nerships and partner programmes; Chakraborty and Nanni (2017), who used archived websites as primary sources to examine activities of scientific institu- tions through the years, or Weber (2017), who traces the tumultuous history of news media on the web through an examination of archived news media content maintained within the Internet Archive. Furthermore, in the BUDDAH (Big UK Domain Data for the Arts and Humanities) project, a number of bursaries were awarded to researchers for carrying out research in their subject area using the UK web archive (BUDDAH 2014). At the European level, RESAW, the Research Infrastructure for the Study of Archived Web Materials, has been established ‘with a view to promoting the establishing of a collaborative European research infrastructure for the study of archived web materials’ (RESAW 2012). Legal issues have implications for web archiving as they influence selection policies and users’ access to archived online content: 86 International Journal of Digital Humanities (2019) 1:85–111 1. Copyright legislation.1 2. Personal data protection as web archiving is likely to imply personal data process- ing. However, it is important to keep in mind that the General Data Protection Regulation (GDPR) authorises legal derogations from the rights of the data subjects when personal data are processed for historical or scientific purposes and for archiving purposes in the public interest. 3. The legal framework on authenticity and integrity of online content as web archives could be used before courts for probative reasons. 4. The issue of illegal contents violating public policy and their potential interest for researchers due to the automatic nature of web archiving tools. 5. Legally delimiting the national scope of competence in a web archiving context with unclear digital boundaries. Indeed, regarding potential overlap between leg- islation on legal deposit and on public records, it is important to have clear criteria to determine the country and the national heritage institution in charge of the archiving of a particular website. 6. Legislation concerning reuse of public sector information. As the web has evolved from a publishing to a communication medium, it now presents a vast collection of primary sources for our past. This wealth of diverse information provides the necessary conditions for the emergence of web archiving as a truly interdisciplinary field, bringing together practitioners and scholars from different back- grounds: humanities, social sciences, computer and information sciences, libraries, archives, etc. (Ogden et al. 2017). However, the sheer quantity of information, and the constant evolution of the web, complicate its preservation and make diachronic study for researchers very challenging (Chakraborty and Nanni 2017). As Laursen notes: ‘Curators do what they can to capture what they can, and their practices and opportunities change over time’ (2017: 220). The following sections report the findings of a review of web archiving activities in Europe. After a short description of our research methodology, we discuss the aspects of web archiving that affect the users of web archives. First, the web archiving selection process is analysed from an operational point of view, including an in-depth analysis of legal deposit legislation. The different ways in which the concept of a ‘national web’ is defined and the different selection strategies used by the studied web archiving institutions are explored. Second, the differences in policies regarding access to web archives are analysed, taking into account the legal framework with regard to copyright and the inclusion of illegal content in web archives. On an operational level, the user- friendliness of the studied web archives is explored based on an analysis of the available search functionalities. The role of metadata, and the importance of obtaining a thorough understanding of user needs and requirements are stressed. Third, the ‘hands-on’ or technical aspects of working with web archives are introduced and some of the challenges and main techniques to keep in mind when working with web archives are discussed. Our explorative analysis of European web archives ends with a discussion underlining four important considerations for digital scholars. 1 For instance, obtaining prior authorisation of right holders, creating new exceptions for reproduction or communication to the public for archiving purposes and obtaining a fair balance between the public interest in preserving information of cultural or historical significance and the interests of rights holders. International Journal of Digital Humanities (2019) 1:85–111 87 2 Methodology The research methodology consisted of three phases. In the first phase, a secondary research approach (also known as desk research) was taken. This involved summarising, collating and/or synthesising documentation related to existing web archiving projects. A number of web archiving initiatives were selected and analysed in depth. These included the National Library and National Archive of the Netherlands, the Royal Danish Library (Netarkivet), the National Library of Ireland, the National Library of France (BnF), the National Library of Luxembourg, the British Library, The National Archives UK and Arquivo.pt. in Portugal. With regard to the selection of our sample of web archiving initiatives, a number of characteristics were taken into account: – Established web archiving initiatives – Web archiving initiatives in countries where both the national library and the national archives are involved in web archiving (as the PROMISE project is a collaboration between the Belgian Royal Library and State Archives, useful lessons could be drawn from countries where both institutions engage in web archiving) – Web archiving initiatives in countries with multiple official languages – Web archiving initiatives in countries of different sizes – Combination of web archiving initiatives relying on external service providers and initiatives that manage all aspects of the process in-house. Not all of these features are applicable to each initiative; the main aim was to study a representative mix of web archiving initiatives, based on the above characteristics. The main research question for this study is: how are other European national libraries and national archives engaging in web archiving and how are the web archiving processes organised? The web archives were studied from a legal, technical and operational point of view. The aim was to create an overview of the web archiving processes in place in each of the institutions covering a) the selection (selection policy, legal framework), b) the web archiving process itself (crawling, quality control, indexation, preservation and storage) and c) access to, and use, of the web archive (policies, search functionalities and legal framework). Operational questions such as the composition of the web archiving teams in terms of professional profiles or the storage requirements in terabytes (TB) or petabytes (PB), were also included in the mix. In the second research phase, interviews were conducted with representatives from the aforementioned institutions. The aim of the interviews was to fill in the gaps that remained on the specific initiatives following the literature review so that a complete overview of the web archiving activities was obtained for each of the institutions. All participants were interviewed either in face-to-face meetings or by conference call. The interviews were semi-structured, using both closed and open questions. Some inter- viewees already provided written replies to (some) of these questions beforehand, in which case the interview consisted mainly of follow-up questions. Interviewees includ- ed a mix of archivists, librarians, IT specialists, managers, digital curators and re- searchers (see Appendix A). 88 International Journal of Digital Humanities (2019) 1:85–111 The third and final research phase encompassed further validation and synthesis. The answers to the questions that were obtained during the literature review, and in the interviews, were integrated. On the basis of which, comparisons were drawn, thereby obtaining an answer to the research questions and creating an overarching view of the selected web archiving initiatives. This allowed to us to distil the relevant aspects that are important for digital scholars. 3 Selection of content for web archives 3.1 How is web archiving framed by the law? In all of the countries where our selected European web archiving institutions are based, the National Library is legally responsible for preserving and opening up cultural and historical heritage to the public, even if they have no legal deposit law (e.g. The Netherlands). There is a lot of information available online; thus, institutions believe that the preservation of online cultural heritage is naturally part of their legal mandate. In addition to the mandate to preserve a nation’s heritage, there are two legal ways to enable web archiving: on the one hand, legal deposit legislation; on the other hand, legislation on public archives. The majority of countries have gradually modified their national legal deposit legislation in order to widen it to the Internet and thus allow the collection and preservation of online information.2 In Ireland, this process is ongoing as the legal deposit legislation is now under review to broaden its scope to include online contents (Ryan 2017). As Maria Ryan (2017) stated: ‘The Irish situation is difficult because Irish Legal Deposit legislation does not extend to digital or online publications. The legis- lation is under review at the moment’. The scope of this legislation is often very broad in regard to determining which websites should be archived. However, national legislation generally excludes personal correspondence and private spaces available on intranets, for privacy reasons. Still, a minority of countries do not have any legal texts relating to legal deposit (at least, to the web legal deposit).3 In these countries, the deposit of websites of cultural and/or historical significance to the National Library is in principle done on a voluntary basis (Beunen and Schiphof 2006, p. 18; Kunze and Power n.d., p. 2). Indeed, in the absence of a legal obligation to deposit publications, the consent of website owners is necessary.4 These right holders are, therefore, able to refuse web archiving. In the Web 2.0 world, obtaining the prior consent of each right holder is impracti- cable, especially since their identification can be very difficult. Therefore, heritage 2 It is the case for France with the DADVSI Law (see « Loi n° 2006–961 du 1er août 2006 relative au droit d’auteur et aux droits voisins dans la société de l’information »), for Luxembourg (see « Loi luxembourgeoise du 25 juin 2004 portant réorganisation des instituts culturels de l’Etat »), for United Kingdom (see « Legal Deposit Libraries (Non-Print Works) Regulations of 5th April 2013 »), For Denmark (see “Danish Act n° 1439 on Legal Deposit of Published Material of 22nd December 2004”). 3 For instance, The Netherlands, Portugal and Switzerland (at the federal level). 4 Prior authorization of the right holders is not necessary for websites that have fallen into the public domain or that were made available under the system of Creative Commons License (Beunen and Schiphof 2006, p. 16). International Journal of Digital Humanities (2019) 1:85–111 89 institutions acting in countries that do not have legislation for the web legal deposit do not always ask the permission from the websites owners before proceeding to collect their website, preferring to take a pragmatic approach. On the one hand, they either notify the website owner of their intention to archive their website and if he/she does not object, they consider that the website owner implicitly consents to the archiving.5 As Kees Teszelszky states: The biggest problem for web archiving in the Netherlands and for our national library is that we do not have a legal deposit like you have in Belgium. [...] So then we decided [...] to [use] the opt out method. So if we want to archive the website, [...] we do not ask permission, we say we are going to archive and if people are not reacting on our wish, then we are archiving. (Teszelszky 2017a) On the other hand, they either choose to archive all websites included in their selection policy without prior notification, but allow the website owner to object to the archiving by using Robot Exclusion Protocols.6 In any case, these heritage institutions are generally very cautious. In this way, they develop a very effective takedown policy in the event of subsequent objections by website owners, through the removal of the archived content from their database. There are a number of advantages for heritage institutions of relying on legal provisions that enable them to frame their web archiving activities in order to solve the aforementioned difficulties. Firstly, legislation on web legal deposit has the advan- tage of offering greater legal certainty and facilitating the web archiving by forcing the website owners to comply with the legal deposit obligation. Indeed, such legislation means that heritage institutions are not required to ask for prior permission from website owners (Beunen and Schiphof 2006). Without that legislation, the owners’ consent would be required, because the archiving of a website composed of various protected contents7 necessarily triggers an act of reproduction,8 likely to infringe copyright. Alongside a web legal deposit obligation, some countries created some copyright exceptions covering activities intrinsically linked to web archiving and access.9 It is, in fact, technically impossible to archive a website without reproducing it. In this way, these kind of exceptions have proved unavoidable in order to permit acts of reproduction (Graff and Sepetjan 2011: 179–180). Secondly, some countries have a legal provision allowing the heritage institution responsible for web archiving to require 5 This approach is the one of the National Library of The Netherlands (KB Nederland, n.d.-b and n.d.-d). 6 This approach is the one of Arquivo.pt. in Portugal (Arquivo.pt, n.d.-c). 7 Let us indicate that websites are composed of a set of elements that can be each protected by copyright (original texts, images, search engine, database, etc.) and may each have a different right holder (KB Nederland n.d.-e). We also have to underline the fact that websites can also be composed of elements protected by other rights such as trademark law, database right, neighboring rights and image right (KB Nederland n.d.-b). 8 Act for which the consent of the right holders is in principle required. 9 In France, the DADVSI Law has introduced an exception allowing acts of reproduction and communication related to the web legal deposit (see French Heritage Code, art. L132–4 to L132–6). In the United Kingdom, Sections 19 to 31 of the Legal Deposit Libraries (Non-Print Works) Regulations of 5th April 2013 and Section 44A of the Copyright, Designs and Patents Act of 15th November 1988 allow the realization of certain activities related to web legal deposit without that they violate copyright. 90 International Journal of Digital Humanities (2019) 1:85–111 domain names management bodies to help them identify website owners.10 Thirdly, some legislations go even further by allowing heritage institutions to require website owners to give the passwords and access keys necessary for collecting their website.11 This makes it considerably easier for the heritage institutions to obtain the web material covered by legal deposit. Concerning the criteria for deciding the scope of the national web archive at the national level, we noticed some similarities in the choices made by the studied countries with legal deposit legislation. Considering that online information falls within the scope of competence of one state or another, there are three main principles to be followed. Firstly, a state considers itself competent to archive online contents published within its national domain name. Secondly, a state also considers itself competent to archive online contents published on other domain names if one of these additional conditions is met: if the website was registered to the national body responsible for managing domain names or by a citizen of the state; if the content of the website is related to the state (i.e. concerns the general affairs of the state); if the content of the website was drafted by a citizen of the state or in the national territory. Luxembourg also has an additional criterion which was not found elsewhere: if the production of the publication has been supported by the state. Thirdly, the language of the content is an additional criterion. However, this criterion only applies to countries with a single national language but does not work for countries with multiple national languages that are also national languages of other countries. In countries without legal deposit legislation for online content, the scope is defined in a similar way. In Ireland, in addition to the national top level domain, web material that is of Irish interest, has heritage value and that treats a subject of interest, is also considered to be within scope (National Library of Ireland 2017a). In Portugal, the top level domain of all Portuguese speaking domains, except for the Brazilian domain, are included, as are websites on other domain names that are of broad interest to the Portuguese community. While in The Netherlands, websites about Dutch language, history and culture on both the national domain, and other domain names, are within the scope of the project (Arquivo.pt n.d.-c; Sierman and Teszelszky 2017). There is a marked difference between public record legislation that regulates the activities of national archives and legal deposit legislation that frames the missions of national libraries. Where web legal deposit legislation exists (UK, France, Denmark, …) numerous detailed provisions are included to frame web archiving activities. For instance, the text of the UK legal deposit legislation12 comprises of more than 20 legal provisions, specifically related to web legal deposit. However, in public records legislation, the same text that applies to classic public records also applies to online public records, meaning that there are no specific legal provisions where web archiving is concerned, except for in the Library and Archives of Canada Act. The legal text on 10 For instance, in France, Article L132–2-1 of the French Heritage Code authorize the “Bibliothèque Nationale de France” to turn to domain names management bodies or to the Higher Audiovisual Council to identify the publishers and producers of websites. There is also a similar legal provision in Denmark (See Danish Act n° 1439 on Legal Deposit of Published Material of 22nd December 2004, §11). 11 It is the case in France (see French Heritage Code, art. R132–23-1, II), United Kingdom (see Legal Deposit Libraries (Non-Print Works) Regulations of 5th April 2013, Section 16 (4)) and Denmark (see Danish Act n° 1439 on Legal Deposit of Published Material of 22nd December 2004, §10). 12 The Legal Deposit Libraries (Non-Print Works) Regulations 2013 International Journal of Digital Humanities (2019) 1:85–111 91 public records therefore applies to websites of public institutions only because the notion of “records” is broadly defined (for instance, as ‘all types of medium’). 3.2 How is web archived content selected? Our analysis shows a great deal of variation when it comes to selection strategies and criteria. Furthermore, the terminology for describing the web archiving approach, differs between web archiving initiatives. As can be seen in Table 1, in the case of Arquivo.pt., two main strategies can be distinguished: broad crawls (covering top-level domain crawls (e.g. .be, .fr) and relevant content outside of the national domain(s)) and selective crawls (thematic or events-based collections, for example). The selection policy of national archives with regard to web archiving differs in the sense that it is mostly limited to the public records of governmental organisations. For national libraries, the scope of collection is broader as web archiving is seen as part of the legal deposit legislation or as a complement to the more traditional electronic or paper collections of publications in countries without legal deposit legislation. All national libraries and Arquivo.pt. in Portugal combine broad crawls with selective crawls, except for the National Library of France (BnF) where a representative sample of the web is taken instead of a complete top level domain crawl and the National Library of The Netherlands, where only a selective approach is taken Table 1.13 Different methods are used to identify the content that does not reside under URLs of the national domain. The British Library, for example, uses Geo-IP localisation to locate information on servers in the UK or make use of UK postal addresses (Hockx-Yu 2014). At the Royal Danish Library a specific system has been developed to identify this content. As Jakob Moesgaard explained: We’ve built a system that basically looks at everything we harvest. It looks at all the links that point out [...] and then it analyses the content on all of those pages. [...] It scans for regular expressions that cover Danish phone numbers and [...] we try to have this sort of validation process ranking [...] to see if [...] it looks Danish enough for us to trust that we should automatically add it to the archive. (Moesgaard and Larsen 2017a, b) In the case of selective crawls, there are different ways of determining these collections. Some institutions have defined overarching selection criteria for these collections. The British Library, for example, focuses on websites that publish research, reflect the diversity of lives, interests and activities in the UK and demonstrate web innovation for the UK Web Archive (UK Web Archive n.d.-a). In general, websites deemed of interest to the nation are included in the selection, meaning websites that are representative of the diverse society, or that are linked to the history and culture of a nation. It is interesting to note that the popularity, uniqueness or the degree of innovation of websites is sometimes also taken into account, as well as websites that publish research (KB Nederland n.d.-a; National Library of Ireland 2017a; Maurer and Els 2017a; Gomes 2017b). 13 Sierman and Teszelszky 2017; BnF 2017a, b; Maurer and Els 2017b; UK Web Archive (n.d.-a); Hockx-Yu 2014; Brügger et al. 2017; Arquivo.pt n.d.-c; Ryan 2017; National Library of Ireland 2017a, b. 92 International Journal of Digital Humanities (2019) 1:85–111 Another way to create selective collections is to build them based on specific themes, events or even emergencies (mostly focusing on natural disasters or other unforeseeable events). There is a large variety in how thematic collections are defined. They can, for example, be centred around the different collection depart- ments in the institution, as is the case in the National Library of France (BnF 2017a) or focus on other themes such as literary collections or health and social issues amongst others, which is the case in the National Library of Ireland (National Library of Ireland (n.d.-d)). Event-based collections on the other hand are more coherent between institutions. Most often they are about events such as elections (national or local), commemorations, referendums or sporting events such as the Olympics. With regard to social media, a number of web archiving initiatives include them in their collections. From a technical point of view, archiving social media is challenging (e.g. due to the vast amount of data generated or changing access policies), which explains why increasingly sophisticated proprietary and open source software and services are available to support social media archiving. The policies with regard to social media differ widely between institutions. Table 2 provides an overview of which institution preserves which social media.14 The most popular social media platforms captured by the studied web archiving initiatives are Twitter, YouTube and Facebook. The social media accounts that are captured, in general focus on important people, organisations and events such as political parties, politicians, newspapers, journalists, athletes, other celebrities, etc. In the case of Arquivo.pt. no special efforts are made to harvest social media, although their web archive does contain some material stemming from Facebook and Twitter (Gomes 2017b). The National Library of the Netherlands is also not currently harvesting social media, but they have it included in their 10-year plan. At the National Archive of The Netherlands social media are not yet included in their collection either, but tests have been scheduled in 2018 to archive social media (Teszelszky 2017a; Posthumus and van Luin 2017a). 14 Tanésie et al. 2017; Maurer and Els 2017b; British Library 2017a; British Library (n.d.-b); National Archives (n.d.-a); Netarkivet.dk 2017; Moesgaard and Larsen 2017a Table 1 Overview of general selection strategies for web content Country Institution Broad crawl Selective crawl The Netherlands National Library No Yes France National Library No (Representative sample) Yes Luxembourg National Library Yes Yes UK British Library Yes (non-print legal deposit) Yes (open UK web archive) Denmark Royal Danish Library Yes Yes Portugal Foundation for Science and Technology (FCT) (Arquivo.pt) Yes Yes Ireland National Library Yes Yes International Journal of Digital Humanities (2019) 1:85–111 93 Some institutions also make use of certain exclusion criteria, some of which concern the legality of the content. The national legislations are unanimous on what constitutes illegal content: child pornography, hate, xenophobic or racist speech, speech inciting to violence, etc. Some institutions take specific measures to exclude this content automat- ically. The National Library of France, for example, makes use of a filtering tool (Tanésie et al. 2017). Additional exclusion criteria are sometimes in place, for instance, excluding content that is already included in other web archives or material that cannot be captured for technical reasons (KB Nederland (n.d.-d); Moesgaard and Larsen 2017a). In the case of The National Archives UK, additional selection criteria have been developed for Twitter content, for example, tweets written by the selected govern- ment organisations are included, but retweets or tweets sent from non-governmental accounts to government accounts are excluded (National Archives (n.d.-a)). When digital scholars use web archives for their research, it is important that they take into account how the archived web content is selected and who is responsible for making that selection. In some institutions specific collection specialists are responsible for making the selection, while in other cases, selection is a responsibility that is shared between a large number of people, each devoting only a limited amount of time to selecting the content. This is, for instance, the case at the National Library of France (BnF 2016) where the selection is done transversally, meaning that each department contributes to the web archiving by entering URLs into the system (Tanésie et al. 2017). Furthermore, some institutions collaborate with external partners. The National Library of Ireland sometimes contacts specialists in the field. For their collection on the Irish elections, they contacted political analysts, lecturers and journalists in order to obtain their feedback on what should be included in the collection (Ryan 2017). The role of digital scholars, along with the general public, in the selection of content for web archives is a topic worthy of consideration. For example, engagement from the digital scholars as well as the general public is already being sought: the national libraries of France, The Netherlands, Luxembourg, Denmark and Ireland, and Arquivo.pt., all provide a way for people to make suggestions for websites to be included in the selection (BnF 2017c; KB (n.d.-d); BnL n.d.; Netarkivet.dk 2016a; Ryan 2017; Arquivo.pt (n.d.-e)). As ‘all web archives to a greater or lesser degree can only suggest comprehensive- ness’, (Koerbin 2017: 194) web archiving institutions have a very important role to play as facilitator. They should ensure that sufficient information about the web archiving Table 2 Overview of social media included in web archives Country Institution Facebook Twitter YouTube Instagram Flickr France National Library (used to, not anymore) Yes No No No Luxembourg National Library Yes Yes Yes Yes No UK British Library Yes Yes No No No UK National Archives No Yes Yes No No Denmark Royal Danish Library Yes Yes Yes Yes No Ireland National Library No Yes Yes No (starting in 2018) 94 International Journal of Digital Humanities (2019) 1:85–111 context is made available so that researchers can find the answers to the questions evoked by Webster (2017: 175–176): ‘Why has this content been archived, by whom and on whose behalf?’ There is a clear demand for this information. Sara Aubry of the National Library of France (BnF) stated: This is information researchers increasingly request meaning that they wish to understand the context of the production of the archive in order to gain insight into whether [a resource] was archived as part of a selective crawl or of a broad crawl, if it was part of a specific project, how long the crawl lasted, [...], so really everything about the context of the capture. (Tanésie et al. 2017, translated from French) However, even though the importance of this contextual information is understood, it is sometimes not made available. From a research perspective, this lack of contextual information is problematic. Finally, the web archiving process itself has an impact on what digital scholars can do with the material: The purpose, strategies and technology of an archive affect what is archived and the manner in which it can be accessed, and in this way influence the possibility of constructing a research object on the basis of the material in the archive. (Nielsen 2016) It is important that digital scholars keep these various aspects in mind, when they undertake their research using data from web archives. 4 Consultation, access and ease of use of web archives 4.1 How to consult and access web archived content? It is essential to underline that access conditions differ widely between web archives as can be seen in Table 3. Some of the web archives are freely accessible online such as Arquivo.pt. in Portugal or the web archive developed by the National Library of Ireland15 (Arquivo.pt n.d.-a; National Library of Ireland n.d.-a). For the national librar- ies, this mission of making national heritage accessible to the public is complementary to their national heritage preservation mandate. However, granting such access to the public must comply with the legal provisions related to copyright. Indeed, the vast majority of archived online content is protected by copyright and, while it is clear that their mere archiving is not likely to cause too much damage to right holders, this is not the case when making this content available to the public (Beunen and Schiphof 2006). As a result, in a number of web archives, only specific parts of the collections are freely accessible. In the case of the British Library, the Open UK web archive and the 15 In the case of the National Library of Ireland, this only counts for the web archive collections that were based on a selective policy. Access conditions to the web material collected during the top-level domain crawl that started in 2017 were not yet defined at the time of the interview. International Journal of Digital Humanities (2019) 1:85–111 95 JISC UK web domain dataset for example are freely accessible, whereas the UK non- print legal deposit web archive is not (UK Web Archive (n.d.-b); British Library (n.d.- b)). At the National Archive of the Netherlands, a specific status for access is assigned to each archived website: open, restricted or offline (Posthumus and van Luin 2017b). Some web archives, which are not freely available, are only accessible on the premises of the library from specific workstations. In the case of the UK non-print legal deposit web archives, the law also specifies that only one user can access a certain piece of online content at any given time.16 A reader card needs to be obtained in some cases to gain access to the reading rooms as is the case in the National Library of The Netherlands (KB Nederland (n.d.-e)). At the National Library of France (BnF)17 however, the legislation is more flexible: accredited users are allowed to bring their own laptop to connect to the network. At the Royal Danish Library remote access is provided for PhD-level researchers (Moesgaard and Larsen 2017b). Some web archives are also only open to researchers and others are not accessible at all, as is the case for the web archive of the National Library of Luxembourg where the technical infrastructure is not yet in place (Maurer and Els 2017a). However, in most cases, the access restrictions are in place because of copyright reasons. As Webster states: ‘A common feature of most web archiving backed by legal deposit legislation is some sort of restrictions on the access afforded to the end user of the archive’ (Webster 2017: p. 180) Table 3. The British Library found a way to avoid certain access restrictions with their interface SHINE of which the beta version was launched in December 2017 (UK Web Archive (n.d.-d)). Their archive is open to anyone, but for content that is not publicly available, only the metadata is shown (Webber 2017). Other web archiving specialists showed interest in the SHINE interface, Yves Maurer from the National Library of Luxembourg stated that: The SHINE interface of the UK British Library would be very useful for digital humanities researchers, for sociologists, political scientists maybe or even jour- nalists. (Maurer and Els 2017a) In the context of access to web archives, it is important to keep in mind the interests of rights holders. Table 4 provides an overview of how the studied institutions allow web archives to be used. Some countries are keen to put in place a fair balance between the interests of website owners and the interest of the public to access archived online content. Indeed, some heritage institutions respect a kind of ‘embargo’ on access (meaning that content can only be made accessible to the public at the end of a certain period) upon a duly justified request of right holders. For instance, in the United Kingdom right holders have the opportunity to submit a written request to the deposit library to prevent readers’ access for a renewable period of three years in order to protect their commercial interests. The British Library grants this ‘embargo’ request if it considers that providing access to readers during the specified period would unreasonably prejudice the interests of right holders (see Legal Deposit Libraries (Non-Print Works) Regulations of 5th April 2013, Section 25). Arquivo.pt. in Portugal also makes use of an automatic ‘embargo’ for all online publica- tions. They are attentive to the interests and rights of authors by respecting an access 16 See Legal Deposit Libraries (Non-Print Works) Regulation of 5th April 2013, Section 23. 17 See French Heritage Code, art. R132–23-2. 96 International Journal of Digital Humanities (2019) 1:85–111 Ta b le 3 O ve rv ie w o f ac ce ss m et h od s to th e w eb ar ch iv es C ou nt ry In st it ut io n A cc es s m et ho d W h o ha s ac ce ss ? O pe n & fr ee ly ac ce ss ib le o nl in e P h ys ic al ac ce ss on lo ca ti on T h e N et h er la nd s N at io na l L ib ra ry N o Y es E ve ry o ne w it h a p ai d li b ra ry ca rd . B ig da ta re se ar ch er s ca n ga in ac ce ss af te r a m ee ti ng an d ha vi n g si gn ed a co n tr ac t. T h e N et h er la nd s N at io na l A rc h iv e Y es (f or w eb si te s w it h an ‘o pe n’ st at us ) Y es (f or w eb si te s w it h a ‘r es tr ic te d’ or ‘o ff li ne ’ st at u s) ‘O p en ’ & ‘o ff li ne ’ st at us w eb si te s: ev er yb o dy . S om e it em s ar e ‘r es tr ic te d’ , w hi ch m ea ns yo u ne ed a sp ec ia l pe rm is si o n (a re se ar ch pr op o sa l is re qu ir ed to ob ta in th is pe rm is si on or pr o of th at th e su bj ec t of th e ar ch iv ed co nt en t is d ea d) . T o ge th er w it h th e sp ec ia l pe rm is si o n a si gn ed fo rm is ne ed ed st at in g yo u un de rs ta nd yo ur o w n re sp on si bi li ti es u nd er th e pr iv ac y- la w . F ra n ce N at io na l L ib ra ry N o Y es (b u t al so fr om w it hi n th e 26 p ar tn er li br ar ie s) A ut ho ri ze d u se rs o f th e B n F (1 8 ye ar s or ol de r an d fo r un iv er si ty st ud ie s, p ro fe ss io na l or pe rs on al re se ar ch . F o r th e la tt er tw o ca te go ri es , in te rv ie w s ar e co nd uc te d be fo re ac cr ed it at io n is gi ve n. ) L ux em b ou rg N at io na l L ib ra ry N o N o N o pu bl ic sy st em y et . U K B ri ti sh L ib ra ry Y es (f or th e U K w eb ar ch iv e) Y es (f or th e le ga l de po si t U K w eb ar ch iv e an d JI S C do m ai n da ta se t) E ve ry o ne w it h a re ad er ’s pa ss . U K N at io na l A rc h iv es Y es N o E ve ry o ne D en m ar k R o ya l D an is h L ib ra ry Y es (o nl y fo r re se ar ch er s co nd uc ti ng re se ar ch on a P h. D -l ev el or ab ov e) Y es (o nl y fo r re se ar ch er s) O nl y fo r re se ar ch p u rp o se s af te r fi ll in g an ap pl ic at io n fo rm th at ne ed s to be ev al u at ed . P o rt u ga l F o un d at io n fo r S ci en ce an d T ec h no lo gy Y es N o E ve ry o ne Ir el an d N at io na l L ib ra ry Y es N o E ve ry o ne T he in fo rm at io n in cl u de d in th is ta bl e ca n be fo un d in : K B N ed er la nd (n .d .- e) ; P os th u m us an d v an L ui n 20 1 7b ; B n F 20 1 7c ; M au re r an d E ls 2 01 7 b; U K W eb A rc h iv e (n .d .- c) ; B ri ti sh L ib ra ry (n .d .- a) ; W eb b er 20 17 ; N at io na l A rc h iv es (n .d .- b) ; M o es g aa rd an d L ar se n 20 1 7b ; A rq ui vo .p t (n .d .- b) ; N at io na l L ib ra ry of Ir el an d 20 1 7a , b International Journal of Digital Humanities (2019) 1:85–111 97 embargo period of one year after the collection of the website to avoid that the archived content competes with the online website (Gomes 2017a) Table 4. Finally, web archives raise the question of how to proceed in relation to illegal content. Since most of web archiving procedures are automatic, it is inevitable that sometimes so-called ‘illegal’ content is collected. This was also noted by the National Library of France (BnF) where Sara Aubry stated: We will not collect them, we will not take active steps to collect [illegal content] in the context of selective crawls [thematic or events-based collections, for example]. In the broad crawls [covering the capture of a representative sample of the French web], however, we will not refrain from collecting them. (Tanésie et al. 2017, translated from French) Table 4 Overview of allowed use of the web archives Country Institution Functionalities The Netherlands National Library Copy only for themselves. The Netherlands National Archive Not specified. France National Library Short quotations and screenshots only for teaching and research. Forbidden to download archived files and other technical restrictions may prevent copying of texts or screenshots. Luxembourg National Library No functionalities as no access is currently provided. UK British Library Printing of material in the legal deposit UK web archive is allowed, but very limited. UK National Archives Most Crown copyright material within the Web Archive can be used without formal permission under the terms of the Open Government Licence. Where the copyright of material is owned by a third-party, it is the responsibility of the user to obtain the necessary permission for re-use. Denmark Royal Danish Library Possible to make a copy of the website for personal use, display the archive or websites from the archive for teaching (non-public classes or courses). Use in public, scientific and television presentations and for scientific publications is also possible but with certain restrictions. Portugal Foundation for Science and Technology (FCT) (Arquivo.pt) Access is intended to support work of an educational, scientific or research nature. Use for commercial purposes is strictly forbidden. Ireland National Library Available for the purposes of research and private study only. For publication the permission is needed from the National Library. When copyright exists and is not held by the National Library, the copyright holder’s permission is also needed. The information included in this table can be found in: KB Nederland (n.d.-e); BnF (2017c); Maurer and Els 2017b; UK Web Archive (n.d.-c); National Archives (n.d.-a); Netarkivet.dk 2016b; Arquivo.pt (n.d.-d); National Library of Ireland (n.d.-b); National Library of Ireland (n.d.-c) 98 International Journal of Digital Humanities (2019) 1:85–111 If web archives contain illegal content, heritage institutions usually ensure that these archived web pages are not made accessible to the public. Nevertheless, such contents might be of interest for digital scholars and researchers in certain disciplines to understand and analyse the history and the culture of the country. To paraphrase Valérie Schafer; having, for example, access to past Neo-Nazi websites, which, in fact, contain hate speech, is of utmost importance, both for the study of digital cultures and of history in general (Tanésie and Aubry 2017). 4.2 What makes a web archive easy to use? Once users have obtained access to a web archive, archived websites are often not easily discoverable via the available search and browse methods (see Table 5). This inhibits use (Dooley 2016). Two main challenges were revealed to ensure discoverability in the context of a web archive; the lack of descriptive metadata guidelines and the lack of a clear understanding of user needs and behaviour (Dooley et al. 2017). It is necessary to address these two challenges in order to guarantee the discoverability of web archives. The lack of descriptive metadata guidelines related to web archiving is also prob- lematic for initiatives where the aim is to link different web archives, as is the case for the National Coalition for Digital Preservation (NCDD) in The Netherlands. The NCDD is working on promoting cooperation and creating an inventory of which material is present in which web archive (NCDD n.d.). Related to this initiative, Teszelszky (2017a) said: ‘If we want to have a national web collection, we need to use the same software. We need to have common standards and that’s something that Table 5 Overview of search options in the web archives Country Institution Search options URL Full-text Topical browsing Alphabetic browsing The Netherlands National Library Yes No No No The Netherlands National Archive No No No No France National Library Yes Yes Yes No Luxembourg National Library Not open for the public yet. Not open for the public yet. Not open for the public yet. Not open for the public yet. UK British Library Yes Yes Yes No UK National Archives Yes Yes No Yes Denmark Royal Danish Library Yes Yes No No Portugal Foundation for Science and Technology Yes Yes No No Ireland National Library Yes Yes No Yes The information included in this table can be found in: Teszelszky 2017b; Posthumus and van Luin 2017b; BnF (2017c); Maurer and Els 2017b; UK Web Archive (n.d.-b); The National Archives n.d.; Gomes 2017b; National Library of Ireland (n.d.-a) International Journal of Digital Humanities (2019) 1:85–111 99 will be worked on’. Increasing standardisation of metadata management would, there- fore, be advantageous for the users. The second most frequently mentioned challenge is the need for a better under- standing of user needs and behaviour to ensure discoverability for archived websites (Costa and Silva 2010; Dougherty et al. 2010). Many web archiving institutions do not have accurate statistics on the number of visitors of their web archive. Often the number of visitors to the web archive are merged with the number of visitors of the whole website (as is the case at the National Archive of the Netherlands) or in other cases the internal use of the staff was included (as is the case at the National Library of the Netherlands). Furthermore, numbers like these do not indicate who these visitors are; why they are visiting; what they expect to find; what they take away with them and whether they experienced any degree of satisfaction. As Maria Ryan (2017) of the National Library of Ireland stated: ‘It’s difficult to get good analytics on web archive users, due to the fact the selective web archive can be accessed remotely’. In the case of Arquivo.pt., efforts are made to target the right people to stimulate them to make use of the web archive. In this regard, they have a well-defined communication strategy in place to encourage researchers and academia to use their collections. For example, they organise contests offering prize money to researchers working with their collections (Gomes 2017b). That user engagement is also considered an important matter at the British Library is underscored by the fact that they have a ‘Web Archiving Engagement Manager’ for the web archive (British Library 2017b). This contrasts with other web archiving initiatives that find it hard to attract users: Not many people are using our web archive. I think we have 100 visitors a year [...] We only see this year that these kind of researchers come to our web archive because some websites are not in the Internet Archive. (Teszelszky 2017a) In general, most interfaces of web archives afford a form of URL search (either searching for an exact URL or a specific part of a URL), combined with full-text searches. The URL approach has been dominant for years (Ben-David and Huurdeman 2014) but, recently, full-text search is also supported by most of the web archives. Research by Costa and Silva (2010) shows that users prefer full-text search to URL search. However, some web archives have also permitted other types of searches for some time now. In such web archives, the user can also explore topical collections or undertake alphabetical browsing (see Table 5). 5 Overview of tools used in web archiving This section briefly describes web archives from a technical viewpoint. In particular, it discusses software tools involved in the process of gathering web content and analysing this content that might be relevant for digital scholars. Not all available tools are described, however, nor are the long-term preservation systems or the back-ends of archives. Web archiving starts with harvesting or crawling websites, which means trying to get a copy of websites. Since web content is diverse––static pages, dynamic pages, multimedia, social media, etc.–– different harvesting tools focus on different types of 100 International Journal of Digital Humanities (2019) 1:85–111 content. Typically they produce output that can be stored or archived, for instance, as a directory structure on disk, mimicking the original website or as Web Archive (WARC) files (ISO 2017). HTTrack (Roche 2018) copies the website(s) to disk so the user can simply open it in a browser. It uses a single thread so one instance is only suited for limited crawls. Webrecorder (Webrecorder n.d.) uses a browser to harvest content of websites, hereby addressing typical issues of other harvesting tools: dynamic content, flash, multimedia, etc. It ‘records’ web pages as the user browses them, so it is suited for very selective, high quality crawling. Although it requires some technical skills to install, an online demo is available. The content is saved in the WARC format. Wget (Free Software Foundation 2017) and the similar tool Wpull (Foo 2016) are versatile command line tools that have built-in web crawling functionality, comparable to HTTrack. They can write to a directory structure or to WARC files. Wpull is better suited for large crawls because it stores detected URLs to disk as opposed to WGet which stores them in often limited computer memory, and it offers deduplication (i.e. crawls a page only once). Both tools are rather easy to install and to run; the art is to compose the right commands to instruct them. Grab-site (Grab-site GitHub 2018) provides a graphical interface for Wpull. Social media require specialised tools to capture their content because of their very dynamic nature. Capturing content is typically done programmatically using Application Programming Interfaces or APIs, offered by the social media providers. F(b)arc (Fbarc GitHub 2018) is a command line tool that can be used to archive data using the Facebook Graph interface. Twarc (Twarc GitHub 2018) is a command line tool and library that makes using the Twitter APIs easy. It can be used to archive data, detect trends, search friends, etc. Social Feed Manager (Social Feed Manager 2018) can harvest data from Twitter, Tumblr, Flickr, and Sina Weibo. Web archiving organisations tend to use more advanced tools, which often require technical skills to install and use. Heritrix (Webarchive.jira.com, July 2016) is a general purpose web crawler designed with web archiving in mind. It can be configured for broad crawls or targeted crawls, on one machine or in clusters, it can be extended with custom code, etc. It is suited for large scale crawling activities, but less so for dynamic pages or social media. It produces WARC files. The NetarchiveSuite (Rosenthal 2017) is built with Heritrix at its core, but provides extra functionality in the area of deployment, long term preservation and access. Brozzler (Brozzler GitHub 2018) uses the engine of the Chrome browser to harvest pages, which offers the same advantages Webrecorder offers, but it requires no user interaction during crawling. It can be set up on a cluster. Besides tools to get the data, there are also tools for doing something with the data. Tools to view the archived websites include Webrecorder Player (Webrecorder Player for Desktop GitHub 2018), OpenWayback (IIPC 2018), pywb (Pywb GitHub 2018) and WAIL (Web Archiving Integration Layer) (Kelly 2017). Webrecorder Player is relatively easy to install and use and can open content from ARC, WARC and HAR (http Archive) files. OpenWayback reads and indexes WARC files and lets users browse or search the archived content in a web browser. Pywb offers OpenWayback functionality, but it also enables web pages to be recorded while the user surfs the web. It is the software used in Webrecorder and Webrecorder Player. Note that OpenWayback and pywb require technical skills to set up. WAIL is an easy-to-use International Journal of Digital Humanities (2019) 1:85–111 101 http://webarchive.jira.com tool with a graphical user interface that combines Heritrix for capturing websites and OpenWayback for viewing the captured content. Several tools and libraries exist to enable processing archived data, but don’t do a lot of actual processing. Tools that can read and write data, or validate and extract metadata from WARC files include JWAT (Clarke 2016), node-warc (Node-warc GitHub 2018), WARCAT (WARCAT GitHub 2017) (Web ARChive (WARC) Archiving Tool), warcio (Warcio GitHub 2017) and warctools (Warctools GitHub 2016). These tools often require programming skills to write software that processes the data itself. Some tools go a step further and provide a framework for analysing web archives. The Archives Unleashed Toolkit (AUT), part of the Archives Unleashed Project (Archives Unleashed Project 2018), provides a flexible data model for storing and managing raw content as well as metadata and extracted knowledge. Although basic programming or scripting skills are required, a lot of built-in functions (including, extracting links, popular images, and named entity extraction) help the writing of powerful code. A version running in the cloud, providing a user interface, is currently being developed. A tool similar to AUT is ArchiveSpark (ArchiveSpark GitHub 2018). This tool focuses somewhat more on entity recognition and linking than AUT. Another difference is that ArchiveSpark extensively uses CDX files, which are indexes gener- ated from WARC files to speed up some processing. Both tools are built using the Apache Spark analytics engine, enabling a plethora of (big) data processing and analysis tools on top of their own functionality. A last aspect worth mentioning is how to access publicly available archived data from organisations. As described before, most organisations make this data accessible by means of a web page. However, there is a standardized way of getting web resources near a given timestamp, with a specific URL: Memento (Van de Sompel et al. 2013). It is not necessary to know which organisation holds the data, as long as it runs a Memento aware web service. Organisations supporting Memento are Arquivo.pt., National Library of Ireland, UK Government Web Archive, UK Web Archive, Internet Archive and many more (Kremer 2016). OpenWayback and pywb for instance are tools that provide Memento functionality. A number of Memento clients exist, as standalone libraries or browser plugins, which can be used to access data in this way. A demo is available online.18 6 Discussion and conclusion Our explorative analysis of European web archives for use by digital scholars under- lines four important considerations: (1) Digital scholars need to investigate why, by whom and on whose behalf web archiving is being done. This is important because it ‘(…) serves to orient users as to some of the questions they should be asking of their sources, and of the institutions that provide them’ (Webster 2017: 176). With regard to why the content in question has been archived, it has been shown that the selection is based on a variety of strategies and criteria. Sometimes the collection scope is 18 See http://timetravel.mementoweb.org/ 102 International Journal of Digital Humanities (2019) 1:85–111 http://timetravel.mementoweb.org/ defined by law as is the case in countries with legal deposit legislation; in other cases, the scope is defined by the heritage institution itself. In the case of national libraries and Arquivo.pt., it has been shown that two main approaches exist: broad crawls and selective crawls although some institutions combine both. Broad crawls cover top-level domain crawls and relevant content outside of the national domain(s) and selective crawls mostly focus on specific events, themes or emer- gencies. When it comes to social media in the studied web archives, it has been demonstrated that approaches differ widely: some institutions do not (yet) include any social media content, while others cover several platforms. Twitter, Facebook and YouTube are the social network platforms that are most often included by web archives. Who does the web archiving is another important factor from the user per- spective. Sometimes specific collection specialists are responsible, whereas in other cases, selection is a responsibility that is shared collectively by a large number of people. A number of institutions also work together with external experts for the selection of web content and most of the studied initiatives offer the public the possibility to submit suggestions to be included in the web archive. The context in which web archiving has taken place is, therefore, very impor- tant for researchers as it has a significant impact on its use as a source in scholarly studies (Webster 2017). (2) Access conditions differ widely between web archives and the vast majority of archived online contents is protected in order to respect the legal provisions relating to copyright. Once access to the archive is gained, most web archive interfaces only afford simple tools (e.g. URL or full-text searches). Researchers also need to take into account the integrity and authenticity of the information captured (which is strongly linked with quality assurance and metadata manage- ment procedures of the webarchive). Nielsen remarks that the ‘ongoing efforts that are being made to enhance access to the archives’ (2016: 22) also reveal new challenges. For example, full-text keyword search provides different possi- bilities for finding material, but can potentially create challenges for digital scholars such as data overload and the task of filtering out the relevant results by themselves. The latter is difficult as most digital scholars are so accustomed to seeking information through querying search engines, such as Google, where the results are ordered by relevance, which means they expect to find informa- tion in web archives in the same way (Costa and Silva 2011). Ultimately, digital scholars need data-level access to web archives to under- take analysis using digital tools and methods. A pioneer in this area is Ian Milligan, who made, in the context of the Web Archives for Longitudinal Knowledge project (WALK n.d.), a number of datasets available, including information about those datasets and how to cite them. It is anticipated that data-level access to web archives will increase in the future (Lin et al. 2017). Digital scholars will thus need to become aware of the characteristics of web archive search results and of the fact that they can be sometimes problematic. For example, web archive search results will often be very numerous, neither ordered by relevance nor importance and full of irrelevant material and false returns (Deswarte 2015). The tools and the interfaces offered by web archives are very much in an early stage of development and web archivists are only International Journal of Digital Humanities (2019) 1:85–111 103 beginning to tackle the strengths and weaknesses of both their data and interfaces. Another challenge identified is the need for a better understanding of user needs and behaviour because of the lack of available resources of web archiving institu- tions. However, a variety of studies have investigated the practices of web archiv- ing and researchers using web archives. Studies done by BUDDAH underlined, amongst other issues, the lack of guidance for humanities researchers: ‘A shared conceptual framework of the web archives research process is essential to system- atize practices, advance the field, and to welcome new entrants to this area. [...] Such a framework would be structurally useful to describe any research that investigates social questions based on web archives’ (Maemura et al. 2016: 3251–3252). Web archiving institutions could play a role in providing this guid- ance for researchers. (3) With regard to legal frameworks for web archiving, it is important for digital scholars to understand the general legal frameworks governing web archiving. Increasingly, many European countries have extended their legal deposit legislation to include web archiving. While this means that national libraries have a formal mandate to archive the web content of their nation, there are still challenges to providing access to this content, for example, for research use. Sometimes this access can, for example, only be provided on-site within a national library. For countries where there is no legal deposit legislation in place, a number of, often pragmatic solutions––such as ap- proaching website owners to ask permission to archive their website content––are in place to enable cultural heritage organisations to archive websites. It should also be important to have in mind that the General Data Protection Regulation (GDPR) gives the Member States the possibility to put in place a softened regime when personal data are processed in specific contexts such as archiving in the public interest, historical or scientific research and statistical purposes (4) Using web archives as a basis for research requires, perhaps even more than other digital research materials, a relatively high level of technical knowledge. Not only is it important to understand the context in which the websites were archived (e.g. how they were selected, when and with tools were they archived), but there are also technical challenges to accessing this content (e.g. full-text search is not always readily available), and understanding the file formats (e.g. WARC) that have been used for web-archiving. However, thanks to the increasing community that is building around web-archiving (e.g. IIPC) and research using web archives (e.g. RESAW), the expertise, tools and knowledge are also growing. Given the importance of legislative, technical and policy-related elements linked to the creation of a web archive as a research object, it is paramount to provide adequate information and documentation about this context to the users of the web archive in order to open up the black box of web archiving. The Portuguese web archive can be considered a good example in this context as videos are created that shed light onto the inner workings of the web archive, thereby furthering transparency (Arquivo.pt 2018). The features and history of a web archive are pertinent to all its users. It is particularly relevant in order to evaluate the web archive as a data source. As Laursen states: ‘In short, the story of an archive is relevant for the trustworthiness of the archive’ (2017: 223). We have shown that many challenges are associated with web archiving. However, some of the greatest challenges, seen from the user’s perspective, come down to two factors. Firstly, that it is impossible to save everything, and that the 104 International Journal of Digital Humanities (2019) 1:85–111 choices made are significant for the research object. As Masanès states: ‘Web archiving is often a matter of choices, as perfect and complete archiving is unreach- able’ (Masanès 2005: 77). Secondly, in most cases the object researchers are attempting to preserve when creating a web archive will be distorted by the actual archiving process (Nielsen 2016). It could be argued that it is unlikely, if not impossible, that we can preserve all of the attributes and functionality of digital materials. However, little is known about the levels of loss that are acceptable for digital scholars (Harvey 2005). 7 Limitations and future research Although this research produced useful insights on how European web archiving initiatives select and open up archived web content, the research design had some limitations. Most importantly, this study was limited to a sample of only nine web archives, eight of which are managed by heritage institutions. This is not meant to be a representative sample for the web archiving landscape, as it only includes European web archives. In addition all these web archives are members of the International Internet Preservation Consortium (IIPC), except for the National Archive of the Netherlands. Despite these limitations, this article can function as a point of departure for more extensive and qualitative research. With regard to selection, research into the retrieval of examples of the earliest web pages of a national web domain would be very interesting, as would studies about how to ensure the representative inclusion of web material about and from minority groups in web archives. Furthermore, the different models of collaboration with partners external to national heritage institutions for selection, such as digital scholars and members of the general public, could also be an interesting research subject. From an access perspective, it could be worthwhile to explore how secure remote access to web archives could be provided for researchers, in compliance with the related legal provisions. Furthermore, research related to data-level access to web archives would be another valuable research area, backed by a solid evidence-base from user studies. From a legal point of view, future research can center around two legal developments that will impact web archiving: on the one hand, the impact of the GDPR on the legislation of the various EU member states; on the other hand, the reform of copyright exceptions and limitations at the European level. From a technical point of view, it has been noted that ‘the archive separates itself increasingly from the live web the archive tries to preserve’ (Laursen and Møldrup- Dalum 2017: 216) and that further research into the development of solutions and tools for the various technical challenges web archives are confronted with is, therefore, essential. Acknowledgements The research outlined in this article was conducted in the context of the PROMISE- project. This project received funding from the Belgian Science Policy Office (BELSPO) in December 2016, through their Belgian Research Action through Interdisciplinary Networks (BRAIN) research programme, for a 24-month period. The project was initiated by the Royal Library of Belgium and the State Archives of Belgium and the project consortium also includes the universities of Ghent and Namur and the Information and Documentation School of the Brussels-Brabant Institute of Higher Education (HE2B IESSID). We would like to thank the interviewees and their colleagues for taking the time to answer our many questions. International Journal of Digital Humanities (2019) 1:85–111 105 List of institutions and representatives consulted – National Library of The Netherlands: Kees Teszelszky (Researcher web archiving, Digital Preservation Department) – National Archive of The Netherlands: Antal Posthumus (Adviser recordkeeping, Directie Infrastructuur & Advies) and Jeroen van Luin (Acquisition and Maintenance of Digital Archives) – National Library of France (BnF): Pascal Tanésie (Assistant to the head of the department of digital legal deposit), Sara Aubry (Web Archiving Project Manager, IT department) and Bert Wendland (IT Department) – National Library of Luxembourg: Yves Maurer (Webarchiving Technical Manager) and Ben Els (Digital Curator) – The Royal Danish Library: Jakob Moesgaard (Specialkonsulent, Department of Digital Legal Deposit and Preservation) and Tue Hejlskov Larsen (IT analyst) – The UK National Archives: Tom Storrar (Head of Web Archiving) and Claire Newing (Web Archivist) – The British Library: Jason Webber (Web Archiving Engagement and Liaison Manager) – Arquivo.pt.: Daniel Gomes (Head of Arquivo.pt., the Portuguese web-archive, Advanced Services Department) – National Library of Ireland (NLI): Maria Ryan (Web Archivist) References Archives Unleashed Project. (2018). The Archives Unleashed Project. Retrieved from http://archivesunleashed.org/. Last accessed on 20/04/2018. ArchiveSpark GitHub. (2018). Helgeho/ArchiveSpark: An Apache Spark framework for easy data processing, extraction as well as derivation for Web archives and archival collections, developed by the Internet Archive and L3S Research Center. Retrieved from https://github.com/helgeho/ArchiveSpark. Last accessed on 20/04/2018. Arquivo.pt. (2018). Arquivo.pt (Portuguese web-archive): official playlist. Retrieved from https://www. youtube.com/playlist?list=PLKfzD5UuSdETtSCX_TM02nSP7JDmGFGIE. Last accessed on 12/02 /2018. Arquivo.pt. (n.d.-a). Arquivo.pt. Retrieved from http://www.arquivo.pt. Last accessed on 22/01/2018. Arquivo.pt. (n.d.-b). Knowledge. Retrieved from https://www.fccn.pt/en/knowledge/arquivo-pt/. Last accessed on 20/10/2017. Arquivo.pt. (n.d.-c). Crawling and archiving Web content. Retrieved from http://sobre.arquivo. pt/en/help/crawling-and-archiving-web-content/#qe-faq-2416. Last accessed on 20/10/2017. Arquivo.pt. (n.d.-d). Terms and conditions. Retrieved from http://sobre.arquivo.pt/en/about/terms-and- conditions/. Last accessed on 31/01/2017. Arquivo.pt. (n.d.-e). What is Arquivo.pt - the Portuguese Web Archive? Retrieved from http://sobre.arquivo. pt/en/help/what-is-arquivo-pt/. Last accessed on 20/10/2017. Ben-David, A., & Huurdeman, H. (2014). Web archive search as research: Methodological and theoretical implications. Alexandria, 25(1–2), 93–111. Beunen, A. & Schiphof, T. (2006). Legal aspects of web archiving from a Dutch perspective (report commissioned by the National Library in The Hague). BnF. (2014). Historique de l’archivage web. Retrieved from http://www.bnf.fr/fr/professionnels/archivage_ web_bnf/a.depot_legal_internet_histoire.html#SHDC__Attribute_BlocArticle1BnF. Last accessed on 22 /01/2018. 106 International Journal of Digital Humanities (2019) 1:85–111 http://archivesunleashed.org/ https://github.com/helgeho/ArchiveSpark https://www.youtube.com/playlist?list=PLKfzD5UuSdETtSCX_TM02nSP7JDmGFGIE https://www.youtube.com/playlist?list=PLKfzD5UuSdETtSCX_TM02nSP7JDmGFGIE http://www.arquivo.pt/ https://www.fccn.pt/en/knowledge/arquivo-pt/ http://sobre.arquivo.pt/en/help/crawling-and-archiving-web-content/#qe-faq-2416 http://sobre.arquivo.pt/en/help/crawling-and-archiving-web-content/#qe-faq-2416 http://sobre.arquivo.pt/en/about/terms-and-conditions/ http://sobre.arquivo.pt/en/about/terms-and-conditions/ http://sobre.arquivo.pt/en/help/what-is-arquivo-pt/ http://sobre.arquivo.pt/en/help/what-is-arquivo-pt/ http://www.bnf.fr/fr/professionnels/archivage_web_bnf/a.depot_legal_internet_histoire.html#SHDC__Attribute_BlocArticle1BnF http://www.bnf.fr/fr/professionnels/archivage_web_bnf/a.depot_legal_internet_histoire.html#SHDC__Attribute_BlocArticle1BnF BnF. (2016). BnF Collecte de web (BCWeb). Retrieved from https://collecteweb.bnf.fr/login.html. Last accessed on 04/02/2018. BnF. (2017a, February). Collectes ciblées de l’internet français. Retrieved from http://www.bnf. fr/fr/collections_et_services/anx_pres/a.collectes_ciblees_arch_internet.html. Last accessed on 16/12 /2017. BnF. (2017b). Internet archives. Retrieved from http://www.bnf.fr/en/collections_and_services/book_press_ media/a.internet_archives.html. Last accessed on 21/09/2017. BnF. (2017c). Guide des archives de l’Internet [Brochure]. Retrieved from http://www.bnf. fr/documents/guide_archives_internet.pdf. Last accessed on 20/09/2017. BnL. (n.d.). Appel à participation - Bibliothèque nationale de Luxembourg. Retrieved from: http://crawl.bnl. lu/2017/06/appel-a-participation-bibliotheque-nationale-de-luxembourg-web-archive/. Last accessed on 26/01/2018. British Library. (2017a, April 18). The challenges of web archiving social media [web log message]. Retrieved from http://blogs.bl.uk/webarchive/2017/04/the-challenges-of-web-archiving-social-media.html. Last accessed on 30/10/2017. British Library. (2017b, May 17). Web Archiving Engagement Manager. Retrieved from https://www.bl. uk/people/experts/jason-webber. Last accessed on 04/02/2018. British Library. (n.d.-a). UK web archive. Retrieved from https://www.bl.uk/collection-guides/uk-web-archive. Last accessed on 31/10/2017. British Library. (n.d.-b). Explore the British Library. Non-print legal deposit: FAQs. Retrieved from http://www.bl.uk/catalogues/search/non-print_legal_deposit.html. Last accessed on 31/10/2017. Brozzler GitHub. (2018). internetarchive/brozzler: brozzler - distributed browser-based web crawler. Retrieved from https://github.com/internetarchive/brozzler. Last accessed on 20/04/2018. Brügger, N., Laursen, D., & Nielsen, J. (2017). Exploring the domain names of the Danish web. In N. Brügger & R. Schroeder (Eds.), The web as history. Using web archives to understand the past and present (pp. 62–80). London: UCL Press. BUDDAH, Big UK Domain Data for the Arts and Humanities. (2014) Bursaries. Retrieved from https://buddah.projects.history.ac.uk/news/bursaries/ . Last accessed on 04/02/2018. Chakraborty, A., & Nanni, F. (2017). The changing digital faces of science museums: A diachronic analysis of museum websites. In N. Brügger (Ed.), Web 25. Histories from the first 25 years of the world wide web (pp. 157–174). New York: Peter Lang. Clarke, N. (2016). JWAT. Retrieved from https://sbforge.org/display/JWAT/JWAT. Last accessed on 20/04 /2018. Costa, M. & Silva, M. (2010). Understanding the information needs of web archive users. In Proceedings of the 10th International Web Archiving Workshop (pp. 9-16). Costa, M. & Silva, M. (2011). Characterizing search behavior in web archives. In Proceedings of the 1st International Temporal Web Analytics Workshop. Deswarte, R. (2015). Revealing British euroscepticism in the UK web domain and archive case study. Retrieved from http://sas-space.sas.ac.uk/6103/#undefined. Last accessed on 25/01/2018. Dooley, J. (2016 October). Metadata to meet user needs. Presented at the OCLC Member Forum. Los Angeles. Dooley, J. M., Farrell, K. S., Kim, T. & Venlet, J. (2017). Developing web archiving metadata best practices to meet user needs. Journal of Webstern Archives, 8(2), Art. 5, 15 pp. Dougherty, M., Meyer, E. T., Madsen, C., van den Heuvel, C., Thomas, A., & Wyatt, S. (2010). Researcher engagement with web Archives: State of the art. London: JISC. Fbarc GitHub. (2018). justinlittman/fbarc: A commandline tool and Python library for archiving data from Facebook using the Graph API. Retrieved from https://github.com/justinlittman/fbarc. Last accessed on 20/04/2018. Foo, C. (2016). Welcome to Wpull’s documentation! - Wpull 2.0.1 documentation. Retrieved from https://wpull.readthedocs.io/en/master/#. Last accessed on 20/04/2018. Free Software Foundation. (2017) Wget - GNU Project - Free Software Foundation. Retrieved from https://www.gnu.org/software/wget/. Last accessed on 20/04/2018. Gomes, D. (2017a, November 30). Web preservation demands access. Retrieved from http://www.dpconline. org/blog/idpd/web-preservation-demands-access. Last accessed 14/12/2017. Gomes, D. (2017b, November 24) Personal interview via Zoom with Daniel Gomes /Interviewers: Sally Chambers, Friedel Geeraert, Gerald Haesendonck, Alejandra Michel and Eveline Vlassenroot. [M4A file]. International Journal of Digital Humanities (2019) 1:85–111 107 https://collecteweb.bnf.fr/login.html http://www.bnf.fr/fr/collections_et_services/anx_pres/a.collectes_ciblees_arch_internet.html http://www.bnf.fr/fr/collections_et_services/anx_pres/a.collectes_ciblees_arch_internet.html http://www.bnf.fr/en/collections_and_services/book_press_media/a.internet_archives.html http://www.bnf.fr/en/collections_and_services/book_press_media/a.internet_archives.html http://www.bnf.fr/documents/guide_archives_internet.pdf http://www.bnf.fr/documents/guide_archives_internet.pdf http://crawl.bnl.lu/2017/06/appel-a-participation-bibliotheque-nationale-de-luxembourg-web-archive/ http://crawl.bnl.lu/2017/06/appel-a-participation-bibliotheque-nationale-de-luxembourg-web-archive/ http://blogs.bl.uk/webarchive/2017/04/the-challenges-of-web-archiving-social-media.html https://www.bl.uk/people/experts/jason-webber https://www.bl.uk/people/experts/jason-webber https://www.bl.uk/collection-guides/uk-web-archive http://www.bl.uk/catalogues/search/non-print_legal_deposit.html https://github.com/internetarchive/brozzler https://buddah.projects.history.ac.uk/news/bursaries/ https://sbforge.org/display/JWAT/JWAT http://sas-space.sas.ac.uk/6103/#undefined https://github.com/justinlittman/fbarc https://wpull.readthedocs.io/en/master/ https://www.gnu.org/software/wget/ http://www.dpconline.org/blog/idpd/web-preservation-demands-access http://www.dpconline.org/blog/idpd/web-preservation-demands-access Grab-site GitHub. (2018). ludios/grab-site: The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns. Retrieved from https://github.com/ludios/grab-site. Last accessed on 20 /04/2018. Graff, E. & Sepetjan, S. (2011). Le dépôt légal en France. Les cahiers de la propriété intellectuelle, 2011/1, 179–180. Harvey, D. R. (2005). Preserving digital materials. München: KG Saur. Helmond, A., Nieborg, D., & van der Vlist, F. N. (2017). The political economy of social data: A historical analysis of platform–industry partnerships. In Proceedings of the 8th International Conference on Social Media & Society (SMSociety 17) New York: ACM Press. https://doi.org/10.1145/3097286.3097324. Hockx-Yu, H. (2014). Archiving social media in the context of non-print legal deposit. Paper presented at IFLA, Lyon. IIPC. (2017). Why archive the web? Retrieved from http://netpreserve.org/web-archiving/. Last accessed on 22 /01/2018. IIPC. (2018). OpenWayback. Retrieved from http://netpreserve.org/web-archiving/openwayback/. Last accessed on 09/02/2018. ISO. (2017). Information and documentation - WARC file format (ISO 28500:2017). KB Nederland (n.d.-a) Selectie bij webarchivering. Retrieved from https://www.kb.nl/organisatie/onderzoek- expertise/e-depot-duurzame-opslag/webarchivering/selectie-bij-webarchivering. Last accessed on 19/12 /2017. KB Nederland (n.d.-b). Legal issues. Retrieved from https://www.kb.nl/en/organisation/research- expertise/long-term-usability-of-digital-resources/web-archiving/legal-issues. Last accessed on 22/09/17. KB Nederland (n.d.-c). Web archiving. Retrieved from https://www.kb.nl/en/organisation/research- expertise/long-term-usability-of-digital-resources/web-archiving. Last accessed on 22/09/17. KB Nederland (n.d.-d) KB-webarchief: veelgestelde vragen. Retrieved from https://www.kb. nl/organisatie/onderzoek-expertise/e-depot-duurzame-opslag/webarchivering/kb-webarchief- veelgestelde-vragen. Last accessed 08/12/2017. KB Nederland (n.d.-e) Gebruiksvoorwaarden webarchief Koninklijke Bibliotheek. Retrieved from https://www.kb.nl/bronnen-zoekwijzers/databanken-mede-gemaakt-door-de-kb/webarchief- kb/gebruiksvoorwaarden-webarchief-koninklijke-bibliotheek. Last accessed on 08/12/2017. Kelly, M. (2017). Web Archiving Integration Layer (WAIL). Retrieved from https://machawk1.github.io/wail/. Last accessed on 20/04/2018. Koerbin, P. (2017). Revisiting the world wide web as artefact: Case studies in archiving small data for the National Library of Australia’s PANDORA archive. In N. Brügger (Ed.), Web 25. Histories from the first 25 years of the world wide web (pp. 191–206). New York: Peter Lang. Kremer, I. (2016). About the Time Travel Service. Retrieved from http://timetravel.mementoweb.org/about/. Last accessed on 20/04/2018. Kunze, S. & Power, B. (n.d.). The 1916 Easter Rising Web Archive Project, p. 2. Retrieved from https://archivedweb.blogs.sas.ac.uk/files/2017/06/RESAW2017-PowerKunze-The_1916_Easter_Rising_ web_archive_Project.pdfp_.pdf. Last accessed on 2/11/2017. Laursen, D., & Møldrup-Dalum, P. (2017). Looking back, looking forward: 10 years of web development to collect, preserve and access the Danish web. In N. Brügger (Ed.), Web 25. Histories from the first 25 years of the world wide web (pp. 207–228). New York: Peter Lang. Lin, J., Milligan, I., Wiebe, J., & Zhou, A. (2017). Warcbase: Scalable analytics infrastructure for exploring web archives. Journal on Computing and Cultural Heritage, 10(4), 1–30. https://doi.org/10.1145 /3097570. Maemura, E., Becker, C., & Milligan, I. (2016). Understanding computational web archives research methods using research objects. In James Joshi, George Karypis, Ling Liu, et al., 2016 IEEE International Conference on Big Data (Big Data)(pp. 3250–3259). Masanès, J. (2005). Web archiving methods and approaches: A comparative study. Library Trends, 54(1), 72– 90. Maurer, Y. & Els, B. (2017a, November 24). Personal interview via GoToMeeting with Yves Maurer and Ben Els/Interviewers: Emmanuel Di Pretoro, Friedel Geeraert, Gerald Haesendonck, Eveline Vlassenroot. [M4A file]. Maurer, Y. & Els, B. (2017b, November 24). Written answers given by the Bibliothèque nationale de Luxembourg via Google Docs before the personal interview with Yves Maurer and Ben Els/ Interviewers: Emmanuel Di Pretoro, Friedel Geeraert, Gerald Haesendonck, Eveline Vlassenroot. Moesgaard, J. & Larsen, T. H. (2017a, November 30). Personal interview via GoToMeeting with Jakob Moesgaard & Tue Hejlskov Larsen/Interviewers: Emmanuel Di Pretoro, Friedel Geeraert, Gerald Haesendonck, Sally Chambers and Alejandra Michel. 108 International Journal of Digital Humanities (2019) 1:85–111 https://github.com/ludios/grab-site https://doi.org/10.1145/3097286.3097324 http://netpreserve.org/web-archiving/ http://netpreserve.org/web-archiving/openwayback/ https://www.kb.nl/organisatie/onderzoek-expertise/e-depot-duurzame-opslag/webarchivering/selectie-bij-webarchivering https://www.kb.nl/organisatie/onderzoek-expertise/e-depot-duurzame-opslag/webarchivering/selectie-bij-webarchivering https://www.kb.nl/en/organisation/research-expertise/long-term-usability-of-digital-resources/web-archiving/legal-issues https://www.kb.nl/en/organisation/research-expertise/long-term-usability-of-digital-resources/web-archiving/legal-issues https://www.kb.nl/en/organisation/research-expertise/long-term-usability-of-digital-resources/web-archiving https://www.kb.nl/en/organisation/research-expertise/long-term-usability-of-digital-resources/web-archiving https://www.kb.nl/organisatie/onderzoek-expertise/e-depot-duurzame-opslag/webarchivering/kb-webarchief-veelgestelde-vragen https://www.kb.nl/organisatie/onderzoek-expertise/e-depot-duurzame-opslag/webarchivering/kb-webarchief-veelgestelde-vragen https://www.kb.nl/organisatie/onderzoek-expertise/e-depot-duurzame-opslag/webarchivering/kb-webarchief-veelgestelde-vragen https://www.kb.nl/bronnen-zoekwijzers/databanken-mede-gemaakt-door-de-kb/webarchief-kb/gebruiksvoorwaarden-webarchief-koninklijke-bibliotheek https://www.kb.nl/bronnen-zoekwijzers/databanken-mede-gemaakt-door-de-kb/webarchief-kb/gebruiksvoorwaarden-webarchief-koninklijke-bibliotheek https://machawk1.github.io/wail/ http://timetravel.mementoweb.org/about/ https://archivedweb.blogs.sas.ac.uk/files/2017/06/RESAW2017-PowerKunze-The_1916_Easter_Rising_web_archive_Project.pdfp_.pdf https://archivedweb.blogs.sas.ac.uk/files/2017/06/RESAW2017-PowerKunze-The_1916_Easter_Rising_web_archive_Project.pdfp_.pdf https://doi.org/10.1145/3097570 https://doi.org/10.1145/3097570 Moesgaard, J. & Larsen, T. H. (2017b, November 30). Written answers given by the Danish Royal Library via Google Docs before the personal interview with Jakob Moesgaard & Tue Hejlskov Larsen/Interviewers: Emmanuel Di Pretoro, Friedel Geeraert, Gerald Haesendonck, Sally Chambers and Alejandra Michel. National Archives. (n.d.-a). How to use the web archive. Retrieved from http://www.nationalarchives.gov. uk/webarchive/information/. Last accessed on 19/10/2017. National Archives. (n.d.-b). UK Government web archive. Retrieved from http://www.nationalarchives.gov. uk/webarchive/. Last accessed on 31/10/2017. National Library of Ireland. (2017a). NLI Review 2016. Retrieved from https://www.nli.ie/GetAttachment. aspx?id=011e629f-1a5a-4cde-91d7-8a62ccf84bef. Last accessed 9/10/2017. National Library of Ireland. (2017b). Web Archive FAQ & Resources. Retrieved from https://www.nli. ie/en/web-archive-faq.aspx. Last accessed on 9/10/2017. National Library of Ireland. (n.d.-a) NLI Web Archive: A record of the online life in Ireland. Retrieved from http://collection.europarchive.org/nli. Last accessed on 1/02/2018. National Library of Ireland. (n.d.-b). Rights and Reproductions. Retrieved from https://www.nli.ie/en/rights- reproductions.aspx. Last accessed on 31/01/2018. National Library of Ireland. (n.d.-c). Web Archive. Retrieved from https://www.nli.ie/en/web_archive.aspx. Last accessed on 31/01/2018. National Library of Ireland. (n.d.-d). Web archive collections. Retrieved from http://www.nli.ie/en/udlist/web- archive-collections.aspx. Last accessed on 20/10/2017. NCDD. (n.d.). Expertgroep webarchivering. Retrieved from http://www.ncdd.nl/kennis-en- advies/expertgroepen/expertgroep-webarchivering/. Last accessed on 08/12/2017. Netarkivet.dk. (2016a). Selektive høstninger. Retrieved from http://netarkivet.dk/om-netarkivet/selektive- hostninger_2016/. Last accessed on 31/10/2017. Netarkivet.dk. (2016b). Adgang til Netarkivet. Retrieved from http://netarkivet.dk/adgang/. Last accessed on 31/01/2018. Netarkivet.dk. (2017). Brugermanual til Netarkivet. Retrieved from: http://netarkivet.dk/wp- content/uploads/2015/03/Netarkivet_Strategi_Langtidsbevaring_1.0_150115.pdf . Last accessed on 1/02 /2018. Nielsen, J. (2016). Using web archives in research - an introduction. Retrieved from http://www.netlab.dk/wp- content/uploads/2016/10/Nielsen_Using_Web_Archives_in_Research.pdf. Last accessed on 18/01/2018. Node-warc GitHub. (2018). N0taN3rd/node-warc: Parse And Create Web ARChive (WARC) files with node.js. Retrieved from https://github.com/N0taN3rd/node-warc. Last accessed on 20/04/2018. Ogden, J., Halford, S. & Carr, L. (2017). Observing web archives. The case for an ethnographic study of web archiving. WebSci. June (25-28). https://doi.org/10.1145/3091478.3091506. Posthumus A. and van Luin, J. (2017a, December 6). Personal interview via UC4all with Antal Posthumus and Jeroen van Luin/Interviewers: Eveline Vlassenroot and Friedel Geeraert. Posthumus A. and van Luin, J. (2017b, December 6). Written answers given via Google Docs by the National Archive before the personal interview with Antal Posthumus and Jeroen van Luin/Interviewers: Eveline Vlassenroot and Friedel Geeraert. Pywb GitHub. (2018). webrecorder/pywb: Core Python Web Archiving Toolkit for replay and recording of web archives https://pypi.python.org/pypi/pywb. Retrieved from https://github.com/webrecorder/pywb. Last accessed on 20/04/2018. RESAW (Research Infrastructure for the Study of Archived Web Materials). (2012). About RESAW. Retrieved from http://resaw.eu/about/. Last accessed on 04/02/2018. Reyes Ayala, B. (2013). Web archiving bibliography 2013. Texas: UNT Digital Library. Roche, X. (2018). HTTrack Website Copier. Retrieved from http://www.httrack.com/. Last accessed on 20/04 /2018. Rosenthal, C. (2017, July). NetarchiveSuite. Retrieved from https://sbforge.org/display/NAS/NetarchiveSuite. Last accessed on 20/04/2018. Ryan, M. (2017, November 16). Personal interview via GoToMeeting with Maria Ryan/Interviewers: Gerald Haesendonck, Alejandra Michel and Eveline Vlassenroot. [M4A file]. Schneider, S. M., & Foot, K. A. (2005). Web sphere analysis: An approach to studying online action. In C. Hine (Ed.), Virtual Methods - Issues in Social Research on the Internet. Oxford: Berg Publishers, 157–171. Schneider, S., & Foot, K. (2008). Archiving of internet content. In W. Donsbach (Ed.), The international encyclopedia of communication. Oxford: Blackwell. https://doi.org/10.1002/9781405186407.wbieca051. Schroeder, R., & Brügger, N. (2017). Introduction: The web as history. In N. Brügger & R. Schroeder (Eds.), The web as history. Using web archives to understand the past and present (pp. 1–19). London: UCL Press. International Journal of Digital Humanities (2019) 1:85–111 109 http://www.nationalarchives.gov.uk/webarchive/information/ http://www.nationalarchives.gov.uk/webarchive/information/ http://www.nationalarchives.gov.uk/webarchive/ http://www.nationalarchives.gov.uk/webarchive/ https://www.nli.ie/GetAttachment.aspx?id=011e629f-1a5a-4cde-91d7-8a62ccf84bef https://www.nli.ie/GetAttachment.aspx?id=011e629f-1a5a-4cde-91d7-8a62ccf84bef https://www.nli.ie/en/web-archive-faq.aspx https://www.nli.ie/en/web-archive-faq.aspx http://collection.europarchive.org/nli https://www.nli.ie/en/rights-reproductions.aspx https://www.nli.ie/en/rights-reproductions.aspx https://www.nli.ie/en/web_archive.aspx http://www.nli.ie/en/udlist/web-archive-collections.aspx http://www.nli.ie/en/udlist/web-archive-collections.aspx http://www.ncdd.nl/kennis-en-advies/expertgroepen/expertgroep-webarchivering/ http://www.ncdd.nl/kennis-en-advies/expertgroepen/expertgroep-webarchivering/ http://netarkivet.dk/om-netarkivet/selektive-hostninger_2016/ http://netarkivet.dk/om-netarkivet/selektive-hostninger_2016/ http://netarkivet.dk/adgang/ http://netarkivet.dk/wp-content/uploads/2015/03/Netarkivet_Strategi_Langtidsbevaring_1.0_150115.pdf http://netarkivet.dk/wp-content/uploads/2015/03/Netarkivet_Strategi_Langtidsbevaring_1.0_150115.pdf http://www.netlab.dk/wp-content/uploads/2016/10/Nielsen_Using_Web_Archives_in_Research.pdf http://www.netlab.dk/wp-content/uploads/2016/10/Nielsen_Using_Web_Archives_in_Research.pdf http://www.netlab.dk/wp-content/uploads/2016/10/Nielsen_Using_Web_Archives_in_Research.pdf https://github.com/N0taN3rd/node-warc https://doi.org/10.1145/3091478.3091506 https://pypi.python.org/pypi/pywb https://github.com/webrecorder/pywb http://resaw.eu/about/ http://www.httrack.com/ https://sbforge.org/display/NAS/NetarchiveSuite https://doi.org/10.1002/9781405186407.wbieca051 Sierman, B., & Teszelszky, K. (2017). How can we improve our web collection? An evaluation of web archiving at the KB National Library of the Netherlands (2007-2017). Alexandria, 27, 94–107. https://doi. org/10.1177/0955749017725930. Social Feed Manager. (2018). Social Feed Manager. Retrieved from https://gwu-libraries.github.io/sfm-ui/. Last accessed on 20/04/2018. Tanésie, P. & Aubry, S. (2017, December 12). Le dépôt légal du web à la BnF : organisation, procédures et outils. Presentation given at the Bibliothèque nationale de France, Paris. Tanésie, P., Aubry, S., Wendland, B. (2017, December 12), Personal interview at the BnF with Pascal Tanésie, Sara Aubry & Bert Wendland/Interviewers: Sally Chambers, Rolande Depoortere, Friedel Geeraert, Alejandra Michel, and Eveline Vlassenroot [mp3 file]. Teszelszky, K. (2017a, November 8). Personal interview via GoToMeeting with Kees Teszelszky/Interviewers: Gerald Haesendonck, Alejandra Michel and Eveline Vlassenroot. [M4A file]. Teszelszky, K. (2017b, November 8). Written answers given via Google Docs by the KB Nederland before the personal interview with Kees Teszelszky/Interviewers: Gerald Haesendonck, Alejandra Michel and Eveline Vlassenroot. The National Archives. (n.d.). UK Government Web Archive. Retrieved from http://www.nationalarchives.gov. uk/webarchive/. Last accessed on 1/02/2018. Twarc GitHub. (2018). DocNow/twarc: A command line tool (and Python library) for archiving Twitter JSON. Retrieved from https://github.com/DocNow/twarc. Last accessed on 20/04/2018. UK Web Archive. (n.d.-a). About. Retrieved from https://www.webarchive.org.uk/ukwa/info/about. Last accessed on 30/10/2017. UK Web Archive. (n.d.-b). Browse. Retrieved from https://www.webarchive.org.uk/ukwa/browse. Last accessed on 1/02/2018. UK Web Archive. (n.d.-c). Frequently asked questions. Retrieved from https://www.webarchive.org. uk/ukwa/info/faq. Last accessed on 30/10/2017. UK Web Archive. (n.d.-d). SHINE. Retrieved from https://www.webarchive.org.uk/shine. Last accessed on 05 /02/2018. Van de Sompel, H., Nelson, M.L., Sanderson, R. (2013). RFC 7089: HTTP Framework for Time-Based Access to Resource States—Memento. Retrieved from http://tools.ietf.org/rfc/rfc7089.txt. Last accessed on 20/04/2018. WALK (Web Archives for Longitudinal Knowledge). (n.d.). Datasets. Retrieved from: http://webarchives. ca/datasets. Last accessed on 04/02/2018. WARCAT GitHub. (2017). chfoo/warcat: Tool and library for handling Web ARChive (WARC) files. Retrieved from https://github.com/chfoo/warcat. Last accessed on 20/04/2018. Warcio GitHub. (2017). webrecorder/warcio: Streaming WARC/ARC library for fast web archive IO https://pypi.python.org/pypi/warcio. Retrieved from https://github.com/webrecorder/warcio. Last accessed on 20/04/2018. Warctools GitHub. (2016). internetarchive/warctools: warctools. Retrieved from https://github. com/internetarchive/warctools. Last accessed on 20/04/2018. Webber, J. (2017, November 16). Personal interview via GoToMeeting with Jason Webber/Interviewers: Sally Chambers, Gerald Haesendonck, Alejandra Michel and Eveline Vlassenroot. [M4A file]. Weber, M. S. (2017). The tumultuous history of news on the web. In N. Brugger & R. Schroeder (Eds.), The web as history. Using web Archives to understand the past and the present (pp. 83–100). London: UCL Press. Webrecorder. (n.d.). Collect & revisit the web. Retrieved from https://webrecorder.io/. Last accessed on 19/02 /2019. Webrecorder Player for Desktop Github. (2018). webrecorder/webrecorderplayer-electron: Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder). Retrieved from https://github.com/webrecorder/webrecorderplayer-electron. Last accessed on 20/04/2018. Webster, P. (2017). Users, technologies, organisations: Towards a cultural history of world web archiving. In N. Brügger & N. (Eds.), Web 25. Histories from 25 years of the world wide web (pp. 175–190). New York: Peter Lang. 110 International Journal of Digital Humanities (2019) 1:85–111 https://doi.org/10.1177/0955749017725930 https://doi.org/10.1177/0955749017725930 https://gwu-libraries.github.io/sfm-ui/ http://www.nationalarchives.gov.uk/webarchive/ http://www.nationalarchives.gov.uk/webarchive/ https://github.com/DocNow/twarc https://www.webarchive.org.uk/ukwa/info/about https://www.webarchive.org.uk/ukwa/browse https://www.webarchive.org.uk/ukwa/info/faq https://www.webarchive.org.uk/ukwa/info/faq https://www.webarchive.org.uk/shine http://tools.ietf.org/rfc/rfc7089.txt http://tools.ietf.org/rfc/rfc7089.txt http://webarchives.ca/datasets http://webarchives.ca/datasets https://github.com/chfoo/warcat https://pypi.python.org/pypi/warcio https://github.com/webrecorder/warcio https://github.com/internetarchive/warctools https://github.com/internetarchive/warctools https://webrecorder.io/ https://github.com/webrecorder/webrecorderplayer-electron Affiliations Eveline Vlassenroot1 & Sally Chambers2 & Emmanuel Di Pretoro3 & Friedel Geeraert4 & Gerald Haesendonck5 & Alejandra Michel6 & Peter Mechant1 Emmanuel Di Pretoro edipretoro@he2b.be Friedel Geeraert Friedel.Geeraert@kbr.be Gerald Haesendonck Gerald.Haesendonck@UGent.be Alejandra Michel alejandra.michel@unamur.be Peter Mechant Peter.Mechant@UGent.be 1 imec-mict-UGent, Ghent, Belgium 2 Ghent Centre for Digital Humanities, UGent, Ghent, Belgium 3 URF-SID, Haute École Bruxelles-Brabant, Bruxelles, Belgium 4 Royal Library and State Archives of Belgium, Brussel, Belgium 5 Department of Electronics and Information Systems, Ghent University - imec – IDLab, Ghent, Belgium 6 NADI/CRIDS, UNamur, Namur, Belgium International Journal of Digital Humanities (2019) 1:85–111 111 Web archives as a data resource for digital scholars Abstract Setting the scene: Archiving the web as a historical source Methodology Selection of content for web archives How is web archiving framed by the law? How is web archived content selected? Consultation, access and ease of use of web archives How to consult and access web archived content? What makes a web archive easy to use? Overview of tools used in web archiving Discussion and conclusion Limitations and future research List of institutions and representatives consulted References work_eegwbnm2wff7hb5di2ihomq3ki ---- Digital Humanities 2010 1 The Importance of Pedagogy: Towards a Companion to Teaching Digital Humanities Hirsch, Brett D. brett.hirsch@gmail.com University of Western Australia Timney, Meagan mbtimney.etcl@gmail.com University of Victoria The need to “encourage digital scholarship” was one of eight key recommendations in Our Cultural Commonwealth: The Report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Sciences (Unsworth et al). As the report suggested, “if more than a few are to pioneer new digital pathways, more formal venues and opportunities for training and encouragement are needed” (34). In other words, human infrastructure is as crucial as cyberinfrastructure for the future of scholarship in the humanities and social sciences. While the Commission’s recommendation pertains to the training of faculty and early career researchers, we argue that the need extends to graduate and undergraduate students. Despite the importance of pedagogy to the development and long-term sustainability of digital humanities, as yet very little critical literature has been published. Both the Companion to Digital Humanities (2004) and the Companion to Digital Literary Studies (2007), seminal reference works in their own right, focus primarily on the theories, principles, and research practices associated with digital humanities, and not pedagogical issues. There is much work to be done. This poster presentation will begin by contextualizing the need for a critical discussion of pedagogical issues associated with digital humanities. This discussion will be framed by a brief survey of existing undergraduate and graduate programs and courses in digital humanities (or with a digital humanities component), drawing on the “institutional models” outlined by McCarty and Kirschenbaum (2003). The growth in the number of undergraduate and graduate programs and courses offered reflects both an increasing desire on the part of students to learn about sorts of “transferable skills” and “applied computing” that digital humanities offers (Jessop 2005), and the desire of practitioners to consolidate and validate their research and methods. We propose a volume, Teaching Digital Humanities: Principles, Practices, and Politics, to capitalize on the growing prominence of digital humanities within university curricula and infrastructure, as well as in the broader professional community. We plan to structure the volume according to the four critical questions educators should consider as emphasized recently by Mary Bruenig, namely: - What knowledge is of most worth? - By what means shall we determine what we teach? - In what ways shall we teach it? - Toward what purpose? In addition to these questions, we are mindful of Henry A. Giroux’s argument that “to invoke the importance of pedagogy is to raise questions not simply about how students learn but also about how educators (in the broad sense of the term) construct the ideological and political positions from which they speak” (45). Consequently, we will encourage submissions to the volume that address these wider concerns. References Breunig, Mary (2006). 'Radical Pedagogy as Praxis'. Radical Pedagogy. http://radicalpeda gogy.icaap.org/content/issue8_1/breunig.ht ml. Giroux, Henry A. (1994). 'Rethinking the Boundaries of Educational Discourse: Modernism, Postmodernism, and Feminism'. Margins in the Classroom: Teaching Literature. Myrsiades, Kostas, Myrsiades, Linda S. (eds.). Minneapolis: University of Minnesota Press, pp. 1-51. http://radicalpedagogy.icaap.org/content/issue8_1/breunig.html http://radicalpedagogy.icaap.org/content/issue8_1/breunig.html http://radicalpedagogy.icaap.org/content/issue8_1/breunig.html Digital Humanities 2010 2 Schreibman, Susan, Siemens, Ray, Unsworth, John (eds.) (2004). A Companion to Digital Humanities. Malden: Blackwell. Jessop, Martyn (2005). 'Teaching, Learning and Research in Final Year Humanities Computing Student Projects'. Literary and Linguistic Computing. 20.3 (2005): 295-311. McCarty, Willard, Kirschenbaum , Matthew (2003). 'Institutional Models for Humanities Computing'. Literary and Linguistic Computing. 18.4 (2003): 465-89. Unsworth et al. (2006). Our Cultural Commonwealth: The Report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Sciences. New York: American Council of Learned Societies. work_efsupyn3jrhhzlgdtqytilaiqq ---- Title of the Paper: Example Paper for Business Systems Research Journal econstor Make Your Publications Visible. A Service of zbw Leibniz-Informationszentrum Wirtschaft Leibniz Information Centre for Economics Navratil, Jiri; Ubik, Sven; Melinkov, Jiri Conference Paper Performing Arts Across the Continents: Our Way to Digital Humanities and Arts Provided in Cooperation with: IRENET - Society for Advancing Innovation and Research in Economy, Zagreb Suggested Citation: Navratil, Jiri; Ubik, Sven; Melinkov, Jiri (2015) : Performing Arts Across the Continents: Our Way to Digital Humanities and Arts, In: Proceedings of the ENTRENOVA - ENTerprise REsearch InNOVAtion Conference, Kotor, Montengero, 10-11 September 2015, IRENET - Society for Advancing Innovation and Research in Economy, Zagreb, Vol. 1, pp. 227-234 This Version is available at: http://hdl.handle.net/10419/183653 Standard-Nutzungsbedingungen: Die Dokumente auf EconStor dürfen zu eigenen wissenschaftlichen Zwecken und zum Privatgebrauch gespeichert und kopiert werden. Sie dürfen die Dokumente nicht für öffentliche oder kommerzielle Zwecke vervielfältigen, öffentlich ausstellen, öffentlich zugänglich machen, vertreiben oder anderweitig nutzen. Sofern die Verfasser die Dokumente unter Open-Content-Lizenzen (insbesondere CC-Lizenzen) zur Verfügung gestellt haben sollten, gelten abweichend von diesen Nutzungsbedingungen die in der dort genannten Lizenz gewährten Nutzungsrechte. Terms of use: Documents in EconStor may be saved and copied for your personal and scholarly purposes. You are not to copy documents for public or commercial purposes, to exhibit the documents publicly, to make them publicly available on the internet, or to distribute or otherwise use the documents in public. If the documents have been made available under an Open Content Licence (especially Creative Commons Licences), you may exercise further usage rights as specified in the indicated licence. https://creativecommons.org/licenses/by-nc/4.0/ www.econstor.eu 227 ENTRENOVA 10-11, September2015 Kotor, Montenegro Performing Arts Across the Continents: Our Way to Digital Humanities and Arts Jiri Navratil, Sven Ubik, Jiri Melnikov CESNET, Czech Republic Abstract One of the strategic projects initiated in Europe in the period 2011-2014 was the project DARIAH-EU which had the main goal to elevate research in Humanities and Arts science using digital technologies and create special infrastructure across Europe. Czech Republic is in the phase of joining this infrastructure. Digital Humanity is a new scientific discipline which appeared in the last decade in many universities over the world and spread into many directions. We joined this activity in the field of performing arts. Our objective was to verify whether modern computer network and audio-visual technologies can enable collaborative work of performing artists when they are distributed across large distances and what are the requirements and limitations. We will describe our experience from the events which we organized or on which we participated during the last 4 years in Europe, US, Malaysia, Korea and Taiwan. The experiments showed that Global Performances as new form of performing arts can be arranged for different type of artists. The GP bring new very interesting impressions for artists and for the spectators. We demonstrated that GP can be used for presentation of performing arts on the festivals, cultural exhibitions and fairs. We believe that, It could have very positive economic effect. It is our IT contribution to Digital Humanities and Arts. Keywords: HD Video, 4K video, 3D technology, cyber performance, live surgery JEL classification: C88 Acknowledgments: This work was supported by the CESNET Large Infrastructure project (LM201005) funded by the Ministry of Education, Youth and Sports of the Czech Republic and used multiple academic networks including Geant, TEIN, GLIF, Gloriad, KREONET, TWAREN and Internet2. Introduction In this paper we describe our way to the field of Digital Humanity (DH). DH describes the emerging scientific discipline that seeks to integrate the principles of humanities and science. It is based on the assumption that mankind has ever larger amounts of data, which allows us to use new ways of analysis and presentation. Due to the digital nature of the data, it can be easily processed using various computational algorithms, more deeply analyzed and then used to acquire new knowledge about mankind, bring new views on our history and our behavior. DH just like other scientific disciplines appeared as result of evolution. Every university, faculty or department involved in the humanities had a profile which was shaped over the years of its existence. However, development of information technologies (IT) and the Internet itself with millions of users brought into many areas entirely new possibilities and each year brings many other directions that will further differentiate. EU reacted to this phenomenon several years ago by opening the project DARIAH-EU (Digital Research Infrastructure for the Arts and Humanities). As seen from the title, it explicitly included the arts in this field. The main task of this project was the 228 ENTRENOVA 10-11, September2015 Kotor, Montenegro establishment of ERIC (European Research Infrastructure Consortium). This was achieved in August 2014. Fifteen members of EU established this consortium. The tasks for ERIC are defined quite broadly which corresponds to the current reality. Although the Czech Republic is not a member of the ERIC, we are interested to participate with this infrastructure because some of the activities that we carry out for years certainly belong to the area covered by ERIC. Most results presented in this paper were achieved due to the activity of CESNET in global networking. CESNET researchers during its existence developed several tools which allow transmitting video signals in HD formats and the latest formats as 4K and 8K which are used in TV and film industry on high-speed networks. In following paragraphs we will describe how we were able to use them for different performing art sessions and also for presentations of one of national heritage project. Our work in the field of art Remote concerts One of our first works in the field of culture was the live transmission of a harp and guitar concert in 3Dfrom Czech Republic to the US. The goal was to create an illusion for remote visitors as they had been sitting in the concert hall close to the musicians. Figure 1 Musicians in the Studio (left) and Remote View via 3D Glasses (right) Source: Author’s illustration We presented this concert in 2013 to the meeting of the APAN cultural working group. The success of it opened us future collaboration with this group. The following two pictures show this performance. In Figure1there are musicians in Prague studio (left) and in the right there is a remote view of them via 3D glasses. We continued with this effort of remote music performances with the aim to demonstrate the capability of current networks and transmission technology to enable such types of art sessions for public. Later, we coordinated a European project E-music, supported by the open call of the pan-European GN3plus project, which brought us an opportunity to collaborate with more partners. In this project we focused on the study of relations between transmission delay and personal feelings of musicians for playing in the distributed environment. 229 ENTRENOVA 10-11, September2015 Kotor, Montenegro Global Performance Global Performance (GP) is a new form of a live performance as results of a joint effort of artists and engineers located across the world, working together in real time. The team includes network engineers and researchers, audio-visual technicians, programmers, musicians, dancers, scene designers and choreographers spanning multiple areas. The first event where we participated took place in Daejeon, Korea during the 36th APAN Meeting in Aug 2013. The performance was called “Dancing beyond time” and it was conducted by Prof. Boncheol Goo from KISTI, Korea. The event began at 08:55 UTC/GMT simultaneously in Salvador, Brazil (BR), Prague, Czech Republic (CZ), Barcelona, Spain (ES) and Daejeon, Korea (KR). Preparation of such an event involved approx. 100 people from nearly 20 organizations located in three continents and it last several months. The original idea was simple, to organize a joint dance of the Avatars. The initial idea of Avatars came out from the famous Kinect project when people can dance in front of the TV screen and the movement is scanned by a Kinect device. The result is a moving virtual person – Avatar which is shown in other window. The idea for this GP developed into music and dance on several places and transferring their audio and video as well as the Avatars into a remote site (Korea) in a synchronized way. There were musicians “Unlimited trio” playing live music in CZ, dancers in KR, ES and BR and the main audience in KR. Video shoot in ultra-high- resolution (4K) along with audio were transmitted to KR. The sound was sent to ES and BR simultaneously. In KR, BR and ES a dance was performed in front of a Kinect sensor. Each skeleton data of dancers in ES, BR and KR were exchanged and sent to the stage at the same time. All dancers could see avatars of each other. To mitigate the latency issue, computer generated skeleton data using Kinect sensors generated by a MaxMSP program were used between sites. During a preparation phase, the director decided to create a more complex scene in Daejeon. The final big screen (size 16x6,5 m) was split into several regions. Each of these regions was dedicated for different video. The left half of the screen was for musicians from Prague with 4K projection. The upper part of the right region was dedicated for a joint dance of Avatars and below them there were projected three original videos of dancers sent from different sites. The performance got true interesting expression when a Korean dancer performed live dance directly in front of the screen. The whole performance had a perfect illusion of a compact live local performance. The audience was approx. 100 people in Daejeonat the conference closing session and additional participants in other locations. The big screen with all videos in the venue in KR is shown in the illustration in Figure2. Figure 2 Two Views of Final Stage of the Global Performance for APAN36 Source: Author’s illustration 230 ENTRENOVA 10-11, September2015 Kotor, Montenegro A slightly different GP was prepared for the APAN38 held in Taiwan. The performance was called „Dancing in Space“. We again collaborated with partners across continents. In this case, one big screen was used to compose video of musicians and dancers together. The main video of a musical trio was transferred from Prague and into it was composed live video of a cellist who played in Miami (US) and dancers who danced in Barcelona (Spain) and in Nantou (Taiwan). The final screen from this performance is shown on Figure3. Figure 3 Global Performance for APAN38 Source: Author’s illustration Work in the field of national heritage Langweil model of old Prague. Langweil model of Prague as a spatial representation of the city is a unique work of art and unique in the world of its kind. The Model from the cardboard on the wooden structure was created in the years1826-1837 by an assistant at the University Library in Klementinum, Antonín Langweil. The model contains over two thousand buildings in the historic center of Prague in a life-like version with all building façade sand ornate details. Approximately half of the buildings were later demolished or rebuilt. Most of building are 3 - 6 cm tall but of course contain various dominant features (towers, columns, etc.) which are around 20 cm tall. The physical model is kept in a kind of “greenhouse” with tinted glass and limited lighting inside. Figure 4 Langweil Model in Prague Museum (left) and Zoomed Details (rights) Source: Photographs are a reproduction of a collection item administered by the City of Prague Museum, the author of the item is Antonin Langweil 231 ENTRENOVA 10-11, September2015 Kotor, Montenegro The digitization of the Langweil model was an incredibly difficult project from a technical point of view. The model could not be touched in any way as there was a risk of irretrievable damage. Digitization had to be carried out without contact, purely on optical principles with special camera. The result of the photography was approximately 250,000 photographs with 16 Mpix resolutions, which represents a considerable amount of data for processing into 3D model. Digitization was done by a professional company Kit Digital in collaboration with the Department of computer graphics and interaction of the Czech Technical University. The digital model is owned by The City of Prague Museum and it is used only in this Museum for local video presentations. We obtained a permission to use one part of the model for our network experiments. We decided to use the model interactively to allow people “to be inside”. In the first step the data was converted into the form usable for CAVE. The CAVE stands for CAVE Automatic Virtual Environment and takes the form of a cube-like space in which images are displayed by a series of 3D projectors. When a visitor enters into CAVE he has a chance to be in new virtual reality. In this case on the streets of old Prague. As the next step, we extended this idea to allow the user to walk in the models remotely. The idea of remote interaction was firstly introduced in the project called „C2C”. The goal of this joint project of CESNET and Institute of Intermedia (IIM) at the Czech Technical University was to send and share data from one CAVE into other immersive environments. For such remote interaction the remote sensor of position was integrated into the system, to allow a remote user to move around in the model, which is stored and rendered locally. The remote user, located somewhere in the world sees the video from the CAVE (or at least the front wall) and use a joystick in his hand to simulate a walk. This principle is illustrated in Figure5. The model was projected on the CAVE walls and on the remote 3D screen according to the position of the person in the CAVE or according to the joystick movements if the remote person. To see the video in 3D, all attendees of such a session must have special glasses. We called this demo „Virtual Walking in historical Prague“, because visitors could move inside the model according their own selection. This project was first time demonstrated on a CineGrid Workshop in San Diego in 2012 and second time on a joint Internet2 and APAN meeting in Honolulu in 2013. Figure 5 Diagram of Using CAVE with Remote Control and 3D Presentation Source: Author’s illustration 232 ENTRENOVA 10-11, September2015 Kotor, Montenegro The scheme of network configuration is shown in Figure6. It indicates also the time for video transfer from Prague to Hawaii and also a backward channel via which we sent the position information to the CAVE. Figure 6 Scheme of Networking for APAN Demo Source: Author’s illustration Discussions of necessary conditions In this part we have to discuss several conditions which play an important role in our work. The first is networking, the second devices for video transmissions and the third presentation devices (large screens, multidimensional screens etc.) Only the perfect integrated work of all three elements can bring results which are well recognized in international community. CESNET is a part of GEANT community (EU academic network), member of GLIF (Global experimental network) and we collaborate with other leading networking partners such as Internet2 in the US and APAN (Asian Pacific Advanced Network). This allows us to bring video into many places on the globe. The video transporting tools which we use in our projects are MVTP-4K and Ultragrid. MVTP-4K is a hardware device based on FPGA technology allowing fast transmission video with minimal internal latency. Ultragrid (UG) is a free software solution which can be used on a PC or MAC platform. The availability of these tools was the second main reason for our invitations to participate at the global project. Each tool has its own advantages. For the demonstrations described in this paper, we used MVTP-4K. This device was originally designed for transmission of 4K video, but the experience showed that it can be used in many other applications due to its very low added latency It can be used for remote access to scientific visualizations, for medical sessions connecting operating theatres with lecture halls and conference venues or for eCulture events and collaboration where we need more parallel videos. The limiting factor is that the same device is needed in all sites. The device is currently commercially available under the name 4KGateway. CESNET has several units available for experiments and we provide them for important events on anon-profit basis. 233 ENTRENOVA 10-11, September2015 Kotor, Montenegro Figure7 MVTP-4K Box Enabling Multiple Video Transmissions Source: Author’s illustration Conclusion In our paper we showed our present work in the field which can be understand as a starting work in a national effort to join the European project ERIC. In the future, we plan to continue in this work and to investigate more deeply the use of immersive visualizations for collaboration in performing arts, and involving other kinds of artistic expressions, such as fine arts and paintings and installations of more permanent infrastructures for the use in university lectures. References 1. 4Kgateway “About” (2015), available at: http://www.4kgateway.com/ (accessed May 30th 2015). 2. APAN (2015), “About APAN”, available at: http://apan.net (accessed May 30th 2015). 3. APAN-35 (2015), “An International Conference of Networking Experts”, available at: https://meetings.internet2.edu/2013-01-jt/ (accessed May 30th 2015). 4. APAN-36 (2015), “Asia-Pacific Advanced Network 36th Meeting”, available at: http://www.apan.net/meetings/Daejeon2013/0101.php (accessed May 30th 2015). 5. DARIAH-EU (2015), “About DARIAH-EU”, available at: https://dariah.eu/about.html (accessed May 15th 2015). 6. DARIAH-ERIC (2015), “DARIAH-EU: Annual Report 2011”, available at: https://dariah.eu/fileadmin/Documents/DARIAH-EU_Annual_report_2011.pdf (accessed May 15th 2015). 7. GLIF (2015), “Connecting research worldwide with lightpaths”, available at: http://www.glif.is/publications/info/brochure.pdf (accessed May 15th 2015). 8. Halak, J., Ubik, S. (2009), “MTPP - Modular Traffic Processing Platform”, Proceedings of the IEEE Symposium on Design and Diagnostics of Electronic Circuits and Systems, Liberec, pp. 170-173. 9. Halak, J., Krsek, M., Ubik, S., Zejdl, P., Nevrela, F. (2011), “Real-time long-distance transfer of uncompressed 4K video for remote collaboration”, ELSEVIER FGCS Vol. 27 No. 7, pp. 886-892. 10. IIM (2015), “Projects”, available at: http://www.iim.cz/?id=148&lang=1#node (accessed May 30th 2015). 11. Langweil (2015), “O projektu” [About the Project], available at: http://www.langweil.cz/projekt.php (accessed May 15th 2015). 12. MVTP4K (2015), “About us”, available at: http://www.ces.net/project/qosip/hw/mvtp-4K- v2.pdf (accessed May 30th 2015). 13. Navratil, J.; Sarek, M.; Ubik, S.; Halak, J.; Zejdl, P.; Peciva, P.; Schraml, J. (2011), “Real-time stereoscopic streaming of robotic surgeries: e-Health Networking Applications http://www.4kgateway.com/ http://apan.net/ https://meetings.internet2.edu/2013-01-jt/ http://www.apan.net/meetings/Daejeon2013/0101.php https://dariah.eu/about.html https://dariah.eu/fileadmin/Documents/DARIAH-EU_Annual_report_2011.pdf http://www.glif.is/publications/info/brochure.pdf http://www.iim.cz/?id=148&lang=1 http://www.langweil.cz/projekt.php http://www.ces.net/project/qosip/hw/mvtp-4K-v2.pdf http://www.ces.net/project/qosip/hw/mvtp-4K-v2.pdf 234 ENTRENOVA 10-11, September2015 Kotor, Montenegro and Services (Healthcom)”, CESNET, Prague Czech Republic, 13th IEEE International Conference on Issue Date: 13-15 June 2011. 14. Ubik, S., Navratil, J., Melnikov, J., Goo, B., Cuenca, D. and Santana I. (2014), „Collaborative visual environments for performing arts“, CESNET, KAIST, i2CAT, Universidade Federal da Bahia, The 11 International konference on Cooperation Design and Visualisation, Sept. 14-17, 2014, Seattle, USA. 15. Ubik, S., Navratil, J., Zejdl, P., Halak, J. (2012), “Real-Time Stereoscopic Streaming of Medical Surgeries for Collaborative eLearning”, 2012 CDVE Mallorca, Spain, Proceedings pp. 73-77. About the authors Jiri Navratil received his PhD in Computer Science from Czech Technical University at Prague in 1984. He worked for 30 years at Computing and Information Center of CTU in different positions linked with High Performance Computing a Communications. During his several sabbatical leaves he worked in Switzerland, Japan and USA in the field of networking. Since 2006 he started work for CESNET - Czech Education a Scientific Network as leader of group supporting special research applications using high speed Internet. In the last years he participated on several multimedia performances organized in frame of large international cooperation in different fields. Author can be contacted at jiri@cesnet.cz Sven Ubik received his MSc and Dr in Computer Science from the Czech Technical University. He is currently a senior researcher in CESNET and the head of the research group Technologies for network applications. He has created a Network Visualization Lab in collaboration with the Czech Technical University. His research interests include novel applications for distance collaboration, digital representation and distance access to culture heritage, 3D models; hardware accelerated video processing and optical networks.. Author can be contacted at ubik@cesnet.cz Jiri Melnikov works in CESNET as an administrator of high-resolution multimedia laboratory and has experience with developing applications for low latency and high quality transmissions. He received his MSc. in Computer Science from the Czech Technical University in 2012. His research interests also include software defined networking and software development for high-resolution tiled displays walls. He currently resides in Prague and can be contacted at melnikov@cesnet.cz work_ej2okgjm65e5rirjm3rpfmetpe ---- indexlcomunicación | nº 8(2) 2018 | Páginas 13-31 E-ISSN: 2174-1859 | ISSN: 2444-3239 | Depósito Legal: M-19965-2015 Recibido el 29_04_2018 | Aceptado el 10_05_2018 EL ROL DEL DOCENTE UNIVERSITARIO Y SU IMPLICACIÓN ANTE LAS HUMANIDADES DIGITALES THE ROLE OF THE TEACHER DEGREE AND HIS INVOLVEMENT WITH THE DIGITAL HUMANITIES Daniel Rodrigo-Cano | Patricia de-Casas-Moreno | Ignacio Aguaded | daniel.rodrigo@alu.uhu.es | patricia@grupocomunicar.com | ignacio@aguaded.es | Universidad de Sevilla | Universidad Antonio de Nebrija | Universidad de Huelva Resumen. La convergencia tecnológica está provocando cambios vertiginosos en el contexto comunicativo y, sobre todo, educativo. En este sentido, la presente investigación trata de analizar una doble vertiente: por un lado, identificar las habi- lidades del docente universitario ante las humanidades digitales para el aprendizaje social y, por otro lado, identificar las herramientas más utilizadas, así como las motivaciones que llevan al éxito en las metodologías colaborativas en la Web 2.0. A raíz de lo expuesto, la investigación está enfocada a través de una metodología cuanti-cualitativa con la recolección de 537 cuestionarios conformados por alum- nos y el desarrollo de un focus group con un total de 20 docentes pertenecientes a las Universidades de Cádiz, Sevilla y Huelva. Entre los resultados más destacados encontramos que el uso de las nuevas tecnologías en el aula está cobrando un papel notorio —tanto desde el punto de vista del alumno como del profesor— para impartir la enseñanza de forma colaborativa y conseguir desarrollar una correcta actitud crítica frente al contexto actual. Por lo tanto, es importante establecer que, en momentos de humanidades digitales, el docente debe procurar empoderar a los alumnos universitarios con el fin de que estos adquieran capacidades como introducir discursos críticos. Palabras clave: Humanidades digitales; enseñanza- aprendizaje; TIC; educomunicación; trabajo colaborativo; web 2.0. Para citar este artículo: Rodrigo-Cano, D; de-Casas-Moreno, P. y Aguaded, I. (2018). El rol del docente universitario y su implicación ante las Humanidades digitales. index.comunicación, 8(2), 13-31. 14 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional Abstract. The technological convergence is leading to rapid change in the commu- nicative context and, above all, education. In this sense, the present investiga- tion tries to analyze a twofold: on the one hand, identify the skills of university professors to the digital humanities for social learning and, on the other hand, recognize the good university practices, to identify the most widely used tools, as well as the motivations that lead to success in the collaborative methodologies in Web 2.0. Following this, the research is focused through a qualitative-quantitative methodology with the collection of 537 questionnaires made up of students and the development of a focus groups with a total of 20 teachers belonging to the University of Cadiz, Seville and Huelva. In this sense, it can be noted, among the most outstanding results, that the use of new technologies in the classroom is taking a visible role, both from the point of view of the student and the professor for teaching in a collaborative manner and to develop a critical attitude to the current context. Therefore, it is important to establish that, in times of digital humanities teachers must seek to empower the university students so that they acquire skills how to enter critical discourses. Keywords: Digital Humanities; teaching-learning; ICT, Edu; collaborative work; web 2.0. 1. Introducción Las Tecnologías de la Información y la Comunicación (TIC) están provocando grandes cambios en la sociedad del conocimiento y, por tanto, en el contexto universitario, donde el proceso de enseñanza-aprendizaje permite la formación a lo largo de la vida a través de herramientas conectadas a Internet, a las redes sociales y a las huellas digitales, que genera cada alumno y cada docente. Las conexiones a Internet en la universidad española se realizan a través de la tecnología wifi, a la que acceden el 87 por ciento de los alumnos como los propios docentes, que generan casi nueve millones de conexiones al año. Además, estos alumnos son usuarios habituales de la plataforma de enseñanza virtual de cada universidad con un 95 por ciento, mientras que por parte del profesorado se puede contabilizar el 93 por ciento en el año 2016 (Llorens et al., 2016). Hay que resaltar que el contexto educomunicativo de hoy en día se identifica por la convergencia tecnológica y el término de ubicuidad (Burbules, 2012), es decir, estar conectado en cualquier momento y a través de cualquier dispositivo móvil (Mojarro, Rodrigo-Cano y Etchegaray-Centeno, 2015). En este sentido, este modelo de interacción ha modificado las formas de enseñanza-aprendizaje, consiguiendo cerrar brechas y revolucionando el contexto educativo (García et al., 2016). Por otra parte, la huella digital permite un análisis profundo del proce- so enseñanza-aprendizaje a través del Learning Analytics, definido como el index.comunicaci�n El rol del docente universitario y su... | Rodrigo Cano, De Casas Moreno, Aguaded | 15 proceso de análisis de un gran número de datos (Big Data), que permite esta- blecer sistemas de apoyo al aprendizaje (Scheffel, et al., 2014; Papamitsiou y Economides, 2014; Freitas et al., 2015). Sin duda, el impulso tecnológico se hace evidente y expone a la sociedad a grandes y nuevos retos, construyendo nuestra huella digital o nuevas identidades dentro del marco comunicacional. Debido al incesante uso de las nuevas tecnologías, hay que establecer una conciencia crítica dentro de la dimensión cultural con el objetivo de mejorar las habilidades y competencias del ser humano en materia digital (Gértrudix, Borges y García, 2017). En los hogares españoles, el servicio de telefonía móvil está presente en un 98,4 por ciento de los mismos (Urueña, Prieto, Seco, Ballestero, Castro y Cadenas, 2018) y se ha convertido en una tecnología habitual de consumo de Internet. Así, siete de cada diez niños de 10 a 15 años disponen de teléfono móvil y más del 90 por ciento han usado Internet en los últimos tres meses (Urueña, Seco, Castro y Cadenas, 2017). Sin duda, los smartphones permiten que los alumnos universitarios obtengan, al alcance de su mano, las principa- les herramientas seleccionadas para el aprendizaje como por ejemplo, YouTu- be, el buscador de Google, Google Drive o Twitter (Hart, 2017). Estos hechos, junto a la inmediatez de acceso a información desde cualquier momento y en cualquier lugar, están facilitando que el proceso enseñanza-aprendizaje personalizado y localizado continúe más allá de las aulas (Economides, 2009; Burbules, 2012; Vázquez-Cano y Sevillano-García, 2017). En suma, todos estos cambios, inciden directamente en la función docen- te, e implica que los educadores se alejen del rol de ser transmisores únicos de conocimiento para convertirse en orientadores, así como plantear problemas y situaciones adecuados para que los alumnos puedan resolverlos, aplicando habilidades para la búsqueda de información complementaria, la comunica- ción de ideas a sus compañeros o al profesor, la selección de la mejor solución a un problema o la valoración de la decisión adoptada (Del-Moral y Villalus- tre, 2010; Espinosa, 2014; Figueras-Maz, Ferrés y Mateu, 2017). 2. Humanidades Digitales: experiencias y proyectos En tiempos de la Web 2.0, redes sociales, Internet, Big Data, postverdad y algo- ritmos, entre otros términos vinculados, es necesario instaurar un pensamiento filosófico, pero sobre todo ético, a través de las humanidades digitales. Éstas son entendidas como un conjunto de disciplinas humanísticas que, junto con el uso de tecnologías, pretenden crear nuevos paradigmas disruptivos con el fin de incluir críticamente el pensamiento humanista en la construcción digital y tecno- lógica de nuestra sociedad (Rodríguez-Ortega, 2014). En esta línea de estudio, 16 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional además de la ética, es necesario que las humanidades digitales incluyan algunos conceptos como la interdisciplinariedad, la transdisciplinariedad, la multidisci- plinariedad, las fuentes abiertas (open source), los recursos abiertos, las licencias abiertas, la promoción de licencias como Creative Commons, la redefinición de las comunidades de investigación y sus límites, el reequilibrio en las relaciones entre docentes y alumnos, y el compromiso e impacto social, muy en la línea de la ética Hacker propuesta por Himanen (2002). Según Piscitelli (2015), se trata de una deriva cultural en la que conviven tecnología y cultura, provocando la necesidad de estudiar e investigar sobre la cultura digital con el fin de crear vínculos a través de la reproducción cultural, que se realiza a través del uso de las TIC, Moodle o los MOOC (Romero-Frías, 2014). Por otro lado, es importante valorar la evolución de las TIC, ya que permi- ten la participación social y democrática, así como el empoderamiento del ciudadano (Biesta, 2013). A raíz de esta afirmación, surgen las Tecnologías para el Empoderamiento y la Participación (TEP), posibilitando el proceso a las múltiples conversaciones con los iguales en el aula con el fin de ser capa- ces de presionar a políticas, marcas y el establishment hacia una democracia 2.0. Al fin y al cabo, el objetivo no es otro que el de lograr implicar a los ciuda- danos. En palabras de Gozálvez y Contreras-Pulido (2014: 130), se trata de «empoderar a la ciudadanía, lo que significa reforzar la libertad, la autonomía crítica y la participación de los ciudadanos en cuestiones políticas, sociales, económicas, ecológicas e interculturales a partir del buen uso de los medios y la tecnología comunicativa». En otro orden de cosas, hay que destacar el entorno de prácticas culturales y digitales, que se experimentan en los ámbitos tecnológicos como las TRIC (Tecnologías+Relación+Información+Comunicación) (Gabelas, Marta-Lazo, y Aranda, 2012; Gabelas, Marta-Lazo y González-Aldea, 2015; Marta-Lazo, Hergueta-Covacho y Gabelas, 2016; Garrido-Lora, Busquet y Munté-Ramos, 2016). Este nuevo concepto y enfoque educomunicativo exhibe una realidad en la que sólo el 95 por ciento de la población mundial posee bajas habilida- des tecnológicas (Kankaraš et al., 2016), es decir, que no pueden utilizar de forma eficaz la tecnología de la que disponen por lo que se debe promover un nuevo modelo de aprendizaje, donde prevalezca la interacción, la creatividad y el pensamiento crítico. Por lo tanto, un correcto modelo educomunicativo debe atender a las siguientes premisas (Marcelo, Yot, Murillo y Mayor, 2016; Marta-Lazo y Gabelas, 2016): → Actividades asimilativas, que busquen promover la comprensión del alum- nado acerca de determinados conceptos o ideas. El profesor debe presen- index.comunicaci�n El rol del docente universitario y su... | Rodrigo Cano, De Casas Moreno, Aguaded | 17 tar esta tarea, basándose en recursos como las presentaciones multimedia, vídeos, documentos de textos digitales, audios, fotografías, etc. → Actividades de gestión de la información, que requieren que el alumnado tenga que buscar, contrastar, sintetizar o realizar un análisis de una determi- nada información, utilizando para ello navegadores web, programas infor- máticos específicos, etc. → Actividades comunicativas, en las que se solicita a los alumnos tareas de presentación de información, discusiones, debates, puestas en común, etc., usando herramientas de comunicación online síncronas o asíncronas. → Actividades productivas, con las que se le pide al alumnado que diseñe, elabore o cree algún producto manejando tecnologías digitales (paquete ofimático, otro software específico, etc.). → Actividades experienciales, intentando ubicar a los alumnos en un ambien- te cercano al ejercicio profesional futuro, bien de forma real o simulada. → Actividades evaluativas, su principal objetivo es la evaluación del alumna- do por medio de tecnologías digitales (e-rúbricas, portafolios, etc.). 3. Características del docente 2.0 El papel del docente en la actualidad sufre un desarrollo constante, ya que estos deben exponerse a una formación continua por la inclusión de las nuevas tecno- logías. En este sentido, para identificar las habilidades del educador 2.0, hay que analizar una serie de características importantes. Por un lado, debe poseer un dominio de la materia o disciplina a impartir. Sin duda, no se puede instruir a los alumnos sin tener una serie de conocimientos y competencias básicas. En este sentido, la comunicación será́ la base de la enseñanza. Además, conocer al grupo de estudiantes es un hecho importante, si se pretende sintonizar con ellos, así como conocer y experimentar técnicas de dinámica con diferente finalidad (presentación, fomentar la interacción, debatir, colaborar, simular…) (Gutiérrez- Porlán, Román-García y Sánchez-Vera, 2018). Según Ibermón (2009), es necesario saber elaborar un guion de la sesión distribuyendo el tiempo y atendiendo a los objetivos que se persigan, el tipo de actividades que se propongan, la curva de fatiga del alumnado, etc. (reflexión en la acción). Asimismo, hay que tener preparado un sistema para evaluar tanto al alumnado como su propia intervención (reflexión sobre la 18 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional acción). Otros de los atributos que debe tener un buen docente es el de producir conocimientos científicos, favorecer estrategias de aprendizaje y desarrollo de competencias, generar una actitud de respeto, conseguir desa- rrollar la capacidad de escucha, establecer una buena comunicación, prepa- rar las clases de manera adecuada, ser una persona justa y paciente con inte- rés por los alumnos y con capacidad de planificar el proceso de enseñanza y el de aprendizaje, ser capaz de seleccionar y presentar los contenidos disciplinares, ofrecer informaciones y explicaciones comprensibles, poseer un buen grado de alfabetización tecnológica y el manejo didáctico de las TIC, saber gestionar las metodologías de trabajo didáctico y las tareas de aprendizaje, ser capaz de relacionarse constructivamente con los alumnos, reflexionar e investigar sobre la enseñanza, e implicarse institucionalmente (Fernández-Borrero y González-Losada, 2012). En muchas de estas carac- terísticas se pone de manifiesto la presencia de las TIC, por lo que, para el docente 2.0 serán necesarias competencias digitales, entendiendo estas como experto en contenidos pedagógicos emergentes, práctico-reflexivo aumentado, experto en entornos de aprendizaje enriquecidos, sensible al uso de la tecnología desde la perspectiva del compromiso social, generador y gestor de prácticas pedagógicas emergentes y capaz de usar las TIC para expandir su relación con la familia y el entorno del estudiante (Castañeda, Esteve y Adell, 2018). Por otro lado, el nuevo educador 2.0 debe ser válido para diseñar la guía docente de acuerdo con las necesidades, el contexto y el perfil profesional, todo ello, en coordinación con otros profesionales. También debe desarrollar el proceso de enseñanza-aprendizaje, propiciando oportunidades de formación tanto individual como grupal; tutorizar el proceso de instrucción del alumno, contribuyendo con acciones que le permitan una mayor autonomía; evaluar el proceso de enseñanza-aprendizaje; aportar activamente a la mejora de la docencia y participar activamente en la dinámica académico-organizativa de la institución (Mas-Torrelló, 2012). Según Espinosa (2014), un buen docente, hoy en día, debe llevar a cabo tareas de índole interpersonal, metodológica, comunicativa, de planificación, de gestión de la docencia, de trabajo en equipo y de innovación. Los profeso- res asignan la máxima importancia a la competencia de comunicación, segui- da de la interpersonal y metodológica, quedando relegado a un segundo plano la planificación y gestión, la innovación y, por último, el trabajo en equipo. Por su parte, Gargallo y otros (2010) señalan que su papel debe ser el de una persona que consiga establecer relaciones entre los conceptos; fomentar el aprendizaje significativo; enseñar a aprender a aprender; motivador; conectar index.comunicaci�n El rol del docente universitario y su... | Rodrigo Cano, De Casas Moreno, Aguaded | 19 la teoría con la práctica; fomentar la participación; y utilizar metodologías variadas y complementarias, en función de las necesidades. En definitiva, y con esta visión, se puede concluir que un buen docente es aquel que tiene vocación por la enseñanza con un buen dominio de la mate- ria que imparte, adaptándola al nivel y la aplicación demandada. Además, se tratará de una persona comprensible, favorecedora del diálogo y que se actua- liza constantemente, no sólo en los conocimientos propios de la materia, sino en metodologías, tecnologías y actitudes personales adecuadas. 4. Metodología La presente investigación trata de analizar una doble vertiente: por un lado, iden- tificar las habilidades del docente universitario ante las humanidades digitales para el aprendizaje social y, por otro lado, reconocer las buenas prácticas univer- sitarias, que permitan identificar las herramientas más utilizadas, así como las motivaciones que llevan al éxito a través del uso de metodologías colaborativas en la Web 2.0. Para ello, este estudio se ha abordado desde un enfoque mixto (cuanti-cualitativo) con el fin de comprender las experiencias, percepciones y expectativas de docentes universitarios respecto a la integración de las Tecnolo- gías de la Información y Comunicación (TIC) dentro de las aulas, lo que permite identificar las humanidades digitales que presentan estos docentes. Para realizar el estudio se ha partido de una muestra de 82.394 alumnos matriculados en las universidades de Cádiz, Huelva y Sevilla. A raíz de este cómputo, se determina una muestra aleatoria de un total de 537 estudiantes durante el curso 2015-2016. En este sentido, para conseguir esta muestra se ha atendido a todos aquellos alumnos contactados a través de los docentes de las universidades seleccionadas por el uso del mailing. Estas respuestas parten de un margen de error del cinco por ciento y un nivel de confianza del 95 por ciento. Además, el instrumento generó un registro de las respuestas obtenidas, que fueron exportadas al programa estadístico SPSS 22.0, desde el que se analizaron e interpretaron los resultados. Esta encuesta se elaboró a través de la herramienta de Formularios de Google, que permite el envío masivo y la recepción de datos online de forma agrupada y visual, basada en el cues- tionario Social Software Survey Used With Unpaced Undergrad (Anderson, Poellhuber y McKerlich, 2010) en los ítems de la dimensión preferencia de aprendizaje. A este cuestionario se le incluyeron cuestiones relacionadas con la identificación, el uso y la frecuencia de las TIC, experiencia y las frecuen- cias de uso con herramientas de la Web 2.0. El cuestionario final contenía 48 ítems, aunque en este artículo analizamos las elaboradas a partir de Anderson, Poellhuber y McKerlich (2010), concre- 20 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional tamente las cuestiones «¿Las redes sociales favorecen la colaboración en el trabajo por parte del profesorado?» y «¿Las redes sociales favorecen el contac- to y la colaboración entre el alumnado y el profesorado?». La herramienta se ha conformado a través de una escala tipo Likert con cuatro respuestas posibles de ‘Totalmente en desacuerdo’ (0) a ‘Muy de acuerdo’ (3) evitando tendencias centrales. En el análisis de la fiabilidad de estas preguntas se utili- zó el coeficiente de alfa de Cronbach, que obtuvo 0,89 en ambas preguntas, consiguiendo un alto grado de fiabilidad en el instrumento. Para el análisis cualitativo, en esta investigación, se utilizó la técnica de Focus Group o grupo de discusión. Se trata de una técnica adecuada para obtener información valiosa e ilustradora, que permite la exploración y el descubrimiento. Además, se convierte en un encuentro orientado a estimular los procesos de comunicación y, por tanto, deben ser flexibles y dinámicos (Taylor y Bogdan, 1987). Para ello, el papel del dinamizador es fundamental y debe estar predispuesto a aprender (Morgan, 1997). Los grupos focales, dada su densidad para la construcción multicriterial y por sus potenciales partici- pativos y de autoconocimiento grupal, posibilitaron convertir colectivos de discusión en dispositivos dinámicos de autorreflexión (Espina, 2007). En este caso, las líneas de debate que se desarrollaron en el focus group fueron: identificar las metodologías colaborativas en la Web 2.0 que faciliten el aprendizaje universitario; identificar las habilidades del docente 2.0 para el aprendizaje social; metodologías colaborativas en la Web 2.0 de éxito en el aula universitaria. Por lo tanto, se realizaron tres focus group, uno por cada universidad de estudio, entre los meses de febrero y marzo de 2016, dentro del curso universitario 2015-2016, con una duración de 90 minutos aproxi- madamente en todos ellos. En la Universidad de Sevilla y en la Universidad de Cádiz participaron siete docentes respectivamente y en la Universidad de Huelva participaron seis docentes. 5. Resultados 5.1. Resultados de la encuesta Los datos de la encuesta realizada para esta investigación permiten realizar una descripción socio-demográfica de los estudiantes universitarios, consi- guiendo los siguientes valores: las respuestas recogidas han sido en mayor número de mujeres (65 por ciento), con una moda de edad de 20 años y una media de 23,06 años (desviación típica de 6,699). La residencia habitual de los encuestados está relacionada con una gran ciudad (60 por ciento) y, preferen- temente, se encuentran contextualizados en el hogar familiar (58 por ciento) frente a las que comparten piso (34 por ciento). Asimismo, hay que destacar index.comunicaci�n El rol del docente universitario y su... | Rodrigo Cano, De Casas Moreno, Aguaded | 21 según los datos extraídos que una gran mayoría de alumnado no dispone de beca (62 por ciento). Por otro lado, en cuanto a la exposición de los resultados de la encuesta, hay que destacar que las redes sociales se han convertido en herramientas que permiten las relaciones entre docentes y alumnos. En este sentido, la estadísti- ca descriptiva llevada a cabo arroja los siguientes resultados según las dimen- siones delimitadas para el estudio. En relación al ítem «Las redes sociales favorecen la colaboración en el trabajo por parte del profesorado» se ha conse- guido una media de 1,83 (con valores de 0 a 3) y una desviación típica (DT) de 0,934. Por su parte, con datos similares, el ítem «Las redes sociales favorecen el contacto y la colaboración entre el alumnado y el profesorado» ha obtenido un media de 1,78 y una DT de 0,906. Con los resultados obtenidos, se puede destacar que el 70,4 por ciento de los alumnos indican estar de acuerdo o muy de acuerdo con el uso de las redes sociales por parte del profesorado como metodología de colaboración (tabla 1). Tabla 1. Porcentaje de respuesta al ítem «Las redes sociales favorecen la colaboración en el trabajo por parte del profesorado». Frecuencia Porcentaje Porcentaje válido Totalmente en desacuerdo 57 10,6 11,9 En desacuerdo 84 15,6 17,6 De acuerdo 219 40,8 45,9 Muy de acuerdo 117 21,8 24,5 Total 477 88,8 100,0 Fuente: elaboración propia. En este ítem de «Las redes sociales favorecen la colaboración en el trabajo por parte del profesorado» se pueden destacar, en función de la universidad a la que pertenecen y el sexo, los siguientes resultados. En cuanto a la Universidad de Cádiz, 126 indican que están de acuerdo con esta afirmación junto con los 81 que indican estar muy de acuerdo. En total, 207 alumnos de la Universidad de Cádiz muestran estar de acuerdo o muy de acuerdo con este ítem, mientras que sólo 87 indican estar en desacuerdo o totalmente en desacuerdo (tabla 2). Sin embargo, son los alumnos de la Universidad de Huelva los que se muestran más de acuerdo con este ítem, atendiendo que por encima del 83 por ciento indican estar de acuerdo (27) o muy de acuerdo (9) frente a los 9 participantes que están en desacuerdo o totalmente en desacuerdo. Por su parte, la Universi- dad de Sevilla tan sólo el 65 por ciento (78 respuestas) se muestran de acuerdo o muy de acuerdo con el ítem «Las redes sociales favorecen la colaboración 22 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional en el trabajo por parte del profesorado». Es en esta Universidad de Sevilla en la que se producen más discrepancias, superando las respuestas en el apartado ‘Totalmente en desacuerdo’ a las respuestas de ‘Muy de acuerdo’. Tabla 2. Tasa de respuestas por Universidad y sexo al ítem «Las redes sociales favorecen la colaboración en el trabajo por parte del profesorado». Universidad Total Total desacuerdo Desacuerdo De acuerdo Muy de acuerdo UCA Sexo Hombre 12 21 33 27 93 Mujer 15 39 93 54 201 Total 27 60 126 81 294 UHU Sexo Hombre 0 3 9 6 18 Mujer 3 6 27 9 45 Total 3 9 36 15 63 US Sexo Hombre 15 9 24 12 60 Mujer 12 6 33 9 60 Total 27 15 57 21 120 Fuente: elaboración propia. Si atendemos al sexo de las personas participantes en la encuesta (tabla 2), se puede destacar que en la Universidad de Cádiz están de acuerdo o muy de acuerdo 147 mujeres sobre un total de 201, representándose en el 73 por ciento del total, frente a las 60 respuestas de los hombres (64,5 por ciento). En la Universidad de Huelva prácticamente no hay diferencias entre el sexo y en ambos casos están de acuerdo o muy de acuerdo tanto los alumnos (83,3 por ciento) como las alumnas (80 por ciento). Es en la Universidad de Sevilla en la que las diferencias entre los hombres y las mujeres se hace más evidente, así mientras 36 hombres, de 60 respuestas totales, han respondido estar de acuer- do o muy de acuerdo con este ítem son 42 mujeres, de 60 respuestas totales, las que han mostrado estar de acuerdo o muy de acuerdo con esta afirmación, lo que indica que hay 10 puntos porcentuales entre las mujeres (70 por ciento) y los hombres (60 por ciento). De la misma forma, respecto al ítem «Las redes sociales favorecen el contacto y la colaboración entre el alumnado y el profesorado» (tabla 3), los alumnos han respondido en 324 ocasiones que están de acuerdo (225) o muy de acuerdo (99), lo que representa un 69 por ciento de las respuestas. index.comunicaci�n El rol del docente universitario y su... | Rodrigo Cano, De Casas Moreno, Aguaded | 23 Tabla 3. Porcentaje de respuesta al ítem «Las redes sociales favorecen el contacto y la colaboración entre el alumnado y el profesorado». Frecuencia Porcentaje Porcentaje válido Totalmente en desacuerdo 54 10,1 11,5 En desacuerdo 93 17,3 19,7 De acuerdo 225 41,9 47,8 Muy de acuerdo 99 18,4 21,0 Total 471 87,7 100,0 Fuente: elaboración propia. Respecto al ítem «Las redes sociales favorecen el contacto y la colabora- ción entre el alumnado y el profesorado» (tabla 4), atendiendo a las Univer- sidades de referencia se puede destacar los resultados de la Universidad de Huelva, en la que 48 de 63 respuestas están de acuerdo (39) o muy de acuerdo (9) con el ítem, lo que representa el 76 por ciento del total de las respuestas respondidas desde esta Universidad. En el caso de la Universidad de Cádiz con un total de 288 respuestas, 207 de las mismas son entre de acuerdo (138) y muy de acuerdo (69), lo que representa un 72 por ciento. Tan sólo en la Univer- sidad de Sevilla los alumnos que han respondido a la encuesta muestran más dudas sobre si las redes sociales favorecen el contacto y la colaboración entre el alumnado y el profesorado; sólo 69 de las 120 respuestas han indicado estar de acuerdo o muy de acuerdo, lo que constituye un 57,5 por ciento. Tabla 4. Tasa de respuestas por Universidad y sexo al ítem «Las redes sociales favorecen el contacto y la colaboración entre el alumnado y el profesorado». Universidad Total Total desacuerdo Desacuerdo De acuerdo Muy de acuerdo UCA Sexo Hombre 18 15 30 27 90 Mujer 9 39 108 42 198 Total 27 54 138 69 288 UHU Sexo Hombre 0 3 12 3 18 Mujer 3 9 27 6 45 Total 3 12 39 9 63 US Sexo Hombre 12 9 30 6 57 Mujer 12 18 18 15 63 Total 24 27 48 21 120 Fuente: elaboración propia. 24 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional Existen diferencias en las respuestas que tanto alumnos como alumnas han respondido a la encuesta; así, mientras en la Universidad de Cádiz 150 (76 por ciento) mujeres indicaron estar de acuerdo o muy de acuerdo con este ítem, sólo 57 hombres indicaron lo mismo (63 por ciento). En las otras dos universidades analizadas se observa que la tendencia es inversa, tanto en la Universidad de Huelva como en la de Sevilla son los hombres los que indi- can estar más de acuerdo con esta afirmación que las mujeres. Mientras en la Universidad de Huelva 15 de los 18 alumnos que respondieron a la encuesta indicaron estar de acuerdo o muy de acuerdo (un 83 por ciento), las 33 de 45 mujeres también indicaron estar de acuerdo o muy de acuerdo (73 por cien- to). Y en la Universidad de Sevilla 33 de las 63 mujeres (52 por ciento) que respondieron a la encuesta muestran estar de acuerdo o muy de acuerdo con esta afirmación, mientras que 36 de 57 hombres (63 por ciento) indican estar de acuerdo o muy de acuerdo con este ítem (tabla 4). En suma, para los alumnos de las Universidades de este estudio las redes sociales son herramientas que favorecen el contacto y la colaboración entre el alumnado. Además, estos instrumentos se han alzado como las nuevas meto- dologías de enseñanza-aprendizaje en el currículo educativo, cuyas nuevas formas favorecen el desarrollo crítico. Sin duda, el nuevo ecosistema comu- nicativo ofrece al nuevo estudiante un escenario interdisciplinar donde el docente ayuda y forma a través de las competencias digitales oportunas para favorecer las humanidades digitales de todos los componentes. 5.2. Opiniones y actitudes de la muestra a través del ‘focus group’ Los cambios en las metodologías, la integración de las TIC y las incertidumbres de un mundo cambiante exigen preguntarse si el docente requiere nuevas habilidades o identificar cuáles son las que se asociarían con el docente universitario para el siglo xxi. De este modo, los participantes han identificado las siguientes habilida- des: el docente debe asumir la horizontalidad pero no la simetría; debe aportar otro rol, otro conocimiento y definir el contexto; y, por último, el profesor no es el único que tiene el conocimiento y la verdad absoluta. Sin embargo, también han apunta- do a otras ideas como: Red-Arquía (concepto contrario a la jerarquía) o Apren-Red (como juego de palabras para denotar el aprendizaje en red). Frente a lo expuesto, es importante señalar que se requiere de una serie de características para la nueva docencia: «otras formas de dar la clase; se trata de retar y desafiar a los alumnos; aprender a aprender; ser autodidacta». Los docentes tienen claro que se enfrentan a nuevas situaciones, indicando que: «En mi opinión las RRSS han revolucionado las clases». index.comunicaci�n El rol del docente universitario y su... | Rodrigo Cano, De Casas Moreno, Aguaded | 25 Coincidiendo con lo que han indicado los docentes en las entrevistas, las TIC han revolucionado la educación, el educador se ha convertido en auto- didacta en muchas ocasiones, buscando aplicaciones y herramientas para la interactuación y la motivación, reclamando que: Pedagogía, didáctica, comunicación y herramientas TIC: «docente clave en el proceso para el uso adecuado de las TIC» y «el docente tiene que tener herramientas y estrategias para utilizarlas adecuadamente como cualquier otra herramienta». Ante los cambios inminentes, que identifican una universidad distinta y más abierta, el docente tiene que adaptarse a los nuevos tiempos y, por ello, es necesa- rio ser investigador, que no escritor de artículos. Asimismo, es importante la inter- colaboración entre docentes para lograr metodologías colaborativas. Por tanto, como ellos mismos han indicado, los docentes se han de adaptar al cambio. Ante estos cambios los informantes ya indican que: «La dificultad es que venimos de un mundo, con una estructura jerárquica, organizada, disciplinada, de arriba-abajo y nos encontramos en un mundo que no sólo lo combate si no que se opone radicalmente». Reconocer las buenas prácticas universitarias en el uso de herramientas para las metodologías colaborativas en la Web 2.0 es uno de los objetivos de esa investigación. Ante este objetivo los informantes de los focus group indicaron aspectos relevantes para el proceso enseñanza-aprendizaje en la universidad. La forma de trabajo influye en las herramientas a utilizar, por eso requieren nuevas formas de enseñanza: «trabajo por proyectos»; «trabajo por problemas»; «trabajo en colaboración» o trabajar en temas de actuali- dad o significativos para los alumnos. Otras soluciones propuestas están rela- cionadas con trabajar a través del humor, como viñetas o la muy utilizada «clase invertida» (flipped classroom). Asimismo, los docentes reconocen estas buenas prácticas con las TIC porque les reportan beneficios como: «El uso habitual de la plataforma mejora el proceso y facilita el mismo, tanto alumnos como docentes, pero requiere un proceso formativo»; «la accesibilidad, la capacidad de almacenamiento y la inmediatez son las grandes ventajas. En una palabra, agilidad»; «capacidad de difusión y de compartir conocimiento»; «las RRSS permiten la conexión directa y la interacción entre los alumnos y el docente». 26 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional Sin embargo, indican algunos inconvenientes como los problemas psico- lógicos (miedo) sobre el uso y puesta en marcha de las TIC. Nos cuesta reinventarnos; no conocemos las estructuras de los proce- sos de la comunicación y me parece necesaria una alfabetización digital. En consecuencia, advierten que Internet no incide directamente en el cambio metodológico y que hay que repensar el uso de metodologías que busquen vincular la participación. Con todo esto, parece evidente que es cues- tión personal del docente establecer buenas prácticas. De este modo, algunas frases reconocen una necesidad de romper algunas normas: «Interactividad entre el docente y los alumnos»; «Transgredo el carte- lito de prohibido el teléfono móvil»; «Todo el mundo al mismo nivel y utilizando las redes»; «Deberíamos poner el objetivo en otro sitio, que sería aprender a aprender». En definitiva, de la misma forma que los docentes en las entrevistas habían identificado algunas herramientas de la Web 2.0 para el desarrollo de buenas prácticas, en los focus group identificaron Facebook, Blog, YouTube y, como novedad, han indicado la tecnología móvil Whatsapp; y, por supuesto, la plataforma virtual. Esta selección está determinada por el uso y aplicación de las nuevas tecnologías como método de enseñanza-aprendizaje a través del trabajo cola- borativo entre el alumno y el docente. En esta ocasión se puede afirmar que, tanto las respuestas de los estudiantes, como las de profesores, se acercan y convergen hacia el uso de las mismas tecnologías. 6. Discusión y conclusiones Las redes sociales se descubren como herramientas que, en opinión de los alum- nos de las universidades de este estudio, favorecen la colaboración y el contacto entre el alumnado y el profesorado. Este tipo de tecnologías deberían posibilitar el desarrollo de tutorías online como herramientas de comunicación, dado que es una de las acciones más habituales entre un grupo importante de docentes que utiliza las tecnologías en el aula universitaria como demuestra el estudio de Marcelo, Yot, y Mayor-Ruiz (2015). Aún sin estrategias formales, se encuentran docentes que utilizan de forma habitual tecnologías como el correo electrónico, presentaciones multimedia o la plataforma virtual (Cerezo, Sánchez-Santillán, Puerto, y Núñez, 2016) de forma significativa, si entendemos el trabajo del index.comunicaci�n El rol del docente universitario y su... | Rodrigo Cano, De Casas Moreno, Aguaded | 27 docente, como orientador y gestor de la información de relevancia (Sancho-Gil y Hernández-Hernández, 2018). La docencia universitaria se ha caracterizado por dos extremos. Por un lado, la transmisión del conocimiento como parte del docente hacia los alum- nos en un mensaje unidireccional que genera un aprendizaje pasivo. En este sentido, el docente no tiene en cuenta las características del alumnado, ni el contexto. En el otro extremo, podríamos situar las estrategias colaborativas en las que se introducen, tanto el trabajo en pequeños grupos como el debate con todo el grupo clase. Además, se pueden encontrar zonas intermedias como en la clase expositiva, que permita la interacción y que pretenda la implicación del grupo clase. Este modelo de clase requiere del docente un alto componente comunicativo (Ibermón, 2009), incluyendo la investigación y uso de las TIC para el desarrollo de actividades de forma limitada y para acciones concretas (Flavin, 2012). Además, las TIC deben estar al servicio de metodologías que ya se estaban implementando y no para transformarlas (Ng’ambi, 2013). En el momento actual, el docente universitario ante las humanidades digi- tales debe decidir el rol de las tecnologías en el aula. Para ello, el profesor universitario requiere, entre otras capacidades: tendencia hacia el trabajo cola- borativo y en equipos docentes; y la presencia generalizada de las TIC en la educación (Prendes, Martínez y Gutiérrez, 2018). En todo caso, para cual- quiera de estas formas de enseñanza descritas es necesario atender al contexto educativo y al de los alumnos. A lo largo del siglo xxi se han incorporado a las aulas universitarias la denominada generación ‘Google’, ‘Net’ o ‘Eins- tein’ (Aguaded y Cabero, 2014: 77) caracteriza por: «carecer de conciencia sobre sus necesidades de información por lo que no saben satisfacerlas autó- nomamente, acceden a Internet y dominan su mecánica, pero no saben usarla de manera significativa, dedican poco tiempo a evaluar críticamente el mate- rial en línea, no saben identificar lo relevante y fiable, pero tampoco reciben instrucción en la escuela al respecto, y suelen leer como promedio sólo entre el 20 por ciento y el 28 por ciento del total del contenido de una web» (Igle- sias-Onofrio y Rodrigo-Cano, 2013: 715). Sin embargo, en las aulas no están ellos solos, sino que comparten las aulas y los docentes, con los inmigrantes digitales y visitantes de la Red, junto con los aprendices del nuevo milenio, la instant message generation, Net Generation, nativos digitales, alfabetos digitales, alfabetos tecnológicos, estudiantes residentes, prosumidores y los prosumidores mediáticos (Sevillano, Quicios y González, 2016). En suma, es importante establecer que en momentos de humanidades digitales el docente debe procurar empoderar a los alumnos universitarios con el fin de que estos adquieran capacidades como introducir discursos 28 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional críticos, que cuestionen el funcionamiento del sistema; que actúen bajo principios de horizontalidad, intercambio de mensajes de igual a igual y en ausencia de jerarquización; en definitiva, que sean capaces de comunicar desde una posición de libertad, en lo que Aparici y García-Marín (2018) han descrito como emirecs. 7. Bibliografía AguAded, I. y CAbero, J. (2014). Avances y retos en la promoción de la innova- ción didáctica con las tecnologías emergentes e interactivas. Educar, especial 30 aniversario, 67-83. Doi: http://dx.doi.org/10.5565/rev/educar.691 Anderson, T.; Poellhuber, B. y MCKerliCh, R. (2010). Social Software survey used with unpaced undergrad. Recuperado de: http://goo.gl/TGSVVI APAriCi, R. y gArCíA-MArín, D. (2018). Prosumers and emirecs: Analysis of two confronted theories. [Prosumidores y emirecs: Análisis de dos teorías enfren- tadas]. Comunicar, 55, 71-79. Doi: https://doi.org/10.3916/C55-2018-07 biestA, G. (2013). Responsible citizens: Citizenship education between social inclusion and democratic politics, En M. Priestley, y G. biestA (Eds). Reinventing the curriculum: New trends in curriculum policy and practice. London: Bloomsbury. burbules, N. (2012). El aprendizaje ubicuo y el futuro de la enseñanza. Encoun- ters/Encuentros/ Rencontres on Education, 13, 3-14. CAstAñedA, L.; esteve, F. y Adell, J. (2018). ¿Por qué es necesario repensar la competencia docente para el mundo digital. RED, Revista de Educación a Distancia, 56. Doi: http://dx.doi.org/10.6018/red/56/6 Cerezo, R.; sánChez-sAntillán, M.; Puerto, M. y núñez, J. C. (2016). Students’ LMS interaction patterns and their relationship with achieve- ment: A case study in higher education. Computers & Education, 96, 42-54. Doi: http://dx.doi.org/10.1016/j.compedu.2016.02.006 del-MorAl, M.E. y villAlustre, L. (2010). Formación del profesor 2.0: desa- rrollo de competencias tecnológicas para la escuela 2.0. Magister, 23, 59-70. eConoMides, A. (2009). Adaptive context-aware pervasive and ubiquitous lear- ning. International Journal of Technology Enhanced Learning, 1(3), 169-192. esPinA, M. (2007) Complejidad, transdisciplina y metodología de la investiga- ción social. Utopía y Praxis Latinoamericana, 12(38), 29-43. esPinosA, M. T. (2014). Necesidades formativas del docente universitario. REDU, 12(4), 161-177. Fernández-borrero, M. y gonzález-losAdA, S. (2012). El perfil del buen docente universitario. Una aproximación en función del sexo del alumnado. REDU, 10(2), 237-249. index.comunicaci�n http://dx.doi.org/10.5565/rev/educar.691 http://goo.gl/TGSVVI https://doi.org/10.3916/C55-2018-07 http://dx.doi.org/10.6018/red/56/6 http://dx.doi.org/10.1016/j.compedu.2016.02.006 El rol del docente universitario y su... | Rodrigo Cano, De Casas Moreno, Aguaded | 29 FiguerAs-MAz, M.; Ferrés, J. y MAteu, J. C. (2017). Percepción de los/as coor- dinadores/as de la innovación docente en las universidades españolas sobre el uso de dispositivos móviles en el aula. Prisma Social, 20, 161-179. FlAvin, M. (2012). Disruptive technologies in higher education. Research in Learning Technology, 20, 102-111. FreitAs, S.; gibson, D.; du-Plessis, C.; hAllorAn, P.; WilliAMs, E.; AMbro- se, M.; dunWell, I. y ArnAb, S. (2015). Foundations of dynamic learning analytics: Using university student data to increase retention. British Journal of Educational Technology, 46(6). 1175-1188. Doi: http://dx.doi.org/10.1111/bjet.12212 gAbelAs, J. A.; MArtA-lAzo, C. y ArAndA, D. (2012). Por qué las TRIC y no las TIC. Comein, 9. Recuperado de: http://goo.gl/80dK9a gAbelAs, J. A.; MArtA-lAzo, C. y gonzález-Aldez, P. (2015). El factor rela- cional en la convergencia mediática: una propuesta emergente. Anàlisi, 53, 20-34. Doi: http://dx.doi.org/10.7238/a.v0i53.2509 gArCíA, M. L. S.; Flores, M. D. P. G.; CAno, E. V. y yedrA, L. R. (2016). Ubicui- dad y movilidad de herramientas virtuales abren nuevas expectativas formati- vas para el estudiantado universitario. Ensayos Pedagógicos, 11(2), 99-131. gArgAllo, B.; sánChez Peris, F.; ros ros, C. y FerrerAs reMesAl, A. (2010). Estilos docentes de los profesores universitarios. La percepción de los alumnos de los buenos profesores. Revista Iberoamericana de Educación, 51 (4), 1-16. gArrido-lorA, M.; busquet, J. y Munté-rAMos, R. A. (2016). De las TIC a las TRIC. Estudio sobre el uso de las TIC y la brecha digital entre adultos y adolescentes en España. Anàlisi, 54, 44-57. Doi: http://dx.doi.org/10.7238/a.v0i54.2953 gértrudix, B. M.; borges, R. E. y gArCíA, G. F. (2017). Redes sociales y jóve- nes en la era algorítmica. Telos, 107, 62-70. gozálvez, V. y ContrerAs-Pulido, P. (2014). Educar a la ciudadanía mediática desde la educacomunicación. Comunicar, revista científica de comunicación y educación, 42, 129-136. Doi: http://dx.doi.org/10.3916/C42-2014-12 gutiérrez-Porlán, I.; roMán-gArCíA, M. y sánChez-verA, M. (2018). Estra- tegias para la comunicación y el trabajo colaborativo en red de los estudian- tes universitarios. Comunicar, 54, 91-100. Doi: https://doi.org/10.3916/C54-2018-09 hArt, J. (2017). Top 100 tools for learning 2017. Las mejores 100 herra- mientas para aprender 2017. Recuperado de: http://goo.gl/z7nNDN hiMAnen, P. (2002). La ética del hacker y el espíritu de la era de la información. Barcelona: Destino. iMberMón, F. (2009). Mejorar la enseñanza y el aprendizaje en la Universidad. Cuadernos de Docencia Universitaria, 14, 1-15. http://dx.doi.org/10.1111/bjet.12212 http://goo.gl/80dK9a http://dx.doi.org/10.7238/a.v0i53.2509 http://dx.doi.org/10.7238/a.v0i54.2953 http://dx.doi.org/10.3916/C42-2014-12 https://doi.org/10.3916/C54-2018-09 http://goo.gl/z7nNDN 30 | index.comunicación | nº 8 (2) | Número especial Educación mediática y factor relacional iglesiAs-onoFrio, M. y rodrigo-CAno, D. (2013). La Web 2.0 en el proceso de enseñanza-aprendizaje: Una experiencia de innovación docente universitaria. Cuestiones Pedagógicas, 22, 287-298. KAnKArAš, M.; Montt, G.; PACCAgnellA, M.; quintini, G. y thorn, W. (2016). Skills Matter: Further Results from the Survey of Adult Skills. OECD Skills Studies. OECD Publishing. Doi: http://dx.doi.org/10.1787/9789264258051-en llorens, F.; Fernández, A.; CAnAy, J. R.; Fernández, S.; rodeiro, D.; ruzo, E. y sAMPAlo, F. J. (2016): Descripción de las TIC, en góMez, J. (Ed.) UNIVERSI- TIC 2016. Análisis de las TIC en las Universidades Españolas. Madrid: Crue Universidades Españolas . MArCelo, C.; yot, C. y MAyor-ruiz, C. (2015). University Teaching with Digital Technologies. [Enseñar con tecnologías digitales en la universidad]. Comunicar, 45, 117-124. Doi: https://doi.org/10.3916/C45-2015-12 MArCelo, C.; yot, C.; Murillo, P. y MAyor, C. (2016). Actividades de aprendi- zaje con tecnologías en la universidad. ¿Qué uso hacen los profesores? Profe- sorado, Revista de Currículum y Formación del Profesorado, 20(3), 283-312. MArtA-lAzo, C. y gAbelAs, J. A. (2016). Comunicación digital: Un modelo basado en el Factor R-elacional. Madrid: UOC. MArtA-lAzo, C.; herguetA-CovACho, E. y gAbelAs-bArroso, J. A. (2016). Applying Inter-methodological Concepts for Enhancing Media Literacy Competences. Journal of Universal Computer Science, 22(1), 37-54. MAs-torelló, O. (2012). Las competencias del docente universitario: la percepción del alumno, de los expertos y del propio protagonista. REDU, 10(2), 299-318. MojArro, A.; rodrigo-CAno, D. y etChegArAy-Centeno, M. C. (2015). Educación personalizada a través de e-learning. Alteridad, 10(1), 21-30. Doi: https://doi.org/ 10.17163/alt.v10n1.2015.02 MorgAn, D. (1997). The Focus Group Guidebook. California: Sage Publica- tions, Inc. ng’AMbi, D. (2013). Effective and ineffective uses of emerging technologies: Towards a transformative pedagogical model. British Journal of Educational Technology, 44(4), 652-661. PAPAMitsiou, Z. y eConoMides, A. (2014). Learning Analytics and Educatio- nal Data Mining in Practice: A Systematic Literature Review of Empirical Evidence. Educational Technology & Society, 17(4), 49-64. PlAzA, J. y ACuñA, A. (2017). El docente ante las TIC: roles, tradiciones y nuevos desafíos/Teachers and ICT: Roles, traditions and new challenges. (En) clave Comahue. Revista Patagónica de Estudios Sociales, 23, 157-168. index.comunicaci�n http://dx.doi.org/10.1787/9789264258051-en https://doi.org/10.3916/C45-2015-12 https://doi.org/ 10.17163/alt.v10n1.2015.02 El rol del docente universitario y su... | Rodrigo Cano, De Casas Moreno, Aguaded | 31 PisCitelli, A. (2015) Humanidades digitales y nuevo normal educativo. Revista TELOS, 101, 1-10. Prendes esPinosA, M. P.; MArtínez sánChez, F. y gutiérrez Porlán, I. (2018). Competencia digital: una necesidad del profesorado universitario en el siglo xxi. RED, 56. Doi: http://dx.doi.org/10.6018/red/56/7 rodríguez-ortegA, N. (2014). Prólogo: Humanidades Digitales y pensamiento crítico. En E. roMero-FríAs y M. sánChez-gonzález (Eds), Ciencias Socia- les y Humanidades Digitales. (pp. 13-17). Tenerife: Cuadernos Artesanos de Comunicación. roMero-FríAs, E. (2014). Ciencias Sociales y Humanidades Digitales: una visión introductoria. En E. roMero FríAs y M. sánChez gonzález (Eds), Ciencias Sociales y Humanidades Digitales. (pp. 19-50). Tenerife: Cuadernos Artesanos de Comunicación. sAnCho-gil, J.M. y hernández-hernández, F. (2018). La profesión docente en la era del exceso de información y la falta de sentido. RED, Revista de Educa- ción a Distancia, 56. Doi: http://dx.doi.org/10.6018/red/56/4 sCheFFel, M.; drAChsler, H.; stoyAnov, S. y sPeCht, M. (2014). Quality Indi- cators for Learning Analytics. Educational Technology & Society, 17(4), 117-132. sevillAno, M. L.; quiCios, M. P. y gonzález, J. L. (2016). Posibilidades ubicuas del ordenador portátil: percepción de estudiantes universitarios españoles. Comunicar, 46, 87-95. Doi: http://dx.doi.org/10.3916/C46-2016-09 tAylor, S. J. y Bogdan, R. (1987). Introducción a los métodos cualitativos de investigación: La búsqueda de significados. Barcelona: Paidós. urueñA, A.; Prieto, E.; seCo, J. A.; bAllestero, M. P.; Castro, R. y CAdenAs, S. (2018). Las TIC en los hogares españoles, estudio de demanda y uso de Servicios de Telecomunicaciones y Sociedad de la Información. Recuperado desde: https://goo.gl/x7Sz5S urueñA, A.; seCo, J. A.; CAstro, R. y CAdenAs, S. (2017). Perfil sociode- mográfico de los internautas, análisis de datos INE 2017. Recuperado desde: https://goo.gl/AB9kx1 vázquez-CAno, E. y sevillAno-gArCíA, M. L. (2017). Lugares y espacios para el uso educativo y ubicuo de los dispositivos digitales móviles en la educación superior. EDUTEC, 62. Doi: http://dx.doi.org/10.21556/edutec.2017.62.1007 Para citar este artículo: Rodrigo-Cano, D; de-Casas-Moreno, P. y Aguaded, I. (2018). El rol del docente universitario y su implicación ante las Humanidades digitales. index.comunicación, 8(2), 13-31. http://dx.doi.org/10.6018/red/56/7 http://dx.doi.org/10.6018/red/56/4 http://dx.doi.org/10.3916/C46-2016-09 https://goo.gl/x7Sz5S https://goo.gl/AB9kx1 http://dx.doi.org/10.21556/edutec.2017.62.1007 work_eklg3piwvbdxdlxggels6ggrmm ---- Eang Yu ·1· Digital Evaluation of Sitting Posture Comfort in Human-Vehicle System under Industry 4.0 Framework TAO Qing 1,2 ,Jinsheng Kang* 3 ,SUN Wenlei 1 ,LI Zhaobo 1 , HUO Xiao 1 1 School of Mechanical Engineering, Xinjiang University, Urumqi, China 830046 2 Center for Post-doctoral Studies of Mechanical Engineering, Urumqi, China 830046 3 Colle of Engineering, Design and Physical Sciences, Brunel University London, Uxbridge, UB8 3PH, UK Abstract: Most of the previous studies on the vibration ride comfort of the human-vehicle system were focused only on one or two aspects of the investigation. This paper proposed a hybrid approach which integrates all kinds of investigation methods in real environment and virtual environment. The real experimental environment includes the WBV (whole body vibration) test, questionnaires for human subjective sensation and motion capture. The virtual experimental environment includes the theoretical calculation on simplified 5-DOF human body vibration model, the vibration simulation and analysis within ADAMS/ Vibration TM module, and the digital human biomechanics and occupational health analysis in Jack software. While the real experimental environment provided realistic and accurate test results, it also serves as core and validation for the virtual experimental environment. The virtual experimental environment takes full advantages of current available vibration simulation and digital human modelling software, and makes it possible to evaluate the sitting posture comfort in a human-vehicle system with various human anthropometric parameters. How this digital evaluation system for car seat comfort design is fitted in the Industry 4.0 framework is also proposed. Key words: parameter identification, vibration characteristic, sitting posture comfort, human-vehicle system, human body model, digital design, digital evaluation, Industry 4.0. 1 Introduction The process of designing a new vehicle involves satisfying a large number of requirements and following multiple guidelines. Among the various vehicle design parameters, the most critical parameter that has a direct effect on users is the level of “comfort”. The application of ergonomic methodologies to vehicle design processes is becoming increasingly important. Seated postures have been regarded as potentially unhealthy factors for several musculoskeletal disorders especially in the car [1] . An occupational epidemiological study by Gyi [2] showed that people exposed to over 4 h of driving per day were more than twice as likely to suffer from low back pain compared to those with over 4 h of sedentary work per day, and the vibration from road can lead to higher risk of musculoskeletal disorders. Sitting comfort needs could be divided into sitting comfort and discomfort. Several studies have suggested that comfort and discomfort be treated as complementary but independent entities [3] . Similarly, Hancock and Pepe [4] showed that discomfort and comfort are at different stages of needs, the latter being placed at a higher stage than the former. In other studies, comfort was not measured and only a discomfort scale was used with supplemental objective measures such as electromyography (EMG), ________________________________________________ * Corresponding author. E-mail: jinsheng.kang@brunel.ac.uk Supported by National Natural Science Foundation of China (Grant No. 51465056), and Xinjiang Provincial Natural Science Foundation of China (Grant No. 2015211C265), and Xinjiang University PhD Start-up Funds. center of pressure (COP), or interface pressure [5] . Driving postures are related to both comfort and discomfort. In a study by Hanson et al. [6] , participants described their preferred driving posture using adjectives. Zhang’s findings [3] suggest that the driving posture indeed is related to both comfort and discomfort. From this, it can again be argued that subjective responses to driving postures should be rated in terms of comfort and discomfort using two separate scales. As addressed above, designing a car seat is a challenging task that must meet multiple requirements; within a confined space where vibration is generally present. Comfort is a complex construct influenced by several factors, except subjective responses the other important impact is vibration. At the stage of developing a seat system, prediction of vibration responses at the human body–seat interface by computer simulations is required. In order to carry out such computer simulations, a dynamic model on the human body is needed , which is an effective tool to describe the simulation for ergonomics design. It also plays an important role in the prediction of the human body vibration characteristics and the impact of vibration on the human body. It was known that the vibration behavior of a human body on seat was affected by not only vibration environment [7-8] but also the sitting posture [9] . As early as in 1974, M. Robert and others built an six degrees of freedom nonlinear vibration model of human body [10] . In 1994, W. Qassem put forward a description on human body vertical vibration and horizontal vibration as an 11 degrees of freedom vibration model [11] . In 1994, Magnusson et al created some recommendations on how to design a driver’s ·2· cab to reduce the exposure of whole body vibration and other risk factors having a negative effect on the health [12] . In 1998, PÉ Boileau and S Rakheja proposed an 4 degrees of freedom human body vertical vibration model [13] based on STH (seat-to-head transmissibility) and DPM (driving-point mechanical impedance). In 2005, Hyeong Kim reported a body vertical vibration model based on STH and AM (apparent mass and apparent quality) [14] . In 2008, E Zhang studied the different vibration parameters of the human body vibration characteristics [15] . Recently, Reddy P S et al proposed a 15-DOF human body model to include the left and right lower and upper arms, as well as neck in the human-vehicle vibration model [16] . Ten subjects were tested in whole body vibration with five frequencies in vertical direction, and a hybrid Polaris spectra system was used to obtain the seat to head transmissibility [17] . Rantaharju T et al compared five different assessment methods for whole body vibration and shocks, and their impacts on the interpretation on the experimental results [18] . Most of the previous studies on the vibration ride comfort of the human-vehicle system were focused only on one or two aspects of the investigation. Some of the research concentrated on human’s perception [3,4,5] , while others emphasized on the sitting posture [6,9] , and many different human body vibration models were proposed and human whole body vibration test conducted [11-18] . With the rapid development of computational power, simulation software and VR (Virtual Reality) technology, it is possible to integrate all the simulation and modelling aspects together, backed up with a limited number of human whole body vibration test and questionnaire results as validation, to provide a simulation method for rapidly evaluating the sitting posture body comfort of human-vehicle systems. This paper describes the pioneer research along this line, in which a limited number of human subjects whole body vibration test together with questionnaire were conducted, a simplified 5-DOF human body vibration model was established, human body vibrations were simulated on ADAMS software platform and sitting postures were evaluated via digital human and joint angles. The application of the proposed method under Industry 4.0 framework is also discussed in this paper. 2 Methods 2.1 Vibration comfort questionaire During a long-term drive in a dynamic environment, the driver will be exposed to whole body vibrations. The frequency range considered in the ISO 2631 standard is 0.5–80Hz for health, comfort and perception, 0.1–0.5Hz for motion sickness. When measuring vibration, the primary quantity of vibration magnitude should be acceleration. The measurement should be done according to a coordinate system originating at a point from which vibration is considered to enter the human body. The principal basic xyz coordinate systems are shown in the ISO 2631:1-1997 standard. The root-mean-square value of the frequency-weighted acceleration were obtained according to the ISO 2631:1-1997 standard, and they were compared with human subjective sensation acquired from the questionnaire during whole body vibration test, and are summarized in Table 1. Table 1. Comparison table of acceleration, human subjective sensation -- vehicle riding comfort Acceleration Comfort Level Less than 0.315 m/s 2 not uncomfortable 0.315 to 0.63 m/s 2 a little uncomfortable 0.5 to 1 m/s 2 fairly uncomfortable 0.8 to 1.6 m/s 2 uncomfortable 1.25 to 2.5 m/s 2 very uncomfortable Greater than 2 m/s 2 extremely uncomfortable 2.2 Simplified 5-DOF human body vibration model Human body can be treated as various degree of freedom vibration systems [1-18] . The reason for us to use this simplified 5-DOF vertical human body vibration model is that this model can capture the most important features of human’s perception and response to vibration in a human-vehicle system, while the model itself is simple to implement in a remote portable or distributed environment in Industry 4.0 framework. Human body is a flexible organization, and therefore the vibration responses will be similar to an elastic system. The lumped mass, torsion spring, damper, and multibody dynamics were used to model the human body vibration response characteristics. Human body was divided into five parts, which were head, torso, lower torso (including hip), the left leg and right leg. The vibration of the thigh will be passed through the whole body parts to the head. The vibration of the head is the most important factor that affects the comfort and produces visual impairment. To predict the vibration responses of the human body in a human-vehicle system under the dynamic environment, this simplified 5-DOF vertical vibration model was developed and shown in Fig.1. This is a mechanical vibration response equivalent model of a sitting posture of human body, it will treat each part of human body as the mass, stiffness, damping and so forth similar to mechanical components. As shown in Fig.1, the kinetic parameters of the model are as follows: where m1 , m2 , m3 , m4 , m5 are the mass of human head, torso, lower torso (including hip), the left leg and right leg; k1 , k2 , k3 , k4 , k5 are the stiffness of corresponding part of the human body; c1 , c2 , c3 , c4 , c5 are the damping of each part of the human body; z1 , z2 , z3 , z4 , z5 are the displacement of center of gravity that each part of the human body; k6, c6 are the stiffness and the damping of seat; z0 is the input displacement excitation. ·3· (Torso)m2 (Head)m1 c1 k1 (Lower Torso)m3 c2 k2 m4 m5 Car body c3 c3 k3 k3 c4 c5 k5k4 Seat k6c6 z1 z2 z3 z0 z0 z5z4 Fig.1. 5-DOF human body vibration model According to Newton's second law, the vibration differential equation of this 5-DOF human posture model is as follows:            qBzKzCzM   (1) Where [M], [K], [C] and [B] is mass matrix, stiffness matrix, damping matrix and incentive matrix of the human-vehicle system respectively; {z} is the output vector,{z}=[z1,z2,z3,z4,z5] T ; {q} is the excitation vector, {q}=[z0,z0] T . For the convenience of the simplified model and calculation, assume that the mass, stiffness and damping of the human right leg and left leg are equal, that is m4=m5, k4=k5, c4=c5. From Eq. (1), conduct the Fourier transformation,                     jQBjzKjzCjjzM  2 (2) Variable  denotes the angular frequency. Then:         jz j jQ 0 1        So,               jQBKCjMjz 12  (3) Finally, the Available human response of transfer function is:                      j BKCjMH 112 (4) The Eq. (4) is the general formula of the transfer function of the human body. It is a column vector. For 5-DOF human vibration model shown in Fig.1, it is as below:              T HHHHHH  54321 ,,,, (5) Where H1( ), H2( ), H3( ), H4( ), H5 ( ) are the transfer function of the car body to human head, car body to torso, the car body to lower torso, the car body to the left leg, the car body to the right leg that each part of human body respectively. For the transfer function of car body to human head H1 ( ), expression is as follows:         jz jz H 0 1 1  (6) The transfer function of acceleration expression is as follows:         jz jz jS 0 1    (7) 2.3 Whole body vibration (WBV) experiment setup The vibration experiment on human body was conducted on the vibration equipment manufactured in Suzhou SUSHI test instrument co., LTD, China. The test seat was a pure rigid steel seat. There were 29 participants, and they were undergraduate and postgraduate students of Xinjiang University with a male to female ratio of 6:4, and aged from 21 to 28 years old. Their mean stature was 173.6 cm, and average weight was 63.4 kg. All the participants were in good health and without skeletal and muscular disorders. Before the experiment, participants completed the general information part of the questionnaire. During the experiment they were asked to answer the questionnaire to indicate their different comfort level under the different frequency bands. The principle of experiment is shown in Fig.2. The vibration table produces the fixed frequency vibration vibrator according to the experimental requirements. Through the six acceleration sensors, the vibration signals on the human’s body were recorded and saved by the oscilloscope. Charge amplifier Signal collector computer Data analysis Power amplifier Vibration table sensor sensor Fig.2 System diagram of whole body vibration test file://///acfs5/user/AppData/ç³ ·4· The parameter selections of human body vibration experiment were as follows: 12 frequency bands as the experimental frequencies were chosen, they were 2, 2.5, 3, 3.5, 4, 5, 6, 8, 10, 12, 16, 20 Hz respectively. The strength of vibration was in accordance with the standards ISO 2631, and the measurement time was 5 min for each frequency under the working condition. 2.4 Vibration simulation in ADAMS software The virtual vibration test was conducted on ADAMS/Vibration TM , a plug-in module for MSC. ADAMS software. The center of the virtual vibration table was set up as the Input Channel, and the input signal applied was 0.3 g (0.3⨉9.81 m/s 2 ). The Output Channel were set on the center of mass of the head, torso, lower torso, and the lower extremities of the virtual human. we set up the acceleration output channel. Carrying out the analysis in the frequency range 0~20 Hz, in Vibration Analysis within ADAMS/Vibration TM module, each body part’s acceleration frequency response characteristics was obtained in the virtual human-vehicle system. 3 Data Analysis and Results For the WBV test, the measured piezoelectric signals of acceleration were transferred into MATLAB software to go through the filtering process and conduct spectral analysis by the A/D conversion, then the frequency response function of human body were obtained. The obtained human response transfer rates   i jS  0 from the WBV test under different vibration frequencies were shown in Fig.3. 0 2 4 6 8 10 12 14 16 18 20 22 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 se a t to h e a d t ra n sm is si b ili ty Frequency (Hz) Female Male All Fig. 3 Curve of seat to head transmissibility from experiment Fig.3 is the vibration transfer function Curve of seat to head transmissibility, which was plotted by ORIGIN software according to the experiment data. Fig.3 showed the averaged results of WBV test on all subjects. Although the whole body vibration response are different between different subjects, the difference was small, with most of the resonance peak occurred at about 4 Hz. Meanwhile, the human body comfort level dropped significantly at around the 4-5 Hz. The questionnaire results showed that the results of the WBV test and the subjective sensation from questionnaires were coinciding. For the theoratical calculation of the simplified 5-DOF human body vibration model, the whole process of identifying human body dynamic characteristics was accomplished in MATLAB optimization toolbox. In this process, the constants for each part of the human body were referred to GB/T 17245-2004 which is a Chinese national standard of adult human body inertia parameters. The constants of stiffness and damping in each part of human body were 50≤ci≤100000 Ns/m (i=1, 2, 3, 4, 5, 6); k1, k4, k5, k6≥0 kN/m; 50≤k2, k3≤300kN/m. Fig 4 shows the theoretical calculation curve obtained from the 5-DOF human body vibration model, in contrast with the WBV experiment obtained transfer function curve. The trends of the two curves are basically the same, with resonance peaks appeared at about 4 Hz. Fig. 4 The comparison of the two curves of the transmissibility from experiment and theoretical calculation Carrying out the Laplace transform on equation (1), the transfer function of acceleration for each part of human body was analyzed in frequency domain. In this analysis, the frequency range was set from 0 to 20 Hz, using MATLAB to solve the Bode diagram command bode, the transfer function Bode diagram for all parts of the human body displacement, velocity and acceleration were obtained. The resonance frequency of four parts of the human body were different, the resonant frequency of the head, torso and lower torso (including hip) were around 4 HZ, and the resonance frequency of the left and right lower limb were around 12 HZ. The human body acceleration frequency response characteristics of the simulated virtual human-vehicle vibration system obtained from ADAMS/Vibration software were basically identical to the result obtained from 5-DOF human body vibration theoretical calculation model. The resonance frequencies, for the head, torso, and lower torso were concentrated in 4 Hz, and lower limbs were at around 12 Hz. The ADAMS simulation results 0 2 4 6 8 10 12 14 16 18 20 22 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 s e a t to h e a d t ra n s m is s ib il it y Frequency (Hz) The experimental results The theoretical results ·5· further confirmed that the 5-DOF human body vibration model and the human body vibration characteristic analysis we made were accountable. 4 Application of Digital Human Model in Sitting Posture Comfort Evaluation 4.1 The digital human model The application of digital human (virtual human/manikin) is a new trend in industry and design practice for ergonomics evaluation. For example, Jack and RAMSIS are the two mainstream software widely used in automobile industry. These software have the capability to evaluate the human sitting posture from biomechanics perspective for comfort and fatigue assessment. They can be easily connected to motion capture system so that realistic human sitting posture inside a vehicle can be streamed in. Another advantage is that virtual human’s gender, height, weight and any anthropometric parameter can be set conveniently, so that the ergonomics evaluation can be conducted on a wider population. Unfortunately, these virtual human software do not have the capability to do the vibration simulation. In our previous research, we have obtained some useful sitting posture data [19] . In these experiments, the hardware used to capture the body motion of participants was the Motion Analysis Eagle Digital System (Fig. 5). This is an optical motion capture system, consisting of Digital Cameras, the Eagle Hub, to which all cameras are connected and uplinks to a computer terminal. All the hardware components are controlled by EVaRT Real Time software. It is within this software where all data is recorded, processed and displayed, and where post processing takes place. Fig 6 shows the generated skeleton model which then be exported to Jack software for occupational health analysis. Fig.5 Motion capture system camera setup The analysis of the data was accomplished using a self-developed program in MATLAB. A total of 20 joint angles were evaluated in relation to the sitting posture and body motion tasks. The definition of joint angles was made with reference to G. Andreoniet et al [20] , and with reference to other joint angle definition conventions as well. Using a self-developed program in MATLAB, the joint angle were calculated, and compared with the occupational health analysis, like fatigue etc. in Jack software. Fig.6 Sitting posture skeleton model generated 4.2 Sitting posture knowledge base Although motion capture system provided a mean to capture human body posture and motion in a realistic way, it is not possible to do motion capture for the wide population with all different anthropometric parameters. A realistic way would be to collect some motion capture data on sitting posture from typical and extreme sized subjects, and conduct biomechanics and occupational health analysis. Through interpolation and extrapolation, the sitting posture knowledge base for all population with all different anthropometric parameters can be established, with corresponding biomechanics and occupational health analysis values as index. This constitutes part of the knowledge base for the digital evaluation of sitting posture comfort in human-vehicle system. 4.3 Hybrid system for sitting posture evaluation Similarly to the motion capture techniques, human subjects WBV (whole body vibration) test could not be applied to a wide population. Actually, all the reported WBV experiment were conducted on a very limited number of human subjects, from several to one or two dozens of. The limited number of WBV test results can be interpolated and extrapolated, with the guideline from the theoretical calculation model of human body vibration. These data, together with the human subjective perception obtained from questionnaires, constitute another part of the knowledge base for the digital evaluation of sitting posture comfort in human-vehicle system. Fig.7 shows the proposed hybrid system for sitting posture comfort evaluation, which consists of two experimental environments: real and virtual. The real experimental environment includes the WBV (whole body vibration) test, questionnaires for human subjective sensation and motion capture. The virtual experimental ·6· environment includes the theoretical calculation on simplified 5-DOF human body vibration model, the vibration simulation and analysis within ADAMS/ Vibration TM module, and the digital human biomechanics and occupational health analysis in Jack software. While the real experimental environment provided realistic and accurate test results, it also serves as core and validation for the virtual experimental environment. The virtual experimental environment takes full advantages of current available vibration simulation and digital human modelling software, and makes it possible to evaluate the sitting posture comfort in a human-vehicle system with various human anthropometric parameters. Human model Car seat model The human- computer interaction Whole Body Vibration Motion capture system Posture evaluation Comfort evaluation Vibration evaluation Real environment evaluation Virtual environment evaluation Car Seat parameters optimization Car seat d i ss a ti s f ac t i on satisfaction s at i s fa c ti o n questionnaire Fig. 7 Hybrid system for sitting posture evaluation 5 Discussion on How the System Being Fitted in Industry 4.0 Framework Industry 4.0 represented the trend and direction of fourth industrial revolution. It was recognized there are four key components and six design principles within Industry 4.0 framework [21] . It seems that they are all related to digital evaluation, and digital evaluation can contribute to the implementation of the four key components and six design principles in Industry 4.0, as indicated in Table 2. Table 2 Contribution of digital evaluation to Industry 4.0 Industry 4.0 components & design principles Digital evaluation could contribute to its implementation Cyber-Physical Systems Internet of Things Smart Factory Internet of Services √ √ √ √ Interoperability √ Virtualization Decentralization Real-Time Capability Service Orientation Modularity √ √ √ √ √ For example, Cyber-Physical Systems means higher level of integration and combination of physical and computational elements and processes, in which digital evaluation will be inevitable. Virtualization, Decentralization, Real-Time Capability and Modularity will all need digital evaluation to be integrated into the digital design and digital manufacturing circle under Internet of Things, Internet of Services and Smart Factory environment, to deliver best product and service to customer. Fig 8 shows how the current digital evaluation of car seat comfort is fitted in the Industry 4.0 framework. Currently all the car component and assembly design are carried out in CAD software, like SolidWorks, NX, or CATIA etc.. The car dynamic simulation can be conducted in Adams – Car, Adams – Chassis environment. The identified vibration parameters of car body and car seat from these dynamic simulations will be the input to the car sitting comfort evaluation, which contains both computer simulation and the knowledge of human perception on riding comfort obtained under real human vibration test. After the sitting comfort evaluation, the results can be fed to the design of car, and also the manufacturing process simulation can be performed, for example, under Siemens FactoryCAD, FactoryFLOW, Plant Simulation, RealNC, etc.. These processes will be within the Cyber-Physical Systems and Smart Factory environment, in which the integrated simulation and synthesis, remote visualization for human and collaborative diagnostics and decision making are considered to be the core elements at cognition level [22] . The interaction with customers can be achieved through Internet of Things and Internet of Services. Service Orientation will be offered both internally and across company borders, based on customer specific ·7· requirements [21] . Fig 8 How the digital evaluation of car seat comfort is fitted in the Industry 4.0 framework 6 Conclusion The communication, integration and synthesis between different simulated and physical systems are considered to be one of the core aspects in the implementation of Industry 4.0 [21. 22] . Most of the previous research on sitting posture comfort in human-vehicle system were concentrated on one or two experimental methods. This paper explored an integrated approach for digital evaluation of sitting posture comfort in human-vehicle system which takes advantages from all experimental methods in real environment and virtual environments. In real environment, WBV (whole body vibration) test, questionnaires for human subjective sensation and motion capture were conducted on a limited number of subjects; while in the virtual environment, the vibration simulation and digital human biomechanics and occupational health analysis can be extended to unlimited number of manikins with all possible anthropometric parameters. The role of the experiments conducted in real environment is to validate the simulation in virtual environment. The role of the simulation in virtual environment is to generate quick assessment results in digital evaluation of sitting posture comfort in human-vehicle system for potential customers with various anthropometric parameters. Finally, how this digital evaluation system for car seat comfort design is fitted in the Industry 4.0 framework is proposed, and open for discussion. References [1] Rajput, B., Abboud, R.J., 2007. The inadequate effect of automobile seating on foot posture and callus development. Ergonomics 50 (1), 131–137. [2] Gyi, D.E., 1996. Driver Discomfort: Prevalence, Prediction and Prevention. Loughborough University, UK. [3] Zhang, L., Helander, M., Drury, C., 1996. Identifying factors of comfort and discomfort. Human Factors 38 (3), 377–389. [4] Hancock, P.A., Pepe, A.A., 2005. Hedonomics: the power of positive and pleasurable ergonomics. Ergonomics in Design 13 (1), 8 –14. [5] Fenety, P.A., Putnam, C., Walker, J.M., 2000. In-chair movement: validity, reliability and implications for measuring sitting discomfort. Applied Ergonomics 31 (4), 383–393. [6] Hanson, L., Sperling, L., Akselsson, R., 2006. Preferred car driving posture using 3-D information. International Journal of Vehicle Design 42 (1/2), 154–169. [7] M.J.Griffin. Handbook of Human Vibration, Academic Press, London,1990. [8] S.Rakheja,R.G.Dong,S.Patra,P.-EBoileau,P.Marcotte,C.Warren,Biod ynamics of the human body under whole-body vibration: synthesis of the reported data, International Journal of Industrial Ergonomics 40 (2010)710–732. [9] W.Wang, S.Rakheja, P.-E.Boileau, Relation between measured apparent mass and seat-to-head transmissibility response of seated occupants exposed to vertical vibration, Journal of Sound and Vibration 314 (2008)907–922. [10] Robert M, Charles DN. A model for the response of seated humans to sinusoidal displacements of the seat [J].Journal of Biomechanics, 1974, 7(3):209-215. [11] W. Qassem, MO Othman, S Abdul-Majeed. The effects of vertical and horizontal vibrations on the human body [J].Medical Engineering and Physics, 1994, 16(2):151-161. [12] Magnusson M. L., Pope M.H., Wilder D.G. & Areskoug B. Are occupational drivers at an increased risk for developing musculoskeletal disorders? Spine International Journal, 1996, v21, n6, Mar 15, p710-717. [13] PÉ Boileau, S Rakheja. Whole-body vertical bio-dynamic response characteristics of the seated vehicle driver measurement and model development [J]. International Journal of Industrial Ergonomics, 1998,22(6):449-472. [14] TH Kim, YT Kim, YS Yoon. Development of a biomechanical model of the human body in a sitting posture with vibration transmissibility in the vertical direction[J]. International Journal of Industrial Ergonomics,2005,35(9):817-829. [15] ZHANG E, XU Lin-an, et al. Dynamic modeling and vibration characteristics of multi-DOF upper part system of seated human body[J].Journal of Engineering Design, 2008, 15(4): 244-249. (in Chinese) [16] Reddy P S, Ramakrishna A and Ramji K. Study of the dynamic behavior of a human driver coupled with a vehicle [J]. Proc IMechE Part D: J Automobile Engineering, 2015, 229(2): 226-234. [17] Li W, Zhang M, Lv G, et al. Biomechanical response of the musculoskeletal system to whole body vibration using a seated driver model [J]. International Journal of Industrial Ergonomics, 2015, 45(2): 91-97. [18] Rantaharju T, Mansfield N J, Ala-Hiiro J M et al. Predicting the health risks related to whole-body vibration and shock: a comparison of alternative assessment methods for high-acceleration events in vehicles [J]. Ergonomics, 2015, 58(7): 1071-1087. ·8· [19] TAO Qing, Jinsheng Kang,Stephan Orphanides, Jie Hong, SUN Wenlei, Application of JACK on evaluation of a split seat chair, Proceedings of the 19th International Conference on Automation and Computing (ICAC), IEEE, 2013, 1-6. [20] G. Andreoniet, Giorgio C. Santambrogio. Method for the analysis of posture and interface pressure of car drivers. Journal of Applied Ergonomics, 2002, Vol. 33, 511-522. [21] Hermann M, Pentek T, Otto B, 2015: Design Principles for Industrie 4.0 Scenarios: A Literature Review. [22] Jay Lee, Behrad Bagheri, Hung-An Kao. A Cyber-Physical Systems architecture for Industry 4.0-based manufacturing systems[J]. Manufacturing Letters, 2015, Vol. 3, 18-23. Biographical notes TAO Qing, born in 1978, is currently an associate professor at Xinjiang University, China. He received his MSc degree from Huazhong University of Science and Technology, China, in 2003, and received his PhD degree from Xinjiang University, China, in 2015. His research interests include mechanical design and industrial design. Tel: +86-991-8592301; E-mail: xjutao@qq.com Jinsheng Kang, born in 1952, is currently a Senior Lecturer at Brunel University London, UK. He received his PhD degree from Bournemouth University, UK in 2001. His research interests include human modelling and simulation, CAD, industrial design. Tel: +44-1895-266330; E-mail: Jinsheng.kang@brunel.ac.uk SUN Wenlei, born in 1962, is currently a professor at Xinjiang University, China. He received his PhD degree from Huazhong University of Science and Technology, China, in 2012. His research interests include mechanical design, CAD, advanced manufacturing. Tel: +86-991-8592308; E-mail: Sunwenxj@163.com LI Zhaobo, born in 1994, is currently an MSc candidate at School of Mechanic Engineering, Xinjiang University, China. E-mail: liangshanlzb@163.com HUO Xiao, born in 1993, is currently an MSc candidate at School of Mechanic Engineering, Xinjiang University, China. E-mail: 1505054791@qq.com http://www.snom.mb.tu-dortmund.de/cms/de/forschung/Arbeitsberichte/Design-Principles-for-Industrie-4_0-Scenarios.pdf http://www.snom.mb.tu-dortmund.de/cms/de/forschung/Arbeitsberichte/Design-Principles-for-Industrie-4_0-Scenarios.pdf mailto:Sunwenxj@163.com work_epd3m3kblbgdnohw24udk3giay ---- DOI: 10.12862/Lab18MZR Roberto Mazzola Il futuro degli studi umanistici al tempo dei Big Data * Laboratorio dell’ISPF, XV, 2018 11 Poiché le pagine che seguono sono idealmente indirizzate soprattutto ai giova- ni studiosi, per onestà intellettuale non posso non ammettere il pre-giudizio che muove la mia riflessione. Credo che la nuova ondata di tecnologie convergenti mediate dal connubio di Big Data e Intelligenza Artificiale non solo confligga tout court con i metodi propri delle discipline umanistiche, bensì aspiri a ricondurre la stessa pluralità delle pratiche di Digital Humanities entro l’alveo del paradigma computazionale dominante, funzionale al sogno utopistico di controllo tecno-scientifico di ogni aspetto dell’agire individuale e collettivo. In particolare, il dilagare dell’acritica accettazione sociale della presunta aset- ticità euristica dei dati, che in realtà non sono neutri perché estratti da modelli predefiniti, rischia di condannare all’irrilevanza il pensiero critico e speculativo. Inoltre premetto che la prima parola chiave presente nel titolo del mio in- tervento volutamente esclude le scienze sociali che da sempre fanno ampio ri- corso a metodi matematico-statistici. Anche la seconda parola chiave è da in- tendersi nella sua accezione più ampia. Da un lato, dunque, come recita Wikipedia, «le discipline umanistiche sono discipline accademiche che studiano l’uomo e la condizione umana utilizzando principalmente strumenti analitici, critici oppure speculativi a differenza del- l’empirismo proprio della scienza», dall’altro, il termine Big Data «designa quel- le cose che si possono fare solo su larga scala, per estrapolare nuove indicazioni o creare nuove forme di valore, con modalità che vengono a modificare i mer- cati, le organizzazioni, le relazioni tra cittadini e governi e altro ancora»1. Quanto al futuro non soltanto degli studi umanistici ma delle stesse Digital Humanities, poiché non riesco a immaginarlo se non in continuità con il passato e il presente, mi chiedo se nell’età della tecnoscienza valga ancora la pluriseco- lare ammirazione degli umanisti per i benefici del progresso tecnologico. Il recente esperimento condotto su due papiri di Ercolano, custoditi alla Bi- blioteca Nazionale di Napoli, «letti» senza danneggiare i rotoli carbonizzati gra- zie ad un mix di tecnologie ingegneristiche ed informatiche induce all’otti- mismo, se si considerano le prospettive aperte per la papirologia e per gli stu- diosi di filosofia epicurea2. Altri esempi si potrebbero portare di come la me- diazione tecnologica offra molteplici possibilità di rivitalizzare le scienze uma- ne, chiamate a raccogliere la sfida di ridefinire il proprio ruolo nell’età della globalizzazione della cultura diffusa dalla rete. Del resto fin dai tempi di Gu- tenberg gli umanisti si sono adattati con profitto alle nuove tecnologie, tanto che per noi è difficile immaginare gli ultimi cinque secoli di cultura umanistica * Relazione presentata al convegno «L’Umanista nella Rete. Teorie e pratiche delle Digital Humanities», Urbino, 3-4 maggio 2017. 1 V. Mayer-Schönberger - K. Cuchier, Big Data. Una rivoluzione che trasformerà il nostro modo di vivere e già minaccia la nostra libertà, Milano, Garzanti, 2013, p. 16. 2 V. Mocella - E. Brun - C. Ferrero -D. Delattre, Revealing Letters in Rolled Herculaneum Papyri by X-ray Phase-contrast Imaging, in «PNAS» 113, 14, 2016, pp. 3751-3754; published ahead of print March 21, 2016 . https://doi.org/10.1073/pnas.1519958113 Il futuro degli studi umanistici 3 senza l’invenzione della stampa che, per inciso, è un esempio riuscito di «di- struzione creatrice»; se da un lato, infatti, scompaiono calligrafi e copisti, dall’altro emergono nuove figure professionali legate al nascente mondo del- l’editoria. Ovviamente, allora come ora le novità spaventavano i conservatori e al suo apparire, come era avvenuto per l’invenzione della scrittura criticata nel Fedro platonico, anche il nuovo ritrovato dell’ingegno umano destò le preoccupazioni degli eruditi chiamati a gestire l’enorme mole di nuove informazioni, e attirò gli strali di quanti consideravano i libri riprodotti in serie in modo veloce uno strumento diabolico per la diffusione di idee eretiche e sovversive. Anche ai giorni nostri una certa narrativa elitaria del declino della cultura umanistica idealizza il passato prestigio, dimenticando che i pretesi valori uni- versali dell’umanesimo moderno sono rimasti sepolti nelle trincee della Grande Guerra e che la complementarietà di sapere umanistico e scientifico, di Wissenschaft e Bildung, ispirata al modello humboltiano di istruzione superiore, data per scontata nella formazione intellettuale di artisti, scienziati, letterati e in genere della borghesia colta fino alla metà del XX secolo, è definitivamente tramontata. Questa breve digressione vuole sgombrare il campo dalla sterile riproposi- zione del dibattito sulle «due culture», umanistica e scientifica, sviluppatosi tra le due sponde dell’Atlantico tra il tardo Ottocento e i primi decenni del Nove- cento e riproposto da Charles P. Snow sul finire degli anni Cinquanta. Reputo questo approccio del tutto inadeguato ad affrontare la complessa collocazione delle discipline umanistiche nell’era digitale, che ha ridefinito la relazione tra umanesimo e scienza, dal momento che tutti noi ormai usiamo risorse informa- tiche in ogni fase delle nostre ricerche, senza considerare l’aumento costante dei centri e dei laboratori di umanistica digitale sparsi per il mondo, che ormai si contano a centinaia. Il rapporto tra umanisti e informatica è di vecchia data e, com’è noto, la fi- gura mitica di questo incontro è il padre gesuita Roberto Busa (1913-2011), che sul finire degli anni ’40 del secolo scorso riuscì a ottenere il sostegno dell’IBM per avviare il suo progetto lessicografico, finalizzato all’analisi rigorosa della complessità teologico-filosofica dell’Opera Omnia di Tommaso d’Aquino. I tradizionali metodi di ricerca testuale trovarono negli enormi mainframe, ge- stiti da ingegneri in giacca e cravatta e sorvegliati da tecnici in camice bianco, un potente alleato nel fare da apripista all’informatica umanistica e al successo della nuova disciplina: la linguistica computazionale, di cui l’Index Thomisticus resta ancora oggi un mirabile esempio. Vale la pena sottolineare che senza l’IBM e un costante flusso di finanzia- menti difficilmente l’Index avrebbe superato indenne i problemi di obsolescen- za tecnologica che hanno scandito il percorso del progetto: partito con le sche- de perforate, passato ai nastri magnetici e poi ai CD-ROM fino al Web, dove l’Index è approdato nel 2005. Va anche aggiunto che una volta entrato nell’orbita IBM, pur volendo, Roberto Busa non avrebbe avuto le risorse eco- nomiche per continuare il suo lavoro con un’altra azienda. Del resto l’IBM Roberto Mazzola 4 aveva fatto un investimento ad alto rischio, ma pur sempre un investimento. La dipendenza dell’opera di Roberto Busa dai centri di potere tecnologico e politi- co non è rimasto un caso isolato e nelle mutate condizioni resta un problema non trascurabile per gli umanisti digitali. Comunque sia, a quasi settant’anni di distanza, quel primo storico incontro è ancora oggi un caso di scuola del rapporto strumentale degli umanisti con l’informatica prevalente prima dell’avvento del web e dei motori di ricerca. In- fatti, per Roberto Busa quello che allora molti chiamavano «cervello elettroni- co» era una preziosa risorsa che si aggiungeva alla cassetta degli attrezzi dello studioso, perché in teoria l’immane lavoro si sarebbe potuto realizzare anche senza l’ausilio della macchina; tant’è vero che le prime tecnologie usate erano analogiche e meccaniche, non digitali. Inutile aggiungere che sarebbe vano cercare nella personalità intellettuale di Roberto Busa tracce di quello che è stato definito il cuore utopico delle Digital Humanities, radicato nella controcultura cyber degli anni ’70. Il padre gesuita per tutta la vita è rimasto un umanista prestato all’informatica, che non chiedeva al computer di «cambiare il mondo» né tantomeno di risolvere i problemi erme- neutici del pensiero tomistico, ma più modestamente di fornire un mezzo rapi- do e affidabile per realizzare lo strumentario lessicale di supporto agli studiosi dell’opera e del pensiero di san Tommaso. In questa prospettiva, dunque, la tecnologia esegue il compito che lo studio- so le assegna in vista della realizzazione dei suoi scopi e forse non è un caso se, nel suo sviluppo storico, la collaborazione tra linguisti computazionali e umani- sti in molti casi non ha superato il livello dell’analisi automatica dei dati lingui- stici, nella sostanziale autoreferenzialità delle rispettive competenze; chiusura che ha raggiunto il punto di rottura con il diffondersi, alla fine degli anni ’90, di metodi stocastico-statistici di apprendimento automatico del linguaggio natura- le, che tendono a ridimensionare, se non addirittura ad escludere, il ruolo di linguisti e psicologi. Questo approccio strumentale lo ritroviamo anche tra quanti si occupano di un «mondo di carta» fatto di manoscritti, documenti, libri e che nel loro lavoro utilizzano le risorse informatiche per la catalogazione e l’allestimento di banche dati per le ricerche bibliografiche. Particolarmente istruttiva è l’esperienza di archivisti e bibliotecari impegnati nell’elaborazione dei metadati, cioè dell’insieme delle informazioni strutturate identificative di manoscritti e libri, utilizzati per la catalogazione e il recupero delle risorse digitali. Per quanti operano nel nuovo ambiente digitale non è dif- ficile riconoscere il collegamento esistente tra gli attuali metadati, che utilizzano i linguaggi di marcatura, e la prassi catalografica tradizionale effettuata attraver- so schede bibliografiche suddivise in aree e campi. In altri ambiti disciplinari il passaggio alla cultura digitale è stato più com- plesso e ricco di opportunità. In particolare i cultori di studi filologici e di criti- ca letteraria si sono avvalsi dell’informatica adattandola in modo creativo al proprio modus operandi, passando dalle edizioni critiche presentate in formato ipertestuale alle edizioni critiche scientifiche born-digital. Il futuro degli studi umanistici 5 In generale dobbiamo però riconoscere che, nonostante gli studi, avviati fin dagli anni Sessanta, sui cambiamenti strutturali in atto nella nascente società dell’informazione, gli umanisti sono stati restii a confrontarsi con le sollecita- zioni, le sfide e le implicazioni culturali e sociali delle nuove tecnologie, limi- tandosi in molti casi ad un loro uso pre-cognitivo; valga per tutti il progetto Gutenberg, lanciato nei primi anni Settanta con il fine dichiarato di promuove- re la lettura attraverso la creazione di un archivio di immagini digitali di fonti primarie, a vantaggio degli studiosi che risparmiavano tempo e fatica nel repe- rimento dei corpora dei classici antichi e moderni. Ancora oggi in molti diparti- menti universitari e centri di ricerca, all’informatica è assegnata la funzione an- cillare di facilitare la ricerca umanistica tradizionale, quando non si tratta di una mera tecnica per il trasferimento della nostra memoria culturale dai supporti analogici a quelli digitali: ciò forse spiega perché difficilmente un’opera di uma- nistica digitale sarà citata dai cultori della disciplina pertinente se non come ri- sorsa accessoria al proprio lavoro ermeneutico. Ma, come ci ricorda Jeffrey Schnapp, un uso del digitale che si limiti ad immagazzinare e conservare il pa- trimonio culturale «è ormai insufficiente», perché «in fin dei conti la sua impor- tanza riguarda principalmente il mondo analogico, ovvero la possibile trasfor- mazione del mondo in cui viviamo»3. Sebbene fin dagli anni Ottanta del secolo scorso l’informatica umanistica più avvertita avesse ben chiare le implicazioni epistemologiche della diffusione del paradigma computazionale nell’ecosistema culturale caratterizzato dalla co- stante interazione uomo-macchina, la riflessione critica sui recenti sviluppi delle Digital Humanities stenta a decollare. Come a chi ha in mano un martello ogni cosa sembrerà un chiodo, così chi usa il computer finirà col vedere dappertutto una serie discreta di elementi computabili. Questa affermazione, che parrebbe evocare il diavoletto luddista sempre in agguato, è piuttosto un invito a riflettere sul fatto che il computer, a differenza di altre tecnologie, non è una macchina che si limita a lavorare per noi, bensì è un dispositivo personale per la mente, che orienta interessi e dirige il lavoro intellettuale verso un universo info-centrico dove sempre più spesso il «mezzo» diventa il «fine». Anche se non ce ne accorgiamo, lo schermo del computer riflette una visione del mondo: ad esempio l’ordine di presentazione dei risultati di un motore di ricerca condiziona la nostra percezione e valuta- zione della loro effettiva rilevanza, che, d’altro canto, cambierà in relazione alle nostre queries, fagocitate da criteri di ranking costantemente modificati e mante- nuti segreti dalle aziende unicamente impegnate a tutelare gli interessi econo- mici degli azionisti. Il dibattito attuale sul presente e sul futuro delle discipline umanistiche si in- treccia con quello sulle Digital Humanities e presenta molte sfaccettature. 3 J. Schnapp, Digital Humanities, Milano, Egea, 2015, p. 61. Roberto Mazzola 6 Vorrei affrontare la questione dal punto di vista di quanti nel nostro paese fan- no la scelta di esercitare le discipline umanistiche come professione, dando, ovviamente, per scontate le competenze digitali dei futuri umanisti. Nell’ultimo decennio, passata la paura dell’espulsione dell’informatica uma- nistica dalle università auspicata dalla ministra Moratti, grazie ad una nuova le- va di ricercatori formatisi nei corsi di laurea e master di informatica umanistica, non solo abbiamo assistito alla costante migrazione in rete delle forme tradi- zionali della circolazione del sapere umanistico, ma anche all’emergenza di una nuova figura professionale, l’umanista digitale, che lentamente e inesorabilmen- te sta modificando il «mestiere» dell’umanista. Un mestiere che è stato in passato ed è ancora, non si sa per quanto tempo, un insieme di tecniche e metodi di ricerca, costantemente perfezionati e rinno- vati, di lenta rielaborazione personale delle conoscenze accumulate nel tempo; un mestiere che, al di là degli specialismi, è adesione a regole di condotta non scritte, a indirizzi culturali condivisi da comunità di studiosi, piccole o grandi che siano. È soprattutto spirito critico, esercitato nel rispetto della pluralità di metodi di indagine, di analisi e argomentazione in vista del fine comune di pro- durre nuove conoscenze. Se le attività che siamo soliti associare alla pratica umanistica cadranno in di- scredito, quanti in futuro saranno ancora disposti a spendere tempo e fatica per un apparato di note a piè di pagina ben fatto, o per redigere una bibliografia ragionata di libri effettivamente letti? Per quanto tempo ancora la monografia costituirà la modalità principale di scrittura accademica? Come sottolineava Max Weber, il Beruf, il mestiere, di chi esercita la scienza come professione è fatto di competenza alimentata da passione disinteressata per la conoscenza4. Ed è a questo mestiere, nella duplice accezione weberiana di professione e vocazione, che all’umanista è chiesto di rinunciare quando è chiamato a fornire i cosiddetti contenuti spendibili sul mercato delle applicazioni degli ultimi ri- trovati tecnologici di realtà aumentata o virtuale, dove cultura e svago si fon- dono e si confondono o che, nel peggiore dei casi, sono miseramente destinati a rimanere alla fase alpha o a finire in demo di software che non vedranno mai la luce. La dimensione per così dire artigianale degli studi umanistici già oggi non rappresenta più la tipica esperienza formativa delle nuove generazioni, e ancora meno lo sarà in futuro, quando le tecnologie digitali orienteranno le strategie didattiche e i processi educativi come auspicato, ad esempio, dal progetto mini- steriale «Piano Scuola Digitale» o dalla sperimentazione nelle scuole primarie di «Smart School» sponsorizzata da Samsung Italia. Anche se siamo ancora lontani dagli eccessi nordamericani, le recenti colla- borazioni attivate dalle università di Bologna, Napoli e Venezia con Google, Apple e Samsung vanno nella direzione di un sempre più invasivo intervento 4 M. Weber, La scienza come professione. La politica come professione, Torino, Einaudi, 2004. Il futuro degli studi umanistici 7 dei colossi del web anche nella formazione delle future generazioni di umanisti digitali inseriti e integrati nell’universo multimediale del web, sempre e dovun- que disponibile grazie all’interoperabilità dei diversi dispositivi di connessione, resi sempre più friendly, o per dirla politically incorrect «a prova d’idiota». A fronte degli impetuosi sviluppi tecnologici, ripensare il ruolo della cultura umanistica senza cadere nella trappola di combattere un’anacronistica battaglia di retroguardia ammantata di retorica «resistenziale», che in realtà finisce per accettare acriticamente l’ineluttabilità dell’esistente, significa, tra l’altro, situare la riflessione critica nel luogo strategico della diffusione dei dispositivi del sape- re/potere della cultura digitale. Mi riferisco ai finanziamenti delle discipline umanistiche che, com’è noto, non solo in Europa, sono irrilevanti se paragona- ti alle risorse destinate alle scienze. Non si tratta di un fenomeno nuovo, ma, a differenza di quanto accadeva ancora in un recente passato, sempre più spesso i responsabili delle decisioni politiche non solo stabiliscono quanto, ma anche come spendere. Il risultato è sotto gli occhi di tutti: basti considerare l’aumento esponenziale dei fondi europei destinati a progetti, meglio se di breve periodo e transnazionali, finalizzati a promuovere l’accesso alla cultura attraverso mezzi digitali, rispetto ai fondi assegnati alla ricerca di base senza ricadute economi- che o a progetti individuali di lungo periodo i cui risultati non possono essere previsti. Nelle università e nei centri di ricerca la già difficile lotta per la sopravviven- za degli umanisti nella giungla del «pubblica o muori» è resa ancora più aspra dall’imperativo del «digitalizza o muori». Se il fine di un singolo o di un gruppo che ottiene un finanziamento è quello di aumentare le chances di ottenerne altri in futuro, e uno dei modi più semplici per continuare a ottenere fondi è quello di partire non dai propri interessi ma dalla tecnologia, può capitare così di portare avanti progetti eterodiretti da inte- ressi politici ed economici. Nel prossimo futuro, possiamo esserne certi, le pos- sibilità di finanziamento aumenteranno ulteriormente se alla parola chiave Digi- tal si aggiungerà quella di Big Data, indispensabile alle Big Humanities, che come ogni Big Science che si rispetti richiede un alto livello di infrastrutture tecnologi- che e risorse economiche significative. La disperata ricerca di fondi pone le Digital Humanities in una condizione di difficile equilibrio tra l’adattare i progetti alla dimensione imprenditoriale ri- chiesta dai partner istituzionali e salvaguardare l’autonoma sperimentazione di forme di produzione, comunicazione e circolazione della cultura umanistica, che riutilizzino il materiale presente in rete, il cui costo è praticamente pari a zero5. All’inizio abbiamo visto che i Big Data rendono velocemente fruibile l’enor- me volume di informazioni provenienti dalle più svariate fonti, utilizzabili per gli scopi più disparati. Vediamo ora alcuni esempi concreti del loro utilizzo in ambito umanistico. 5 Vedi ad esempio . http://www.bibliotecanapoletana.it/ Roberto Mazzola 8 Con velocità sorprendente, se pensiamo ad esempio ai magri risultati fino ad oggi conseguiti dal ventennale sforzo di dar vita alla Biblioteca Digitale Italiana, dal 2004 Google ha digitalizzato decine di milioni di libri e, poiché la digitaliz- zazione di per sé non fornisce dati, grazie all’OCR omni-font i testi sono stati resi indicizzabili. La miniera di informazioni così ottenuta ha permesso al- l’azienda non solo di migliorare i servizi di controllo ortografico e di traduzione automatica, ma anche di sperimentare nuove forme di analisi testuale automati- ca. Un primo saggio delle potenzialità dell’approccio quantitativo-probabilistico ai libri è stato offerto da due studiosi che, con il sostegno finanziario di Goo- gle, hanno utilizzato un software che ha «letto» circa cinque milioni di libri. La versione del programma, Ngram Viewer6, accessibile gratuitamente, presenta i grafici della rilevanza di singole parole o frasi nel corso dei secoli e nelle varie aree linguistiche: un metodo di analisi che, secondo gli autori, rappresenterebbe una rivoluzione copernicana delle scienze umane, perché la neonata scienza, definita «culturomica»7, mapperà l’intero patrimonio culturale dell’umanità così come è avvenuto con quello genetico. I fautori del distant reading tengono sem- pre a precisare che l’analisi algoritmica dei testi letterari arricchisce e non sosti- tuisce le competenze e i metodi tradizionali. Cosa sicuramente vera nel caso di Franco Moretti8, ma non per quanti propugnano la teoria della fine delle teorie. Google, animato dall’ambizioso progetto che Siva Vaidhayanathan ha defi- nito la googlizzazione della conoscenza9, non si è fermato ai libri e nel 2011 ha an- nunciato la nascita del Google Cultural Institute, che prefigura l’infrastruttura cul- turale globale del XXI secolo. Come per la digitalizzazione dei libri, l’obiettivo dichiarato è quello di ren- dere accessibile a tutti il patrimonio artistico e culturale dell’umanità. Una gene- rosità che a voler essere benevoli possiamo considerare una versione aggiornata del mecenatismo diffuso tra i paperoni statunitensi del secolo scorso o, a pen- sar male, un modo astuto di entrare nel ricco business dell’industria culturale, in particolare fornendo nuovi prodotti e servizi al fiorente mercato del turismo culturale. Del resto i magnati di Internet si dedicano alla filantropia in modo nuovo. A differenza dei Robber Barons del secolo passato, come Andrew Carne- gie, John D. Rockfeller e Andrew Mellon, che elargivano denaro per borse di studio, per costruire ospedali, scuole, biblioteche ecc., con l’obiettivo di mitiga- re le diseguaglianze della società americana, i tecnofilantropi della Silicon Valley 6 Gli n-grammi nella terminologia della linguistica computazionale sono le occorrenze di una parola o di una frase misurate nel corso di un certo periodo di tempo. 7 E. Aiden - J.B. Michel, Uncharted, Big Data as a Lens on Human Culture, New York, River- head Books, 2013. 8 F. Moretti, La letteratura vista da lontano, con un saggio di A. Piazza, Torino, Einaudi, 2005; Id., Distant Reading, London-New York, Verso Books, 2013. A testimonianza delle sterminate letture di Franco Moretti vedi la sua recente prova di raffinata critica letteraria: Il borghese. Tra storia e letteratura, traduzione di Giovanna Scocchera, Torino, Einaudi, 2017 (ed. orig.: The Bour- geois: Between History and Literature, 2013). 9 S. Vaidhayanathan, La grande G. Come Google domina il mondo e perché dovremmo preoccuparci, Milano, Rizzoli, 2012, pp. 175-203. Il futuro degli studi umanistici 9 operano su scala planetaria, promuovendo iniziative sulle quali mantengono il pieno controllo grazie all’ambiguo statuto giuridico delle loro fondazioni, come ad esempio quelle dei Gates, dei Bezos o degli Zuckerberg, che rende labile il confine tra non profit e for profit. Diversamente da Google Books, che ai suoi esordi ha incontrato non poche resistenze anche sul piano legale10, la nascita del Google Cultural Institute è stata salutata da un coro unanime di entusiastici consensi, scatenando una vera e propria caccia alla partnership da parte di enti ed istituzioni pubbliche e private, mentre critici d’arte e curatori di mostre «plaudono alla scure che li decapite- rà»11. Ovviamente una particolare attenzione è dedicata al Bel Paese, oggetto dell’accurata indagine di Elisa Bonacini12. L’organizzazione della piattaforma in tre sezioni – Art Project, Momenti storici e World Wonders – ricalca quella del parco a tema, una sorta di Disney- land dell’immaginario culturale collettivo. Non è né un «luogo» né un «non luogo», nel senso indicato da Marc Augé, ma piuttosto un neo-luogo che con- sente la visita a distanza di siti archeologici, musei virtuali, oltre ad offrire foto e materiale documentario provenienti da istituzioni o singoli utenti. Le soluzioni tecniche adottate per allestire un museo virtuale ne orientano la fruizione, determinando il tipo di esperienza estetica e culturale dell’utente. Se, ad esempio, vogliamo visitare comodamente seduti sul divano di casa il museo dell’Ermitage di Pietroburgo, ospitato in quello che fu il palazzo d’inverno de- gli Zar, possiamo farlo nelle modalità standardizzate di narrazione offerta da Google, fatta di immagini ad altissima definizione, percorsi immersivi e altro ancora. Ma è questa l’unica possibilità per conoscere l’Ermitage a distanza? No, se alla passione per l’arte uniamo quella per il cinema e ci guardiamo Arca russa. Il film diretto nel 2002 da Aleksandr Sokorov, e girato in digitale, colpisce i ci- nefili perché è realizzato senza stacchi e senza montaggio. La scelta del piano sequenza non è un virtuosismo di maniera, perché il flusso ininterrotto di im- magini per un’ora e mezza ci accompagna in un viaggio nel tempo che ci fa rivivere non solo la storia delle collezioni d’arte, ma anche le vicende dei per- sonaggi che quelle opere raccolsero. Dal momento che Google non vuole vivere solo in rete, nel 2013 ha scelto una città dall’alto valore simbolico come Parigi quale sede europea dell’Istituto, dove è ospitato anche il Lab, campus incubatore di nuove forme di collabora- zione tra arte e tecnologia. Numerosi code artists, così definiti per marcare la di- stanza dai computer artists del recente passato, sono ospitati per realizzare un’impresa che avrebbe scoraggiato anche il più ottimista degli enciclopedisti francesi del secolo dei Lumi: dare un senso a sette milioni di pezzi digitalizzati. 10 R. Mazzola, Google Books e le scienze (post)umane, in «Laboratorio dell’ISPF», XII, 2015, DOI: 10.12862/ISPF15L405. 11 N. Wiener, Introduzione alla cibernetica. L’uso umano degli esseri umani, Torino, Bollati Boringhieri, 2012, p. 176. 12 E. Bonacini, Google e il patrimonio culturale italiano, in «SCIRES it», IV, 2014, 1, pp. 25-40 . http://dx.medra.org/10.12862/ispf15L405 http://caspur-ciberpublishing.it/index.php/scires-it/article/view/10911/10116 Roberto Mazzola 10 Il compito loro affidato è quello di collaborare per rendere «creativi» i più avanzati software di apprendimento automatico di cui Google dispone. I risul- tati sono presentati nella sezione dedicata agli esperimenti condotti nel Lab. Fra gli esperimenti si distingue «x-degree-of-separation» che riprende l’idea dei sei gradi di separazione legata agli esperimenti condotti negli anni Sessanta da Stanley Milgram. Gli ingegneri e i code artists di Google hanno applicato il Machine Learning per scoprire modelli in grado di trovare percorsi tra due immagini di manufatti qualsiasi scelti dall’utente, che vengono collegati tra loro attraverso una catena di somiglianze di forma e colore. Poiché la galleria di immagini proposta può destare un qualche sconcerto, l’azienda consiglia di usare il prodotto come una vera e propria macchina di serendipità. E stando al gioco, poiché parliamo di trovare ciò che non si cerca, non posso fare a meno di riferire del corto circuito suscitato in me dall’animazione de la caduta degli angeli ribelli di Brueghel il vec- chio, frutto della nostalgia di un nipotino di Timoty Leary per i paradisi artifi- ciali evocati dagli stati di coscienza alterati dei bei tempi andati, come si diceva una volta, ma che ora non usa più. All’interno dell’istituto parigino i visitatori trovano l’occorrente per costruir- si il proprio visore di realtà virtuale per smartphone, il cardbord, che credo fa- rebbe inorridire Jerome Lanier, il quale sperava che «in futuro, la gente userà collettivamente la realtà virtuale per socializzare»13. Le esperienze di realtà aumentata e virtuale offerte da Google sono, a mio avviso, piuttosto un ritorno alle forme di spettacolo ottico diffuse prima del- l’avvento del cinematografo, che proponevano viaggi simulati nel variegato mondo delle immagini dei vari Panorama, Diorama, Cosmorama e Sensorama ancora presenti alle esposizioni universali tra Otto e Novecento, riattualizzate in chiave futuribile all’Esposizione Universale di New York del 1939, dedicata a The World of Tomorrow e poi nell’attrazione Tomorrowland a Disneyland inaugu- rata nel 1955. Con un altro corto circuito il visore di Google mi ha ricordato l’ottico Dip- pold cantato da Fabrizio De Andrè, che promette di dare ai clienti la luce che trasforma il mondo in un giocattolo: Daltonici, presbiti, mendicanti di vista Il mercante di luce, il vostro oculista, Ora vuole soltanto clienti speciali Che non sanno che farne di occhi normali. Non più ottico ma spacciatore di lenti Per improvvisare occhi contenti, Perché le pupille abituate a copiare Inventino i mondi sui quali guardare 13 Intervista dell’8 marzo 1998, disponibile all’URL . http://www.mediamente.rai.it/home/bibliote/intervis/l/lanier.htm http://www.mediamente.rai.it/home/bibliote/intervis/l/lanier.htm Il futuro degli studi umanistici 11 Seguite con me questi occhi sognare Fuggire dall’orbita e non voler ritornare Rendere viva l’arte del passato non è impresa facile e se il museo tradiziona- le è stato accusato, con qualche ragione, di essere la tomba dell’arte, il rischio di quello virtuale è di trasformarsi nel suo cenotafio. Vorrei concludere con l’invito all’ottimismo proveniente da una fonte a dir poco inaspettata. Di recente l’esperto di machine learning Pedro Domingos, ribal- tando il luogo comune secondo il quale le discipline umanistiche hanno «im- boccato una spirale che le condurrà alla morte», si è detto convinto che le pro- spettive a lungo termine degli scienziati non appaiano le più rosee. Infatti, sot- tolinea, «in futuro gli unici scienziati a sopravvivere potrebbero essere gli in- formatici». Cosicchè, quando computer e robot sapranno fare tutto meglio di noi, aumenterà il valore del contributo degli umanisti, il cui campo d’azione «è tutto quello che non si può capire se non si è un essere umano»14. Buon lavoro. 14 P. Domingos, L’algoritimo definitivo. La macchina che impara da sola e il futuro del nostro mondo, Torino, Bollati Boringhieri, 2016, p. 319. Laboratorio dell’ISPF ISSN 1824-9817 www.ispf-lab.cnr.it Roberto Mazzola ISPF-CNR, Napoli mazzola@ispf.cnr.it – Il futuro degli studi umanistici al tempo dei Big Data Citation standard MAZZOLA Roberto. Il futuro degli studi umanistici al tempo dei Big Data. Laborato- rio dell’ISPF. 2018, vol. XV (11). DOI: 10.12862/Lab18MZR. Online first: 15.06.2018 Full issue online: 21.12.2018 ABSTRACT The future of Humanities in the era of Big Data. This article proposes a critical enquire about the impact of the new wave of digital technologies on humanistic disciplines. Today humanists are called to accept the challenge of redefining their role in the age of globalization of culture. In the last twenty years, the widespread diffusion of the world wide web, search engines and social networks has definitively fostered a crisis of confidence about the idea of a purely instrumental use of the computer in order to facilitate the traditional humanistic research. More recently, the introduction of Big Data, remote reading, machine learning resources, products and tools have begun a process of radical transformation of the very practices of the Digital Humanities. In a more and more web-based digital environment, the humanistic research requires a high level of specialization, scientific expertise and technology infrastructures as well as massive funding. KEYWORDS Digital Humanities; Big Data; Distant Reading; Google Cultural Institute SOMMARIO L’articolo propone un’indagine critica sull’impatto della nuova ondata di tecnologie digitali nel campo delle discipline umanistiche. Oggi gli umanisti sono chiamati ad ac- cettare la sfida di ridefinire il loro ruolo nell’era della globalizzazione della cultura. Ne- gli ultimi vent’anni, la diffusione capillare del web, dei motori di ricerca e dei social network ha definitivamente messo in crisi l’idea di un uso puramente strumentale del computer per facilitare la tradizionale ricerca umanistica. Più recentemente, l’introduzione di Big Data, lettura a distanza, prodotti e strumenti di machine learning hanno avviato un processo di trasformazione radicale anche nelle stesse pratiche di Digital Humanities. In effetti, la ricerca umanistica in un ambiente digitale sempre più basato sul web richiede un alto livello di specializzazione, competenza scientifica e infrastrutture tecnologiche, oltre a finanziamenti massicci. PAROLE CHIAVE Digital Humanities; Big Data; Distant Reading; Google Cultural Institute http://www.ispf-lab.cnr.it/ work_erliyypy4rbpll5utntv2c3rhu ---- 3D Virtual Environment System Applied to Aging Study – Biomechanical and Anthropometric Approach 2351-9789 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of AHFE Conference doi: 10.1016/j.promfg.2015.07.728 Procedia Manufacturing 3 ( 2015 ) 5551 – 5556 Available online at www.sciencedirect.com ScienceDirect 6th International Conference on Applied Human Factors and Ergonomics (AHFE 2015) and the Affiliated Conferences, AHFE 2015 3D virtual environment system applied to aging study - Biomechanical and anthropometric approach Carla Guimaraes*, Vitor Balbio, Gloria Cid, Maria Cristina Zamberlan, Flavia Pastura, Laryssa Paixao National Iinsitute ofTechnology, Av. Venezuela, 82, Anexo 4, Rio de Janeiro and 20081312, Brazil Abstract The percentage of people over 60 years increased from 8.6% in 2000 to 10.8% at Brazilian population in 2010. In 78 cities of Brazil this portion of citizens already represents 20% of the total population. In this context, fall is one of the most serious problem of aging process, and it is being recognized as an important public health issue because it is considered a major source cause of disability to older adults. This is a matter of concerning to older people because can lead to physical handicap and loss of independence. The purpose of this paper is to present a 3D digital interactive environment to work with 3D digital human models applied to aging study. The 3D interactive platform framework involved: first step - scanning older adults and caregivers at a 3D Whole Body scanner and captured caregivers motions using 17 inertial sensors from XSENS Technology; second step- 3D modeling and simulation - scans and motion data are incorporated to virtual environment; third step - study reports and e- book. The conclusion are: the simulation will assure more democratic visualization and improve available information for the stakeholders involved, as designers, architects, health personnel's in the benefit of senior population, the 1D and 3D anthropometric measurements database of the caregivers and old adults will be a tool that can help designers and health cares in the future to improve design and care services for senior people in order to improve safety and quality of life. The 3D digital interactive environment is still under development. Also this system could be used to interact with the caregivers as a game to training in order to improve the daily care services tasks. © 2015 The Authors. Published by Elsevier B.V. Peer-review under responsibility of AHFE Conference. Keywords: Aging study; Caregivers; Biomechanics; 1D and 3D anthropometry * Corresponding author. Tel.: +552121231058; fax: +552121231001. E-mail address: carla.guimaraes@int.gov.br © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of AHFE Conference CORE Metadata, citation and similar papers at core.ac.uk Provided by Elsevier - Publisher Connector https://core.ac.uk/display/82753356?utm_source=pdf&utm_medium=banner&utm_campaign=pdf-decoration-v1 http://crossmark.crossref.org/dialog/?doi=10.1016/j.promfg.2015.07.728&domain=pdf http://crossmark.crossref.org/dialog/?doi=10.1016/j.promfg.2015.07.728&domain=pdf 5552 Carla Guimaraes et al. / Procedia Manufacturing 3 ( 2015 ) 5551 – 5556 1. Introduction The elderly proportion in the Brazilian population had a general growth in the last 10 years. [4] The percentage of people over 60 years increased from 8.6% in 2000 to 10.8% in 2010. In 78 cities of Brazil this portion of citizens already represents 20% of the total population. In this context, fall is one of the most serious consequences of aging process being recognized as an important public health problem due to their incidence, to health complications and to the high assistance cost, in order words falls are a major source of disability in older people and is highly associated to postural instability and home environment. This is a matter of concern with the elderly, for it can lead to physical handicap and loss of independence. These health and social problems have increased concern of government health institutions regarding the care services and products development for this population in order to improve quality life and security. Digital Human Model (DHM) is a digital human representation in the 3D space that can be moved and manipulated to simulate real and accurate movements of people [2] . Digital human modelling is a fast growing area that bridges computer-aided engineering, design, human factors, applied ergonomics and training [5] . Digital human modelling and simulation play an important role in product design, prototyping, manufacturing, health services and many other areas [3] The term “serious games” describes software /video games designed specifically for training and education (in terms of learning and practice) [1,7]. A subset of educational serious gaming focuses on training, where users need to acquire a specific competence or built up a particular set of skills. Serious games are designed to solve real life problems through environment visualization and simulation [8,9]. Technological innovations have been frequently implemented in attempts to enhance the learning experience. Some technology as inertial sensors, magnetometers, GPS and wireless technologies, or a combination of such devices can improve detailed activity information, occupational biomechanics and performance measures data collected in order to enrich training and technique evaluation [5] The purpose of this paper is to present a 3D digital interactive environment to work with 3D digital human models applied to aging study with focus on caregivers’ biomechanics analyzes and 1D and 3D anthropometry measurements’ from caregivers and older adults. 2. 3D interactive platform framework The Ergonomics Laboratory team of the National Institute of Technology has developed "serious games" platforms and simulation environments applied to ergonomic work analysis and new ergonomic design. The goal of these simulations has been to help designers and employees to understand and implement ergonomic concepts work environment design. Based on these experiences we are developing the actual 3D interactive system. The actual system consists of basic system and modular tools described below (see Fig.1): - Analysis: Allow to get and to record the "bone" and joint data graphs and the diagram of movement in 3D; - The E-Book – consisting of text, images of analysis of caregivers working movements, that can be readable on computers or other electronic devices; - Reports and Exports: return reports with graphs and diagram for printing or saving. Allow exporting the RAW data to XML or other exchange data format. The caregivers and old adults that invited to participate to this study are part of Center of Research and study of Elderly of Rio de Janeiro State (CEPE). The old people group will be selected by the CEPE Health care team. A formal consent is being signed by caregivers and old people that agree to participate to the study. 5553 Carla Guimaraes et al. / Procedia Manufacturing 3 ( 2015 ) 5551 – 5556 Fig. 1. Flow chart 3. Data acquisition Comprised three steps before data input in the 3D digital interactive environment: First the Caregivers and old people group from CEPE will be invited to be scanned in a Cyberware WBX 3D whole body scanner at Ergonomic Lab at National Institute of Technology and also to be take 1D and 3D anthropometric data. The 1D and 3D anthropometric methods that will be applied at the study follow the CAESAR research protocols [6]. (see Fig.2 a, b). After the process of acquiring scanning data, each scan will be treated using the process of retopology. In the second step the skilled caregivers was asked to select and define some working movements when manipulated elderly people that they consider the most difficulties and overloading for their bodies performance. After that, their motions were captured using a suit with 17 inertial sensors from XSENS Technology (MOCAP). (see Fig 3) Finally, the data captured from MOCAP and the 3D DHM data from scanning process will be incorporated at the virtual platform that is being developed as an interactive 3D software using Unity 3D and other game tools. The visual representation of the 3DHM at the platform is being generated based on mocap data, following position and dimension of bone segments of the virtual body. That makes the visual representation an accurate copy of the original bone position and of the specific actor's movements being captured. Fig. 2. (a) 1D Traditional Anthropometry applied to Elderly; (b) 3D Anthropometry applied to Elderly 5554 Carla Guimaraes et al. / Procedia Manufacturing 3 ( 2015 ) 5551 – 5556 Fig. 3. Caregiver task motion captured using inertial sensors MOCAP 4. Data analysis That platform is being developed considering the need to analyze data from different caregivers movements being repeated in different moments (kinematics analysis) and also with different skills (see Fig. 4 and Fig.5). These analysis is been done at Visual 3D software and will be incorporate at the 3D platform. The 1D and 3D anthropometric measurements data from caregivers and old people will be analyze with Matlab statistic’s software and Cloud Compare software and the database with these measures’ will be incorporate at the platform. The data analysis allows the kinematic data to be visualized by means of graphics. This kind of visualization makes it easy to analyze the data and the application to training. (see Fig. 4). Fig. 4. Kinematic data graphic interface. 5555 Carla Guimaraes et al. / Procedia Manufacturing 3 ( 2015 ) 5551 – 5556 5. E-Book and reports Those modulus complete the interactive platform - The book features an entry corresponding to the movement currently under analysis and shows anthropometric data while the Report Module exports the Analysis as results to an exchangeable and readable data format. The overall organization of the modules of the plataform can be seen in the spreadsheet. (see Fig.5) Fig. 5. 3D Interactive Platform Structure. 5556 Carla Guimaraes et al. / Procedia Manufacturing 3 ( 2015 ) 5551 – 5556 6. Conclusions The 3D digital interactive environment is still under development. Its analysis will allow to study and to improve caregivers performance through training and to prevent work related musculoskeletal problems. The simulation will assure more democratic visualization and improve available information for the stakeholders involved, as designers, architects, health personnel in the benefit of senior population. The 1D and 3D anthropometric database from caregivers and old people will be an information tool that can help designers and health care stakeholders in the future to improve health care services, safety and quality of life for elderly population. Acknowledgements The Ergonomic Laboratory researchers would like to thanks the sponsor : FAPERJ/CEPE – Edital Pró-idoso. References [1] Annetta, L. A. A framework for serious educational game design - Review of General Psychology, Vol 14(3), Sep, 250, 2010 [2] Guimarães C. P; Ribeiro, F ; Cid, G.L. ; Streit, P.; Oliveira, J., Zamberlan, M.C ; Paranhos, A.G. and , Pastura, F. 3D Digital Platform development to analyze 3D digital human models. A case of study of Jiu Jitsu combat sport. Anais do 2nd International Digital Human Modeling Symposium, Ann Habor, Michigan. 2013 [3] Guimarães, C., Pastura, F., Pavard, B., Pallamin. N;. Cid, G., Santos, V., Zamberlan, M.C. Ergonomics Design Tools Based on Human Work Activities, 3D Human Models and Social Interaction Simulation IHC Congress. Miami. 2010 [4] IBGE, Pesquisa Nacional por Amostra de Domicílios 1999/2009, 2010 [5] James, D.A.; Thiel, D.E.; Allen, K.J.; Abell, a.; Kilbreath, S.; Davis, G.M.; Rowlands, D. and Thiel, D.V. Technology and Health: Physical activity monitoring in the free living environment Procedia Engineering 34, 367-372 , 2012 [6] Robinette, K.M., Blackwell, S., Daanen, H.A.M., Fleming, S., Boehmer, M., Brill, T., Hoeferlin, D., Burnsides, D.,. Civilian American and European surface anthropometry resource (CAESAR), Final Report, Summary. vol. I. AFRL-HE-WP-TR-2002-0169, Human Effectiveness Directorate, Crew System Interface Division, 2255 H Street, Wright-Patterson AFB OH 45433-7022 and SAE International, 400 Commonwealth Dr., Warrendale, PA 15096, 2002. [7] Naumann, A. and Rötting, M., Digital Human Modeling for Design and Evaluation of Human-Machine Systems. In: Israel, J. H. & Naumann, A. (Hrsg.), MMI Interaktiv - Human: Vol. 1, No. 12. (S. 27-35). 2007 [8] Senevirathne, S. G; Kodagoda,M.; Kadle,V.; Haake , S.; Senior, T. and Heller, B. Application of serious games to sport health and exercise. In: Proceedings of the 6th SLIIT Research Symposium, Sri Lanka, 27 January, 2011 [9] Steinmetz, R. and Göbel, S. (2012) Challenges in Serious Gaming as Emerging Multimedia Technology for Education, Training, Sports and Health, Advances in Multimedia Modeling, Lecture Notes in Computer Science Volume 7131, p 3, 2012 . work_esyo6plndnejhnptsxiuw3tahy ---- A human-like learning control for digital human models in a physics-based virtual environment HAL Id: hal-01171441 https://hal.archives-ouvertes.fr/hal-01171441 Submitted on 3 Jul 2015 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. A human-like learning control for digital human models in a physics-based virtual environment Giovanni de Magistris, Alain Micaelli, Paul Evrard, Jonathan Savin To cite this version: Giovanni de Magistris, Alain Micaelli, Paul Evrard, Jonathan Savin. A human-like learning control for digital human models in a physics-based virtual environment. Journal of Visualization and Computer Animation, John Wiley & Sons, 2015, pp.423-440. �10.1007/s00371-014-0939-0�. �hal-01171441� https://hal.archives-ouvertes.fr/hal-01171441 https://hal.archives-ouvertes.fr Noname manuscript No. (will be inserted by the editor) A human-like learning control for digital human models in a physics-based virtual environment Giovanni De Magistris · Alain Micaelli · Paul Evrard · Jonathan Savin Received: / Accepted: Abstract This paper presents a new learning con- trol framework for digital human models in a physics- based virtual environment. The novelty of our con- troller is that it combines multi-objective control based on human properties (combined feedforward and feed- back controller) with a learning technique based on hu- man learning properties (human-being’s ability to learn novel task dynamics through the minimization of insta- bility, error and effort). This controller performs multi- ple tasks simultaneously (balance, non-sliding contacts, manipulation) in real time and adapts feedforward force as well as impedance to counter environmental distur- bances. It is very useful to deal with unstable manipula- tions, such as tool-use tasks, and to compensate for per- turbations. An interesting property of our controller is that it is implemented in cartesian space with joint stiff- ness, damping and torque learning in a multi-objective control framework. The relevance of the proposed con- trol method to model human motor adaptation has been demonstrated by various simulations. G. De Magistris CEA, LIST, LSI, rue de Noetzlin, Gif-sur-Yvette, F-91190 France Tel.: +33 7 61 30 84 66 E-mail: giovanni demagistris@hotmail.it E-mail: giovanni.de-magistris@cea.fr A. Micaelli CEA, LIST, LSI, rue de Noetzlin, Gif-sur-Yvette, F-91190 France P. Evrard CEA, LIST, LSI, rue de Noetzlin, Gif-sur-Yvette, F-91190 France J. Savin Institut national de recherche et de sécurité (INRS), rue du Morvan, CS 60027, Vandœuvre-lès-Nancy, F-54519 France Keywords Digital Human Model · Motion Control · Bio-Inspired Motor Control · Virtual Reality 1 Introduction Digital human model (DHM) technique is rapidly emerging as an enabling technology and a unique line of research for the verification of human factors issues in industry, which is the general purpose of our work. In order to evaluate the physical (biomechanical) as- pects of working conditions, several software packages have been developed to facilitate ergonomic assessment, such as SAMMIE [55], JACK [4], Ergoman [59] and SANTOSHuman [69,68]. Simulations computed with these software packages usually rely on kinematic an- imation frameworks. Such frameworks use either pre- recorded motions obtained by a tracking system and motion capture or interactive manual positioning of the DHM body through a mouse, menus and keyboard. In the first case, simulations are realistic but they require extensive instrumentation of a full scale mock-up of the future workstation or a similar existing one. They are extremely time consuming because of motion cap- ture data processing [6]. Furthermore, their ability to predict complex human postures and movements for various sizes and dimensions in a timely and realistic manner is strictly dependent on the accuracy of the motion database. In the second case, simulations are clearly subjective (the designer, possibly with no spe- cific skill in ergonomics, chooses arbitrarily a posture or trajectory). Again, they are time consuming (built up like a cartoon) and usually appear unnatural [13], even though these digital manikins possess semi-automatic controls provided by a set of behaviours, such as gaz- ing, reaching, walking and grasping. These issues do 2 Giovanni De Magistris et al. Fig. 1: Adaptive and Learning controller not encourage designers to consider alternative scenar- ios, which would be beneficial for a comprehensive as- sessment of the future work situation. Moreover, such software packages are subject to numerous limitations: since they are restricted to static models and calcu- lation, they neglect dynamic aspects. Neither do they consider contact forces between the DHM and objects (at best the designer has to arbitrarily set both contact force magnitude and direction manually). For these rea- sons, assessment of biomechanical risk factors based on simulations of industrial or experimental situations may lead to real stress underestimation of up to 40-50% [43]. A challenging aim therefore consists in developing a DHM capable of performing tasks as an artificial human-being through dynamically consistent motions, behaviours and internal characteristics (positions, ve- locities, accelerations and torques) based on a simple description of the future work task, in order to achieve realistic ergonomics assessments of various work task scenarii at an early stage of the design process. 2 Human behaviours To achieve this goal, a multi-objective DHM controller based on human behaviours using simulated physics is presented in this article. In our simulation framework, the entire motion of the human model in the virtual environment is ruled by real-world Newtonian physi- cal and mechanical simulation, along with automatic control of applied forces and torques. To develop this controller, we chose to take into account the following important behaviours of human motor control: 1. spring-like behaviour: Won and Hogan [71] noted that muscle elastic properties and reflexes produce a restoring force to an undisturbed trajectory when the hand is slightly perturbed, as a spring between the hand and the planned trajectory. The mechan- ical impedance (strength of these spring-like prop- erties) increases with endpoint force [25] or muscle activation [39] and it is adapted to counter environ- mental disturbances [48]. This behaviour is imple- mented in the feedback part of our controller. 2. anticipatory capabilities: When a multibody system gets in touch with an object, it is important to make the limb more compliant to avoid “contact instabil- ity” [31]. An important conclusion, which consis- tently emerges from the theoretical analysis, is that mechanics needs a feedforward control. A number of studies have shown that the nervous system uses internal representations to anticipate the consequences of dynamic interaction forces. In particular, Lackner and Dizio [40] demonstrated that the central nervous system (CNS) is able to predict the centripetal and Coriolis forces; Grib- ble and Ostry [26] demonstrated the compensation of interaction torques during multijoint limb move- ment. These studies suggest that the nervous sys- tem has sophisticated anticipatory capabilities. We therefore need to design accurate internal models of body dynamics and contacts. Generally, a feedforward control model is based on the anticipatory computation of the forces that will be needed to carry out a desired motion plan, with- out sensory information. The CNS therefore needs A human-like learning control for digital human models in a physics-based virtual environment 3 an internal representation or an inverse model of the human model and environment. This control technique is fast and does not have the instability risk, but has an obvious drawback: the sensitivity to unexpected disturbances. The feedfor- ward control is not able to compensate for perturba- tions. If these disturbances can be measured, we can make on-the-fly correction of the movement. This method corresponds to the feedback control of our controller. 3. motion error minimization: Shadmehr and Mussa- Ivaldi [61] demonstrated that, trial after trial, the CNS reduces motion error through the compensa- tion of the environmental forces and the feedfor- ward control adaptation. An illustrative example is Kawato’s feedback error learning model [37], based on cooperation between two control mechanisms: a feedback loop, which operates in an initial train- ing phase, and a feedforward model, which subse- quently emerges. In this model, a feedback error is used as the learning signal for the feedforward model, which gradually compensates for any dy- namic disturbances, and thereby learns an internal model of the body dynamics. This learning con- trol model does not converge in unstable situations [52], while the controller described in this paper is more adapted to unstable interaction (see [72] and Sect. 9). 4. metabolic cost minimization: the CNS optimizes the arm impedance to achieve a desired margin of sta- bility while minimizing metabolic cost [10]. Following these human motor control behaviours, we developed a new whole body control based on feedfor- ward and feedback mechanisms (Fig. 1) inspired by the human ability to adapt force and impedance to deal with stable or unstable situations and to compensate for perturbations [72,24]. 3 Overview on adaptive and learning control Adaptive and learning controls found in the literature can be distributed into four groups: 1. Classical Adaptive – Gain Scheduling [2] – Model Reference Adaptive Control (MRAC) [35] – Self-tuning regulator [3] – Self-Oscillating Adaptive Systems (SOAS) [33] 2. Periodic Adaptive/Learning – Iterative Learning Control (ILC) [5] – Repetitive Control (RC) [41] – Run-to-Run control (R2R) [12] 3. Machine Learning – Reinforcement Learning [9,8] 4. Non-symbolic learning tool – Artificial neural network [32] – Fuzzy logic [16] – Genetic algorithms [62] In our study, we wanted to develop an algorithm adapted to unstable interactions, which are inevitable in our context (namely, verification of human factors in industrial work task design). In particular, our case study dealt with the task of clipping small metal parts to a plastic instrument panel of a vehicle [19]. In this work-task, we observed subjects performing the same task repeatedly. When we tried to simulate this task with a DHM, one way to compensate for the repetitive part of the error is to use periodic adaptive/learning control. With this type of controller, a robot performs the same task for numerous iterations, reducing the pe- riodic error at each following trial. If a task has reproducible dynamics or fixed envi- ronment, impedance control is used to impose a desired dynamic behaviour to the interaction between the robot end-effector and the environment [30,14]. The common control impedance techniques requires a reproducible dynamics (the target impedance model is fixed). For this reason, it is not adapted when the environment changes (the interaction may become unstable). One possible solution to perform unstable tasks is to increase impedance in order to deal with incorrect force arising from unknown dynamics. Yet, while higher impedance may increase stability in movement task, it may also lead to instability during interactions with a stiff environment. Common periodic adaptive control learns only force from the feedback error. Thus, it is inefficient in un- stable situations because the force will be different in each trial due to noise or external perturbations [70]. In addition, common ILC algorithms do not require a low mechanical impedance to obtain safety and energy minimization [28]. The algorithm developed below is more adapted to unstable interactions than common ILC algorithms, be- cause it allows to change the force in each trial [11] and to obtain low impedance. Learning the optimal force and impedance appropriate for different tasks can help the robot achieve them with minimum error and least amount of energy (as humans do [23]). 4 Giovanni De Magistris et al. Fig. 2: DHM with skinning and collision geometry (left). Right hand model with skinning and collision geometry (right) 4 Digital human model using simulated physics 4.1 Model of human body and dynamics In our study, the human body is kinematically modelled as a set of articulated rigid bodies (Fig. 2) organized into a redundant tree structure, which is characterized by its degrees of freedom (dof). Each articulation can be modelled into a number of revolute joints depending on the function of the corresponding human segment. Our DHM therefore comprises of 39 joint dof and 6 root dof, with 8 dof for each leg and 7 for each arm. The root is not controlled. For validation purposes, several DHMs have been dimensioned based on each subject’s anthropometry [29]. The dynamics of the DHM are described as a second order system as: MṪ +NT +G = Lτ + ∑ j JTcj Wcj + ∑ k JTendkW i endk (1) M is the generalized inertia matrix; Ṫ is the accelera- tion in generalized coordinates; NT is the centrifugal and coriolis forces; G is the gravity force in generalized coordinates; L is the matrix to select the actuated de- grees of freedom (L = [0 I]T with 0 the zero matrix and I the identity matrix); τ is the set of joint torques; J is the jacobian matrix; W is the wrench applied by digital human models on environments (W = [Γ T F T ]T with Γ the moment in cartesian space and F the force in cartesian space). In the notation of this paper, frames are denoted by subscripts as follows: – com: center of mass frame – c: non-sliding contacts at known fixed locations such as the contact points between the feet and the ground – end: end-effector frame – q: joint space – ρ: ρ-space – K, B, τ: the learning rate of stiffness, damping or torque Moreover the following superscripts are used: – min: joint stiffness, damping and feedforward torque required to maintain posture stability and to reduce the systematic deviation caused by the interaction with the environment – d: ”desired” values – l: the learned torque, stiffness or damping – ini: the initial torque, stiffness or damping – i: wrench derived from unknown contacts with en- vironment - interaction wrench – ff : feedforward – fb: feedback – ob: object 4.2 Contacts model Simulations were based on the XDE physics sim- ulation module developed at the CEA-LIST (http://www.kalisteo.fr/lsi/en/aucune/a-propos- de-xde). This module manages the whole physics simulation in real time, including accurate and robust A human-like learning control for digital human models in a physics-based virtual environment 5 constraint-based methods for contact and collision resolution [45,46]. Friction effects were modelled in compliance with Coulomb’s friction law, which can be formulated as: ∥fxy∥ ≤ µ ∥fz∥ (2) with ∥fxy∥ being the tangential contact force, µ the dry friction factor and ∥fz∥ the normal contact force. 4.3 Hand model The hand model, illustrated in Fig. 2, has 20 dof. To control joint positions θ, we use a simple Proportional- Derivative controller. The desired joint positions θ are a set of desired positions θd corresponding to different preset grasps. 5 Adaptive controller based on human behaviours Corresponding to the above analysis of human motion control, we propose a human-like learning controller (Fig. 1) composed of feedforward and feedback controls, both of which are adapted during movements. This con- troller is inspired by the works of Yang et al [72] and Ganesh et al [24]. The proposed controller can deal with both stable and unstable conditions. The learned stiffness, damping and feedforward torque compensate for external pertur- bations. This behaviour is similar to human adaptation [65]. 5.1 Cartesian controller with joint stiffness, damping and torque learning We describe a cartesian controller that, given a target in cartesian space, learns joint space parameters. An in- teresting property of this controller is that although it is a cartesian controller, its impedance is learned and dis- tributed according to the limbs’ dynamics. As demon- strated in [44], to control limb stiffness and stability, the CNS must increase joint stiffness when an external force is applied to the hand. This result is obtained with our controller in Sect. 9. The desired cartesian space impedance is: Kend = J †T end,ρ ( Kρ − ∂JTend,ρ ∂ρ W iend ) J † end,ρ Bend = J †T end,ρBρJ † end,ρ (3) K is the stiffness matrix; B is the damping matrix; J† = M−1JT (JM−1JT )−1 is the dynamic pseudoin- verse matrix with J a full rank matrix; ρ = Sq. S is a matrix to select a part of the actuated degrees of free- dom (S = [I 0]) to obtain a dyamic model independent of non-sliding contact forces at known fixed locations in Eq. 1 such as the contacts between the feet and the ground (see Appendix A). 5.2 Overall cost function As explained in Sect. 2, the CNS minimizes the motion error cost ME(t) (Eq. 5) and the metabolic cost MC(t) [10] (to learn impedance and feedforward torque, a hu- man does not spend extra effort (Eq. 6)). We therefore set our overall cost function C(t) as: C(t) = ME(t) + MC(t) (4) with: ME(t) = 1 2 ϵT (t)[J †T end,ρMρJ † end,ρ]ϵ(t) (5) and: MC(t) = 1 2 ∫ t t−D Φ̃T (σ)Q−1Φ̃(σ)dσ (6) Mρ is the inertia matrix (see Appendix A) and Q = diag(I ⊗ QK, I ⊗ QB, Qτ ). ϵ is the tracking error commonly used in robotics [63] defined as: ϵ = δ(V d, V r) + bδ(Hd, Hr) ∈ se(3) (7) with Hr ∈ SE(3), Hd ∈ SE(3), V r ∈ se(3) and V d ∈ se(3), where SE(3) is the special Euclidian group and se(3) is the Lie algebra of SE(3). δ(Hd, Hr) denotes the displacement (position and orientation) error between the desired and current state; δ(V d, V r) denotes the velocity (linear and an- gular velocity) error between the desired and current state. Φ̃(t) = Φ(t) − Φd(t) = [vec(Klρ(t)) T , vec(Blρ(t)) T , (τlρ(t)) T ]T −[vec(Kminρ (t))T , vec(Bminρ (t))T , (τminρ (t))T ]T = [vec(K̃(t))T , vec(B̃(t))T , τ̃(t)T ]T (8) where vec(·) is the column vectorization operator, K̃ = Klρ(t) − Kminρ (t), K̃ = Blρ(t) − Bminρ (t) and τ̃ = τlρ(t)−τminρ (t). Kminρ , Bminρ and τminρ are joint stiffness, damping and feedforward torque required to maintain posture stability and to reduce systematic deviation caused by the interaction with the environment (see Appendix B). In Eq. 8, the function Φ(t) that adapts stiffness, damping and feedforward torque tends to the minimal value Φd(t) with a metabolic cost minimization [10]. 6 Giovanni De Magistris et al. To measure stability, we use the motion error cost ME in Eq. 5. If there exists δ > 0 such that ∫ t1 t ṀE(σ)dσ < δ, (9) then human interaction with an environment is stable in [t, t1] [36]. 5.3 DHM Torques Following the important behaviours of human motor control listed in Sect. 2, we propose a DHM controller composed of feedforward and feedback parts that are adapted during trials: τρ = Sτ ff + Sτfb − τlρ (10) where τff is the torque to compensate for DHM dynam- ics (feedforward part of our controller in Sect. 8.1); τlρ (Eq. 16) is the learned feedforward torque that depends on the feedback error. τfb = −LT (JTcomFcom + JTendW d end + J T c ∆fc) is the torque to compensate trajectory errors (feedback part of our controller in Sect. 8.2). ∆fc is the contact forces. W dend is the desired task wrench in Eq. 11 that adapts stiffness and damping in Eq. 8. The desired task wrench W dend is computed by us- ing an adaptive proportional-derivative (PD) feedback control law: W dend = K l endδ(H d, Hr) + Blendδ(V d, V r) + Biniendϵ = (Klend + bB ini end)δ(H d, Hr) + (Blend + B ini end)δ(V d, V r) (11) Kend and Bend denote the cartesian stiffness and damp- ing matrix respectively. As explained in Sect. 5.1, our controller learn joint space parameters using Eqs. 13 and 14. To pass from joint to cartesian stiffness and damping, we use the Eq. 3. It is important to remember that joint-space and ρ-space are related by the relationship ρ = Sq. The Biniend is chosen according to: Biniend = J †T end,ρB ini ρ J † end,ρ (12) with Biniρ being a symmetric positive definite matrix with minimal eigenvalue λmin(B ini ρ ) ≥ λB > 0. This minimal feedback matrix insures stable and compliant motion control. It corresponds to the mechanical prop- erties of the passive muscles of the human relaxed arm [54]. 5.4 Learning laws In order to vary the mechanical control of a limb over time, the cerebellum plays an important role in the hu- man motor learning process, forming and storing asso- ciated muscle activation patterns. According to Smith [64], stiffness varies throughout the movement. Based on human properties detailed in Sect. 2, stiffness Klρ(t) and damping Blρ(t) are adapted as follows: Klρ(t, k + 1) = K l ρ(t, k) +QK{J † end,ρ[ϵ(t, k)δ(H d, Hr)T ]J †T end,ρ − γ(t)K l ρ(t, k)} (13) Blρ(t, k + 1) = B l ρ(t, k) +QB{J † end,ρ[ϵ(t, k)δ(V d, V r)T ]J †T end,ρ − γ(t)B l ρ(t, k)} (14) with Klρ(t, k = 0) = 0[nρ,nρ] and B l ρ(t, k = 0) = 0[nρ,nρ], t ∈ [0, D), QK and QB are symmetric positive definite constant gain matrices. The forgetting factor of learning γ is defined by: γ(t) = p 1 + u ∥ϵ(t)∥2 (15) with positive p and u values. To obtain convergence, we need γ(t) > 0 (see Appendix B). The learning response speed can be tuned through the choice of p and u. If γ(t) is large, torque and impedance learning will be slow; if γ(t) is small, we will obtain slow torque and impedance unlearning. Unlike the constant value of γ in [24], the time vary- ing definition of γ in Eq. 15 has the following advan- tage: when ϵ(t) is large, γ(t) is small and vice versa. For this reason, we have a controller that quickly in- creases torque and impedance during bad tracking per- formance and quickly decreases torque and impedance during good tracking performance. The learned feedforward torque is adapted through: τlρ(t, k + 1) = τ l ρ(t, k) + Qτ [J † end,ρϵ(t, k) − γ(t, k)τ l ρ(t, k)] (16) with τlρ(t, k = 0) = 0[nρ,1], t ∈ [0, D], Qτ a symmetric positive definite constant matrix. The diagonal learning rate matrices QK, QB and Qτ are empirically chosen. In particular we choose QK > Qτ because human stiffness increases faster than feedforward torque [10] and QB = QK/b according to Eq. 11. 6 Trajectory planner based on human psychophysical principles A movement can be characterized, independently of the end-effector, by: – the initial and final points of the trajectory (position and orientation) A human-like learning control for digital human models in a physics-based virtual environment 7 – obstacle positions (via-points of the trajectory) – duration Experimental study of human movements has shown that voluntary movements obey the following three ma- jor psychophysical principles: – Hick-Hyman’s law: the average reaction time TRave required to choose among n probable choices depends on their logarithm [34]: TRave = d log2(n + 1) (17) – Fitts’ law: the movement time depends on the log- arithm of the relative accuracy (the ratio between movement amplitude and target dimension) [21]: D = g + z log2(2ΥP) (18) where D is the duration time, Υ is the amplitude, P is the accuracy, and g and z are empirically de- termined constants. – Kinematics invariance: hand movements have a bell-shaped speed profile in straight reaching move- ments [50]. The speed profile is independent of the movement direction and amplitude. For more com- plex trajectories (i.e. handwriting) the same princi- ple predicts a correlation between speed and curva- ture [51] described as a 2/3 power law: ṡ(t) = ZsR 1− 2 3 (19) where ṡ(t) is the tangential velocity, R is the radius of curvature and Zs is a proportionality constant, also termed ”velocity gain factor”. For this reason, more complex trajectories can be divided into overlapping basic trajectory similar to reaching movements. Such spatio-temporal invari- ant features of normal movements can be explained by a variety of criteria of maximum smoothness, such as the minimum jerk criterion [22] or the min- imum torque-change criterion [67]. We implemented a modified minimum jerk criterion with via-points to calculate trajectories and avoid ob- stacles. The original minimum-jerk model in [22] may fail to predict the hand path and can only be applied to av- erage data because it predicts a single optimum move- ment for given via-points. Unlike the original minimum jerk model, the 2/3 power law can be applied to all movements. The main problem with this method is the formula, which predicts speed from paths. In this study, we therefore chose Todorov’s model [66], which com- bines the original minimum-jerk model and the 2/3 power law model and uses a path observed in a spe- cific trial to predict the speed profile. Todorov’s model substitutes a smoothness constraint for the 2/3 power law (see Appendix C.2). This model is validated and compared to the 2/3 power law in [66] for four tasks with a specified path. For a given hand path in space, Todorov’s model [66] assumes that the speed profile is the one that minimizes the third derivative of position (also named ”jerk”): Jerk = ∫ D 0 ∥∥∥∥ d3dt3 r[s(t)] ∥∥∥∥2 (20) with r(s) = [x(s), y(s), z(s)] a 3D hand path and s is the curvilinear coordinate. According to this approach, minimization is performed only over the speed profiles because the path is specified. Formal definition of the inside term of the integral in Eq. 20 is in Appendix C.1. In the original minimum jerk model [22], the mini- mum jerk trajectory is a 5th-order polynomial in t. Us- ing the end-point constraints, we can compute the coef- ficients of this polynomial. The trajectory and speed are found by a given set of via-points and thus, the hand is constrained to pass through the via-points at defi- nite times. To calculate the minimum jerk trajectory, it is necessary to give passage times TP , positions x, ve- locities v and accelerations a. In the Todorov’s model, the passage times TP are not defined a priori, but are determined by the algorithm explained below. To find the optimal jerk for any given passage times TP and intermediate points x, Todorov’s model mini- mizes the jerk with respect to v and a by setting the gradient to zero and solving the resulting system of lin- ear equations. To find the intermediate times TP , the method uses a nonlinear simplex method to minimize the optimal jerk over all possible passage times. In the same way as for translations, the speed profile of a rotation is the one that minimizes the third deriva- tive of orientation (or ”jerk”), with a 3D rotation path r(s) = [α(s), β(s), γ(s)]. In brief, to calculate the minimum jerk trajectory for the rotations and the translations, we need to pro- vide the positions X, the initial and final velocities V and the initial and final accelerations A. An illustrative example of a minimum jerk trajec- tory simulation is given in [19] and a comparison be- tween real human data and simulations is given in [20]. 7 Duration time based on human laws Duration times are a-priori chosen following the 3D Fitt’s law proposed in [27] for a pointing task. The reach and position states are similar to a pointing task at trivariate targets, and therefore we use the equation in 8 Giovanni De Magistris et al. Fig. 3: w, h and d measurements for the 3D Fitt’s law [27] to calculate movement time D: D ≈ 56 + 5208 log2 (√ fw (θ) ( Υ w )2 + 1 9.2 ( Υ h )2 + fd (θ) ( Υ d )2 + 1 ) (21) with fw (0 ◦) = 0.211, fw (45 ◦) = 0.242, fw (90 ◦) = 0.717, fd (0 ◦) = 0.194, fd (45 ◦) = 0.147 and fd (90 ◦) = 0.312. Υ is the distance (or amplitude), θ is the move- ment angle (the human user’s axis of movement), w is the width measured along movement axis, h is the height measured along Z-axis, and d is perpendicular to both (see Fig. 3). 8 Feedforward and feedback control The optimization framework (see Fig. 4) is based on a combined anticipatory feedforward and feedback con- trols system based on underlying notions of the accel- eration - based control method [1,15] and a Jacobian- Transpose (JT) control method [57,42,18]. These controllers are formulated as two successive Quadratic Programming (QP) controllers (Fig. 4), each of them dealing with a great number of dof and solving simultaneously all constraint equations. The controller is introduced to compute joint torques that achieve different objectives and satisfies different constraints. In our multi-objective control, a task means that a certain frame on the DHM body should be transferred from an initial state to a desired state. 8.1 Feedforward During the feedforward phase, the objectives are: 1. Objective based on acceleration control. This feedfor- ward action compensates for the low frequency, rigid body behaviour of the DHM dynamics. The goal is to minimize the difference between actual acceler- ation A and desired acceleration Ad found by the minimum jerk trajectory planner. A is expressed in terms of the unknowns of the sys- tem Ṫ as:{ V = JT A = JṪ + J̇T (22) with J being the Jacobian matrix expressed in its own frame. 2. Regularization for QP problem: To regularize the QP problem, we set the desired torque τd, the de- sired contact force fdc and the desired acceleration Ṫ d to zero. During the feedforward phase, the constraints are: 1. Dynamic equation. As explained in Sect. 2, the CNS is able to predict dynamics. We therefore set the DHM dynamics in Eq. 1 as a feedforward constraint. 2. Contact point accelerations. To help maintain con- tacts, contact acceleration must be null. Ac = JcṪ + J̇cT = 0 (23) 3. Non-sliding contacts. The non-sliding contacts are expressed as a set of inequality constraints. Con- tact constraints are imposed at contact points be- tween the feet and the ground. The contact force fc should remain within the friction cone. The lin- earized Coulomb friction model [1] is applied, in which the friction cone of each contact is approx- imated by a four-faced polyhedral convex cone. The contact constraints are formularized as: Ecifci + dci < 0 (24) where Eci is the approximated friction cone, and dci is a customer defined margin vector so that the projection of fci on the normal vector of each facet of the friction cone should be kept larger than dci. A human-like learning control for digital human models in a physics-based virtual environment 9 Fig. 4: Block diagram of the cartesian control framework We summarize the feedforward phase as: Ô = arg min 1 2 τff ,Ṫ ,fc ∥∥∥∥∥∥∥   τffṪ fc   −   τff d Ṫ d fdc   ∥∥∥∥∥∥∥ 2 Q (25) subject to:  MṪ + NT + G = Lτ + JTc fc Ecfc + dc ≥ 0 JcṪ + J̇cT = 0 (26) The optimization objective is the same for each task, which is to minimize the error between the variable and its desired value. The objectives are combined in the diagonal weight matrix Q. These values are chosen ac- cording to the priorities of different objectives. With this optimization, we obtain τff , fc, Ṫ . 8.2 Feedback In the feedback part, for each task, we imagine that a virtual wrench is applied at a certain frame on the DHM body to guide its motion towards a given tar- get. These virtual wrenches are computed by solving an optimization problem. To obtain this, in the feedback phase the objectives are: 1. COM position. The dynamic controller maintains the DHM balance by imposing that the horizontal plane projection of the COM lie within a convex support region [7]. For this COM-tracking objective, we consider only the force component and F dcom is obtained by using a PD control in ℜ3 to measure the error between the actual and desired COM po- sitions. F dcom = Kcom(x d com − xrcom) + Bcom(vdcom − vrcom) (27) where Kcom and Bcom are the proportional and derivative gain matrix respectively. 2. End-effector. The end-effector task is used for per- forming some specific motions. In this paper the ob- jective is to realize point to point movement with the human-based learning controller in Eq. 11. 10 Giovanni De Magistris et al. 3. Minimize the difference between actual contact force and feedforward contact force. ∆fdc = 0(3nfc ,1) with Q∆fc = w∆fcI3nfc In the feedback phase the constraints are: 1. Static equilibrium. The wrenches are constrained by the static equilibrium of the DHM: Lτfb = −JTcomFcom −J T endW d end − ∑ i Jci T ∆fci (28) 2. Non-sliding contacts Ec(fc + ∆fc) + dc ≥ 0 (29) We summarize the feedback phase as: Ô = arg min 1 2 Fcom,τ fb,∆fc ∥∥∥∥∥∥   FcomWend ∆fc   −   F dcomW dend ∆fdc   ∥∥∥∥∥∥ 2 Q (30) subject to:{ Lτfb =−JTcomFcom − JTendW d end − J T c ∆fc Ec(fc + ∆fc) + dc ≥ 0 (31) The optimization objective is the same for each task, which is to minimize the error between the variable and its desired value. The objectives are combined in the diagonal weight matrix Q. These values are chosen ac- cording to the priorities of different objectives. With this optimization, we obtain Fcom, Wend, ∆fc. The feedback joint torque is equal to: τfb = −LT (JTcomFcom + J T endW d end + J T c ∆fc) (32) 9 Results Our simulation framework requires a PC running a Python 2.7 environment with XDE modules. With a simulation step of 0.01 s, the joint torques are calculated in quasi-real-time (computation duration is 1.5 times the simulation duration) on a PC equipped with an Intel Xeon E5630 (12M Cache, 2.53 GHz Pro- cessor, 24 Gb of RAM). Several simulations have been made using our new joint stiffness, damping and torque-learning cartesian controller. A first case-study dealt with a fictional hand task, a second case-study dealt with an experimental as- sembly task. All simulations consisted of controlling a 45 dof DHM, with 6 dof for the root position and orien- tation, using actuators/muscle producing joint torques τ in a 6-dimensional Cartesian task space characterized by an interaction external wrench W iend while tracking a minimum-jerk task reference trajectory detailed in Sect. 6. The wrench is derived from the contacts or given by an imposed wrench field. There are four contact points on each foot. During the experimental task, we observed that torso orientation varied very little. We therefore add an objective to maintain the desired torso orientation equal to its initial orientation. The optimization weights for the different objectives are: 104 for the COM, 5·103 for the right hand task, 101 for the posture, 102 for the head, 102 for the torso, 100 for the contact task and 102 for the gravity compen- sation. These weights are empirically chosen based on the estimated importance and priorities of the different objectives. The learning rate matrices QK, QB and Qτ in [72] have been changed for different applications and they are empirically chosen based on the importance and priorities of the different objectives. We choose Qk > Qτ because human stiffness in- crease faster than feedforward torque [10]. The con- troller parameters are selected as QK = diag[8.](nρ,nρ), QB = diag[0.8](nρ,nρ), Qτ = diag[1.](nρ,nρ), a = 0.2, u = 5, b = 10 for all simulations. In [20], we used this controller to simulate an ex- perimental insert clipping activity in quasi-real-time and applied the simulated postures, time and exertions to an OCRA index-based ergonomic assessment [53]. Given only scant information on the scenario (typically initial and final operator-positions and clipping force), the simulated ergonomic evaluations were in the same risk area as those based on experimental human data. In addition, DHM trajectories are similar to real tra- jectories. 9.1 Free hand movement The first case studied is a point-to-point movement: the right hand goes from the initial right hand position to the insert position. At the start of the simulation, the insert is placed on the table in Fig. 8a, and the DHM body is upright and its arms are along the body. This reproduces the grasping action of the experimental task in [19]. The movement duration D = 1.3 s is chosen ac- cording to the 3D Fitt’s law proposed in Sect. 7 for a pointing task. A constant interaction external wrench W extend = [0 N · m, 0 N · m, 0 N · m, 3 N, 3 N, 3 N]T is applied to the right hand during the motion. Adaptation is simu- lated for 225 iterations. At the end of each iteration the joint position is reset to the start point and the joint velocity and acceleration are reset to zero. A human-like learning control for digital human models in a physics-based virtual environment 11 In the first phase (iterations 1-75) interaction wrench is absent. In the second phase (iterations 76- 149) a constant perturbation wrench W iend = W ext end is applied to the right hand. In the third phase (iterations 150-225) interaction wrench is absent. As demonstrated in Fig. 5, DHM increases its joint stiffness in order to maintain limb stability in the pres- ence of applied external forces at the hand [10,44]. We note that the error decreases (see Fig. 6) and therefore, initial divergent trajectories become conver- gent and successful after learning. We observe similar pattern of stiffness and feedfor- ward torque (Fig. 7) of the experiments in [10]. This is derived from stiffness and damping adaptation to com- pensate for unstable interaction, without a large modi- fication of the feedforward torque. We note that the limb stiffness converges to small values when no forces are applied at the hand. The lower-magnitude joint stiffness is typical of a human subject acting with a zero force field [44]. 9.2 Simulation of a insertion with a virtual object In the second case-study, we simulate the insertion task in [19]. The interaction external wrench is derived from the contacts between the insert and the virtual object represented in Fig. 8a. The digital mock-up (DMU) scenario is represented in Fig. 8a. This reproduces the experimental environ- ment in [19] by ensuring geometric similarity. The in- puts used to build the DMU scenario are the workplace spatial organization (x, y and z dimensions), inserts and tool descriptions (x, y, z positions and weight) and the DHM position. In Figs. 9 and 8b we show the results of the simula- tion when the right hand goes from the virtual object center xob to the x = xob + [0 m, 0.03 m, −0.02 m] po- sition (the reference frame is represented in Fig. 8a). Adaptation is simulated for 50 iterations. We note in Fig. 8b that the asymptotic force slightly decreases. We have demonstrated this human behavior by human subject experiments [17]. 10 Conclusion and future works In this paper, we have described a multi-objective con- trol of digital human models based on human-being’s ability to learn novel task dynamics through the mini- mization of instability, error and effort. Our controller has been validated with a 45 dof DHM. For this paper, we applied our algorithm to a rather simple case study, of limited impact relatively to the complexity of ac- tual work gestures. In order to confirm the encouraging results and to give the desired genericity to our con- troller and DHM, we plan to do additional theoretical and practical works. One improvement will be to enrich prehension sim- ulation. For our case study, we explicitly specified the type of grasp (palmar, pinch, full-handed) and orien- tation of the object in the operator’s hand, according to the final orientation (the object is attached to the hand). In the near future, we plan to introduce prehen- sion functions in our kinematic model. In order not to make our kinematic model heavier (20 segments and 28 additional dof per hand [49]), we propose to replace the wrist and fingers by a dedicated end-effector whose characteristics (number of joints, types, rotational and translational range) would mimic the dof observed for each type of grasp [47]. For example, this effector would have more dof in pinch mode than in full-handed grasp mode. Another improvement will involve the parametriza- tion of the controller. Actually, our controller needs sev- eral parameters to be set, for instance tasks weights. In our case study, we set those parameters empirically. To improve the genericity of our algorithm, the tasks weights could be automatically calculated without re- quiring any manual tuning [58]. References 1. Abe, Y., Silva, M.D., Popović, J.: Multiobjective control with frictional contacts. In: Proc. ACM SIGGRAPH/EG Symposium on Computer Animation, pp. 249-258. Airela- Ville, Switzerland (2007) 2. Andreiev, N.: A process controller that adapts to signal and process conditions. Control Engineering 38 (1977) 3. Astrom, K., Borrison, U., Wittenmark, B.: Theory and application of self-tuning regulators. Automatica 13, 457- 476 (1977) 4. Badler, N.: Virtual humans for animation, ergonomics, and simulation. In: Proceedings of the IEEE workshop on non- rigid and articulates motion, pp. 28-36 (1997) 5. Bien, Z., Xu, J.: Iterative learning control: analysis, design, integration and applications. In: Kluwer Academic Publish- ers Norwell. MA, USA (1998) 6. Bradwell, B., Li, B.: A tutorial on motion capture driven character animation. In: Eight IASTED International Con- ference Visualization, Imaging, and Image Processing. Palma de Mallorca (2008) 7. Bretl, T., Lall, S.: Testing static equilibrium for legged robots. IEEE Transactions on Robotics 24, 794-807 (2008) 8. Buchli, J., Stulp, F., Theodorou, E., Schaal, S.: Learning variable impedance control. The International Journal of Robotics 30, 820-833 (2011) 9. Buchli, J., Theodorou, E., Schaal, S.: Reinforcement learn- ing of full-body humanoid motor skills. In: 10th IEEE- RAS International Conference on Humanoid Robots (Hu- manoids), pp. 405-410 (2010) 12 Giovanni De Magistris et al. (a) Joint stiffness (b) Joint damping Fig. 5: Learned joint stiffness and damping (mean over one period) (a) (b) Fig. 6: Cartesian position [m] and orientation [rad] error (mean over one period) (a); Cartesian linear [m/s] and angular [rad/s] velocity error (mean over one period) (b) Fig. 7: Learned force (mean over one period) A human-like learning control for digital human models in a physics-based virtual environment 13 (a) (b) Fig. 8: DMU scenario and virtual object to model insertion with the stiffness Kobj of the object equal to 1000 N/m in the four direction (a); Average interaction force during insertion (b); (a) (b) Fig. 9: Learned joint stiffness during insertion (mean over one period) (a); Cartesian position [m] and orientation [rad] error during insertion (mean over one period) (b) 10. Burdet, E., Osu, R., Franklin, D., Milner, T., Kawato, M.: The central nervous system stabilizes unstable dynamics by learning optimal impedance. Nature 414, 446-449 (2001) 11. Burdet, E., Tee, K.P., Mareels, I., Milner, T.E., Chew, C., Franklin, D.W., Osu, R., Kawato, M.: Stability and motor adaptation in human arm movements. Biological Cybernet- ics 94, 20-32 (1998) 12. Castillo, E.: Run-to-run process control: literature review and extensions. J. Qual. Technol. 29, 184-196 (1997) 13. Chaffin, D.: Human motion simulation for vehicle and workplace design. Human Factors and Ergonomics in Man- ufacturing 17, 475-484 (2007) 14. Cheah, C., Wang, D.: Learning impedance control for robotic manipulators. IEEE Transactions on Robotics and Automation 14, 452-465 (1998) 15. Colette, C., Micaelli, A., Andriot, C., Lemerle, P.: Robust balance optimization control of humanoid robots with mul- tiple non coplanar grasps and frictional contacts. In: Pro- ceedings of the IEEE International Conference on Robotics and Automation, pp. 3187-3193. Pasadena, USA (2008) 16. Commuri, S., Lewis, F.: Adaptive-fuzzy logic control of robot manipulators. In: IEEE International Conference on Robotics and Automation, vol. 3, pp. 2604-2609. Minneapo- lis, MN (1996) 14 Giovanni De Magistris et al. 17. De Magistris, G.: Dynamic digital human model control design for the assessment of the workstation ergonomics. PhD Thesis - Pierre and Marie Curie University (2013) 18. De Magistris, G., Micaelli, A., Andriot, C., Savin, J., Marsot, J.: Dynamic virtual manikin control design for the assessment of the workstation ergonomy. In: First In- ternational Symposium on Digital Human Modeling. Lyon (2011) 19. De Magistris, G., Micaelli, A., Evrard, P., Andriot, C., Savin, J., Gaudez, C., Marsot, J.: Dynamic control of DHM for ergonomic assessments. International Journal of Indus- trial Ergonomics 43, 170-180 (2013) 20. De Magistris, G., Micaelli, A., Savin, J., Gaudez, C., Marsot, J.: Dynamic digital human model for ergonomic assessment based on human-like behaviour and requiring a reduced set of data for a simulation. In: Second Interna- tional Digital Human Model Symposium 2013. Ann Arbor, USA (2013) 21. Fitts, P.: The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology 47, 381-391 (1954) 22. Flash, T., Hogan, N.: The coordination of arm move- ments: an experimentally confirmed mathematical model. Journal of Neuroscience 7, 1688-1703 (1985) 23. Franklin, D., Burdet, E., Osu, R., Tee, K., Chew, C., Mil- ner, T., Kawato, M.: CNS learns stable, accurate, and effi- cient movements using a simple algorithm. J Neuroscience 28, 11165-11173 (2008) 24. Ganesh, G., Albu-Schaeffer, A., Haruno, M., Kawato, M., Burdet, E.: Biomimetic motor behavior or simultaneous adaptation of force, impedance and trajectory in interac- tion tasks. In: IEEE International Conference on Robotics and Automation. Anchorage, Alaska, USA (2010) 25. Gomi, H., Osu, R.: Task-dependent viscoelasticity of hu- man multijoint arm and its spatial characteristics for in- teraction with environments. Journal of Neuroscience 18, 8965-8978 (1998) 26. Gribble, P., Ostry, D.: Compensation for interaction torques during single and multijoint limb movement. Jour- nal of Neurophysiology 82, 2310-2326 (1999) 27. Grossman, T., Balakrishnan, R.: Pointing at trivariate targets in 3d environments. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 447-454. New York (2004) 28. Haddadin, S., Albu-Schffer, A., Hirzinger, G.: Require- ments for safe robots: Measurements, analysis & new in- sights. International Journal on Robotics Research 28, 1507-1527 (2009) 29. Hanavan, E.: A mathematical model of the human body. Wright-Patterson Air Force Base Report No. AMRLTR- 102, 64-102 (1964) 30. Hogan, N.: Impedance control: an approach to manipulation-part i: Theory; part ii: Implementation; part iii: Applications. Transaction ASME J. Dynamic Systems, Measurement and Control 107, 11-24 (1985) 31. Hogan, N.: Mechanical impedance of single- and multi- articular systems. j. m. winters & s.l. woo. springerverlag. Multiple muscle systems: Biomechanics and movement or- ganization (1990) 32. Hovland, G., Sikka, P., McCarragher, B.: Skill acquisition from human demonstration using a hidden markov model. In: IEEE International Conference on Robotics and Au- tomation, vol. 3, pp. 2706-2711. Minneapolis, MN (1996) 33. Hsu, L.: Self-oscillating adaptive systems (soas) without limit-cycles. In: Proceedings of the American Control Con- ference, vol. 13. Albuquerque, New Mexico (1997) 34. Hyman, R.: Stimulus information as a determinant of reaction time. Journal of Experimental Psychology 45, 188- 196 (1953) 35. Ioannou, P., Kokotović, P.: Adaptive systems with re- duced models. Lecture Notes in Control and Information Sciences 47 (1983) 36. Jagannathan, S.: Neural Network Control of Nonlin- ear Discrete-Time Systems. CRC Press Taylor & Francis Group, FL (2006) 37. Kawato, M., Gomi, H.: A computational model of four regions of the cerebellum based on feedback error learning. Biological Cybernetics 69, 95-103 (1992) 38. Khatib, O., Sentis, L., Park, J., Warren, J.: Whole-body dynamic behavior and control of human-like robots. Inter- national Journal of Humanoid Robotics 01, 29-43 (2004) 39. Kirsch, R., Boskov, D., Rymer, W., Center, R., Center, M., Cleveland, O.: Muscle stiffness during transient and continuous movements of catmuscle: perturbation charac- teristics and physiological relevance. IEEE Transactions on Biomedical Engineering 41, 758-770 (1994) 40. Lackner, J., Dizio, P.: Rapid adaptation to coriolis force perturbations of arm trajectory. Journal of Neurophysiol- ogy 72, 299-313 (1994) 41. Li, C., Zhang, D., Zhuang, X.: A survey of repetitive con- trol. In: Proceedings of 2004 IEEE/RSJ International Con- ference on Intelligent Robots and Systems, pp. 1160-1166. Sendai, Japan (2004) 42. Liu, M., Micaelli, A., Evrard, P., Escande, A., Andriot, C.: Interactive dynamics and balance of a virtual character during manipulation tasks. In: IEEE International Confer- ence on Robotics and Automation, pp. 1676-1682. Shang- hai, China (2011) 43. Lämkull, D., Hanson, L., Ortengren, R.: A comparative study of digital human modelling simulation results and their outcomes in reality: A case study within manual as- sembly of automobiles. International Journal of Industrial Ergonomics 39, 428-441 (2009) 44. McIntyre, J., Mussa-Ivaldi, F., Bizzi, E.: The control of stable arm postures in the multi-joint arm. Exp Brain Res, pp. 248-264 (1996) 45. Merlhiot, X.: A robust, efficient and time-stepping com- patible collision detection method for non-smooth contact between rigid bodies of arbitrary shape. In: Proceedings of the Multibody Dynamics ECCOMAS Thematic Conference (2007) 46. Merlhiot, X.: Extension of a time-stepping compatible contact determination method between rigid bodies to de- formable models. In: Proceedings of the Multibody Dynam- ics ECCOMAS Thematic Conference (2009) 47. Miller, A., Knoop, S., Christensen, H., Allen, P.: Auto- matic grasp planning using shape primitives. In: IEEE In- ternational Conference on Robotics and Automation, vol. 2, pp. 1824-1829 (2003) 48. Milner, T., Cloutier, C.: Compensation for mechanically unstable loading in voluntary wrist movement. Experimen- tal Brain Research 94, 522-532 (1993) 49. Miyata, N., Kouki, M., Mochimaru, M., Kawachi, K., Kurihara, T.: Hand link modelling and motion generation from motion capture data based on 3d joint kinematics. In: Proceedings SAE International Iowa (2005) 50. Morasso, P.: Spatial control of arm movements. Experi- mental Brain Research 42, 223-227 (1981) 51. Morasso, P., Mussa-Ivaldi, F.: Trajectory formation and handwriting: a computational model. Biological Cybernet- ics 45, 131-142 (1982) 52. Morasso, P., Sanguineti, V.: Ankle stiffness alone cannot stabilize upright standing. Journal of Neurophysiology 88, 2157-2162 (2002) A human-like learning control for digital human models in a physics-based virtual environment 15 53. Occhipinti, E.: Ocra, a concise index for the assessment of exposure to repetitive movements of the upper limbs. Ergonomics 41, 1290-1311 (1998) 54. Perreault, E., Kirsch, R., Crago, P.: Multijoint dynam- ics and postural stability of the human arm. Experimental Brain Research 157, 507-517 (2004) 55. Porter, J.M., Case, K., Marshall, R., Gyi, D., Sims, R.: Beyond Jack and Jill: designing for individuals using HADRIAN. International Journal of Industrial Ergonomics 33, 249-264 (2004) 56. Prakash, N.: Differential Geometry, an Integrated Ap- proach. TATA McGraw-Hill Publishing Company Limited (1981) 57. Pratt, J., Torres, A., Dilworth, P., Pratt, G.: Virtual ac- tuator control. In: IEEE International Conference on Intel- ligent Robots and Systems, pp. 1219-1226 (1996) 58. Salini, J.: Dynamic control for the task/posture coordina- tion of humanoids: toward synthesis of complex activities. Ph.D. thesis, University of Pierre and Marie Curie (2012) 59. Schaub, K., Landau, K., Menges, R., Grossmann, K.: A computer-aided tool for ergonomic workplace design and preventive health care. Human Factors and Ergonomics in Manufacturing 7, 269-304 (1997) 60. Sciavicco, L., Siciliano, B.: Modelling and Control of Robot Manipulators. Springer, London (2000) 61. Shadmehr, R., Mussa-Ivaldi, F.: Adaptive representation of dynamics during learning of a motor task. Journal of Neuroscience 14, 3208-3224 (1997) 62. Si, J., Zhang, N., Tang, R.: Modified fuzzy associative memory scheme using genetic algorithm. In: Proceedings of the 1999 Congress on Evolutionary Computation (CEC), vol. 3, pp. 2002-2006 (1999) 63. Slotine, J., Li, W.: Applied Nonlinear Control. Prentice- Hall, Englewood Cliff, NJ (1991) 64. Smith, A.: Does the cerebellum learn strategies for the optimal time-varying control of joint stiffness? Behavioral and Brain Sciences 19, 399-410 (1996) 65. Tee, K., Franklin, D., Kawato, M., Milner, T., Burdet, E.: Concurrent adaptation of force and impedance in the redundant muscle system. Biological Cybernetics 102, 31- 44 (2010) 66. Todorov, E., Jordan, M.: Smoothness maximization along a predefined path accurately predicts the speed profiles of complex arm movements. Journal of Neurophysiology 80, 697-714 (1998) 67. Uno, Y., Kawato, M., Suzuki, R.: Formation and control of optimal trajectory in human multijoint arm movement: Minimum torque-change model. Biological Cybernetics 61, 89-101 (1989) 68. Vignes, R.: Modeling Muscle Fatigue in Digital Hu- mans. Center for Computer-Aided Design, The University of IOWA. Tech. rep. (2004) 69. VSR Research Group: Technical report for project virtual soldier research. Center for Computer-Aided Design, The University of IOWA. Tech. rep. (2004) 70. Wolpert, D., Miall, C., Kawato, M.: Internal models in the cerebellum. Trends Cognitive Sciences 2, 338-347 (1998) 71. Won, J., Hogan, N.: Stability properties of human reach- ing movements. Experimental Brain Research 107, 125-136 (1995) 72. Yang, C., Ganesh, G., Haddadin, S., Parusel, S., Al- buSchaeffer, A., Burdet, E.: Human like adaptation of force and impedance in stable and unstable interactions. Trans- actions on Robotics 27, 918-930 (2011) 16 Giovanni De Magistris et al. A Relation between cartesian space and joint space Using Eqs. 1, the interaction dynamics is: MṪ + NT + G = Lτ + JTc Wc + J T endW i end (33) Given an interaction wrench W i end . In this paper, we treat the DHM control where the floating base is the foot. We consider cases with the foot fixed to the ground. In this way, we obtain a completely actuated DHM with fixed-base robots characteristics. The dynamic model of DHM is: Mqq̈ + Nqq̇ + Gq = τ + J T c,qWc + J T end,qW i end (34) with Mq = L T ML, Nq = L T NL, Gq = L T G, JTc,q = L T JTc and J T end,q = LT JT end . When the only contact with the ground is the foot and it is the root, we obtain JcL = 0. Since ρ = Sq and S is a matrix to select a part of the actuated degrees of freedom (S = [I 0]) to obtain a dyamic model independent of non-sliding contact forces at known fixed locations in Eq. 1 such as the contacts between the feet and the ground, we can write the system as: Mρρ̈ + Nρρ̇ + Gρ = τρ + J T end,ρW i end (35) with Mρ = SMqS T , Nρ = SNqS T , Gρ = SGq and J T end,ρ = SJT end,q . From Eq. 35 and since δW i end = Kendvec(H −1 end δHend) = KendJend,qδq = KendJend,qδ(S tρ) = KendJend,ρδρ, we obtain: δτρ + δ(J T end,ρW i end) = δτρ + (δJ T end,ρ)W i end + J T end,ρδW i end = δτρ + (δJ T end,ρ)W i end + J T end,ρKendJend,ρδρ = 0 (36) Since δτρ = −Kρδρ and Eq. 36, we obtain: Kρ = − δτρ δρ = JTend,ρKendJend,ρ + ∂JT end,ρ ∂ρ W iend (37) Finally, the cartesian impedance is: Kend = J †T end,ρ ( Kρ − ∂JT end,ρ ∂ρ W iend ) J † end,ρ (38) with J † end,ρ the dynamic pseudo-inverse [38] defined as: J † end,ρ = M−1ρ J T end,ρ(Jend,ρM −1 ρ J T end,ρ) −1 (39) It can be similarly obtained Bend = J †T end,ρ BρJ † end,ρ . B Convergence Analysis B.1 Motion error cost function The first derivative of ME (Eq. 5) can be calculated as follows: ṀE = 1 2 d dt [ϵT (J †T end,ρ MρJ † end,ρ )ϵ] = 1 2 [ϵ̇T (J †T end,ρ MρJ † end,ρ )ϵ +ϵT (J̇ †T end,ρ MρJ † end,ρ + J †T end,ρ ṀρJ † end,ρ + J †T end,ρ MρJ̇ † end,ρ )ϵ + ϵT (J †T end,ρ MρJ † end,ρ )ϵ̇] (40) with ϵ = V − V ∗, ϵ̇ = A − A∗ (41) and V ∗ = V d − bδ(Hd, Hr). V d is the velocity obtained by minimum jerk planner. A∗ is the derivative of V ∗. Matrix Mρ is symmetric, we therefore obtain: ṀE = [ϵ T (J †T end,ρ MρJ † end,ρ )ϵ̇] + 1 2 [ϵT (J †T end,ρ ṀρJ † end,ρ )ϵ] + [ϵT (J †T end,ρ MρJ̇ † end,ρ )ϵ] (42) The relationship between ρ velocity and Cartesian space velocity can be expressed as: V = Jend,ρρ̇ ⇒ ρ̇ = J † end,ρ V (43) Differentiating Eq. 43, the cartesian acceleration term can be found as: A = Jend,ρρ̈ + J̇end,ρρ̇ (44) A human-like learning control for digital human models in a physics-based virtual environment 17 then the equation of robot motion in joint space can also be represented in Cartesian space coordinates by the relationship: ρ̈ = J † end,ρ (A − J̇end,ρρ̇) = J † end,ρ (A − J̇end,ρJ † end,ρ V ) (45) Substituting Eqs. 45 and 43 into Eq. 35 yields: MρJ † end,ρ [A − J̇end,ρJ † end,ρ V ] + NρJ † end,ρ V + Gρ = τρ + J T end,ρW i end (46) Multiplying both side by J †T end,ρ , we obtain: (J †T end,ρ MρJ † end,ρ )A = [−J†T end,ρ NρJ † end,ρ + J †T end,ρ MρJ † end,ρ J̇end,ρJ † end,ρ ]V − J†T end,ρ Gρ + J †T end,ρ τρ + W i end (47) Using Eq. 10, we otbain: (J †T end,ρ MρJ † end,ρ )A = [−J†T end,ρ NρJ † end,ρ + J †T end,ρ MρJ † end,ρ J̇end,ρJ † end,ρ ]V − J†T end,ρ Gρ + J †T end,ρ τffρ − W d end − J †T end,ρ τlρ + W i end (48) where τ ff ρ is the torque to compensate for DHM dynamics. By definition, it can be written as: τffρ ≡ Mρρ̈ ∗ + Nρρ̇ ∗ + Gρ ≡ MρJ † end,ρ A∗ + [NρJ † end,ρ − MρJ † end,ρ J̇end,ρJ † end,ρ ]V ∗ + Gρ (49) Using Eq. 41 and substituting Eq. 49 into Eq. 48 yields: (J †T end,ρ MρJ † end,ρ )ϵ̇ = [−J†T end,ρ NρJ † end,ρ + J †T end,ρ MρJ † end,ρ J̇end,ρJ † end,ρ ]ϵ + W iend − J †T end,ρ τlρ − W d end (50) Substituting Eq. 50 into Eq. 42 yields: ṀE =ϵ T [(−J†T end,ρ NρJ † end,ρ + J †T end,ρ MρJ̇ † end,ρ + J †T end,ρ MρJ † end,ρ J̇end,ρJ † end,ρ )ϵ + W i end − J†T end,ρ τlρ − W dend] + 1 2 [ϵT (J †T end,ρ ṀρJ † end,ρ )ϵ] + [ϵT (J †T end,ρ MρJ̇ † end,ρ )ϵ] = 1 2 ϵT [J †T end,ρ (Ṁρ − 2Nρ)J † end,ρ ]ϵ + ϵT [W i end − J†T end,ρ τlρ − W dend] + ϵ T [J †T end,ρ MρJ̇ † end,ρ + J †T end,ρ MρJ † end,ρ J̇end,ρJ † end,ρ ]ϵ (51) Matrix Ṁρ − 2N is skew-symmetry [60] and for this reason, we have: ϵT (J †T end,ρ (Ṁρ − 2Nρ)J † end,ρ )ϵ = 0 (52) Let us now analyze the third term of Eq. 51. Using Eq. 39, since Jend,ρJ † end,ρ = I and J̇end,ρJ † end,ρ + Jend,ρJ̇ † end,ρ = 0, we obtain: J †T end,ρ MρJ̇ † end,ρ = (Jend,ρM −1 ρ J T end,ρ )−1Jend,ρM −1 ρ MρJ̇ † end,ρ = (Jend,ρM −1 ρ J T end,ρ )−1Jend,ρJ̇ † end,ρ J †T end,ρ MρJ † end,ρ J̇end,ρJ † end,ρ = (Jend,ρM −1 ρ J T end,ρ )−1Jend,ρM −1 ρ MρJ † end,ρ J̇end,ρJ † end,ρ = −(Jend,ρM −1 ρ J T end,ρ )−1Jend,ρJ̇ † end,ρ (53) Substituting Eq. 52 and Eq. 53 into Eq. 51, we obtain: ṀE = ϵ T [W iend − J †T end,ρ τlρ − W d end] (54) Using Eqs. 54, 11 and 38, we have: ṀE=−ϵT Biniendϵ − ϵ T Kl end δ(Hd, Hr) − ϵT Bl end δ(V d, V r) − ϵT J†T end,ρ τlρ + ϵ T W i end =−ϵT Bini end ϵ − ϵT [ J †T end,ρ ( Kρ − ∂JTend,ρ ∂ρ W i end ) J † end,ρ ] δ(Hd, Hr) − ϵT (J†T end,ρ BlρJ † end,ρ )δ(V d, V r) − ϵT J†T end,ρ τlρ + ϵ T W i end (55) We can derive δME(t) = ME(t) − ME(t − D) from Eqs. 55 and 8 as: δME(t) = ∫ t t−D{−ϵ T (σ)Bini end (σ)ϵ(σ) − ϵT (σ)[J†T end,ρ K̃J † end,ρ ](σ)δ(Hd, Hr)(σ) − ϵT (σ)[J†T end,ρ B̃J † end,ρ ](σ)δ(V d, V r)(σ) −ϵT (σ)[J†T end,ρ τ̃](σ) − ϵT (σ)[J†T end,ρ Kminρ J † end,ρ ](σ)δ(Hd, Hr)(σ) −ϵT (σ)[J†T end,ρ Bminρ J † end,ρ ](σ)δ(V d, V r)(σ) − ϵT (σ)[J†T end,ρ τminρ ](σ) + ϵ T (σ)W i end (σ)}dσ (56) Any smooth interaction force can be approximated by the linear terms of its Taylor expansion along the reference trajectory as follows: W iend(t) = W i,0 end (t) + [J †T end,ρ KiρJ † end,ρ ](t)δ(Hd, Hr) + [J †T end,ρ BiρJ † end,ρ ](t)δ(V d, V r) (57) where W i,0 end is the zero order term compensated by J †T end,ρ τmin; [J †T end KiρJ † end ] and [J †T end BiρJ † end ] are the first order coefficients. From Eqs. 57 and 41, we can obtain the values for Kminρ (t), B min ρ (t) and τ min ρ (t) to guarantee stability (Eq. 58). Different W i end will yield different values of Kminρ (t), B min ρ (t) and τ min ρ (t) and when W i end is zero or is assisting the tracking task ||ϵ(t)|| → 0, Kminρ (t), Bminρ (t) and τ min ρ (t) will be 0. Kminρ (t), D min ρ (t) and τ min ρ (t) represent the minimal required effort of stiffness, damping and feedforward force required to guarantee∫ t t−D{−ϵ T (σ)(J †T end,ρ Kminρ J † end,ρ )(σ)δ(Hd, Hr)(σ) − ϵT (σ)(J†T end,ρ Bminρ J † end,ρ )(σ)δ(V d, V r)(σ) −ϵT (σ)J†T end,ρ τminρ (σ) + ϵ T (σ)W i end (σ)}dσ ≤ 0 (58) so that from Eq. 55 we have ∫ t t−D ṀE(σ)dσ ≤ 0. From Eqs. 56 and 58, we can write: δME(t) ≤ ∫ t t−D{−ϵ T (σ)Bini end (σ)ϵ(σ) − ϵT (σ)(J†T end,ρ K̃J † end,ρ )(σ)δ(Hd, Hr)(σ) −ϵT (σ)(J†T end,ρ B̃J † end,ρ )(σ)δ(V d, V r)(σ) − ϵT (σ)(J†T end,ρ τ̃)(σ)}dσ (59) 18 Giovanni De Magistris et al. B.2 Metabolic cost function The metabolic cost function is: MC(t) = 1 2 ∫ t t−D Φ̃T (σ)Q−1Φ̃(σ)dσ (60) According to the definition of Φ(t) and Q, the following properties of vec(·), ⊗ and tr(·) operators: vec(ΩY U) = (UT ⊗ Ω)vec(Y ), tr(ΩY ) = vec(ΩT )T vec(Y ), tr(ΩY ) = tr(Y Ω) (61) and using the symmetry of Q−1 K , we obtain: vec(K̃T )T (I ⊗ QK)−1vec(K̃T ) = vec(K̃T )T ((Q −1 K )T ⊗ I)vec(K̃T ) = vec(K̃T )T vec(K̃T Q−1 K ) = tr{K̃K̃T Q−1 K } = tr{K̃T Q−1 K K̃} (62) In the same way, can be found the terms corresponding to B̃ and τ̃. For these reasons, we can define δMC(t) = MC(t) − MC(t − D) as: δMC(t) = 1 2 ∫ t t−D{tr{[K̃ T (σ)Q−1 K K̃(σ)] − [K̃T (σ − D)Q−1 K K̃(σ − D)]} + tr{[B̃T (σ)Q−1 B B̃(σ)] − [B̃T (σ − D)Q−1 B B̃(σ − D)]} +[τ̃T (σ)Q−1τ τ̃(σ)] − [τ̃T (σ − D)Q−1τ τ̃(σ − D)]}dσ (63) From Eqs. 13, 14 and 16, we obtain: δK = QK{J † end,ρ [ϵ(t)δ(Hd, Hr)T ]J †T end,ρ − γ(t)Klρ(t)} δB = QB{J † end,ρ [ϵ(t)δ(V d, V r)T ]J †T end,ρ − γ(t)Blρ(t)} δτ = Qτ {J † end,ρ ϵ(t) − γ(t)τlρ(t)} (64) Using the symmetry of Q−1 K , K̃(σ) − K̃(σ − D) = δK(σ) and Eq. 64, the first term in the integrand of Eq. 63 can be written as: tr{[K̃T (σ)Q−1 K K̃(σ)] − [K̃T (σ − D)Q−1 K K̃(σ − D)]} = tr{[K̃T (σ) − K̃T (σ − D)]T Q−1 K [2K̃T (σ) − K̃T (σ) + K̃T (σ − D)]} = tr{δKT (σ)Q−1 K [2K̃(σ) − δK(σ)]} = −tr{δKT (σ)Q−1 K δK(σ)} + 2 tr{δK(σ)Q−1 K K̃(σ)} = −tr{δKT (σ)Q−1 K δK(σ)} + 2ϵT (σ)(J†T end,ρ K̃J † end,ρ )(σ)δ(Hd, Hr)(σ) − 2γ(σ)tr{(Klρ)T (σ)K̃(σ)} (65) In the same way, can be found the second terms in the integrand of Eq. 63 as: tr{B̃T (σ)Q−1 B B̃(σ) − B̃T (σ − D)Q−1 B B̃(σ − D)} = −tr{δBT (σ)Q−1 B δB(σ)} + 2ϵT (σ)(J†T end,ρ B̃J † end,ρ )(σ)δ(Hd, Hr)(σ) − 2γ(σ)tr{(Blρ)T (σ)B̃(σ)} (66) and third terms in the integrand of Eq. 63 as: [τ̃T (σ)Q−1τ τ̃(σ)] − [τ̃ T (σ − D)Q−1τ τ̃(σ − D)] = −[δτ T (σ)Q−1τ δτ(σ)] + 2ϵ T (σ)(J †T end,ρ τ̃)(σ) − 2γ(σ)(τlρ) T (σ)τ̃(σ) (67) Replacing Eqs. 65, 66 and 67 into 63, we obtain: δMC(t)=− 12 ∫ t t−D[δΦ̃ T (σ)Q−1δΦ̃(σ)]dσ − ∫ t t−D[γ(σ)Φ̃ T (σ)Φ(σ)]dσ + ∫ t t−D[ϵ T (σ)(J †T end,ρ K̃J † end,ρ )(σ)δ(Hd, Hr)(σ) + ϵT (σ)(J †T end,ρ B̃J † end,ρ )(σ)δ(V d, V r)(σ)+ϵT (σ)(J †T end,ρ τ̃)(σ)]dσ (68) Combining Eqs. 59 and 68, we obtain the first difference of cost function: δC(t) = C(t) − C(t − D) = δME(t) + δMC(t) ≤ − 1 2 ∫ t t−D[δΦ̃ T (σ)Q−1δΦ̃(σ)]dσ − ∫ t t−D[γ(σ)Φ̃ T (σ)Φ̃ + γ(σ)Φ̃T (σ)Φd(σ) + ϵT (σ)Bini end (σ)ϵ(σ)]dσ (69) To obtain δC(t) ≤ 0, a sufficient condition is: ϵT Biniendϵ + γΦ̃ T Φ̃ + γΦ̃T Φd ≥ λB||ϵ||2 + γ||Φ̃||2 − γ||Φ̃||||Φd|| ≥ 0 (70) where λB as the infimum of the smallest eigenvalue of B ini end . Replacing γ(t) = p 1+u||ϵ(t)||2 into Eq. 70, we obtain: λBu||ϵ||4 + λB||ϵ||2 + p||Φ̃||2 − p||Φ̃||||Φd|| ≥ 0 (71) To find the regions of points (||ϵ||2, ||Φ̃||) for each of which Eq. 71 holds, we need firstly to determine the set of points that satisfies: λBu||ϵ||4 + λB||ϵ||2 + p||Φ̃||2 − p||Φ̃||||Φd|| = 0 (72) Eq. 72 is an ellipse passing trough the points (||ϵ||2 = 0, ||Φ̃|| = 0) and (||ϵ||2 = 0, ||Φ̃|| = ||Φd||). To find the canonical equation of this ellipse, we need only to complete the squares and we obtain: λBu(||ϵ||2 + 12u ) 2 + p(||Φ̃|| − ||Φ d|| 2 )2 λB 4u + p||Φd||2 4 = 1 (73) A human-like learning control for digital human models in a physics-based virtual environment 19 By Krasovskii-LaSalle principle, ||ϵ||2 and ||Φ̃|| will converge to an invariant set Ωs ⊆ Ω on which δC(t) = 0, where Ω is the bounding set defined as: Ω ≡  (||ϵ||2, ||Φ̃||), λBu(||ϵ|| 2 + 1 2u )2 + p(||Φ̃|| − ||Φd||/2)2 λB 4u + p||Φd||2 4 ≤ 1   (74) If the parameter γ is constant [24], the bounding set is:{ (||ϵ||2, ||Φ̃||), 4λB||ϵ||2 + 4γ(||Φ̃|| − ||Φd||/2)2 γ||Φd||2 ≤ 1 } (75) γ does not affect convergence, but the convergence speed and size of convergence set. C Minjerk C.1 Formal definition Using Eq. 20, we can write the inside term of the integral as: P = ∥∥∥∥ d3dt3 r(s) ∥∥∥∥2 = ∥∥∥∥ d2dt2 r′(s)ṡ ∥∥∥∥2 = ∥∥∥∥ ddt(r′′(s)ṡ2 + r′(s)s̈) ∥∥∥∥2 = ∥∥r′′′(s)ṡ3 + 3r′′(s)ṡs̈ + r′(s)...s∥∥2 (76) To explicit the invariance with respect to rotations and translations of the minimization problem in Eq. 76, we can define uniquely 3D curve [56] by its curvature R(s) and its torsion η(s). The path r satisfies Frenet’s formulas: t = Rn n′ = ηb − Rt b′ = −ηn (77) From geometry, we know that: r′ = t′ r′′ = Rn r′′′ = R′n + R(ηb − Rt) (78) We replacing Eq. 78 in Eq. 76 and we obtain: P = ∥∥n(R′ṡ3 + 3Rṡs̈) + t(...s − R2ṡ3) + b(ṡ3Rη)∥∥2 (79) n, t and b are orthogonal and thus we obtain: P = (R′ṡ3 + 3Rṡs̈)2 + ( ... s − R2ṡ3)2 + (ṡ3Rη)2 (80) C.2 Relation to the 2/3 power law We want to find the relation of Eq. 80 to 2/3 power law. To obtain this, we define a function: Zs = ṡ 3R(s) (81) Zs corresponds to the term multiplying the torsion η in Eq. 80. We derive Eq. 81 respect to time and we obtain: R′(s)ṡ4 + 3ṡ2s̈R(s) = Z′sṡ R′(s)ṡ3 + 3ṡs̈R(s) = Z′s (82) The term R′(s)ṡ3 + 3ṡs̈R(s) is equal to the term multiplying n in Eq. 79. We now substitute Eq. 82 in Eq. 79: P = ∥∥n(Z′s) + t(...s − ZsR) + b(Zsη)∥∥2 = Z′2s + (...s − ZsR)2 + Z2s η2 (83) From Eq. 81, we have: ṡ(t) = Z 1 3 s R − 1 3 (84) In the 2/3 power law Z 1 3 s = const and Z ′ s = 0 and it is equivalent to setting the coefficient of n of the instantaneous jerk to zero, and the coefficient of b proportional to the coefficient of t. To demonstrate this, we analyze the 2D power law: (x′2 + y′2)1/2 = const (√ (x′y′′ − y′x′′)2) (x′2 + y′2)3/2 ) ⇒ x′y′′ − y′x′′ = const (85) Taking derivatives, we obtain: x′ y′ = x′′′ y′′′ , r′ = r′′′ (86) The jerk vector points is orthogonal to n and aligned with t. Thus, the jerk along n is zero. work_etoxniuo4nbctp6l3xefoqkff4 ---- Bio​ ​CRM:​ ​A​ ​Data​ ​Model​ ​for​ ​Representing Biographical​ ​Data​ ​for​ ​Prosopographical​ ​Research Jouni​ ​Tuominen​1,2​,​ ​Eero​ ​Hyvönen​1,2​,​ ​and​ ​Petri​ ​Leskinen​1 1​ ​​Semantic​ ​Computing​ ​Research​ ​Group​ ​(SeCo),​ ​Aalto​ ​University,​ ​Finland,​ ​and 2​ ​​HELDIG​ ​–​ ​Helsinki​ ​Centre​ ​for​ ​Digital​ ​Humanities,​ ​University​ ​of​ ​Helsinki,​ ​Finland http://seco.cs.aalto.fi, ​ ​http://heldig.fi firstname.lastname@aalto.fi Keywords​:​ ​Linked​ ​Data,​ ​Data​ ​models,​ ​Biographical​ ​representation,​ ​Event-based​ ​modeling, Role-centric​ ​modeling,​ ​Prosopography Type​ ​of​ ​submission​:​ ​original​ ​unpublished​ ​work Biographies​ ​make​ ​a​ ​promising​ ​application​ ​case​ ​of​ ​Linked​ ​Data:​ ​they​ ​can​ ​be​ ​used,​ ​e.g.,​ ​as​ ​a basis​ ​for​ ​Digital​ ​Humanities​ ​research​ ​in​ ​prosopography​ ​and​ ​as​ ​a​ ​key​ ​data​ ​and​ ​linking resource​ ​in​ ​semantic​ ​Cultural​ ​Heritage​ ​portals.​ ​In​ ​both​ ​use​ ​cases,​ ​a​ ​semantic​ ​data​ ​model​ ​for harmonizing​ ​and​ ​interlinking​ ​heterogeneous​ ​data​ ​from​ ​different​ ​sources​ ​is​ ​needed.​ ​We present​ ​such​ ​a​ ​data​ ​model,​ ​Bio​ ​CRM​ ​[1],​ ​with​ ​the​ ​following​ ​key​ ​ideas:​ ​1)​ ​The​ ​model​ ​is​ ​a domain​ ​specific​ ​extension​ ​of​ ​CIDOC​ ​CRM,​ ​making​ ​it​ ​applicable​ ​to​ ​not​ ​only​ ​biographical​ ​data but​ ​to​ ​other​ ​Cultural​ ​Heritage​ ​data,​ ​too.​ ​2)​ ​The​ ​model​ ​makes​ ​a​ ​distinction​ ​between​ ​enduring unary​ ​roles​ ​of​ ​actors,​ ​their​ ​enduring​ ​binary​ ​relationships,​ ​and​ ​perduring​ ​events,​ ​where​ ​the participants​ ​can​ ​take​ ​different​ ​roles​ ​modeled​ ​as​ ​a​ ​role​ ​concept​ ​hierarchy.​ ​3)​ ​The​ ​model​ ​can be​ ​used​ ​as​ ​a​ ​basis​ ​for​ ​semantic​ ​data​ ​validation​ ​and​ ​enrichment​ ​by​ ​reasoning.​ ​4)​ ​The enriched​ ​data​ ​conforming​ ​to​ ​Bio​ ​CRM​ ​is​ ​targeted​ ​to​ ​be​ ​used​ ​by​ ​SPARQL​ ​queries​ ​in​ ​flexible ways​ ​using​ ​a​ ​hierarchy​ ​of​ ​roles​ ​in​ ​which​ ​participants​ ​can​ ​be​ ​involved​ ​in​ ​events. Bio​ ​CRM​ ​provides​ ​the​ ​general​ ​data​ ​model​ ​for​ ​biographical​ ​datasets.​ ​The​ ​individual​ ​datasets concerning​ ​different​ ​cultures,​ ​time​ ​periods,​ ​or​ ​collected​ ​by​ ​different​ ​researchers​ ​may introduce​ ​extensions​ ​for​ ​defining​ ​additional​ ​event​ ​and​ ​role​ ​types.​ ​The​ ​Linked​ ​Data​ ​approach enables​ ​connecting​ ​the​ ​biographies​ ​to​ ​contextualizing​ ​information,​ ​such​ ​as​ ​the​ ​space​ ​and time​ ​of​ ​biographical​ ​events,​ ​related​ ​persons,​ ​historical​ ​events,​ ​publications,​ ​and​ ​paintings. Use​ ​cases​ ​for​ ​data​ ​represented​ ​using​ ​Bio​ ​CRM​ ​include​ ​prosopographical​ ​information retrieval,​ ​network​ ​analysis,​ ​knowledge​ ​discovery,​ ​and​ ​dynamic​ ​analysis. The​ ​development​ ​of​ ​Bio​ ​CRM​ ​was​ ​started​ ​in​ ​the​ ​EU​ ​COST​ ​project​ ​"Reassembling​ ​the Republic​ ​of​ ​Letters"​ ​[2]​ ​and​ ​it​ ​is​ ​being​ ​piloted​ ​in​ ​the​ ​case​ ​of​ ​enriching​ ​and​ ​publishing​ ​the printed​ ​register​ ​of​ ​over​ ​10​ ​000​ ​alumni​ ​of​ ​the​ ​Finnish​ ​Norssi​ ​high​ ​school​ ​as​ ​Linked​ ​Data​ ​[3]. [1]​ ​​http://seco.cs.aalto.fi/projects/biographies/ [2]​ ​​http://www.republicofletters.net [3]​ ​Eero​ ​Hyvönen,​ ​Petri​ ​Leskinen,​ ​Erkki​ ​Heino,​ ​Jouni​ ​Tuominen​ ​and​ ​Laura​ ​Sirola: Reassembling​ ​and​ ​Enriching​ ​the​ ​Life​ ​Stories​ ​in​ ​Printed​ ​Biographical​ ​Registers:​ ​Norssi​ ​High School​ ​Alumni​ ​on​ ​the​ ​Semantic​ ​Web​.​​ ​Proceedings,​ ​Language,​ ​Technology​ ​and​ ​Knowledge 2017.​ ​June​ ​19-20,​ ​Galway,​ ​Ireland,​ ​Springer-Verlag,​ ​2017. http://seco.cs.aalto.fi/projects/biographies/ http://www.republicofletters.net/ http://seco.cs.aalto.fi/publications/2017/hyvonen-et-al-norssit-2017.pdf http://seco.cs.aalto.fi/publications/2017/hyvonen-et-al-norssit-2017.pdf http://seco.cs.aalto.fi/publications/2017/hyvonen-et-al-norssit-2017.pdf work_eurrngme4rastj3rgbundxma5u ---- From graveyard to graph RESEARCH ARTICLE From graveyard to graph Visualisation of textual collation in a digital paradigm Elli Bleeker1 & Bram Buitendijk1 & Ronald Haentjens Dekker1 Published online: 19 June 2019 # Springer Nature Switzerland AG 2019 Abstract The technological developments in the field of textual scholarship lead to a renewed focus on textual variation. Variants are liberated from their peripheral place in appen- dices or footnotes and are given a more prominent position in the (digital) edition of a work. But what constitutes an informative and meaningful visualisation of textual variation? The present article takes visualisation of the result of collation software as point of departure, examining several visualisations of collation output that contains a wealth of information about textual variance. The newly developed collation software HyperCollate is used as a touchstone to study the issue of representing textual information to advance literary research. The article concludes with a set of recom- mendations in order to evaluate different visualisations of collation output. Keywords Collation software . Textual scholarship . Visualisation . Markup . Hypergraph for variation . Tool evaluation 1 Introduction Scholarly editors are fond of the truism that the detailed comparison (‘collation’) of literary texts is a tiresome, error prone, and demanding activity for humans and a task suitable for computers. Accordingly, the past decades have born witness to the devel- opment of a number of software programs which are able to collate large numbers of text within seconds, thus advancing significantly the possibilities for textual research. These developments have led to a renewed focus on textual variation, liberating variants from their peripheral place in appendices or footnotes and giving them a more prominent position in the edition of a work. Still, automated collation continues to engross researchers and developers, as it touches upon universal topics including (but not limited to) the computational modelling of humanities objects, scholarly editing International Journal of Digital Humanities (2019) 1:141–163 https://doi.org/10.1007/s42803-019-00012-w * Elli Bleeker elli.bleeker@di.huc.knaw.nl 1 Research and Development – KNAW Humanities Cluster, Amsterdam, Netherlands http://crossmark.crossref.org/dialog/?doi=10.1007/s42803-019-00012-w&domain=pdf mailto:elli.bleeker@di.huc.knaw.nl theory, and data visualisation. The present article takes visualisation of collation result as its point of departure. We use the representation of the results of a newly developed collation tool, ‘HyperCollate’, as a use case to address the more general issue of using data visualisations as a means of advancing textual and literary research. The underly- ing data structure of HyperCollate is a hypergraph (hence the name), which means that it can store and process more information than string-based collation programs. Ac- cordingly, HyperCollate’s output contains a wealth of detailed information about the variation between texts, both on a linguistic/semantic level and a structural level. It is a veritable challenge to visualise the entire collation hypergraph in any meaningful way, but the question is, really, do we want to? In particular, therefore, we investigate which representation(s) of automated collation results best clear the way for advanced research into textual variance. The article is structured as follows. After a brief introduction of automated collation immediately below, we define a list of textual properties relevant for any study into the nature of text. We then consider the strengths and weaknesses of the prevailing representations of collation output, which allows us to define a number of requirements for a collation visualisation. Subsequently, the article explores the question of visual literacy in relation to using a collation tool. Since visualisations function simultaneous- ly as instruments of study and as means of communication, it is vital they are understood and used correctly. In line with the idea of visual literacy, we conclude with a number of recommendations to evaluate the visualisations of collation output. The implications of creating and using visualisations to study textual variance are discussed in the final parts of the article. Before we go on, it is important to note that we define 'textual variance' in the broadest sense: it comprises any differences between two or more text versions, but also the revisions and other interventions within one version. Indeed, we do not make the traditional distinction between 'accidentals' and 'substan- tives'. This critical distinction is the editor's to make, for instance by interpreting the output of a collation software program. 2 Automated collation Collation at its most basic level can be defined as the comparison of two or more texts to find (dis)similarities between or among them. Texts are collated for different reasons, but in general, collation is used to track the (historical) transmission of a text, to establish a critical text, or to examine an author’s creative writing process. Traditionally, collation has been considered an auxiliary task: it was an elementary part of preparing the textual material in order to arrive at a critically established text and not necessarily a part of the hermeneutics of textual criticism. The reader was presented with the end-result of this endeavour (a critical text), and the variant readings were stored in appendices or footnotes, the kind of repositories that would get so few visitors that they have been bleakly referred to as cemeteries (Vanhoutte 1999; De Bruijn 2002, 114). In the environment of a digital edition, however, users can manipulate transcriptions which are prepared and annotated by editors. Many digital editions have a functionality to compare text versions and, accordingly, collation has become a scholarly primitive, like searching and annotating text. The digital representation of the result of the comparison thus brings textual variants to the forefront instead of (respectfully) entombing them. 142 E. Bleeker et al 3 Properties of text It’s important to note that offering users the opportunity to explore textual variance in a digital environment is an argument an sich: it stresses that text is a fluid and intrinsi- cally unstable object. And, as anyone who has worked with historical documents knows, these fluid textual objects often have complex properties, such as discontinuity, simultaneity, non-linearity, and multiple levels of revision.1 The dynamic and temporal nature of textual objects means that they can be interpreted in more than one way but existing markup systems like TEI/XML can never fully express the range of textual and critical interpretations.2 Nevertheless, the benefits of 'making explicit what was so often implicit … outweighed the liabilities' of the tree structure (Drucker 2012), and as it happens, the textual scholarship community has embraced TEI/XML as a means of 1 See Haentjens Dekker and Birnbaum (2017) for an exhaustive overview of textual features and the extent to which these can be represented in a computational model. 2 The TEI Guidelines offer the element to indicate the degree of certainty associated with some aspect of the text markup, but as Wout Dillen points out, this requires an elaborate encoding practice that is not always worth the effort (2015, 90) and furthermore the ambiguity is not always translatable to the qualifiers Blow,̂ Bmedium,̂ and Bhigh.̂ From graveyard to graph 143 encoding literary texts. Expressing the multidimensional textual object within a tree data structure (the prevalent model for texts) requires a number of workarounds and results in an encoded XML transcription which contains neither fully ordered nor unordered information (Bleeker et al. 2018, 82). This kind of partially ordered data is challenging to process. As a result, XML files are often collated as strings of characters, inevitably leaving out aspects of the textual dynamics such as deletions, additions or substitutions. The conversion from XML to plain text implies that the multidimensional features of the text expressed by and tags are removed; the text is consequently flattened into a linear sequence of words. Only in the visualisation stage of the collation workflow do features like additions or deletions occur again (Fig. 1). Although these versions of Krapp’s Last Tape are compared on the level of plain text only, the alignment table in Fig. 1 also shows the in-text variation of witnesses 07 and 10, thus neatly illustrating the informational role of visualisations. The main objective for the development of the collation engine HyperCollate was to include textual properties like in-text variation in the alignment in order to perform a more inclusive collation and to facilitate a deeper exploration of textual variation. A look at the drafts of Virginia Woolf’s Time Passes3 offers a good illustration of some textual features we'd like to include in the automated collation. For reasons of clarity, we limit the collation input to two small fragments: the initial holograph Fig. 1 Example of an alignment table visualisation of a collation of four versions of Samuel Beckett’s Krapp’s Last Tape which visualises the deleted words as strike-through. The collation was performed by CollateX 3 Woolf, Virginia. Time Passes. The genetic edition of the manuscripts is edited by Peter Shillingsburg and available at www.woolfonline.com (last accessed on 2018, April 27). Excerpts from Woolf’s manuscripts are reused in this contribution with special acknowledgments to the Society of Authors as the Literary Representative of the Estate of Virginia Woolf. 144 E. Bleeker et al http://www.woolfonline.com draft ‘IHD-155’ (witness 1) and the typescript ‘TS-4’ (witness 2). Both fragments are manually transcribed in TEI/XML. The transcriptions below are simplified for reasons of legibility. A quick look at these fragments reveals that they contain linguistic variation between tokens with the same meaning as well as structural variation indicated by the markup. Here, the ampersand mark ‘&’ in witness 1 and the word token ‘and’ in witness 2 constitute linguistic variation: two different tokens with the same mean- ing. Furthermore witness 1 presents a case of in-text or intradocumentary variation: variation within a witness’ text (see also Schäuble and Gabler 2016; Bleeker 2017, 63). If we look at the revision site that is highlighted in the XML transcription of witness 1, we see several orders in which we can read the text: including or excluding the added text; including or excluding the deleted text. In other words, there are multiple ‘paths’ through the text,: the textualstream diverges at the point where revision occurs, indicated by the element and the element. When the text is parsed, the textual content of these different paths should be considered as being on the same level: they represent multiple, co-existing readings of the text. Intradocumentary variation can become highly complex, for instance in the case of a deletion inside a deletion inside a deletion, etc. The structural variation in this example becomes manifest if we compare the two witnesses: the excerpt in witness 1 is contained by one element, while the phrase in witness 2 is contained by two elements. However structural variation does not only occur across documents: when an author indicates the start of a new chapter or paragraph by inserting a metamark of some sorts, this is arguably a form of structural intradocumentary variation. To summarise, we can distinguish different forms of textual variance. Variation can occur on the level of the text characters (linguistic or semantic variation) and on the structure of the text (sentences, paragraphs, etc.). Furthermore, we distinguish between intradocumentary variation (within one witness) and interdocumentary variation (across witnesses). Arguably all forms are relevant for textual scholarship, but taking them into account when processing and comparing texts has both technical and conceptual consequences. These consequences have been discussed extensively elsewhere (Bleeker et al. 2018) and will be briefly repeated in section 5 below. The main goal of the present article is to focus on the question of visualisation. Assuming we have a software program that compares texts in great detail, including structural information and in-witness revisions, how can we best visualise its ouput? first and foremost, The additional information (structural and linguistic, intradocumentary and interdocumentary) needs to be visualised in an understandable way. The visualisations can be useful for a wide range of research objectives, such as (1) finding a change in markup indicating structural revision like sentence division, (2) presenting the different paths through one witness and the possible matches between tokens from any path, (3) complex revisions, like a deletion within a deletion within an addition, (4) studying patterns of revision, and so on. This begs the question: is it even possible or desirable to decide on one visualisation? Is there one ultimate visualisation that reflects the dynam- ic, temporal nature of the textual object(s) by demonstrating both structural and linguistic variation on an intradocumentary and interdocumentary level? the existing field of Information Visualisation can certainly offer inspiration, but simply adopting its methods and techniques will not suffice, since it deals primarily with objects which are From graveyard to graph 145 ‘self-identical, self-evident, ahistorical, and autonomous’ (Drucker 2012), adjectives which could hardly be applied to literary texts. 4 Existing Visualisations of collation results Let us consider the various existing visualisations of collation output and explore to what extent they address the conditions outlined above. We can distinguish roughly five types of visualisation: alignment tables, parallel segmentation, synoptic viewers, variant graphs, and phylogenetic trees or ‘stemmata’. A smaller example of a collation of two fragments from Woolf’s A Sketch of the Past (holograph MS-VW-SoP and typescript TS1-VW-SoP) serves as illustration of the effect of the visualisations: Witness 1 (MS-VW-SoP): with the boat train arriving, people talking loudly, chains being dropped, and the screws the beginning, and the steamer suddenly hooting Witness 2 (TS1-VW-SoP): with the boat train arriving; with people quarrelling outside the door; chains clanking; and the steamer giving those sudden stertorous snorts These two small fragments are transcribed in plain text format and subsequently collated with the software program CollateX. Unless indicated otherwise, the result from this collation forms the basis for the visualisation examples below. 4.1 Alignment table An alignment table presents the text of the witnesses in linear sequence (either horizontally or vertically), making it well-suited to a study of the relationships between witnesses on a detailed level, but less so to acquire an overview of patterns in revision. Note that ‘aligned tokens’ are not necessarily the same as ‘matching tokens’: two tokens may be placed above each other because they are at the same relative position between two matches, even though they do not constitute a match. For this reason, alignment tables often have additional markup (e.g. colours) to differentiate between matches and aligned tokens. The arrangement of the tokens is also one of the advan- tages of an alignment table: it shows at first glance the variation between tokens at the same relative position. In other words, this representation indicates tokens which match on a semantic level, such as synonyms or fragments with similar meanings, such as ‘talking loudly’ and ‘quarrelling outside the door’ (Fig. 2). Ongoing research into the potential of an alignment table visualisation to explore intradocumentary variation (see Bleeker et al. 2017, visualisations created by Vincent Neyt) focuses on increasing the amount of information in an alignment table by incorporating intradocumentary variation in the cells. The alignment table in Fig. 3 shows that witness 1 (Wit1) contains several paths; matching tokens are displayed in red. 146 E. Bleeker et al 4.2 Synoptic viewers A synoptic edition contains a visual representation of the collation results from the perspective of one witness, where the variants are indicated by means of a system of signs or diacritical marks. In contrast to an alignment table, a synoptic overview is more suitable as an overview examination of the patterns of variation. The following paragraphs discuss two ways of presenting textual variation synoptically: parallel segmentation and an inline apparatus. It may be clear that both are skeuomorphic in character, in the sense that they mimic the analogue examination and presentation of textual variants. This characteristic should not necessarily be considered negative, however, precisely because it is a tried and tested instrument for textual research. 4.2.1 Parallel segmentation The term ‘parallel segmentation’ may be confusing, as it is also the name of the (TEI) encoding for a critical apparatus. In this context, parallel segmentation is used to describe the visualisation of textual variation in a side-by-side manner, often with the corresponding segments linked to one another. The quantity of online, open source tools for a parallel segmentation visualisation suggests that it is a popular way of studying textual variation (e.g. the Versioning Machine,4 the Edition Visualisation Technology – EVT – project,5 and the visualisation of Juxta Commons).6 As Fig. 4 shows, parallel segmentation entails presentation of the witnesses as reading texts in separate panels which can be read vertically (per witness) or horizontally (interdocumentary variation across witnesses). Colours indicate the matching and non-matching segments. To be clear: this parallel segmentation visualisation concerns the presentation of variance; it is not a collation method in and of itself. The segments are encoded by the editor, for instance using the TEI // construction to link matching segments. In contrast to the inline apparatus presentation (see 2b below), which uses a base text, parallel segmentation presents the witnesses are presented as variations on one another. Most tools allow for an interactive visualisation in the sense that clicking on a segment in one witness highlights the corresponding segments in the other witness(es). As represented in Fig. 4, the parallel segmentation may also visualise 4 See http://v-machine.org/ (last accessed 2018, March 30). 5 Downloadable on https://sourceforge.net/projects/evt-project/files/latest/download (last accessed 2018, March 30) 6 See http://www.juxtasoftware.org/juxta-commons/ (last accessed 2018, March 30). Fig. 2 Example of alignment table visualisation of ‘MS1-VW-SoP’ (W1) collated against ‘TS1-VW-SoP’ (W2) which, again, shows how synonyms which do not match are aligned anyway because of the matching tokens which surround them. Table generated by CollateX From graveyard to graph 147 http://v-machine.org/ https://sourceforge.net/projects/evt-project/files/latest/download http://www.juxtasoftware.org/juxta-commons/ intradocumentary variation by rendering deletions and additions (embedded in the corresponding by means of and elements). 4.2.2 Critical or inline apparatus Conventionally, an apparatus accompanies a critically established text which figures as a base text. The apparatus is made up of a set of notes containing variant readings, often recorded in some shorthand using diacritical signs, witness sigli, and some context. Variant readings encoded according to the TEI guidelines can be generated as said footnotes, or the reader can select certain readings to be displayed/ignored. Alternatively, an inline apparatus entails a synoptic visualisation of the variant readings in the form of diacritical marks inside a reading text. This kind of synoptic overview can draw the reader’s attention to the places in the text that underwent heavy revisions. A classic example of a synoptic visualisation is found in the Ulysses edition (Joyce 1984–1986), a presentation format which Hans Walter Fig. 4 Screen capture of the parallel segmentation visualisation of the Versioning Machine output of three versions of Walden (Henry David Thoreau): the base text of the Princeton edition, manuscript Version A, and manuscript Version B. The witnesses are displayed side-by-side, with cancelled text in witness Version A represented by strikethrough, added text by green, and matching text by highlight. In this example, the collation has been carried out manually and transcribed according to the TEI Parallel Segmentation method (Schacht 2016. ‘Introduction’) Fig. 3 Alignment table visualisation showing intradocumentary variation in witness 1. The colour red is used to draw attention to the matching tokens, which is especially useful in the case of more or longer witnesses 148 E. Bleeker et al Gabler and Joshua Schäuble recently repeated digitally with the Diachronic Slider (Schäuble and Gabler 2016; Fig. 5). The clear advantage of a digital synoptic edition is that the diacritical signs can be replaced with visual indications which have a lower readability threshold than diacritical marks, such as different colours or a darker shade behind the tokens that vary in other witnesses (cf. the Faust edition). 4.3 Variant graph Avariant graph is a collection of nodes and edges. It is to be read from left to right, top to bottom, following the arrows. This reading order makes it a directed acyclic graph (DAG): it can be read in one order only, without ‘looping’ back. In some visualisations, the text tokens are placed on the edges (e.g. Schmidt and Colomb 2009); in others, they are placed in the nodes (e.g. CollateX; Fig. 6). In contrast to the alignment table, there is no ‘visual alignment’ in the variant graph: matching tokens are merged. Only the variant text tokens are made explicit; witness sigla indicate which tokens belong to which witness. By following a path over nodes and edges, users can read the text of a specific witness and see where it corresponds with and diverges from other witnesses. One of the main advantages of a variant graph is that it doesn’t impose one single order: in the visualisation, no path through the text is preferred over the other. The variant graph thus facilitates recording and structuring non-linear structures in manuscript texts, making it easier to visualise layers of writing without preferring one over the other. Because the variant graph is capable of including more information than for instance an alignment table, it is a useful visualisation with which to analyse the collation outcome in detail. The vertical or horizontal direction of the variant graph depends on the tool or the preference of the user. Horizontally oriented variant graphs imitate to some extent the Western reading orientation (from left to right), while variant graphs that are vertically situated appear to anticipate the reading habits of ‘homo digitalis’ (from top to bottom). In both cases, longer witnesses result in endless scrolling and a loss of overview. This was reason for the TRAViz project to insert line breaks based on the assumption that Fig. 5 Visualisation of the inline apparatus of the Diachronic Slider of ‘MS1-VW-SoP’ collated against ‘TS1- VW-SoP’. The text from ‘MS1-VW-SoP’ are visualised in red; the green text is of ‘TS1-VW-SoP’. The coloured visualisation replaces the traditional diacritical signs From graveyard to graph 149 Fig. 6 Vertical variant graph visualisation of the comparison between ‘MS1-VW-SoP’ (W1) and ‘TS-VW- SoP’ (W2). Graph generated with CollateX 150 E. Bleeker et al online readers prefer vertical scrolling but also like to be reminded that the text in the variant graph derives from a codex format (Jänicke et al. 2014; Fig. 7). The variant graphs of CollateX in the figures directly above are non-interactive by design (since they are visual renderings of a collation output). However, the usefulness of interactive visualisations has been positively noted in several contributions (e.g., Andrews and Van Zundert) and projects. TRAViz, for instance, lets users interact with the graph and adjust it to match their needs and interests, and the variant graphs generated by the Stemmaweb tool set7 allow for their nodes to be connected, input to be adjusted, and edges to be annotated with additional information about the type of variance. Such features emphasise the visualisation’s double function as a means of communication and a scholarly instrument: on the one hand, it allows the user to clarify and communicate her argument about textual variation. On the other, the possibility of adjusting the visualisation and thus the representation of variation foregrounds the idea that the output of a tool is open to interpretation. 4.4 Phylogenetic trees or stemmata One final type of visualisation is the phylogenetic tree (also known as ‘stemma codicum’ or ‘stemmata’). Stemmata are not a collation method: they are created by the scholar or generated based on collation output like alignment tables or variant graphs. For that reason, stemmata do not directly concern the visualisation of collation output, primarily because the phylogenetic tree is used to store and explore the relationships between witnesses (and not between tokens). Nevertheless, this kind of tree provides a valuable perspective on visualising textual variation on a macro level: even at first glance, the tree conveys a good deal of information. The arrangement of the nodes within a stemma is meaningful; nodes close together in the stemma imply a high similarity between the witnesses. Each node in a tree represents a witness, and the edges which connect the nodes represent the process of copying one witness to another (a process sensitive to mistakes and thus variation). Stemmata are traditionally rooted, the witness represented as root being the ‘archetype’, which implies that all witnesses derive from one and the same manuscript (Fig. 8). More recently unrooted trees have 7 Stemmaweb brings together several tools for stemmatology: https://stemmaweb.net/ (last accessed on 2018, April 27). Fig. 7 Screen capture of the TRAViz variant graph visualization of a collation of Genesis 1:4. The size of the text indicates its presence in the witnesses From graveyard to graph 151 https://stemmaweb.net/ been introduced that do not assume one ‘ancestor’ or archetype witness and simply represent relationships between witnesses (Fig. 9).8 Avisualisation method similar to (and probably inspired by) stemmata or phylogenetic trees is the genetic graph in which the genetic relationships between documents related to a work are modelled (see Burnard et al. 2010, §4.2; Fig. 10). Nodes represent documents; the edges may be typed to indicate the exact relationship between documents (e.g. ‘influence’), and they are usually directed so as to convey the chronology of the text’s chronological development. A genetic graph is also not a direct visualisation of collation output, but a visual representation of the editor’s argument about the text’s development and her construction of the genetic dossier. With this overview representation, the editor may point to the existence of textual fragments like paralipomena, which were previously ignored or delegated to footnotes, critical apparatuses, or separate publications. The kind of macrolevel visualisations provided by stemmata or genetic graphs present the necessary overview and invite more rigorous exploration. Diagrams, graphs, or coloured squares add new perspectives to the various ways in which we look at text. 8 The Stemmaweb toolset allows users to root and reroot their stemmata to explore different outcomes, see https://stemmaweb.net/?p=27 (last accessed 2018, March 25). 152 E. Bleeker et al https://stemmaweb.net/?p=27 5 HyperCollate HyperCollate, a newly developed collation tool at the R&D department of the Human- ities Cluster of the Dutch Royal Academy of Science, examines textual variation in an inclusive way using a hypergraph model for textual variation. HyperCollate is an implementation of TAG, the data model also developed at the R&D department (Haentjens Dekker and Birnbaum 2017). A discussion of the collation tool’s technical specifications is not within the scope of the present article (see Bleeker et al. 2018); for now, it suffices to know that a hypergraph differs from traditional graphs, the edges of which can connect only two nodes with each other, because the edges in a hypergraph can connect more than two nodes with one another. These ‘hyperedges’ connect an arbitrary set of nodes, and the nodes in turn can have multiple hyperedges. Conceptu- ally, then, the hyperedges in the TAG model can be considered as multiple layers of markup/information on a text. The hypergraph for variation used by HyperCollate is an evolved model based on the variant graph. By treating texts as a network, HyperCollate is able to process intradocumentary variation and store multiple hierarchies in an idiomatic manner. In other words, because HyperCollate doesn't require TEI/XML Fig. 8 A complex stemma in the form of a rooted directed acyclic graph (DAG), with the α in the top right corner representing the archetype witness from which other witnesses may derive (source: Andrews and Mace 2012) From graveyard to graph 153 transcriptions to be transformed into plain text files, TEI tags indicating revision like and can be used to improve the collation result. HyperCollate accordingly uses valuable intelligence of the editor expressed by markup to improve the alignment of witnesses. Since the internal data model of HyperCollate is a hypergraph, the input text can be an XML file and doesn’t need to be transformed into plain text. The comparison of two data-centric XML files is relatively simple, and it is even a built-in of the oXygen XML Fig. 9 Example of an unrooted phylogenetic tree (source: Roos and Heikkilä 2009) Fig. 10 Possible genetic graph visualisation proposed by the TEI Workgroup on Genetic Editions (Burnard et al. 2010), with the nodes A to Z representing different documents in the genetic dossier of a hypothetical work 154 E. Bleeker et al editor, but as explained above, a typical TEI-XML transcription of a literary text with intradocumentary variation constitutes partially ordered information. In order to process this kind of information, HyperCollate first transforms the TEI-XML witnesses into separate hypergraphs and then collates the hypergraphs. Graph-to-graph collation en- sures that the input text can be processed taking into account both the textual content and the structure of the text. For each witness, HyperCollate looks at the witness’ text, the different paths through the witness’ text, and the structure of the witness, and subse- quently compares the witnesses on all these levels. Accordingly, the output of HyperCollate contains a plethora of information. Similar to CollateX,9 a widely used text collation tool, the output of HyperCollate could be visualised in different ways (e.g., an alignment table or a variant graph). By default, HyperCollate’s output is visualised as a variant graph, primarily because a variant graph does not have a single order so it is relatively straightforward to represent the different orders of the tokens as individual paths. The question is, how (and where) to include the additional information in the visualisations? A variant graph may be more flexible regarding the token order, but the nodes and edges can only contain so much extra information, as Fig. 12 below shows. A favourable consequence of HyperCollate is that, in case of intradocumentary variation, each path through a witness is considered equally important. This feature is in stark contrast with current approaches to intradocumentary variation, which usually entail a manual selection of one revision stage per witness (see Bleeker 2017, 110–113). By means of illustration, let us take a look at another collation of two small fragments from Woolf’s Time Passes containing intradocumentary structural variation. The fragments are manually transcribed in TEI/XML and simplified for reasons of clarity. The XML files form the input of HyperCollate. Witness 1 contains an interesting addition (highlighted): Woolf added a metamark and the number ‘2’ in the margin. The transcriber interpreted the added number as an indication that the running text should be split up and a new chapter should be started, so she tagged the number with the element.10 This means that the tokens of this witness can be ordered in two ways: excluding the addition and including the addition. Furthermore, the element in witness 1 is at the same relative position as the element in witness 2, so that the two headers are a match (even though their content is not). Figure 11 shows the variant graph visualisation of the output. Note that the paths through the witnesses can be read by following the witness sigli on the edges (w1, w1:add, w2); the markup is represented as a ‘hyperedge’11 on the text nodes: An alternative way of representing HyperCollate’s output in a variant graph is by enclosing both linguistic and structural information within the text nodes (Fig. 12). The visualisations of the collation hypergraph in Figs. 11 and 12 represent the collation output of two small and simplified witnesses. It may be clear that collating two larger TEI/ XML transcriptions of literary text, each containing several stages of revisions and multiple layers of markup, results in a collation hypergraph that, in its entirety, cannot 9 Haentjens Dekker, Ronald and Gregor Middell. CollateX. https://collatex.net/. 10 Arguably the transcriber could have added a
, but the TEI Guidelines do not allow for a
to be placed within an . Nevertheless, contrasting the structure of witness 1 with the structure of witness 2 already alerts the reader to structural revisions and invites a closer inspection. 11 The edges in a hypergraph are called hyperedges. In contrast to edges in a DAG, hyperedges can connect a set of nodes. From graveyard to graph 155 https://collatex.net/ be visualised in any meaningful way. At the same time, the various types of information contained by the collation hypergraph are of instrumental value to a deeper study of the textual objects. For that reason, HyperCollate offers not one specific type but rather lets the user select from a wide variety of visualisations, ranging from alignment tables to variant graphs. In selecting the output visualisation, the user decides which information she prefers to see and which information can be ignored. She may consider an alignment table if she’s primarily interested in the relationships between witnesses on a microlevel, or a variant graph if an insightful overview of the various token orders is more relevant to her research. Furthermore, she may decide what markup layers she want to see: arguably knowing that every token is part of the root element ‘text’ is of less concern than detecting changes in the structure of sentences. Making such decisions does require the user to have a basic knowledge of the underlying dataset and a clear idea of what she’s looking for. 6 Requirements for visualising textual variance This overview allows us to draw a number of conclusions regarding the visualisation of textual variation and to what extent each visualisation considers the various dimensions of the textual object. We have seen that intradocumentary variation is as of yet not represented by default; the editor is required to make certain adjustments to the visualisation. Alignment tables and parallel segmentation can be extended to some extent, for instance by using colours and visualising deletions and additions. Regular variant graphs may include intradocumentary variation if the different paths through the texts are collated as separate witnesses12; only HyperCollate’s variant graph output includes both intra- and interdocumentary variation. Structural variation, is currently only taken into account by HyperCollate and consequently only visualised in HyperCollate’s variant graph. While the added value of studying this type of variation may be clear, it remains a challenge to visualise both linguistic/semantic and structural variation in an informative and clear manner. Fig. 11 may clearly convey the structural difference between witness 1 and witness 2 (i.e., the element), but the raw collation output contains much more information which, if included, would probably overburden the user. A promising feature of visualisations intended to further explorations of textual variation is interactivity. One can imagine, for instance, the added value of discovering promising sites of revision through a graph representation, zooming in, and annotating the relationships between the witness nodes. Acknowledging the various strengths and shortcomings of existing visualisations, we propose that there is not one, all-encompassing visualisation that pays head to all properties 12 This practice leads to some problematic issues in case of complex revisions, see De Bruijn et al. 2007; Bleeker 2017, 111–114. Fig. 11 Alternative, black-and-white visualisation of HyperCollate output, with the markup repre- sented as hyperedge on the nodes. Other markup is not visualised 156 E. Bleeker et al Fig. 12 Alternative visualisation of HyperCollate output, with each node containing the Xpath-like informa- tion about the place of the text in the XML tree (e.g. the path /TEI/text/div/p/s/ indicates that the ancestors of a text node are, bottom up, an element, a

element, a

element, the element and the element) From graveyard to graph 157 of text. Instead, each visualisation highlights a different aspect of textual variance or provides another perspective on text. Each perspective puts another textual characteristic before the footlights, while (ideally) making users aware of the fact that there is much more happing behind the familiar scenes. As Tanya Clement argues, focusing on one aspect can be instrumental in our understanding of text, helping the user ‘get a better look at a small part of the text to learn something about the workings of the whole’ (Clement 2013, §3). Indeed it seems that multiple and interactive representations (cf. Andrews and Van Zundert 2013; Jänicke et al. 2014; Sinclair et al. 2013) are a promising direction. 7 Visual literacy and code criticism The process of visualising data is a scholarly activity in line with the process of modelling, hence the resulting visualisation influences the ways in which a text can be studied Collation output can be visualised in different ways, which raises essential questions regarding the assessment and evaluation of visualisations. The function of a digital visualisation is two-fold: on the one hand, it serves as a means of communication and on the other hand it provides an instrument of research. The communicative aspect implies that visualisation is first and foremost an affair of the scholar(s) who creating visualisations. The diversity of visualisations, each of which highlights different aspects of the text, reflects the hermeneutic aspect inherent to humanist textual research. Thus, by using visualisation to foreground textual variation, editors are able to better represent the multifocal nature of text. In order to choose an appropriate representation of collation output, then, scholars need to know what argument they want to make about their data set, and how the visualisation can support that argument by presenting and omitting certain information. Accordingly, they can estimate the value of a visualisation for a specific scholarly task and expose the inevitable bias embedded in technology. When a visualisation is used as an instrument of study and exploration, it is vital to be critical about its workings and its (implicit) bias. This includes an awareness of which elements the visualisation highlights and, just as important, which elements are ignored. As Martyn Jessop has pointed out, humanist education often overlooks training in ‘visual literacy’, which can be defined as the effective use of images to explore and communicate ideas (Jessop 2008, 282). Visual literacy, then, denotes an understanding of the fact that a visualisation represents a scholarly argument. Jessop identifies four principles that facilitate the understanding of a visualisation: aims and methods, sources, transparency requirements, and documentation (Jessop 2008 290). The documentation of a visualisation of collation output then, could describe what research objective(s) it aims to achieve, on what witnesses it is based, and how these witnesses have been transcribed, tokenized, and aligned.13 Another suitable rationale for critically evaluating the visualisation process is offered by the domains of ‘tool criticism’ or ‘code criticism’ (Traub and van Ossenbruggen 2015; Van Zundert and Dekker 2017, 125). Tool criticism assumes that the code base of scholarly tools reflects certain scholarly decisions and assumptions, and it raises critical questions in order to further awareness of the 13 Although the value of documenting a tool’s operations is uncontested, making use of documentation is not yet part of digital humanities’ best practice. In that respect, it is worthwhile to keep in mind the RTFM-mantra of software development (‘Read the F-ing Manual’). 158 E. Bleeker et al relationships between code and scholarly intentions. Questions include (but are not limited to) ‘is documentation on the precision, recall, biases and pitfalls of the tool available’, or ‘is provenance data available on the way the tool manipulates the data set?’ (Traub and van Ossenbruggen 2015). Indeed, when it comes to evaluating the visualisation of automated collation results, one may well ask to what extent these witnesses and the ways in which they have been processed by the collation tool are subject to bias and interpretation. Like transcription (and any operation on text for that matter), collation is not a neutral process: it is subject to the influence of the editor. This becomes clear if we look at the different steps in the collation workflow as identified by the Gothenburg model (GM; 2009). The GM consists of five steps: tokenisation, normalisation, alignment, analysis, and visualisa- tion. For each step, the editor is required to make decisions, e.g. ‘what constitutes a token’, ‘do I normalise the tokens and, if so, do I present the original and the normalised tokens’, or ‘what is my definition of a match and how do I want to align the tokens?’ As Joris Van Zundert and Ronald Haentjens Dekker emphasise, not all decisions made by collation software are easily accessible to the user, simply because they are the result of ‘incredibly complex heuristics and algorithms’ (Van Zundert and Dekker 2017, 123). To illustrate this, we can look at the decision tree used by HyperCollate to calculate the alignment of two simple sentences. The graph in Figs. 13 and 14 are complementary and show all possible decisions the alignment algorithm of Hypercollate can take in order to align the tokens of witness A and witness B and the likely outcomes of each decision. An evident downside of such trees is that they become very large very quickly. For that reason, we see them as primarily useful for editors keen to find out more about the alignment of their complex text. The GM pipeline is not strictly chronological or linear. Although automated collation does start with tokenization, not every user insists on normalising the tokens, and a step can be revisited if the outcome is considered unsatisfactory or not in line with the user’s expectations. Though visualisation comes last in the GM model, this article has argued that it is surely not an afterthought to collation. In fact, the visual representation of textual variance entails an additional form of information modelling: editors are compelled to give physical form to an abstract idea of textual variation which exists at that point only in the transcription and (partly) in the collation result. Using the markup to obtain a more optimal alignment, as HyperCollate does, only emphasises this point: marking up texts Fig. 13 The collation of witness B against witness A, with potential matches indicated in red From graveyard to graph 159 entails making explicit the knowledge and assumptions that would otherwise have been left implicit. Visualising the markup elements, then, implies that these assumptions and thus a particular scholarly orientation to text is foregrounded. 8 Conclusions The present article investigated several methods of representing textual variation: alignment tables, synoptic viewers, and graphs. Two small textual fragments containing in-text variation and structural variation formed the example input for the alignment table and the variant graph visualisation. The fragments were transcribed in TEI/XML and subsequently collated with CollateX and Fig. 14 The decision tree for collating witness B against witness A. Chosen matches indicated in bold, discarded matches rendered as strike-through; others are potential matches. Arrow numbers indicate the number of matches discarded since the root node (this number should be as low as possible). Red leaf nodes indicate a dead end, orange leaf nodes a ‘sub-optimal’ match, and green leaf nodes indicate an optimal set of matches 160 E. Bleeker et al HyperCollate respectively. In addition, we looked at existing visualisations of the Versioning Machine and the Diachronic Slider. These visualisations were judged on their potential to represent different types of variance in addition to the regular interdocumentary variation: intradocumentary, linguistic, and structural. Visualising these aspects of text paves the way for a deeper, more thorough, and more inclusive study of the text’s dimensions. We concluded that there is currently no ideal visualisation, and that the focus should not be on creating an ideal visualisation. Instead, we propose appreciating the multitude of possible visualisations which, individually, amplify a different textual property. This re- quires us to appreciate what a visualisation can do for our research goals and, furthermore, to evaluate its effectiveness. To this end, methods from code criticism and visual literacy can be of aid in furthering an understanding of the digital representations of collation output as rhetorical devices. We propose evaluating the usefulness of a visualisation on the basis of the following principles: 1) Interactivity. This may range from annotating the edges of a graph, adjusting the alignment by (re)moving nodes, to alternating between macro- and micro level explorations of variance. 2) Readability and scalability. Especially in a case of many and/or long witnesses, alignment tables and variant graphs become too intricate to read: their function becomes primarily to indicate complex revision sites. 3) Transparency of the textual model. The visualisation not only represents textual variance, but simultaneously makes clear what scholarly model is intrinsic to the collation. It needs to be clear which scholarly perspective serves as a model for transcription and representation. 4) Transparency of the code. Visualisations represent the outcome of an internal collation process which is usually not available to the general user audience. A clear, step-by-step documentation of the algorithmic process helps users under- stand what scholarly assumptions are present in the code, what decisions have been made, what parameters have been used, and how these assumptions, decisions, and parameters may have influenced the outcome. Decision trees may be of additional use. This applies particularly to interactive visualisations: if it’s possible to adjust parameters or filters, these adjustments need to be made explicit. Digital visualisation is sometimes regarded as an afterthought in humanities research, or even considered with a certain degree of suspicion. Some consider it a mere technical undertaking, an irksome habit of some digital humanists who recently learned to work with a flashy tool. Yet if used correctly, these flashy tools may also function as instruments of study and research, which means they should be evaluated accordingly. Within the framework of visualising collation output, visual literacy is key. Having a critical understanding of the research potential of visualisations facilitates our research into textual variance. After all, these representational systems produce an object which we use for research purposes; we need to take seriously the ways in which they do this. In addition to communicating a scholarly argument, digital visualisations of collation output foreground textual variation. The collation tool HyperCollate facilitates the examination of a text from multiple perspectives (some unfamiliar, some inspiring, some contrasting, but all of them highlighting a particular element of interest). This From graveyard to graph 161 freedom of choice invites scholars to reappraise prevalent notions and continue explor- ing the dynamic nature of text in dialogue with other disciplines. Digital visualisations, then, give us a means to take variants out of the graveyard and into an environment in which they can be fully appreciated. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and repro- duction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. References Andrews, T., & Mace, C. (2012). Trees of Texts: Models and Methods for an Updated Theory of Medieval Text Stemmatology. Paper presented at the digital humanities conference, 2012, July 16–20, University of Hamburg. Abstract available at http://www.dh2012.uni-hamburg.de/conference/programme/abstracts/trees-of-texts- models-and-methods-for-an-updated-theory-of-medieval-text-stemmatology.1.html. Accessed 23 Dec 2018. Andrews, T., & Van Zundert, J. (2013). An Interactive Interface for Text Variant Graph Models. Paper presented at the Digital Humanities Conference, 2013, July 16–19, University of Lincoln, Nebraska. Abstract available at http://dh2013.unl.edu/abstracts/ab-379.html. Accessed 23 Dec 2018. Bleeker, E. (2017). Mapping invention in writing: Digital infrastructure and the role of the genetic editor. Ph.D. Dissertation, University of Antwerp. Bleeker, E., Buitendijk, B., Dekker, R. H., Neyt, V., & van Hulle D. (2017). The challenges of automated collation of manuscripts. In Advanced in digital scholarly editing, Leiden: Sidestone Press, pp. 241–249. Bleeker, E., Buitendijk, B., Dekker, R. H., & Kulsdom, A. (2018). Including XML Markup in the Automated Collation of Literary Texts. Proceedings of the XML Prague conference 2018, February 9–11, pp. 77–95. Burnard, L., Jannidis, F., Middell, G., Pierazzo, E., & Rehbein, M. (2010). An encoding model for genetic editions, accessible at http://www.tei-c.org/Activities/Council/Working/tcw19.html (last accessed 2018, March 30). Clement, T. (2013). Text analysis, data mining, and visualizations in literary scholarship. In Literary studies in the digital age: An evolving anthology. https://doi.org/10.1632/lsda.2013.0. De Bruijn, P. (2002). Dancing around the grave. A history of historical-critical editing in the Netherlands. In Plachta, B. & Van Vliet, H.T.M. (red.), Perspectives of scholarly editing/perspektiven der textedition (pp. 113–124). Berlin: Weidler Buchverlag. Dillen, W. (2015). Digital scholarly editing for the genetic orientation: The making of a genetic edition of Samuel Beckett’s works. Ph.D. thesis, University of Antwerp. Drucker, J. (2012). Humanistic theory and digital scholarship. In M. Gold (Ed.), Debates in the digital humanities (pp. 85–96). Minneapolis: University of Minnesota Press. Haentjens Dekker, R., & Birnbaum, D. J. (2017). It’s more than just overlap: Text as graph. Presented at Balisage: The Markup Conference 2017, Washington, DC, August 1 - 4, 2017. In Proceedings of Balisage: The Markup Conference 2017. Balisage Series on Markup Technologies, vol. 19. https://doi. org/10.4242/BalisageVol19.Dekker01. Jänicke, Stefan, Gessner, Annette, Büchler, Marco, & Scheuermann Gerik (2014). Design rules for visualizing text variant graphs. In Proceedings of the digital humanities 2014, edited by Clare Mills, Michael Pidd and Jessica Williams. Joyce, J. (1984-1986). Ulysses: A critical and synoptic edition, prepared by Hans Walter Gabler with Wolfhard Steppe and Claus Melchior, 3 vols. New York & London: Garland Publishing Inc. Jessop, M. (2008). Digital visualization as a scholarly activity. Literary and Linguistic Computing, 23(3), 281– 293. Roos, T., & Heikkilä, T. (2009). Evaluating methods for computer-assisted stemmatology using artificial benchmark data sets. Literary and Linguistic Computing, 24(4), 417–433. Schacht, P. (2016). ‘Introduction’ in: Thoreau, Henry David. Walden: A fluid-text edition. Digital Thoreau. http://digitalthoreau.org/fluid-text-toc. Accessed 27 May 2019. 162 E. Bleeker et al http://www.dh2012.uni-hamburg.de/conference/programme/abstracts/trees-of-texts-models-and-methods-for-an-updated-theory-of-medieval-text-stemmatology.1.html http://www.dh2012.uni-hamburg.de/conference/programme/abstracts/trees-of-texts-models-and-methods-for-an-updated-theory-of-medieval-text-stemmatology.1.html http://dh2013.unl.edu/abstracts/ab-379.html http://www.tei-c.org/Activities/Council/Working/tcw19.html https://doi.org/10.1632/lsda.2013.0 https://doi.org/10.4242/BalisageVol19.Dekker01 https://doi.org/10.4242/BalisageVol19.Dekker01 http://digitalthoreau.org/fluid-text-toc Schäuble, J., & Gabler, H. W. (2016). Visualising processes of text composition and revision across document Borders. Paper presented at the symposium Digital Scholarly Editions as Interfaces, Graz, Austria, September 22–23. Schmidt, D., & Colomb, R. (2009). A data structure for representing multi-version texts online. International Journal of Human-Computer Studies, 67(6), 497–514. Sinclair, S., Ruecker, S., & Radzikowska, M. (2013). Information visualization for humanities scholars. In Literary studies in the digital age, an evolving anthology, edited by Kenneth Price and Ray Siemens. Available at https://dlsanthology.mla.hcommons.org/information-visualization-for-humanities-scholars. Accessed 23 Dec 2018 Traub, M., & van Ossenbruggen, J. (2015). Workshop on tool criticism in the digital humanities. CWI Techreport July 1, 2015. Available at https://pdfs.semanticscholar.org/d337/ce558c2fd1d8be793786c9 cfc3fab6512dea.pdf. Accessed 27 May 2019. Vanhoutte, E. (1999). Where is the editor? Human IT, 3.1, 197–214. Van Zundert, J., & Dekker, R. H. (2017). Code, scholarship, and criticism: When is code scholarship and when is it not? Digital Scholarship in the Humanities, 32, 121–133. From graveyard to graph 163 https://dlsanthology.mla.hcommons.org/information-visualization-for-humanities-scholars https://pdfs.semanticscholar.org/d337/ce558c2fd1d8be793786c9cfc3fab6512dea.pdf https://pdfs.semanticscholar.org/d337/ce558c2fd1d8be793786c9cfc3fab6512dea.pdf From graveyard to graph Abstract Introduction Automated collation Properties of text Existing Visualisations of collation results Alignment table Synoptic viewers Parallel segmentation Critical or inline apparatus Variant graph Phylogenetic trees or stemmata HyperCollate Requirements for visualising textual variance Visual literacy and code criticism Conclusions References work_exnpbfjwuzdrnogl2pkv3q6n7e ---- TMWG_posterFweb THE BACKBONE THESAURUS USER STORIES BBT ID The Backbone Thesaurus (BBT) is the research outcome of work undertaken by the Thesaurus Maintenance WG in an effort to design and establish a coherent overarching meta-thesaurus for the Humanities [1]. It is a faceted classification scheme that favors a loose integration of multiple thesauri, by offering a small set of top-level concepts (facets and hierarchies) for specialist thesauri terms to map to. Curation The BBT [2] is systematically curated by a cross disciplinary team of editors coming from organisations participating in the TMWG (AA, FORTH, DAI, FRANTIQ/CNRS), through BBTalk, an online editing and communication tool designed to support collaborative, interdisciplinary development and extension of thesauri. Partner vocabularies and thesauri The controlled vocabularies/thesauri that have been mapped to the BBT to this day are: the iDAI.welt Thesaurus [3], the DYAS Humanities Thesaurus [4], the Parthenos Vocabularies [5] and the Language of Binding Thesaurus [6]. Members of the working group are working towards integrating the PACTOLS [7], the Taxonomy of Digital Research Activities in the Humanities (TADiRAH) [8] and the Arts and Architecture Thesaurus [9] with the BBT. Why adopt the BBT - BBT has a logical and easily accessible structure - It makes use of a small number of top-level concepts - It allows the subsumption of any local thesaurus: scholars are not required to quit using their terms of preference. - It promotes objectivity and interdisciplinarity. It allows integration of terms from different scientific fields and enables cross-disciplinary resource discovery - It is a community driven initiative that offers peer scientific support - It can also serve as a basis for thesaurus building (and restructuring) Benefits from joining the Thesaurus Federation H. Goulis, Academy of Athens, E. Tsouloucha, ICS-FORTH Current Main Editors Martin Doerr, Helen Katsiadakis, Helen Goulis, Eleni Tsouloucha, Chrysoula Bekiari, Lida Charami, Gerasimos Chrysovitsanos, Camilla Colombi, Patricia Kalafata, Annika Kirscheneder, Blandine Nouvel, Evelyne Sinigaglia, Yorgos Tzedopoulos References: Thesaurus Maintenance Working Group (2019). DARIAH Backbone Thesaurus (BBT): Definition of a model for sustainable interoperable thesauri maintenance, Version 1.2.2. Greece: May 2019 https://www.backbonethesaurus.eu/sites/default/files/DARIAH_BBT%20v%201.2.2%20draft%20v1.pdf; Georgis Ch., Bruseker G., Tsouloucha E. (2019). BBTalk: An Online Service for Collaborative and Transparent Thesaurus Curation, ERCIM News 116, Special theme: Transparency in Algorithmic Decision Making https://ercim-news.ercim.eu/en116/r-i/bbtalk-an-online-service-for-collaborative-and-transparent-thesaurus-curation; Daskalaki M., Charami L. (2017). A Back Bone Thesaurus for Digital Humanities, ERCIM News 111, October 2017, Special theme: Digital Humanities https://ercim-news.ercim.eu/en111/special/a-back-bone-thesaurus-for-digital-humanities; M. Daskalaki, M. Doerr, 2017. Philosophical background assumptions in digitized knowledge representation systems, in Dia-noesis: A Journal of Philosophy, 2017, Issue 3, p 17-28; https://www.backbonethesaurus.eu/sites/default/files/Philosophical%20background%20assumptions.pdf; Thesaurus Maintenance Working Group (2015). Thesaurus Maintenance Methodological Outline. Greece 2015 https://www.backbonethesaurus.eu/sites/default/files/workingpaperonthesaurusmaintenance29_05_2015.pdf. This work is licensed under a Creative Commons Attribution 4.0 International Licence Scan or go to https://youtu.be/QdDEGN-jiRY to watch BBT User stories “In the reality of the data BBT is one of our major goals and will bring together 190 years of cataloguing and tagging of knowledge and creating knowledge systems”. Reinhard Förtsch (DAI) “We had an old fashioned thesaurus, we realised we have to change its structure and make a step toward a conceptual framework” “The BBT improved the quality of our work” “We flagged up a number of redundancies in our thesaurus which could not be resolved before adopting BBT” Blandine Nouvel (FRANTIQ-CNRS) “When the DAI is doing world archaeology nowadays we try to become interoperable. With the BBT we had a target. And that was the great thing. The greatest learning curve for people entering the digital world is that you need clean schemes”. “In the future the BBT will have a very central role in the way DAI presents its data on the web with the FAIR implementation. BBT is one of the hallmarks of this kind of thinking”. [1] https://www.backbonethesaurus.eu/ [2] https://vocabs.dariah.eu/backbone_thesaurus/en/ [3] http://thesauri.dainst.org/_fe65f286 [4] https://humanitiesthesaurus.academyofathens.gr [5] https://isl.ics.forth.gr/parthenos_vocabularies/ [6] https://www.ligatus.org.uk/lob/ [7] https://pactols.frantiq.fr/opentheso/ [8] http://tadirah.dariah.eu/vocab/ [9] https://www.getty.edu/research/tools/vocabularies/aat/ DARIAH-EU Virtual Annual Event 2020: Scholarly Primitives, November 10-13, 2020 Evelyne Sinigaglia (FRANTIQ-CNRS) Links work_ez62sq6rm5bixbwx6qakugt6f4 ---- Core or Periphery? Digital Humanities from an Archaeological Perspective Jeremy Huggett (Archaeology, School of Humanities, University of Glasgow) Abstract The relationship between Digital Humanities and individual humanities disciplines is difficult to define given the uncertainties surrounding the definition of Digital Humanities itself. An examination of coverage within Digital Humanities journals narrows the range but at the same time emphasises that, while the focus of Digital Humanities might be textual, not all textually-oriented disciplines are equally represented. Trending terms also seem to suggest that Digital Humanities is more of a label of convenience, even for those disciplines most closely associated with Digital Humanities. From an archaeological perspective, a relationship between Digital Archaeology and Digital Humanities is largely absent and the evidence suggests that each is peripheral with respect to the other. Reasons for this situation are discussed, and the spatial expertise of Digital Archaeology is reviewed in relation to Digital Humanities concerns regarding the use of GIS. The conclusion is that a closer relationship is possible, and indeed desirable, but that a direct conversation between Digital Humanities, Digital Archaeology and humanities geographers needs to be established. Determining scope From a traditional humanities perspective, it can often seem as if Digital Humanities (DH) is not only the new kid on the block but also the monster that is garnering all the attention and sucking up available research funding. DH is seen as being better-placed to respond to the kind of large-scale collaborative research programmes increasingly favoured by funding bodies (for example, Barker et al 2012, 189). So, from an archaeological perspective, what is the scope of DH? and what is the nature of its relationship with the individual humanities disciplines served by DH? Determining the scope of DH is immediately made difficult because of the lack of a clear-cut definition of what DH actually is. The annual Day of Digital Humanities with its now traditional request for definitions of the digital humanities rather underlines this situation, as does the equally traditional range of responses producing almost as many different definitions as there are scholars who responded. With perhaps one exception, none of the definitions offered in 2012 identified which fields or humanities disciplines came under the DH banner: the majority are content to leave the 'humanities' part of DH undefined, with plenty of references to broad interdisciplinarity, big tents, and traditional humanities. One contributor - Lisa McAulay - suggests that DH relates to a cluster of subject areas - literature, languages, linguistics, history, classics, anthropology, and archaeology. None in the list are surprising, although the absence of philosophy and the performing arts might be noted. Evaluating coverage An evaluation of the relative importance of humanities discipline within the Digital Humanities can be estimated by looking at the appearance of each term within a range of DH journals. This is admittedly a crude analysis, based on the number of papers within which a term occurs rather than the disciplinary focus of each paper, but it serves to provide an impression of the coverage of each journal. Figure 1: Distribution of papers within Digital Humanities journals, expressed as a percentage of total hits. (IJHAC does not include occurrences in its predecessor History and Computing; PMLA only considers papers published since 2002). Some of the results in Figure 1 are surprising: for instance, 87% of the hits in Computers and the Humanities (published 1966-2004) related to literature and linguistics, almost exactly mirrored in its successor publication, Language Resources and Evaluation, whereas Literary and Linguistic Computing displays a rather more balanced set of results. The International Journal of Humanities and Arts Computing, perhaps reflecting its origins in the journal History and Computing, leans towards history and literature, but also had the highest proportion of references to archaeology (7%) - double that of the next highest ranked for archaeology (Digital Humanities Quarterly). Digital Humanities Quarterly probably displays the strongest representation across the subjects, but still retains a significant leaning towards literature and history. This underlines the close association of DH with literature, linguistics, and history, and suggests a rather different relationship with other humanities subjects, if there is one at all. So what lies behind this apparent focus on literature, linguistics and history? Does the lack of reference to other humanities disciplines represent a lack of interest in or relevance of digital methods in those areas? Is the disciplinary scope of DH much smaller than might have been expected? External perceptions of DH tend to view it as a text-based subject, and various DH scholars have pointed to the privileged position of text within the field of DH. For example, Pilsch suggests that "Digital humanities is, ultimately, a way of doing textual criticism. In fact ... we can suggest that digital humanities is a specialized set of assumptions about how texts work and what makes them interesting" (2012, 5). Liu defines DH broadly as combining 'humanities computing' or 'text-based' digital humanities and new media studies (2012, 10). Barker, Hardwick and Ridge suggest that "The means by which many humanists first, or only, experience the digital humanities are the tools that are being developed to assist in philological research." (Barker et al 2012, 187). Particularly relevant in this context, Hockey notes that "applications involving textual sources have taken center stage within the development of humanities computing as defined by its major publications" (Hockey 2004, 1). While the definitions from the Day of Digital Humanities 2012 may not emphasise disciplinary areas, several reference a focus on text, ranging from seeking patterns within texts and representing and interacting with texts. This textual emphasis would seem to support the literature, linguistics, and history focus identified in DH journals; however, other text-heavy disciplines such as classics and philosophy are not strongly represented. A strong emphasis on text, perceived or real, makes it difficult for humanities subjects which do not share that same emphasis to see the DH agenda as relevant to their own disciplines. Consequently Svensson's proposition that the strong textual focus within DH affects the scope and penetration of humanities computing (2009, 51) would appear to find support here. However, it does not explain the apparent under-representation of subjects such as philosophy and classics. Although philosophy is closely related to computing (for example, Ess 2004), there seems to be a much more limited relationship with DH. For example, Bradley notes that while there are philosophers developing digital content or using information technology to further philosophical research, and there are a number of notable philosophers thinking about the interface between technology and ourselves, there are not numerous examples of philosophers using DH techniques in the pursuit of philosophy (Bradley 2012, 104). The multidisciplinary nature of classics means that digital aspects may be subsumed under the headings of history, archaeology, or linguistics - or, from a classics point of view, classicists including archaeologists, ancient historians and philologists may employ digital methods and technologies (Mahony and Bodard 2010, 1). There is some dispute about the status of digital classics: for example, Crane (2004) talks of classicists aggressively integrating computerised tools into the discipline but at the same time argues that the needs of classicists are not so distinctive as to warrant a separate "classical informatics". Both Terras (2010, 187) and Rabinowitz (2011) see digital classics as more of an emergent field still in its early stages, while Cayless (2011) describes it as an underground movement, with some very high-profile projects and practitioners operating within a more generally hostile attitude towards digital ways of knowing. Trending disciplines Trending terms may also be revealing. For example, Google's nGram viewer can display the frequency of phrases within a sample of over 5.2 million books scanned by Google up to 2009, normalising the results by the number of books published each year. Since the ngram term must occur in at least 40 books, several phrases which might have been expected (for example, digital philosophy, digital classics) returned null results, which could in itself be seen as significant. Figure 2: Google nGram results: 'traditional' labels (top) and 'digital' labels (below) Some interesting patterns are apparent in Figure 2. References to literary computing peaks either side of 1980, while linguistic computing peaks as literary computing declines in the mid 1990s. Historical computing and archaeological computing peak in the late 1980s-early 1990s before declining. Classical computing underlines the limitations of this tool, as its steady growth is associated with an increasing profile of publications on classical computing devices rather than computing in the classics. Humanities computing peaks latest and rises highest, but like all the terms, it now appears to be in decline. Not unexpectedly, the decline of the more traditional terms for computing in the humanities is matched by the rise in use their 'digital' equivalents (the very early showing for digital history in the 1960s relates to publications on digital signalling rather than history). Perhaps unexpectedly, DH is last on the scene: digital literature references appear from 1975, digital history from 1980, and digital archaeology from 1988, while DH first appears around 1993. Furthermore, DH has not overtaken the other terms and remains the least common of those shown. Leaving aside the vagaries of context-free text searching, these results seem to demonstrate a shift in emphasis towards the 'digital', with most of the traditional terms being overtaken by their digital equivalents by 2005. However, the results also suggest that individual disciplines maintained their disciplinary identity in the move to 'digital', with DH essentially acting as an umbrella term of convenience, or, alternatively representing the gradual development of a new disciplinary focus. In the end, the disciplinary scope of DH remains unclear. On one hand, it might be expected to represent the broad church of the humanities, but in reality it seems to consist of a much smaller and more restricted group of humanities fields with some of its major constituents drifting in and out as it suits them. In that light, it would be worth examining the extent to which digital literature, digital linguistics, and digital history publications appear in more mainstream disciplinary journals, or whether their predominance in DH journals represents a choice or need to publish outside their disciplinary journals. The same question could apply to other humanities subjects - do their digital publications appear in DH journals rather than in their disciplinary outlets? Does this account for the poor showing of digital classics and digital philosophy? In archaeology, for example, there is only one computing-based journal (Archeologia e Calcolatori), and archaeology has a low profile within DH journals; instead, archaeological computing papers tend to appear in mainstream archaeology journals and, to a lesser extent, in disciplinary journals outside the field (such as geography). This highlights the way in which digital archaeologists participate in the discipline of archaeology more generally, whereas it has been suggested that DH scholarship is often not highly regarded, in citation terns at least, within their broader fields (Juola 2008, 73-75). Digital Archaeology and Digital Humanities So where does this leave archaeology and its relationship with DH? It evidently does not figure strongly in DH journals, and DH barely figures within archaeological publications. The impression from the disciplinary discussion above is that archaeology remains largely distinct - some might say aloof - from DH. Dunn has recently commented that the relationship between archaeology and DH is curiously lacking (Dunn 2012) and suggests that the reasons for this are nuanced and complex. There are certainly strong parallels between both DH and Digital Archaeology (DA) - both share similar concerns with interdisciplinarity, technology and digital methods. Indeed, the characterisation of DA and DH is not so different. For example, Dunn characterised archaeology as "a disciplinary mash-up, needing support from a range of technological infrastructures, at all levels of scale and complexity" (Dunn 2011, 98), and Daly and Evans (2006, 3) defined digital archaeology as "not so much a specialism, nor a theoretical school, but an approach - a way of better utilizing computers based on an understanding of the strengths and limits of computers and information technology as a whole". Both definitions might equally be applied to DH. It is perhaps this very similarity that, paradoxically, separates the two disciplines. As a field, DA is well-established. Probably the earliest use of electronic data processing in European archaeology was by Peter Ihm and Jean-Claude Gardin in 1958/1959 and in the USA by James Deetz in 1960 (Cowgill 1967, 17). Since then, activity in archaeological computing has grown substantially, especially since the first personal computer revolution in the 1980s, and the annual Computer Applications and Quantitative Methods in Archaeology (CAA) conference has been meeting since 1973, with 500 delegates meeting in Southampton in March 2012. Like DH, DA has spawned a number of different centres (for example, Digitale Archäologie, based in Freiburg, the Center for Digital Archaeology (CoDA) at the University of California, Berkeley, the Laboratorio di Archaeologia Digitale at the University of Foggia, the Digital Archaeology Research Lab (DigAR) at the University of Washington, Seattle) and a range of undergraduate modules and specialised postgraduate degrees. There are also a number of tenured positions and support posts in University archaeology departments as well as a larger number of computing posts in commercial archaeology organisations (43 in the UK at the last count (Jeffrey and Aitchison 2008)). Given this existing infrastructure, it is not unreasonable to propose that DA does not 'need' DH for legitimacy or support, although it is evident that archaeologists are happy to capitalise on digital humanities programmes if they can see the benefits for archaeology. Equally, digital humanities scholars not infrequently draw on archaeological examples in their publications (for instance, Bodenhamer 2007, 2010; Anderson et al 2010), often in the context of demonstrating technologies such as Geographical Information Systems (GIS). Methodological commons? Like archaeology, DH is frequently defined in terms of practice rather than a particular category of data (text) or a historical period (for example, Scholes and Wulfman 2008, 65, Anderson et al 2010, 3782). Indeed, McCarty and Short's classic diagram mapping DH emphasises this, with its central zone highlighting the methodological commons shared by the various disciplines (McCarty and Short 2002). While its authors make it clear that the map is a work in progress, it notably omits archaeology from either the set of disciplines (although 'material culture' is included) or from the 'clouds of knowing' which represent areas of learning which bear upon the field. Later updates (for example, McCarty 2005, 119) add anthropology to the cloud, which could include archaeology if its American definition is adopted. The absence of archaeological contributions to recent collaborative volumes on DH (for example, Berry 2012, Gold 2012) is matched by corresponding recent collections of DA which make only passing reference to DH (for example, Kansa et al 2011, Chrysanthi et al 2012). This serves to underline the lack of relationship between the two disciplines in either direction - digital humanists are not queuing up to access DA and digital archaeologists are not knocking on the door of DH. This apparent peripheral status of DA and DH with respect to each other could support the contention that while both disciplines are concerned with methods, their focus is rather different, with archaeology focused on the study of past material culture whereas DH has a broader, primarily textual outlook (for example, Dunn 2012). Two propositions arise from this situation; that:  the image of archaeology as dealing with primarily long-past pre-literate societies means it fits poorly within a logo-centric DH, and  the practices that underpin the methodologies of both DH and DA are drawn from elsewhere, not from each other, or have developed independently. One of the problems here is that the characterisation of archaeology, at least in DH terms, is frequently flawed. While there is no doubt that archaeology deals with prehistoric societies, to define it in these terms alone is to ignore the several millennia of literate societies which are equally the subject of archaeological study. Ultimately texts are forms of material culture just as much as potsherds and flint flakes, and hence grist to archaeology's mill. Indeed, David Clarke's famous definition of archaeology as "the discipline with the theory and practice for the recovery of unobservable hominid behaviour patterns from indirect traces in bad samples" (Clarke 1973, 17) challenges rather than places limits on the subject. Furthermore, the scope and reach of archaeology - and DA - is wider than is often appreciated. As part of the archaeology of modernity (see Harrison and Schofield 2010, Schofield 2009), new areas of study such as digital forensic data recovery (for example, Ross and Gow 1999) and the investigation of digital media (for example, Huhtamo and Parikka 2011b), as well as the disciplinary implications of new information technologies (for example, Huggett 2012a, 2012b), the study of 'non-places' (transit areas and travel spaces) and virtual worlds (Harrison and Schofield 2010, 249ff), together with contemporary conflict, human rights and disaster archaeology, are all part of archaeology as practised in the twenty-first century. Some would argue that archaeology is over-reaching itself in some of these areas - for example, Huhtamo and Parrika make it clear that they see media archaeology as quite distinct from the more typical understanding of archaeology (2011a, 3), although Liu's characterisation of media archaeology as the study of old media (2012, 16) leaves the door open. Others might argue that archaeology's moves into such areas is a response to tactical and political rather than disciplinary demands. However, the fact remains that archaeology has extended its interest and involvement into these fields, and several are also of interest to - and, in the case of digital media studies, considered to be a part of (Lui 2012, 10) - DH. At the very least, therefore, this re-presentation of archaeology offers the potential for greater interactions in future between DH and DA than there has been to date, and in the process may help to address the foreshortened, presentist focus of DH identified by Liu (2012, 15) by combining contemporary and historical objects of study. If the character of archaeology should not present an obstacle to establishing a greater relationship with DH, the question of shared practice is perhaps more problematic. At one level, neither discipline has need of the other when it comes to the basic analysis of their data. On the other hand, both DA and DH are moving into areas in which the other already has expertise, so one might expect a productive relationship to be established at least in these contexts. In terms of DA there is a dramatic increase in interest in handling text, largely associated with the Semantic Web or Web 3.0: for instance, text mining grey literature reports and journals to extract temporal and spatial data together with associated contextual attributes (for example, Richards et al. 2011, Byrne and Klein 2010). However, the relationships established by DA in relation to projects such as these are primarily with computing science, not DH, despite the long history of text processing in DH. If DA seems to be bypassing DH in relation to text, DH appears to be looking beyond DA in relation to GIS. For example, although a recent volume on Spatial Humanities includes a contribution from an archaeologist (Lock 2010), the 'Suggestions for Further Reading' section contains no reference to archaeological work in GIS (Bodenhamer et al 2010, 177-189). Reference to archaeology appears only in relation to theoretical work on space despite archaeology being recognised elsewhere in the same volume as the first amongst the humanities to adopt GIS (Bodenhamer 2010, 21). Instead the main focus of recommended works is geography and, to a lesser extent, historical GIS. In some respects, this situation is not surprising - rather than pursue a set of complex technological methodologies mediated through another humanities discipline, is it not sensible to go straight to the discipline which is most closely associated with the development of those techniques? However, mediation through an allied humanities discipline may offer considerable benefits in terms of complementarity of theory and method, time saved through lessons learned, and so on. That said, it might appear that historical GIS performs this mediating role within DH, but if so, it is less well developed than in DA and the kinds of issues raised by, for example, Bodenhamer (2010), Boonstra (2009), Jessop (2008), and Suri (2011), are the same as those raised within DA more than fifteen years ago (for example, Gaffney et al 1995), which have been addressed to a varying extent since then. Spatial differences Perhaps as a consequence of this lack of relationship with DA, DH applications of GIS can seem very limited, even simplistic, to archaeological eyes in that they often seem to focus on interactive hypermedia visualisation with little use of GIS analytical tools (for example, Hypercities (Presner 2010), Litmap (Hui 2010) and GapVis (Barker et al 2012) although the user interfaces of projects such as these can disguise very complex data manipulation involved in the generation of the underlying spatial data in the first place. Examples of the successful use of humanities GIS cited by Bodenhamer (2007, 2010) are, from an archaeological perspective, a combination of 3D virtual worlds and multimedia databases rather than GIS as such. As if to emphasise this, as a way of bringing together GIS and the humanities Bodenhamer describes 'deep maps of memory', in which each artefact from a place (a letter, memoir, photograph, painting, oral account, video etc.) constitutes a separate layer that can be arranged sequentially through time (Bodenhamer 2007, 105; 2010, 27-28). This concept has been taken up by Fishken (2011) among others, who proposes the creation of 'Digital Palimpsest Mapping Projects'. However, there is no sense in which the 'knowledge' of the layers is being utilised beyond the spatial and temporal layering inherent in the GIS, and these models are operating on what is essentially a multimedia methodology. In part, of course, this represents a difference between data exploration and data analysis - the analysis, such as it is, remains in the eye of the beholder. This underlines the need within the DH for the kind of spatial literacy and spatial thinking identified by Suri (2011, 182) and the specialist training referred to by Boonstra (2009, 5). A range of specific problems with applying GIS within a DH context have been identified, and lie behind a perceived reluctance to use these tools. For example, Bodenhamer (2010, 23-24) identifies several issues:  The complexity of the technology and the level of time and effort required to learn the techniques  GIS favour structured data  Ambiguity, uncertainty, nuance, and uniqueness are not readily routinised  Managing time is problematic - GIS typically represent time as an attribute of space  GIS rely heavily on visualisation, which is difficult for a logo-centric scholarship which does not generally think in terms of geographical space or framing spatial queries  GIS require collaboration between technical and domain experts, putting the lone humanities scholar at a disadvantage  GIS appear reductionist in the way data are categorised, space is defined, and complexity is handled. These strongly reflect the conflict between positivist technology and humanist traditions also highlighted by, amongst others, Boonstra (2009, 6), Gregory and Hardie (2011, 299), Harris et al (2010, 168), Jessop (2008, 44), and Suri (2011, 163). The contrasts between the accuracy, precision, structure, and reductionism inherent in GIS and the humanistic emphases on uncertainty, imprecision and ambiguity are often presented as part of a critical assessment of the application and use of GIS. In a trenchant response to the archaeological critics of GIS who have raised much the same issues in the past, Cripps et al point to the advent of fuzzy approaches which mean that certainty is no longer required; they argue that GIS do not foster generalisation and standardisation (or at least, no more so than the book, article or presentation, and we are well-accustomed to problematise these); and that far from being reductionist, GIS facilitate complex analyses of time, human agency and perception, and the semantics and linguistics of space (Cripps et al 2006, 27-28). In other words, methods to deal with these issues have been investigated and continue to be developed and, far from representing a purely pragmatic response, they are embedded in critical theory. The danger is that preconceptions concerning GIS applications remain unchallenged through a lack of engagement with the tools and a reluctance to develop them in the search for answers to what are perceived to be the more humanistic questions. For instance, space within GIS is frequently conceived as rectilinear, isotropic (independent of direction), gridded, and framed, and consequently it establishes the conditions for distanced and dispassionate observation – the so-called 'scientific gaze' (Thomas 2004, 199) which is problematic for the humanities. However, this characterisation is not uncontested and GIS are capable of modelling alternative conceptions of space at a human scale which are not predicated on Western, post-Enlightenment perceptions. For example, during the debates surrounding the Indian Land Claims Commission (established in 1946) Western 'common- sense' notions of homogenous, bounded, stable territorial units had to be set aside for aboriginal forms of territoriality in which the spatial unit consisted of aggregates of 'tenures' held at different times (Zedeño 1997). To the Hopi, these could be places, landmarks, natural resources (herds, stands of trees, mineral outcrops), and the material record of human use of the land and its resources (burial grounds, villages, encampments, trails, shrines etc.) (Zedeño 1997, 71). Crucially, as Zedeño emphasises, this concept of space and territoriality is in stark contrast to the kind of landscape in which space is contiguous and can be comprehended at a glance (Zedeño 1997, 73). Nevertheless, it is possible to represent the richness of such a landscape within a GIS along with the human encounters, movement, perceptions, interrelationships and memories that constitute it (for example, Llobera 2007). Such a representation is never anything more than a model of reality, just as the text describing it is no more than an attempt to abstract an impression of the Hopi conceptual world. The visual emphasis of GIS "with its reductionist allure and wondrous images" (Harris et al 2010, 170) is undoubtedly a highly seductive aspect of the tools. The power of the visual image is not unfamiliar to humanists -what perhaps makes GIS so powerful is that, while traditional maps can be a potent means of capturing large amounts of information, that information remains locked within the image, whereas GIS maps are generated on the fly from underlying spatial information and its associated attributes. Consequently GIS facilitate a much higher degree of flexibility: new information can be added, new data can be created through manipulating information within the existing map, and data can be removed. Of greater significance, however, is the seduction of the tool itself - the ease with which images can be generated at the push of a button and the way in which the software can be seen as protecting the user from, and hence disguises, the underlying complexities through inserting layers of opacity (Huggett 2004, 83-84), while the very use of the tool can heighten perceived authority - but all these issues emphasise the need for a properly critical approach. It may be true that the dependence of archaeologists and geographers on maps and plans make the application of GIS easier (Bodenhamer 2010, 21), but visualising DH data need not be a barrier despite its textual focus. As several DH scholars have shown, the extraction of spatial information from texts makes visualisation possible (for example, Gregory and Hardie 2011, Gregory and Cooper 2009), while archaeologists and geographers have demonstrated the potential of more qualitative approaches (see the contributions in Daniels et al 2011 and Dear et al 2011 for example). The need to represent ambiguity and uncertainty are well-established and arguably inherent to some extent in GIS if a raster rather than vector representation is used thoughtfully. For example, vector polygons present clear unambiguous boundaries to regions when what is required is imprecisely delimited, indeterminate boundaries. Boundaries might be malleable (in the sense that the boundary shifts, expands, and contracts depending on circumstances) and permeable (recognising that things may cross from one domain to the other to varying extents, again depending on circumstances) (Kooyman 2006, 425). This is nevertheless capable of being modelled using rasters to represent the degrees of uncertainty or ambiguity. Similarly, uncertainty of location is poorly represented as vector point data. For example, archaeological sites may be recorded using a mixture of resolutions from 1m to 100m or more for a variety of reasons but are frequently represented in absolute locations, although they may be coloured according to their resolution of location. However, within the approximate area within which such a site falls, it is possible to know where the site is not going to be (in a river, on a cliff, for instance), enabling an estimation of the probability that a site is located in some areas rather than others, which can again be represented using graduated rasters. At a more human level, many conceive of the world in terms of their immediate surroundings, with a great deal of knowledge of space and relationships. Beyond that familiar world, things become more hazy and indistinct – scale becomes less precise, and proximity and distance become more a case of 'near', 'further away', 'a long way away', for example. Again, these can be generalisable to a series of rasters to enable this ambiguity to be incorporated within the model. Time is undoubtedly problematic, but this is essentially in terms of its visualisation, rather than its underlying representation. For the most part, presentations of time within GIS are essentially static: snapshots representing single moments in time which can then be stitched together into sequences sampling what is a dynamic phenomenon (for example, Johnson (2002) and Gregory (2008)). An advantage of this approach is that it is recognisable and interpretable, whereas more complex three- dimensional representations of time as space-time paths, space-time prisms and potential path areas result in unfamiliar images which are difficult to assimilate (for example, Shaw et al 2008, Neutens et al 2011) as well as being very much more complex to generate. Nevertheless, the representation of time intervals (using Allen relations (Allen 1991) for instance) within the underlying GIS database can model complex temporal relationships with appropriately fuzzy components ('during', 'before', 'overlaps' and so on) which can then be retrieved as a sequence of contemporary snapshots. It would be misplaced to assume that GIS practitioners are unaware and uncritical of the tools they use and the ways those tools affect the representation of information, but it does underline the requirement for knowledgeable users (as emphasised by Boonstra 2009, 5). This might indeed be achieved through collaboration between technical and domain experts, as Bodenhamer (2011, 24) suggests, which fits with a multiple-member interdisciplinary team model for DH research, but it is not a requirement. Alternatively the lone DH scholar may be trained in the techniques: a model essentially adopted within archaeology where archaeological GIS projects are largely undertaken by archaeologists practised in the use of GIS. The archaeological experience would suggest the need for suitable humanities-focused courses to be created in order to communicate the complexities of spatial concepts within an appropriate and meaningful context . Building relationships? In many respects, the adoption of GIS within DH is caught up in a series of anxiety or identity discourses within DH, DA, and also geography, which may account for many of the doubts, uncertainties, and criticisms which are voiced. Anxiety discourses tend to be associated with fields which meet their disciplinary challenges by drawing down concepts and methodologies from external subjects, and which have an intellectual centre primarily focused on praxis, with theory being derived from outside (for example, Lyytinen and King 2004, 222). This seems equally appropriate as a description of DH and DA with each seeking justification, validation, and status as part of a process of discipline-building, rather than being perceived as providing little more than low- prestige technical support for their broader communities. In the process, however, it would seem sensible and strategically appropriate to ensure that the respective discourses contribute to, rather than are at the expense of, each other. For example, DH scholars frequently appear suspicious of what has been labelled as 'common denominator' systems (Hunt et al 2011, 218). These are categories of digital tools which, despite being broad-based, have been developed to accommodate scientists and engineers, with humanists being seen very much as an afterthought: "academics in the HASS [humanities, arts, and social sciences] have learned to content themselves with the few beneficial bits (or bytes) that fall their way from the technological table; nonetheless, common denominator systems are insufficient by themselves to meet the specialised needs of HASS scholars." (Hunt et al 2011, 218). This has also been a feature of the DA discourse in the past, where it has long been recognised that few of the digital tools used by archaeologists have been created by archaeologists specifically for archaeological use. However, this is essentially reductio ad absurdum: there are many tools, digital or otherwise, that have not been specifically created for DH, or DA, and yet are fundamental to each. In fact, one of the advantages of GIS is that, despite being essentially very simple, they are capable of extension, adaptation, and modification in order to better represent the complexities of the application area. The issue is therefore not the rejection of these broad-based digital tools, but the question of their development and application into new areas. Of course, this may be precisely the kind of pragmatism that Meeks (2012) is concerned about. While he points to archaeologists as having more experience with adapting digital tools to their work than digital humanists (Meeks 2012, 95), he sees archaeology's pragmatic approach as not offering solutions to the perception that humanities needs software tools embedded with humanities rather than engineering principles. By this argument, GIS, as broad-based digital tools, and archaeologists, who are pragmatic - and by inference, uncritical - enough to turn them to use, are equally problematic in terms of DH applications. While the kinds of approaches outlined above to handling uncertainty, time, and so on may be open to the accusation of pragmatism, this would assume that the results they generate represent reality or truth in some way rather than being what they are: abstract conceptual models of virtual spaces built out of theory. In many respects, this argument is closely related to the discussions within DH about the place of building things as a scholarly activity (for example, Ramsay and Rockwell 2012; Ramsay 2011a, 2011b). Digital archaeologists, whatever the digital tools they adopt and use, are well-accustomed to the idea of creating, coding, and modifying these tools in order to facilitate research - indeed, the ability to do so can be seen as a significant factor in the consideration of a suitable tool. However, the process of construction or modification is an integral component of research and arises out of theory, rather than being seen an end in itself. At the same time as DH and DA are, to some extent at least, manoeuvring around each other with respect to textual and spatial issues, geography has also been positioning itself in relation to the humanities more generally. In the same way as part of archaeology's discourse has been to question whether it is a science, social science, or humanities subject, geography has situated itself in recent years on the boundaries of the social sciences and humanities (for example, Cosgrove 2011, xxiv, Dear 2011, 311-312). Indeed, Cosgrove argues that connections between geography and humanities have been strongest during periods of cultural inquisitiveness, "when imagination encounters the resistance of material reality" (Cosgrove 2011, xxiii), a characterisation that seems especially pertinent in the context of the 'digital' worlds each is seeking to create. Furthermore, both archaeology and geography with their science/social science profiles have experience of Byerley's recent warning concerning DH: if DH is seen as a response to a scenario of broader humanities budget cuts, it may end up with a series of eggs in a more expensive basket, which will be especially problematic if the humanities are seen as 'irrelevant' as ever (Byerley 2012, 3). The humanist turn? In such circumstances of budgetary crisis, disciplinary anxiety, and the search for relevance, it would seem that DH, DA and humanities geographers would be stronger together and weaker apart, to employ a hackneyed phrase. However, in order to define and build such a relationship between the three fields, a direct conversation is required. Dear points to an absence of such a conversation between geography and the humanities, recognising that "textual propinquity is not sufficient to produce a community of enquiry" (Dear 2011, 304) and there has likewise been no equivalent conversation between DA and DH to date. Over recent years our disciplines have experienced, to varying extents and at varying times, a 'computational turn', a 'digital turn', and a 'spatial turn': as Lock has observed, the time may have arrived for spatial technologies to develop the 'humanist turn' (Lock 2010, 103), presenting at once an opportunity and a challenge for DH in its relationship with the spatial disciplines. References Allen, James F. 1991. Time and Time Again: The Many Ways to Represent Time. International Journal of Intelligent Systems 6 (4): 341-355. Anderson, Sheila, Tobias Blanke, and Stuart Dunn. 2010. Methodological commons: arts and humanities e-Science fundamentals. Philosophical Transactions of the Royal Society A 368: 3779- 3796. Barker, Elton, Chris Bissell, Lorna Hardwick, Allan Jones, Mia Ridge, and John Wolffe. 2012. Colloquium: Digital technologies. Help or hindrance for the humanities? Arts and Humanities in Higher Education 11 (1-2): 185-200. Barker, Elton, Kate Byrne, Leif Isaksen, Eric Kansa, and Nick Rabinowitz. 2012. Google Ancient Places. http://googleancientplaces.wordpress.com/2012/02/25/the-story-continues/ (accessed April 13, 2012). Berry, David M. (ed.). 2012. Understanding Digital Humanities. Basingstoke: Palgrave Macmillan. Bodenhamer, David J. 2007. Creating a landscape of memory: the potential of humanities GIS. International Journal of Humanities and Arts Computing 1 (2): 97-110. Bodenhamer, David J. 2010. The Potential of Spatial Humanities. In The Spatial Humanities. GIS and the Future of Humanities Scholarship, ed. David J. Bodenhamer, John Corrigan and Trevor M. Harris, 14-30. Bloomington: Indiana University Press. Bodenhamer, David J., John Corrigan, and Trevor M. Harris (eds.). 2010. The Spatial Humanities. GIS and the Future of Humanities Scholarship. Bloomington: Indiana University Press. Boonstra, Onno. 2009. Barriers between historical GIS and historical scholarship. International Journal of Humanities and Arts Computing 3 (1-2): 3-7. Bradley, Peter. 2012. Where are the Philosophers? Thoughts from THATCamp Pedagogy. Journal of Digital Humanities 1: 104-106. http://journalofdigitalhumanities.org/ Byerley, Alison. 2012. Everything Old is New Again: The Digital Past and the Humanistic Future. Paper presented at the Modern Language Association (MLA) Conference, Seattle, January 2012. http://www.duke.edu/~ves4/mla2012/ (accessed April 13, 2012). Byrne, Kate and Ewan Klein. 2010. Automatic extraction of archaeological events from text. In Making History Interactive: Computer Applications and Quantitative Methods in Archaeology 2000, ed. Bernard Frischer, Jane Crawford and David Koller, 48-56. Oxford: Archaeopress. Cayless, Hugh. 2011. Building Digital Classics. #alt-academy ('Making Room' cluster). http://mediacommons.futureofthebook.org/alt-ac/pieces/building-digital-classics (accessed April 13, 2012). Chyrsanthi, Angeliki, Patricia Murrieta Flores, and Constantinos Papadopoulos (eds.). 2012. Thinking Beyond the Tool. Archaeological Computing and the Interpretative Process. Oxford: Archaeopress. Clarke, David L. 1973. Archaeology: The Loss of Innocence. Antiquity 47 (185): 6–18. Cosgrove, Denis. 2011. Prologue: Geography within the humanities. In Envisioning Landscapes, Making Worlds. Geography and the Humanities, ed. Stephen Daniels, Dydia DeLyser, J. Nicholas Entrikin, and Douglas Richardson, xxii-xxv. Abingdon: Routledge. Cowgill, George. 1967. Computer Applications in Archaeology. Computers and the Humanities 2 (1): 17-23. Crane, Greg. 2004. Classics and the Computer: An End of the History. In A Companion to Digital Humanities, ed. Susan Schreibman, Ray Siemens and John Unsworth, 46-55. Oxford: Blackwell. http://www.digitalhumanities.org/companion/ Cripps, Paul, Graeme Earl, and David Wheatley. 2006. A Dwelling Place in Bits. Journal of Iberian Archaeology 8: 25-39. Daly, Patrick. and Thomas L. Evans. 2006. Introduction: archaeological theory and digital pasts. In Digital Archaeology: bridging method and theory, ed. Thomas L. Evans and Patrick Daly, 3-9. Abingdon: Routledge. Daniels, Stephen, Dydia DeLyser, J. Nicholas Entrikin, and Douglas Richardson (eds.). 2011. Envisioning Landscapes, Making Worlds. Geography and the Humanities. Abingdon: Routledge. Dear, Michael. 2011. Historical moments in the rise of the geohumanities. In GeoHumanities. Art, history, text at the edge of place, ed. Michael Dear, Jim Ketchum, Sarah Luria, and Douglas Richardson, 309-314. Abingdon: Routledge. Dear, Michael, Jim Ketchum, Sarah Luria, and Douglas Richardson (eds.). 2011. GeoHumanities. Art, history, text at the edge of place. Abingdon: Routledge. Dunn, Stuart. 2012 CAA1 - The Digital Humanities and Archaeology Venn Diagram. http://stuartdunn.wordpress.com/2012/04/01/caa1-the-digital-humanities-and-archaeology-venn- diagram/ (accessed April 13, 2012). Dunn, Stuart. 2011. Poor relatives or favourite uncles? Cyberinfrastructure and Web 2.0: a critical comparison for archaeological research. In Archaeology 2.0: New Approaches to Communication and Collaboration, ed. Eric C. Kansa, Sarah Witcher Kansa, and Ethan Watrall, 95-118. Los Angeles: UCLA Cotsen Institute of Archaeology Press. http://escholarship.org/uc/item/1r6137tb Ess, Charles. 2004. "Revolution? What Revolution?" Successes and Limits of Computing Technologies in Philosophy and Religion. In A Companion to Digital Humanities, ed. Susan Schreibman, Ray Siemens and John Unsworth, 132-142. Oxford: Blackwell. http://www.digitalhumanities.org/companion/ Fishken, Shelley Fisher. 2011 "Deep Maps": A Brief for Digital Palimpsest Mapping Projects. Journal of Transnational American Studies 3 (2). http://escholarship.org/uc/item/92v100t0 Gaffney, Vincent, Zoran Stančič, and H. Watson. 1995. The Impact of GIS on Archaeology: a Personal Perspective. In Archaeology and Geographical Information Systems: a European Perspective, ed. Gary Lock and Zoran Stančič, 211-229. London: Taylor and Francis. Gregory, Ian N. 2008. Different Places, Different Stories: Infant Mortality Decline in England and Wales 1851-1911. Annals of the Association of American Geographers 98 (4): 773-794. Gregory, Ian N. and Andrew Hardie. 2011. Visual GISting: bringing together corpus linguistics and Geographical Information Systems. Literary and Linguistic Computing 26 (3): 297-314. Gregory, Ian N. and David Cooper. 2009. Thomas Gray, Samuel Taylor Coleridge and Geographical Information Systems: A Literary GIS of Two Lake District Tours. International Journal of Humanities and Arts Computing 3 (1-2): 61-84. Gold, Matthew K. (ed.). 2012. Debates in the Digital Humanities. Minneapolis: University of Minnesota Press. Harris, Trevor M., John Corrigan, and David J. Bodenhamer. 2010. Challenges for the Spatial Humanities: towards a research agenda. In The Spatial Humanities. GIS and the Future of Humanities Scholarship, ed. David J. Bodenhamer, John Corrigan and Trevor M. Harris, 167-176. Bloomington: Indiana University Press. Harrison, Rodney and John Schofield. 2010. After Modernity. Archaeological Approaches to the Contemporary Past. Oxford: Oxford University Press. Hockey, Susan. 2004. The History of Humanities Computing. In A Companion to Digital Humanities, ed. Susan Schreibman, Ray Siemens and John Unsworth, 1-19. Oxford: Blackwell. http://www.digitalhumanities.org/companion/ Huggett, Jeremy. 2012a. What Lies Beneath: Lifting the Lid on Archaeological Computing. In Thinking Beyond the Tool. Archaeological Computing and the Interpretative Process, ed. Angeliki Chyrsanthi, Patricia Murrieta Flores, and Constantinos Papadopoulos, 204-214. Oxford: Archaeopress. Huggett, Jeremy. 2012b. Disciplinary Issues: the research and practice of computer applications in archaeology. Keynote presentation: Computer Applications and Quantitative Methods in Archaeology, Southampton March 2012. Huggett, Jeremy. 2004. Archaeology and the New Technological Fetishism. Archeologia e Calcolatori 15: 81-92. Huhtamo, Erkki and Jussi Parikka. 2011a. Introduction: an archaeology of media archaeology. In Media Archaeology: approaches, applications, and implications, ed. Erkki Huhtamo and Jussi Parikka, 1-21. Berkeley: University of California Press. Huhtamo, Erkki and Jussi Parikka (eds.). 2011b. Media Archaeology: approaches, applications, and implications. Berkeley: University of California Press. Hui, Barbara. 2010 Litmap. http://barbarahui.net/the-litmap-project/ (accessed April 13, 2012). Hunt, Leta, Marilyn Lundberg, and Bruce Zuckerman. 2011. Getting beyond the common denominator. Literary and Linguistic Computing 26 (2): 217-231. Jeffrey, Stuart. and Kenny Aitchison, K. 2008. 'Who works in digital archaeology?, ADS Online 22 (Winter). http://ads.ahds.ac.uk/newsletter/issue22/jeffery.html (accessed April 13, 2012). Jessop, Martyn. 2008. The Inhibition of Geographical Information in Digital Humanities Scholarship. Literary and Linguistic Computing 23 (1): 39-50. Johnson, Ian. 2002. Contextualising Archaeological Information through Interactive Maps. Internet Archaeology 12. http://intarch.ac.uk/journal/issue12/johnson_index.html Juola, Patrick. 2008. Killer applications in Digital Humanities. Literary and Linguistic Computing 23 (1): 73-83. Kansa, Eric C., Sarah Witcher Kansa, and Ethan Watrall (eds.). 2011. Archaeology 2.0: New Approaches to Communication and Collaboration. Los Angeles: UCLA Cotsen Institute of Archaeology Press. http://escholarship.org/uc/item/1r6137tb Kooyman, Brian. 2006. Boundary Theory as a Means to Understanding Social Space in Archaeological Sites. Journal of Anthropological Archaeology 25: 424-435. Liu, Alan. 2012. The state of the digital humanities. A report and critique. Arts and Humanities in Higher Education 11 (1-2): 8-41. LLobera, Marcos. 2007. Reconstructing visual landscapes. World Archaeology 39 (1): 51-69. Lock, Gary. 2010. Representations of Space and Place in the Humanities. In The Spatial Humanities. GIS and the Future of Humanities Scholarship, ed. David J. Bodenhamer, John Corrigan and Trevor M. Harris, 89-108. Bloomington: Indiana University Press. Lyytinen, Kalle. and John Leslie King. 2004. Nothing at the Center? Academic legitimacy in the Information Systems field. Journal of the Association for Information Systems 5 (6): 220-245. Mahony, Simon. and Gabriel Bodard. 2010. Introduction. In Digital Research in the Study of Classical Antiquity, ed. Gabriel Bodard and Simon Mahony, 1-11. Farnham: Ashgate. McCarty, Willard. 2005. Humanities Computing. Basingstoke: Palgrave Macmillan. McCarty, Willard and Harold Short. 2002. Mapping the field. Report of ALLC meeting held in Pisa, April 2002. http://www.allc.org/node/188 (accessed April 13, 2012). Meeks, Elijah. 2012 Digital Humanities as Thunderdome. Journal of Digital Humanities 1: 94-96. http://journalofdigitalhumanities.org/ Neutens, Tijs., Tim Schwanen, and Frank Witlox. 2011. The Prism of Everyday Life: Towards a New Research Agenda for Time Geography. Transport Reviews 31 (1): 25-47. Pilsch, Andrew. 2012. As Study or as Paradigm? Humanities and the Uptake of Emerging Technologies. Paper presented at the Modern Language Association (MLA) Conference, Seattle, January 2012. http://www.duke.edu/~ves4/mla2012/ (accessed April 13, 2012). Presner, Todd. 2010. HyperCities: A Case Study for the Future of Scholarly Publishing. in The Shape of Things to Come, ed. Jerome McGann, 251-271. Houston: Rice University Press. Rabinowitz, Adam. 2011 Review of 'Digital Research in the Study of Classical Antiquity'. Internet Archaeology 30. http://intarch.ac.uk/journal/issue30/rabinowitz.html Ramsay, Stephen 2011a. Who's In and Who's Out. Paper presented at the Modern Language Association (MLA) Conference, Los Angeles, January 2011. http://lenz.unl.edu/papers/2011/01/08/whos-in-and-whos-out.html (accessed April 13, 2012). Ramsay, Stephen 2011b. On Building. http://lenz.unl.edu/papers/2011/01/11/on-building.html (accessed April 13, 2012). Ramsay, Stephen and Geoffrey Rockwell 2012. Developing Things: Notes toward an Epistemology of Building in the Digital Humanities. In Debates in the Digital Humanities, (ed.) Matthew Gold, 75-84. Minneapolis: University of Minnesota Press. Richards, Julian D., Stuart Jeffrey, Stewart Waller, Fabio Ciravegna, Sam Chapman, and Ziqi Zhang. 2011. The Archaeology Data Service and the Archaeotools Project: Faceted Classification and Natural Language Processing. In Archaeology 2.0: New Approaches to Communication and Collaboration, ed. Eric C. Kansa, Sarah Witcher Kansa and Ethan Watrall, 31-56. Los Angeles: UCLA Cotsen Institute of Archaeology Press. http://escholarship.org/uc/item/1r6137tb. Ross, Seamus. and Ann Gow. 1999. Digital Archaeology: Rescuing Neglected and Damaged Data Resources. JISC/NPO Study within the eLib Programme on the Preservation of Electronic Materials. London: Library and Information Technology Centre. http://eprints.erpanet.org/47/01/rosgowrt.pdf Scoles, Robert and Clifford Wulfman. 2008. Humanities Computing and Digital Humanities. South Atlantic Review 73 (4): 50-66. Schofield, John (ed.). 2009. Defining Moments: Dramatic Archaeologies of the Twentieth-Century. Oxford: Archaeopress. Shaw, Shih-Lung, Hongbo Yu, and Leonard S. Bombom. 2008. A Space-Time GIS Approach to Exploring Large Individual-based Spatiotemporal Datasets. Transactions in GIS 12 (4): 425-441. Suri , Venkata Ratnadeep. 2011. The assimilation and use of GIS by historians: a sociotechnical interaction networks (STIN) analysis. International Journal of Humanities and Arts Computing 5 (2): 159-88. Svensson, Patrik. 2009. Humanities computing as digital humanities. Digital Humanities Quarterly 3 (3). http://www.digitalhumanities.org/dhq/vol/3/3/000065/000065.html. Terras, Melissa. 2010. The Digital Classicist: disciplinary focus and interdisciplinary vision, In Digital Research in the Study of Classical Antiquity, ed. Gabriel Bodard and Simon Mahony, 171-189. Farnham: Ashgate. Thomas, Julian. 2004. Archaeology and Modernity. Abingdon: Routledge. Thomas, William G. 2012. What We Think We Will Build and What We Build in Digital Humanities. Journal of Digital Humanities 1: 77-81. http://journalofdigitalhumanities.org/ Zedeño, María Nieves. 1997. Landscapes, Land Use and the History of Territory Formation: an example from the Puebloan Southwest. Journal of Archaeological Method and Theory 4(1): 67-103 work_f5jg5mje5neyjo5h3iki2jkclm ---- Crowd simulation: A video observation and agent-based modelling approach Browse Explore more content Repository IJDH Shahrol 2016.pdf (889.45 kB) Crowd simulation: A video observation and agent-based modelling approach CiteDownload (889.45 kB)ShareEmbed journal contribution posted on 03.11.2016, 10:00 by Shahrol Mohamaddan, Keith Case Human movement in a crowd can be considered as complex and unpredictable, and accordingly large scale video observation studies based on a conceptual behaviour framework were used to characterise individual movements and behaviours. The conceptual behaviours were Free Movement (Moving Through and Move-Stop-Move), Same Direction Movement (Queuing and Competitive) and Opposite Direction Movement (Avoiding and Passing Through). Movement in crowds was modelled and simulated using an agent-based method using the gaming software Dark BASIC Professional. The agents (individuals) were given parameters of personal objective, visual perception, speed of movement, personal space and avoidance angle or distance within different crowd densities. Two case studies including a multi-mode transportation system layout and a bottleneck / non-bottleneck evacuation are presented. Categories Mechanical Engineering not elsewhere classified Keywords Agent-based modellingCrowd simulationObservational study History School Mechanical, Electrical and Manufacturing Engineering Published in International Journal of the Digital Human Volume 1 Issue 3 Pages 229 - 247 (19) Citation MOHAMADDAN, S. and CASE, K., 2016. Crowd simulation: A video observation and agent-based modelling approach. International Journal of the Digital Human, 1(3), pp. 229-247. Publisher © Inderscience Version AM (Accepted Manuscript) Publisher statement This work is made available according to the conditions of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) licence. Full details of this licence are available at: https://creativecommons.org/licenses/by-nc-nd/4.0/ Acceptance date 03/03/2016 Publication date 2016-10-18 Notes This paper was accepted for publication in the journal International Journal of the Digital Human and the definitive published version is available at http://dx.doi.org/10.1504/IJDH.2016.10000735 DOI https://doi.org/10.1504/IJDH.2016.10000735 ISSN 2046-3375 Publisher version http://dx.doi.org/10.1504/IJDH.2016.10000735 Language en Administrator link https://repository.lboro.ac.uk/account/articles/9565757 Licence CC BY-NC-ND 4.0 Exports Select an optionRefWorksBibTeXRef. managerEndnoteDataCiteNLMDC Categories Mechanical Engineering not elsewhere classified Keywords Agent-based modellingCrowd simulationObservational study Licence CC BY-NC-ND 4.0 Exports Select an optionRefWorksBibTeXRef. managerEndnoteDataCiteNLMDC Hide footerAboutFeaturesToolsBlogAmbassadorsContactFAQPrivacy PolicyCookie PolicyT&CsAccessibility StatementDisclaimerSitemap figshare. credit for all your research. work_fanwgwngu5d65lyl2rdbpkfbze ---- Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Research How to Cite: Dalamu, Taofeek. 2019. “Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities.” Digital Studies/ Le champ numérique 9(1): 8, pp. 1–50. DOI: https://doi.org/10.16995/ dscn.287 Published: 23 April 2019 Peer Review: This is a peer-reviewed article in Digital Studies/Le champ numérique, a journal published by the Open Library of Humanities. Copyright: © 2019 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. Open Access: Digital Studies/Le champ numérique is a peer-reviewed open access journal. Digital Preservation: The Open Library of Humanities and all its journals are digitally preserved in the CLOCKSS scholarly archive service. Dalamu, Taofeek. 2019. “Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities.” Digital Studies/Le champ numérique 9(1): 8, pp. 1–50. DOI: https://doi.org/10.16995/dscn.287 RESEARCH Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Taofeek Dalamu Anchor University, Lagos, NG lifegaters@yahoo.com The focus of the study is the application of Systemic Functional Grammatics (SFG) to text as a facility of meaning-making. Having provided a wide room for technological devices to read and account for elements of a text, it portrays the exercise within the scope of Digital Humanities (DH). The theory, championed by Halliday, describes a text from its systemic configurations to chain structures and social relationship frameworks. To explain the weight of SFG as an interface between text and technology, the author chose a poem, ‘Area Boy’, in which three perspectives of the mood system, thematic system, and transitivity system are instrumental to expose its nuances. The approach was followed by correlating the three systems together as a comparative analysis. The study reveals that ‘Area Boy’ operates in declarative clauses with heavy utilization of Subject and Finite. These are organized in marked themes. The contents of the text are represented in material processes (e.g. spent) with supports from both mental (e.g. remember) and verbal (e.g. said) processes. Some of the processes along with circumstances (e.g. Of washing …, Now that …) recur as repetitions for emphatic and enhancement purposes. On the one hand, the article concludes that SFG can assist in interpreting textual elements to generate meaning potential. On the other hand, through the SFG’s metafunctional applications to ‘Area Boy’, one can suggest that the society should give a helping hand to the less privileged. Such a behavior can eradicate vices experienced through the ‘Area Boys’ from the society. Keywords: ‘Area Boy’; Digital Humanities; Mood System; Systemic Functional Grammatics; Thematic System; Transitivity System Cette étude se focalise sur l’application de la grammaire fonctionnelle systémique (GFS) à des textes comme moyen de facilitation de la création de sens. Ayant pourvu une large marge pour des dispositifs technologiques pour lire et justifier des éléments d’un texte, cette étude présente la mise en pratique dans le cadre des Humanités numériques (HN). Cette théorie, Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 2 of 50 promue par Halliday, décrit un texte, de ses configurations systémiques à ses structures de la chaîne et structures des relations sociales. Pour expliquer la signification de GFS comme interface entre texte et technologie, cet auteur se sert du poème « Area Boy », où trois perspectives du système de mode, du système thématique et du système de transitivité ont des rôles déterminants dans l’exposition des nuances de GFS. Nous avons ensuite fait une analyse comparative en corrélant les trois systèmes ensemble. Cette étude révèle que « Area Boy » fonctionne en des propositions déclaratives avec une utilisation intensive du Sujet et du Fini. Elles sont organisées en thèmes indiqués. Les contextes du texte sont représentés par des processus matériaux (par exemple, spent) avec des soutiens des processus mentaux (par exemple, remember) et verbaux (par exemple, said). Certains des processus, ainsi que des circonstances (par exemple, Of washing…, Now that…), se reproduisent en tant que répétitions pour des raisons emphatiques et appuyées. D’un côté, cet article affirme que GFS peut aider à l’interprétation des éléments textuels pour produire du potentiel de signification. De l’autre côté, à travers des applications métafonctionnelles de GFS à « Area Boy », on peut suggérer que la société doit donner un coup de main aux moins privilégiés. Un tel comportement peut éradiquer les vices venant de la société et ceux vécus par les « Area Boys ». Mots-clés: « Area Boy »; Humanités numériques; système de mode; Grammaire fonctionnelle systémique; système thématique; système de transit Introduction In the light of development, digitization has become an inevitable phenomenon in human social affairs. As one feels its preoccupation in the physical and social sciences; digitization has also dynamically glided into the humanities, especially, the literary world. As a result, scholars have resolved to employ technological and scientific devices to explore literary items to benefit readers (Burrows 2004; Craig 2004). For instance, Robinson and Saklofske (2017) interconnect computer software, mobile applications, etc. with narratives. The utilization of mobile apps, computer software, as modular systems, assists in synchronizing networks on the perception of narratives. Such effort encourages distance readings and algorithmic appreciations. It is on that critical plane that this study suggests the application of Systemic Functional Linguistics (SFL) as a reliable tool that can facilitate the meaning potential of literary texts. This is because SFL possesses capabilities to provide a type of analysis that can Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 3 of 50 be utilized for close structural readings with semantic implications. The digitization of literature attracts theoretical terms as Warwick (2016) particularly emphasizes. The deployment of conceptual terminologies to promote digital humanities (DH) situates SFL very viably in this arena. Consequently, the appropriate applications of theoretical techniques, as interpretive strategies, forge lucid and direct partnerships, and cement strong relationships between research components and meanings derived from the materials (Schreibman, Siemens, and Unsworth 2004, xxv). The abilities of the theory to process texts into linguistic devices makes scientific instruments like tables and graphs effective in accounting for grammatical and semantic frequencies in the form of Jockers and Underwood’s (2016) quantitative methods. Also, SFL works well, by supporting the computation of clause elements in their complex forms within the framework and methodology as discussed by Bradley (2004), and Drucker (2016). As stated earlier, the need for a critical inquiry (Warwick 2016) stimulates the introduction of SFL as a reliable lens to manifest the nitty-gritty of a literary item (e.g. poem). This grammatics (Halliday 2013, 29) addresses this operation by characterizing the clauses of a specific poem into both structural labels and contextual situations. Apart from that, SFL considers language in the form of structures within the purview of socio-cultural manifestations (Kress 2010; Bartlett 2013; Dalamu 2017e, 2017h). This could be a reason for drawing a text into two separate planes. In Halliday and Hasan’s (1985, 5) sense, “there is text and there is other text that accompanies it: text thus is with, namely the con-text.” This notion of elements, associated with the text, pinpoints the production environment of the text. The socio-cultural norms, as Halliday and Hasan (1985) underscore, offer the text much meaningful detail. This is on the grounds that the context meshes with the text and its immediate indices as the unified element of communication. In every language production, Halliday and Hasan assert, there are two texts. The first text is the internal chains that bind the product of a text together as an indivisible entity of meaning (1985, 24–26). The menus of the structural elements are connected through cohesive ties (Eggins 2004; Dalamu 2018). The second, as characterized, is the context of the language of interaction (Halliday and Hasan, 1985). Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 4 of 50 This is the totality of the elements in the setting in which the language is applied. One can argue that there is nothing fascinating in analyzing a text for the purpose of its structural components. It is rather captivating when an analyst considers the constituents of a text within the profile of its socio-cultural plane (Ravelli 2000, 29). That suggestion is a probable projector of the text in the domains of cohesion and coherence. Cohesion describes the structure of the text while coherence realizes its context (Thompson 2004). Figure 1, below, adds flavor to the text and context abstractions of a piece of language in use. The convention of coherence and cohesion, “merry-go-rounding” the text, ends up at the table of three metafunctions as shown in Figure 1, above. This explains the idea that the three metafunctions dominate and remain the focus of SFL. Both the meeting and melting point of coherence and cohesion are the three metafunctions (Halliday 1985; Matthiessen 1993). Through that synthesis, meaning is generated in text. Having said that, there are numerous conceptual frameworks in the theory that can assist in explaining texts. These are very possible without recourse to the celebrated three metafunctions. For instance, part of the grammatical metaphor has capabilities to explore a text independently (Thompson 2004, 220–224). Furthermore, within the domains of SFL, analysts can consider a text from the spheres of contextual, socio- semiotic, and multimodal perspectives (Hodge and Kress 1988; O’Toole 1994; Kress and Van Leeuwen 2003; Kress 2010; Dalamu 2017g). These are some of the incentives that propel the writer to suggest that SFL contains reliable resources useful in DH. One can feel the waves of DH in Riguet and Mpouli’s (2017) characterization of dialogism of French discourse on literary criticism. As a result, Riguet and Mpouli Figure 1: Text and context expressed through coherence and cohesion. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 5 of 50 discuss how scientific terminologies are “loaned” and adopted, giving those terms new but literary meanings. While Muzny, Algee-Hewitt, and Jurafsky (2017) throw some light on the dialogue, in terms of conversation, as the basis for communicative interactions; Binotti and Azcorra (2017) explain DH values from their great influence on the general public, describing the effectiveness of Entiéndelo (a textual explication tool) for all humanity. The authors depict the benefits of Entiéndelo as augmenting people’s quality of life. As modern literary communications cannot be totally jettisoned or retired from ancient events, Bogna (2017) and Ciula (2017) elucidate some old and new disciplines in both multi- and interdisciplinary manners (also in Erlin 2016). The correlations draw readers’ attentions to their great significance as well as their interrelationships. Other scholars model language from the perspective of discourse studies in relation to DH (Rodilla and Gonzalez-Perez 2017); expound the concept of “big data” (Castro 2017); reveal unusual non-count nominals in Modern English (Svensson 1998); and elucidate historical discourses of race in literary elements (Lee et al 2018). Of importance is Kreniske and Kipp’s (2014) insight on the influence of DH on the documentation of social values of South African San nationality. This study, as a contribution to earlier analyses, explicates the application of technology that relies on the result of the application of SFL concepts to the text. As a practice, the analyst has applied SFL to Adesanmi’s “Area Boy” (Adesanmi 2010, 308). This exercise displays the influence that SFL can have on a text in terms of the writer’s style and corpus development. In other words, the paper discusses how technology has assisted the analyst to do a reading of a specific poem (“Area Boy”) within a framework of functional grammar. It is the hope of the author that this will trigger further research efforts, channeled in a similar direction of this course. As a fundamental textually- focused theory, SFL exercises its vitality on the grammar of a language, considering the clause, as the center of analysis. Grammar refers to the structural system of wordings of a language (Yule 1985, 69; Dalamu 2017a, 268) that can be viewed from below, from around, and from above. The functions and analysis of such a language are also carried out in the way that a language communicates (Burke 1969; Quirk and Greenbaum 1973; McGregor 1992; Halliday and Matthiessen 2004). Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 6 of 50 It is pertinent to argue that the analysis of a quantum of grammatical elements cannot be done haphazardly because grammar itself is an organized event. By implication, a consideration for making meaning from the grammar of a language must not be operationalized chaotically. Rather, its organization must be in sequences. Thus, it is also demanded that the theoretical application on grammatical structures must begin from somewhere, that is, its constituted ordering. The concern drives SFL to start the analysis of text from the clause, its nerve, as applied later. This shows that the examination of every grammatical unit and function(s) has a connection with the clause. Thus, it is obligatory for every user of SFL to get acquainted with the clause and varieties of building blocks attached to it in either simple or complex forms. In corollary, Ravelli (2000, 29) points out that the key to beginning a systemic analysis is to identify a clause, which is the hub of grammar. Following Ravelli (2000), the clause is similar in concept to a sentence, except that a sentence pertains to written language, whereas a clause applies also to spoken language. In a specific sense, a clause represents a state of affairs. X-raying systemic functional linguistics Unlike so many ideas within schools of linguistics, SFL comes along with many linguistic tickets, as means of constructing and illuminating the thoughts of the exponents. The major exponential ancestors are fundamentally Saussure on Syntagmatic and Paradigmatic (De Beaugrande 1991), Bühler on the three functional models of language (Innis 1987), and Malinowski on Context of Situation (Malinowski 1935; Bailey 1985). The link of Hejelmslev to the theory is on Theme that taps its currency from the Prague School (Halliday 1994). Firth is always remembered for the concept of System – a system of systems or being polysystemic (Firth 1957; Butler 1985), while Hasan is notably the propagator of Context of Culture (Halliday and Hasan 1985), and Halliday on the Three Metafunctions (Halliday 1973). However, the configuration, harmonization, and development (to an extent) of the contributions of the intellectual progenitors of SFL’s identities rest on Michael Alexander Kirkwood Halliday. Actually, Halliday conceives the bright idea of SFL from the confluence of the thoughts of earlier scholars (e.g. Firth). The convergence occurs through the scrutiny and careful selection of useful materials as platforms for constructing SFL. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 7 of 50 That insight elevates Halliday’s pedigree as synonymous with SFL (Dalamu 2017c). This is because Halliday does not only make choices from scholarly resourceful materials; the sage also champions the compatibility of the raw materials; and moreover, injects invaluable terms to the subjects that SFL accommodates. This study, in that regard, considers Halliday as the architect as well as the mason of the theory. The centrality of the clause to grammar, as mentioned earlier, cannot be undermined. The fragmentation of the clause produces phrases and words; the elaboration leads to the formation of clause complexes. The writer points out that every statement deployed by an interactant either in the form of the spoken or written language has its origin negotiated in the clause. Such place of occupancy encourages systemicists to make the clause kernel in analyses rather than the sentence. The significance of the veins of the clause operations on the text can be demonstrated, as in Table 1, below. In one way or another, SFL is functionally-cyclical, most especially, in the dominance of the clause in all operations. Besides, point (v), (vi), and (vii), in Table 1, link the clause again to the three metafunctions. Beginning the construction of meaning of a text (e.g. poem) from below the clause (e.g. word) to a full-fledged clause and ending the exercise around the clause (e.g. discourse) is, perhaps, a sign Table 1: Domains of the clause in SFL. Numbers Systemic Terms Grammatical Elements Examples I Below the clause Words, phrases Search, a system code Ii Around the clause Cohesion, texture, discourse Engineered differently Iii Above the clause Clause complexes I sing yet dance in church Iv Beyond the clause Metaphorical interactions Statistical comparisons elevate research V Clause as exchange Mood system As in Figure 8, below Vi Clause as message Thematic system As in Figure 9, below Vii Clause as representation Transitivity system As in Figure 10, below Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 8 of 50 of building up meaning from the scratch to a broad meaning derivative. It is on that ground that SFL serves as an interface between a poem (e.g. “Area Boy”) and technological devices (e.g. graphs) in order to position “Area Boy”, as an entity of DH. Digital Humanities: Historical developments In the historical development of DH, the name of Roberto Busa is estimable. Roberto Busa was a Jesuit priest who picked interest in building a concordance for the works of Thomas Aquinas in 1949. That assiduous effort charted a pioneering course for what is known as DH today (Ess 2004, 133). Though, a very tedious journey, the objective was to realize the word of Aquinas’ writings in what Busa referred to as index verborum (Busa 1980). The effect of that singular act seems referential up to this period. In Crane’s observation, Busa’s attempts transcended any other struggles in the hemisphere of lexemic accountability (Crane 2004, 47). The difficulty experienced, doing manual operations influenced Busa to seek help from IBM to accelerate the counting and ensure accuracy (Burton 1981a, 1981b). Hockey’s perspectives on DH history Hockey’s approach to the history of DH from the effort of Father Busa serves as the point of departure (2004, 4). In this classification, 1949 to early 1970s mark the beginnings where index verborum and cum hypertextibus are reiterated. The years of consolidation fall within 1970 to the middle of 1980s, witnessing the journal Computers and the Humanities, conferences, the writing of computer programs, along with the establishment of computer centers. Personal computers that foster innovation, as Hockey (2004) reports, become necessary for scholarship to snowball the development of DH in 1980s and early 1990s. Of significance in the era is the long-standing impact that reminds the writer of the publication of the Human Computing Yearbook for the storage or archiving and campaign of scholarly projects, software, and publications (ibid.). The herald of the World Wide Web (WWW), as the “culmination” in 1990s, becomes the irresistible boost that welcomes researchers to first-class information, perhaps, in any subject. Moreover, WWW gives license to anyone to publish. This promotes scholarly works as much as the elimination of the constraint of printing pages of books. There is no page limitation, and Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 9 of 50 every publication can be reviewed from time to time. Another great merit of the WWW/URL is that any publication can be accessed from any part of the globe as long as it is not passworded. Hockey (2004, 17) submits that, “now that the Internet is such a dominant feature of everyday life, the opportunity exists for humanities computing to reach out further than has hitherto been possible” (also in Svensson 2010; Jørgensen 2016). Digital Humanities: Definitions and domains Perhaps, scholars have been wisely and systematically softening academic pedals from defining the term, Digital Humanities (e.g. Burdick et al 2012). This is because DH is a probable subject to expand beyond human imaginations among the sub- contending disciplines (Kirschenbaum 2010, 58). The discipline also addresses many of the research challenges on methodological paradigms (Schreibman, Siemens, and Unsworth 2004, xxx). However, attempts have been made to describe the contents of the fast growing and developing DH. Thus, Busa (2004) claims that DH is precisely the automation of every possible analysis of human expression (therefore, it is exquisitely a “humanistic” activity), in the widest sense of the word, from music to the theater, from design and painting to phonetics, but whose nucleus remains the discourse of written texts (Busa 2004, xvi). This perspective is very broad. It is coherent, Busa explains, to all possible human social endeavors. The pointer in the description is the text. Again, at this point, the relevance of SFL to text can be referred. As the nucleus of DH is the text, the same text is the hub of SFL as well as the wheel of language. SFL seems the chair of DH and language because of its theoretical underpinning in both the textual claims and social connections (Wodak and Meyer 2001, 3–9). As such, domains of SFL are contextually-expressed, as publicized earlier in Figure 1, in the terminology of cohesion and coherence. The study sees a joint venture between DH and SFL. DH seems to embrace two customary but academic lifestyles by creating a robust and intertwining relationship between the digits (1, 2, 3, etc.) and the alphabet (a, b, c, etc.). The partnership extends to signs and figures of various Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 10 of 50 pluralistic annotations such as scientific representations of symbols in different forms, capacities, and functions. This is where SFL can create an interconnection between the two entities by positioning every element of a clause in the appropriate place. This advances the examination of the events of the “humanities” to produce meanings. This opportunity can be a compelling reason for Busa (2004, xxx) to elucidate the evolving but permanent association of humanities and the computer utilization as signaling “The finger of God.” Burdick et al (2012, vii) recognize the communicative divine signature by validating that unlike in the past, researchers in the humanities today live and function in “rare moments of opportunity” with the potential to play a vastly expanded creative role in public life. Computerization influences seem to have aided such transformation. The testimony of Burdick et al. (2012) places a wide gulf between the knowledge of precursors of humanities and the present humanists. The current information age (or golden age) negotiates workable and lasting relationships between human expressions/lifestyles and computer applications unlike past generations. However, as the arts construct enduring relationships with computerization, the disciplines are not in any way retreating from the long-standing tenets of founding fathers. DH is a foremost development of “the purview of the humanities, precisely because it brings the values, representational and interpretive practices, meaning- making strategies, complexities, and ambiguities of being human into every realm of experience and knowledge of the world” (ibid.). This suggests that a major contribution of DH is the creation of additional values into the arts through the applications of computerized interpretative equipment. As such, technological tools are capable of advancing and enhancing the meaning-making of human activities where SFL serves an intermediary function. Furthermore, the Digital Humanities Manifesto articulates DH as having observed in the discipline an array of convergent practices in two senses. One, it explains that the “print is no longer the exclusive or the normative medium in which knowledge is produced and/or disseminated; instead, print finds itself absorbed into new, multimedia configurations” (Digital Humanities Manifesto n.d.). The usual manner of exercises in the print has changed to an advanced level. Two, “digital Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 11 of 50 tools, techniques, and the media have altered the production and dissemination of the knowledge of arts, human, and social sciences” (ibid.). It is not contentious that DH has taken the arts from their deep-seated artistry to a sustainable “scientific” level of functionality. Perhaps, sooner or later, all disciplines in the Ivory Towers rather than awarding B.A. degree titles in the humanities, will transit to awarding a B.Sc. to every bachelor of a university. This projection depends on applications of computational infrastructures to the humanities, which have the capacity to realize the dream proposed (Edmond 2016; Montfort 2016; O’Donnell, Walter, Gil, and Fraistat 2016). Responsibilities of DH dominate all fields, where human beings operate (Butler-Kisber 2013). This is because the applications of computer facilities are limitless most especially when one correlates every action with the renowned slogans of IBM, Everything you need to build anything you want and THINK (IBM 2017; Creative Block Inc. 2017). This is made possible and effective because writers of computer programs receive instructions that assist them to produce a program that is parallel to a particular operational need and demand (Peirson, Damerow, and Laubichler 2016). The more the scope of human beings widens the better the areas of DH’s occupancy. Among others, DH is applied to linguistics, literary studies, music, graphic arts, and archaeology. (Schreibman, Siemens, and Unsworth 2004). Remarkable suggestion of John Unsworth Father Busa, perhaps the most distinguished pioneer of the well-known DH echelons (Schreibman, Siemens, and Unsworth 2004, ix), did not label the subject as DH. Before 2001/2002, Busa and his contemporaneous scholars had been tagging the remarkable activities on the new idea of textual accountability as Index Thomisticus, Lessico Tomistico Biculturale, concordance, humanities computing, etc. The construction of a universally-acceptable title given to what Busa started in 1949 rests on John Unsworth (Unsworth 2002), the same way that the construct of Discourse Analysis resides in Zellig Harris, and Context of Situation rests on Bronislaw Malinowski (Malmkjaer 2004). The big idea, according to Unsworth, came to him while negotiating the title of the book, A Companion to Digital Humanities, with the representative of Blackwell publishing company (Unsworth 2010; also in Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 12 of 50 Kirschenbaum 2010, 56–57). Although, the labeling rests on Unsworth, DH is a child of circumstance borne per chance. However, it is pertinent to think back to Busa’s assertion on “Digitus Dei est hic! i.e. The finger of God is here!” Busa perceives the phenomenon as an outstanding activity involving human beings, yet, charged and influenced by God. That is the rationale for Busa to add that “it is just like a satellite map of the points to which the wind of the ingenuity of the sons of God moves and develops the contents of computational linguistics, i.e., the computer in the humanities” (Busa 2004, xvi). Very salient in the Unsworth’s (2010) construct is the adjective “digital.” The coinage, in Unsworth’s standpoint, appears in order to move away from simple digitization of lexemes. “Digital” as a modifier signals a form of “sporadic” shift from the counting of words into all manifestations of humanistic operations. The “randomization” of the affiliation of computer technological applications to various humanistic domains is a probable factor that has prevented the discipline of DH from one-face value on definition. SFL: The interface between “Area Boy” and technology Although, the three metafunctions of SFL – interpersonal, textual, and experiential – are the theoretical concepts of the study; it is significant to demonstrate the function of SFL in the study as manifested in Figure 2, below. The portion in the blue color (identified as A) is the poem, “Area Boy”, while the portion in the green color (identified as C) is the technology. On the one hand, “Area Boy” is a piece of literature that contains textual elements with embedded meaning potential. On the other hand, the green color is the facility useful for calibration. Because, it is seemingly difficult for the technological device to approach “Area Boy” in order to generate systemic meaning, SFL (identified as B) bridges the lacuna. Its application processes “Area Boy” structures into countable Figure 2: Relationship between SFL and technology. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 13 of 50 values that the technology can accommodate. The systemic operations permit the technological facility to act on “Area Boy.” In a simple term, the outcomes of the application turn the whole exercises on “Area Boy” to semiotic slots of SFL, and SFL to computerization devices in order to operate as DH. The theoretical application of SFL is the wheel that turns “Area Boy” into an entity of DH. Besides the current application, as mentioned earlier, SFL with the use of any of its concepts (substitution, ellipsis, grammatical metaphor, coherence, context of situation, etc.) can be applied to texts for meaning-making. Theoretical breadth Significantly, a demonstration of SFL as a very resourceful tool of DH inspires the author to adopt the three metafunctions as the relevant conceptual entities. That being said, the three metafunctions, as mentioned earlier, are the core concept of SFL. The applications of the triadic terms to a text provide the target audience structural, paradigmatic, and contextual meanings (Eggins 2004). Table 2, below, shows the operational slots of the three metafunctions. The grammatical spheres of the metafunctions shown in the analyses of Figures 9, 10, and 11, below, make it very possible to earmark semantic slots to the structural organs of the clauses of “Area Boy.” The system networks in Figures 3, 4, and 5, below, are indicators of the metafunctions, operating from below, from around, and from above. However, some of these functions are basically-intrinsic. Besides, the system network represents the choice that a language user makes out of numerous ones available to the individual. Contextual implications of interpersonal, textual, and experiential metafunctions are accommodated discursively. Table 2: Three Metafunctions’ operational slots. Terminology Grammatical Sphere Paradigmatic Context Interpersonal Metafunction Mood System Network Tenor of Discourse Textual Metafunction Theme System Network Mode of Discourse Experiential Metafunction Transitivity System Network Field of Discourse Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 14 of 50 Mood system Thompson (2004) probes the interpersonal metafunction as a device that fulfils the “performative” roles of every addresser to the addressee. The concept reveals either constitutive functions or ancillary functions. The speech roles, Thompson (2004, 46–47) emphasizes, permit questions (interrogatives), commands (imperatives), statements (declaratives), and offers (modulated interrogatives) to be realizable in discussions (also in Dalamu, 2017b, 190–193). However, Halliday and Matthiessen (2004, 111–132) characterize the main structural organs as the mood disposed in Subject and Finite respectively. Predicator, Complement, and Adjunct, in Bloor and Bloor’s (2013) conceptualization, are components of the residue. Figure 3, below, explains further the system network of the mood choices. Apart from exclamation marks and “sets” in English, the choices of the indicative and imperative are clearly open in communicative activities. Thematic system Theme and rheme fall into the organizational ideas of a text. A user of a language determines the componential arrangements of the communication as desired (Halliday 1994, 34–67). Apart from that, the function that the language is deployed to achieve Figure 3: Mood System Network (Thompson 2014). Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 15 of 50 has a great implication on the background details of a discourse. Rashidi (1992, 192) illuminates the theme as the starting point of the message. That is, the constituent that begins moving the encoder towards the essence of the communication. There is the essential ideational jumping-off point directing the decoder’s attention to the ultimate goal of the communication. The theme, in Rashidi’s approach, begins a clause irrespective of the linguistic device experienced at the start up. In other words, it gives a track to text productions. It is that operational condition that further influences Rashidi (1992, 197) to describe the rheme as the nub of the message of a clause despite the obligatory appearance of the theme in any construct (e.g. NG, Prep G, VG, Adj G or Adv G). This manifests the essential position of the theme in structures except in a situation of elliptical lexical amenities. Themes of the text operate in different ways. Unmarked theme occurs when the topical theme functions as a subject of the clause. Marked theme occurs when the theme of the clause is not the subject. Topical theme operates whenever participant, process, and circumstance realize the theme. Thematic theme arises before the topical theme. Exclusive discussions of the theme are in Halliday and Matthiessen (2004, 64–87), where the point of departure, orientation, and location connecting the social reality realize the theme (Ellis 1987, 113–121; Dalamu 2017f). Figure 4, below, reveals the system network of the thematic system of the clause, exhibiting textual, interpersonal, and experiential/ideational elements as the configuration of the multiple themes. The system in Figure 4, below, shows the theme and rheme as two separate tools of interpretations. Transitivity system Bloor and Bloor (2004) argue that the experiential metafunction encodes the speaker’s experience by allowing language to play a critical role that accommodates the goings-on and the participants involved in the activities. In consonance with that perspective, Halliday and Matthiessen (2004, 168–259) describe the content and participants, sometimes encompassed with circumstantials, as crucial in disseminating information through the experiential (McGregor 1997). Valuable materials are in (Martin 1992; Halliday and Matthiessen 2014). Figure 5, below, elucidates the experiential metafunction, showing the processes and participants. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 16 of 50 Figure 4: Thematic system network (Halliday and Matthiessen 2014). Figure 5: Transitivity system network (Dalamu, 2017i). Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 17 of 50 Figure 5, above, reveals six processes functioning in the English language. Material processes, mental processes, and relational processes are major while behavioral, verbal, and existential processes are minor. These systemic facilities are minor because behavioral, verbal, and existential occur at the peripheries of the major processes (Halliday and Matthiessen 2014; Dalamu, 2017d). The occurrence of the processes in a disc-like format is another operational figure that throws more lights on the cyclical nature of SFL (Halliday 1994, 108). The linear sequence of Figure 5, above, informs the introduction of the second material label as being a caricature. Figure 6, below, illustrates the compatibility of the interpersonal metafunction, textual metafunction, and experiential metafunction on a clause, indicating their unbroken relationships. The partnership pinpoints the way that meaning potential of a text is realizable in three different systemic forms in order to generate meanings (Bloor and Bloor 2013; Fontaine 2013; Dalamu 2017c). Figure 6: Three metafunctions composite system network (Dalamu 2017i). Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 18 of 50 This study demonstrates the treaty in the three metafunctions in Figure 6, above, in order to serve a good purpose of understanding their usefulness in DH through technological appreciations. Methodology The author has chosen the poem of “Area Boy” written by Pius Adesanmi out of the 231 poems in Lagos of the Poet, which addresses concerned issues of Lagos State, Nigeria (Adesanmi 2010). Consequently, the book has huge implications concerning Lagos and the Nigerian society at large. It is also a means of creating a global awareness of the meaning of “Area Boy” in this part of the world. Above all that, the choice of the poem allows the study to exhibit resultant effects of SFL on a literary text. Design “Area Boy” has been divided into clauses with the systemic traditional style of the slash demarcations, that is, “///” and “//”. “///” signifies the beginning and the end of a stanza while “//” serves as a simple clause separator. These are the reasons for observing slashes in the data presentation, below. The analysis of the “Area Boy” has undergone three different spheres of the mood, thematic, and transitivity systems in order to reveal the application of each of the three metafunctional instruments in clear functional terms. In the mood system analysis, S = Subject, F = Finite, P = Predicator, C = Complement, A = Adjunct, and Mod Adj = Modal Adjunct. The study also uses Circ as Circumstance, Pro as Process, Loc as Location, and Ident as Identifying. Measures After the systemic analysis, the researcher exploited AntConc, a text-computing technology (Laurence Anthony’s Software), to account for the processes in “Area Boy.” The first step was to identify and write down the processes in a piece of paper after which the entire “Area Boy” file was inputted into AntConc by selecting the “Open Files” icon in the “Navigation Menu” and the “Concordance” in the “Tool Tabs”. As the “Area Boy” file has appeared in the “Corpus File” window, each process term (e.g. remember) was later entered into the AntConc window’s dialogue box in the left-hand side of the “Control Panel.” The AntConc displayed the frequency in two forms after clicking “Start”. The computing instrument showed the word recurrence in the ‘KWIC Results Window’ and the digit in “Concordance Hits” as shown in Figure 7, below. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 19 of 50 Details about AntConc are in Anthony (2018). That exercise was conducted to ensure the recurrence accuracy of the texts. Besides, AntConc supported the investigation by harvesting the frequency of other linguistic elements such as at times, you, your, and nobody. Thereafter, the researcher utilized the Microsoft Excel Worksheet (e.g. Figure 8, as publicized latter) to further support SFL to process the clauses in “Area Boy.” The use of the Excel Worksheet became a fundamental tool in order to achieve accurate classifications of the structures that the metafunctional components have realized. As AntConc does not have the capacity for systemic appreciations, manual counting of the grammatical constituents in the semiotic slots has become inevitable. To this end, quantitative operations, following Jockers and Underwood (2016), Drucker (2016), and Dalamu (2017i), allow tables to compute the grammatical elements in the semiotic slots into appropriate values. Each table further schematizes into a simple graph for prompt examination of the operational facilities of the systemic elements. The scientific interpretation can Figure 7: A sample screen of AntConc. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 20 of 50 assist the reader for easy accessibility of the functional domains of the poem. The graphs of the mood, thematic, and transitivity systems expressed in Figure 14, later below, are cumulated into a single piece to reveal the relationships of the three metafunctions. Procedure The analytical as well as reading processes in Figures 9 to 14, as later illustrated below, inform the patterns of the discussion. However, the discussion gives preferences to the transitivity system because the grammatical term provides expressions for the contents of the clause. Besides, as the transitivity shows concern for the narrator’s experience (inner and outer), the terminology also communicates universal relations of subcomponents of logical items (Butler 1985; Olivares 2013). Data presentation The items, below, are the data of “Area Boy”, written in paragraphs and poetic lines. Area Boy ///At times you still remember Those agonizing years// you spent As a cheap labourer in the General’s farm Tilling, toiling and sweating in the sun For the pittance//they flung at you once in a month// Yet nobody said anything then.// ///At times you still remember The painful years//you spent As a reluctant houseboy in Ikoyi// Oga’s callouseness still haunts your steps// Madam’s overbearing attitude you cannot forget// Yet nobody said anything then./// Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 21 of 50 ///At times you still remember The psychological oppression Of watching their scions spray dollars in parties Of their limousines splashing water on you in the streets Of your wondering//where you went wrong// //Yet nobody said anything then.// Now that something in you has snapped// Now that you can no longer stomach it// Now that you’re fighting back in the streets// //Lashing out at the system/// ///Stinging the molochs//who operate it// And cowards who tolerate it// It is time for them to call you names: Tout! Vagrant! Vandal! Area boy!/// ///Brother, your being an Area Boy is now the issue// Nobody will ever bother to excavate The fossils of disenchantment Buried deep down in your soul.// Data analysis Figures 8, 9, and 10, below, display the application of mood, thematic, and transitivity systems to the poem, “Area Boy.” The investigation further exhibits the frequencies of the grammatical constituents of “Area Boy”, based on SFL’s applications in Figures 8, 9, and 10, below, in tables and graphs as expressed in the result section. Results Mood system of the “Area Boy” analysis Table 3, below, displays the values of the semiotic slots in the “Area Boy” mood analysis in Figure 8, below. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 22 of 50 Ta b le 3 : “ A re a B oy ” m oo d s ys te m r ec u rr in g va lu e. Se m io ti c Sl o t C la u se To ta l C L1 C L2 C L3 C L4 C L5 C L6 C L7 C L8 C L9 C L1 0 C L1 1 C L1 2 C L1 3 C L1 4 C L1 5 C L1 6 C L1 7 C L1 8 C L1 9 C L2 0 C L2 1 S 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 19 F 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 19 P 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 19 C 1 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 16 A 2 2 2 1 1 2 2 0 1 5 1 1 2 1 1 0 0 0 1 1 1 2 7 Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 23 of 50 Figure 8: “Area Boy” mood analysis. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 24 of 50 Figure 9: “Area Boy” thematic analysis. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 25 of 50 Figure 10: “Area Boy” transitivity analysis. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 26 of 50 Figure 11, below, is the cumulative of the values computed in the mood system in Table 3, above. Figure 11, above, indicates Adjunct as the priority because it is more functional in the “Area Boy” text. Subject, Finite, and Predicator are next with Complement being the less functional device. The figure shows that the text is constructed in declarative clauses, issuing statements to the target audience in order to show the feelings of the speaker. Thematic system of the “Area Boy” analysis Table 4, below, reveals the values of the semiotic slots in the “Area Boy” thematic analysis in Figure 9, as shown earlier, above. Figure 12, below, is the cumulative of the values specified in the thematic system in Table 4, below. Figure 11: “Area Boy” mood system calibration. Figure 12: “Area Boy” thematic system calibration. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 27 of 50 Ta b le 4 : “ A re a B oy ” th em at ic s ys te m r ec u rr in g va lu e. Se m io ti c Sl o t C la u se To ta l C L1 C L2 C L3 C L4 C L5 C L6 C L7 C L8 C L9 C L1 0 C L1 1 C L1 2 C L1 3 C L1 4 C L1 5 C L1 6 C L1 7 C L1 8 C L1 9 C L2 0 C L2 1 Th em e1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 19 Th em e2 1 0 0 1 1 0 0 1 1 1 1 1 1 1 0 0 0 1 0 1 0 12 Th em e3 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 1 0 0 0 4 R h em e 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 21 Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 28 of 50 Rheme is the most prominent in Figure 12, above. This is the core of the message of the “Area Boy.” Besides, Theme 1 recurs in almost all the clauses. This signals that the organizations of the clauses are hardly elliptical. The structures are complete statements that sometimes have Theme 2 as a support for the clauses points of departure. Theme 3 is available only in clauses 12, 13, 14, and 18. That points out the rarity of Theme 3 in the textual operations. Transitivity system of the “Area Boy” analysis Table 5, below, shows the values of the semiotic slots in the “Area Boy” transitivity analysis in Figure 10, above. Figure 13, below, is the cumulative of the values manifested in the thematic system in Table 5, below. Material processes record the highest value in Figure 13, above. This is in alignment with the claim of Halliday and Matthiessen (2004) that material processes are the most deployed in language usages. Apart from mental processes that operate at the frequency of five, other processes such as relational and behavioral operate at the minimal levels of two points each. It is surprising that verbal processes function in a relatively similar category with other processes with three points. This act seems to happen because the narrator makes a sort of reported speeches from time to time. Figure 13: “Area Boy” transitivity calibration. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 29 of 50 Ta b le 5 : “ A re a B oy ” tr an si ti vi ty s ys te m c al ib ra ti on . Se m io ti c Sl o t C la u se To ta l C L1 C L2 C L3 C L4 C L5 C L6 C L7 C L8 C L9 C L1 0 C L1 1 C L1 2 C L1 3 C L1 4 C L1 5 C L1 6 C L1 7 C L1 8 C L1 9 C L2 0 C L2 1 M at er ia l 0 1 1 0 0 1 1 0 0 0 0 1 0 1 0 1 1 0 0 0 1 9 M en ta l 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 5 R el at io n al 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 2 B eh av io ra l 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 2 V er b al 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 3 Ex is te n ti al 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C ir cu m st an ce 2 2 1 0 2 2 0 0 0 5 0 1 1 2 0 0 0 0 1 1 0 2 0 Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 30 of 50 Three metafunctions of the “Area Boy” analysis Figure 14, below, is the cumulative of the values, exhibited in the mood, thematic, and transitivity systems, as displayed earlier in Figures 8, 9, and 10, above. Figure 14, above, demonstrates the Adjunct of the mood system, rheme of the thematic system, and material processes of the transitivity system as the highest in functional values, as followed by mental processes. By implication, SFL illustrates Adjunct, rheme, and material processes, as the strongest areas of domination of the “Area Boy” text. These are followed by Subject, Finite, and Predicator of the mood system, and Theme 1 of the thematic system. The computing outcomes of SFL of “Area Boy” indicate analytical skills that can augment cross-fertilization of ideas in disciplines. The graphical appearances of textual elements create a sort of communicative interaction for the audience in an easy way. Discussion There are five stanzas in the poem of “Area Boy.” The segments explain the concern of the narrator about an “Area Boy” named George in the epigraph not actually integral to the stanzas. It is striking to read from the epitaph that the poem is for George, the “Area Boy”, who opened up a bitter heart to me at Ojuelegba, Lagos. This Figure 14: “Area Boy” metafunctions’ relationship calibration. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 31 of 50 revelation specifies the source of the poet’s influence as well as the focus. Ojuelegba is an important part of Yaba, Lagos (not far away from the renowned University of Lagos), where influential and highly respected people live. As a heartbeat of Yaba, the mentioning of Ojuelegba anywhere in Nigeria signifies something remarkably-different. It is a signpost to a very small portion of land with a flyover. Underneath the flyover are motor parks, petty trading activities, and prostitutes. On top of these, Ojuelegba is a domain for miscreants for twenty-four hours a day. In all these, Ojuelegba points to a place where prostitutes transact businesses. However, the Governor of Lagos State between 2007–2015, Babatunde Raji Fashola, cleansed Ojuelegba of prostitutes and miscreants. Fashola positioned the place as a worthwhile environment during his reign and that drive has been sustained till today. Fashola is now the Minister of Power, Works, and Housing in the current Buhari’s Administration. Despite the thorough cleansing, the negative nuances attached to Ojuelegba have become very difficult to remove from the people’s mindsets. The displacement of the “Area Boy” from Ojuelegba might have given rise to the heart rendering poem of “Area Boy.” The author approaches the discussion from the broad views of the systemic organization of the clauses in the stanzas, and semantic values attached to the clauses most especially from the goings-on. The poem opens up with a circumstantial element of place, At times, to indicate a consistent feeling of the “Area Boy” concerning the issues of life that he has undergone. This is expressed through a mental process, remember. Remember illustrates the trauma in the cognitive capacity of the “Area Boy.” The recurrence of you projects the poet as a voice for the “Area Boy” because the Actor, you, refers to what has happened to an individual, as the experience of the past that connects the “Area Boy’s” present condition. In every stanza, except in the fifth, you functions, at least, three times consecutively. The applications of Actor, you, present the poem more as containing declarative clauses. As statements, Subjects and Finites operate well in propagating the interactive nature on the clauses (Thompson 2004). All the clauses are declarative except CL15 and CL16 that are punctuated. Apart from CL1 that the Subject, You, takes the Finite, remember, in the Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 32 of 50 present tense, the Subjects You, They, and nobody in CL2, CL3, and CL4 present their Subjects, spent, flung, and said in the past. Out of the past elements, spend, fling, and say are systemically-deduced as predicators. This operation reveals SFL as a viable tool of separating a fused verbal group structure into two systemic distinct forms that function in the domains of Finite and Predicator (Halliday and Matthiessen 2014; Dalamu 2017b). For the Participant, Those agonizing years, analyzed earlier as Phenomenon, in Figure 10, CL1, is a painful expression that demonstrates how the “Area Boy” has been subjected to the modern day slavery in the General’s farm, where the individual works and receives a meager salary. The circumstantial communicative device, in the General’s farm, seems to refer to “an individual who was an Army General” and after the service years retired to establish a farm to generate money. In that course, the “Area Boy” becomes a useful-cum-precious tool in the farm. This is because an average Nigerian graduate detests tilling the land. Most elites are in search of and doing white-collar-jobs. Perhaps, that attitude has contributed to the importation of food from most parts of the world to the country. Even those that read Agricultural (related) Sciences in Universities may not be ready to practice farming either as relating to livestock or crop productions. It is in that light that the “Area Boy” becomes a resourceful personality for the General, as expressed in CL2. That exploitation encourages the speaker to conclude that Yet nobody said anything then. The clause with a verbal process indicts every onlooker at the manhandling of the “Area Boy.” The poet expects that all the concerned should have raised their voices concerning the abuse of the rights of the “Area Boy” in the General’s farm. Individuals have all been recalcitrant simply because the “Area Boy” is not a family member of successful persons. Such unwillingness reveals the nature of relationship between the rich and the poor. The people have forgotten that the “Area Boy” is a member of the Nigerian society who has the right equivalent to the General’s. The adjunct, then, serves a purpose of reminding the society of the situation of the “Area Boy” when he needs helps and nobody is observant of his plight. Then links the past agony of the “Area Boy” to his present status of being a vagabond. The structure below shows the thematic functions of the stanza one. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 33 of 50 The point of departure of CL1 and CL4 is the same as having two themes; whereas CL2 and CL3 elements have one theme each. The second stanza is comparatively-parallel to the first stanza because it begins with a flashback into the past, remembering the painful years … spent as a houseboy in Ikoyi. The “Area Boy” is connected to Ojuelegba, while the master lives in Ikoyi. The implication is that Ikoyi is in the Lagos Island, while Ojuelegba is in the mainland, representing two different worlds or perspectives. Foreigners and well-meaning Nigerians reside in Ikoyi. This can indicate that if the “Area Boy” will have an access at all to Ikoyi, it can only be made possible through rendering services to the master. The poet constructs one domain for the poor and the other for the affluent. The “Area Boy” is neither a foolish person nor a senseless individual. It is that he needs helps from the society and no one appreciates his humble cries to assist the helpless human being. Such supports, if rendered, could give the individual breakthroughs in order to showcase his talents and skills in resourceful ways. The mood system in stanza two is similar in structure to stanza one. The study locates the differences in CL7 and CL8 where the Subjects, Oga’s callousness and Madam’s overbearing attitude/you, attract different Finites. The two Finites, haunt and can no longer, operate in present forms. It is salient to have no longer in the verbal group. It is a negative polarity that compels the boy to continue to flashback to his experience with the wife of his master. That is, the Madam’s imperious characteristic. The Participant, The painful year and circumstantial element, as a reluctant houseboy support the remark on Madam’s dictatorial capacity. The “Area Boy” understands that he passes through pains in the master’s house, nonetheless, because there is no one to help, the indigent resigns himself to fate. According to the narrator, the “Area Boy” discharges his responsibilities sluggishly. The houseboy’s experience sensitizes the readers in two forms. The master’s characteristic, expressed as Oga’s callousness, and the madam’s attitude describes, as Madam’s overbearing Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 34 of 50 attitude. The Oga (i.e. master) is emotionally-hardened. That heartlessness has made the boss to be careless of the sufferings of the concerned, which has turned him to a restless individual. Perhaps, that has lead to the persistent complaints that the writer observes from the “poetic narrative.” The other approach is that the woman in the house does not help matters. Madam makes the situation possibly-worse. The madam’s domineering role overwhelms the “Area Boy” to be forcefully-dedicated to his responsibilities despite the initial reluctance. Indirectly, the destitute engages in a sort of forced labor in the house of the rich and, probably, in the presence of the children Yet nobody said anything then. It is painful that nobody comes to his aid, being a reason for the lamentation. The structure below illustrates the thematic organization of stanza two. CL5, CL8, and CL9 portray similar thematic choices of dual theme operations; whereas CL6 and CL7 organize single theme each. The third outpouring of a hurting heart shows in a psychological form in the third stanza. The poet calls that The psychological oppression, which is Phenomenon to the mental process, remember. The first concern positions the “Area Boy”, as a laborer in the farm. The second challenge is the nagging of the master and wife on the houseboy. The experience here plays out as a kind of feeling and not an exercise of personal strength in order to achieve a mission. One can argue that the concern of the “Area Boy”, this time, does not have a solid logical foundation. This is because the grievance is not objective. The circumstantial devices of Of washing their scions spray dollars in parties/Of that limousines splashing water on you … are pains that fall into the terrain of personal feelings, expressed in the form of envy. The sentiments of jealousy classified as oppression can lead to perpetration of evil. Moreover, the resentment is a thought that has the potency to persuade the “Area Boy” to look for Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 35 of 50 money at all costs. In the same spirit of self-appraisal of sensationalism, the individual raises a complaint Of your wondering where you went wrong. Actually, it is a good thing to be comfortable in life. Nevertheless, developing a spirit of rivalry against someone’s neighbor is not acceptable in all ramifications of social norms. The “Area Boy” has forgotten that fingers are not equal and can never be equal. The throbbing heart needs to transcend the mundane activities that he witnesses and complains of in diverse forms in order to focus on how to survive socially economically. To fully register the grievance, four circumstantial facilities with the markers of of (three times), and in are employed. These demonstrate the degree of the “Area Boy’s” annoyance against the family of his master. The longevity of the clause supports the claim above. As if that is not enough, the playing of a blame- game emanates to project the individual as someone, who has sometimes missed opportunities. Probably, that validation influences the speaker to begin to query where the “Area Boy” has gone wrong. In my argument, the “Area Boy” needs to dig deep in terms of his past, his parents, and perhaps personal disobedience to instructions from the guardian. The stubbornness, ignorance, and lackadaisical qualities of the complainant might have caused his present situation of indigence. The declarative, Yet nobody said anything, motivated with a verbal process, is striking. This is because the content recurs three times in stanzas, one, two, and three – CL4, CL9, and CL11. The implication of the repetitive statement is that it strongly expresses the wish of the “Area Boy.” The verbs “to be” and “to have” as well as the auxiliary “can no longer” exhibited in negative polarity occupy the Finite positions of CL14, CL12, and CL13. The other ‘interacts’ of the mood are repetitions of the Subjects discussed earlier in stanzas one and two. CL15 does not have mood at all but residue. The “tormented Area Boy”, as a laborer, a houseboy, and a psychologically-oppressed individual, expects members of the Lagos community to provide him a succor from the anguish experienced. Before anyone casts blame on the Lagos society, it is important to point out that Lagos is a very busy city where the concept of individualism dominates virtually all activities (Gustavsson 2008). The blame must first go to the parents and second to Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 36 of 50 the government. If the parents have failed in their responsibilities of properly raising the child, the government is supposed to bear the burden of caring for the citizens, including the less privileged ones. Overwhelming responsibilities of other parents may have prevented them from taking care of the “Area Boy” and others in a similar condition. So, the plight of the boy is a lesson to all parents; people must give birth to only the children that they can cater for because nobody will say anything while their untrained children roam. Parents must wonder while their disobedient children wander. The structure below demonstrates the point of departure of the clauses in stanza three. The organization of the clauses in stanza three reveals different communicative background, when one makes a dialectical appreciation with earlier dissected stanzas one and two. Three organizational structures unfold here. CL10 and CL11 have two themes each. CL12, CL13, and CL14 operate with three themes each whereas CL15 has neither the thematic system nor the mood system. The poet restricts the function of CL15 to rhematic elements, corresponding to the interpersonal devices of Predicator, Lashing out at and Complement, the system. After the past that stanzas one, two, and three favor, the conformity of stanzas four and five operates in the present events. The narrator describes the phenomenon, using the circumstantial mechanism of Now, which refers to the engagement of time. One observes the emphasis of Now in clauses 12, 13, 14, and 20 as well. The past can be categorized as the premature stage, while the mature stage connotes the present. The past unveils the “Area Boy” as being in servitudes of influential people, who violate social standards to take advantage of the boy’s weakness and cheat him. The present displays the “Area Boy”, as an individual with freedom. Thus, he troubles the Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 37 of 50 society that, by the opinion of the boy, has not been kind to the sore-hearted person. Those who renege in ministering to the needy usually pay the astronomical price for their negligence. Perhaps, some people may not dream of evil perpetration; the circumstances surrounding them may incite their behaviors toward social vices. The “Area Boy” reveals that Now that something in you has snapped, one can retaliate, has a connection with the experience of the past. The author can recall the utilization of the mental process of remember in three different occasions in the verses. The lexeme, remember, is a pointer to how the servitude experience borders the painful- hearted fellow, and negatively influences his mental capabilities. The experience has made the boy so unpleasant to an extent of outpouring his annoyance to the audience. The gathered knowledge stimulates the boy to confess that one can no longer stomach it. It is a frame of mind expressed in the mental process of can no longer stomach that is very difficult to erase from one’s cognitive storage. It is an indicator that those who are rich in the society must treat the poor fairly well; else as the time is fast approaching, in no time, the less privileged will fight back, and perhaps, be terroristic. The poet takes cognizance of this, as commented that Now that you’re fighting back in the street … stinging the molochs it is time to call you names: Tout! … The “Area Boy” understands the trouble that he causes the society. Besides, it is the society that labels him “Area Boy” and other synonymous appellations such as tout, vagrant, and vandal. The names seem to signify the punishment that the boy inflicts on the society. In a precise way, “Area Boys” are those irresponsible children most of them boys (because there are no “Area Girls” in Lagos), who pick pockets, steal, and later turn to armed robbers. Possibly, some socially affected individuals might not be armed robbers but beggars and miscreants, who wander to beseech people for money in order to sustain their lives under the bridge. It cannot be totally ruled out that one, who wanders, can turn to a thief because an aphorism stipulates that an idle hand is the devil’s workshop. There are four processes in stanza four. These are stinging (material), operate (material), tolerate (behavioral), and is (relational). Sting represents a poisonous spirit, while operate refers to the workable mechanism of the social system and structure. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 38 of 50 Tolerate describes the attitude of the entire actors of society that accept the evil that the powerful perpetrate on the less privileged (see Halliday and Matthiessen 2004, 179–238). At this juncture, the expectation of the narrator is that the society ought to intervene by protesting against the General’s inhumanity, Oga’s callousness, and Madam’s overbearing qualities. Instead of necessary interpositions, the people seem to add to the traumatic experience of the boy. The poem draws on those behaviors to create a relational process as an attribute as well as a cursor to the current perspective of the society on the helpless individual, who has perhaps turned to a criminal. CL16 expresses the residue as Predicator, stinging and Complement, the molochs. The Subjects in CL17 and CL18 are relative markers, who, that attract different Finites of operate and tolerate in present tense respectively. CL19’s Subject is it, which takes is as its Finite. The explanation below characterizes the thematic choices of stanza four. The four clauses in stanza four appear in different parameters except that CL17 and CL19 have a similar pattern of theme/rheme structures. CL16 operates only in rheme with an empty set of theme. The stanza experiences a full-stretch of thematic configuration in CL18 with three themes at a go. The poet fraternizes with the “Area Boy” by calling him brother as a point of departure of CL20. The association becomes a necessity because George, the “Area Boy” provides the poet pieces of information about the environment and personal feelings. The poem, “Area Boy”, seems to honor the citizens, who are victims of being miscreants; the poet is a probable voice for “Area Boys.” Besides, the poet, being an intellectual, subsumes himself into a similar situation in order to sensitize the government to rise to the plight of “Area Boys”, who cause headaches to the larger society. In respect of that, the poet perceives the challenge of the notion of “Area Boy”, as a concern that rocks the social boat of Lagos and Nigeria at large. Even if the government should rescue “Area Boys” after that they have been socially bartered; Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 39 of 50 what about the damages that the misfortune has created in their subconscious souls? The Subjects, your being an Area Boy and Nobody in CL20 and CL21 take on the Finites, is and will ever respectively to reflect the mood system of the interaction. The structure below indicates the themes of the stanza. CL20 demonstrates a marked multiple structure of themes, while CL21 shows an unmarked thematic organ. The poet concludes with a declarative that Nobody will ever bother to evacuate/The fossils of disenchantment/Buried deep down in your soul. Perhaps, irrespective of the aids given to the “Area Boy” the past experience might prevent the agonized destitute from adopting the full status and responsibility of a good citizen. In that case, it becomes imprudent to allow citizens to degenerate in social treasures before the society rescues them from their plights. Such delay could be very extortionate in relation to loss of lives and property. Conclusion The study shows that SFL is an instrument of DH by allowing scientific facilities to process the structural values of the poem, as illustrated earlier in Figures 8, 9, and 10; Tables 3, 4, and 5; and Figures 11, 12, 13, and 14. The results of the analysis of the poem, “Area Boy”, in which SFL is applied display the tenor of discourse, as explicating the experience of the painful heart in declarative statements functioning with Subject and Finite. These semiotic values are highly supported with Adjuncts. The interactions reveal how the society creates bitterness in the soul of the “Area Boy.” The organization of the clauses operates on themes, which are sometimes marked multiple themes, as means of expressing the markedness. The mode of discourse exhibits meaning in the rhemes, which are in alignment with most processes. The experience that the text shares utilizes material processes of having and being (e.g. spent, fighting back, lashing out at, stinging, and operate) in order to explain the situation of the “Area Boy” in the past and in the present. The study also observes the field of discourse, oscillating between mental processes (e.g. remember, can no Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 40 of 50 longer forget, and can no longer stomach) and verbal processes (e.g. said). These are indicators of the traumatic experience of the painful soul and his expectation from the society. Examining the poem from the transitivity systemic approach, the author observes some repetitive devices from the investigation. There are about five of the communicative facilities that function as processes, circumstances, and full-fledged clauses. These are: At times you still remember (declarative clause); Yet nobody said anything (declarative clause); spent (material process); Now (circumstantial element of time); and Of (circumstantial element of manner). There are two divisions in the poem, that is, the environment of the past and that of the present. The past experience displays accumulated thoughts of the “Area Boy” as a laborer as well as a houseboy. Apart from that, personal feelings, crowded with sentiments, disturb the victim. The later can be accepted as the fault of the society. However, one is also compelled to indict the boy, and to negate his grievances. The boy does not need to blame others for his shortcomings, failings, and challenges. Instead of groaning, the individual ought to chart a new course of survival in a legally- acceptable way. The envy of the master’s family berates social norms. Nevertheless, the present, the poet alerts, is a fighting back – a time of retaliation. This represents a period when the concerned individual feels being frustrated by the society. The disappointment might have permitted the “Area Boy” to become a burden to the society because the government has somehow abdicated its social responsibility of caring for its own. That is, all citizens. The selfishness of the society contributes as well to make a nuisance out of the helpless individual. Instead of assisting the “Area Boy” in order to have access to good things of life, the penniless is taken advantage of in order to slavishly serve the haughty. From a theoretical perspective, the study suggests that SFL has the potency to provide socio-cultural meaning potential to texts in their literary forms. It also deduces from “Area Boy” that the government and private individuals should endeavor to consider the less privileged, which have equal rights to survive as citizens of the nation. Apart from the corpus that can be achieved, SFL textual interpretations have the capacity to stimulate computer experts to construct simulations of poetic devices that the audience can easily observe from computers. It is the hope of the study that Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 41 of 50 researchers will make use of SFL conceptual frameworks to analyze literature for desired meaning potential. Furthermore and in retrospect, the constraints experienced in the manual counting of the systemic constituents of “Area Boy” inspires the following suggestions. To the best of my knowledge, some of the available software assisting in DH (e.g. AntConc) could not vividly cater for systemic appreciations of texts. On that ground, one could recommend the need for computer experts (or programmers) to produce some software that can take care of SFL analyses and positions on lexemic investigations. Such technological facility must have the potency to identify and compute a corpus of systemic processes, circumstantial devices, continuatives, vocatives, etc. of their kinds. If a project of this magnitude, involving systemicists and software experts, is conducted; one is seemingly sure that such cross- fertilization of ideas will yield some merits. First, the software will eradicate the manual counting, as done in this paper, to automation of systemic accountability of communicative facilities either in Microsoft Excel Sheet or Microsoft Word or any other computerization concepts. Second, it might attract researchers to participate in the development of SFL for the betterment of the humanity at large. Third, the software could promote SFL as learner- and user-friendly. Fourth, it can aid easy generation of meaning potential of texts to reveal “what a composer of a text means” structurally and contextually. Acknowledgements My sincere appreciation goes to Dr. Daniel P. O’Donnell (Editor-in-Chief), Mr. Virgil Grandfield (Managing Editor) and Mr. Steven Gillis (Congress 2017 Issue Manager) of Digital Studies/Le champ numerique (DSCN), the University of Lethbridge Journal Incubator, and all the other editors and reviewers who have contributed in one way or another to this article. The contributions of these individuals, without mincing words, have had great impacts on this work. I am also grateful to Mrs. Bonke Dalamu for her consistent encouragement during the multi-tasking reviewing processes of this article. Competing Interests The author has no competing interests to declare. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 42 of 50 References Adesanmi, Pius. 2010. “Area Boy.” In Lagos of the Poets, edited by Ofeimun Odia, 308. Lagos: Hornbill House. Anthony, Laurence. 2018. AntConc: A freeware Corpus Analysis Toolkit for Concordancing and Text Analysis. Accessed June 13 2018. Bottom of Form: http://www.laurenceanthony.net/software/antconc/. Bailey, Richard W. 1985. “Negotiating Meaning: Revisiting the Context of Situation.” In Systemic Perspectives on Discourse 2, edited by James D. Benson, and William S. Greaves, XVI: 1–16. Norwood, New Jersey: Ablex Publishing Corporation. Bartlett, Tom. 2013. “I’ll Manage the Context: Context, Environment and the Potential for Institutional Change.” In Systemic Functional Linguistics: Exploring Choice, edited by Lise Fontaine, Tom Bartlett, and Gerard O’Grady, 342–364. Cambridge: Cambridge University Press. DOI: https://doi.org/10.1017/ CBO9781139583077.021 Binotti, Lucia, and Carmen Urioste-Azcorra. 2017. “Digital Humanities and the Common Good. The Case of Entiéndelo.” Revista de Humanidades Digitalés 1: 207–222. Accessed May 21, 2018. http://revistas.uned.es/index.php/RHD. Bloor, Thomas, and Meriel Bloor. 2004. The Functional Analysis of English. Great Britain: Hodder. DOI: https://doi.org/10.4324/9780203774854 Bloor, Thomas, and Meriel Bloor. 2013. The Functional Analysis of English. Abingdon, Oxon: Routledge. Bogna, Alice. 2017. “From Ancient Texts to Maps (and Back Again) in the Digital World. The Digiliblt Project.” Revista de Humanidades Digitalés 1: 297–313. Accessed May 21, 2018. http://revistas.uned.es/index.php/RHD. DOI: https:// doi.org/10.5944/rhd.vol.1.2017.16784 Bradley, John. 2004. “Text tools.” In A Companion to Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 487–503. Oxford: Blackwell. DOI: https://doi.org/10.1002/9780470999875.ch33 Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 43 of 50 Burdick, Anne, Johanna Drucker, Peter Lunenfeld, Todd Presner, and Jeffrey Schnapp. 2012. Digital_Humanities. USA: Massachusetts Institute of Technology Press. Burke, Kenneth. 1969. A Grammar of Motives. California: University of California Press. Burrows, John. 2004. “Textual Analysis.” In A Companion to Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 316–342. Oxford: Blackwell. DOI: https://doi.org/10.1002/9780470999875.ch23 Burton, Dolores M. 1981a. “Automated Concordances and Word Indexes: The Fifties.” Computers and the Humanities 15: 1–14. DOI: https://doi. org/10.1007/BF02404370 Burton, Dolores M. 1981b. “Automated Concordances and Word Indexes: The Early Sixties and the Early Centers.” Computers and the Humanities 15: 83–100. DOI: https://doi.org/10.1007/BF02404202 Busa, Roberto A. 2004. “Foreword: Perspectives on the Digital Humanities.” In A Companion to Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, xvi–xxi. Oxford: Blackwell. Butler, Christopher S. 1985. Systemic Linguistics Theory and Applications. London: Batsford Acedemic and Educational. Butler-Kisber, Lynn. (ed.) 2013. Teaching and Learning in the Digital World: Possibilities and Challenges 6(2): 1–423. Accessed January 12, 2016. https:// www.learninglandscapes.ca/index.php/learnland/issue/view/Teaching-and- Learning-in-the-Digital-World-Possibilities-and-Challenges. Castro, Rojas A. 2017. “Big Data in the Digital Humanities. New Conversations in the Global Academic Context.” Humanities Commons, Digital Culture Annual Report, 62–71. Accessed April 23, 2018. https://hcommons.org/deposits/item/ hc:11759/. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 44 of 50 Ciula, Arianna. 2017. “Digital Palaeography: What is Digital about it?” Digital Scholarship in the Humanities 32(suppl. 2): ii89–ii105. Accessed December 17, 2018. DOI: https://doi.org/10.1093/llc/fqx042 Craig, Hugh. 2004. “Stylistic Analysis and Authorship Studies.” In A Companion to Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 271–285. Oxford: Blackwell. DOI: https://doi. org/10.1002/9780470999875.ch20 Crane, Greg. 2004. “Classics and the Computer: An End of the History.” In A Companion to Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 46–55. Oxford: Blackwell. DOI: https://doi. org/10.1002/9780470999875.ch4 Creative Block Inc. 2017. IBM’s New Tagline: Think 3.0. Accessed January 27, 2018. http://blog.thecreativeblock.marketing/ibms-new-tagline-think-3.0. Dalamu, Taofeek O. 2017a. “Institution’s Title and Shibboleth: A Construction of Grammatical Relationship in Advertising Plates.” Journal of Language and Linguistic Studies 13(1): 260–282. Dalamu, Taofeek O. 2017b. “Systemic Functional Theory: A Pickax of Textual Investigation.” International Journal of Applied Linguistics and English Literature 6(3): 187–198. DOI: https://doi.org/10.7575/aiac.ijalel.v.6n.3p.187 Dalamu, Taofeek O. 2017c. “A Preliminary Exposé of Systemic Functional Theory Fundamentals.” Ethical Lingua 4(2): 98–108. DOI: https://doi.org/10.30605/ ethicallingua.v4i2.414 Dalamu, Taofeek O. 2017d. “Nigerian Children Specimens as Resonance of Print Media Advertising: What for?” Communicatio 11(2): 79–111. Dalamu, Taofeek O. 2017e. “Narrative in Advertising: Persuading the Nigerian Audience within the Schemata of Storyline.” Anu. Filol. Lleng. Lit. Mod. 7, 19–45. Dalamu, Taofeek O. 2017f. “Periodicity: Interpreting Waves of Information in Osundare’s Harvestcall.” Buckingham Journal of Language and Linguistics 10: 42–70. Dalamu, Taofeek O. 2017g. “Maternal Ideology in an MTN® Advertisement: Analysing Socio-Semiotic Reality as a Campaign for Peace.” Journal of Language Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 45 of 50 and Education 3(4): 16–26. DOI: https://doi.org/10.17323/2411-7390-2017-3- 4-16-26 Dalamu, Taofeek O. 2017h. “Yuletide Ideology as Advertising Ideology: An Historical Illumination from Saint Nicholas to the Present Day.” Facta Universitatis Series: Linguistics and Literature 15(2): 143–161. UDC 659. 1: 27–36. Nikola, sveti. Dalamu, Taofeek O. 2017i. “A Discourse Analysis of Language Choice in MTN® and Etisalat® Advertisements in Nigeria.” PhD Thesis, Yaba, Lagos: University of Lagos, School of Postgraduate Studies. Dalamu, Taofeek. 2018. “Exploring Advertising Text in Nigeria within the Framework of Cohesive Influence.” Styles of Communication 10(1): 75–97. De Beaugrande, Robert. 1991. Linguistic Theory: The Discourse of Fundamental Works. London and New York: Longman. Digital Humanities Manifesto. nd. A Manifesto on Manifestos. Accessed March 14, 2017. http://manifesto.humanities.ucla.edu/2009/05/29/the-digital- humanities-manifesto-20/. Drucker, Johanna. 2016. “Graphical Approaches to the Digital Humanities.” In A New Companion to Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 238–250. West Sussex, UK: Wiley Blackwell. Edmond, Jennifer. 2016. “Collaboration and Infrastructure.” In A New Companion to Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 54–66. West Sussex, UK: Wiley Blackwell. Eggins, Suzanne. 2004. Introduction to Systemic Functional Linguistics. London: Continuum. Ellis, Jeffrey. 1987. “The Logic and Textual Functions.” In New Developments in Systemic Linguistics: Theory and Description, edited by Michael A. K. Halliday, and Rupert P. Fawcett 1: 107–129. London: Frances Painter. Erlin, Matt. 2016. “Digital Humanities Masterplots.” Digital Literary Studies 1(1): 1–11. Ess, Charles. 2004. “Revolution? What Revolution? Successes and Limits of Computing Technologies in Philosophy and Religion.” In A Companion to Digital Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 46 of 50 Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 132–144. Oxford: Blackwell. DOI: https://doi.org/10.1002/9780470999875. ch12 Firth, John R. 1957. Papers in Linguistics, 1934–1951. London: Oxford University Press. Fontaine, Lise. 2013. Analyzing English Grammar: A Systemic Functional Introduction. Cambridge: Cambridge University Press. Gustavsson, Gina. 2008. “What Individualism Is and Is Not.” Workshop Paper to be Presented at the NOPSA Conference 2008, Tromsö, 1–25. http://www.diva- portal.org/smash/get/diva2:54576/FULLTEXT01.pdf. Halliday, Michael A. K. 1973. Explorations in the Functions of Language. London: Edward Arnold. Halliday, Michael A. K. 1985. “Systemic Background.” In Systemic Perspectives on Discourse XV, edited by James Benson, and Williams Greaves, 1–15. Norwood, New Jersey: Ablex Publishing Corporation. Halliday, Michael A. K. 1994. An Introduction to Functional Grammar. Great Britain: Arnold. Halliday, Michael A. K. 2013. “Meaning as Choice.” In Systemic Functional linguistics: Exploring Choice, edited by Lise Fontaine, Tom Bartlett, and Gerard O’Grady, 15–36. Cambridge: Cambridge University Press. DOI: https://doi.org/10.1017/ CBO9781139583077.003 Halliday, Michael A. K., and Christian M. I. M. Matthiessen. 2004. An Introduction to Functional Grammar. Great Britain: Hodder Arnold. Halliday, Michael A. K., and Christian M. I. M. Matthiessen. 2014. Halliday’s Introduction to Functional Grammar. Abindon, Oxon: Routledge. Halliday, Michael A. K., and Ruqaiya Hasan. 1985. Language, Context, and Text: Aspects of Language in a Socio-Semiotic Perspective. Geelong: Deakin University Press. Hockey, Susan. 2004. “The History of Humanities Computing.” In A Companion to Digital Humanities, edited by Susan Schreibman, Ray Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 47 of 50 Siemens, and John Unsworth, 3–19. Oxford: Blackwell. DOI: https://doi. org/10.1002/9780470999875.ch1 Hodge, Robert, and Gunther Kress. 1988. Social Semiotics. Cambridge: Polity. IBM. 2017. Enter the Cognitive Era. Accessed June 17, 2018. https://www.ibm.com/ us-en/. Innis, Robert E. 1987. “Entry for Bühler, Karl.” In Thinkers of the Twentieth Century, edited by Roland Turner. London: Saint James Press. Jockers, Matthew L., and Ted Underwood. 2016. “[Text] Mining the Humanities.” In A New Companion to Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 291–306. West Sussex, UK: Wiley Blackwell. Jørgensen, Finn Arne. 2016. “Summary: The Internet of Things.” In A New Companion to Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 42–53. West Sussex, UK: Wiley Blackwell. Kirschenbaum, Matthew G. 2010. “What is Digital Humanities and What’s It Doing in English Departments?” ADE Bulletin 150: 55–61. DOI: https://doi. org/10.1632/ade.150.55 Kreniske, Philip, and Jesse Kipp. 2014. “How the San of Southern Africa Used Digital Media as Educational and Political Tools.” The Journal of Interactive Technology and Pedagogy. Accessed February 15, 2016. https://jitp.commons. gc.cuny.edu/how-the-san-of-southern-africa-used-digital-media-as-educational- and-political-tools/. Kress, Gunther. 2010. Multimodality: A Semiotic Approach to Contemporary Communication. New York: Routledge. Kress, Gunther, and Theo van Leeuwen. 2003. Reading Images: The Grammar of Visual Design. London and New York: Routledge. Lee, James, Blaine Greteman, Jason Lee, and David Eichmann. 2018. Linked Reading: Digital Historicism and Early Modern Discourses of Race around Shakespeare’s Othello. Accessed November 23, 2018. https://osf.io/preprints/ socarxiv/tg23u/. Malinowski, Bronislaw. 1935. Coral Gardens and Their Magic, 2: The Language of Magic and Gardening. New York: American Book Company. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 48 of 50 Malmkjaer, Kirsten. (ed.) 2004. The Linguistics Encydopedia. London: Routledge. DOI: https://doi.org/10.4324/9780203644645 Martin, James R. 1992. English Text: Structure and System. Philadephia: John Benjamins. DOI: https://doi.org/10.1075/z.59 Matthiessen, Christian. 1993. “Register in the Round: Diversity in a Unified Theory of Register Analysis.” In Register Analysis: Theory and Practice, edited by Mohen Ghadessy, 221–392. London and New York: Pinter Publisher. McGregor, William B. 1992. “The Place of Circumstantial in Systemic-Functional Grammar.” In Advances in Systemic Linguistics: Recent Theory and Practice, edited by Martin Davies, and Louise Ravelli, 136–145. London: Pinter Publisher. McGregor, William B. 1997. Semiotic Grammar. London and New York: Oxford University Montfort, Nick. 2016. “Exploratory Programming in Digital Humanities Pedagogy and Research.” In A New Companion to Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 98–107. West Sussex, UK: Wiley Blackwell. Muzny, Grace,  Mark Algee-Hewitt, and Dan Jurafsky. 2017. “Dialogism in the Novel: A Computational Model of the Dialogic Nature of Narration and Quotations.” Digital Scholarship in the Humanities 32(suppl. 2): ii31–ii52. DOI: https://doi.org/10.1093/llc/fqx031 O’Donnell, Daniel P., Katherin L. Walter, Alex Gil, and Neil Fraistat. 2016. “Only Connect: The Globalization of the Digital Humanities.” In A New Companion to Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 493–510. West Sussex, UK: Wiley Blackwell. Olivares, Beatriz E. Q. 2013. “The Interpersonal and Experiential Grammar of Chilean Spanish: Towards a Principled Systemic-Functional Description Based on Axial Argumentation.” PhD thesis. Accessed February 17, 2018. www.isfla. org/Systemics/Print/Theses/BQuiroz_2013.pdf. O’Toole, Michael. 1994. The Language of Displayed Art. London: Pinter. Peirson, Erick, Julia Damerow, and Manfred Laubichler. 2016. “Software Development & Trans-Disciplinary Training at the Interface of Digital Humanities Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 49 of 50 and Computer Science.” Digital Studies/Le champ numérique, 1–15. Accessed June 22, 2018. https://www.digitalstudies.org/articles/10.16995/dscn.17/. Quirk, Randolph, and Sidney Greenbaum. 1973. University Grammar of English. Essex England: Longman. Rashidi, Linda S. 1992. “Towards an Understanding of the Notion of Theme: An Example from Dari.” In Advances in Systemic Linguistics: Recent Theory and Practice, edited by Martin Davies, and Louise Ravelli, 189–204. London: Pinter Publisher. Ravelli, Louise. 2000. “Getting Started with Functional Analysis of Texts.” In Researching Language in Schools and Communities, edited by Len Unsworth, 27–63. London and Washington: Cassel. Riguet, Marine, and Suzanne Mpouli. 2017. “At the Crossroads Between the Scientific and the Literary Discourse: Comparison as a Figure of Dialogism.” Digital Scholarship in the Humanities 32(suppl. 2): ii60–ii77. Accessed October 15, 2018. DOI: https://doi.org/10.1093/llc/fqx026 Robinson, Amy, and Jon Saklofske. 2017. “Connecting the Dots: Integrating Modular Networks and Narrativity in Digital Scholarship.” Digital Studies/le Champ Numerique 9. Accessed October 15, 2018. https://www.digitalstudies. org/articles/10.16995/dscn.266/. DOI: https://doi.org/10.16995/dscn.266 Rodilla, Patricia M., and César Gonzalez-Perez. 2017. “A Modelling Language for Discourse Analysis in Humanities: Definition, Design, Validation and First Experiences.” Revista de Humanidades Digitales 1: 368–378. Accessed March 16, 2018. DOI: https://doi.org/10.5944/rhd.vol.1.2017.16133 Schreibman, Susan, Ray Siemens, and John Unsworth. (eds.) 2004. A Companion to Digital Humanities. Oxford: Blackwell. Svensson, Patrik. 1998. Number and Countability in English Nouns. An Embodied Model. Uppsala: Swedish Science Press. Svensson, Patrik. 2010. “The Landscape of DH.” Digital Humanities Quarterly 4(1): 1–35. Accessed March 19, 2017. http://digitalhumanities.org/dhq/ vol/4/1/000080/000080.html. Thompson, Geoff. 2004. Introducing Functional Grammar. Great Britain: Hodder Arnold. Dalamu: Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities Art. 8, page 50 of 50 Thompson, Geoff. 2014. Introducing Functional Grammar. Abingdon, Oxon: Routledge. DOI: https://doi.org/10.4324/9780203785270 Unsworth, John. 2002. “What Is Humanities Computing and What Is Not?” Graduate School of Library and Information Sciences. Illinois Informatics Institute, University of Illinois, Urbana. Unsworth, John. 2010. “Message to the Author.” E-mail to Matthew Kirschenbaum. Warwick, Claire. 2016. “Building Theories or Theories of Building? A Tension at the Heart of Digital Humanities.” In A New Companion to Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 538–552. West Sussex, UK: Wiley Blackwell. Wodak, Ruth, and Michael Meyer. (eds.) 2001. Methods of Critical Discourse Analysis. London: SAGE. DOI: https://doi.org/10.4135/9780857028020 Yule, George. 1985. The Study of Language. Cambridge: Cambridge University Press. How to cite this article: Dalamu, Taofeek. 2019. “Illuminating Systemic Functional Grammatics (Theory) as a Viable Tool of Digital Humanities.” Digital Studies/Le champ numérique 9(1): 8, pp. 1–50. DOI: https://doi.org/10.16995/dscn.287 Submitted: 04 November 2017 Accepted: 18 October 2018 Published: 23 April 2019 Copyright: © 2019 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. OPEN ACCESS Digital Studies/Le champ numérique is a peer-reviewed open access journal published by Open Library of Humanities. work_ffhzfogh2vc6bpes4zeewwyv4m ---- informatics Article Conceptualization and Non-Relational Implementation of Ontological and Epistemic Vagueness of Information in Digital Humanities † Patricia Martin-Rodilla 1,* and Cesar Gonzalez-Perez 2 1 CiTIUS, University of Santiago de Compostela; Jenaro de la Fuente Domínguez, s/n, 15782 Santiago de Compostela, Spain 2 Institute of Heritage Sciences (Incipit) Spanish National Research Council (CSIC) Avda. Vigo, s/n, 15705 Santiago de Compostela, Spain; cesar.gonzalez-perez@incipit.csic.es * Correspondence: patricia.martin.rodilla@usc.es † This paper is an extended version of our paper published in TEEM’18, Salamanca, Spain, 24–26 October 2018. Received: 22 March 2019; Accepted: 30 April 2019; Published: 6 May 2019 ���������� ������� Abstract: Research in the digital humanities often involves vague information, either because our objects of study lack clearly defined boundaries, or because our knowledge about them is incomplete or hypothetical, which is especially true in disciplines about our past (such as history, archaeology, and classical studies). Most techniques used to represent data vagueness emerged from natural sciences, and lack the expressiveness that would be ideal for humanistic contexts. Building on previous work, we present here a conceptual framework based on the ConML modelling language for the expression of information vagueness in digital humanities. In addition, we propose an implementation on non-relational data stores, which are becoming popular within the digital humanities. Having clear implementation guidelines allow us to employ search engines or big data systems (commonly implemented using non-relational approaches) to handle the vague aspects of information. The proposed implementation guidelines have been validated in practice, and show how we can query a vagueness-aware system without a large penalty in analytical and processing power. Keywords: vagueness; non-relational databases; conceptual modelling; imprecision; uncertainty; knowledge representation; digital humanities; ConML 1. Introduction We generate knowledge from raw data through different mechanisms, such as observation, perception, theorization, and deduction [1], thus producing information models that constitute the starting point of any knowledge generation process. These information models pose a significant impact on the quality and type of knowledge that we are able to generate. When working in the humanities, we also create information models that reflect not only the data that we have but also the possible hypotheses from them in order to fill the knowledge gap. This model-building process is especially relevant when working with information about our past, in which this gap is usually larger. For these reasons, several authors have recently pointed out how relevant models are in the humanities, and identified improvement and evaluation research needs [2,3]. Thus, conceptual modelling techniques have been emerged as a theoretical valid and practical way to represent humanistic knowledge. Conceptual models have been successfully used in humanities projects such as Europeana [4], ARIADNE [5], and DARIAH [6]. Conceptual models describe the world in terms of concepts, their properties, and the relationships amongst them. The main advantage of conceptual modelling, as opposed to other approaches, is its Informatics 2019, 6, 20; doi:10.3390/informatics6020020 www.mdpi.com/journal/informatics http://www.mdpi.com/journal/informatics http://www.mdpi.com https://orcid.org/0000-0002-3976-7589 http://dx.doi.org/10.3390/informatics6020020 http://www.mdpi.com/journal/informatics https://www.mdpi.com/2227-9709/6/2/20?type=check_update&version=3 Informatics 2019, 6, 20 2 of 23 focus on the knowledge-level representation of the domain of discourse, which allows us to obtain simplified and manageable proxies of a relevant scope [3]. Conceptual modelling has been mostly developed under the umbrella of software engineering, and due to this disciplinary heritage, current conceptual modelling techniques lack the necessary mechanisms to represent different subjective opinions or hypotheses [3], and address the ontological or epistemic vagueness that is often part of the part of the world being studied [3]. This is unfortunate, because vagueness plays a crucial role in humanistic models. This is so, firstly, because humanistic studies often deal with our past, which is often described through incomplete and partially unknown information sources and/or fragmented data, and, secondly, because many research practices in the humanities imply a significant degree of vagueness due to their ethnographic and narrative methodologies. Developing conceptual models that are capable of managing vagueness is difficult, mainly because modelling involves making decisions about the nature, degree, and characteristics of the reality modelled. This difficulty only increases when we try to implement these models as software systems to organize, query, annotate, or search data and assist in the generation of new knowledge. The technologies that we usually employ to do this, either relational or non-relational, are significantly unaware of information vagueness, which only compounds the problem. In this context, the ConML conceptual modelling language [7] was developed as a simple and affordable tool that can be used by specialists in the humanities without much experience in information technologies, and with special attention towards the implementation of conceptual models as computer artefacts and databases. In this paper, we present the modelling mechanisms in ConML that explicitly address the representation of vagueness in the humanities. Then, we elaborate by proposing some implementation mechanisms that we can use to carry this improvement over to computer systems, and in particular non-relational store systems. We also provide a complete validation using a real-world humanities project. The paper is organized as follows: the rest of this section presents a review of existing modelling approaches of vagueness, describing what problems have been found in relation to humanistic information. Section 2 presents the proposed conceptual framework. Section 3 illustrates the proposed approach through its application to a real project in digital humanities, which includes an implementation of a non-relational environment and some examples of data queries involving vagueness resolution. Section 4 discusses the results obtained. Section 5 critically analyses the work and its future possibilities. 1.1. Uncertain Information in Humanities Fields Data and information modelling applied to humanities is a sub-discipline that has experienced decades of development, due to the need to create models representing humanities data in daily research practices. This need increases exponentially with the recognition of digital humanities as a discipline, and the use of information software systems for storing, indexing, searching, and reasoning about humanistic data. Within this context, there is a large number of works on modelling information in humanities fields [8–10], organized into two underlying categories. On the one hand, humanistic information modelling studies are derived from curation and archives studies, whose practitioners have considerable experience in storing and processing information. These studies have been joined by so-called Linked Open Data approaches [11], which advocate information models that are subsequently shared on the web, converting it into a common database. In all these approaches, the underlying conceptual models usually have a first layer based on an entity–relationship model [12] or similar models and later add layers for the interconnection of models using technologies such as RDF [13]. Common solutions for implementation described here are XML technologies [14], which analyze how the information was obtained or who obtained it (the so-called metadata), or useful annotations for further study of the information contained in the models through information encoding paradigms such as TEI [15]. These conceptual and technological ecosystems for information modelling in the humanities Informatics 2019, 6, 20 3 of 23 are very common as a basis for important documentation projects in the field, such as DARIAH [6] or Pelagios [16,17]. Regarding the support for expressing uncertain and imprecise information, neither TEI specification or existing Linked Open Data metamodels explicitly support vagueness (ontological or epistemic). This lack incapacitates these ecosystems regarding the true generation of knowledge in their application domains [18]. In practice, users who need to build software systems based on these models have identified problems with vagueness representation, creating some ad hoc implementations using XML technologies and TEI mechanisms for the representation of vagueness in the metadata part. For instance, some TEI annotation resources have been used (like the TEI Note tag) for representing the certainty degree of some data (adding a possible uncertainty value to the tag) [14] or using XML tags to represent probabilistic aspects [19]. However, these solutions only solve the problem laterally and not modelling the uncertainty as something intrinsic and transversal to the whole model, forcing users to use modelling mechanisms, such as annotation tags, which are not specifically designed for this purpose. Consequently, software searching and indexing systems do not know that these “custom” uses will not be able to index and search while taking vagueness properties into account. On the other hand, we can find more aligned approaches with the theoretical framework previously presented, not those that use metadata approaches but those that use modelling based on entities and characteristics of the information itself. One of the most well-known works here for digital humanities is CIDOC-CRM (the conceptual reference model impulse by the International Council of Museums) [20], an ISO standard generally applied to the cultural sector that has traditionally been used in archaeological and museum environments, although it has extensions for other humanities uses. The need for modelling aspects of uncertain information has been determined as intrinsic to archaeological practice [21,22] and has also been detected in conceptual analyses carried out on CIDOC-CRM [22,23], although CIDOC-CRM does not support it in its specification [20]. Recently, some authors have started working on an extension of CIDOC CRM to support uncertainty [24], although only covering the uncertainty introduces specific modelling when different users present different points of view or discourses about the information. This approach mixes subjective modelling approaches, and only models some epistemic vagueness scenarios. In addition, we can find other specifications using a thesaurus, an ad hoc creation of ontologies and folksonomies [25], and similar approaches for covering digital humanities’ needs in terms of vagueness modelling, but again without any explicit support at a metamodel level. All these works, and recent international initiatives such as PROgressive VIsual DEcision-Making in Digital Humanities (PROVIDEDH) [26], reveal the need to represent vagueness semantics in the humanities models as part of the intrinsic specification of the modelling mechanisms, avoiding ad hoc solutions. Both large groups of modelling approaches in the digital humanities discussed previously are lacking in this respect. Finally, there are some initiatives for using well-known software engineering modelling technologies to apply uncertainty modelling patterns to the humanities but still are a work in progress. For instance, we can find some isolated examples of using UML [27] to represent information in the humanities, identifying but not addressing the vagueness topic [28]; UML approaches, independent of the application domain, are discussed in the next section. In summary, although vagueness modelling for humanities information is a need that has been detected for decades in many of the works, the existing techniques do not incorporate mechanisms for this within their specifications and are limited to its ad hoc treatment in special cases. 1.2. Existing Approaches Outside Humanities Modelling aspects of information vagueness represents a field of interest for numerous fields and projects outside humanities disciplines, with different approaches. To facilitate the process of reviewing these approaches for our purposes, we have divided the approaches into three large groups: statistical approaches, strongly mathematical approaches, and software engineering approaches, although some of the reviewed works can be considered hybrids. All these approaches model vagueness explicitly, and Informatics 2019, 6, 20 4 of 23 some of them have developed techniques and tools that allow for the explicit treatment of both types of vagueness, which makes them a starting point to analyse their possible application to humanities fields. First of all, statistics is a particularly relevant discipline in vagueness modelling. Both for ontological and epistemic vagueness, we can find statistical approaches that generally associate probability functions to especially vague attributes of the information that we are modelling. The probability functions could be indicators of the precision (using in inferential statistics) or of the certain degree of the values of the attributes (i.e., error measurements for a given value). These solutions, while explicitly modelling both types of vagueness, assume vagueness as a margin of error function, contradicting our premise of treating uncertain information in the humanities as an intrinsic characteristic of them (that enriches the information) and not as something to mitigate. Thus, we can use these approaches as an idea to explicitly model aspects of vagueness but without giving it semantics of error. Regarding strongly mathematically approaches, they start from similar paradigms to the previous ones (based on margins of error) such as the interval predictor models [29], models that estimate regions of uncertainty of the contained information. A less error-focused approach corresponds to the fuzzy logic subdiscipline [30,31], which develop specific techniques (e.g., fuzzy sets and probability degrees, rule bases, linguistic summaries as fuzzy descriptions of variables or fuzzy quantifiers, and similarity measures) [31–34] for the modelling of vague aspects of the information. All these techniques contemplate the richness that both types of vagueness bring to the information models and their software applications [32]. Finally, approaches from software engineering maintain the differentiation between imprecision and uncertainty that we have detailed in our theoretical framework. In the case of ontological vagueness (imprecision), they try to expressly model the probability and possibility of the existence of entities in the data and information models. In the case of epistemic vagueness (uncertainty), they try to identify modellable characteristics such as set membership, interval membership, incompleteness, and other vague aspects. These works are still in progress (the OMG standardization group for vagueness in UML is still working, and their first ideas are from 2016) [35,36], although some UML modelling solutions based on stereotypes [37] can already be found. In any case, UML does not currently include support for modelling vagueness in its official specification [27]. The three groups of approaches have been applied to represent information and implement software systems in several domains of application (genetics and medicine [38], e-government and infrastructures [39], energy resources, etc.), being less common in models for representing humanistic information. Its treatment of vagueness closely linked to the concept of error and its large mathematical base makes its direct application to humanities fields difficult, where the definition of a probability function or the assumption of an a priori distribution of the data is complex. With the idea of providing a solution for the explicit modelling of uncertain information in the humanities that is (1) far from this notion of error and (2) simple and intuitive for humanities researchers [40], the modelling language ConML has incorporated specific modelling mechanisms of both types of vagueness. The following section explains in detail the conceptual framework and the mechanisms proposed. In order to define, characterize, and implement vagueness mechanisms as part of any conceptual model ad their subsequent software systems based on them, it is necessary to make some decisions about the specific treatment of vagueness we adopt and what modelling language is adequate for expressing the models. The following sections introduce both of them. 1.3. Theoretical Framework Many terms have been used in the literature to refer to the fact that data, or information, is not clear or perfectly defined: imprecision, vagueness, uncertainty, imperfection, etc. A complete conceptual characterisation of what is meant by these terms is rarely provided, so confusion ensues. Informatics 2019, 6, 20 5 of 23 To avoid this, we provide here a small theoretical framework that hopefully will clarify things and establish the basis for further developments such as the solution proposed in Section 3. To start with, we acknowledge that many aspects of the world are unclear, imprecise, or not well defined, and when we try to represent them in a model, we are often confronted with the need to either remove or explicitly manage this vagueness. Vagueness comes in two forms: • Ontological vagueness, or imprecision, which refers to things in the world that are not clear-cut, such as the boundaries of a hill; • Epistemic vagueness, or uncertainty, which refers to situations where our knowledge about something is unclear or incomplete. We say that imprecision is ontological because it is an inherent property of some things in the world. For example, a hill is an entity that any of us can conceptualise and reason about, but it lacks clear-cut boundaries, so that it is impossible to determine a line marking the hill’s boundary. This fact is independent of the knowledge that we may or may not have about the hill. Contrarily, we say that uncertainty is epistemic because it relates to how much we know about something. For example, I may know the name of this particular hill, or I may ignore it, or I may be roughly certain but not sure about it. This is a subjective phenomenon and definitely not inherent to the hill. Vagueness, in turn, jointly refers to imprecision and uncertainty. A deeper and complete treatment of vagueness as a knowledge representation concern, including imprecision and uncertainty, can be found in [3] (Chapter 14). Imprecision, being inherent to the things of the world and independent of our knowledge, depends on what properties we look at. Some properties, such as the names of people or cities, or the height of buildings or people, are not imprecise, as they are clearly established for any particular entity we may consider. For example, I have a clear name and height, regardless of whether you know them or not. This means that a modelling approach that aims to support the expression of imprecision must provide a mechanism to identify which properties or things being represented are subject to this kind of vagueness. On the contrary, anything may be subject to uncertainty because uncertainty depends on our knowledge about something, regardless of what that something is. As anyone may possibly be more or less knowledgeable about anything, every property of everything is, in principle, equally subject to uncertainty. Finally, it is worth mentioning the concept of accuracy. Whereas precision refers to how much detail an expression contains (such as 15.25 being more precise than 15.2), accuracy refers to how well an expression represents something, e.g., if I have 15.25 euros in my pocket, the expression 15 is imprecise but is quite accurate, whereas the expression 37.123 is much more precise but far less accurate. Note that precision is a property of expressions alone, regardless of how well they represent anything; contrarily, accuracy is a property of the representational power of expression. In this regard, accuracy is a useful tool to fight uncertainty. For example, imagine that we are required to express the distance between two places in kilometers. If we believe the distance is around 650 km but are unsure of it, we can refrain from attempting to be accurate in order to gain certainty by saying that the distance is between 500 and 900 km. This is certainly not very accurate, but we are probably right as the actual distance falls inside the given interval. 1.4. ConML ConML is a conceptual modelling language designed for the humanities and social sciences. Using ConML, we can represent the entities in the world as well as their characteristics and the connections among them. We can also represent the relevant categories that we employ to classify these entities, together with the relationships between them. ConML is based on the object-oriented paradigm, as are many other popular modelling languages such as UML [27], but is much simpler so that non-experts in software systems can learn it and use it in under 30 h [18,41]. Informatics 2019, 6, 20 6 of 23 At the category (type) level, the basic constructs of ConML are class, which represents a category in the world, and feature, which represents a characteristic of a category. There are two kinds of features: attributes that correspond to atomic characteristics, which are expressed through simple values (such as someone’s age or the name of a place), and semi-associations, which correspond to complex characteristics, which are expressed through references to other things, such as a house’s owner (which is a person) or a person’s birth place (which is a town). In addition, inverse pairs of semi-associations are combined into associations; in this regard, each semi-association of an association corresponds to associations as seen from the point of view of each of the participant classes. In this regard, we can say that, in ConML, classes have features, which can be either attributes or semi-associations, and classes are related to each other through associations, each of which is composed of a pair of inverse semi-associations. For example, we may have a ConML model representing the fact that buildings have an address and a height, and are located in cities, which have a name. Here, building and city are two classes. The building class has two attributes, address and height, whereas the city class has one attribute, name. Furthermore, building and city are related by the association Is Located In. Attributes in ConML have a data type, which specifies what kind of data may be stored by their instances. Only five simple data types exist in ConML: Boolean, number, time, text, and data. In addition, ConML supports enumerated data types. An enumerated type consists of a list of pre-defined named items, and a value of this type can only hold an existing item. For example, a model may define a styles enumerated type containing the items romanesque, gothic, and neoclassical. An attribute such as building style, defined as having type styles, could only take one of these items as a value. Interestingly, the items in an enumerated type do not need to be arranged as a linear list but can be hierarchically organized to represent subsumption or aggregation, so that every item may have a “parent” or super-item and may have a number of “child” sub-items. For example, we could add Decorated Gothic and Flamboyant Gothic under gothic in the styles enumerated to reflect the fact that there are two subkinds of the gothic style. At the entity (instance) level, the basics constructs of ConML are object, which represents a specific entity in the world as an instance of a class; value, which represents a characteristic of an entity as an instance of an attribute; and link, which represents a connection between two entities as an instance of an association. We can say that, in ConML, objects have values and are connected to each other by links. For example, we may have a ConML model representing the fact that the cathedral in Santiago de Compostela is 32 m high. Here, cathedral and Santiago de Compostela refer to objects instance of building and city, respectively: 34 m is a value instance of Height, and “in” refers to a link between these two objects. A comprehensive description of ConML is outside the scope in this article but can be found in [3,7]. 2. Materials and Methods This section presents the ConML mechanisms proposed for expressing vagueness as part of digital humanities conceptual models. 2.1. Expressing Imprecision and Uncertainty with ConML ConML features several mechanisms that support imprecision and vagueness. These mechanisms are distinct, but they are often used in combination to express complex facts. In general, imprecision is difficult to treat through cross-cutting mechanisms, as its semantics depend largely on the nature of each imprecise characteristic. On the contrary, uncertainty can be satisfactorily treated through cross-cutting mechanisms in the language, as it is independent of the characteristics being described. The following sections describe each of these mechanisms in turn. Informatics 2019, 6, 20 7 of 23 2.1.1. Null and Unknown Semantics Most modelling or software-oriented languages, as well as most database management systems and languages, provide a null keyword, or equivalent, to express that a piece of data is not available. However, this is ambiguous, because data unavailability may be due to ontological or epistemic reasons. For example, if we read that p.Name = null where p is a person, we should interpret null as meaning epistemic absence, i.e., we do not know p’s name. However, if we encounter something like b.Protection¬Level = null, where b is a building, we may interpret this as epistemic or ontological absence, i.e., we do not know what protection level applies to b, or b has no protection level whatsoever. To avoid ambiguity, ConML offers two different keywords: • Null, which indicates ontological absence; b.Protection¬Level = null means that no protection level has been established for b; • Unknown, which indicates epistemic absence; b.Protection¬Level = unknown means that a protection level has been established for b, but we do not know what it is. In this manner, unknown provides a simple but powerful mechanism to express ignorance of a fact, which is an extreme case of uncertainty. Null semantics may be applied only to those features that have a minimum cardinality of zero. For example, if the Person.Name attribute in our previous example is defined as having a cardinality of 1 in a class model, then it may not take null values in an instance model in order to maintain type conformance. However, unknown semantics may be applied to any feature, as anything is susceptible of not being known. 2.1.2. Certainty Qualifiers To cater for finer degrees of uncertainty, ConML incorporates certainty qualifiers. These are labels that may be attached to instances of classes or features to express how certain a statement is, following an exclusive order relation between them. Note that ConML does not define the qualifiers in a quantitative level (e.g., assigning a percentage of certainty to each qualifier), because this assignation could vary between domains of applications or even between implementation solutions, and it could be assigned in next phases of the mode implementation. There are five pre-defined degrees of certainty in ConML: • Certain. The expressed fact is known to be true. This is indicated by an asterisk * sign; • Probable. The expressed fact is probably true. This is indicated by a plus + sign; • Possible. The expressed fact is possibly true. This is indicated by a tilde ~ sign; • Improbable. The expressed fact is probably not true. This is indicated by a minus − sign; • Impossible. The expressed fact is known to be not true. This is indicated by an exclamation ! sign. Certainty qualifiers can be applied to describe existence or predication. When used for existence, they are attached to an instance of class in order to express how certain we are of the existence of such an entity. For example, we may label building b in our previous example as (+), to indicate that the building represented by b probably exists. Similarly, certainty qualifiers can be applied to instances of features to express how certain we are of the associated predication. For example, we may state that b.Height = 34 (*) to indicate that we are sure that the building represented by b is 34 m high. 2.1.3. Abstract Enumerated Items In previous sections we described the fact that items in an enumerated type can be hierarchically organized to represent subsumption or aggregation between items and sub-items. We can use this varying abstraction level of enumerated items to represent different degrees of vagueness, both ontological (imprecision) and epistemic (uncertainty). Let us imagine that we have a World¬Regions Informatics 2019, 6, 20 8 of 23 enumerated type having root items Europe and Asia, and then items France, Germany, and Spain under Europe. Imagine now that that we wanted to express where the prehistoric bell-beaker culture took place. We know that it happened in Europe, but its boundaries are naturally (i.e., ontologically) vague; for this reason, the best thing we can do is use Europe, as France, Germany, or Spain would be too restrictive. The ontologically vague Europe is an acceptable representation of the fact we want to convey, namely, that the bell-beaker culture happened all over Europe but without clear-cut boundaries. Imagine now that we need to indicate where someone was born, and that we know that it was somewhere in Europe but we are not sure what country. Again, we should use Europe to capture this fact. By doing this, we would be purposefully injecting some inaccuracy to gain certainty, as explained in previous sections. As illustrated by the examples, using an abstract enumerated item such as Europe may entail ambiguity, as statements such as Place¬Of¬Occurrence = Europe may mean two different things: the place of occurrence is all of Europe (imprecision), or the place of occurrence is some particular spot in Europe, which we are not sure of (uncertainty). Despite this, the semantics of the expressions are usually sufficient to resolve the ambiguity; for example, Place¬Of-Birth = Europe should be interpreted as an uncertain (rather than imprecise) expression as we know that people are born in a specific spot rather than in a whole continent. 2.1.4. Arbitrary Time Resolution The time data type introduced in previous sections corresponds to expressions of points along the arrow of time. However, as opposed to other modelling languages, ConML allows expressions of the time data type to contain arbitrary resolution. This means that time points do not necessarily follow the usual pattern of day, month, year, hour, minute, and second, but can be as “thick” or “thin” as needed. Some sample time values in ConML are 8 June 1996 20:45, September 1845, late 20th century, or early neolithic. All these expressions represent “points” in time of different “thickness”. In a similar way as we did with abstract enumerated types, we can use “thick” time points to express imprecision or uncertainty. Furthermore, like in the previous case, the ensuing ambiguity must be resolved by looking at the semantics of each individual expression. For example, a statement such as Moment = 1936 may mean that something was ongoing throughout the complete year 1936 (imprecision), or that it happened at a particular time this year but we are not sure when (uncertainty). A statement such as Date¬Of-Birth = 1936, however, is clearly uncertain rather than imprecise, as we know that people are born on a specific day and time rather than throughout a full year. The four mechanisms presented cover most of the needs found in terms of humanities information modelling, although it could be possible to define other mechanisms to support imprecision and vagueness as part of ConML (e.g., methods for defining ranges) that we are considering for future revisions. Next section presents a proposal for an implementation of these mechanisms on non-relational data structures, validating ConML mechanisms in a project with real data and showing how the software system manages data queries involving vagueness resolution. 3. Results 3.1. Case Study and Resultant Models This section describes the application of the solution proposed in previous sections to a real scenario in Digital Humanities. This scenario occurred with a research project carried out at the Institute for Medieval and Renaissance Studies and Digital Humanities (Instituto de Estudios Medievales y Renacentistas y de Humanidades Digitales IEMYRhd) [42], University of Salamanca, Spain. The research project, named DICTOMAGRED [43], analyses historical sources (including oral testimonies, legal documents, literature, etc.), most of them in Arabic, which contain geographical references describing routes through different areas in the Maghreb, their place names, their topography, and Informatics 2019, 6, 20 9 of 23 other related issues. The main goal of the project is “to provide a software tool for humanities specialists to retrieve information about the location of toponyms in North Africa as they appear in historical sources of medieval and modern times” [43]. Due to the heterogeneous nature of these historical sources, both in type and chronology, multiple needs appeared in relation to the representation of vagueness. In addition, and as in most cases in digital humanities research, vagueness not only helps researchers to better represent the area of study, it also provides additional knowledge about it. For this project in particular, needs included the specification of the degree of certainty of sources in relation to place names, the description of population estimates of the different geographical areas, and the indication of whether these places are now inhabited or not, among others [44]. Figure 1 shows an excerpt of the class model created for the project, focusing on toponyms (i.e., place names) and relations between them, the related geographical areas, and the historical sources that were employed. • Toponym: proper name referring to a geographical place. No vagueness is involved; • ToponymDistance: relative distance between two toponyms. This class also holds information related to the reliability of the distance estimation as a separate attribute; • GeographicArea: location of the place referred to by a toponym. If a toponym is still in use, the corresponding geographic area is epistemologically vague but known; if not, the geographic area may be estimated from the historical sources; • HistoricalSource: any manifestation of a testimony, (textual such as letters, publications, and bibliographical references) or oral testimonies (formal or informal) that allows the reconstruction, analysis, and interpretation of historical events. Informatics 2019, 6, x 9 of 23 (i.e., place names) and relations between them, the related geographical areas, and the historical sources that were employed. • Toponym: proper name referring to a geographical place. No vagueness is involved; • ToponymDistance: relative distance between two toponyms. This class also holds information related to the reliability of the distance estimation as a separate attribute; • GeographicArea: location of the place referred to by a toponym. If a toponym is still in use, the corresponding geographic area is epistemologically vague but known; if not, the geographic area may be estimated from the historical sources; • HistoricalSource: any manifestation of a testimony, (textual such as letters, publications, and bibliographical references) or oral testimonies (formal or informal) that allows the reconstruction, analysis, and interpretation of historical events. ConML allowed us to make decisions about the treatment of vagueness very early in the project while working at the conceptual level, and thus avoid bringing technological dependencies or other implementation decisions to the conceptual model. Thus, the class diagram in Figure 1 lays the foundation for expressing vagueness when taking instances. To illustrate this, we take some instances of the classes in Figure 1, as depicted in Figure 2. Firstly, toponym was instantiated as objects top1, top2, and top3 in order to represent toponyms of interest: Sijilmasa, Aghmat Ourika, and Tamdalt. According with two historical sources (instances of TextualHistoricalSource that are not presented in the following diagrams for space reasons), Sijilmasa was an important human location founded at 757 B.C. These historical sources place it within the limits of Tiaret, close to a rich gold mine that existed between Sudan and Zawila, on a difficult route. This was a medieval Moroccan center of commerce in the far north of the Sahara in Morocco. The history of the city was marked by several successive invasions of Berber dynasties. Due to their strategic importance, their distance with other important cities and the related routes have been studied for decades. Another important extinct city is Tamdalt, whose records date from the 2nd century B.C.; from Tāmdalt to Siŷilmāsa there are 11 marhalas (stages). Tamdalt is the Ansara river, which was born in the mountain that is ten miles from it, in the Mahgreb, where there is a silver mine. Currently, the name Tamdalt is not in use. Finally, Agmat Ourika was a city located eight days from Siŷilmāsa and three days from Dar’a. From the localities of the Sūs to this city, it takes six days to walk, and many villages of Berber tribes are crossed, whose apogee lay in middle ages. Currently, the known archaeological site Journaa Aghmat in an enclave in the Moroccan Ourika road. All this information is described with vagueness mechanisms in the object-oriented diagram in Figure 2. Figure 1. ConML class model for Toponym Studies in DICTOMAGRED project. ConML allowed us to make decisions about the treatment of vagueness very early in the project while working at the conceptual level, and thus avoid bringing technological dependencies or other implementation decisions to the conceptual model. Thus, the class diagram in Figure 1 lays the foundation for expressing vagueness when taking instances. To illustrate this, we take some instances of the classes in Figure 1, as depicted in Figure 2. Firstly, toponym was instantiated as objects top1, top2, and top3 in order to represent toponyms of interest: Sijilmasa, Aghmat Ourika, and Tamdalt. According with two historical sources (instances of Textual Historical Source that are not presented in the following diagrams for space reasons), Sijilmasa was an important human location founded at 757 Informatics 2019, 6, 20 10 of 23 B.C. These historical sources place it within the limits of Tiaret, close to a rich gold mine that existed between Sudan and Zawila, on a difficult route. This was a medieval Moroccan center of commerce in the far north of the Sahara in Morocco. The history of the city was marked by several successive invasions of Berber dynasties. Due to their strategic importance, their distance with other important cities and the related routes have been studied for decades. Another important extinct city is Tamdalt, whose records date from the 2nd century B.C.; from Tāmdalt to Siŷilmāsa there are 11 marhalas (stages). Tamdalt is the Ansara river, which was born in the mountain that is ten miles from it, in the Mahgreb, where there is a silver mine. Currently, the name Tamdalt is not in use. Finally, Agmat Ourika was a city located eight days from Siŷilmāsa and three days from Dar’a. From the localities of the Sūs to this city, it takes six days to walk, and many villages of Berber tribes are crossed, whose apogee lay in middle ages. Currently, the known archaeological site Journaa Aghmat in an enclave in the Moroccan Ourika road. All this information is described with vagueness mechanisms in the object-oriented diagram in Figure 2. Informatics 2019, 6, x 10 of 23 Figure 1. ConML class model for Toponym Studies in DICTOMAGRED project. Figure 2. ConML model for Sijilmasa, Tamdalt, and Aghmat Ourika toponyms information in DICTOMAGRED project. In grey, objects created for instantiate the class model, representing imprecise and uncertain information regarding toponym, ToponymDistance, and geographic area. Vagueness is expressed throughout the model in Figure 2 as follows: Three objects (top1, top2, and top3) represen the three toponyms involved on this scenario. For each object, time arbitrary resolution is used to express when each toponym was initially used. In addition, certainty qualifiers are employed to describe how certain we are about these datings: for Sijilmasa and Tamdalt, the asterisk in parenthesis at the end of the UsedIn attribute line indicates that we are sure that the toponym was in use on these dates for the reliable historical sources; for Aghmat Ourika, we use a tilde sign to indicate that we are not sure that it was used at the middle ages. In addition, the CurrentName for Aghmat Ourika is Journâa Aghmat, the current name of the archaeological site with a certain qualifier sign, as no place name exists today in references to the other archaeological sites; Sijilmasa and Tamdalt, on the contrary, maintains their original names but with a minus sign, because is false that the old toponyms are now in use. Parallel objects ga1, ga2, and ga3 represent the geographical areas where we currently place each toponym. These objects also employ certainty qualifiers for the values of XCoord and YCoord attributes in order to express the certainty of the coordinates. Abstract enumerated items are also employed with the region attribute. In the case of Sijilmasa and Aghmat Ourika, since there are well- known archaeological places in the center of Morocco, we can safely state them in Morocco. In the case of Tamdalt, it is an inhabited archaeological site near frontiers at present, so the level of certainty about the region is low, and therefore the very general Maghreb value is chosen since we cannot be more specific. Regarding both topDis1 and topDis2 objects, vagueness is explicitly treated through the ReliabilityLevel enumerated type, which allows us to state that the distance of “marhalas” (a stage or period in different Arabic languages and dialects) presents low reliability, whereas “walking days” Figure 2. ConML model for Sijilmasa, Tamdalt, and Aghmat Ourika toponyms information in DICTOMAGRED project. In grey, objects created for instantiate the class model, representing imprecise and uncertain information regarding toponym, ToponymDistance, and geographic area. Vagueness is expressed throughout the model in Figure 2 as follows: Three objects (top1, top2, and top3) represen the three toponyms involved on this scenario. For each object, time arbitrary resolution is used to express when each toponym was initially used. In addition, certainty qualifiers are employed to describe how certain we are about these datings: for Sijilmasa and Tamdalt, the asterisk in parenthesis at the end of the Used In attribute line indicates that we are sure that the toponym was in use on these dates for the reliable historical sources; for Aghmat Ourika, we use a tilde sign to indicate that we are not sure that it was used at the middle ages. In addition, the Current Name for Aghmat Ourika is Journâa Aghmat, the current name of the archaeological site with a certain qualifier sign, as no place name exists today in references to the other Informatics 2019, 6, 20 11 of 23 archaeological sites; Sijilmasa and Tamdalt, on the contrary, maintains their original names but with a minus sign, because is false that the old toponyms are now in use. Parallel objects ga1, ga2, and ga3 represent the geographical areas where we currently place each toponym. These objects also employ certainty qualifiers for the values of XCoord and YCoord attributes in order to express the certainty of the coordinates. Abstract enumerated items are also employed with the region attribute. In the case of Sijilmasa and Aghmat Ourika, since there are well-known archaeological places in the center of Morocco, we can safely state them in Morocco. In the case of Tamdalt, it is an inhabited archaeological site near frontiers at present, so the level of certainty about the region is low, and therefore the very general Maghreb value is chosen since we cannot be more specific. Regarding both topDis1 and topDis2 objects, vagueness is explicitly treated through the ReliabilityLevel enumerated type, which allows us to state that the distance of “marhalas” (a stage or period in different Arabic languages and dialects) presents low reliability, whereas “walking days” presents medium reliability. Additionally, we cannot specify a distance in km, so unknown is used as a value for Km Distance. As we can see in the DICTOMAGRED conceptualization [44], the use of explicit vagueness modelling mechanisms (both ontological and epistemic) allows us to capture relevant information needs in digital humanities research. In addition, it allows us to develop a software system while taking into account these specificities in the information. 3.2. Implementatio The final aim of the vagueness inclusion in DICTOMAGRED project includes the development of indexing and searching mechanisms according to different levels of information uncertainty, for example, searching only toponyms in current usage or accessing those that are on camel-days journeys or marhalas measurements of estimated distance with a high confidence by the historical sources. A non-relational storage structure has been chosen for the software system, since it allows us to maintain acceptable rates for indexing and searching information. Non-relational databases present particularities that we need to manage when implementing the vagueness mechanisms. In order to define this implementation proposal as universally as possible, we have decided to work with key-value structures for the expression of information, since they are the simplest and most commonly employed structure in all non-relational databases. Additionally, key-value principle is used as basis for document-based structures, which are also commonly non-relational schemas in which the data entities are grouped in documents as objects, which are composed by keys (properties) and values. These documents are usually formatted following JSON syntax [45,46]. For complete information about the non-relational terminology used here for describing implementation mechanisms, please consult [47]. In addition, it should be pointed out that non-relational databases are the most widely currently used structure for application development, due to their performance in terms of indexing and searching performance, real-time data management, and connectivity (for example, for mobile or distributed applications). Digital humanities software systems also require these indexing and searching performance capabilities. Next, we detailed the non-relation implementation designed for each vagueness mechanism defined: 1. Null and unknown semantics. Most of the non-relational systems do not allow one to create specific reserved words that could implement the need for null and unknown semantic for expressing vagueness. Some systems use numeric values such as zero, negatives values, or empty strings to represent null and/or unknown values. Other values are sometimes used as “magic” values for these semantics. However, these practices often introduce ambiguity and confusion, as zero and empty strings may constitute acceptable values for associated attributes. It is also common practice to create specific informational objects in the database structure for null or unknown semantics. This is a possible solution in systems where the object structure is still supported, such as MongoDB [48]. However, this solution is not possible in all non-relational Informatics 2019, 6, 20 12 of 23 systems. As we need specific semantics elements for representing absence of facts and absence of information universally, we have defined a node in our non-relational structure for each of them, encapsulating in specific references in the non-relational software systems the semantic required. Figure 3 shows the non-relational node and the key-value structures defined for null and unknown semantics and their use in a specific toponym information description in DICTOMAGRED.Informatics 2019, 6, x 12 of 23 Figure 3. Firebase console showing the data node for defining null and unknown semantics. 2. Certainty qualifiers. As we previously detailed, a certainty qualifier offers some “extra” information about a specific value of an attribute defined in the conceptual model (i.e., in b.Height = 34 (~), “34” is the value and the certainty qualifier indicates extra information; we are not very sure about the height given value). Thus, it is necessary to firstly define in the non- relational structure the certainty qualifiers as specific references that we can add to any key- value previously defined. A node with all possible certainty qualifiers is defined as part of the non-relational structure, separated from any other information node. With this solution, it is possible to correlate another key-value structure to the value “34” itself (following the example), for indicating the certainty qualifier. Figure 4 shows the nodes added and their use in a specific toponym information description in DICTOMAGRED. Figure 4. Firebase console showing the data node for defining certainty qualifiers in DICTOMAGRED implementation. 3. Abstract enumerated items. Some systems use numeric values for representing levels of abstraction in a hierarchical structure of items. Other values are sometimes used as ad hoc formatted values for these semantics, as chains of strings separated by special characters like “.” or “/” for representing the entire path of the enumerated item value (Region = Magreb.Morocco). However, these practices often introduce ambiguity and confusion in the information, as they may constitute acceptable values for the associated attributes or responds to arbitrary Figure 3. Firebase console showing the data node for defining null and unknown semantics. 2. Certainty qualifiers. As we previously detailed, a certainty qualifier offers some “extra” information about a specific value of an attribute defined in the conceptual model (i.e., in b.Height = 34 (~), “34” is the value and the certainty qualifier indicates extra information; we are not very sure about the height given value). Thus, it is necessary to firstly define in the non-relational structure the certainty qualifiers as specific references that we can add to any key-value previously defined. A node with all possible certainty qualifiers is defined as part of the non-relational structure, separated from any other information node. With this solution, it is possible to correlate another key-value structure to the value “34” itself (following the example), for indicating the certainty qualifier. Figure 4 shows the nodes added and their use in a specific toponym information description in DICTOMAGRED. Informatics 2019, 6, x 12 of 23 Figure 3. Firebase console showing the data node for defining null and unknown semantics. 2. Certainty qualifiers. As we previously detailed, a certainty qualifier offers some “extra” information about a specific value of an attribute defined in the conceptual model (i.e., in b.Height = 34 (~), “34” is the value and the certainty qualifier indicates extra information; we are not very sure about the height given value). Thus, it is necessary to firstly define in the non- relational structure the certainty qualifiers as specific references that we can add to any key- value previously defined. A node with all possible certainty qualifiers is defined as part of the non-relational structure, separated from any other information node. With this solution, it is possible to correlate another key-value structure to the value “34” itself (following the example), for indicating the certainty qualifier. Figure 4 shows the nodes added and their use in a specific toponym information description in DICTOMAGRED. Figure 4. Firebase console showing the data node for defining certainty qualifiers in DICTOMAGRED implementation. 3. Abstract enumerated items. Some systems use numeric values for representing levels of abstraction in a hierarchical structure of items. Other values are sometimes used as ad hoc formatted values for these semantics, as chains of strings separated by special characters like “.” or “/” for representing the entire path of the enumerated item value (Region = Magreb.Morocco). However, these practices often introduce ambiguity and confusion in the information, as they may constitute acceptable values for the associated attributes or responds to arbitrary Figure 4. Firebase console showing the data node for defining certainty qualifiers in DICTOMAGRED implementation. Informatics 2019, 6, 20 13 of 23 3. Abstract enumerated items. Some systems use numeric values for representing levels of abstraction in a hierarchical structure of items. Other values are sometimes used as ad hoc formatted values for these semantics, as chains of strings separated by special characters like “.” or “/” for representing the entire path of the enumerated item value (Region = Magreb.Morocco). However, these practices often introduce ambiguity and confusion in the information, as they may constitute acceptable values for the associated attributes or responds to arbitrary implementation decisions. It is also common practice to create implement abstract enumerated items as in the previous certainty qualifiers mechanism, defining a hierarchical node in the non-relational structure and putting the most concrete value of the hierarchy (Region = Morocco). Then, the software system iterates this node in order to obtain at what level of abstraction the value is described. The final possibility is to define the hierarchical node but putting as Boolean values of the attribute all the levels involved (Magreb = true; Morocco = true). Both last solutions follow a non-relational structure and are operational for implementing abstract enumerated items. However, iterating the node each time we want to solve the abstraction information is inefficient in non-relational environments, so finally we chose the Boolean values structure. Figure 5 shows the non-relational node defined for the regions enumerated type and their items, and their use in a specific toponym information description in DICTOMAGRED. Informatics 2019, 6, x 13 of 23 implementation decisions. It is also common practice to create implement abstract enumerated items as in the previous certainty qualifiers mechanism, defining a hierarchical node in the non- relational structure and putting the most concrete value of the hierarchy (Region = Morocco). Then, the software system iterates this node in order to obtain at what level of abstraction the value is described. The final possibility is to define the hierarchical node but putting as Boolean values of the attribute all the levels involved (Magreb = true; Morocco = true). Both last solutions follow a non-relational structure and are operational for implementing abstract enumerated items. However, iterating the node each time we want to solve the abstraction information is inefficient in non-relational environments, so finally we chose the Boolean values structure. Figure 5 shows the non-relational node defined for the regions enumerated type and their items, and their use in a specific toponym information description in DICTOMAGRED. Figure 5. Firebase console showing the regions data node implementing the abstract enumerated items mechanism. 4. Arbitrary time resolution. Most of the non-relational systems use the timestamp mechanism to represent temporal values (number of milliseconds after 1st January 1970). The need for representing previous dates at any granularity level in digital humanities makes timestamps use impossible for humanities information. There are some non-relational systems, such as MongoDB [48], that present specific data types for dates but with a very rigid format guided by ISO 8601 standards, which also presents other problems for humanities information, such as absence of support of Julian calendar or problems in data conversions between other date systems, such as Hegira (used in DICTOMAGRED project), Chinese calendar, etc. These limitations encouraged us to implement class library supporting the arbitrary resolution inherent to the time data type in ConML, which allows for some of the most usual forms of time representation, including simple and incomplete dates (and times), years, decades, and centuries. Now, we have implemented part of the functionalities of the class library in the non- relational environment for DICTOMAGRED. Similar to the certainty qualifiers implementation, we have defined a node in the non-relational structure with a hierarchical conceptualization of Figure 5. Firebase console showing the regions data node implementing the abstract enumerated items mechanism. 4. Arbitrary time resolution. Most of the non-relational systems use the timestamp mechanism to represent temporal values (number of milliseconds after 1st January 1970). The need for representing previous dates at any granularity level in digital humanities makes timestamps use impossible for humanities information. There are some non-relational systems, such as MongoDB [48], that present specific data types for dates but with a very rigid format guided by ISO 8601 standards, which also presents other problems for humanities information, such as Informatics 2019, 6, 20 14 of 23 absence of support of Julian calendar or problems in data conversions between other date systems, such as Hegira (used in DICTOMAGRED project), Chinese calendar, etc. These limitations encouraged us to implement class library supporting the arbitrary resolution inherent to the time data type in ConML, which allows for some of the most usual forms of time representation, including simple and incomplete dates (and times), years, decades, and centuries. Now, we have implemented part of the functionalities of the class library in the non-relational environment for DICTOMAGRED. Similar to the certainty qualifiers implementation, we have defined a node in the non-relational structure with a hierarchical conceptualization of vagueness points in a timeline that we want to manage (years, decades, centuries, time eras, etc.). Then, we included a key-value structure referring to the specific point in time used for solved a given value. For instance, UsedIn = middle ages contains a key-value structure indicating that the value “middle ages” needs to be interpreted as the “Age” level of granularity in time. Figure 6 shows the non-relational node defined for the arbitrary time resolution, and its use in a specific toponym information description in DICTOMAGRED. Informatics 2019, 6, x 14 of 23 vagueness points in a timeline that we want to manage (years, decades, centuries, time eras, etc.). Then, we included a key-value structure referring to the specific point in time used for solved a given value. For instance, UsedIn = middle ages contains a key-value structure indicating that the value “middle ages” needs to be interpreted as the “Age” level of granularity in time. Figure 6 shows the non-relational node defined for the arbitrary time resolution, and its use in a specific toponym information description in DICTOMAGRED. Figure 6. Firebase console showing UsedIn attribute implementation according the arbitrary time resolution mechanism. Note that, although we explained the implementation proposal by each vagueness mechanism, it is possible (and desirable) to combine the mechanisms, exploiting the expressiveness of the ConML vagueness mechanisms and the potential of the non-relational structure. Thus, it is possible to express in a non-relational structure that one specific toponym was used in the second century (S.II B.C.) with highly confidence (using certainty qualifiers) while other was used in middle ages with a lower confidence. Figure 6. Firebase console showing UsedIn attribute implementation according the arbitrary time resolution mechanism. Note that, although we explained the implementation proposal by each vagueness mechanism, it is possible (and desirable) to combine the mechanisms, exploiting the expressiveness of the ConML vagueness mechanisms and the potential of the non-relational structure. Thus, it is possible to express in a non-relational structure that one specific toponym was used in the second century (S.II B.C.) with highly confidence (using certainty qualifiers) while other was used in middle ages with a lower confidence. All the implementation details in non-relational structure shows are implemented in DICTOMAGRED, including vague measurements for distances or vague locations (see Figures 2 Informatics 2019, 6, 20 15 of 23 and 7). The project uses a web-based environment with non-relational real-time database provided by Firebase services [49]. Firebase is a mobile and web application development platform run by Google since 2014 that allow us to personalize the non-relational database implementation with indexing and searching integrated services, as well as other functionalities (real time maintenance, cloud services, etc.). It is important to highlight that the implementation proposal presented here is defined in terms of the conceptual model previously defined and following a non-relational data structure, but independently of the specific non-relational environment chosen. Thus, as well as on Firebase, the following implementation could also be adopted as part of any other well-known non-relational environment based on key-value or document-based structures, such as MongoDB, Amazon DynamoBD, CouchBase, Oracle noSQL, etc. [47,50]. Following this premise, the specific modelling and implementation decisions made during this work present some homogeneity for all mechanisms, in order to ensure that the implementation proposal defined here is as universally applicable as possible for non-relational contexts with expression of informational vagueness needs, both ontological and epistemic. In addition, we employed a search system service provided from Algolia [51] via a RESTful JSON API for implementing the non-relational queries, although Firebase supports the main programming languages (including Javascript, PHP, or Python, among others) that will allow us to integrate the DICTOMAGRED system via web. The following subsection shows the experiments carried out within the DICTOMAGRED project defining specific queries that include aspects of vagueness and illustrating how the DICTOMAGRED software system manages vagueness in its query results.Informatics 2019, 6, x 15 of 23 Figure 7. Firebase console showing final implementation details. At right, the values marhalas or parasangs (Iranian past measure unit for distance) as vague measurement units for distance in the DICTOMAGRED data model. At left, the final values for the specific Tamdalt toponym supporting vague information. All the implementation details in non-relational structure shows are implemented in DICTOMAGRED, including vague measurements for distances or vague locations (see Figure 2 and Figure 7). The project uses a web-based environment with non-relational real-time database provided by Firebase services [49]. Firebase is a mobile and web application development platform run by Google since 2014 that allow us to personalize the non-relational database implementation with indexing and searching integrated services, as well as other functionalities (real time maintenance, cloud services, etc.). It is important to highlight that the implementation proposal presented here is defined in terms of the conceptual model previously defined and following a non-relational data structure, but independently of the specific non-relational environment chosen. Thus, as well as on Firebase, the following implementation could also be adopted as part of any other well-known non- relational environment based on key-value or document-based structures, such as MongoDB, Amazon DynamoBD, CouchBase, Oracle noSQL, etc. [47,50]. Following this premise, the specific modelling and implementation decisions made during this work present some homogeneity for all mechanisms, in order to ensure that the implementation proposal defined here is as universally applicable as possible for non-relational contexts with expression of informational vagueness needs, both ontological and epistemic. In addition, we employed a search system service provided from Algolia [51] via a RESTful JSON API for implementing the non-relational queries, although Firebase supports the main programming languages (including Javascript, PHP, or Python, among others) that will allow us to integrate the DICTOMAGRED system via web. The following subsection shows the experiments carried out within the DICTOMAGRED project defining specific queries that include aspects of vagueness and illustrating how the DICTOMAGRED software system manages vagueness in its query results. 3.3. Query-Based Vagueness Resolution Results Three queries have been defined according to the specific vagueness needs of the case study shows in Figure 2 from DICTOMAGRED, expressed first in natural language and subsequently executed in the Algolia search systems accessing the Firebase-defined structure: • QUERY A: Searching for all Dictomagred toponyms located in Maghreb region whose CurrentName is improbable. This means that the toponym is probably not in use regarding current maps of populations and cities. QUERY A involves tow vagueness mechanisms: abstract enumerated items to solve the hierarchical levels of the information about the regions attribute, and certainty qualifiers to evaluate what values of the current name present an improbable qualifier. Figure 7. Firebase console showing final implementation details. At right, the values marhalas or parasangs (Iranian past measure unit for distance) as vague measurement units for distance in the DICTOMAGRED data model. At left, the final values for the specific Tamdalt toponym supporting vague information. 3.3. Query-Based Vagueness Resolution Results Three queries have been defined according to the specific vagueness needs of the case study shows in Figure 2 from DICTOMAGRED, expressed first in natural language and subsequently executed in the Algolia search systems accessing the Firebase-defined structure: • QUERY A: Searching for all Dictomagred toponyms located in Maghreb region whose CurrentName is improbable. This means that the toponym is probably not in use regarding current maps of populations and cities. QUERY A involves tow vagueness mechanisms: abstract enumerated items to solve the hierarchical levels of the information about the regions attribute, and certainty qualifiers to evaluate what values of the current name present an improbable qualifier. Informatics 2019, 6, 20 16 of 23 • QUERY B: Searching for all DICTOMAGRED toponyms whose distance from Sijilmasa is unknown. This means that the system evaluates the instances of ToponymDistance where KmDistance is unknown and shows the correspondence toponyms involved in these instances as origin or destinies. This query allows us to test the resolution of unknown references. • QUERY C: Searching for all toponyms used in middle ages or in the second century B.C. This means that the software system has to query UsedIn attribute value at two levels of abstraction for solving the query employed arbitrary time resolution (note that both points in time present different levels of granularity and neither of them adjusts to classic timestamps of data formats employed in ISO 8601 standard or similar references). Note that all queries require, at least, the use of one vagueness mechanism or even combined versions of them, in order to offer to the DICTOMAGRED users (mainly researchers on Arabic language; Magreb topography, history, and/or archaeological remains; etc.) responses to their research questions (Figures 8–13). Next Figures 8, 10, and 12 show how these queries are executed, and Figures 9, 11, and 13 show the corresponding results consulting our Firebase non-relational database using the Algoria search engine. Note that, for executing a query in the Algolia dashboard, it is necessary to define as filters or facets [51] the parameters that the query requires, in our case region as Maghreb and CurrentName certainty as improbable in the query A (Figure 8), KmDistance as unknown in the query B (Figure 10) and UsedIn as middle ages or second century B.C. in query C (Figure 12). Informatics 2019, 6, x 16 of 23 • QUERY B: Searching for all DICTOMAGRED toponyms whose distance from Sijilmasa is unknown. This means that the system evaluates the instances of ToponymDistance where KmDistance is unknown and shows the correspondence toponyms involved in these instances as origin or destinies. This query allows us to test the resolution of unknown references. • QUERY C: Searching for all toponyms used in middle ages or in the second century B.C. This means that the software system has to query UsedIn attribute value at two levels of abstraction for solving the query employed arbitrary time resolution (note that both points in time present different levels of granularity and neither of them adjusts to classic timestamps of data formats employed in ISO 8601 standard or similar references). Note that all queries require, at least, the use of one vagueness mechanism or even combined versions of them, in order to offer to the DICTOMAGRED users (mainly researchers on Arabic language; Magreb topography, history, and/or archaeological remains; etc.) responses to their research questions (Figures 8–13). Next Figures 8, 10, and 12 show how these queries are executed, and Figures 9, 11, and 13 show the corresponding results consulting our Firebase non-relational database using the Algoria search engine. Note that, for executing a query in the Algolia dashboard, it is necessary to define as filters or facets [51] the parameters that the query requires, in our case region as Maghreb and CurrentName certainty as improbable in the query A (Figure 8), KmDistance as unknown in the query B (Figure 10) and UsedIn as middle ages or second century B.C. in query C (Figure 12). Figure 8. Query A execution through Algolia search engine. We have added two facets with the two requirements of the query about the region and the certainty in the current name use of the toponyms. Figure 8. Query A execution through Algolia search engine. We have added two facets with the two requirements of the query about the region and the certainty in the current name use of the toponyms. In the first case, query A results offered all toponyms situated specifically at Maghreb whose current name certainty is improbable. The system recovers two toponyms with the following conditions: Sijilmasa and Tamdalt (see in Figure 2 the correct values of these toponyms according with the query requirments.). Figure 9 shows the results for the query, showing the data for Sijilmasa toponym. Regarding query B results, the systems recovers four toponyms (two of them are part of our example) whose distance in kilometers from Sijilmasa is unknown. Informatics 2019, 6, 20 17 of 23 Finally, query C involved the execution of two combined searching structures due to the fact we have to manage toponyms used in the middle ages or used in the second century BC. Logical operators are common in relational database structures, but less supported in non-relational systems. Algolia allow us to use OR logical operator thanks to the custom search console including in their dashboard. Query C results recovers 20 toponyms used in these periods of time, including two presented in our case: Aghmat Ourika and Tamdalt.Informatics 2019, 6, x 17 of 23 Figure 9. Results for query A. In the first case, query A results offered all toponyms situated specifically at Maghreb whose current name certainty is improbable. The system recovers two toponyms with the following conditions: Sijilmasa and Tamdalt (see in Figure 2 the correct values of these toponyms according with the query requirments.). Figure 9 shows the results for the query, showing the data for Sijilmasa toponym. Figure 10. Query B execution using Algolia search engine. We have added custom expression on the Algolia console referring to Sijilmasa internal code as the reference point for recovering distances to it. Figure 9. Results for query A. Informatics 2019, 6, x 17 of 23 Figure 9. Results for query A. In the first case, query A results offered all toponyms situated specifically at Maghreb whose current name certainty is improbable. The system recovers two toponyms with the following conditions: Sijilmasa and Tamdalt (see in Figure 2 the correct values of these toponyms according with the query requirments.). Figure 9 shows the results for the query, showing the data for Sijilmasa toponym. Figure 10. Query B execution using Algolia search engine. We have added custom expression on the Algolia console referring to Sijilmasa internal code as the reference point for recovering distances to it. Figure 10. Query B execution using Algolia search engine. We have added custom expression on the Algolia console referring to Sijilmasa internal code as the reference point for recovering distances to it. Informatics 2019, 6, 20 18 of 23 Informatics 2019, 6, x 18 of 23 Figure 11. Results for Query B. Regarding query B results, the systems recovers four toponyms (two of them are part of our example) whose distance in kilometers from Sijilmasa is unknown. Figure 12. Query C execution through Algolia search engine. We have added a custom expression on the Algolia console with an OR expression for executing it. Figure 11. Results for Query B. Informatics 2019, 6, x 18 of 23 Figure 11. Results for Query B. Regarding query B results, the systems recovers four toponyms (two of them are part of our example) whose distance in kilometers from Sijilmasa is unknown. Figure 12. Query C execution through Algolia search engine. We have added a custom expression on the Algolia console with an OR expression for executing it. Figure 12. Query C execution through Algolia search engine. We have added a custom expression on the Algolia console with an OR expression for executing it. Informatics 2019, 6, 20 19 of 23 Informatics 2019, 6, x 19 of 23 Figure 13. Results for query C. Finally, query C involved the execution of two combined searching structures due to the fact we have to manage toponyms used in the middle ages or used in the second century BC. Logical operators are common in relational database structures, but less supported in non-relational systems. Algolia allow us to use OR logical operator thanks to the custom search console including in their dashboard. Query C results recovers 20 toponyms used in these periods of time, including two presented in our case: Aghmat Ourika and Tamdalt. In summary, the previous implementation of the four ConML mechanisms for expressing vagueness in the Firebase non-relational database allowed us to define searches that include vagueness references in their specification, taking advantage of the capabilities of non-relational systems. 4. Discussion The results obtained for the A, B, and C queries defined and the Firebase-based software system created for the presented implementation show that the non-relational implementation of the vagueness mechanisms is possible with vagueness resolution in the query system. Note that, apart from the specific example that we wanted to show in this paper (represented in Figure 2), the software system manages all toponyms, retrieving those that meet the established vagueness criteria. It will be also possible to concretize the results only for our case study, using the filtering mechanisms on the original results. This filtering service is provided by Firebase (and almost all non-relational structure software systems) and could analyse only the case of Sijilmasa and related toponyms. Because DICTOMAGRED [43] has a manageable number of nodes in its non-relational structure (currently DICTOMAGRED manages 53 toponyms with five hierarchical levels of information in their tree non-relational structure, which constitutes around 300 nodes of information), we could validate with the project researchers that the coverage of the implementation is total, that is, the conceptual model and the vagueness mechanisms created represent both the research needs of the project and the data source, obtaining accurate results (data that meet the conditions of indexing and searching) for queries A, B, and C. This type of expert-guided validation is only possible with a Figure 13. Results for query C. In summary, the previous implementation of the four ConML mechanisms for expressing vagueness in the Firebase non-relational database allowed us to define searches that include vagueness references in their specification, taking advantage of the capabilities of non-relational systems. 4. Discussion The results obtained for the A, B, and C queries defined and the Firebase-based software system created for the presented implementation show that the non-relational implementation of the vagueness mechanisms is possible with vagueness resolution in the query system. Note that, apart from the specific example that we wanted to show in this paper (represented in Figure 2), the software system manages all toponyms, retrieving those that meet the established vagueness criteria. It will be also possible to concretize the results only for our case study, using the filtering mechanisms on the original results. This filtering service is provided by Firebase (and almost all non-relational structure software systems) and could analyse only the case of Sijilmasa and related toponyms. Because DICTOMAGRED [43] has a manageable number of nodes in its non-relational structure (currently DICTOMAGRED manages 53 toponyms with five hierarchical levels of information in their tree non-relational structure, which constitutes around 300 nodes of information), we could validate with the project researchers that the coverage of the implementation is total, that is, the conceptual model and the vagueness mechanisms created represent both the research needs of the project and the data source, obtaining accurate results (data that meet the conditions of indexing and searching) for queries A, B, and C. This type of expert-guided validation is only possible with a manageable number of nodes, which are easily verifiable by humans. In other contexts, a solution based on monitoring the coverage of the algorithm automatically will be necessary. Finally, it is important to highlight the need for the initial conception of vagueness support from the first stages of design of each project or concrete application. As can be seen, the queries that we have designed already take into account the possibilities of expressing the vagueness of the software system Informatics 2019, 6, 20 20 of 23 since they arise from the previously conceptual model created. Without this previous conceptual design, the queries designed would probably not follow the vagueness logic of the model. We believe that the presented implementation constitutes an important advance for the support of vagueness in digital humanities at a conceptual level. Specially, and going back to the motivation of this work, the explicit addressing of the value added for the vague information in the humanities is treated through the proposal presented, providing mechanisms for future projects with same needs to deal with the vagueness in their implementation, instead of adapting non-vague support solutions. In addition, and due to the performance advantages of non-relational systems, there are currently more applications and projects in digital humanities that choose non-relational structures to manage their data. This implementation can serve as a relevance reference for these type of projects and applications with clear vagueness management needs. 5. Conclusions Imprecise and uncertain information constitutes an intrinsic characteristic of the digital humanities research practice, and, when properly modelled and expressed, may comprise a valuable asset. This paper has reviewed most well-known approaches to the modelling of vagueness, and presented a theoretical framework and specific modelling mechanisms in ConML for the expression of ontological and epistemic vagueness in the digital humanities. As illustrated by an application to a real project, these mechanisms allow researchers to express imprecision and uncertainty in their own models. In addition, the implementation proposal presented allows them to fulfill their vagueness needs without a large penalty in analytical and processing power, thanks to the non-relational structures. As far as we know, this is the first implementation proposal for vagueness in digital humanities that offers a software solution for vagueness from the conceptual model design to the implementation in a real digital humanities project, dealing with specific examples of vagueness needs. Due to this innovative component, critical analysis is also needed. Some suggestions for making improvements are identified as part of our future roadmap. The first aspect is that, in contrast to data coverage validation that we have already mentioned in the previous section, we do not have data about the performance of the software system (time for solving a query, etc.). It has not been considered necessary to measure them because, below a specific volume of information (as DICTOMAGRED volume), it is difficult to obtain reliable measures in performance. As a future plan, it is necessary to evaluate the implementation presented with a greater volume of nodes, so that the performance in some searches could be compromised. This is especially relevant in queries that involve the vagueness mechanism of arbitrary time resolution or that involve more than one vagueness mechanism at the same time. In addition, we plan to compare the performance results obtained with implementations in relational structures, in order to stablish some criteria or guidelines that will help engineers and digital humanities project managers in making decisions about their implementation data structure based on the informational needs of each project or application. Secondly, some of the defined vagueness mechanisms are closely related to new implementation techniques related to fuzzy logic. For instance, certainty qualifiers could be seen as fuzzy characterizations of information. For theses reason, we are also considering fuzzy sets and levels of set membership [32–34] or similar rule-based logic mechanisms [32] for improving specific details of the implementation of vagueness. Finally, the application of the proposal presented here, both at a conceptual level and at the implementation level, to heterogeneous projects or application on digital humanities will allow us to test the vagueness mechanisms expressiveness and the implementation in a variety of humanistic contexts and realities. These future works will allow us to improve applications and specific implementations of the preset proposal for cases with greater demands for vagueness in the digital humanities. Informatics 2019, 6, 20 21 of 23 Author Contributions: Conceptualization, C.G.-P.; Data Curation and Software Implementation, P.M.-R.; Methodology, Validation, Formal Analysis and Writing, C.G.-P. and P.M.-R. Funding: This research was partially funded by Spanish Ministry of Economy, Industry, and Competitiveness under its Competitive Juan de la Cierva Postdoctoral Research Programme, grant FJCI-2016-28032. Conflicts of Interest: The authors declare no conflict of interest. References 1. Ackoff, R.L. From data to wisdom. J. Appl. Sys. Anal. 1988, 16, 3–9. 2. Ciula, A.; Eide, Ø. Modelling in digital humanities: Signs in context. Digit. Scholarsh. Humanit. 2016, 32 (Suppl. 1), i33–i46. [CrossRef] 3. Gonzalez-Perez, C. Information Modelling for Archaeology and Anthropology: Software Engineering Principles for Cultural Heritage; Springer International Publishing: Berlin, Germany, 2018. [CrossRef] 4. Europeana. Europeana Project 2008–2015 [26/04/2016]. Available online: http://www.europeana.eu/ (accessed on 22 March 2019). 5. ARIADNE. ARIADNE Project 2013. Available online: http://ariadne-infrastructure.eu/ (accessed on 22 March 2019). 6. DARIAH-EU. Digital Research Infrastructure for the Arts and Humanities (DARIAH) 2007–2015 [26/04/2016]. Available online: https://dariah.eu/ (accessed on 22 March 2019). 7. Incipit. ConML Technical Specification. ConML 1.4.4 2015. Available online: http://www.conml.org/ Resources_TechSpec.aspx (accessed on 22 March 2019). 8. Flanders, J.; Jannidis, F. Data modeling. In A New Companion to Digital Humanities; Schreibman, S., Siemens, R., Unsworth, J., Eds.; Wiley: Hoboken, NJ, USA, 2015. 9. Flanders, J.; Jannidis, F. Knowledge Organization and Data Modeling in the Humanities. Available online: https://www.wwp.northeastern.edu/outreach/conference/kodm2012/flanders_jannidis_ datamodeling.pdf (accessed on 22 March 2019). 10. Hedges, M. Grid-enabling humanities datasets. Digit. Humanit. Q. 2009, 3, 4. 11. Linked Data. Available online: http://linkeddata.org/ (accessed on 22 March 2019). 12. Chen, P.P.-S. The entity-relationship model: Toward a unified view of data. In Readings in Artificial Intelligence and Databases; Elsevier: Amsterdam, The Netherlands, 1988; pp. 98–111. 13. W3C. RDF Schema 1.1. W3C Recommendation 25 February 2014. Available online: https://www.w3.org/TR/ rdf-schema/ (accessed on 22 March 2019). 14. Hunter, A.; Liu, W. Representing and Merging Uncertain Information in XML: A Short Survey. Available online: http://www0.cs.ucl.ac.uk/staff/A.Hunter/papers/saj.pdf (accessed on 22 March 2019). 15. Consortium, T. Text Enconding Initiative (TEI) 2016. Available online: http://www.tei-c.org/index.xml (accessed on 22 March 2019). 16. Isaksen, L.; Simon, R.; Barker, E.T.E.; de Soto Cañamares, P. Pelagios and the emerging graph of ancient world data. In Proceedings of the 2014 ACM conference on Web science, Bloomington, IN, USA, 23–26 June 2014; pp. 197–201. 17. Commons, P. Pelagios Commons WebSite (Pelagios 6 Project). Available online: http://commons.pelagios.org/ (accessed on 22 March 2019). 18. Gonzalez-Perez, C.; Martín-Rodilla, P. Teaching Conceptual Modelling in Humanities and Social Sciences. Digit. Humanit. Mag. 2017, 1, 408–416. 19. Chirico, R.D.; Frenkel, M.; Diky, V.V.; Marsh, K.N.; Wilhoit, R.C. ThermoML—An XML-Based Approach for Storage and Exchange of Experimental and Critically Evaluated Thermophysical and Thermochemical Property Data. 2. Uncertainties. J. Chem. Eng. Data 2003, 48, 1344–1359. [CrossRef] 20. ISO. ISO 21127:2006 Information and Documentation—A Reference Ontology for the Interchange of Cultural Heritage Information 2006. Available online: https://www.iso.org/standard/34424.html (accessed on 22 March 2019). 21. De Runz, C.; Desjardin, E.; Piantoni, F.; Herbin, M. Using Fuzzy Logic to Manage Uncertain Multi-modal Data in an Archaeological GIS. Available online: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1. 108.7063 (accessed on 22 March 2019). http://dx.doi.org/10.1093/llc/fqw045 http://dx.doi.org/10.1007/978-3-319-72652-6 http://www.europeana.eu/ http://ariadne-infrastructure.eu/ https://dariah.eu/ http://www.conml.org/Resources_TechSpec.aspx http://www.conml.org/Resources_TechSpec.aspx https://www.wwp.northeastern.edu/outreach/conference/kodm2012/flanders_jannidis_datamodeling.pdf https://www.wwp.northeastern.edu/outreach/conference/kodm2012/flanders_jannidis_datamodeling.pdf http://linkeddata.org/ https://www.w3.org/TR/rdf-schema/ https://www.w3.org/TR/rdf-schema/ http://www0.cs.ucl.ac.uk/staff/A.Hunter/papers/saj.pdf http://www.tei-c.org/index.xml http://commons.pelagios.org/ http://dx.doi.org/10.1021/je034088i https://www.iso.org/standard/34424.html http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.7063 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.7063 Informatics 2019, 6, 20 22 of 23 22. Tolle, K.; Wigg-Wolf, D. Uncertainty . . . ? ECFN Meeting 2014—Basel Goethe University 2014. Available online: http://ecfn.fundmuenzen.eu/images/Tolle_Wigg-Wolf_Uncertainty.pdf (accessed on 6 May 2019). 23. Christensen-Dalsgaard, B.; Castelli, D.; Jurik, B.A.; Lippincott, J. Research and Advanced Technology for Digital Libraries. In Proceedings of the 12th European Conference, ECDL 2008, Aarhus, Denmark, 14–19 September 2008. 24. Van Ruymbeke, M.; Hallot, P.; Billen, R. Enhancing CIDOC-CRM and compatible models with the concept of multiple interpretation. Remote Sens. Spat. Inf. Sci. 2017, 4, 287. [CrossRef] 25. Ore, C.-E.; Eide, Ø. TEI and cultural heritage ontologies: Exchange of information? Lit. Linguist. Comput. 2009, 24, 161–172. [CrossRef] 26. PROVIDEDH. PROgressive VIsual DEcision-Making in Digital Humanities (PROVIDEDH) Project 2019. Available online: https://providedh.eu (accessed on 22 March 2019). 27. ISO/IEC. Information Technology—Object Management Group Unified Modeling Language (OMG UML) Part 1: Infrastructure. ISO/IEC 19505-1:2012. Available online: https://www.iso.org/standard/32624.html (accessed on 22 March 2019). 28. Malta, M.C.; González-Blanco, E.; Cantón, C.M.; Del Rio, G. A Common Conceptual Model for the Study of Poetry in the Digital Humanities. Available online: https://dh2017.adho.org/abstracts/148/148.pdf (accessed on 22 March 2019). 29. Lacerda, M.J.; Crespo, L.G. Interval predictor models for data with measurement uncertainty. In Proceedings of the 2017 American Control Conference (ACC), Seattle, WA, USA, 24–26 May 2017. 30. Zadeh, L.A. Fuzzy logic = computing with words. IEEE Trans. Fuzzy Syst. 1996, 4, 103–111. [CrossRef] 31. Zadeh, L.A. A Summary and Update of “Fuzzy Logic”. In Proceedings of the 2010 IEEE International Conference on Granular Computing, San Jose, CA, USA, 14–16 August 2010. 32. Bouchon-Meunier, B. Strengths of Fuzzy Techniques in Data Science. Available online: https://hal.sorbonne- universite.fr/hal-01676195/document (accessed on 22 March 2019). 33. Zhou, H.; Wang, J.-Q.; Zhang, H.-Y. Multi-criteria decision-making approaches based on distance measures for linguistic hesitant fuzzy sets. J. Oper. Res. Soc. 2018, 69, 661–675. [CrossRef] 34. Faizi, S.; Rashid, T.; Sałabun, W.; Zafar, S.; Wątróbski, J. Decision making with uncertainty using hesitant fuzzy sets. Int. J. Fuzzy Sys. 2018, 20, 93–103. [CrossRef] 35. OMG. Project Portal for OMG® Uncertainty Modeling (UM) 2017. Available online: http://www.omgwiki. org/uncertainty/doku.php?id=Home (accessed on 22 March 2019). 36. Yue, T.; Ali, S.; Selic, B. Standardizing Uncertainty Modeling at OMG. Available online: http://www.cister. isep.ipp.pt/ae2016/presentations/utest2.pdf (accessed on 22 March 2019). 37. Xiao, J.; Pinel, P.; Pi, L.; Aranega, V.; Baron, C. Modeling uncertain and imprecise information in process modeling with UML. In Proceedings of the Fourteenth International Conference on Management of Data (COMAD), Mumbai, India, 17–19 December 2008. 38. Jackson, C.H.; Bojke, L.; Thompson, S.G.; Claxton, K.; Sharples, L.D. A framework for addressing structural uncertainty in decision models. Med. Decis. Mak. 2011, 31, 662–674. [CrossRef] [PubMed] 39. Ottomanelli, M.; Wong, C.K. Modelling uncertainty in traffic and transportation systems. Transportmetrica 2011, 7, 1–3. [CrossRef] 40. Sarma, A.D.; Benjelloun, O.; Halevy, A.; Nabar, S.; Widom, J. Representing uncertain data: Models, properties, and algorithms. VLDB 2009, 18, 989–1019. [CrossRef] 41. Martín-Rodilla, P.; Gonzalez-Perez, C. Assessing the learning curve in archaeological information modelling: Educational experiences with the Mind Maps and Object-Oriented paradigms. In Proceedings of the 45th Computer Applications and Quantitative Methods in Archaeology (CAA 2017), Atlanta, GA, USA, 13–16 March 2017. 42. IEMYR. Instituto de Estudios Medievales y Renacentistas y de Humanidades Digitales IEMYRhd 2018. Available online: http://iemyr.usal.es/ (accessed on 22 March 2019). 43. Dictomagred. DICTOMAGRED: Diccionario de Toponimia Magrebí 2018. Available online: https:// dictomagred.usal.es/ (accessed on 22 March 2019). 44. Rodríguez, M.A.M. Paisajes, espacios y objetos de devoción en el Islam. Available online: https://dialnet. unirioja.es/servlet/libro?codigo=708334 (accessed on 22 March 2019). 45. Sharp, J.; McMurtry, D.; Oakley, A.; Subramanian, M.; Zhang, H. Data Access for Highly-Scalable Solutions: Using SQL, NoSQL, and Polyglot Persistence; Microsoft Patterns & Practices: Redmond, DC, USA, 2013. http://ecfn.fundmuenzen.eu/images/Tolle_Wigg-Wolf_Uncertainty.pdf http://dx.doi.org/10.5194/isprs-annals-IV-2-W2-287-2017 http://dx.doi.org/10.1093/llc/fqp010 https://providedh.eu https://www.iso.org/standard/32624.html https://dh2017.adho.org/abstracts/148/148.pdf http://dx.doi.org/10.1109/91.493904 https://hal.sorbonne-universite.fr/hal-01676195/document https://hal.sorbonne-universite.fr/hal-01676195/document http://dx.doi.org/10.1080/01605682.2017.1400780 http://dx.doi.org/10.1007/s40815-017-0313-2 http://www.omgwiki.org/uncertainty/doku.php?id=Home http://www.omgwiki.org/uncertainty/doku.php?id=Home http://www.cister.isep.ipp.pt/ae2016/presentations/utest2.pdf http://www.cister.isep.ipp.pt/ae2016/presentations/utest2.pdf http://dx.doi.org/10.1177/0272989X11406986 http://www.ncbi.nlm.nih.gov/pubmed/21602487 http://dx.doi.org/10.1080/18128600903244636 http://dx.doi.org/10.1007/s00778-009-0147-0 http://iemyr.usal.es/ https://dictomagred.usal.es/ https://dictomagred.usal.es/ https://dialnet.unirioja.es/servlet/libro?codigo=708334 https://dialnet.unirioja.es/servlet/libro?codigo=708334 Informatics 2019, 6, 20 23 of 23 46. De Freitas, M.C.; Souza, D.Y.; Salgado, A.C. Conceptual Mappings to Convert Relational into NoSQL Databases. In Proceedings of the 18th International Conference on Enterprise Information Systems, Rome, Italy, 25–28 April 2016. 47. What are NoSQL Databases? Available online: https://aws.amazon.com/nosql/ (accessed on 22 March 2019). 48. MongoDB. Available online: https://wwwmongodbcom/ (accessed on 22 March 2019). 49. Inc. G. Firebase 2019 [01/03/2019]. Available online: https://firebase.google.com/ (accessed on 22 March 2019). 50. Abramova, V.; Bernardino, J. NoSQL databases: MongoDB vs cassandra. In Proceedings of the International C* Conference on Computer Science and Software Engineering, Porto, Portugal, 10–12 July 2013. 51. Algolia. Algolia Website 2019. Available online: https://www.algolia.com/ (accessed on 22 March 2019). © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). https://aws.amazon.com/nosql/ https://www mongodb com/ https://firebase.google.com/ https://www.algolia.com/ http://creativecommons.org/ http://creativecommons.org/licenses/by/4.0/. Introduction Uncertain Information in Humanities Fields Existing Approaches Outside Humanities Theoretical Framework ConML Materials and Methods Expressing Imprecision and Uncertainty with ConML Null and Unknown Semantics Certainty Qualifiers Abstract Enumerated Items Arbitrary Time Resolution Results Case Study and Resultant Models Implementatio Query-Based Vagueness Resolution Results Discussion Conclusions References work_fnntlamvnnhzfm53c6ydj4oeoa ---- [PDF] Discovering relationships from imperial court documents of Qing China | Semantic Scholar Skip to search formSkip to main content> Semantic Scholar's Logo Search Sign InCreate Free Account You are currently offline. Some features of the site may not work correctly. DOI:10.3366/ijhac.2012.0036 Corpus ID: 9673344Discovering relationships from imperial court documents of Qing China @article{Hsiang2012DiscoveringRF, title={Discovering relationships from imperial court documents of Qing China}, author={J. Hsiang and Shih-Pei Chen and Hou Ieong Ho and H. Tu}, journal={Int. J. Humanit. Arts Comput.}, year={2012}, volume={6}, pages={22-41} } J. Hsiang, Shih-Pei Chen, +1 author H. Tu Published 2012 History, Computer Science Int. J. Humanit. Arts Comput. The Qing Imperial Court documents are a major source of primary research material for studying the Qing era China since they provide the most direct and first-hand details of how national affairs were handled. However, the way Qing archived these documents has made it cumbersome to collect documents covering the same event and rebuild their original contexts. In this paper, we describe some information technology that we have developed to discover two important and useful relations among these… Expand View via Publisher thdl.ntu.edu.tw Save to Library Create Alert Cite Launch Research Feed Share This Paper 5 CitationsBackground Citations 1 Methods Citations 1 View All Figures, Tables, and Topics from this paper figure 1 table 1 figure 2 table 2 figure 3 figure 4 table 4 figure 5 table 5 figure 6 figure 7 figure 8 figure 9 figure 10 figure 11 figure 12 figure 13 figure 14 View All 18 Figures & Tables Text mining Digital library Historical document Diagram Archive 5 Citations Citation Type Citation Type All Types Cites Results Cites Methods Cites Background Has PDF Publication Type Author More Filters More Filters Filters Sort by Relevance Sort by Most Influenced Papers Sort by Citation Count Sort by Recency Discovering land transaction relations from land deeds of Taiwan Shih-Pei Chen, Yu-Ming Huang, J. Hsiang, H. Tu, Hou Ieong Ho, Ping-Yen Chen Business, Computer Science Lit. Linguistic Comput. 2011 3 PDF View 2 excerpts, cites background Save Alert Research Feed A Chinese ancient book digital humanities research platform to support digital humanities research Chih-Ming Chen, C. Chang Computer Science Electron. Libr. 2019 1 Save Alert Research Feed Application of Taiwan's Human Rights-Themed Cultural Assets and Spatial Information Shuhui Lin Sociology, Computer Science Complex. 2020 PDF Save Alert Research Feed Visuality in a Cross-disciplinary Battleground: Analysis of Inscriptions in Digital Humanities Journal Publications Rongqian Ma, Kai Li Computer Science 2021 PDF View 2 excerpts, cites methods Save Alert Research Feed A Bibliographic Analysis of Scholarly Publication in the Emerging Field of Digital Humanities in Taiwan K. Chen, Muh-Chyun Tang Sociology 2019 Save Alert Research Feed References SHOWING 1-2 OF 2 REFERENCES Methods for Identifying Versioned and Plagiarized Documents T. C. Hoad, J. Zobel Computer Science J. Assoc. Inf. Sci. Technol. 2003 370 PDF Save Alert Research Feed Collection statistics for fast duplicate document detection Abdur Chowdhury, O. Frieder, D. Grossman, M. McCabe Computer Science TOIS 2002 282 PDF Save Alert Research Feed Related Papers Abstract Figures, Tables, and Topics 5 Citations 2 References Related Papers Stay Connected With Semantic Scholar Sign Up About Semantic Scholar Semantic Scholar is a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI. Learn More → Resources DatasetsSupp.aiAPIOpen Corpus Organization About UsResearchPublishing PartnersData Partners   FAQContact Proudly built by AI2 with the help of our Collaborators Terms of Service•Privacy Policy The Allen Institute for AI By clicking accept or continuing to use the site, you agree to the terms outlined in our Privacy Policy, Terms of Service, and Dataset License ACCEPT & CONTINUE work_fooxxved5zgsddcx7iawm44dhi ---- Microsoft Word - Lothian_2013.docx   1   Journal of e-Media Studies Volume 3, Issue 1, 2013 Dartmouth College Can Digital Humanities Mean Transformative Critique? Alexis Lothian and Amanda Phillips We need new hybrid practitioners: artist-theorists, programming humanists, activist- scholars; theoretical archivists, critical race coders. We need new forms of graduate and undergraduate education that hone both critical and digital literacies. We have to shake ourselves out of our small, field-based boxes so that we might take seriously the possibility that our own knowledge practices are normalized, modular, and black boxed in much the same way as the code we study in our work. ––Tara McPherson, “Why is the Digital Humanities So White?” (154) We were invited to this issue of the Journal of e-Media Studies because we gave something a name. We are two participants in a group of early-career queer, feminist, and ethnic studies scholars of media, literature, and culture who are interested in digital scholarship, who kept meeting at conferences and wondering why the critical frameworks and politicized histories of our activist inquiry were so rarely part of the conversations we were having about scholarly technology. The series of academic conference events that led us to converge as a collective have by now been hashed and rehashed many times: there was an idea at THATCamp SoCal in response to anxiety at MLA 2011; then a small but productive panel at ASA (American Studies Association) 2011; some blog posts on HASTAC (Humanities, Arts, Sciences, and Technology   2   Advanced Collaboratory) and elsewhere, a Tumblr; and the birth of a hashtag that finally caught the attention of the digital humanities (DH) Twittersphere. Somebody made a Google Doc, some bodies attended a panel, and some buddies were in the collective hoping that people would take over the hashtag and submit to the Tumblr and blog about why #transformDH was cute but vague and ultimately misguided. But, ultimately, the project’s goal was to put a name to a feeling and see who else was thinking the same thing. That there are now names out there, records of attendance, email trails, and other evidence for the future tenure files that might take such endeavors into account, was a side effect that has taught us much about the power of naming–– you might even say of branding––when you want to get an idea into circulation. What was the idea? In short, #transformDH is an aggregated statement of the obvious. First of all: the emergent methods and practices we call digital humanities are not only for traditional work. Years of DH criticism might point to the banality of this sentiment; the changing shapes of communication and technology alter the terms of scholarship, and keeping afloat in the coming century will require mastery over new tools and methods. The revolution of DH is in full swing, with the force of multicampus institutions, internet portals, and federal funding at its back. The histories that DH as a discipline traces back through practices of humanities computing have indeed done transformational work on the structures of scholarship and the bureaucracies that shape our careers. Yet the bright lights and marching bands of the so-called big tent outshine less marketable histories of engagement with technology that have emerged from standpoints that critique the privileging of certain gendered, racialized, classed, able-bodied, Western-centric productions of knowledge. In a recent blog entry, filmmaker, feminist, and academic Alex Juhasz describes why she does not affiliate herself wholeheartedly with digital humanities:   3   The “field” does the amazing potentially radicalizing work of asking humanities professors (and students) to take account for their audiences, commitments, forms, and the uses of their work. But this was always there to take account of, being obscured by the transparent protocols of publishing and pedagogy that have been revealed because of the force of the digital. However, this turn is occurring, for the most part, as if plenty of fields, and professors, and artists, and students, and humanists hadn’t been already been doing this for years (and therefore without turning to these necessarily radical traditions of political scholars, theoretical artists, and humanities activists). #TransformDH was our attempt to turn the digital humanities toward these radical traditions, as well as toward the bodies of critical work in new media studies by Wendy Chun, Lisa Nakamura, Anna Everett, Tara McPherson, and many others, that unpack the politics inherent in the force of the digital, the powers that shape the hardware and software that in turn shape our scholarly work. We wanted to think about the institutions that were forming in this ever more amorphous thing called digital humanities. We didn’t want the ways of engaging knowledge that were important to us to be left out. We felt it would be too easy to say that we were doing something other than DH, whether that be new media studies or critical cultural studies with a focus on the digital; instead, we wanted to bring what Juhasz calls “necessarily radical traditions,” which have nourished us, into the DH field in which we also felt at home. If humanities scholars in critical media and cultural studies, queer studies, ethnic studies, disability studies, and related areas are doing work in and with the digital, we should lay claim to our place within digital humanities. We should explicitly occupy that space and assert––as McPherson and Jamie “Skye” Bianco,   4   among others, have recently done––that the honorable history of humanities computing is not the only one that matters for whatever it is we mean when we talk about the field. Inclusivity is important to DH practitioners in the humanities computing tradition. We share that goal, but it is not the heart of our project. In “Whose Revolution? Towards a More Equitable Digital Humanities,” Matthew K. Gold’s MLA 2012 talk reflecting on his book Debates in the Digital Humanities, Gold raises the question of which hierarchies, uneven distributions of labor, and value systems DH might preserve even as it seeks to change the way academic work is done. His important discussion focuses on the vital and often overlooked power of institutional resources to shape what scholarly work gets done. Yet the metaphor that comes after his set of concrete and useful suggestions for diversifying DH is interesting: “as any software engineer can tell you, the more eyes you have on a problem, the more likely you are to find and fix bugs in the system.” If the system of DH were to run smoothly, Gold implies, it would not perpetuate hierarchies or inequalities. Gender, race, sexuality, ability, and class––and the marked bodies on which they become most visible––can be content that would fit within the forms already being established and funded for digital work: the on-campus centers, the annotated archives. But what we know about the academy, from its constitutive imbrications with nationalism and empire to the structures of race and gender that still shape its labor practices, suggests otherwise. Content and form are not so separable; truly accounting for one will unavoidably change the other. So instead of smoothing out the bugs in the digital academy, we wonder how digital practices and projects might participate in more radical processes of transformation––might rattle the poles of the big tent rather than slip seamlessly into it. To that end, we are interested in digital   5   scholarship that takes aim at the more deeply rooted traditions of the academy: its commitment to the works of white men, living and dead; its overvaluation of Western and colonial perspectives on (and in) culture; its reproduction of heteropatriarchal generational structures. Perhaps we should inhabit, rather than eradicate, the status of bugs––even of viruses—in the system. Perhaps there are different systems and anti-systems to be found: DIY projects, projects that don’t only belong to the academy, projects that still matter even if they aren’t funded, even if they fail. What would digital scholarship and the humanities disciplines be like if they centered around processes and possibilities of social and cultural transformation as well as institutional preservation? If they centered around questions of labor, race, gender, and justice at personal, local, and global scales? If their practitioners considered not only how the academy might reach out to underserved communities, but also how the kinds of knowledge production nurtured elsewhere could transform the academy itself? These questions are not hypothetical. These digital humanities already exist. Here we offer a curated list of projects, people, and collaborations that suggest the possibilities of a transformative digital humanities: one where neither the digital nor the humanities will be terms taken for granted. The transformative digital humanities will not be found only among the members of our ad hoc collective. Nor will it be found only where the funding is, where the easily recognized and intensively supported DH projects are. We’ve gathered a selection of projects, ranging from institutionally sponsored archives of less-than-traditional materials to networks that purposefully have no direct connection to the academy as such. None belongs to a core member of our   6   collective, because we are becoming a little alarmed at the publicity our act of naming has begun to generate. All the projects put the questions of decades of feminist, queer, and critical race theory (all of which share significant temporal nodes with the politicized computing movements at the heart of much DH philosophy) at the center of their work, leveraging the affordances and methodologies for social justice. Here one can find collaboration pushed to collectivity, interdisciplinarity that reaches outside of the ivory tower, and art that builds its own theory. These are only beginnings, suggestions; you may disagree that these are projects worth gathering, or you may wish to suggest other projects for consideration. Your feedback, critiques, and additions will help us to build a transformative digital humanities together. Curation Transformative Archives Archives may be the most legible form of digital humanities production, as digital tools have been developed to preserve, gather, and share historical documents. Digital humanities practitioners have increasingly been theorizing the power structures and silences of the archive, as well as drawing on materials less often granted the legitimacy of academic preservation.   7   Adeline Koh: Digitizing “Chinese Englishmen”: Representations of Race and Empire in the Nineteenth Century Adeline Koh’s online Digitizing “Chinese Englishmen” project is an early step in the direction of decolonizing the archive, offering a forum for collaborative annotation and novel social media intervention on texts that expand the Victorian Anglophone repertoire beyond its current “narrow geographical boundaries.” Koh’s project carves out a space for the postcolonial archive: The website is meant to be both a “decentralized” and a “postcolonial” archive. By a “decentralized” archive, it refers to one which provides modes for democratic access and exchange. On first glance, the term “postcolonial” nineteenth century archive may appear anachronistic, as no colonies were in fact “postcolonial” in this time period. My use of the term “postcolonial,” however, derives more from the type of postcolonial   8   literary criticism and postcolonial theory commonly associated with Edward Said and the Subaltern Studies Collective than with movements towards decolonization before and after the Second World War. In this definition, a “postcolonial” archive is one which examines and questions the creation of imperialist ideology within the structure of the archive. Additionally, it aims to assemble a previously unrepresented collection of subaltern artifacts. (“Addressing Archival Silence on 19th Century Colonialism – Part 2”) Straits Chinese Magazine, the project’s source text, offers readers a complicated, alternative view of what it meant to be both an Englishman and a Chinese gentleman in the 19th century. Koh’s archive makes no effort to resolve or simplify the complicated identity practices of the Chinese Englishmen, hoping instead to offer a platform to evaluate them without the colonial impulse to reduce these Victorians to paragons of false consciousness or imitations of “real” British gentlemanliness. Digitizing “Chinese Englishmen” expands the archive beyond colonial representations of nonwhite peoples in the 19th century, leveraging the reach of the digital to transform the face of 19th-century studies.   9   Women Who Rock: Making Scenes, Building Communities at the University of Washington Women Who Rock is an oral history archive at the University of Washington, built from the ground up on the principles of women of color feminism: collaboration across difference, intersectional critique, and accountability to communities outside the academy. Participation in the project provides training for women’s and ethnic studies graduate students in the digital skills that suit their research interests, from web design to video production. Headed by Michelle Habell-Pallán, this is one of the few well-established, institutionally supported DH projects that are rooted in critical feminist media theory and praxis. Women Who Rock Research Project (WWRRP) supports, develops, and circulates cultural production, conversations and scholarship by cultural producers and faculty,   10   graduate students, and undergraduates across disciplines, both within and outside the University, who examine the politics of gender, race, class, and sexuality generated by popular music. Our goal is to generate dialogue and provide a focal point from which to build and strengthen relationships between local musicians and their communities, and educational institutions. (Women Who Rock Project: Making Scenes, Building Communities) [Video by Angelica Macklin: http://vimeo.com/24484214] Oral histories such as this are committed to the production of knowledge from below, bringing people and practices who have traditionally been excluded from academic spheres––or simply not taken seriously there––into the frameworks of institutional preservation. In the case of Women Who Rock, the preservation of popular music’s communities and histories is also aimed at a transformation of the institutional archive itself, bringing down barriers between the university and the knowledge worlds that lie outside its walls. Transformative Artistic Production Definitions of the digital humanities do not often include digital artistic production. But why not? The borders of artistic practice, software design, political activism, and critical knowledge production are porous.   11   Micha Cárdenas: Transreal Politics A queer performance artist currently working toward a PhD in the University of Southern California (USC)’s Interactive Media Arts and Practice program, Micha Cárdenas uses art, theory, and technology to encourage social justice thinking, which results in a unique brand of art-theory that pushes each of the fields in which it engages. Cárdenas develops new software applications, designs and builds electronic gadgets that challenge hegemonic regimes, and infuses each performance with theoretical writing. Cárdenas’s collaborative work has resulted in two theoretical texts so far: Trans Desire/Affective Cyborgs, coauthored by Barbara Fornssler and Wolfgang Shirmacher, and The Transreal: Political Aesthetics of Crossing Realities, coauthored with Zach Blas, Elle Mehrmand, and Amy Sara Carroll. Cárdenas’s work takes trans- to its fullest extent, crossing realities, genders, theoretical perspectives, and technical design.   12   The video featured here, “Becoming Transreal,” a performance in collaboration with Elle Mehrmand and Chris Head, focuses the attention of the digital back on the material body and its entanglement with global capital, reminding us, through the pain of transgender experience braided with a dystopic science fiction narrative, that technology is of concern to bodies (and corporations) most of all. From the video’s description: What if you could become anything? What happens after species change surgery becomes a reality? becoming transreal speculates on a future in which the promises of bionanotechnology have become realized, and yet as capitalism has continued to fail, both the interiors of our bodies and the virtual world have become totally commodified. you can become anything, but to finance your whims of identity transformation, the same nanohormones that transform your body are also producing drugs for others. becoming transreal looks at transgender experience through a lens of slipstream science fiction poetry about bio-nano drug piracy. The performance uses motion capture to interface with Second Life avatars [http://en.wikipedia.org/wiki/Second_Life] and 3D stereoscopic imagery to immerse the audience in this transreal world. Cárdenas operates in the tradition of mixed-reality performance, which Steve Benford and Gabriella Giannachi define broadly as a subset of performance art, including augmented reality and pervasive gaming, that combine “many real, virtual, augmented reality, and augmented virtuality environments into complex hybrid and distributed performance stages” (3). Although many mixed-reality works, such as Blast Theory’s Uncle Roy Everywhere or 42 Entertainment’s   13   I Love Bees, focus on direct user participation and mobile technologies, Cárdenas invites the audience to enter the world of the performance through indirect means such as audience props and immersive presentation technologies. Using large-scale projection equipment and biometric sensors keyed to the performers’ bodies, Cárdenas’s transreal performances bridge a physical installation space with the virtual world of Second Life, (dis)embodying their own content through form. Cárdenas creates a performance space and temporality layered with autobiography and speculative fiction, physical bodies and digital avatars. [Video by Micha Cárdenas and Elle Mehrmand, “Becoming Transreal”: http://vimeo.com/16869351] Zach Blas: Queer Technologies Zach Blas’s Queer Technologies project invites viewers to rethink the role of critical theory by bringing it out of academic language and into the realm of product design. Blas’s art reimagines   14   queer theory as a high-design brand, building objects that we can imagine as desirable accessories for the discerning plugged-in activist, and challenging us to pay attention to the commodification of art and ideas. Part manifesto, part news report, part critical essay, Queer Technologies’ suite of instructional videos takes digital production as both theory and praxis. Each video documents a queer weapon of resistance that responds to, yet participates in, the methods of the technological tools of empire. Blas’s playful, speculative products ironically reproduce the signifiers of global capital while offering queer possibilities for undermining them, as indicated by the promotional speech embedded in each video: Queer Technologies is an organization that develops applications for queer technological agency, interventions, and social formation. We use technology to make queer weapons of resistance. These include: transCoder, a queer programming anti-language software development kit; ENgendering Gender Changers, a solution to gender adapters’ male/female binary; Gay Bombs, a technical manual manifesto that outlines a how-to of queer networked activism; and GRID, a mapping application that tracks dissemination of queer technologies and maps the battle plans to more thoroughly infect networks of global capital. You can find our products at the Disingenuous Bar, a center for political support for technical problems, or in various consumer electronics stores, such as Best Buy, Radio Shack, and Target. This sarcastic PR spin calls into question the Apple products and slick gadgetry on which media- inclined academics depend; indeed, Queer Technologies asks us to consider not only the ends to   15   which we apply our digital tools, but also the troubling legacies and potential applications of cutting-edge developments in science and technology. The video “Fag Face, or How to Escape Your Face” responds to biometric technologies that enlist the face in governmental control systems, whose applications range from commercial digital camera software to surveillance technologies used by local law enforcement. Responding to legacies of homophobia and neoliberal governance with Deleuze, Guattari, and gay pornography, “Fag Face” offers a new way to think about and produce critical theory. [Video by Zach Blas, “Fag Face”: http://vimeo.com/26638452] From the Center Scholarship and activism, academy and community, theory and pedagogy are often considered to be separate. By including this project, in which researchers and technology educators work with incarcerated women of color using digital storytelling techniques, we hope to challenge readers to think about what it might mean to allow our ideas about scholarship and political commitment to be transformed from the ground up. Digital scholar, poet, and University of California– Berkeley graduate student Margaret Rhee serves as project co-lead and conceptualist. At the   16   2011 HASTAC Conference, Rhee spoke of this collaborative activist work as “counterintuitive to the logics and rewards of the academy”––yet absolutely necessary. As feminists in our new media age, we believe women should be the authors, directors and storytellers of our own lives. We re-imagine how new media technologies can provide a vital intervention for all women, even those whose voices are subsumed in larger hegemonic discourse. Oftentimes, incarcerated women and issues of race, class and sexuality are unacknowledged even in interdisciplinary areas such as Ethnic, Women and Queer Studies and in larger conversations and decisions of HIV/AIDS prevention education, policy and new media technologies. “From the Center” derives from intersectional issues, domains and disciplines. We hope to bridge seemingly disparate subjects: feminist praxis, HIV/AIDS education, digital storytelling, the prison industrial complex, Women’s Studies, Ethnic Studies and New Media Studies. Thus, we question, hope and urge a re-articulation of women’s identity, HIV/AIDS education and the digital divide by centering the issues and concerns of incarcerated women. (From the Center) The field of digital humanities has become well known for its willingness to challenge academic conventions on one level: the idea that a PhD constitutes professional training that should lead invariably to a tenure-track university teaching position. Yet the vision of From the Center, and Rhee’s insistence that her work should be considered part of a scholarly project, highlights the limits of the academic transformations suggested by the increasingly celebrated alt-ac narrative (which encourages PhDs to seek careers in non-teaching roles in the university). From the Center is a far more radical vision of what alternative scholarly knowledge projects and professional   17   practices could be. It is not uncommon for scholars with particular political commitments to use their skills for activist projects in addition to their university work of teaching, research, and (in the age of DH) digital projects. But what would it mean to slip the bounds of the neoliberal academy, even for a moment, and imagine this work as the center of scholarly activity? [Digital story, “Miracle”: http://vimeo.com/26096719] Because I want to help women know that it is okay to go through things like that, this life. Because I have someone in my family who has HIV. And I learned from her how to have safe sex and get tested. From the evocative intensity of the video to the straightforward statements that highlight a reality too rarely acknowledged within scholarly spaces, knowledge is being produced and transmitted here. When From the Center’s team travels to conferences, its presenters include formerly incarcerated participants as well as academics and professional activists. Their presence suggests that the privileged sphere of digital scholarship need not remain hermetically sealed from those who “go through things like that, this life.” Transformative Networked Pedagogies Connections and support networks among those engaged in knowledge production are central to the growth of the digital humanities sphere. Much unacknowledged work of consolidation, mentorship, and intellectual framing takes place in and through digitally mediated social   18   networks. Here we highlight two examples that make the work of theory/practice explicit and conscious, building collaborative spheres on feminist principles and connecting transformative praxes inside and outside the academy. Fembot Collective: Feminism, New Media, Science and Technology The Fembot Collective consists of faculty, graduate students, and librarians who created a portal for feminist scholarship about technology. Committed to the ideals of open source, Fembot hosts an online journal, Ada: Journal of Gender, New Media and Technology, with an open peer editorial process, an expanded notion of what “article” means, and a built-in system to help contributors bolster promotion and tenure portfolios: Fembot has developed a framework for a two-level review process that includes an open editorial peer review and a community level of review for works in progress. Valuing both the scholarly works and participation in the community of review, Fembot will   19   provide metrics on article views/downloads and the usefulness of comments. These metrics will be aggregated into a portfolio, which is conducive to forming an incentive to participate in the community and support an argument for value toward promotion. In addition to its transformation of scholarly publishing, Fembot contributes pedagogical tools on the undergraduate and graduate levels, hosting blog posts in the site’s Laundry Day section that outline short, teachable moments in feminist technology scholarship, and providing tenure policies and dissertation prospectuses for use in professionalization training. Most recently, Fembot acts as the portal for FemTechNet, a feminist technology teaching network that hopes to launch a course taught worldwide, Dialogues on Feminism and Technology, in 2013. Billed as a “Distributed Online Collaborative Course,” FemTechNet is an attempt at developing a viable model for transdisciplinary, transnational, transmedial collaborative pedagogies, and a feminist intervention on the MOOC (Massive Open Online Course) model that is prevalent and controversial in current digital humanities discourse. In the future, Fembot will host peer-evaluated readings, videos, bibliographies, and other teaching resources to aid participants in tailoring local instances of the course to its networked goals. Experiments such as FemTechNet and Ada position the Fembot Collective as an innovator in scholarly communicative possibilities. Crunk Feminist Collective “Mission Statement”   20   In its mission statement, the Crunk Feminist Collective throws off “hegemonic ways of being” in favor of reveling in and sharing the intoxicating effects of women of color feminisms with its readership and commenting community. This blogging community provides a space for women of color to commune, critique, and call out hegemonic culture in ways that reach across the divide separating academia from the popular. Beat-driven and bass-laden, Crunk music blends Hip Hop culture and Southern Black culture in ways that are sometimes seamless, but more often dissonant. Its location as part of Southern Black culture references the South both as the location that brought many of us together and as the place where many of us still do vibrant and important intellectual and political work. The term “Crunk” was initially coined from a contraction of “crazy” or “chronic” (weed) and “drunk” and was used to describe a state of uber- intoxication, where a person is “crazy drunk,” out of their right mind, and under the influence. But where merely getting crunk signaled that you were out of your mind, a crunk feminist mode of resistance will help you get your mind right, as they say in the South. Casting off stilted academic speech for lyrical manifestos, insisting on the utility of affect for deep and considered arguments, and refusing to disconnect deeply personal stories from the project of scholarship, the Crunk Feminists’ commentary is more timely than journal production and more effective in enlisting the passion and drive of reader-students for social justice purposes.   21   The collective’s interventions in internet and popular culture have included critiquing mainstream media for its coverage of Olympians Gabby Douglas and Claressa Fields, covering the triumphs and missteps of the popular The Misadventures of Awkward Black Girl web series, and offering film and television reviews that range from Love and Hip Hop to Pariah. The Crunk Feminists also offer practical career advice for young academics and swap experiences and strategies for the unique struggles of the black feminist running a university-level class. As the blog’s large community of regular readers and commenters attest, the tactics and philosophies of Crunk Feminism reach into academia and beyond, educating and transforming their corner of the web. Conclusion As the tools and methods of the digital humanities take up their new positions of prominence, we can only hope that they will begin to take on the mutations and instabilities represented by the practitioners and projects featured here, rather than settle into the creaky machine of the corporate university. Whatever its future, DH has already proved its power to unsettle the old guard, inducing anxious and skeptical blog posts from high-profile critics and me-too conference panels spreading the word to far-off disciplines. The spirit of #transformDH is not to arrest this momentum, but to channel it in truly transformative directions—to avoid trading whiteness for more whiteness, heteropatriarchy for more heteropatriarchy, one imperialist hierarchy for another. We hope the community at large will continue to find and go viral with the social justice-minded hybrid practices, identities, and collaborations elaborated in McPherson’s epigraph to this work   22   of curation and analysis—the antiracist archives, the queer art-theories, the collaborative feminist pedagogies, the crunk academic activisms, the critical race coders. #TransformDH is a convenient means to do so, but in the spirit of transformative work, we hope it will be supplanted by something else soon. About the Authors Alexis Lothian is assistant professor of English at Indiana University of Pennsylvania, where she researches and teaches at the intersections of cultural studies, digital media, speculative fiction, and queer theory. She is the editor of an upcoming special issue of Ada: Journal of Gender, New Media and Technology on feminist science fiction, a coeditor of a Social Text Periscope dossier on Speculative Life, and a founding member of the editorial team for the journal Transformative Works and Cultures. Her work has been published in International Journal of Cultural Studies, Cinema Journal, Camera Obscura, and Journal of Digital Humanities. Amanda Phillips is a PhD candidate in the Department of English with an emphasis in feminist studies at the University of California–Santa Barbara. Her dissertation takes a vertical slice of the video games industry to look at how difference is produced and policed on multiple levels of the gamic system. Her interests more broadly are in queer, feminist, and race-conscious discourses in and around technoculture, popular media, and the digital humanities. In addition to participating in the Humanities Gaming Institute 2010, sponsored by the National Endowment for the Humanities (NEH), Amanda has been a HASTAC Scholar since 2009; she has also hosted, in conjunction with Margaret Rhee, an online HASTAC forum on Queer and Feminist New Media   23   Spaces, the organization’s most commented on forum to date. She has presented at the conferences for UCLA Queer Studies, the American Studies Association, the Modern Language Association, the Popular Culture Association, and the Conference on College Composition and Communication, and has participated in unconferences such as HASTAC’s Peer-to-Peer Pedagogy Workshop, THATCamp SoCal, and the Transcriptions Research Slam. Most recently, she has been involved with the #transformDH Collective’s efforts to encourage and highlight critical cultural studies work in digital humanities projects. Bibliography Benford, Steve, and Gabriella Giannachi. Performing Mixed Reality. Cambridge, MA: MIT Press, 2011. Blas, Zach. “Fag Face, or How to Escape Your Face.” Vimeo. 2012. Accessed May 9, 2012. http://vimeo.com/26638452. ———. “Queer Technologies: Automating Perverse Possibilities.” Queer Technologies. 2012. Accessed May 9, 2012. http://www.zachblas.info/projects/queer-technologies/. Cárdenas, Micha. Transreal.org. 2012. Accessed May 9, 2012. http://transreal.org/. Cárdenas, Micha, and Elle Mehrmand. “Becoming Transreal.” UCLA Freud Playhouse, Los Angeles, CA. Performed Nov. 3, 2010. Vimeo. May 9, 2012. The Crunk Feminist Collective. The Crunk Feminist Collective. 2010–present. Accessed May 9, 2012. http://crunkfeministcollective.wordpress.com/. ———. “Mission Statement.” The Crunk Feminist Collective (blog). Mar. 6, 2010. Accessed May 9, 2012. http://crunkfeministcollective.wordpress.com/about/.   24   The Fembot Collective. Fembot: Feminism, New Media, Science and Technology. 2012. Accessed May 9, 2012. http://fembotcollective.org/. Gold, Matthew K. “Whose Revolution? Towards a More Equitable Digital Humanities.” The Lapland Chronicles (blog). Jan. 10, 2012. Accessed May 9, 2012. http://mkgold.net/blog/2012/01/10/whose-revolution-toward-a-more-equitable-digital- humanities/. González, Isela, Margaret Rhee, Allyse Gray, and Kate Monico Klein. From the Center: Facilitating Feminist Digital Theory and Praxis in a Digital Environment (blog). 2012. Accessed May 9, 2012. http://hastac.org/blogs/alexislothian/2011/12/02/hastac2011- center-facilitating-feminist-digital-theory-and-praxis-dig. Graduates of From the Center. “Miracle.” Vimeo. 2010. Accessed May 9, 2012. http://vimeo.com/26096719. Juhasz, Alex. “Two Conferences: One Students’/Women’s Media Power.” Media Praxis: Integrating Media Theory, Practice and Politics (blog). Apr. 2, 2012. Accessed May 9, 2012. http://aljean.wordpress.com/2012/04/02/two-conferences-one-studentswomens- media-power/. Koh, Adeline. “Addressing Archival Silence on 19th Century Colonialism – Part 1: The Power of the Archive.” Adeline Koh (blog). Mar. 4, 2012. Accessed May 9, 2012. http://www.adelinekoh.org/blog/2012/03/04/addressing-archival-silence-on-19th-century- colonialism-part-1-the-power-of-the-archive/. ———. “Addressing Archival Silence on 19th Century Colonialism – Part 2: Creating a Nineteenth Century ‘Postcolonial’ Archive.” Adeline Koh (blog). Mar. 4, 2012. Accessed May 9,   25   2012). http://www.adelinekoh.org/blog/2012/03/04/addressing-archival-silence-on-19th-century- colonialism-part-2-creating-a-nineteenth-century-postcolonial-archive/. ———. Digitizing “Chinese Englishmen.” 2012. Accessed May 9, 2012. http://chineseenglishmen.adelinekoh.org/. Macklin, Angelica. “I Saw You On The Radio!” Vimeo. 2011. Accessed May 9, 2012. http://vimeo.com/24484214. McPherson, Tara. “Why is the Digital Humanities So White?, or, Thinking the Histories of Race and Computation.” In Debates in the Digital Humanities, edited by Matthew K. Gold, 138–160. Minneapolis: Minnesota University Press, 2012. Women Who Rock Project: Making Scenes, Building Communities. 2012. Accessed May 9, 2012. http://womenwhorockcommunity.org/. Published by the Dartmouth College Library. http://journals.dartmouth.edu/joems/ Article DOI: 10.1349/PS1.1938-6060.A.425 work_fqxsysypnfaf7cgja5pr7o75lu ---- DOI: 10.12862/Lab16CVN I manoscritti vichiani della Biblioteca Nazionale di Napoli “Vittorio Emanuele III” Le “Carte Villarosa” Sei fascicoli di carte vichiane varie non rilegate (Ms. XIX, 42) Nota editoriale e indici Laboratorio dell’ISPF, XIII, 2016 2 PREMESSA Le cosiddette Carte Villarosa rappresentano una raccolta estremamente evocati- va per lo studioso vichiano; e insieme, del tutto imprescindibile. Non si può lavorare sui testi di Vico senza avere almeno una volta trascorso del tempo a leggere e decifrare le scritture del lascito villarosano, che è composto dalle carte ereditate dal marchese Carlantonio De Rosa direttamente dal figlio di Giambat- tista, Gennaro Vico. Fausto Nicolini1 ricostruisce sinteticamente la storia della famiglia, ricordandone l’origine abruzzese e rievocando il capostipite, primo marchese di Villarosa, Carlantonio (1638-1712), uomo di toga come da tradi- zione della famiglia, reggente del Collaterale a Napoli e amico di Antonio Vico, padre di Giambattista. Alla loro amicizia Nicolini attribuisce la decisione pater- na di avviare il giovane Giambattista agli studi giuridici, presa su consiglio per l’appunto del marchese, al quale Antonio aveva confidato le sue preoccupazio- ni . Da questi discese poi quel Carlantonio che fu allievo di Vico nel 1738 e in seguito avvocato. Ma fu un altro Carlantonio (1762-1847), quinto marchese di Villarosa, a rac- cogliere e pubblicare per la prima volta gli Opuscoli vichiani; bibliofilo appassio- nato, ricevette dalle mani di Gennaro Vico, ormai anziano, quel poco che del padre aveva potuto raccogliere, e in più cominciò a girare per biblioteche pub- bliche e private, o a incaricare amici di farlo altrove, per mettere insieme la straordinaria raccolta che la Biblioteca Nazionale di Napoli “V. Emanuele III” conserva in forma manoscritta. Il lascito, insieme a quel che aveva variamente recuperato, divenne in seguito materiale a stampa e costituì la prima raccolta dell’opera vichiana, composta di quattro volumi da lui stesso editi. Le preziosissime Carte Villarosa2, custodite nella suggestiva cornice della Se- zione Manoscritti della Biblioteca napoletana, sono raccolte in sei fascicoli e dodici codici, e rappresentano il materiale principale sul quale viene condotta l’operazione ecdotica dell’editore critico vichiano, che a sua volta trova forma e collocazione nei volumi di edizione critica dell’opera omnia condotta dall’Istituto per la storia del pensiero filosofico e scientifico moderno del Consiglio nazio- nale delle ricerche fin dal 1982. Manuela Sanna 1 B. Croce, Bibliografia vichiana accresciuta e rielaborata da Fausto Nicolini, Napoli, Ricciardi, 1947, pp.135-138. 2 Una prima descrizione è da vedere nel Catalogo vichiano napoletano, a cura di M. Sanna, supplemento al «Bollettino del Centro di studi vichiani», XVI, 1986, affiancato dal Catalogo della Mostra bibliografico-documentaria in occasione delle Onoranze a Vico nel II centenario della nascita, a cura di G. Guerrieri, Napoli, 1968. Le “Carte Villarosa” 3 AVVERTENZA I contenuti delle “Carte Villarosa” (Ms. XIX, 42 della Biblioteca Nazionale di Napoli) sono stati classificati e catalogati in maniera descrittiva. Nelle note apposte ai singoli materiali – individuati con la segnatura e il numero di fascicolo, seguiti dalla numera- zione risultante dall’ordinamento bibliotecario riscontrabile a margine – si farà riferi- mento al Catalogo vichiano napoletano, a cura di M. Sanna, Napoli, Bibliopolis, 1986 (pp. 501-505) con la sigla “CVN”; eventuali riferimenti a B. Croce - F. Nicolini Bibliografia vichiana, Napoli, Ricciardi, 1947, vol. I, e al catalogo della Mostra bibliografica documentaria in occasione delle “Onoranze a Vico nel II centenario della nascita”, a cura di G. Guerrieri, Na- poli, L’Arte tipografica, 1968, saranno dati rispettivamente con le sigle “C-N” e “Guerr.”. Soltanto per i fascicoli I e III, contenenti rispettivamente Versi e iscrizioni ed Epi- stole, che si è indicati singolarmente, si è scelto di dare conto anche delle principali edizioni a stampa in cui i diversi materiali sono stati pubblicati. Le indicazioni, date in forma abbreviata, corrispondono a: Ultimi onori di letterati amici in morte di Angela Cimini, in Napoli, nella stamperia di Felice Mosca, 1727; Opuscoli di Giovanni Battista Vico raccolti e pubblicati da Carlantonio de Rosa marchese di Villarosa, Napoli, presso Porcelli, 1819; Opu- scoli di Giambattista Vico, nuovamente pubblicati con alcuni scritti inediti da Giuseppe Ferrari, Milano, Società tipografica de’ Classici italiani, 1836; Opuscoli vari di Giambattista Vico, cioè Scritti scientifici, orazioni, iscrizioni e poesie, Napoli, Jovene, 1840; G. Vico, L’autobiografia, il carteggio e le poesie varie, a cura di B. Croce e F. Nicolini, seconda edizio- ne, Bari, Laterza, 1929; G. Vico, Versi d’occasione e scritti di scuola, con appendice e bi- bliografia generale delle opere a cura di Fausto Nicolini, Bari, Laterza, 1941; G. Vico, Scritti vari e pagine sparse, Bari, Laterza, 1941; G. Vico, Epistole, con aggiunte le epistole dei suoi corrispondenti, a cura di M. Sanna, Napoli, Morano, 1993; G. Vico, Minora. Scritti latini, storici e d’occasione, a cura di G. G. Visconti, Napoli, Guida, 2000. Nel fascicolo III, Lettere del Vico e al Vico o riguardanti Vico, sono indicate tra parentesi uncinate le epistole mancanti di destinatario o mittente e tra parentesi tonde quelle non indirizzate a Vico ma riguardanti Vico. Questa pubblicazione è parte del progetto di edizione elettronica dei Manoscritti vi- chiani della Biblioteca Nazionale di Napoli curato dal Centro di Umanistica Digitale dell’ISPF-CNR su materiale acquisito grazie al POR-FESR Campania 2007-2013. Hanno collaborato in particolare Roberto Evangelista (fascicoli V e VI e revisione), Assunta Sansone (fascicoli I e III), Roberta Visone (fascicoli II e IV), Ruggero Cerino (supporto tecnico). Coordinamento di Leonardo Pica Ciamarra. Supervisione scientifi- ca di Manuela Sanna. Si ringrazia Mariolina Rascaglia della Biblioteca Nazionale di Napoli per la preziosa consulenza nella preparazione del materiale da riprodurre. Nota editoriale e indici 4 INDICE N.B. È dato di seguito l’elenco di tutti i contenuti della raccolta suddivisi per fascicoli. Cliccando sull’intestazione del fascicolo lo si apre in un’altra finestra. All’interno di ciascun fascicolo, la funzio- ne “Segnalibri” dà accesso ad un indice interattivo dei contenuti. Giacché gli originali hanno dimen- sioni molto diverse tra loro e sono riprodotti su una maschera orizzontale uniforme, si suggerisce al lettore di impostare di volta in volta l’ingrandimento più comodo. FASCICOLO I Versi ed iscrizioni del Vico e al Vico Ammiravo già un tempo Roma e Atene Con mano al re quelle gran vie far note Con sue alte ampie moli, e sterminate Con voi m’allegro, o figlio alme di Giove Del fier perduto Mondo i Primi Vati Divina Rosa d’un eterno Aprile Due candide Colombe a Dio dilette In Coppia ricca di valor latino O Bel Trionfo, a cui vado favore O Sovrano, Real Lione alato Pregio Sommo e Sovran del secolo nostro Sommo Genio Sovran d’eroi famosi Un Nume io vidi in spoglia di pastore Vaga Colomba, che con spedit’ali Venere mentre a le sue Grazie unita Heheu Dalmarsus summi pars magna Senatus Questi d’alti immortal Cigni canori Gran Vico, che tra l’altre avare ingiuste --- A’ miei sudori il Ciel non temprò ingiuste Piena di giusto sdegno al mio pensiero Nestora non laudet non Graeca docta Periclem Guari non fia che ’l mio vario destino O divino Uomo, o glorioso, e grande Quell’ardente desio alto e immortale Garzon sublime, e pien d’animo grande O Mastro egregio di più elette Rime Questo spirto divino alto e immortale https://rep.giambattistavico.it:9000/rpc/cat/repository/manoscritti/BNN_Ms_XIX_42_01/index.html Le “Carte Villarosa” 5 Veggio la Fama tua che ’l Mondo a pieno Da l’innesto real nato è ’l germoglio Sommo, e sovran del secolo nostro onore Mentre obliando sulle usate piume Desta da Giove, in pria si volse a lui Contro un meschino il Fato armossi e ’n lui Né superbo Lavor, né Marmi incisi Tornò al Ciel la gran donna e saggia e forte Io, che m’induro incontro a Morte e innaspro Vico, che per sermone eletto, e saggio Il cieco insano vulgo estima uom saggio De mente heroica Festa dies oritur, discurrant undique laeti Almae quid facerent, rogo, sorores Blancardi, mihi amore singulari Ab Siculis oris ad nostra Fasque, Fidesque Capassi, socium meorum ocellus Cyrille, o prope corculum Minervae Iam redit alma dies, qua errantia Lumina Caeli Quidnam saeva sedens Martis super arma Hymenaeus Quid fit, Musae innuptae recinant Hymenaea Mens facta ad verum, cui plenum pectus honesti Musa tibi adspirat, Vates, argute, jocisque AFFETTI DI UN DISPERATO CANZONE IN MORTE DEL SIGNOR CONTE D. ANTONIO CARAFFA Canzone di Giambattista Vico nella promozione della santità di Clemente XII Iscrizioni Iscrizione con la quale il Vico accompagnava un esemplare dell’Opera De universo Jure mandato in dono al principe Eugenio di Savoia In morte del marchese Orazio Rocca Iscrizione per il sepolcro del Cardinale Innico Caracciolo Per l’edificazione del ponte presso Ravenna e per la costruzione di altre opere sui fiumi Ronco e Montone Iscrizione fatta per un arco da erigersi al serenissimo Infante di Spagna Don Carlo Per la nomina di Filippo di Borbone iuniore a generalissimo del corpo di spedizione spagnuolo in Italia In morte del principe Francesco Caracciolo In morte di Giacomo III Stuart In morte di Francesco Boncore Nota editoriale e indici 6 Iscrizione per il nuovo palazzo innalzato da Luigi Molinelli Due iscrizioni in morte del duca Argento Quattro iscrizioni per le nozze di Carlo di Borbone con Maria Amalia Walburga Due iscrizioni in morte di Caterina d’Aragona FASCICOLO II Frammenti di scritti vari del Vico 1. Apografo dell’Orazione per la partenza del conte di S. Stefano 2. Due apografi per la Parthenopea Conjuratione 3, Autografo di Emendationes in Historiam Caraphae 4. Foglio volante contenente Ad lectores aequanimos risalente a un primo abbozzo del Diritto universale 5. Traduzione autografa degli articoli del Le Clerc intorno al Diritto universale 6. Dedica apografa premessa ai componimenti per le nozze di Adriano Carafa con Teresa Borghese 7. Traduzione dei citati articoli del Le Clerc intorno al Diritto universale 8. Aggiunta all’Autobiografia 9. Dedica del De Aequilibrio corporis animantis a Carlo di Borbone 10. Foglio volante su cui è incollata la «dipintura» preposta alla Scienza Nuova, ediz. 1730, con avvertenza autografa 11. Foglio volante con note autografe del Vico: istruzioni per la seconda edizione della Scienza Nuova 12. Foglio volante autografo: «Ex Bernardi Tanucci…» epistola 13. Editio princeps del De mente heroica, Dissertatio habita in regia Academia Neapolitana, Napoli, Johannes Franciscus Pacius, regia universitatis typographus, publica auctoritate excude- bat, 1732 FASCICOLO III Lettere del Vico e al Vico o riguardanti Vico Di Nicola Galizia (Di Giovanni Crisostomo Damasceno) Di Bernardo Maria Giacco Di Bernardo Maria Giacco https://rep.giambattistavico.it:9000/rpc/cat/repository/manoscritti/BNN_Ms_XIX_42_02/index.html https://rep.giambattistavico.it:9000/rpc/cat/repository/manoscritti/BNN_Ms_XIX_42_03/index.html Le “Carte Villarosa” 7 Di Biagio Garofalo Di Tommaso Maria Minorelli Di Bernardo Maria Giacco Di Bernardo Maria Giacco Di Jean Leclerc Di Bernardo Maria Giacco Del Cardinale Corsini Di Giovan Artico conte di Porcia Di Lorenzo Corsini Di Edouard de Vitry A Edouard de Vitry Di Lorenzo Corsini Di Giuseppe Athias Di Giovan Artico di Porcia Di Antonio Corsini Di Antonio Conti Di Giovan Artico di Porcia Di Francesco Saverio Estevan Di Francesco Saverio Estevan Di Tommaso Russo A Tommaso Russo Di Domenico Lodovico Di Nicola Gaetani di Laurenzano Di Niccolò Giovo A Niccolò Giovo Di Niccolò Concina Di Tommaso Maria Alfani < Di Tommaso Maria Alfani> Di Tommaso Maria Alfani Di Daniele Concina Di Joseph Joachim de Montealegre Di Joseph Joachim de Montealegre Di Niccolò Concina Di Muzio Gaeta (Di Isabella Pignone del Carretto) Nota editoriale e indici 8 Di Francesco Serao Di Francesco Serao Di Michelangelo Franceschi FASCICOLO IV Carte varie della scuola del Vico 1. Apografo: Institutionum Oratoriarum liber unus: exposuit utriusque iuris doctor J. B. a Vico…, 1711 2. Due fogli volanti autografi contenenti Oratiunculae pro adsequenda laurea in utroque iurae, secondo la definizione di Villarosa 3. Quadernetti e fogli volanti apografi di varia grandezza, recan- ti varie volte l’indicazione «G. B. Vico, 1738»; appunti dalle lezioni del Vico FASCICOLO V Un’opera per commissione, manoscritto autografo con saltuarie corre- zioni apografe. Ragionamento primo: L’acquisto delle scienze… tutt’altro necessa- rissimo ad un giovane nobile Ragionamento secondo: Per istradare i nobili giovanetti all’acquisto delle cosiddette scienze FASCICOLO VI Carte varie relative alla vita e alla fortuna del Vico 1. Breve nota di ragioni per don G. B. Vico contro la magni- fica donna Caterina Tommaselli 2. Apografo della vita di G. B. Vico napolitsno scritta dall’avv. N. Sala 3. Varie minute autografe di iscrizioni composte da Gennaro Vico pel padre 4. Copia manoscritta della Vita del Vico del Fabbroni 5. Due minute del frammento di relazione di Gennaro Vico a una designata edizione delle opere del padre 6. Appunti di Francesco Daniele intorno al modo… 7. Un’anonima apologia del cattolicesimo del Vico 8. Copia di una recensione degli opuscoli del Vico pubblicata dal Marchese di Villarosa https://rep.giambattistavico.it:9000/rpc/cat/repository/manoscritti/BNN_Ms_XIX_42_04/index.html https://rep.giambattistavico.it:9000/rpc/cat/repository/manoscritti/BNN_Ms_XIX_42_05/index.html https://rep.giambattistavico.it:9000/rpc/cat/repository/manoscritti/BNN_Ms_XIX_42_06/index.html work_fs52igq465cbtkoirnfkiz766e ---- Analysis of Weight Distribution in Term of Forces and Torques during Lifting Weight using Digital Human Modelling Analysis of Weight Distribution in Terms of Forces and Torques during Lifting Weight Using Digital Human Modelling Zafar Ullah*and Shahid Maqsood University of Engineering and Technology, Peshawar 25000, Pakistan ABSTRACT Construction activities performed by workers are usually repetitive and physically demanding. Execution of such tasks in awkward postures can strain the body parts and can result in fatigue, back pain or in severe cases permanent disabilities. In view of this Digital Human Modelling (DHM) technology offers human ergonomics experts the facilities of an efficient means of kinematics characteristics of lifting heavy weights in different postures. The objective of this paper is to analyse and calculate the forces and torques on the different body parts during lifting weights in four different postures using Digital Human Modelling software. For this purposes four different lifting postures were analysed and the forces and torques were calculated. It was identified that changing the postures considerably minimize the redundant stresses on the body muscles. Keywords: Musculoskeletal disorders; Lifting task; Lower back pain INTRODUCTION The International Labour Organization (ILO) estimates that some 2.3 million women and men around the world succumb to work-related accidents or diseases every year; this corresponds to over 6000 deaths every single day. Worldwide, there are around 340 million occupational accidents and 160 million victims of work-related illnesses annually [1]. Over the years, manufacturing companies have taken ergonomics and usability as basic parameters of quality for their products [1]. The design approach has been reviewed, giving to the end-users’ needs, requests, and limitations an extensive consideration. For this reason, an increasing attention is currently devoted to ergonomics and human factors evaluations even from the early stages of the design process [2-4]. Digital Mock-Ups (DMUs) provided by many computer aided engineering applications enable manufacturers to design a digital prototype of a product in full details, simulating its functions and predicting interaction among its different components [5-8]. The production of physical prototypes, which is a very time consuming task, is then deferred to the final stages of the design process [9]. In order to take advantage of digital simulations to conduct ergonomic assessments (computer aided ergonomics), digital substitutes of human beings capable of interacting with the DMUs in the simulation environment are required [10,11]. This has given birth to the so- called Digital Human Modelling (DHM), which led to the development of many software tools [10,12,13]. These tools are mainly used to study human-product and human-process interaction and to conduct ergonomic and biomechanical analyses, as well as manual process simulations, even before the physical prototype is available. DMUs, together with digital human models, are increasingly used in order to reduce the development time and cost, as well as to facilitate the prediction of performance and/or safety [14]. The ergonomic design methodology relying on digital human models makes the iterative process of design evaluation, diagnosis and review more rapid and economical [15,16]. It increases also the quality by minimizing the redundant changes and improves safety of products by eliminating ergonomics related problems [17,18]. Furthermore, with the arising of the forth-industrial revolution (Industry 4.0), the concept of the virtualization of the manufacturing processes has gained a greater importance. In this context, human simulation in production activities will certainly play a significant role [19]. These digital humans, provided by many process simulation software, are essentially kinematic chains consisting of several segments and joints [20]. In view of this the digital human modelling software helps to construct the Jo ur na l of Ergonom ics ISSN: 2165-7556 Journal of Ergonomics Research Article Correspondence to: Ullah Z, University of Engineering and Technology Peshawar 25000, Pakistan, Tel: +92 03329278262; E-mail: zafarullah631@yahoo.com Received: October 18, 2018; Accepted: March 19, 2019; Published: March 27, 2019 Citation: Ullah Z, Maqsood S (2019) Analysis of Weight Distribution in Term of Forces and Torques during Lifting Weight using Digital Human. J Ergonomics 9:243. doi:10.35248/2165-7556.19.9.243 Copyright: © 2019 Ullah Z, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. J Ergonomics, Vol.9 Iss.1 No:243 1 mailto:zafarullah631@yahoo.com human replica within the software and analysis is made on the mannequins in lifting task to calculate the forces and torques. METHODOLOGY Digital human models are computer-generated prototype of human beings used for biomechanical analysis. The mannequins are design through Human Computer Aided Design (CAD) software to mimic the real life industries workers posture. The facility of Ergo Tool is also available in the software which provides the static biomechanical stress on the different body parts. Four different lifting postures were analysed for forces and torque calculation assigning 20 kg concrete block to be lift. MANNEQUIN POSTURE DURING LIFTING WEIGHT The mannequins were assigned 20 kg weight to be lift in four different postures. Through Ergo Tool in Human Cad Mannequin Pro were applied to calculate the forces and torques applied on different body parts. Mannequin in Figure 1, picking the 20 kg load in semi standing forward bending position, in Figure 2 picking the same load in semi sitting position with align knee and hip position with hand more extended and neck bending slightly from frontal plane. Similarly the mannequin in Figure 3, loading the load with standing feet and hand extended, the mannequin in Figure 4, picking the load with sitting position with one leg front support and one leg back support. Figure 1: Mannequin lifting block sitting with head extended down. Figure 2: Mannequin lifting block in semi sitting. Figure 3: Mannequin lifting block with forward extension with legs straight. Figure 4: Mannequin lifting block with one leg back with knee support. RESULTS OF DIGITAL HUMAN MODELLING The detailed forces and torque is provided in the static biomechanics (Tables 1-4). The postures taken is the replica of real life workers during lifting blocks. Four mannequin were created and assign to pick 20 kg concrete block and the masses act as a weights due to gravity. In the Human CAD the Ergo tool of Static Biomechanics Tool were applied and all the forces and torque are displayed on the window screen. The details of static biomechanical stress are given in the Tables 1-4. Table 1 shows the static biomechanical stresses on different body parts, the highest force applied on pelvis (359.049 N) and the second most load bearing region is thorax (268.708 N). Similarly the highest positive torque act on the thorax (183.927 Nm) and secondly (167.889 Nm) positive torque act on the pelvis. The line graph in Figure 5 shows that most of the stresses are concentrated on the pelvic region. Table 1: Static biomechanical forces of posture 1. Force(N) Torque(Nm) Head 65.629 0 Left Arm 24.356 45.807 Left Foot 17.682 0.475 Ullah Z, et al. J Ergonomics, Vol.9 Iss.1 No:243 2 Left Forearm 10.518 36.547 Left Palm 7.317 9.886 Left Shank 49.872 12.144 Left Thigh 121.998 23.426 Pelvis 359.049 183.927 Right Arm 25.267 38.982 Left Foot 17.682 1.087 Right Forearm 11.429 36.206 Right Palm 105.317 9.584 Right Shank 49.872 4.817 Right Thigh 121.998 30.857 Thorax 268.708 167.889 Table 2 shows the static biomechanical stresses on different body parts, the highest force applied on pelvis (359.049 N) and the second most load bearing region is thorax (268.708 N). Similarly the highest positive torque act on the thorax (183.927 Nm) and secondly (167.889 Nm) positive torque act on the pelvis. The line graph in Figure 6 shows that most of the stresses are concentrated on the pelvic region. Table 2: Static biomechanical forces of posture 2. Force(N) Torque(Nm) Head 65.629 0 Left Arm 24.356 51.533 Left Foot 17.682 1.147 Left Forearm 10.518 37.884 Left Palm 7.317 10.983 Left Shank 49.872 3.71 Left Thigh 121.998 32.084 Pelvis 359.049 122.721 Right Arm 25.267 41.744 Left Foot 17.682 0.468 Right Forearm 11.429 31.72 Right Palm 95.317 7.335 Right Shank 49.872 13.93 Right Thigh 121.998 3.682 Thorax 268.708 103.175 Table 3: Static biomechanical forces of posture 3. Force(N) Torque(Nm) Head 65.629 0 Left Arm 24.356 16.562 LeftFoot 17.682 1.145 Left Forearm 10.518 15.321 Left Palm 7.317 6.36 Left Shank 49.872 1.145 Left Thigh 121.998 2.777 Pelvis 359.049 103.136 Right Arm 25.267 15.674 Left Foot 17.682 1.094 Right Forearm 11.429 15.424 Right Palm 7.317 5.605 Right Shank 49.872 1.094 Right Thigh 121.998 2.626 Thorax 268.708 112.915 Table 4: Static biomechanical forces of posture 4. Force(N) Torque(Nm) Head 65.629 0 Left Arm 24.356 37.216 Left Foot 17.682 1.023 Left Forearm 10.518 26.133 Left Palm 7.317 8.598 LeftShanke 49.872 6.64 LeftThigh 121.998 26.871 Pelvis 359.049 160.717 Right Arm 25.267 29.889 Left Foot 17.682 0.965 Ullah Z, et al. J Ergonomics, Vol.9 Iss.1 No:243 3 Right Forearm 11.429 17.848 Right Palm 105.317 7.426 Right Shank 49.872 6.631 Right Thigh 121.998 26.856 Thorax 268.708 156.112 Figure 5: Static biomechanical graph of posture 1. Figure 6: Static biomechanical graph of posture 2. Table 3 shows the static biomechanical stresses on different body parts, the highest force applied on pelvis (359.049 N) and the second most load bearing region is thorax (268.708 N). Similarly the highest positive torque act on the thorax (112.915 Nm) and secondly (103.136 Nm) positive torque act on the pelvis. The line graph in Figure 7 shows that most of the stresses are concentrated on the pelvic region. Figure 7: Static biomechanical graph of posture 3. Table 4 shows the static biomechanical stresses on different body parts, the highest force applied on pelvis (359.049 N) and the second most load bearing region is thorax (268.708 N). Similarly the highest positive torque act on the thorax (156.112 Nm) and secondly (160.717 Nm) positive torque act on the pelvis. The line graph in Figure 8 shows that most of the stresses are concentrated on the pelvic region. Results of forces of the four postures given in below Table 5 and comparing results of torque of the four postures given in below Table 6. Figure 8: Static biomechanical graph of posture 4. Table 5: Comparing Forces, comparing results of the four postures and Results of forces of the four postures. Figure 1 (Force(N)) Figure 2 (Force(N)) (Force(N)) Figure 4 (Force(N)) Head 65.629 65.629 65.629 65.629 Left Arm 24.356 24.356 24.356 24.356 Left Foot 17.682 17.682 17.682 17.682 Left Forearm 10.518 10.518 10.518 10.518 Left Palm 7.317 7.317 7.317 7.317 Left Shank 49.872 49.872 49.872 49.872 Left Thigh 121.998 121.998 121.998 121.998 Pelvis 359.049 359.049 359.049 359.049 Right Arm 25.267 25.267 25.267 25.267 Left Foot 17.682 17.682 17.682 17.682 Right Forearm 11.429 11.429 11.429 11.429 Right Palm 105.317 95.317 7.317 105.317 Right Shank 49.872 49.872 49.872 49.872 Right Thigh 121.998 121.998 121.998 121.998 Thorax 268.708 268.708 268.708 268.708 Ullah Z, et al. J Ergonomics, Vol.9 Iss.1 No:243 4 Figure 3 Figure 1 Torque(Nm) Figure 2 Torque(Nm) Figure 3 Torque(Nm) Figure 4 Torque(Nm) Head 0 0 0 0 Left Arm 45.807 51.533 16.562 37.216 Left Foot 0.475 1.147 1.145 1.023 Left Forearm 36.547 37.884 15.321 26.133 Left Palm 9.886 10.983 6.36 8.598 Left Shanke 12.144 3.71 1.145 6.64 Left Thigh 23.426 32.084 2.777 26.871 Pelvis 183.927 122.721 103.136 160.717 Right Arm 38.982 41.744 15.674 29.889 Left Foot 1.087 0.468 1.094 0.965 Right Forearm 36.206 31.72 15.424 17.848 Right Palm 9.584 7.335 5.605 7.426 Right Shank 4.817 13.93 1.094 6.631 Right Thigh 30.857 3.682 2.626 26.856 Thorax 167.889 103.175 112.915 156.112 DISCUSSION Musculoskeletal Disorders are noted as a result of the presence of different risk factors, including contact stress, force, vibrations, repetition and jobs that put muscles under redundant physical forces. In the proposed study it is shown that changing the posture significantly change thee stresses. Figure 9 shows the comparative forces applied, the highest forces allied on posture 4 in Figure 4, followed by posture 3 in Figure 3. Similarly in posture 2 in Figure 2 a less forces is applied and the most ergonomically less stresses posture is in Figure 1 of posture 1. Similarly is the case of torque produced in the body is concentrated in the pelvis region. As from Figures 9 and 10, it is clear that most of the forces and positive torque is concentrated in pelvis region and the pelvis region is the most sensitive region of the human skeletal system. Figure 9: Static biomechanical graph of the forces. Figure10: Static biomechanical graph of the torques. CONCLUSION Through Human CAD tool the static Biomechanical stresses distributions were calculated. In an industrially developing countries like Pakistan the source of exposure to MSDs risks seem to be severe mainly because of the untrained workforce and due the absence of the labour laws implementation. The conclusion taken is that, though many studies have shown a significant relation between manual labour and MSDs, in an industrially developing countries, people are exposed to work without knowing the new job physical demand. In this regard, there is a dire need for medical and physical examination as a prerequisite for new jobs. In addition, workers should be trained on ergonomics basis before they are exposing to manual material handling. REFERENCES 1. Kaulio MA. Customer, consumer and user involvement in product development: A framework and a review of selected methods. Total Quality Management. 1998;9:141-149. 2. Stanton NA, Salmon PM, Rafferty LA, Walker GH, Baber C, Jenkins DP. Human factors methods: a practical guide for engineering and design. CRC Press. 2017 3. Shackel B. Ergonomics in information technology in Europe-a review. Behav Inf Technol. 1985;4:263-287. 4. Martinsons MG, Chong PK. The influence of human factors and specialist involvement on information systems success. Human relations. 1999;52:123-152. 5. De Sa AG, Zachmann G. Virtual reality as a tool for verification of assembly and maintenance processes. Computers & Graphics. 1999;23:389-403. 6. Stark R, Krause FL, Kind C, Rothenburg U, Müller P, Hayka H, et al. Competing in engineering design- The role of Virtual Product Creation. CIRP Journal of Manufacturing Science and Technology. 2010;3:175-184. Ullah Z, et al. J Ergonomics, Vol.9 Iss.1 No:243 5 Table 6: Comparing torque, comparing the results of torque of the four postures. 7. Dolezal WR. Success factors for digital mock-ups (DMU) in complex aerospace product development. Technische Universität München. 2008. 8. Mourtzis D, Papakostas N, Mavrikios D, Makris S, Alexopoulos K. The role of simulation in digital manufacturing: applications and outlook. Int J Comput Integr Manuf. 2015;28:3-24. 9. Whiteside J, Bennett J, Holtzblatt K. Usability engineering: Our experience and evolution handbook of human-computer interaction. Elsevier. 1998. 10. Pelliccia L, Klimant F, De Santis A, Di Gironimo G, Lanzotti A, Tarallo A, et al. Task-based motion control of digital humans for industrial applications. Procedia CIRP. 2017;62:535-540. 11. Magistris GD, Micaelli A, Savin J, Gaudez C, Marsot J. Dynamic digital human models for ergonomic analysis based on humanoid robotics techniques. Int J Digital Human. 2015;1:81-109. 12. Di Gironimo G, Pelliccia L, Siciliano B, Tarallo A. Biomechanically-based motion control for a digital human. Int J Interact Des Manuf. 2012;6:1-13. 13. Magistris G, Micaelli A, Evrard P, Andriot C, Savin J, Gaudez C, et al. Dynamic control of DHM for ergonomic assessments. Int J Ind Ergon. 2013;43:170-180. 14. Ma L, Chablat D, Bennis F, Zhang W, Guillaume F. A new muscle fatigue and recovery model and its ergonomics application in human simulation. Virtual Phys Prototyp. 2010;5:123-137. 15. Rasmussen J. Skills, rules, and knowledge; signals, signs, and symbols, and other distinctions in human performance models. IEEE transactions on systems, man, and cybernetics. 1983;13:257-266. 16. Maguire M. Methods to support human-centred design. Int J Hum Comput Stud. 2001;55:587-634. 17. Demirel HO, Duffy VG. Applications of digital human modeling in industry. International Conference on Digital Human Modeling. Springer. (2007) 18. MacLeod D. The ergonomics edge: Improving safety, quality, and productivity. John Wiley & Sons. US. 1994. 19. Hai Z. Development of smart industry maturity model. University of Twente. Master’s Thesis. 2017. 20. Aggarwal JK, Cai Q. Human motion analysis: A review. Comput Vis Image Underst. 1999;73:428-440. Ullah Z, et al. J Ergonomics, Vol.9 Iss.1 No:243 6 De 内容 Analysis of Weight Distribution in Term of Forces and Torques during Lifting Weight using Digital Human Modelling ABSTRACT INTRODUCTION METHODOLOGY MANNEQUIN POSTURE DURING LIFTING WEIGHT RESULTS OF DIGITAL HUMAN MODELLING DISCUSSION CONCLUSION REFERENCES work_ftbmk33innhhdebs3n535qcp6a ---- Realizing Lessons of the Last 20 Years: A Manifesto for Data Provisioning & Aggregation Services for the Digital Humanities (A Position Paper) Search D-Lib:   HOME | ABOUT D-LIB | CURRENT ISSUE | ARCHIVE | INDEXES | CALENDAR | AUTHOR GUIDELINES | SUBSCRIBE | CONTACT D-LIB   D-Lib Magazine July/August 2014 Volume 20, Number 7/8 Table of Contents   Realizing Lessons of the Last 20 Years: A Manifesto for Data Provisioning & Aggregation Services for the Digital Humanities (A Position Paper) Dominic Oldman, British Museum, London Martin Doerr, FORTH-ICS, Crete Gerald de Jong, Delving BV Barry Norton, British Museum, London Thomas Wikman, Swedish National Archives Point of Contact: Dominic Oldman, doint@oldman.me.uk doi:10.1045/july2014-oldman   Printer-friendly Version   Abstract The CIDOC Conceptual Reference Model (CIDOC CRM), is a semantically rich ontology that delivers data harmonisation based on empirically analysed contextual relationships rather than relying on a traditional fixed field/value approach, overly generalised relationships or an artificial set of core metadata. It recognises that cultural data is a living growing resource and cannot be commoditised or squeezed into artificial pre-conceived boxes. Rather, it is diverse and variable containing perspectives that incorporate different institutional histories, disciplines and objectives. The CIDOC CRM retains these perspectives yet provides the opportunity for computational reasoning across large numbers of heterogeneous sources from different organisations, and creates an environment for engaging and thought-provoking exploration through its network of relationships. The core ontology supports the whole cultural heritage community including museums, libraries and archives and provides a growing set of specialist extensions. The increased use of aggregation services and the growing use of the CIDOC CRM has necessitated a new initiative to develop a data provisioning reference model targeted at solving fundamental infrastructure problems ignored by data integration initiatives to date. If data provisioning and aggregation are designed to support the reuse of data in research as well as general end user activities then any weaknesses in the model that aggregators implement will have profound effects on the future of data centred digital humanities work. While the CIDOC CRM solves the problem of quality and delivering semantically rich data integration, this achievement can still be undermined by a lack of properly managed processes and working relationships between data providers and aggregators. These relationships hold the key to sustainability and longevity because done properly they encourage the provider to align their systems, knowing that the effort will provide long lasting benefits and value. Equally, end user projects will be encouraged to cease perpetuating the patchwork of short-life digital resources that can never be aligned and which condemn the digital humanities to a pseudo and predominantly lower quality discipline.   Introduction This paper addresses the complex issues of large scale cultural heritage data integration crucial for progressing digital humanities research and essential to establishing a new scholarly and social relevance for cultural heritage institutions often criticised for being, "increasingly captive by the economic imperatives of the marketplace and their own internally driven agendas1." It includes a discussion on the essential processes and necessary organisational relationships, data quality issues, and the need for wider tangible benefits. These benefits should extend beyond end user reuse services and include the capability to directly benefit the organisations that provide the data, providing a true test of quality and value. All these components are interdependent and directly affect the ability of any such initiative to provide a long term and sustainable infrastructure on which evidence producers, information curators and evidence interpreters can rely on, invest in and further develop. Many cultural data aggregation projects have failed to address these foundational elements contributing instead to a landscape that is still fragmented, technology driven and lacking the necessary engagement from humanities scholars and institutions. Therefore this paper proposes a new reference model of data provision and aggregation services2 aiming to foster a more attractive and effective practice of cultural heritage data integration from a technological, managerial and scientific point of view. This paper is based on results and conclusions from recent work of the CIDOC CRM Special Interest Group (CRM SIG), a working group of CIDOC, the International Committee for Documentation of the International Council of Museums (ICOM). The Group has been developing the CIDOC Conceptual Reference Model (Doerr, 2003; Crofts et al., 2011) and been providing advice for integration of cultural heritage data over the past 16 years. Over this period the adoption of the CIDOC CRM has significantly increased, supported by enabling technology such as RDF stores3 and systems like SolrTM4, with Graph5 databases providing further potential6. These systems have become mature and powerful enough to deal with the real complexity and scale of global cultural data. Consequently, this means that issues of sustainable management of integrated resources are more urgent that ever before, and consequently the Group is calling for a collaborative community effort to develop a new Reference Model of Data Provision and Aggregation, which is based on a completely different epistemological paradigm compared to the well-known OAIS (Open Archival Information Service) Reference Model (Consultative Committee for Space data Systems, 2009). Therefore, this paper reflects positions developed within the CRM SIG and by others on three important aspects of developing integrated cultural heritage systems and associated data provisioning processes. These positions in summary are: Cultural heritage data provided by different organisations cannot be properly integrated using data models based wholly or partly on a fixed set of data fields and values, and even less so on 'core metadata'. Additionally, integration based on artificial and/or overly generalised relationships (divorced from local practice and knowledge) simply create superficial aggregations of data that remain effectively siloed since all useful meaning is available only from the primary source. This approach creates highly limited resources unable to reveal the significance of the source information, support meaningful harmonisation of data or support more sophisticated use cases. It is restricted to simple query and retrieval by 'finding aids' criteria. The same level of quality in data representation is required for public engagement as it is for research and education. The proposition that general audiences do not need the same level of quality and the ability to travel through different datasets using semantic relationships is a fiction and is damaging to the establishment of new and enduring audiences. Thirdly, data provisioning for integrated systems must be based on a distributed system of processes in which data providers are an integral part, and not on a simple and mechanical view of information system aggregation, regardless of the complexity of the chosen data models. This more distributed approach requires a new reference model for the sector. This position contrasts with many past and existing systems that are largely centralised and where the expertise and practice of providers is divorced. The effects of these issues have been clearly demonstrated by numerous projects over the last 20 years and continue to affect the value and sustainability of new aggregation projects, and therefore the projects that reuse the aggregated data. This paper is therefore particularly aimed at new aggregation projects that plan to allocate resources to data provisioning and aggregation and that wish to achieve sustainability and stability, but also informs existing aggregators interested in enhancing their services.   Background During the last two decades many projects have attempted to address a growing requirement for integrated cultural heritage data systems. By integrating and thereby enriching museum, library and archive datasets, quality digital research and education can be supported making the fullest use of the combined knowledge accumulated and encoded by cultural heritage organisations over the last 30 years — much of this effort having been paid for by the public purse and by other humanities funding organisations. As a result the potential exists to restore the significance and relevance of these institutions in a wider and collaborative context revitalising the cultural heritage sector in a digital environment. The ability to harmonise cultural heritage data such that individual organisational perspectives and language is retained, yet at the same time allowing these heterogeneous datasets to be computationally 'reasoned' over as a larger integrated resource, is one that has the potential to propel humanities research to a level that would attract more interest and increased investment. Additionally, the realisation of this vision provides the academy with a serious and coherent infrastructural resource that encodes knowledge suitable as a basis for advanced research and crucially, from a Digital Humanities7 perspective, based upon cross disciplinary practices. As such it would operate to reduce the intellectual gap that has opened up between the academy and the cultural heritage sector. Even though industrial and enterprise level information integration has a successful 20 year history (Wiederhold, 1992; Gruber, 1993; Lu et al., 1996.; Bayardo et al., 1997; Calvanese, Giacomo, Lenzerini, et al., 1998), to date very few projects, if any, attempting to deliver such an integrated vision for the cultural heritage sector have been able to preserve both meaning and provide a sustainable infrastructure under which organisations could realistically align their own internal infrastructures and processes. Systems have lacked the necessary benefits and services that would encourage longer-term commitment, failed to develop the correct set of processes needed to support long-term data provisioning relationships, and neglected to align their services with the essential objectives of their providers. The situation reflects a wider problem of a structurally fragmented digital humanities landscape and an ever-widening intellectual gap with cultural heritage institutions, reflected rather than resolved by aggregation initiatives. This is despite the existence of very clear evidence (lessons learnt during the execution of these previous projects) for the reasons behind this failure. The three main lessons that this paper identifies are: The nature of cultural heritage data (museum collection data, archives, bibliographic materials but also more specialist scientific and research datasets) is such that it cannot be treated in the same way as warehouse data, administrational information or even library catalogues. It contains vast ranges of variability reflecting the different types of historical objects with their different characteristics, and therefore is also influenced by different scholarly disciplines and perspectives. Different institutional objectives, geography and institutional history also affect the data. There are no 'core' data. However, many project managers use these characteristics to claim that such complexity cannot be managed, or conversely that such rich data does not exist, or that it is impossible to create an adequate common data representation. These positions hinder research and development and limit engagement possibilities. There is a frequent lack of understanding that cultural heritage data cannot be carved up into individual products to be shipped around, stored and individually 'consumed', like a sort of emotion stimulant coming to "fruition"8. Heritage data is rather the insight of research about relationships between a past reality and current evidence in the form of curated objects. Therefore it is only meaningful as long as the provenance and the connection to the material evidence in the hands of the curators are preserved. The different entities that exist within it are fundamentally related to each other providing mutually dependent context. Heritage data is subject to continuous revision by the curators and private and professional researchers, and must reliably be related to previous or alternative opinions in order to be a true source of knowledge. Consequently, the majority of the data in current aggregation services, sourced from cultural heritage organisations, cannot be cited scientifically. Sustaining aggregations using data sources from vastly different owning organisations requires an infrastructure that facilitates relationships with the local experts and evidence keepers to ensure the correct representation of data (such that it is useful to the community but also to the providing organisations themselves), and also takes into account the changing nature of on-going data provision. Changes can occur at either end of data provisioning relationships and therefore a system must be able to respond to likely changes, taking into account the levels of resources available to providers necessary to maintain the relationship throughout. Aggregations must also include processes for directly improving the quality of data (i.e. using the enriched integrated resources created by data harmonisation) and feeding this back to institutional experts. These three issues are inextricably linked. Understanding the meaning of cultural heritage data and the practices of the owners of the data or of material evidence are essential in maintaining long term aggregation services. Longevity requires that data must be encoded in a way to provide benefits for all parties; the users of resulting services, the aggregators and the providers. Due to the same functional needs such principles are self-evident for instance, for the biodiversity community, expressed as the "principle of typification" for zoology in (ICZN, 1999) and its application in the collaboration of GBIF (Global Biodiversity Information Facility) within natural history museums, but they are not evident for the wider cultural heritage sector. This may be because, in contrast to cultural heritage information, misinterpretation of biodiversity data can have immense economic impact, for instance, with pest species. By not addressing these problems only short-lived projects can result that, in the end, consume far more resources than resolving these issues would require.   The Historical Problem Consider this statement from 1995: "Those engaged in the somewhat arcane task of developing data value standards for museums, especially the companies that delivered collections management software, have long had to re-present the data, re-encode it, in order for it to do the jobs that museums want it to perform. It's still essentially impossible to bring data from existing museum automation systems into a common view for use for noncollections management purposes as the experience of the Museum Educational Site Licensing (MESL) and RAMA (Remote Access to Museum Archives) projects have demonstrated. Soon most museums will face the equally important question of how they can afford to re-use their own multimedia data in new products, and they will find that the standards we have promoted in the past are inadequate to the task." (Bearman, 1995) Both the projects cited in the quote are testament to the fact that managing data and operational relationships with cultural heritage organisations in support of collaborative networks is a complicated undertaking and requires a range of different skills. Some of the issues have been clarified and resolved by the passage of time and we now have a better understanding of what benefits are practically realistic and desirable when forming new data collaborations. Yet new projects seem intent on replicating flawed approaches and repeating the mistakes of the past.9   The Continuing Problem Consider this opinion from the JISC Discovery Summit 2013 from the expert panel. "...developers are impatient and just want to get access to the data and do interesting things, and on the other side of the equation we have curators reasonably enough concerned about how that data is going to be used or misinterpreted or used incorrectly. I think that this is actually a difficult area because the conceptual reference models are generally more popular with the curators than with the developers [...It is not] clear to me ... how we solve the problem of engaging the people who want to do the development...through that mechanism, but nonetheless as this great experiment that we are living through at the moment with opening up data and seeing what can be done [...]unfolds, if we find that the risks are starting to become too great and the value is so poor because the data is being misused or used incorrectly or inappropriately, if that risk is a risk to society in general and not just to the curators...then we are going to have to find those kind of solutions." (JISC, 2013)10 This suggests a continuing problem and that digital representation of cultural heritage information is still determined by those who understand it least. It also suggests that cultural heritage experts have yet to engage with the issue of digital representation and continue to leave it to technologists. Nevertheless, why would a software developer believe that representing the semantics and contextual relationships between data is not as interesting (let alone crucially important) as representing data without them, and why do they determine independently the mode of representation in any event? Given the enormous costs involved in aggregating European data such a risk assessment suggested above, might reasonably be conducted up front, since the infrastructural changes needed to resolve the realisation of this risk would be almost impossible to implement. The delegates of the Summit agreed by a large majority that their number one concern was quality of metadata and contextual metadata, contrary to the view of some of the panel members11 — emphasising the gap between providers, users and aggregators. Ironically, it is those who advocate technology and possess the skills to use computers who seem most reluctant to explore the computer's potential for representing knowledge in more intelligent ways. As computer scientists regularly used to say, 'garbage in, garbage out' (GIGO)12. The value and meaning of the data should not be secondary or be determined by an intellectually divorced technological process.   Position 1 — The Nature of Cultural Heritage Data The RAMA project, funded in the 1990s by the European Commission, serves as a case study demonstrating how squeezing data into fixed models results in systems that ultimately provide no significant progress in advancing cultural heritage or scholarly humanities functions. Yet large amounts of scarce resources are invested in similar initiatives that can only provide additional peripherally useful digital references. The most recent and prominent example is the Europeana project13, which although technologically different to RAMA, retains some of the same underlying philosophy. The RAMA project proceeded on the premise that data integration could only be achieved if experts were prepared to accept a, "world where different contents could be moulded into identical forms", and not if, "one thinks that each system of representation should keep its own characteristics regarding form as well as contents" (Delouis, 1993). This is a view that is still widely ingrained in the heads of many cultural heritage technologists. While more aggregations, like Europeana, have made some use of knowledge representation principles and event based concepts they continue to use them in highly generalised forms and with fixed, core field modelling. This is clearly wrong from both a scholarly and educational perspective (as well as for subsequent engagement opportunities) and therefore results in wasteful technical implementations. Yet academics seem unable to deviate (or fail to understand) from this traditional view of data aggregation. The CIDOC CRM, which commenced development in the latter part of the 1990's post the failure of RAMA and MESL, is a direct answer to the "impossible" problem identified by Bearman and others. The answer, realised by several experts, was to stop the technically led pre-occupation with fixed values and fields which inevitable vary both internally and between organisations, and instead think about the relationships between things and the real world context of the data. This not only places emphasis on the meaning of the data but also places objects, to a certain degree, back into their historical context. "Increasingly it seems that we should have concerned ourselves with the relationships...between the objects." (Bearman, 1995) This fundamentally different approach concentrates on generalisations not determined by high committee but is instead based on many years of empirical analysis. It is concerned with contextual relationships that are mostly implicit but prominent in various disciplinary forms of digital documentation and associated research questions, and that cultural heritage experts in all fields are able to agree on. From this analysis a notion of context has been emerging which concentrates on interrelated events and activities connected to hierarchies of part-whole relationships of things, events and people, and things subject to chains of derivation and modification. This is radically different from seeking the most prominent values by which people are supposed to seek for individual objects. This approach is highly significant for the digital humanities (see Unsworth (2002)) because it inevitably requires a collaborative shift of responsibilities from technologists to the experts who understand the data. It therefore also requires more engagement from museums and cultural heritage experts. However, the widening gap between the Academy and resource-poor memory institutions means that a solution requires clearly identified incentives to encourage this transfer of responsibility. It entails the alignment of different strategies and the ability to provide more relevant and useful services with inbuilt longevity. It must carry an inherent capacity to improve the quality of data and deliver all benefits cost effectively. Given this, the alignment needs to start at the infrastructure level. The technical reasons why applications of the CIDOC CRM can be much more flexible, individual and closer to reality than traditional integration schemata, and yet allow for effective global access, are as follows. Firstly, the CRM extensively exploits generalization/specialization of relationships. Even though it was clearly demonstrated in the mid 1990's that this distinct feature of knowledge representation models is mandatory for effective information integration (Calvanese, Giacomo, Lenzerini, et al., 1998), it has scarcely been used in other schemata for cultural data integration14. It ultimately enables querying and access to all data with unlimited schema specializations but by fewer implicit relationships15, and removes the need to mandate fixed data field schemas for aggregation. This is also substantially different from adding 'application profiles' to core fields (e.g. schema.org), where none of the added fields will reveal the fact of a relationship in a more generic query. Secondly, it foresees the expansion of relationships into indirections, frequently implying an intermediate event, and the deduction of direct relationships from such expansions. For instance, the size of an object can be described as a property of an object or as a property of a measurement of an object. The location of an object can be property of the object or of a transfer of it. The expansion adds temporal precision. The deduction generalizes the question to any time, as a keyword search does. Modern information systems are well equipped to deal consistently with deductions, but no other documented schema for cultural data integration has made use of this capacity (Tzompanaki and Doerr, 2012; Tzompanaki et al., 2013). This paradigm shift means that, instead of the limitations imposed by using fixed fields for global access, the common interface for users is defined by an underlying system of reasoning that is invisible to the user (but is explicitly documented) and is crucially detached from the data entry format. It provides seemingly simple and intuitive generalizations of contextual questions. This use of algorithmic reasoning, that makes full use of the precise underlying context and relationships between entities, provides a far better balance and control of recall and precision and can be adjusted to suit different requirements. By representing data using a real world semantic ontology, reflecting the practice and understanding of scholars and researchers, aggregation projects become more serious resources, and as a result their sustainability will become a more serious concern across the community. The enthusiasm of technologists and internal project teams is not sufficient for long-term sustainability, and corporate style systems integration techniques are not appropriate for cultural heritage data. Just like the proliferation of data standards, often justified by small variations in requirements, isolated aggregations using the same justifications will also proliferate affecting overall sustainability and diluting precious resources and financing. In contrast an aggregation that supports and works with the variability of cultural heritage data and owning organisations, and that services a wider range of uses, stands a far better chance of long-term support. Other schemas, despite using elements of knowledge representation, are still created, 'top down' and perpetuate a belief in the need for 'core'; and are inevitably flawed by a lack of understanding of knowledge and practice. It is far easier and quicker for technologists to make artificial assumptions about data, and mandate a new schema, than it is to develop a 'bottom up' understanding of how cultural heritage data is used in practice. However, the CRM SIG has completed this work removing the need for further compromise on this field.   Position 2 — Engagement Needs Real World Context A familiar argument put to the community by technologists is that creating resources using a semantic reference model is complicated and expensive, and that aggregations designed to satisfy a general audience do not need this level of sophistication. Moreover, the requirements of museum curators (see above) and other academics are not the same as those of the public and the latter should be prioritised when allocating resources to create services. In other words, publishing data, in whatever form, is the most important objective. However, publishing data and communicating understanding are two completely different concepts and humanities data can be impossible to interpret without meaningful context. This view also misunderstands the role of museums and curators who are keepers of primary material evidence and hold a primary role in communicating with and engaging general audiences using rich contextual narratives. The only reference model that influences the design of current aggregation systems is the Reference Model for an Open Archival Information System (OAIS). It basically assumes that provider information consists of self-contained units of knowledge, such as scholarly or scientific publications, finished and complete with all necessary references. It assumes that they are finished products that have to be fixed and preserved for future reference. The utterly unlucky choice of the term 'metadata', for cultural data, assuming that the material cultural object is the real 'data', actually degrades curatorial knowledge to an auxiliary retrieval function without scientific merit, as if one could 'read out' all curatorial knowledge just by contemplating the object, in analogy to reading a book. Consequently, a surface representation with limited resolution (a 3D Model) is taken as a sufficient 'surrogate' for the object itself, the assumed real target of submission to the Digital Archive. The absence of a different type of reference model perpetuates this view in implementer and management circles. In reality, 'museum metadata' are the product, but not as a self-contained book, but rather equivalents of paragraphs, illustrations, headings and references of a much larger, 'living' book — the network of current knowledge about the past. The same holds for other fields of knowledge, such as biodiversity or geology data. Museum curators are skilful in representing objects using a range of different approaches, all of them more sophisticated than the presentation of raw object metadata. Their experience and practice has wider value for colleagues in schools, universities and other research environments. The reason why many curators have not engaged with technology is because of the limitations that it apparently presents in conveying the history and context in which objects were produced and used. Museums, by their nature, remove the object from its historical context and "replaces history with classification". Curators, almost battling against the forces of their own environment, attempt to return objects back into their own original time and space, a responsibility difficult to achieve within the "hermetic world" of the museum gallery (Stewart, 1993, p.152), and particularly amongst a largely passive and untargeted mixture of physical 'browsers'. In the digital world flat data representations, even if augmented with rich multimedia, do not convey the same quality of message and validity of knowledge that curators attempt to communicate to general audiences every day. The lonely gallery computer with its expensive user experience and empty chair is all too often a feature of 'modern' galleries. Museums also spend vast amounts of money enriching data on their web sites, sometimes with the help of curators, and attempt to add this valuable and engaging context. However, such activity involves the resourcing of intensive handcrafted content that inevitably limits the level of sophistication and collaboration that can be achieved, as well as the range and depth of topics that can be covered. (Doerr and Crofts, 1998, p.1) Far from being driven by purely private and scholarly requirements, curators would see contextual knowledge representation as a way of supporting their core role in engaging and educating the public but on a scale they could not achieve with traditional methods and with current levels of financing. Since semantically harmonised data reveals real world relationships between things, people, places, events and time, it becomes a more powerful engagement and educational tool for use with wider audiences beyond the walls of the physical museum. In comparison to traditional handcrafted web page development it also represents a highly cost effective approach. Semantic cultural heritage data using the CIDOC CRM may not equate to the same type of narrative communication as a curator can provide, but it can present a far more engaging and sophisticated experience when compared to traditional forms of data representation. While it can help to answer very specific research questions it can also support the unsystematic exploration of data. It can facilitate the discovery of hitherto unknown relations and individual stories, supporting more post-modern concerns; but also providing a means to amalgamate specifics and individual items into a larger "totalizing" view of expanding patterns of history16 (Jameson, 1991, p.333). Unsystematic exploration17 (but which invariably leads to paths of relationships around particular themes) is extremely useful for general engagement, but this is also seen as increasingly important for scholars (curators, researchers/scientists from research institutions and universities) working with big data, changing the way that scholars might approach research and encouraging new approaches that traditionally have been viewed as more appropriate to the layperson. The CIDOC CRM ontology supports these different approaches bringing together methodologies that are useful to researchers, experts, enthusiasts and browsers alike, but in a single multi-layered implementation. The opportunity provided by the CIDOC CRM goes further. Just in the same way that lessons identified in the 1990's about cultural heritage integration have been ignored, the research into how museums might shape the representation of cultural knowledge has also been ignored in most digital representations. The pursuit of homogenised views with fixed schemas continues with vigour within digital communities, but the strength of the knowledge held by different museums is in its difference — its glorious heterogeneous nature. Yet again from the 1990s, a quote from a leading academic museologist. "Although the ordering of material things takes place in each institution within rigidly defined distinctions that order individual subjects, curatorial disciplines, specific storage or display spaces, and artefacts and specimens, these distinctions may vary from one institution to another, being equally firmly fixed in each. The same material object, entering the disciplines of different ensembles of practices, would be differently classified. Thus a silver teaspoon made in the eighteenth century in Sheffield would be classified as 'Industrial Art' in Birmingham City Museum, 'Decorative Art' at Stoke on Trent, 'Silver' at the Victoria and Albert Museum, and 'Industry' at Kelham Island Museum in Sheffield. The other objects also so classified would be different in each case, and the meaning and significance of the teaspoon itself correspondingly modified". (Greenhill, 1992, pp.6-7) While the World Wide Web has undoubtedly revolutionised many aspects of communication, work and engagement, its attractive but still essentially "Gutenberg" publishing model has effectively created an amnesia across the community. While a pre-Web world talked about how computers could push the boundaries of humanities as a subject, a post-Web world seems content with efficient replication of the same activities that previously took place on paper (Renn, 2006; McCarty, 2011, pp.5-6).   Position 3 — The Reality of Cultural Heritage Data Provisioning It is a long-standing failure of aggregators to design and implement a comprehensive set of processes necessary to support long term provider-to-aggregator relationships. The absence of such a reference model is considered to be a major impediment to establishing sustainable integrated cultural heritage systems and therefore, by implication, a significant factor in the inability to fully realize the benefits of the funding and resources directed towards the humanities over the last 20 years. This legacy has contributed to a general fragmentation of humanities computing initiatives as project after project has concentrated on end user functionality without properly considering how they could sustain the relationships that ultimately determined their shelf life — if indeed this was an objective. This, along with the lack of an empirically conceived cultural heritage reference model (discussed above), has impeded the ultimate goal of collaborative data aggregation to support intelligent modelling and reasoning across large heterogeneous datasets, and provide connections between data embedded with different perspectives. Instead, each new and bigger initiative pushes further the patience of funders who are increasingly unhappy with the return to the community of their investment. In contrast to the approach of most aggregators, the responsibilities demanded of such systems are viewed by this paper from a real world perspective, as distributed and collaborative rather than substantially centralized and divorced from providers, as the OAIS Reference Model assumes. In reality the information provider curates his/her resources and provides, at regular intervals, updates. The provider is the one who has access to the resources to verify or falsify statements about the evidence in their hands. Therefore the role of the aggregator includes the responsibility for the homogeneous access to the integrated data and the resolution of co-references (multiple URIs, 'identifiers', for the same thing) across all the contributed data - but not to 'take over' the data like merchandise. The latter synopsis of consistency appears to be the genuine knowledge of the aggregator, whereas any inconsistencies should be made known to, and can only be resolved by, the original providers. The process of transformation of these information resources to the aggregator's target system requires a level of quality control that is often beyond the means of prospective providers. Therefore a collaborative system that delivers such controls means that the information provider benefits from data improvement and update services that would normally attract a significant cost, and could not be done as effectively only based on local knowledge. Additionally, if the aggregation is done well, harmonisation should deliver significant wider benefits to the provider (including digital relevance and exposure for organisations regardless of their status, size and location) and to the community and society as a whole. The process of mapping from provider formats to an aggregator's schema needs support from carefully designed tools. All current mapping tools basically fail in one way or another to support industrial level integration of data, from a large number of providers,18 in a large number of different formats and different interpretations of the same formats,19 undergoing continuous data and format changes at the provider side, undergoing semantic and format changes at the aggregator side20. To consistently maintain integrated cultural data requires a much richer, component and workflow based architecture. The proposed architecture includes a new component, a type of knowledge base or 'mapping memory', and at its centre a generic, human readable 'mapping format' (currently being developed as X3ML)21, designed to support different processes and components and accommodate all organisations with different levels of resourcing. Such architecture begins to overcome the problem of centralised systems where mapping instructions are unintelligible and inaccessible to providers22 and hence lack quality control. Equally, it overcomes the problem of decentralised mapping by providers which often interpret concepts within the aggregator's schema in mutually incompatible ways. It finally overcomes the problem of maintaining mappings after changes to source or target formats and changes of interpretations of target formats or of terminological resources on which mappings are conditional. It should further provide collaborative communication (formal and social) support for the harmonization of mapping interpretations. Since the mapping process depends on clean data and brings to light data inconsistencies, sophisticated feed-back processes for data cleaning and identifier consistency between providers must also be built into the design. The ambition of such an architecture exceeds the scope of typical projects and it can only come to life if generic software components can be brought to maturity by multiple providers. Unfortunately, all current 'generic' mapping tools are too deeply integrated into particular application environments and combine too many functions in one system to contribute to an overall solution. We do not expect any single software provider to be capable of providing such generic components for all the necessary interface protocols. This is borne out by the continued expenditure of many millions of Euros by funding bodies to fund mapping tools and other components in dozens of different projects without ever providing a solution of industrial strength and high quality. Therefore the proposed architecture and reference model (called Synergy), which has already been outlined by the CRM SIG in various parts23, aims at a specification of open source components, well-defined functionality and open interfaces. Implementers may develop and choose between functionally equivalent solutions with different levels of sophistication, for example table-based, or graph-based visualization of mapping instructions, intelligent systems that automatically propose mappings or purely manual definition, etc. They may choose functionally equivalent components from different software providers capable of dealing with particular format and interface protocols and therefore different provider-aggregator combinations. Only in this way does the community have a chance to realize an effective data aggregation environment in the near future. In the reference model that we propose the architecture plays a central role and is a kind of proof of feasibility. However, it is justified and complemented by an elaborate model of the individually identified business processes that exist between the partners of cultural data provision and aggregation initiatives, both as a reference and a means to promote interoperability on an organisational and technical level. The processes enabled by this architecture should also be viewed with an understanding that, as a result of a properly defined end-to-end provisioning system, other more collaborative processes and practices can be initiated that enable organizations to support each other and to exchange experience and practice more easily and to greater effect. The establishment of a system that supports many different organisations promotes greater levels of collaboration between them independently of aggregators, and increases the pool of knowledgeable resources. It is at this level where structural robustness can be practically implemented to enable a re-construction of the essential alliances between the cultural heritage sector and the wider academic body. These considerations should be considered a priority for any new aggregation service rather than a problem to be solved at a later date. Prioritisation of quantity above quality and longevity means that aggregators soon reach a point where they can no longer deal adequately with data sustainability issues. The solution is often to concentrate even more resources on functionality and marketing in a hope that the underlying problem might be solved externally, and make the decision to remove funding more difficult. Inevitably relationships between providers and aggregators start to fail, links become broken and relationships break down leading to a gradual decline and finally failure. The cultural heritage sector is unable to invest resources into schemes that have unclear longevity and which lack the certainty needed to support institutional planning. Without a high degree of certainty cultural organisations cannot divert resources into aligning their own systems with that of aggregators and therefore the scope of those systems becomes so low that their overall value is marginal. Organisations become ever more cautious towards these new projects. This current state of affairs is also implicitly reflected by the growth of smaller and more discrete projects that seek to aggregate data into smaller, narrower and bespoke models designed as index portals for particular areas of study and particular research communities. These projects can be interpreted as a direct statement of dissatisfaction with larger more ambitious aggregation projects that have failed to provide infrastructures onto which these communities can build and develop (and therefore contribute to an overall effort) extensions for particular areas of scholarly investigation. However, the reduced and narrow scope of these projects mean that they are only of use to those who already have a specialised understanding of the data, can piece together information through their own specialist knowledge, and are content with linear reference resources. They embrace concepts of linked data but forego the idea that a more comprehensive and contextual collaboration is possible. To more cross-disciplinary researchers and other groups interested in wider and larger questions, the outputs of these projects are of limited interest and represent a fragmented patchwork of resources. They provide little in the way of information that can be easily understood or which easily integrates with other initiatives because they provide only snippets, data products that have been separated from their natural wider context and, in effect, they practically restrict (in contrast to their stated aims) the ability to link data outside their narrow domain. This severely limits the possibilities and use cases for the data and contributes to a fragmented landscape over which wider forms of digital humanities modelling is impossible.   Conclusion   Knowledge Representation At the beginning of the twenty-first century, much effort was spent trying to both define and predict the trajectory of what had now been termed the digital humanities24. "In some form, the semantic web is our future, and it will require formal representations of the human record. Those representations — ontologies, schemas, knowledge representations, call them what you will — should be produced by people trained in the humanities. Producing them is a discipline that requires training in the humanities, but also in elements of mathematics, logic, engineering, and computer science. Up to now, most of the people who have this mix of skills have been self-made, but as we become serious about making the known world computable, we will need to train such people deliberately." (Unsworth, 2002) The debate about the extent to which humanists must learn new skills still continues today. The reasons why humanists have been slow and reluctant to incorporate these new skills into their work, in perhaps the same way that some other disciplines have done, are too varied and complex to consider here. Whether a lack of targeted training or a philosophical position about the extent to which computers can address the complexities of historical interpretation, or a lack of understanding about the areas of scholarship that can be enhanced or transformed by computers, the resulting lack of engagement seriously affects the outcomes and value of digital humanities projects. While there are notable exceptions the overwhelming body of 'digital humanities' work, while often providing some short term wonderment and 'cool', has not put the case well enough to persuade many humanists to replace existing traditional practices. This has a direct connection with the reasons why, despite a clear identification of the problem from different sources, flawed approaches still exist and are incorporated, without challenge, into each new technological development. The cultural heritage linked data movement currently provides new examples of this damaging situation. Far from providing meaningful linking of data, the lack of a properly designed model reflecting the variability of museum and other cultural data, and the inability to provide a robust reference model for collaborative data provisioning mean that early optimism has not materialised into a coherent and robust vision. While more data is being published to the Internet it has limited value beyond satisfying relatively simple publishing use cases or providing reference materials for discrete groups. While these resources are useful, their preoccupation has seriously impeded more ground-breaking humanities research that might uncover more profound discoveries and demonstrate that humanities research is as important to society as scientific research, and is deserving of more consideration from funders. However, just as in scientific research, the humanities community must learn from previous research to be considered worthy of increased attention. Further fragmentation and unaligned initiatives are unlikely to instil confidence in those organisations that have previously been willing to finance digital humanities projects. We must learn from projects like CLAROS25 at Oxford University that are significant because, while they adopt a more semantic and contextual approach, they have evolved from the lessons learnt directly from projects like RAMA. The CLAROS team includes expertise and experience derived through past contributions to the RAMA project, as well as others similar projects. This experience has provided a first-hand understanding of the problems of data aggregation and larger scale digital humanities research. As a result the CLAROS project positively benefits from the failures of previous research rather than replicates unsuccessful methodologies. New projects like ResearchSpace26 and CultureBrokers27, are now building on the work of CLAROS to create interactive semantic systems together with other types of CRM based projects.   Research and Engagement While different audiences have different objectives all data use cases benefit from the highest quality of representation, the preservation of local meaning and the re-contextualising of knowledge with real world context. Researchers benefit from the ability to investigate and model semantic relationships and the facility to use context and meaning for co-reference and instance matching. Engagement and education activities benefit from exactly the same semantic properties that bring data to life and provide more interesting and varied paths for people to follow without the 'dead ends' that would ordinarily confront users of traditional aggregation services. There is no longer an excuse for using 'top down' schemas because the 'bottom up' empirical and knowledge based approach is now available (the result of years of considerable effort) and accessible in the form of the CIDOC CRM. The core CRM schema is mature and stable, growing in popularity and provides no particular technological challenge. It is an object-oriented schema based on real world concepts and events implementing data harmonisation based on the relationships between things rather than artificial generalisations and fixed field schemas. It simplifies complicated cultural heritage data models but in doing so provides a far richer semantic representation sympathetic to the data and the different and varied perspectives of the cultural heritage community.   Reversing Fragmentation and Sustaining Collaboration The history of digital humanities is now littered by hundreds of projects that have made use of and brought together cultural heritage data for a range of different reasons. Yet these projects have failed to build up any sense of a coherent and structural infrastructure that would make them more than "bursts of optimism" (Prescott, 2012). This seems connected with a general problem of the digital humanities clearly identified by Andrew Prescott and Jerome McGann. "... the record of the digital humanities remains unimpressive compared to the great success of media and culture studies. Part of the reason for this failure of the digital humanities is structural. The digital humanities has struggled to escape from what McGann describes as 'a haphazard, inefficient, and often jerry-built arrangement of intramural instruments, free-standing centers, labs, enterprises, and institutes, or special digital groups set up outside the traditional departmental structure of the university'" (Prescott, 2012; McGann, 2010) But while these structural failings exist in the academic world, Prescott also identifies years of exclusion of cultural heritage organisations, keeping them at arm's length and thereby contributing to a widening gap with organisations that own, understand and digitise (or, increasingly, curate digital material) our material, social and literary history. Prescott, with a background that includes both academic and curatorial experience in universities, museums and libraries, is highly critical of this separation and comments that, "my time as a curator and librarian [was] consistently far more intellectually exciting and challenging than being an academic". This experience and expertise, which in the past set the tone and pace for cultural heritage research and discovery, is slowly but surely disappearing making it increasingly difficult for cultural heritage organisations to claim a position in a, "new digital order"28 — particular in an environment of every increasing financial pressures. (Prescott, 2012). In effect they are reduced to simple service providers with little or no stake in the outcomes.29 Within the vastly diverse and every changing nature of digital humanities projects how can organisations and projects collaborate with each other? How can they spend time and resource effectively in a highly fragmented world that ultimately works against effective collaboration? We believe that one answer is to change the emphasis from the inconsistent 'bursts' and instead focus on the underlying structures that could support more consistent innovation. By establishing the foundational structures that providers, aggregators and users of cultural data all have a common interest in maintaining, a more consistent approach to progressing in the digital humanities may be achieved. In such an environment projects are able to build tools and components that can be both diverse and innovative but that contribute to the analysis and management of a growing body of harmonised knowledge capable of supporting computer based reasoning. The challenge is not about finding the right approach and methodology (these aspects being understood back in the 1990's) but rather how the ingrained practices of the last 20 years, determined mostly by technologists, can be reversed and a more collaborative, cross disciplinary and knowledge led approach can be achieved. This is a collaboration based not simply on university department collaboration but a far wider association of people and groups who provide an equally important role in establishing a healthy humanities ecosystem. The CRM SIG has already started elaborating and experimenting with key elements for a new reference model of collaborative data provision and aggregation, in line with the requirements indicated above. Work on this new structure has already commenced in projects like CultureBrokers, a project in Sweden starting to develop some of the essential components described above. We call on other prospective aggregators, existing service providers and end users to pool resources and contribute to the development of this new and sustainable approach. Without such a collaboration the community risks never breaking out of a cycle based on flawed assumptions and restrictive ideas, and therefore never creating the foundational components necessary to take digital humanities to a higher intellectual, and practical, level.   Notes 1 Clough, G. Wayne, Secretary of the Smithsonian institute citing Robert Janes (Janes, 2009) in Best of Both Worlds: Museums, Libraries, and Archives in the Digital Age (Kindle Locations 550-551), (2013). 2 Synergy Reference Model of Data Provision and Aggregation. A working draft is available here. 3 Database engine for triple and quad statements in the model of the Resource Description Framework. 4 Solr™ is the fast open source search platform from the Apache Lucene™ project (http://lucene.apache.org/solr/). 5 A database using graph structures, i.e., every element contains a direct pointer to its adjacent elements. 6 The CIDOC CRM is largely agnostic about database technology, relying on a logical knowledge representation structure. 7 By "Digital Humanities" we mean not only philological applications but any support of cultural-historical research using computer science. 8 For the recent use of the term "cultural heritage fruition", see, e.g., "cultural assets protection, valorisation and fruition" in Cultural Heritage Space Identification System, Best Practices for the Fruition and Promotion of Cultural Heritage or Promote cultural fruition. 9 For example, the failed umbrella digital rights management model of MESL. 10 Paul Walk, former Deputy Director, UKOLN — transcribed from the conference video cast. 11 Panel included: Rachel Bruce, JISC; Maura Marx, The Digital Public Library of America; Alister Dunning, Europeana; Neil Wilson, British Library; Paul Walk, UKOLN; and David Baker, Resource Discovery Task Force. 12 Garbage in, garbage out (GIGO) in the field of computer science or information and communications technology refers to the fact that computers, since they operate by logical processes, will unquestioningly process unintended, even nonsensical, input data ("garbage in") and produce undesired, often nonsensical, output ("garbage out"). (Wikipedia) 13 Europeana, a European data aggregation project with a digital portal. 14 Only recently Europeana developed the still experimental EDM model, which adopted the event concept of the CIDOC CRM. Dublin Core application profiles implemented in RDF make use of some very general superproperties, such as dc:relation, dc:date, dc:description, dc:coverage. XML, JSON, Relational and Object-Relational data formats cannot represent superproperties. 15 In terms of RDF/OWL, one would speak of "superproperties", which are known to the schema and the database engine, but need not appear explicitly in the data. (Calvanese, Giacomo, and Lenzerini, 1998) use the terms, "relation subsumption" and "query containment". 16 A historiographic concern in respect of the tension between post-modern approaches and uncovering and exposing supposedly power relations across different political-economic phases. For example, (Jameson, 1991). 17 An approach associated with writers such W.G. Sebald, for example, see (Sebald, 2002). 18 The German Digital Library envisaged about 30,000 potential providers in Germany. 19 Hardly any tool supports sources as different as RDBMS, XML dialects, RDF, MS Excel, and tables in text formats. 20 See, for instance, the migration of Europeana from ESE to EDM. 21 See delving / x3ml for the X3ML mapping format development. 22 For example, the use of XSLT. 23 The reference model has been named 'Synergy'. A working draft is available here. 24 Formerly, Humanities Computing. 25 CLAROS, a CIDOC CRM-based aggregation of classical datasets from major collections across Europe. 26 ResearchSpace, TA project, funded by the Andrew W. Mellon Foundation, to create an interactive research environment based on CIDOC CRM data harmonisation. 27 Culturebroker, a project mainly funded by a consortium led by the Swedish Arts Council implementing data provisioning using the new reference model for Swedish institutions. 28 Something different to simple information publication and the use of popular social networking facilities. 29 See The Two Art Histories, Haxthausen (2003) and The Museum Time Machine, Lumley (Ed.) (1988) for further evidence.   References [1] Bayardo, R., et al. (1997) "InfoSleuth: Agent-Based Semantic Integration of Information in Open and Dynamic Environments", in ACM SIGMOD International Conference on Management of Data, pp. 195-206. [2] Bearman, D. (1995) "Standards for Networked Cultural Heritage". Archives and Museum Informatics, 9 (3), 279-307. [3] Calvanese, D., Giacomo, G., Lenzerini, M., et al. (1998) "Description Logic Framework for Information Integration", in 6th International Conference on the Principles of Knowledge Representation and Reasoning (KR'98), pp. 2-13. [4] Calvanese, D., Giacomo, G. & Lenzerini, M. (1998) "On the decidability of query containment under constraints", in Principles of Database Systems, pp. 149-158. [5] Consultative Committee for Space data Systems. (2009) "Reference Model for an Open Archival Information System" (OAIS). [6] Crofts, N. et al. (eds.) (2011) "Definition of the CIDOC Conceptual Reference Model". [7] Delouis, D. (1993) "TOlOsystixnes France", in International Cultural Heritage Informatics. [8] Doerr, M. (2003) "The CIDOC conceptual reference module: an ontological approach to semantic interoperability of metadata". AI Magazine, 24 (3), 75. [9] Doerr, M. & Crofts, N. (1998) "Electronic espernato—The Role of the oo CIDOC Reference Model". Citeseer. [10] Gruber, T. R. (1993) "Toward Principles for the Design of Ontologies Used for Knowledge Sharing". International Journal Human-Computer Studies, (43), 907-928. [11] Greenhill, E. H. (1992) Museums and the Shaping of Knowledge. Routledge. [12] Haxthausen, Charles W. (2003) The Two Art Histories: The Museum and the University. Williamstown, Mass: Yale University Press. [13] ICZN (1999) International Code of Zoological Nomenclature. 4th edition. The International Trust for Zoological Nomenclature. [14] Jameson, F. (1991) Postmodernism or the Cultural Logic of Late Capitalism. Duke University Press. [15] Janes, Robert R. (2009) Museums in a Troubled World: Renewal, Irrelevance or Collapse? London: Routledge, pp. 13. [16] JISC (2013) "JISC Discovery Summit 2013". [17] Lu, J. et al. (1996) "Hybrid Knowledge Bases". IEEE Transactions on Knowledge and Data Engineering, 8 (5), pp. 773-785. [18] Lumley, Robert. (1988) The Museum Time Machine. Routledge. [19] McCarty, W. (2011) "Beyond Chronology and Profession", in Hidden Histories Symposium. 17 September 2011, University College London. [20] McGann, J. (2010) "Sustainability: The Elephant in the Room", from a Mellon Foundation Conference at the University of Virginia. [21] Prescott, A. (2012) "An Electric Current of the Imagination." Digital Humanities: Works in Progress. [22] Renn, J. (2006) "Towards a Web of Culture and Science". Information Services and Use 26 (2), pp. 73-79. [23] Sebald, W. G. (2002) The Rings of Saturn. London: Vintage. [24] Stewart, S. (1993) "On Longing: Narratives of the Miniature, the Gigantic, the Souvenir, the Collection". Duke University Press. [25] Tzompanaki, K., et al. (2013) "Reasoning based on property propagation on CIDOC-CRM and CRMdig based repositories", in Online Proceedings for Scientific Workshops. [26] Tzompanaki, K. & Doerr, M. (2012) "Fundamental Categories and Relationships for intuitive querying CIDOC-CRM based repositories". [27] Unsworth, J. (2002) "What is Humanities Computing and What is not?". [28] Wiederhold, G. (1992) "Mediators in the Architecture of Future Information Systems". IEEE Computer.   About the Authors Dominic Oldman is currently the Deputy Head of the British Museum's Information Systems department and specialises in systems integration, knowledge representation and Semantic Web/Linked Open Data technologies. He is a member of the CIDOC Conceptual Reference Model Special Interest Group (CRM SIG) and chairs the Bloomsbury Digital Humanities Group. He is also the Principal Investigator of ResearchSpace, a project funded by the Andrew W. Mellon Foundation using CIDOC CRM to provide an on-line collaboration research environment. A law graduate he also holds a post graduate degree in Digital Humanities from King's College, London.   Martin Doerr is a Research Director at the Information Systems Laboratory and head of the Centre for Cultural Informatics of the Institute of Computer Science, FORTH. Dr. Doerr has been leading the development of systems for knowledge representation and terminology, metadata and content management. He has been leading or participating in a series of national and international projects for cultural information systems. His long-standing interdisciplinary work and collaboration with the International Council of Museums on modeling cultural-historical information has resulted besides others in an ISO Standard, ISO21127:2006, a core ontology for the purpose of schema integration across institutions. He is chair of the CRM SIG.   Gerald de Jong has a background in combinatorics and computer science from the University of Waterloo in Ontario, Canada. He has a more than a decade of freelance experience in the Netherlands, both coding and training, including being part of the original Europeana technical team. He has a passion for finding simplicity in otherwise complex things, and with multi-agent and darwinistic approaches to solving gnarly problems. He co-founded Delving BV in 2010 to focus on the bigger information challenges in the domain of cultural heritage. He is a member of the CRM SIG.   Barry Norton is Development Manager of ResearchSpace, a project developing tools for the cultural heritage sector using Linked Data. He has worked on data-centric applications development since the mid 90's and holds a PhD on Semantic Web and software architecture topics from the University of Sheffield. Before working at the British Museum he worked as a consultant Solutions Architect following a ten year academic career at universities in Sheffield, London (Queen Mary), Karlsruhe, Innsbruck and at the Open University.   Thomas Wikman is an experienced manager working on national and European ICT projects and museum collaborations since the mid 90's. He is the Project manager and Co-ordinator at the Swedish National Archives for the CultureCloud and the CultureBroker projects. CultureBroker is an implementation of the Data Provisioning Reference Model and the CIDOC CRM aggregating archival and museum data. He is a member of the CRM SIG.   Copyright © 2014 Dominic Oldman, Martin Doerr, Gerald de Jong, Barry Norton and Thomas Wikman work_ftnttjzrpnafjjgbra75tg323a ---- 0 Neurocognitive Literary Studies and Digital Humanities Dr. Valiur Rahaman (Paper Presenter) Asstt Professor, Department of English Madhav Institute of Technology & Science Gwalior-INDIA Founder President, Indian Society of Digital Humanities (formed 2016) Principal Investigator of CRS Research Project on Humanities Inspired Technology Long Presentation at ADHO Conference / Digital Humanities 2020 /Virtual Conference Keywords: Humanities-Inspired Technology, Research in Digital Humanities, Neurocriticism, Autism, Literary Studies, Literary data Modeling, Digital Narrative, Social Media, Transdisciplinary Research 1 Neurocognitive Literary Studies and Digital Humanities ABSTRACT The paper demonstrates how neurocognitive social psychology can be applied to study human behavior through literary character analysis with digital tools; and how the digital literary studies in terms of neurocognitive psychology may help develop new models for technology and theories of contemporary science. On the basis of the theses, the paper illustrates the theoretical methodology called “Humanities-inspired technology for society” as an essential sub-branch of Digital Humanities and its application to the two major research studies: to great classics of all times and to etiology of autism. The paper advocates to bring literary theory and neurocognitive literature in the curricular of science and technology. Keywords: Humanities-Inspired Technology, Research in Digital Humanities, Neurocriticism, Autism, Literary Studies, Literary data Modeling, Digital Narrative, Social Media, Transdisciplinary Research 1. Introduction Psychology, Cognitive Science and Psychoanalysis are often intersectional subjects with literary studies. Digital Humanities strengthens literary studies when its scholarship help develop models for advancement of science and technology. Till date, a very few studies have gone to this direction-how DH scholarship help technological modeling for challenging social problems and healthcare issues. The paper highlights the conceptual ground of humanities-inspired technology for society (HITS), its applications and functions. It has a major component 'neurocognitive literary study' through digital tools and hence the paper establishes a networked rapport of literary arts with neurocognitive science and digital humanities/studies. At the beginning of the paper, the author defines HITS as an approach to knowledge system and concludes with its applications. 2 2. HITS as Sub-branch of DH: A Study in Digital Humanities to Technological Advancement Digital Humanities scholarship is utilized to disseminate, preserve, conserve and represent visuals of the knowledge system but seldom used for advancing human technologies for social welfare. The paper explores humanities inspired technology as a subdiscipline of Digital humanities which studies how humanities scholarship intersected or interpreted or analyzed with digital technological tools and it demonstrates attributes to modelling for technological development. It deals with practical expositions of literary or language philosophers, and critical theorists as impetuses for modeling of cognitive computational technology. Hence, it strongly establishes an inseparable bridge between practices in technology and humanities epistemology. The function of Humanities-inspired technology for society (HITS) essentially lies with developing models based on digital studies in philosophy of language and literary studies in terms of brain, mind and behavior. It coordinates the two different streams of knowledge system for three reasons: first, to remind; second to upgrade; and third to develop. It reminds what is missed by the world of technology; suggests to upgrade technological tools and devices for their humane utilization without their hazardous impacts on the earth and beyond; and develops new models out of scholarly studies in humanities for technological advances. For instance, there is no neuro- model based technology developed till date to identify the factors of sexual deviant criminals, to control or detect such heinous criminals. Begun with empathy to the victims, a HITS scholar studies the behavior patterns of such personalities in Literature in terms of neuro-cognitive psychology and social psychology and may develop behavior semiotic model based on the studies patterns and prepared corpus. Such studies develop industry-based research and development in the fields of Digital Humanities, which is much awaited epistemological contention in the arena of humanities departments in India and across the world. For ages, Literature is studied in its own terms: Aristotelian, Longinian, Classicist, Romantics, Modern, Postmodern, Gender, Colonial and Postcolonial. Literary studies seldom go beyond its defined disciplinary territories and this was the major reason for its fall across the world. 3 Its boundary is defined for its users and the users are not allowed to go beyond the boundaries, thus, communication with the real world is questioned in literary studies. The influences of Marx, Freud, Nietzsche, Foucault, Lacan, and Derrida are irresistible penetrating human thinking so they could touch the offshoots of the literary studies despite the disciplinary resistance of classical rhetoricians. Now, something has happened more than that: interferences of science and technology in the study of Humanities with slow but steady manners; in respective phases resulting in Humanities Computing, Computational Humanities, Digital Humanities, Speculative Digital Humanities (SpecLab), and Public Digital Humanities. 3. Conceptualization, Experimentation, and Invention The demand of transdisciplinary studies of science and arts, aesthetics and technology are observed in the history of ideas of contentions of difference and epistemological hybridity. I.A. Richards’s collaborative works with C.K Ogden developed a transdisciplinary approach to the poetics called ‘science of criticism’ (Green); C. P. Snow observed two cultures in the “intellectual life of the whole of western society” (Rede lectures); E. O. Wilson‘s Consilience: The Unity of Knowledge (Wilson) is the finest exposition of trans-disciplinary thought argues for “consilience” referring to “the synthesis of knowledge” derived from different specialized fields of human endeavor to envision a new field of knowledge serving the society. “The greatest enterprise of the mind has always been and always will be the attempted linkage of the sciences and humanities.” (Wilson; Morris) How this linkage is possible? Let’s understand with few examples: Descartes’ painting is a part popular science known as a pattern-design of the first experimentation in designing the airplane.(Miller) The coordinate system is ingrained in Descartes's philosophy; and similarly, Thomas Carlyle’s Circle is well-known model in Mathematics (DeTemple) as “a certain circle in a coordinate plane associated with a quadratic equation” and may similar studies are yet to be done. The implications of humanities knowledge of the two are examples of Humanities inspired technology and science. Such findings of interferences of Humanities in the domains of science and technology are observable to establish an ideation that science and technology are developed also by the epistemological influences of Humanities (esp. linguistics, literature and cultural heritage). The HITS never establishes superiority of a knowledge system over another one such as demonstrated in Science and Poetry as a problem in epistemological enquiries. (Midgley) 4 4. Literature, Neurocognitive science, and Technology: Substantial Studies in Neurocognitive Digital Humanities Based on the concept argued above, the paper now reflects substantiated studies research on Humanities inspired technology. In this, it is shown how knowledge of Humanities polishes, cherishes the motives for developing technological tools to guarantee the safety and security of the human society at large. We conducted two studies together: I theorized the ‘Neurocognitive literary theory’ based on “activated neurons affecting/effecting the human behavior (ANAEHB)” patterns and applied to study Hamlet’s neurological problems equating his mental status with existing persons in real society; to study R.N. Tagore’s The Post Office in terms of how neurocognitive forces in an author empathetically influence the audiences of the play resulting in its translation and staging across the world during the World War. (Rahaman and Sharma); and to study neurodevelopment issues reflected through behavior such as the mental anguish and moral dilemmas of Rodion Raskolnikov in Fyodor Dostoyevsky’s Crime and Punishment (1866), and neurocognitive factors of racial discriminative behavior patterns of Marlowe & Kurtz in Joseph Conrad’s Heart of Darkness (1899), and sexual deviant behavior of David Lurie in J. M. Coetzee’s Disgrace (1999). These characters illustrate the behavior patterns of the socially disturb mindset resulting in numerous societal problems at large. The specific factors of behaviors disturbing the other members of the society and their connection with the CNS are etiologically studied and replied to the research questions: Can Literary reading be intersected with neurological and computational studies? Can reading in Humanities or knowledge of humanities help solve complex problems in the development of AI, Neurocomputation, Human Nature Inspired Computing, and Medical computing? Based on the following findings which are observed as outcomes of neurocognitive literary studies: 1. the impulses of human beings through deep reading of literary classics, and compare with real-life situations in Human Society is feasible 2. Understanding human impulses identifying neurological causes behind human behavior and developed computational modeling to express the criminal mindset 3. Based on Humanities and Knowledge Engineering for Medical & Technology, developed a device to protect a woman from the unwanted accident 4. Established the possibility of 5 Trans-disciplinary research in Arts & Literature intersected with cognitive sciences and computational studies 5. Established Literature & Language as a reflection of socio-neuron behavior and identified mental patterns of the neurological disorder in humans to commit Rape and Murder. For literary studies, words are the only media for assessing human behaviors so Atlas.ti the software application is used to analyze the patterns of behavior through frequencies of words used by the characters of the literary works. 5. Literary Narratives, Neurodevelopment and Techno-epidemiology As argued, the trans-disciplinary approach always brings novelty in the procedures of experimentation resulting in prismatic ways to see the world. For example, Friedrich Salomon Rothschild (1899-1995), a psychiatrist and colleague of Erich Fromm (1900-1980) developed the theory of biosemiotics. Rothschild was a reader of Charles W. Morris (I have cited above) who studied Engineering and Psychology at NU and earned a Ph.D. under the research supervision of psycho-sociologist George Herbert Mead (1863-1931). His book Signs, Language, and Behavior (1946) elucidates the signs representing human behavior; specific modes of signifying adequacy, truth, and reliability of signs; and defined life is but the semiotic narrative, and as the signature of human behavior. Similarly, J. C. Whitehorn and G. K. Zipf collaboratively wrote “Schizophrenic language” (1943); G. K. Zipf edited The Psycho-Biology of Language (1939), “The Unity of Nature, Least-Action, and Natural Social Science” (1942), and “Observations of the Possible Effect of Mental Age Upon the Frequency-Distribution of Words, from the viewpoint of Dynamic Philology” are the oldest research papers archived in the PubMed and remain foundational works in cognitive- linguistic disorders which symptomatize the Autism basically. These works are the consequences of inclinations towards what we called “research consilience” a trans-disciplinary approach to knowledge serving humanity and its associated agencies. 6. ZEF factor of Autism The prevalence of the rate of autism in the world states itself the facts of a less effective approach to cure and challenge autism. To do so, there is unavoidable necessity to observe the history of the etiology of autism: from the second decade of the twentieth century to the WW I & II, and to 2019. The entire history of autism reveals various factors of autism established by medical practices 6 or special treatment. The keen observation of etiology of autism states that the epidemiological historians of autism could really not differentiate the terms between symptomatology and etiology of autism. The problem is strongly put forth in “Deconstructing the Etiology of Autism and its Cure through Social Media & Digital Literary Narratives” (Rahaman 2020) and came up with a major finding that Autism eventuates during fertilization periods, longtime before the birth of a child. It is the evaluative study of the research pursued in the etiology of ASD and the possibility to develop a parallel treatment way by deconstructing the established hardcore medical practices for ASD. We studied, critically evaluated articles published between 1943 and 2019, consulted the world health organization reports of the prevalence of ASD in USA & eight South Asian countries, and develop an additional idea as therapy of ASD through “Social Media” & “Literary Narratives” differentiating technological and developed a model of post-technological Autism treatment. The study contributed to help the cure procedures for ASD through “Social media” and “literary narratives” further requirement of upgradation in epidemiological treatment through technological imaging and development of technology based on the ZEF factors of Autism. The other findings establish the open possibilities of research in the fields required to design further research and make policies to resist the prevalence of ASD around the world. Acknowledgements The concept of “Humanities Inspired Technology & Science” (HITS) sprung from the readings for the ongoing research project sponsored by the Collaborative Research Scheme under TEQIP-III, National Project Implementation Unit of MHRD, Govt. of India. The aim of the project is to define the potential of Humanities & Social Sciences to be used for the development of technology and science for the welfare of human beings with minimum after-effects or side-effects upon common lives or its target groups. The title of the presentation "Cognitive Literary Studies for Computational Cognitive Modeling: A Humanities Inspired Approach to Technological Advancement" is briefed as short title "Neurocognitive Literary Studies and Digital Humanities". The presentation is one of the outcomes of the collaborative research project funded under the National Project Implementation Unit (NPIU)-AICTE, MHRD-Govt. of India. The membership to ADHO is subscribed by the NPIU. The contribution of Dr R K Pandit, Director MITS Gwalior for his supports in forms of rich discussion for the research studies. 7 Works Cited DeTemple, Duane W. “Carlyle Circles and the Lemoine Simplicity of Polygon Constructions.” The American Mathematical Monthly, vol. 98, no. 2, 1991, pp. 97–108, doi:10.1080/00029890.1991.11995711. Green, Elspeth. “I . A . Richards Among the Scientists.” ELH Fall, vol. 86, no. 3, 2019, pp. 751– 77, doi:https://doi.org/10.1353/elh.2019.0028. Midgley, Mary. Science and Poetry. Routledge London, 2001. Miller, Leonard G. “Descartes, Mathematics, and God.” Philosophical Review, vol. 66, no. 4, 1957, pp. 451–65, doi:10.1093/mind/xx.80.592. Morris, Charles William. “Symbolism and Reality : A Study in the Nature of Mind.” Foundations of Semiotics, no. 15, 1993, pp. xxv, 128 p. Rahaman, Valiur, and Sanjiv Sharma. Reading an Extremist Mind through Literary Language: Approaching Cognitive Literary Hermeneutics to R.N. Tagore’s Play The Post Office for Neuro-Computational Predictions. Edited by G R Sinha and Jasjit S B T - Cognitive Informatics Suri Computer Modelling, and Cognitive Science, Academic Press, 2020, pp. 197–210, doi:https://doi.org/10.1016/B978-0-12-819445-4.00010-2. Wilson, Edward O. Consilience: The Unity of Knowledge. Vintage Books Random House New York, 1999. Consulted Works 1. Arbib, Michael A. James J. Bonaiuto. From Neuron to Cognition via Computational Neuroscience. The MIT Press. 2016 2. Bara, B. G., Ciaramidaro, A., Walter, H., & Adenzato, M. Intentional minds: A philosophical analysis of intention tested through fMRI experiments involving people with schizophrenia, people with autism, and healthy individuals. Frontiers in Human Neuroscience, 5(7), 111. 2011. 3. Baron-Cohen, S. Mindblindness: An essay on autism and Theory of Mind. Cambridge, MA: MIT Press. 1995 8 4. Brown, Julie. Writers on the Spectrum: How Autism and Asperger Syndrome have Influenced Literary Writing, Jessica Kingsley Publishers, London. 2010. 5. Cook, Amy. Shakespearean Neuroplay: Reinvigorating the Study of Dramatic Texts and Performance through Cognitive Science. Palgrave-Macmillan. 2010. 6. Corbett BA, et al “Treatment Effects in Social Cognition and Behavior following a Theater- based Intervention for Youth with Autism.” Cortex. 2019 Jun; 115:15-26. doi: 10.1016/j.cortex.2019.01.003. Epub 2019 Jan 22. 7. Dutta, Krishna; Robinson, Andrew, eds. Rabindranath Tagore: an anthology. Macmillan. 1998. 8. Einstein AJ, Henzlova MJ, Rajagopalan S. Estimating risk of cancer associated with radiation exposure from 64-slice computed tomography coronary angiography. JAMA 2007;298 (3):317–323. 9. Emmeche, Claus; Kull, Kalevi Towards a Semiotic Biology: Life is the Action of Signs” Imperial College Press. 2011 10. Fitzgerald, Michael The Genesis of Artistic Creativity Asperger’s Syndrome and the Arts. Jessica Kingsley Publishers. 2005. 11. George McKenzie, Jackie Powell and Robin Usher Ed. Understanding Social Research: Perspectives on Methodology and Practice, The Falmer Press. London. 1997. 12. Glynn. Dylan, Quantitative Methods in Cognitive Semantics: Corpus-Driven Approaches (Cognitive Linguistic Research). De Gruyter Mouton.2010. 13. Hickok, Gregory. The Myth of Mirror Neurons: The Real Neuroscience of Communication and Cognition, WW Norton. London. 2014. 9 14. Hickok, Gregory. “Eight Problems for the Mirror Neuron Theory of Action Understanding in Monkeys and Humans” Available. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2773693/(2009) 15. Korczak, Janusz. Ghetto Diary with an Introduction of Betty Jean Lifton Available https://ia800401.us.archive.org/2/items/GhettoDiary- EnglishJanuszKorczak/ghettodiary.pdf. Retrieved on 1.8.2019 16. Pandit, RK, and Rahaman, Valiur, “Critical Pedagogy in Digital Era: Understanding the Importance of Arts & Humanities for Sustainable IT Development” (May 12, 2019). Proceedings of International Conference on Digital Pedagogies (ICDP) 2019. Available http://dx.doi.org/10.2139/ssrn.3387020 17. Peter Tepe, Discourse Studies, Vol. 13, No. 5, Special Issue on Hermeneutics and Discourse Analysis (October 2011), pp. 601-608. 18. Pineda, Jaime A., ed. 2013. Mirror Neuron Systems: The Role of Mirroring Processes in Social Cognition. Springer. Humana Press. 19. Rahaman, Valiur. Introducing Digital Humanities. Yking Books. Jaipur. India 2016. 20. Rahaman, Valiur. 2020. “Epi/Pandemic in Literature: A Study in Medical Humanities for COVID 19 Prevention Plenary Speaker. National Webinar on Literature & Epidemics. May 2020. MK Bhavnagar University, Gujarat. India.” Bhavnagar: Bhavnagar University India. https://sites.google.com/view/webinar-eng-mkbu/plenaries?authuser=0. 21. Rahaman, Valiur, and Sanjiv Sharma. “Reading an Extremist Mind through Literary Language: Approaching Cognitive Literary Hermeneutics to R.N. Tagore’s Play The Post Office for Neuro-Computational Predictions.” Cognitive Informatics Computer Modelling, and Cognitive Science. Ed. G R Sinha and Jasjit Suri., 197–210. Academic Press. Elsevier. 2020. doi: https://doi.org/10.1016/B978-0-12-819445-4.00010-2. 22. Ramchandran V. Blackslee, Sanda. (1998) Phantoms in the Brain: Probing the Mysteries of the Human Mind. HarperCollins. London. 1999. P 368. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2773693/(2009) https://ia800401.us.archive.org/2/items/GhettoDiary-EnglishJanuszKorczak/ghettodiary.pdf.%20Retrieved%20on%201.8.2019 https://ia800401.us.archive.org/2/items/GhettoDiary-EnglishJanuszKorczak/ghettodiary.pdf.%20Retrieved%20on%201.8.2019 https://ia800401.us.archive.org/2/items/GhettoDiary-EnglishJanuszKorczak/ghettodiary.pdf.%20Retrieved%20on%201.8.2019 https://dx.doi.org/10.2139/ssrn.3387020 https://dx.doi.org/10.2139/ssrn.3387020 https://sites.google.com/view/webinar-eng-mkbu/plenaries?authuser=0 10 23. Shakespeare, William.. Hamlet. Ed. Burton Raffe and Bloom. Yale University Press. London. 2003. doi:10.1017/CBO9781107415324.004. 24. Wolfrey, Julian. 2011. Introducing Criticism at the 21st Century. Introducing Criticism in the 21st Century, Edinburgh University Press. London. 25. Wilson, Matthew W. “Cyborg Geographies: Towards Hybrid Epistemologies”. Gender, Place and Culture, 16(5) (2009): 499–515. 26. Yeo, Richard. Defining Science, William Whewell, natural knowledge, and public debate in early Victorian Britain. Cambridge University Press. 1993. 27. V. Gallese, M.A. Gernsbacher, C. Heyes, G. Hickok, M. Iacoboni, “Mirror Neuron Forum”, Perspectives on Psychological Science 6 (4) (2011) 369-407. https://dx.doi.org/10.1177%2F1745691611413392 work_fvlnsuuajzcchapn3qcudhpmh4 ---- Humanist Studies & the Digital Age, 6.1 (2019) ISSN: 2158-3846 (online) http://journals.oregondigital.org/hsda/ DOI: 10.5399/uo/hsda.6.1.3 32 The Origins of Humanities Computing and the Digital Humanities Turn1 Dino Buzzetti, University of Bologna Abstract: At its beginnings Humanities Computing was characterized by a primary interest in methodological issues and their epistemological background. Subsequently, Humanities Computing practice has been prevailingly driven by technological developments and the main concern has shifted from content processing to the representation in digital form of documentary sources. The Digital Humanities turn has brought more to the fore artistic and literary practice in direct digital form, as opposed to a supposedly commonplace application of computational methods to scholarly research. As an example of a way back to the original motivations of applied computation in the humanities, a formal model of the interpretive process is here proposed, whose implementation may be contrived through the application of data processing procedures typical of the so called artificial adaptive systems. 1. Introduction A retrospective overview of the first stages of development of the newly emerged forms of reflection and methodological practices that, in Italian, have been properly named informatica umanistica (humanities computing) can also foster a better understanding of the new current trends. As a matter of fact, the limitations of the technological tools available at the time conferred more space to the ideation of what could have been achieved through the application of computational methods. The essential nature of the available technology focused the attention on the vast range of future opportunities enabled by the implementation of computational procedures on the model of the universal Turing machine. The origins of humanities computing are therefore characterized by a marked attention to the methodological and theoretical implications of research projects based on the application of computational procedures. The ensuing technological developments produced a rather paradoxical drawback. By polarizing the attention of scholars on the functionalities of application programs occasionally imposing themselves as dominant technologies, they induced a conceptual dependence on the available technology, to the detriment of a well-grounded choice of appropriate methods and alternative solutions. It may therefore be worthwhile to review the successive phases of 1 This article was published in Italian: Alle origini dell’Informatica Umanistica: Humanities Computing e/o Digital Humanities, in Il museo virtuale dell’informatica archeologica, a cura di Paola Moscati e Tito Orlandi. Atti della “Segnatura” (13 dicembre 2017). «Rendiconti dell’Accademia Nazionale dei Lincei», Classe di Scienze morali, storiche e filologiche, S. ix, 30.1-2 (2019), 71-103. http://journals.oregondigital.org/hsda/ http://creativecommons.org/licenses/by-nd/4.0/ Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 33 applied computation characterized by new technological developments in order to evaluate their impact on the research practices of humanities computing and to reconsider its current orientations in the light of the theoretical discussions of its opening phase. 2. The era of the mainframes The initial phase of humanities computing was characterized by employment of large computers available in computer centers, or public or private institutions that used them for administrative purposes. A “special thanks” for the intertextual analysis of the correspondences between two fundamental legal texts of the Jewish tradition, conducted by Sergio Noja in 1968, was addressed “to the Management of the S. Paolo Banking Institute of Turin,” who made “the electronic computer available,” which in turn made the publication of the essay presenting the results (Noja 1968, 582). The most famous example of this type of facility is that of the 56 printed volumes of the Index Thomisticus prepared by father Roberto Busa (1974-1980) thanks to the support of Thomas J. Watson, founder of IBM, and completed, after thirty years of work, only in the 70s. It has surely been noted that both computational projects conducted by the electronico IBM automato resulted in printed publications. At first glance, in an era characterized by the ever- increasing pervasiveness of the digital, this seems surprising. However, the paradox is only apparent and, if well considered, leads us to conclusions of a different sort. This circumstance leads us to reflect on the working conditions that the mainframe technology then allowed. The memories, consisting of punched cards and only later of magnetic tapes, did not allow any form of visualization of the data, and the output resulting from their processing was returned in print format from the output units of the computers specifically used. Therefore, the purpose of computation could not consist in reproduction and visualization of digital data, or object sources of investigation, but only in the elaboration and analysis of their informational content. In this situation the purpose of the research was primarily directed to computation, that is to the application of computational procedures to objects, in our case, of humanistic research. Hence the name of humanities computing or, in Italian informatica umanistica, for the research practices of this first period. The possibility of tackling research problems in the humanities with computational methods involved, at this stage and in the absence of technological mediation consisting of instruments already available and ready for use, a reflection directly addressed to the fundamentals of computation. IBM itself is likely to have supported Father Busa’s project for the opportunity it offered to extend the application of computation, up to that point almost exclusively aimed at processing numeric data, into the realm of processing textual information. Therefore, researchers did not simply have to make use of computational tools already set up, and choose those most suitable for the purposes of the research, but they had to contribute their own design, focusing attention on the specific aims of their own research. 3. A definition of humanities computing Attention to “the fundamentals of computing theory and science, which today absolutely nobody in Humanities Computing mentions” (Orlandi 2016, 80-81), together with the constant reference to the objectives of the research, led in the first phase of development of humanities computing to a Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 34 definition expressed in the title of Nicholas Wirth’s widely read book, Algorithms + Data Structures = Programs (Wirth 1976). In this regard, a volume that appeared at the conclusion of an investigation on the “impact of new technologies in the humanities” in Europe included a chapter, coordinated by Tito Orlandi, dedicated to study of formal methods, in which we find this specification – proposed by Manfred Thaller – of humanities computing as it was characterized in previous years and understood fundamentally as “applied computer science” (de Smedt 1999): we will attempt to define the core in terms of the traditional combination of data structures and algorithms, applied to the requirements of a discipline: - The methods needed to represent the information within a specific domain of knowledge in such a way that this information can be processed by computational systems result in the data structures required by a specific discipline. - The methods needed to formulate the research questions and specific procedures of a given domain of knowledge in such a way as to benefit from the application of computational processing result in the algorithms applicable to a given discipline. Crucial in this definition was the awareness that computation applied to the humanities requires both representation (data structures) and processing (algorithms) of the information contained in the objects of study, a requirement often overlooked in the subsequent phases of development influenced by advances in technology. This essentially theoretical characterization of humanistic information technology placed formalization in the foreground as a necessary and unavoidable prerequisite of research. From this point of view, the “real beginnings” of humanities computing can be directly traced, together with other initial “experiments” to the seminal works of Jean-Claude Gardin (Orlandi 2016, 79) on “formalization of aspects of archaeological research connected with the processes of representation and classification of data” (Moscati 2013, 7). As Paola Moscati rightly notes (2013, 10), Gardin stated that “the interest of method [...] rather comes from its logical implications, and from the consequences it seeks to provoke in the general economy of the archaeological research” (Gardin 1960, 5); and later, in 1971, in a letter to René Ginouvès, Gardin reiterated that “[...] the comparative merits of such or such machine model or punch cards since 1955 worried us less than the methods of formalization (mise en forme) of the data and of the archeological reasoning, in the perspective of a ‘mechanization’ conceived without referring to any of these cards or these machines in particular” (JCG 205, 12 January 1971). What mattered more than the technology, therefore, was the formal organization of data. From this point of view Jean-Claude Gardin, “really is at the source of Humanities Computing” (Orlandi 2016, 82). 4. Representation vs. Data processing Subsequently, the impressive technological development that occurred within a few years with the introduction of personal computers, graphic interfaces and then the implementation of the World Wide Web, profoundly transformed the research practices in the domain of the humanities computing, and greatly influenced the relationship itself between the representation and the elaboration of the content of the data under examination. The new and more advanced opportunities for practical use of computers, made possible by the progressive advancement of technology, have paradoxically Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 35 caused, if not quite a setback, at least one obvious delay in the theoretical elaboration necessary for the planning of applications specifically designed for specific research purposes. Once more Manfred Thaller, in the aforementioned volume, stated that, for an adequate training in humanities computing, “the study of computational methods themselves” was essential to the development of “new methods” for “the explanation,” framed “according to formal principles,” of the phenomena studied in the various humanities disciplines (de Smedt 1999). But in the new technological context, the center of the discussion was progressively moving from the investigation of the formalization of research methods and of the applicability of computation, to the evaluation of the possibility of using the new technological tools that gradually became available. The theoretical discussion about the design of applications specifically developed for specific projects of research and their intrinsic methodological implications thus fell into the background. To give just one example, in the years in which exclusively textual terminals were still in use, the graphic layout of a document was decided by the author himself during the composition of the text with the insertion of the markup, or print instructions, in declarative or directly procedural form. Subsequently, with the introduction of graphical interfaces, writing programs working in WYSIWYG mode (What You See Is What You Get) automatically inserted the markup, removing from the author the direct control of the layout, which could be carried out exclusively in the ways provided by the program functionalities. The alleviation of effort in the process of composition was obtained at the cost of renouncing the direct design of the graphic characteristics of the document. This example, after all banal and certainly not relevant in terms of research, is nonetheless useful to point out certain consequences, often unnoticed, implicit in the development of technology, and to highlight some actual reasons for the progressive renunciation of the design of computer applications usable for research purposes, in favor of the passive and uncritical use of new instruments thrust upon us. Also the very rapid spread of the Web has had profound consequences in the evolution of humanities computing. On closer inspection, the specific role of the computer in the practical use of the Web is quite limited, since it does nothing but guarantee remote access to data or documents stored at a distance, to be viewed on the computer screen. But the elaboration of the information content of the resources displayed remains entrusted to the reader’s ability to understand and, from this point of view, nothing changes. About the so-called “liberational effect of electronic technology on texts” Marilyn Deegan and Kathryn Sutherland have acutely observed that “the narrative of redemption from print” – anticipated by McLuhan and repeated emphatically by its followers of the 90s – did not foresee that further, unimagined, developments in electronic technology, like the Google search engine, the brainchild of two college students, would lovingly extend the culture of the book through instant delivery of high-resolution images of the pages of thousands of rare and previously hard-of-access volumes. (Deegan and Sutherland 2009, 10). Therefore, it is possible to assert that one “Google Book Search” – as they emphasize – “is not providing electronic text, it is providing books” (147). In other words, this common tool uses electronic technology to archive and instantly return “simulations of print and manuscript documents” (27). The elaboration of the informational content of the object of study is substantially neglected and the interest is turned to the digital reproduction of the source. That being said, we certainly do not Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 36 want to underestimate the importance of availability of multimedia resources. Digital images can, for example, be technically elaborated to improve the readability of deteriorated manuscripts, but the precedence given to the mere visualization of the sources profoundly modified the functionality of the fundamental link between representation and information processing. Attention came to be mainly addressed to the representation of information transmitted by objects of study, to the detriment of the elaboration of their contents for purposes of analysis and interpretation. The lack of attention paid to the elaboration of information content of textual data can also be seen in the strategic choices of Text Encoding Initiative (TEI), with which in the 90s the community of scholars of humanities established “a standard for the representation of texts in digital form” (TEI 2016). The purpose of the TEI was indeed to publish Guidelines for Electronic Text Encoding and Interchange, in order to “define and document a markup language for representing the structural, renditional, and conceptual features of texts,” above all “in the humanities and social sciences” (TEI 2015). With the introduction of document production systems, among computer scientists, a so-called document community was formed, which took care of the automation, the visualization and printing processes of the documents, distinct from the community denominated in a similar way data processing or database community, dedicated instead to the design of archives for structured data. Now, while for the document community, in the interchange of data between different systems, it was essential to maintain the invariance of the representation of the documents, for the data processing community, it was instead fundamental to ensure the invariance of data processing operations. As a result, while the document community “chose to standardize the representation of data,” to guarantee its interchangeability, the database community “chose to standardize the semantics of data,” by developing “data models that described the logical properties of data, independently of how it was stored,” and regardless of the particular format of their representation. To tell the truth, even the “data semantics was not irrelevant to the document community, but the definition of semantics did seem to be a difficult problem” and the attempts undertaken proved too easily exposed to criticism (Raymond et al. 1996, 27). So, for quite similar reasons, attempts to define semantics in the scholarly community, most notably the Text Encoding Initiative, similarly met with resistance. Thus, the route proposed by SGML was a reasonable one: promote the notion of application and machine independence, and provide a base on which semantics could eventually be developed, but avoid actually specifying a semantics. (28). The technology of document management systems thus affected the choices for the digital representation of the text and led to the adoption of Standard Generalized Markup Language (SGML) as a standard language for the codification of textual data. As a language of simple representation and not data processing – because it lacks a semantics of its own – SGML represented a clear limit for the processing of textual data for the analysis of contents and the interpretation of texts. In the field of humanities computing, the prevalence assigned to the representation as opposed to the processing of information radically changed, in this phase, the prevailing orientations of the research. 5. Semantic Web and Digital Humanities In the last and most recent period, the development of technology has had contrasting effects on the research practices of humanities computing. The Semantic Web project has brought back to the Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 37 fore the fundamental demand for the elaboration of the information accessible online. Languages have been developed for the representation of the content of Web resources, such as the Resource Description Framework (RDF) and for the construction of formal ontologies. Through the use of these languages, the so-called DIKW hierarchy (Data, Information, Knowledge, Wisdom), already in use in information science, could be treated more formally, allowing meaning to be assigned to data conceived as pure symbols that have not been interpreted and to represent the information conveyed by them through descriptions of the linked content between them. The networks and graphs of semantic relations thus obtained (linked data) made it possible to define connections rigorously and to organize certain fields of knowledge according to logically defined structural relationships and as such to allow the application of real procedures of formal inference. All this made it possible to bring attention back – through the tools provided by Semantic Web technologies – to the problem of processing the content of digital resources accessible online. At the same time, and against this trend, the use of the expression digital humanities to define the customary field of humanities computing has imposed itself. The deliberate adoption of this name seems to be due to the preference expressed by the publisher Blackwell Publishing for a catchy title for its Companion for the introduction to the discipline (Schreibman et al. 2004). However, this has favored the tendency to comprehend under this definition all the phenomena in which the digital medium is used to disseminate contents related to the humanities. Even a simple e-book, or all applications for mobile devices designed for the access to multimedia concessions, thus seems to enter the field of interest of humanities computing. The transition from humanities computing to digital humanities also comes explicitly theorized as a positive evolution of humanities computing. In fact, literary and artistic practice itself is more and more taking place in a directly digital form. In a recent interview with the online magazine Il lavoro culturale (Cultural Work) Jeffrey Schnapp, founder and director of metaLAB of Harvard University, claims to fully share “the point of view according to which a definition of digital humanities that reduced it to the application of a series of IT tools for the study of cultural heritage would be a relatively trivial operation,” and argues that already in the 1990s, when the formula, Digital Humanities, was established in the United States, and we stopped talking about Computational Humanities or Humanistic Computing, we wanted to emphasize two aspects: the emergence of the Network as a public space and the personalization of the computer... The expression Digital Humanities marked precisely this moment of transition, in which the distinction between the world of digital technologies and culture in society no longer existed. This is a moment of unification in which there has certainly been a rethink on what research in the field of human sciences can be. Consequently, humanities computing should give way to a “new experimental model of the human sciences,” to a new social practice of “Knowledge Design,” as opposed to nineteenth-century practice of philology (Capezzuto 2017). In the face of all this, there is no lack of authoritative positional statements that in different forms instead recommend the opportunity for a return to origins. So John Unsworth, already in the title of an essay in which he draws coherent attention to the results achieved in the various phases of Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 38 humanities computing development, urges us to go beyond the simple digital representation of the primary research sources (Unsworth 2004): we are, I think, on the verge of what seems to me the third major phase in humanities computing, which has moved from tools in the 50s, 60s, and 70s, to primary sources in the 80s and 90s, and now seems to be moving back to tools, in a sequence that suggests an oscillation that may repeat itself in the future. But whether or not the pattern ultimately repeats, I think we are arriving at a moment when the form of the attention that we pay to primary source materials is shifting from digitizing to analyzing, from artifacts to aggregates, and from representation to abstraction. The exhortation to proceed beyond the simple digital “representation” of the studied documents, passing to the “analysis” of the content and to the “abstraction” necessary for the formal specification of computational procedures, is here quite evident, while the reference to “tools” should be understood as being functional with respect to the formalization of the methods adopted, and not as simple technological devices prepared in advance, regardless of the specific procedures applied, just placed “in the hands of researchers” for the computer-assisted practice of the usual activity of examining and annotating documents (Leon s.d.). 6. The “logicism” of Jean-Claude Gardin Is it then possible to envisage the forms of this desired return to origins in our new context? The indications are not lacking and many inspiring principles can be drawn precisely from the illuminating anticipations of Jean-Claude Gardin. First of all, in the whole of his theoretical reflection, his rigorous methodological perspective and the reference to its necessary epistemological foundation take a central role. In a text published for the proceedings of a seminar held on 17 January 1994 at the University of Bologna, in which he presents the “research program” in which he was engaged “for thirty years” (Gardin and Borghetti 1995, 17), Gardin states that he is interested more than in “extending the field of application of computer science in the human sciences than in the progress and consolidation, with or without the calculator, of methodologies and their epistemological status.» (70). On the other hand, without giving priority to theoretical reflection on the possibility of applying computational methods to the humanities, one would inevitably run “the risk of confusion between the means and the ends of research (70),” and humanities computing would lose those “features of an autonomous intellectual project, with its own tools and goals” that actually characterize it (33). Hence the proposal by Gardin of the “logicist method” (30 ff) and “of the inevitable reference to epistemology” (19) that the application of this method necessarily involves. In the “analysis of archaeologists’ and historians’ texts” (17), or of the human sciences in general, considered “in their entirety” and as “constructs” (1980) or “scientific constructions” (Gardin and Borghetti 1995, 19), Gardin interest lies not so much, with Wittgenstein, in “erecting a building,” but rather “in having the foundations of possible buildings transparently before me” (18; see Wittgenstein 1998, 9). It is therefore necessary to face a problem of method and “practical epistemology” (Gardin and Borghetti 1995, 19), that is, of a type of epistemological reflection which he considers as “an activity whose purpose is to clarify the basic conceptual constructs of the human sciences, as they arise in practice, through the combined study of the symbolic systems that provide the materials, and the chains of operations that govern its architecture (Gardin et al. 1987, 29).” Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 39 The logicist method presented by Gardin consists therefore in the “study of the mechanisms and foundations of scientific argumentation” (Gardin and Borghetti 1995, 19) and in applying its principles to scientific «constructions» of the humanities, defined as follows: I define ‘constructions’ the texts elaborated on the model of scientific works, with the following three components: (a) a set of observation facts or data ascertained on any type of foundation; (b) hypotheses or conclusions based on these data and which constitute the end of the construction, its reason for being; lastly, (c), the argument produced to link these two components: data to conclusions or, conversely, hypotheses to the facts, with modalities that can be of different nature: natural reasoning or common sense, mathematics, formal logic, computer science, or any type of conjunction of these instruments that is considered as distinctive of our intellectual procedures in the human sciences (18). The application of the logicist method involves the use of “schematizations,” which in turn are defined – following “the logician J.B. Grize” (30) – as “the exercise that aims to isolate the operations called ‘natural logic’, currently practiced in the argument of ordinary language” (31; cf. Grize 1974, 204); the schematizations “are therefore nothing more than exercises in transferring into logical or, rather, semiological form, specialized texts in a particular discipline or field of research” (31). The assignment of a logical form to the discursive arguments obtained through the schematizations “shows that every construction can be defined through the combination of two elements” (34), the “initial” propositions that describe the “facts” (35) and the “rewriting operations,” or the discursive passages, “whose sequence constitutes the reasoning” that leads to the “conclusions” (34), that is the propositions called “terminal” (35). The rewriting operations constitute real “logical operations that are in reality particularly diversified” (30) and depend on the peculiar principles of inference of the different “modes of reasoning” (93) that in the “discursive practices” (37) of the human sciences can take the most varied forms: “inductions, implications, abductions, inferences, deductions, etc.” (30) Now, “the two elements” of the schematization previously cited are found again unchanged in the structure of the knowledge base in the field of artificial intelligence” and “the organizations thus defined constitute” practically “the specific subject of expert systems or ‘knowledge-based systems.’ ” (35) The result was “the possibility of conceiving the schematizations as one source of knowledge for the elaboration of expert systems or, conversely, expert systems as a possible development of schematizations” (36); and “the computational paradigm” could become “the main tool” of the logicist analysis, that is, of that “rewriting modality which consists in expressing interpretative constructions in the form of chains of propositions that link observed data” to theoretical statements such as “in a calculation procedure” (Gardin 1993, 12). The adoption of the computational tool, therefore, originates from a precise methodological choice and is based on the “homology” between the “architecture” of expert systems and that of schematizations (Gardin and Borghetti 1995, 36). Now, the more relevant aspect related to the switching to expert systems “concerns the ‘added value’” which is expected to be obtained “on the epistemological level” in which our scientific arguments are situated. But if, on the one hand, “the mandatory conversion of rewriting operations in reasoning ‘rules’ ” offers the possibility of “applying these rules in an experimental way, through simulations aimed at proving their validity”; on the other hand, “nothing allows us to affirm that our discursive practices can be assimilated to the true and Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 40 proper, and rigorously formal, rules of reasoning” (37). The computational methodological option thus requires an equally well-founded epistemological justification. 7. Epistemological reflection and expert systems In response to the alleged “scientist infection” of which he would be a victim (Gardin 1993, 15) and to other criticisms of his logicist approach, Gardin presents a thorough picture of the contrasting positions in an attempt to arrive at an adequate characterization of the method of human sciences, without, however, refraining from considering “the limits and possibilities of logicism” (19). Among the different positions taken into consideration we can distinguish, on the one hand, those that presuppose an exclusive “dualism” between the expositional methods of the human sciences and those of the natural sciences (Gardin and Borghetti 1995, 18-19) and, on the other, the “middle positions” that insist on the intermediate nature of the human and social sciences “should one characterize them only by negations (‘neither this nor that’), or should one opt for an intermingling of categories (‘a little of this, a little of that’)” (Gardin 1999, 125). In this debate, the position of logicism “seems to be confused with that of the human sciences themselves, in that ‘entre-deux’ (Passeron 1991) where they intend to legitimize their location today,” unless they put in question the very definition of “this ‘third way’ of knowledge which, according to the some, would not be that of science or literature; according to the others, it would be neither that of the symbolic constructions, separated from the logic and ‘natural’ languages, nor entirely that of argumentation current in everyday life.” (Gardin 1993, 19). Rather than following this debate in detail, it is important to note here that Gardin declares that he “feels uncomfortable in these intermediate spaces where the rules of the discursive game remain obscure” (1999, 125), even admitting in the end, that the “substance de l’entredeux, elle, m’echappe” (1991a, 32). Indeed, it is perhaps more important to observe that, today, both the epistemological reflection and the most up-to-date computational procedures actually seem to converge in offering a way out that addresses the issue Gardin leaves unresolved. In retrospect, its difficulty seems to depend upon the state of research in the field of expert systems at that precise moment and upon the availability of inference engines, which at the time were still too tied to the classical model of hypothetical-deductive reasoning, typical of the natural sciences. Now, however, a possibility of solution is in sight, in full compliance with the homology recognized by Gardin between the “logical form” (1995, 31) of the scientific constructions of the human sciences and the procedures of formalized inference of expert systems; and all this without overthrowing, however, from another point of view, the relationship of priority between the adoption of the logicist method and the “computer applications” that accompany it, but which – it should be reiterated – “do not constitute its main objective, nor its inevitable extension” (1993, 12). Therefore, it seems appropriate to pay attention to the possibility of a more detailed analysis of the interpretative and inferential practices of the texts expressed in natural language both in general and, for what concerns us more directly, in the field of human sciences. 8. Adaptive systems and methodological issues In this regard, in an essay published on Archeologia e Calcolatori we find an affirmation that sounds almost surprising to those who usually rely on the classical deterministic paradigm of computation, but that is, however, of particular importance for our purposes since it is precisely based on the Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 41 “homology” already highlighted by Gardin between the “architecture” of the expert systems and that of the schematizations of scientific constructions expressed in natural language (Gardin and Borghetti 1995, 36). This essay is dedicated to the epistemological foundations of “adaptive systems,” whose theory developed thanks to contributions from “several fields” of research such as biology, cognitive sciences, and artificial intelligence (cf. Holland 1962, 297). Precisely at the beginning Massimo Buscema, the author of this essay, writes expressly, “I shall use an analogy to explain the difference” or, to say it in a better way, the relationship, “between artificial science and natural language; the computer is to the artificial sciences as writing is to natural language” (2014, 53). In other words, in the artificial sciences, the computer is what writing represents for natural language: the artificial sciences consist of formal algebra for the generation of artificial models (structures and processes), in the same way in which natural languages are made up of semantics, syntax and pragmatics for the generation of texts. (2011, 17) It follows that in the “artificial adaptive systems,” that are “part of the vast world of natural computation” – which constitutes in turn “a subset of the artificial sciences” – the functioning of the texts composed in ordinary language is assimilated, in an apparently unexpected way, to the algorithmic operation of the computer. What is illustrated here, in fact, is a homology between the forms of computation and the analysis of cultural phenomena, which amounts to a renewed proposal of the homology already theorized by Jean-Claude Gardin between the schematizations of scientific constructions in the humanities and the architecture of expert systems. In this sense, we can also read the definition of artificial science proposed by Buscema: “artificial sciences are those sciences for which an understanding of natural and/or cultural processes is achieved by the recreation of those processes through automatic models” (2013, 17). One could then almost say that Gardin’s logicism also fits, like the adaptive systems studied by Buscema, in the field of the so-called “natural computing,” which is, however, described as “the computational version of the process of extracting ideas from nature to develop computational systems” (de Castro 2006, 3). In both cases, the homology between discursive procedures and automatic systems works in reverse: while in natural computation the rules of the system adapt to the processes from which they are derived, in Gardin’s logicist analysis, the discursive schematizations are necessarily adapted to the formal rules of the expert system in use. Among the expert systems examined by Gardin and the adaptive systems studied by Buscema there is therefore a crucial difference. What is characteristic of the adaptive systems is the presence of “rules that determine the conditions of possibility of other rules”; by their nature, these rules – formed by ‘constraints (links)’ which give the artificial models of natural processes the capacity to dynamically generate other rules – “are similar to the Kantian transcendental rules” and constitute the regulatory overarching principles on which the adaptive functioning of the system depends. In this way, natural computation does not try to recreate natural and/or cultural processes by analyzing the rules which make them function, and thus formalizing them [statistically] into an artificial model. On the contrary, natural computation tries to recreate natural and/or cultural processes by constructing artificial models able to create local rules dynamically and therefore capable of change in accordance with the process itself. (Buscema 2013, 20) Based on these considerations, the idea of building an adaptive model of this type – functional with respect to the analysis of texts expressed in natural language, in order to overcome the difficulty Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 42 encountered by Gardin in assigning a well-defined logical form to that intermediate form of argument that seems plausibly proper to all humanistic disciplines – looks quite legitimate, without thereby legitimizing non-clarified discursive rules and the residual mingling of models. Even from a more general epistemological point of view, “this ‘third way’ of knowledge,” still viewed with suspicion by Gardin (1993, 19), has some plausible justifications. It is true that even the field of humanistic research can be subdivided in “two sectors,” one more strictly “governed by logic” and the other “governed by what you might call intuition”. It is also true that it is “difficult to subject intuition to scrutiny” for validity (Orlandi 2016, 79), an objection that Gardin frequently aims at positions that are prone to justify “the plurality and accumulation of interpretations” (1999, 119), without defining a precise criterion for validation. However, “a phenomenologically inclined cognitive scientist,” reflecting on the origins of cognition, might reason as follows: We reflect on a world that is not made, but found, and yet it is also our structure that enables us to reflect upon this world. Thus, in reflection we find ourselves in a circle: we are in a world that seems to be there before reflection begins, but that world is not separate from us. For the French philosopher Maurice Merleau-Ponty, the recognition of this circle opened up a space between self and world, between the inner and the outer. This space was not a gulf or divide; it embraced the distinction between self and world, and yet provided the continuity between them. Its openness revealed a middle way, an entre-deux. (Varela et al. 1991, 3) The recognition of this entre-deux, of this intermediate path between the self and the world, brings into play the fundamental relationship between the subject and the object of knowledge. Gardin also considers the problem posed by the “incisive formula,” frequently cited in epistemological debates, of the retour en force du sujet (1991b, 99); however – without going into the discussion of the complex relationship between model or subjective representation of phenomena and objective reality, or between observer and observed – he tends to treat the “subject” from a predominantly objective point of view and to deal above all with the “objective evaluation of the role of the subject in human sciences” (98). Suffice it to mention, however, that in the face of this, even in the natural sciences and especially in physics, the problem has been addressed in a direct way: “when a theory is highly successful and becomes firmly established, the model tends to become identified with ‘reality’ itself, and the model nature of the theory becomes obscured,” writes the theoretical physicist Hugh Everett, who thus goes on: once we have granted that any physical theory is essentially only a model for the world of experience, we must renounce all hope of finding anything like “the correct theory.” There is nothing which prevents any number of quite distinct models from being in correspondence with experience (i.e., all "correct"). (Everett 1973, 134) Also in physics, therefore, the ‘multi-interpretation,’ considered so problematic by Gardin, does not cause scandal and, except for the criterion of empirical conformity, the problem of choice doesn’t arise anymore. Then the question shifts rather to the formal reconstruction of the interpretative process in the discursive practices of the ‘third way’ mainly followed in the human sciences, whose characterizing element seems to be constituted precisely by a form of self-referentiality, which includes in itself the role of the observer. Thus, one understands the relevance of the processes of redefining their own rules which are typical of automatic adaptive systems. Formal analysis of self-referential Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 43 procedures of internal transformation imposes itself as the primary task of a research that can be extended – thanks to the analogy previously reported – to the interpretative practices of the texts expressed in natural language. 9. Ordinary language: formal model and natural computation Which formal model can therefore be proposed for the representation and the formal analysis of the texts in natural language, which constitute the main product of scientific constructions in the humanities? Inspiration can come only from an analysis of language and from the perception of the enormous distance that separates the rigid “formalist’s motto” – characteristic of one of the most raw formulations of Good Old-Fashioned Artificial Intelligence (GOFAI), If you take care of the syntax, the semantics will take care of itself (Haugeland 1985, 106) and the compelling image of the connection between the text and its meaning offered by Samuel Beckett: There are many ways in which the thing I am trying in vain to say may be tried in vain to be said. (1965, 123) This illuminating sentence dissolves with immediate naturalness the extreme trivialization of the relationship between syntax and semantics of the previous maxim. In the conception of good old- fashioned artificial intelligence, a formalization of the syntax should lead to an alleged one-to-one correspondence between the syntactic structure and the semantic structure of the text, an assumption which persists also philosophies of language of analytical orientation. As Davidson maintains, “to give the logical form of a sentence is, then, for me, to describe it in terms that bring it within the scope of a semantic theory” (Davidson 1970, 145). The illusory postulation of this cherished one-to-one relation between syntax and semantics is completely debunked by the iconic representation of the fundamental indeterminacy of the relationship between the many ways of saying the same thing and the many ways to understand what is said by the same sentence: an identical content can admit different forms of expression, while an identical expression each time can be assigned different meanings. Here we encounter opposite conceptions of the relationship between the “expression” and the “content” of the text (see Hjelmslev 1961, 47-60). Following Saussure, who speaks of a plane of ideas (plan... des idées) and a plane of sounds (celui... des sons), (49) Hjelmslev states that an adequate description of the functioning of language “must analyze content and expression separately,” and that each of the two analyses may identify a certain number of entities “which are not necessarily susceptible of one-to-one matching with entities in the opposite plane” (46). On the one hand, the logicians, perhaps too conditioned by the symbolic character of the formal languages, are led to suppose that a syntactic system has “essentially the same structure as a semiotic” system and to consider it “normative for the concept of a semiotics.” On the other hand, for linguists it is the language which must be “considered as normative” for the functioning of a syntactic system (110). Accordingly, the task of the linguistic theoretician is not merely that of describing the actually present expression system, but of calculating what expression systems in general are possible as expression for a given content system, and vice versa. (105) Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 44 In fact, “the two planes,” the syntactic one and the semantic one, “cannot be shown to have the same structure throughout,” with a “one-to-one relation” between the functioning of the one and the functioning of the other (112). Therefore, while a logician like Carnap proposes “a sign-theory where, in principle, any semiotic is considered as a mere expression system without regard for the content,” from a linguistic point of view the “formal” description “is not limited to the expression-form, but sees its object in the interplay between the expression-form and a content-form” (110-111). However, it would be misleading to think that the radical difference between the two different conceptions of the relationship between the expression and the content of the text puts into question the possibility of establishing a functional homology between the discursive practices of ordinary language and the most advanced systems of artificial intelligence and natural computation. In Jean- Claude Gardin’s view, the formalization of the scientific production of the humanistic disciplines consists essentially in the formalization of the discours savant (1974, 57): in fact, the possibility of formalizing the textual phenomena in no way requires, as a necessary condition, a one-to-one correspondence between the syntactic structure and the semantic structure of the text. Rather, it is necessary to reflect on other characteristic aspects of the text and in particular on its diacritical or self- referential forms of expression. Also, in this regard, however, the approach of the logicians and that of the linguists diverge. As Hjelmslev observes, “the logistic theory of signs finds its starting point in the metamathematics of Hilbert,” which considers the system of mathematical symbols only as “a system of expression-figurae with complete disregard of their content,” and which treats its “transformation rules,” or rules of rewriting as Gardin would say, “without considering possible lnterpretations.” The same method was then “carried over by the Polish logicians into their ‘metalogic’” and eventually “brought to its conclusion by Carnap” (1961, 110). In particular, Hjelmslev, who had defined language in general as “a semiotic into which all other semiotics may be translated,” (109) argues that this is the advantage of everyday language, and its mystery. And this is why the Polish logician Tarski (who reached the same conclusion independently of the present author) rightly says that everyday languages are characterized in contrast to other languages by their ‘universalism.’ (1970a, 104-105) For Tarski, on the other hand, rather than constituting an advantage, it is presumably just this universality of everyday language which is the primary source of all semantical antinomies, like the antinomies of the liar or of heterological words. (1956, 164) In fact, “one does not realize that the language about which we speak must not at all coincide with the language in which we speak” and if the semantics is elaborated in that same language, the analysis of antinomies shows that “the language which contains its own semantics and within which the logical rules commonly accepted apply must inevitably be inconsistent” (1936, 2). So while for Hjelmslev “owing to the universalism of everyday language, an everyday language can be used as metalanguage to describe itself as object language,” (1970a, 132) for Tarski “in contrast to natural languages, the formalized languages do not have universality.” (1956, 167). In fact, formal languages are developed as pure symbolic systems regardless of the content and for this reason, “when we investigate the language of a formalized deductive science, we must always distinguish clearly between the language about which we speak and the language in which we speak,” (167), between the “metalanguage” and the “language under investigation” (172). However, proceeding in this way, the normative relationship Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 45 between semiotic structure and logical structure is reversed and what is deemed to be an obstacle due to the self-referentiality of natural language is avoided through the sharp separation of the “metalanguage,” the language “to describe,” from the “object language,” the “language described” (Hjelmslev 1970a, 132). But it is precisely the search for the forms of expression of metalinguistic import within the natural language that can make our investigation proceed to the construction of a formal model of its self-referential semiotic system. To use Hjelmslev’s linguistic terminology natural language can actually be described as a “semiotics” that includes its “metasemiotics,” which is more specifically described as “a semiotics whose content plane is a semiotic” (1961, 114), also expressed in the natural language itself. 10. The markup: diacritical function and self-referential cycle A useful starting point for this research can be found precisely by considering the current model for the digital representation of the text. It is well known that the text understood from a computational point of view as a data type of, that is, exclusively as “information coded as characters or sequences of characters” (Day 1984, 1), fails to represent all the information contained in the text understood in its current meaning. To solve this problem one resorts to the markup, whose standard form, accepted by the community of scholars of the humanities, consists in embedding, in the ordered sequence of the set of characters, marks or tags that define the properties of its partial segments or subsets. Now, if the markup represents textual information, the legitimate question arises about the status assumed by the markup in relation to the text. Thus, as Allen Renear puts it, one can enquire “about just what markup really is,” and in particular, “when it is about a text and when it is part of a text,” or in other words whether it belongs to the object language or the metalanguage of the text, without however excluding that “it may sometimes be both” (2000, 419). Trying to arrive at a satisfactory answer, we can examine the case of punctuation. Alluding to the importance of this topic for the interpretation of the text, the title of a book dedicated to punctuation, Eats, Shoots and Leaves (Truss 2003), presents an interesting example: written with the comma the title means “eats, shoots and leaves” and can allusively describe the rude behavior of a young man invited to dinner by a friend; written without comma it means “eats buds and foliage” and may describe the eating habits of a panda. Now that comma, which completely changes the meaning of the sentence or of single words like shoots and leaves, can be considered, like any other another diacritical sign of the text, both as an element of the text, in that it is part of the writing system, and as an indication or a metalinguistic rule, as it prescribes the way in which the text must be interpreted. It therefore can be affirmed that “punctuation is not simply part of our writing system,” but that “it is a type of document markup” (Coombs et al. 1987, 935). Along the same lines, the condition of the markup in general can be assimilated to that of a diacritical sign which, as such, has a double function: when it is used “to describe a document’s structure” (Raymond et al. 1992, 1) it carries out a metalinguistic function, but since it is expressed with “assigned tokens” which denote “specific positions in a text” (4) it constitutes itself the structure. The markup is therefore “simultaneously embedded and separable” from the text, is “part of the text, yet distinguishable from it” (3). Therefore, because it “denotes structure” in the text and at the same time it “is structure” itself (Buzzetti 2002, 80), the markup plays with respect to the text – in addition to “a properly diacritical function” – also “a self-reflexive function” and “can be considered, respectively, as an extension of the expression that Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 46 explains its structure” and the implicit rules that determine it, “as a form of external reference to its functional and structural aspects” (Buzzetti 2000). In short, “markup is at once representation and representation of a representation” (Buzzetti 2002, 81). Because of its ambivalent nature, every form of diacritical expression generates a cyclic process (markup loop) within the textual dynamic: we may say that an act of composition is a sense-constituting operation that brings about the formulation of a text. The resulting expression can be considered as the self-identical value of a sense- enacting operation. By fixing it, we allow for the indetermination of its content. To define the content, we assume the expression as a rule for an interpreting operation. An act of interpretation brings about a content, and we can assume it as its self-identical value. A defined content provides a model for the expression of the text and can be viewed as a rule for its restructuring. A newly added structure mark can in turn be seen as a reformulation of the expression, and so on, in a permanent cycle of compensating actions between determination and indetermination of the expression and the content of the text. (Buzzetti and McGann 2006, 68) All this can also be appropriately expressed with a diagram (Fig. 1). It is worthwhile to pause and consider in more detail some of the formal aspects both of the cycle and of the diagram that represents it. The diagram refers in particular to the markers that complete the digital representation of the text and that can be inserted inside it, or be made up of external elements connected through pointers to certain positions in the linear sequence of the characters. Since there is no direct correspondence between the elements of the syntactic structure and the elements of the semantic structure, the internal (embedded) markup – as it is part of the sequence of characters and it forms itself its structure – diacritically describes syntactic and expressive properties of the text. The external markup (stand-off), on the other hand, not being bound to the linear structure of the expression of the text, can freely express aspects that are not necessarily linear in the structure of its content. In the multidimensional diagram of the self-referential cycle of the text, there is therefore a correspondence between the dimension of the expression and that of the internal markup, as well as that of the content and external markup. Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 47 Figure 1. The markup loop (Buzzetti and McGann 2006, 68). The dual linguistic and metalinguistic function of the markup owing to its diacritical nature means that the same marker is both a self-identical element of the expression of the text, and a rule that determines the structure of the content and defines its specific elements, which in turn behave in the same way with respect to the expression. Therefore, the structural diacritical elements of expression and content can be considered both as the result of a restructuring operation and as the operations themselves that determine, respectively, the organization of the structure, both of the expression and of the content of the text. Formally, therefore, they can be understood as values of a function, or as the functions themselves which formally represent the rules for structuring the text. The relationship between the formal representation of the value of the function and the formal representation of the function or the rule itself, deserves to be carefully considered from the logical point of view, in order not to run into serious confusion between the linguistic and the metalinguistic levels present in natural language. In his careful analysis of ordinary language use, Gilbert Ryle appropriately cautions against easy “category-mistakes” (1949, 17) which one incurs if one does not pay attention to the “logical type or category,” (16) of commonly used expressions. As for what concerns us, Ryle observes that “a ‘variable’ or ‘open’ hypothetical statement” (120) – that is, a propositional function that contains variables, and all propositions of this type that express law- statements or a rule – “belong to a different and more sophisticated level of discourse from that, or those, to which belong the statements of the facts that satisfy them” (121). These propositions therefore constitute real rules of inference, for a law is used as, so to speak, an inference-ticket (a season ticket) which licenses its possessors it to move from asserting factual statements to asserting other factual statements (121). So the rules of inference, and with them the diacritical expressions we are dealing with, can be considered as statements of a higher order that belong to the logical type of the “inference-licenses,” studied by Stephen Toulmin in The Uses of Argument (2003, 91), who by his own admission “owes Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 48 much” to Ryle’s ideas, which he also “applied to the physical sciences” in his own Philosophy of Science (2003, 239). In his review of this work, Ernest Nagel (1954) observes that, thanks to the so-called deduction theorem, the principle, now “canonical in modern logical theory” (405), that “a rule of inference can in general be replaced by a premise,” holds in the case of our inference-licenses, and that “in the case of material rules of inference,” consisting of true non-tautological propositions like the ones we are dealing with, “this can apparently always be done.” Nagel tells us too that this “maneuver” can also “be introduced in reverse” (406). This means, according to standard logic, that the same sentence can act both as a first-order asserted premise in the object language, and as a rule of inference in the metalanguage. One should note that while in logic the object language and the metalanguage are necessarily kept separate – and are made up of statements respectively in the “material” and in the “formal mode of speech,” to use the terminology introduced by Carnap (1934), or by statements de re and statements de voce, to use a terminology drawn from the use of medieval logical Latin (Henry 1984), a technicized, but still natural language – in the case of ordinary language, which contains its own metalanguage, inference rules are expressed by statements. Such object-language higher-order de re statements are however inferentially equivalent to first-order de voce statements expressed in an external metalanguage, separated from the object language. Consequently, in natural language, self-referential object-language diacritical expressions take on a double function: considered as first-order statements, they are used as structural markers both of the expression and the content of the text; whereas considered as second-order statements, they constitute rules of inference which are used as functions of the expression to determine the structure of the content or, conversely, rules of inference used as functions of the content to determine the structure of the expression. 11. Generalization of the model Still on a formal level, we can observe that the structure of the markup cycle, represented above – which can however be generalized for all forms of diacritical expression – corresponds exactly to the “conversational cycle,” which according to Frederick Parker-Rhodes represents the actual “speech process” between the speaker and the listener (1978, 16) or, dealing with texts, between the writing and the reading of a text (Fig. 2). Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 49 Figure 2. The conversational cycle (Parker-Rhodes 1978, 16). In this cycle, the “expression” (A) is an operation performed by the speaker “which takes a ‘thought’ as input (which we must think as formalized in some manner)” and produces a “text” (B). One should note, incidentally, that here by expression we mean an operation, which is a function of the content, and not its result, a fact that proves the ambivalence of the diacritical mark on which it operates. In turn, the “comprehension” (C), or interpretation, is an operation performed by the listener, who receives the text as an “input containing all the information imparted to it by the speaker” and that produces “again a thought” (D) as its “output” (17). It is clear, regardless of the use of a different terminology, that the structure of this cycle corresponds exactly to that of the previously examined markup cycle (Fig. 1). However, an important observation by Parker-Rhodes should not be overlooked. It explicitly refers to the indeterminacy of the interpretation process: the “thought that the speaker had intended to convey,” once received and interpreted in the mind of the listener, “could produce the elaboration of a new thought” as a “result” (17). In this case the diagram could take the form of an open spiral, which is more suitable to represent the case of several possible interpretations (Fig. 3). Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 50 Figure 3. The helicoidal cycle (Gardin 1980, 45). Such a cycle could end at some point, returning to the starting point, or proceed indefinitely, depending on the context in which a given expression of the text is received. Jean-Claude Gardin also recognizes as “self-evident” the “cyclical nature” of the process of scientific construction. However, similarly to Parker-Rhodes, he believes that the cycle is not necessarily closed and therefore can be best represented by a “helicoidal curve,” more suited to retrace “the successive steps of its formation,” which are produced through a series of choices, depending not only on the data and their organization, but also and above all on the “logico-semantic rules of interpretation” and the different “interpretative models” that are equally possible (1980, 145). 12. Epistemological foundations The analysis of the cyclical nature of discursive practices brings us back again to the question of its epistemological foundation. As Gardin observes, the process of scientific construction can be considered both “from within,” and “from without,” or in other terms, subjectively from the author’s point of view, and objectively from the point of view of those who examine it, as an alternative to other constructions, in order to express a judgment of “validation” (145). This allows us to better evaluate the intermediate nature of the humanistic disciplines’ methodology, which many locate in the entre-deux between the predominantly objective nature of the methods of the natural sciences and the predominantly subjective nature of literary or discursive production in general. In other words, one has to decide whether this entre-deux divides or joins the two points of view, establishing what relationship exists between the subject and the object, or else the subjectivity and the objectivity absolutely considered. The cyclic and self-referential nature of the discursive process, which in the ordinary language form of expression jointly includes both the representation of its own object, and the representation of the way in which the subject represents it, inclines towards an answer that excludes the absolute separation between the subject and the object, or in other words between the observer and the observed. This is the position embraced, for example, by the theorists of autopoiesis (Varela et al. 1991), that draw inspiration from the epistemological discourse of Maurice Merleau-Ponty and his notion of “chiasm.” In one of his most iconic descriptions, Merleau-Ponty presents the chiasm as Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 51 an exchange between me and the world, between the phenomenal body and the “objective” body, between the perceiving and the perceived: what begins as a thing ends as consciousness of the thing, what begins as a “state of consciousness” ends as a thing exchange between me and the world, between the phenomenological body and the ‘objective body’ between the perceiver and the perceived: what begins as a thing ends as consciousness of the thing, what begins as a “state of consciousness” ends up as a thing. (1968, 215) In his essay on La structure du comportement, in order to clarify the connection between the subject and the object, Merleau-Ponty, once again, cites (1942, 11) the physiologist Viktor von Weizsacker, who describes that relation in these terms: “the properties of the subject and the intentions of the subject (...) not only mix with each other, but also constitute a new whole” (1927, 45). This means that the subject and the object must not be conceived as separate, but as constantly connected in a continuous process of “overlapping or encroachment (empiétement)” (Merleau-Ponty 1968, 123), as if one would over and over again take the place of the other. The chiastic interlacement thus consists of a relationship of “activity and passivity coupled,” (261) a representing and being represented of the subject and the object both in language and perception. Thus, the understanding of the “chiasm,” as described by Merleau-Ponty, leads to the conclusion that language, understood as natural language, “is the same” thing that simultaneously represents and is represented, but not the same “in the sense of real identity,” but rather “the same in the structural sense,” that is, in the sense of a unique and self- identical semiosis, which also includes the semiosis that represents it (261). The same relationship between the subject that represents and the object being represented, when conceived as ‘the same thing,’ that is, as the ‘new whole’ that they constitute, is found in the notion of the subject proper to the cybernetics “of the second order,” the cybernetics of the “observing systems,” in which “the observer enters the system by stipulating his own purpose,” as opposed to the cybernetics of the “observed systems,” or “first-order” cybernetics, in which “the observer enters the system by stipulating the system’s purpose” (von Foerster 2003, 285-286). Thus, in this context, one can find this enlightening definition of the subject: “I am the observed relation between myself and observing myself.” (257). Here the subject is defined as one and the same thing, a new whole, constituted by the representation of the relationship between the self observing itself and the self observed by itself. Hence the idea that the conception of systemic self-referentiality – that takes place, for example, both in natural language and in its formal model – could constitute a new fundamental scientific paradigm. A new paradigm of this kind necessarily leads one to believe that the nature of the human sciences can be considered as an intermediary one, only as long as the natural sciences and the literary or the discursive disciplines in general are conceived of as absolutely separate and incompatible with each other. However, the recognition of the unavoidable relationship between the observer and the “systems observed,” or the principle of the autonomous organization of the “observing systems,” today extends manifestly beyond the field of the disciplines characterized by the interpretive method to the field of the physical and biological sciences. Thus, the paradigm of self-referentiality seems to open a new perspective of convergence between the methods of the natural sciences and the methods of the human sciences, whose median nature would then be based more on the nature of the object of the investigation than on the specific nature of the method whereby knowledge is constructed. Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 52 13. Subjectivity and objectivity: formalization and implementation At this point, our diagram of the self-referential cycle of the discursive process can be reconsidered, taking into account the reflexive character of the relationship between the subject and the object. Language, in as much as it is seen as expression, is subjective, because it is the representation of the form of our act of representing; however, in as much as it is seen as content, language is objective, because it is the representation of the form of what it represents. In turn, a form of diacritical expression of the text, subjective in itself, can be considered both from an objective point of view, as an element of the expression identical to itself, and from a subjective point of view, as a function that determines a structural element of the content (Fig. 4). The same can be said of an element of the content: objective in itself, which can be considered both from an objective point of view as an element identical to itself, and from a subjective point of view as a function that determines a structural element of the expression. Figure 4. Subjectivity and objectivity in the speech process (Parker-Rhodes 1978, 16). The distinction between something subjective and something objective is therefore a recursive distinction that could continue indefinitely (Fig. 5): Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 53 Figure 5. Recursiveness of the subjective/objective distinction. But this does not happen, for the very reason that language is self-referential, as one can clearly infer from the diagram shown in Fig. 6: Figure 6. Chiastic self-referentiality of the subjective/objective distinction. It can be reasonably assumed that this scheme represents a possible formal model of the ‘chiasm,’ that is, of the relationship between the subject and the object that involves a continuous process of mutual “encroachment, infringement (empiétement, enjambement)” (Merleau-Ponty 1964, 175), or reciprocal displacement, dismissal, and override. An image of continual oscillation between what is subjective and what is objective, as an uninterrupted process, is aptly allusive of the self-referential mobility of the text and the dynamic nature of the ambivalence of the diacritical structural elements of both the expression and the content of the text. This is an aspect which cannot be described only metaphorically, but which finds formal expression also in rigorous mathematical terms. As David Hestenes writes about the mathematician who introduced the algebras that bear his name, Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 54 Clifford may have been the first person to find significance in the fact that two different interpretations of number can be distinguished, the quantitative and the operational. On the first interpretation, number is a measure of “how much” or “how many” of something. On the second, number describes a relation between different quantities. (1999, 60) In other words, seen from the latter point of view, a number describes the operation that connects two different quantities. The same concept of ambivalence between value and function can be found in the “calculus of indications” introduced by the English mathematician George Spencer Brown (1969, 11): he admits that there can be a “partial identity of operand and operator,” since an operand “is merely a conjectured presence or absence of an operator” (88). Granted that a rigorous formalization of such a model is possible, it can be surmised that a computational implementation could be obtained by developing a suitable adaptive system, endowed with functional capabilities as those previously illustrated. If the adaptive artificial systems that we described are built on the basis of a recognized analogy with the operation of natural language, that is, foreseeing the presence of rules capable of modifying other rules of the system, the same analogy allows us to suppose that a formal model of the discursive processes of natural language could be implemented precisely using an adaptive computational system of the same type. ln fact, the ambivalence of precise mathematical objects strictly defined (operation and operand, function and value), can constitute the formal expression of the relationship between subject and object that we have described by recalling the epistemological notion of the ‘chiasm.’ Secondly, it is precisely the indeterminate character of the relationship between syntax and semantics in the natural language that gives origin to the self-referential cycle of “rules” of the second order “that establish the conditions of possibility of other rules” of the system (Buscema 2013, 20). Thus, in this way, the road is open for the possibility of implementing a computational model of the discursive processes proper to scientific constructions in the humanities, consisting in an automatic system of an adaptive type. 14. Conclusions This concludes our long, extended argument aimed at supporting the opportunity for a return to the origins of humanities computing to avoid the risk, of which Jean-Claude Gardin has made us aware, of exchanging means for research purposes. The period of the origins, or the so-called humanities computing, was distinguished by an attitude aimed primarily at reflecting on the methods and their epistemological foundations, as a preliminary condition to the choice of computational means suitable for the solution of the research problems of a specific disciplinary field. Subsequently, subordination to the rapid technological development of the 1990s, favoring the importance of the digital medium in artistic and literary production, has actually reversed this relationship. The priority given to practices of cultural production directly in digital forms and to research activities assisted by the computer, although still conducted in all-traditional forms, has produced a veritable mutation of the humanities computing practice of the original period and led to the advent of the so-called digital humanities. Thus, interest has abated in what Jerome McGann considers, in this new digital environment, the urgent and very current philological “imperative” of the “preservation of cultural memory” (2012), in agreement with the famous definition of August Boeckh, die Erkenntnis des Erkannten. Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 55 In the second part of the essay, I therefore tried to present, through an example, a form of restoration of the original humanities computing attitude to deal with theoretical and methodological issues in regard to the interpretation of texts. Building on the thoughts of Jean-Claude Gardin on the analysis of discursive practices in the human and social sciences – in particular, on the homology between the structure of knowledge base expert systems and the structure of the schematizations of data and argumentations in humanities scientific texts – I have noted a significant convergence, or rather a substantial homology, between the analysis of the self-referential phenomena of natural language and the establishment of data processing rules in automatic adaptive systems. Thus, this correspondence has allowed me to outline a formal model for the analysis of the interpretative practices of texts in ordinary language congruent with data processing procedures proper of adaptive systems. A possibly successful implementation of this model would undoubtedly confirm the fecundity, for humanities computing, of a re-proposal of the priority of the theoretical and methodological reflection that particularly characterized the period of its origins. translated by Massimo Lollini Works Cited S. BECKETT, Proust and Three Dialogues with Georges Duthuit, London 1965. R. BUSA S.J. (ed.), Index Thomisticus: Sancti Thomae Aquinatis operum omnium indices et concordantiae in quibus verborum omnium et singulorum formae et lemmata cum suis frequentiis et contextibus variis modis referuntur quaeque, auspice Paulo 6 Summo Pontifice, consociata plurium opera atque electronico IBM automata usus digessit Robertus Busa, 56 voll., Stuttgart-Bad Cannstatt 1974-1980. M. BUSCEMA, Artificial Adaptive Systems: Philosophy, Mathematics and Applications, in M. Buscema, M. Ruggieri (eds.), Advanced Networks, Algorithms and Modeling for Earthquake Prediction, Aalborg 2011. M. BUSCEMA, The General Philosophy of Artificial Adaptive Systems, in M. Buscema, W.J. Tastle (eds.), Intelligent Data Mining in Law Enforcement Analytics: New Neural Networks Applied to Real Problems, Dordrecht 2013. M. BUSCEMA, The General Philosophy of Artificial Adaptive Systems (AAS), in M. Ramazzotti (ed.), ARCHEOSEMA. Artificial Adaptive Systems for the Analysis of Complex Phenomena. Collected Papers in Honour of David Leonard Clarke, “Archeologia e Calcolatori,” Supplemento 6 (2014), pp. 53- 84. http://www.archcalc.cnr.it/indice/Suppl_ 6/04_Buscema. pdf [ 11/04/2018]. Buzzetti, Dino. “Digital Representation and the Text Model.” New Literary History, vol. 33, no. 1, 2002, pp. 61–88. doi:10.1353/nlh.2002.0003. D. BUZZETTI, Ambiguittà, diacritica e Markup: Note sull’edizione critica digitale, in S. Albonico (ed.), Soluzioni informatiche e telematiche per la filologia, Atti del Seminario di studi (Pavia, 30-31 marzo 2000), Pavia 2000. http://studi umanistici. uni pv.itidipslamm/pubtel/Atti2000/dino_b uzzetti.htm [11/04/2018]. Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 56 D. BUZZETTI, J. McGANN, Critical Editing in a Digital Horizon, in L. Burnard, K. O’Brien O’Keeffe, J. Unsworth (eds.), Electronic Textual Editing, New York 2006, pp. 51-71. S. CAPEZZUTO, Il design della conoscenza: Intervista a Jeffrey Schnapp, “II lavoro culturale” (2017). http://www.lavoroculturale.org/intervista-a-jeffrey-schnapp/ [11.04.2018]. R. CARNAP, Logische Syntax der Sprache, Wien 1934.65 J.H. COOMBS, A.H. RENEAR, S.J. DEROSE, Markup Systems and the Future of Scholarly Text Processing, “Communications of the ACM” 30: 11 (1987), pp. 933-947. D. DAVIDON, Action and reaction, “Inquiry” 13:1-4 (1970), pp. 140-148. A.C. DAY, Text Processing, Cambridge 1984. L.N. DE CASTRO, Fundamentals of natural computing: An overview, “Physics of Life Reviews” 4:1 (2007), pp. 1-36. K. DE SMEDT et al. (eds.), Computing in Humanities Education: A European Perspective, University of Bergen 1999. http:/ /www.hd.uib.no/ AcoHum/book/ [11/04/2018]. M. DEEGAN, K. SUTHERLANDT, Transferred Illusions: Digital technology and the forms of print, Farnham-Burlington 2009. H. EVERETT, III, The Theory of the Universal Wave Function, in B.S. de Witt, N. Graham (eds.), The Many-Worlds Interpretation of Quantum Mechanics, Princeton 1973, pp. 3-140. H. VON FOERSTER, Understanding Understanding: Essays on Cybernetics and Cognition, New York NY 2003. J .-C. GARDIN, Les applications de la mecanographie dans la documentation archeologique, “Bulletin des Bibliotheques de France” 5:1-3 (1960), pp. 5-16. J.-C. GARDIN, Les analyses de discours, Neuchatel 1974. J.-C. GARDIN, Archaeological Constructs: An aspect of theoretical archaeology, Cambridge 1980. J .-C. GARDIN, Le calcul et la raison: Essais sur la formalisation du discours savant, Paris 1991. (a). J.-C. GARDIN, Le role du sujet dans les sciences de l’homme: Essais d’evaluation objective, “Revue europeenne des sciences sociales” 29:89 (1991), pp. 91-102. (b) J.-C. GARDIN, Points de vue logicistes sur les methodologies en sciences sociales, “Sociologie et societes” 25:2 (1993), pp. 11-22. J.-C. GARDIN, Archeologie, formalisation et sciences sociales, “Sociologie et sociétés, 31:1 (1999), pp. 119-127. J .-C. GARDIN, M.N. BORGHETTI, L’architettura dei testi storiografici: Un’ipotesi, a cura di I. Mattozzi, Bologna 1995. J.-C. GARDIN, M.-S. LAGRANGE, J-M. MARTIN, J. MOHO, J. NATALI-SMIT, La logique du plausible: Essais d’épistémologie pratique en sciences humaines, 2e éd. Revue et augmentée, Paris 1987. [JCG] Fondo Équipe Archéologie de l’Asie Centrale et Jean-Claude Gardin, Archivi della Maison Archéologie & Ethnologie René-Ginouvès, Nanterre. Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 57 J.-B. GRIZE, Logique mathématique, logique naturele et modèles, in Formalisierung in den Geisteswissenschaften / Sciences humaines et formalisation, “Jahresbericht der Schweizerischen Geisteswissenschaftlichen Gesellschaft” (1974), pp. 201-207. J. HAUGELAND, Artificial Intelligence: The very idea, Cambridge MA 1985. D.P. HENRY, That Most Subtle Question (Quaestio Subtilissima): The metaphysical bearing of medieval and contemporary linguistic disciplines, Manchester 1984. D. HESTENES, New Foundations for Classical Mechanics (Second Edition), New York 2002. L. HJELMSLEV, Prolegomena to a Theory of Language, Madison WI 1961. L. HJELMSLEV, I fondamenti della teoria del linguaggio [1961], introduction and translation by G.C. Lepschy, Torino 1968. L. HJELMSLEV, Language: An introduction, Madison WI 1970a. L. HJELMSLEV, Il linguaggio [1970], a cura di G.C. Lepschy, Transl. A. Debenedetti Woolf, Torino 19706. J.H. HOLLAND, Outline for a Logical Theory of Adaptive Systems, «Journal of the ACM» 9:3 (1962), pp. 297-314. S. LEON, Digital Public History, s.d. http://www.6floors.org/dossier I personal-statement/digital- public-history/ [11/04/2018]. J.J. McGANN, Memory Now, posted on “4Humanities,” 19 August 2012. https://4humanities.org/2012/08/jerome-j-mcgann-memory-now-2/ [27/04/2018]. M. MERLEAU-PONTY, La structure du comportement (1942), 6’ ed., Paris 1967. M. MERLEAU-PONTY, Le visible et l’invisible, Paris 1964. Merleau-Ponty, Maurice. The Visible and the Invisible: Followed by Working Notes (1964). Northwestern University Press, 1968. P. MOSCATI, Jean-Claude Gardin (Parigi 1925-2013): Dalla meccanografia all’informatica archeologica, “Archeologia e Calcolatori” 24 (2013), pp. 7-24. http://www.archcalc.cnr.it/ indice/PD F24/0 l_Moscati. pdf [27/04/2018]. E. NAGEL, Review of The Philosophy of Science by S. Toulmin, “Mind” 63:251 (1954), pp. 403-412. S. NOJA, Saggio di un confronto a mezzo di un elaboratore elettronico tra lo “Šulḫan ‘arȗḵ” di Karo e quello del rabbino di Lubavitch, “Atti della Accademia delle Scienze di Torino” 102 (1967-68), pp. 555-582. T. ORLANDI, Interview, in J. NYHAN, A. FLINN, Computation and the Humanities: Towards an oral history of Digital Humanities, Cham 2016. A.F. PARKER-RHODES, Inferential Semantics, Hassocks 1978. J.-C. PASSERON, Le raisonnement sociologique: l’espace non-popperien du raisonnement naturel, Paris 1991. D.R. RAYMOND, F.W. TOMPA, D. WOOD, Markup reconsidered, paper presented at the First International Workshop on Principles of Document Processing, Washington DC, 22-23 October 1992, pp. Humanist Studies & the Digital Age Buzzetti 6.1 (2019) 58 1-25. http://citeseerx.ist. psu.edu/viewdoc/download?doi= 10. l. l .80.9369&rep=rep 1 &type=pdf [09/04/2018]. D.R. RAYMOND, F.W. TOMPA, D. WOOD, From Data Representation to Data Model: Meta-semantic issues in the evolution of SGML, “Computer Standards and Interfaces” 18 (1996), pp. 25-36. A. RENEAR, The Descriptive/Procedural Distinction is Flawed, “Markup Languages” 2:4 (2000), pp. 411- 420. G. RYLE, The Concept of Mind, London 1949. S. SCHREIBMAN, R.G. SIEMENS, J. UNSWORTH ( eds.), A Companion to Digital Humanities, Malden MA 2004. G. SPENCER-BROWN, Laws of Form, London 1969. Tarski, Alfred. “Grundlegung der wissenschaftlichen Semantik.” In Actes du Congrès international de philosophie scientifique, Sorbonne, Paris 1935. vol. III: Language et pseudo-problèmes. Hermann, Paris 1936, pp. 1-8. https://gallica.bnf.fr/ark:/12148/bpt6k383668/f5.image.r=.langFR Tarski, Alfred. “The Concept of Truth in Formalized Languages.” In Logic, Semantics, Metamathematics: Papers from 1923 to 1938. Translated by J.H. Woodger. Clarendon Press, 1956, pp. 152-278. TEXT ENCODING INITIATIVE TEI: Text Encoding Initiative, 2016. http://www.tei-c.org/ [09/04/2018]. TEXT ENCODING INITIATIVE, Guidelines for Electronic Text Encoding and Interchange, 2015. http://www.tei-c.org/Guidelines/ [09/04/2018]. S. E. TOULMIN, The Philosophy of Science: An introduction, London 1953. Toulmin, Stephen. The Uses of Argument (1st ed. 1958). Updated ed., Cambridge University Press, 2003. L. TRUSS, Eats, Shoots and Leaves: The zero tolerance approach to punctuation, London 2003. J. UNSWORTH, Forms of Attention: Digital Humanities beyond representation, paper delivered at “The Face of Text: Computer-Assisted Text Analysis in the Humanities,” The third conference of the Canadian Symposium on Text Analysis (CaSTA), McMaster University, 19-21 November 2004. http://www.people.virginia.edu/,jmu2m/FOA/ [l0/04/2018]. F. VARELA, E. THOMPSON, E. ROSCH, The Embodied Mind, Cambridge MA 1991. V. von WEIZSÄCKER, Reflexgesetze, in A. Bethe et al. (hrsg.), Handbuch der normalen und pathologischen Physiologie, Bd. 10, Berlin 1927. N. WIRTH, Algorithms + Data Structures = Programs, Englewood Cliffs N.J. 1976. L. WITTGENSTEIN, Culture and Value: A selection from the posthumous remains, edited by G. H. von Wright in collaboration with H. Nyman, revised edition of the text by A. Pichler, translated by P. Winch, Oxford 1998. The Origins of Humanities Computing and the Digital Humanities Turn Dino Buzzetti, University of Bologna 1. Introduction 2. The era of the mainframes 3. A definition of humanities computing 4. Representation vs. Data processing 5. Semantic Web and Digital Humanities 6. The “logicism” of Jean-Claude Gardin 7. Epistemological reflection and expert systems 8. Adaptive systems and methodological issues 9. Ordinary language: formal model and natural computation 10. The markup: diacritical function and self-referential cycle 11. Generalization of the model 12. Epistemological foundations 14. Conclusions Works Cited work_fw6fzegd7bfwfkfhyo4q4vmmlu ---- Unus pro omnibus! Generic research tool for all Humanities disciplines. André Kilchenmann a.kilchenmann@dasch.swiss Flavie Laurens flavie.laurens@dasch.swiss Data and Service Center for the Humanities DaSCH — November 11, 2020 CFP Paper Abstract | DARIAH Annual Event 2020 The “digital turn” has changed research in the Humanities to a large extent: many new digital tools and methods exist with which you can access and analyze texts, videos, sound and music. However, those tools are most of the time standalone applications and it is more difficult to combine various records. A good illustration of this situation is research projects with moving image as main (re)source. Scholars record current events and interview contemporary witnesses like historic or ethnographic projects. Here, moving images or videos need to be transcribed which could be a “simple” interview transcription. But in some disciplines like sociology or film and media studies, these multimedia objects must be extended which complexify the process. In those cases, scholars would also like to annotate the source, to describe the composition of the image, the soundtrack, or the movement of the camera. It’s a linkage between various sources and descriptions. The question is: How can we bring them all together? At the Data and Service Center for the Humanities (hereinafter called DaSCH) in Basel, Switzerland, we have to deal with all different data sets from all disciplines in the Humanities. The DaSCH is a national research infrastructure which provides data handling services like data curation, long-term access, and research and analysis tools to work with qualitative data. We bring a wide variety of data, data models and media (digital representations) from different disciplines together: from archaeology to philosophy; from moving image to books, audio and still images. An important aspect of managing qualitative data in the Digital Humanities is that, in most cases, the preservation of data sets alone makes little sense. We have to store data sets that can be accessed, re-used, connected and annotated. To reach this goal and to provide qualitative data handling services, the DaSCH develops and maintains a software platform called DaSCH Service Platform (previous “Knora”) consisting of a database based on a Resource Description Framework (RDF) triple store and Application Programming Interfaces (APIs). The DaSCH Service Platform handles data from database, as well as media files stored on our own IIIF-based- media server. Those tools are part of the backend, the server side. Scholars with good IT-skills can interact with APIs and work with their data. For scholars with limited IT-knowledge, we need to provide a simple, generic user interface. We are developing an intuitive, easy-to-use web-based application, called “DSP-App”, placed on top of DaSCH Service Platform to directly use its powerful data management functionalities. Data models and data will automatically follow accepted standards, be findable, accessible interoperable, and re-usable (FAIR principles). With DSP-App, scholars will have a ready-to-use platform in order to create their own data models, upload data, attach metadata, and perform analyses and data-visualization as they could do 1 with a desktop data management tool. Even scholars with small data sets will have access to long-term accessibility at minimal cost and time to keep their research data alive, guaranteeing longevity of the data. Author Biography Dr. André Kilchenmann studied cultural anthropology, media studies and computer science at the Uni- versity of Basel. During this time, he worked at the museum of cultures in Basel and at the data center of the University. His interests are photography, design and digital work in general. In 2016, he completed his PhD studies at the Digital Humanities Lab in Basel and now works for the Data and Service Center for the Humanities DaSCH. Flavie Laurens is a front-end designer and web developer. She has a master’s degree in “Systematics, Evo- lution, Paleobiodiversity” minor in “Biodiversity Informatics” from Pierre and Marie Curie University (UPMC), Paris. Since June 2018, she has been working on different user interfaces for the Data and Service Center for the Humanities DaSCH. 2 work_fzdifx2eunfa3cwi4ehblrf6ze ---- Editorial for the Special Issue on “Digital Humanities” information Editorial Editorial for the Special Issue on “Digital Humanities” Cesar Gonzalez-Perez Institute of Heritage Sciences (Incipit), Spanish National Research Council (CSIC), Avda. Vigo, s/n, 15705 Santiago de Compostela, Spain; cesar.gonzalez-perez@incipit.csic.es Received: 8 July 2020; Accepted: 8 July 2020; Published: 10 July 2020 ���������� ������� Digital humanities are often described in terms of humanistic work being carried out with the aid of digital tools, usually computer-based. Other disciplinary fields in, for example biology or economy, went through a digital turn a few years or decades ago. Now, many areas of the humanities are going the same way. This is especially so of literary studies, linguistics, and archaeology. Many researchers in the humanities regularly carry out their work in information- and computing-intensive settings, employing techniques and tools that so far have been limited to software engineers or computer scientists [1]. However, there is little consensus on what digital humanities actually are, whether they constitute a new discipline or just a passing fad, or how they change the nature of humanistic enquiry. In this setting, the role of information is especially relevant. As with any other field of study, researchers in the humanities produce large amounts of information that is generated, stored, manipulated, communicated, and visualised through digital means. This Special Issue attempts to contribute to a better understanding of digital humanities by focusing on the role that information plays in humanistic research and, specifically, how humanistic knowledge is generated, communicated, used, and institutionalised through information-intensive tools, techniques, and methods. Relevant issues include how things are documented and described; how natural language is incorporated into the research process; how time, space, subjectivity, change, and multilingualism affect reasoning and knowledge production; how computing techniques (such as big data, artificial intelligence, or information visualisation) can help in the humanities; finally, any other aspects of humanistic research that are often performed in information-intensive settings. The articles in this Special Issue cover a wide range of topics related to information in digital humanities. Some address information issues from an ontological point of view. This includes, for example, “Capturing the Silences in Digital Archaeological Knowledge” [2], which explores non-knowledge, or lack of knowledge as captured in archaeological datasets. The article “Linking Theories, Past Practices, and Archaeological Remains of Movement through Ontological Reasoning” [3] proposes new approaches to knowledge generation through the construction of ontologies, with a special focus on movement over a territory. Finally, the article “Ontology-Mediated Historical Data Modeling: Theoretical and Practical Tools for an Integrated Construction of the Past” [4] takes a constructionist approach to the whole life cycle, from knowledge modelling to the development of a software tool, to aid in the study of the past. Other articles take a more specialised approach, such as “Exploring West African Folk Narrative Texts Using Machine Learning” [5], which employs a number of natural language processing techniques to process and compare two corpora of West African folk tales. Additionally, the article “One Archaeology: A Manifesto for the Systematic and Effective Use of Mapped Data from Archaeological Fieldwork and Research” [6] proposes a public sector-oriented approach to managing and sharing archaeological geospatial information. The remaining articles in the Special Issue tackle the very relevant aspect of language and its connection to information generation and use. “Measuring Language Distance of Isolated European Languages” [7] employs corpus-based techniques, as opposed to phylogenetic approaches, Information 2020, 11, 359; doi:10.3390/info11070359 www.mdpi.com/journal/information http://www.mdpi.com/journal/information http://www.mdpi.com https://orcid.org/0000-0002-3976-7589 http://www.mdpi.com/2078-2489/11/7/359?type=check_update&version=1 http://dx.doi.org/10.3390/info11070359 http://www.mdpi.com/journal/information Information 2020, 11, 359 2 of 2 to obtain distance measurements between isolated languages in Europe, whereas “Software Support for Discourse-Based Textual Information Analysis: A Systematic Literature Review and Software Guidelines in Practice” [8] produces a systematic literature review of software tools for discourse analysis and introduces some guidelines for developing and adopting these tools. In summary, this Special Issue on information in digital humanities covers aspects of ontological modelling and reasoning, theorising on the past, natural language, geo-spatial information, and software tools, among others. We hope that these articles help us advance in our understanding of the roles that information play in humanistic research and practice. Funding: This research received no external funding. Conflicts of Interest: The author declares no conflict of interest. References 1. Gonzalez-Perez, C. Information Modelling for Archaeology and Anthropology; Springer: Berlin/Heidelberg, Germany, 2018. 2. Huggett, J. Capturing the Silences in Digital Archaeological Knowledge. Information 2020, 11, 278. [CrossRef] 3. Nuninger, L.; Verhagen, P.; Libourel, T.; Opitz, R.; Rodier, X.; Laplaige, C.; Fruchart, C.; Leturcq, S.; Levoguer, N. Linking Theories, Past Practices, and Archaeological Remains of Movement through Ontological Reasoning. Information 2020, 11, 338. [CrossRef] 4. Travé Allepuz, E.; del Fresno Bernal, P.; Mauri Martí, A. Ontology-Mediated Historical Data Modeling: Theoretical and Practical Tools for an Integrated Construction of the Past. Information 2020, 11, 182. [CrossRef] 5. Lô, G.; de Boer, V.; van Aart, C.J. Exploring West African Folk Narrative Texts Using Machine Learning. Information 2020, 11, 236. [CrossRef] 6. McKeague, P.; Corns, A.; Larsson, Å.; Moreau, A.; Posluschny, A.; Van Daele, K.; Evans, T. One Archaeology: A Manifesto for the Systematic and Effective Use of Mapped Data from Archaeological Fieldwork and Research. Information 2020, 11, 222. [CrossRef] 7. Gamallo, P.; Pichel, J.R.; Alegria, I. Measuring Language Distance of Isolated European Languages. Information 2020, 11, 181. [CrossRef] 8. Martin-Rodilla, P.; Sánchez, M. Software Support for Discourse-Based Textual Information Analysis: A Systematic Literature Review and Software Guidelines in Practice. Information 2020, 11, 256. [CrossRef] © 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). http://dx.doi.org/10.3390/info11050278 http://dx.doi.org/10.3390/info11060338 http://dx.doi.org/10.3390/info11040182 http://dx.doi.org/10.3390/info11050236 http://dx.doi.org/10.3390/info11040222 http://dx.doi.org/10.3390/info11040181 http://dx.doi.org/10.3390/info11050256 http://creativecommons.org/ http://creativecommons.org/licenses/by/4.0/. References work_g267nfgkpjesdnlkfvfg4yqfwa ---- Scheduling algorithm for the picture configuration for secondary tasks of a digital human–computer interface in a nuclear power plant Research Article Scheduling algorithm for the picture configuration for secondary tasks of a digital human–computer interface in a nuclear power plant Gang Zhang1, Xuegang Zhang1, Yu Luan1, Jianjun Jiang2 and Hong Hu2 Abstract Secondary tasks of a digital human–computer interface in a nuclear power plant increase the mental workloads of operators and decrease their accident performance. To reduce the adverse effects of secondary tasks on operators, a picture configuration scheduling algorithm of secondary tasks is proposed. Based on the research background and operator interviews, a scheduling algorithm process is established, and variables and constraint conditions of the sche- duling process are defined. Based on the scheduling process and variables definitions, this article proposes a picture feature extraction method, a method for counting identical keywords, an arrangement method of queues in a buffer pool and a picture configuration scheduling algorithm of secondary tasks. The results of simulation experiments demonstrate that the algorithm realizes satisfactory performance in terms of the number of replacements, the average waiting time, and the accuracy. Keywords Digital human–computer interface, a picture configuration scheduling algorithm, buffer pool, constraint conditions Date received: 27 November 2019; accepted: 16 February 2020 Topic: Robot Manipulation and Control Topic Editor: Andrey V Savkin Associate Editor: Bin He Introduction An operator must perform his or her not only primary tasks but also secondary tasks of digital human–computer interfaces (HCIs) in a nuclear power plant (Npp) to deal with an accident. 1 The secondary tasks are also known as interface management tasks. Interface management tasks mainly include navigation, configuration, arrangement, interrogation, and automation. 2 An operator must execute secondary tasks to support primary tasks because many parameters and navigations and a substantial amount of information must be configured to correctly deal with an accident. An operator’s cognitive resources must be distributed when an accident is being addressed. If the allocated cognitive resources outweigh the support capability of an operator, task performance will decline 3 because the cog- nitive resources of any operator are limited. Then, if 1 State Key Laboratory of Nuclear Power Safety Monitoring Technology and Equipment, China Nuclear Power Design Company Ltd, Shenzhen, Guangdong Province, China 2 School of Safety and Environment Engineering, Hunan Institute of Technology, HengYang, HuNan Province, China Corresponding author: Jianjun Jiang, School of Safety and Environment Engineering, Hunan Institute of Technology, HengYang, HuNan Province 421002, China. Emails: jjjhnit@126.com; jiangjianjun310126@126.com; 13807474256 @126.com International Journal of Advanced Robotic Systems March-April 2020: 1–11 ª The Author(s) 2020 DOI: 10.1177/1729881420911256 journals.sagepub.com/home/arx Creative Commons CC BY: This article is distributed under the terms of the Creative Commons Attribution 4.0 License (https://creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/ open-access-at-sage). https://orcid.org/0000-0001-5856-2055 https://orcid.org/0000-0001-5856-2055 mailto:jjjhnit@126.com mailto:jiangjianjun310126@126.com mailto:13807474256@�126.com mailto:13807474256@�126.com https://doi.org/10.1177/1729881420911256 http://journals.sagepub.com/home/arx https://creativecommons.org/licenses/by/4.0/ https://us.sagepub.com/en-us/nam/open-access-at-sage https://us.sagepub.com/en-us/nam/open-access-at-sage http://crossmark.crossref.org/dialog/?doi=10.1177%2F1729881420911256&domain=pdf&date_stamp=2020-03-16 secondary tasks consume additional cognitive resources of an operator, the mental load and work performance of the operator will be affected. Compared with the traditional operating control plat- form, a digital HCI provides operators with abundant infor- mation and parameters. The information and parameters on any display are not fixed; however, charts and graphs are discontinuous, which will increase the cognitive load of operators, consume their attentional resources, and gener- ate keyhole effects. 4 Then, misreading, misjudgment, and misoperation will easily occur, which will increase the probability of human-factor accidents. With the rapid development of science and technology, artificial intelligence technology has made great achieve- ments. Intelligent and mechanized machine instead of cum- bersome human operation has gradually become true. The flexibility and intelligence of robot control can make up for the security risks and the lack of efficiency and accuracy of manual operation or inspection. If the pictures for second- ary tasks of a digital HCI can be intelligently configured by robot technology, operators’ cognitive resources and the time of dealing with an event will be decreased, and the accidents caused by human errors will be decreased, so pictures configuration is necessary. Three core technologies of intelligent system are robot technology, artificial intelligence, and digital technology, respectively, in which, robot technology is the key prob- lem. For robot technology, software control technology is the core of the whole robot control system. To decrease the disturbance from secondary tasks, based on the soft- ware control technology of robot this article studies a scheduling algorithm that can be used for picture config- uration for secondary tasks. When an operator must obtain parameter information, if the operator need not configure related secondary tasks, he can save time and decrease his cognitive load. The research achievements regarding secondary tasks are few. Most studies focus on human–machine interfaces (HMIs). In 2011, 5 a visual strategy is used to design an interface between a human and a computer. The design strategy keeps in mind of human beings and on the assump- tion that the HMI should be as simple as possible. To improve highlighting in an HMI, Anuar and Kim 6 proposed a systematic method for an automatic system of Npps. Bhatti et al. 7 presented a user-centered design strategy that includes operation contexts and relevant interfaces that are suitable for users and standard designs. In 2015, 8 a particle swarm optimization method with weights was proposed for optimizing a complex problem. In 2009, 9 input perfor- mance, user comfort, and interface layout were studied. The study shows that input and comfort performances can be improved by optimizing the interface layout. Later, the topological structure and integrated design of the compo- nent layout and the shape of the HMI were studied based on a finite element network and a collision detection algorithm. 10,11 Some scholars studied how the HCI design of warehouse orders affects the perceived load, usability, comfort, and operation performance, and experimental data show that graphic user interfaces can reduce operation time of tasks and human error. 12 In the process of industrial operation, HCI can help operators get familiar with the factory state and deal with unexpected events. Therefore, some scholars put forward the idea of ecological interface design and a dynamic interface design model, which have been applied. 13 Aiming at the diversity in device interaction pro- cess, some scholars proposed a multi-objective and multi- mode interaction modeling method based on the interface description language, which could improve the usability of HCI end-user interaction. 14 For the disabled who have dif- ficulty in moving, some scholars studied the HCI based on the gesture interaction mode. The research process used mobile device robot platform, 3D image sensor, identifica- tion system based on the support vector machine, and vehi- cle positioning equipment. 15 Some scholars studied the HMI design for the enterprise online product trading plat- form. The experimental results show that color plays an important role in awakening customers and that warm and cool colors have different influence on people. 16 Through simulative experiments, Kantowitz et al. found that interface management tasks reduced the performance of first tasks, and had a direct impact on the reliability for an operator to complete first tasks. 17 Tijerina et al. tested interface management tasks had influence on professional operators of heavy vehicles, namely, interface management task had certain influence on reliability of professional operators. 18 To reduce the adverse impact of the interface management task on the operator, Howard and Kerst pro- posed that the interface management should been organized into a physical space model that could be easily recognized by the methods of path tracking, backtracking, status iden- tification, and scope limitation. 19 To improve the readabil- ity and visibility of the interface management task and reduce the attention resources allocation of operators, Cook and Woods proposed that the characteristics of interface management task could been moved to the data area using analog input device, data control device, and computer monitoring system. 20 The study confirmed that if two tasks are very similar, there is a learning transfer from one sec- ondary task to another secondary task. 21 Under the back- ground of secondary task, to explore the combined effect of anxiety, cognitive load, experience, the researchers designed experiments with secondary tasks, and without secondary tasks, respectively. The experimental situation is set as lower anxiety and higher anxiety. Eleven profes- sionals and 10 novices participated in the experiment; the results show that the anxiety causes performance degrada- tion for the novice and that secondary tasks increase mental load and reduce the rate of response. 22 In a concurrent eye task, some scholars tested whether a manual type secondary task could increase the awareness of eye movement error. The experiment found that the difficulty of a task had no 2 International Journal of Advanced Robotic Systems effect on the awareness of eye movement error, and the participants’ ability to monitor eye movement improved with the increase of interference. 23 In addition to these studies, other achievements regard- ing the design and evaluation methods of HMI have been realized, such as a virtual environment and a constraint genetic algorithm 24,25 and evaluation methods of HMI. 26–29 Naujoks et al. 30 studied the automation of longitudinal and lateral control during an on-road experiment in everyday traffic. The results demonstrated that driving safety with subjectivity or objectivity was not influenced by the degree of automation. A model for determining the likelihood of a driver’s involvement in secondary tasks based on attributes of driving behavior was developed. The model could be applied in crash investigations to resolve legal disputes in traffic accidents. 31 The descriptions above indicate that secondary tasks give interference for an operator, affect the operators’ execution of first task, increase psychological load, and affect the attention resources distribution. To decrease the mental load and distribution of the cognitive resources for operators, based on robot technology, this article proposes a scheduling algorithm for picture configuration of secondary tasks of HCI in an Npp. The article has two main contributions that are listed as follows: (1) the proposed method can be used to automatically configure pictures, which can reduce the time that is spent dealing with an accident and decrease the men- tal stress of operators, so that the incidence of human-factor accidents can be decreased and (2) the method is established under certain conditions including digital system features and constraint conditions, so the proposed method is more in line with the actual situation. Scheduling process and constraint conditions Scheduling process Based on the research background and operator interviews, the process of picture configuration scheduling mainly includes the following: acquiring priority, organizing data, tracking dynamic processes, and using a replacement algo- rithm. Figure 1 illustrates the process of picture configura- tion of secondary tasks. Constraint conditions of the picture configuration process Notations. Notations are listed below: Buffer: a buffer pool that is used to save related pic- tures and primary tasks; Task_fi: an implemented object of the ith primary task; K_time_long_task: implemented objects that have been recently visited; task_sij: the jth picture that is associated with the implemented object of the ith primary task; size(cur_task): the size of the current implemented objects for the ith primary task; size(task_sij): the size of the jth picture that is associ- ated with the implemented object of the ith pri- mary task; size(cur_sec_task): the sum of all pictures that are related to currently running objects; Dynamically tracking the Npp current status and running process of regulations Yes No Testing whether the pool size reaches its maximum Yes No Calculating the priorities of pictures of the primary task Priority>threshold value No Yes Putting a picture into the buffer pool to form a multilevel queue Dynamically changing the order of pictures in a queue Picture displays on one of screens Dequeue Replacing a running object of the primary task and pictures Dynamically maintaining the synchronous change in pictures in the buffer pool and the current plant status Information center Extracting the keywords for running objects of the primary task and pictures from feature library Data mapping Determining whether a programmed pool contains implementation tasks and pictures? Figure 1. Process of the picture configuration scheduling algorithm. Zhang et al. 3 cur_sec_task: all pictures that are related to currently running objects; v_time(task_fi): recent visitation time of implement- ing objects that are related to the ith primary task; cur_task: objects that are being implemented in cur- rent primary tasks; cur_f: the current picture; Suff_size: the size of the buffer pool; F_t_sizei: the stored size of the implemented object for the ith primary task; G_inf_sizeij: the stored size of the jth picture that is associated with the implemented object of the ith primary task; M: the number of implemented objects of the primary task in the buffer pool; Nij: the number of the jth picture that is related to the implemented object of the ith primary task; U_sumij: the number of visitations of the jth picture that is related to the implemented object of the ith primary task; S_f_sumi: the number of visitations of the implemen- ted object of ith primary task; Fti: the visitation frequency of the implemented object of the ith primary task; mti: the importance degree of the implemented object of the ith primary task; Fgij: the visitation frequency of the jth picture that is related to the implemented object of the ith pri- mary task; gmij: the importance degree of the jth picture that is related to the implemented object of the ith pri- mary task; pri_wij: the priority of the jth picture that is related to the implemented object of the ith primary task; fpi: the weight of the implemented object of the ith primary task; w_f: the threshold value of the implemented object weight of the ith primary task; k_w_fi: extracted keyword vector space of the imple- mented objects of the ith primary task; s(k_w_fi): the number of extracted keywords of the implemented objects of the ith primary task; k_w_sij: the keyword vector space of the jth picture that is related to the implemented object of the ith primary task; s(k_w_sij): the number of extracted keywords of the jth picture that is related to the implemented object of the ith primary task; sim(k_w_fi, k_w_sij): the similarity degree between the implemented object of the ith primary task and the jth picture that is related to the ith primary task; vfik: the extracted kth keyword of the implemented objects of the ith primary task; vsijp: the pth keyword of the jth picture that is related to the implemented object of the ith primary task; pri_f: priority threshold value; f_c: feature library; c_sum: the number of keywords in the feature library; f_s_p_sij: the number of the identical keywords between the implemented object of the ith primary task and the jth picture that is related to the ith primary task; f(cur_inf)ij: the current status of the plant with the jth- newest picture that is related to the implemented object of the ith primary task; t_s_infij: the current data or parameters of the jth pic- ture that is related to the implemented object of the ith primary task; flag: indicator of whether the implemented object of the ith primary task is changed; changeij: indicator of whether the jth picture that is related to the ith primary task is changed in the running process. Constraint conditions. 1. The buffer pool size must be greater than or equal to the sum of the sizes of the implemented objects of the primary task and the related pictures, which can be expressed as follows suf f size � Xm i¼1 f t sizei þ Xm i¼1 Xnij j¼1 g inf sizeij ð1Þ 2. The visitation frequency of the jth picture that is related to the implemented object of the ith primary task is as expressed in equation (2) f gij ¼ u sumij Pnij i¼1 u sumij ð2Þ Similarly, the visitation frequency of the implemented object of the ith primary task is as follows f ti ¼ s f sumiPnij i¼1 s f sumi ð3Þ 3. The weight of the implemented object of the ith primary task is defined as f pi ¼ f ti � mti ð4Þ 4. The buffer pool is initialized to determine which objects of the primary tasks should be added into it. The condition is expressed as follows f pi � w f ð5Þ 5. The similarity degree between the implemented object of the ith primary task and the jth picture that is related to the ith primary task is defined as simðk w f i; k w sijÞ¼ f s p sij sðk w f iÞþ sðk w sijÞ ð6Þ 4 International Journal of Advanced Robotic Systems 6. The priority is calculated via equation (7) pri wij ¼ 0:7 � simðk w f i; k w sijÞþ 0:2f gij þ 0:1gmij ð7Þ 7. The sum of the sizes of the implemented objects, all pictures that will be added into the buffer pool in the immediate future and all pictures that are currently in the buffer pool must be less than or equal to the buffer pool size, which can be expressed as follows sizeðcur taskÞþ sizeðcur sec taskÞ þ Xm i¼1 task f i þ Xm i¼1 XN ij j¼1 sizeðtask sijÞ <¼ suf f size ð8Þ For pictures or tasks in the buffer pool: ffi If pri_wij¼pri_f, Task_fi is added into the ith queue of the buffer pool and the queue is reordered. (9) The values of “flag” are defined as follows: ffi If flag ¼ 1, the current implemented object should be added into the buffer pool and it will be ready for configuring the pictures that are related to the implemented object. ffl If flag ¼ 0, pictures that are related to the imple- mented object continue to be configured. (10) The values of changeij are defined as follows: ffi If changeij ¼ 1, the configured pictures should be timely updated to keep pace with the current plant running status. ffl If changeij ¼ 0, pictures are not updated. Picture configuration scheduling algorithm Scheduling process The scheduling process, which is illustrated in Figure 1, mainly includes determining the priorities of each rele- vant picture and dynamically arranging the pictures and tasks in a buffer pool. These steps are described in the following. Calculating the priority of each picture. According to the con- straint conditions above, equation (7) can be used to calculate the priority of each picture. Equation (7) con- sists of three parts: (1) the similarity degree between the implemented objects of the primary task and the pic- tures; (2) the visitation frequency of the pictures; and (3) the importance degrees of the pictures. The visitation frequency of the pictures can be calculated via equation (2). The importance degree of the pictures can be obtained via operator interviews and expert judgments. The similarity degree can be obtained via equation (6). For equation (6), two steps must be conducted: (1) extracting the picture information keywords that are associated with the implemented objects of the current primary task from a feature library and (2) calculating the number of identical keywords. A feature library is established and improved by domain experts, supervi- sors, and advanced operators. Extraction of the key- words and calculation of the number of identical keywords can be conducted by following two algo- rithms, which are presented as follows 1) Algorithm for extracting keywords from a feature library (1) Algorithm process The algorithm steps are as follows: ffi Successively search for the current primary tasks in a feature library. ffl If the current primary tasks that are being implemented are identified, their keywords will be extracted; otherwise, return to step (1). � Add the ith primary task keywords into a vector space (k_w_fi). Ð Successively search for the current jth pic- ture keywords from the ith primary task keyword vector space (k_w_fi). ð If the current jth picture is identified, the pth keyword of the jth picture will be extracted; otherwise, return to step (4). Þ Add the pth keyword into a vector space (k_w_sij). This algorithm process for extracting keywords is illu- strated in Figure 2. (2) Pseudo code for extracting keywords from a feature library Feature_extract_algorithm() Begin i¼1; While(i<¼c_sum) begin If(task_fi¼cur_task) While(k<¼ s(k_w_fi)) begin K_w_fi vfik; k¼kþ1; end; Else i¼iþ1; End;m¼1;p¼1; While(m<¼c_sum) Begin If(taskij¼cur_f) While(p<¼ s(k_w_sij)) Begin k_w_sij vsijp; p¼pþ1; End; Else m¼mþ1; End; END. Zhang et al. 5 2) Algorithm for calculating the number of identical keywords (1) Algorithm steps ffi Find the keyword vector space of the implemented objects from the ith pri- mary task. ffl Find the keyword vector space of the jth picture that is related to the implemented object of the ith primary task. � Successively search for the pth keyword of the jth picture from the current ith primary task. Ð If the current task picture keyword is identified, the count is successively increased. The algorithm process is illustrated in Figure 3. (2) Pseudo code for calculating the number of identi- cal keywords Calculate_key_sum() Begin k¼1;p¼1; Locate(k_w_fi); Locate(k_w_sij); While(k<¼s(k_w_fi)) begin while(p<¼s(k_w_sij)) begin if(k_w_fi[vfik]¼k_w_sij[vsijp]) f_s_p_sij¼ f_s_p_sijþ1; p¼pþ1; end; k¼kþ1; end; End. Dynamically establishing the sequences of pictures and primary tasks in multilevel queues of a buffer pool. Two problems must be solved for picture configuration and primary tasks in a buffer pool: (1) arranging them in order and (2) dealing with the dynamic process when the latest pictures and tasks arrive to the buffer pool. The solutions of the two problems are described in the following sections. (1) Arranging the pictures and primary tasks in a buf- fer pool The proposed process for arranging the pictures and primary tasks is as follows: ffi based on corresponding Successively search for the current primary tasks in a feature library task_fi==current task? When k<= s(k_w_fi) K_w_fi←vfik; k=k+1 yes i=i+1 No Successively search for the current jth picture keywords in the ith primary task keyword vector space taskij= current picture? When p<= s(k_w_sij) k_w_sij←vsijp; p=p+1 yes m=m+1 No Figure 2. Algorithm process for extracting keywords. Find k_w_fi Find k_w_sij k<=s(k_w_fi) f_s_p_sij= f_s_p_sij+1 p<=s(k_w_sij) k_w_fi[vfik]==k_w _sij[vsijp] Yes Yes p=p+1 k=k+1 End no no Figure 3. Algorithm process for calculating the number of identical keywords. 6 International Journal of Advanced Robotic Systems accidents, the implemented objects of primary tasks for which the weights are greater than or equal to the threshold values are added into the buffer pool, and objects that were implemented earlier are arranged with higher priority in the queue; ffl all pictures that are related to the implemented objects of the primary tasks are searched; � all relevant pictures are arranged in order of their priorities to build a navigation path; and Ð if the sum of the sizes of all tasks is greater than the buffer pool size, then the pictures that are arranged behind other pictures in the same queue will be removed from the buffer pool. The multilevel queues structure of primary tasks and relevant pictures is illustrated in Figure 4. (2) Dealing with the dynamic process when the latest pictures and tasks arrive to a buffer pool If the sum of the sizes of all implemented objects of primary tasks and pictures in the buffer pool is greater than or equal to the buffer pool size, a few implemented objects and relevant pictures in the buffer pool will be replaced by other objects or related pictures. The replacement process is realized via an algorithm, which has the following algorithm process: ffi before the latest pictures and tasks are added into the buffer pool, the sums of the sizes of the buffer pool and tasks, respec- tively, must be calculated; ffl if equation (8) holds, the pictures or tasks will be directly added to the end of a queue of the buffer pool, where the queue structure is illustrated in Figure (4); � if equation (8) does not hold, before new pictures or tasks are added into the queues in order, a few pictures or tasks must be removed from the queues, namely the pictures or tasks that have been in the queues for the longest will be replaced by pictures or tasks that should be implemented as early as possible. The pseudo code of the replacement algorithm is as follows: Rep_task_algorithm() Begin If Eq. (8) then Those pictures or tasks are added into the queues; else Begin K_time_long_task¼task_f1; For i¼2 to m do If(v_time(task_fi)> K_time_long_task) then Begin K_time_long_task¼task_fi; v¼i; i¼iþ1; end; i¼v; task_fi$cur_task; task_fij$cur_sec_task; order(cur_sec_task); end End Picture configuration scheduling algorithm The process of the picture configuration scheduling algo- rithm is illustrated in Figure 1. According to Figure 1, the definitions of the constraint conditions and Scheduling process section, the picture con- figuration scheduling algorithm of the digital HCI in an Npp is defined as follows: Scheduling_Algorithm_picture_configuration() Begin Initialize W_f an initial value; pri_f an initial value; repeat For i ¼ 1 to the total quantity of objects to be executed do Begin mti specify a value; fgij according to Eq. (3); fpi according to Eq. (4); if (Eq. (5))then add task_fi into a buffer; for j¼1 to Nij do begin call Feature_extract_algorithm(), which was proposed in this article; call Calculate_key_sum(), which was proposed in this article; sim(k_w_fi, k_w_sij) according to Eq. (6); pri_wij according to Eq. (7); if (Eq. (1))then continue; else break; end if add task_sij into the buffer to form a navigation path; Task_f1 Task_f2 Task_fm Task_s11 Task_s12 Task_s1j Task_s21 Task_s22 …………………………………………………… ………………………… Task_sm1 Task_sm2 Task_smj …… … … Task_s2j … Figure 4. Queue construction of implemented tasks and pictures. Zhang et al. 7 end if; end; end; until(Eq. (8) is false) (2) Function pseudo codes of the running process Check the current plant status and regulations For i¼1 to m do begin If(cur_task¼task_fi) then For j ¼ 1 to Nij do Begin If(cur_f¼task_sij) then if(pri_wijpri_f then Add cur_f into buffer; re_order(buffer, task_sij); else goto L1; End if; For i ¼ 1 to m do Begin For j¼1 to Nij do Begin If changeij¼1 then Update(task_sij); Mapping(plant_data task_sij); End if; End; End; End Performance analysis Experimental background To evaluate the performance of the picture configuration scheduling algorithm, related experiments are conducted by the authors. A steam generator tube rupture (SGTR) accident in an Npp is used for illustration. As task points are more in SGTR accident, 10 task points were selected for the convenience and standard of experimental procedures, experimental participants mainly deal with these task points and the relevant pictures are obtained from DOS regulations of SGTR accidents. The task points are listed in Table 1. Each picture is represented by a number, as presented in Table 2. Experiment description Participants in the experiment must obtain parameters, evaluate the plant status, decide to how to deal with or restore an accident site, and access branches of accident regulations. To compare the time performance including configuring pictures and manual approach, picture con- figurations are scheduled via the proposed algorithm and participants in the experiment, respectively. Ten stu- dents from Hunan Institute of Technology participated Table 1. Task points. Number Task description 1 Confirm: Confirm RCV 017VP on RCV 0002BA (BY-pass demineralizers RCV) 2 Confirm REA on AUTO makeup the Boron concentration of the primary system 3 The volume of REA Boron tanks 4 Set RCP 404KU X the value of no load Set point (20% of �4 m) 5 Set RCV 046VP on AUTO 6 Reset CIB signal by RPA 284KG and RPB 284KG 7 Reset SI signal by RPA 060KG and RPB 060KG 8 Confirm the reactor trip by RPA 300TO and RPB 300TO 9 Check that all the CIA values are close 10 Confirm that RIS 061VP and 062VP are open Table 2. Picture numbers. Picture number 1 RIC003YCD 2 RCV002YCD 3 REA001YCD 4 ECP002YCD 5 TEP003TCD 6 RCV001YED 7 RCP002YCD 8 EPP002YFU 9 RIS100YFU 10 EAS100YFU 11 RGL001YCD 12 EPP001YFU 13 LHP001YCD 14 LHQ001YCD 15 DOS10AYST 8 International Journal of Advanced Robotic Systems in the simulative experiment; they were divided into five groups and were trained for 2 days. The experiment was conducted 10 times. Each group is required to do two trials. In the experiment, some parameters have dynamic values, such as Nij, U_sumij, S_f_sumi, Fti, Fgij, mti, and gmij. The dynamic values may be obtained during the simulative experiment according to related tasks. The initial values of a few parameters must be specified directly. Two initial values are set as w_f ¼ 0.5 and pri_f ¼ 0.5. Most parameter values are obtained or dynamically changed according to the running process of the SGTR accident. The experimental process is based on Figure 1. The simulation platform that is used for the experiment is Windows 7, with an i7-6700 CPU, 8 G RAM, and disk space of 500 GB. The experimental results are the mean values of all experimental data. Performance analysis The performance of the picture configuration scheduling algorithm is analyzed from several perspectives according to the experimental data. (1) The change curves of the numbers of replace- ments, which are plotted in Figure 5. Replacement is viewed as a process, namely ffi lower correlation pictures with current task are removed from buffer pool; ffl more correlation pictures with current run- ning task will get into the buffer pool. According to Fig- ure 5, the numbers of replacements of (a) and (b) are 0 when the number of tasks is 4, which is the optimal case. Fewer replacements correspond to less time being spent on picture configuration. Comparing with the least recently used and least fre- quently used methods, the algorithm proposed in this article conducts few replacements, which indicates the algorithm proposed has good performance on replacements. It is shown in Figure 5 that the number of replacements will increase with the number of tasks, which accords with the actual scenario, as the size of a buffer pool is fixed and the probabilities that relevant pictures are not in the buffer pool increase with the number of tasks. (2) Picture average waiting time, which is plotted in Figure 6. Waiting time is viewed as an interval, namely, it is after picture is get into the buffer pool, until is automatically configured on a screen. According to Figure 6, the picture average waiting time in the experiments with the algorithm that is proposed in 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 Number of tasks The algorithm proposed in this paper Least Recently used(LRU) Least Frequently used(LFU) 4 6 8 10 12 14 16 18 20 0 1 2 3 4 5 6 7 8 9 Number of tasks T h e a ve ra g e r e p la ce m e n t tim e s T h e a ve ra g e r e p la ce m e n t tim e s The algorithm proposed in this paper Least Recently used(LRU) Least Frequently used(LFU) (a) (b) Figure 5. Change curves of the numbers of replacements: (a) replacements when the buffer pool size is 3 and (b) replacements when the buffer pool size is 5. 1 2 3 4 5 6 7 8 9 10 4000 4500 5000 5500 6000 6500 The times of exeriment A ve ra g e w a iti n g t im e o f p ic tu re Algorithm proposed in this paper Short job first, SJF Highest Response Ration Next,HRRN First-come First served,FCFS Figure 6. Picture average waiting time (ms). Zhang et al. 9 this article is approximately 5200 ms. Comparing with the highest response ratio next (HRRN) and first come first served (FCFS) methods, the algorithm that is proposed in this article has a shorter waiting time; however, comparing with the shortest job first (SJF) method, it has a longer waiting time. Shorter waiting time means that time cost of picture configure is less. By and large, the algorithm performance on average waiting time is good. (3) Time cost analysis According to Figure 7, the time cost of the scheduling algorithm is far less than the time cost of the manual approach for picture configuration; hence, the scheduling algorithm outperforms the manual approach. If time cost of picture configuration is decreased, then psychology pres- sure of operators is decreased, and then accident safety can be improved. (4) Accuracy of picture configuration, which is plotted in Figure 8. According to Figure 8, the accuracy of the picture con- figuration scheduling algorithm proposed in this article is approximately 85%; hence, it is reliable. Comparing with the SJF, HRRN, and FCFS methods, the algorithm that is proposed in this article is more accurate. Conclusions This article discusses how secondary tasks in a digital HCI increase the mental loads of operators and analyzes the advantages that pictures were intelligently configured by robot technology. In this article, based on robot technology, a picture configuration scheduling algorithm of secondary tasks is obtained. All relevant variables of the scheduling algorithm are defined. Mathematical expressions for sev- eral constraint conditions are established. In addition, sev- eral algorithms for extracting information features, counting identical keywords, and configuring pictures of secondary tasks were proposed. The simulative experiment analysis results demonstrate that the picture configuration scheduling algorithm realizes satisfactory performance. Most of the data obtained via the simulation experiments reflect the algorithms’ performances for picture configura- tion, such as correctness, number of replacements, and waiting time. However, the participants are students; hence, the time that is spent on configuring pictures manu- ally might exhibit small deviations. However, the devia- tions have little effect on the performance of the scheduling algorithm, as the time difference between the manual approach and the scheduling algorithm is very large. Thus, the small deviations have no readily observa- ble effects on the difference in time cost between the man- ual approach and the scheduling algorithm. In the future, the constraint conditions will be further improved accord- ing to feedbacks in application process; the algorithm will be extended to other fields. Declaration of conflicting interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Funding The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported in part by Hunan Provincial Natural Science Foundation of China (2019JJ40066, 2017JJ4019), the Social and Science Fund of Hunan Province of China (XSP18YBZ035), the Scientific Research Foundation of Hunan Institute of Technology of China (HQ19009), The key laboratory of Hunan Provin- ce(2019TP1020), China. Figure 7. Picture configuration times of the manual approach and the scheduling algorithm. 1 2 3 4 5 6 7 8 9 10 0 10 20 30 40 50 60 70 80 90 100 Experiment times A cc u ra cy (% ) Algorithm proposed in this paper Short job first, SJF Highest Response Ration Next,HRRN First-come First served,FCFS Figure 8. Accuracy of the picture configuration scheduling algorithm. 10 International Journal of Advanced Robotic Systems ORCID iD Jianjun Jiang https://orcid.org/0000-0001-5856-2055 References 1. Li Z, Da-xin Y, and Yi-qun W. The effect of information display on human reliability in a digital control room [in Chinese]. China Saf Sci J 2010; 20(9): 81–84. 2. Linfeng L. A monitoring transferring law research for opera- tors in a digital nuclear power plant [in Chinese]. Master Thesis, University of South China, 2013. 3. Wickens C. Processing resources and attention. Multiple task performance. London: Taylor & Francis Press, 1991. 4. Seong PH. Reliability and risk issues in large scale safety- critical digital control systems. New York: Springer Press, 2009. 5. Yan R. The Research of human-computer interface design based on vision communication. Procedia Eng 2011; 15: 3114–3118. 6. Anuar N and Kim J. A direct methodology to establish design requirements for human–system interface (HSI) of automatic systems in nuclear power plants. Ann Nucl Energy 2014; 63: 326–338. 7. Bhatti G, Bremond R, Jessel JP, et al. Design and evaluation of a user-centered interface to model scenario on driving simulators [Special Issue on Road Safety and Simulation]. Transport Res C Emer Technol 2015; 50: 3–12. 8. Futrell BJ, Ozelkan EC, and Brentrup D. Optimizing complex building design for annual daylighting performance and eva- luation of optimization algorithms. Energ Buildings 2015; 92(1): 234–245. 9. Haynes S. Effects of positioning optimization in an alterna- tive computer workstation for people with and without low back pain. Int J Ind Ergonom 2009; 39(5): 719–727. 10. Xia L, Zhu J, Zhang W, et al. An implicit model for the integrated optimization of component layout and structure topology. Comput Method Appl Mech Eng 2013; 257: 87–102. 11. Sørensen SE, Hansen MR, Ebbesen MK, et al. Non-linear optimization of track layouts in loop-sorting-systems. Auto- mat Constr 2013; 31: 19–30. 12. Kim S, Nussbaum MA, and Gabbard JL. Influences of aug- mented reality head-worn display type and user interface design on performance and usability in simulated warehouse order picking. Appl Ergon 2019; 74: 186–193. 13. Lindscheid C, Sakthithasan P, and Engell S. An ecological interface design based visualization of the energy balance of chemical reactors. IFAC Papers OnLine 2019; 51: 308–314. 14. Gaouar L, Benamar A, Le Goaer O, et al. HCIDL: human– computer interface description language for multi-target, multimodal, plastic user interfaces. Future Comput Inform J 2018; 3: 110–130. 15. Ding LR, Lin RZ, and Lin ZY. Service robot system with inte- gration of wearable Myo armband for specialized hand gesture human–computer interfaces for people with disabilities with mobility problems. Couput Electr Eng 2018; 69(7): 815–827. 16. Cheng FF, Wu CS, and Leiner B. The influence of user inter- face design on consumer perceptions: a cross-cultural com- parison. Comput Hum Behav 2019; 101: 394–401. 17. Kantowitz B, Hanowski R, and Tijeina L. Simulator evalua- tion of heavy-vehicle workload: Q : complex secondary tasks. In: 1996 Proceedings of the Human Factors Society-40th Annual Meeting, Santa Monca, CA, USA, August 1996, pp. 1002–1006. Human Factors Society. 18. Tijerina L, Kigr S, Rockweel T, et al. Workload assessment of in-cab test message system and cellular phone use by heavy vehicle drivers the road. In: 1995 proceeding of the Human Factors Society-39th Annual Meeting, Santa Moncia, CA, USA, September 1995, pp. 1015–1019. Washington, DC: Human Factors Society. 19. Howard J and Kerst S. Memory and perception of carto- graphic information for familiar and unfamiliar environ- ments. Hum Factors 1981; 23: 495–504. 20. Cook R and Woods D. Adapting to new technology in the operating room. Hum Factors 1995; 38: 593–613. 21. TiborPetzoldt SB and Krems JK. Learning effects in the lane change task (LCT) – realistic secondary tasks and transfer of learning. Appl Ergon 2014; 45(3): 639–646. 22. Nibbeling N, Oudejans RRD, and Daanen HAM. Effects of anxiety, a cognitive secondary task, and expertise on gaze behavior and performance in a far aiming task. Psychol Sport Exerc 2012; 13(4): 427–435. 23. Robinson MM and Irwin DE. Conscious error perception: the impact of response interference from a secondary task. Atten Percept Psychophys 2017; 79: 863–877. 24. Avola D, Spezialetti M, and Placidi G. Design of an efficient framework for fast prototyping of customized human–com- puter interfaces and virtual environments for rehabilitation. Comput Meth Prog Biomed 2013; 110(3): 490–502. 25. Troiano L and Birtolo C. Genetic algorithms supporting gen- erative design of user interfaces: examples. Inform Sci 2014; 259(20): 433–451. 26. Ramakrisnan P, Jaafar A, Hanis F, et al. Evaluation of user interface design for leaning management system (LMS): investigating student’s eye tracking pattern and experiences. Procedia Soc Behav Sci 2012; 67(10): 527–537. 27. Browne K and Anand C. An empirical evaluation of user interface for mobile video game. Entertain Comput 2012; 3(1): 1–10. 28. Chun-yan X, Sheng-yuan Y, Qing-fen L, et al. Experimental study on human–machine interface evaluation of main con- trol room in nuclear power plants. Chin Saf Sci J 2008; 18(8): 109–114 (In Chinese). 29. Wei Z and Wei H. Method of software interface evaluation based on eyetracking technology [in Chinese]. Electro Mech Eng 2013; 29(4): 62–64. 30. Naujoks F, Purucker C, and Neukum A. Secondary task engagement and vehicle automation – comparing the effects of different automation levels in an on-road experiment. Transport Res F Traf Psychol Behav 2016; 38(3): 67–82. 31. Ye M, Osman OA, Ishak S, et al. Detection of driver engage- ment in secondary tasks from observed naturalistic driving behavior. Accid Anal Prev 2017; 106(9): 385–391. Zhang et al. 11 https://orcid.org/0000-0001-5856-2055 https://orcid.org/0000-0001-5856-2055 https://orcid.org/0000-0001-5856-2055 << /ASCII85EncodePages false /AllowTransparency false /AutoPositionEPSFiles true /AutoRotatePages /None /Binding /Left /CalGrayProfile (Gray Gamma 2.2) /CalRGBProfile (sRGB IEC61966-2.1) /CalCMYKProfile (U.S. Web Coated \050SWOP\051 v2) /sRGBProfile (sRGB IEC61966-2.1) /CannotEmbedFontPolicy /Warning /CompatibilityLevel 1.4 /CompressObjects /Off /CompressPages true /ConvertImagesToIndexed true /PassThroughJPEGImages false /CreateJobTicket false /DefaultRenderingIntent /Default /DetectBlends true /DetectCurves 0.1000 /ColorConversionStrategy /LeaveColorUnchanged /DoThumbnails false /EmbedAllFonts true /EmbedOpenType false /ParseICCProfilesInComments true /EmbedJobOptions true /DSCReportingLevel 0 /EmitDSCWarnings false /EndPage -1 /ImageMemory 1048576 /LockDistillerParams true /MaxSubsetPct 100 /Optimize true /OPM 1 /ParseDSCComments true /ParseDSCCommentsForDocInfo true /PreserveCopyPage true /PreserveDICMYKValues true /PreserveEPSInfo true /PreserveFlatness false /PreserveHalftoneInfo false /PreserveOPIComments false /PreserveOverprintSettings true /StartPage 1 /SubsetFonts true /TransferFunctionInfo /Apply /UCRandBGInfo /Remove /UsePrologue false /ColorSettingsFile () /AlwaysEmbed [ true ] /NeverEmbed [ true ] /AntiAliasColorImages false /CropColorImages false /ColorImageMinResolution 266 /ColorImageMinResolutionPolicy /OK /DownsampleColorImages true /ColorImageDownsampleType /Average /ColorImageResolution 175 /ColorImageDepth -1 /ColorImageMinDownsampleDepth 1 /ColorImageDownsampleThreshold 1.50286 /EncodeColorImages true /ColorImageFilter /DCTEncode /AutoFilterColorImages true /ColorImageAutoFilterStrategy /JPEG /ColorACSImageDict << /QFactor 0.40 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageDict << /QFactor 0.76 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /JPEG2000ColorACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000ColorImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasGrayImages false /CropGrayImages false /GrayImageMinResolution 266 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Average /GrayImageResolution 175 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50286 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict << /QFactor 0.40 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /GrayImageDict << /QFactor 0.76 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /JPEG2000GrayACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000GrayImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasMonoImages false /CropMonoImages false /MonoImageMinResolution 900 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Average /MonoImageResolution 175 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50286 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict << /K -1 >> /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox false /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (U.S. Web Coated \050SWOP\051 v2) /PDFXOutputConditionIdentifier (CGATS TR 001) /PDFXOutputCondition () /PDFXRegistryName (http://www.color.org) /PDFXTrapped /Unknown /CreateJDFFile false /Description << /ENU >> /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ << /AsReaderSpreads false /CropImagesToFrames true /ErrorControl /WarnAndContinue /FlattenerIgnoreSpreadOverrides false /IncludeGuidesGrids false /IncludeNonPrinting false /IncludeSlug false /Namespace [ (Adobe) (InDesign) (4.0) ] /OmitPlacedBitmaps false /OmitPlacedEPS false /OmitPlacedPDF false /SimulateOverprint /Legacy >> << /AllowImageBreaks true /AllowTableBreaks true /ExpandPage false /HonorBaseURL true /HonorRolloverEffect false /IgnoreHTMLPageBreaks false /IncludeHeaderFooter false /MarginOffset [ 0 0 0 0 ] /MetadataAuthor () /MetadataKeywords () /MetadataSubject () /MetadataTitle () /MetricPageSize [ 0 0 ] /MetricUnit /inch /MobileCompatible 0 /Namespace [ (Adobe) (GoLive) (8.0) ] /OpenZoomToHTMLFontSize false /PageOrientation /Portrait /RemoveBackground false /ShrinkContent true /TreatColorsAs /MainMonitorColors /UseEmbeddedProfiles false /UseHTMLTitleAsMetadata true >> << /AddBleedMarks false /AddColorBars false /AddCropMarks false /AddPageInfo false /AddRegMarks false /BleedOffset [ 9 9 9 9 ] /ConvertColors /ConvertToRGB /DestinationProfileName (sRGB IEC61966-2.1) /DestinationProfileSelector /UseName /Downsample16BitImages true /FlattenerPreset << /ClipComplexRegions true /ConvertStrokesToOutlines false /ConvertTextToOutlines false /GradientResolution 300 /LineArtTextResolution 1200 /PresetName ([High Resolution]) /PresetSelector /HighResolution /RasterVectorBalance 1 >> /FormElements true /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles true /MarksOffset 9 /MarksWeight 0.125000 /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PageMarksFile /RomanDefault /PreserveEditing true /UntaggedCMYKHandling /UseDocumentProfile /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ] /SyntheticBoldness 1.000000 >> setdistillerparams << /HWResolution [288 288] /PageSize [612.000 792.000] >> setpagedevice work_g2jcebslp5elfmzdsxwrfoxmh4 ---- Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 A. Lo Duca, C. Bacciu, A. Marchetti – The Use of Blockchain for Digital Archives: a comparison between Ethereum and Hyperledger DOI: http://doi.org/10.6092/issn.2532-8816/9959 The Use of Blockchain for Digital Archives: a comparison between Ethereum and Hyperledger 1Angelica Lo Duca, 2Clara Bacciu, 3Andrea Marchetti IIT CNR, Pisa, Italy 1angelica.loduca@iit.cnr.it 2clara.bacciu@iit.cnr.it 3andrea.marchetti@iit.cnr.it Abstract In recent years, blockchain technology is progressively spreading on a large scale in various research sectors, including Cultural Heritage. Different types of blockchain exist, which can be classified either according to the type of users that can access them, or based on the features they offer. This article describes a theoretical study in which two very different blockchains are compared: Ethereum and Hyperledger, in order to define which of the two is more suitable for storing tangible heritage contained in digital archives. After a brief description of the two technologies, a possible generic application scenario will be described in order to understand which of the two technologies best meets the requirements of the scenario. The comparison between the two blockchains will therefore be carried out on the basis of general issues, architectural requirements and various considerations. As a result of the comparison, it will emerge that Hyperledger Fabric is more suitable in the context of digital archives. Negli ultimi anni la tecnologia blockchain si sta diffondendo sempre più su larga scala in diversi settori di ricerca, inclusi i Beni Culturali. Esistono diverse tipologie di blockchain, che possono essere classificate sia in base al tipo di utenti che possono accedervi, sia in base alle funzionalità che offrono. Questo articolo descrive uno studio teorico in cui si confrontano due blockchain molto diverse tra di loro: Ethereum e Hyperledger, al fine di definire quale delle due è maggiormente indicata per la memorizzazione di beni culturali tangibili contenuti in archivi digitali. Dopo una breve descrizione delle due tecnologie, verrà descritto un possibile scenario di applicazione abbastanza generico per poter capire quale delle due tecnologie meglio soddisfa i requisiti. Verrà quindi effettuato il confronto tra le due blockchain sulla base di problematiche generali, requisiti architetturali e considerazioni varie. Come risultato del confronto, emergerà che Hyperledger Fabric è più adatta nel contesto degli archivi digitali. 145 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 Introduction Recently, the diffusion of applications based on blockchain technology [21], [25] has been increasing rapidly. The original focus of these technologies concerned cryptocurrencies (i.e., Bitcoin), but is shifting to finance and business in general, and is being extended progressively for a variety of applications in healthcare, government, Internet of Things, entity and assets management and eventually Cultural Heritage. In particular, a blockchain could be a good solution to store, protect and preserve over time data about tangible heritage, especially minor tangible heritage, i.e. artistically relevant artworks but not as famous as masterpieces. For example, in case of disasters (either natural or man made), the fact that the blockchain is a replicated registry can be exploited to retrieve information that in other circumstances would otherwise be lost forever. In addition, information contained in the blockchain cannot be erased or tampered with, so, in case of theft of the real artwork, related data will remain available and would be used to recognise the work if someone tries to sell it and to detect counterfeits. Using blockchain to store digital archives of artworks thus constitutes a promising field of application. This paper is an extension of the work illustrated in [2]. In particular, it describes the challenges and requirements for storing a digital archive of artworks in a blockchain. In addition, a possible framework based on blockchain is illustrated. Then the paper describes a comparison between two blockchains, Ethereum1 [24] and Hyperledger Fabric2 [5], used as main framework for digital archives. A preliminary implementation of a framework for digital archives, based on Ethereum, has already been defined in [6], [3]. On the basis of the described framework, a selection of some comparison criteria is done, including general issues related to blockchains, architectural requirements applied to the specific architecture and other considerations. As a result of the comparison, we can say Hyperledger Fabric better fits for the proposed scenario because is more configurable. However, due to its popularity Ethereum still remains a good solution. In addition to Ethereum and Hyperledger there are other implementations of the blockchain technology. Some of the most important are: Bitcoin [18], Corda,3 Quorum.4 Bitcoin was the first blockchain. Based on open source code, it implements a decentralized digital cryptocurrency where transactions are validated by miners through a rewarding process. Corda is an open source blockchain platform, designed mainly for business applications. Similarly, to Corda, Quorum, based on Ethereum with added control for permission and privacy, is a blockchain envisaged mainly for business applications. A complete comparison among the most important blockchains is done in [15]. The choice of which blockchain should be used to store digital archives depends mainly on two aspects: firstly, the blockchain should be general, i.e., not limited to financial applications. Secondly, it should be popular, i.e. technically mature 1 https://www.ethereum.org/ 2 https://www.hyperledger.org/projects/fabric 3 https://www.corda.net/ 4 https://www.jpmorgan.com/global/Quorum 146 https://www.jpmorgan.com/global/Quorum https://www.corda.net/ https://www.hyperledger.org/projects/fabric https://www.ethereum.org/ A. Lo Duca, C. Bacciu, A. Marchetti – The Use of Blockchain for Digital Archives: a comparison between Ethereum and Hyperledger and with community guaranteeing long-term sustainability. This paper compares only Ethereum and Hyperledger Fabric, mainly because they represent the two main ways of implementing a blockchain: on the one hand Ethereum is the most representative example of all permissionless blockchains. On the other hand, Hyperledger represents permissioned blockchains. Both Ethereum and Hyperledger can be applied to specific scenarios through the use of smart contracts/chain codes. Bitcoin is a digital currency, and programmability is very limited. Corda, instead, is a more recent blockchain, still not established as Ethereum and Hyperledger Fabric. Quorum is a specific implementation of Ethereum, thus some considerations done for Ethereum are valid also for Quorum. Related Works The problem of managing records through a blockchain has been largely investigated during the last few years. In her paper, Lemieux proposes a classification of blockchain applications [16], based on which information is stored in the blockchain: a) mirror type, b) digital record type, c) tokenized type. Mirror type In the mirror type, the blockchain serves as a mirror, which stores only records fingerprints. The complete information of a record is stored into an external repository and the blockchain is used only to verify records integrity. In [9] the authors describe a first implementation of a decentralized database for the storage of descriptive metadata related to digital records, based on the combination of the blockchain and IPFS technologies. In their paper Liang et. al. describe ProvChain [17], a system which guarantees data provenance in cloud environments. Vishwa et. al. [23] illustrate a blockchain-based framework, which guarantees copyright compliance of multimedia objects by means of smart contracts. Digital record type In the digital record type, the blockchain is used to store all the records in the form of smart contracts. In [4] the authors illustrate a distributed and tamper-proof framework for media. Each media is represented by a watermark, which is firstly compressed and then stored into a blockchain. Approved modifications to media are stored in the blockchain thus preventing tampering. In [8] the authors describe Archain, a blockchain-based archive system, which stores small sized records. Multiple roles are defined in the system, thus allowing records creation, approval and removal. 147 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 Tokenized type In the tokenized type, records are stored in the blockchain and they are linked to a cryptocurrency. Adding, updating or removing a record has a cost. This constitutes an innovative case, where the literature is not consolidated yet. An example of this type of blockchain is represented by the Ubitquity Project,5 which records land transactions on behalf of companies and government agencies. Background The concept of Digital Archive A digital archive is a repository of digital records that need long-term or even permanent preservation for their cultural, historical, or evidentiary value. In digital archives, a record can be anything holding a piece of information in the form of digital object, such as texts, images, pictures, videos and audios. This paper focuses on digital archives which contain collections of minor artworks. Minor artworks are artistically relevant works but not as well-known as famous masterpieces, or belonging to the so-called minor arts, such as books and manuscripts, pottery, lacquerware, furniture, jewellery, or textiles. Examples of minor tangible heritage could be those kept in some small libraries or countryside churches, or even in private households. The creation, management and sustainability of a digital archive is not an easy task, because there is a series of issues that must be taken into consideration [1], [13], [26], [11]. The InterPARES (International Research on Permanent Authentic Records in Electronic Systems) series of projects [22] focused on creating policies and guidelines for making and maintaining digital records, including authenticity requirements for record systems and long-term preservation of digital records. A digital archive is subject to obsolescence, in the sense that the hardware supports on which it is stored change over time (from the floppy disk to the Internet cloud). Thus a digital archive needs long-term preservation, i.e. digital artwork should remain accessible for a long period of time depending on legal, regulatory, operational, and historical requirements. Secondly, every artwork of the digital archive must be associated with different metadata (descriptive, structural, administrative), which should be maintained up-to-date by authorized accounted persons. This means that on the one hand that all the operations about the digital archive should be documented in an open and verifiable manner (transparency). On the other hand, artworks should be protected against forgery and identified correctly in case of loss and subsequent discovery (anti-counterfeiting). Thirdly, records of the digital archive are stored in different media formats, each defined by its own software and hardware. A digital archive should guarantee the availability of all the formats, i.e. artworks should be efficiently and accurately retrieved. Finally there are also other aspects that must be considered, such as corruption and loss of information, which need protection, integrity and traceability of artworks. 5 http://ubitquity.io/brazil_ubitquity_llc_pilot.html 148 http://ubitquity.io/brazil_ubitquity_llc_pilot.html A. Lo Duca, C. Bacciu, A. Marchetti – The Use of Blockchain for Digital Archives: a comparison between Ethereum and Hyperledger Integrity makes sure that the digital description of the artwork is not subject to unauthorized changes. Protection permits to protect the digital description of the artwork in case of natural disasters and/or attacks (it is obviously impossible to protect the real work only with IT tools). Traceability permits to trace all movements of individual artworks. Another relevant issue concerns how difficult it can be to find and access repositories due to the inconsistent description practices among different archives. The ISAD(G) (General International Standard Archival Description) [7] is a standard that addresses this issue and gives guidelines, to be used in conjunction with existing national standards, for the preparation of archive descriptions that are effective in presenting the content of archival material, so that it is easily identifiable and accessible. This description creates a hierarchy of metadata related to the entire archive, as opposed to those related to each record. An overview of blockchain technology A blockchain is a particular implementation of a Distributed Ledger (DL). A DL is essentially a database, which is shared among different nodes of a network. In practice, all the nodes of the network share the same copy of the database and any change made on a node, is replicated to all the other nodes in few minutes and, in some cases, even in few seconds. A DL can be public (as opposite of private) if any node can read the content, and permissionless (as opposed of permissioned) if any node can write content (Table 1). The protocol for the first functioning blockchain was introduced in 2008 to support the digital cash Bitcoin, and implements the ledger as a chain of blocks. Each block contains data, a timestamp and a cryptographic hash of the previous block. This way the integrity of the information stored in the blockchain is protected through a security system based on cryptography. With respect to a standard database, a blockchain is an append-only register. This means that information can only be added to the database, but it cannot be removed. Modifications to the stored data can be done by re-uploading a new version of the data. A distributed consensus algorithm is used to decide which updates to the ledger are to be considered valid. New participants (nodes) can start collaborating to the maintenance of the repository by following this algorithm. There is no need of a central authority or trust between nodes; the consensus algorithm and cryptography grant the correctness of data even in the presence of some malicious nodes. Each block is made tamper-resistant by adding in its header a cryptographic signature of the data it contains (usually a hash of the content), as well as a link to the previous block of the chain (the cryptographic hash of the block). This way each block is dependent on the content of all the previous blocks, making it impossible to modify the data contained in old blocks without rewriting the new ones. Initially designed for financial transactions, blockchain technology can be used to record anything of value. Even executable code can be stored in the blockchain, the so-called smart 149 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 contracts. A smart contract is not necessarily the transposition of a real contract, it is just code that is executed by all the nodes of the blockchain network, and the result of the computation is stored after a consensus is reached. A transaction carrying the payload of the contract is first broadcast to the network. Its result is the deployment of the payload as code linked by its public address. Any new transaction can then refer to this address to trigger the execution of the functions inside the contract. The big advantage of a blockchain is that it is an immutable, distributed, always available, secure and publicly accessible repository of data. The main issues with blockchain implementation of distributed ledgers are scalability and efficiency: often, consensus algorithms that are used to grant consistency are expensive in terms of time and resources. In some cases, a certain level of trust among participants can be present, thus simpler consensus algorithms can be used. Key technical choices of blockchain technology include: 1) permission design, i.e., whether permission is needed to access the blockchain; 2) choice of consensus algorithm, i.e., how a new block is added to the blockchain; 3) whether or not to use smart contracts, i.e., whether to use the blockchain as a virtual machine where programs representing business processes are run; 4) whether or not to use a cryptocurrency, i.e., whether the consensus algorithm and smart contract operations depend on an artificial currency or not. Those technical choices often result from the governance model that has been chosen for the ecosystem of participants. SOME ALL CAN READ PRIVATE PUBLIC CAN WRITE PERMISSIONED PERMISSIONLESS Table 1: Types of blockchains according who can access what. Ethereum Ethereum is a public open-source blockchain platform that has the capability of running so- called decentralised applications (dApps). At the moment, the consensus algorithm is based on Proof of Work. Mining nodes generate a cryptocurrency named Ether that is used to pay for transactions. The key characteristic of Ethereum is that it is a programmable blockchain, because it provides a Virtual Machine (EVM) that can execute user generated scripts (smart 150 A. Lo Duca, C. Bacciu, A. Marchetti – The Use of Blockchain for Digital Archives: a comparison between Ethereum and Hyperledger contracts) using the network of nodes. Smart contracts are usually written in Solidity6 language (but there are some alternatives), are compiled to EVM bytecode, and are deployed to the blockchain for execution. Contract computation consumes gas, which is paid spending Ether. Smart contracts are the foundation of dApps. Diagram in Figure 1 shows the simplified architecture of dApps: there is no central server to which Web every browser has to connect, but instead each one has its own instance of the application. Ethereum functions both as storage for data and code, and as the machine that executes the code. The Ethereum Network The Ethereum network is a public distributed network with two types of nodes: full nodes and lightweight nodes. Full nodes which contain the whole blockchain, i.e. all the validated transactions. Some full nodes, called miners, are also responsible for transaction validation. Miners can also be grouped in pools. Lightweight nodes contain a subset of the blockchain and rely on full nodes for missing information. Examples of lightweight nodes are e-wallets, i.e. electronic devices or apps which permit to do transactions. 6 https://solidity.readthedocs.io 151 Figure 1: The architecture diagram of an Ethereum Dapp. https://solidity.readthedocs.io/ Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 Ether and Gas As already said, Ethereum is cryptocurrency-based blockchain where the cryptocurrency used is called Ether (ETH). The price of 1 ETH is 182.31 $ (updated on October 31, 2019). Together with Ether there is also Gas, which is used to pay computational resources in the network ( gas fee). The current value of Gas is called gas price. Every smart contract has associated a gas limit, which is the maximum amount of gas which it can consume. Hyperledger Hyperledger is an open source effort aimed at advancing cross-industry blockchain technologies. Hyperledger focuses on developing different blockchain frameworks and modules to support global enterprise solutions. Hyperledger blockchains are generally permissioned blockchains, which means that the parties that want to join the network must be authenticated and authorized. The focus of Hyperledger is to provide a transparent and collaborative approach to blockchain development. Within Hyperledger, there are eight different technology code projects, which define a common set of development principles: five distributed ledger frameworks and three support modules. The Hyperledger frameworks include: An append-only distributed ledger A consensus algorithm for agreeing to changes in the ledger Privacy of transactions through permissioned access Smart contracts to process transaction requests. In this paper only Hyperledger Fabric is described, because it is the most widespread. The Hyperledger Fabric blockchain is a distributed system consisting of many nodes that communicate with each other. Figure 2shows the Hyperledger Fabric Model. 152 Figure 2: The Hyperledger Fabric Model. Client A defines a chaincode (contract) through a transaction. Once the transaction approved, client B can invoke methods contained in the chaincode through another transaction. A. Lo Duca, C. Bacciu, A. Marchetti – The Use of Blockchain for Digital Archives: a comparison between Ethereum and Hyperledger Chaincodes and channels The blockchain runs programs called chaincodes, holds state and ledger data, and executes transactions. Chaincodes correspond to the Ethereum smart contracts. Each chaincode can be invoked through one or more operations, called transactions. Transactions have to be endorsed and only endorsed transactions may be committed and have an effect on the state of the ledger. The most peculiar aspect of Hyperledger is the possibility to define channels, which are data partitioning mechanisms that allow transaction visibility for only some defined users of the blockchain. Each channel is an independent chain of transaction blocks containing only transactions for that particular channel. The ledger contains the current world state of the network and a chain of transaction invocations. The world state reflects the current data about all the assets in the network. Ledger provides a verifiable history of all successful state changes (valid transactions) and unsuccessful attempts to change state (invalid transactions), occurring during the operation of the system. Roles and transactions In Hyperledger two roles can be defined: clients and validators. Clients are applications that act on behalf of a person to propose transactions on the network. Validators maintain the state of the network and a copy of the ledger. Unlike Ethereum, in Hyperledger Fabric there is no mining of blocks. In order to verify a transaction, each transaction is sent to one trusted validator, which broadcasts it to all the other validators of the network. All the validators reach consensus (using a specific algorithm) on the order to follow to execute all the transactions. Then each validator runs the transactions on its own, following the established order and builds a block with all the executed transactions. Since the execution of transactions is deterministic, all the validators build exactly the same block. Finally, the validators asynchronously notify the client application of the success or failure of the transaction. Clients are notified by each validator. The model of blockchain for digital archives The use of blockchain for digital archives guarantees a mechanism to access, manage and protect cultural heritage on a daily basis and at times of disasters (due for example to climate change or man-made). The blockchain-based framework should be designed both for minor tangible heritage and major tangible and intangible heritage. Thanks to the append-only-register property of the blockchain, the framework provides a layered protection and conservation means for cultural heritage. The framework exploits also some specific advantages of blockchain (integrity, transparency and authenticity of records) to allow the secure storage of minor tangible heritage contained in digital archives. The framework integrates also technologies for a distributed record storage, such as the InterPlanetary File 153 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 System7 [14], in order to guarantee the digital preservation and transmission of tangible and intangible heritage from generation to generation. The use of these storage technologies permits the development of a sustainable protection and enhancement of values as well as the long-term management of cultural heritage at risk. Thanks to the benefits of blockchain and the distributed technologies for record storage, the framework should improve sustainable access to digital heritage by contributing to the resilience of our societies in terms of: helping users to preserve the memory of cultural heritage in case of destruction of the physical artwork, due to natural disasters or man-made disasters, facilitating the restoration and/or the reconstruction of damaged heritage, thanks to the information contained in the ledger, preventing malicious changes to the ledger, registering temporary movements of movable heritage, for example for exhibits. Requirements The section The concept of Digital Archive describes the general requirements of a digital archive (long-term preservation, transparency, anti-counterfeiting, protection, integrity and traceability). The use of blockchain for digital archives should guarantee also the following architectural requirements: interoperability: this aspect should guarantee that the blockchain can easily interoperate with external modules, such as web interfaces and external storage (i.e. IPFS); customizable infrastructure: the system should guarantee that the underlying infrastructure is customizable, e.g. the number of nodes and costs can be decided independently; roles: the blockchain should define different users roles, according to what specified in the previous section. queries: this aspect refers to the ability to search data in the blockchain, e.g. search an artwork by title or author. In addition to these requirements, the following parameters that affect performance and scalability should be taken into account [19]: block frequency: inversely proportional to the time between two succeeding blocks. It is affected by mining difficulty; block size: the number of transactions that fit in a block; network size: the number of nodes of the network. Increasing the number of nodes in the network does not always improve performance. In fact, communication and consensus costs may increase. 7 https://ipfs.io/ 154 https://ipfs.io/ A. Lo Duca, C. Bacciu, A. Marchetti – The Use of Blockchain for Digital Archives: a comparison between Ethereum and Hyperledger throughput: the number of transactions per second; latency: the time elapsed between the submission of a transaction and its validation; finality: the property that once a transaction is completed, there is no way to alter it. Architecture Errore: sorgente del riferimento non trovata describes the architecture of the blockchain-based framework for digital archives. Starting from the bottom of the figure, there is a Data Lake where all the descriptions of artworks are stored. The Data Lake is a distributed storage. Artworks in Data Lake can be accessed through indexes contained in the blockchain. The blockchain contains also other basic information related to artworks descriptions, as well as a track of all the operations done on each artwork. This means that an external audit of the framework can always verify the status of an artwork and determine if something is wrong. The blockchain constitutes the backend of the framework, together with the Cache. The Cache service stores basic information about artworks, such as author name and description, in order to make users queries faster. In fact, natively, a blockchain is not suitable for fast queries such those required by a web search engine. The frontend of the framework is composed of the Authentication Service, which manages users access to the system, and three interfaces, one for each type of user: Search, Publisher, and Admin Interface. Users of the framework should play one of the following roles: generic user, publisher, verifier. A generic user can search for an approved artwork in the system. A publisher user can publish or update an artwork in the system. When a new artwork is published, its status is set to pending. This means that the artwork is not approved yet thus cannot be accessed by third parties neither can be updated by its author. A verifier is an expert in the field to which the artwork belongs, and can vote for the approval of the artwork description. This mechanism constitutes an algorithm for compliance with the principle of reliability of a record. Complex strategies can be defined to establish how an artwork should be approved. 155 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 Due to its intrinsic nature, the blockchain already satisfies the general requirements of a digital archive, but long-term preservation, which is guaranteed through the Data Lake. Protection would be achieved through the fact that the blockchain is replicated on different nodes. Anti- counterfeiting would be guaranteed by associating each work to a sort of digital identity card, containing all the information related to the work (including physical information). Finally, integrity and traceability would be intrinsically guaranteed by the immutability and timestamping properties of the blockchain. In fact, blockchain security assumptions guarantee that if at a certain time a piece of information has been added to a block that reached consensus, it will be impossible to alter that information without altering all the following blocks. Both Ethereum and Hyperledger Fabric satisfy the architectural requirements. However, depending on the type of blockchain, there are the following additional issues that should be taken into account: 156 Figure 3: The architecture of the blockchain-based framework. A. Lo Duca, C. Bacciu, A. Marchetti – The Use of Blockchain for Digital Archives: a comparison between Ethereum and Hyperledger costs: whether or not every transaction has a fee; popularity: how the blockchain is known, i.e. there is a supporting community and there are skilled programmers able to implement contracts; consensus: the distributed process which establishes the validation of transactions. Discussion Table 2 illustrates issues associated with the proposed architecture for digital archives and how the two blockchains address them. The fourth column of the table shows which blockchain fits better the requirements for digital archives. Regarding costs, Hyperledger Fabric fits better to the proposed architecture, because the network can be configured without costs on transactions. This means that all the categories of users can access the blockchain freely. However, if a business model were defined in the architecture, e.g. pay as you publish/access resources, also Ethereum could be suitable for digital archives. Anyway, a private network can be always set up on Ethereum, with a gas price set to zero, thus satisfying the model of the proposed architecture. When dealing with popularity, Ethereum is more popular and well-known than Hyperledger Fabric. This means that a technical problem in the implementation of the described architecture could find a greater support by the Ethereum community that the Hyperledger one. Regarding consensus, Ethereum bases it on Proof of Work, while Hyperledger Fabric implements a permissioned voting-based consensus that implies a level of trust among participants ad requires messages to be exchanged between nodes. In this case, Ethereum seems to be more fit because of the lack of need for trust among participants. Summarizing, from the point of view of issues, Ethereum seems to behave better than Hyperledger Fabric, but for costs. 157 Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 Issue Ethereum Hyperledger Fabric Preferred blockchain Costs Every transaction has a cost dependent on gas price (current is about 4 gwei). In a private blockchain, the gas price can be configured. Costs can be established when configuring the network Hyperledger Fabric Popularity A well-established and wide community exist. Many programmers are able to write smart contracts A developing niche community exists. Ethereum Consensus Based on Proof of Work. The larger the network, the more reliable the consensus Consensus algorithm requiring a level of trust and message overhead. The larger the network, more time it takes to reach consensus. Ethereum Table 2: Considerations about issues in the two blockchains and which one is more suitable for digital archives. Table 3 describes the architectural requirements and how the two blockchains satisfy them. Like in the previous table, the fourth column specifies the preferred blockchain for a given requirement. Firstly, referring to the customizable architecture, Hyperledger Fabric is to prefer, because it permits to define who can access the network (permissioned blockchain). However, Ethereum can be set up as a private blockchain, in the sense that there is a single organization which manages it, and contracts handling user permissions can be implemented. Secondly, looking at interoperability with external storage, Hyperledger Fabric is better than Ethereum, because it has a native storage (data lake) and there is no need to configure external libraries to access it. Anyway, if a programmer has good skills with Ethereum, the configuration of external libraries to access external storage should not be difficult. The same analysis can be done for interoperability with Web Interfaces. Thirdly, regarding roles, Hyperledger Fabric is more configurable than Ethereum because of its native support of roles. Through channels, Hyperledger Fabric can define also more complex access policies. When defining roles, Ethereum has an overhead in terms of smart contracts thus is not indicated for the proposed 158 A. Lo Duca, C. Bacciu, A. Marchetti – The Use of Blockchain for Digital Archives: a comparison between Ethereum and Hyperledger architecture. Finally, queries are not supported neither by Ethereum nor by Hyperledger and this aspect constitutes a limit of all blockchains. Thus, an additional mechanism based on caching is defined in the architecture, in order to speed up data searches. Summarizing the comparison about the architectural requirements, Hyperledger Fabric is the best solution. Requirement Ethereum Hyperledger Fabric Preferred blockchain Customizable infrastructure Ethereum is natively a public network, however a private blockchain can be set up Native support of permissioned blockchain Hyperledger Fabric Interoperability with external storage (e.g. IPFS) External libraries exist to support IPFS (e.g. Infura)8 No support to external storage because nodes in hyperledger fabric have already a local storage Hyperledger Fabric, but Ethereum is a good alternative Interoperability with Web Interfaces External libraries exist (e.g. web3.js9 and Drizzle)10 Native support of interoperability with web interfaces (in Angular JS)11 Hyperledger Fabric, but Ethereum is a good alternative Roles A smart contract must be defined to manage roles Native support of roles through the definition of policies Hyperledger Fabric Queries No native support No native support - Table 3: Considerations about architectural requirements in the two blockchains and which one is more suitable for digital archives. A direct comparison between Ethereum private and Hyperledger in terms of performance is 8 https://infura.io/ 9 https://web3js.readthedocs.io/en/v1.2.2/ 10 https://www.trufflesuite.com/docs/drizzle/quickstart 11 https://angularjs.org/ 159 https://angularjs.org/ https://www.trufflesuite.com/docs/drizzle/quickstart https://web3js.readthedocs.io/en/v1.2.2/ https://infura.io/ Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 difficult, due to the fact that are both highly configurable. In general, how the protocol and the network are configured (starting from the choice of a consensus algorithm) has a big impact on performance. An increase in mining difficulty leads to a block frequency decrease, throughput decrease and latency rise. Block size and block frequency are to be balanced (especially with PoW): incrementing block size improves performance only if the block period is large enough for nodes to be able to create, sign, propagate, execute transactions and reach consensus. Adding nodes to the network increases computation capability, but if block frequency is high and block size is large, some nodes may not have the resources to propagate information on time and keep in sync. Therefore, scaling is limited by the design of the blockchain platform. A study about performance and scalability of Ethereum private networks can be found in [19], while performance metrics about Hyperledger are studied in [10]. As for the Ethereum public platform, average block frequency is 10-20 seconds, and average block size is 20-30 KB. Troughput is about 15 transactions per second, with a latency of about 6 minutes. Current network size is 2.717.215 nodes.12 Summarizing, although Ethereum is more popular than Hyperledger Fabric, Hyperledger Fabric seems more suitable to store digital archives, because it is highly configurable and permits to define roles natively, without additional overhead. However, as the first implementation of the proposed architecture, Ethereum is more indicated because of its simplicity and popularity [6], [3]. Conclusions and future work This paper has presented challenges and requirements of storing digital archives through blockchain. In addition, a possible architecture based on blockchain has been illustrated as well as a preliminary comparison between Ethereum and Hyperledger Fabric as underlying blockchains in a framework for storing, protecting and preserving digital archives. The paper has compared the two blockchains at three levels: natively issues of the blockchain, architectural requirements and general considerations. As a result, Hyperledger Fabric is more suitable to store digital archives because of its high configurability. We are aware that this study is preliminary, but we believe that the effort to define a possible architecture of the framework, as well to select and analyse which parameters should be considered when comparing two or more blockchains for digital archives storage is useful in this field. As already said in the introduction, an implemented use case of this architecture can be found in [24], which exploits Ethereum as underlying blockchain. The further step should be the implementation of the framework in Hyperledger Fabric and then a comparison of the two implemented use cases. 12 https://etherscan.io/nodetracker/nodes 160 https://etherscan.io/nodetracker/nodes A. Lo Duca, C. Bacciu, A. Marchetti – The Use of Blockchain for Digital Archives: a comparison between Ethereum and Hyperledger References [1] ARMA. 2017. “International. Generally Accepted Recordkeeping Principles”. https://rim.ucsc.edu/management/images/ThePrinciplesMaturityModel.pdf [2] Bacciu, Clara, Angelica Lo Duca and Andrea Marchetti. 2019. “The Use of Blockchain for Digital Archives: Challenges and Perspectives”. AIUCD Annual Conference – Pedagogy, Teaching and Research in the Age of Digital Humanities, 78- 81. [3] Basile, Mariano, Gianluca Dini, Andrea Marchetti, Clara Bacciu and Angelica Lo Duca. 2019. “A blockchain-based support to safeguarding the Cultural Heritage”. EVA Proceedings of the Electronic Imaging & the Visual Arts, edited by V. Cappellini,64-73. Firenze: FUP. [4] Bhowmik, Deepayan, and Tian Feng. 2017. “The multimedia blockchain: A distributed and tamper-proof media transaction framework.” In DSP Proceedings of the 22nd International Conference on Digital Signal Processing, 1-5. DOI: 10.1109/ICDSP.2017.8096051 [5] Cachin, Christian. 2016. “Architecture of the hyperledger blockchain fabric.” In Proceedings of the Workshop on Distributed Cryptocurrencies and Consensus Ledgers. https://www.zurich.ibm.com/dccl/papers/cachin_dccl.pdf [6] Clara Bacciu, Angelica Lo Duca, and Andrea Marchetti. 2019. “A Blockchain-based Application to Protect Minor Artworks.” In Proceedings of the 15th International Conference on Web Information Systems and Technologies, edited by A. Bozzon, F. Dominguez Mayo and J. Filipe, 319-325. Setúbal: Scitepress. DOI:10.5220/0008347903190325 [7] Duranti, L., and R. Preston. 2008. International research on permanent authentic records in electronic systems (InterPARES) 2: Experiential, interactive and dynamic records. Padova: CLEUP. [8] Galiev, Albert, Shamil Ishmukhametov, Rustam Latypov, Nikolai Prokopyev, Evgeni Stolov and Ilya Vlasov. 2018. “Archain: a novel blockchain based archival system.” In WorldS4 Proceedings of the Second World Conference on Smart Trends in Systems, Security and Sustainability, 84-89. DOI: 10.1109/WorldS4.2018.8611607 [9] García-Barriocanal, Elena, Salvador Sánchez-Alonso and Miguel-Angel Sicilia. 2017. “Deploying metadata on blockchain technologies.” In Research Conference on Metadata and Semantics Research, 38-49. https://doi.org/10.1007/978-3-319-70863- 8_4 [10] https://doi.org/10.1007/978-3-030-30429-4_ [11] Hyperledger Performance and Scale Working Group. 2018. “Hyperledger Blockchain Performance Metrics” Whitepaper. https://www.hyperledger.org/wp- 161 https://www.hyperledger.org/wp-content/uploads/2018/10/HL_Whitepaper_Metrics_PDF_V1.01.pdf https://doi.org/10.1007/978-3-030-30429-4_ https://www.zurich.ibm.com/dccl/papers/cachin_dccl.pdf https://rim.ucsc.edu/management/images/ThePrinciplesMaturityModel.pdf Umanistica Digitale - ISSN:2532-8816 - n.8, 2020 content/uploads/2018/10/HL_Whitepaper_Metrics_PDF_V1.01.pdf [12] International Council on Archives / Conseil international des archives. 2000. ISAD (G): General International Standard Archival Description, Second Edition, Adopted by the Committee on Descriptive Standards, Stockholm, Sweden, 19-22 September 1999. Ottawa. [13] ISO. 2016. “ISO 15489-1/2: 2016- Information and documentation - Records management”. https://www.iso.org/standard/62542.html [14] Juan Benet. 2014. “Ipfs-content addressed, versioned, p2p file system.” arXiv preprint. arXiv:1407.3561. [15] Kuny, T. 1998. “The digital dark ages? Challenges in the preservation of electronic information.” International preservation news 17:8-13. [16] Lemieux, Victoria. 2017. “A typology of blockchain record- keeping solutions and some reflections on their implications for the future of archival preservation.” In Big Data 2017. Proceedings of the IEEE International Conference on Big Data, 2271– 2278. DOI: 10.1109/BigData.2017.8258180 [17] Liang, Xueping, Sachin Shetty, Deepak Tosh, Charles Kamhoua, Kevin Kwiat, Laurent Njilla. 2017. “Provchain: A blockchain-based data provenance architecture in cloud environment with enhanced privacy and availability.” In CCGRID 2017 Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 468-477. DOI: 10.1109/CCGRID.2017.8 [18] Nakamoto, Satoshi. 2008. “Bitcoin: A peer-to-peer electronic cash system”. Website. https://bitcoin.org/bitcoin.pdf [19] Schäffer, Markus, Monica Di Angelo and Gernot Salzer. 2019. “Performance and Scalability of Private Ethereum Blockchains.” In BPM 2019 Proceedings of the Business Process Management: Blockchain and Central and Eastern Europe Forum. DOI [20] Schwartz, David, Noah Youngs and Arthur Britto. 2014. “The ripple protocol consensus algorithm”. Ripple Labs Inc WhitePaper. https://ripple.com/files/ripple_consensus_whitepaper.pdf [21] Swan, Melanie. 2015. “Blockchain: Blueprint for a new economy”. O’Reilly Media. [22] Tsung-Ting Kuo, Hugo Zavaleta Rojas, Lucila Ohno-Machado. 2019. “Comparison of blockchain platforms: a systematic review and healthcare examples.” Journal of the American Medical Informatics Association 26, no. 5:462–478. [23] Vishwa, Alka, and Farookh Khadeer Hussain. 2018. “A blockchain based approach for multimedia privacy protection and provenance.” In SSCI 2018 Proceedings of the IEEE Symposium Series on Computational Intelligence, 1941-1945. DOI:10.1109/ssci.2018.8628636 162 https://ripple.com/files/ripple_consensus_whitepaper.pdf https://bitcoin.org/bitcoin.pdf https://www.iso.org/standard/62542.html https://www.hyperledger.org/wp-content/uploads/2018/10/HL_Whitepaper_Metrics_PDF_V1.01.pdf https://www.hyperledger.org/wp-content/uploads/2018/10/HL_Whitepaper_Metrics_PDF_V1.01.pdf A. Lo Duca, C. Bacciu, A. Marchetti – The Use of Blockchain for Digital Archives: a comparison between Ethereum and Hyperledger [24] Wood, Gavin. 2014. “Ethereum: A secure decentralised generalised transaction ledger”. Ethereum Project Yellow Paper. https://gavwood.com/paper.pdf [25] Zeilinger, Martin. 2018. “Digital art as ‘monetised graphics’: Enforcing intellectual property on the blockchain.” Philosophy & Technology 31, no. 1: 15-41. DOI: 10.1007/s13347-016-0243-1 [26] Zheng, Zibin, Shaoan Xie, Hong-Ning Dai, Xiangping Chen and Huaimin Wang. 2018. “Blockchain challenges and opportunities: A survey.” In International Journal of Web and Grid Services, 14/4, 352-375. Last access URLs: 28th October 2019. 163 https://gavwood.com/paper.pdf Abstract Introduction Related Works Mirror type Digital record type Tokenized type Background The concept of Digital Archive An overview of blockchain technology Ethereum Ether and Gas Hyperledger Chaincodes and channels Roles and transactions The model of blockchain for digital archives Requirements Architecture Discussion Conclusions and future work References work_g2vkzz5wvzfmreplmjszn7vqo4 ---- White Paper Report Report ID: 107673 Application Number: HT-50059-12 Project Director: Joseph Scheinfeldt (tom.scheinfeldt@uconn.edu) Institution: George Mason University Reporting Period: 9/1/2012-3/31/2015 Report Due: 6/30/2015 Date Submitted: 8/13/2015 Another Week | Another Tool: A Digital Humanities Barn Raising In 2010, the Roy Rosenzweig Center for History and New Media gathered twelve digital humanists of different stripes – developers, professors, designers, managers – for One Week | One Tool, A Digital Humanities Barn Raising. The goal was to conceive of, produce, and market a new digital humanities tool. The result was Anthologize, a WordPress plugin for publishing WordPress content to PDF, EPUB, and other forms. Following that success, RRCHNM attempted to recreate the experience with twelve different digital humanists, again from many different fields and backgrounds. Drawing on lessons learned from the first iteration, we put more emphasis on project management strategies and reduced the amount of time devoted to instruction in digital humanities tools and methods. The tool they produced is a Software as a Service application, Serendip-o-Matic, which allows users to enter text or their Zotero library and discover unexpectedly similar results in DPLA, Europeana, and other sources. In what follows, we hope that you will find insights and perspectives about the experience that will provide inspiration for other innovative programs, project management considerations, and digital humanities practices in general. The Another Week | Another Tool team was: Brian Croxall. Digital Humanities Specialist and Lecturer in English, Emory University Jack Dougherty. Associate Professor and Director of Educational Studies, Trinity College Meghan Frazer. Digital Resources Curator, The Ohio State University Scott Kleinman. Professor of English, California State University Rebecca Sutton Koeser. Software Engineer, Emory University Libraries Ray Palin. Teacher and Librarian, Sunapee Middle High School Amy Papaelias. Assistant Professor of Graphic Design and Foundation, SUNY Mia Ridge. Ph.D. candidate in Digital Humanities, Open University Eli Rose. Undergraduate student, Oberlin University Amanda Visconti. Ph.D. candidate in English, University of Maryland Scott Williams. Collections Database Adminsitrator, Univ of Pennsylvania Museum of Archaeology and Anthropology Amrys Williams. Postdoctoral Fellow, National Museum of American History The second iteration of One Week | One Tool was an extremely successful experience for the participants' professional development, instruction in digital humanities methods, and experience in collaboration with a variety of people in different roles. The results differ from the first iteration, very much following the changes RRCHNM implemented between the two events. In particular, this iteration's emphasis on a project management team led to several participants citing learning about project management structures as their most important takeaway from the experience. The product, the software-as-a-service application, Serendip-o-matic, continues online, though with fairly minimal traffic. The reach of the experience to affect other digital humanists' thinking has been extraordinary, taking the forms of a long session presentation at Digital Humanities 2014, and a post for ACRL's TechConnect series, among other successful informal and formal papers and presentations. Project Activities, Accomplishments, and Audiences The One Week | One Tool team coalesced into a group that quickly self-selected into distinct teams with clear leaders: the development team, the outreach team, and the project management team. This division, particularly within the development team, reflects the inroads that project management techniques, particularly Agile, have made into digital humanities. Mia Ridge became the “Scrum Master” of the development team, coordinating the teams activities and priorities. Meghan Frazer and Brian Croxall, the project management team, took on the role of coordinating between the the development and outreach teams. At times this was a controlled chaos, as sometimes within the course of a single day the realities of what could be built and the expectations of the outreach team would diverge. This was to be expected in such a condensed product launch time, and provided valuable experience for all participants, some of whom had not worked within anything analogous to the project management structures that developed. Indeed, the structure was noted by more than one participant as an important lesson in their professional lives and development. The adoption of those project management structures and gaining experience in learning and negotiating them is perhaps the most important lesson. Indeed, that management processes were extremely productive. Interestingly, this appears to have allow the participants to focus more on the 'playfulness' of both the event and ultimate product in their publications. Despite occasional frustrations and the somewhat more well-defined structure of the group, playfulness in research and tool-building became a major theme in their later presentations and reflection. The participants have produced many formal and informal presentations and documents. Many were avid bloggers about their experiences – most can be found in their Zotero group (https://www.zotero.org/groups/oneweekonetool2013/items). The range of presentations and publications speak to an extraordinarily wide audience, including the international Digital Humanities 2014 conference; Through Design, a popular design podcast; the Association of College and Research Libraries' TechConnect series; and regional technology and/or humanities conferences (see below for details, and Appendix I for a complete bibliography). https://www.zotero.org/groups/oneweekonetool2013/items Evaluation Evaluation of the participants' experiences was conducted via a survey in a Google form. Themes that can be seen in the survey include: • Collaboration Many participants cited lessons in team collaboration, particularly as part of project management. The importance of managing communication between all members of the team and the experience in doing so is described as producing important changes in their professional lives. • Structure Closely related to collaboration and communication, well-defined structure was noted as a factor in the success of the week. Interestingly, more structure, established both before and after the week itself, was noted as a suggestion for changing One Week | One Tool. • Time constraints after the week As discussed below, the most common reason cited for the cessation of development work on Serendip-o-matic is lack of available time in the year after the experience. This is consistent with the first iteration of One Week | One Tool. Continuation of the Project For most practical purposes development on Serendip-o-matic is at an end, though for the time being the application will be maintained. During our reunion at THATCamp in 2014, we discussed the possibility of active development continuing. The consensus was that continuing active development was an unrealistic goal. The lack of available time was consistently given as the primary reason for this. While enthusiasm for Serendip-o-matic remained high, the reality is that everyone's professional responsibilities left little opportunity to continue development and coordination. This is not surprising, as One Week | One Tool is by design an experience distinct from 'usual' professional life, and participants in the first iteration had much the same reaction. That said, the more fundamental part of One Week | One Tool – the development of professional skills that will be applied in working life and shared among others, appears destined to have a continuing effect on the participants and their colleagues. Occasional new presentations from the participants will expand the influence of their lessons learned. Hence, RRCHNM will continue to keep Serendip-o-matic up and running for as long as our technical infrastructure can reasonably support it. Grant Products Publications and Professional Development The One Week | One Tool team has been quite prolific in their ongoing professional development work about Serendip-o-matic and the experience of One Week | One Tool. Their Zotero library of their blog posts, presentations, papers, and other recognitions contains over seventy items. The most significant product is the long paper at Digital Humanities 2014 presented by Amy Papaelias, Brian Croxall, Mia Ridge, and Scott Kleinman. In the presentation, they reflect on the virtues of playfulness, both in the process of building Serendip-o-matic and in the product itself. They argued in favor of the benefits for incorporating more “playful work” in the context of academic research and scholarship. As current digital humanities work relies on collaborative environments (including hackathons, maker spaces, maker challenges, etc.), opportunities like One Week | One Tool provide a space for playful work to encourage more creative risk-taking and engaging user- experiences within the context of digital humanities scholarship and practice. Importantly, they included a considerations of the challenges of evaluation in their talk. Another notable and informative post is Meghan Frazer's post in the ACRL's TechConnect series (http://acrl.ala.org/techconnect/?p=3621), which provides an insightful summary of the lessons learned. Please see Appendix I for a full bibliography of resources related to One Week | One Tool. Serendip-o-matic The usage of Serendip-o-matic itself has been somewhat limited. The site usage statistics show that, after the initial release, visits declined sharply. That is not to say, however, that it does not continue to bear fruit. It is used as a demonstration tool not only for the One Week | One Tool process itself, but also as an example of using multiple APIs to produce research results. Overall, though, it is important to remember that One Week | One Tool is an exercise in rapid, immersive learning about technologies, tools, development, management, and outreach in digital humanities projects. What the twelve participants achieved, learned, and – most importantly – shared with their colleagues has been a significant success. http://acrl.ala.org/techconnect/?p=3621 Appendix I Bibliography of related presentations and publications [Generated from the Another Week | Another Tool Zotero Group https://www.zotero.org/groups/oneweekonetool2013/items] Andrew, Liam. “I’m Feeling Lucky: Can Algorithms Better Engineer Serendipity in Research — or in Journalism?” Nieman Journalism Lab, July 16, 2014. http://www.niemanlab.org/2014/07/im-feeling-lucky-can-algorithms-better-engineer- serendipity-in-research-or-in-journalism/. “Another Week | Another Tool Begins.” Roy Rosenzweig Center for History and New Media, July 31, 2013. http://chnm.gmu.edu/news/another-week-another-tool/. Baker, James. “@HumaBirdProject + #inspiringwomen.” Digital Scholarship Blog, August 7, 2013. http://britishlibrary.typepad.co.uk/digital-scholarship/2013/08/humabirdproject- inspiringwomen.html. Benatti, Francesca. “Mia Ridge Leads Developement at One Week|One Tool.” Digital Humanities at The Open University, August 7, 2013. http://www.open.ac.uk/blogs/dighum/? p=613. carlspina. “Spark New Paths of Research with Serendip-O-Matic.” Novel Technology, August 8, 2013. http://carlispina.wordpress.com/2013/08/08/serendip-o-matic/. Croxall, Brian. “Day 1 of OWOT: Check Your Ego at the Door,” July 30, 2013. http://www.briancroxall.net/2013/07/30/day-1-of-owot-check-your-ego-at-the-door/. ———. “Day 2 of OWOT: Pick Your Poison,” July 31, 2013. http://www.briancroxall.net/2013/07/31/day-2-of-owot-pick-your-poison/. ———. “Day 3 of OWOT: Of Names and Stories and Gophers,” August 1, 2013. http://www.briancroxall.net/2013/08/01/day-3-of-owot-of-names-and-stories-and-gophers/. ———. “Day 4 of OWOT: Stay Gold, Ponyboy,” August 2, 2013. http://www.briancroxall.net/2013/08/02/day-4-of-owot-stay-gold-ponyboy/. ———. “Day 5 of OWOT: We Did It! (Can We Do It Again? Please??),” August 3, 2013. http://www.briancroxall.net/2013/08/03/day-5-of-owot-we-did-it-can-we-do-it-again-please/. ———. “‘If Hippos Be the Dude of Love…’: Serendip-O-Matic at Digital Humanities 2014.” Brian Croxall, July 22, 2014. http://www.briancroxall.net/2014/07/22/if-hippos-be-the-dude- of-love-serendip-o-matic-at-digital-humanities-2014/. https://www.zotero.org/groups/oneweekonetool2013/items ———. “One Week | One Tool: Introducing Serendip-O-Matic.” The Chronicle of Higher Education. ProfHacker, August 5, 2013. http://chronicle.com/blogs/profhacker/one-week- one-tool-introducing-serendip-o-matic/51449. Dorn, Sherman. “One Week, Better Tools (spoof).” Sherman Dorn, August 2, 2013. http://shermandorn.com/wordpress/?p=6288. ———. “Pictura Invisibilis Collegio Artium Digital.” Sherman Dorn, August 2, 2013. http://shermandorn.com/wordpress/?p=6296. Dougherty, Jack. “Final Reflections from One Week One Tool: The Blur of Days 4-5,” August 4, 2013. http://commons.trincoll.edu/jackdougherty/2013/08/04/owot-4-5/. ———. “Learning Moments at One Week One Tool 2013, Day 1,” July 30, 2013. http://commons.trincoll.edu/jackdougherty/2013/07/30/owot1/. ———. “Metaphorical Learning Moments at One Week One Tool, Day 3,” August 1, 2013. http://commons.trincoll.edu/jackdougherty/2013/08/01/owot-3/. ———. “My Peggy Olson Learning Moment at One Week One Tool, Day 2,” July 31, 2013. http://commons.trincoll.edu/jackdougherty/2013/07/31/owot-2/. “DPLA Welcomes Serendip-O-Matic to the App Library.” Digital Public Library of America, August 2, 2013. http://dp.la/info/2013/08/02/welcome-serendip-o-matic/. “Europeana API Used in One Week | One Tool’s Serendip-O-Matic!” Europeana, August 5, 2013. http://pro.europeana.eu/web/guest;jsessionid=B6A260586E6411EF20C0AFA2DC95D6DB. Frazer, Meghan. “One Week, One Tool, Many Lessons.” ACRL TechConnect Blog, August 7, 2013. http://acrl.ala.org/techconnect/?p=3621. Graham, Shawn. “A Quick Run with Serendip-O-Matic.” Electric Archaeology, August 2, 2013. http://electricarchaeology.ca/2013/08/02/a-quick-run-with-serendip-o-matic/. Grossman, Sara. “How to Build a Digital-Humanities Tool in a Week.” The Chronicle of Higher Education. Wired Campus, August 2, 2013. http://chronicle.com/blogs/wiredcampus/how-to- build-a-digital-humanities-tool-in-a-week/45243. Heimburger, Franziska. “Vos Sources Vous Surprennent Avec Le Serendip-O-Matic.” La Boite à Outils Des Historiens, August 2, 2013. http://www.boiteaoutils.info/2013/08/vos-sources- vous-surprennent-avec-le.html. Hocking, Cameron. “Mining the Treasures of Trove.” Bright Ideas, August 8, 2013. http://slav.global2.vic.edu.au/2013/08/08/mining-the-treasures-of-trove/#.UgfcVmRAQls. Hovious, Amanda. “Serendip-O-Matic | Designer Librarian.” Designer Librarian, August 5, 2013. http://designerlibrarian.wordpress.com/tag/serendip-o-matic/. Hunt, Ryan. “Serendip-O-Matic as a Potential Model for Open Online Academic Work.” IVRYTWR, September 25, 2013. http://ivrytwr.com/2013/09/25/serendip-o-matic-as-a- potential-model-for-open-online-academic-work/. Kleinman, Scott. “Introducing Serendip-O-Matic,” August 5, 2013. http://scottkleinman.net/blog/2013/08/05/introducing-serendip-o-matic/. ———. “Play as Process and Product: On Making Serendip-O-Matic | Scottkleinman.net.” Accessed July 29, 2014. http://scottkleinman.net/blog/2014/07/10/play-as-process-and- product-on-making-serendip-o-matic/. ———. “Serendip-O-Matic (and Other Good News).” Digital Humanities - Southern California, August 12, 2013. http://dhsocal.blogspot.com/2013/08/serendip-o-matic-and-other-good- news.html. Machovec, George. “From Your Managing Editor: Fourteenth Annual Readers’ Choice Awards.” The Charleston Advisor 16, no. 2 (October 1, 2014): 3–10. Meacham, Rebecca. “‘Dear Lucky One’: The Westing Game Invites Us to Play.” The Ploughshares Blog, August 7, 2013. http://blog.pshares.org/index.php/dear-lucky-one-the- westing-game-invites-us-to-play/. Moravec, Michelle. “Serendip-O-Matic Seeks to Replicate Thrill of Archival Discovery Online.” History News Network, August 5, 2013. http://hnn.us/articles/serendip-o-matic-seeks- replicate-thrill-archival-discovery-online. “One Week | One Tool Has Built . . . Serendip-O-Matic.” One Week One Tool, August 2, 2013. http://oneweekonetool.org/. “One Week | One Tool Team Launches Serendip-O-Matic.” Roy Rosenzweig Center for History and New Media, August 2, 2013. http://chnm.gmu.edu/news/one-week-one-tool- team-launches-serendip-o-matic/. Palin, Ray. “One Week | One Tool: Bit by Bit,” August 4, 2013. http://raypalin.info/blog/archives/1157. Peter. “Serendip-O-Matic: Der Automat Für Zufallsfunde...” Hatori Kibble, August 5, 2013. http://hatorikibble.wordpress.com/2013/08/05/serendip-o-matic-der-automat-fur- zufallsfunde/. “Presentations, Exhibitions.” News Pulse: Faculty/Staff Newsletter, August 12, 2013. http://newspulse.newpaltz.edu/2013/08/12/presentations-exhibitions-14/. “Professor Kleinman Helps Develop Search Tool Serendip-O-Matic.” Department of English, California State University-Northridge, August 3, 2013. http://www.csun.edu/engl/news.php? op=story&id=30. “RECOMMENDED: Serendip-O-Matic, From the One Week | One Tool Team.” Dh+lib, August 6, 2013. http://acrl.ala.org/dh/2013/08/06/recommended-serendip-o-matic-from-the-one- week-one-tool-team/. Retief, Esther. “Serendip-O-Matic Search Engine - Connects Your Sources to Digital Materials in Libraries, Museums and Archives Around the World.” LIS Trends, October 14, 2014. http://listrends.blogspot.com/2014/10/serendip-o-matic-search-engine-connects.html. Ridge, Mia. “And so It Begins: Day Two of OWOT.” Open Objects, July 31, 2013. http://openobjects.blogspot.com/2013/07/and-so-it-begins-day-two-of-owot.html. ———. “Conference Paper: Play as Process and Product: On Making Serendip-O-Matic.” Mia Ridge, July 2, 2014. http://www.miaridge.com/conference-paper-play-as-process-and- product-on-making-serendip-o-matic/. ———. “Halfway Through. Day Three of OWOT.” Open Objects, August 1, 2013. http://openobjects.blogspot.com/2013/08/halfway-through-day-three-of-owot.html. ———. “Highs and Lows, Day Four of OWOT.” Open Objects, August 2, 2013. http://openobjects.blogspot.com/2013/08/highs-and-lows-day-4-of-owot.html. ———. “So We Made a Thing. Announcing Serendip-O-Matic at One Week, One Tool.” Open Objects, August 2, 2013. http://openobjects.blogspot.com/2013/08/so-we-made-thing- announcing-serendip-o.html. ———. “Working out What We’re Doing: Day One of One Week, One Tool.” Open Objects, July 30, 2013. http://openobjects.blogspot.com/2013/07/working-out-what-were-doing-day- one-of.html. Rybak, Chuck. “DH Toe Dip: The Serendip-O-Matic.” Sad Iron, August 28, 2014. http://www.sadiron.com/dh-toe-dip-the-serendip-o-matic/. ———. “DH Toe Dip: The Serendip-O-Matic | Sad Iron.” Sad Iron, August 28, 2014. http://www.sadiron.com/dh-toe-dip-the-serendip-o-matic/. “Serendip-O-Matic.” Bamboo DiRT, August 2, 2013. http://dirt.projectbamboo.org/resources/serendip-o-matic. “Serendip-O-Matic.” Designer Librarian, August 5, 2013. http://designerlibrarian.wordpress.com/2013/08/05/serendip-o-matic/. “Serendip-O-Matic.” Europeana Labs, July 2014. http://preview.labs.eanadev.org/apps/serendip-o-matic/. “Serendip-O-Matic - Csodálkozz a Bibliográfiádra.” Kereső Világ: Keresés, Szövegbányászat, Big Data, August 8, 2013. http://kereses.blog.hu/2013/08/09/serendip-o- matic_csodalkozz_a_bibliografiadra. “Serendip-O-Matic: It’s Not Search, It’s Serendipity.” Danegeld, August 8, 2013. http://danegeld.dk/2013/08/08/serendip-o-matic-its-not-search-its-serendipity/. “Serendip-O-Matic Launched.” Mason News, August 9, 2013. http://newsdesk.gmu.edu/2013/08/serendip-o-matic-launched/. “Serendip-O-Matic: Let’s Your Sources Surprise You.” Digital Meets Culture, August 2013. http://www.digitalmeetsculture.net/article/serendip-o-matic-lets-your-sources-surprise-you/. “Serendip-O-Matic: Let Your Sources Surprise You.” Stuff You Missed in History Class, August 5, 2013. http://missedinhistory.tumblr.com/post/57428548818/serendip-o-matic-let-your- sources-surprise-you. Serendip-O-Matic - Post Mortal Songs. Switzerland, 2005. http://www.discogs.com/Serendip- o-matic-Post-Mortal-Songs/release/6984447. “Serendip-O-Matic Results Using 2012 SPU Library Annual Report.” Keeping Time, August 3, 2013. http://forkeepingtime.tumblr.com/post/57254758403/serendip-o-matic-results-using- 2012-spu-library-annual. Smale, Maura. “New at the DPLA: There’s an App for That.” ACRLog, August 15, 2013. http://acrlog.org/2013/08/15/new-at-the-dpla-theres-an-app-for-that/. “SMHS Connects with Serendip-O-Matic.” Sunapee School District, August 13, 2013. http://www.sunapeeschools.org/home/announcement/smhsconnectswithserendip-o-matic. “Stained Glass, Google, Serendip-O-Matic, More: Short Wednesday Buzz.” ResearchBuzz, August 14, 2013. http://researchbuzz.me/2013/08/14/stained-glass-google-serendip-o- matic-more-short-wednesday-buzz-august-14-2013/. Starr, Julie. “Serendip-O-Matic Led Me to These Gorgeous Images of Early NZ.” Evolving Newsroom, August 12, 2013. http://evolvingnewsroom.co.nz/serendip-o-matic-led-me-to- these-gorgeous-images-of-early-nz/. “Surprising Results: A Search Engine Designed by and for Digital Humanities.” Explored.tech, August 6, 2013. http://sophia.smith.edu/blog/exploredtech/category/digital- humanities/. Verhoeven, Deb, and Toby Burrows. “Crowdsourcing for Serendipity.” The Australian: Higher Education, December 10, 2014. http://www.theaustralian.com.au/higher- education/opinion/crowdsourcing-for-serendipity/story-e6frgcko-1227150244558. Visconti, Amanda. “Digital Projects from Start to Finish: DH Mentorship from One Week One Tool (OWOT).” Literature Geek, July 30, 2013. http://www.literaturegeek.com/owotdayone/. ———. “#OWOT a Week: Introducing Serendip-O-Matic, a Tool for Digital Humanities Discovery and Delight.” Literature Geek, August 2, 2013. http://www.literaturegeek.com/owot-a-week/. Williams, Amrys O. “One Week, One Tool.” AmShazam., July 28, 2013. http://amrys.wordpress.com/2013/07/28/one-week-one-tool/. ———. “OWOT, Day 2.” AmShazam., July 29, 2013. http://amrys.wordpress.com/2013/07/29/owot-day-2/. ———. “OWOT, Day 3.” AmShazam., July 30, 2013. http://amrys.wordpress.com/2013/07/30/owot-day-3/. ———. “OWOT, Day 4.” AmShazam., July 31, 2013. http://amrys.wordpress.com/2013/07/31/owot-day-4/. ———. “OWOT, Day 5.” AmShazam, August 1, 2013. http://amrys.wordpress.com/2013/08/01/owot-day-5/. ———. “OWOT, Day 6.” AmShazam., August 2, 2013. http://amrys.wordpress.com/2013/08/02/owot-day-6/. ———. “OWOT, Day 7.” AmShazam., August 3, 2013. http://amrys.wordpress.com/2013/08/03/owot-day-7/. ———. “What We Built at OWOT: Serendip-O-Matic.” History of Science, Medicine, and Technology at the University of Wisconsin, August 2, 2013. http://wisconsinhstm.blogspot.com/2013/08/what-we-built-at-owot-serendip-o-matic.html. “Сервис Serendip-O-Matic иллюстрирует тексты картинками из библиотек и музеев.” Edutainme, August 5, 2013. http://www.edutainme.ru/news/servis-serendip-o-matic- illyustriruet-teksty-kartinkami-iz-bibliotek-i-muzeev/.   Project Activities, Accomplishments, and Audiences Evaluation Continuation of the Project Grant Products Appendix I Bibliography of related presentations and publications work_g2z6gcjdengm7ikpgxd3bxgsqu ---- PARTHENOS Foresight - Executive Summary PARTHENOS is a Horizon 2020 project funded by the European Commission under Grant Agreement n. 654119. The views and opinions expressed in this publication are the sole responsibility of the author and do not necessarily reflect the views of the European Commission. PARTHENOS Foresight Executive Summary https://zenodo.org/record/2662490 Introduction In recent years there has been rapid growth both in the development of digital methods and tools and in their application across a wide range of disciplines within humanities and cultural heritage studies. The future development of this landscape depends on a complex and dynamic ecosystem of interactions between a range of factors: changing scholarly priorities, questions and methods; technological advances and new tool development; and the broader social, cultural and economic contexts within which both scholars and infrastructures are situated. This foresight study investigates how digital research methods, technologies and infrastructures in digital humanities and cultural heritage may develop over the next 5-10 years, and provides some recommendations for future interventions to optimize this development. Foresight Foresight research is a key mechanism for the development and implementation of research and innovation policy in the medium to long term, enabling policy-making bodies to set research priorities and influence the progress of research. Foresight research is not simply ‘future gazing’, nor is it just about forecasting by experts, rather it is a way of facilitating structured thinking and debate about long- term issues and developments, and of broadening participation in this process, by involving different stakeholders, to create a shared understanding about possible futures and to enable them to be shaped or influenced. Engaging a representative range of relevant and informed stakeholders in the dialogue brings several benefits: it extends the breadth and depth of the knowledge base created by the foresight process by drawing on distributed knowledge; it increases the ‘democratic basis and legitimacy’ of the study report by avoiding a top-down, expert-driven analysis; and it helps to spread the message about foresight activities and to embed it within participating organisations, thus improving sustainability. Foresight studies draw upon existing knowledge networks and stimulate new ones – in addition to any reports produced, these embedded networks are an important output of foresight activities, facilitating a longer-term thinking process that extends beyond the period of the study itself. PARTHENOS FORESIGHT EXECUTIVE SUMMARY POLICY-MAKING & PLANNING PARTICIPATION & NETWORKING PERSPECTIVE & FUTURE FORESIGHT PARTHENOS Foresight Methodology A foresight study may utilize a range of different information gathering methods in the construction of its knowledge base. Specifically, the PARTHENOS foresight study commenced with an initial literature review and landscape scanning, to set the context for the study. This was followed by a series of structured, interactive events that combined expert panels with interactive workshops to obtain input for the study’s foresight knowledge base, by curating multi-polar discussions among both experts from relevant backgrounds and a broader range of actual or potential stakeholders in research infrastructures, including (but not restricted to) users/researchers. These events then fed in turn into a series of interviews with targeted stakeholders. Lastly, the PARTHENOS Hub – which is a publication and interaction platform created by the project itself – provided a space to both present the methodology and ask for additional input through a questionnaire. The respective issue can be consulted here: http://www. parthenos-project.eu/portal/the-hub/issue-2. Within this overall framework, the study followed a thematic approach, structuring its investigations around a two-dimensional matrix of questions that addressed, firstly, the different aspects of the foresight process: ● current trends – what is happening, and what impact is it having? ● potentialities and opportunities – what may happen? ● requirements – what do we want to happen? ● obstacles, constraints, risks and threats – what might prevent this from happening? ● what activities and interventions (e.g. funding programmes, strategic research, service provision) might serve to ‘optimize’ outcomes? and, secondly, the different contexts to which those aspects relate: ● technology (e.g. new tools or methods); ● scholarly or professional practice (e.g. emerging research areas, changes in career structures); ● the broader ‘environment’ (e.g. social, cultural, economic, political, policy). Research/Scolarship (what are researchers doing? want to do?) Technological (new, evolving, potential technology) Enviromental (social, cultural, policy, economic ... ) Findings This study has found a dynamic field with a host of opportunities offered by new technologies, but requiring additional skills and infrastructure if full use is to be made of the opportunities. The main findings of the foresight study are summarized below, grouped according to identified trends, obstacles, potentialities and requirements. Trends The adoption of digital research methods is increasingly widespread in the humanities and cultural heritage sector, with the development of new data sources, technologies, and expanding collaborations creating a dynamic and innovative environment. The development of the digital humanities has been characterized by the explosion in data available for analysis: digitized collections; open data; born-digital content. There are limitations and issues in relation to these, however: there is still a need for further digitization, in particular of collections relating to marginalized groups; significant concerns have emerged about potential infringement of IPR and the GDPR; and big technology companies are raising barriers to access to their data. There is also a wide range of tools for analysing these data: open source software; natural language processing, machine learning, and artificial intelligence tools and libraries. Open source software enables the broad adoption of new tools and facilitates sustainability beyond a single project, while the development of software libraries for computational analysis offers the potential for widespread automated analysis. There is an important difference, however, between placing software on GitHub and ensuring it is sustainable in the long term, and there is a risk that artificial intelligence may be seen as a vague panacea for all difficulties, without the community fully understanding the potentials, limitations and biases of the tools. There has also been an increase in the number and variety of collaborations: interdisciplinary collaboration; intersectoral collaboration; and international collaboration. Collaborations between the humanities and other fields, universities and other sectors of society, and across national borders, are increasingly common and bring new perspectives and ideas to projects and data sets. This may be hindered, however, by humanists who are reluctant to embrace digital methodologies, a suspicion of the commercial sector, and certain restrictions on international funding. These trends towards increased data, tools and collaboration are all “Open source software enables the broad adoption of new tools and facilitates sustainability beyond a single project, while the development of software libraries for computational analysis offers the potential for widespread automated analysis.” expected to continue into the near future, albeit with the potential for some restrictions on access to data due to concerns about IPR and the GDPR, and more limitations imposed by the big technology companies. The rate of increased adoption of data, tools and collaboration is liable to be constrained by funding limitations. Obstacles The opportunities offered by recent technological advances in the humanities have not yet reached their full potential, a situation that has been heavily influenced by environmental obstacles. The three most often raised obstacles were: funding, the digital divide, and concerns about IPR and the GDPR. The lack of sufficient funding for the digital humanities and cultural heritage sectors, especially since the financial crisis of 2008 and the growing emphasis on the funding of STEM subjects, has had significant consequences for the capability of the sector to meet the challenges of the twenty-first century: ● Distortion of research interests: Insufficient funds drives researchers to focus on those areas where funding is available, with an accompanying lack of freedom to explore other areas that they consider important. ● Loss of people from the sector: Restricted budgets inevitably lead to a lack of job security, and the loss of team members has ramifications for the sustainability of projects and the loss of vital skills from the sector. The lack of funding also feeds into the digital divide within the digital humanities and cultural sectors. This digital divide can take many forms, including: ● International digital divide: There continues to be significant differences between the research infrastructures available to researchers and research institutes in different countries. ● Interdisciplinary digital divide: There are significant differences between the research infrastructures that are available to the digital humanities compared with STEM disciplines that have been prioritized for funding. This, in turn, has contributed to the digital divide in technical skills. ● Intradisciplinary digital divide: There continues to be a significant and ongoing divide within the humanities between those who embrace the potential of digital methodologies and those who do not. There are also concerns about IPR and the GDPR. The GDPR, in particular, is seen as blocking avenues of research, and preventing humanists researching some of the most important emerging issues affecting the EU, including fake news, populism, and nationalism. Potentialities The potential of digital research methods in the humanities and cultural heritage sectors is reliant not on the emergence of new technologies or discoveries, but rather on the application of existing technologies. The new digital technologies and primary sources offer a host of new possibilities, but a decade of underfunding has left much of the potential unrealized. Particular interest was noted in those technologies that potentially offer a technological solution to overcoming the problem of a lack of growth in the humanities: ● Crowdsourcing: Crowdsourcing offers the opportunity both to outsource certain tasks to the wider community, thus scaling up certain types of activity, and to engage the public more deeply with humanities research. ● Artificial Intelligence: Artificial Intelligence offers the potential to contribute to a wide range of research in the digital humanities, but it is important that humanities researchers are willing to investigate the black box of these technologies more fully. Neither is a panacea to the underfunding of the humanities, however. While they may offer the opportunity to increase the scale of projects, they nonetheless require expert guidance and a fuller understanding on the part of those researchers employing them. New technologies and publication models also offer the potential for greater public impact: ● Augmented Reality, Virtual Reality, and Mobile Applications: The near-ubiquitous mobile smartphone, and the growing potential of augmented reality and virtual reality technologies, offer numerous opportunities for promoting research and collections in new ways. Not all will be successful, however, and there needs to be room for experimentation and failure, which is increasingly difficult given the importance accorded to impact and metrics in research evaluation. ● Open Research: Open research is seen as having potential not only for improving research access and quality, but also for reaching out to the wider public. For this to be achieved, however, there is a need for funding to ensure that open access policies can be followed. From a technological perspective, the typical view was the expectation of more of the same. However, the impact of these technologies on the structure of the humanities, or the potential of the humanities for culture more broadly, is much less clear. Requirements There is a fundamental need for growth in the funding of the humanities and cultural heritage sector to ensure that it can meet the challenges of the twenty-first century and our increasingly technology-mediated society. This is not simply a request for unlimited funds to support blue-sky thinking, but reflects the need for a discussion about the “fundamental questions” and “inspirational goals” that the community has to offer society. It is not just a matter of technologies, but rather about finding the questions. At a European level there is a need for a stronger European lead, with a more explicit European Commission strategy on cultural heritage, and more visible public institutions offering leadership on research infrastructure and standards. It was suggested that cultural heritage institutes may contribute to the building of a European identity in the same way that 18th and 19th century cultural heritage institutes contributed to nation building. Europe is not a single homogenous region, however, and there is a need for segmentation in future digital humanities strategy, with different regions requiring different answers. This means that there is also an important role for national governments in ensuring sustainable levels of support for the humanities and cultural heritage sector. There is a need for a suitable information regulation framework that supports rather than hinders humanities research; this framework should distinguish between the work of academic or public sector researchers and those from private corporations, and should recognize that the protection required when handling personal health records differs from the protection required when analysing political commentary that is already in the public arena. Finally, as more than one contributor noted, there is a need for more projects similar to the PARTHENOS Foresight Study (or indeed a sustainment or continuation of this study), that engage with professionals in culture and heritage to ask them what they see happening and what their needs and issues are. The digital humanities and cultural heritage sectors form a diverse community, without a single voice, and it needs to find that voice if it is to meet some of the challenges of the twenty-first century. “There is a fundamental need for growth in the funding of the humanities and cultural heritage sector to ensure that it can meet the challenges of the twenty- first century and our increasingly technology-mediated society.” Research Agenda From the foresight study, five broad themes emerge that should form the basis of a research agenda in the digital humanities: public engagement; research infrastructures; development of the digital commons; artificial intelligence; and impact and evaluation methods and metrics. Public Engagement Public engagement is an essential part of ending the underfunding of the humanities and cultural heritage sectors. The contribution of STEM research to society is widely recognised in a way that the contribution of the humanities is not, and there is a need for humanists to make the case for their work more forcibly with a combined voice. There are many ways that the new technologies can be used by humanists and cultural heritage sector to ensure research outputs are as widely accessible as possible: open access, open data (following good data practice), social media, augmented reality, virtual reality, and mobile apps. Crowdsourcing platforms can also be used for soliciting contributions from the public. Engagement, however, is not just about promotion of research or extracting free labour, but about engaging with the public to ensure the humanities are meeting the challenges society faces at the beginning of the twenty-first century, whether that is fake news, nationalism, populism, or climate change, and demonstrating the contribution humanities research is making to these grand challenges. Research infrastructures The value of recent initiatives in the development of research infrastructures were widely recognized in the foresight study, as they provide a certain amount of sustainability to research projects, and more development of research infrastructures for the humanities and cultural heritage sector was seen as necessary. At a time when projects are often short and the competition for funding is fierce, research infrastructures need to facilitate collaboration and sustainability, establishing communities around the infrastructures that are developed. It is important that research infrastructures do not simply perpetuate or exacerbate existing inequalities but help to bridge the digital divide. New research infrastructures, or enhancements to existing ones, should: ● bring to the fore marginalised collections. ● ensure access and analysis is not only possible by the technologically literate. “There are many ways that the new technologies can be used by humanists and cultural heritage sector to ensure research outputs are as widely accessible as possible: open access, open data (following good data practice), social media, augmented reality, virtual reality, and mobile apps.” ● provide data services and tools as well as data. Importantly, research infrastructures should feed into the public engagement by being visible, and findable, and should be used to establish authority in the development of standards and best practice. Development of the digital commons New data sets and new technologies offer the potential for a host of new research questions to be addressed, but the humanities must be more critical in both the application of digital methodologies and the data that is available. The digital humanities should not be reduced to the application of trendy technologies and data sources looking for research questions, but rather answering the big questions, while at the same time enhancing the digital commons and other digital resources. There is significant work to be done in: ● making new collections freely available online, especially those from marginalised communities. ● integrating diverse data sets. ● building context and provenance for online resources. These issues are particularly important in the context of the widely recognised potential for artificial intelligence. Artificial intelligence The potential for artificial intelligence, machine learning, and other large- scale computational methodologies are as prevalent in the humanities and cultural heritage sector as the sciences. It is essential, however, that these technologies are not simply applied in an ad hoc manner, but are applied critically with attention to sustainability and ethical considerations. There is in particular a need to focus on: ● the ethical implications of the application of AI technologies. ● real world applications that are reusable. ● ensuring the technologies are used to help close rather than extend the digital divide. Impact and evaluation Impact and evaluation are important parts of the research process, especially when ensuring that limited funds are used in the best way possible, and it is essential that new methodologies and metrics are developed for measuring impact and evaluation that reflect the specific needs of the humanities and cultural heritage sector. These methodologies and metrics should incentivize innovation, sustainability, and public engagement. They should also recognize a far wider range of outputs and applications, and contribute to the development of standards and best practices in research evaluation. www.parthenos-project.eu work_g47ybedbfbcc3hyaxkzfxtrngi ---- Inderscience Publishers - linking academia, business and industry through research Log in Log in For authors, reviewers, editors and board members Username Remember me Go Forgotten? Help Sitemap Home For Authors For Librarians Orders Inderscience Online News Explore our journals Browse journals by titleAfrican Journal of Accounting, Auditing and FinanceAfrican Journal of Economic and Sustainable DevelopmentAfro-Asian Journal of Finance and AccountingAmerican Journal of Finance and AccountingAsian Journal of Management Science and ApplicationsAtoms for Peace: an International JournalElectronic Government, an International JournalEuroMed Journal of ManagementEuropean Journal of Cross-Cultural Competence and ManagementEuropean Journal of Industrial EngineeringEuropean Journal of International ManagementGlobal Business and Economics ReviewInterdisciplinary Environmental ReviewInternational Journal of Abrasive TechnologyInternational Journal of Accounting and FinanceInternational Journal of Accounting, Auditing and Performance EvaluationInternational Journal of Ad Hoc and Ubiquitous ComputingInternational Journal of Adaptive and Innovative SystemsInternational Journal of Additive and Subtractive Materials ManufacturingInternational Journal of Advanced Intelligence ParadigmsInternational Journal of Advanced Mechatronic SystemsInternational Journal of Advanced Media and CommunicationInternational Journal of Advanced Operations ManagementInternational Journal of AerodynamicsInternational Journal of Aerospace System Science and EngineeringInternational Journal of Agent-Oriented Software EngineeringInternational Journal of Agile and Extreme Software DevelopmentInternational Journal of Agile Systems and ManagementInternational Journal of Agricultural Resources, Governance and EcologyInternational Journal of Agriculture Innovation, Technology and GlobalisationInternational Journal of Alternative PropulsionInternational Journal of Applied CryptographyInternational Journal of Applied Decision SciencesInternational Journal of Applied Management ScienceInternational Journal of Applied Nonlinear ScienceInternational Journal of Applied Pattern RecognitionInternational Journal of Applied Systemic StudiesInternational Journal of Arab Culture, Management and Sustainable DevelopmentInternational Journal of Artificial Intelligence and Soft ComputingInternational Journal of Arts and TechnologyInternational Journal of Auditing TechnologyInternational Journal of Automation and ControlInternational Journal of Automation and LogisticsInternational Journal of Automotive CompositesInternational Journal of Automotive Technology and ManagementInternational Journal of Autonomic ComputingInternational Journal of Autonomous and Adaptive Communications SystemsInternational Journal of Aviation ManagementInternational Journal of Banking, Accounting and FinanceInternational Journal of Behavioural Accounting and FinanceInternational Journal of Behavioural and Healthcare ResearchInternational Journal of Bibliometrics in Business and ManagementInternational Journal of Big Data IntelligenceInternational Journal of Big Data ManagementInternational Journal of Bioinformatics Research and ApplicationsInternational Journal of Bio-Inspired ComputationInternational Journal of Biomechatronics and Biomedical RoboticsInternational Journal of Biomedical Engineering and TechnologyInternational Journal of Biomedical Nanoscience and NanotechnologyInternational Journal of BiometricsInternational Journal of BiotechnologyInternational Journal of Blockchains and CryptocurrenciesInternational Journal of Bonds and DerivativesInternational Journal of Business and Data AnalyticsInternational Journal of the Built Environment and Asset ManagementInternational Journal of Business and Emerging MarketsInternational Journal of Business and GlobalisationInternational Journal of Business and Systems ResearchInternational Journal of Business Competition and GrowthInternational Journal of Business Continuity and Risk ManagementInternational Journal of Business EnvironmentInternational Journal of Business ExcellenceInternational Journal of Business Forecasting and Marketing IntelligenceInternational Journal of Business Governance and EthicsInternational Journal of Business Information SystemsInternational Journal of Business Innovation and ResearchInternational Journal of Business Intelligence and Data MiningInternational Journal of Business Intelligence and Systems EngineeringInternational Journal of Business Performance and Supply Chain ModellingInternational Journal of Business Performance ManagementInternational Journal of Business Process Integration and ManagementInternational Journal of Chinese Culture and ManagementInternational Journal of Circuits and Architecture DesignInternational Journal of Cloud ComputingInternational Journal of Cognitive BiometricsInternational Journal of Cognitive Performance SupportInternational Journal of Collaborative EngineeringInternational Journal of Collaborative EnterpriseInternational Journal of Collaborative IntelligenceInternational Journal of Communication Networks and Distributed SystemsInternational Journal of Comparative ManagementInternational Journal of CompetitivenessInternational Journal of Complexity in Applied Science and TechnologyInternational Journal of Complexity in Leadership and ManagementInternational Journal of Computational Biology and Drug DesignInternational Journal of Computational Complexity and Intelligent AlgorithmsInternational Journal of Computational Economics and EconometricsInternational Journal of Computational Intelligence in Bioinformatics and Systems BiologyInternational Journal of Computational Intelligence StudiesInternational Journal of Computational Materials Science and Surface EngineeringInternational Journal of Computational Medicine and HealthcareInternational Journal of Computational Microbiology and Medical EcologyInternational Journal of Computational Science and EngineeringInternational Journal of Computational Systems EngineeringInternational Journal of Computational Vision and RoboticsInternational Journal of Computer Aided Engineering and TechnologyInternational Journal of Computer Applications in TechnologyInternational Journal of Computers in HealthcareInternational Journal of Computing Science and MathematicsInternational Journal of Continuing Engineering Education and Life-Long LearningInternational Journal of Convergence ComputingInternational Journal of Corporate GovernanceInternational Journal of Corporate Strategy and Social ResponsibilityInternational Journal of Creative ComputingInternational Journal of Critical AccountingInternational Journal of Critical Computer-Based SystemsInternational Journal of Critical InfrastructuresInternational Journal of Cultural ManagementInternational Journal of Cybernetics and Cyber-Physical SystemsInternational Journal of Data Analysis Techniques and StrategiesInternational Journal of Data Mining and BioinformaticsInternational Journal of Data Mining, Modelling and ManagementInternational Journal of Data ScienceInternational Journal of Decision Sciences, Risk and ManagementInternational Journal of Decision Support SystemsInternational Journal of Design EngineeringInternational Journal of Digital Culture and Electronic TourismInternational Journal of Digital Enterprise TechnologyInternational Journal of the Digital HumanInternational Journal of Digital Signals and Smart SystemsInternational Journal of Diplomacy and EconomyInternational Journal of Dynamical Systems and Differential EquationsInternational Journal of Earthquake and Impact EngineeringInternational Journal of Ecological Bioscience and BiotechnologyInternational Journal of Economic Policy in Emerging EconomiesInternational Journal of Economics and AccountingInternational Journal of Economics and Business ResearchInternational Journal of Education Economics and DevelopmentInternational Journal of Electric and Hybrid VehiclesInternational Journal of Electronic BankingInternational Journal of Electronic BusinessInternational Journal of Electronic Customer Relationship ManagementInternational Journal of Electronic DemocracyInternational Journal of Electronic FinanceInternational Journal of Electronic GovernanceInternational Journal of Electronic HealthcareInternational Journal of Electronic Marketing and RetailingInternational Journal of Electronic Security and Digital ForensicsInternational Journal of Electronic TradeInternational Journal of Electronic TransportInternational Journal of Embedded SystemsInternational Journal of Emergency ManagementInternational Journal of Emerging Computing for Sustainable AgricultureInternational Journal of Energy Technology and PolicyInternational Journal of Engineering Management and EconomicsInternational Journal of Engineering Systems Modelling and SimulationInternational Journal of Enterprise Network ManagementInternational Journal of Enterprise Systems Integration and InteroperabilityInternational Journal of Entertainment Technology and ManagementInternational Journal of Entrepreneurial VenturingInternational Journal of Entrepreneurship and Innovation ManagementInternational Journal of Entrepreneurship and Small BusinessInternational Journal of Environment and HealthInternational Journal of Environment and PollutionInternational Journal of Environment and Sustainable DevelopmentInternational Journal of Environment and Waste ManagementInternational Journal of Environment, Workplace and EmploymentInternational Journal of Environmental EngineeringInternational Journal of Environmental Policy and Decision MakingInternational Journal of Environmental Technology and ManagementInternational Journal of ExergyInternational Journal of Experimental and Computational BiomechanicsInternational Journal of Experimental Design and Process OptimisationInternational Journal of Export MarketingInternational Journal of Family Business and Regional DevelopmentInternational Journal of Financial Engineering and Risk ManagementInternational Journal of Financial Innovation in BankingInternational Journal of Financial Markets and DerivativesInternational Journal of Financial Services ManagementInternational Journal of Food Safety, Nutrition and Public HealthInternational Journal of Forensic EngineeringInternational Journal of Forensic Engineering and ManagementInternational Journal of Forensic Software EngineeringInternational Journal of Foresight and Innovation PolicyInternational Journal of Functional Informatics and Personalised MedicineInternational Journal of Fuzzy Computation and ModellingInternational Journal of Gender Studies in Developing SocietiesInternational Journal of Global Energy IssuesInternational Journal of Global Environmental IssuesInternational Journal of Global WarmingInternational Journal of Globalisation and Small BusinessInternational Journal of Governance and Financial IntermediationInternational Journal of Granular Computing, Rough Sets and Intelligent SystemsInternational Journal of Green EconomicsInternational Journal of Grid and Utility ComputingInternational Journal of Happiness and DevelopmentInternational Journal of Healthcare PolicyInternational Journal of Healthcare Technology and ManagementInternational Journal of Heavy Vehicle SystemsInternational Journal of High Performance Computing and NetworkingInternational Journal of High Performance Systems ArchitectureInternational Journal of Higher Education and SustainabilityInternational Journal of Hospitality and Event ManagementInternational Journal of Human Factors and ErgonomicsInternational Journal of Human Factors Modelling and SimulationInternational Journal of Human Resources Development and ManagementInternational Journal of Human Rights and Constitutional StudiesInternational Journal of Humanitarian TechnologyInternational Journal of Hybrid IntelligenceInternational Journal of Hydrology Science and TechnologyInternational Journal of HydromechatronicsInternational Journal of Image MiningInternational Journal of Immunological StudiesInternational Journal of Indian Culture and Business ManagementInternational Journal of Industrial and Systems EngineeringInternational Journal of Industrial Electronics and DrivesInternational Journal of Information and Coding TheoryInternational Journal of Information and Communication TechnologyInternational Journal of Information and Computer SecurityInternational Journal of Information and Decision SciencesInternational Journal of Information and Operations Management EducationInternational Journal of Information Privacy, Security and IntegrityInternational Journal of Information QualityInternational Journal of Information Systems and Change ManagementInternational Journal of Information Systems and ManagementInternational Journal of Information Technology and ManagementInternational Journal of Information Technology, Communications and ConvergenceInternational Journal of Innovation and LearningInternational Journal of Innovation and Regional DevelopmentInternational Journal of Innovation and Sustainable DevelopmentInternational Journal of Innovation in EducationInternational Journal of Innovative Computing and ApplicationsInternational Journal of Instrumentation TechnologyInternational Journal of Integrated Supply ManagementInternational Journal of Intellectual Property ManagementInternational Journal of Intelligence and Sustainable ComputingInternational Journal of Intelligent Defence Support SystemsInternational Journal of Intelligent Engineering InformaticsInternational Journal of Intelligent EnterpriseInternational Journal of Intelligent Information and Database SystemsInternational Journal of Intelligent Internet of Things ComputingInternational Journal of Intelligent Machines and RoboticsInternational Journal of Intelligent Systems Design and ComputingInternational Journal of Intelligent Systems Technologies and ApplicationsInternational Journal of Intercultural Information ManagementInternational Journal of Internet and Enterprise ManagementInternational Journal of Internet Manufacturing and ServicesInternational Journal of Internet Marketing and AdvertisingInternational Journal of Internet of Things and Cyber-AssuranceInternational Journal of Internet Protocol TechnologyInternational Journal of Internet Technology and Secured TransactionsInternational Journal of Inventory ResearchInternational Journal of Islamic Marketing and BrandingInternational Journal of Knowledge and LearningInternational Journal of Knowledge and Web IntelligenceInternational Journal of Knowledge Engineering and Data MiningInternational Journal of Knowledge Engineering and Soft Data ParadigmsInternational Journal of Knowledge Management in Tourism and HospitalityInternational Journal of Knowledge Management StudiesInternational Journal of Knowledge Science and EngineeringInternational Journal of Knowledge-Based DevelopmentInternational Journal of Lean Enterprise ResearchInternational Journal of Learning and ChangeInternational Journal of Learning and Intellectual CapitalInternational Journal of Learning TechnologyInternational Journal of Legal Information DesignInternational Journal of Leisure and Tourism MarketingInternational Journal of Liability and Scientific EnquiryInternational Journal of Lifecycle Performance EngineeringInternational Journal of Logistics Economics and GlobalisationInternational Journal of Logistics Systems and ManagementInternational Journal of Low RadiationInternational Journal of Machine Intelligence and Sensory Signal ProcessingInternational Journal of Machining and Machinability of MaterialsInternational Journal of Management and Decision MakingInternational Journal of Management and Enterprise DevelopmentInternational Journal of Management and Network EconomicsInternational Journal of Management Concepts and PhilosophyInternational Journal of Management DevelopmentInternational Journal of Management in EducationInternational Journal of Management PracticeInternational Journal of Managerial and Financial AccountingInternational Journal of Manufacturing ResearchInternational Journal of Manufacturing Technology and ManagementInternational Journal of Masonry Research and InnovationInternational Journal of Markets and Business SystemsInternational Journal of Mass CustomisationInternational Journal of Materials and Product TechnologyInternational Journal of Materials and Structural IntegrityInternational Journal of Materials Engineering InnovationInternational Journal of Mathematical Modelling and Numerical OptimisationInternational Journal of Mathematics in Operational ResearchInternational Journal of Mechanisms and Robotic SystemsInternational Journal of Mechatronics and AutomationInternational Journal of Mechatronics and Manufacturing SystemsInternational Journal of Medical Engineering and InformaticsInternational Journal of Metadata, Semantics and OntologiesInternational Journal of MetaheuristicsInternational Journal of Microstructure and Materials PropertiesInternational Journal of Migration and Border StudiesInternational Journal of Migration and Residential MobilityInternational Journal of Mining and Mineral EngineeringInternational Journal of Mobile CommunicationsInternational Journal of Mobile Learning and OrganisationInternational Journal of Mobile Network Design and InnovationInternational Journal of Modelling in Operations ManagementInternational Journal of Modelling, Identification and ControlInternational Journal of Molecular EngineeringInternational Journal of Monetary Economics and FinanceInternational Journal of Multicriteria Decision MakingInternational Journal of Multimedia Intelligence and SecurityInternational Journal of Multinational Corporation StrategyInternational Journal of Multivariate Data AnalysisInternational Journal of Nano and BiomaterialsInternational Journal of NanomanufacturingInternational Journal of NanoparticlesInternational Journal of NanotechnologyInternational Journal of Network ScienceInternational Journal of Networking and SecurityInternational Journal of Networking and Virtual OrganisationsInternational Journal of Nonlinear Dynamics and ControlInternational Journal of Nuclear DesalinationInternational Journal of Nuclear Energy Science and TechnologyInternational Journal of Nuclear Governance, Economy and EcologyInternational Journal of Nuclear Hydrogen Production and ApplicationsInternational Journal of Nuclear Knowledge ManagementInternational Journal of Nuclear LawInternational Journal of Nuclear Safety and SecurityInternational Journal of Ocean Systems ManagementInternational Journal of Oil, Gas and Coal TechnologyInternational Journal of Operational ResearchInternational Journal of Organisational Design and EngineeringInternational Journal of Petroleum EngineeringInternational Journal of Physiotherapy and Life PhysicsInternational Journal of Planning and SchedulingInternational Journal of Pluralism and Economics EducationInternational Journal of Portfolio Analysis and ManagementInternational Journal of Postharvest Technology and InnovationInternational Journal of Power and Energy ConversionInternational Journal of Power ElectronicsInternational Journal of PowertrainsInternational Journal of Precision TechnologyInternational Journal of Private LawInternational Journal of Process Management and BenchmarkingInternational Journal of Process Systems EngineeringInternational Journal of Procurement ManagementInternational Journal of Product DevelopmentInternational Journal of Product Lifecycle ManagementInternational Journal of Product Sound QualityInternational Journal of Productivity and Quality ManagementInternational Journal of Project Organisation and ManagementInternational Journal of Public Law and PolicyInternational Journal of Public PolicyInternational Journal of Public Sector Performance ManagementInternational Journal of Qualitative Information Systems ResearchInternational Journal of Qualitative Research in ServicesInternational Journal of Quality and InnovationInternational Journal of Quality Engineering and TechnologyInternational Journal of Quantitative Research in EducationInternational Journal of Radio Frequency Identification Technology and ApplicationsInternational Journal of Rapid ManufacturingInternational Journal of Reasoning-based Intelligent SystemsInternational Journal of Reliability and SafetyInternational Journal of RemanufacturingInternational Journal of Renewable Energy TechnologyInternational Journal of Research, Innovation and CommercialisationInternational Journal of Responsible Management in Emerging EconomiesInternational Journal of Revenue ManagementInternational Journal of Risk Assessment and ManagementInternational Journal of Satellite Communications Policy and ManagementInternational Journal of Security and NetworksInternational Journal of Semantic and Infrastructure ServicesInternational Journal of Sensor NetworksInternational Journal of Service and Computing Oriented ManufacturingInternational Journal of Services and Operations ManagementInternational Journal of Services and StandardsInternational Journal of Services Operations and InformaticsInternational Journal of Services SciencesInternational Journal of Services Technology and ManagementInternational Journal of Services, Economics and ManagementInternational Journal of Shipping and Transport LogisticsInternational Journal of Signal and Imaging Systems EngineeringInternational Journal of Simulation and Process ModellingInternational Journal of Six Sigma and Competitive AdvantageInternational Journal of Smart Grid and Green CommunicationsInternational Journal of Smart Technology and LearningInternational Journal of Social and Humanistic ComputingInternational Journal of Social Computing and Cyber-Physical SystemsInternational Journal of Social Entrepreneurship and InnovationInternational Journal of Social Media and Interactive Learning EnvironmentsInternational Journal of Social Network MiningInternational Journal of Society Systems ScienceInternational Journal of Soft Computing and NetworkingInternational Journal of Software Engineering, Technology and ApplicationsInternational Journal of Space Science and EngineeringInternational Journal of Space-Based and Situated ComputingInternational Journal of Spatial, Temporal and Multimedia Information SystemsInternational Journal of Spatio-Temporal Data ScienceInternational Journal of Sport Management and MarketingInternational Journal of Strategic Business AlliancesInternational Journal of Strategic Change ManagementInternational Journal of Strategic Engineering Asset ManagementInternational Journal of Structural EngineeringInternational Journal of Student Project ReportingInternational Journal of Supply Chain and Inventory ManagementInternational Journal of Supply Chain and Operations ResilienceInternational Journal of Surface Science and EngineeringInternational Journal of Sustainable Agricultural Management and InformaticsInternational Journal of Sustainable AviationInternational Journal of Sustainable DesignInternational Journal of Sustainable DevelopmentInternational Journal of Sustainable EconomyInternational Journal of Sustainable ManufacturingInternational Journal of Sustainable Materials and Structural SystemsInternational Journal of Sustainable Real Estate and Construction EconomicsInternational Journal of Sustainable SocietyInternational Journal of Sustainable Strategic ManagementInternational Journal of Swarm IntelligenceInternational Journal of System Control and Information ProcessingInternational Journal of System of Systems EngineeringInternational Journal of Systems, Control and CommunicationsInternational Journal of Teaching and Case StudiesInternational Journal of TechnoentrepreneurshipInternational Journal of Technological Learning, Innovation and DevelopmentInternational Journal of Technology and GlobalisationInternational Journal of Technology Enhanced LearningInternational Journal of Technology Intelligence and PlanningInternational Journal of Technology ManagementInternational Journal of Technology MarketingInternational Journal of Technology Policy and LawInternational Journal of Technology, Policy and ManagementInternational Journal of Technology Transfer and CommercialisationInternational Journal of Telemedicine and Clinical PracticesInternational Journal of Theoretical and Applied Multiscale MechanicsInternational Journal of Tourism AnthropologyInternational Journal of Tourism PolicyInternational Journal of Trade and Global MarketsInternational Journal of Transitions and Innovation SystemsInternational Journal of Trust Management in Computing and CommunicationsInternational Journal of Ultra Wideband Communications and SystemsInternational Journal of Value Chain ManagementInternational Journal of Vehicle Autonomous SystemsInternational Journal of Vehicle DesignInternational Journal of Vehicle Information and Communication SystemsInternational Journal of Vehicle Noise and VibrationInternational Journal of Vehicle PerformanceInternational Journal of Vehicle SafetyInternational Journal of Vehicle Systems Modelling and TestingInternational Journal of Virtual Technology and MultimediaInternational Journal of WaterInternational Journal of Web and Grid ServicesInternational Journal of Web Based CommunitiesInternational Journal of Web Engineering and TechnologyInternational Journal of Web ScienceInternational Journal of Wireless and Mobile ComputingInternational Journal of Work InnovationInternational Journal of Work Organisation and EmotionJournal for Global Business AdvancementJournal for International Business and Entrepreneurship DevelopmentJournal of Design ResearchJournal of Supply Chain RelocationLatin American Journal of Management for Sustainable DevelopmentLuxury Research JournalMENA Journal of Cross-Cultural ManagementMiddle East Journal of ManagementNordic Journal of TourismProgress in Computational Fluid Dynamics, An International JournalProgress in Industrial Ecology, An International JournalThe Botulinum JournalWorld Review of Entrepreneurship, Management and Sustainable DevelopmentWorld Review of Intermodal Transportation ResearchWorld Review of Science, Technology and Sustainable Development Browse journals by subject Computing and Mathematics Economics and Finance Education, Knowledge and Learning Energy and Environment Healthcare and Biosciences Management and Business Public Policy and Administration Risk, Safety and Emergency Management Science, Engineering and Technology Society and Leisure All Subjects Research picks Securing telemedicineTelemedicine is slowly maturing allowing greater connectivity between patient and healthcare providers using information and communications technology (ICT). One issue that is yet to be addressed fully, however, is security and thence privacy. Researchers writing in the International Journal of Ad Hoc and Ubiquitous Computing, have turned to cloud computing to help them develop a new and strong authentication protocol for electronic healthcare systems. Prerna Mohit of the Indian Institute of Information Technology Senapati in Manipur, Ruhul Amin of the Dr Shyama Prasad Mukherjee International Institute of Information Technology, in Naya Raipur, and G.P. Biswas of the Indian Institute of Technology (ISM) Dhanbad, in Jharkhand, India, point out how medical information is personal and sensitive and so it is important that it remains private and confidential. The team's approach uses the flexibility of a mobile device to authenticate so that a user can securely retrieve pertinent information without a third party having the opportunity to access that information at any point. In a proof of principle, the team has carried out a security analysis and demonstrated that the system can resist attacks where a malicious third party attempts to breach the security protocol. They add that the costs in terms of additional computation and communication resources are lower than those offered by other security systems reported in the existing research literature. Mohit, P., Amin, R. and Biswas, G.P. (2021) 'An e-healthcare authentication protocol employing cloud computing', Int. J. Ad Hoc and Ubiquitous Computing, Vol. 36, No. 3, pp.155–168. DOI: 10.1504/IJAHUC.2021.113873 Anticancer drugs from the monsoonA small-branched shrub found in India known locally as Moddu Soppu (Justicia wynaadensis) is used to make a sweet dish during the monsoon season by the inhabitants of Kodagu district in Karanataka exclusively during the monsoons. Research published in the International Journal of Computational Biology and Drug Design has looked at phytochemicals present in extracts from the plant that may have putative anticancer agent properties. C.D. Vandana and K.N. Shanti of PES University in Bangalore, Karnataka and Vivek Chandramohan of the Siddaganga Institute of Technology also in Tumkur, Karnataka, investigated several phytochemicals that had been reported in the scientific literature as having anticancer activity. They used a computer model to look at how well twelve different compounds "docked" with the relevant enzyme thymidylate synthase and compared this activity with a reference drug, capecitabine, which targets this enzyme. Thymidylate synthase is involved in making DNA for cell replication. In cancer, uncontrolled cell replication is the underlying problem. If this enzyme can be blocked it will lead to DNA damage in the cancer cells and potentially halt the cancer growth. Two compounds had comparable activity and greater binding to the enzyme than capecitabine. The first, campesterol, is a well-known plant chemical with a structure similar to cholesterol, the second stigmasterol is another well-known phytochemical involved in the structural integrity of plant cells. The former proved itself to be more stable than the latter and represents a possible lead for further investigation and testing as an anticancer drug, the team reports. Vandana, C.D., Shanti, K.N., Karunakar, P. and Chandramohan, V. (2020) 'In silico studies of bioactive phytocompounds with anticancer activity from in vivo and in vitro extracts of Justicia wynaadensis (Nees) T. Anderson', Int. J. Computational Biology and Drug Design, Vol. 13, Nos. 5/6, pp.582–601. DOI: 10.1504/IJCBDD.2020.113836 Native reforestation benefits biodiversityTimber harvest and agriculture have had an enormous impact on biodiversity in many parts of the world over the last two hundred years of the industrial era. One such region is 20 to 50 kilometre belt of tropical dry evergreen forest that lies inland from the southeastern coast of India. Efforts to regenerate the biodiversity has been more successful when native tropical dry evergreen forest has been reinstated rather than where non-native Acacia planting has been carried out in regeneration efforts, according to research published in the Interdisciplinary Environmental Review. Christopher Frignoca and John McCarthy of the Department of Atmospheric Science and Chemistry at Plymouth State University in New Hampshire, USA, Aviram Rozin of Sadhana Forest in Auroville, Tamil Nadu, India, and Leonard Reitsma of the Department of Biological Sciences at Plymouth explain how reforestation can be used to rebuild the ecosystem and increases population sizes and diversity of flora and fauna. The team has looked at efforts to rebuild the ecosystem of Sadhana Forest. An area of 28 hectares had its water table replenished through intensive soil moisture conservation. The team has observed rapid growth of planted native species and germination of two species of dormant Acacia seeds. The team's standard biological inventory of this area revealed 75 bird, 8 mammal, 12 reptile, 5 amphibian, 55 invertebrate species, and 22 invertebrate orders present in the area. When they looked closely at the data obtained from bird abundance at point count stations, invertebrate sweep net captures and leaf count detections, as well as Odonate and Lepidopteran visual observations along fixed-paced transects they saw far greater diversity in those areas where native plants thrived rather than the non-native Acacia. "Sadhana Forest's reforestation demonstrates the potential to restore ecosystems and replenish water tables, vital components to reversing ecosystem degradation, and corroborates reforestation efforts in other regions of the world," the team writes. "Sadhana Forest serves as a model for effective reforestation and ecosystem restoration," the researchers conclude. Frignoca, C., McCarthy, J., Rozin, A. and Reitsma, L. (2021) 'Greater biodiversity in regenerated native tropical dry evergreen forest compared to non-native Acacia regeneration in Southeastern India', Interdisciplinary Environmental Review, Vol. 21, No. 1, pp.1–18. DOI: 10.1504/IER.2021.113781 Protection from coronavirus and zero-day pathogensResearchers in India are developing a disinfection chamber that integrates a system that can deactivate coronavirus particles. The team reports details in the International Journal of Design Engineering. As we enter the second year of the COVID-19 pandemic, there are signs that the causative virus SARS-CoV-2 and its variants may be with us for many years to come despite the unprecedented speed with vaccines against the disease have been developed, tested, and for some parts of the world rolled out. Sangam Sahu, Shivam Krishna Pandey, and Atul Mishra of the BML Munjal University suggest that we could adapt screening technology commonly used in security for checking whether a person is entering an area, such as airports, hospitals, or government buildings, for instance, carrying a weapon, explosives, or contraband goods. Such a system might be augmented with a body temperature check for spotting a person with a fever that might be a symptom of COVID-19 or another contagious viral infection. They add that the screening system might also incorporate technology that can kill viruses on surfaces with a quick flash of ultraviolet light or a spray of chemical disinfectant. Airborne microbial diseases represent a significant ongoing challenge to public health around the world. While COVID-19 is top of the agenda at the moment, seasonal and pandemic influenza are of perennial concern as is the emergence of drug-resistant strains of tuberculosis. Moreover, we are likely to see other emergent pathogens as we have many times in the past any one of which could lead to an even greater pandemic catastrophe than COVID-19. Screening and disinfecting systems as described by Sahu could become commonplace and perhaps act as an obligatory frontline defense against the spread of such emergent pathogens even before they are identified. Such an approach to unknown viruses is well known in the computer industry where novel malware emerges, so-called 0-day viruses, before the antivirus software is updated to recognize it and so blanket screening and disinfection software is often used. Sahu, S., Pandey, S.K. and Mishra, A. (2021) 'Disinfectant chamber for killing body germs with integrated FAR-UVC chamber (for COVID-19)', Int. J. Design Engineering, Vol. 10, No. 1, pp.1–9. DOI: 10.1504/IJDE.2021.113247 Wetware data retrievalA computer hard drive can be a rich source of evidence in a forensic investigation... but only if the device is intact and undamaged otherwise many additional steps to retrieve incriminating data from within are needed and not always successful even in the most expert hands. Research published in the International Journal of Electronic Security and Digital Forensics considers the data retrieval problems for investigators faced with a hard drive that has been submerged in water. Alicia Francois and Alastair Nisbet of the Cybersecurity Research Laboratory at Auckland University of Technology in New Zealand, point out that under pressure suspects in an investigation may attempt to destroy digital evidence prior to a seizure by the authorities. A common approach is simply to put a hard drive in water in the hope that damage to the circuitry and the storage media within will render the data inaccessible. The team has looked at the impact of water ingress on solid-state and conventional spinning magnetic disc hard drives and the timescale over which irreparable damage occurs and how this relates to the likelihood of significant data loss from the device. Circuitry and other components begin to corrode rather quickly following water ingress. However, if a device can be retrieved and dried within seven days, there is a reasonable chance of it still working and the data being accessible. "Ultimately, water submersion can damage a drive quickly but with the necessary haste and skills, data may still be recoverable from a water-damaged hard drive," the team writes. However, if the device has been submerged in saltwater, then irreparable damage can occur within 30 minutes. The situation is worse for a solid-state drive which will essentially be destroyed within a minute of saltwater ingress. The research provides a useful guide for forensic investigators retrieving hard drives that have been submerged in water. Francois, A. and Nisbet, A. (2021) 'Forensic analysis and data recovery from water-submerged hard drives', Int. J. Electronic Security and Digital Forensics, Vol. 13, No. 2, pp.219–231. DOI: 10.1504/IJESDF.2021.113374 Of alcohol and bootlacesThere is no consensus across medical science as to whether or not there is a safe lower limit on alcohol consumption nor whether a small amount of alcohol is beneficial. The picture is complicated by the various congeners, such as polyphenols and other substances that are present in different concentrations in different types of alcoholic beverage, such as red and white wine, beers and ales, ciders, and spirits. Moreover, while, there has been a decisive classification of alcohol consumption as a cause of cancer, there is strong evidence that small quantities have a protective effect on the cardiovascular system. Now, writing in the International Journal of Web and Grid Services, a team from China, Japan, Taiwan, and the USA, has looked at how a feature of our genetic material, DNA, relates to ageing and cancer and investigated a possible connection with alcohol consumption. The ends of our linear chromosomes are capped by repeated sequences of DNA base units that act as protective ends almost analogous to the stiff aglets on each end of a bootlace. These protective sections are known as telomeres. Which each cell replication the length of the telomeres on the ends of our chromosomes get shorter. This limits the number of times a cell can replicate before there is insufficient protection for the DNA between the ends that encodes the proteins that make up the cell. Once the telomeres are damaged beyond repair or gone the cell will die. This degradative process has been linked to the limited lifespan of the cells in our bodies and the aging process itself. Yan Pei of The University of Aizu in Aizuwakamatsu, Japan, and colleagues Jianqiang Li, Yu Guan, and Xi Xu of Beijing University of Technology, China, Jason Hung of the National Taichung University of Science and Technology, Taichung, Taiwan, and Weiliang Qiu of Brigham and Women's Hospital in Boston, USA, have carried out a meta-analysis of the scientific literature. Their analysis suggests that telomere length is associated with alcohol consumption. Given that shorter telomeres, before they reach the critical length, can nevertheless lead to genomic instability, this alcohol-associated shortening could offer insight into how cancerous tumour growth might be triggered. Telomere shortening is a natural part of the ageing process. However, it is influenced by various factors that are beyond our control such as paternal age at birth, ethnicity, gender, age, telomere maintenance genes, genetic mutations of the telomeres. However, telomere length is also affected by inflammation and oxidative stress, environmental, psychosocial, behavioural exposures, and for some of those factors we may have limited control. For others, such as chronic exposure to large quantities of alcohol we have greater control. Li, J., Guan, Y., Xu, X., Pei, Y., Hung, J.C. and Qiu, W. (2021) 'Association between alcohol consumption and telomere length', Int. J. Web and Grid Services, Vol. 17, No. 1, pp.36–59. DOI: 10.1504/IJWGS.2021.113686 Quality after the pandemicAdedeji Badiru of the Air Force Institute of Technology in Dayton, Ohio, USA, discusses the notion of quality insight in the International Journal of Quality Engineering and Technology and how this relates to motivating researchers and developers working on quality certification programs after the COVID-19 pandemic. In the realm of product quality, we depend on certification based on generally accepted standards to ensure high quality. Badiru writes that the ongoing COVID-19 pandemic has led to serious disruption to production facilities and led to the upending of normal quality engineering and technology programs. In the aftermath of the pandemic, there will be a pressing need to redress this problem and its impact on quality management processes may, as with many other areas of normal life, continue to be felt for a long time. Badiru suggests that now is the time to develop new approaches to ensure that we retrieve the pre-COVID quality levels. He suggests that in the area of quality certification, we must look at other methods in this field, perhaps borrowing from other areas of quality oversight. One mature area from which the new-normal of certification might borrow is academic accreditation. The work environment has changed beyond recognition through the pandemic and we are unlikely to revert to old approaches entirely. Indeed, the pandemic has already necessitated the urgent application of existing quantitative and qualitative tools and techniques to other areas, such as work design, workforce development, and the form of the curriculum in education. Action now, from the systems perspective in engineering and technology, "will get a company properly prepared for the quality certification of the future, post-COVID-19 pandemic," he writes. This will allow research and development of new products to satisfy the triage of cost, time, and quality requirements as we ultimately emerge from the pandemic. Badiru, A. (2021) 'Quality insight: product quality certification post COVID-19 using systems framework from academic program accreditation', Int. J. Quality Engineering and Technology, Vol. 8, No. 2, pp.218–227. DOI: 10.1504/IJQET.2021.113728 Spotting and stopping online abuseSocial media has brought huge benefits to many of those around the world with the resources to access its apps and websites. Indeed, there are billions of people using the popular platforms every month in almost, if not, every country of the world. Researchers writing in the International Journal of High Performance Systems Architecture, point out that as with much in life there are downsides that counter the positives of social media. One might refer to one such negative facet of social media as "cyber violence". Randa Zarnoufi of the FSR Mohammed V University in Rabat, Morocco, and colleagues suggest that the number of victims of this new form of hostility is growing day by day and is having a strongly detrimental effect on the psychological wellbeing of too many people. A perspective that has been little investigated in this area with regard to reducing the level of cyber violence in the world is to consider the psychological status and the emotional dimension of the perpetrators themselves. New understanding of what drives those people to commit heinous acts against others in the online world may improve our response to it and open up new ways to address the problem at its source rather than attempting to simply filter, censor, or protect victims directly. The team has analysed social media updates using Ensemble Machine Learning and the Plutchik wheel of basic emotions to extract the character of those updates in the context of cyber violence, bullying and trolling behaviour. The analysis draws the perhaps obvious, but nevertheless highly meaningful, conclusion that there is a significant association between an individual's emotional state and the personal propensity to harmful intent in the realm of social media. Importantly, the work shows how this emotional state can be detected and perhaps the perpetrator of cyber violence be approached with a view to improving their emotional state and reducing the negative impact their emotions would otherwise have on the people with whom they engage online. This is very much the first step in this approach to addressing the serious and growing problem of cyber violence. The team adds that they will train their system to detect specific issues in socoal media updates that are associated with harassment with respect to sexuality, appearance, intellectual capacity, and political persuasion. Zarnoufi, R., Boutbi, M. and Abik, M. (2020) 'AI to prevent cyber-violence: harmful behaviour detection in social media', Int. J. High Performance Systems Architecture, Vol. 9, No. 4, pp.182–191 DOI: 10.1504/IJHPSA.2020.113679 Me too #metooSexual harassment in the workplace is a serious problem. To address it, we need a systematic, multistage preventive approach, according to researchers writing in the International Journal of Work Organisation and Emotion. One international response to sexual harassment problems across a range of industries but initially emerging from the entertainment industry was the "#metoo" movement. Within this movement victims of harassment and abuse told their stories through social media and other outlets to raise awareness of this widespread problem and to advocate for new legal protections and societal change. Anna Michalkiewicz and Marzena Syper-Jedrzejak of the University of Lodz, Poland, describe how they have explored perception of the #metoo movement with regards to in reducing the incidence of sexual harassment. "Our findings show that #metoo may have had such preventive potential but it got 'diluted' due to various factors, for example, cultural determinants and lack of systemic solutions," the team writes. They suggest that because of these limitations the #metoo movement is yet to reach its full potential. The team's study considered 122 students finishing their master's degrees in management studies and readying themselves to enter the job market. They were surveyed about the categorisation of psychosocial hazards – such as sexual harassment – in the workplace that cause stress and other personal problems as opposed to the more familiar physical hazards. "Effective prevention of [sexual harassment] requires awareness but also motivation and competence to choose and implement in the organisations adequate measures that would effectively change the organisational culture and work conditions," the team writes. The #metoo movement brought prominence to the issues, but the team suggests that it did not lead to the requisite knowledge and practical competence that would facilitate prevention. They point out that the much-needed social changes cannot come about within a timescale of a few months of campaigning. Cultural changes need more time and a willing media to keep attention focused on the problem and how it might be addressed. There is also a pressing need for changes in the law to be considered to help eradicate sexual harassment in the workplace. Michałkiewicz, A. and Syper-Jędrzejak, M. (2020) 'Significance of the #metoo movement for the prevention of sexual harassment as perceived by people entering the job market', Int. J. Work Organisation and Emotion, Vol. 11, No. 4, pp.343–361. DOI: 10.1504/IJWOE.2020.113699 Data mining big data newsWhile the term "big data" has become something of a buzz phrase in recent years it has a solid foundation in computer science in many contexts and as such has emerged into the public consciousness via the media and even government initiatives in many parts of the world. A North American team has looked at the media and undertaken a mining operation to unearth nuggets of news regarding this term. Murtaza Haider of the Ted Rogers School of Management at Ryerson University in Toronto, Canada and Amir Gandomi of the Frank G. Zarb School of Business at Hofstra University in Hempstead, New York, USA, explain how big data-driven analytics emerged as one of the most sought-after business strategies of the decade. They have now used natural language processing and text mining algorithms to find the focus and tenor of news coverage surrounding big data. They mined a five million-word body of news coverage for references to the novelty of big data, showcasing the usual suspects in big data geographies and industries. "The insights gained from the text analysis show that big data news coverage indeed evolved where the initial focus on the promise of big data moderated over time," the team found. There work also demonstrates how text mining and NLP algorithms are potent tools for news content analysis. The team points out that academic journals have been the main source of trusted and unbiased advice regarding computing technologies, large databases, and scalable analytics, it is the popular and trade press that are the information source for over-stretched executives. It was the popular media that became what the team describes as "the primary channel for spreading awareness about 'big data' as a marketing concept". They add that the news media certainly helped popularise innovative ideas being discussed in the academic literature. Moreover, the latter has had to play catchup during the last decade on sharing the news. That said, much of the news coverage during this time has been about the novelty and the promise of big data rather than the proof of principles that are needed for it to proceed and mature as a discipline. Indeed, there are many big data clichés propagated in an often uncritical popular media suggesting that big data analytics is some kind of information panacea. In contrast, the more reserved nature of academic publication knows only too well that big data does not represent a cure-all for socio-economic ills nor does it have unlimited potential. Haider, M. and Gandomi, A. (2021) 'When big data made the headlines: mining the text of big data coverage in the news media', Int. J. Services Technology and Management, Vol. 27, Nos. 1/2, pp.23–50. DOI: 10.1504/IJSTM.2021.113574 More about Research Picks News New Editor for International Journal of Applied Nonlinear Science 23 March, 2021 Prof. Wen-Feng Wang from the Interscience Institute of Management and Technology in India and Shanghai Institute of Technology in China has been appointed to take over editorship of the International Journal of Applied Nonlinear Science. New Editor for Journal of Design Research 11 March, 2021 Prof. Jouke Verlinden from the University of Antwerp in Belgium has been appointed to take over editorship of the Journal of Design Research. The journal's former Editor in Chief, Prof. Renee Wever of Linköping University in Sweden, will remain on the board as Editor. Inderscience Editor in Chief receives Humboldt Research Award 5 March, 2021 Inderscience is pleased to announce that Prof. Nilmini Wickramasinghe, Editor in Chief of the International Journal of Biomedical Engineering and Technology and the International Journal of Networking and Virtual Organisations, has won a Humboldt Research Award. This award is conferred in recognition of the award winner's academic record. Prof. Wickramasinghe will be invited to carry out research projects in collaboration with specialists in Germany. Inderscience's Editorial Office extends its warmest congratulations to Prof. Wickramasinghe for her achievement, and thanks her for her continuing stellar work on her journals. Best Reviewer Award announced by International Journal of Environment and Pollution 11 February, 2021 We are pleased to announce that the International Journal of Environment and Pollution has launched a new Best Reviewer Award. The 2020 Award goes to Prof. Steven Hanna of the Harvard T.H. Chan School of Public Health in the USA. The senior editorial team thanks Prof. Hanna sincerely for his exemplary efforts. Inderscience new address 11 February, 2021 As of 1st March 2021, the address of Inderscience in Switzerland will change to: Inderscience Enterprises Limited Rue de Pré-Bois 14 Meyrin - 1216 Geneva SWITZERLAND For Authors Registered authors log in here Online submission: new author registration Preparing articles Submitting articles Copyright and author entitlement Conferences/Events Orders Journal subscriptions Buying one-off articles and issues Books and Conference Proceedings See our subscription rates (PDF format) 2021 New titles International Journal of Cybernetics and Cyber-Physical Systems MENA Journal of Cross-Cultural Management International Journal of Family Business and Regional Development International Journal of Forensic Engineering and Management International Journal of Big Data Management Previous Next Keep up-to-date Our Blog Follow us on Twitter Visit us on Facebook Our Newsletter (subscribe for free) RSS Feeds New issue alerts Return to top Contact us About Inderscience OAI Repository Privacy and Cookies Statement Terms and Conditions Help Sitemap © Inderscience Enterprises Ltd. work_g6nalraddzc7df7eqmjqbnazm4 ---- 淡江大學機構典藏:Bitstream Authorization Required English  |  正體中文  |  简体中文  |  Items with full text/Total items : 57915/91485 (63%) Visitors : 13672868      Online Users : 46 RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team. Scope All of 機構典藏 文學院 理學院 工學院 商管學院 外國語文學院 國際研究學院 教育學院 創業發展學院 全球化研究與發展學院 社區發展學院 全球發展學院 技術學院 行政單位 體育事務處 淡江出版期刊 66週年校慶研討會 67週年校慶研討會 教育部教學實踐研究計畫 Tips: please add "double quotation mark" for query phrases to get precise results please goto advance search for comprehansive author search Adv. Search Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer Browse all Communities & Collections Title Date Authors Relevant Link 機構典藏網站流量統計 各系所授權人數統計 機構典藏相關使用文件 臺灣機構典藏 TAIR 日本機構典藏 JAIRO 西文出版社授權政策查詢 淡江大學電子學位論文 淡江大學出版期刊 淡江大學覺生紀念圖書館 Loading... Bitstream Authorization Required Sorry, the bitstream file is not authorized now. The constraint is as followed, Access time constraint: 9999/12/31 ~ Access place constraint: If you're having problems, or you expected the ID to work, feel free to contact the site administrators. 淡江大學機構典藏 iradmin+admin@www.lib.tku.edu.tw Go to the 機構典藏 home page DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - Feedback work_g7k5s3mvk5bnvnyj6ygtd6gv4y ---- Dodgson, N., Patterson, J., and Willis, P. (2010) What's up prof? Current issues in the visual effects & post-production industry. Leonardo: Art Science and Technology, 43 (1). pp. 92-93. ISSN 0024-094X http://eprints.gla.ac.uk/47904/ Deposited on: 9 January 2012 Enlighten – Research publications by members of the University of Glasgow http://eprints.gla.ac.uk http://eprints.gla.ac.uk/view/author/8689.html http://eprints.gla.ac.uk/view/journal_volume/Leonardo=3A_Art_Science_and_Technology.html http://eprints.gla.ac.uk/view/journal_volume/Leonardo=3A_Art_Science_and_Technology.html http://eprints.gla.ac.uk/47904/ T r a n s a c T io n s 92 LEONARDO, Vol. 43, No. 1, pp. 92–93, 2010 ©2010 isasT WHAT’S UP PROF? CURRENT ISSUES IN THE VISUAL EFFECTS & POST- PRODUCTION INDUSTRY Neil Dodgson, University of Cambridge, Computer Laboratory, CB3 0FD, U.K. E-mail: John Patterson, University of Glasgow, Dep’t of Computing Science, G12 8QQ U.K. E-mail: Phil Willis, University of Bath, Computer Science Dep’t, BA2 7AY U.K. E-mail: P.J.Willis@bath.ac.uk Submitted: 25/2/2009 Abstract We interviewed creative professionals at a number of London visual effects and post-production houses. We report on the key issues raised in those interviews: desirable new technologies, infrastruc- ture challenges, personnel and process management. Visual effects companies began to estab- lish themselves, in the film industry, in the 1980s. The potential of computers became fully apparent during the 1990s when they began to generate realistic imagery [1]. In the U.K. alone, visual effects and post-production are now worth over a billion U.S. dollars. Today, the industry faces many issues critical to its future. To get a snapshot of current issues, we interviewed a range of creative professionals in London in De- cember 2008. In particular, we elicited how those professionals in the creative industry thought that the universities could best help them. The Organizations We visited six organizations [A–F] rep- resenting different facets of the industry: A. A large visual effects company, dealing mostly with movies. The company employs 20 technical staff, 400 artists, plus management. B. A medium-sized post-production company, working on advertising, television, and movies. The com- pany has over 100 employees, mostly visual effects artists. C. A software developer with 50 em- ployees, producing software for post-production and visual effects. D. A systems developer with 70 em- ployees producing combined soft- ware and hardware solutions for colour grading. E. A scanning and recording house, a member of an international group providing full services to the film industry, specializing in converting between digital and analogue media. F. An independent consultancy spe- cializing in coordinating research projects in this industry. The Issues We asked each organization to discuss current problems and desires. We subse- quently categorised them three ways: 1. Desirable new technologies. 2. Infrastructure. 3. Managing people and process. 1. Desirable new technologies a) Human in the loop. There is much good university research on fully- automatic methods for image processing and computer vision. These work well at the low quality end of the market (e.g., segmentation and 3D reconstruction). However, this work has had little impact on the high quality end, where every- thing is still done manually. It would be useful to investigate methods that solve particular problems (e.g., optical flow, boundary detection, and object detec- tion) to help a human being either to direct the automated algorithm or to ad- just the output of the automated algo- rithm quickly and efficiently. In either case the semi-automatic method will only be useful if the result is superior to the manual method while taking less time to achieve. [D] b) Repurposing. Research is needed into effective ways to reuse both footage and 3D models. Models tend to be made anew for each sequel. This is under- standable as technology moves on, but it is increasingly expensive. However, we also find that the 3D models used for a movie are not used for the simultane- ously-released accompanying game. How can we make better use of existing assets? [C,F] c) Finding assets. The databases of as- sets are now so large that we need to develop better ways to catalogue them and to search both images and 3D mod- els. There are usually many different versions of a given asset: it is vital to find the correct version, not just the cor- rect asset. [A,F] d) 3D reconstruction. Reasonable methods for the reconstruction of 3D objects exist but they work best with frame-synchronised views from binocu- lar cameras. The next challenge is the extraction of data of good enough quality for the reconstruction of a complete 3D scene from multiple movie cameras. Some aspects of this problem remain challenging. Support for 3D (stereo- scopic) movie-making has become a priority for the industry following the popularity of recent 3D releases. [2,C] e) Artistic control of physical simula- tion. Movie effects need to be visually plausible but the simulations that under- lie them do not have to be physically realistic nor work for longer than the shot. There has been considerable re- search on producing physically realistic simulations. The industry needs physi- cally plausible simulation that can be directed and modified by the artist [3]. For example, can we build a water simu- lator where the artist can control where the water goes? Could we make a cloth simulator which is physically plausible but which gives the artist control over specific behaviours? How do we make things that look plausible when they are physically impossible? [A,E] f) Making convincing digital humans. Human beings are good at recognizing and analysing the appearance and behav- iour of other human beings. It is still difficult to make a convincing digital human. We know that there is evidence that a digital human that is not quite convincing is more disturbing to the average viewer than a digital human that is clearly not meant to be realistic (“the uncanny valley” [4]). Compounding this is the problem that it is difficult to cap- ture good face data and difficult to pro- duce plausible animation of face data. Acquisition of human motion on set or on a soundstage is particularly expensive and therefore is only used if it is abso- lutely necessary. [A] g) Breaking free from pixels. A non- pixel format (e.g., that in [5]) could be useful to break free from the problem of producing the same material at many different resolutions and needing to en- sure that the original material is always shot at the highest resolution that you will need. Such a format would need to be able to handle all the processing that we currently do on pixelised images. In the long term there would need to be input devices (cameras) and output de- vices (projectors) that could handle the non-pixel format. [B] 2. Infrastructure a) Trans-coding media between digital formats. There has been a proliferation of formats. For example, a single work can be required in a dozen different for- mats resulting in a lot of CPU time and staff time converting between them. One way in which we could tackle this is to develop a video version of Adobe’s Portable Document Format, a single file format that can be converted at need Transactions 93 T r a n s a c T io n s either at the player or at the server when the player requests the file. [B] b) Backup of large data stores. A post- production or visual effects house pro- duces gigabytes of new data each day. At the small end of the scale, a 2K DPX movie frame requires 12MB, and a 4K frame can require as much as 144MB. At the large end, an advertising poster can be rendered with up to 600 megapixels, requiring 1.8GB. One company uses a 160 TB file store; another mentioned data volumes of several hundred tera- bytes. One company reported that no vendor of off-site backup was able to cope with the quantity of new data that they produce. Two companies com- mented that, because of the volume problem, they maintain their backups on site, with the obvious risks. [A,B,D,F] c) Keeping up with technology. Tech- nology changes rapidly. Companies de- vote much resource to making best use of new technology to speed up processes and keep ahead of the competition. It is not just a question of optimizing the effects algorithms: one company re- ported that less than 20% of their code did the actual effects work, with the rest of the code being required for data man- agement. [D] d) Archiving and cataloguing assets. Archiving everything is problematic. If done, cataloguing is important (see 1(c)). For example, an upcoming feature film has 1700 effects shots, with 4 million assets, variations on those assets produce 10 million identifiable objects. These take up several hundred terabytes. How do we archive something like this? There are many subsidiary questions within this problem: for example, is it sufficient to store the original imagery and models along with a description of the process to get to the final shot? [A,F] e) Archiving footage in perpetuity. In addition to archiving assets in the short to medium term, there is a desire to ar- chive the finished product forever. All physical media deteriorates, whether physical film, magnetic tape, or optical disc. Film has a life of around 40 years, though this varies considerably with storage conditions [6]. Some film has survived reasonably intact over 70 years [7]. LTO Ultrium (½" digital ar- chive tape) has a predicted life of 15–30 years [8]. Can we develop mechanisms that robustly store digital footage for decades or centuries? If so, can we automatically migrate existing film ar- chives to secure digital media. This is not a small problem: the British Film Institute has an archive of 150,000 mov- ies [9]. The Internet Movie Database [10] reports 14,692 movies released in 2008, the equivalent of a hundred mil- lion feet of film per year. [E,F] f) Healing the 2D/3D divide. There are currently separate workflows for 2D data (images) and 3D data (modelling). It would be useful to join the workflows in some way, especially as stereoscopic movies become more popular. [2,C] g) Improving digital capture. There are currently no digital capture devices that can compete with film in quality of captured imagery. [E] 3. Managing people and process a) Managing creative input. A decade ago, visual effects artists were generally aware of the underlying technology and of the entire pipeline from concept to the finished film. Today, young artists, while still skilled creatively, are far less knowledgeable technically. They can thus either fail to use the full power of the technology or fail to understand the implications of their actions for the later stages of the pipeline. [A] b) Managing workflow. The current methods for visual effects and post- production follow a production line: each step in the process building on the previous one. Can we break free of this production line method and provide ef- fective feedback loops between the dif- ferent links in the production chain? [A] c) Managing a large workforce. The industry once consisted of small compa- nies, within each of which everyone knew everyone else. Over the last dec- ade, several of the companies have be- come too large to work in this way. How do we manage this creative, collabora- tive process when people in different parts of the chain do not know each other and have only a basic understanding of each other’s roles? [A] d) Managing client expectations. Vis- ual effects are now an ordinary part of the production pipeline, rather than any- thing special. Some movies now have over a thousand effects shots and even non-effects movies employ a lot of digi- tal post-production. For example, a re- cent live-action movie with no visual effects still had over 900 shots that re- quired CGI post-production, such as changing the sky colour and moving or removing background elements. Much effects work is time-consuming and la- bour-intensive. Many effects are gener- ated using one-off solutions that are thrown together to get the result wanted by the director. Despite these difficulties, the companies find that their clients have little appreciation of which effects are straightforward to produce and which are extraordinarily expensive. There is a common belief that, if they have seen an effect in some other movie, then it must be straightforward to produce. [A,B] Implications and Conclusions With regard to research timescales, the universities and companies differ. The companies need solutions to their current problems, on a timescale of 6 to 24 months. The universities need to work on problems that will become pressing in 5 to 10 years time or on problems for which no solution is obvious to industry. The latter are those problems to which no company will devote resources but for which a solution would be useful, if one could be found. Computer graphics and image proc- essing researchers are best placed to tackle the development of new technolo- gies in (1). These are also the problems best suited to university timescales. We are working with some of the companies to research certain of these. Our col- leagues in networking, information re- trieval, databases, and engineering are best placed to tackle research issues in infrastructure (2), particularly how to handle backup and archive of large data- sets. The managerial issues (3) demon- strate that some of the biggest problems facing the industry have little to do with technology and everything to do with people. References and Notes 1. Richard Rickitt, Special Effects: the history & technique (Virgin Books, 2000). 2. Lenny Lipton, “Digital stereoscopic cinema: the 21st century”, Proc. SPIE 6803, 2008. 3. Ronen Barzel, “Faking Dynamics of Ropes and Springs”, IEEE Computer Graphics & Applications 17(3), pp. 31–39, 1997. 4. F.C. Gee, W.N. Browne, K. Kawamura, “Un- canny valley revisited”, IEEE International Work- shop on Robot and Human Interactive Communication 2005 (ROMAN 2005), pp. 151– 157, 2005, 5. John Patterson, Philip J. Willis, “Image Process- ing and Vectorisation” International patent applica- tion PCT/GB2007/002470, filed 5 July 2007, U.K. 6. James M. Riley, IPI Storage Guide for Acetate Film, Image Permanence Institute, 1993. 7. British Film Institute Mitchell & Kenyon Collec- tion, , accessed 26 February 2009. 8. Sun Microsystems LTO Ultrium tape cartridge specifications, , accessed 26 February 2009. 9. British Film Institute National Archive, , accessed 25 February 2009. 10. Internet Movie Database, , accessed 25 February 2009. citation_temp.pdf http://eprints.gla.ac.uk/47904/ citation_temp.pdf http://eprints.gla.ac.uk/47904/ citation_temp.pdf http://eprints.gla.ac.uk/47904/ work_gbwegerwnngtpd5va6ifmrlnnu ---- On edited archives and archived editions | SpringerLink Advertisement Search Log in Search SpringerLink Search Associated Content Part of a collection: Special Issue on Digital Scholarly Editing Research Article Published: 29 April 2019 On edited archives and archived editions Wout Dillen1  International Journal of Digital Humanities volume 1, pages263–277(2019)Cite this article 774 Accesses 3 Altmetric Metrics details Abstract Building on a longstanding terminological discussion in the field of textual scholarship, this essay explores the archival and editorial potential of the digital scholarly edition. Following Van Hulle and Eggert, the author argues that in the digital medium these traditionally distinct activities now find the space they need to complement and reinforce one another. By critically examining some of the early and more recent theorists and adaptors of this relatively new medium, the essay aims to shed a clearer light on some of its strengths and pitfalls. To conclude, the essay takes the discussion further by offering a broader reflection on the difficulties of providing a ‘definitive’ archival base transcription of especially handwritten materials, questioning if this should be something to aspire to for the edition in the first place. This is a preview of subscription content, access via your institution. Access options Buy single article Instant access to the full article PDF. US$ 39.95 Tax calculation will be finalised during checkout. Rent this article via DeepDyve. Learn more about Institutional subscriptions Fig. 1 Fig. 2 Fig. 3 Notes 1.As Patrick Sahle posited the second part of his Digitale Editionsformen: ‘Das Kennzeichen des gegenwärtigen Medienwandels ist nicht so sehr ein Wechsel des Medien, sondern vielmehr ein Transmedialisierung!’ (2013: 161; see also 162). 2.In ‘Edition, Project, Database, Archive, Thematic Research Collection: What’s in a Name?’ Price weighed a series of alternatives against one another and makes a case for switching to the concept of ‘arsenal’ instead (2009). 3.See: http://www.beckettarchive.org/introduction.jsp. Note the use of the word ‘series’ here, another term to add to the list – and one that is again perhaps more firmly rooted in print culture. 4.Gerrit Brünning, one of the collaborators on the Faust Edition explained as much at a talk that he gave at the University of Antwerp as part of the Platform Digital Humanities Lecture Series (26 March 2018). 5.More specifically, Eggert mentions the ISO-646 character set. This character set is a successor of ASCII (the American Standard Code of Information Interchange), and the predecessor of today’s international standard character set called Unicode. 6.In fact, Shillingsburg’s own list of these ‘visual elements with semantic force’ for manuscripts explicitly includes ‘insertions above and below lines and in margins’ (2015, 17). 7.In his paper, Shillingsburg foresees two exceptions to this rule: ‘a new authoritative witness to the work or the discovery of error in the original work’ (2015: 24). But the images that represent the document may need to be updated as well, if the edition wants to conform to newer and higher digital imaging standards. Such an update will invariably have a number of implications for the image-text linking tools that the content management framework uses, but it may also have consequences for the text, if the new image clarifies a textual feature the discovery of that the old image could not. 8.The CHCA and its multi-version-document (MVD) encoding scheme are discussed in more detail elsewhere in this volume. References Beckett Digital Manuscript Project. Retrieved March 30 2018 from: www.beckettarchive.org. Boot, P. Fischer, F. and Van Hulle, D. (2017). Introduction. In Boot, P. Cappellotto, A., Dillen, W., Fischer, F., Kelly, A., Mertgens, A., Sichani, A., Spadini, E., and Van Hulle, D., (Eds.), Advances in digital scholarly editing. Papers presented at the DiXiT conferences in the Hague, Cologne, and Antwerp (pp. 15–22). Leiden: Sidestone Press. Google Scholar  Brünning, G., Henzel, K., & Pravida, D. (2013). Multiple encoding in genetic editions: The case of Faust. Journal of the Text Encoding Initiative, 4, 1–12 http://jtei.revues.org/697. Accessed 23 April 2019. Dahlström, M. (2000). Drowning by versions. Human IT, 4(4) http://etjanst.hb.se/bhs/ith/4-00/md.htm. Accessed 23 April 2019. Dahlström, M. (2009). The Compleat edition. In M. Deegan & K. Sutherland (Eds.), Text editing, print, and the digital world (pp. 27–44). Basingstoke: Ashgate. Google Scholar  Dahlström, M., & Dillen, W. (2017). Review of Litteraturbanken: the Swedish Literature Bank. RIDE, 6. https://doi.org/10.18716/ride.a.6.2. Eggert, P. (2005). Text-encoding, theories of the text, and the “work- site”. Literary and Linguistic Computing, 20(4), 425–435. Article  Google Scholar  Eggert, P. (2017). The archival impulse and the editorial impulse. In P. Boot, A. Cappellotto, W. Dillen, F. Fischer, A. Kelly, A. Mertgens, A.-M. Sichani, E. Spadini, & D. Van Hulle (Eds.), Advances in digital scholarly editing. Papers presented at the DiXiT conferences in the Hague, Cologne, and Antwerp (pp. 121–124). Leiden: Sidestone Press. Google Scholar  Evenson, J. (1999). Electronic Archives: Creating a New Bibliographic Code. Paper presented at the ACH-AALC conference in Charlottesville. USA: Virginia. Google Scholar  Faust Edition. Retrieved March 30 2018 from: http://beta.faustedition.net. Henny-Krahmer, U., & Neuber, F. (2017). Editorial: Reviewing digital text collections. RIDE, 6. https://doi.org/10.18716/ride.a.6.0 Accessed on 30 March 2018. Huitfeldt, C., & Sperberg-McQueen, C. M. (2008). What is a transcription? Literary and Linguistic Computing, 23(3), 295–310. Article  Google Scholar  Litteraturbanken. Retrieved March 30 2018 from: https://litteraturbanken.se/start. Neuber, F., & Henny-Krahmer, U. (2018). Editorial: Digital text collections - take two, Action! RIDE, 8. https://doi.org/10.18716/ride.a.8.0. NietzcheSource. Retrieved March 30 2018 from: http://www.nietzschesource.org. Nixon, M., & Van Hulle, D. (2017). Samuel Beckett’s library. Cambridge: Cambridge University Press. Google Scholar  Price, K. (2007). Electronic scholarly editions. In S. Schreibman & R. Siemens (Eds.), A companion to digital literary studies (pp. 434–450). Malden: Blackwell Publishing. Google Scholar  Price, K. (2009). Edition, Project, Database, Archive, Thematic Research Collection: What’s in a Name? DHQ, 3(3) http://www.digitalhumanities.org/dhq/vol/3/3/000053/000053.html. Accessed 23 April 2019. Robinson, P. (1996). Is there a text in these variants? In R. Finneran (Ed.), The literary text in the digital age (pp. 99–115). Ann Arbor: University of Michigan Press. Google Scholar  Robinson, P. (2009). What text really is not, and why editors have to learn to swim. Literary and Linguistic Computing, 24(1), 41–52. Article  Google Scholar  Robinson, P. (2016, 6 October). The Revolution is Coming. Paper presented at Digital Scholarly Editing: Theory, Practice, Methods. ESTS 2016 / DiXiT 3. Antwerp, Belgium. Robinson, P., & Solopova, E. (1993). Guidelines for transcription of the manuscripts of the wife of Bath’s prologue. In N. Blake & P. Robinson (Eds.), The Canterbury Tales project occasional papers (pp. 19–52). Oxford: Office for Humanities Communication. Google Scholar  Sahle, P. (2005). Digitales Archiv–Digitale Edition. Anmerkungen zur Begriffsklärung. In M. Stolz (Ed.), Literatur und Literaturwissenschaft auf dem Weg zu den neuen Medien. Bern: germanistik.ch https://www.germanistik.ch/publikation.php?id=Digitales_Archiv_und_digitale_Edition. Accessed 23 April 2019. Sahle, P. (2013). Digitale Editionsformen. Zum Umgang mit der Überlieferung unter den Bedingungen des Medienwandels. Teil 2: Befunde, Theorie und Methodik. Norderstedt: Books on Demand. Google Scholar  Shillingsburg, P. (1996). Scholarly editing in the computer age. Theory and practice (3rd ed.). Ann Arbor: The University of Michigan Press. Google Scholar  Shillingsburg, P. (2015). Development principles for virtual archives and editions. Variants. The Journal of the European Society for Textual Scholarship, 11, 11–28. Google Scholar  Steinkrüger, P. (2014). Review of Nietzschesource. RIDE, 1. https://doi.org/10.18716/ride.a.1.4. Van Hulle, D. (1999). Authenticity or Hyperreality in hypertext editions. Human IT, 1, 227–244 http://etjanst.hb.se/bhs/ith/1-99/dvh.htm. Accessed 23 April 2019. Van Hulle, D. (2009). Editie en/of Archief: modern manuscripten in een digitale architectuur. Verslagen en Mededelingen van de Koninklijke Academie voor Nederlandse Taal- en Letterkunde, 119(2), 163–178. Google Scholar  Download references Author information Affiliations Centre for Manuscript Genetics, University of Antwerp, Antwerp, Belgium Wout Dillen Authors Wout DillenView author publications You can also search for this author in PubMed Google Scholar Corresponding author Correspondence to Wout Dillen. Additional information Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Rights and permissions Reprints and Permissions About this article Cite this article Dillen, W. On edited archives and archived editions. Int J Digit Humanities 1, 263–277 (2019). https://doi.org/10.1007/s42803-019-00018-4 Download citation Published: 29 April 2019 Issue Date: 04 July 2019 DOI: https://doi.org/10.1007/s42803-019-00018-4 Keywords Digital scholarly editing Textual criticism Archives Editions Associated Content Part of a collection: Special Issue on Digital Scholarly Editing Access options Buy single article Instant access to the full article PDF. US$ 39.95 Tax calculation will be finalised during checkout. Rent this article via DeepDyve. Learn more about Institutional subscriptions Advertisement Over 10 million scientific documents at your fingertips Switch Edition Academic Edition Corporate Edition Home Impressum Legal information Privacy statement California Privacy Statement How we use cookies Manage cookies/Do not sell my data Accessibility Contact us Not logged in - 128.182.81.34 North East Research Libraries (8200828607) - LYRASIS (3000176756) - Carnegie Mellon University (3000133174) - Carnegie Mellon University Hunt Library (1600047252) Springer Nature © 2021 Springer Nature Switzerland AG. Part of Springer Nature. \ work_ge4v2fbe3naqpivqnw56pl3h4m ---- Review of Oliver Grau, Janina Hoth, & Eveline Wandl-Vogt (Eds.) (2019). Digital Art through the Looking Glass: New strategies for archiving, collecting and preserving in digital humanities | SpringerLink Advertisement Search Log in Search SpringerLink Search Reviews Published: 17 January 2020 Review of Oliver Grau, Janina Hoth, & Eveline Wandl-Vogt (Eds.) (2019). Digital Art through the Looking Glass: New strategies for archiving, collecting and preserving in digital humanities Hamburg/Krems/Vienna: Edition Donau-Universität Krems and Austrian Academy of Sciences. 312 pp. ISBN 9783903150515 (E-Book) Penesta Dika1,2  Postdigital Science and Education volume 2, pages506–510(2020)Cite this article 987 Accesses 3 Altmetric Metrics details This is a preview of subscription content, access via your institution. Access options Buy single article Instant access to the full article PDF. US$ 39.95 Tax calculation will be finalised during checkout. Subscribe to journal Immediate online access to all issues from 2019. Subscription will auto renew annually. US$ 79 Tax calculation will be finalised during checkout. Rent this article via DeepDyve. Learn more about Institutional subscriptions Notes 1.See http://www.mediaarthistory.org/retrace. 2.See http://www.virtualart.at/nc/home.html. 3.See https://www.hek.ch/en.html. 4.See https://www.comune.venezia.it/it/content/algorithmic-signs-ernest-edmonds-manfred-mohr-vera-moln-r-frieder-nake-roman-verostko-0 (in Italian) and https://aru.ac.uk/storylab/our-research/algorithmic-signs (in English). References Beiguelman, G., & Conçalves Magalhães, A. (Eds.). (2014). Possible futures. art, museums and digital archives. Sao Paulo: Editora Peiropolis. Google Scholar  Grau, O., Coones, W., & Rühse, V. (Eds.). (2017). Museum and archive on the move. Changing cultural institutions in the digital era. Berlin and Boston: Walter De Gruyter. Google Scholar  Grau, O., Hoth, J., & Wandl-Vogt, E. (Eds.). (2019). Digital art through the looking glass: new strategies for archiving, collecting and preserving in digital humanities. Hamburg/Krems/Vienna: Edition Donau-Universität Krems and Austrian Academy of Sciences. Google Scholar  Hall, G. (2002). Culture in bits: the monstrous future of theory. London: Continuum. Google Scholar  Jandrić, P., Knox, J., Besley, T., Ryberg, T., Suoranta, J., & Hayes, S. (2018). Postdigital science and education. Educational Philosophy and Theory, 50(10), 893–899. https://doi.org/10.1080/00131857.2018.1454000. Article  Google Scholar  Jandrić, P., Ryberg, T., Knox, J., Lacković, N., Hayes, S., Suoranta, J., Smith, M., Steketee, A., Peters, M. A., McLaren, P., Ford, D. R., Asher, G., McGregor, C., Stewart, G., Williamson, B., & Gibbons, A. (2019). Postdigital dialogue. Postdigital Science and Education, 1(1), 163–189. https://doi.org/10.1007/s42438-018-0011-x. Article  Google Scholar  Negroponte, N. (1998). Beyond digital. Wired, (12 January) http://www.wired.com/wired/archive/6.12/negroponte.html. Download references Author information Affiliations Kunstuniversität Linz, Linz, Austria Penesta Dika University of Business and Technology in Prishtina, Prishtina, Kosovo Penesta Dika Authors Penesta DikaView author publications You can also search for this author in PubMed Google Scholar Corresponding author Correspondence to Penesta Dika. Rights and permissions Reprints and Permissions About this article Cite this article Dika, P. Review of Oliver Grau, Janina Hoth, & Eveline Wandl-Vogt (Eds.) (2019). Digital Art through the Looking Glass: New strategies for archiving, collecting and preserving in digital humanities. Postdigit Sci Educ 2, 506–510 (2020). https://doi.org/10.1007/s42438-020-00100-z Download citation Published: 17 January 2020 Issue Date: April 2020 DOI: https://doi.org/10.1007/s42438-020-00100-z Keywords Postdigital Arts Archive Collection Preservation Digital humanities Access options Buy single article Instant access to the full article PDF. US$ 39.95 Tax calculation will be finalised during checkout. Subscribe to journal Immediate online access to all issues from 2019. Subscription will auto renew annually. US$ 79 Tax calculation will be finalised during checkout. Rent this article via DeepDyve. Learn more about Institutional subscriptions Advertisement Over 10 million scientific documents at your fingertips Switch Edition Academic Edition Corporate Edition Home Impressum Legal information Privacy statement California Privacy Statement How we use cookies Manage cookies/Do not sell my data Accessibility Contact us Not logged in - 128.182.81.34 North East Research Libraries (8200828607) - LYRASIS (3000176756) - Carnegie Mellon University (3000133174) - Carnegie Mellon University Hunt Library (1600047252) Springer Nature © 2021 Springer Nature Switzerland AG. Part of Springer Nature. \ work_gheccvgisbbcvplzywm7tjv5be ---- Microsoft Word - fenlon_RO2019_preprintSubmission.docx 1 Interactivity, Distributed Workflows, and Thick Provenance: A Review of Challenges confronting Digital Humanities Research Objects Katrina Fenlon (kfenlon@umd.edu; https://orcid.org/0000-0003-1483-5335) Introduction While Research Objects (ROs) are primarily oriented toward scientific research workflows, the RO model and parallel approaches have gained some uptake in the humanities, enough to suggest their potential to undergird sustainable, networked humanities research infrastructures. Digital scholarship in the humanities takes a great variety of forms that range widely beyond traditional publications, and which incorporate narratives, media, datasets and interactive components—any of which may be physically dispersed as well as dynamic and evolving over time. Despite the rapid growth of digital scholarship in the humanities, most existing research infrastructures lack support for the creation, management, sharing, maintenance, and preservation of complex, networked digital objects. ROs, and the community and tools that are growing around ROs, offer a potential, partial solution. While the concept of the RO has seen significantly more uptake in the humanities than has the formal data model (Bechhofer, 2013; Belhajjame et al., 2015), several compelling applications of the concept that suggest the time is ripe for considering broader integration of the model into distributed infrastructures. These applications include platforms for data sharing and collaborative scholarship, platforms for digital and semantic publishing, and digital repositories in several domains. This paper reviews existing applications of the ROs model to identify challenges confronting the application of ROs to humanities digital scholarship. This paper builds on Fenlon (2019), which investigated the application of the ROs model to digital humanities collections, and which identified three promising strengths of the model for the realm of digital humanities: (1) ROs readily perform the most essential function of a collection: to aggregate related resources in order to support scholarly objectives; (2) ROs have the capacity for explicit, semantic descriptions of interrelationships among components that are often hidden in digital humanities collections (and therefore vulnerable to dissolution); and (3) the RO model accommodates aggregations of linked data, offering researchers the opportunity to create and annotate virtual, fully referential collections. Having identified some strengths and limitations of the RO model for digital humanities collections through one experimental application of model, this paper builds on that analysis by reviewing the literature on ROs in the humanities and examining a range of applications of the RO and similar models within humanities and cultural heritage domains. This paper frames the review around three main challenges and their implications for future implementations of ROs to support digital research in the humanities: First, digital humanities scholarship requires specialized interactive use, so realizing the advantages of ROs for the humanities will depend on implementations that create platforms for experimentation and development by communities. Second, the idiosyncratic workflows employed in the construction of networked humanities scholarship means that workflow-oriented ROs will not gain significant uptake in the humanities unless they can capture distributed, sociotechnical workflows in meaningful ways. Third, humanities ROs will require capturing provenance in ways and at a level of detail that may be unfamiliar to the ROs scientific origins; humanities scholarship requires “thick,” multilayered, context-rich provenance descriptions that can accommodate conflicting assertions and formalize uncertainty. 2 Challenge 1. Essential interactivity for specialized use Much of humanities digital scholarship is essentially interactive. New modes of production and publication in the humanities are intended for user interaction or participation, and dynamic and responsive representation based on research context. Digital collections and archives, digital editions, maps, models, and simulations, and other modes of digital scholarship all rely on interactive components to express their interpretive contributions, or to enact their scholarly purposes. The interactive and dynamic components of digital scholarship include things like customized browsing and searching facilities that take advantage of extensive, rich scholarly encodings and annotations; platforms for collaborative annotation; dynamic maps and visualizations; etc. Such components are intended to do multiple things at once: to make arguments, to manifest interpretive stances, to enable knowledge transfer, and simultaneously to serve as active platforms for ongoing interpretation and research (Palmer, 2009; Fenlon, 2017; and others). Prior empirical work on applying the RO model to digital humanities collections found the main limitation of the model for digital humanities collections to be that functional components, designed for ongoing end-user interaction, are not usefully captured in a basic RO model and instead fall to the implementations built on top of research-object management systems (Fenlon, 2019). ROs can, of course, accommodate as flat code objects that are intended to be interactive; and ROs have been employed for this purpose to support data migration and archiving (e.g., the RO BagIt profile). But the purpose of digital humanities scholarship is to be alive and functional, and for ROs to be useful in this domain will require implementations that support platforms for flexible, participatory development. In a conceptual sense, the RO model has demonstrated value for this kind of platform approach in the humanities. The Perseids project offers a platform for sharing and peer-review of the transcriptions, annotations, and analyses that constitute research data in the Classics. The Perseids architecture is built around the concept of data publications, which are modeled as collections of related data objects. The Perseids team explicitly relates the data publication model to the RO model (Almas, 2017). Like ROs, Perseids data publications weave in several domain standards (including the TEI Epidoc schema, W3C Web Annotation, and others) to undergird an infrastructure that supports scholarly requirements specific to the Classics domain: transcription, fine-grained annotation, collaborative editing (with versioning), a research environment that facilitates data-type-specific extensions, and tailored workflows for peer review (Almas, 2017). Similarly, the CERES (Community Enhanced Repository for Engaged Scholarship) toolkit, created by the Northeastern University Libraries Digital Scholarship Group, explicitly draws on the concept of the RO in its system for supporting networked humanities scholarship and publishing. CERES allows digital humanities creators to build custom publications that pull objects from different repositories using APIs (including the Northeastern University Libraries’ Digital Repository Service and the Digital Public Library of America) (Sweeney, Flanders & Levesque, 2017). It is unclear how the RO model may fit into the broader, more diversified landscape of linked data and the Semantic Web in cultural institutions and in the humanities, but the conceptual fit within digital scholarship is established. ROs and similar models have substantial potential to underpin systems that support a variety of implementations. Realizing the advantages of ROs for the humanities will depend on implementations that create platforms for experimentation and collaborative development by distributed communities (Fenlon, 2019). Such platforms must accommodate dynamic interface-building, to allow scholarly communities with distinctive interests and needs to mobilize ROs in different ways. They must also accommodate participation and co- 3 creation through contributions of linked-data annotations and enrichments, including linking among ROs and the concepts and entities within ROs. Challenge 2. Distributed and idiosyncratic workflows of networked humanities scholarship Humanities digital scholarship is increasingly networked: heavily interconnected with and dependent on external resources for functionality and meaning. Many digital humanities publications in various forms—monographs, multimedia productions, exhibits, collections—draw on, reference, embed, and patch together distributed resources called from other collections, often via API. For example, a collection may center on a set of high-resolution images of primary sources, which are called from another digital library’s IIIF image server. Some of the longest- running, large-scale cultural heritage digital libraries (including Europeana and the Digital Public Library of America) are aggregations of descriptive surrogates, which link to original content hosted externally. Externally maintained schemas, authorities, and utilities undergird digital editions. Visualization and mapping projects generate content using external services. And with the growth of linked data in cultural collections, projects increasingly leverage external data sources as primary content, to which scholars then add layers of interpretive narrative, annotations, context, and interconnection. Humanities workflows rarely happen in self-contained or end-to-end research infrastructures, thwarting the possibility of sufficiently rich, automatic workflow capture. Indeed, efforts to build a workflow-oriented, unified cyberinfrastructure for supporting humanities scholarship tend to founder (e.g., Dombrowski, 2014). However, niche, task- or domain-specific infrastructures can capture constrained workflows. For example, in the domain of musicology, Page et al. (2017) observe how digital editions and annotations of encoded works are “manifestations of workflows deployed in musicological scholarship,” and offer a compelling framework for representing musical ROs, which include images, text, audio, and encoded music (Page et al., 2017; De Roure et al., 2018). Computational workflows are readily captured within humanities research environments, and ROs have come into play for this purpose. For example, the HathiTrust Research Center Data Capsule environment is moving toward systematic provenance-capture for computational text analysis workflows. These workflows take as inputs worksets (Jett et al., 2017), which are conceptually and technically akin to ROs: aggregate digital objects that implement addressability for and relational expressivity among components using domain ontologies. Unlike ROs, worksets are envisioned as the inputs of workflows in the current model of the HathiTrust Data Capsule environment, rather than encompassing whole research workflows (Murdock et al., 2017). But workflow-oriented ROs will not gain significant uptake in humanities contexts unless they can also capture and make useful more complex, distributed, sociotechnical workflows in meaningful ways. With their capacity for linked data using domain vocabularies, ROs readily accommodate many of the artifacts of networked digital scholarship in the humanities, along with their interrelationships (Fenlon, 2019). But can ROs accommodate humanities workflows in useful ways? In their effort to undergird DARIAH (pan-European infrastructure for digital arts and humanities research) through the systematic production of humanities ROs, Blanke and Hedges (2013) observed that humanities scholars employ sequential workflows, but “except in relatively specialised cases we rarely encountered workflows that could be automated, shared with and used by others, such as occur in many scientific disciplines.” While auto-generated and computer- useable workflows may not apply to most humanities research processes, formally characterized, (semi-) manually captured workflows would be highly useful for review, validation, archiving, reproducibility, reuse, and other purposes. While the RO model has the capacity and flexibility for complex workflow representation, more research is needed to characterize humanities workflows; 4 to identify how such characterizations can be made useful; and to identify model extensions and unique implementation strategies workflows might require in different domains. Challenge 3. Thick provenance Drilling down on the problem of workflow capture, digital humanities scholarship places special demands on data provenance—not only on the provenance of digital resources (such as files, compound objects, datasets) or components thereof (such as passages of music, paragraphs of a text, or lines of a poem), but also the provenance of attached, contextual information. Archival artifacts—the evidence of the humanities—often possess simultaneous, multiple and parallel provenances (Gilliland, 2014; Hurley, 2005). Documenting the provenance of the evidence itself can be complicated, but beyond that, the provenance of the provenance must also be documented. Any assertion made about any artifact (in the form of metadata or annotation), or any contextual and secondary information attached to artifacts in the context of digital scholarship, require provenance. Annotations and metadata are often, in the humanities, products of scholarly, interpretive work. Therefore, each annotation or metadata proposition itself is subject to claims of authorship, competing perspectives, expression of uncertainty, and further annotation—all requiring provenance information. Because provenance is a multilayered thing in humanities scholarship, different humanities disciplines and subdisciplines may require domain-specific provenance schemas and standards, which specialize existing standards for the expression of the provenance of different kinds of resources, ranging from digital media files to annotations. Humanities ROs will require thick, multilayered, context-rich provenance descriptions, which can accommodate conflicting assertions and formalize uncertainty. It is unclear whether existing implementations of the RO model can accommodate this level of description, though the model itself has the capacity. The ResearchSpace environment (Oldman and Tanase, 2018) offers exemplary support for documentation of thick, multifaceted provenance of humanities ROs. ResearchSpace is an open- source platform created by the British Museum to facilitate scholarly data sharing, formal argumentation, and semantic publishing within communities of researchers. ResearchSpace does not directly employ the RO model, though its architecture does rely on aggregates of linked data, taking advantage of related standards including W3C Web Annotation and Linked Data Platform containers. In this environment, provenance and argumentation are expressed using the CIDOC-CRM specialization CRMInf (The Argumentation Model). Scholars can use this vocabulary to build narratives and thick descriptions around digital ROs through annotation and data-linking. These narratives of provenance allow and formalize the expression of uncertainty and competing perspectives, and the environment also serves to document the scholarly work that goes into building these narratives (ResearchSpace Team, 2018). The reasons for highlighting the ResearchSpace approach to provenance in this review of humanities ROs are (1) to exemplify the unique demands of formalizing humanities provenance, and (2) to exemplify the highly distinctive, domain-specific implementation requirements that confront the RO and other domain-independent data models. Describing humanities provenance will require vocabularies to express argument and belief, as Oldman et al. (2015) observe. Beyond the RO model’s use of Prov and Web Annotation, humanities provenance will demand domain- specific argumentation extensions such as CRMInf. It is clear that ROs can theoretically accommodate thick provenance description, just as they can theoretically accommodate the representation of highly complex workflows, but can they usefully undergird implementations that are centered in humanities research needs? The ResearchSpace interface is tailored toward knowledge work, toward the collaborative construction of multifaceted provenance descriptions, 5 without requiring users to code or gain expert-level knowledge of domain ontologies. Tools for the authorship of humanities ROs, or tools that implement ROs behind the scenes, may benefit from taking the same approach. Conclusion ROs make a great deal of sense for modeling cultural information; skeletons of a similar shape— the simple and powerful combination of aggregation and annotation to represent compound digital objects—already structures large-scale cultural data aggregations, e.g., through the Europeana Data Model and the Digital Public Library of America Metadata Application Profile, which are both founded on ore:aggregations plus oa:annotations. But the challenges confronting widespread application of the RO model to humanities digital scholarship are significant. This review of existing applications has identified three central challenges: 1. Digital humanities scholarship requires specialized interactive use, so realizing the advantages of ROs for the humanities will depend on implementations that create platforms for experimentation and development by communities. 2. The idiosyncratic workflows of networked humanities scholarship means that workflow- oriented ROs will not gain significant uptake in the humanities unless they can capture distributed, sociotechnical workflows in meaningful ways. 3. Humanities ROs will require thick, multilayered, context-rich provenance descriptions that can accommodate conflicting assertions and formalize uncertainty, along with implementations that support the documentation of such provenance. In particular, the challenge of characterizing and formally expressing diverse humanities workflows, along with the provenance of data and contextual information within those workflows, presents the most urgent challenge and exciting opportunity for the future of humanities cyberinfrastructure. To many stakeholders in humanities cyberinfrastructure, “workflows are the new content” (Dempsey, 2016; Baynes et al., 2016; Schonfeld and Waters, 2018). While research on workflows is underway on multiple fronts (including Liu et al., 2017), it is clear already that there will be significant semantic differences between conceptual and technical elements in scientific workflows (and provenance) and those in the humanities; and these differences will affect the implementation of ROs for humanities research. Historically, attempts to implement scientific research infrastructures (including data models like the RO model) to support humanities scholarship have hit an obstacle in the form of semantic gulfs. For example, in the Linking and Querying Ancient Texts (LaQuAT) project, an effort to transfer eScience infrastructure in support of a humanities virtual research environment, Anderson and Blanke observed a fundamental challenge in integrating humanities data from different databases. They located the solution to that problem in humanities research communities: “integrating humanities research material...will require researchers to make the connections themselves, including decisions on how they are expressed and how to understand and explore the data more effectively” (Anderson and Blanke, 2012). Oldman et al. (2015), reviewing the state of linked data in the humanities, observed that basic linked data publication for many kinds of humanities sources can be counterproductive, “unless adapted to reflect specific methods and practices, and integrated into the epistemological processes they genuinely belong to.” This caution resonates with the challenges identified for the adoption of the RO model—or indeed for the importation of any data model, even domain- independent data models—into the humanities. The main challenges to implementing ROs for humanities research also present exciting opportunities for a more sustainable cross-disciplinary infrastructure (Fenlon, 2019), but implementation strategies must be centered in scholarly communities, and grow out from the practices, needs, and epistemologies of specific areas of study in the humanities and cultural institutions. 6 References Almas, B. (2017). Perseids: Experimenting with Infrastructure for Creating and Sharing Research Data in the Digital Humanities. Data Science Journal, 16(0). https://doi.org/10.5334/dsj-2017-019 Anderson, S., & Blanke, T. (2012). Taking the Long View: From e-Science Humanities to Humanities Digital Ecosystems. Historical Social Research / Historische Sozialforschung, 37(3 (141)), 147–164. Baynes, M. A., Sommer, D., Melley, D., & Lickiss, T. (2016, April). Workflow is the new content: Expanding the scope of interaction between publishers and researchers. Panel presentation presented at the Society for Scholarly Publishing. Retrieved from https://www.sspnet.org/events/past-events/workflow-is-the-new-content-expanding-the- scope-of-interaction-between-publishers-and-researchers/ Bechhofer, S., Buchan, I., De Roure, D., Missier, P., Ainsworth, J., Bhagat, J., … Goble, C. (2013). Why linked data is not enough for scientists. Future Generation Computer Systems, 29(2), 599–611. https://doi.org/10.1016/j.future.2011.08.004 Belhajjame, K., Zhao, J., Garijo, D., Gamble, M., Hettne, K., Palma, R., … Goble, C. (2015). Using a suite of ontologies for preserving workflow-centric research objects. Journal of Web Semantics, 32, 16–42. https://doi.org/10.1016/j.websem.2015.01.003 Blanke, T., & Hedges, M. (2013). Scholarly primitives: Building institutional infrastructure for humanities e-Science. Future Generation Computer Systems, 29(2), 654–661. https://doi.org/10.1016/j.future.2011.06.006 De Roure, D., Klyne, G., Page, K., Pybus, J., Weigl, D. M., & Willcox, P. (2018, July). Digital Music Objects: Research Objects for Music. Presented at the Research Object workshop (RO2018) at IEEE eScience Conference 2018. Retrieved from https://zenodo.org/record/1442453#.XB6Chc9KhhE Dempsey, L. (2016, October). The Library in the Life of the User: Two Collection Directions. Education. Retrieved from https://www.slideshare.net/lisld/the-library-in-the-life-of-the- user-two-collection-directions Dombrowski, Q. (2014). What Ever Happened to Project Bamboo? Literary and Linguistic Computing, 29(3), 326–339. https://doi.org/10.1093/llc/fqu026 Fenlon, K. (2017). Thematic research collections: Libraries and the evolution of alternative scholarly publishing in the humanities (Doctoral dissertation, University of Illinois at Urbana-Champaign). Retrieved from http://hdl.handle.net/2142/99380 Fenlon, Katrina. (2019). Modeling Digital Humanities Collections as Research Objects. Presented at the ACM/IEEE Joint Conference on Digital Libraries 2019. Retrieved from https://hcommons.org/deposits/item/hc:24889/ Gilliland, A. J. (2014). Conceptualizing 21st-Century Archives. ALA Editions. Hurley, C. (2005). Parallel provenance [Series of parts]: Part 1: What, if anything, is archival description?. [An earlier version of this article was presented at the Archives and Collective Memory: Challenges and Issues in a Pluralised Archival Role seminar (2004: Melbourne).]. Archives and Manuscripts, 33(1), 110. Jett, J., Cole, T. W., & Downie, J. S. (2017). Exploiting graph-based data to realize new functionalities for scholar-built worksets. Proceedings of the Association for Information Science and Technology, 54(1), 716–717. https://doi.org/10.1002/pra2.2017.14505401128 Liu, A., Kleinman, S., Douglass, J., Thomas, L., Champagne, A., & Russell, J. (2017). Open, Shareable, Reproducible Workflows for the Digital Humanities: The Case of the 4Humanities.org “WhatEvery1Says” Project. Presented at the Digital Humanities (DH2017). Retrieved from https://dh2017.adho.org/abstracts/034/034.pdf Murdock, J., Jett, J., Cole, T., Ma, Y., Downie, J. S., & Plale, B. (2017). Towards Publishing Secure Capsule-based Analysis. Proceedings of the 17th ACM/IEEE Joint Conference on 7 Digital Libraries, 261–264. Retrieved from http://dl.acm.org/citation.cfm?id=3200334.3200367 Oldman, D., Doerr, M., & Gradmann, S. (2015). Zen and the Art of Linked Data. In A New Companion to Digital Humanities (pp. 251–273). https://doi.org/10.1002/9781118680605.ch18 Oldman, D., & Tanase, D. (2018). Reshaping the Knowledge Graph by Connecting Researchers, Data and Practices in ResearchSpace. In D. Vrandečić, K. Bontcheva, Mari Carmen Suárez-Figueroa, V. Presutti, I. Celino, M. Sabou, … E. Simperl (Eds.), The Semantic Web – ISWC 2018 (pp. 325–340). Retrieved from https://link.springer.com/chapter/10.1007%2F978-3-030-00668-6_20 Page, K., Lewis, D., & Weigl, D. (2017). Contextual interpretation of digital music notation. Presented at the Digital Humanities (DH2017), Montréal, Canada. Palmer, C. L., Teffeau, L. C., & Pirmann, C. M. (2009). Scholarly Information Practices in the Online Environment: Themes from the Literature and Implications for Library Service Development. Retrieved from OCLC Research and Programs website: http://www.oclc.org/content/dam/research/publications/library/2009/2009-02.pdf ResearchSpace Team, British Museum. (2018, December). Moving from Documentation to Knowledge Building: ResearchSpace Principles and Practices. Presented at the Stiftung Preußischer Kulturbesitz (Prussian Cultural Heritage Foundation) Berlin. Retrieved from https://www.researchspace.org/docs/Berlin.pdf Schonfeld, R. C., & Waters, D. (2018, April). The turn to research workflow and the strategic implications for the academy. Presented at the Coalition for Networked Information (CNI) Spring Membership Meeting, San Diego, CA. Retrieved from https://vimeo.com/271130388 Sweeney, S. J., Flanders, J., & Levesque, A. (2017). Community-Enhanced Repository for Engaged Scholarship: A case study on supporting digital humanities research. College & Undergraduate Libraries, 24(2–4), 322–336. https://doi.org/10.1080/10691316.2017.1336144 work_ghhuhx2om5grtncgc72zosmaz4 ---- 19/7/2016 DH 2016 Abstracts http://dh2016.adho.org/abstracts/86 1/3 (http://dh2016.adho.org) DH Home (http://www.dh2016.adho.org) /  Abstracts (/abstracts/) /  86 (/abstracts/86) Show info How to cite XML Version (/static/data/305.xml) Title: EVI-LINHD. A Virtual Research Environment for the Spanish-speaking Community Authors: Gimena del Rio Riande, Elena González-Blanco García, Clara Martínez Cantón, Juan José Escribano Category: Paper:Poster Keywords: Virtual Research Environment, Virtual Research Community, Digital Scholarly Edition, Spanish- speaking Community, DH Center del Rio Riande, G., González-Blanco García, E., Martínez Cantón, C., Escribano, J. (2016). EVI-LINHD. A Virtual Research Environment for the Spanish-speaking Community. In Digital Humanities 2016: Conference Abstracts. Jagiellonian University & Pedagogical University, Kraków, pp. 776-777. EVI­LINHD. A Virtual Research Environment for the Spanish­speaking Community Although Digital Humanities have been defined from a discipline perspective in many ways, it is surely a field still looking for its own objects, practices and methodologies. Their development in the Spanish-speaking countries is no exception to this process and, even it is complex to trace a unique genealogy to give account for the evolving field in Spain and Latin America (Gonzalez-Blanco, 2013; Spence and Gonzalez-Blanco, 2014; Rio Riande 2014a, 2014b), the emergence of various associations in Mexico (RedDH), Spain (HDH) and Argentina (AAHD) that seek for a constant dialogue (Galina, González-Blanco and Rio Riande, 2015), and academic lab and DH center initiatives such as LINHD (Spain and Argentina), GRINUGR (Spain), Medialab USAL, LABTEC (Argentina), TadeoLab (Colombia), Elabora HD (Mexico), among others, make it clear that research has become increasingly “global, multipolar and networked” (Llewellyn Smith, et al., 2011) and that the academic field is looking for a global outreach and aims to open spaces of shared virtual work. Virtual Research Communities (VRCs) are a consequence of these changes. Virtual Research Environments (VREs) have become central objects for digital humanist community, as they help global, interdisciplinary and networked research taking of profit of the changes in “data production, curation and (re‐)use, by new scientific methods, by changes in technology supply” (Voss and Procter, 2009: 174-90). DH Centers, labs or less formal structures such as associations benefit from many kind of VREs, as they facilitate researchers and users a place to develop, store, share and preserve their work, making it more visible. The focus and implementation of each of these VREs is different, as Carusi and Reimer (2010) show in their comparative analysis, but there are some common guidelines, philosophy and standards that are generally shared (as an example, see the Centernet map and guidelines of TGIR Huma-Num, 2015). This poster presents the structure and design of the VRE of LINHD, the Digital Innovation Lab at UNED ( http://linhd.uned.es (http://linhd.uned.es)), and the first Digital Humanities Center in Spain. This VRE focuses on the possibilities of a collaborative environment for (profane or advanced) Spanish-speakers scholarly digital editors. Taking into account the language barrier that English may suppose for a Spanish-speakers scholar or http://dh2016.adho.org/ http://www.dh2016.adho.org/ http://dh2016.adho.org/abstracts/ http://dh2016.adho.org/abstracts/86 http://dh2016.adho.org/static/data/305.xml http://linhd.uned.es/ 19/7/2016 DH 2016 Abstracts http://dh2016.adho.org/abstracts/86 2/3 student and the distance they may encounter with the data and organization of the interface (in terms of computational knowledge) while facing a scholarly digital edition or collection, LINHD’s VRE comes as a solution for the VRC interested in scholarly digital work. Moreover, it will make it possible to add an apply tools that contribute to improve Spanish-English applications or tools developed locally, such as Contawords, by Iula- UPF http://contawords.iula.upf.edu/executions (http://contawords.iula.upf.edu/executions). Opening such an environment to the Spanish speaking world will make it possible to reach different kinds of communities, whose profile and training in digital humanities differ from the typical users of DH tools and environment. Testing all these tools in this new environment will, for sure, draw interesting project results. In this sense, our project dialogues and aims to join the landscape of other VREs devoted to digital edition, such as Textgrid, e-laborate, etc. and, in a further stage, to build a complete virtual environment to collect and classify data, tools and projects, work and publish them and share the results with the research community. After having studied the structure and components of other digital virtual environment, our VRE has been designed on a humanist-user centered perspective, in which interface design, accessibility easiness and familiarity with tools and standards are key factors. Therefore, the key of our VRE is the combination of different open-source software that will enable users to complete the whole process of developing a digital editorial project. The environment is, up-to-now, divided into three parts: 1) A repository of data to (projects, tools, etc.) with permanent identifiers in which the information will be indexed through a semantic structured ontology of metadata and controlled vocabularies (such as Isidore and Huni, but using LINDAT software by Clarin. eu). 2) A working space based on the possibilities of eXistDB to work on text encoding together with Tei-Scribe, a tool developed at LINHD to tag texts in an intuitive way, storing and querying, plus some publishing tools (pre-defined stylesheets and some other open-source projects, such as Sade, Versioning machine, etc.). 3) A collaborative cloud workspace which integrates a wiki, a file archiving system and a publishing space for each team. Sustainability and long-term preservation are issues which we contemplate from the beginning, as our group is leading the addition of Spain into Dariah and LINHD is also part of a Clarin-Knowledge center with two powerful NLP groups from U.Pompeu Fabra in Barcelona and IXA in País Vasco. Our project has been conceived according to DH standards and open-source tools and its infrastructure is supported by our university UNED. Bibliography 1. Candela, L. Virtual Research Environments. GRDI2020. http://www.grdi2020.eu/Repository/FileScaricati/eb0e8fea-c496- 45b7-a0c5-831b90fe0045.pdf (http://www.grdi2020.eu/Repository/FileScaricati/eb0e8fea-c496-45b7-a0c5-831b90fe0045.pdf) (accessed 28-10-2015). 2. Carusi, A. and T. Reimer, (2010). Virtual Research Environment Collaborative Landscape Study. A JISC funded project. Oxford e-Research Centre, University of Oxford and Centre for e-Research, King's College London https://www.jisc.ac.uk/rd/projects/virtual-research-environments (https://www.jisc.ac.uk/rd/projects/virtual-research- environments) (accessed 28-10-2015). 3. Galina, I., González Blanco García, E. and Rio Riande, G. del (2015). Se habla español. Formando comunidades digitales en el mundo de habla hispana. Abstracts of the HDH 2015 Conference, Madrid, Spain. http://hdh2015.linhd.es/ebook/hdh15- galina.xhtml (http://hdh2015.linhd.es/ebook/hdh15-galina.xhtml) (accessed 28-10-2015). 4. González-Blanco Garcí A., E. (2013). Actualidad de las Humanidades Digitales y un ejemplo de ensamblaje poético en la red: ReMetCa. Cuadernos Hispanoamericanos, 761: 53-67. 5. Llewellyn Smith, C., Borysiewicz, L., Casselton, L., Conway, G., Hassan, M., Leach, M., et al. (2011). Knowledge, Networks and Nations: Global Scientific Collaboration in the 21st Century. London: The Royal Society. 6. Rio Riande, G. del (2014a). ¿De qué hablamos cuando hablamos de Humanidades Digitales? Abstracts of the AAHD Conference. “Culturas, Tecnologías, Saberes Buenos Aires, Argentina. http://www.aacademica.com/jornadasaahd/toc/6? abstracts (http://www.aacademica.com/jornadasaahd/toc/6?abstracts) (accessed 28-10-2015). 7. Rio Riande, G. del (2014b). ¿De qué hablamos cuando hablamos de Humanidades Digitales? http://blogs.unlp.edu.ar/didacticaytic/2015/05/04/de-que-hablamos-cuando-hablamos-de-humanidades-digitales/ (http://blogs.unlp.edu.ar/didacticaytic/2015/05/04/de-que-hablamos-cuando-hablamos-de-humanidades-digitales/). (accessed 28-10-2015). 8. Spence, P. and González-Blanco, E. (2014). A historical perspective on the digital humanities in Spain,H-Soz-Kult, doi: 22.10.2014, http://www.hsozkult.de/text/id/texte-2535 (http://www.hsozkult.de/text/id/texte-2535). The Status Quo of Digital Humanities in Europe, H-Soz-Kult, doi: 22.10.2014. (accessed 28-10-2015). http://contawords.iula.upf.edu/executions http://www.grdi2020.eu/Repository/FileScaricati/eb0e8fea-c496-45b7-a0c5-831b90fe0045.pdf https://www.jisc.ac.uk/rd/projects/virtual-research-environments http://hdh2015.linhd.es/ebook/hdh15-galina.xhtml http://www.aacademica.com/jornadasaahd/toc/6?abstracts http://blogs.unlp.edu.ar/didacticaytic/2015/05/04/de-que-hablamos-cuando-hablamos-de-humanidades-digitales/ http://www.hsozkult.de/text/id/texte-2535 19/7/2016 DH 2016 Abstracts http://dh2016.adho.org/abstracts/86 3/3 9. Tgir H.-N. (2011). Le guide des bonnes pratiques numériques. http://www.huma-num.fr/ressources/guide-des-bonnes- pratiques-numeriques (http://www.huma-num.fr/ressources/guide-des-bonnes-pratiques-numeriques) (version of 13-1-2015). (accessed 28-10-2015). 10. Voss, A. and Procter, R. (2009). Virtual research environments in scholarly work and communications, Library Hi Tech, 27(2): 174–90. This paper has been developed thanks to the Starting Grant research project: Poetry Standardization and Linked Open Data: POSTDATA (ERC-2015-STG-679528), funded by the European Research Council (ERC) under the European Union´s Horizon 2020 research and innovation programme. http://www.huma-num.fr/ressources/guide-des-bonnes-pratiques-numeriques work_gio25lsxfnawho3jaoh4w6te54 ---- Szmrecsanyi_rerevised.dvi Corpus-based dialectometry: aggregate morphosyntactic variability in British English dialects* Benedikt Szmrecsanyi Freiburg Institute for Advanced Studies bszm@frias.uni-freiburg.de Abstract The research reported in this paper departs from most previous work in dialectometry in several ways. Empirically, it draws on frequency vectors derived from naturalistic corpus data and not on discrete atlas classifi- cations. Linguistically, it is concerned with morphosyntactic (as opposed to lexical or pronunciational) variability. Methodologically, it marries the careful analysis of dialect phenomena in authentic, naturalistic texts to aggregational-dialectometrical techniques. Two research questions guide the investigation: First, on methodological grounds, is corpus-based di- alectometry viable at all? Second, to what extent is morphosyntactic variation in non-standard British dialects patterned geographically? By way of validation, findings will be matched against previous work on the dialect geography of Great Britain. 1 Introduction The overarching aim in this study is to provide a methodological sketch of how to blend philologically responsible corpus-based research with aggregational- dialectometrical analysis techniques. The bulk of previous research in dialec- tometry has focussed on phonology and lexis (however, for work on Dutch dialect syntax see Spruit 2005, 2006, 2008, Spruit et al. t.a.). Moreover, orthodox di- alectometry draws on linguistic atlas classifications as its primary data source. The present study departs from these traditions in several ways. It endeavours, first, to measure aggregate morphosyntactic distances and similarities between traditional dialects in the British Isles. Second, the present study does not rely on atlas data but on frequency information deriving from a careful analysis of language use in authentic, naturalistic texts. This is another way of saying that the aggregate analysis in this paper is frequency-based, an approach that contrasts with atlas-based dialectometry, which essentially relies on categorical input data. Succinctly put, the difference is that atlas-based approaches typi- cally aggregate observations such as of two variants X and Y, variant X is the dominant one in dialect Z, while frequency-based approaches are empirically based on corpus findings along the lines of, say, in dialect Z, variant X is 3.5 times more frequent in actual speech than variant Y. 1 The corpus resource drawn on is fred, the Freiburg English Dialect Corpus, a naturalistic speech corpus sampling interview material from 162 different lo- cations in 38 different counties all over the British Isles, excluding Ireland. The corpus was analyzed to obtain text frequencies of 62 morphosyntactic features, yielding a structured database that provides a 62-dimensional frequency vector per locality. The Euclidean distance measure was subsequently applied to com- pute aggregate morphosyntactic distances, which then served as the input to dialectometrical analysis. Two research questions guide the present study’s inquiry: first, on the methodological plane we are interested in whether and how corpus-based (that is, frequency-based) dialectometry is viable. Substantially, we will seek to un- cover if and to what extent morphosyntactic variation in non-standard British dialects is patterned along geographic lines. By way of validation, findings will be matched against previous work (dialectological, dialectometrical, and per- ceptual) on the dialect geography of Great Britain. 2 Previous work on aggregate dialect differences in Great Britain Let us first turn to the literature in order to eclectically review extant scholar- ship on dialect differences in Great Britain. ?:20–35 is one of the best-known dialectological accounts of accent differences in traditional British dialects. ? studies eight salient accent features to establish a composite map dividing Eng- land into 13 traditional dialect areas. These can be grouped into six macro areas: (1) Scots, (2) northern dialects (Northumberland and the Lower North), (3) western central (Midlands) dialects (Lancashire, Staffordshire), (4) eastern central (Midlands) dialects (South Yorkshire, Lincolnshire, Leicestershire), (5) southwestern dialects (western Southwest, northern Southwest, eastern South- west), and (6) southeastern dialects (central East and eastern Countries). In the realm of perceptual dialectology, Inoue (1996) conducted an experi- ment to study the subjective dialect division in Great Britain. 77 students at several universities in Great Britain were asked, among other things, to draw lines on a blank map ‘according to the accents or dialects they perceived’ (Inoue 1996:146), based on their experience. The result of this exercise can be sum- marised as follows: dialects of English in Wales and Scotland are perceived as being very different from English English dialects. Within England, the North is differentiated from the Midlands, and the Midlands are differentiated from the South (Inoue 1996:map 3). This division is quite compatible with ?’s (?) classification, except that in Inoue’s (1996) experiment, Lancashire is part of the North, not of the western Midlands, and the northern Southwest (essentially, Shropshire and Herfordshire) patterns with Midland dialects, not southwestern dialects. As for atlas-based dialectometry, Goebl (2007) draws on the Computer De- veloped Linguistic Atlas of England (which is based on the Survey of English Dialects) to study aggregate linguistic relationships between 314 sites all over England. The aggregate analysis is based on 597 lexical and morphosyntactic features. Among many other things, Goebl (2007) utilises cluster analysis to partition England into discrete dialect areas (Goebl 2007:maps 17–18). It turns 2 out that there is ‘a basic opposition between the North [. . . ] and the South of England’ (Goebl 2007:145). The dividing line runs south of Lancashire and South Yorkshire, and thus cuts right across what ? and Inoue (1996) classify as the Midlands dialect area. In southern English dialects, Goebl (2007) finds a major split between southwestern and other southern dialects. 3 Methods and data The present study is an exercise in corpus-based dialectometry. Corpus linguis- tics is a methodology that draws on principled collections of naturalistic texts to explore authentic language usage. A hallmark of the methodology is the ‘extensive use of computers for analysis, using both automatic and interactive techniques’ and the reliance ‘on both quantitative and qualitative analytical techniques’ (Biber et al. 1998:4). This section will discuss the corpus as well as the feature frequency portfolio that will serve as the basis for the subsequent aggregate analysis. 3.1 Data source: the Freiburg English Dialect Corpus (FRED) This study will tap the Freiburg English Dialect Corpus (henceforth: fred) (see Hernández 2006; Szmrecsanyi and Hernández 2007 for manuals) as its primary data source. fred contains 372 individual texts and spans approximately 2.5 million words of running text, consisting of samples (mainly transcribed so- called ‘oral history’ material) of dialectal speech from a variety of sources. Most of these samples were recorded between 1970 and 1990; in most cases, a field- worker interviewed an informant about life, work etc. in former days. The 431 informants sampled in the corpus are typically elderly people with a working- class background (so-called ‘non-mobile old rural males’). The interviews were conducted in 162 different locations (that is, villages and towns) in 38 different pre-1974 counties in Great Britain plus the Isle of Man and the Hebrides. The corpus is annotated with longitude/latitude information for each of the loca- tions sampled. From this annotation, county coordinates can be calculated by computing the arithmetic mean of all the location coordinates associated with a particular county. At present, fred is neither part-of-speech annotated nor syntactically parsed. 3.2 Feature selection and extraction Corpus-based dialectometry is essentially frequency-based dialectometry; thus the approach outlined here bears a certain similarity to the method in Hop- penbrouwers and Hoppenbrouwers (2001) (discussed in Heeringa 2004:16–20). Following a broadly variationist approach in the spirit of, for example, Labov (1966), a catalogue spanning 35 morphosyntactic variables with typically (but not always) two variants each was defined. This catalogue of 35 variables yields a list of p = 62 morphosyntactic target variants (henceforth: features); the Ap- pendix provides a comprehensive list. In an attempt to aggregate as many vari- ables as possible, the features included in the catalogue are the usual suspects in the dialectological, variationist, and corpus-linguistic literature, regardless of 3 whether a geographic distribution has previously been reported for a particular feature or not. To qualify for inclusion, however, a candidate feature had to fulfill the following criteria: 1. For statistical reasons, the feature had to be relatively frequent, specifi- cally: ≥ 1 occurrence per 10,000 words of running text (this criterion rules out interesting but infrequent phenomena such as resumptive relative pro- nouns or double modals). 2. For practical purposes, the feature had to be extractable subject to a reasonable input of labour resources by a human coder (ruling out, for example, hard-to-retrieve null phenomena such as zero relativisation, or phenomena where semantics enters heavily into consideration, such as gendered pronouns). Next, the material in fred was coded for the features in the catalogue. 26 features for which automatic recall was feasible were extracted automatically using Perl (Practical Extraction and Report Language) scripts. 36 features were coded manually after pre-screening the data using Perl scripts, a step which considerably narrowed down the number of phenomena which had to be in- spected manually. Even so, the frequency database utilised in the present study is based on 75,124 manual (that is, qualitative) coding decisions. Szmrecsanyi (forthcoming) provides a detailed description of the procedure along with the detailed coding schemes that regimented the coding process. Once coding was complete, another line of Perl scripts was used to extract vectors of ptotal = 62 feature frequencies per locality. The feature frequencies were subsequently normalised to frequency per ten thousand words (because textual coverage in fred varies across localities) and log-transformed* to de- emphasise large frequency differentials and to alleviate the effect of frequency outliers. The resulting 38 × 62 table (on the county level – that is, 38 coun- ties characterised by 62 feature frequencies each for the full dataset) yields a Cronbach’s α value of .86, indicating satisfactory reliability. Finally, the 38 × 62 table was converted into a 38 × 38 distance matrix using Euclidean distance – the square root of the sum of all squared frequency differentials – as an interval measure. This distance matrix was subsequently analyzed dialectometrically.* 4 Results We now move on to a discussion of empirical findings. Unless stated otherwise, the level of areal granularity is the county level (N = 38). 4.1 On the explanatory power of geography Let us first consider the role that geographic distance plays in aggregate mor- phosyntactic variability. First, how much of this variability can be explained by geography? Second, looking at the morphosyntactic dialect landscape in the British Isles, to what extent are we dealing with a continuum such that transitions are gradual and not abrupt? 4 As for the first question, a Perl script was run on the Euclidean distance ma- trix based on all ptotal = 62 features and on fred’s geographic longitude/latitude annotation to generate a table specifying pairwise morphosyntactic and geo- graphic distances. This yielded an exhaustive list of all N × N−1 2 = 703 pos- sible county pairings, each pairing being annotated for morphosyntactic and geographic distance. On the basis of this list, the scatterplot in Figure 1 illus- trates the correlation between morphosyntactic and geographic distance in the database at hand. [Figures 1 and 2 here] Figure 1 highlights two facts. First, while the correlation between mor- phosyntactic and geographic distance is highly significant (p = .00), it is rela- tively weak (Pearson correlation coefficient: r = .22). In other words, geography explains overall only 4.7 per cent of the morphosyntactic variance (R2 =.047). To put this value into perspective, Spruit et al. (to appear:Table 7) – in a study on aggregate linguistic distances in Dutch dialects – report R2 values of .47 for the correlation between geography and pronunciation, .33 for lexis, and .45 for syntax. Second, the best curve estimation for the relationship be- tween morphosyntactic and geographic distance in British English dialects is actually linear.* Given Séguy (1971) and much of the atlas-based dialectometry literature that has followed Séguy’s seminal study, one would actually expect a sublinear or logarithmic relationship. Having said that, we note that Spruit (2008:54-55), in his study of Dutch dialects, finds that the correlation between syntactic and geographic distance is also more linear than logarithmic. Hence, it may simply be the case that (morpho)syntactic variability has a different relationship to geographic distance than lexical or pronunciational variability. Against this backdrop, it is interesting to note that not all of the 62 features entered into aggregate analysis correlate significantly with geography. In fact, only 23 features do (these are marked with an asterisk in the Appendix).* When the aggregate analysis is based on only those pgeo = 23 features, we obtain the scatterplot in Figure 2. The correlation coefficient between morphosyntactic and geographic distance is now approximately twice as high as in Figure 1 (r = .41), which means that for this particular feature subset geography explains about 16.6 per cent of the morphosyntactic variance (R2 = .166).* While these numbers begin to approximate the explanatory potency of geography in atlas- based dialectometry, it still seems that we should base the aggregate analysis on all available data. This is why the subsequent analysis in this paper will be based on the entire feature portfolio (ptotal = 62), despite the weaker geographic signal it provides. Still, we observe that feature selection does matter a great deal, and one is left to wonder to what extent compilers of linguistic atlases – the primary data source for those studies that report high coefficients for geography – really draw on all available features, or rather on those features that seem geographically interesting. [Figure 3 here] 5 Comparatively weak as the overall correlation between morphosyntactic and geographic distance may be, are we nonetheless dealing with a morphosyn- tactic dialect continuum? To answer this question, we will now visualise ag- gregate morphosyntactic variability using cartographic techniques, all relying on Voronoi tesselation (see Goebl 1984) to project linguistic results to geogra- phy. Regular multidimensional scaling (henceforth: mds) (see Kruskal and Wish 1978) was utilised to scale down the original 62-dimensional Euclidean distance matrix to three dimensions; the distances in the three-dimensional mds solution correlate with the distances in the original distance matrix to a satisfactory degree (r = .82). Subsequently, the three mds dimensions were mapped to the red–green–blue colour components, giving each of the county polygons in Figure 3 a distinct colour.* In continuum maps such as Figure 3, smooth (as opposed to abrupt) colour transitions implicate the presence of a dialect con- tinuum. As can be seen, the morphosyntactic dialect landscape in the British Isles is overall not exceedingly continuum-like.* While colour transitions in the south of England are fairly smooth (meaning that this is a fairly homogeneous dialect area), the picture is more noisy in the North of England and, especially, in Scotland. To aid interpretation of Figure 3, each of the 62 normalised log- transformed feature frequencies was correlated against each of the three mds dimensions to determine which of the features correlate most strongly with the red–green–blue colour scheme in Figure 3 (see Wieling et al. 2007 for a similar procedure). It turns out that more reddish colours correlate best with increased frequencies of multiple negation (feature [34]) (r = .79), greenish colours corre- late most strongly with higher frequencies of non-standard weak past tense and past participle forms (feature [23]) (r = .63), and bluish colours correlate best with increased frequencies of wh-relativisation (feature [49]) (r = .57). By way of an interim summary, the research discussed in this section has two principal findings. Firstly, the explanatory potency of geography is com- paratively weak in the data at hand and accounts for only between 4.7 to 16.6 per cent of the observable morphosyntactic variance (depending on whether all available features or only those with a significant geographic distribution are studied). Secondly, the morphosyntactic dialect landscape in Great Britain does not have a very continuum-like structure overall, although transitions appear to be more gradual in England than in Scotland. 4.2 Classification and validation The task before us now is to examine higher-order patterns and groupings among British English dialects. Is it possible to identify dialect areas on morphosyn- tactic grounds (and on the empirical basis of frequency data)? If so, do these dialect areas conform to those previously identified in the literature (see section 2)? To answer these questions, hierarchical agglomerative cluster analysis (see Aldenderfer and Blashfield 1984), a data classification technique used to par- tition observations into discrete groups, was applied to the dataset. Simple clustering can be unstable, hence a procedure known as ‘clustering with noise’ (Nerbonne et al. 2008) was conducted: the original Euclidean distance matrix was clustered repeatedly, adding some random amount of noise in each run. This exercise yielded a cophenetic distance matrix which details consensus (and thus more stable) cophenetic distances between localities, and which is amenable 6 to various cartographic visualisation techniques. This study uses the cluster- ing parameters described in Nerbonne et al. (2008), setting a noise ceiling of c = σ/2 and performing 100 clustering runs. There are many different clustering algorithms; in addition to using the – quite customary – Weighted Pair Group Method using Arithmetic Averages (wpgma), we also apply Ward’s Minimum Variance Method (ward), as the two algorithms yield interestingly different clustering outcomes.* [Figures 4, 5, 6, and 7 here] The resulting higher-order structures can be visualised, for example, via so- called composite cluster maps (see Nerbonne et al. 2008 for a discussion). These highlight the fuzzy nature of dialect boundaries such that darker borders be- tween localities represent more robust linguistic oppositions (which, thanks to the clustering-with-noise technique utilized, can be considered statistically sig- nificant). Figure 4 presents a composite cluster map that visualises the outcome of wpgma noisy clustering, which is contrasted with the corresponding ward outcome in Figure 5. An alternative visualisation, which highlights rough group memberships and fuzzy transition areas, can be attained by applying mds to the cophenetic distance matrix (see, for instance, Alewijnse et al. 2007:section 5.3) and subsequently assigning component colours to each of the three resulting mds dimensions. Such maps – where similar colourings indicate likely member- ship in the same dialect area – are displayed in Figure 6 (wpgma) and Figure 7 (ward). Note, in this context, that the distances in the three-dimensional mds solution correlate very highly with the distances in the cophenetic distance matrix (r = .96 and r = 1.00, respectively). Figures 4 through 7 can be interpreted as follows. Both the wpgma and ward algorithms characterise Scotland as heterogeneous and geographically fairly incoherent (more so according to wpgma than according to ward). Both algorithms moreover tend to differentiate between English English dialects and non-English English dialects (Scottish English dialects and northern Welsh di- alects, in particular Denbighshire [DEN]). This is consonant with the sharp perceptual split between English English dialects and Welsh/Scottish dialects reported in Inoue (1996). As for divisions among English English dialects, how- ever, the two clustering algorithms generate fairly different classifications: • wpgma classifies England as a rather homogeneous dialect area vis-à-vis Scotland and Wales. The only outlier in England is the county Warwick- shire (WAR; the brownish polygon in Figure 6), which is more similar to Denbighsire (DEN; Welsh English) and some Scottish dialects than to the other English counties. • ward broadly distinguishes between southern English dialects (reddish/ pinkish colours in Figure 7) and northern English dialects (brownish/dark- ish colours). Northumberland (NBL, dark green), Durham (DUR, blue), and Warwickshire (WAR; light blue), albeit English counties, pattern with Scottish dialects. Middlesex (MDS) is grouped with the northern dialects, although the county is located in the geographic Southeast (this fact is re- sponsible for the salient southeastern ‘box’ in Figure 5). In sum, the ward 7 algorithm finds a rather robust North–South split in England, which is compatible with all three accounts surveyed in Section 2 (?Inoue 1996; Goebl 2007). Figures 5 and 7 can also be seen to reveal a split among northern dialects into Midland dialects (darkish/brownish colours, in par- ticular Leicestershire [LEI], Shropshire [SAL], Lancashire [LAN], West- morland [WES], and Yorkshire [YKS]) versus northern dialects (Durham [DUR] and Northumberland [NBL]). This opposition would be in accor- dance with Inoue (1996) as well as ?. In summary, we have seen in this section that it seems to be possible – despite a good deal of apparent geographical incoherence – to identify rough dialect areas on morphosyntactic grounds, and that these are not incompatible with previous accounts of dialect differences in Great Britain. For one thing, most English English dialects are rather robustly differentiated from non-English English dialects. Second, the ward algorithm in particular finds a North–South split among English English dialects that appears meaningful given extant schol- arship. At the same time, we note that both algorithms fail to identify mean- ingful and coherent patterns among Scottish dialects. Also, neither algorithm detects a split between the Southwest of England and other southern dialects, as posited by ? and Goebl (2007). 5 Conclusions This study has demonstrated that frequency vectors derived from naturalistic corpus data – as opposed to, for instance, categorical linguistic atlas classifi- cations – can serve as the empirical basis for aggregate analysis. Focussing on morphosyntactic variability in British English dialects, we have seen that the dataset yields a significant geographic signal which is, however, comparatively weak in comparison to previous atlas-based dialectometrical findings. The anal- ysis has also suggested that overall variability in British English dialects does not seem to have an exceedingly continuum-like structure, and that there is quite a bit of geographical incoherence. Future study will want to investigate whether the comparatively weak explanatory potency of geography is real, or whether it is an artefact of the specific methodology or data type used. Having said that, the results do reveal that British English dialects can be partitioned into rough dialect areas on morphosyntactic grounds. Although the match with the literature is not perfect – as a matter of fact, we should not expect it to be perfect, given that some of the studies cited ‘are based on entirely different things and on not very much at all’, as one reviewer of this paper noted – the classification suggested here is not incompatible with previous work on dialect divisions in Great Britain. This enhances confidence in the method utilized here. A more detailed discussion of the outlier status of counties such as Warwickshire and Middlesex (including the identification of the features that are responsible for this outlier status), and of the extent to which the methodology presented here uncovers hitherto unknown generalisations is reserved for another occasion. More generally speaking, though, the present study highlights the fact that a careful and philologically responsible identification and analysis of features occurring in naturalistic, authentic texts (as customary in, for example, varia- tionist sociolinguistics and corpus-based dialectology) advertises itself for aggre- gation and computational analysis. The point is that the qualitative-philological 8 jeweller’s eye perspective and the quantitative-aggregational bird’s eye perspec- tive are not mutually exclusive, but can be fruitfully combined to explore large- scale patterns and generalisations. It should be noted in this connection that the line of aggregate analysis sketched out in this paper could easily be extended to other humanities disciplines that rely on naturalistic texts as their primary data source (for instance, literary studies, historical studies, theology, and so on). The methodology outlined in the present study can and should be refined in many ways. For one thing, work is under way to utilise Standard English text corpora to determine aggregate morphosyntactic distances between British English dialects, on the one hand, and standard English dialects (British and American) on the other hand. Second, the feature-based frequency informa- tion on which the present study rests will be supplemented in the near future by part-of-speech frequency information, on the basis of a coding scheme that distinguishes between 73 different part-of-speech categories. Third, given that geography does not seem to play an exceedingly important role in the dataset analyzed here, it will be instructive to draw on network diagrams (in the spirit of, for example, McMahon et al. 2007) as an additional visualisation and inter- pretation technique. 9 Notes *I am grateful to John Nerbonne, Wilbert Heeringa, and Bart Alewijnse for having me over in Groningen in spring 2007 to explain dialectometry to me. I also wish to thank Peter Kleiweg for creating and maintaining the RuG/L04 package. The audience at the Workshop on ‘Measuring linguistic relations between closely related varieties’ at the MethodsXIII conference in Leeds (August 2008) provided very helpful and valuable feedback on an earlier version of this paper, as did four anonymous reviewers. The usual disclaimers apply. *Zero frequencies were rendered as .0001, which yields a log frequency of -4. *The analysis was conducted using some custom-made Perl scripts, standard statistical soft- ware (spss), and Peter Kleiweg’s RuG/L04 package (available online at http://www.let.rug.nl/~kleiweg/L04/) as well as the L04 web interface maintained by Bart Alewijnse (http://l04.knobs-dials.com/). * R 2 linear = .0469, R 2 logarithmic = .0439 *In order to test individual features for significant geographic distributions, dialect dis- tances were also calculated on the basis of individual features (using one-dimensional Euclidean distance as interval measure) and correlated with geographical distance. If the ensuing correla- tion coefficient was significant, a given feature was classified as having a significant geographic distribution. *Still, the relationship is more linear (R2linear = .0166) than logarithmic (R 2 logarithmic = .134). *To do justice to fred’s areal coverage – which is unparalleled in the corpus-linguistic realm, but certainly not perfect – the polygons in Figure 3 have a maximum radius of ca. 40 km. This yields a ‘patchy’ but arguably more realistic geographic projection. *Having said that, it should be made explicit that the present study is based on an aggregate analysis of features that are known to display variation (though not necessarily geographic variation). As one reviewer noted, the inclusion of more invariable features – say, basic word order or the like – would yield smoother dialect transitions. This is of course true, yet we note that linguistic atlases, and thus atlas-based dialectometry, also of course have a bias towards variable features. *Notice that given the present study’s dataset, the Unweighted Pair Group Method using Arithmetic Averages (upgma), another popular algorithm used in, for instance, Nerbonne et al. (2008), yields almost exactly the same classification as wpgma. 10 Appendix: the feature catalogue Features whose distribution correlates significantly with geography are marked by an asterisk (*). A. The pronominal system [1]* vs. [2] non-standard vs. standard reflexives [3] vs. [4] archaic thee, thou, thy vs. standard you, yours, you B. The noun phrase [5]* vs. [6] synthetic vs. analytic adjective comparison [7] vs. [8] the of -genitive vs. the s-genitive [9] vs. [10]* preposition stranding vs. preposition/particle frequencies C. Primary verbs [11] vs. [12]* the primary verb to do vs. the primary verbs to be/have note: this includes both main verb and auxiliary verb usages D. Tense, mood, and aspect [13] vs. [14] the future marker be going to vs. will/shall [15] vs. [16]* would vs. used to as markers of habitual past [17]* vs. [18] progressive vs. unmarked verb forms [19]* vs. [20] the present perfect with auxiliary be vs. the present perfect with auxiliary have E. Verb morphology [21] vs. [22] a-prefixing on -ing-forms vs. bare -ing-forms [23] vs. [24] non-standard weak past tense and past participle forms vs. standard strong forms [25]* vs. [26] non-standard ‘Bybee’ verbs vs. corresponding standard forms note: ‘Bybee’ verbs (see Anderwald 2009) have a three-way paradigm – e.g. begin/began/begun – in Standard English but can be reduced to a two-way paradigm – e.g. begin/begun/begun – in dialect speech [27] non-standard verbal -s [28]* vs. [29] non-standard past tense done vs. standard did [30] vs. [31] non-standard past tense come vs. standard came 11 F. Negation [32]* vs. [33] invariant ain’t vs. not/*n’t/*nae-negation [34]* vs. [35] multiple negation vs. simple negation [36]* vs. [37] negative contraction vs. auxiliary contraction [38]* vs. [39]* don’t with 3rd person singular subjects vs. standard agree- ment [40] vs. [41] never as a preverbal past tense negator vs. standard nega- tion G. Agreement [42] existential/presentational there is vs. was with plural sub- jects [43]* vs. [44] deletion of auxiliary be in progressive constructions vs. auxiliary be present [45]* vs. [46]* non-standard was vs. standard was [47] vs. [48]* non-standard were vs. standard were H. Relativisation [49] wh-relativisation [50]* relative particle what [51] relative particle that [52] relative particle as I. Complementation [53]* as what or than what in comparative clauses [54] vs. [55]* unsplit for to vs. to-infinitives [56] vs. [57] infinitival vs. gerundial complementation after to begin, to start, to continue, to hate, to love [58] vs. [59] zero vs. that complementation after to think, to say, and to know J. Word order phenomena [60] lack of inversion and/or of auxiliaries in wh-questions and in main clause yes/no-questions [61]* vs. [62]* prepositional dative vs. double object structures after the verb to give 12 References M. S. Aldenderfer and R. K. Blashfield (1984), Cluster Analysis, Quantitative Applications in the Social Sciences (Newbury Park, London, New Delhi). B. Alewijnse, J. Nerbonne, L. van der Veen, and F. Manni (2007), ‘A Compu- tational Analysis of Gabon Varieties’, in P. Osenova, ed., Proceedings of the RANLP Workshop on Computational Phonology. 3–12. L. Anderwald (2009), The Morphology of English Dialects (Cambridge). D. Biber, S. Conrad, and R. Reppen (1998), Corpus Linguistics: Investigating Language Structure and Use (Cambridge). H. Goebl (1984), Dialektometrische Studien: Anhand italoromanischer, rätroromanischer und galloromanischer Sprachmaterialien aus AIS und ALF (Tübingen). H. Goebl (2007), ‘A bunch of dialectometric flowers: a brief introduction to dialectometry’, in U. Smit, S. Dollinger, J. Hüttner, G. Kaltenböck, and U. Lutzky, eds, Tracing English through time: Explorations in language vari- ation (Wien), 133–172. W. Heeringa (2004), Measuring dialect pronunciation differences using Leven- shtein distance (Ph. D. thesis, University of Groningen). N. Hernández (2006), User’s Guide to FRED. http://www.freidok.uni-freiburg.de/volltexte/2489/ (Freiburg). C. Hoppenbrouwers and G. Hoppenbrouwers (2001), De indeling van de Neder- landse streektalen. Dialecten van 156 steden en dorpen geklasseerd volgens de FFM (Assen). F. Inoue (1996), ‘Subjective Dialect Division in Great Britain’, American Speech, 71(2), 142–161. J. B. Kruskal and M. Wish (1978), Multidimensional Scaling, Volume 11 of Quantitative Applications in the Social Sciences (Newbury Park, London, New Delhi). W. Labov (1966), ‘The linguistic variable as a structural unit’, Washington Linguistics Review, 3, 4–22. A. McMahon, P. Heggarty, R. McMahon, and W. Maguire (2007), ‘The sound patterns of Englishes: representing phonetic similarity’, English Language and Linguistics, 11(1), 113–142. J. Nerbonne, P. Kleiweg, and F. Manni (2008), ‘Projecting dialect differences to geography: bootstrapping clustering vs. clustering with noise’, in C. Preisach, L. Schmidt-Thieme, H. Burkhardt, and R. Decker, eds, Data Analysis, Ma- chine Learning, and Applications. Proceedings of the 31st Annual Meeting of the German Classification Society (Berlin), 647–654. J. Séguy (1971), ‘La relation entre la distance spatiale et la distance lexicale’, Revue de Linguistique Romane, 35, 335–357. 13 M. R. Spruit (2005), ‘Classifying Dutch dialects using a syntactic measure: the perceptual Daan and Blok dialect map revisited’, Linguistics in the Nether- lands, 22(1), 179–190. M. R. Spruit (2006), ‘Measuring syntactic variation in Dutch dialects’, Literary and Linguistic Computing, 21(4), 493–506. M. R. Spruit (2008), Quantitative perspectives on syntactic variation in Dutch dialects (Ph. D. thesis, University of Amsterdam). M. R. Spruit, W. Heeringa, and J. Nerbonne (to appear), ‘Associations among Linguistic Levels’, Lingua. B. Szmrecsanyi (forthcoming), Woods, trees, and morphosyntactic distances: traditional British dialects in a corpus-based dialectometrical view . B. Szmrecsanyi and N. Hernández (2007), Manual of Information to ac- company the Freiburg Corpus of English Dialects Sampler (”FRED-S”). http://www.freidok.uni-freiburg.de/volltexte/2859/ (Freiburg). M. Wieling, W. Heeringa, and J. Nerbonne (2007), ‘An aggregate analysis of pronunciation in the Goeman-Taeldeman-van Reenen-Project data’, Taal en Tongval, 59(1), 84–116. 14 9008007006005004003002001000 geographic distance (in km) 26 24 22 20 18 16 14 12 10 8 6 4 m o rp h o s y n ta c ti c d is ta n c e R Sq Linear = 0.047 Figure 1: Correlating linguistic and ge- ographic distances, county level (N = 38), all features (ptotal = 62), r = .22, p = .00. 9008007006005004003002001000 geographic distance (in km) 18 16 14 12 10 8 6 4 2 0 m o rp h o s y n ta c ti c d is ta n c e R Sq Linear = 0.166 Figure 2: Correlating linguistic and ge- ographic distances, county level (N = 38), geographically significant features only (pgeo = 23), r = .41, p = .00. 15 ANS BAN CON DEN DEV DFS DUR ELN FIF GLA HEB INV MAN KCD KEN KRS LAN LEI LKS LND MDX MLN NBL NTT OXF PEE PER ROC SAL SEL SFK SOM SUT WAR WES WIL WLN YKS Figure 3: Continuum map: regular mds on Euclidean distance ma- trix (county level). Labels are three-letter Chapman county codes (see http://www.genuki.org.uk/big/Regions/Codes.html for a legend). Smooth colour transitions indicate the presence of a dialect continuum. Reddish colours correlate best with increased frequencies of multiple negation, greenish colours correlate best with higher frequencies of non-standard weak past tense and past participle forms, and bluish colours correlate best with increased frequencies of wh-relativisation. 16 Figure 4: Composite cluster map, county level (N = 38), all features (ptotal = 62); input: cophenetic distance matrix (clustering algorithm: wpgma). Darker borders indicate more robust dialect boundaries. Figure 5: Composite cluster map, county level (N = 38), all features (ptotal = 62); input: cophenetic distance matrix (clustering algorithm: ward). Darker borders indicate more robust dialect boundaries. 17 Figure 6: Fuzzy mds map, county level (N = 38), all features (ptotal = 62); in- put: cophenetic distance matrix (clus- tering algorithm: wpgma); felicitous- ness of the mds solution: r = .96. Sim- ilar colours indicate likely membership in the same dialect area. Figure 7: Fuzzy mds map, county level (N = 38), all features (ptotal = 62); in- put: cophenetic distance matrix (clus- tering algorithm: ward); felicitousness of the mds solution: r = 1.00. Sim- ilar colours indicate likely membership in the same dialect area. 18 work_gnmc5jq4yvbsjj7l2xxzqpev2e ---- 1 GEOPARSING, GIS, AND TEXTUAL ANALYSIS: CURRENT DEVELOPMENTS IN SPATIAL HUMANITIES RESEARCH Ian Gregory, Christopher Donaldson, Patricia Murrieta-Flores, and Paul Rayson Introduction The spatial humanities constitute a rapidly developing research field that has the potential to create a step-change in the ways in which the humanities deal with geography and geographical information. As yet, however, research in the spatial humanities is only just beginning to deliver the applied contributions to knowledge that will prove its significance. Demonstrating the potential of innovations in technical fields is, almost always, a lengthy process, as it takes time to create the required datasets and to design and implement appropriate techniques for engaging with the information those datasets contain. Beyond this, there is the need to define appropriate research questions and to set parameters for interpreting findings, both of which can involve prolonged discussion and debate. The spatial humanities are still in early phases of this process. Accordingly, the purpose of this special issue is to showcase a set of exemplary studies and research projects that not only demonstrate the field’s potential to contribute to knowledge across a range of humanities disciplines, but also to suggest pathways for future research. Our ambition is both to demonstrate how the application of exploratory techniques in the spatial humanities offers new insights about the geographies embedded in a diverse range of texts (including letters, works of literature, and official reports) and, at the same time, to encourage other scholars to integrate these techniques in their research. To date, a standard definition of the spatial humanities has yet to be determined (a fact which, no doubt, has much to do with the pioneering nature of the field); however Richardson et al’s 2011 description of GeoHumanities as a ‘rapidly growing zone of creative interaction between geography and the humanities’ offers an excellent starting place.1 Although there is nothing explicitly digital about this definition, such ‘creative interaction’ certainly owes much to the widespread application of geographical technologies within the digital humanities.2 In this way, one can begin to trace the relation between GeoHumanities and what Gregory and Geddes describe when referring to the spatial humanities as a field that employs ‘geographical technologies to develop new knowledge about the geographies of human cultures past and present.’3 For Gregory and Geddes, the spatial humanities has its origins in historical geographical information systems (HGIS), from which it has developed both on account of technological advances in geographic information science (GISc) and on account of the 2 increased acceptance of digital technologies as tools for knowledge creation across a range of humanities disciplines. In their seminal collection, The Spatial Humanities, Bodenhamer et al advance a similar conception of the field, presenting the spatial humanities both through critical engagements with specific spatial technologies and through more general evaluations of the benefits these technologies can bring.4 Implicit in each of these accounts is the notion that the spatial humanities has come about through the creative adaptation and application of geographical technologies in humanities research. Taking this basic definition of the spatial humanities as a starting point, this issue draws attention to important developments that are shaping the field. 1. Digital collections and geographical technologies Information technology is currently advancing in ways that make both non-quantitative source materials and geographical technologies increasingly available and accessible. The amount of the digital textual material available to researchers is proliferating rapidly. Some of the major digital collections of historical books and reports include the Old Bailey Online, Early English Books Online, and the British Library’s Nineteenth-Century Newspaper collection,5 which each comprise many millions or even billions of words. The digitisation of these corpora (as large collections of digital texts are called) was principally driven by the desire to make them available online with keyword-search facilities. With recent advances in text analysis software, however, it is becoming apparent that researchers can gain new insights about these corpora through more complex forms of analyses. In addition, researchers have also begun to recognize the manifest importance of ‘born digital’ material, including corpora compiled from websites, email archives, and social media, which can be analysed in many of the same ways. Alongside these developments, the last several years have also witnessed unprecedented advances in the design and distribution of geographical technologies. Once these were the preserve of geographical information systems (GIS) software, such as ArcGIS, which, although powerful, were both expensive and highly specialised and therefore only accessible to a small number of specialists.6 Thanks to the increasing number of open-source GIS software packages, however, technologies that were once available only to a select few have begun to reach a mass audience. Moreover, with the ubiquity of satnavs and virtual globes (such as Google Earth), interest in and familiarity with digital mapping has increased exponentially. 7 Indeed, as Bodenhamer et al proffer, ‘we are more aware than ever of the power of the map to facilitate commerce, enable knowledge discover, [and] make geographic information visual 3 and socially relevant.’8 When we draw these threads together – the proliferation of digital corpora, the increasing availability and accessibility of geographical technologies, and the growing awareness of the potential contributions to knowledge that these technologies can make – the potential of the spatial humanities to advance humanities scholarship becomes clear. This is especially so when we consider the recent advances made in the development of automated geoparsing. 2. Georeferencing and automated geoparsing Creating georeferenced databases (where each item of data is assigned geographic coordinates, allowing it to be mapped and spatially analysed) has long been identified as one of the main challenges in implementing GIS and other geospatial technologies in humanities research.9 Much of the foundational work in quantitative HGIS was completed by projects that built major databases of census and related statistics. These projects entailed time-consuming research, and frequently millions of dollars in funding, to create systems that linked digitised changing historical administrative boundaries with databases of statistical tables.10 The challenge facing humanities researchers today is rather different. In general, the sources used by humanities researchers are unstructured texts in which geographical information is present in the form of specific named-entities such as place-names. In order to georeference an unstructured text, one needs to identify the place-names it contains and then to assign each place-name to the coordinates that represent its location. The first of these steps can be achieved by implementing Natural Language Processing techniques that are capable of automatically recognizing the place-names within a text. Completing the second step entails pairing these place-names with coordinate data from a gazetteer, such as GeoNames or GNIS.11 This two-step process is known as geoparsing.12 Geoparsing provides a solution to georeferencing texts; however, the next (and arguably the more important) task is to decide what to do with the fully georeferenced text. The software and the analytic techniques developed for working with georeferenced databases have been developed to handle quantitative sources, usually from scientific or social science paradigms. But how does one analyse a georeferenced text in ways that are sensitive to the complex nature of humanities sources? Moreover, what contributions to knowledge can we expect from these approaches? This special issue addresses these questions by presenting series of studies and research projects that demonstrate not only the opportunities afforded by geoparsing and working with georeferenced texts, but also the challenges this presents and their implications 4 for spatial humanities research. 3. Trends in spatial humanities research The early days of HGIS were characterised by projects that used quantitative data to study spatial patterns in fields such as historical demography and environmental and economic history. Over the past decade, the potential of these approaches to make contributions to knowledge has become increasingly apparent,13 and scholars in disciplines across the humanities and social sciences have begun to incorporate GIS, and cognate geospatial technologies, into their research. In what follows, we offer a brief survey of these developments, beginning with archaeology and history and then moving on to consider literary studies and, finally two closely related areas, corpus and computational linguistics. 3.1. Archaeology, history and classics Archaeology was one of the first humanities disciplines to integrate geospatial technologies in its methods and to apply these technologies in its research. That this was the case is largely because the study of the material past is inherently spatial in nature. In order to make sense of artefacts and sites, archaeologists need to understand their spatial contexts. They need, in other words, to determine how specific objects, features, and structures relate not only to the places where they are found, but also to the wider landscapes those places comprise. It is only by understanding the spaces inhabited by past cultures that archaeologists are able to reconstruct the customs, beliefs, and institutions that defined those cultures. At this rudimentary level, all forms of archaeology can be characterised by a preoccupation with spatial thinking. Given this, it comes as little surprise that the methodologies and theories that have shaped the discipline over the past fifty years have continued to emphasize the spatial dimension of archaeological practice. The emergence of ‘spatial archaeology’ during the 1970s can be seen as paradigmatic in this respect; for, although not uncontroversial, its influence is apparent not only in the widespread use of spatial analysis for studying excavated artefacts to landscape compositions, but also in the most recent trends in archaeological computing.14 Archaeologists first began to utilise GIS during the 1980s.15 Since that time, other spatial technologies such as Remote Sensing and GPS have also been incorporated in almost every branch of the discipline.16 By contrast, the spatial study of textual sources in archaeology is a much more recent phenomenon. The reasons for this are varied; but, for the present purposes, it suffices to say that the use of textual sources in archaeology is different than it is in most 5 other humanities disciplines. Whereas texts are the core source for most humanities disciplines, artefacts found in digs are the core of archaeology. This is one of the reasons why digital archaeology has often seemed to have such a different ‘scene’ during the emergence of digital humanities.17 Indeed, pioneering projects have only recently begun to harness the potential of spatial technologies to investigate text corpora relevant to archaeological research. One example of this is using techniques from corpus and computational linguistics to mine grey literature reports, extracting potential spatial and contextual information for archaeological interpretation.18 Although this type of approach provides a foundation for assessing the geographies underlying such texts, the methodologies that can go beyond data exploration and enter the realm of corpus analysis have yet to materialise in archaeology. We think, however, that this will not take long. With the combination of the experience in spatial methods and thinking from archaeology, and the diverse approaches developed in history, corpus and computational linguistics, we expect to see advanced forms of textual spatial analysis in archaeology in the near future. In the case of history, the application of geospatial methods is a more recent phenomenon. Here, as was the case in archaeology, the integration of GIS came about largely in response to the need to find methodologies to answer specific research questions in the wider context of the discipline. HGIS, the result of these endeavours, has over the past fifteen years become a diverse and dynamic subfield. Although, as noted above, HGIS projects initially concentrated on the quantitative exploration of economic and political data,19 more recent studies have focused on the application of GIS in the analysis of historical documents.20 Exploratory HGIS scholarship has also recently undertaken experimental research combining spatial and corpus analysis,21 and, in some cases, even integrating GIS with serous gaming engines to facilitate the 3-D virtual modelling of historical places and landscapes.22 Thus, by different routes, GIS has become important to both archaeology and to historical geography. For archaeologists this evolved from the need to survey and record sites, for historical geographers from the desire to make better use of quantitative data, but for both subjects it has led to the development of new analytic approaches. In both cases, there is a clear potential to apply these technologies and methods to textual sources. Work done in classics has also been exploiting the potential benefits of harnessing the spatial information in texts and representing them using enhanced visualisations. Examples include HESTIA23 and Google Ancient Places24 which have developed Web-based visual resources to facilitate the exploration of places of interest mentioned in ancient literature.25 6 3.2. Literary studies A number of ground-breaking research projects have also recently identified the transformative potential of GIS, and related technologies, for the discipline of literary studies. Led by research teams at centres of excellence in Britain, Europe, the United States, and Australia, these projects have proven that GIS and its cognates have the power to revolutionise how we interpret the material, imaginative, and discursive geographies not only of individual novels, poems, and plays, but also of large corpora of literary works. In doing so, these projects have helped reinvigorate both literary geography (the study of the spatiality of literary works) and, more generally, what might be called the geography of literature (the study of the place-bound nature of the acts of writing, publishing, and reading). Even more remarkably, they have also suggested the potential of wholly new modes and practices for literary scholarship. A key development here has been the emergence of digital literary atlas projects, such as ETH Zurich’s Literary Atlas of Europe, the University of Queensland’s Cultural Atlas of Australia, Trinity College Dublin’s Digital Literary Atlas of Ireland, and the New University of Lisbon’s LITSCAPE.PT (which is featured in this collection). Taking their cue from the pioneering work of Franco Moretti, Matthew Jockers, and the Stanford Literary Lab, the creators of these atlases have embraced the idea that literary critics can use maps as ‘analytical tools: that dissect the text in unusual ways, and bring to light relations that would otherwise remain hidden.’26 Underlying this methodological premise is a conviction in the value of the map as a form of abstraction that, in reducing the object of study – the literary text or corpus – to a few particulars, both defamiliarizes it and, in the process, helps to generate new research questions and to guide critical inquiry. Equally operative here is a new, contested paradigm for literary hermeneutics, variously called ‘distant reading’27 or ‘macroanalysis’28, which has sought to supplement more traditional critical approaches through the aggregate analysis of large literary corpora. These new practices, as Jockers explains, are poised to take advantage of ‘the massive digital-text collections’ available on the World Wide Web and, in the process, to launch literary studies into a new age: Today, in the age of digital libraries and large-scale book-digitization projects, the nature of the evidence available to us has changed, radically. Which is not to say that we should no longer read books looking for, or noting, random ‘things,’ but rather to emphasize that massive digital corpora offer us unprecedented access to the literary record and invite, even 7 demand, a new type of evidence gathering and meaning making.29 In other words, instead of simply identifying, isolating, and analysing the features of a handful of ‘representative’ texts, literary scholars today should strive to create new knowledge about those features and texts by contextualizing them in relation to the large text corpora that are now available to them. The creation of digital literary atlases, such as the Literary Atlas of Europe and LITSCAPE.PT, have demonstrated how GIS and spatial analysis can assist in facilitating this sort of contextualization. But, even though these projects are successful in their own terms, they have also stimulated important debates about the merits of such macro-mapping activities for literary scholarship. A key issue here is the widely perceived incongruity between the methodologies of GISc – with their reliance on precise, quantifiable data – and the kinds of equivocal or, what Bushell calls, ‘slippery’ information with which literary scholars typically engage.30 Put bluntly, tools and techniques designed to measure absolute Euclidean space often prove inadequate for modelling the complex, contingent, and, at times, contradictory geographies of literary works of art – a criticism that can also be applied to other humanities sources. Consequently, many literary scholars have dismissed the application of GIS, and other digital tools, in the macro-mapping of literary corpora as problematically instrumentalist and reductive, noting that this approach tends to flatten out and suppress the differences that distinguish literary works from one another. Other scholars have been more conciliatory, expressing interest in the results of such projects whilst advising that the value of mapping as a critical practice depends largely on the nature of texts being studied. Notably, Hewitt councils that the analysis of literary works ‘is more revealing when sensitivity is shown to the approaches of individual texts and authors,’ and that, accordingly, that ‘mapping … cannot be the first step in a mass hermeneutic process,’ but should come ‘after an exploration of … evidence of a work’s engagement with spatial concerns.’31 Another common criticism of large-scale, literary atlas projects is that they problematically conflate the real world and the world of the literary text. For, although acknowledging that ‘the geography of fiction follows its own distinctive rules, since literature can create its own space, without physical restrictions,’ in the end these projects are often simply predicated on the positivist assumption ‘that a large part of fiction indeed refers to the physical/real world.’32 As Bushell reminds us, ‘the points where [a literary map] does not correspond directly to the world of the book may be more interesting than the points where it does.’33 8 These are major impediments for the integration of spatial humanities approaches in literary studies. For although it is clear that GIS and distant reading are clearly compatible and can be usefully combined to build spatial models of specific narrative structures, their efficacy in aiding textual study is, as yet, limited. The problem here, as one research team has noted, is both epistemological and technological: both a consequence of differing research cultures and of the limitations of current research tools.34 In order to overcome these deficits, researchers within the field of the digital humanities – including literary scholars, computer scientists, and GISc specialists – need to work together not only to develop new practices and frameworks for interdisciplinary collaboration and creative exchange, but also to produce substantial works of scholarship that demonstrate how those practices and frameworks contribute to close study and analysis of specific literary texts. Encouragingly, this is precisely the direction in which much work in the field of literary cartography is moving. Notably, research centres such as the Stanford Literary Lab have continued to pioneer innovative interdisciplinary approaches to the study of literary spaces, most recently by modelling the use of digital crowdsourcing to construct an ‘emotional map’ of London based on a corpus of eighteenth- and nineteenth-century novels.35 At Lancaster University, moreover, a team of literary scholars, GIScientists, quantitative historians, corpus and computational linguists is using data mining in conjunction with GIS to investigate literary representations of and responses to the English Lake District.36 Furthermore, an interdisciplinary team of scholars at the University of Edinburgh is currently using immersive- mapping and mobile-computing technologies to enable users to explore the Edinburgh cityscape through geo-located extracts of literary works from the early modern period to the twentieth century.37 Alongside these projects, other integrative forms of geospatial technology are informing the development of deep mapping, a newly emerging concept that an increasing body of literature discusses in detail.38 3.3 Corpus and computational linguistics A key requirement underpinning the mapping of texts is the ability to connect the GIS techniques with the underlying qualitative texts. As already described, at a minimum, this entails geoparsing the texts to extract place-names and locate these on a map. Additional techniques from the closely related areas of corpus and computational linguistics can enable the spatial humanities researcher to link their distant reading back to close analysis of the underlying text. 9 Computational linguistics, or Natural Language Processing (NLP), techniques drawn from computer science allow the automatic summarisation of meaning or extraction of a variety of patterns from text. One example NLP technique is named-entity extraction which can, to a certain level of accuracy, find all mentions of personal names, organisations, place names, dates and times in a text. As evidenced in this special issue, combining this technique with toponym resolution39 to locate the extracted place-names on a map forms one of these bi- directional links from analysis to text. Another large class of NLP techniques enable the automatic annotation of words or phrases within a text at various levels of linguistic detail. Part-of-speech (POS) annotation enables highly accurate (97-98%) identification of major word classes in text, such as adjectives, nouns, verbs and adverbs. Once these categories are marked in the corpus, they can be searched alongside the word forms. Proper nouns are particularly useful for finding place-names and personal names. Adjectives are a good source when searching for evaluative descriptions of landscape features for example. A second level of tagging, called semantic annotation,40 adds meaning or conceptual labels to words and phrases in a text. This enables searching by concepts e.g. health and disease, finance and money, education within the corpus. The second family of related techniques stems from corpus linguistics which is a method or collection of methods for text analysis stemming from the discipline of linguistics. With the increase in power and storage capacity of computers in the 1970s and 1980s, a set of methods emerged which enabled large quantities of machine-readable text to be analysed and explored semi-automatically for language description purposes. Similar developments in the digital humanities can be traced back to Roberta Busa working with IBM in 1949 to produce his computer-generated Index Thomisticus of the writings of Thomas Aquinas. In parallel, dictionary publication was revolutionised in the 1980s with the creation of machine-readable corpora such as COBUILD alongside new searching and analysis software. At least five corpus linguistics methods are worthy of mention here. In combination, they provide a semi-automatic approach to data-driven exploratory analysis which can uncover patterns within the data that are otherwise difficult or impossible to extract by more manual analyses. First, frequency lists show all the different word types in a text and how often they occur allowing the researcher to focus their efforts on the most represented features in a corpus. Frequency lists can also be extended to show how well a word is dispersed within a corpus and this is key to understanding the salience of a word in a corpus. Second, concordances show every occurrence of a word in a text with a small amount of context, usually 4-5 words either side. This enables the researcher 10 to look for patterns and meanings by sorting the surrounding text. Third, the keywords method compares two or more frequency lists to identify words which are statistically more represented in one text relative to another sub-corpus or a much larger reference corpus. This can show what a text is about and highlight interesting terms for further analysis. Fourth, n-grams (sometimes called lexical bundles or clusters) shows repeated consecutive sequences of words of a given length (n) which extends the single word frequency lists. Finally, the collocation technique assists in finding which words regularly co-occur in close proximity in texts. In the context of the spatial humanities, the collocation technique has proven useful for discovering which topics are discussed in relation to different places that are mentioned in a corpus. In addition, the combination of keywords methods, semantic analysis and collocation means that it is possible to uncover and visualise the topics associated with particular place-names by connecting GIS databases to their semantic collocates, so called ‘visual GISting’.41 4. The essays in this issue The above discussion shows that the spatial humanities draw from, and applied to, a wide range of disciplines. Nevertheless, at its core there are similarities in approach, methods and limitations that draw the spatial humanities together. The essays in this volume have been selected to represent both the diversities and commonalities of the field. The first essay, by Alex et al, is principally concerned with geoparsing, and with how the automated extraction of geographic information can aid the analysis of large text corpora. The main focus here is the Edinburgh Geoparser, a state of the art Web-based tool, which has been adapted to facilitate the georeferencing of historical texts. In order to illustrate the power and flexibility of the Edinburgh Geoparser, Alex et al present three brief, but contrasting, case studies: one concerned with nineteenth-century trade, another concerned with the ancient world, and a third concerned with a historical gazetteer of English place-names. The four essays that follow Alex et al move from geoparsing to consider how working with georeferenced corpora can inform humanities research. In the first, Schwartz engages with a field that is near the traditional heartland of HGIS: environmental history. Drawing on georeferenced texts from the British Parliamentary Papers, his chapter examines nineteenth- century reports on fish stocks in British waters. In order to do this, Schwartz combines computer-assisted qualitative data analysis (CAQDAS) methods with GIS to assess changing perceptions about the decline of fish stocks by comparing texts from Royal Commissions in 1863 and 1893. Although Schwartz is keen to stress that CAQDAS and GIS are reductionist 11 approaches, his essay convincingly demonstrates how they can aid more traditional forms of historical analysis by generating research questions and guiding critical inquiry. Alves and Queiroz’s essay, which focuses on the application of geospatial tools in literary studies, also suggests how GIS-based distant reading can complement more traditional close reading practices. Here the focus is on LITSCAPE.PT, a digital literary atlas of historical and modern Portuguese literature. Rather than using geoparsing software, as Alves and Queiroz explain, LITSCAPE.PT relies on a creative combination of crowdsourcing and relational databases to facilitate the mapping and analysis of excerpts from a wide variety of literary works. Using this mixed-methods approach, the project has successfully catalogued more than 6,000 excerpts, which are being used to study literary representations of mainland Portugal as well as social and environmental history. As examples, Alves and Queiroz present two case studies: a comparative examination of the evolving physical and literary geographies of Lisbon and a transhistorical assessment of the declining presence of wolves in Portuguese literature. The fourth essay, by Purves and Derungs, also addresses the representation of landscapes in writing; however, they are more concerned with the way that texts represent landscape than what those representations reveal about the texts themselves. As Purves and Derungs argue, engaging with written representations of space can help human geographers move beyond the GIS-facilitated measurement of Euclidean space towards a more nuanced conception of place. As proof-of-concept, Purves and Derungs focus on two apparently contrasting corpora related to mountain landscapes in Switzerland and Britain: the Text+Berg archive of the Yearbooks of the Swiss Alpine Club and georeferenced photographs from Flickr. Although the Yearbooks (a historical corpus documenting 150 years of Alpine mountaineering) seem like a typical resource to draw on in such an analysis, Flickr (a web 2.0 site that allows users to upload photographs) is a much less obvious choice. However, as Purves and Derungs explain, the metadata added to Flickr photos make them an equally excellent resource for understanding how places are experienced and perceived. Van den Heuvel’s essay, which concludes this special issue, also emphasizes the need to move beyond a purely Euclidean conception of space, which, he contends, is inadequate for modelling the sorts of networks and systems of knowledge exchange in which scholars in the humanities are often interested. As his example, van den Heuvel concentrates on the Republic of Letters, suggesting that spatial humanities approaches can aid us in reconstructing the geographical distribution of documents and drawings that defined this epistolary intellectual 12 community. Taking inspiration from the notion of ‘deep maps’, van den Heuvel, concludes by positing the idea of ‘deep networks’ as a means for visualising how knowledge was communicated and disseminated in early modern Europe. Taken together, these essays offer a representative sample of the sorts of projects currently being pursued within the spatial humanities. They call attention to new developments within the field and, moreover, present different perspectives on the challenges and the potentials of research in the field. In doing so, they affirm that the creative adaption and application of geographical technologies has the potential to revolutionize scholarship across a number of different humanities disciplines. Our hope is that, in modelling the use of innovative methods they will encourage other scholars to integrate these approaches in their research. Acknowledgements This special edition resulted from an expert meeting on ‘Digital Texts and Geographical Technologies in the Digital Humanities’ held at Lancaster University 8-9th July 2013, funded by the European Research Council (ERC) under the European Union’s Seventh Framework Programme (FP7/2007-2013) / ERC grant ‘Spatial Humanities: Texts, GIS, Places’ (agreement number 283850). This introductory essay also benefited from support under the same grant. End Notes 1 D. Richardson, S. Luria, J. Ketchum and M. Dear, ‘Introducing the geohumanities’, in M. Dear, J. Ketchum, S. Luria and D. Richardson, eds., GeoHumanities: Art, history, text at the edge of place (Abingdon, 2011), 3-4. Cited here at 3. 2 See, for instance, K. Offen ‘Historical geography II: Digital imaginations’, Progress in Human Geography, 37, no. 4 (2013), 564-77. 3 I.N. Gregory and A. Geddes, ‘Introduction: From Historical GIS to Spatial Humanities: Deeping scholarship and broadening technology’, in I.N. Gregory and A. Geddes, eds., Towards Spatial Humanities: Historical GIS and Spatial History (Bloomington, IN, 2014), ix-xix. Cited here at xv. 4 See: D.J. Bodenhamer, J. Corrigan, and T.M. Harris, ‘Introduction’, in D.J. Bodenhamer, J. Corrigan and T.M. Harris, eds., The Spatial Humanities: GIS and the future of humanities scholarship (Bloomington, 2010), vii-xv; and D.J. Bodenhamer, ‘The potential of spatial humanities’, in D.J. Bodenhamer, J. Corrigan, and T.M. Harris, eds., The Spatial Humanities: GIS and the future of humanities scholarship (Bloomington, 2010),14-30. 13 5 See: The Proceedings of the Old Bailey – London’s Central Criminal Court, 1674 to 1913, http://www.oldbaileyonline.org, last accessed 5 Aug 2014; Early English Books Online, http://eebo.chadwyck.com/home, last accessed 5 Aug 2014; British Newspapers, 1600-1950, http://gale.cengage.co.uk/product-highlights/history/19th-century-british-library-newspapers.aspx, last accessed 5 Aug 2014. 6 See ArcGIS, [accessed 5 Aug 2014]. 7 See: Google Earth, [accessed 5 Aug 2014]. Examples of free and open source GIS software include: QGIS: A free and open source geographic information system, http://www.qgis.org, last accessed 5 Aug 2014; and MapWindow, http://www.mapwindow.org, last accessed 5 Aug 2014. 8 Bodenhamer et al (2010), vii. 9 I.N. Gregory and P.S. Ell, Historical GIS: Technologies, methodologies, scholarship (Cambridge, 2007), Chap 3. 10 A.K. Knowles, ed. ‘Reports on National Historical GIS projects’ Historical Geography, 33 (2005), 293-314 provides a review. 11 GeoNames, [accessed 5 Aug 2014]; GNIS, [accessed 5 Aug 2014]. 12 C. Grover, R. Tobin, K. Byrne, M. Woollard, J. Reid, S. Dunn and J. Ball, ‘Use of the Edinburgh geoparser for georeferencing digitized historical collections’, Philosophical Transactions of the Royal Society A, 368 (2010), 3875-3889. 13 I.N. Gregory, ‘Further reading: From historical GIS to spatial humanities: An evolving literature’ in I.N. Gregory and A. Geddes, eds., Towards Spatial Humanities: Historical GIS and Spatial History (Bloomington, IN, 2014), 186-202; A.K. Knowles, ed., Placing History: How maps, spatial data, and GIS are changing historical scholarship (Redlands, CA, 2008). 14 J. Huggett, ‘What Lies Beneath: Lifting the Lid on Archaeological Computing’ in A. Chrysanthi, P. Murrieta Flores, and C. Papadopoulos, eds., Thinking Beyond the Tool: Archaeological Computing and the Interpretative Process (Oxford, 2012) 204–214. 15 M. Aldenderfer, H. Maschner, and M. Goodchild, Anthropology, space, and geographic information systems (New York, 1996); A.S. Fotheringham, C. Brunsdon, M. and Charlton, Quantitative geography: perspectives on spatial data analysis (Thousand Oaks, CA, 2000); and D. Wheatley and M. Gillings, Spatial Technology and Archaeology. The Archaeological Applications of GIS (London, 2002). 16 R.N. Parker and E.K. Asencio, GIS and spatial analysis for the social sciences: coding, mapping, and modelling (New York, 2008); and M.F. Goodchild and D.G. Janelle, ‘Toward critical spatial thinking in the social sciences and humanities’, GeoJournal 75, no. 1 (2010), 3-13. 17 J. Huggett, ‘Core or periphery? Digital Humanities from an archaeological perspective’, Historical Social Research, 37, no. 3 (2012) 86-105. 14 18 J. Richards, S. Jeffrey, S. Waller, F. Ciravegna, S. Chapman, and Z. Zhang, ‘The Archaeology Data Service and the Archaeotools Project: Faceted Classification and Natural Language Processing’, in E. C. Kansa, S. Whitcher Kansa, & E. Watrall, eds., Archaeology 2.0. New Approaches to Communication and Collaboration (Los Angeles, 2011), 31–56. 19 I.N. Gregory and R.G. Healey ‘Historical GIS: Structuring, mapping and analysing geographies of the past’ Progress in Human Geography, 31 (2007), 638-653. 20 B. Donahue, ‘Mapping Husbandry in Concord: GIS as a Tool for Environmental History’ in A. K. Knowles and A. Hillier, eds., Placing history: how maps, spatial data, and GIS are changing historical scholarship (Redlands, CA, 2008), 151-77; A. K. Knowles, W. Roush, C. Abshere, L. Farrell, A. Feinberg, and T. Humber, ‘What could Lee see at Gettysburg’, in A. K. Knowles and A. Hillier, eds., (2008), 235-66; R.M. Schwartz, I. Gregory, and J. Marti-Henneberg, ‘History and GIS: railways, population change, and agricultural development in late nineteenth century Wales’, in M. Dear, et al (2011), 251-266. 21 I. Gregory and A. Hardie ‘Visual GISting: bringing together corpus linguistics and Geographical Information Systems’, Literary and Linguistic Computing 26, no. 3 (2011), 297-314; P. Murrieta- Flores, A. Baron, I. Gregory, A. Hardie, and P. Rayson, ‘Automatically analysing large texts in a GIS environment: the Registrar General’s reports and cholera in the nineteenth century’, Transactions in GIS (2014). 22 See, for example, UCLA RomeLab, [accessed 5 Aug 2014]; Virtual St Paul’s Cross Project, [accessed 5 Aug 2014]; and T.M. Harris, L.J. Rouse, and S. Bergeron, ‘Humanities GIS: Adding place, Spatial storytelling and Immersive visualization into the Humanities’, in M. Dear, et al (2011), 226-240. 23 E. Barker, S. Bouzarovski, C. Pelling, and L. Isaksen, ‘Mapping an ancient historian in a digital age: the Herodotus Encoded Space-Text-Image Archive (HESTIA)’, Leeds International Classical Studies, 9, no. 1 (2010), 1-24. 24 Google Ancient Places, [accessed 5 Aug 2014]. 25 E. Barker, K. Byrne, L. Isaksen, E. Kansa, and N. Rabinowitz, Google Ancient Places, [accessed 5 Aug 2014]. 26 F. Moretti, Atlas of the European Novel, 1800-1900 (London, 1998), 3. 27 F. Moretti, ‘Conjectures on World Literature’, New Left Review, 1 (2000), 54–68. 28 M. L. Jockers, Macroanalysis: Digital Methods and Literary History (Illinois, 2013). 29 Jockers (2013), 7-8. 30 S. Bushell, ‘The slipperiness of literary maps: Critical cartography and literary cartography’, Cartographica, 47, no. 3 (2012), 149-60. 31 R. Hewitt, ‘Mapping and Romanticism’, Wordsworth Circle, 42, no. 2 (2011), 157-65. 15 32 B. Piatti and L. Hurni. ‘Editorial: The cartographies of fictional worlds’, The Cartographic Journal, 48, no. 4 (2011), 218-23: 218-19. 33 Bushell (2012), 154. 34 D.J. Bodenhamer, T.M. Harris, and J. Corrigan, ‘Spatial Narratives and Deep Maps: A Special Report’, International Journal of Humanities and Arts Computing, 7, nos. 1-2 (2013), 170-75. 35 See Stanford Literary Lab, [accessed 5 Aug 2014]. 36 See D. Cooper and I. Gregory, ‘Mapping the Lakes District: A Literary GIS’, Transactions of the Institute of British Geographers, 36, no. 1 (2011), 89-108; see also Spatial Humanities: Texts, GIS, Places, [accessed 5 Aug 2014]. 37 See Palimpsest, [accessed 5 Aug 2014]. 38 See D.J. Bodenhamer, T.M. Harris, and J. Corrigan, Deep Maps and Spatial Narratives (Bloomington: Indiana University Press, forthcoming 2014). 39 See J. L. Leidner. ‘Toponym resolution in text: annotation, evaluation and applications of spatial grounding’. SIGIR Forum 41, 2 (2007), 124-126. 40 Rayson, P., Archer, D., Piao, S. L., McEnery, T. ‘The UCREL semantic analysis system’. In proceedings of the workshop on Beyond Named Entity Recognition Semantic labelling for NLP tasks in association with 4th International Conference on Language Resources and Evaluation (LREC 2004), 25th May 2004, Lisbon, Portugal, (2004) pp. 7-12. 41 Gregory and Hardie (2011). work_gsarfotn5zdibdrpi2gq3nhpoq ---- Diapositiva 1 Taipei, 22/2/2012 Antonella FRESA Technical Coordinator Central Institute Union Catalogue Italian Libraries A data infrastructure for digital cultural heritage: characteristics, requirements and priority services 22 February 2012 TELDAP 2012 Conference Taipei, 22/2/2012 Table of content • The Digital Cultural Heritage sector: characteristics and needs • The vision towards a DCH data infrastructure • Two inter-related projects: DC-NET and INDICATE • Positioning of the DCH sector Taipei, 22/2/2012 Initiatives of the European Member States in the last 10 year A wide range of activities: • Building a shared platform of tecommendations and guidelines • Agreement on common data models • Experimenting and launching innovative online services • E-infrastructures for the citizens • E-infrastructures for the research • International cooperation: in Europe and abroad • Digitisation within national and regional programmes 3 3 Taipei, 22/2/2012 DATA MODEL & SERVICES DIGITAL CULTURAL HERITAGE e-INFRASTRUCTURE 2002 2005 2009 2012 2014 NATIONAL & REGIONAL INITIATIVES EUROPEANA RECOMMENDATIONS & GUIDELINES E-INFRASTRUCTURES for the citizens for the researchers Taipei, 22/2/2012 The amount of digitised material is growing very rapidly • National, regional and European programmes support the digitisation of the content of Museums, Libraries, Archives, Archaeological sites and Audiovisual repositories • The generation of digital cultural heritage is accelerated also by the impulse of Europeana that is fostering the European cultural institutions to produce even more digital content • Digital cultural heritage content are complex and interlinked through many relations Digital cultural content characteristics Taipei, 22/2/2012 6 NATIONAL PROGRAMMES REGIONAL PROGRAMMES EUROPEAN PROGRAMMES Digital cultural content National portals National portals Regional portals Thematic portals Data Continuum THE VISION …….. International portals Taipei, 22/2/2012 1. high quality information technology management, to ensure trust, availability, reliability, long term safety of content, security, preservation and sustainability; 2. access facilities to the final users (the researchers) who will search into the DCH e-Infrastructure for their research and to the cultural institutions that will deliver their data to the DCH e-Infrastructure; 3. interoperation among existing cultural heritage repositories and of cultural heritage data with research data. The needs of the DCH sector Taipei, 22/2/2012 The e-infrastructure for DCH It is not a “new infrastructure”, but it is instead a “new approach” - based on national and regional systems - Valorising existing resources The keyword is INTEROPERABILITY Regional system National system Thematic system National system Regional system Taipei, 22/2/2012 Expected impacts • e-Infrastructures The adoption of the e-Infrastructures by the digital cultural heritage community will open new scenarios of use and exploitation • Cultural Heritage Cultural managers will become more aware about the potential that the e- infrastructures can offer to their work: storage, preservation, access services for the cultural institutions, etc. • Research A better integration of the cultural sector with the e-Infrastructures will enable the research of new advanced services and applications • Other sectors Digital cultural content will become more usable and re-usable for education, cultural tourism, long-life learning, non-professional cultural interests, creative industry, etc. 9 9 Taipei, 22/2/2012 • To focus on the use of existing e-infrastructures as a channel for digital cultural heritage data • Storage, computing, connectivity together with authentication , authorisation and accounting mechanisms offered by the e-infrastructures can well serve the needs of the sector: the issue here is to establish factual cooperation among two sectors (the research and the cultural heritage) that are not used to work together DCH V/S e-Infrastructures Taipei, 22/2/2012 • Key players from the DCH: – Ministries of Culture – Cultural institutions  Cross-domain: museums, libraries and archives together • Key player from the research: – Ministries of Research – Researchers in the Humanities – Researchers in ICT applied to CH • E-Infrastructure providers Key players Taipei, 22/2/2012 • To define priorities among the services to be deployed • To consult and to advocated with stakeholders • To engage with programme owners • To improve awareness: standards, who-is-who, … • To promote trust building, covering different aspects and including organisational, operational and legal issues • To run experiments: pilots and use case studies • To open international cooperation • To establish an e-culture community Preparatory actions Taipei, 22/2/2012 1. DC-NET: joint activities plan for DCH e-infrastructure implementation 2. INDICATE: international cooperation, use case studies, pilots, policy harmonisation Two integrated projects Priorities and progamming Support and demonstration Taipei, 22/2/2012 DC-NET ERA-NET A Network for the European Research Area: • Composed by Programme Owners and Programme Managers in the cultural sector • To agree common perspectives & priorities across EU Member States • To establish an operative dialogue between cultural heritage and e-Infrastructures communities in Europe, • To identify constraints and capabilities in order to establish a plan of joint activities Started in December 2009, it will last until March 2012 A project funded by EC FP7 e-Infrastructures 14 Taipei, 22/2/2012 INDICATE A concrete approach within an international dimension – Stimulating the international cooperation of eInfrastructures providers and cultural heritage users – Target areas: • Mediterranean region, (Egypt, Turkey and Jordan) • Cooperation with China in liaison with the EPIKH Grid School • exchanges with South America in the frame of experiments for live distributed performances – Case studies: preservation, virtual exhibitions, GIS Started in September 2010, it will last until September 2012 A project funded by EC FP7 e-Infrastructures 15 Taipei, 22/2/2012 • The two projects share the same coordinator and have many partners in common. • The e-infrastructure programmes identified in DC-NET will be at the basis of the sustainability of the results of INDICATE. • The two projects represent the same DCH community. DC-NET 1/12/2009 1/9/2010 INDICATE 31/05/2012 31/8/2012 1/4/2011 Taipei, 22/2/2012 Research workflow and Service priorities Priorities for the Digital Cultural Heritage sector have been put together, having in mind the typical workflow of the DCH research. 17 17 Taipei, 22/2/2012 Typical DCH research workflow • Find: accessing information • Process: tools for manipulating information • Publish: make the results visible online • Conference: discuss and annotate published information • Preserve: maintaining access to content over the longer term • Secure Plus lower-level “basic digital services” such as email, data storage, web hosting, etc. 18 18 Taipei, 22/2/2012 Services priorities On the basis of the typical workflow of the DCH research, services are divided into 3 categories: 1. Services for content providers, i.e. those related to the creation of online data resources for DCH 2. Services for managing and adding value to the content itself 3. Services which enable, support and enhance virtual research communities and the activities of content consumers 19 19 Taipei, 22/2/2012 Services for content provides and data resource creation FROM common issues TO common priorities 20 20 Taipei, 22/2/2012 Services for content provides and data resource creation Common issues: • Interoperability of online resources • Insularity in terms of searching • Changes in location • High cost of establishment • Vulnerability to technical problems • Limitation on servers capacity and processing 21 21 Taipei, 22/2/2012 Services for content provides and data resource creation Common priorities: • Interoperation of systems • Aggregation of content • Cross-search • Semantic search • Persistent identification of digital objects • Simplification of set-up services • Stable platform • Scalability 22 22 Taipei, 22/2/2012 Services for managing and adding value to content e.g.: • Geo-referencing • 3D representation • Virtual reality and immersive interfaces • Annotation • Linked data generation 23 23 Taipei, 22/2/2012 Services for content consumers The “cafeterial model”: a broad range of services to be made available, without the need to actually deliver them for all members of the community. e.g.: • User authentication and access control • Collaborative environments • Advanced search • Visualisation 24 24 Taipei, 22/2/2012 Services priority ordering A prioritised list of the most immediately important services has been agreed: 1. Long-term preservation 2. Persistent identifiers 3. Interoperability and Aggregation 4. Advanced search 5. Data resource set-up 6. User authentication and access control 7. IPR and digital rights management Taipei, 22/2/2012 26 culture research e- infrastructures Cooperation and coordination among these three sectors is at the core of the DCH e-infrastructure Taipei, 22/2/2012 The network of common interest It combines: – regional, national and international levels, – bottom-up (working groups) and top-down (Joint Programming) approaches Working groups: experts seconded by their cultural, research and infrastructure organisations Cooperation with other networks and projects: EPIKH, CHAIN, EUMEDGRID-Support, EUMEDCONNECT2, LINKED HERITAGE, …. Taipei, 22/2/2012 Liaisons with strategic bodies Factual cooperation is established with: – e-IRG e-Infrastructure Reflection Group – ESFRI European Strategy Forum on Research Infrastruftures (SSH thematic working group) – EGI European Grid Initiative – TERENA Trans-European Research and Networking Association – MSEG Member States Expert Group on digitisation – ASREN – Arab States Research and Education Network Taipei, 22/2/2012 Position Paper Open consultation Green Paper on Common Strategic Framework 1. European Coordination: the role of Member States and European Commission 2. Europeana: towards its full deployment 3. Preservation: a task for the Member States 4. Digital Cultural Heritage: the need for a research e-Infrastructure 5. Research and innovation in the digital cultural heritage: an international matter 6. Users involvement: the success factor 7. Coordination and demonstration: a requirement for the DCH sector Taipei, 22/2/2012 Next appointment 8 March 2012, Rome – DC-NET Final Conference 20 April 2012, Catania – INDICATE Technical Conference to demonstrate the e-Culture Science Gateway and to present the result of the use case studies on long-term preservation, virtual exhibitions and geo-coded cultural content 9-10 July 2012, Cairo – INDICATE Final Conference Taipei, 22/2/2012 The vision • INDICATE and DC-NET are part of a wider process, which started 10 years ago among cultural institutions • This process entered in a new phase joining the research e- infrastructures • Time is ready to start working towards an Open Science Infrastructure for Digital Cultural Heritage in 2020 Joint Programming Support and demonstrations Roadmaps DCH-RP Proposal Taipei, 22/2/2012 Thank you Antonella Fresa DC-NET and INDICATE Technical Coordinator fresa@promoter.it antonella.fresa@beniculturali.it www.dc-net.org www.indicate-project.org mailto:fresa@promoter.it mailto:antonella.fresa@beniculturali.it http://www.dc-net.org/ http://www.dc-net.org/ http://www.dc-net.org/ http://www.indicate-project.org/ http://www.indicate-project.org/ http://www.indicate-project.org/ work_gtrr6eekknan3eo7i2anh4hjya ---- 0 Neurocognitive Literary Studies and Digital Humanities Dr. Valiur Rahaman (Paper Presenter) Asstt Professor, Department of English Madhav Institute of Technology & Science Gwalior-INDIA Founder President, Indian Society of Digital Humanities (formed 2016) Principal Investigator of CRS Research Project on Humanities Inspired Technology Long Presentation at ADHO Conference / Digital Humanities 2020 /Virtual Conference Keywords: Humanities-Inspired Technology, Research in Digital Humanities, Neurocriticism, Autism, Literary Studies, Literary data Modeling, Digital Narrative, Social Media, Transdisciplinary Research 1 Neurocognitive Literary Studies and Digital Humanities ABSTRACT The paper demonstrates how neurocognitive social psychology can be applied to study human behavior through literary character analysis with digital tools; and how the digital literary studies in terms of neurocognitive psychology may help develop new models for technology and theories of contemporary science. On the basis of the theses, the paper illustrates the theoretical methodology called “Humanities-inspired technology for society” as an essential sub-branch of Digital Humanities and its application to the two major research studies: to great classics of all times and to etiology of autism. The paper advocates to bring literary theory and neurocognitive literature in the curricular of science and technology. Submitted for Long Presentation at ADHO Conference / Digital Humanities 2020 /Virtual Conference Keywords: Humanities-Inspired Technology, Research in Digital Humanities, Neurocriticism, Autism, Literary Studies, Literary data Modeling, Digital Narrative, Social Media, Transdisciplinary Research 1. Introduction Psychology, Cognitive Science and Psychoanalysis are often intersectional subjects with literary studies. Digital Humanities strengthens literary studies when its scholarship help develop models for advancement of science and technology. Till date, a very few studies have gone to this direction-how DH scholarship help technological modeling for challenging social problems and healthcare issues. The paper highlights the conceptual ground of humanities-inspired technology for society (HITS), its applications and functions. It has a major component 'neurocognitive literary study' through digital tools and hence the 2 paper establishes a networked rapport of literary arts with neurocognitive science and digital humanities/studies. At the beginning of the paper, the author defines HITS as an approach to knowledge system and concludes with its applications. 2. HITS as Sub-branch of DH: A Study in Digital Humanities to Technological Advancement Digital Humanities scholarship is utilized to disseminate, preserve, conserve and represent visuals of the knowledge system but seldom used for advancing human technologies for social welfare. The paper explores humanities inspired technology as a subdiscipline of Digital humanities which studies how humanities scholarship intersected or interpreted or analyzed with digital technological tools and it demonstrates attributes to modelling for technological development. It deals with practical expositions of literary or language philosophers, and critical theorists as impetuses for modeling of cognitive computational technology. Hence, it strongly establishes an inseparable bridge between practices in technology and humanities epistemology. The function of Humanities-inspired technology for society (HITS) essentially lies with developing models based on digital studies in philosophy of language and literary studies in terms of brain, mind and behavior. It coordinates the two different streams of knowledge system for three reasons: first, to remind; second to upgrade; and third to develop. It reminds what is missed by the world of technology; suggests to upgrade technological tools and devices for their humane utilization without their hazardous impacts on the earth and beyond; and develops new models out of scholarly studies in humanities for technological advances. For instance, there is no neuro- model based technology developed till date to identify the factors of sexual deviant criminals, to control or detect such heinous criminals. Begun with empathy to the victims, a HITS scholar studies the behavior patterns of such personalities in Literature in terms of neuro-cognitive psychology and social psychology and may develop behavior semiotic model based on the studies patterns and prepared corpus. Such studies develop industry-based research and development in the fields of Digital Humanities, which is much awaited epistemological contention in the arena of humanities departments in India and across the world. 3 For ages, Literature is studied in its own terms: Aristotelian, Longinian, Classicist, Romantics, Modern, Postmodern, Gender, Colonial and Postcolonial. Literary studies seldom go beyond its defined disciplinary territories and this was the major reason for its fall across the world. Its boundary is defined for its users and the users are not allowed to go beyond the boundaries, thus, communication with the real world is questioned in literary studies. The influences of Marx, Freud, Nietzsche, Foucault, Lacan, and Derrida are irresistible penetrating human thinking so they could touch the offshoots of the literary studies despite the disciplinary resistance of classical rhetoricians. Now, something has happened more than that: interferences of science and technology in the study of Humanities with slow but steady manners; in respective phases resulting in Humanities Computing, Computational Humanities, Digital Humanities, Speculative Digital Humanities (SpecLab), and Public Digital Humanities. 3. Conceptualization, Experimentation, and Invention The demand of transdisciplinary studies of science and arts, aesthetics and technology are observed in the history of ideas of contentions of difference and epistemological hybridity. I.A. Richards’s collaborative works with C.K Ogden developed a transdisciplinary approach to the poetics called ‘science of criticism’ (Green); C. P. Snow observed two cultures in the “intellectual life of the whole of western society” (Rede lectures); E. O. Wilson‘s Consilience: The Unity of Knowledge (Wilson) is the finest exposition of trans-disciplinary thought argues for “consilience” referring to “the synthesis of knowledge” derived from different specialized fields of human endeavor to envision a new field of knowledge serving the society. “The greatest enterprise of the mind has always been and always will be the attempted linkage of the sciences and humanities.” (Wilson; Morris) How this linkage is possible? Let’s understand with few examples: Descartes’ painting is a part popular science known as a pattern-design of the first experimentation in designing the airplane.(Miller) The coordinate system is ingrained in Descartes's philosophy; and similarly, Thomas Carlyle’s Circle is well-known model in Mathematics (DeTemple) as “a certain circle in a coordinate plane associated with a quadratic equation” and may similar studies are yet to be done. The implications of humanities knowledge of the two are examples of Humanities inspired technology and science. Such findings of interferences of Humanities in the domains of science and technology are observable to establish an ideation that science and technology are developed also by the epistemological influences of Humanities (esp. linguistics, literature and 4 cultural heritage). The HITS never establishes superiority of a knowledge system over another one such as demonstrated in Science and Poetry as a problem in epistemological enquiries. (Midgley) 4. Literature, Neurocognitive science, and Technology: Substantial Studies in Neurocognitive Digital Humanities Based on the concept argued above, the paper now reflects substantiated studies research on Humanities inspired technology. In this, it is shown how knowledge of Humanities polishes, cherishes the motives for developing technological tools to guarantee the safety and security of the human society at large. We conducted two studies together: I theorized the ‘Neurocognitive literary theory’ based on “activated neurons affecting/effecting the human behavior (ANAEHB)” patterns and applied to study Hamlet’s neurological problems equating his mental status with existing persons in real society; to study R.N. Tagore’s The Post Office in terms of how neurocognitive forces in an author empathetically influence the audiences of the play resulting in its translation and staging across the world during the World War. (Rahaman and Sharma); and to study neurodevelopment issues reflected through behavior such as the mental anguish and moral dilemmas of Rodion Raskolnikov in Fyodor Dostoyevsky’s Crime and Punishment (1866), and neurocognitive factors of racial discriminative behavior patterns of Marlowe & Kurtz in Joseph Conrad’s Heart of Darkness (1899), and sexual deviant behavior of David Lurie in J. M. Coetzee’s Disgrace (1999). These characters illustrate the behavior patterns of the socially disturb mindset resulting in numerous societal problems at large. The specific factors of behaviors disturbing the other members of the society and their connection with the CNS are etiologically studied and replied to the research questions: Can Literary reading be intersected with neurological and computational studies? Can reading in Humanities or knowledge of humanities help solve complex problems in the development of AI, Neurocomputation, Human Nature Inspired Computing, and Medical computing? Based on the following findings which are observed as outcomes of neurocognitive literary studies: 1. the impulses of human beings through deep reading of literary classics, and compare with real-life situations in Human Society is feasible 2. Understanding human impulses identifying 5 neurological causes behind human behavior and developed computational modeling to express the criminal mindset 3. Based on Humanities and Knowledge Engineering for Medical & Technology, developed a device to protect a woman from the unwanted accident 4. Established the possibility of Trans-disciplinary research in Arts & Literature intersected with cognitive sciences and computational studies 5. Established Literature & Language as a reflection of socio-neuron behavior and identified mental patterns of the neurological disorder in humans to commit Rape and Murder. For literary studies, words are the only media for assessing human behaviors so Atlas.ti the software application is used to analyze the patterns of behavior through frequencies of words used by the characters of the literary works. 5. Literary Narratives, Neurodevelopment and Techno-epidemiology As argued, the trans-disciplinary approach always brings novelty in the procedures of experimentation resulting in prismatic ways to see the world. For example, Friedrich Salomon Rothschild (1899-1995), a psychiatrist and colleague of Erich Fromm (1900-1980) developed the theory of biosemiotics. Rothschild was a reader of Charles W. Morris (I have cited above) who studied Engineering and Psychology at NU and earned a Ph.D. under the research supervision of psycho-sociologist George Herbert Mead (1863-1931). His book Signs, Language, and Behavior (1946) elucidates the signs representing human behavior; specific modes of signifying adequacy, truth, and reliability of signs; and defined life is but the semiotic narrative, and as the signature of human behavior. Similarly, J. C. Whitehorn and G. K. Zipf collaboratively wrote “Schizophrenic language” (1943); G. K. Zipf edited The Psycho-Biology of Language (1939), “The Unity of Nature, Least-Action, and Natural Social Science” (1942), and “Observations of the Possible Effect of Mental Age Upon the Frequency-Distribution of Words, from the viewpoint of Dynamic Philology” are the oldest research papers archived in the PubMed and remain foundational works in cognitive- linguistic disorders which symptomatize the Autism basically. These works are the consequences of inclinations towards what we called “research consilience” a trans-disciplinary approach to knowledge serving humanity and its associated agencies. 6. ZEF factor of Autism The prevalence of the rate of autism in the world states itself the facts of a less effective 6 approach to cure and challenge autism. To do so, there is unavoidable necessity to observe the history of the etiology of autism: from the second decade of the twentieth century to the WW I & II, and to 2019. The entire history of autism reveals various factors of autism established by medical practices or special treatment. The keen observation of etiology of autism states that the epidemiological historians of autism could really not differentiate the terms between symptomatology and etiology of autism. The problem is strongly put forth in “Deconstructing the Etiology of Autism and its Cure through Social Media & Digital Literary Narratives” (Rahaman 2020) and came up with a major finding that Autism eventuates during fertilization periods, longtime before the birth of a child. It is the evaluative study of the research pursued in the etiology of ASD and the possibility to develop a parallel treatment way by deconstructing the established hardcore medical practices for ASD. We studied, critically evaluated articles published between 1943 and 2019, consulted the world health organization reports of the prevalence of ASD in USA & eight South Asian countries, and develop an additional idea as therapy of ASD through “Social Media” & “Literary Narratives” differentiating technological and developed a model of post-technological Autism treatment. The study contributed to help the cure procedures for ASD through “Social media” and “literary narratives” further requirement of upgradation in epidemiological treatment through technological imaging and development of technology based on the ZEF factors of Autism. The other findings establish the open possibilities of research in the fields required to design further research and make policies to resist the prevalence of ASD around the world. Acknowledgements The concept of “Humanities Inspired Technology & Science” (HITS) sprung from the readings for the ongoing research project sponsored by the Collaborative Research Scheme under TEQIP-III, National Project Implementation Unit of MHRD, Govt. of India. The aim of the project is to define the potential of Humanities & Social Sciences to be used for the development of technology and science for the welfare of human beings with minimum after-effects or side-effects upon common lives or its target groups. Thanks to Dr R K Pandit, Director MITS for his supports for rich discussion in terms of the research studies. Works Cited 7 DeTemple, Duane W. “Carlyle Circles and the Lemoine Simplicity of Polygon Constructions.” The American Mathematical Monthly, vol. 98, no. 2, 1991, pp. 97–108, doi:10.1080/00029890.1991.11995711. Green, Elspeth. “I . A . Richards Among the Scientists.” ELH Fall, vol. 86, no. 3, 2019, pp. 751– 77, doi:https://doi.org/10.1353/elh.2019.0028. Midgley, Mary. Science and Poetry. Routledge London, 2001. Miller, Leonard G. “Descartes, Mathematics, and God.” Philosophical Review, vol. 66, no. 4, 1957, pp. 451–65, doi:10.1093/mind/xx.80.592. Morris, Charles William. “Symbolism and Reality : A Study in the Nature of Mind.” Foundations of Semiotics, no. 15, 1993, pp. xxv, 128 p. Rahaman, Valiur, and Sanjiv Sharma. Reading an Extremist Mind through Literary Language: Approaching Cognitive Literary Hermeneutics to R.N. Tagore’s Play The Post Office for Neuro-Computational Predictions. Edited by G R Sinha and Jasjit S B T - Cognitive Informatics Suri Computer Modelling, and Cognitive Science, Academic Press, 2020, pp. 197–210, doi:https://doi.org/10.1016/B978-0-12-819445-4.00010-2. Wilson, Edward O. Consilience: The Unity of Knowledge. Vintage Books Random House New York, 1999. Consulted Works 1. Arbib, Michael A. James J. Bonaiuto. From Neuron to Cognition via Computational Neuroscience. The MIT Press. 2016 2. Bara, B. G., Ciaramidaro, A., Walter, H., & Adenzato, M. Intentional minds: A philosophical analysis of intention tested through fMRI experiments involving people with schizophrenia, people with autism, and healthy individuals. Frontiers in Human Neuroscience, 5(7), 111. 2011. 3. Baron-Cohen, S. Mindblindness: An essay on autism and Theory of Mind. Cambridge, MA: MIT Press. 1995 4. Brown, Julie. Writers on the Spectrum: How Autism and Asperger Syndrome have 8 Influenced Literary Writing, Jessica Kingsley Publishers, London. 2010. 5. Cook, Amy. Shakespearean Neuroplay: Reinvigorating the Study of Dramatic Texts and Performance through Cognitive Science. Palgrave-Macmillan. 2010. 6. Corbett BA, et al “Treatment Effects in Social Cognition and Behavior following a Theater- based Intervention for Youth with Autism.” Cortex. 2019 Jun; 115:15-26. doi: 10.1016/j.cortex.2019.01.003. Epub 2019 Jan 22. 7. Dutta, Krishna; Robinson, Andrew, eds. Rabindranath Tagore: an anthology. Macmillan. 1998. 8. Einstein AJ, Henzlova MJ, Rajagopalan S. Estimating risk of cancer associated with radiation exposure from 64-slice computed tomography coronary angiography. JAMA 2007;298 (3):317–323. 9. Emmeche, Claus; Kull, Kalevi Towards a Semiotic Biology: Life is the Action of Signs” Imperial College Press. 2011 10. Fitzgerald, Michael The Genesis of Artistic Creativity Asperger’s Syndrome and the Arts. Jessica Kingsley Publishers. 2005. 11. George McKenzie, Jackie Powell and Robin Usher Ed. Understanding Social Research: Perspectives on Methodology and Practice, The Falmer Press. London. 1997. 12. Glynn. Dylan, Quantitative Methods in Cognitive Semantics: Corpus-Driven Approaches (Cognitive Linguistic Research). De Gruyter Mouton.2010. 13. Hickok, Gregory. The Myth of Mirror Neurons: The Real Neuroscience of Communication and Cognition, WW Norton. London. (2014) 14. Hickok, Gregory. “Eight Problems for the Mirror Neuron Theory of Action Understanding in Monkeys and Humans” 9 Available.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2773693/(2009) 15. Korczak, Janusz. Ghetto Diary with an Introduction of Betty Jean Lifton Available https://ia800401.us.archive.org/2/items/GhettoDiary- EnglishJanuszKorczak/ghettodiary.pdf. Retrieved on 1.8.2019 16. Pandit, RK, and Rahaman, Valiur, “Critical Pedagogy in Digital Era: Understanding the Importance of Arts & Humanities for Sustainable IT Development” (May 12, 2019). Proceedings of International Conference on Digital Pedagogies (ICDP) 2019. Available at SSRN: https://ssrn.com/abstract=3387020 or http://dx.doi.org/10.2139/ssrn.3387020 17. Peter Tepe, Discourse Studies, Vol. 13, No. 5, Special Issue on Hermeneutics and Discourse Analysis (October 2011), pp. 601-608. 18. Pineda, Jaime A., ed. 2013. Mirror Neuron Systems: The Role of Mirroring Processes in Social Cognition. Springer. Humana Press. 19. Rahaman, Valiur. Introducing Digital Humanities. Yking Books. Jaipur. India 2016. 20. Rahaman, Valiur. 2020. “Epi/Pandemic in Literature: A Study in Medical Humanities for COVID 19 Prevention Plenary Speaker. National Webinar on Literature & Epidemics. May 2020. MK Bhavnagar University, Gujarat. India.” Bhavnagar: Bhavnagar University India. https://sites.google.com/view/webinar-eng-mkbu/plenaries?authuser=0. 21. Rahaman, Valiur, and Sanjiv Sharma. 2020. “Chapter 10 - Reading an Extremist Mind through Literary Language: Approaching Cognitive Literary Hermeneutics to R.N. Tagore’s Play The Post Office for Neuro-Computational Predictions.” Cognitive Informatics Computer Modelling, and Cognitive Science. Ed. G R Sinha and Jasjit Suri., 197–210. Academic Press. Elsevier. doi:https://doi.org/10.1016/B978-0-12-819445-4.00010-2. 22. Ramchandran V. Blackslee, Sanda. (1998) Phantoms in the Brain: Probing the Mysteries of the Human Mind. HarperCollins. London. 1999. P 368. 23. Shakespeare, William.. Hamlet. Ed. Burton Raffe and Bloom. Yale University Press. London. 2003. doi:10.1017/CBO9781107415324.004. 24. Wolfrey, Julian. 2011. Introducing Criticism at the 21st Century. Introducing Criticism in the 21st Century, Edinburgh University Press. London. 25. Wilson, Matthew W. “Cyborg Geographies: Towards Hybrid Epistemologies”. Gender, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2773693/(2009) https://ia800401.us.archive.org/2/items/GhettoDiary-EnglishJanuszKorczak/ghettodiary.pdf.%20Retrieved%20on%201.8.2019 https://ia800401.us.archive.org/2/items/GhettoDiary-EnglishJanuszKorczak/ghettodiary.pdf.%20Retrieved%20on%201.8.2019 https://ia800401.us.archive.org/2/items/GhettoDiary-EnglishJanuszKorczak/ghettodiary.pdf.%20Retrieved%20on%201.8.2019 https://ssrn.com/abstract%3D3387020 https://dx.doi.org/10.2139/ssrn.3387020 https://sites.google.com/view/webinar-eng-mkbu/plenaries?authuser=0 10 Place and Culture, 16(5) (2009): 499–515. 26. Yeo, Richard. Defining Science, William Whewell, natural knowledge, and public debate in early Victorian Britain. Cambridge University Press. 1993. 27. V. Gallese, M.A. Gernsbacher, C. Heyes, G. Hickok, M. Iacoboni, “Mirror Neuron Forum”, Perspectives on Psychological Science 6 (4) (2011) 369-407. https://dx.doi.org/10.1177%2F1745691611413392 work_gv7vwol2tvfddmns3xm2mj7aje ---- Can forward dynamics simulation with simple model estimate complex phenomena?: Case study on sprinting using running-specific prosthesis Murai et al. Robomech J (2018) 5:10 https://doi.org/10.1186/s40648-018-0108-8 R E S E A R C H A R T I C L E Can forward dynamics simulation with simple model estimate complex phenomena?: Case study on sprinting using running-specific prosthesis Akihiko Murai* , Hiroaki Hobara, Satoru Hashizume, Yoshiyuki Kobayashi and Mitsunori Tada Abstract Surpassing the world record in athletic performance requires extensive use of kinematic and dynamic motion analy- ses to develop novel body usage skills and training methods. Performance beyond the current world record has not been realized or measured; therefore, we need to generate it with dynamics consistency using forward dynamics simulation, although it is technologically difficult because of the complexity of the human structure and its dynamics. This research develops a multilayered kinodynamics simulation that uses a detailed digital human model and a simple motion-representation model to generate the detailed sprinting performances of individuals with lower extremity amputations (ILEAs) aided by carbon-fiber running-specific prostheses (RSPs), which have complex interactions with humans. First, we developed a digital human model of an ILEA using an RSP. We analyzed ILEA sprinting based on experimental motion measurements and kinematics/dynamics computations. We modeled the RSP-aided ILEA sprint- ing using a simple spring-loaded inverted pendulum model, comprising a linear massless spring, damper, and mass, and we identified the relevant parameters from experimentally measured motion data. Finally, we modified the sprint motion by varying the parameters corresponding to the RSP characteristics. Here, the forward dynamics have been utilized to simulate detailed whole-body sprinting with different RSP types (including simulated RSPs not worn by the subject). Our simulations show good correspondence with the experimentally measured data and further indicate that the sprint time can be improved by reducing the RSP viscosity and increasing stiffness. These simulation results are validated by the experimentally measured motion modifications obtained with different types of RSPs. These results show that the multilayered kinodynamics simulation using the detailed digital human model and the simple motion-representation model has the capacity to generate complex phenomena such as RSP-aided ILEA sprinting that contains complex interactions between the human and the RSP. This simulation technique can be applied to RSP design optimization for ILEA sprinting. Keywords: Digital human technology, Running-specific prosthesis, Motion modification simulation © The Author(s) 2018. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creat iveco mmons .org/licen ses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Introduction We have measured the kinetic and physiological aspects of human performance using an optical motion cap- ture system, force plate, etc., and applied kinematics and dynamics analyses to compute the joint angles and torques and estimate muscle activities, in the fields of biomechanics and sports science. This method has real- ized excellent athletic performances and clarified injury mechanisms, although it cannot analyze the perfor- mances that have not been realized, for instance, a per- formance that surpasses the current world record. We usually generate the motions of robots, for instance, a grounded manipulator with dynamic consistency, by (1) joint angle or task-based motion generation and (2) forward dynamics simulation. However, applying this Open Access *Correspondence: a.murai@aist.go.jp Digital Human Research Group, National Institute of Advanced Industrial Science and Technology (AIST ), 2-3-26, Aomi, Koto-ku, Tokyo 135-0064, Japan http://orcid.org/0000-0002-2035-4346 http://creativecommons.org/licenses/by/4.0/ http://crossmark.crossref.org/dialog/?doi=10.1186/s40648-018-0108-8&domain=pdf Page 2 of 8Murai et al. Robomech J (2018) 5:10 technique to human whole-body motion generation is considerably difficult, because (1) humans have many more degrees of freedom and a much more complicated structure compared to robots, and (2) humans are float- ing systems; therefore, we need to estimate the contact forces that can easily become unstable, especially during dynamic motions such as sprinting  [1–4]. This research solves these problems by developing a multilayered kino- dynamics simulation that uses a detailed digital human model and a simple motion-representation model, which parametrically represents human motion mechanisms. Here, kinodynamics represents the discipline that tries to solve kinematic constraints and dynamic constraints simultaneously, as defined in [5]. In this study, we ana- lyzed and modelled the sprinting performances of indi- viduals with lower extremity amputations (ILEAs), aided by carbon-fiber running-specific prostheses (RSPs), which entail complex interactions between humans and RSPs that form the kinematic and dynamic constraints, to improve the RSP-aided ILEA sprinting performance. Carbon-fiber RSPs have enabled ILEAs to realize hith- erto unachieved degrees of high-level sprinting [6]. While the running mechanics in able-bodied sprinters and ILEAs have been previously examined, these researches were mainly limited to biomechanical studies [7, 8]. Fur- ther, many studies have investigated the RSP behavior and performance during sprinting through rigid-body dynamics  [9] and finite element analysis  [10], but the relationship between the RSP characteristics and sprint performance remains unclear. In particular, RSP-aided ILEA sprinting involves humans and RSPs, as well as the kinematic and dynamic interactions between them, and its kinematics and dynamics analyses are techni- cally complex compared with those of general rigid body systems that are often analyzed in the robotics field. We generated the RSP-aided ILEA sprinting motion, which contains the complex interactions between humans and RSPs, by developing a multilayered kinodynamics simu- lation, which uses a detailed digital human model and a simple motion-representation model that parametrically represents human motion mechanisms. Digital human models have been developed to study body kinematics and perform dynamics analyses. These models have been developed based on the knowledge of human anatomy, and they can estimate and ana- lyze human motion through kinematics and dynamics computations  [11–14]. We extended the spring-loaded inverted pendulum (SLIP) model for the simple motion- representation model that parametrically represents human motion mechanisms. The sprinting motion is often simplified using the SLIP model, which models the entire human body as a spring-mass model and describes the spring-like leg movement during sprinting  [15]. We applied a unilateral SLIP model with a spring, damper, and mass, similar to the one used in [16], to model the RSP-aided ILEA sprinting motion. Figure  1 shows the concept of multilayered kinodynamics simulation. This simulation consists of (A) simplification of sprint motion by using the SLIP model comprising a spring, damper, and mass, and identification of the relevant parameters from experimentally measured motion, (B) modification of the sprint motion by varying the SLIP model param- eters and simulation of its forward dynamics, and (C) reconstruction of the detailed whole-body sprint motion from the simulated SLIP model motion. Our approach realizes the simulation of the detailed whole-body sprint- ing of the specific subject using different RSP types and properties (including simulated RSPs not worn by the subject). We evaluated our simulation results by com- paring them with the experimentally measured motion, and both result sets showed good correspondence. This modeling and simulation technique can contribute to the quantitative evaluation and design of RSPs to realize higher levels of RSP-aided ’ILEA sprinting’ performances. Methods We first modified our anatomographic digital human model  [11] by adjusting the surface shape and skeleton, so that the model represented the kinematic and dynamic characteristics of the ILEAs using the RSPs. The able- bodied individual and ILEA models consisted of 18 and 23 bones, respectively. Each bone was represented as - simplify sprinting motion with SLIP model - identify model paremters - reconstruct detailed whole-body motion - modify model parameters - simulate forward dynamics - compare/evaluate motions - apply to RSP designing, sports training, etc. experimentally measured motion motion that has not been performed by subject A) B) C) Fig. 1 Concept chart of multilayered kinodynamics simulation of motion modification. This simulation consists of three steps: A sim- plify the sprinting motion with the SLIP model and identify its param- eters using experimentally measured motion, B modify the identified model parameters and simulate the forward dynamics of the simple model, and C reconstruct the detailed whole-body motion from the motion of the simple model Page 3 of 8Murai et al. Robomech J (2018) 5:10 a rigid-body linkage with inertial parameters, and the bones were connected to each other via spherical joints. Figure  2 shows the digital human model of an able-bod- ied individual and an ILEA with unilateral transfemoral amputation wearing an RSP (Sprinter 1E90, Ottobock, Duderstadt, Germany, and Xtreme, Ossur, Reykjavik, Ice- land) with a prosthetic knee joint (3S80, hydraulic single- axis knee joint, Ottobock, Duderstadt, Germany) (male, height 1.68E+00 m and weight 7.01E+01 kg for an able- bodied individual and 6.65E+01 kg for the ILEA). Next, ILEA sprinting with the unilateral transfemo- ral amputation was captured by means of a commer- cial marker-based optical motion capture system with 20 cameras (VICON, Oxford, England) operating at a frame rate of 200 Hz. The subject wore two types of RSPs (Sprinter 1E90 and Xtreme) with the prosthetic knee joint (3S80). The subject was free from any injuries at the time of data collection, and our study protocol was approved by the local institutional review board and conformed to the guidelines outlined in the Declaration of Helsinki (1983). The subject wore 57 markers, whose locations were determined based on an improved ver- sion of the Helen Hayes Hospital marker set. Further, 20 additional markers were attached to each RSP to capture the detailed RSP deformation. The positions of the mark- ers are indicated in Fig. 2 by means of white spheres. We also recorded the contact force between the ILEA and the floor using seven force plates (AMTI, MA, USA), each of which measured the six-axis contact force and momentum at a rate of 2 kHz. The inverse kinematics was computed with DhaibaWorks  [17], and the inverse dynamics was solved with OpenSim  [12]. The multilay- ered kinodynamics simulation for motion modification consisted of the following three steps (Fig. 1). (A) Simplification of sprint motion using the SLIP model and identification of relevant parameters The sprinting was analyzed with simplified models to extract the kinematic and dynamic characteristics to realize stable dynamic simulations. The SLIP model, which represents the entire human body as a spring- mass model, has been previously applied to describe the spring-like leg movement during locomotion and sprint- ing  [15]. In this study, we applied the unilateral ’spring- damper-mass SLIP model’ to represent RSP-aided ILEA sprinting (Fig.  3). Here, the whole body was modeled as a mass supported by a spring and damper connected in parallel. Next, we identified the relevant parameters of this model using experimentally measured motion data. The natural length of the leg ( Lleg,0 ) and the spring and damper parameters ( Kleg and Dleg , respectively) were identified for the intact limb and the RSP, respectively, by mathematical optimization. This optimization minimized the error between the spring and damper forces and measured the contact force between the ground and the intact limb or the RSP, as given in the following equation: (1) Ef = T∑ t (Fleg(t) − (Kleg(Lleg(t) − Lleg,0) + Dleg ˙Lleg(t))) 2 , able-bodied model RSP (Sprinter 1E90) RSP (Xtreme) Fig. 2 Digital human models of an able-bodied individual and ILEA fitted with an RSP. Left: digital human model of the able-bodied individual, mid- dle: model of the ILEA fitted with Sprinter 1E90, and right: model of the ILEA fitted with Xtreme Page 4 of 8Murai et al. Robomech J (2018) 5:10 where Fleg is the measured contact force, Lleg is the meas- ured length of the leg, and ˙Lleg is its velocity. We also computed the height ( H0 ) of the center of mass (COM) and the contact angle ( θ0 ) at forefoot strike for each case from the experimental data. (B) Modification and simulation of the forward dynamics of RSP‑aided ILEA sprinting The kinematic and dynamic characteristics of an RSP can be modified by changing its shape and material. We varied the SLIP model parameters that correspond to the RSP characteristics, ( Kleg and Dleg ). Here, we remark that RSP-aided ILEA sprinting is the result of complex interactions between the human controller and the RSP characteristics. The identified parameters in "Results and discussion" (Table  1) indicate that all the parameters, except the RSP characteristics ( Kleg and Dleg of the pros- thesis), yield similar values during sprinting with differ- ent RSPs. Therefore, we assumed that this specific subject utilizes the same control strategy for all RSPs, and only the parameters corresponding to the RSP characteris- tics change when the individual wears a different type of RSP. We simulated ILEA sprinting using different types of RSPs by modifying Kleg and Dleg of the prosthesis and computed the forward dynamics. Here, we computed COM acceleration ( ACOM(t) ) in the following process. 1 if state == flight & PCOM,y(t) < H0 2 state = stance 3 PCOP,x = PCOM,x + PCOM,y/tanθ0 4 PCOP,y = 0 5 if state = stance 6 Fleg = Kleg(Lleg(t) − Lleg,0 + Dleg) ˙Lleg(t) 7 if Fleg > 0 8 state = flight 9 if state == flight 10 ACOM,x(t) = 0 11 ACOM,y(t) = −g 12 else 13 ACOM,x(t) = (PCOP,x − PCOM,x(t))Fleg(t)/Lleg(t)m 14 ACOM,y(t) = (PCOP,y − PCOM,y(t))Fleg(t)/Lleg(t)m − g where state represents the phase of sprinting and m rep- resents the total mass of the body. Kleg and Dleg change depending on whether the intact or the prosthetic leg is in contact with the ground. The time integration of ACOM(t) computes the trajectory of COM and COP in this forward dynamics simulation. (C) Reconstruction of detailed whole‑body motion from the simulated SLIP model motion We reconstructed the detailed whole-body motion from the simulated simple SLIP model motion for detailed kinematics and dynamics analyses and visualization. The trajectories of all 77 markers, which were experimentally measured, were represented using the quadratic form of the SLIP model status. The parameters of this mapping function from the SLIP model status to the trajectories of all 77 markers were optimized by minimizing the follow- ing function: where Pmar is the measured marker position, i is the marker ID, j ∈ (x, y, z) , and M is the quadratic form of the COM position ( PCOM ) and the position of the center of pressure (COP) ( PCOP ), as given in the following function: We optimized the parameters α and β to minimize the evaluation function Em . The trajectories of the 77 mark- ers were reconstructed from the SLIP model motion in the forward dynamics simulation, the abovementioned parameters, and the kinematics constraints arising from the COP position using this M(PCOM,j, PCOP,j) . This step (2) Em = T∑ t (Pmar(i, j, t) − M(PCOM,j(t), PCOP,j(t))) 2 , (3) M(PCOM,j(t), PCOP,j(t)) = (α(PCOM,j(t) − PCOP,j(t)) + β) 2 . H0 θ0 PCOP(t) PCOM(t), VCOM(t) Fleg(t) flight phase stance phase flight phase x y Kleg Dleg Fig. 3 SLIP model for RSP-aided ’ILEA sprinting’ Table 1 Parameters for  the RSPs and  intact leg in  the SLIP model Prosthetic Intact Sprinter 1E90 Xtreme Sprinter 1E90 Xtreme Lleg,0 (m) 1.05E+00 1.06E+00 1.06E+00 1.06E+00 Kleg (N/m) 1.69E+04 2.04E+04 2.14E+04 2.25E+04 Dleg (Ns/m) 8.77E+00 9.23E+01 2.92E+01 2.01E+01 H0 (m) 9.41E−01 9.81E−01 9.63E−01 9.96E−01 θ0 (rad) 1.34E+00 1.31E+00 1.37E+00 1.36E+00 Page 5 of 8Murai et al. Robomech J (2018) 5:10 significantly contributes to the detailed kinematics and dynamics analyses. The whole-body joint angles were estimated using the inverse kinematics computation per- formed using these marker trajectories and the detailed digital human model. The whole-body joint torques were estimated using the inverse dynamics computation per- formed using these joint angles and the contact forces that were estimated in step (B) using the SLIP model. Results and discussion We can observe the following points from the experi- mental results: 1. Figure   4 shows the analyzed motion of RSP-aided ILEA sprinting (Sprinter 1E90). Our model simu- lates the COM trajectories with an average error of 1.94E+01 mm during the stance phase of sprinting. The kinematics and dynamics of the digital human model compute both the human joint torque and bending torque of the RSP during sprinting using its shape and the external force acting upon it. Figure  5 illustrates the bending moment at each point of the RSPs. With regard to step (A) in "Introduction", Table 1 lists the parameters identified from the stance phases of the Sprinter 1E90 and Xtreme RSPs and the intact leg. The values of the prosthetic parame- ters Kleg and Dleg exhibit apparent differences, which correspond to the RSP characteristics, although the other parameters yield similar values. 2. Our model identifies these parameters with average errors of 1.04E+01 ± 6.50 E+00% and 1.83E+01 ± 1.40E+01% (average ± SD) in the contact forces for Sprinter 1E90 and Xtreme, respectively. In step (B), the COM trajectories during the RSP-aided ILEA sprinting (Sprinter 1E90) was simulated (Fig.  4). Our model simulated the COM trajectories with error of 1.94E+01 ± 5.08E+00 mm (average ± SD) during the stance phase of sprinting. In step (C), we reconstructed the whole-body 77-marker posi- tions through steps (A to C) without changing the SLIP model parameters. Our method reconstructed the marker positions with an error of 4.32E+00 ± 1.76E+00 mm (average ± SD), whose maximum errors ranged from 2.20E+00 mm on the marker of the right tragus to 3.02E+01 mm on the marker of the top of the RSP during the stance phase of sprint- ing. 3. Figure   6 shows the detailed whole-body motion and COM trajectories of ILEA sprinting with differ- ent RSP types and properties. Figure   7 shows the hip joint torques at the sides of the RSP during the stance phase of ILEA sprinting with different RSP types and properties, which are the results of the kin- ematics and dynamics analyses of the detailed whole- body motion. Figure  8 shows the 100-m sprint time obtained with different types of RSPs. These results have three implications: 1. The appropriate digital human model, motion meas- urements, and kinematics and dynamics computa- tions aid in realizing dynamics analysis. Figure   5 represents the bending moment at each point of the RSPs during ILEA sprinting. The radii of these curva- tures fit well with the values listed in  [18]. The SLIP model with the spring, damper, and mass suitably represent the kinematics and dynamic characteristics Fig. 4 Synthesized motion of ILEA sprinting using RSP (20 fps). Blue: measured center of mass (COM) trajectory; green: simulated COM trajectory 0 0.1 0 0.1 0 200 400 600 0 200 400 600 P1 P2 P3 P4 P5 P6 time [s] Sprinter 1E90 Xtreme jo in t t or qu e [N ·m ] P1 P2 P3 P4 P5 P6 P1P2 P3 P4 P5 P6 Fig. 5 RSP torques during sprinting. Horizontal axis: time, the origin of which represents the instant of the left forefoot strike, and vertical axis: flexion/extension torque whose positive value represents flexion. Left: Sprinter 1E90 and right: Xtreme. Each line corresponds to a point shown in the corresponding figure at the bottom Page 6 of 8Murai et al. Robomech J (2018) 5:10 of the experimentally measured ILEA sprinting using an RSP for both the intact limb and the RSP. 2. The forward dynamics simulation with the simple SLIP model realizes the kinematics and dynamics analyses of the motions that were not performed by the subject. Figure   6 shows the COM trajectories of ILEA sprinting with different properties of the RSPs. The RSPs with stiffness values of 75 and 125% were not worn by the subject during the measure- ments; they were simulated. The simulation results show that both types of RSPs exhibit similar patterns: the subject moves upward when Kleg increases and downward when Kleg decreases. Figure   7 shows the hip joint torques at the sides of the RSPs during the stance phase of the RSP-aided ILEA sprinting with different RSP types and properties, which were the result of the detailed whole-body kinematics and dynamics analyses. The RSPs with stiffness values of 75 and 125% were not worn by the subject during the 0.75 × Kleg, Xtreme Kleg, Xtreme 1.25 × Kleg, Xtreme 0.75 × Kleg, Sprinter 1E90 Kleg, Sprinter 1E90 1.25 × Kleg, Sprinter 1E90 Fig. 6 Simulated whole-body sprint motion and COM trajectory during the stance phase with different RSPs. Top row: Sprinter 1E90, bottom row: Xtreme, left (red): 75% of Kleg , middle (green): 100% of Kleg , and right (blue): 125% of Kleg 10050 -100 0 100 200 300 0 100 % of stance phase Sprinter 1E90 Xtreme jo in t t or qu e [N ·m ] 500 0.75 Kleg, Sprinter 1E90 Kleg, Sprinter 1E90 1.25 Kleg, Sprinter 1E90 0.75 Kleg, Xtreme Kleg, Xtreme 1.25 Kleg, Xtreme Fig. 7 Simulated left hip joint torque during the stance phase with different types of RSPs. Left graph: Sprinter 1E90, right graph: Xtreme, red dotted line: 75% of Kleg , green solid line: 100% of Kleg , and blue dashed-dotted line: 125% of Kleg Page 7 of 8Murai et al. Robomech J (2018) 5:10 measurements; they were simulated. The simulation results show that both types of RSPs exhibit simi- lar patterns: the required hip joint torque increases when the RSP stiffness ( Kleg ) increases. Here, we note that the relationship between the RSP stiffness and the required hip joint torque is not a simple lin- ear relationship. The complex relationships in the temporal and amplitude directions appear because of the kinematic and dynamic interactions between humans and the RSPs. The multilayered kinodynam- ics simulation using the detailed digital human model and simple motion-representation model represents these complex interactions and realizes the non-lin- ear complex relationship between the RSP stiffness and the hip joint torque that is necessary for sprint- ing using the same control strategy. 3. From Fig.  8, we note that the 100-m sprint time is significantly improved with decrease in Dleg , and the model falls down ( PCOM,y(t) becomes 0) when Dleg increases drastically. An increase in Kleg also contributes to slightly reducing the sprint time. The sprint time was 8.53E+00 s when Kleg = 1.69E+04 N/m, Dleg = 8.77E+00 Ns/m (Sprinter 1E90), and 9.33E+00 s (9.38E+00% slower) when Kleg = 2.04E+04 N/m, Dleg = 9.23E+01 Ns/m (Xtreme). The experimentally measured sprint speeds were 7.11E+00 m/s and 6.60E+00 m/s (7.73E+00% slower) for Sprinter 1E90 and Xtreme, respectively. Here, we first note that the simulated 100-m sprint times were relatively short because of the limitations in our model. One limitation was that a fatigue model was not considered in these simulations. In addition, there were certain dynamic and physical limitations; for instance, the friction parameter and maximum muscle tension have not yet been implemented in our SLIP model. Regardless of the above limitations, our results indicate that the forward dynamics simu- lation with the simple SLIP model agrees satisfac- torily with the measured data at the point that the ratio between the simulated 100-m sprint times using RSPs whose Kleg and Dleg correspond to Sprinter 1E90 and Xtreme is close to the ratio between the measured sprint speeds using the corresponding RSPs. The simulation results indicate that an increase in Kleg improves the sprint time; therefore, this prin- ciple can be applied to RSP design to improve the ILEA sprinting performance. These results, how- ever, are limited to one ILEA sprinter with unilateral transfemoral amputation and using several types of RSPs. Further, we have also assumed that the subject adopts the same control strategy when using differ- ent types of RSPs. Conclusion In conclusion, our multilayered kinodynamics simula- tion realized stable forward dynamics simulation of ILEA sprinting with an RSP on a specific subject, and estimated the detailed kinematic and dynamic characteristics of this complex phenomena. We believe that our approach can contribute to simulating performances that surpass human performances, and particularly contribute to the optimization of RSP design for ILEA sprinting. Authors’ contributions In this study, AM performed the model development, kinematics/dynamics computations, and data analysis, and participated in acquiring the measure- ments. HH, SH, and YK performed the measurements and helped in drafting the manuscript. MT performed the software development. All authors read and approved the final manuscript. Acknowledgements This research was supported by a Grant-in-Aid for Young Scientists (A) #17H04700 and Scientific Research(A) #939778. Competing interests The authors declare that they have no competing interests. Ethics approval and consent to participate Written informed consent was obtained from the patient for the publication of this report and any accompanying images. Funding This research was supported by a Grant-in-Aid for Young Scientists (A) #17H04700 and Scientific Research(A) #939778. 7.5 8.0 8.5 9.0 9.5 50 100 150 50 100 150 Kleg [%] D le g [% ] 7.0 7.5 8.0 8.5 9.0 9.5 10.0 time [s] 7.0 Fig. 8 Time taken for 100-m sprint for the SLIP model with varying Kleg and Dleg values. Horizontal axis: Kleg , vertical axis: Dleg of Sprinter 1E90. The 100-m sprint time with the RSP having the corresponding parameters is represented in a color scale ranging from blue to yel- low. The color red indicates that the model falls down before crossing 100 m Page 8 of 8Murai et al. Robomech J (2018) 5:10 Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in pub- lished maps and institutional affiliations. Received: 31 January 2018 Accepted: 8 May 2018 References 1. Taylor GW, Hinton GE, Roweis ST (2007) Modeling human motion using binary latent variables. In: NIPS’06 Proceedings of the 19th international conference on neural information processing systems. pp 1345–1352 2. Safonova A, Hodgins JK, Polland NS (2004) Synthesizing physically real- istic human motion in low-dimensional, behavior-specific spaces. ACM Trans Graph (TOG) 23:514–521 3. Yamane K, Nakamura Y (2008) Dynamics simulation of humanoid robots: forward dynamics, contact, and experiments. In: The 17th CISM-IFToMM symposium on robot design, dynamics, and control 4. Otten E (2003) Inverse and forward dynamics: models of multi-body systems. Philos Trans R Soc B 358:1492–1500 5. Motonaka K, Watanabe K, Maeyama S (2015) Kinodynamic notion plan- ning for an X4-Flyer. In: Habib MK (ed) Handbook of research on advance- ments in robotics and mechatronics. IGI Global, Hershey, pp 455–474 6. Nolan L (2008) Carbon fiber prostheses and running in amputees: a review. Foot Ankle Surg 14:125–129 7. Grabowski AM, McGowan CP, McDermott WJ, Beale MT, Kram R, Herr HM (2010) Running-specific prostheses limit ground-force during sprinting. Biol Lett 6:201–204 8. Brüggemann GP, Arampatzis A, Emrich F, Potthast W (2009) Biomechan- ics of double transtibial amputee sprinting using dedicated sprinting prostheses. Sports Technol 1:220–227 9. Dumas R, Cheze L, Frossard L (2009) Loading applied on prosthetic knee of transfemoral amputee: comparison of inverse dynamics and direct measurements. Gait Posture 30:560–562 10. Rigney SM, Simmons A, Kark L (2015) Concurrent multibody and finite element analysis of the lower-limb during amputee running. IEEE EMBS Annu Int Conf 2015:2434–2437 11. Murai A, Endo Y, Tada M (2016) Anatomographic volumetric skin-muscu- loskeletal model and its kinematic deformation with surface-based SSD. IEEE Robot Autom Lett 1:1–7 12. Delp SL, Anderson FC, Arnold AS, Loan P, Habib A, John CT, Guendel- man E, Thelen DG (2007) OpenSim: open-source software to create and analyze dynamic simulations of movement. IEEE Trans Biomed Eng 54:1940–1950 13. Nakamura Y, Yamane K, Fujita Y, Suzuki I (2005) Somatosensory computa- tion for man-machine interface from motion capture data and musculo- skeletal human model. IEEE Trans Robot 21:58–66 14. Rasmussen J, Damsgaard M, Surma E, Christensen S, de Zee M, Vondrak V (2003) AnyBody—a software system for ergonomic optimization. In: Fifth world congress on structural and multidisciplinary optimization 15. Blickhan R (1989) The spring-mass model for running and hopping. J Biomech 22:1217–1227 16. Derrick TR, Caldwell GE, Hamill J (2000) Modeling the stiffness characteris- tics of the human body while running with various stride lengths. J Appl Biomech 16:36–51 17. Endo Y, Tada M, Mochimaru M (2014) Dhaiba: development of virtual ergonomic assessment system with human models. Digit Hum Model 2014:1–8 18. Funken J, Willwacher S, Böcker J, Müller R, Heinrich K, Potthast W (2014) Blade kinetics of a unilateral prosthetic athlete in curve sprinting. In: 32 International conference of biomechanics in sports Can forward dynamics simulation with simple model estimate complex phenomena?: Case study on sprinting using running-specific prosthesis Abstract Introduction Methods (A) Simplification of sprint motion using the SLIP model and identification of relevant parameters (B) Modification and simulation of the forward dynamics of RSP-aided ILEA sprinting (C) Reconstruction of detailed whole-body motion from the simulated SLIP model motion Results and discussion Conclusion Authors’ contributions References work_gzap5knjfzfb5lzgkyw43rmece ---- Introduction: Digital Humanities as Dissonant Research How to Cite: O’Sullivan, James. 2018. “Introduction: Digital Humanities as Dissonant.” Digital Studies/Le champ numérique 8(1): 3, pp. 1–7, DOI: https://doi.org/10.16995/dscn.286 Published: 23 January 2018 Peer Review: This is a peer-reviewed article in Digital Studies/Le champ numérique, a journal published by the Open Library of Humanities. Copyright: © 2018 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. Open Access: Digital Studies/Le champ numérique is a peer-reviewed open access journal. Digital Preservation: The Open Library of Humanities and all its journals are digitally preserved in the CLOCKSS scholarly archive service. https://doi.org/10.16995/dscn.286 http://creativecommons.org/licenses/by/4.0/ O’Sullivan, James. 2018. “Introduction: Digital Humanities as Dissonant.” Digital Studies/Le champ numérique 8(1): 3, pp. 1–7, DOI: https://doi.org/10.16995/dscn.286 RESEARCH Introduction: Digital Humanities as Dissonant James O’Sullivan University College Cork, IE james.osullivan@ucc.ie The Digital Humanities Summer Institute gives students and scholars a chance to broaden their knowledge of the Digital Humanities within a feasible timeframe. The DHSI Colloquium was first founded by Diane Jakacki and Cara Leitch to act as a means of supporting graduates who wanted to be a part of such a gathering. The Colloquium has grown in recent years, to the point where it is now seen as an important part of the field’s conference calendar for emerging and established scholars alike, but it remains a non-threatening space in which students, scholars, and practitioners can share their ideas. This issue is testament to that diversity, as well as the strength of the research being presented at the Colloquium. It includes Scott B. Weingart and Nickoal Eichmann-Kalwara, Mary Borgo, William B. Kurtz, and John Barber. “What’s Under the Big Tent?: A Study of ADHO Conference Abstracts,” which portrays the discipline as one which is dominated by specific groups and practices. Using the Victorian Women Writers Project as a case-study, Mary Borgo treats models for the sustainable growth of TEI-based digital resources. William B. Kurtz details his experiences working on a digital initiative, in this instance, Founders Online: Early Access, and engages with the need for such projects to hold broader public appeal. John Barber’s “Radio Nouspace: Sound, Radio, Digital Humanities,” describes the curation of sound within the context of radio, and how such activity connects to creative digital scholarship. Together, these articles represent the purpose of facilitating a community comprised of divergent interests and perspectives, a community which can often be positively dissonant. Keywords: DHSI; Digital Humanities Summer Institute; colloquium; colloque Le Digital Humanities Summer Institute (DHSI) offre une chance aux étudiants et érudits d’étoffer leurs connaissances en humanités numériques pendant un délai réalisable. Diane Jakacki et Cara Leitch ont établi le premier colloque du DHSI pour soutenir des diplômés qui voulaient participer à un tel rassemblement. Ces dernières années, le colloque s’est développé jusqu’au point d’être considéré maintenant comme une conférence importante sur https://doi.org/10.16995/dscn.286 mailto:james.osullivan@ucc.ie O’Sullivan: Introduction2 le calendrier non seulement pour les érudits émergeants mais aussi pour les érudits établis dans le domaine. Le colloque continue cependant à être un espace non menaçant où les étudiants, les érudits et les professionnels peuvent échanger leurs idées. Ce numéro est un témoignage de cette diversité et de la qualité de la recherche présentée au colloque. Le numéro inclut l’article « What’s Under the Big Tent?: A Study of ADHO Conference Abstracts » par Scott B. Weingart et Nickoal Eichmann-Kalwara, ce qui présente les humanités numériques comme une discipline dominée par des groupes et pratiques spécifiques. En se servant du Victorian Women Writers Project comme étude de cas, Mary Borgo traite des maquettes pour la croissance durable des ressources numériques basées sur la TEI. William B. Kurtz détaille les expériences qu’il a acquises en travaillant sur l’initiative numérique Founders Online: Early Access ainsi que l’importance que de tels projets constituent un facteur attractif pour un plus large public. Dans le texte de John Barber, « Radio Nouspace: Sound, Radio, Digital Humanities », il s’agit du traitement de sons radiophoniques et du lien entre cette activité et l’érudition numérique créative. Tous ces articles correspondent au but de faciliter une communauté composée des intérêts et perspectives divergents qui peut souvent être véritablement dissonante. Mots-clés: Digital Humanities; DHSI Special Issue; Digital Humanities Summer Three years ago, Diane Jakacki passed control of the University of Victoria’s DHSI Colloquium1 to Mary Galvin and me. Our task was to continue to develop what Diane, alongside Cara Leitch, had started in 2009. Initially, the Colloquium was intended as a means of giving graduates an opportunity to present their research to the burgeoning community of Digital Humanities scholars. It was an opportunity for students to discuss their research with a large, international, and interdisciplinary audience, and furthermore, it enabled them to take advantage of institutional mechanisms designed to support participation at conferences. At the present phase in the development of the Digital Humanities, there is a marked emphasis on the acquisition of technical skills—emerging and established scholars alike are under intense pressure to develop their expertise in this domain. Here is not the most appropriate venue to discuss the positive and negative consequences of this reality, but it is the reality, one which is largely compelled by the demands of employers, 1 For more on the Colloquium, see the event’s dedicated website, http://dhsicolloquium.org. http://dhsicolloquium.org O’Sullivan: Introduction 3 funders, and the broader socio-cultural climates in which our institutes of education reside. Community-driven learning opportunities like the Digital Humanities Summer Institute are vital in such a context, helping us to learn, and further build our community, in a fashion that is suited to the hyper-demands of present-day academia. Truly wonderful is the scholar who can specialise in Medieval Studies while becoming equally adept in French, Python, statistics, and 3D modelling— perhaps I speak for myself, but this isn’t most of us. Mastery, of the true kind, comes from a lifetime of repetition, of focusing on that one little thing and questioning it and yourself for decades on end. Hiring committees, promotion boards—they often expect the former, the academic Swiss Army knife2 capable of achieving excellence in disciplinary discord. Through its broad range of foundational and intensive programs, DHSI gives students and scholars a chance to broaden their knowledge within a feasible timeframe. DHSI does not make masters, but it does allow the curious to recognise the ways in which they might re-imagine their intellectual practice. Mastery can always be pursued in the aftermath of Victoria, but we should also be content to progress with a valuable measure of fluency—one doesn’t need to be an adept programmer to interact with computer scientists, a certain level of proficiency is sufficient to enable the conversations that make meaning happen. This fluency, and the vibrant community that emerges out of its exchange, is what DHSI offers—the Colloquium was invented as a means of supporting graduates who wanted to be a part of such a gathering. In 2012, the Colloquium’s leadership agreed that there was sufficient demand to broaden the scope of the event beyond graduate submissions. Concurrently, DHSI continued to attract an increasing number of students, resulting in significant growth for the Colloquium and its audience—it is not unusual for participants to find themselves addressing an auditorium housing several hundred of their peers. This growth has continued in recent years, and as the Colloquium remains an addendum to the course-based pedagogical mission of DHSI, a measure of invention has been required to satisfy the increased volume of submission. In addition to more 2 I am of course referencing last year’s opening ceremony, wherein instructors are tasked with describing their courses. In-keeping with tradition, offerings are outlined through something of a pun-off. O’Sullivan: Introduction4 traditional presentations—though the current cap stands at 10 minutes—submissions are now welcome across a number of high-impact formats, such as lightning talks. In 2014, Mary Galvin initiated the Colloquium’s first poster session, which has become increasingly popular amongst participants. At DHSI 2016, we were proud to host a joint session with the concurrent Electronic Literature Organization Conference and Festival, while at DHSI 2017, posters and demonstrations were incorporated from the Society for the History of Authorship, Reading and Publishing’s annual conference. Developing the Colloquium is about continuing to respond to the needs of the community, finding ways to assist scholars and practitioners at various junctures in their careers to disseminate their research, ideas, and projects. A book of abstracts has been circulated since 2015, while a select number of presentations from DHSI 2014 were transformed into the Colloquium’s first special issue, published in Digital Humanities Quarterly.3 At the forthcoming gathering, our hope is to incorporate more audio-visual approaches to the capture of contributions. Such has been the growth of the Colloquium that last year saw a number of registrations from scholars not participating in courses. There was also a need to appoint the first Program Assistant, Lindsey Seatter, who has since succeeded Mary Galvin as co-chair. Mary committed much of her time to the development of this event, and, as with many of our field’s instigators, our community is all the better for her efforts. Despite its growth, the ethos of the Colloquium remains consistent: it is a non-threatening space in which students, scholars, and practitioners can share their ideas. To this end, we operate a peer-review policy wherein all reviewers are instructed to offer collegial feedback—constructive criticism is a requirement, not a recommendation. Unlike some other conferences, we have the luxury of accepting submissions if they meet a minimum threshold in terms of scholarly value. Those submissions that are considered to have fallen short of this standard are finessed through reviewer feedback so that they improve to a 3 O’Sullivan, James, Mary Galvin, and Diane Jakacki. 2016. DHSI Colloquium 2014 Special Issue, in Digital Humanities Quarterly 10.1. Web. http://www.digitalhumanities.org/dhq/vol/10/1/index.html O’Sullivan: Introduction 5 point where they are ready to be presented. I say this is a luxury because all we have to do as organisers and reviewers is to improve and accept submissions— accommodating the rising number of presentations is a task that falls to Daniel Sondheim, Assistant Director of the Electronic Textual Cultures Lab at the University of Victoria, and Ray Siemens, Director of DHSI. Dan, Ray, and the University of Victoria are yet to deny any of the Colloquium’s scheduling requirements, and the product of that facilitation is a diverse and inclusive final program. This issue is testament to that diversity, as well as the strength of the research being presented at the Colloquium. While there are only four papers, they each represent a significant contribution to the field, spanning a range of subjects that includes radio, metadata standards, Victorian women writers, and macro-level explorations of the wider Digital Humanities. One of the peculiarities of our realm’s interdisciplinary nature is that community gatherings draw a seemingly discordant group of individuals—is there value in conferences and publications comprised of historians, linguists, programmers, archivists, artists, and statisticians? Is the DH mix simply too broad to have meaning? I was disappointed to see Literary and Linguistic Computing become Digital Scholarship in the Humanities for this very reason—I liked having a journal that was entirely focused on my particular interests, and wasn’t overly enthused at the prospect of a publication that would meld an array of research on all kinds of everything. But, if the Digital Humanities are truly meant to be disruptive, then disciplinarity—which has a great many merits—should not be isolated from this process of disruption. In 2014, we stopped clustering Colloquium sessions into themes—the argument Mary advanced was that themes divided audiences, and as we aren’t forced to schedule parallel sessions, we should follow in the footsteps of the discipline’s pioneers and use the opportunity to encourage dissonance. Dissonance is at the very heart of the Digital Humanities, and we should embrace it, because dissonance is what gave us computational approaches to literary criticism, it is what compelled us to try and think beyond the codex, and most importantly, it is what shows us the failings in our techniques and approaches to scholarship. The Colloquium, and O’Sullivan: Introduction6 this special issue,4 like other journals and gatherings in this field, seeks to embrace dissonance as a valuable means of producing knowledge through the exchange of ideas and expertise that seemingly lack harmony, while simultaneously maintaining the utmost respect for the principles of differing disciplines. Such collaborative principles are what DHSI is founded on, and its Colloquium is merely an opportunity to encourage curiosity, and breed inter- and transdisciplinary creativity. In this respect, it is perhaps fitting that this issue includes Scott B. Weingart’s and Nickoal Eichmann-Kalwara’s “What’s Under the Big Tent?: A Study of ADHO Conference Abstracts.” While one can believe in dissonance, diversity, and interdisciplinarity, the reality does not always reflect the mantra. Quantifying submissions to our field’s flagship Digital Humanities conference, Weingart and Eichmann-Kalwara portray the discipline as one which is dominated by specific groups and practices. These findings, they argue, are at odds with anecdotal experiences, and they suggest a number of ways through which we might respond to such failings. Using the Victorian Women Writers Project as a case-study, Mary Borgo treats models for the sustainable growth of TEI-based digital resources. Discussing some of the most salient issues in the development of a digital edition—technical barriers, student involvement, ethics—this essay demonstrates the value of the Colloquium through the dissemination of those lessons that have been learned by its author as a consequence of her involvement in this project. William B. Kurtz also details his experiences working on a digital initiative, in this instance, Founders Online: Early Access. Kurtz’s examination is more specific to large-scale Digital Humanities work, and engages with the need for such projects to hold broader public appeal. John Barber’s “Radio Nouspace: Sound, Radio, Digital Humanities,” is something of a departure from the other contributions, in that it describes the curation of sound within the context of radio, and how such activity connects to creative digital scholarship, reflecting on digital storytelling, sound-based narrative, 4 I would like to thank a number of editors from Digital Studies/Le champ numérique, particularly Daniel O’Donnell, Paul Esau, Vanja Spiric, and Virgil Grandfield for their tireless efforts in bringing this special issue to fruition. O’Sullivan: Introduction 7 and practice-based research. In isolation, each of these essays offer insight from which interested readers will benefit—together, they represent the purpose of facilitating a community comprised of divergent interests and perspectives. Competing Interests The author has no competing interests to declare. How to cite this article: O’Sullivan, James. 2018. “Introduction: Digital Humanities as Dissonant.” Digital Studies/Le champ numérique 8(1): 3, pp. 1–7, DOI: https://doi. org/10.16995/dscn.286 Submitted: 04 November 2017 Accepted: 04 November 2017 Published: 23 January 2018 Copyright: © 2018 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. OPEN ACCESS Digital Studies/Le champ numérique is a peer-reviewed open access journal published by Open Library of Humanities. https://doi.org/10.16995/dscn.286 https://doi.org/10.16995/dscn.286 http://creativecommons.org/licenses/by/4.0/ Competing Interests work_h372vnzqpjh7zdvdjbzcwakgja ---- Simulation-Based Evaluation of Ease of Wayfinding Using Digital Human and As-Is Environment Models International Journal of Geo-Information Article Simulation-Based Evaluation of Ease of Wayfinding Using Digital Human and As-Is Environment Models Tsubasa Maruyama 1,*, Satoshi Kanai 2, Hiroaki Date 2 and Mitsunori Tada 1 1 National Institute of Advanced Industrial Science and Technology, Tokyo 135-0064, Japan; m.tada@aist.go.jp 2 Graduate School of Information Science and Technology, Hokkaido University, Sapporo 060-0814, Japan; kanai@ssi.ist.hokudai.ac.jp (S.K.); hdate@ssi.ist.hokudai.ac.jp (H.D.) * Correspondence: tbs-maruyama@aist.go.jp; Tel.: +81-3-3599-8201 Received: 30 June 2017; Accepted: 24 August 2017; Published: 26 August 2017 Abstract: As recommended by the international standards, ISO 21542, ease of wayfinding must be ensured by installing signage at all key decision points on walkways such as forks because signage greatly influences the way in which people unfamiliar with an environment navigate through it. Therefore, we aimed to develop a new system for evaluating the ease of wayfinding, which could detect spots that cause disorientation, i.e., “disorientation spots”, based on simulated three-dimensional (3D) interactions between wayfinding behaviors and signage location, visibility, legibility, noticeability, and continuity. First, an environment model reflecting detailed 3D geometry and textures of the environment, i.e., “as-is environment model”, is generated automatically using 3D laser-scanning and structure-from-motion (SfM). Then, a set of signage entities is created by the user. Thereafter, a 3D wayfinding simulation is performed in the as-is environment model using a digital human model (DHM), and disorientation spots are detected. The proposed system was tested in a virtual maze and a real two-story indoor environment. It was further validated through a comparison of the disorientation spots detected by the simulation with those of six young subjects. The comparison results revealed that the proposed system could detect disorientation spots, where the subjects lost their way, in the test environment. Keywords: wayfinding; digital human model; signage; laser-scanning; structure-from-motion; accessibility evaluation 1. Introduction It is increasingly important in our rapidly aging society [1] to perform accessibility evaluations for enhancing the ease and safety of access to indoor and outdoor environments for all people, including the elderly and the disabled. Under international standards [2], “accessibility” is defined as “provision of buildings or parts of buildings for people, regardless of disability, age or gender, to be able to gain access to them, into them, to use them and exit from them.” As recommended in the ISO/IEC Guide 71 [3], accessibility must be assessed considering both the physical and cognitive abilities of individuals. From the physical viewpoint, for example, tripping risks in an environment [4] must be assessed to ensure the environment is safe to walk in, as conducted in our previous study [5]. By contrast, from the cognitive aspect, ease of wayfinding [6] must be assessed to enable people to gain access to destinations in unfamiliar environments. Wayfinding is a basic cognitive response of people trying to find their way to destinations in an unfamiliar environment based on perceived information and their own background knowledge [7]. Visual signage influences the way in which people unfamiliar with an indoor environment navigate through it [8]. As shown in Table 1, visual signage can be classified into positional, directional, routing, and identification signage depending on the type of navigation information on the signage. As recommended in the guidelines [2], these four types of signage must be arranged appropriately ISPRS Int. J. Geo-Inf. 2017, 6, 267; doi:10.3390/ijgi6090267 www.mdpi.com/journal/ijgi http://www.mdpi.com/journal/ijgi http://www.mdpi.com http://dx.doi.org/10.3390/ijgi6090267 http://www.mdpi.com/journal/ijgi ISPRS Int. J. Geo-Inf. 2017, 6, 267 2 of 22 at key decision points considering the relationship between the navigation information on signage and the path structure of the environment. In addition, as mentioned in the literature [9], ease of wayfinding must be evaluated considering not only signage continuity, visibility, and legibility but also signage noticeability. Table 1. Signage type and navigation information. Signage Type Navigation Information Positional signage Next goal position to be reached to arrive at a destination (e.g., map) Directional signage Next walking direction to take to reach a destination (e.g., right or left) Routing signage Walking route to be taken to reach a destination (e.g., route drawn on map or indicated by textual information) Identification signage Name of current place Currently, ease of wayfinding is evaluated using four approaches: real field testing [10], virtual field testing [11,12], CAD model analysis [13], and wayfinding simulation [14–20]. In real field tests [10], a certain number of human subjects are asked to perform experimental wayfinding tasks in a real environment. By contrast, in virtual field tests [11,12], subjects are asked to perform wayfinding tasks in a virtual environment using virtual reality devices. In these real or virtual field tests [10–12], ease of wayfinding is evaluated by analyzing subjects’ responses to a questionnaire and their wayfinding results, e.g., walking route, gaze duration, and gaze direction. However, in these tests, prolonged wayfinding experiments involving a variety of wayfinding tasks must be conducted by various human subjects of different ages, genders, body dimensions, and visual capabilities. Thus, field tests are not necessarily efficient and low-cost approaches. In CAD model analysis [13], signage continuity is evaluated by analyzing the relationships among various pieces of user-specified navigation information indicated by signage. However, this approach cannot evaluate ease of wayfinding in terms of signage visibility, legibility, and noticeability because three-dimensional (3D) interactions between individuals and signage are not considered. Recently, a variety of wayfinding simulations has been proposed [14–20]. Such simulation-based approaches have made it possible to evaluate the ease of wayfinding by simulating the wayfinding of the pedestrian model. However, these simulations consider only a part of signage factors such as signage location, continuity, visibility, legibility, and noticeability. In addition, these simulations involve only simplified as-planned environment models that do not model the detailed environmental geometry, including obstacles on the walkway, and realistic environmental textures. For reliable evaluation, an environment model must be created to reflect the as-is situation of the environment because detailed 3D geometry and realistic textures affect the wayfinding of individuals [17,21]. Given the above background, the purpose of this study is to develop a new system for evaluating ease of wayfinding. The system makes it possible to detect spots that cause disorientation, i.e., “disorientation spots”, based on simulated 3D interactions among realistic wayfinding behaviors, as-is environment model, and realistic signage system. In this study, the as-is environment model represents an environment model that reflects a given environment as-is, i.e., detailed 3D geometry including obstacles and realistic textures. A schematic of the proposed system is shown in Figure 1. To achieve this goal, we draw on the results of our previous studies, in which algorithms of as-is environment modeling [22], walking simulation of a digital human model (DHM) in that environment model [23], and basic wayfinding simulation of the DHM [24] were developed. As shown in Figure 1, first, the as-is environment model consisting of the walk surface points WS, navigation graph GN , and textured 3D environmental geometry GI is automatically generated from 3D laser-scanned point clouds [22] and a set of photographs of the environment [24]. Then, a set of signage entities is created by the user by manually assigning signage information. Then, a wayfinding simulation scenario is specified manually by the user. Thereafter, the DHM commences its wayfinding in accordance with the navigation information indicated by the arranged signage, while ISPRS Int. J. Geo-Inf. 2017, 6, 267 3 of 22 estimating signage visibility, noticeability, and legibility based on imitated visual perception. As a result, disorientation spots are detected. ISPRS Int. J. Geo-Inf. 2017, 6, 267 3 of 22 while estimating signage visibility, noticeability, and legibility based on imitated visual perception. As a result, disorientation spots are detected. Figure 1. Overview of system for evaluating ease of wayfinding. The proposed system is demonstrated in a virtual maze and a real two-story indoor environment. The system is further validated by comparing the disorientation spots detected by the simulation with those obtained in a test involving six young subjects in the two-story indoor environment. The rest of this paper is organized as follows. Section 2 introduces the related literature and clarifies the contributions of this study. Section 3 presents a brief introduction of the previously developed as-is environment modeling system [22,24]. In Section 4, an overview of signage entity creation is described. In Section 5, the algorithm for the simulation in which DHM performs wayfinding is introduced. Finally, in Section 6, the system is demonstrated and validated. 2. Related Work This study is related primarily to wayfinding simulation research. A variety of simulation algorithms aiming to evaluate the ease of wayfinding have been studied. Chen et al. [14] proposed a wayfinding simulation algorithm based on architectural information such as egress width, height, contrast intensity, and room illumination in a 3D as-planned environment model. Furthermore, Morrow et al. [15] proposed an environmental visibility evaluation system using 3D pedestrian model. In the study, environmental visibilities from Figure 1. Overview of system for evaluating ease of wayfinding. The proposed system is demonstrated in a virtual maze and a real two-story indoor environment. The system is further validated by comparing the disorientation spots detected by the simulation with those obtained in a test involving six young subjects in the two-story indoor environment. The rest of this paper is organized as follows. Section 2 introduces the related literature and clarifies the contributions of this study. Section 3 presents a brief introduction of the previously developed as-is environment modeling system [22,24]. In Section 4, an overview of signage entity creation is described. In Section 5, the algorithm for the simulation in which DHM performs wayfinding is introduced. Finally, in Section 6, the system is demonstrated and validated. 2. Related Work This study is related primarily to wayfinding simulation research. A variety of simulation algorithms aiming to evaluate the ease of wayfinding have been studied. Chen et al. [14] proposed a wayfinding simulation algorithm based on architectural information such as egress width, height, contrast intensity, and room illumination in a 3D as-planned environment model. Furthermore, Morrow et al. [15] proposed an environmental visibility evaluation system using 3D pedestrian model. In the study, environmental visibilities from pedestrian models were evaluated to assist facility managers in designing architectural layout and signage placement. However, these ISPRS Int. J. Geo-Inf. 2017, 6, 267 4 of 22 studies [14,15] are not applicable to the evaluation of ease of wayfinding based on signage system because the pedestrian models used in them were not modeled to incorporate the surrounding signage in the simulation. Hajibabai et al. [16] proposed a wayfinding simulation using directional signage in an as-planned 2D environment model for emergency evacuation during a fire. The 2D pedestrian model used in the study could make decisions about its walking route based on perceived signage and fire propagation. However, in that study, signage visibility and legibility were estimated by oversimplified human visual perception, and signage noticeability was not considered. In addition, performing a precise 3D wayfinding simulation using a 3D as-is environment model using their framework is infeasible. Recently, signage-based 3D wayfinding simulation has been advancing. Brunnhuber et al. [17] and Becker-Asano et al. [18] proposed schemes for wayfinding simulation using directional and identification signage in a 3D as-planned environment model. In these simulations, the next walking direction of the pedestrian models was determined autonomously based on the navigation information on the perceived signage. Signage perception was realized by estimating signage visibility and legibility based on the imitated visual perception of the pedestrian model. However, signage noticeability was not considered in these simulations, although it has a significant effect on the wayfinding of people in unfamiliar environments [9]. More recently, advanced approaches for estimating suitable signage locations have been proposed. Zhang et al. [19] proposed a system for planning the placement of directional signage for evaluation. In their system, a minimum number of signage and appropriate signage locations were determined automatically by simulating interactions between the pedestrian models and the signage system. In addition, Motamedi et al. [20] proposed a system for optimizing the arrangement of directional and identification signage in building information model (BIM)-enabled environments. Their system estimated optimal signage arrangement based on signage visibility and legibility for a 3D pedestrian model walking in a BIM-based environment model. However, as in cases of other previous simulations, signage noticeability was not considered in these studies [19,20]. In addition, the system [19] was validated with an oversimplified environment model imitating a large rectangular space having an egress, and the feasibility of its use in realistic and complex as-is environments was not validated. By contrast, in the system [20], the walking route of the pedestrian model was not changed based on the navigation information indicated by perceived signage, so evaluation based on signage continuity was basically infeasible. Furthermore, these simulations [16–20] treated only one or two types of signage—Directional and/or identification. Thus, these simulations cannot be applied to actual signage systems including all signage types in Table 1. Moreover, with the exception of the simulation proposed by Motamedi et al. [20], simplified as-planned environment models were used in the previous wayfinding simulations. Therefore, to realize a reliable evaluation of ease of wayfinding, simulation users and/or facility managers are urged to create detailed and realistic as-planned environment models, including small obstacles and environmental textures based on measurements of the environment. Unlike the simulations developed in these previous studies [14–20], the proposed system can evaluate the ease of wayfinding by simulating 3D interactions among realistic wayfinding behaviors, as-is environment model, and realistic signage system. Specifically, the contributions of the present study are as follows: 1. DHM can make a decision based on the surrounding signage perceived by its imitated visual perception in consideration of signage location, continuity, visibility, noticeability, and legibility. 2. As-is environment model including detailed environmental geometry and realistic textures, can be generated automatically using 3D laser-scanning and SfM. 3. Proposed system can simulate the wayfinding of the DHM by discriminating among four types of signage, namely, positional, directional, routing, and identification signage. ISPRS Int. J. Geo-Inf. 2017, 6, 267 5 of 22 4. Proposed system is validated through a comparison of disorientation spots between simulations and measurements obtained from young subjects. 3. Automatic 3D As-Is Environment Modeling In the proposed system, first, an as-is environment model is generated automatically. As shown in Figure 2, the model comprises walk surface points WS, a navigation graph GN , and textured 3D environmental geometry GI . WS represents a set of laser-scanned point clouds on walkable surfaces such as floors, slopes, and stair-treads. Specifically, WS is used to estimate the footprints of the DHM during the simulation. GN generated from WS represents the environmental pathways that the DHM would navigate through during the simulation. The graph GN = 〈V, E, c, t, ES〉 comprises a set of graph nodes V and a set of edges E. Each node vk ∈ V represents free space in the environment, and has a position vector t(vk) and cylinder attribute c(vk), whose radius r(vk) and height hv represent the distance to the wall and walkable step height, respectively. Each edge ek, representing the connectivity of free spaces, is generated between two adjacent nodes with a common region. ES = {esk} represents a set of stair edges connecting two graph nodes at the end of stairs. WS and GN can be generated automatically using our method [22]. By contrast, GI represents a 3D mesh model with high-quality textures, and it is used to estimate signage visibility and noticeability during simulation. GI can be created automatically using SfM with a set of photographs of the environment [24]. Detailed algorithms and demonstrations are given in our previous studies [22–24]. ISPRS Int. J. Geo-Inf. 2017, 6, 267 5 of 22 4. Proposed system is validated through a comparison of disorientation spots between simulations and measurements obtained from young subjects. 3. Automatic 3D As-Is Environment Modeling In the proposed system, first, an as-is environment model is generated automatically. As shown in Figure 2, the model comprises walk surface points , a navigation graph , and textured 3D environmental geometry . represents a set of laser-scanned point clouds on walkable surfaces such as floors, slopes, and stair-treads. Specifically, is used to estimate the footprints of the DHM during the simulation. generated from represents the environmental pathways that the DHM would navigate through during the simulation. The graph = 〈 , , , , 〉 comprises a set of graph nodes and a set of edges . Each node ∈ represents free space in the environment, and has a position vector ( ) and cylinder attribute ( ), whose radius ( ) and height ℎ represent the distance to the wall and walkable step height, respectively. Each edge , representing the connectivity of free spaces, is generated between two adjacent nodes with a common region. = { } represents a set of stair edges connecting two graph nodes at the end of stairs. and can be generated automatically using our method [22]. By contrast, represents a 3D mesh model with high-quality textures, and it is used to estimate signage visibility and noticeability during simulation. can be created automatically using SfM with a set of photographs of the environment [24]. Detailed algorithms and demonstrations are given in our previous studies [22–24]. (a) (b) (c) Figure 2. 3D as-is environment model: (a) Walk surface points ; (b) navigation graph ; (c) textured environmental geometry . 4. Creation of Signage Entity In the proposed scheme, the signage system is modeled as a set of signage entities = { }. Each signage entity = [ , ] consists of a 3D textured mesh model of the signage and a set of signage information entities = { , } ( ∈ [1, ] ), where represents the number of signage information items included in . When modeling the existing signage, is constructed using SfM; otherwise, is created using 3D CAD software. , is created by manually assigning the geometric, navigation, and legibility properties in Table 2. The details are given below. 4.1. Geometric Property The geometric property includes the description region , center position , unit normal vector , width , and transformation matrix T . As shown in Figure 3a, = [ , ] consists of two diagonal points of the rectangular description region on , in which the signage information is written. , , and are estimated from . T represents a transformation matrix from the local coordinate system of , to the coordinate system of , where is defined to satisfy three conditions: (1) the origin of is located on , (2) y-axis of is aligned with , and (3) z-axis of is aligned with the z-axis of . Under this definition, T is calculated automatically from and . Figure 2. 3D as-is environment model: (a) Walk surface points WS; (b) navigation graph GN ; (c) textured environmental geometry GI . 4. Creation of Signage Entity In the proposed scheme, the signage system is modeled as a set of signage entities S = {Si}. Each signage entity Si = [Gi, Ii] consists of a 3D textured mesh model Gi of the signage and a set of signage information entities Ii = {Ii,j} (j ∈ [1, Ni]), where Ni represents the number of signage information items included in Si. When modeling the existing signage, Gi is constructed using SfM; otherwise, Gi is created using 3D CAD software. Ii,j is created by manually assigning the geometric, navigation, and legibility properties in Table 2. The details are given below. 4.1. Geometric Property The geometric property includes the description region Rg, center position pg, unit normal vector ng, width wg, and transformation matrix TG I . As shown in Figure 3a, Rg = [pto p, pbottom] consists of two diagonal points of the rectangular description region on Gi, in which the signage information is written. pg, ng, and wg are estimated from Rg. TG I represents a transformation matrix from the local coordinate system XI of Ii,j to the coordinate system XG of Gi, where XI is defined to satisfy three conditions: (1) the origin of XI is located on pg, (2) y-axis of XI is aligned with ng, and (3) z-axis of XI is aligned with the z-axis of XG . Under this definition, TG I is calculated automatically from pg and ng. ISPRS Int. J. Geo-Inf. 2017, 6, 267 6 of 22 Table 2. Signage information entity. Property Attribute Assignment Method Geometric property Description region Rg = [pto p, pbottom] Assigned by user by picking two diagonal points Center position pg Estimated from RgUnit normal vector ng Width wg Transformation matrix TG I Estimated from pg and ng Navigation property Type of signage Tn ∈ {′positional′, ′directional′, ′routing′, ′identi f ication′} Assigned by user based on the signage designName of indicated place Dn Navigation information NI Legibility property Maximum viewing distance dl Measured from human subjects Center point of 3D VCA pl Estimated from dlRadius of 3D VCA rl ISPRS Int. J. Geo-Inf. 2017, 6, 267 6 of 22 Table 2. Signage information entity. Property Attribute Assignment Method Geometric property Description region = [ , ] Assigned by user by picking two diagonal points Center position Estimated from Unit normal vector Width Transformation matrix T Estimated from and Navigation property Type of signage ∈{ , ′ , ′ , ′ ′} Assigned by user based on the signage design Name of indicated place Navigation information Legibility property Maximum viewing distance Measured from human subjects Center point of 3D VCA Estimated from Radius of 3D VCA (a) (b) (c) Figure 3. Overview of signage information: (a) Geometric property; (b) navigation property; (c) legibility property. 4.2. Navigation Property The navigation property includes the type of signage , name of indicated place , and navigation information . As listed and shown in Table 3 and Figure 3b, respectively, is Destination A (A) Positional sign ( is ) (B) Directional sign ( is ) (C) Routing sign ( is ) Destination A Destination A To Destination A (A) (B) (C) Figure 3. Overview of signage information: (a) Geometric property; (b) navigation property; (c) legibility property. 4.2. Navigation Property The navigation property includes the type of signage Tn, name of indicated place Dn, and navigation information NI . As listed and shown in Table 3 and Figure 3b, respectively, NI is assigned by the user in accordance with Tn. The user must specify a next goal position pn, next walking direction dn, and a set of passing points PN for positional, directional, and routing signage, respectively. pn and PN are specified w.r.t. the coordinate system XW of the textured environmental geometry GI . By contrast, dn is specified w.r.t. XI of Ii,j. ISPRS Int. J. Geo-Inf. 2017, 6, 267 7 of 22 Table 3. Assignment of navigation information depending on signage type. Signage Type Navigation Information NI to Achieve a Destination Referenced Coordinate System Positional signage Next goal position pn XW of GI Directional signage Next walking direction dn XI of Ii,j Routing signage A set of passing points PN = {pk} XW of GI Identification signage Name of current place Cn None 4.3. Legibility Property The legibility property includes the center point pl w.r.t. XI of Ii,j and radius rl of the 3D visibility catchment area (VCA). As shown in Figure 3c, the 3D VCA of signage represents a sphere in which people can recognize the information written in the signage. The VCA was defined originally as a 2D circle by Fillipidis et al. [25] and Xie et al. [26]. In this study, the 3D VCA is calculated such that the great circle of the sphere on the horizontal plane corresponds to the 2D VCA circle proposed by Xie et al. [26]. Specifically, pl and rl are calculated using the following equation: rl = wg 2 sin ϕl pl = pg + ng( wg 2 tan ϕl ) ϕl = tan −1 ( wg 2 dl ), (1) where dl represents the maximum viewing distance between the signage and the subject standing at a place, in which the subject can recognize the information on the signage. By measuring dl from the subjects, the legible space of the signage is calculated as the 3D VCA using Equation (1). 5. System for Evaluation of Ease of Wayfinding As shown in Figure 1, the wayfinding simulation using the DHM is performed in accordance with the user-specified wayfinding scenario, including DHM properties H = [M, θH , θV , nt], start position ps, initial walking direction dI , name of destination D, and signage locations and orientations Ts = {Ti}, where M, θH , θV , nt, and Ti represent motion-capture (MoCap) data for flat walking obtained from the gait database [27], horizontal and vertical angles of view frustum, threshold value of signage noticeability, and transformation matrix from XG to XW , respectively. Before the simulation, the locations and the orientations of each signage entity Si ∈ S are determined by assigning Ti ∈ Ts. Then, a DHM having the same body dimensions as the subject of M is generated. As shown in Figure 4, the DHM has 41 degrees of freedom and a link mechanism corresponding to that of M. The imitated eye position peye of the DHM is estimated as the midpoint between the top of the head and the neck. Finally, the wayfinding simulation is performed by repeating the algorithms described in the following subsections. ISPRS Int. J. Geo-Inf. 2017, 6, 267 8 of 22 ISPRS Int. J. Geo-Inf. 2017, 6, 267 8 of 22 Figure 4. Link mechanism of DHM. 5.1. Signage Perception Based on Imitated Visual Perception In the proposed system, signage visibility, noticeability, and legibility are estimated to determine whether a signage is found and its information is recognized by the DHM. The details are described in the following subsections. 5.1.1. Signage Visibility Estimation Signage visibility represents whether a signage is included in the view frustum of the DHM defined by and . As shown in Figure 5, it is estimated simply by scanning the eyesight of the DHM. First, the eyesight of the DHM is obtained using OpenGL by rendering an image from the camera model located at the DHM eye position . At the same time, as shown in the figure, the textured 3D environmental geometry and the textured 3D mesh model of each signage ∈ are rendered with a single color instead of their original textures. Finally, if the color of appears in the rendered image, is considered “visible” signage and inserted into a set of visible signage entities = { }. (a) (b) Figure 5. Signage visibility estimation: (a) View frustum of DHM; (b) image rendered using OpenGL. 5.1.2. Signage Noticeability Estimation As people overlook objects in their eyesight, it is not always true that the DHM can find a signage when is visible ∈ . Therefore, signage noticeability representing whether the DHM can notice ∈ must be estimated. Figure 4. Link mechanism of DHM. 5.1. Signage Perception Based on Imitated Visual Perception In the proposed system, signage visibility, noticeability, and legibility are estimated to determine whether a signage is found and its information is recognized by the DHM. The details are described in the following subsections. 5.1.1. Signage Visibility Estimation Signage visibility represents whether a signage is included in the view frustum of the DHM defined by θH and θV . As shown in Figure 5, it is estimated simply by scanning the eyesight of the DHM. First, the eyesight of the DHM is obtained using OpenGL by rendering an image from the camera model located at the DHM eye position peye. At the same time, as shown in the figure, the textured 3D environmental geometry GI and the textured 3D mesh model Gi of each signage Si ∈ S are rendered with a single color instead of their original textures. Finally, if the color of Gi appears in the rendered image, Si is considered “visible” signage and inserted into a set of visible signage entities Svis = {Sk}. ISPRS Int. J. Geo-Inf. 2017, 6, 267 8 of 22 Figure 4. Link mechanism of DHM. 5.1. Signage Perception Based on Imitated Visual Perception In the proposed system, signage visibility, noticeability, and legibility are estimated to determine whether a signage is found and its information is recognized by the DHM. The details are described in the following subsections. 5.1.1. Signage Visibility Estimation Signage visibility represents whether a signage is included in the view frustum of the DHM defined by and . As shown in Figure 5, it is estimated simply by scanning the eyesight of the DHM. First, the eyesight of the DHM is obtained using OpenGL by rendering an image from the camera model located at the DHM eye position . At the same time, as shown in the figure, the textured 3D environmental geometry and the textured 3D mesh model of each signage ∈ are rendered with a single color instead of their original textures. Finally, if the color of appears in the rendered image, is considered “visible” signage and inserted into a set of visible signage entities = { }. (a) (b) Figure 5. Signage visibility estimation: (a) View frustum of DHM; (b) image rendered using OpenGL. 5.1.2. Signage Noticeability Estimation As people overlook objects in their eyesight, it is not always true that the DHM can find a signage when is visible ∈ . Therefore, signage noticeability representing whether the DHM can notice ∈ must be estimated. Figure 5. Signage visibility estimation: (a) View frustum of DHM; (b) image rendered using OpenGL. 5.1.2. Signage Noticeability Estimation As people overlook objects in their eyesight, it is not always true that the DHM can find a signage Si when Si is visible Si ∈ Svis. Therefore, signage noticeability representing whether the DHM can notice Si ∈ Svis must be estimated. ISPRS Int. J. Geo-Inf. 2017, 6, 267 9 of 22 In the proposed system, signage noticeability is estimated using the saliency estimation algorithm proposed by Itti et al. [28] based on the visual search mechanism of real humans [29]. In this algorithm, a Gaussian pyramid is first generated from an image rendered by the camera model at peye. Then, feature maps representing contrasts of intensity, color differences, and orientations are obtained from each image. By integrating and normalizing the feature maps, a saliency map Ms = {m(x, y)} is generated, where m(x, y) ∈ [0, 1] represents the degree of saliency at a pixel (x, y). In the map, m(x, y) increases at the pixel, in which contrasts of intensity, color differences, and orientations are higher than those of other pixels. Finally, as shown in Figure 6, the propose system estimates the noticeability ni of visible signage Si ∈ Svis using the following equation: ni = max (x,y)∈Pi m(x, y), (2) where m(x, y) and Pi represent the degree of saliency at pixel (x, y) in MS and a set of pixels, in which the signage geometry Gi is rendered. If ni is greater than the noticeability threshold nt of the user-specified wayfinding scenario, Si is considered “found” signage, and inserted into a set of found signage entities S f ound = {Sk} (S f ound ⊆ Svis). ISPRS Int. J. Geo-Inf. 2017, 6, 267 9 of 22 In the proposed system, signage noticeability is estimated using the saliency estimation algorithm proposed by Itti et al. [28] based on the visual search mechanism of real humans [29]. In this algorithm, a Gaussian pyramid is first generated from an image rendered by the camera model at . Then, feature maps representing contrasts of intensity, color differences, and orientations are obtained from each image. By integrating and normalizing the feature maps, a saliency map = { ( , )} is generated, where ( , ) ∈ [0,1] represents the degree of saliency at a pixel ( , ). In the map, ( , ) increases at the pixel, in which contrasts of intensity, color differences, and orientations are higher than those of other pixels. Finally, as shown in Figure 6, the propose system estimates the noticeability of visible signage ∈ using the following equation: = max( , )∈ ( , ), (2) where ( , ) and represent the degree of saliency at pixel ( , ) in and a set of pixels, in which the signage geometry is rendered. If is greater than the noticeability threshold of the user-specified wayfinding scenario, is considered “found” signage, and inserted into a set of found signage entities = { } ( ⊆ ). Figure 6. Signage noticeability estimation. 5.1.3. Signage Legibility Estimation Signage legibility represents whether the DHM can recognize signage information of found signage ∈ , i.e., whether the DHM can read the textual or graphical information written on the signage. It is estimated using the 3D VCA of signage information , of . If is included in the 3D VCA of , , , is considered “recognized” signage information. In the proposed system, it is assumed that the DHM can correctly interpret , only when and , are found (i.e., ∈ ) and recognized, respectively. Note that the signage noticeability, , does not influence the signage legibility estimation. 5.2. Wayfinding Decision-Making Based on Signage Perception Based on the estimated signage visibility, noticeability, and legibility, the wayfinding state of the DHM is changed dynamically in accordance with the state transition chart shown in Figure 7a. As shown in the figure, when the simulation is performed, the DHM is set to start walking in the direction (state SW1 in Figure 7a). Then, as shown in Figure 7b, when a signage is found by the DHM, i.e., is inserted to , the DHM is set to walk toward the center position of , of (state SW2) to read the information on . Thereafter, the other signage does not influence the state transition until the state is changed to the look-around state (SW3) even if is found by the DHM. When , of becomes legible, the name of indicated place of , is compared with the name of destination of the wayfinding scenario. If , the state is changed to SW3 to find other signage related to ; else, the state is changed in accordance with the type of recognized signage information . If represents positional, directional, or routing signage, the state is changed to the motion planning state (SW4). By contrast, if represents an identification signage, the state is changed to the success state (SW5). In this state, the simulation is deemed complete because this state is the final state. Figure 6. Signage noticeability estimation. 5.1.3. Signage Legibility Estimation Signage legibility represents whether the DHM can recognize signage information of found signage Si ∈ S f ound, i.e., whether the DHM can read the textual or graphical information written on the signage. It is estimated using the 3D VCA of signage information Ii,j of Si. If peye is included in the 3D VCA of Ii,j, Ii,j is considered “recognized” signage information. In the proposed system, it is assumed that the DHM can correctly interpret Ii,j only when Si and Ii,j are found (i.e., Si ∈ S f ound) and recognized, respectively. Note that the signage noticeability, ni, does not influence the signage legibility estimation. 5.2. Wayfinding Decision-Making Based on Signage Perception Based on the estimated signage visibility, noticeability, and legibility, the wayfinding state of the DHM is changed dynamically in accordance with the state transition chart shown in Figure 7a. As shown in the figure, when the simulation is performed, the DHM is set to start walking in the direction dI (state SW1 in Figure 7a). Then, as shown in Figure 7b, when a signage Si is found by the DHM, i.e., Si is inserted to S f ound, the DHM is set to walk toward the center position pg of Ii, j of Si (state SW2) to read the information on Si. Thereafter, the other signage Sj does not influence the state transition until the state is changed to the look-around state (SW3) even if Sj is found by the DHM. When Ii, j of Si becomes legible, the name of indicated place Dn of Ii, j is compared with the name of destination D of the wayfinding scenario. If Dn 6= D, the state is changed to SW3 to find other signage related to D; else, the state is changed in accordance with the type of recognized signage information Tn. If Tn represents positional, directional, or routing signage, the state is changed to the motion planning state (SW4). By contrast, if Tn represents an identification signage, the state is ISPRS Int. J. Geo-Inf. 2017, 6, 267 10 of 22 changed to the success state (SW5). In this state, the simulation is deemed complete because this state is the final state. During the wayfinding simulation, the DHM basically repeats the states SW2, SW4, SW6, and SW3. As shown in Figure 7c, when the DHM recognizes Ii, j, it is set to walk toward the temporal destination of the DHM, i.e., subgoal position psub (SW4 and SW6). Then, as shown in Figure 7d, when the DHM arrives at psub, it is asked to observe the surrounding environment (i.e., look-around) by rotating the neck joint horizontally within its range of motion (SW3). When the DHM finds new signage in this state, the state changes back to SW2. By contrast, when the DHM cannot find any signage, the current DHM position is treated as a “disorientation spot” (SW7). The state SW7 is considered the failed state. Note that the state can be changed to SW8 from SW7 only when Tn represents a directional signage, as described in Section 5.3.1. ISPRS Int. J. Geo-Inf. 2017, 6, 267 10 of 22 During the wayfinding simulation, the DHM basically repeats the states SW2, SW4, SW6, and SW3. As shown in Figure 7c, when the DHM recognizes , , it is set to walk toward the temporal destination of the DHM, i.e., subgoal position (SW4 and SW6). Then, as shown in Figure 7d, when the DHM arrives at , it is asked to observe the surrounding environment (i.e., look-around) by rotating the neck joint horizontally within its range of motion (SW3). When the DHM finds new signage in this state, the state changes back to SW2. By contrast, when the DHM cannot find any signage, the current DHM position is treated as a “disorientation spot” (SW7). The state SW7 is considered the failed state. Note that the state can be changed to SW8 from SW7 only when represents a directional signage, as described in Section 5.3.1. (a) (b) (c) (d) Figure 7. Wayfinding decision-making based on signage perception: (a) Wayfinding state transition; (b) walking toward signage; (c) walking toward subgoal position; (d) look-around. 5.3. Signage-Based Motion Planning 5.3.1. Updating Subgoal Position of DHM In the signage-based motion planning state (SW4), first, the subgoal position is determined automatically depending on the type of recognized signage information and its navigation information . When = ′ ′, is determined as the next goal position of to make the DHM walk toward a location indicated by the recognized signage information , . When = ′ ′, as shown in Figure 8, a queue of fork points = { } is extracted by the following steps. (1) A graph node ( ∈ ) just under the pelvis position of the DHM is extracted from the navigation graph . Then, is inserted into a set of graph nodes , where , is found Walking trajectory Look-around Figure 7. Wayfinding decision-making based on signage perception: (a) Wayfinding state transition; (b) walking toward signage; (c) walking toward subgoal position; (d) look-around. 5.3. Signage-Based Motion Planning 5.3.1. Updating Subgoal Position of DHM In the signage-based motion planning state (SW4), first, the subgoal position psub is determined automatically depending on the type of recognized signage information Tn and its navigation information NI . When Tn =′ positional′, psub is determined as the next goal position pn of NI to make the DHM walk toward a location indicated by the recognized signage information Ii, j. When Tn =′ directional′, as shown in Figure 8, a queue of fork points F = {pm} is extracted by the following steps. ISPRS Int. J. Geo-Inf. 2017, 6, 267 11 of 22 (1) A graph node vc (vc ∈ V) just under the pelvis position pp of the DHM is extracted from the navigation graph GN . Then, vc is inserted into a set of graph nodes V′P, where V ′ P represents graph nodes on a feasible walking path when the DHM walks in accordance with the next walking direction dn indicated by Ii, j. (2) vc and dn of Ii, j are assigned to the variables vt and dt, respectively. (3) A graph node vp located in the direction of dt is extracted using the following equation: p = argmax k∈Nt dk·dt dk = t(vk)− t(vt) ‖t(vk)− t(vt)‖ , (3) where Nt represents a set of indices of graph node vk (vk /∈ V′P) connected to vt by a graph edge. Using this equation, vp is determined as a graph node with the minimum angle difference between dt and a graph edge connecting vk and vt. (4) If Nt 6= ∅, vp is inserted into V′P, and dk and vp are assigned to vt and dt, respectively. (5) If |Nt| ≥ 2 ∨ Nt = ∅, t(vp) is pushed into F because t(vp) is considered a center position at the fork way or at the terminal of the walkway. (6) Steps (3)–(5) are repeated, until Nt = ∅, i.e., until a graph node representing the terminal of the walkway is found. When the wayfinding state is changed to SW4 or SW8 in Figure 7a, a first fork point is taken from F and assigned to psub. This algorithm enables the proposed system to detect multiple disorientation spots, i.e., fork points with no visible and noticeable signage after perceiving directional signage. ISPRS Int. J. Geo-Inf. 2017, 6, 267 11 of 22 represents graph nodes on a feasible walking path when the DHM walks in accordance with the next walking direction indicated by , . (2) and of , are assigned to the variables and , respectively. (3) A graph node located in the direction of is extracted using the following equation: = arg max∈ ∙ = ( ) ( )‖ ( ) ( )‖, (3) where represents a set of indices of graph node ( ∉ ) connected to by a graph edge. Using this equation, is determined as a graph node with the minimum angle difference between and a graph edge connecting and . (4) If ∅, is inserted into , and and are assigned to and , respectively. (5) If | | 2 ∨ = ∅ , is pushed into because is considered a center position at the fork way or at the terminal of the walkway. (6) Steps (3)–(5) are repeated, until = ∅, i.e., until a graph node representing the terminal of the walkway is found. When the wayfinding state is changed to SW4 or SW8 in Figure 7a, a first fork point is taken from and assigned to . This algorithm enables the proposed system to detect multiple disorientation spots, i.e., fork points with no visible and noticeable signage after perceiving directional signage. Figure 8. Extraction of fork points from navigation graph. When = ′ ′, is determined as the last elements of a set of passing points of indicated by , . Then, the walking path of the DHM is estimated such that it passes the graph nodes at ∈ in Section 5.3.2. 5.3.2. Walking Path Selection and Walking Trajectory Generation As shown in Figure 9, after determining the subgoal position , the walking path = { } ( ∈ ) of the DHM is determined automatically by the following function: = Path( , ), (4) where Path( , ) represents a function to select a set of graph nodes between two nodes located at and using the Dijkstra method from . When the wayfinding state is changed to SW2 with the visible signage ∈ , ( ) and are assigned to and , where and represent a graph node just under the DHM pelvis position and the center position of , of , respectively. By contrast, when the state is changed to SW4, and are determined depending on the type of recognized signage information . When = ′ ′ or = ′ ′ , ( ) and are assigned to and , respectively. By contrast, when = ′ ′ , is determined as Figure 8. Extraction of fork points from navigation graph. When Tn =′ routing′, psub is determined as the last elements of a set of passing points PN of NI indicated by Ii, j. Then, the walking path VP of the DHM is estimated such that it passes the graph nodes at pk ∈ PN in Section 5.3.2. 5.3.2. Walking Path Selection and Walking Trajectory Generation As shown in Figure 9, after determining the subgoal position psub, the walking path VP = {vi} (vi ∈ V) of the DHM is determined automatically by the following function: VP = Path(pa, pb), (4) where Path(pa, pb) represents a function to select a set of graph nodes VP between two nodes located at pa and pb using the Dijkstra method from GN . When the wayfinding state is changed to SW2 with the visible signage Si ∈ Svis, t(vc) and pg are assigned to pa and pb, where vc and pg represent a graph node just under the DHM pelvis position pp and the center position pg of Ii, j of Si, respectively. By contrast, when the state is changed ISPRS Int. J. Geo-Inf. 2017, 6, 267 12 of 22 to SW4, pa and pb are determined depending on the type of recognized signage information Tn. When Tn =′ positional′ or Tn =′ directional′, t(vc) and psub are assigned to pa and pb, respectively. By contrast, when Tn =′ routing′, VP is determined as VP = ∪ k<|PN| k=0 Path(pk, pk+1), where pk ∈ PN is a passing point representing a walking route indicated by NI of Ii, j. After determining VP, the walking trajectory VT = 〈pi〉 is generated automatically by our previously developed optimization algorithm [23], where VT represents a sequence of sparsely discretized target pelvis positions of the DHM. This optimization algorithm is designed to make VT more natural and smooth, while avoiding contact with walls. The details are described in [23]. ISPRS Int. J. Geo-Inf. 2017, 6, 267 12 of 22 = ⋃ Path( , )| | , where ∈ is a passing point representing a walking route indicated by of , . After determining , the walking trajectory = 〈 〉 is generated automatically by our previously developed optimization algorithm [23], where represents a sequence of sparsely discretized target pelvis positions of the DHM. This optimization algorithm is designed to make more natural and smooth, while avoiding contact with walls. The details are described in [23]. Figure 9. Examples of walking path selection and walking trajectory generation. 5.4. MoCap-Based Adaptive Walking Motion Generation Finally, the walking motion of the DHM is generated as it follows using our MoCap-based adaptive walking motion generation algorithm [23]. In the algorithm, realistic articulated walking movements of the DHM are generated based on MoCap data for flat walking. The details and demonstrations are introduced in [23]. 6. Results and Validations The proposed system was developed using Visual Studio 2010 Professional edition with C++. The system was applied to a virtual maze and a real two-story indoor environment. In addition, it was validated by comparing the disorientation spots between the simulation and measurements obtained from young subjects. Videos of as-is environment modeling and wayfinding simulation results, i.e., Figures 10–13, are available in the supplementary video file. 6.1. Evaluation of Ease of Wayfinding in Virtual Maze The proposed system was first applied to a virtual maze with a set of signage entities = { , , , , }, to test its basic performance. Figure 10 shows the constructed environment model of the virtual maze. In the figure, textured environmental geometry was constructed manually using CAD software [30], and the set of walk surface points and navigation graph were constructed from a set of vertices of . Note that the proposed system could perform not only in the as-is environment model but in the given 3D model of the environment, e.g., CAD data of the environment, by converting the model to dense point clouds. Tables 4 and 5 show the wayfinding scenario and the user-assigned parameters of each signage information , , respectively. As shown in Table 5, all four types of signage were used. Figure 9. Examples of walking path selection and walking trajectory generation. 5.4. MoCap-Based Adaptive Walking Motion Generation Finally, the walking motion of the DHM is generated as it follows VT using our MoCap-based adaptive walking motion generation algorithm [23]. In the algorithm, realistic articulated walking movements of the DHM are generated based on MoCap data M for flat walking. The details and demonstrations are introduced in [23]. 6. Results and Validations The proposed system was developed using Visual Studio 2010 Professional edition with C++. The system was applied to a virtual maze and a real two-story indoor environment. In addition, it was validated by comparing the disorientation spots between the simulation and measurements obtained from young subjects. Videos of as-is environment modeling and wayfinding simulation results, i.e., Figures 10–13, are available in the supplementary video file. 6.1. Evaluation of Ease of Wayfinding in Virtual Maze The proposed system was first applied to a virtual maze with a set of signage entities S = {S1, S2, S3, S4, S5}, to test its basic performance. Figure 10 shows the constructed environment model of the virtual maze. In the figure, textured environmental geometry GI was constructed manually using CAD software [30], and the set of walk surface points WS and navigation graph GN were constructed from a set of vertices of GI . Note that the proposed system could perform not only in the as-is environment model but in the given 3D model of the environment, e.g., CAD data of the environment, by converting the model to dense point clouds. Tables 4 and 5 show the wayfinding scenario and the user-assigned parameters of each signage information Ii,j, respectively. As shown in Table 5, all four types of signage were used. ISPRS Int. J. Geo-Inf. 2017, 6, 267 13 of 22 ISPRS Int. J. Geo-Inf. 2017, 6, 267 13 of 22 (a) (b) (c) (d) Figure 10. Environment model of virtual maze: (a) Textured environmental geometry (#vertices: 4,241,573, #faces: 8,436,885); (b) walk surface points ; (c) navigation graph ; (d) wayfinding scenario. The results are available in the supplementary video file. Table 4. User-specified wayfinding scenario. Parameters Specified Values MoCap data for flat walking of MoCap data of a young male subject (Age: 22 years, height: 1.73 m) Horizontal angle of view frustum of 100 deg 1 Vertical angle of view frustum of 60 deg 1 Noticeability threshold ∈ [0, 1] of 0.3 2 Start position Shown in Figure 10d Initial walking direction Name of destination “Goal“ Signage locations and orientations Shown in Figure 10d 1 and were specified based on the handbook [31]. 2 was specified as a small value for validation. Table 5. User-assigned parameters of signage information. Parameters Sign Sign Sign Sign Sign Type of signage ‘Positional’ ‘Directional‘ ‘Directional‘ ‘Routing‘ ‘Identification‘ Name of indicated place “Goal“ Navigation information Shown in Figure 10d “Goal“ Maximum viewing distance 4.0 m 1 5.0 m 1 1.74 m 1 1 was specified as a tentative value without human measurements. Figure 10. Environment model of virtual maze: (a) Textured environmental geometry GI (#vertices: 4,241,573, #faces: 8,436,885); (b) walk surface points WS; (c) navigation graph GN ; (d) wayfinding scenario. The results are available in the supplementary video file. Table 4. User-specified wayfinding scenario. Parameters Specified Values MoCap data for flat walking M of H MoCap data of a young male subject (Age: 22 years, height: 1.73 m) Horizontal angle of view frustum θH of H 100 deg 1 Vertical angle of view frustum θV of H 60 deg 1 Noticeability threshold nt ∈ [0, 1] of H 0.3 2 Start position ps Shown in Figure 10d Initial walking direction dI Name of destination D “Goal“ Signage locations and orientations Ts Shown in Figure 10d 1 θH and θV were specified based on the handbook [31]. 2 nt was specified as a small value for validation. Table 5. User-assigned parameters of signage information. Parameters Sign S1 Sign S2 Sign S3 Sign S4 Sign S5 Type of signage Tn ‘Positional’ ‘Directional’ ‘Directional’ ‘Routing’ ‘Identification’ Name of indicated place Dn “Goal” Navigation information NI Shown in Figure 10d “Goal” Maximum viewing distance dl 4.0 m 1 5.0 m 1 1.74 m 1 1 dl was specified as a tentative value without human measurements. Figure 11 shows the evaluation results of ease of wayfinding. As shown in Figure 11a, when the simulation was performed, the DHM found and recognized S1 and I1, 1, respectively. In consequence, the DHM was set to walk toward the next goal positon pn indicated by I1, 1. Then, when the DHM arrived at pn, S2 and I2, 1 were found and recognized by the DHM (Figure 11b), respectively. A feasible ISPRS Int. J. Geo-Inf. 2017, 6, 267 14 of 22 walking path V′P and a set of fork points F of I2, 1 were then extracted. Then, the DHM was set to walk toward the first fork point p1 ∈ F of I2, 1. After that, the DHM found and recognized S3 and I3, 1 at p1 ∈ F, respectively. Then, as shown in Figure 11c, V ′ P and F of I3, 1 were extracted. At the same time, the DHM was set to walk toward p1 ∈ F of I3, 1. However, as shown in Figure 11d, the DHM could not find any new signage when it arrived at p1 ∈ F of I3, 1. Therefore, this spot was detected as a disorientation spot. As recommended by international standards [2], a facility manager must provide signage at all key decision points such as forks. Therefore, from this standpoint, the detection of this disorientation spot can be considered reasonable. ISPRS Int. J. Geo-Inf. 2017, 6, 267 14 of 22 Figure 11 shows the evaluation results of ease of wayfinding. As shown in Figure 11a, when the simulation was performed, the DHM found and recognized and , , respectively. In consequence, the DHM was set to walk toward the next goal positon indicated by , . Then, when the DHM arrived at , and , were found and recognized by the DHM (Figure 11b), respectively. A feasible walking path and a set of fork points of , were then extracted. Then, the DHM was set to walk toward the first fork point ∈ of , . After that, the DHM found and recognized and , at ∈ , respectively. Then, as shown in Figure 11c, and of , were extracted. At the same time, the DHM was set to walk toward ∈ of , . However, as shown in Figure 11d, the DHM could not find any new signage when it arrived at ∈ of , . Therefore, this spot was detected as a disorientation spot. As recommended by international standards [2], a facility manager must provide signage at all key decision points such as forks. Therefore, from this standpoint, the detection of this disorientation spot can be considered reasonable. (a) (b) (c) (d) (e) (f) Figure 11. Evaluation results of ease of wayfinding in virtual maze (red lines: graph edges, blue lines: graph edges on , cyan lines: , yellow lines: walking trajectory of DHM, purple lines: graph edges on ): (a) Wayfinding in accordance with ; (b) wayfinding in accordance with ; (c) wayfinding in accordance with ; (d) detecting disorientation spot; (e) wayfinding in accordance with ; (f) simulation was completed. The results are available in the supplementary video file. Figure 11. Evaluation results of ease of wayfinding in virtual maze (red lines: graph edges, blue lines: graph edges on VP, cyan lines: VT , yellow lines: walking trajectory of DHM, purple lines: graph edges on V′P): (a) Wayfinding in accordance with S1; (b) wayfinding in accordance with S2; (c) wayfinding in accordance with S3; (d) detecting disorientation spot; (e) wayfinding in accordance with S4; (f) simulation was completed. The results are available in the supplementary video file. ISPRS Int. J. Geo-Inf. 2017, 6, 267 15 of 22 Thereafter, as shown in Figure 11e, the DHM was set to walk toward p2 ∈ F indicated by I3, 1 to evaluate the ease of wayfinding after passing the detected disorientation spot. In consequence, the DHM found and recognized S4 and I4, 1 at p2 ∈ F of I3, 1, respectively. Then, the DHM was set to walk toward p4 ∈ PN of I4, 1 following VP generated on passing points pi ∈ PN of I4, 1. Finally, as shown in Figure 11f, the DHM found and recognized S5 and I5, 1, respectively, where S5 was an identification signage pertaining to the destination D. In consequence, the wayfinding simulation was completed. Based on the above results, from the standpoints of system performance, the following conclusions were obtained. • The proposed system could detect disorientation spots resulting from the lack of signage or poor location of signage in the environment model. • The proposed system could simulate the wayfinding of the DHM by discriminating among four types of signage, namely, positional, directional, routing, and identification. 6.2. Evaluation Results of Ease of Wayfinding in Real Two-Story Indoor Environment The proposed system was further applied to a real two-story indoor environment with a set of signage entities S = {S1, S2, S3, S4}. Figure 12 shows the constructed as-is environment model. In Figure 12, the laser-scanned point clouds were acquired from the environment by a terrestrial laser scanner [32]. The textured environmental geometry GI was constructed from 21,143 photos of the environment using commercial SfM software, ContextCapture [33], where the photos were extracted from the video data captured using a digital single-lens reflex camera [34]. As shown in Figure 12c, the model contains a few distorted regions, which can be attributed to the performance limitations of the SfM software. However, most of the model could be generated successfully. In the simulation, the DHM properties H of the wayfinding scenario was identical to that in Table 4. The starting position ps, initial walking direction dI , and signage locations and orientations TS are shown in Figure 12d,e. The maximum viewing distance dl of each signage was specified as dl = 4.46 m for each signage information Ii,j, as determined by measurement of dl of S1 using six subjects ranging in age from 22 to 26 years. A positional signage S1, two types of directional signage S2 and S3, and an identification signage S4 were arranged in the environment to simulate the situation in which people tried to find a conference room using only the signage in the unfamiliar indoor environment. Figure 13 shows the evaluation results of ease of wayfinding. As shown in Figure 13a, when the simulation was performed, S1 and I1, 1 were found and recognized by the DHM, respectively. Since the next goal position pn indicated by I1, 1 was specified on the end of the caracole on the second floor, the DHM was set to ascend the caracole. When the DHM arrived at pn of I1, 1, the DHM was asked to observe the surrounding environment to find new signage. However, as shown in Figure 13b, the DHM could not find S2 although S2 was visible. This was because the estimated signage noticeability n2 = 0.27 of S2 at the spot was less than the user-specified threshold, nt = 0.3. Thus, this spot was detected as a disorientation spot because S2 was overlooked. Following the above results, in Figure 13c, the signage design of S2, i.e., texture on Gi, was improved to enhance its noticeability. As a result, the ease of wayfinding was improved to enable the DHM to find S2 at the detected disorientation spot. This improvement was caused by the fact that n2 of S2 from the DHM standing at the disorientation spot detected previously increased to an adequately large value, n2 = 0.68. After the DHM recognized I2, 1, the DHM was set to walk toward the first fork point p1 indicated by I2, 1. However, as shown in Figure 13c, when the DHM arrived at p1 of I2, 1, the wayfinding state had fallen into SW7, i.e., gotten lost, since the DHM could not find any new signage at p1. This was because any signage could not be seen by the DHM at p1. Therefore, this spot was also detected as a disorientation spot owing to the lack of signage. ISPRS Int. J. Geo-Inf. 2017, 6, 267 16 of 22 ISPRS Int. J. Geo-Inf. 2017, 6, 267 16 of 22 (a) (b) (c) (d) (e) Figure 12. As-is environment model of two-story indoor environment: (a) Laser-scanned point clouds (#points: 5,980,647); (b) navigation graph ; (c) textured environmental geometry (#vertices: 625,484, #faces: 1,241,049); (d) wayfinding scenario on first floor [35]; (e) wayfinding scenario on second floor [35]. The results are available in the supplementary video file. Figure 12. As-is environment model of two-story indoor environment: (a) Laser-scanned point clouds (#points: 5,980,647); (b) navigation graph GN ; (c) textured environmental geometry GI (#vertices: 625,484, #faces: 1,241,049); (d) wayfinding scenario on first floor [35]; (e) wayfinding scenario on second floor [35]. The results are available in the supplementary video file. ISPRS Int. J. Geo-Inf. 2017, 6, 267 17 of 22 ISPRS Int. J. Geo-Inf. 2017, 6, 267 17 of 22 (a) (b) (c) (d) Figure 13. Evaluation results of ease of wayfinding in two-story indoor environment (yellow lines: walking trajectory of DHM): (a) Wayfinding simulation on first floor; (b) detection of disorientation spot resulting from overlooking the signage ; (c) design improvement of and detection of disorientation spot resulting from lack of signage; (d) ease of wayfinding improved completely by changing the design of and adding . The results are available in the supplementary video file. Figure 13. Evaluation results of ease of wayfinding in two-story indoor environment (yellow lines: walking trajectory of DHM): (a) Wayfinding simulation on first floor; (b) detection of disorientation spot resulting from overlooking the signage S2; (c) design improvement of S2 and detection of disorientation spot resulting from lack of signage; (d) ease of wayfinding improved completely by changing the design of S2 and adding S5. The results are available in the supplementary video file. ISPRS Int. J. Geo-Inf. 2017, 6, 267 18 of 22 By contrast, in Figure 13d, a new positional signage S5 was arranged around the detected disorientation spot. As a result, as shown in the figure, the wayfinding simulation of the DHM was completed successfully. As described above, the proposed system enabled the user to validate the ease of wayfinding in the environment interactively by considering the wayfinding of the DHM, as-is environment model, and arranged signage system. From the standpoint of system performance, the following conclusions were obtained. • The proposed system could detect disorientation spots resulting from the lack of signage and overlooking signage. • The proposed system could simulate the wayfinding of the DHM even in the realistic and complex as-is environment model. • The proposed system could quickly re-evaluate rearranged signage based on the simulation. 6.3. Efficiency of Environment Modeling and Simulation Table 6 shows the elapsed time of the as-is environment modeling and simulation. As shown in the table, the times for 3D environment modeling from laser-scanned point clouds were less than one minute in both environments. By contrast, owing to the performance limitation of the SfM software [33], construction of the textured environmental geometry GI required approximately one week. Table 6. Time required for environment modeling and simulation. (CPU: Intel(R) Core(TM) i7-6850K 3.60 GHz, RAM: 64 GB, GPU: GeForce GTX 1080). Process Time Required in Case of Virtual Maze Time Required in Case of Two-Story Indoor Environment Automatic construction of WS and GN from laser-scanned point clouds 2.5 s (#points: 963,691) 1 50.0 s (#points: 5,980,647) 1 Automatic construction of GI using SfM software [33] Approximately 1 week (#photos: 21,143) (resolution: 1920 × 1080) Signage visibility, legibility, and noticeability estimation Less than 0.17 s Signage-based motion planning Less than 0.02 s One-step walking motion generation with 100 frames interpolation 2 0.15 s 2.5 s 1 Number of downsampled points used for environment modeling. 2 Elapsed time of signage visibility, legibility, and noticeability evaluation was not included. Furthermore, the time required for signage visibility, legibility, and noticeability estimation was less than 0.17 s. In addition, the times required for one-step walking motion generation were 0.15 s and 2.5 s in the virtual maze and the two-story indoor environment, respectively. Therefore, it was confirmed that the proposed system could simulate the DHM wayfinding efficiently. Note that the time required for walking motion generation in the two-story indoor environment was longer owing to the high computational load of rendering the environment model. 6.4. Experimental Validation of System for Evaluating Ease of Wayfinding 6.4.1. Overview of Wayfinding Experiment The simulation results on ease of wayfinding presented in Section 6.2 were validated by the wayfinding experiment using six young subjects. In the validation, two signage systems imitating S = {S1, S2, S3, S4} and S ∪ S5 were arranged in the real environment, where S and S5 represent the set of signage entities used in the simulation in Figure 13a,b and the added signage in the simulation in Figure 13d, respectively. In the wayfinding experiment, first, the name of destination was revealed to the subjects at the start position ps. Then, the subjects were asked to find their way to the destination ISPRS Int. J. Geo-Inf. 2017, 6, 267 19 of 22 using the arranged signage system. During this process, wayfinding events such as finding signage and recognizing signage information were recorded by the thinking-aloud method [36], where the subjects were asked to walk while continuously thinking out loud. Verbal information from the subjects was recorded by handheld voice recorders. At the same time, videos of the walking trajectories of the subjects were captured by the observer. Finally, when the subjects arrived at the destination, the experiment was deemed complete. Note that all subjects have regularly used the environment, but the locations of arranged signage and the destination were not revealed to them. In addition, in the simulation results in Section 6.2, the maximum viewing distance dl was specified by measuring dl from those six subjects. In the experiments, first, the wayfinding behaviors of three young subjects (Y1–Y3) were measured using the signage system imitating S. After that, the behaviors of the other three young subjects (Y4–Y6) were measured using the signage system imitating S ∪ S5. 6.4.2. Comparison of Wayfinding Results between DHM and Subjects Figure 14 shows the comparison of wayfinding results between the DHM and the subjects. As shown in Figure 14a, a disorientation spot was found during the experiment by three subjects (Y1–Y3), which corresponded to the disorientation spot detected by the simulation. Thus, it was confirmed that the proposed ease of wayfinding simulation could detect disorientation spot, where the subjects actually lost their way owing to the lack of signage. ISPRS Int. J. Geo-Inf. 2017, 6, 267 19 of 22 in Figure 13d, respectively. In the wayfinding experiment, first, the name of destination was revealed to the subjects at the start position . Then, the subjects were asked to find their way to the destination using the arranged signage system. During this process, wayfinding events such as finding signage and recognizing signage information were recorded by the thinking-aloud method [36], where the subjects were asked to walk while continuously thinking out loud. Verbal information from the subjects was recorded by handheld voice recorders. At the same time, videos of the walking trajectories of the subjects were captured by the observer. Finally, when the subjects arrived at the destination, the experiment was deemed complete. Note that all subjects have regularly used the environment, but the locations of arranged signage and the destination were not revealed to them. In addition, in the simulation results in Section 6.2, the maximum viewing distance was specified by measuring from those six subjects. In the experiments, first, the wayfinding behaviors of three young subjects (Y1–Y3) were measured using the signage system imitating . After that, the behaviors of the other three young subjects (Y4–Y6) were measured using the signage system imitating ∪ . 6.4.2. Comparison of Wayfinding Results between DHM and Subjects Figure 14 shows the comparison of wayfinding results between the DHM and the subjects. As shown in Figure 14a, a disorientation spot was found during the experiment by three subjects (Y1–Y3), which corresponded to the disorientation spot detected by the simulation. Thus, it was confirmed that the proposed ease of wayfinding simulation could detect disorientation spot, where the subjects actually lost their way owing to the lack of signage. (a) (b) Figure 14. Comparison of wayfinding results between simulation and human measurements: (a) Comparison using = { , , , }; (b) comparison using ∪ . Figure 14. Comparison of wayfinding results between simulation and human measurements:(a) Comparison using S = {S1, S2, S3, S4}; (b) comparison using S ∪ S5. ISPRS Int. J. Geo-Inf. 2017, 6, 267 20 of 22 By contrast, as shown in Figure 14b, two subjects, Y4 and Y5, arrived at the destination when the signage system imitating S ∪ S5 was arranged. However, a disorientation spot was found during the experiment by subject Y6. This was explained by the fact that the subject Y6 overlooked the signage imitating S2. As shown in Figure 14a, this disorientation spot was also detected in the simulation because the DHM could not find S2 owing to the low noticeability of S2. Therefore, it was further confirmed that the proposed system could detect disorientation spot, where subjects actually lost their way owing to overlooking signage. 7. Conclusions In this study, we developed a simulation-based system for evaluating ease of wayfinding using a DHM in an as-is environment model. The proposed system was demonstrated using a virtual maze and a real two-story indoor environment. The following conclusions were drawn from our results: • Our system makes it possible to evaluate the ease of wayfinding by simulating the 3D interactions among the realistic wayfinding behaviors of a DHM, as-is environment model, and realistic signage system. • Under the user-specified wayfinding scenario, the system simulates the wayfinding of the DHM by evaluating signage locations, continuity, visibility, legibility, and noticeability based on the imitated visual perception of the DHM. • Realistic signage system, including four types of signage, namely, positional, directional, routing, and identification, can be discriminated in the wayfinding simulation. • Disorientation spots owing to the lack of signage and overlooking signage can be identified only by conducting the simulation. • Rearranged signage plans can be re-evaluated quickly by carrying out the simulation alone. Our system was further validated by comparison of disorientation spots between simulations and measurements obtained from six young subjects. From this validation, it was confirmed that the proposed system has a possibility of detecting disorientation spots, where people lose their way owing to the lack of signage or overlooking signage. To validate the performance of the proposed system in detail, wayfinding experiments with a greater number of subjects in various as-is environments, including outdoor environments, must be conducted using more complex wayfinding scenarios in a future work. Furthermore, in Sections 6.1 and 6.2, the noticeability threshold nt was specified without reference to measurements of human visual capabilities. However, in practice, nt must be specified as the minimum value estimated by the dominant users of the environment in consideration of their visual capabilities. Therefore, a method for determining a suitable value of nt using a statistical database related to human visual capabilities [37] will be developed in a future work. The textured environmental geometry GI of the two-story indoor environment included a few distorted regions owing to performance limitations of the SfM software and poor textures on the walls. In the proposed system, GI was used for signage noticeability estimation. From the standpoint of evaluating ease of wayfinding, the system must detect the disorientation spot, where low signage noticeability is expected. In general, the signage noticeability decreases in areas where wall surfaces around the signage are complex and textural, i.e., saliency of signage design is relatively low compared to its surroundings. Fortunately, in such areas, GI can be well reconstructed owing to the nature of the SfM algorithm. Therefore, the proposed system can detect disorientation spots resulting from overlooking signage, even if a part of GI is distorted. Furthermore, as mentioned in the literature [20], the presence of crowds influences the ease of wayfinding. Thus, crowd simulation technologies must be introduced into the proposed simulation framework. In addition, in the proposed system, the walking trajectory of the DHM was generated using a previously developed optimization algorithm [23]. However, as observed in Figure 14, the walking trajectories of individual human subjects vary. In our future work, such variabilities will ISPRS Int. J. Geo-Inf. 2017, 6, 267 21 of 22 be considered by introducing Monte Carlo simulation into the proposed system, i.e., generating a variety of DHM walking trajectories using the algorithm [23] with resampled parameters related to the trajectory generation. Supplementary Materials: The following is available online at www.mdpi.com/2220-9964/6/9/267/s1, Video S1: EvaluationResults.mp4. Acknowledgments: This work was supported by JSPS KAKENHI Grant No. 15J01552 and JSPS Grant-in-Aid for Challenging Exploratory Research under Project No.26560168. Author Contributions: Tsubasa Maruyama proposed the original idea of this paper; Tsubasa Maruyama developed the entire system and performed the experiments; Satoshi Kanai, Hiroaki Date, and Mitsunori Tada improved the idea of the paper; Tsubasa Maruyama wrote the paper. Conflicts of Interest: The authors declare no conflict of interest. References 1. World Health Organization. WHO Global Report on Falls Prevention in Older Age. Available online: http://www.who.int/ageing/publications/Falls_prevention7March.pdf (accessed on 30 June 2017). 2. International Organization for Standardization. ISO21542: Building Construction—Accessibility and Usability of the Built Environment. Available online: https://www.iso.org/standard/50498.html (accessed on 15 December 2011). 3. International Organization for Standardization/International Electrotechnical Commission. ISO/IEC Guide 71 Second Edition: Guide for Addressing Accessibility in Standards. Available online: http://www.iec.ch/ webstore/freepubs/isoiecguide71%7Bed2.0%7Den.pdf (accessed on 1 December 2014). 4. Rubenstein, L.Z. Falls in Older People: Epidemiology, Risk Factors and Strategies for Prevention. Available online: https://www.ncbi.nlm.nih.gov/pubmed/16926202 (accessed on 22 June 2017). 5. Maruyama, T.; Kanai, S.; Date, H. Tripping risk evaluation system based on human behavior simulation in laser-scanned 3D as-is environments. J. Comput. Des. Eng. 2017, under review. 6. Churchill, A.; Dada, E.; de Barros, A.G.; Wirasinghe, S.C. Quantifying and validating measures of airport terminal wayfinding. J. Air Transp. Manag. 2008, 14, 151–158. [CrossRef] 7. Hunt, E.; Waller, D. Orientation and Wayfinding: A Review. Available online: http://citeseerx.ist.psu.edu/ viewdoc/summary?doi=10.1.1.46.5608 (accessed on 30 June 2017). 8. Hölscher, C.; Büchner, S.J.; Brosamle, M.; Meilinger, T.; Strube, G. Signs and maps: Cognitive economy in the use of external aids for indoor navigation. In Proceedings of the 29th Annual Conference of the Cognitive Science Society, Nashville, TE, USA, 1–4 August 2007. 9. Yasufuku, K.; Akizuki, Y.; Hokugo, A.; Takeuchi, Y.; Takashima, A.; Matsui, T.; Suzuki, H.; Pinheiro, A.T.K. Noticeability of illuminated route signs for tsunami evacuation. Fire Saf. J. 2017, in press. [CrossRef] 10. Thora, T.; Bergmann, E.; Konieczny, L. Wayfinding and description strategies in an unfamiliar complex building. In Proceedings of the 33rd Annual Conference of the Cognitive Science Society, Boston, MA, USA, 20–23 July 2011. 11. Vilar, E.; Rebelo, F.; Noriega, P. Indoor human wayfinding performance using vertical and horizontal signage in virtual reality. Hum. Factors Ergon. Manuf. Serv. Ind. 2014, 24, 601–605. [CrossRef] 12. Buechner, S.J.; Wiener, J.; Hölscher, S. Methodological triangulation to assess sign placement. In Proceedings of the Symposium on Eye Tracking Research and Applications, Santa Barbara, CA, USA, 28–30 March 2012. 13. Furubayashi, S.; Yabuki, N.; Fukuda, T. A data model for checking directional signage at railway stations. In Proceedings of the First International Conference on Civil and Building Engineering Informatics, Tokyo, Japan, 7–8 November 2013. 14. Chen, Q.; de Vries, B.; Nivf, M.K. A wayfinding simulation based on architectural features in the virtual built environment. In Proceedings of the 2011 Summer Computer Simulation Conference, Hague, The Netherlands, 27–30 June 2011. 15. Morrow, E.; Mackenzie, I.; Nema, G.; Park, D. Evaluating three dimensional vision fields in pedestrian microsimulations. Transp. Res. Procedia 2014, 2, 436–441. [CrossRef] www.mdpi.com/2220-9964/6/9/267/s1 http://www.who.int/ageing/publications/Falls_prevention7March.pdf https://www.iso.org/standard/50498.html http://www.iec.ch/webstore/freepubs/isoiecguide71%7Bed2.0%7Den.pdf http://www.iec.ch/webstore/freepubs/isoiecguide71%7Bed2.0%7Den.pdf https://www.ncbi.nlm.nih.gov/pubmed/16926202 http://dx.doi.org/10.1016/j.jairtraman.2008.03.005 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.5608 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.5608 http://dx.doi.org/10.1016/j.firesaf.2017.04.038 http://dx.doi.org/10.1002/hfm.20503 http://dx.doi.org/10.1016/j.trpro.2014.09.053 ISPRS Int. J. Geo-Inf. 2017, 6, 267 22 of 22 16. Hajibabai, L.; Delavar, M.R.; Malek, M.R.; Frank, A.U. Agent-Based Simulation of Spatial Cognition and Wayfinding in Building Fire Emergency Evacuation. Available online: https://publik.tuwien.ac.at/files/ pub-geo_1946.pdf (accessed on 22 June 2017). 17. Brunnhuber, M.; Schrom-Feiertag, H.; Luksch, C.; Matyus, T.; Hesina, G. Bridging the gaps between visual exploration and agent-based pedestrian simulation in a virtual environment. In Proceedings of the 18th ACM Symposium on Virtual Reality Software and Technology, Toronto, ON, Canada, 10–12 December 2012. 18. Becker-Asano, C.; Ruzzoli, F.; Hölscher, C.; Nebel, B. A multi-agent system based on unity 4 for virtual perception and wayfinding. Transp. Res. Procedia 2014, 2, 452–455. [CrossRef] 19. Zhang, Z.; Jia, L.; Qin, Y. Optimal number and location planning of evacuation signage in public space. Saf. Sci. 2017, 91, 132–147. [CrossRef] 20. Motamedi, A.; Wang, Z.; Yabuki, N.; Fukuda, T.; Michikawa, T. Signage visibility analysis and optimization system using BIM-enabled virtual reality (VR) environments. Adv. Eng. Inf. 2017, 32, 248–262. [CrossRef] 21. Phaholthep, C.; Sawadsri, A.; Bunyasakseri, T. Evidence-based research on barriers and physical limitations in hospital public zones regarding the universal design approach. Asian Soc. Sci. 2017, 13, 133. [CrossRef] 22. Maruyama, T.; Kanai, S.; Date, H. Simulating a Walk of Digital Human Model Directly in Massive 3D Laser-Scanned Point Cloud of Indoor Environments. Available online: https://link.springer.com/chapter/ 10.1007/978-3-642-39182-8_43 (accessed on 22 June 2017). 23. Maruyama, T.; Kanai, S.; Date, H.; Tada, M. Motion-capture-based walking simulation of digital human adapted to laser-scanned 3D as-is environments for accessibility evaluation. J. Comput. Des. Eng. 2016, 3, 250–265. [CrossRef] 24. Maruyama, T.; Kanai, S.; Date, H. Vision-based wayfinding simulation of digital human model in three dimensional as-is environment models and its application to accessibility evaluation. In Proceedings of the International Design Engineering Technical Conferences & Computers & Information in Engineering Conference, Charlotte, NC, USA, 6–9 August 2016. 25. Filippidis, L.; Galea, E.R.; Gwynne, S.; Lawrence, P.J. Representing the influence of signage on evacuation behavior within an evacuation model. J. Fire Prot. Eng. 2006, 16, 37–73. [CrossRef] 26. Xie, H.; Filippidis, L.; Gwynne, S.; Galea, E.R.; Blackshields, D.; Lawrence, P.J. Signage legibility distances as a function of observation angle. J. Fire Prot. Eng. 2007, 17, 41–64. [CrossRef] 27. Kobayashi, Y.; Mochimaru, M. AIST Gait Database 2013. Available online: https://www.dh.aist.go.jp/ database/gait2013/ (accessed on 22 June 2017). 28. Itti, L.; Koch, C.; Niebur, E. A model of saliency-based visual-attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 1254–1259. [CrossRef] 29. Treisman, A.M.; Gelade, G. A feature-integration theory of attention. Cogn. Psychol. 1980, 12, 97–136. [CrossRef] 30. FreeCAD: An Open-Source Parametric 3D CAD Modeler. Available online: https://www.freecadweb.org/ (accessed on 22 June 2017). 31. Takashi, T.; Keita, I.B. Physiology. In Handbook of Environmental Design, 2nd ed.; Koichi, I., Ed.; Maruzen Publishing: Tokyo, Japan, 2003. 32. 3D Laser-Scanner FARO. Available online: http://www.faro.com/products/3d-surveying/laserscanner- faro-focus/overview (accessed on 22 June 2017). 33. Bentley—Reality Modeling Software. Available online: https://www.bentley.com/en/products/brands/ contextcapture (accessed on 22 June 2017). 34. Nikon D3300. Available online: http://www.nikon-image.com/products/slr/lineup/d3300/ (accessed on 22 June 2017). 35. Floor Maps of Graduate School of Information Science and Technology. Available online: http://www.ist. hokudai.ac.jp/facilities/ (accessed on 22 June 2017). 36. O’Neill, M.J. Evaluation of a conceptual model of architectural legibility. Environ. Behav. 1991, 23, 259–284. [CrossRef] 37. Database of Sensory Characteristics of Older Persons with Disabilities. Available online: http://scdb.db.aist. go.jp/ (accessed on 22 June 2017). © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). https://publik.tuwien.ac.at/files/pub-geo_1946.pdf https://publik.tuwien.ac.at/files/pub-geo_1946.pdf http://dx.doi.org/10.1016/j.trpro.2014.09.059 http://dx.doi.org/10.1016/j.ssci.2016.07.021 http://dx.doi.org/10.1016/j.aei.2017.03.005 http://dx.doi.org/10.5539/ass.v13n4p133 https://link.springer.com/chapter/10.1007/978-3-642-39182-8_43 https://link.springer.com/chapter/10.1007/978-3-642-39182-8_43 http://dx.doi.org/10.1016/j.jcde.2016.03.001 http://dx.doi.org/10.1177/1042391506054298 http://dx.doi.org/10.1177/1042391507064025 https://www.dh.aist.go.jp/database/gait2013/ https://www.dh.aist.go.jp/database/gait2013/ http://dx.doi.org/10.1109/34.730558 http://dx.doi.org/10.1016/0010-0285(80)90005-5 https://www.freecadweb.org/ http://www.faro.com/products/3d-surveying/laserscanner-faro-focus/overview http://www.faro.com/products/3d-surveying/laserscanner-faro-focus/overview https://www.bentley.com/en/products/brands/contextcapture https://www.bentley.com/en/products/brands/contextcapture http://www.nikon-image.com/products/slr/lineup/d3300/ http://www.ist.hokudai.ac.jp/facilities/ http://www.ist.hokudai.ac.jp/facilities/ http://dx.doi.org/10.1177/0013916591233001 http://scdb.db.aist.go.jp/ http://scdb.db.aist.go.jp/ http://creativecommons.org/ http://creativecommons.org/licenses/by/4.0/. Introduction Related Work Automatic 3D As-Is Environment Modeling Creation of Signage Entity Geometric Property Navigation Property Legibility Property System for Evaluation of Ease of Wayfinding Signage Perception Based on Imitated Visual Perception Signage Visibility Estimation Signage Noticeability Estimation Signage Legibility Estimation Wayfinding Decision-Making Based on Signage Perception Signage-Based Motion Planning Updating Subgoal Position of DHM Walking Path Selection and Walking Trajectory Generation MoCap-Based Adaptive Walking Motion Generation Results and Validations Evaluation of Ease of Wayfinding in Virtual Maze Evaluation Results of Ease of Wayfinding in Real Two-Story Indoor Environment Efficiency of Environment Modeling and Simulation Experimental Validation of System for Evaluating Ease of Wayfinding Overview of Wayfinding Experiment Comparison of Wayfinding Results between DHM and Subjects Conclusions work_h7kgytdtjfbkbbljthnxgrln4q ---- Drones and Surveillance Cultures in a Global World Research How to Cite: Muthyala, John. 2019. “Drones and Surveillance Cultures in a Global World.” Digital Studies/Le champ numérique 9(1): 18, pp. 1–51. DOI: https://doi.org/10.16995/dscn.332 Published: 27 September 2019 Peer Review: This is a peer-reviewed article in Digital Studies/Le champ numérique, a journal published by the Open Library of Humanities. Copyright: © 2019 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. Open Access: Digital Studies/Le champ numérique is a peer-reviewed open access journal. Digital Preservation: The Open Library of Humanities and all its journals are digitally preserved in the CLOCKSS scholarly archive service. https://doi.org/10.16995/dscn.332 http://creativecommons.org/licenses/by/4.0/ Muthyala, John. 2019. “Drones and Surveillance Cultures in a Global World.” Digital Studies/Le champ numérique 9(1): 18, pp. 1–51. DOI: https://doi.org/10.16995/dscn.332 RESEARCH Drones and Surveillance Cultures in a Global World John Muthyala University of Southern Maine, US muthyala@maine.edu Digital technologies are essential to establishing new forms of dominance through drones and surveillance systems; these forms have significant effects on individuality, privacy, democracy, and American foreign policy; and popular culture registers how the uses of drone technologies for aesthetic, educational, and governmental purposes raise questions about the exercise of individual, governmental, and social power. By extending computational methodologies in the digital humanities like macroanalysis and distant reading in the context of drones and surveillance, this article demonstrates how drone technologies alter established notions of war and peace, guilt and innocence, privacy and the common good; in doing so, the paper connects postcolonial studies to the digital humanities. Keywords: Drones; Surveillance; Digital Humanities; Postcolonial studies; Globalisation; Digital cultures Les technologies numériques sont essentielles pour établir de nouvelles formes de domination par le biais des drones et des systèmes de surveillance. Ces formes ont des effets importants sur l’individualité, la vie privée, la démocratie et la politique étrangère américaine. La culture populaire dénombre un éventail de ces effets employant des technologies de drones pour des objectifs esthétiques, éducatifs et gouvernementaux d’une manière qui soulève des questions sur la mise en pratique du pouvoir individuel, gouvernemental et social. En étendant des méthodologies statistiques des Humanités numériques, tels que la macroanalyse et la lecture globale, dans le contexte des drones et de la surveillance, cet article démontre la façon dont les technologies numériques modifient fondamentalement les notions déjà établies de la guerre et de la paix, de la culpabilité et de l’innocence, de la vie privée et du bien commun. De ce fait, cet article lie les études post-coloniales aux Humanités numériques. Mots-clés: Drones; Surveillance; Humanités numériques; études post- coloniales; Mondialisation; Cultures numériques https://doi.org/10.16995/dscn.332 mailto:muthyala@maine.edu Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 2 of 51 Za Kaom Pa Stargo Stargo Drone Hamla (my gaze is as fatal as a drone attack). Song performed by Sitara Younas, Pashto singer “How the digital humanities advances, channels, or resists today’s great postindustrial, neoliberal, corporate, and global flows of information-cum- capital is thus a question rarely heard in the digital humanities…” —Alan Liu, “Where is Cultural Criticism in the Digital Humanities?” I make three central arguments in this paper: the use of digital technologies is essential to establishing new forms of dominance through drones (unmanned automated vehicles, UAVs) and surveillance systems; these forms have significant effects on individuality, privacy, democracy, and American foreign policy; and popular culture registers how the use of drone technologies for aesthetic, educational, and governmental purposes raises complex questions about the exercise of individual, governmental, and social power. In what follows, I first highlight the cultural turn in the digital humanities in order to open up a critical terrain to study the militarized and civilian uses of drones and the surveillance cultures they engender; second, I focus on drones as disruptive technologies that thrive on surveillance regimes; and third, I study the creative appropriations of drone technologies by artists and singers seeking to counter the global reach of digital networks that enable some nation- states to wield power over largely post-colonial societies, and control the social, legal, and political meanings of innocence and guilt, privacy and freedom. Taken together, these approaches help us infuse cultural criticism in the digital humanities and connect postcolonial studies with the digital humanities. Digital Humanities and the cultural turn Over the last two decades, digital humanities emerged as a promising field of inquiry in which interdisciplinary collaboration in the sciences and the humanities lead to new digital tools, multimodal interfaces, and hybrid methodologies. Early initiatives are often traced back to the electronic concordance of Saint Thomas Aquinas’ works, first created by Jesuit priest Father Roberto Busa in the 1950s, by partnering with International Business Machines (IBM). The use of computing in the humanities became the key topic for literary scholars and scientists in seminars offered by IBM, Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 3 of 51 and in 1966, they published Computers and the Humanities (Hindley 2013). In the decades that followed, digital technologies grew so rapidly that they spawned a dizzying array of communication and information tools and systems. Using computational approaches to the humanities, the digital humanities has generally concerned itself with text encoding, text mining, machine learning, database creations, archiving, curating, data visualization, algorithmic criticism, and distant reading. Organizations like the Office of Digital Humanities of the National Endowment for the Humanities, Alliance of Digital Humanities Organizations, Humanities, Arts, Science, and Technology Alliance and Collaboratory, Association for Computers and Humanities, Canadian Society for Digital Humanities, Australian Association for Digital Humanities, Japanese Association for Digital Humanities, European Association for Digital Humanities, and the panels of DH at the Modern Language Association Conference, THAT CAMP, and other conferences, including several journals, blogs, anthologies, university press series, undergraduate and graduate courses and programs, and regional and national grants and fellowships all show the discipline’s growing institutionalization in higher education in America and other parts of the world. A central debate in the digital humanities concerns computing: one side argues that the digital humanities mark the computational turn in the humanities, whereas the other side acknowledges the turn but broadens its focus to include the social and cultural impact of digital technologies (Berry 2012, 5). Scholars identify three waves or phases in digital humanities. The first phase focused on digitization, codes, software, and archiving; the second phase emphasized interactivity, making the data malleable, developing multimodal environments, and visualization; the third phase uses “digital toolkits in the service of the Humanities’ core methodological strengths: attention to complexity, medium specificity, historical context, analytical depth, critique and interpretation” (Presner, Schnapp, and Lunenfeld 2009). Perhaps (Muthyala 2016), it’s the nature of an emerging field to develop concepts and meta- critical acumen about its assumptions and practices, which are themselves emerging (new or realigned developments) and emergent (coming into being in relation to the urgency or need of scholarly or creative occasion). There is also a hackers vs Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 4 of 51 yackers divide: the hackers do the splendid inventions, creations, and euphoric discoveries that bring in millions of dollars and make life worth living, while the yackers ask uncomfortable questions about meaning, context, nuance, policy, purpose, pedagogy, social, political, and economic implications, ethics, the good life, and make the examined life miserable (Pannapacker 2013). Stressing coding as essential to DH, Stephen Ramsay contends, “Personally, I think digital humanities is about building things […] If you are not making anything, you are not […] a digital humanist” (Gold, 2012a). DH registers a transformation that is about “moving from reading and critiquing to building and making” (Gold, 2012a). Write David M. Berry and Anders Fagerjord (2017): “As digital technology has swept over the world, the humanities too have undergone a rapid change in relation to the use and application of digital technologies in scholarship […] Humanities research has been irrevocably transformed, as indeed have everyday life, our societies, economies, cultures and politics” (1). There is no going back to a pre-digital world; we are in a post-digital era, because “the tendrils of digital technology have in some way touched everyone” (Cascone 2000, 12). The digital is here to stay. What we do with it is what matters. Tongue-in-cheek yet with insight, Marjorie Burghart (2013) suggests three orders reminiscent of the three Medieval Orders, loosely defined, operating in digital humanities: “Oratores, bellatores, laboratores: those who pray, those who fight, those who work.” There are those who work and do things and produce new codes, software, systems, and tools used for scholarship and creativity; there are those who work hard to legitimize this work to non-specialists, the general public, and scholars in other disciplines; they fight the rhetorical battles to gain institutional prestige and academic credibility; and then there are those “non-practicing believers,” who are “interested by the DH phenomenon and enthusiastic, but not involved themselves in any practical aspect” (Burghart 2013). Since the aim here is not to rehearse the task of defining and explaining digital humanities, suffice it to say that these definitions are extended in several works: Susan Schreibman, Ray Siemens, and John Unsworth’s (2004) A Companion to Digital Humanities; Columbia University’s Round Table on DH (Center for Digital Research and Scholarship 2011) at the Center for Research and Scholarship, “Research Without Borders: Defining the Digital Humanities”; Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 5 of 51 Todd Presner, Jeffrey Schnapp, and Peter Lunenfeld’s (2009) Digital Humanities Manifesto 2.0; David M. Berry’s (2012) Understanding the Digital Humanities; Anne Burdick et al’s (2012) Digital_Humanities; Matthew K. Gold’s (2012b) Debates in the Digital Humanities and Melissa Terras, Julianne Nyhan, and Edward Vanhoutte’s (2013) Defining Digital Humanities: A Reader. A pointed criticism about the digital humanities comes from Alan Liu (2012), who argues that cultural criticism is notably absent in the digital turn in the humanities: While digital humanists develop tools, data, and metadata critically, therefore (e.g., debating the “ordered hierarchy of content objects” principle; disputing whether computation is best used for truth finding or, as Lisa Samuels and Jerome McGann put it, “deformance”; and so on), rarely do they extend their critique to the full register of society, economics, politics, or culture. How the digital humanities advances, channels, or resists today’s great postindustrial, neoliberal, corporate, and global flows of information-cum-capital is thus a question rarely heard in the digital humanities associations, conferences, journals, and projects with which I am familiar. Liu’s call for cultural criticism in the digital humanities is noteworthy, because the tendency to define the field primarily as an extension of computational humanities continues to gain purchase in public discourse; to critics like Stanley Fish (2018), digital humanities are deeply suspect: “administrators who pour funds and resources into the digital humanities are complicit in the killing of the humanities.” Recently, in criticizing the institutional cachet of digital humanities and what he views as hasty, misguided approaches to use statistical methods for literary analysis, Fish (2019) notes, “At bottom CLS [computational literary studies] or Digital Humanities is a project dedicated to irresponsibility masked by diagrams and massive data mining.” Timothy Brennan (2017) asks, “After a decade of investment and hype, what has the field accomplished?” His answer is sharp: “Not much” (Brennan 2017). Adam Kirsch (2014) sounds the alarm, proclaiming that “technology is taking over English departments,” which is a “false promise of the digital humanities.” Oddly enough, to Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 6 of 51 these critics, the digital humanities begins and ends with computational humanities, a view demonstrating a lack of awareness of the extensive discussions about the field, including whether it can even be called a field or discipline. Fish’s blaming administrators who support the digital humanities as being “complicit” in their devaluation is the kind of myopic, hyperbolic rhetoric we often find in political campaigns where, despite evidence to the contrary, candidates blame each other for all the ills of the world—the real, the imagined, the fanciful, the grotesque—and then some. Liu’s call to move beyond the computational towards the cultural turn in the digital humanities is, therefore, more urgent than before; his warning to think institutionally and socio-politically about the digital humanities by examining vast systems and networks that facilitate the flow of money, power, and influence by individuals, groups, and nation-states finds resonance in Daniel Allington, Sarah Brouillette, and David Golumbia’s (2016) indictment of higher education’s growing dependency on neoliberal values and business models. Arguing that digital humanities “discourse sees technological innovation as an end in itself and equates the development of business models with political progress,” they contend, “the unparalleled level of material support that Digital Humanities has received suggests that its most significant contribution to academic politics may lie in its (perhaps unintentional) facilitation of the neoliberal takeover of the university” (Allington, Brouillette, and Golumbia 2016). Likewise, Anne Cong-Huyen (2013) observes that the field has tended to remain insular by focusing heavily on technological expertise, as if without it one cannot become part of the discipline or really understand it: These digital and electronic technologies are of particular importance because they are often perceived as being neutral, without any intrinsic ethics of their own, when they are the result of material inequalities that play out along racial, gendered, national, and hemispheric lines. Not only are these technologies the result of such inequity, but they also reproduce and reinscribe that inequity through their very proliferation and use, which is dependent upon the perpetuation of global networks of economic and social disparity and exploitation. Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 7 of 51 Similarly, Tara McPherson (2012) says that “the difficulties we encounter in knitting together our discussions of race (or other modes of difference) with our technological productions within the digital humanities (or in our studies of code) are actually an effect of the very designs of our technological systems, designs that emerged in post- World War II computational culture.” The impulse to move beyond race by advocating colour-blindness worked closely like the modular systems that protected the coding logic intact by making it functionally invisible in order to enhance other uses and expectations. Likewise, in “Cultural Politics, Critique, and the Digital Humanities,” Tanner Higgin (2010) argues that unless we critique the broader institutional and systemic conditions that have allowed the digital humanities to emerge as they have now, the discipline will replicate inequality, because there are “far more subtle ways technologies reproduce oppressive social relations in everyday life within and without academia.” Higgin sees a “potentially techno fetishistic obsession in DH with technological transformation via the creation and use of various digital tools/platforms/networks, etc. as agents of social change. These efforts are often performed under the guiding ethos of collaboration which often becomes an uncritical stand-in for an empty politics of access and equity” (Higgin 2010). Adding yet another critical angle to the debate, Alex Reid (2014) argues that the scientific worldview can also be unexaminedly appropriated by the humanities, including the very distinction between them that the humanities seek to dismantle. The risk is that the human in the humanities loses its central role as a subject and agent of experience, knowledge, and consciousness. In “Critical Theory and the Mangle of Digital Humanities,” Todd Presner (2015) seeks to connect critical theory to digital humanities by not flattening out the differences between doing or building something with digital technologies and the appreciative, interpretive, and contextually analytical impulses of the humanities; he suggests that “the first challenge for digital humanities is to develop both critical and genealogical principles for exposing its own discursive structures and knowledge formations at every level of practice, from the materiality of platforms, the textuality of the code, and the development of content objects to the systems of inclusion and exclusion, truth and falsehood governing its disciplinary rituals, doctrines, and social systems” Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 8 of 51 (60). It is what concerns Adeline Koh (2014), who argues that the discourse of civility, the social contract for participating in liberal society, in digital humanities has two requirements: 1) “the practice of civility, or niceness; and 2) possession of technical knowledge, defined as knowledge of coding or computer programming” (94; italics from original) These two stipulations function as “rules” for “entry to the scholarly field” (Koh 2014, 94). Like Koh, Gary Hall cautions against drawing heavily on science to re-orient the humanities, as if the latter were more in need of re-assessment than the former, which implicitly privileges the one over the other; instead, Hall asks, “Along with a computational turn in the humanities, might we not also benefit from more of a humanities turn in our understanding of the computational and the digital?” (2011, 2). In cautioning practitioners and scholars in digital humanities to avoid relying excessively on the sciences or assuming that scientific methodology in its quantitative modality is fundamentally unlike the unstable interpretive knowledge the humanities offers, Liu, Cong-Huyen, McPherson, Ramsay, Higgin, Allington, Brouillette, Golumbia, Reid, Presner, Koh, and Hall emphasize the need to rethink, not just reposition, the digital humanities in relation to institutional operations, governmental policies, demographics shifts, and cultural orientations that support and legitimize the sciences; in other words, the cultural turn in the digital humanities is necessary and urgent. Drone warfare and empire in the 21st century One way to extend these critics’ ideas is to examine the rise of two recent phenomena: drones and surveillance. With their bulbous front-ends, the Predator, Reaper, and Global Hawk are the iconic symbols of drones. 27 ft in length and with a wingspan of 55 ft, the Predator can fly for 24 hours at 25,000 ft, and the system costs $20 million. 36 ft in length and with a wingspan of 66 ft, the Reaper can fly for 24 hours at 50,000 ft, and the system costs $26.8 million. 48 ft in length and with a wingspan of 131 ft, the Global Hawk can fly for 28 hours at 60,000 ft and costs $140.9 million (Gertler 2012, 31) (Figures 1 and 2). Other models and platforms, with varied operational histories, include Firescout, Grey Eagle, Hawk, Hunter, Hummingbird, Nano, Prowler Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 9 of 51 II, Puma, Raven, ScanEagle, Sentinel, Shadow, Switchblade, T-Hawk, Warrior, Wasp III, (Gertler 2012, 8; AeroVironment 2019). Companies producing drones or drone technology include General Atomics, AeroVironment, Raytheon, Boeing, Northrop Grumman, and Lockheed Martin (Benjamin 2013, 34–54). Drones like Switchblade can fire missiles and also plunge towards a target in a suicide mission to kill it. Research is being conducted to produce technology that will enable drones to be Figure 1: Global Hawk. Figure 2: Reaper Drone. Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 10 of 51 almost fully automatic, requiring little pilot control (Benjamin 2013, 37). On May 14, 2013, a drone, X-47B, took off from an aircraft carrier, setting a precedent for drone warfare, because it makes mobile the infrastructural needs of maintaining, protecting, and launching drones from areas over which the military can establish control. This development sets “the way for the US to launch unmanned aircraft from just about any place in the world” (Vergakis 2013). The efficacy of drone warfare, from a military perspective, is predicated on the range and quality of the military, technological, and political infrastructure necessary to share intelligence, coordinate missions, and execute them successfully. The “military’s secret military,” (Turse 2012, 12) referred to as US Special Operations Command (SOCOM), set up in 1987, today includes the Green Berets, Rangers, Navy Seals, Air Force Air Commandos, and Marine Corp Special Operations Teams. This unit “carries out the United States’ most specialized and secret missions. These include assassinations, counter-terrorist raids, long-range reconnaissance, intelligence analysis, foreign troop training, and weapons of mass destruction counter-proliferation operations” (Turse 2012, 12). Its core cell, SOCOM, acts under the President’s direct supervision. Countries where SOCOM is or was active include Afghanistan, Bahrain, Belize, Brazil, Bulgaria, Burkina Faso, Dominican Republic, Egypt, Germany, Indonesia, Iran, Iraq, Jordan, Kazakhstan, Kuwait, Kyrgyzstan, Lebanon, Mali, Norway, Oman, Pakistan, Panama, Poland, Qatar, Romania, Saudi Arabia, Senegal, South Korea, Syria, Tajikistan, Thailand, Turkmenistan, United Arab Emirates, Uzbekistan, and Yemen (Turse 2012, 15–16). To maintain, manage, and deploy drones, command and control centres with varying degrees of sophisticated infrastructure and technological capabilities have been sent up in 60 bases all over the world, including in Arizona, Florida, Missouri, New Mexico, New York, North Dakota, Ohio, South Dakota, and Texas. The drones, Special Operations Command, and control centres “are the backbone of the new American robotic way of war. They are also the latest development in a long-evolving saga of America power projection abroad; in this case, remote-controlled strikes anywhere on the planet with a minimal foreign ‘footprint’ and little accountability” gain normalcy (Turse 2012, 22), as “bayonet, telegram, and cannon have been replaced by data mining, satellite reconnaissance, Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 11 of 51 and long distance strikes by weaponized drones” (Hensley 2018, 228). In short, “drones are power tools with the ability to transform the political and social landscape forever” (Yehya 2015, 3). And when we map the landscape of drone wars, “we jibe against the limits of cartographic and so of geopolitical reason,” which transforms drone wars into “the everywhere war,” observes Derek Gregory (2011, 239). This war “transforms the concept the battlefield into a multidimensional ‘battlespace’ where the enemy is fluid and indeterminate,” writes Christine Agius, further adding, “this vertical form of control re-asserts a type of neo-colonial surveillance and ordering that renders contingent any claims to sovereignty, constantly routinizing insecurity in certain spaces” (2017, 372; 380). Drone wars can take place anytime and anywhere; they re-define notions of normalcy and exception, as they generate constant insecurity by waging perpetual war. In drone warfare, it is difficult to ascertain when a country is at war, and when it is not, when conditions of peace prevail, and when they don’t, because the anytime- everywhere matrix enables powerful states to create and manage conditions of emergency on a scale that is trans-territorial and biopolitical. In A Theory of the Drone, Grégorie Chamayou (2015) highlights principles that give institutional character and social power to drones: “persistent surveillance or permanent watch; totalization of perspective or synoptic viewing; creating an archive or film of everyone’s life; data fusion; schematization of forms of life; detection of anomalies and pre-emptive anticipation” (38–42). Unlike traditional war in which the machinery of combat—troops, tanks, weapons, electronic gadgets, munitions, battleships, fighter jets—is assembled, managed, and deployed, and often visible to the eye, this new war is fought in secrecy. It’s a cheap war. It’s an invisible war. It’s a war of stealth and silence. Consider what transpired over the last two decades: in Pakistan, under President George W. Bush, there were 48 drone strikes, 116–137 civilian deaths, and 218–326 militant casualties, and under President Obama, there were 353 strikes, 129–162 civilian deaths, and 1,659–2,683 militant casualties (New America 2019a). In Yemen, Bush authorized 1 strike resulting in zero civilian casualties, and six militants killed, while Obama authorized 184 strikes, leading to 89–101 civilians killed, and 973–1,240 Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 12 of 51 militants killed (New America 2019b). In his first two years, President Donald Trump continued Obama’s aggressive use of drones, by authorizing 112 strikes in Pakistan and Yemen combined; if this rate continues during his presidency, he will surpass Obama’s drone strike record (Wolfgang 2018). The efficacy of drone warfare rests on the quantity and quality of data collected through surveillance (Drew 2009). As they hover in the air, drones secretly surveil entire towns and villages or zero in on buildings and moving objects, while recording thousands of hours of data and feeding them in live or recorded formats, so that pilots, analysts, operators, generals, and others can engage in data mining, target identification, tracking, and elimination. Analysts working in the Algorithmic Warfare Cross-functional Team, a result of Project Maven to “accelerate DoD’s integration of big data and machine learning,” would then spend time “turning countless hours of aerial surveillance into actionable intelligence” (Weisgerber 2017). In other words, certain methodologies of computational digital humanities—macroanalysis and distant reading—are the sine qua non of drone warfare. In Macroanalysis: Digital Methods and Literary History, Matthew Jockers (2013) argues that working with big data can help literary scholars ask new questions about genre, history, gender, and stylometry. As a complement, not substitute, to close reading, he advances macroanalysis to “emphasize that massive digital corpora offer us unprecedented access to the literary record and invite, even demand, a new type of evidence gathering and meaning making” (Jockers 2013, 8). He adds, “[…] the literary researcher must embrace new, and largely computational, ways of gathering evidence […]. More interesting, more exciting, than panning for nuggets in digital archives is the ability to go beyond the pan and exploit the trommel of computation to process, condense, deform, and analyze the deeper strata from which these nuggets were born, to unearth, for the first time, what the corpora really contain” (Jockers 2013, 9–10). Instead of only emphasizing “an examination of seminal works,” we can study the “aggregated ecosystem or ‘economy’ of texts” (Jockers 2013, 32). Along similar lines, Franco Moretti (2013) in Distant Reading opines that we should not rely on single or small text samples to create a historical period or literary canon or detail genres and styles and plots, but engage with large data sets Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 13 of 51 of information and learn to mine and interpret them for their nodes, networks, proximities and distances from other nodes and networks. Distant reading, he contends, “allows you to focus on units that are much smaller or much larger than the text: devices, themes, tropes—or genres and systems. And if, between the very small and the very large the text itself disappears, well, it is one of those cases when one can justifiably say, Less is more. If we want to understand the system in its entirety, we must accept losing something” (Moretti 2013, 48–9). Moretti seeks to apprehend literature or history as textual systems and networks by examining or distantly reading, as it were, large corpora containing metadata of thousands of texts and analyzing them across time by visualizing datasets. Within digital humanities as computational literary studies (CLS), these approaches have come under scrutiny, the latest being Nan Z. Da’s (2019) “The Computational Case Against Computational Literary Studies.” In examining several case studies, Da (2019) argues that Data sets with high dimensionality are decompressed using various forms of scalar reduction (typically through word vectorization) whose results are plotted in charts, graphs, and maps using statistical software. (605) She finds problems with how tagging and categorizing word frequencies and associations, pronoun uses and clusters, and finding patterns and inflections in large corpora are used to make arguments about gender, genre, literary history, themes, etc. In some cases, using the scientific model of replicating lab experiments in controlled settings, Da develops her own computational projects using similar or the same data sets, and arrives at different findings, especially when English texts are translated into other languages and non-English texts are used to read them distantly, as it were, or macroanalytically. Reviewing her study and other interventions in computational literary studies, like Ted Underwood’s (2019) Distant Horizons: Digital Evidence and Literary Change, is not my aim here. It is to note that Da uses computational methodology to critique computational literary studies, in order to argue the following: “Quantitative visualization is intended to reduce complex Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 14 of 51 data outputs to its essential characteristics. CLS has no ability to capture literature’s complexity” (Da 2019, 643; Critical Inquiry 2019). A good case can be made for the value of CLS to advance systems thinking in literary studies, to generally, provisionally, and visually plot the wide range of datasets gleaned from literary production over time; there is value in moving beyond a small corpus of texts when claims to their representational status are taken for granted or inadequately interrogated. CLS enable us to raise different, new, or recalibrated questions about literary taste, reading habits, genre evolution, and sub-genre transformations, including predictive analytics. However, my aim here is to draw from these debates to make a case for the cultural turn in the digital humanities, so that we do not end up privileging computational literary studies or humanities computing as the primary field for disciplinary valorization and professional identity; moreover, my aim is to use humanities methods of textual analysis, contextual inquiry, historical understanding, and conceptual, theoretical argumentation to study multi-genre and multimodal cultural productions that thematize the digital and technologically embody the digital in the context of drone warfare and the transnational surveillance cultures they generate. I am not saying there is a causal link between DH and drone warfare. What I am saying is that there are similarities in structure and method between them that need urgent scholarly examination. Like its analog precursor, the digital, to extend on Edward Said, is “in the world, and hence worldly,” and is “always enmeshed in circumstance, time, place, and society” (Said 1983, 35). Whatever the vastness of digital corpora, the complexity of coding languages, and the sophistication of algorithmic, robotic logics that compress information in space and time to generate analytics with predictive power, the conception, production, dissemination, and use of the digital are worldly endeavours, a series of innumerable acts and motivations profoundly and inescapably shaped by human interests, local pressures, national trends, and global flows. To engage with the worldliness of the digital is to grasp technological innovation as a social and cultural phenomenon that can rewrite, erase, re-draw, or affirm the histories, cultures, and spaces of many peoples and living Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 15 of 51 things in the world; it is to grasp the digital as affording new ways of conceiving of the world and our being in the world. The worldliness of the digital links First World concerns with so-called Third World realities, by foregrounding the enduring legacies of colonialism and the struggle for post-colonial provenance. Put differently, whereas computational literary studies involve digitizing metadata about literary texts and creating algorithms to retrieve data sets and read them for patterns, repetitions, inflections, and shifts in textual systems, drones and surveillance technologies generate and use data about peoples, cities, villages, towns, and terrains to detect patterns, repetitions, inflections, and shifts in human and animal behaviour with one central aim: track, identify, kill. Some methodologies that have become part of the digital humanities, whose lineage extends into computational humanities, are also essential practices in drone warfare and global surveillance. These technologies connect vast trans-regional communication networks, command and control centres, video and image feeds, intelligence analyses, military officials, and politicians working in real-time in locations strewn across the world to assess, interpret, and decide whom to kill, where to kill, when to kill. The network of cables, satellites, and screens, the jumble of joy sticks, keyboards, and computers, and the ensemble of bytes, pixels, and video feeds all coalesce to create a global theatre of war; in this theater, the contours and sensory attributes of material reality are looped endlessly in pixels and bytes; they are processed to recreate digital data and knowledge whose power to render the physical world intelligible and controllable and conquerable is of a piece with the sophisticated technology, pragmatic ingenuity, and exceptionalist thinking that characterize American society. Anarchy of global surveillance Kevin Haggerty and Richard Ericson (2007) propose a new paradigm called “surveillant assemblage” to describe surveillance as a process that manages the flow of information and data produced through a surveillance of ideas, things, and people in migration, thus making mobility a crucial dimension of the politics of visibility. They write: Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 16 of 51 This assemblage operates by abstracting human bodies from their territorial settings and separating them into a series of discrete flows. These flows are then reassembled into distinct ‘data doubles’ which can be scrutinized and targeted for intervention. In the process, we are witnessing a rhizomatic leveling of the hierarchy of surveillance, such that groups which were previously exempt from routine surveillance are now increasingly being monitored. (Haggerty and Ericson 2007, 104) The body here becomes disembodied but does not replace the corporeal body but acts as its “data double” (Haggerty and Ericson 2007, 109). Surveillance, writes Daniel J. Solove (2004) in The Digital Person: Technology and Privacy in the Information Age, leads to the creation of “digital dossiers” that are “collection[s] of detailed data about an individual. […] data is digitized into binary numerical form, which enables computers to store and manipulate it with unprecedented efficiency” (1–2). A prominent theorist of information technology and data management, Roger A. Clarke (1998) in “Information Technology and Dataveillance” coins the term “dataveillance” to characterize a new modality of surveillance enhanced by the growth of digital technologies: “dataveillance is the systematic use of personal data systems in the investigation or monitoring of the actions or communications of one or more persons” (499). Dataveillance in this context is best apprehended as “meticulous rituals of power,” asserts William G. Staples (2003) in Everyday Surveillance, because they are “microtechniques of social monitoring” and “‘small’ procedures and techniques that are precisely and thoroughly exercised”; they are “ritualistic because they are faithfully repeated and are often quickly accepted and routinely practiced with little questions”; and they exude “power because they are intended to discipline people into acting in ways that others have deemed to be lawful or have defined as appropriate or simply ‘normal’” (xii, 3). Hence, the Gorgon Stare: with twelve cameras, the MQ-9 Reaper can surveil an area of four kilometers and produce images and video feeds that can be differentially accessed and analyzed by people separated in space and time (Shachtman 2009). A drone with ARGUS-IS (Autonomous Real-Time Ground Ubiquitous Surveillance- Imaging Systems) takes this further: it can cover fifteen square miles and send video Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 17 of 51 feed to sixty-five windows, each capable of focusing continuously on a moving target or one location (Hambling 2009). In 2005 during the Bush presidency, the Force Application and Launch Continental United States Program (Falcon) was designed to release remote controlled spacecraft that could fly close to five times faster than the speed of sound, at 100,000 feet, and with 1000 pounds of armaments and supplies. The aim of the program, in the words of John E. Pike, of GlobalSecuirty.org, is to “crush someone anywhere in world [sic] on 30 minutes’ notice with no need for a nearby air base” (Pincus 2005). “Surveillance, a technology of racial sorting and subjugation,” writes Jennifer Rhee, “structures drone technology and its dehumanizing tendencies” (2018, 164). Drone surveillance establishes a “regime of figuration, a way of seeing and, therefore, a modality of thought,” argues Nathan K. Hensley (2018, 229). The Gorgon Stare, ARGUS, and Falcon are designed to bring all things within their scopic purview and enable America to establish global strike capacity. They seek and probe and trace and map the daily activities of several groups of people, including women and children, without their knowledge. In Drone: Remote Control Warfare, Hugh Gusterson observes, “As the drones gaze unblinkingly from above, there can be voyeuristic pleasure in watching the Other. In fact, it is hard to imagine a more voyeuristic technology than the drone” (2016, 62). Some of them would turn out to be terrorists or actively aiding them, but not all. But to catch the few, the Gorgon Stare compels all whom it watches to lose privacy and dignity. To apprehend the few, the Gorgon Stare requires all whom it sees to demonstrate their innocence. The Gorgon Stare is biopolitical in two ways: it moves beyond the individual to surveil people as a totality, a mass of subjects made amenable to the scopic, panoramic gaze of the drone, and it seeks to manage and regularize life. As Michel Foucault explicates, “It is therefore not a matter of taking the individual at the level of individuality but, on the contrary, of using overall mechanisms and acting in such a way as to achieve overall states of equilibrium or regularity; it is, in a word, a matter of taking control over life and the biological process of man-as-species and of ensuring that they are not disciplined, but regularized” (1997, 246–7). Biopower seeks to manage all of life, or bring the multitude of the living under the domain of governmentality—to administer, to take charge, to mange, to sort, https://GlobalSecuirty.org Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 18 of 51 to distribute, to maintain life. It is this biopolitical impulse that gains incredible computational and surveillant power in the age of drones and the cultures of surveillance they engender. Thus, the drone instantiates a new structure of biopolitical power that seeks invasive domination through constant, secret surveillance of a space, its peoples, its inhabitants. It is within the drone’s optic field of operations that guilt is assumed and innocence a burden to be proven. The terror of the drone is not only that it takes life without notice and with blinding speed, or that it comes from nowhere and recedes into nowhere, or that it hums its presence and withdraws into thin air whenever it chooses. It is much more than that—it adjudicates life on a daily basis of surveillance that considers everyone suspicious, leaving little room for innocence to become the norm and guilt an aberration. This is the terrifying nature of the Drone: it is a predator on the prowl not only for those intending to cause harm, but for those who, in some situations, cannot speak, establish, or convey their innocence. A good example of how these risks have become military tactics in drone warfare is the “signature strike,” a strategy for increasing domination through dataveillance where nuances and specificities are subsumed into behavioural types, correlative data doubles, and predictive analyses (De Luce and Paul Mcleary 2016). As one operator says, “the drone program amounts to little more than death by unreliable metadata” (Storm 2014), because, as Alcides Eduardo dos Reis Peron points out, “the practice of constructing an enemy before identifying him, and incriminating all those related to him, is extremely controversial and insufficient to properly clarify those on the ground as enemies” (2014, 91). Moreover, “according to several administration officials,” write Jo Becker and Scott Shane (2012), the policy “in effect counts all military-age males in a strike zone as combatants. […] unless there is explicit intelligence posthumously proving them innocent.” This policy goes beyond surveilling and identifying individual terrorists to targeting groups of people engaged in suspicious activity. Derek Gregory (2014) observes, “Combatants are thus vulnerable to violence not only because they are its vectors but also because they are enrolled in the apparatus that authorizes it: they are killed not as individuals but as the corporate bearers of Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 19 of 51 a contingent (because temporary) enmity” (7). Peter Bergen notes (2012), “These are drone attacks based on patterns of merely suspicious activity by a group of men, rather than the identification of a particular individual militant.” When drones are equipped with transceivers or Air Handlers to mimic satellite towers to absorb telephonic communication, which is looped into data feeds for target analysis by intelligence and military personnel, the identity of a suspect becomes predicated on patterns of phone use. In instances where a strike is authorized, it is the SIM card (subscriber identification module) of the phone that leads to the targeting of the person using the phone (Scahill and Greenwald 2014). When a suspected phone is targeted and authorized for elimination, the exigencies of human interaction where different people end up using the targeted phone become redundant, because, in the surveillant assemblage, it’s the metadata that ascertains guilt and rationalizes death, not the individual or individuals using the phone. It is this process of data mining, geo-tagging, and algorithmic analysis that forecloses the possibility of separating suspects from innocents. Sheer incidental proximity in the everydayness of human interaction where innocent people end up using a targeted phone only to end up blown to pieces is what Jeremy Scahill and Glenn Greenwald (2014) refer to as “death by metadata […] where they think, or they hope, that the phone that they’re blowing up is in the possession of a person that they’ve identified as a potential terrorist. But in the end, they don’t actually really know. And that’s where the real danger with this program lies.” The surveillant assemblage reduces the need for gathering reliable intelligence based on close, extended observation and evidence in favour of a guilt-by-association logic that dramatically increases the risk of targeting innocent people, or those whose culpability does not deserve the ultimate punishment of death. In September 2011, drone strikes killed Anwar al-Awlaki and Samir Khan, US citizens and terror suspects, in Yemen. A few weeks later, a drone attack killed Abdulrahman, aged sixteen and son of al-Awlaki (Benjamin 2013, 65). In February 2010, US drones mistakenly killed close to two-dozen civilians, including women and children in Afghanistan (Benjamin 2013, 94). Low estimates of casualties in Pakistan, Yemen, and Somalia include 4,228 killed, 522 civilians, and 184 children, according to the Bureau of Investigative Journalism (2019). Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 20 of 51 The psychosocial impact of drone strikes includes fear and paranoia among helpers and official rescue personnel who retrieve the dead, rescue the living, and care for the injured. Because the blasts from the strikes often burn bodies, dismember them, or sometimes simply incinerate them, the process of identifying victims means gathering whatever body parts can be found and handing them to friends and relatives of the victims. In villages where the Jirga is conducted—public hearings and discussions to resolve disputes by the maliks (local elders) and khassadars (local police forces overseen by maliks)—due to drone strikes that killed dozens of attendees, some of whom were the Taliban who were present at the meeting to resolve local disputes, there is growing fear and anger about drone attacks that target militants but more often than not result in the loss of innocent life (Cavallaro, Sonnenberg, and Knuckey 2012, 23–4). Because of the “double tap” strategy of striking targets twice or more, rescuers often hesitate to rush to aid the injured, fearing becoming targets and losing their lives, thus depriving the injured, especially the innocent, of timely medical attention (Cavallaro, Sonnenberg, and Knuckey 2012, 74). Strikes that destroy places housing targets also sometimes destroy surrounding houses, leaving individuals and families helpless and destitute. Because medical expenses are high, many of the injured do not get adequate care or take loans they simply cannot afford but need if only to stay alive or avoid becoming severely handicapped. It is common for witnesses to drone strikes to exhibit “anticipatory anxiety” caused by the fear of impending strikes anytime and from anywhere (Cavallaro, Sonnenberg, and Knuckey 2012, 81). Terror, anxiety, and fear of becoming victims of drones generate post-traumatic disorders among those living in places hit by drones, or witnesses to the devastating impact of drone missiles. In some instances, parents and families are pulling children from school awhile, or refusing to send them, fearing that when groups of children get together, they could easily become drone targets. Similarly, practices of mourning and burying the dead, which happen in public gatherings, are observed with trepidation because it increases the likelihood of drone attacks on groups (Cavallaro, Sonnenberg, and Knuckey 2012, 89). Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 21 of 51 Sites of drone killings or crashes give visibility to the power, structure, and infrastructural systems that facilitate drone wars. As Lisa Parks (2017) argues, in terms of infrastructure, for instance, using Google Earth, we can discern how drones deal with “geology, physics, energy, and weather” through “earthmoving, importation, construction, installation, and maintenance” to build large air strips and hangars, which become the “staging ground for drone campaigns and vertical maneuvers” (137–9). In terms of the forensic, places where drones kill or crash become material signs that make visible the invisible structure of drone warfare, as the bodies of killed and the injured vivify the violence inflicted, and the debris reveals the type of drone, materials used in its construction, technological systems, and so on (Parks 2017, 151–2). In terms of the perceptual, drones and the surveillance regimes they establish produce “spectral suspects,” whose identities are established not by epidermal and other discernable features, but through infrared contouring of heat-emitting entities (like the human body), which can appear black or white, based on a given set of technological settings. Spectral suspects are “visualizations of temperature data that take on the biophysical contours of the human body while its surface appearance remains invisible and its identity unknown” (Parks 2017, 145). But here, since identities are not known, “seeing according to temperature turns everyone into a potential suspect or target and has the effect of ‘normalizing’ surveillance since all bodies appear similar beneath its gaze” (Parks 2017, 145). It is why other assessments and verifications of threat and identity come into play, like signature strikes and double tap, including computational approaches like maintaining data repositories, metadata analysis, data dossiers, data doubles, and dataveillance. To grasp human behaviour as part of a network of actions and patterns, drone surveillance facilitates a distant reading of human collectivities, a macroanalysis of information flows to ascertain suspicious activity and spectral suspects in order to contain or eliminate them pre-emptively. A major reservation about drone warfare, says Greg Kennedy (2013) in “Drones: Legitimacy and Anti-Americanism,” is the question of legitimacy, a term often used “in such circumstances interchangeably with concepts such as proportional, moral, Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 22 of 51 ethical, lawful, appropriate, reasonable, legal, justifiable, righteous, valid, recognized, and logical” (25). There is a tendency, point out Sarah Kreps and John Kaag (2012), to conflate technological sophistication with ethical and legal assessment, because technology is not neutral but used by human beings: “the ability to undertake more precise, targeted strikes should not be confused with the determination of legal or ethical legitimacy,” which raises the question of war and justice (17). Fred Kaplan (2013) underscores a key fact: drone strikes take place outside of war zones. They can happen anywhere the US decides a threat is imminent. He writes, “For when we talk about accidental civilian deaths by drones in Pakistan and Yemen, we are talking about countries where the United States is not officially fighting wars. In other words, these are countries where the people killed—and their embittered friends and relatives— didn’t know that they were living in a war zone” (Kaplan 2013). To further complicate matters, sometimes, those targeted by drones were “low-level, anonymous suspected militants who were predominantly engaged in insurgent or terrorist operations against their governments, rather than in active international terrorist plots” (Zenko 2013, 10). Such instances lead to drone warfare camouflaging proxy wars fought by a powerful state to help another government, and not necessarily to defend itself against foreign suspects. To the two dimensions of just war theory—the justification for war (jus ad bellum) and the rules of engagement during war (jus in bello)—philosopher Michael Walzer (2004) in Arguing About War adds a third, justice after the war (jus post bellum) (viii). A good argument can be made that in drone warfare, the new dispensation of American empire, all three dimensions are skewed. The ethical conundrum is this: the US is engaged in a global hunt for people posing imminent danger to the country and scours the entire world for them without formal intimation or declarations of war; the US envelopes entire regions and populations and subjects everyone, without distinction, to a surveillance regime to ferret out suspects and kill them; the US disposes its targets without consistently verifying the proportionality of the strikes, because the targets are chosen by macroanalysing big data generated by covert digital surveillance. Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 23 of 51 Critiques of and opposition to drone warfare emerged from various parts, both within the US and other parts of the world. Especially significant are the efforts by individuals and groups to make the invisible wars of drones visible, literal, palpable, visceral. And here, the turn to art and creativity becomes the avenue for expressing dissent against drone wars, while humanizing their deadly effects. But as we shall also see, drones and surveillance in cultural production raise complex questions about the power of art to register dissent and resistance, and foreground the uneven terrain of freedom and responsibility negotiated by culture producers and consumers; they shed light on the gendered inscriptions of drone warfare in military culture, which feminize drone piloting, because of its distance to and immunity from real-life battlefield risks of injury and death, while affirming the technological superiority of the countries that engage in drone wars, and the manifestation of male anxieties in celebrating bravery and honour produced in the drone techno-spatial ecosystem (Schnepf 2017; Hensley 2018; Clark 2018). They also seek to resist the power of the “robotic imaginary,” which Jennifer Rhee (2018) describes as the “shifting inscriptions of humanness and dehumanizing erasures evoked by robots” that emerge in “the inextricable entanglement of ‘technology’ and ‘culture’.” She adds, “as a concept, the robotic imaginary offers the capacity to identify both an abiding vision of the human that is held up to be, however provisionally or circumscribed, universal, and the extensive erasures of human experiences that enable this inscription of the human” (2018, 5–6). Drone art and culture foreground the manner in which the human is constructed through a regime of surveillance that generates data repositories, which serve as the basis for algorithmically identifying human targets for threat removal. However, as the next section will show, producing the data and extrapolating the human from the data involves a struggle for the human. Drone art and culture foreground the multifarious dimensions of this struggle, in order not to restore a stable, fixed human entity but to resist digital networks and protocols with the power to adjudicate life and death through invasive biopolitical surveillance. It’s in art, literature, and culture that we see a struggle for the human play out with poignancy (Center for the Study of the Drone 2019). Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 24 of 51 The struggle for the human in drone wars Operating Predator drones is not an easy task; it requires new skill sets and a new mode of understanding “battlefield,” “enemy,” “emergency,” and “collateral damage.” Just twenty-one years old when he started working as a drone pilot, Brandon Bryant operated from the Ground Control Station at Nellis air force base, close to Las Vegas, Nevada. In discussing his experiences as a remote pilot operating MQ-1B Predators flying over Afghanistan, Bryant notes that his squadron made 1,626 strikes; in dealing with the aftermath of each strike, Bryant eventually sought therapy and was diagnosed with post-traumatic stress disorder. He realized that “the job made him numb: a ‘zombie mode’ he slipped into as easily as his flight suit” (Power 2013). Bryant “sometimes felt himself merging with the technology, imagining himself as a robot, a zombie, a drone itself. Such abstractions don’t possess conscience or consciousness; drones don’t care what they mean, but Bryant most certainly does” (Power 2013). Surveilling targets and their habitations on pixelated screens for days and weeks on end and releasing Hellfire missiles that obliterated them with explosive power and, sometimes, finding out that the target’s identity was uncertain, their guilt not fully established, turned drone piloting into a job where ethics were always at risk of being compromised. Hovering virtually more than two miles above the earth to surreptitiously surveil people’s lives every day on computer screens in cockpits located thousands of miles away in Nevada, the drone pilot can discern a full range of personal and public behaviour of the people subjected to the drone’s watchful gaze. For drones to function as tools to carry out military or police missions, digital tools, software, and networks produce thousands of still and moving images and multimedia feed, which are amassed and assessed as large datasets. In tandem with intelligence reports, data is sorted, tagged, distributed, mined and made amenable for evaluation and assessment by data and military analysts, so as to identify suspects and launch missiles through remotely controlled armed drones to destroy targets. The role of human agency—an embodied sentient being feeling and thinking and deciding—becomes subordinated to the dynamics of data gathering, surveillance, and decision-making. Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 25 of 51 Between the target and the drone pilot is a semi-autonomous digitally-run system that generates vast gigabytes of data for surveillance, but as it multiplies its data and coordinates with a slew of other data structures and robotic systems to manage drone vehicles and pilot them, surveillance becomes dataveillance and the pilot and target merge into a vast digital superstructure where they become important nodes whose value and significance is internally assessed in relation to the purpose and viability of the military mission embodied in a global network of surveillance managed by the most powerful country on the earth. Ethics becomes immanent to the form and function of dataveillance, a situation in which external points of reference to pose questions about decisions and policies justifying drone strikes become harder to find or redundant. Accidents or mistakes that result in human lives being lost or strikes where innocent men, women, and children are wiped out with devastating missile power are evaluated in terms offered by the digital structure and system: assessing inputs and outputs, transmission protocols, evaluative criteria, collaboration among people reading and assessing a variety of data sets and military intelligence, readability of still and moving images, algorithmic machine learning to mine big data and generate patterns and trends to surveil and targets to identify. Put differently, human life is adiaphorized, as Zygmunt Bauman puts it. To wit, adiaphorization refers to situations where “systems and processes become split off from any other consideration of morality […] surveillance streamlines the process of doing things at a distance, of separating a person from the consequences of action” (Bauman and Lyon 2013). An action becomes “neither good nor evil, measurable against technical (purpose-oriented or procedural) but not against moral criteria” (Bauman 1993, 125). The military designed a software to mock up a drone strike in order to asses its strike capability and surrounding damages. When drone pilots release missiles that rip apart or hollow out structures of steel, aluminum, iron, wood, earth, and human bodies, there is a splattering of things, and of blood and tissue; the result of a drone strike is uncannily rendered in the colloquial term given to the military’s software program (now called Fast Assessment Strike Tool) designed to assess strike capability and damage: bugsplat (Cronin 2018, 2). The damage done Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 26 of 51 by a drone attack is akin to bugs splattering on a windshield of a vehicle travelling at high speed. Because humans appear as bugs on pixelated screens, and there is a visual blob when destroyed, there is human splatter, or bugsplat—“collateral damage estimate methodology” (Department of Defense 2012). To counter the invisible power of drone warfare, a collective of anonymous artists from America and Pakistan produced giant posters of victims of drone strikes and plastered them in the area where they were killed in the region of Khyber- Pakhtunkhwa in Pakistan. Featuring the photo of an innocent child whose parents were killed in a drone strike, the poster is enlarged enough to allow drone pilots see not a bug-like pixel on a screen but the face of a human being whose life is impacted by armed drones. Interestingly enough, a photo of this poster was taken by a small drone with cameras and posted online at #NotABugSplat.com (https://notabugsplat. com/). As Rhee (2018) notes, “#NotABugSplat’s representation of young drone victims is in tension with drone technology and the drone operator’s labo[u]r, which trains them to view those who come into the frame of their drone surveillance as bugs or dehumanized and threatening racial Others” (164–5). In this public art installation, the aim to humanize victims re-orients the drone pilot’s field of vision as his/her drone cameras surveil the terrain and send image feeds back to intelligence analysts and military brass. This reorientation of the field of vision is both literal and conceptual. At the literal level, what is remote and bug- like becomes its actual representation in the artistic rendition of a poster photo of a victim’s visage and body. The technology to zoom inwards on a camera’s subject to reveal its details comes up short in the drone video feeds, where the subject’s human features are pixelated into non-human entities like bugs. Rather than covering the site or hiding it from drone operators, the artists explicitly foreground the killing site with enhanced pictures so that the literal field of vision of the drone pilot sees a different terrain, one re-mapped by human actors on the ground. At the conceptual level, this enhancement of the subject who is now dead or living through the trauma of being victimized in drone strikes serves to change the logic of adiaphorization in dataveillance into one of human calculation in daily life: drone warfare is not an https://notabugsplat.com/ https://notabugsplat.com/ https://notabugsplat.com/ Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 27 of 51 autonomous, self-engineered mode of waging battle but one in which human beings use digital technologies to fulfill foreign policy and military objectives. The giant poster thus shortens the literal and conceptual distance: literally, it shortens it by enhancing the subject’s image to make it easier for the drone camera to locate it, and it shortens the psychological distance between the drone pilot and drone technology, with the hope that the tendency to automatize drone war is undercut by empathy in the pilot for the actual or intended victim of future strikes. The giant poster serves to highlight the past (drone strikes killed innocent people) and foreground the present (local and other human agents register their views of the strike by signaling who was victimized), so that the future will be bereft of such strikes (drone pilots realize the human cost of drone wars and refrain from firing missiles). In addition, the poster functions as a geo-tagger: it memorializes the victims while documenting history in local topography. Its historical accounting involves a remembrance with geo-spatial and temporal coordinates: time and location, space and place are crisscrossed with the explicit purpose of countering the adiaphorization of drone warfare. By taking pictures of the giant poster with a mini drone attached with cameras and broadcasting them in digital spaces that can be viewed by millions across the world, these dissenters enact an artistic politics of adaptation and subversion: drone technology is used not to kill or maim or surveil but to relocate the drone that kills and maims and surveils within a re-mapped topography that explicitly foregrounds the ethically compromised effects of drone warfare. Where the US military cannot or does not (or does so surreptitiously) keep records of civilian casualties of drone strikes, the artists publicize history by both documenting the location and victim of strikes and exhibiting them for the public and the drone pilot. This artistic creation installs drone war in public memory by subverting the use of drone technologies for ends that directly counter those of the drone pilots and their commanders: the giant photo makes public what the drone operators would prefer remain private; the giant photo registers the innocent victims of drone wars where the drone operators see bug splats; the giant photo interrupts the drone’s pilot’s field of vision by serving as a constant signifier of the ethical dimension of drone warfare, Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 28 of 51 one over which the drone pilot has little control: the poster has to be obliterated with another drone strike or constantly made part of the surveilled topography, which means practicing studied indifference or wanton forgetfulness, which places the onus of both actions squarely on the shoulders of drone pilots. Such art reinserts what drone warfare actively seeks to silence: the humanity of drone strike victims. Drone art and politics Where #NotABugSplat seeks to reinsert the human into a war whose techniques are virtual but results are materially deadly, Pakistani-American artist Mahwish Chishty seeks to change the symbolic meaning of the drone, from one associated with American empire and postmodern violence effected through virtual means to an object worthy of artistic curiosity. She seeks to abstract the drone from its militarized setting and turn it into a canvas where local Pakistani truck cultural practices can be painted, so that the drone is delinked from foreign state violence and turned into a tool or site for creative experimentation with local culture. However, the delinking is not an act of transposing politics into art, moving from one medium or modality into another, but of juxtaposing the political and the artistic, or, better still, of showing their imbrication, in order to reveal the contradictory, circumstantial nature of aesthetic production, where national and international interests do not undermine local specificities, while simultaneously not granting the latter a monopolizing power to determine the terms of aesthetic and political engagement. Featured at www.mahachishty.com/ are more than a dozen gouache paintings on paper, handmade paper, birch plywood and Masonite boards. Drones are painted in many shapes, with the MQ-9 Reaper, a popular armed US drone, used as the prominent design. Chishty draws from the folk painting traditions of Pakistani trucking industries where carvings, bright colours, mirrors, calligraphy, and paint are used to adorn trucks, often at considerable cost to their owners. Trucking in Pakistan is a major industry, as its roadways are used more than its waterways, railways, and airways for freight and public transportation. 60% of its 258,000-kilometre road network is paved, and the Bedford Rocket, an iconic British-based truck brand, now shares popularity with Hino and http://www.mahachishty.com/ Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 29 of 51 Nissan models from Japan (Elias 2011, 55; 58). Truck art, observes Jamal E. Elias (2011) in On Wings of Diesel: Trucks, Identity, and Culture in Pakistan, is “a function of visual culture as a window into the structure and politics of contemporary societies” (12) (Figures 3 and 4). Truck drivers are not the sole initiators of truck painting, but are usually intermediaries between owners and painters with of ten different intentions for painting: the owners seek to make a business statement and establish uniqueness in the market, which also gives them a chance for personal expression as paintings can include specific requirements of subject and theme and colour of the painters; the painters are part of a large circle of locally-based small businesses run individually or in groups. Calligraphy in Urdu and English, for instance, signals the owner’s familiarity with official or mainstream culture; on roads where top speeds are not feasible, decreasing the likelihood of wear and tear of the vehicle, decorative items like pinwheels are used inside trucks, which increases the longevity of art décor. Figure 3: Truck Art, Islamabad, Pakistan. Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 30 of 51 Elias (2005) further notes: “The motifs on trucks display not just aesthetic considerations, but attempts to depict aspects of the religious, sentimental and emotional worldviews of the individuals employed in the truck industry. And since trucks represent the major means of transporting cargo throughout Pakistan, truck decoration might very well be this society’s major form of representational art.” He distinguishes among five styles based on regions: Rawalpindi (stylized cowlings, appliqués of plastic), Sawat (wooden door carvings and metal hammered into shapes), Peshawar (a mix of the previous two styles that use carvings, metal, cowlings, paint), Baluch (chrome cowlings, complex, ornate designs patterned into mosaics), and Karachi (biggest truck centre showcases all styles, with woodcuts and wide colour spectrums). Subjects of decorative art include figures from religious, political, and everyday culture, women, personal art or objects as talismans (Elias 2005). Chishty uses many of these elements in painting drones, which are also represented in a variety of drone shapes: some are small, sharp, triangulations with boomerang shapes akin to X-47B; some have bulky, oval front-ends akin to Figure 4: Truck Art, Karachi, Pakistan. Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 31 of 51 Reaper and Predator drones; some are cast as twins joined at the back with two fronts facing opposite sides; some appear like thin butterflies in flight; and others have a burst of colourful missiles falling downwards from a flying drone. In an interview with Josh Harkinson of Mother Jones magazine, Chishty observes that her aim in painting drones this way was to make them “friendlier looking, instead of such hard-edged, metallic war machines” (Harkinson 2013). When asked if she were viewing militarized weapons idealistically, Chishty replies, “I don’t know if I am glorifying it. I just want people to talk about it. At the same time, it has some kind of beauty to it. I am also looking at them as objects, and not as much as war machines” (Harkinson 2013). To her, just as the truck drivers decorate their trucks ornately and with distinctive styles, which she views primarily as aesthetic expression, drone painting by using Pakistani folk art means using local culture to turn an object associated with death and war into an object of aesthetic contemplation. In “By the Moonlight,” a gouache painting on birch plywood, Chishty portrays the front underside of a wide- angled drone in green with decorative patterns of white appearing as conjoined shapes; the middle body is yellow and the tail-end is blue, with the wings rendered in darkened peach and around twelve semi-circular shapes, their borders lined in blue and yellow and adorning each wing side. This colourful drone is placed at the centre of what appears to be a modern street etched into plywood with tea stain. Several electric poles with wires line each side of the street with multi-storied buildings. The contrast is sharp but not jarring. While the lack of colour in the scene in which the drone is placed suggests its destructive force, it can also be viewed as an attempt to make the drone appear pleasant, colourful, and worthy of beautiful self-expression à la truck drivers styling their trucks (Figure 5). Put differently, Chishty is not practicing representational art in the general sense of using Pakistani truck art to depict realistic drone strikes or their repercussions on property, land, or humans; she is using local art to individually express her desire to counter the dominant perception of drones as objects of violence by turning them into colourful cultural artifacts. Many of them unambiguously titled after formal terms used in military jargon—RQ 170: the Beast of Kandahar, Hovering Reaper, Predator, Black Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 32 of 51 Hawk, X-47B—the paintings evoke truck art in loud, pleasing colours, woodcuts, embroidered cloth, talismans, metal works, calligraphy, and religious and cultural symbols (Figure 6). Figure 5: “By the Moonlight” by Mahwish Chishty. Figure 6: “Reaper Drone” by Mahwish Chishty. Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 33 of 51 Meghan Neal (2013) calls such work a form of cultural repurposing: “Drone art can be seen as a form of reappropriation—taking back something that in the popular consciousness is so often a symbol of death and destruction and making it something beautifully provocative, even hilarious.” Along similar lines, Anike (2013) in Muslim Media Watch of Patheos.com points out, “Chishty’s drone art is reappropriation; it questions the popular image of the drone as an icon of death and destruction and thus in its own way protests this symbol by choosing to view drones as objects, not just as war machines.” However, while many online commenters support Chishty’s views expressed in her interview with Josh Harkinson at Mother Jones, others voice strong disagreement about her choice of subject and her artistic work. One among them, Mariam Sabri, pointedly counters the supportive comments by noting, “I’ve been having discussions with a few artists, those who are involved with political advocacy through art, and an art teacher in Pakistan about this (Harkinson 2013). We all feel collectively sickened after reading Mahwish Chishty’s interview.” Sabri calls such drone art “silly,” “insensitive,” and “deluded,” because “she [Chishty] clearly seems to be depoliticizing drones” (Harkinson 2013). Sabri’s criticism is not without merit given Chishty’s observations in the interview: “I don’t know if I am glorifying it. I just want people to talk about it. At the same time, it has some kind of beauty to it. I am also looking at them as objects, and not as much as war machines” (Harkinson 2013). The key issue here is whether the appreciation of beauty is possible for people who experience the horror of drone strikes and the constant unease of living under drone surveillance. Even if we grant that it is theoretically or experientially possible, the question is, to what extent? In other words, what are the politics of location in cultural production and reception? Does where we are determine how we view art and culture? Evidently, yes. Chishty’s strategic move to wrest drone technology out of the discourse and activity of warfare is predicated on the idea that art ought to function in autonomous, or, better yet, depoliticized spaces. Speaking of truck art, Chishty says that truckers “spend so much time on it and they don’t get any funding. This is something that they do, just a personal interest. It has no reason whatsoever other than just an aesthetic sense” (Harkinson 2013). But aesthetic work, as Jamal Elias’s anthropological analyses of truck art shows, moves beyond personal, artistic https://www.patheos.com/ Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 34 of 51 expression to collective representation of trucking culture: travails of truck drivers, the sense of home they create and evoke on the road, the geographic differences that influence their choice of themes, and so on. In other words, truck art is woven into Pakistani trucking culture. Chishty’s approach draws on contemporary US-Pakistan politics about drones to highlight drones as aesthetic objects, which is a profoundly political act, but justifies this politics on the grounds of aesthetic autonomy. What needs underscoring is the potential for slippage in intent and interpretation: wanting people to talk about drones might well lead people to talk about drones primarily as works of art or only as tools of war; this contradicts the fact that the very purpose of her drone art is to counter the dominant impression of drones as tools of violence, an impression based not on aesthetic insistence (the US military is not advocating that Pakistanis view drones as art objects even as it launches drone strikes), but on verifiable history (drone strikes have killed and destroyed people and infrastructure) (Figures 7 and 8). Figure 7: Truck Wheel Art. Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 35 of 51 Critics who dismiss Chishty’s work as insulting to people whose lives were wrecked by drone missiles miss, understandably, the political import of her emphasis on drone aesthetics that seeks to grasp the drone primarily as technology, a tool built by human beings to accomplish certain ends. That it is used currently in warfare should not obscure the fact that as a technology, the drone is amenable for other uses, including creative ones that can bring the social and material impact of drone strikes into broader public spaces, a move that can shed light on the geopolitical imbalances structuring drone warfare. Her focus on individual freedom to pursue creative expression by appropriating a tool that has become a potent weapon of war towards non-military ends can be viewed as an attempt to re-centre the human subject that the drone, by its very nature, seeks to de-centre through data mining, algorithmic calculation, distant reading, and macroanalysis, what Bauman refers to as adiaphorization, as we have earlier seen. Chishty pushes this view further in the video art “Predator,” which can be projected into dark areas for a performative event. The video, available on Vimeo (https://vimeo.com/129010049), runs for 5 minutes and 27 seconds; centred and Figure 8: Mahwish Chishty’s “Hellfire Missile.” https://vimeo.com/129010049 Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 36 of 51 taking up the entire screen is a colourful image of a drone, speckled and painted with truck art colours and images; in the first minute, a hissing sound, almost a screech, builds into a crescendo of Aztec death rattles, the sounds produced when one blows air into the skull-shaped artifacts unearthed by archeologists in Mexico (Watson 2008). The sounds of these skull whistles are nerve-wracking, because they seem to condense a thousands screams, which is why they are also referred to in the vernacular as the “scream of a thousand corpses,” ostensibly a reference to the manner in which the Aztecs used the whistles for ceremonial rites and to intimidate enemies, or ward off threats. In a minute or so, we can see and hear the drone take a strike, but for almost three minutes, the drone simply hovers, closing and opening its eyes; it hovers and hovers; that is, as we have seen, the drone is hovering because it is surveilling individuals, groups, and populations constantly; then in the last minute of the video, the ominous wailing returns, to end with a drone strike. In video and animation, mixed with painting and sound, Chishty brings aesthetics and politics into open collision—the secret wars of drones are rendered aesthetically, not to displace politics with aesthetics, but to put politics and aesthetics into constant, creative tension. The drone is now no longer a depersonalized weapon of war; it is an aesthetic creation that can also be turned into a tool for violence. It is this double-sidedness of creative political expression that repurposes or reappropriates in order to juxtapose, not replace, which is a unique feature of Chishty’s art and installations. Drones and surveillance in popular culture The impulse to use drones aesthetically also finds expression in Pashto culture and literature. In “Impact of War on Terror on Pashto Literature and Art,” published in March 2014 by the Federally Administered Tribal Areas (FATA 2014) Research Centre in Islamabad, Pakistan, the impact of war is generally divided between pre-9/11 and post-9/11 periods. Nature, romance, landscape, individual dreams, love, desire, friendship are thematic concerns of the pre-9/11 period, and with the start of the war, changes become apparent as poets and artists began to shift focus to the devastating effects of war on small and big, village and semi-urban communities. Genres like the ghazal, nazm (Pashto poems), tappa, and jihadi tarana (anthem) all register this shift Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 37 of 51 in focus. Popular and well-regarded artists who have engaged with this shift include Salim ur Rehman Salim, Muqadar Shah Muqadar, Akbar Sayal, Ajmal, Bakht Sher Aseer, Shabab Ranizai, Roshan Bangash, Ata Muhammed Wardag, Rehmat Zalmai, and Syeda Haseena Gul, among many others (FATA 2014). It would be a mistake, however, to romanticize the pre-9/11 period because the Soviet invasion in 1979, which lasted for more than a decade, saw noticeable effects on art and literature among Pashtuns, but what makes this periodizing important is the extent to which military themes of war, loss, devastation, enemies, invasion, destruction and death and their associated symbols permeate creative activity. Responses to this war range from extreme anti-Americanism, where the West becomes the First Cause for war and, therefore, needs to be countered militarily, politically, and culturally, to broader explorations of how peoples living under the constant threat of military action or in militarized regions experience their effects on personal and public psyches. In jihadi taranas, the Manichean dichotomy of the West and Afghani/Pashtun identity is explicit and is generally oriented towards inciting readers to protest and rise up against the oppressive foreign powers. The output in this genre, however, is limited, while the political manifestation of this ideology in the political party of the Taliban and other such entities is undeniable (FATA 2014). This does not mean that pro- Taliban materials are not read widely. In Mohalla Jangi (Neighbourhood of War), Peshwar, Pakistan, there are 2,000 printing presses, some of which regularly print materials supporting the Taliban, Islamic radicalism, and anti-Americanism (Siddiqui 2012). In art, poems, ghazals and tappas, artists and writers view the landscape with less thrall because it is pockmarked with the effects of war; there is mourning and sadness in witnessing the changing landscape, which makes habitation increasingly difficult and associated with police actions and American military presence, on the one hand, and extremist, fundamentalist groups eager to subjugate and control society, on the other. Over the last three years, two songs by Pakistani Pasthto singer Sitara Younas received considerable attention on Youtube and in Pakistani regional popular culture. Her “Khud Kasha Dhamaka Yama” can be translated as “I am a suicide bomber.” Part of the lyrics include, “Don’t chase me. I am an illusion. I am a suicide blast.” Written Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 38 of 51 by Pashto writer Rashid Johar and composed by Pashto musician Shakir Zeb, the song uses the on-going US-Afghanistan and Pakistan military activities against terror groups as materials for song writing and singing (Ali 2011). Its explicit analogizing of one smitten with amorous desire for another with the unexpected, shadowy power of a suicide bomber has drawn public attention, with journalists like Manzoor Ali paraphrasing poet Farooq Firaq, who says that “suicide attacks have left deep imprints on our society and that such songs are a result of overall negativity in society” (Ali 2011). Firaq “proposes establishing a censor board—comprising of actors, writers and elders—to oversee and filter such content” (Ali 2011). We see here the lasting effects of wars and police and military missions on people living in these societies. The intent of this song is not designed as propaganda to convince young people, especially those disillusioned or frustrated with their lives, to become true believers in radical Islam and glorify the act of killing others through suicide; it is a registering of everyday life and the complex ways in which some people use the ideas and events they are familiar with to make sense of other aspects of their lives and infuse new symbols and analogies that dramatize the dynamics of young love, romance, heroism, risk, danger, and yearning, to wit, the stuff of which dreams are made in human societies. Younas’ second song pushes the envelope further in “Za Kaom Pa Stargo Stargo Drone Hamla,” which translates as “My gaze is as fatal as a drone attack.” Penned and given melody by Pashto director Maas Khan Wesal, the song was performed in an episode by actress Dua Qureshi in the television film “Da Khkulo Badshahi Da” produced by Khans Productions (Khan 2012). A translation of parts of the song reads thus: My gaze is as fatal as a drone attack/The touch of my lips sweeten words Intoxicating wine are my looks/My gaze is as fatal as a drone attack Coquettish stare is a snare of beauty/Smile fresh as early morning dew Ensnares lovers with amorous pangs/My gaze is as fatal as a drone attack O lovers! Go through a lover’s agony/A leaping flame and a rose bud The clink of my bangles leaves one enchanted/My smile rustles desires in many a heart Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 39 of 51 Tests lovers’ courage/My gaze is as fatal as a drone attack My beauty and body/At its prime Leaves many going astray/My gaze is as fatal as a drone attack. (Khan 2012) The singer recognizes the power that she, a woman, wields over a man; she is confident of her attractive looks as she croons that “the clink of my bangles leaves one enchanted” and “my smile rustles desires in many a heart.” Her attractive features are so compelling that they heighten the desire of lovers to the point where their commitment to each is tested, because her “beauty and body at its prime, leaves many going astray.” This woman knows she can “sweeten” her utterances and disorient others with her beauty such that they lose senses. The force of these sentiments is echoed repeatedly in the refrain “My gaze is as fatal as a drone attack.” The link between drones and fatality is certainty. Drones are deadly weapons of war; they do make mistakes when they kill suspects, targets, and civilians, but what cannot be doubted is a simple certainty—they destroy, they kill. The power of the drone in this song derives less from the drone’s technological capacity to unleash missiles from thousands of feet in the air and find targets with accuracy but from its “gaze” that is “fatal.” In a neat stroke of lyricism, dance, and sentiment, the song captures the problematic nature of postmodern war: drones and surveillance cultures. Without the ability to subject a people to constant, detailed surveillance, drones lose their power as tools of violence. It is the drone’s unique, invisible ability to gaze at the other that makes the other succumb to the drone’s missile. Implicit here is the idea that to counter the gaze of this seductive woman, the lover has to resist her at the level of her gaze; he has to turn that gaze around or ensure that he cannot be located in her field of vision. In other words, he has to contest the power of her surveillance that recognizes the disorienting effects she has on him. But that is what he cannot, thus the deadly accuracy of the woman’s power: “my gaze is as fatal as a drone attack.” Not surprisingly, such cultural interweaving of death, violence, romance, and love generated strong disapproval, even talk of censuring cultural production. Gul Nazir Mangal, an artist from Waziristan, a region administered by Pakistan, says, “We should not be proud of these attacks, which are being carried out by foreigners on our land. This needs to be condemned instead of making songs and dancing on its Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 40 of 51 tunes,” because such songs are “not only harmful to culture and literature, but also create a sense of disunity amongst the people” (Khan 2012). Officials should, suggests Mangal, set up a censor board to check cultural content before it’s released to the public. Arshad Ali, another musician, reiterates this, saying that “It’s not appropriate to incorporate drone attacks in music as it’s a grave issue faced by our country. Each artist has a certain responsibility towards society” (Khan 2012). But what is the nature of this responsibility when it comes to digital technologies, drones, surveillance, and networks? We cannot address these issues unless we frame them within global contexts, as we have seen in this essay. Drones and surveillance are woven into digital networks that not only connect different countries but impact individuals, groups, and entire populations around the world; it is hardly surprising, then, that cultural engagement with drones and their effects and the vexing issues of authority, representation, intention, and social purpose have transnational dimensions. For more than a month starting in January 2014, the Ann Arbor Art Center in Michigan held a special gallery featuring the work of more than forty artists on the subject of drones. The Center explained its choice of subject thus: Drones are the quintessential object of the 21st century. They are revolutionizing global warfare and domestic and foreign surveillance, galvanizing the creative impulse, and challenging democratic principles and personal values around the globe. They are changing the way we work, play, battle, and live in the 21st century. (Ann Arbor Art Center 2014) “Galvanizing the creative impulse” aptly characterizes the artistic and cultural activity about drones over the last decade. It is an international phenomenon with artists in Afghanistan, Pakistan, England, and America boldly and creatively thinking about and using drones; not just armed drones but drones as a new technological artifact with a unique ability to reorient us to space and time. But as we have seen, the artistic impulse about drones moves well beyond this laudable goal even as it stresses its humanizing potential. Drone art has become cultural life: people are painting drones literally and digitally; they are using mixed media to generate new Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 41 of 51 juxtapositions of ideas and symbols; they are singing about them in telefilms in Pahstun Afghani societies; they are making paper or cloth imprints to attract drone operators; they are using drones in live dance performances; they are rewiring them for paint bombing, or graffiti art. The digital, the arts, and the humanities become entwined in an act of creative exploration that allows suppressed voices to be heard, registers the unacknowledged effects of invisible wars in public discourse, and digitally enables human presence and the quest for dignity to find transnational resonance in a global world. To conclude: when we moved beyond computational humanities to study the imbrication of the digital—as technology, tool, ideology, and episteme—in drones and surveillance, we bridge the digital humanities to postcolonial digital humanities by foregrounding a new biopolitical reality in which digital technologies fundamentally alter established notions of war and peace, guilt and innocence, privacy and the common good. Such a bridging involves, as Roopika Risam (2019) aptly puts it, “praxis at the intersection of digital technologies and humanistic inquiry: designing new workflows and building new archives, tools, databases, and other digital objects that actively resist reinscriptions of colonialism and neocolonialism” (4). If we don’t move beyond the computational humanities to examine the governmental and military institutions that establish sophisticated, transnationally networked digital regimes to surveil peoples and kill terror suspects while also killing civilians, the threat to liberal democracy will increase, not decrease; we need to not only infuse the digital into the humanities but the humanities into the digital; that is, we need to apply humanities approaches to examine how social and political organizations thrive on constant technological innovation to realize national security goals at the expense of robbing thousands of peoples of their rights to privacy and dignity. It involves making the digital humanities public by widely disseminating specialized DH research to general, non-academic audiences, and bringing to bear DH tools and humanities methodologies on domestic and foreign policy, military practices, discourses of exceptionalism, imperial worldviews, in short, on matters of public concern; it involves drawing on complex fields of cultural and social production to enrich our Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 42 of 51 understanding of the human in a digital age, shape our scholarly endeavours, and inform our pedagogical practices. By affirming the human dimensions of surveilled subjects and examining the trans-territorial networks of surveillance in post-colonial societies, we can try to nullify, prevent, blunt, or deflect the same logic of national security being applied to us, right here in America, in American towns, counties, and cities. But that can yet happen, unless we rigorously study, question, and publicly engage with, adapt, re-orient, and transform the cultural and political dimensions of digital technologies. Acknowledgements I thank the reviewers of this article for giving detailed, helpful suggestions. Special thanks to Mahwish Chishty, for permission to use her paintings. Competing Interests The author has no competing interests to declare. References AeroVironment. 2019. “Media Gallery: Unmanned Aerial Systems.” Accessed June 17. https://www.avinc.com/media_center/unmanned-aircraft-systems. Agius, Christine. 2017. “Ordering without Bordering: Drones, the Unbordering of Late Modern Warfare and Ontological Insecurity.” Postcolonial Studies 20(3): 370–86. DOI: https://doi.org/10.1080/13688790.2017.1378084 Ali, Manzoor. 2011. “Khud Kasha Dhamaka Yama: The Song’s a Blast.” The Express Tribune. November 26. Accessed June 17, 2019. http://tribune.com.pk/ story/298042/khud-kasha-dhamaka-yama-the-songs-a-blast/. Allington, Daniel, Sarah Brouillette, and David Golumbia. 2016. “Neoliberal Tools (and Archives): A Political History of the Digital Humanities.” Los Angeles Review of Books. May 1. Accessed June 17, 2019. https://lareviewofbooks.org/ article/neoliberal-tools-archives-political-history-digital-humanities/. Anike. 2013. “The Colorful Drones of Mahwish Chishty,” Patheos. July 1. Accessed June 17, 2019. http://www.patheos.com/blogs/mmw/2013/07/the-colourful- drones-of-mahwish-chishty/. Ann Arbor Art Center. 2014. Drones 2014. Accessed June 23, 2019 https://www. annarborartcenter.org/drones-2014/. https://www.avinc.com/media_center/unmanned-aircraft-systems https://doi.org/10.1080/13688790.2017.1378084 http://tribune.com.pk/story/298042/khud-kasha-dhamaka-yama-the-songs-a-blast/ http://tribune.com.pk/story/298042/khud-kasha-dhamaka-yama-the-songs-a-blast/ https://lareviewofbooks.org/article/neoliberal-tools-archives-political-history-digital-humanities/ https://lareviewofbooks.org/article/neoliberal-tools-archives-political-history-digital-humanities/ http://www.patheos.com/blogs/mmw/2013/07/the-colourful-drones-of-mahwish-chishty/ http://www.patheos.com/blogs/mmw/2013/07/the-colourful-drones-of-mahwish-chishty/ https://www.annarborartcenter.org/drones-2014/ https://www.annarborartcenter.org/drones-2014/ Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 43 of 51 Bauman, Zygmunt. 1993. Postmodern Ethics. Cambridge, MA: Blackwell. Bauman, Zygmunt, and David Lyon. 2013. Liquid Surveillance: A Conversation. Malden, MA: Polity Press. Kindle edition. Becker, Jo, and Scott Shane. 2012. “A Measure of Change: Secret ‘Kill List’ Proves a Test of Obama’s Principles and Will.” The New York Times. May 29. Accessed June 17, 2019. http://www.nytimes.com/2012/05/29/world/obamas-leadership-in- war-on-al-qaeda.html?pagewanted=1&_r=2&pagewanted=all&#p[TMATMA]. Benjamin, Medea. 2013. Drone Warfare. New York: Verso. Bergen, Peter, and Megan Braun. 2012. “Drone is Obama’s Weapon of Choice.” CNN Opinion. September 19. Accessed June 17, 2019. http://www.cnn. com/2012/09/05/opinion/bergen-obama-drone/index.html. Berry, David M. 2012. “Introduction: Understanding the Digital Humanities.” Understanding the Digital Humanities, edited by David M. Berry. New York: Palgrave Macmillan. DOI: https://doi.org/10.1057/9780230371934 Berry, David M., and Anders Fagerjord. 2017. Digital Humanities: Knowledge and Critique in a Digital Age. Malden, Massachusetts: Polity Press. Brennan, Timothy. 2017. “The Digital-Humanities Bust.” The Chronicle of Higher Education. October 15. Accessed June 17, 2019. https://www.chronicle.com/ article/The-Digital-Humanities-Bust/241424. Burdick, Anne, Johanna Drucker, Peter Lunenfeld, Todd Presner, and Jeffrey Schnapp. 2012. Digital_Humanities. Cambridge: MIT Press, 2012. Burghart, Marjorie. 2013. “The Three Orders or Digital Humanities Imagined #dhiha5.” Digital Humanities à l’IHA. April 28. Accessed June 17, 2019. http:// dhiha.hypotheses.org/817. Cascone, Kim. 2000. “The Aesthetics of Failure: ‘Post-Digital’ Tendencies in Contemporary Computer Music.” Computer Music Journal 24(4): 12–8. DOI: https://doi.org/10.1162/014892600559489 Cavallaro, James, Stephan Sonnenberg, and Sarah Knuckey. 2012. Living Under Drones: Death, Injury and Trauma to Civilians from US Drones Practices in Pakistan. Stanford: International Human Rights and Conflict Resolution Clinic at Stanford Law School; New York: Global Justice Clinic at NYU School of Law. Accessed June http://www.nytimes.com/2012/05/29/world/obamas-leadership-in-war-on-al-qaeda.html?pagewanted=1&_r=2&pagewanted=all&#p[TMATMA] http://www.nytimes.com/2012/05/29/world/obamas-leadership-in-war-on-al-qaeda.html?pagewanted=1&_r=2&pagewanted=all&#p[TMATMA] http://www.cnn.com/2012/09/05/opinion/bergen-obama-drone/index.html http://www.cnn.com/2012/09/05/opinion/bergen-obama-drone/index.html https://doi.org/10.1057/9780230371934 https://www.chronicle.com/article/The-Digital-Humanities-Bust/241424 https://www.chronicle.com/article/The-Digital-Humanities-Bust/241424 http://dhiha.hypotheses.org/817 http://dhiha.hypotheses.org/817 https://doi.org/10.1162/014892600559489 Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 44 of 51 17, 2019. https://law.stanford.edu/publications/living-under-drones-death- injury-and-trauma-to-civilians-from-us-drone-practices-in-pakistan/. Center for Digital Research and Scholarship. 2011. “Research Without Borders: Defining the Digital Humanities.” Columbia University. April 6. Accessed June 17, 2019. http://www.youtube.com/watch?v=Xu6Z1SoEZcc. Center for the Study of the Drone. 2019. “Understanding the Drone Through Art.” Bard College. Accessed June 17. https://dronecenter.bard.edu/multimedia- portals/portal-drone-art/. Chamayou, Grégorie. 2015. A Theory of the Drone. New York: The New Press. Clark, Lindsay C. 2018. “Grim Reapers: Ghostly Narratives and Killing in Drone Warfare.” International Feminist Journal of Politics 20(3): 602–23. DOI: https:// doi.org/10.1080/14616742.2018.1503553 Clarke, Roger A. 1988. “Information Technology and Dataveillance.” Communications of the ACM 31(5): 498–512. DOI: https://doi.org/10.1145/42411.42413 Cong-Huyen, Anne. 2013. “#CESA2013: Race in DH – Transformative Asian/American Digital Humanities.” September 24. Accessed June 17, 2019. http://anitaconchita.wordpress.com/2013/09/24/cesa2013-race-in-dh- transformative-asianamerican-digital-humanities/. Critical Inquiry. 2019. “Computational Literary Studies: A Critical Inquiry Online Forum.” In the Moment. March 31. Accessed June 17. https://critinq.wordpress. com/2019/03/31/computational-literary-studies-a-critical-inquiry-online- forum/. Cronin, Bruce. 2018. Bugsplat: The Politics of Collateral Damage in Western Armed Conflicts. New York: Oxford University Press. Da, Nan Z. 2019. “The Computational Case Against Computational Literary Studies.” Critical Inquiry 45: 601–39. DOI: https://doi.org/10.1086/702594 De Luce, Dan, and Paul Mcleary. 2016. Foreign Policy. April 5. Accessed June 17, 2019. https://foreignpolicy.com/2016/04/05/obamas-most-dangerous-drone- tactic-is-here-to-stay/. Department of Defense. 2012. “No-Strike and the Collateral Damage Estimation Methodology.” Chairman of Joint Chiefs of Staff Instruction 3160.01A. Washington, DC. https://law.stanford.edu/publications/living-under-drones-death-injury-and-trauma-to-civilians-from-us-drone-practices-in-pakistan/ https://law.stanford.edu/publications/living-under-drones-death-injury-and-trauma-to-civilians-from-us-drone-practices-in-pakistan/ http://www.youtube.com/watch?v=Xu6Z1SoEZcc https://dronecenter.bard.edu/multimedia-portals/portal-drone-art/ https://dronecenter.bard.edu/multimedia-portals/portal-drone-art/ https://doi.org/10.1080/14616742.2018.1503553 https://doi.org/10.1080/14616742.2018.1503553 https://doi.org/10.1145/42411.42413 http://anitaconchita.wordpress.com/2013/09/24/cesa2013-race-in-dh-transformative-asianamerican-digital-humanities/ http://anitaconchita.wordpress.com/2013/09/24/cesa2013-race-in-dh-transformative-asianamerican-digital-humanities/ https://critinq.wordpress.com/2019/03/31/computational-literary-studies-a-critical-inquiry-online-forum/ https://critinq.wordpress.com/2019/03/31/computational-literary-studies-a-critical-inquiry-online-forum/ https://critinq.wordpress.com/2019/03/31/computational-literary-studies-a-critical-inquiry-online-forum/ https://doi.org/10.1086/702594 https://foreignpolicy.com/2016/04/05/obamas-most-dangerous-drone-tactic-is-here-to-stay/ https://foreignpolicy.com/2016/04/05/obamas-most-dangerous-drone-tactic-is-here-to-stay/ Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 45 of 51 Drew, Christopher. 2009. “Drones are Weapons of Choice in Fighting Qaeda.” The New York Times. March 16. Accessed June 17. 2019. https://www.nytimes. com/2009/03/17/business/17uav.html. Elias, Jamal. 2005. “On Wings of Diesel: The Decorated Trucks of Pakistan.” Amherst College Magazine. Accessed June 17, 2019. https://www.amherst.edu/amherst- story/magazine/issues/2005_spring/wings. ———. 2011. On Wings of Diesel: Trucks, Identity, and Culture in Pakistan. Oxford, England: One World Publications. FATA (Federally Administered Tribal Areas) Research Center. 2014. Impact of War on Terror on Pashto Literature and Art. Islamabad, Pakistan. Fish, Stanley. 2018. “Stop Trying to Sell the Humanities.” The Chronicle of Higher Education. June 17. Accessed June 17, 2019. https://www.chronicle.com/article/ Stop-Trying-to-Sell-the/243643. ———. 2019. “Computational Literary Studies: Participant Forum Responses, Day 3.” Critical Inquiry. In the Moment. April 3. Accessed June 17, 2019. https://critinq. wordpress.com/2019/04/03/computational-literary-studies-participant-forum- responses-day-3-5. Foucault, Michel. 1997. Society Must be Defended: Lectures at the Collège de France, 1975–76. New York: Picador. Gertler, Jeremiah. 2012. “U.S Unmanned Aerial Systems.” Congressional Research Service. U.S. Department of State. January 3. Accessed June 23. https://pdfs. semanticscholar.org/b1c9/1702837787bb72dbde6affec193996a51ce0.pdf?_ ga=2.177205733.1390175729.1561346061-577688779.1561346061. Gold, Matthew K. 2012a. “The Digital Humanities Moment.” In Debates in the Digital Humanities, edited by Matthew K. Gold. Minneapolis: University of Minnesota Press. Accessed June 24, 2019. http://dhdebates.gc.cuny.edu/debates/text/2. ———. 2012b. Debates in the Digital Humanities. Minneapolis: University of Minnesota Press. Accessed June 24, 2019. http://dhdebates.gc.cuny.edu/. Gregory, Derek. 2011. “The Everywhere War.” The Geographical Journal 177(3): 238–50. DOI: https://doi.org/10.1111/j.1475-4959.2011.00426.x ———. 2014. “Drone Geographies.” Radical Philosophy 183. Accessed June 17, 2019. https://www.radicalphilosophy.com/article/drone-geographies. https://www.nytimes.com/2009/03/17/business/17uav.html https://www.nytimes.com/2009/03/17/business/17uav.html https://www.amherst.edu/amherst-story/magazine/issues/2005_spring/wings https://www.amherst.edu/amherst-story/magazine/issues/2005_spring/wings https://www.chronicle.com/article/Stop-Trying-to-Sell-the/243643 https://www.chronicle.com/article/Stop-Trying-to-Sell-the/243643 https://critinq.wordpress.com/2019/04/03/computational-literary-studies-participant-forum-responses-day-3-5 https://critinq.wordpress.com/2019/04/03/computational-literary-studies-participant-forum-responses-day-3-5 https://critinq.wordpress.com/2019/04/03/computational-literary-studies-participant-forum-responses-day-3-5 https://pdfs.semanticscholar.org/b1c9/1702837787bb72dbde6affec193996a51ce0.pdf?_ga=2.177205733.1390175729.1561346061-577688779.1561346061 https://pdfs.semanticscholar.org/b1c9/1702837787bb72dbde6affec193996a51ce0.pdf?_ga=2.177205733.1390175729.1561346061-577688779.1561346061 https://pdfs.semanticscholar.org/b1c9/1702837787bb72dbde6affec193996a51ce0.pdf?_ga=2.177205733.1390175729.1561346061-577688779.1561346061 http://dhdebates.gc.cuny.edu/debates/text/2 http://dhdebates.gc.cuny.edu/ https://doi.org/10.1111/j.1475-4959.2011.00426.x https://www.radicalphilosophy.com/article/drone-geographies Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 46 of 51 Gusterson, Hugh. 2016. Drone: Remote Control Warfare. Cambridge, Massachusetts: MIT Press. DOI: https://doi.org/10.1063/1.5009234 Haggerty, Kevin D., and Richard Ericson. 2007. “The Surveillant Assemblage.” In The Surveillance Studies Reader, edited by Sean P. Hier, and Joshua Greenberg, 104–16. New York: Open University Press. Hall, Gary. 2011. “The Digital Humanities Beyond Computing.” Culture Machine 12: 1–11. Hambling, David. 2009. “Special Forces Gigapixel Flying Spy See All.” Wired. February 12. Accessed June 23, 2019. http://www.wired.com/dangerroom/2009/02/ gigapixel-flyin/. Harkinson, Josh. 2013. “Friendly Fire: Drones as Folk Art.” Mother Jones. June 24. Accessed June 17, 2019. http://www.motherjones.com/media/2013/06/ pakistani-drone-art-mahwish-chishty. Hensley, Nathan K. 2018. “Drone Form: Mediation at the End of Empire.” Novel: A Forum on Fiction 51(2): 226–49. DOI: https://doi.org/10.1215/00295132- 6846084 Higgin, Tanner. 2010. “Cultural Politics, Critique and the Digital Humanities.” May 5. Accessed June 17, 2019. http://www.tannerhiggin.com/cultural-politics- critique-and-the-digital-humanities/. Hindley, Meredith. 2013. “The Rise of the Machines.” Humanities: The Magazine for the National Endowment for the Humanities 34(4). Accessed June 17, 2019. http:// www.neh.gov/humanities/2013/julyaugust/feature/the-rise-the-machines. Jockers, Matthew. 2013. Macroanalysis: Digital Methods and Literary History. Chicago: University of Illinois Press. ProQuest Ebook Central. DOI: https://doi. org/10.5406/illinois/9780252037528.001.0001 Kaplan, Fred. 2013. “The World as Free Fire Zone.” MIT Technology Review. June 7. Accessed June 17, 2019. http://www.technologyreview.com/ featuredstory/515806/the-world-as-free-fire-zone/page/2/. Kennedy, Greg. 2013. “Drones: Legitimacy and Anti-Americanism.” Parameters 42(4)/43(1): 25–8. https://doi.org/10.1063/1.5009234 http://www.wired.com/dangerroom/2009/02/gigapixel-flyin/ http://www.wired.com/dangerroom/2009/02/gigapixel-flyin/ http://www.motherjones.com/media/2013/06/pakistani-drone-art-mahwish-chishty http://www.motherjones.com/media/2013/06/pakistani-drone-art-mahwish-chishty https://doi.org/10.1215/00295132-6846084 https://doi.org/10.1215/00295132-6846084 http://www.tannerhiggin.com/cultural-politics-critique-and-the-digital-humanities/ http://www.tannerhiggin.com/cultural-politics-critique-and-the-digital-humanities/ http://www.neh.gov/humanities/2013/julyaugust/feature/the-rise-the-machines http://www.neh.gov/humanities/2013/julyaugust/feature/the-rise-the-machines https://doi.org/10.5406/illinois/9780252037528.001.0001 https://doi.org/10.5406/illinois/9780252037528.001.0001 http://www.technologyreview.com/featuredstory/515806/the-world-as-free-fire-zone/page/2/ http://www.technologyreview.com/featuredstory/515806/the-world-as-free-fire-zone/page/2/ Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 47 of 51 Khan, Hidayat. 2012. “My Gaze is as Fatal as a Drone Attack.” The Express Tribune. September 18. Accessed June 17, 2019. http://tribune.com.pk/story/438610/ my-gaze-is-as-fatal-as-a-drone-attack/. Kirsch, Adam. 2014. “Technology is Taking Over English Departments.” The New Republic. May 2. Accessed June 17, 2019. https://newrepublic.com/ article/117428/limits-digital-humanities-adam-kirsch. Koh, Adeline. 2014. “Niceness, Building, and Opening the Genealogy of the Digital Humanities: Beyond the Social Contract of Humanities Computing.” Differences: A Journal of Feminist Cultural Studies 25(1): 93–106. DOI: https:// doi.org/10.1215/10407391-2420015 Kreps, Sarah, and John Kaag. 2012. “The Use of Unmanned Aerial Vehicles in Contemporary Conflict: A Legal and Ethical Analysis.” Polity 44: 1–26. DOI: https://doi.org/10.2139/ssrn.2023202 Liu, Alan. 2012. “Where is Cultural Criticism in the Digital Humanities?” In Debates in the Digital Humanities, edited by Matthew K. Gold. Minneapolis: University of Minnesota Press. Accessed June 17, 2019. http://dhdebates.gc.cuny.edu/ debates/text/20. Open Access Edition. McPherson, Tara. 2012. “Why are the Digital Humanities So White? Or Thinking the Histories of Race and Computation.” In Debates in the Digital Humanities, edited by Matthew K. Gold. Minneapolis: University of Minnesota Press. Accessed June 17, 2019. http://dhdebates.gc.cuny.edu/debates/text/29. Open Access Edition. DOI: https://doi.org/10.5749/minnesota/9780816677948.003.0017 Moretti, Franco. 2013. Distant Reading. New York: Verso. Muthyala, John. 2016. “Whither the Digital Humanities?” Hybrid Pedagogy. Accessed June 17, 2019. http://hybridpedagogy.org/whither-the-dh/. Neal, Meghan. 2013. “Finally: A Drone for Dropping Rhymes, not Bombs.” Motherboard. June 25. Accessed June 23, 2019. https://www.vice.com/en_us/ article/533xmk/poetry-drone-would-drop-rhymes-not-bombs. New America. 2019a. “Drone Strikes: Pakistan.” Accessed June 17. https://www. newamerica.org/in-depth/americas-counterterrorism-wars/pakistan/. http://tribune.com.pk/story/438610/my-gaze-is-as-fatal-as-a-drone-attack/ http://tribune.com.pk/story/438610/my-gaze-is-as-fatal-as-a-drone-attack/ https://newrepublic.com/article/117428/limits-digital-humanities-adam-kirsch https://newrepublic.com/article/117428/limits-digital-humanities-adam-kirsch https://doi.org/10.1215/10407391-2420015 https://doi.org/10.1215/10407391-2420015 https://doi.org/10.2139/ssrn.2023202 http://dhdebates.gc.cuny.edu/debates/text/20. Open Access Edition http://dhdebates.gc.cuny.edu/debates/text/20. Open Access Edition http://dhdebates.gc.cuny.edu/debates/text/29 https://doi.org/10.5749/minnesota/9780816677948.003.0017 http://hybridpedagogy.org/whither-the-dh/ https://www.vice.com/en_us/article/533xmk/poetry-drone-would-drop-rhymes-not-bombs https://www.vice.com/en_us/article/533xmk/poetry-drone-would-drop-rhymes-not-bombs https://www.newamerica.org/in-depth/americas-counterterrorism-wars/pakistan/ https://www.newamerica.org/in-depth/americas-counterterrorism-wars/pakistan/ Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 48 of 51 ———. 2019b. “Drone Strikes: Yemen.” Accessed June 17. https://www.newamerica. org/in-depth/americas-counterterrorism-wars/us-targeted-killing-program- yemen/. Pannapacker, William A. 2013. “‘Hacking’ and ‘Yacking’ About the Digital Humanities.” The Chronicle of Higher Education. September 3. Accessed June 17, 2019. https://www.chronicle.com/article/HackingYacking-About/141311. Parks, Lisa. 2017. “Vertical Mediation.” In Life in the Age of Drone Warfare, edited by Lisa Parks, and Caren Kaplan, 134–57. Durham: Duke University Press. DOI: https://doi.org/10.1215/9780822372813 Peron, Alcides Eduardo dos Reis. 2014. “The ‘Surgical’ Legitimacy of Drone Strikes? Issues of Sovereignty and Human Rights in the Use of Unmanned Aerial Systems in Pakistan.” Journal of Strategic Security 7(4): 81–93. DOI: https://doi. org/10.5038/1944-0472.7.4.6 Pincus, Walter. 2005. “Pentagon Has Far-Reaching Defense Spacecraft in Works.” Washington Post. March 16. Accessed June 17, 2019. http://www.washingtonpost. com/wp-dyn/articles/A38272-2005Mar15.html. Power, Matthew. 2013. “Confessions of a Drone Warrior.” GQ. October 23. Accessed June 17, 2019. https://www.gq.com/story/drone-uav-pilot-assassination. Presner, Todd. 2015. “Critical Theory and the Mangle of Digital Humanities.” In Between Humanities and the Digital, edited by Patrik Svensson, and David Theo Goldberg, 55–68. Boston: Massachusetts Institute of Technology. Presner, Todd, Jeffrey Schnapp, and Peter Lunenfeld. 2009. Digital Humanities Manifesto 2.0 Accessed June 17, 2019. http://www.humanitiesblast.com/ manifesto/Manifesto_V2.pdf. Reid, Alex. 2014. “Digital Nonhumanities... Excerpt.” January 11. Accessed June 17, 2019. https://profalexreid.com/2014/01/11/digital-nonhumanities-excerpt/. Rhee, Jennifer. 2018. The Robotic Imaginary: The Human and the Price of Dehumanized Labor. Minneapolis: University of Minnesota Press. DOI: https:// doi.org/10.5749/j.ctv62hh4x Risam, Roopika. 2019. New Digital Worlds: Postcolonial Digital Humanities in Theory, Praxis, and Pedagogy. Evanston, IL: Northwestern University Press. DOI: https:// doi.org/10.2307/j.ctv7tq4hg https://www.newamerica.org/in-depth/americas-counterterrorism-wars/us-targeted-killing-program-yemen/ https://www.newamerica.org/in-depth/americas-counterterrorism-wars/us-targeted-killing-program-yemen/ https://www.newamerica.org/in-depth/americas-counterterrorism-wars/us-targeted-killing-program-yemen/ https://www.chronicle.com/article/HackingYacking-About/141311 https://doi.org/10.1215/9780822372813 https://doi.org/10.5038/1944-0472.7.4.6 https://doi.org/10.5038/1944-0472.7.4.6 http://www.washingtonpost.com/wp-dyn/articles/A38272-2005Mar15.html http://www.washingtonpost.com/wp-dyn/articles/A38272-2005Mar15.html https://www.gq.com/story/drone-uav-pilot-assassination http://www.humanitiesblast.com/manifesto/Manifesto_V2.pdf http://www.humanitiesblast.com/manifesto/Manifesto_V2.pdf https://profalexreid.com/2014/01/11/digital-nonhumanities-excerpt/ https://doi.org/10.5749/j.ctv62hh4x https://doi.org/10.5749/j.ctv62hh4x https://doi.org/10.2307/j.ctv7tq4hg https://doi.org/10.2307/j.ctv7tq4hg Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 49 of 51 Said, Edward. 1983. The World, the Text, and the Critic. Cambridge: Harvard University Press. Scahill, Jeremy, and Glenn Greenwald. 2014. “Death by Metadata: Jeremy Scahill and Glenn Greenwald Reveal NSA Role in Assassinations Overseas.” Democracy Now. February 10. Accessed June 17, 2019. https://www.democracynow. org/2014/2/10/death_by_metadata_jeremy_scahill_glenn. Schnepf, J. D. 2017. “Domestic Aerial Photography in the Era of Drone Warfare.” Modern Fiction Studies. Project Muse 63(2): 270–87. DOI: https://doi. org/10.1353/mfs.2017.0022 Schreibman, Susan, Ray Siemens, and John Unsworth. (eds.) 2004. A Companion to Digital Humanities. Oxford: Blackwell. Accessed June 17, 2019. http:// www.digitalhumanities.org/companion/. DOI: https://doi.org/10.1111/ b.9781405103213.2004.00026.x Shachtman, Noah. 2009. “Air Force to Unleash ‘Gorgon Stare’ on Squirting Insurgents.” Wired. February 19. Accessed June 17, 2019. http://www.wired. com/dangerroom/2009/02/gorgon-stare/. Siddiqui, Taha. 2012. “Taliban Jihad Literature: What’s read in Afghanistan is Printed in Pakistan.” The Express Tribune. August 12. Accessed June 17, 2019. http://tribune.com.pk/story/421356/taliban-jihad-literature-whats-read-in- afghanistan-is-printed-in-pakistan/. Solove, Daniel J. 2004. The Digital Person: Technology and Privacy in the Information Age. New York: New York University Press. Staples, William G. 2003. Everyday Surveillance: Vigilance and Visibility in Postmodern Life. New York: Bowman and Littlefield Publishers, Inc. Storm, Darlene. 2014. “Whistleblower: NSA Targets SIM Cards for Drone Strikes, ‘Death by Unreliable Metadata’.” Computerworld. February 10. Accessed June 17, 2019. https://www.computerworld.com/article/2475921/data-privacy/ whistleblower--nsa-targets-sim-cards-for-drone-strikes---death-by-unreliable- metadata-.html. Terras, Melissa, Julianne Nyhan, and Edward Vanhoutte. 2013. Defining Digital Humanities: A Reader. New York, NY: Ashgate. https://www.democracynow.org/2014/2/10/death_by_metadata_jeremy_scahill_glenn https://www.democracynow.org/2014/2/10/death_by_metadata_jeremy_scahill_glenn https://doi.org/10.1353/mfs.2017.0022 https://doi.org/10.1353/mfs.2017.0022 http://www.digitalhumanities.org/companion/ http://www.digitalhumanities.org/companion/ https://doi.org/10.1111/b.9781405103213.2004.00026.x https://doi.org/10.1111/b.9781405103213.2004.00026.x http://www.wired.com/dangerroom/2009/02/gorgon-stare/ http://www.wired.com/dangerroom/2009/02/gorgon-stare/ http://tribune.com.pk/story/421356/taliban-jihad-literature-whats-read-in-afghanistan-is-printed-in-pakistan/ http://tribune.com.pk/story/421356/taliban-jihad-literature-whats-read-in-afghanistan-is-printed-in-pakistan/ https://www.computerworld.com/article/2475921/data-privacy/whistleblower--nsa-targets-sim-cards-for-drone-strikes---death-by-unreliable-metadata-.html https://www.computerworld.com/article/2475921/data-privacy/whistleblower--nsa-targets-sim-cards-for-drone-strikes---death-by-unreliable-metadata-.html https://www.computerworld.com/article/2475921/data-privacy/whistleblower--nsa-targets-sim-cards-for-drone-strikes---death-by-unreliable-metadata-.html Muthyala: Drones and Surveillance Cultures in a Global WorldArt. 18, page 50 of 51 The Bureau of Investigative Journalism. 2019 “Drone Wars: The Full Data.” Accessed June 23. https://v1.thebureauinvestigates.com/category/projects/ drones/drones-graphs/. Turse, Nick. 2012. The Changing Face of Empire: Special Ops. Drones, Spies, Proxy Fighters, Secret Bases, and Cyberwarfare. Chicago: Haymarket Books. Underwood, Ted. 2019. Distant Horizons: Digital Evidence and Literary Change. Chicago: University of Chicago Press. DOI: https://doi.org/10.7208/ chicago/9780226612973.001.0001 Vergakis, Brock. 2013. “U.S. Launches Drone from Aircraft Carrier.” washingtonexaminer.com May 14. Accessed June 24, 2019. https://www. washingtonexaminer.com/us-launches-drone-from-aircraft-carrier. Walzer, Michael. 2004. Arguing about War. New Haven: Yale University Press. Watson, Julie. 2008. “Archeologists Digging up Pre-Columbian Sounds.” Los Angeles Times. July 6. Accessed June 17, 2019. http://articles.latimes.com/2008/jul/06/ news/adfg-sounds6. Weisgerber, Marcus. 2017. “The Pentagon’s New Algorithmic Warfare Cell Gets Its First Mission: Hunt ISIS.” Defense One. May 14. Accessed June 17, 2019. https:// www.defenseone.com/technology/2017/05/pentagons-new-algorithmic- warfare-cell-gets-its-first-mission-hunt-isis/137833/. Wolfgang, Ben. 2018. “Trump Outpacing Obama in Drone Strikes; 80 in First Year: Report.” The Washington Times. June 7. Accessed June 17, 2019. https://www. washingtontimes.com/news/2018/jun/7/donald-trump-outpacing-barack- obama-drone-strikes-/. Yehya, Naief. 2015. “The Drone: God’s Eye, Death Machine, Cultural Puzzle.” Culture Machine 16: 1–3. Zenko, Micah. 2013. “Reforming US Drone Strike Policies.” Council on Foreign Relations. Special Report 65: 1–41. https://v1.thebureauinvestigates.com/category/projects/drones/drones-graphs/ https://v1.thebureauinvestigates.com/category/projects/drones/drones-graphs/ https://doi.org/10.7208/chicago/9780226612973.001.0001 https://doi.org/10.7208/chicago/9780226612973.001.0001 https://www.washingtonexaminer.com https://www.washingtonexaminer.com/us-launches-drone-from-aircraft-carrier https://www.washingtonexaminer.com/us-launches-drone-from-aircraft-carrier http://articles.latimes.com/2008/jul/06/news/adfg-sounds6 http://articles.latimes.com/2008/jul/06/news/adfg-sounds6 https://www.defenseone.com/technology/2017/05/pentagons-new-algorithmic-warfare-cell-gets-its-first-mission-hunt-isis/137833/ https://www.defenseone.com/technology/2017/05/pentagons-new-algorithmic-warfare-cell-gets-its-first-mission-hunt-isis/137833/ https://www.defenseone.com/technology/2017/05/pentagons-new-algorithmic-warfare-cell-gets-its-first-mission-hunt-isis/137833/ https://www.washingtontimes.com/news/2018/jun/7/donald-trump-outpacing-barack-obama-drone-strikes-/ https://www.washingtontimes.com/news/2018/jun/7/donald-trump-outpacing-barack-obama-drone-strikes-/ https://www.washingtontimes.com/news/2018/jun/7/donald-trump-outpacing-barack-obama-drone-strikes-/ Muthyala: Drones and Surveillance Cultures in a Global World Art. 18, page 51 of 51 How to cite this article: Muthyala, John. 2019. “Drones and Surveillance Cultures in a Global World.” Digital Studies/Le champ numérique 9(1): 18, pp. 1–51. DOI: https://doi. org/10.16995/dscn.332 Submitted: 24 November 2018 Accepted: 05 June 2019 Published: 27 September 2019 Copyright: © 2019 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/. OPEN ACCESS Digital Studies/Le champ numérique is a peer-reviewed open access journal published by Open Library of Humanities. https://doi.org/10.16995/dscn.332 https://doi.org/10.16995/dscn.332 http://creativecommons.org/licenses/by/4.0/ Digital Humanities and the cultural turn Drone warfare and empire in the 21st century Anarchy of global surveillance The struggle for the human in drone wars Drone art and politics Drones and surveillance in popular culture Acknowledgements Competing Interests References Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 work_hebtpqw3vffnpfo4wo7n5klapy ---- Digital Humanities Within a Global Context: Creating Borderlands of Localized Expression 1 23 Fudan Journal of the Humanities and Social Sciences ISSN 1674-0750 Fudan J. Hum. Soc. Sci. DOI 10.1007/s40647-018-0224-0 Digital Humanities Within a Global Context: Creating Borderlands of Localized Expression Amy E. Earhart 1 23 Your article is protected by copyright and all rights are held exclusively by Fudan University. This e-offprint is for personal use only and shall not be self-archived in electronic repositories. If you wish to self-archive your article, please use the accepted manuscript version for posting on your own website. You may further deposit the accepted manuscript version in any repository, provided it is only made publicly available 12 months after official publication or later and provided acknowledgement is given to the original source of publication and a link is inserted to the published article on Springer's website. The link must be accompanied by the following text: "The final publication is available at link.springer.com”. O R I G I N A L P A P E R Digital Humanities Within a Global Context: Creating Borderlands of Localized Expression Amy E. Earhart1 Received: 19 February 2018 / Accepted: 22 March 2018 � Fudan University 2018 Abstract As scholars have begun the digitization of the world’s cultural materials, the understanding of what is to be digitized and how that digitization occurs remains narrowly imagined, with a distinct bias toward North American and European notions of culture, value and ownership. Humanists are well aware that cultural knowledge, aesthetic value and copyright/ownership are not monolithic, yet digital humanities work often expects the replication of narrow ideas of such. Drawing on the growing body of scholarship that situates the digital humanities in a broad global context, this paper points to areas of tension within the field and posits ways that digital humanities practitioners might resist such moves to homogenize the field. Working within the framework of border studies, the paper considers how working across national barriers might further digital humanities work. Finally, ideas of ownership and/or copyright are unique to country of origin and, as such, deserve careful attention. While open access is appealing in many digital humanities pro- jects, it is not always appropriate, as work with indigenous cultural artifacts has revealed. Keywords Digital humanities � Global � Borderlands � Transnational As scholars have begun the digitization of the world’s cultural materials, the understanding of what is to be digitized and how that digitization occurs, of how we utilize technology, of infrastructures of academic digital humanities (dh), remains narrowly imagined, with a distinct bias toward North American and European notions of culture, value and ownership. Humanists are well aware that cultural & Amy E. Earhart aearhart@tamu.edu 1 Department of English, Texas A&M University, 4227 TAMU, College Station, TX 77843-4227, USA 123 Fudan J. Hum. Soc. Sci. https://doi.org/10.1007/s40647-018-0224-0 Author's personal copy http://orcid.org/0000-0002-6090-674X http://crossmark.crossref.org/dialog/?doi=10.1007/s40647-018-0224-0&domain=pdf http://crossmark.crossref.org/dialog/?doi=10.1007/s40647-018-0224-0&domain=pdf https://doi.org/10.1007/s40647-018-0224-0 knowledge, academic infrastructures and copyright/ownership are not monolithic, yet digital humanities disciplinary structures often expect the replication of narrow ideas of such. Katherine Hayles predicts an entanglement of codes within a global environment, noting that ‘‘As the worldview of code assumes comparable importance to the worldviews of speech and writing, the problematics of interaction between them grow more complex and entangled’’ (2010, 31). The multiplicity of codes as expressed within global environments brings a largely ignored complexity to digital humanities and code studies and necessitates scholarship to interpret and critique such codes. While digital humanities is global, those of us practicing digital humanities continue to work within, to replicate, localized academic structures. While we might have come to terms intellectually with the notion that our scholarship is looking outward, that we are increasingly called upon to view our work within a complex web of global academic conversations, individual academics remain caught within nationally bound structures of academia, making the notion of a globalized construction of scholarship that values disparate forms of digital humanities incredibly difficult. As digital humanists imagine the ways that our community of scholars across the world might engage, we have the opportunity to construct a collaborative environment that models the best of such interactions. Efforts are well underway. Models range from a big tent approach, an umbrella model that pulls together all such efforts, to a networked set of nodes. Yet, as global interaction among digital humanists grows it has revealed tension regarding the way in which the digital humanities engage with each other. Rather than initiating a one size fits all global model, we need to imagine a global digital humanities that lives in the borderlands, a place of connection and contradiction and, mostly importantly, a place that does not try to centralize itself. Recognizing that monolithic models of digital humanities are unproductive, digital humanists have begun to discuss how we might create academic infrastruc- tures, such as organizations, conferences and journals, that fully account for the diversity of practice. Early organizations such as GO::DH, Global Outlook::Digital Humanities, are leaders in the expansion of such infrastructure. Developed to ‘‘break down barriers that hinder communication and collaboration among researchers and students of the Digital Arts, Humanities, and Cultural Heritage sectors in high, mid, and low income economies’’ (GO:DH 2017), GO::DH has become a Special Interest Group (SIG) affiliated with the largest digital humanities organization in the world, the Alliance of Digital Humanities Organizations or ADHO. Work by members of GO::DH and others within ADHO has helped to make building ‘‘global digital humanities networks’’ one of the priorities of ADHO. ADHO has also been working to expand membership, constituent organizations and cultural and linguistic difference within their organization. Other co-partners of ADHO include Centernet: An International Network of Digital Humanities Centers, constructed as ‘‘an international network of digital humanities centers formed for cooperative and collaborative action to benefit digital humanities and allied fields in general, and centers as humanities cyberinfrastructure in particular.’’ Emphasizing inclusivity, the organization views itself as a ‘‘big tent,’’ extending a welcome to all who self-define as digital humanities. While centernet is an international network A. E. Earhart 123 Author's personal copy with expansive goals, it remains limited in representation. Many countries that are actively producing digital humanities work, such as India, are not included in the network. Only two centers in Africa are included, though excellent digital humanities work across Asia is underway. Clearly the largest digital humanities organizations in the world are trying to articulate the way by which they might encourage a global discussion of digital humanities, but remain limited in their success. Digital humanities as a structural entity has coalesced around the ADHO yearly conference. Since 1989 digital humanists have gathered for the annual conference, imagined as international in scope. Originally the conference rotated between North American and Europe, but in order to encourage international participants the conference has begun to meet in wide ranging locations; it has moved from its original Canadian/US/Western Europe locations to greater parts of Europe and the Americas, such as Poland and Mexico. Created under the umbrella of ADHO, the organization includes The European Association for Digital Humanities (EADH); the Association for Computers and the Humanities (ACH), predominantly an Americas organization; Canadian Society for Digital Humanities/Société canadi- enne des humanités numériques (CSDH/SCHN); centerNet, Australasian Associa- tion for Digital Humanities (aaDH); Japanese Association for Digital Humanites (JADH); and Humanistica, L’association francophone des humanités numériques/digitales (Humanistica). Past conference themes have embraced a global digital humanities. The 2012 international digital humanities conference, held at the University of Hamburg, had the auspicious theme of Digital Diversity: Cultures, Languages and Methods. Australia’s hosting of the 2015 conference focused on a theme of Global Digital Humanities. The 2018 Digital Humanities Conference held in Mexico City asks for us to consider Bridges/Puentes. The conference is fairly unique among academic conferences in that it is attempting to pull together such a broad group of scholars. There is no other academic conference in the literature, for example, that has the long-term goal of global outreach and has made such strives toward building a global organization. Digital humanities journals are also focusing on the global digital humanities and have begun to publish papers that engage with the complex issues of how we might define digital humanities in the increasingly broad space and places in which the scholarship is created. Such efforts extend to journals affiliated with ADHO, including DSH: Digital Scholarship in the Humanities (formerly LLC: The Journal of Digital Scholarship in the Humanities), DHQ (Digital Humanities Quarterly) and Digital Studies/Le champ numérique which have featured global issues, such as collections titled ‘‘Digital Humanities Without Borders,’’ ‘‘Global Outlook::Digital Humanities: Global Digital Humanities Essay Prize,’’ both in Digital Studies/Le champ numérique, and papers that consider a broader global understanding of digital humanities, such as ‘‘Corpus-Based Studies of Translational Chinese in English–Chinese Translation’’ and ‘‘Aspect Marking in English and Chinese: Using the Lancaster Corpus of Mandarin Chinese for Contrastive Language Study,’’ both in DSH: Digital Scholarship in the Humanities. However, the data suggest that we still have a long way to go if we want to be a global organization. Melissa Terras was the first to focus attention on conference Digital Humanities Within a Global Context: Creating… 123 Author's personal copy representation, finding that the conference was attended overwhelmingly by scholars from the USA, Canada and the UK (see Fig. 1). Concerned about the lack of geodiversity of conference attendance, Terras has continued to track attendance, and her recent work suggests that digital humanities remains imagined as western located (see Fig. 2). Work by Roopika Risam, Alex Gil, Isabel Galina, Domenico Fiormont, Elika Ortega, Padmini Ray Murray, among other scholars, have called interpretations such as Fig. 2 into question, suggesting that the digital humanities is centered in the Americas and Europe only in the Western imagination, a construct that ignores the broad scope of global digital humanities. Risam notes, ‘‘the distribution of DH centers suggests uneven development. The USA and, to a lesser extent, the UK and Canada appear the true centers of DH, while other countries comprise the peripheries’’ (2017, 378). Should we want to broaden the digital humanities to a globally representative field, then we must begin to not only reimagine boundaries, but to construct organizations which decentralize. Part of the difficulty is that the structures of the largest digital humanities organizations, such as ADHO, remain narrowly focused. A study of the conference authors from 2004 to 2013 shows that conference participation remains unequally distributed (see Fig. 3). Conference participation is largely formed by the perennial question of how to define the field, with some definitions driving limited globalized membership, so too might structural issues associated with the conference. centerNet and ADHO offer free and reduced cost memberships for joining their entities and, while waiving membership fees does encourage participation, the actual costs associated with attending the Digital Humanities conference, from airfare to lodging costs, remain high. Registration discounts occur by career stage, with staff and students receiving Fig. 1 Presenters at ACH/ALLC 2005 by Institution Country. Terras (2006). Please note that the Digital Humanities Conference was originally titled the ACH/ALLCH conference A. E. Earhart 123 Author's personal copy discounted rates, but the organization has not included registration differentiation by region, country or income, leaving those from low-economy counties facing a dramatic challenge. For example, at the 2016 digital humanities conference in Krakow participants from Poland reported that the registration costs of the Fig. 2 Quantifying Digital Humanities. Melissa Terras. Infographic: Quantifying Digital Humanities. 2012. Melissa Terras’ Blog. http://www.ucl.ac.uk/infostudies/melissa-terras/DigitalHumanitiesInfogra phic.pdf Accessed September 18, 2017 Fig. 3 Number of authors per region 2004–2013. Weingart and Eichmann-Kalwara (2017) Digital Humanities Within a Global Context: Creating… 123 Author's personal copy http://www.ucl.ac.uk/infostudies/melissa-terras/DigitalHumanitiesInfographic.pdf http://www.ucl.ac.uk/infostudies/melissa-terras/DigitalHumanitiesInfographic.pdf conference were equivalent to a month of salary for lecturers. Though the conference was in their home country, the cost was prohibitive. While some have floated the idea of income-based registration, to date the conference has not responded to a key structural issue that prohibits participation from a broader digital humanities community. The conference has taken positive steps to create a less exclusionary space by holding the 2015 conference in Australia and the 2018 conference in Mexico. Prompted by the 2011 formation of La Red de Humanidades Digitales (RedHD), the 2018 Mexico City conference will be ‘‘the first time that the conference will take place in Latin America & the global south.’’ The shift in locations for Digital Humanities signals an important moment in the history of the organization is largely due to the hard work of organizations like GO::DH and RedHD. However, there remain clear structural barriers to an inclusive global digital humanities. Algorithmic analysis of digital humanities’ structures points to continuing problems in developing a diverse global digital humanities. Scott Weingart’s analysis of the yearly ADHO conference has pushed digital humanities to think through how we are constituting ourselves through our conference and our field, revealing the ways that conference participation remains geographically located in the Americas and Europe. 1 Conference participation limitations also appear in our constituent journals which are likewise publishing articles predominantly clustered around scholars in the Americas and Europe. Telling is an analysis of Digital Humanities Quarterly: DHQ examining co-author networks in the journal from 2007 to 2014 which reveals that the networks remain squarely centered in the Americas, with very little representation beyond Europe (see Fig. 4). All of this suggests that digital humanities as understood through our organizational entities, digital humanities organizations, conferences and journals, desires to be global but remains merely the imagined global. The domination of the primary modes of disciplinary construction, journals and conferences by the Americas and Europe is a problem in that it is creating a field that runs counter to the described goals of global digital humanities, implying that no matter the imagined global digital humanities, a truly global understanding of an organization or a field is difficult to construct, perhaps even more difficult in the current age of nationalist tensions. There are numerous interventions underway to broaden our representation of global digital humanities, but we remain caught within tensions of an umbrella structure that enforces structures that are often not conducive to the larger representation of digital humanities. Digital humanities has struggled to articulate a global organization in large part because of originating tensions within the organization construction. Digital humanities, as a field, has struggled to articulate what is included within its rubric, a struggle that remains an open academic question. Tensions within the field have revolved around who’s in and who’s out, but in a localized context focused on, once again, the Americas and Europe. Reviewing the literature that attempts to define digital humanities reveals that geography has been ignored by scholarship until 1 See dh quantified for a list of scholars invested in collecting information of the community: http:// scottbot.net/dh-quantified. A. E. Earhart 123 Author's personal copy http://scottbot.net/dh-quantified http://scottbot.net/dh-quantified recent interventions. Such scholarly constructions of digital humanities which view digital humanities as naturalized within a European and Americas structure has led to current limitations of the field. As O’Donnell et al. make clear, our current representation of digital humanities moves along clear lines of demarcation, whether economic, linguistic or geographic (2016, 493). The centering of digital humanities in this manner has created an ‘‘unproductive dichotomy of center and periphery,’’ leading to a call for a resistance to such structures through a creation of a regional or local digital humanities (Gil and Ortega 2016, 23). For example, Alex Gil’s ‘‘Around DH in 80 Days’’ project resists the limited centering of digital humanities, instead revealing the diversity of global digital humanities projects (see Fig. 5). The diversification of digital humanities, the struggle to create an organizational entity that inclusively represents a global digital humanities, will continue to occur through ADHO and its affiliated conference and journals, but the organizational structures currently remain resistant to a more globally imagined digital humanities. Because of this, we might ask whether ADHO is actually the mechanism to bring about global digital humanities. As the organization has grown, there has been an almost de facto understanding that it should be the center for global dh. But the centering of digital humanities in an organization that has arisen out of western academic structures will, I argue, always struggle to imagine how to construct a truly representative field. A better question might be whether we can construct an alternative mechanism that accurately represents all the different ways that digital humanities is practiced in a global environment. The rejection of an umbrella or big tent organization in which to coalesce a global digital humanities is born out of an analysis of the way that geographic, economic, cultural and structural approaches to academic discipline impact our interactions in the larger digital humanities. During the research and writing of Traces of the Old, Uses of the New: The Emergence of Digital Humanities (2015) I came to understand that providing one definition of the digital humanities was dependent upon a stable infrastructure from which the practice developed. The definition of digital humanities within the Americas is dependent upon an academia that is increasingly defunded and deprofessionalized, driving a digital humanities that is interested in an entrepreneurially based startup model of digital humanities. This is not so for other localized digital humanities practices, yet dh organizations like ADHO continue to imagine digital humanities with a distinct bias toward North American and European notions of culture, value and ownership. O’Donnell et al. rightly argue that this view of digital humanities is predicated on viewing the development of a global digital humanities ‘‘as an opportunity for transferring Fig. 4 ‘‘Co-Author Network for Digital Humanities Quarterly: 2007–14.’’ de la Cruz et al. (2015) Digital Humanities Within a Global Context: Creating… 123 Author's personal copy knowledge, experience, and access to infrastructure from a developed North to an underdeveloped South’’ (2016, 496). Rejecting this, the authors call for an approach that ‘‘is far more about developing understanding than merging practice,’’ and they turn to ‘‘supra-networks that transcend national, linguistic, regional and economic boundaries’’ (2016, 496). I’d like to quibble with the use of networks as the way by which we should represent the interaction of the various global representations of digital humanities. The notion of an overarching system that is built from nodes, is not that different than how ADHO and its constituent conference imagines itself, a model that ignores the very real institutional and cultural divides that are always with us. In many ways, a supra-network is a slightly shifted replication of the long understood big tent digital humanities and, ultimately, a failed model. Digital humanities is an amorphous and fluid concept or practice, particularized in various disciplines, national contexts and even local environments, but the field is represented as a coherent body of practice by intact structures that include the annual digital humanities conference, the various global organizations that form ADHO, and even journals published by the various societies. The digital humanities, as represented by the yearly international conference, is a digital humanities which ignores the borders of practice that masks areas of dissension and normalizes the field to a particular form without contour. However, the center does not hold and recent conferences have featured ruptures, revealing the false constructedness of a coherent digital humanities. Structuring the global digital humanities as a ‘‘big tent’’ hides the way that such a representation seeks ‘‘sameness’’ in practice. A counternarrative that provides a more inclusive understanding of global digital humanities is one that turns to specificity. While some may see the segmentation of digital humanities as counterproductive, I argue Fig. 5 ‘‘Around DH in 80 Days.’’ Gil (2014) A. E. Earhart 123 Author's personal copy that digital humanities must be particularized because dh, as enacted, is so broad, diffuse and flexible that a generalized definition does not adequately address the various digital approaches currently in use nor how certain humanities fields are being altered by digital practice. A far more productive understanding of our collective histories is to identify the borders of practice and to look for disciplinary overlaps that benefit all partners. A specificity of global digital humanities’ practices is best understood in the framework of what Gloria Anzaldua has called the borderlands in her crucial work Borderlands|La Frontera (1987). Anzaldua’s framework allows us to examine the impact of cultural representations of digital humanities within larger frameworks of power, including the economic, cultural and power dynamics that impact the production of scholarship. While Anzaldua is writing prior to the digital turn and code studies scholarship, her work is prescient. Examining the code shifting of language, Anzaldua argues that language codes provide a way to examine the complexity of networked interfaces of communication and a way of understand how cultural identity is impacted by power dynamics of such code. Anzaldua’s focus on code switching, defined in her book as language switching or ‘‘The switching of ‘codes’ …from English to Castillian Spanish to the North Mexican dialect of Tex- Mex to a sprinkling of Nahuatl to a mixture of all of these,’’ produces great cultural upheaval. This ‘‘language of the Borderlands’’ is ever shift and changing and ‘‘There, at the juncture of cultures, languages cross-pollinate and are revitalized; they die and are born’’ (1987, Preface). While Anzaldua situates her discussion of borderlands in the geographic specificity of the Texas/Mexico border, her theorization of power between multiple cultural codes might be extended to our understanding of digital humanities. Roopika Risam echoes such an extension of code switching when she calls for DH accents, a recognition of the multiple languages, both ‘‘linguistic and computational’’ as the formation of dh(s) (2017, 381). To Risam, the multiple accents of digital humanities must be ‘‘understood in a broader ecology of ‘accents’ that inflect practices, whether geography, language, or discipline,’’ providing a model that makes sense of and values the broadness of digital humanities, rather than contains such diversity within a limited framework (2017, 382). Key to understanding the way that localized digital humanities interact within a global framework is to evaluate the contingent power structures. Anne Donadey notes, ‘‘Discrete fields of knowledge can be seen as being separated by disciplinary borders; the interdisciplinary and comparative areas where they meet and are brought together can be viewed as borderland zones in which new knowledge is created, sometimes remaining in the borderland, sometimes becoming institution- alized into a different field of knowledge with its own borders’’ (2007, 23–24). The importance of borders is not in the separation, though indeed that is in play, but the meeting points, which provide productive tensions that bring forth new knowledge. Focusing on resistance, as Donadey puts it, avoids the flattening of ‘‘the concept of borderlands that would erase its historical and cultural grounding by turning it into a disembodied metaphor’’ (2007, 23). The borderlands stand in opposition to big tent representations of cultural connection. To embrace a borderlands understanding of global digital humanities is to respect localized practices and to Digital Humanities Within a Global Context: Creating… 123 Author's personal copy embrace points of context rather than a homogenized centrality. As Anzaldua reminds us, ‘‘A borderland is a vague and undetermined place created by the emotional residue of an unnatural boundary. It is in a constant state of transition’’ (1987, 3). The continual renegotiation of points of connection is productive and ever shifting. Rather than attempting to stabilize such moments, border theory seeks fluidity and destabilization as a means of new knowledge production. Viewing the global digital humanities within a border theory model rather than a big tent or umbrella formulation, one journal or one conference, allows scholars to seek those points of contact while understanding how the power dynamics of digital humanities have come to create points of contention. Crucial to respecting the integrity of localized digital humanities is a careful examination of our assumptions about technology use in digital humanities projects. GO::DH has supported ‘‘minimal computing’’ approaches as a way to rethink the way that many western digital humanities projects center technology innovation. Based on discussions in 2014 with digital humanists in Cuba, those associated with GO::DH, led by Alex Gil, recognized that computing needs in various localized environments might benefit from what Ernesto Oroza calls the ‘‘architecture of Necessity’’ (Gil and Ortega 2016, 29). GO::DH has defined ‘‘minimal computing’’ as that which ‘‘simultaneously capture(s) the maintenance, refurbishing, and use of machines to do DH work out of necessity along with the use of new streamlined computing hardware like the Raspberry Pi or the Arduino micro controller to do DH work by choice. This dichotomy of choice versus necessity focuses the group on computing that is decidedly not high-performance and importantly not first-world desktop computing’’ (GO::DH 2017). While we continue to need to explore how technologies benefit our research questions, we cannot ignore more minimal computing approaches that are often the most innovative and expansive within our field. The bias toward highly robust, often expensive, technologically centered projects as the gold standard for dh also creates a centered field that actively ignores the work occurring in some parts of global digital humanities. To best move forward, we need to return to a multiplicity of approaches that allows for scholarship to recenter technology, and we must resist the creation of rigid borders of academic disciplinarity that effectively shuts down the possibilities of global digital humanities interchange. To proceed in a non-policed borderlands, we must resist a tyranny of technology. Frames for our community interaction must be fluid and non-centralized. They must be evolving. To enable the productive friction between communities, we might begin to see our fields as less about connective nodes and networks and more focused on transnational understandings of disconnecting nodes. Border theory expands our methodologies and our approaches, rejecting a narrow understanding of digital humanities. It allows us to rethink the way that our own scholarship has been colonized and limited, particularly through models of ownership. A tenet of digital humanities in the Americas, for example, has focused around issues regarding ownership of scholarship, with faculty increasingly asserting control over their own labor and their ability to disseminate it freely, as open access (oa) materials, to an audience apart from or in parallel with more traditional structures of academic publishing. Key to defining the digital humanities A. E. Earhart 123 Author's personal copy then is that our scholarship is increasingly public. Matthew Kirschenbaum notes that ‘‘Whatever else it might be then, the digital humanities today is about a scholarship (and a pedagogy) that is publicly visible in ways to which we are generally unaccustomed, a scholarship and pedagogy that’s bound up with infrastructure in ways that are deeper and more explicit than we are generally accustomed, a scholarship and pedagogy that is collaborative and depends on networks of people and that lives an active, 24/7 life online’’ (2012, 60). The public digital humanities and the accompanying push for open access are central to the way that many digital humanists situate their scholarship. However, to fully encompass all expressions of digital humanities, we must also think carefully about issues of ownership, which many in digital humanities have expressed in limited western contexts such as copyright. As we move toward a model of interchange and exchange of globalized digital scholarship, the understanding of ownership and open access must be carefully examined and complicated. The dominance of models of open access in the Americas has been critiqued by a growing number of scholars, with particular attention to this issue from scholars who work with indigenous communities and knowledges. Kim Christen, for example, has produced scholarship and innovative digital tools to address issues of ownership and openness that are centered on indigenous knowledge structures. Her work recognizes that the digital archiving process has deep roots in museum and library collections’ problematic pasts and that many indigenous communities’ have had their intellectual production exploited by colonizers. As Christen notes, ‘‘The colonial collecting project was a destructive mechanism by which Indigenous cultural materials were removed from commu- nities and detached from local knowledge systems’’ (2015, 2). In response, Christen has developed a content management system (CMS), Mukurtu, that allows for sophisticated control of the materials within the CMS, demarcating the viewing of digital objects through localized understandings of what should be seen and what should not be seen and forcing the user to understand that there are certain objects or ideas that are not open to all. 2 While Christen’s work explicitly targets indigenous groups, her thinking about what should be seen and what should not be seen models best practices that we must extend into our conception of the global digital humanities. At the 2017 Montreal Digital Humanities meeting the ‘‘Copyright, Digital Humanities, and Global Geographies of Knowledge’’ panel considered this important issue. The discussion of copyright practices in various countries during the panel revealed the very limited understanding of the topic within the larger collective who attended the conference. Isabel Galina Russell’s remarks focused on copyright in Latin America, with her particular expertise focused on Mexico. Galina Russell emphasized that ‘‘Latin America distinguishes itself from other regions of the world in that scientific information belongs to all’’ (2017). Recognizing that few for profit academic commercial publishers exist in Latin America, Galina Russell argues that ‘‘there is a 2 See Kimberly Christen. ‘‘On Not Looking: Economies of Visuality in Digital Museums’’ in The International Handbooks of Museum Studies: Museum Transformations, First Edition. Ed. Annie E. Coombes and Ruth B. Phillips. Oxford: John Wiley & Sons, Ltd. Oxford Press, 2015: 365–386. 365–3666. Digital Humanities Within a Global Context: Creating… 123 Author's personal copy generalized idea that knowledge produced in the university belongs to all, it is a common good provided to the country,’’ negating copyright and shifting ownership of academic production to the public (2017). This conception of ownership stands in stark contrast to the way that ownership has functioned within the types of structures set up by the western for profit academic publishers and that many dh scholars see as central to oa initiatives. In the same panel, Padmini Ray Murray discussed the copyright lawsuit brought against Shyam Singh, the owner of a small Indian shop producing course packs for students at a local university, who was sued by several leading academic presses. Murray points out that the case revealed the way that assumptions of copyright elided national boundaries and attempted to apply western understandings of ownership on scholarly work. At the same time that the lawsuit negated copyright rules of the Indian state, it also selectively ignored US and UK copyright rules with the desire to further enforce western ideas of ownership. In response to the supposed copyright violations, the lawsuit ‘‘sought to ban all course packs, including those that observe the US definition of fair use, i.e., excerpts comprising less than 10% of the whole text’’ (2017). At the same time the legal challenge ignored ‘‘Section 52 of the Indian Copyright Act \that[ permits ‘fair dealing’ with the purpose of research, as well as permitting any copyrighted work to be used for the purpose of educational instruction’’ (2017). Situating copyright law neither in Indian or the west, the lawsuit was written as nationless, boundary less, centered only on the effort to end the exchange of information. Both papers point to the complications of thinking about ownership and knowledge as equivalent forms across cultures and nations. While we might value open access in the digital humanities, not all producers of knowledge will accede to openness. Instead we must, once again, develop structures that see knowledge as culturally defined and controlled. By valuing the localized understanding of knowledge and knowledge production, we situate the global digital humanities within a productive nexus of borders. Instead of insisting that we encapsulate all practices of digital humanities within a big tent or a centralized structure, we should instead view ADHO and its conferences and journals as important, but not central, meeting spaces for digital humanists. Rather than seeing ADHO as the center, we should encourage a global digital humanities that works on the borderlands, with localized expressions of scholarship that reinvigorate through exchange. Rejecting the ‘‘dualistic thinking in the individual and collective consciousness’’ is a struggle, as Anzaldua argues, but it is the only way that we might move beyond binaries that are currently in place, whether technologically advanced/primitive, east/west, or low income/high income (1987, 422). Resisting the homogenization of scholarly methods, questions, outcomes, production and ownership is the only way to develop a truly robust global digital humanities. A. E. Earhart 123 Author's personal copy References Anzaldua, Gloria. 1987. Borderlands/La Frontera. San Francisco: Aunt Lute Book Company. Centernet: An International Network of Digital Humanities Centers. 2017. https://dhcenternet.org/about. Accessed 15 Aug 2017. Christen, Kimberly. 2015. Tribal Archives, Traditional Knowledge, and Local Contexts: Why the ‘s’ Ma Ers. Journal of Western Archives 6(1): 1–19. de la Cruz, Dulce Maria, Jake Kaupp, Max Kemman, Kristin Lewis, and Teh-Hn Yu. 2015. Mapping Cultures in the Big Tent: Multidisciplinary Networks in the Digital Humanities Quarterly. https:// jkaupp.github.io/DHQ/coursework/VisualizingDHQ_Final_Paper.pdf. Accessed 10 Aug 2017. DH2018: Mexico City. Dh 2018 (blog) 2018. https://dh2018.adho.org/en/. Accessed 10 Aug 2017. Donadey, Anne. 2007. Overlapping and Interlocking Frames for Humanities Literary Studies: Assia Djebar, Tsitsi Dangarembga. Gloria Anzaldua. College Literature 34(4): 22–42. Earhart, Amy E. 2015. Traces of the Old, Uses of the New: The Emergence of the Digital Literary Studies. Ann Arbor: University of Michigan Press. Galina Russell, Isabel. 2017. Presentation, Panel on Copyright, Digital Humanities, and Global Geographies of Knowledge. Presented at the Digital Humanities 2017, Montreal, Canada. Gil, Alex. 2014. Around DH in 80 Days. Around DH in 80 Days (blog). http://www.arounddh.org. Accessed 10 Aug 2017. Gil, Alex, and Elika Ortega. 2016. Global Outlooks in Digital Humanities: Multilingual Practices and Minimal Computing. In Doing Digital Humanities: Practice, Training, Research, ed. Constance Crompton, Richard J. Lane, and Ray Siemens, 22–34. London: Routledge. Global Outlook::Digital Humanities. 2017. http://www.globaloutlookdh.org. Accessed 10 Aug 2017. Hayles, Katherine. 2010. My Mother Was a Computer: Digital Subjects and Literary Texts. Chicago: University of Chicago Press. Kirschenbaum, Matthew. 2012. What is Digital Humanities and What’s It Doing in English Departments? In Debates in the Digital Humanities, ed. Matthew Gold, 3–11. St. Paul: U Minnesota P. Membership. ADHO (blog). 2018. https://adho.org/faq. Accessed 10 Aug 2017. O’Donnell, Daniel Paul, Katherine L. Walter, Alex Gil, and Neil Fraistat. 2016. Only Connect: The Globalization of the Digital Humanities. In A New Companion to the Digital Humanities, ed. Susan Schreibman, Ray Siemens, and John Unsworth, 493–510. Malden, MA: Wiley Blackwell. Pannapacker, William. 2009. The Brainstorm Blog: The Chronicle of Higher Education Online. Ray Murray, Padmini. 2017. Presentation, Panel on Copyright, Digital Humanities, and Global Geographies of Knowledge. Presented at the Digital Humanities 2017, Montreal, Canada. Risam, Roopika. 2017. Other worlds, other DHs: Notes towards a DH Accent. Digital Scholarship in the Humanities 32(2): 377–384. SIGs: ADHO Special Interest Groups (SIGs). 2017. ADHO (blog). http://adho.org/sigs. Accessed 3 Nov 2017. Terras, Melissa. 2006. Disciplined: Using Educational Studies to Analyse ‘Humanities Computing’. Literary and Linguistic Computing 21(2): 229–246. Terras, Melissa. 2011. Quantifying Digital Humanities. UCL Centre for Digital Humanities. http://blogs. ucl.ac.uk/dh/2012/01/20/infographic-quantifying-digital-humanities/. Accessed 5 Nov 2017. Weingart, Scott B., and Nickoal Eichmann-Kalwara. 2017. What’s Under the Big Tent? A Study of ADHO Conference Abstracts. Digital Studies/Le Champ Numerique 7: 6. https://doi.org/10.16995/ dscn.284/. Amy E. Earhart is an Associate Professor in the Department of English at Texas A&M University. She is the author of Traces of the Old, Uses of Old: The Emergence of Digital Literary Studies (2015) and co- editor of The American Literature Scholar in the Digital Age (2010). She is the author of various books and chapters in venues including Debates in Digital Humanities, Textual Cultures and the Humanities and the Digital, among others. Digital Humanities Within a Global Context: Creating… 123 Author's personal copy https://dhcenternet.org/about https://jkaupp.github.io/DHQ/coursework/VisualizingDHQ_Final_Paper.pdf https://jkaupp.github.io/DHQ/coursework/VisualizingDHQ_Final_Paper.pdf https://dh2018.adho.org/en/ http://www.arounddh.org http://www.globaloutlookdh.org https://adho.org/faq http://adho.org/sigs http://blogs.ucl.ac.uk/dh/2012/01/20/infographic-quantifying-digital-humanities/ http://blogs.ucl.ac.uk/dh/2012/01/20/infographic-quantifying-digital-humanities/ https://doi.org/10.16995/dscn.284/ https://doi.org/10.16995/dscn.284/ Digital Humanities Within a Global Context: Creating Borderlands of Localized Expression Abstract References work_hf3rbisgfzaupmouglg6u4tjey ---- 1 volume 7 issue 14/2018 THE RESEARCHER AS STORYTELLER U S I N G D I G I TA L T O O L S F O R S E A R C H A N D S T O R Y T E L L I N G W I T H A U D I O - V I S U A L M AT E R I A L S Berber Hagedoorn University of Groningen Research Centre for Media and Journalism Studies Oude Kijk in ‘t Jatstraat 26 9712 EK Groningen The Netherlands B.Hagedoorn@rug.nl Sabrina Sauer University of Groningen Research Centre for Media and Journalism Studies Oude Kijk in ‘t Jatstraat 26 9712 EK Groningen The Netherlands S.C.Sauer@rug.nl Abstract: This article offers a first exploratory critique of digital tools' socio-technical affordances in terms of support for narrative creation by media researchers. More specifically, we reflect on narrative creation processes of research, writing and story composition by Media Studies and Humanities scholars, as well as media professionals, working with crossmedia and audio-visual sources, and the pivotal ways in which digital tools inform these processes of search and storytelling. Our study proposes to add to the existing body of user-centred Digital Humanities research by presenting the insights of a cross-disciplinary user study. This involves, broadly speaking, researchers studying audio-visual materials in a co-creative design process, set to fine-tune and further develop a digital tool (technically based on linked open data) that supports audio-visual research through exploratory search. This article focuses on how 89 researchers – in both academic and professional research settings – use digital search technologies in their daily work practices to discover and explore (crossmedia, digital) audio-visual archival sources, especially when studying mediated and historical events. We focus on three user types, (1) Media Studies researchers; (2) Humanities researchers that use digitized audio-visual materials as a source for research, and (3) media professionals who need to retrieve materials for audio-visual text productions, including journalists, television/image researchers, documentalists, documentary filmmakers, digital storytellers, and media innovation experts. Our study primarily provides insights into the search, retrieval and narrative creation practices of these user groups. A user study such as this which combines different qualitative methods (focus groups with co-creative design sessions, research diaries, questionnaires), first, affords fine-grained insights. Second, it demonstrates the relevance of closely considering practices and mechanisms conditioning narrative creation, including self-reflexive approaches. Third and finally, it informs conclusions about the role of digital tools in meaning-creation processes when working with audio-visual sources, and where interaction is pivotal. Keywords: narratives, narrative creation, storytelling, exploratory search, media research, working with audio- visual sources (AV), user studies, Digital Humanities, archives, affordances of digital search tools, linked open data mailto:B.Hagedoorn@rug.nl mailto:S.C.Sauer@rug.nl B. Hagedoorn and S. Sauer, The Researcher as Storyteller 2 This article presents results of an exploratory Digital Humanities study focused on researchers working with digitized audio-visual (AV) sources, particularly regarding cases of mediated and historical events.1 In this article, we reflect on narrative creation processes, specifically research, writing and story composition by Media Studies and Humanities scholars as well as media professionals, and the pivotal ways in which digital tools inform these processes of search and storytelling around crossmedia AV sources. Whilst our study is concerned with supporting media research from beginning to end, we take a particular interest in exploratory search2 for supporting the first – exploratory and initial – stages of doing research, because during the initiation of a search researchers “may be in most need of support”.3 We argue that this is especially prevalent for researchers working with AV and crossmedia sources, due to the complex, dynamic and multifaceted nature of this data type. Sonja de Leeuw has discussed the history and challenges for European television history since the dawn of its archival turn in the opening article of VIEW, arguing that “institutions and digital libraries are challenged to meet the needs of users, to construct new interfaces not only in-house but also through online platforms. This requires fresh conceptual thinking about topical relations and medium-specific curatorial approaches as well as user-led navigation and the production of meaning”4 (our emphasis). In this article we study how contemporary digital tools and platforms of cultural heritage institutions adapt and react to this challenge, in interaction with curatorial approaches and user perspectives. Here, we pay particular attention to research with AV sources via audio-visual archival institutions, and the impact on narrative creation around mediated events. This article analyses and questions the ‘translation’ of AV data on different platforms into the narratives that we, as researchers working with AV sources, can tell – and by doing so, informs on the conclusions about the role of digital tools in meaning-creation processes. The study’s theoretical and methodological starting point is that narratives5 should be viewed in terms of their socio-technical context. Digital tools – used to search for, annotate, and analyse events – frame and afford the narratives that both media scholars and professionals as researchers can form around their research question. In their work, researchers study and integrate cultural and political meanings connected to media events.6 They delve into how said meanings – often disruptive, and long-term – are reproduced and made sense of via television and connected media platforms. In turn, we have studied how researchers search for narratives (cases) surrounding ‘disruptive’ events (such as natural disasters, terrorist attacks, and ‘breaking news’ marathons of disaster and terror) from a cultural-historical perspective, drawing upon archival and Linked Data materials from the Netherlands Institute of Sound and Vision and the digital search database and tool Media Suite (CLARIAH). Special attention is paid here to the functionalities of DIVE+, a Linked Data event-based browser, based on the simple event data model, where users can browse and explore different heritage collections simultaneously, which supports the creation of browsing narratives. 1 O u t l i n e a n d M e t h o d This study integrates the research areas Media Studies, Information Studies and Science and Technology Studies. It connects research and search practices to data quality enhancement, to realize a cross-disciplinary project that 1 Nick Couldry, Andreas Hepp and Friedrich Krotz, Media Events in a Global Age, Routledge, 2009; Elihu Katz and Tamar Liebes, ‘’No More Peace!’: How Disaster, Terror and War Have Upstaged Media Events,’ International Journal of Communication 1, 2007, 157-166; Daniel Dayan and Elihu Katz, Media Events: The Live Broadcasting of History, Harvard University Press, 1992. 2 Gary Marchionini, ‘Exploratory Search: From Finding to Understanding,’ Communications of the ACM 49, 4, 2006, 41-46. 3 Gary Marchionini and Ryen White, ‘Find What You Need, Understand What You Find,’ International Journal of Human-Computer Interaction, 23, 3, 2007, 205-237. 4 Sonja De Leeuw, ‘European Television History Online: History and Challenges,’ VIEW: Journal of European History and Culture 1,1, 2012, 3-11. 5 Marie-Laure Ryan, ed, Narrative across Media: The Languages of Storytelling, U of Nebraska Press, 2004. 6 Elihu Katz and Tamar Liebes, ‘’No More Peace!’: How Disaster, Terror and War Have Upstaged Media Events,’ International Journal of Communication 1, 2007, 157-166; César Jiménez-Martínez, ‘Integrative Disruption: The Rescue of the 33 Chilean Miners as a Live Media Event,’ in Andrew Fox (ed.) Global Perspectives on Media Events in Contemporary Society, IGI Publishers, 2016, pp. 60-77. https://doi.org/10.1080/10447310701702352 https://doi.org/10.1080/10447310701702352 B. Hagedoorn and S. Sauer, The Researcher as Storyteller 3 seeks both technical and academic innovation. This study therefore takes a cross-disciplinary digital hermeneutics approach. By integrating digital technology for interpretation support, we provide insight into the roles of narratives in digital hermeneutics – the encounter of hermeneutics and web technology7 – and how events (and in what form) help interpretation. Our theoretical and methodological framework connects ideas about user studies in Digital Humanities to our own user-centred design and mixed methodology: a co-creative design approach that includes focus groups, research diaries, and questionnaires with open questions, to learn about the role of narrative creation and exploratory search in media research practices. Furthermore, the framework brings prior work on media events and narratives into focus, in relation to our research on understanding user-technology interactions. Our analysis is focused on how researchers use and reflect on the use of exploratory search tools, and how exploratory search informs narrative creation practices. The collected data provides insights into how researchers search and explore digital audio-visual archives to form narratives. Through user studies, we were able to focus on, first, how researchers construct navigation paths via exploratory search, and, second, evaluate the role of narratives in learning about historical mediated events and doing research into these events. In this process, DIVE+ (see §3 and Video 1) was also compared to other online search tools, such as Google Explore. Ultimately, studying working with AV can provide specific insights into the different perspectives that define the course and framing of mediated events, and our study offers a critique of digital tools’ socio-technical affordances in terms of support for search, retrieval and narrative creation by researchers working with AV materials. Video 1. DIVE+: Explorative Search for Digital Humanities Digital Humanities centres on humanities questions that are raised by and answered with digital tools. At the same time, the DH-field interrogates the value and limitations of digital methods in Humanities’ disciplines. While it is important to understand how digital technologies can offer new venues for Humanities research, it is equally essential to understand and interpret the ‘user side’ and sociology of Digital Humanities. Our overarching research question is concerned with how media researchers (scholars and professionals) appropriate search tools to ask and answer new questions, and apply digital methods when working with AV sources. To answer this question, we relate it to a concrete search practice and digital tool, and ask the sub question: how does exploratory search support researchers to study (disruptive) media events across media, and how these events are instilled with specific cultural or political meanings? 7 Chiel Van Den Akker, Susan Legêne, Marieke Van Erp, Lora Aroyo, Roxane Segers, Lourens Van Der Meij, Jacco Van Ossenbruggen, Guus Schreiber, Bob Wielinga, Johan Oomen, Geertje Jacobs, ‘Digital Hermeneutics: Agora and the Online Understanding of Cultural Heritage Categories and Subject Descriptors,’ WebSci 11, Koblenz, Germany, 2011. https://www.youtube.com/watch?v=FI3MPiU9rjo http://dl.acm.org/citation.cfm?id=2527039 http://dl.acm.org/citation.cfm?id=2527039 http://dl.acm.org/citation.cfm?id=2527039 B. Hagedoorn and S. Sauer, The Researcher as Storyteller 4 As a result, we can consider the implications on how researchers interpret and negotiate AV sources and affordances of digital tools, in their own research practices. User studies observe technology use in practice, and can therefore show how users appropriate technologies.8 User studies can serve to evaluate technologies in UI/UX testing (i.e. User Interface Design and User Experience testing) and pre-conceived use cases.9 They may also help us understand how technologies are increasingly becoming part of disciplinary practices.10 Whilst previous user research in Digital Humanities concentrates on assessing how and why Digital Humanities benefits from studies into user needs and behaviour11 – on user requirement research12 and on participatory design research13 – our article proposes to add to this body of research by presenting insights of a cross-disciplinary user study that involves researchers studying AV materials, in an iterative co-creative design process14 set to fine-tune and further develop a digital tool that supports audio-visual research through exploratory search. We employed a user-centred design methodology15 to analyse researchers’ engagement when using exploratory search, and more specifically, how users and technologies co-construct meaning and meaning-making practices. We studied how media researchers use digital search technologies in their daily work practices, to discover and explore digital AV archival material. Our study includes three user types: (1) Media Studies researchers, who are generally more experienced in working with AV sources; (2) Humanities researchers that use AV materials as a source for research or are interested in doing so, with varying degrees of expertise; and (3) media professionals who need to retrieve AV materials for audio-visual text productions, such as television programmes, journalistic productions or other creative endeavours. In group 1 and 2 we met with both university students (advanced levels) and lecturers. Humanities researchers (group 2) include scholars with academic backgrounds such as history, international studies, digital humanities, communication studies, languages and culture studies, whilst media professionals (group 3) include journalists, television/image researchers, documentalists, documentary filmmakers, digital storytellers, and media innovation experts. These user types are the foreseen end users of DIVE+ and the overarching Media Suite tool and database, because they create audio-visual narratives for their respective work purposes. We set up co-creative design sessions (see §5) with 89 researchers in both academic as well as professional settings, across different cities and institutions in the Netherlands (group 1: 21 participants; group 2: 57 participants; group 3: 11 participants) to observe and reflect on how they interact with search tools to explore, access and retrieve digitized AV material for narrative creation, and in some cases, creative re-use of this material in new audio-visual productions. From this micro-analysis, we extrapolate insights at the meso level: to relate insights gained about user interactions with one exploratory search tool (DIVE+) to more overarching ideas about user-technology interactions, and what such interactions imply about the role of digital tools in Humanities and Media Studies. 8 Leslie Haddon, ‘Domestication Analysis, Objects of Study, and the Centrality of Technologies in Everyday Life,’ Canadian Journal of Communication 36, 2, 2011; Nelly Oudshoorn and Trevor Pinch, How Users Matter: The Co-construction of Users and Technology, MIT Press, 2003. 9 Claire Warwick, ‘Studying Users in Digital Humanities,’ Digital Humanities in Practice, Facet Publishing, 2012, pp. 1-21. 10 James Stewart and Robin Williams, ‘The Wrong Trousers? Beyond the Design Fallacy: Social Learning and the User,’ in Debra Howcroft and Eileen M. Trauth (eds.) Handbook of Critical Information Systems Research: Theory and Application, Edward Elgar, 2005, pp. 195-221. 11 Claire Warwick, ‘Studying Users in Digital Humanities,’ Digital Humanities in Practice, Facet Publishing, 2012, pp. 1-21. 12 Harriet E. Green and Patricia Lampron, ‘User Engagement with Digital Archives for Research and Teaching: A Case Study of Emblematica Online,’ portal: Libraries and the Academy, 17, 4, 2017, 759-775. 13 Max Kemman and Martijn Kleppe, ‘User Required? On the Value of User Research in the Digital Humanities,’ Selected Papers from the CLARIN 2014 Conference, October 24-25, 2014, Soesterberg, The Netherlands 116, Linköping University Electronic Press, 2014. 14 Elisabeth B.-N. Sanders and Pieter Jan Stappers, ‘Co-Creation and the New Landscapes of Design,’ Co-Design, 4, 1, 2008, 5-18. 15 S.M. Zabed Ahmed, Cliff McKnight and Charles Oppenheim, ‘A User-Centred Design and Evaluation of IR Interfaces,’ Journal of Librarianship and Information Science, 38, 3, 2006, 157-172. http://www.cjc-online.ca/index.php/journal/article/view/2322/2929 https://preprint.press.jhu.edu/portal/sites/ajm/files/17.4green.pdf https://preprint.press.jhu.edu/portal/sites/ajm/files/17.4green.pdf B. Hagedoorn and S. Sauer, The Researcher as Storyteller 5 2 D I V E + This research study is CLARIAH-centric, from the perspective of DIVE+. The latter is integrated in the national CLARIAH (Common Lab Research Infrastructure for the Arts and Humanities) research infrastructure in the Netherlands – as part of the CLARIAH Media Suite – and is aimed at providing researchers with access to digitized audio-visual data as well as tools for research and analysis. DIVE+ is an event-centric Linked Data digital collection browser, which offers intuitive or exploratory browsing and exploration of media events at different levels of detail. It connects media objects (images or movies retrieved from cultural datasets), places (geographical or descriptive), actors (people or organizations) and concepts that are depicted or associated with particular collection objects, to contextualize search paths into overarching narratives and timelines.16 This tool is the result of collaboration between computer scientists, Humanities scholars, cultural heritage professionals and interaction designers.17 Figure 1. DIVE+ supports creation, saving and sharing of explored connections between objects, persons and places in the form of so-called search narratives.18 Events are a central part of this data enrichment: giving context to objects in collections by linking them in events. DIVE+ builds on the results of DIVE by expanding this digital hermeneutics approach for interaction, interpretation and exploration of digital heritage via different and linked online collections, providing a basis for interpretation support 16 In Media Suite version 4, the DIVE+ categories have been updated to Media Objects, People, Locations, and Concepts. 17 DIVE+ is a research project funded by the NLeSC and is a collaborative effort of Vrije Universiteit Amsterdam (Lora Aroyo, Victor de Boer, Oana Inel, Chiel van den Akker, Susan Legêne), Netherlands Institute for Sound and Vision (Jaap Blom, Liliana Melgar, Johan Oomen), Frontwise (Werner Helmich), University of Groningen (Berber Hagedoorn, Sabrina Sauer) and the Netherlands eScience Centre (Carlos Martinez Ortiz). It is also supported by CLARIAH and NWO. It was the winning submission of the LODLAM Challenge 2017 Grand Prize (International Summit for Linked Open Data in Libraries, Archives and Museums) in recognition of how DIVE+ demonstrates social, cultural and technical impact of Linked Data. 18 Victor de Boer, Oana Inel, Lora Aroyo, Chiel van den Akker, Susan Legêne, Carlos Martinez, Werner Helmich, Berber Hagedoorn, Sabrina Sauer, Jaap Blom, Liliana Melgar and Johan Oomen, ‘DIVE+: Exploring Linked Integrated Data,’ Europeana Insight, September 2017, https://pro. europeana.eu/page/issue-7-lodlam#dive-exploring-integrated-linked-media. https://mediasuite.clariah.nl/ https://www.beeldengeluid.nl/en/knowledge/projects/dive https://www.esciencecenter.nl/ https://www.esciencecenter.nl/ https://www.clariah.nl/ https://www.clariah.nl/ https://www.nwo.nl/ https://www.nwo.nl/ https://www.beeldengeluid.nl/en/knowledge/blog/dive-receives-grand-prize-lodlam-summit-venice https://pro.europeana.eu/page/issue-7-lodlam#dive-exploring-integrated-linked-media https://pro.europeana.eu/page/issue-7-lodlam#dive-exploring-integrated-linked-media B. Hagedoorn and S. Sauer, The Researcher as Storyteller 6 in the searching and browsing of heritage objects, with semantic information from existing collections plus open Linked Data vocabularies.19 This browser offers events-driven exploration of digital heritage material, where events are prominent building blocks in the creation of narrative backbones20 and links a variety of different media sources and collections. Whilst DIVE+ is continually updated, including through crowdsourcing, at the time of our research the browser contains entities from Delpher (scanned radio bulletins from KB/National Library of the Netherlands), Amsterdam Museum, Tropenmuseum and the Netherlands Institute for Sound and Vision (news broadcasts of the Open Images collection). Our research study aids in answering the question how such a browser – technically based on linked open data, supporting event-centric exploration or context analysis – can support a scholar/researcher from beginning to end, and therefore this study helps to improve DIVE+ (as part of the overarching Media Suite) as a browser. To do so, our research study draws upon the exploration of narratives (narrative centric approach) instead of other types of search (for instance more traditional or document centric approaches, such as faceted search). Moreover, this study addresses the purpose and usefulness of narratives for scholarly research. Figure 2. DIVE+ Linked data sources and vocabularies: establishing explorable links through shared vocabularies. 3 E x p l o r a t o r y S e a r c h : A B a s i s f o r T o o l C r i t i c i s m a n d R e s e a r c h i n g ‘ D i s r u p t i v e ’ M e d i a E v e n t s Users’ ideas and practices with exploratory search and retrieval technologies can not only shape AV narratives and productions, but can also enhance the development of exploratory search tools. Our study contributes to ideas about 19 DIVE+ Project Homepage, Beeld en Geluid, http://diveproject.beeldengeluid.nl. 20 Victor De Boer, Liliana Melgar, Oana Inel, Carlos Martinez Ortiz, Lora Aroyo and Johan Oomen, ‘Enriching Media Collections for Event- Based Exploration,’ 11th Metadata and Semantics Research Conference (MTSR2017), Tallinn, Estonia. Best Paper Award; Victor De Boer, Johan Oomen, Oana Inel, Lora Aroyo, Elco Van Staveren, Werner Helmich and Dennis De Beurs, ‘DIVE into the Event-Based Browsing of Linked Historical Media,’ Web Semantics: Science, Services and Agents on the World Wide Web, 35, 3, 2015, 152-158. https://www.beeldengeluid.nl/en/knowledge/projects/dive http://www.victordeboer.com/wp-content/uploads/2017/12/enriching-deboer-47-mtsr2017.pdf http://www.victordeboer.com/wp-content/uploads/2017/12/enriching-deboer-47-mtsr2017.pdf https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3198924 http://www.websemanticsjournal.org/index.php/ps/article/view/427/442 B. Hagedoorn and S. Sauer, The Researcher as Storyteller 7 tool criticism in Digital Humanities research. According to David Berry,21 the first wave of Digital Humanities focused on digitization and realizing technological infrastructures, whilst the second wave was generative, creating environments and tools to interact with data that is born digitally. The third wave, which Berry refers to in terms of a third layer, should concentrate on “the underlying computationality of the forms held within a computational medium (...) to look at the digital component of the Digital Humanities in the light of its medium specificity, as a way of thinking about how media changes produce epistemic changes”.22 We advocate a research approach as a part of which questions such as why specific data is collected, for what purpose, and within what context – the so-called politics of archiving – are addressed from a critical (Humanities) perspective. In line with Berry, we aim to understand (research) culture through digital technology, and even more specifically, the ways in which digital tools facilitate everyday research practices.23 We interrogate the underlying assumptions about how media researchers explore AV materials online. This is in line with Berry’s argument that one should understand culture through the use of (and through working with) digital technology; with a focus on how people use software in their everyday practices.24 Moreover, a reflection on the use of a digital search tool designed to afford both exploration and narrative creation, allows us to draw user-validated conclusions about how this particular tool reshapes an understanding of what it means to explore and create narratives via digital tools. It may well turn out that the ways in which the tool designers translated ideas about exploring and narrativizing digital material, do not match how the foreseen users understand exploration and narratives. We argue that exploratory search is crucial for Humanities researchers who draw upon audio-visual materials in their research. Recognizing relevant multi-platform sources and bringing these to attention – in an iterative fashion – greatly supports scholars in their research. Supporting researchers’ explorations is especially relevant in the case of scholars studying complex mediated and/or historical events. In the first place, because audio-visual, online and digital sources are in abundance, scattered across different platforms and changing daily in our contemporary landscape. Second, disruptive media events are difficult to interpret due to the challenges of grasping the immediate story. A media event is an event with a specific narrative that gives the event its meaning, and is in contemporary societies increasingly recognized as non-planned or disruptive. Disruptive media events,25 such as the sudden rise of populist politicians, terrorist attacks or environmental disasters, are shocking and unexpected, making them especially difficult to interpret. One can even argue that in today’s crossmedia landscape, disruption has become a marker of the way in which news narratives are continually told, circulated and shared across media, formatted as breaking news.26 This leads to problems for researchers who analyse how narratives construct different political, economic or cultural meanings around such events. Previous research argues that media events should always be viewed in relation to their wider political and socio-cultural contexts. Events, as they unfold in the media, may correspond to long-term social phenomena, and the way in which such events are constructed has particular connotations. Specific actors (newscasters, governments, institutions, political interest groups) use media events to build narratives in line with their own political, economic or cultural purposes – examples are stories of empathy, fear and change in relation to international media events.27 We argue that researchers, in turn, also build event narratives, and can therefore said to be storytellers. Yet, disruptive media events, such as live broadcasting marathons of disaster, terror, and war, have not yet been researched in the context of exploratory search strategies. 21 David M. Berry, ‘Introduction: Understanding the Digital Humanities,’ in Understanding Digital Humanities, David M. Berry (ed.), Palgrave Macmillan UK, 2012, pp. 1-20. 22 Ibid, p. 4. 23 Ibid, p. 5. 24 Ibid, p. 5. 25 Elihu Katz and Tamar Liebes, ‘No More Peace!’: How Disaster, Terror and War Have Upstaged Media Events,’ International Journal of Communication 1, 2007, 157-166. 26 Ingrid Volkmer, News in Public Memory. An International Study of Media Memories across Generations, Peter Lang, 2006; Daniel Dayan and Elihu Katz, Media Events: The Live Broadcasting of History, Harvard University Press, 1992. 27 César Jiménez-Martínez, ‘Integrative Disruption: The Rescue of the 33 Chilean Miners as a Live Media Event,’ in Andrew Fox (ed.) Global Perspectives on Media Events in Contemporary Society, IGI Publishers, 2016, pp. 60-77. http://dx.doi.org/10.1057/9780230371934_1 B. Hagedoorn and S. Sauer, The Researcher as Storyteller 8 Searching for stories, shapes stories.28 Prior research underlines the importance of visualizing, constructing and storing of narratives during information navigation to contextualize retrieved materials.29 Our own research study further illuminates the role of media researchers as storytellers, and their processes of selection and interpretation when working with audio-visual sources and learning about mediated events, especially regarding search, retrieval and narrative creation. 4 C o - C r e a t i v e U s e r S e s s i o n s Our case study approach combines grounded theory – that fosters an understanding of how researchers interpret and create narratives – with usability methodologies, such as work task evaluations. First of all, this allows us to draw conclusions about how search tools and digital technologies co-construct the researcher’s professional practice. Second, the data helps us probe the question how the kind of digitality of search and retrieval shapes the practice of media research, and, in extension of this, creative and storytelling processes. The research takes an interdisciplinary approach: it combines insights from Media Studies, as well as from Information Studies and Science and Technology Studies and integrates ideas about narrative creation, search practices, and overarching notions about how users and technologies co-construct meaning.30 Therefore, the presented research does not necessarily focus on how Digital Humanities’ tools have an impact on researchers’ practices, but rather analyses how researchers make use of search tools. In our user study, we collected qualitative data to answer our main question; in keeping with our user-centred approach, we (A) observed how users used the search browser by giving users search tasks31; (B) asked users specific written and verbal feedback about their user experience (questionnaires with open questions and research diaries)32, and (C) collected user perspectives on the role of digital search technologies in Humanities research in the shape of user-generated posters. The user study observes media researchers as they use DIVE+ to explore media events, across three stages: (1) research question formulation; (2) DIVE+ use; and (3) comparative user evaluations of the DIVE+ browser, compared to other online search tools such as Google Explore, resulting in specific search narratives. While interacting with the search browser, users were observed, and asked to provide feedback on their search experience, talking aloud about their search journeys. They were subsequently asked to export the navigation paths that were generated in the DIVE+ browser and provide written or verbal feedback on their experiences in terms of how DIVE+ supports narrative creation about historical events. This feedback was, then, discussed during a focus group session, in which we asked participants to reflect on their experiences. 28 Sabrina Sauer, ‘Audiovisual Narrative Creation and Creative Retrieval: How Searching for a Story Shapes the Story,’ Journal Of Science And Technology Of The Arts, 9, 2, 2017, 37-46. 29 Berber Hagedoorn and Sabrina Sauer, ‘Getting the Bigger Picture: An Evaluation of Media Exploratory Search and Narrative Creation,’ DHBenelux 2017 Conference, Paper, Utrecht University, Utrecht, 4 July 2017; Chiel Van Den Akker, Susan Legêne, Marieke Van Erp, Lora Aroyo, Roxane Segers, Lourens Van Der Meij, Jacco Van Ossenbruggen, Guus Schreiber, Bob Wielinga, Johan Oomen, Geertje Jacobs, ‘Digital Hermeneutics: Agora and the Online Understanding of Cultural Heritage Categories and Subject Descriptors,’ WebSci 11, Koblenz, Germany, 2011; Maartje Kruijt, Supporting Exploratory Search with Features, Visualizations, and Interface Design: A Theoretical Framework, MA Thesis, University of Amsterdam, 2016; Sonja De Leeuw, ‘European television history online: History and challenges,’ VIEW: Journal of European History and Culture 1,1, 2012, 3-11. 30 Wiebe Bijker, Thomas Hughes and Trevor Pinch, The Social Construction of Technological Systems. New Directions in the Sociology and History of Technology, MIT Press, 2012. 31 Barbara Wildemuth and Luanne Freund, ‘Assigning Search Tasks Designed to Elicit Exploratory Search Behaviors,’ in Proceedings of the Symposium on Human-Computer Interaction and Information Retrieval, ACM, 2012, p. 4. 32 Elaine G. Toms and Wendy Duff, ‘’I spent 1,5 hours sifting through one large box’: Diaries as Information Behavior of the Archives User: Lessons Learned’, Journal of the American Society for Information Science and Technology, 53, 14, 2012, 1232-1238. http://dl.acm.org/citation.cfm?id=2527039 http://dl.acm.org/citation.cfm?id=2527039 http://dl.acm.org/citation.cfm?id=2527039 http://dl.acm.org/citation.cfm?id=2527039 http://dl.acm.org/citation.cfm?id=2527039 http://dl.acm.org/citation.cfm?id=2527039 http://dl.acm.org/citation.cfm?id=2527039 http://dl.acm.org/citation.cfm?id=2527039 http://dl.acm.org/citation.cfm?id=2527039 https://onlinelibrary.wiley.com/doi/10.1002/asi.10165 https://onlinelibrary.wiley.com/doi/10.1002/asi.10165 B. Hagedoorn and S. Sauer, The Researcher as Storyteller 9 5 S e a r c h T a s k s Users were introduced to the DIVE+ search browser, the overarching Media Suite, as well as Google Explore and selected online audio-visual repositories, and subsequently asked to perform a search task. Search tasks are “goal-oriented activities carried out using search systems”.33 We developed exploratory search tasks in line with recommendations for task design.34 This means we tailored tasks to research situations. An example task given to the users of the DIVE+ browser was: Example task 1: Imagine that a media company is going to produce programmes about Jakarta, Beatrix (former Queen and now Princess of the Netherlands), Islam, or Watersnoodramp (1953 North Sea flood). Your goal is to propose an interesting angle for one of the programmes. Figure 3. Image of heavily damaged house during 1953 North Sea flood in Zeeland, the Netherlands. Source: Commons Wikimedia. For an exploratory search task such as described in example task 1, with a specific focus on the keyword Watersnoodramp (referring to the 1953 North Sea flood, a natural disaster in the Netherlands with 1836 casualties), this could result in an exploration path and search narrative as visualized in Video 2. 33 Barbara Wildemuth, Luanne Freund and Elaine G. Toms, ‘Untangling Search Task Complexity and Difficulty in the Context of Interactive Information Retrieval Studies,’ Journal of Documentation 70, 6, 2014, 1118-1140. 34 Pia Borlund, ‘A Study of the Use of Simulated Work Task Situations in Interactive Information Retrieval Evaluation: A Meta-Evaluation,’ Journal of Documentation 72, 3, 2016, 394-413. https://www.emeraldinsight.com/doi/abs/10.1108/JD-03-2014-0056 https://www.emeraldinsight.com/doi/abs/10.1108/JD-03-2014-0056 https://www.emeraldinsight.com/doi/abs/10.1108/JD-06-2015-0068 B. Hagedoorn and S. Sauer, The Researcher as Storyteller 10 Video 2. Exploration Path 1: Using DIVE+ (Media Suite) to search for watersnoodramp (North Sea flood). Another example task given to users to research a long-term media event was: Example task 2: Try looking for sources about the representation of the social acceptance of migrants, refugees and migration as a long-term event, and its impact on (Dutch) society. What research questions are sparked by what you find? How do the search affordances of the online repository/ies shape your research question and your understanding of the topic? Reflect on your own role as a storyteller, and how you think the tool you are using influences this role. For an exploratory search task such as described in example task 2, with a specific focus on the keyword vluchteling (refugee), this could result in a navigation path and search journey such as visualized in Video 3. Video 3. Exploration Path 2: Using DIVE+ (Media Suite) to search for ‘vluchteling’ (‘refugee’). https://www.youtube.com/watch?v=pXwzejOE57A https://www.youtube.com/watch?v=S5eDn2UmGaE&feature=youtu.be&hd=1 B. Hagedoorn and S. Sauer, The Researcher as Storyteller 11 6 T h e R e s e a r c h J o u r n e y Research into mediated events represented on multiple media platforms (including crossmedia or multi-platform audio- visual texts) can then take the following general steps when circling around a research question and specifying your research topic. This process of the research journey follows, in an iterative fashion, the steps of Explore – Refine – Analyse – Tool Criticism – Write – Disseminate. These are presented below in a model for grounded analysis, which answers to our discussed need for hermeneutic approaches in Digital Humanities, to closely consider practices and mechanisms conditioning narrative creation and for researchers to include a self-reflexive approach. Explore Refine Analyse Tool Criticism Write Disseminate Figure 4. The Research Journey: Explore – Refine – Analyse – Tool Criticism – Write – Disseminate. 6 . 1 S t e p 1 . E x p l o r e Exploring the topic to acquire contextual information about the topic (exploratory search, context acquisition): a) searching for videos; b) access academic databases to explore topics, read historical overviews and articles; c) searching for AV-material using faceted search to search for names, date, genre (news, documentary, current events programmes), and by broadcaster d) visiting archives physically to read newspapers on the days of the event, and the weeks/months after. Making a decision about which collections/archives are of interest to search, has ramifications: are these collections accessible, in terms of (1) their location: does the researcher need to visit the collection/archive in person, is there a digital point of access; (2) materiality of the collection: is the collection retrievable, and in what material form (physical objects, or digitized, or digital) and (3) contextualization, such as accessible metadata and other forms of contextualization that gives research value to the collection items. B. Hagedoorn and S. Sauer, The Researcher as Storyteller 12 6 . 2 S t e p 2 . R e f i n e Refining ideas about the topic by sorting and relating sources. There are different ways to connect materials on paper (a researcher may use a mind map to draw out how sources relate) to primarily piece together: a) sequence of events; b) the different sources that are found (is it a primary source, is it a secondary source); c) draw out storylines: what do the sequences of events (described by the different sources) show in terms of a narrative, what is the story that is being told? Is this a description of events. Is it an interpretation of said facts? In other words, how are the disruptive events translated into a story (short or long-term narrative)? When searching in collections, the researcher can refine search for instance by title and keywords of the event or implicated persons (to see whether data is available about the event) (1) as close to the event in time and (2) media content that discusses the event (for example, political talk shows more distant to the event than directly after) to collect discourses surrounding the event for analysis. For each collection, the researcher should apply and source criticism. These include questions such as what media objects, subjects, places, and actors are part of the event, or what information is available to be able to study the event? And what is the position of the retrieved object in context of the larger collection it belongs to? Researchers therefore critically reflect on the role of provenance, novelty, and diversity of objects and collections. 6 . 3 S t e p 3 . A n a l y s e After selecting a corpus, the researcher analyses this corpus to gain insight into processes of construction and manipulation of meanings: analysing selected materials, looking specifically at how each item tells a story, or trying to piece together what is happening or has happened, per: a) Type(s) of material and medium: television broadcasts, radio broadcasts, online articles (when archived), news- paper articles, interviews, scholarly articles; b) Narrative discourse(s): how is the story about the event being told, what are the central keywords used in the descriptions – because this helps creating insight into the discourse(s) surrounding the event; c) What stories/narratives are told about the event? How are media trying to understand what is happening? And what do these narratives signify in terms of how we interpret media events? d) Integrating findings. 6 . 4 S t e p 4 , 5 a n d 6 . T o o l c r i t i c i s m , w r i t i n g , a n d d i s s e m i n a t i o n Finally, the researcher integrates findings and writes these up. During the writing process, including recording findings for dissemination, the researcher also demonstrates tool criticism – also in relation to the aforementioned step of source criticism – explicitly reflecting on and demonstrating awareness of: a) How the archive and search tools used, constrain or shape the outcome of the research process. During the writing process, the researcher therefore also needs to have access to or be able to gather information about the selection and interpretation process of the used tool and repository/database; b) How the research and dissemination practices of the researcher (contextualization, re-mix, re-use) could possible add to further contextualization of cultural heritage objects; B. Hagedoorn and S. Sauer, The Researcher as Storyteller 13 c) And finally, during the writing up of findings for dissemination, the researcher can pay particular attention to how the research gives insight into how media create lucid narratives about events that are inherently complex and chaotic, as well as scattered. This form of grounded analysis leaves room for scholars to discover unexpected insights, new narratives and discourses. Discovering a multitude of narratives around the event can also be just as interesting, as it grants insight into the multi-interpretability of past events. 7 A n a l y s i s o f S e a r c h P r a c t i c e s a n d T o o l C r i t i c i s m 7 . 1 U s i n g e x p l o r a t o r y s e a r c h d u r i n g r e s e a r c h q u e s t i o n f o r m u l a t i o n Exploratory search tools are not used very often.35 Media Studies researchers did indicate enjoyment at the freedom that exploratory search offered them, especially in terms of how it triggered research questions. For instance, Media Studies students with advanced experience in working with audio-visual sources and digital search tools (BA level 3 Research Seminar) seemed to associate a clear research question with rigorous and intent heavy search, and exploratory search is regarded as more free flowing, aiding them in learning about facts that they would not have learned about when using more traditional sources. Exploratory search in this way can help with further focusing or defining the scope of one’s research, and even with developing a research question: “Exploratory search can result in new perspectives and approaches which in turn benefit the initial research” – Media Studies researcher [respondent no. 55] Humanities researchers further indicated how the randomness of source selection opened up chances for researchers to find sources that other methods might not reveal. In particular, collections that offer the possibility to search Linked Data (related entities) from a singular entry point, were considered to have the potential to illustrate context more than a historical account might provide. Contextual understanding is also central: respondents identify quite often that exploratory search does not necessarily add to the actual research project, but to the understanding of the topic they are researching. On the one hand, this seems to be valued quite highly, but on the other hand, it does not seem to be a priority during research in general, as a group of Media Studies researchers concluded after collecting user perspectives on the role of digital search technologies in Humanities research in the shape of user-generated posters: “Overall, we do believe that exploratory search is useful but perhaps to create a general understanding of the topic you are researching, rather than to find specific information that could answer your research question” – Media Studies researcher [respondent no. 56] 7 . 2 S e r e n d i p i t y Exploratory search then seems to function more as a creative stimulus. Makri et al. have argued that digital information environments need to support serendipity strategies to allow users to “make mental space or draw on previous experiences”.36 In this context, the co-creative design sessions practically point to how exploratory 35 Berber Hagedoorn and Sabrina Sauer, ‘Getting the Bigger Picture: An Evaluation of Media Exploratory Search and Narrative Creation,’ DHBenelux 2017 Conference, Paper, Utrecht University, Utrecht, 4 July 2017. 36 Stephann Makri, Ann Blandford, Mel Woods, Sarah Sharples and Deborah Maxwell, ‘Making My Own Luck’: Serendipity Strategies and How to Support Them in Digital Information Environments,’ Journal of the Association for Information Science and Technology, 65, 11, 2014, 2179-2194. B. Hagedoorn and S. Sauer, The Researcher as Storyteller 14 search during research question formulation and information retrieval offers potential for serendipitous browsing. Serendipitous search encounters are generally characterized as fortuitous accidental findings that are the outcome of a creative act,37 which is either afforded by the personality type of the seeker (e.g. ‘super encounters’38 have prepared minds and are open to recognize serendipitous findings) or by triggers embedded in the search system. Our user studies bring into view what organization and management theorist Miguel Pina e Cunha has described as: “[W]hile unexpected sources of knowledge are by definition impossible to locate (...) serendipitous discoveries may result from intentional exploratory search processes”.39 However, although finding new, unexpected narratives is important – in terms of new discoveries (!) – discovering insights serendipitously is not a goal in itself. Rather, eliciting serendipity is part of the skillset of a researcher, implicitly. 7 . 3 A n g l e s f o r c r e a t i v e c o n t e n t In this process, for media professionals specifically, the research question is translated into searching for an angle on a topic: from macro (the bigger idea or angle) to micro. The ‘angle’ is something that depends on the perceived audience of the programme or text the professionals are creating. For instance, an informative programme for a young target audience requires a different take on the Watersnoodramp (North Sea flood) disruptive event, then a documentary for adults would. Exploration is guided by expectations about the audience and the researcher’s own domain knowledge: how much does the professional personally know about, and how much are they personally interested in, the topic? How much exploration is afforded, also in terms of time and budget? Exploration is impacted by the professionals’ poetics,40 meaning the practices, conditions and unwritten rules of thumb guiding the selection and interpretation processes of media professionals with different genres, programmes (for instance television history programming) and target audiences,41 which in turn guide practices of creative retrieval as well.42 The institution of the archive, and the documentalists working there, need to be included here as agents of historical knowledge, as they also reveal such particular aims, strategies and conditions regarding the providing of access, contextualization and circulation of AV sources.43 Our user studies further reveal how media professionals (journalists, television/image researchers, documentalists, documentary filmmakers, digital storytellers, and media innovation experts) often search Wikipedia and YouTube to find inspiration for an angle, while newspapers (in databases) are reviewed for more detailed information about and around a topic. Previously made productions are also revisited: what was already made and searched for regarding the topic? Professionals’ search also includes various search tricks: the use of words that will lead to interesting material (such as the search term curiosa, which is a term that only expert users of the archive system would think of). Offering a browser that invites users to find inspiring and interesting material for a new angle on a topic becomes relevant for AV narratives that need original content, such as documentaries or the news. Sometimes the sheer 37 Elaine G. Toms, ‘Serendipitous Information Retrieval,’ DELOS Workshop: Information Seeking, Searching and Querying in Digital Libraries, n.p., 2000. 38 Sanda Erdelez, ‘Information Encountering: It’s More Than Just Bumping into Information,’ Bulletin of the American Society for Information Science, Feb/March, 2009, 25-29. 39 Miguel Pina e Cunha, ‘Serendipity: Why Some Organizations Are Luckier than Others,’ FEUNL Working Paper Series, Lisbon, 2005. 40 Berber Hagedoorn, ‘Collective Cultural Memory as a TV Guide: ‘Living’ History and Nostalgia on the Digital Television Platform,’ Acta Universitatis Sapientiae, Series Film and Media Studies 14, ‘Histories, Identities, Media,’ 2017, 71-94. pp. 73; Berber Hagedoorn, ‘De poëtica van het verbeelden van geschiedenis op broadcast televisie,’ Journal for Media History/Tijdschrift voor Mediageschiedenis 20, 1, 2017, 78-114; 41 Berber Hagedoorn, Doing History, Creating Memory: Representing the past in documentary and archive-based television programmes within a multi-platform landscape, Doctoral dissertation, Faculty of Humanities, Utrecht University, the Netherlands, 2016: pp. 24-33. 42 Sabrina Sauer and Maarten de Rijke, ‘Seeking Serendipity: A Living Lab Approach to Understanding Creative Retrieval in Broadcast Media Production,’ in Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, ACM, 2016, 989-992. 43 Berber Hagedoorn and Bas Agterberg, ‘The End of the Television Archive as We Know It? The National Archive as an Agent of Historical Knowledge in the Convergence Era,’ Media and Communication, 4, 3, 2016, 162-175. https://onlinelibrary.wiley.com/doi/full/10.1002/bult.118 https://content.sciendo.com/view/journals/ausfm/14/1/article-p71.xml http://www.tmgonline.nl/index.php/tmg/article/view/282 http://www.tmgonline.nl/index.php/tmg/article/view/282 https://www.cogitatiopress.com/mediaandcommunication/article/view/595 https://www.cogitatiopress.com/mediaandcommunication/article/view/595 B. Hagedoorn and S. Sauer, The Researcher as Storyteller 15 amount of material is daunting, however, the media professionals feel that this needs to be browsed through. Digital tools offering better support for this, is something that is met with certain enthusiasm. 7 . 4 M a k i n g m e a n i n g , c r e a t i n g l u c i d n a r r a t i v e s All three user groups demonstrate deeper reflections about how the tools that they used for (re)search and retrieval, inherently provide narrative elements. On an individual level, this is regarded as crucial in relation to the subjectivity of research. Users reflected further on how they – meaning the user as a researcher – are not the only influential factor regarding their produced research narrative, rather their tools and their own use of these tools also impacts the way in which their research is shaped and narrated: Meaning is attributed to the way one searches and conducts research – Media Studies researcher [respondent no. 58] The meaning is formed by the search tools you use and the way that you search – Media Studies researcher [respondent no. 64] Real connections still have to be made in an old and traditional way... in the mind of the researcher – Humanities researcher [respondent no. 14] Subsequently, the resulting search or narrative path, which represents a mediated event as a (more or less) lucid narrative, is also not regarded as neutral: “Narrative is a framing tool that helps shape information” – Media Studies researcher [respondent no. 59] (our emphasis). Our research study offers practical examples of how exploratory search can, then, support interpretation and narrative creation of events, through the visualization of the navigation path. “Exploratory research lets you see connections and thus shows you the meaning of AV content” – Media Studies researcher [respondent no. 75]. “When AV content is put together and looked at as a set, it can become a part of a narrative with a variety of meanings” – Media Studies researcher [respondent no. 73]. Figure 5. Narrative creation in DIVE+: exploration path searching for keyword ‘vluchteling’ (‘refugee’) [screenshot from Video 3 above]. B. Hagedoorn and S. Sauer, The Researcher as Storyteller 16 7 . 5 T r u s t v e r s u s h i d d e n a g e n d a s On an institutional and cultural-historical level, prior media research has argued how the transmission and portrayal of any event is necessarily dependent on the attitude or demeanour of the broadcasting institution.44 The large-scale comparative research of the European Television History Network has demonstrated how “[n]o event is value-free and neither is its mediation or interpretation. Historically, and across cultures and borders, values change”.45 However, this seems problematic when investigating and generating narratives in an exploratory search tool such as DIVE+. This is the case because currently, despite the fact that exploratory search and the visualization of the search path in DIVE+ can support narrative creation, researchers do not grasp how the tool mediates an attitude or demeanour. Based on our studies, we argue that trust in the search engine, browser and archive, is usually based on prior experience. Prior experience regarding search and retrieval determine the user’s expectations, their skills (for example in investigating signposts, such as the About page, for clues on the politics of archiving) and therefore their attitude towards retrieving dependable search results. As a respondent describes: “Even a database has a hidden agenda (...) Can I trust the algorithm?” – Media professional [respondent no. 3] (our emphasis). It is also relevant to note how in DIVE+, a search for Watersnoodramp leads to material that is dated before the time of the flood. This is interesting to the user, because it triggers curiosity about what the browser suggests. The results encountered through exploratory search are regarded as directionless in the sense that their usefulness depends on the researcher, and the project. The direction and value of the results are thus heavily dependent on the way in which they are used. In relation to our previous point that researchers currently do not grasp how the tool mediates attitude or demeanour, when it is difficult to gauge where materials and entities come from, this makes it problematic for the user to assess the usefulness of the source. In addition, crossmedia audio-visual sources are changing daily, and hence sometimes brings forward a different result due to removal of data from the database. Improvements for the DIVE+ browser are then specifically directed towards how more transparency about entities and relations should be added: There [in another search tool] the data-triples were shown, the entities, the relation between them, these were explicitly shown. And that already gave me more inspiration ... where does this relation stem from, you could find out very quickly, by directly clicking on it ... and it was revealed that the birth place is Ghent (...) It’s not immediately clear at a glance what the link is between the entities when you look at your search results – Media professional [respondent no. 3] (our emphasis) The experience was OK, but the interface is very cluttered. There is too much visible on the screen – Media professional [respondent no. 7] (our emphasis) 7 . 6 N e e d - t o - h a v e s o r n i c e - t o - h a v e s ? Media professionals describe fine-grained selection functionalities as the ‘need-to-haves’, especially to easily refine search results beyond entity categories: clear, well-defined search fields and more filter options, including per medium to make a distinction between text, audio and video in search results. Respondents argue that when such need-to- haves are lacking, the functionalities offered for exploring and linking are only ‘nice-to-haves’. Professionals especially request more direct insights into in-depth relationships, stating that this is now deemed too shallow: You will quickly find relations (connections) based on general search terms, but unfortunately, I did not find the depth of the relation between Beatrix and woningnood [housing shortage] – Media professional [respondent no. 4] (our emphasis) 44 Paddy Scannell, Radio, Television & Modern Life. A Phenomenological Approach, Blackwell Publishing, 1996. 45 Jonathan Bignell and Andreas Fickers, A European Television History, Wiley-Blackwell, 2008. B. Hagedoorn and S. Sauer, The Researcher as Storyteller 17 Very broad results, it’s often unclear why something is shown. You see few other relations between the results except the keyword Jakarta. I had expected a concept such as the independence act’ – Media professional [respondent no. 6] (our emphasis) The dataset on the background is missing critical mass to deliver sufficient results – Media professional [respondent no. 8] Professionals argue that their expressed need to give users more control over search filters, stems directly from the fact that in their professional practice, they are used to using search interfaces with many, many search fields. “The useful thing about many search fields is that you can focus very nicely on where you start and end in the definition of the field” – Media professional [respondent no. 3]. Prior experience, again, is thus a key factor impacting the interpretation and selection experience. 8 A n a l y s i s o f S e a r c h N a r r a t i v e s : T h e S t e e p L e a r n i n g C u r v e 8 . 1 E x p l o r a t i o n r o u t e s a n d m e t a - s t r u c t u r e s a s n a r r a t i v e s The search engines most often used for exploratory search by our respondents were DIVE+ and Google Explore, the Google Trends explore functionality. While DIVE+ is designed for working with audio-visual sources, the lay- out of Google Explore was deemed more user friendly and easier to navigate by our users. The learning curve of using DIVE+ made it less attractive for use from the outset, compared to Google Explore. This was made especially clear in our studies by respondent commentary about the difficulty in assessing both how connections between entities are established by the tool, as well as the unclear depth of the relation between entities (see also the commentary made above by respondent no. 4 regarding that they did not find the depth of a particular relation). Across all user groups, respondents expressed how the DIVE+ platform and exploratory search can help in guiding the user, and even aid in raising new research questions. Platform functionalities and affordances can help steer or guide the researcher and at the same time can push to formulate new questions. First, exploratory search is considered by our users to demand narrow research questions. “The added value is that you can determine (...) what your topic is going to be about based on the available research data” – Media Studies researcher [respondent no. 54]. Second, exploratory search is regarded as iterative. For example, one respondent (Media Studies researcher, respondent no. 57) described the process of exploratory search in DIVE/DIVE+ as constant revisioning of the research question based on the retrieved results. Here, a search narrative is defined as a route which indicates different phases. This underscores the learning curve of exploratory search, and different phases of narrative creation for the researcher: narrative creation as an exploration route. The users’ responses show that narratives in general, and in particular research narratives, are not a fixed entity but fluid. The attached meanings are ever-changing, based on the conditions in which discourses are encountered and constructed via individuals or events. It is noteworthy that both exploratory search and narratives are classified by respondents as non-fixed. Narratives are seen as to be composed of other narratives, in the sense that texts are constructed from other texts: [Narrative is] a way of framing information and events, that makes certain elements strange and normalizes others, creating something like a story – Media Studies researcher [respondent no. 65] B. Hagedoorn and S. Sauer, The Researcher as Storyteller 18 During the process of collecting data, the narrative might change, for media researchers might find information that changes their research question and primary focus – Media Studies researcher [respondent no. 59] Importantly, users indicate here how the practice of telling narratives is, as we saw earlier for the practice of searching, based in prior experiences, and narratives are shaped by prior experiences. Research itself was not considered as a narrative by all respondents. For Humanities researchers especially, research was strongly considered as not a narrative: “I believe that the narrative metaphor does not really apply to my research, because I do not produce sequential data, but rather a meta-structure, which cannot be told as a story” – Humanities researcher [respondent no. 38] (our emphasis). Media professionals were most critical whether the DIVE+ search path resulted in a narrative: The list of narratives is very helpful, but does not really yield a story. More like a storage of the search process – Media professional [respondent no. 2] (our emphasis) I mainly found general information and a further search for a relationship with an event did not offer a satisfactory outcome – Media professional [respondent no. 4] (our emphasis) Subsequently, visualization of more in-depth relationships is requested by users as an improvement of the exploratory search browser. 8 . 2 M e d i a u s e r s a s s t o r y t e l l e r s i n c o n t r o l ? Professionals also found that not every click should be saved in the exploration path, which not only points to giving the user more control over search functionalities like filters, as discussed above, but also more control over the lucid narrative that is generated (in the form of the exploration or search path), which can be exported offline and saved on the researcher’s own desktop: Ideally this functionality [saving the search log] will not simply save my entire click history, but will retain only relevant results– Media professional [respondent no. 7] (our emphasis) It would be more useful if DIVE[+] did not save everything itself, but only on the request of the user – Media professional [respondent no. 2] (our emphasis) Media researchers (scholars and professionals) are, in fact, storytellers. Our research outlines how researchers build narratives, and makes the role of the researcher and digital search tools in the construction of narratives explicit. This highlights the interpretative aspects of research, and research is always being interpreted in certain (social) contexts. Practices of search, research and retrieval, too, frame a certain version of reality through the construction of a narrative. The researcher is framing the narrative by choosing which sources to use and not to use – Media Studies researcher [respondent no. 61] (our emphasis) Media researchers acquire information from multiple searches and piece this information together in order to find similarities, patterns, and discrepancies. These are then put together in a storytelling format – Media Studies researcher [respondent no. 71] (our emphasis) B. Hagedoorn and S. Sauer, The Researcher as Storyteller 19 Such skills, on the one hand, seem to be something that people in modern societies are more and more used to, as well as actively developing: in our current association society46, individuals function as experienced as a kind of information hunters and gatherers47 that collect information from different platforms or databases in logical narratives for themselves. On the other hand, our research also indicates how these skills, as well as awareness of how such skills contribute to understanding for both learning and doing research, can be better supported.48 8 . 3 T o w a r d s s y n t h e s i s Across all user groups, user explorations underline the difficulty for users to create narratives about media events, due to the fact that there is a learning curve when it comes to understanding how to inspect collections for metadata, how to compare collections, and even how to explore collections. The features and interface of DIVE+, especially, offers a steep learning curve. Each of the tools in the Media Suite supports users in a particular way, but it is a challenge for users to synthesize found source materials into an overarching narrative. The ideal place for this synthesis would be the Media Suite’s workspace functionality, where a user can create a workspace for a particular (shared) project and to collect and inspect bookmarked materials. 9 R e f l e c t i o n : T h e R e s e a r c h e r i n a S p l i t P o s i t i o n In this study, we have argued how narrative creation occurs during the encounter and interaction of digital search apparatuses’ attitudes, with those of the researcher. We have also pointed out differences between research fields in terms of prior skills in search and retrieval, and the expectations regarding search and retrieval that arise during the research journey. As we learned in our study, researchers themselves can also be made more aware of how, through their own search and research practices, they build narratives around events, and how this impacts the meaning making process. Offering researchers the ability to explore and create lucid narratives about media events, including bringing relevant (multi-media and multi-platform) AV sources to their attention, therefore greatly supports their interpretative work. We argue that this is especially prevalent in the first exploratory search stage of typical media and humanities research.49 Exploratory search is crucial for researchers who draw upon media materials in their research, because audio-visual, online and digital sources are in abundance, scattered across different platforms, and change daily in the contemporary landscape. Supporting researchers’ explorations becomes even more important when scholars study 46 Marcel Broersma, ‘De associatie maatschappij: journalistiek stijl en de onthechte nieuwsconsument,’ Inaugural lecture, Chair Journalistic Culture and Media, 17 March 2009. 47 Henry Jenkins, ‘Confronting the Challenges of a Participatory Culture (Part Six),’ Confessions of an Aca-Fan: The Official Weblog of Henry Jenkins, 26 October 2006. 48 We have therefore, based on our studies, improved the DIVE+ browser with support for audiovisual annotation (also video or media annotation), especially the option for users to manually add annotation to and in-between their exploratory search path(s). 49 Marc Bron, Jasmijn van Gorp and Maarten De Rijke, ‘Media Studies Research in the Data-Driven Age: How Research Questions Evolve,’ Journal of the Association for Information Science and Technology, 67, 7, 2015, 1535-1554; Chiel Van Den Akker, Susan Legêne, Marieke Van Erp, Lora Aroyo, Roxane Segers, Lourens Van der Meij, Jacco Van Ossenbruggen, Guus Schreiber, Bob Wielinga, Johan Oomen, Geertje Jacobs, ‘Digital Hermeneutics: Agora and the Online Understanding of Cultural Heritage Categories and Subject Descriptors,’ WebSci 11, Koblenz, Germany, 2011. https://www.rug.nl/staff/m.j.broersma/oratie_marcelbroersma_170309.pdf http://henryjenkins.org/blog/2006/10/confronting_the_challenges_of_5.html http://dl.acm.org/citation.cfm?id=2527039 B. Hagedoorn and S. Sauer, The Researcher as Storyteller 20 disruptive media events via audio-visual sources, due to the complexity of the narrative and the audio-visual text’s representation – including re-presentation in digital heritage and memory institutions. In today’s association society, we then find the media researcher in what could be described as a split position. On the one hand, there are important new opportunities and new types of questions that can be asked, which encourages the (re)use of television archives and European audio-visual heritage, promoting engagement with cultural memory on national and international levels. The increased access and more direct availability of high-quality material, with connected metadata and contextualization makes and keeps AV material valuable for research. Digital tools offer significant research opportunities to identify useful data faster and over a longer research period. This also includes important multimedia and crossmedia perspectives, such as searching and linking various data sets from different collections via a singular entry point. On the other hand, there are also new challenges and new types of questions that should be asked. One challenge is that practices of crossmedia and transmedia storytelling, for instance television programme websites and social media platforms with relevant contextual information are highly susceptible to change. Often there is no structural archiving of such contextual information, regarding online (web-archiving), printed and digital production documentation for a complete memory of production. Furthermore, media literacy remains a considerable issue for the skill sets of both digital natives and non-digital natives. New critical questions to be asked concern the so- called politics of archiving. Audio-visual sources represent a construction and selection of our reality, and their (un) availability in a database is again a selection: curators adding a further interpretative layer. In short, in the digital age, more people are part of the selection processes of the media representations we reuse and encounter as researchers. Exploratory search can support researchers’ explorations of difficult to interpret disruptive media events, potentially offering serendipitous browsing and discovery of event narratives, helping users to better assess the quality of sources. However, this serendipitous browsing needs to be anchored to situated search practice of the researcher – thus, creating a tool that affords both exploration and anchoring of narratives. However, such opportunities of linked open data do require a shift in search cultures. It is therefore relevant to deconstruct how exploratory search and digital tools afford narrative creation, giving insight into the constructed quality and key perspectives that define the course and framing of mediated events. But also, how they shape narratives due to technological affordances and constraints. Creating narratives whilst exploring adds understanding plus creative insights to research and learning through audio-visual materials. This process also highlights the constructed nature of narratives in general, making users aware of their own storytelling practices. Based on gained experiences, respondents often expect to find exactly what they were looking for, but this is not what exploratory search offers: users thus had to open themselves up to new search learning curves and expectation management. Across all user groups, exploratory search was understood as a kind of loose concept, or as research without a direction. If searching – especially in the early phases of research – produced unexpected results, it could already be regarded as exploratory, and successfully serendipitous. Moreover, the respondents stressed the importance of a ‘refine step’ in the research journey, when both research questions and search queries are revisited, repeated and revised. Opportunities of linked open data, then, seem to require a shift in search attitude or even search cultures. Moreover, as results are interlinked across data types, platforms and databases, free association is supported by exploratory search. Links which redirect users from a certain source to another were also often associated with exploratory search, and its functionality to make interconnections more visible. Once users were able to recognize the value of the lack of directionality and of meandering AV that exploratory search offers, they loosened their expectations of finding what they wanted to find, and rather started to focus on the value of what they happened to find whilst roaming the archive – allowing for unexpected insights into topics. B. Hagedoorn and S. Sauer, The Researcher as Storyteller 21 We have used different methods to gain insight into users’ search behaviour, contributing to an understanding of users’ “non-purposive information practices”50, as well as to the development of digital tools. Reflecting on tool usage with researchers grounds the research in the professional, daily practice of the end user, and strives to embrace the complexity of Digital Humanities projects: balancing Humanities’ and Computer Science concerns.51 Digitization has changed work practices of media scholars and media professionals, and in their research practices they increasingly use digital archives to create media texts. This means that retrieving audio-visual material requires an in-depth knowledge of how to find sources digitally. Our studies show how in interaction we can perhaps learn most, and more effectively, about this. A c k n o w l e d g m e n t s This research was supported by the Netherlands Institute for Sound and Vision in the context of Berber Hagedoorn as Sound and Vision Researcher in Residence in 2016-2017 and the Netherlands Organisation for Scientific Research (NWO) under project number CI-14-25 as part of the MediaNow project. This research was also made possible by the CLARIAH-CORE project financed by NWO, with the Research Pilot Narrativizing Disruption. The authors would like to thank the anonymous reviewers for their helpful comments and suggestions, and Hanne Stegeman for her research assistance during data categorization. B i o g r a p h i e s Berber Hagedoorn (b.hagedoorn@rug.nl) is Assistant Professor Media Studies at the University of Groningen. Her research interests revolve around audiovisual culture, creative reuse and storytelling across screens. She received the 2018 Europeana Research Grant award for digital humanities research into Europe’s cultural heritage. Hagedoorn is the Vice-Chair of ECREA’s Television Studies section (European Communication Research and Education Association) and organizes cooperation for European research and education into television’s history and its future as a multi-platform storytelling practice. She has extensive experience in Media and Culture Studies and Digital Humanities through large-scale European and Dutch best practice projects on digital heritage and cultural memory representation, including Europeana, VideoActive, EUscreen and CLARIAH. Hagedoorn has published in amongst others Continuum, Journal for Media History/Tijdschrift voor Mediageschiedenis, Media and Communication and see also https://berberhagedoorn.wordpress.com. Sabrina Sauer (s.c.sauer@rug.nl) is Assistant Professor Media Studies at the University of Groningen, Research Centre for Media and Journalism Studies. She has a background in Media Studies and Science and Technology Studies, and studied as an actor prior to writing her dissertation about user-technology improvisations as a source for ICT innovation. Her current research focuses on data-driven creative processes, the agency of users and technological artefacts, exploratory search and algorithm development, and serendipity. Apart from that, she is keenly interested in Digital Humanities, and questions around digital materiality. Sauer has published in amongst others Journal Of Science And Technology Of The Arts. 50 Edin Tabak, Information Cosmopolitics: An Actor-Network Theory Approach to Information Practices, Chandos Publishing, 2015. 51 Edin Tabak, ‘A Hybrid Model for Managing DH Projects,’ DH Quarterly, 11, 1, 2017. VIEW Journal of European Television History and Culture Vol. 7, 14, 2018 DOI: 10.18146/2213-0969.2018.jethc159 Publisher: Netherlands Institute for Sound and Vision in collaboration with Utrecht University, University of Luxembourg and Royal Holloway University of London. Copyright: The text of this article has been published under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Netherlands License. This license does not apply to the media referenced in the article, which is subject to the individual rights owner’s terms. https://www.clariah.nl/ https://www.clariah.nl/projecten/research-pilots/nardis mailto:b.hagedoorn@rug.nl https://berberhagedoorn.wordpress.com� mailto:s.c.sauer@rug.nl http://www.digitalhumanities.org/dhq/vol/11/1/000284/000284.html http://creativecommons.org/licenses/by-nc-nd/3.0/nl/deed.en_GB http://dx.doi.org/10.18146/2213-0969.2018.jethc159 _Hlk1593146 _Hlk1150662 work_hfzicxtgw5bjdlonq4stxl2atu ---- Umanistica Digitale - ISSN:2532-8816 - n.4, 2019 V. Vanden Daelen – Data Sharing, Holocaust Documentation and the Digital Humanities: Introducing the European Holocaust Research Infrastructure (EHRI) DOI: http://doi.org/10.6092/issn.2532-8816/9036 Data Sharing, Holocaust Documentation and the Digital Humanities: Introducing the European Holocaust Research Infrastructure (EHRI) Veerle Vanden Daelen Kazerne Dossin, Memorial, Museum and Documentation Centre on Holocaust and Human Rights, Mechelen, Belgium veerle.vandendaelen@kazernedossin.eu Abstract. The European Holocaust Research Infrastructure (EHRI) started its work in October 2010 with financial support from the European Union. The project, which is currently under its second funding phase, continues developing according to its’ mission to support the Holocaust research community by building a digital infrastructure and facilitating human networks and by helping networking of Holocaust researchers and archives. EHRI provides online access to information about dispersed sources relating to the Holocaust through its Online Portal. Tools and methods are developed that enable researchers and archivists to collaboratively work with such sources and explore new methodologies within digital humanities. This contribution seeks to present the resources and services EHRI has to offer to the research community, with a special emphasis on the EHRI Portal. European Holocaust Research Infrastructure (EHRI) è un progetto lanciato nel 2010 grazie al sostegno dell'Unione Europea. Fa parte degli obiettivi del progetto la creazione di una infrastruttura digitale a supporto della comunità degli studiosi della Shoah e l'implementazione del networking fra ricercatori e istituti di conservazione. Nell'ambito di EHRI metodi e strumenti di lavoro sono sviluppati con l'obiettivo di favorire il lavoro collaborativo fra ricercatori e archivisti e, di studiare nuove metodologie per le Digital Humanities. Grazie al suo portale web (Online Portal) EHRI fornisce l'accesso alle informazioni sulle risorse d'archivio per la storia della Shoah disperse nei numerosi archivi europei e internazionali. Proprio i servizi e le risorse disponibili attraverso il portale web EHRI costituiscono il principale argomento di questo paper. 1 Umanistica Digitale - ISSN:2532-8816 - n.4, 2019 Introduction The European Holocaust Research Infrastructure (EHRI) started its work in October 2010 with initial financial support from the European Union Seventh Framework Programme for four years. Thanks to the continued EU support - EHRI is currently (2015-2019) a Horizon2020 EU-financed project with a total budget of almost eight million Euros - the project keeps on developing. The consortium in EHRI’s second phase under H2020 consists of 24 partner institutions from 17 different countries and includes research institutions, libraries, archives, museums, memorial sites and e-science specialists.1 Apart from this core working group, EHRI equally relies on the support of many other individuals and organisations in the broad fields of Holocaust studies and digital humanities. EHRI is devoted to building a Holocaust research infrastructure that is sustained by its network and will have a right of existence on its own accord. The mission and main objective of the European Holocaust Research Infrastructure (EHRI) is to support the Holocaust research community by building a digital infrastructure and facilitating human networks and by helping networking of Holocaust researchers and archives. EHRI provides online access to information about dispersed sources relating to the Holocaust through its Online Portal. Tools and methods are developed that enable researchers and archivists to collaboratively work with such sources and explore new methodologies within digital humanities. Apart from providing an online platform, EHRI also facilitates an extensive network of researchers, archivists and others to increase cohesion and co-ordination among practitioners and to initiate new transnational and collaborative approaches to the study of the Holocaust. EHRI thereby seeks to overcome one of the hallmark challenges of Holocaust research: the wide dispersal of the archival source material across Europe and beyond (because of the geographical scope of the Holocaust, attempts to destroy the evidence, migration of Holocaust survivors etc.) and the concomitant fragmentation of Holocaust historiography with a multiplicity of documentation projects. By bringing together experts from different fields, and by building an innovative digital infrastructure supported by a large community, EHRI is a flagship project that showcases the opportunities for historical research in the digital age. With this presentation at the EHRI workshop “Data Sharing, Holocaust Documentation, Digital Humanities: Best Practices, Case Studies, Benefits”, we would like to present the 1 NIOD, Institute for War, Holocaust and Genocide Studies (Amsterdam), Yad Vashem (Jerusalem), National Archives Belgium/CEGESOMA (Brussels), King’s College (London), Institute for Contemporary History (Munich), Jewish Museum in Prague, DANS (Den Haag), Wiener Library (London), Vienna Wiesenthal Institute for Holocaust Studies, Jewish Historical Institute, ŻIH (Warsaw), Mémorial de la Shoah (Paris), International Tracing Service (Arolsen), United States Holocaust Memorial Museum, USHMM (Washington D.C.), Bundesarchiv (Berlin / Koblenz), Elie Wiesel National Institute for the Study of the Holocaust in Romania (Bucharest), Hungarian Jewish Archives (Budapest), Vilna Gaon State Jewish Museum (Vilnius), Dokumentačné stredisko holokaustu (Bratislava), Contemporary Jewish Documentation Center Foundation CDEC (Milan), The Jewish Museum of Greece (Athens), Ontotext (Sofia), INRIA (Le Chesnay), Stowarzyszenie Centrum Badań nad Zagładą Żydów (Warsaw), Kazerne Dossin: Memorial, Museum and Documentation Centre on Holocaust and Human Rights (Mechelen). 2 V. Vanden Daelen – Data Sharing, Holocaust Documentation and the Digital Humanities: Introducing the European Holocaust Research Infrastructure (EHRI) resources and services EHRI has to offer to the research community, with a special emphasis on the EHRI Portal. EHRI resources and training include: Online Portal with information on Holocaust-related archival material held in institutions across Europe and beyond, Online Training in Holocaust Studies; Seminars and Workshops; Fellowship Programme; Conferences; Online Document Blog; Online Research Guides; and Tools and Methods for Digital History. EHRI Fellowships, Online Courses, Research Guides, Document Blog & Workshop The EHRI Fellowships support and stimulate Holocaust research by facilitating international access to key archives and collections as well as archival and digital humanities knowhow. The fellowships intend to support researchers, archivists, curators, digital humanists, and younger scholars (for information on past fellows and open calls, see https://ehri- project.eu/ehri-fellowship-call-2016-2018). The EHRI online courses also address the researchers, the general public and data managers/archivists (http://training.ehri-project.eu). There is on the one hand an unguided online course with 6 units from EHRI’s first phase (https://training.ehri-project.eu/), as well as now the development of an interactive tutored online course with six lessons (https://ehri-project.eu/interactive-ehri-online-course-holocaust- studies) and a Bundesarchiv-written course on German Archivistics (Aktenkunde). Whereas in EHRI’s first phase two Research Guides were published online (https://portal.ehri-project.eu/guides; e.g. on Theresienstadt, https://portal.ehri- project.eu/guides/terezin), EHRI is now exploring the options of its relatively new EHRI Document Blog (https://blog.ehri-project.eu/). The EHRI document blog provides a space to share ideas about Holocaust-related archival documents; it provides an innovative platform for the presentation, visualization, contextualization and interpretation of the data and metadata, using digital tools. EHRI furthermore reaches out and explores new methodologies via workshops and methodological seminars. These include, for example, a seminar for conservationists working on Holocaust-related materials and workshops on specific topics. The EHRI Portal The EHRI portal offers information on 57 countries, descriptions on 1,939 archival institutions across 51 countries, and 231,888 archival descriptions in 479 institutions (https://portal.ehri-project.eu/ July 17, 2017). The data in the portal are structured in a top- down fashion: from countries with country reports on the history, archival situation and status of EHRI’s research on the country, to an inventory of institutions which preserve Holocaust- relevant sources within these countries, to top-level collection descriptions (be it record groups, fonds, subfonds, collections or any other way the institution describing the sources structures them). EHRI’s goal is to provide information on the archives, not to provide digital representations of all the archival materials. EHRI focuses on collection descriptions and is not 3 https://portal.ehri-project.eu/ https://blog.ehri-project.eu/ https://portal.ehri-project.eu/guides/terezin https://portal.ehri-project.eu/guides/terezin https://portal.ehri-project.eu/guides https://ehri-project.eu/interactive-ehri-online-course-holocaust-studies https://ehri-project.eu/interactive-ehri-online-course-holocaust-studies https://training.ehri-project.eu/ http://training.ehri-project.eu/ https://ehri-project.eu/ehri-fellowship-call-2016-2018 https://ehri-project.eu/ehri-fellowship-call-2016-2018 Umanistica Digitale - ISSN:2532-8816 - n.4, 2019 aiming to be a “scan depository”, nor does it aim to be a complete public database on the (often privacy sensitive) file or document level (although we will take those descriptions if we can have them). So, instead of a digitization project giving direct access to the sources, it should be seen more as a “routeplanner”, assistance to researchers to identify the sources they need and to see information on the sources they are looking for across different institutions, languages and countries. If EHRI is aware of where the sources themselves are consultable in a digital format, it will include the link into the description. Storing all digital images of Holocaust-relevant archives is at this point not something the project can support or is aiming for. However, the contextualization, the merging of information on the sources across this many different countries and institutions is a tremendous help for researchers to identify their sources and it also allows the institutions preserving these sources to communicate to the international research community which sources they are holding, as quite often sources ended up in unexpected places. As such, EHRI is making sources visible in a systematic fashion in order to counteract the fragmentation of the sources. The project reveals interconnections (e.g. through a multilingual thesaurus with approx. 5470 terms; collation of authority files; relationships between originals and copies). The goal is to keep expand and enriching the online inventory of institutions and collections pertaining to the Holocaust in Europe, Israel and beyond, and to connect archives and users. The contextualization of the sources they preserve is indeed useful for the archives, as well as receiving potential expert user feedback. As such, the project is mutually useful for both researchers and archives, and the positive news on this ever-growing portal is that it has attracted in a relatively short time after its launch a high number of unique users, who make frequent use of this resource. Integration of Metadata into the EHRI Portal In EHRI’s first phase, the metadata in the portal were either bulk-imported by EHRI-IT or manually added by historians within the project. However, all data entry remained non- synchronized as even the bulk imports where one-time only harvests or imports. In EHRI’s second phase, the key factor of attention is on ensuring sustainable, meaning updatable, connections between the metadata providers and the project’s portal. Multiple Scenarios for Metadata Integration The integration of metadata into the EHRI portal can be done in various ways and can involve interaction with historians, archivists and IT/digital humanists within the project (as identified in the figures beneath). Typically, the historians and archivists (Work Package 9 - WP9 in the figure) will indicate which archives contain Holocaust-relevant collections. They will verify whether or not the institution already has descriptions of the sources or not. In the latter case, EHRI may decide to write collection descriptions itself or hire a local expert to do so. In the first case, with descriptions available, the first follow-up question is whether or not the 4 V. Vanden Daelen – Data Sharing, Holocaust Documentation and the Digital Humanities: Introducing the European Holocaust Research Infrastructure (EHRI) descriptions are digital and if so, in which format and in an exportable way or not. Samples of exports are provided to the EHRI IT of Work Package 10 (WP10) so that EHRI can assess which pathway would be possible to ingest information on the institution in case into the EHRI portal (hence the reference to 10 in the figure). Figure 1: Workflow between WP9 and WP10 regarding data imports into the EHRI infrastructure (2) (EHRI, D9.4 Resource Reports, Update April 2017). WP10 consequently verifies the sample export to evaluate whether or not the export is valid Encoded Archival Description or EAD and whether or not the institution has a Protocol for Metadata Harvesting (PMH) Endpoint. When both these questions get affirmative answers, establishing a connection between the institution and the EHRI project is a fairly straightforward endeavor, which only then entails the signing of a content-provider agreement (CPA) to ensure a sustainable connection to the EHRI project. Figure 2: Workflow between WP9 and WP10 regarding data imports into the EHRI infrastructure (2) (EHRI, D9.4 Resource Reports, Update April 2017) 5 Umanistica Digitale - ISSN:2532-8816 - n.4, 2019 The EHRI EAD Mapping Tool and the EAD Publishing Tool For those who send in sample data that did not provide valid EAD, the EHRI project has developed an EAD Mapping Tool. The EHRI Mapping Tool allows for the mapping of local metadata fields to the international EAD standard. The tool can be installed by the institution itself or by the EHRI IT and will take the metadata of the institution, map them to valid EAD and convert consequently all metadata passing through this mapping tool. If they consequently have OAI-PMH (Open Archives Initiative – Protocol for Metadata Harvesting), they are then ready to share their data with the EHRI project, as seen in the figure below for Collection Holding Institution A. In case there is no OAI-PMH for EHRI to harvest the metadata, EHRI can assist further by the installation of an EAD publishing tool (Resource Sync). The EHRI Metadata Publishing Tool has been created to help archives to publish their metadata in a sustainable way (allowing for semi-automatic updates of descriptions in the EHRI Portal). Here as well, the institution can install the tool itself or with the help of EHRI IT. As soon as the program is installed and the metadata are being stored on a for EHRI accessible place on the institution’s website, the sustainable connection to the project is a fact, as shown in the example of Collection Holding Institution B. Those able to provide valid EAD but without OAI-PMH can also install the EAD publishing tool in order to provide the EHRI project with their metadata in a sustainable way, as seen in the figure below for collection-holding institution C. Figure 3: EHRI Data Infrastructure (EHRI D10.1 and D10.2 Collection description publishing services) 6 V. Vanden Daelen – Data Sharing, Holocaust Documentation and the Digital Humanities: Introducing the European Holocaust Research Infrastructure (EHRI) EHRI Manual Data Entry and Follow-up In any case, the institutions preserving the documentation concerning the Holocaust are center stage in EHRI. It is not only those who have all the necessary knowhow and IT-tools available that are able to connect to the project. Also those institutions that are not yet having digital metadata or metadata in a format which would not be compatible with the use of the above- mentioned tools, are invited to share their metadata in the portal and make their metadata more openly available online. As already mentioned, where appropriate, archives can also be covered by manual surveying and manual data entry, either by EHRI staff or local experts, or by the collection-holding institution itself which in that case receives direct access to its own institution description in the EHRI portal and can – from there – add collection descriptions and child descriptions to its repository description. The screen shots below give an idea of how repositories, collections and child item descriptions can be created and updated within the EHRI portal. The extra tool behind the scenes is that every field from the ICA-standards, which form the basis for the forms, is explained when one clicks on the field itself. Moreover, all metadata entered in the portal can be exported in valid EAD to the respective institutions and as such, EHRI opens possibilities for further use of the data beyond the project itself. Figure 4: Illustration: Screen shots from the EHRI portal admin site 7 Umanistica Digitale - ISSN:2532-8816 - n.4, 2019 Open Source and Data Sharing Beyond the Project Because EHRI does not only wish for the metadata to be published in its own portal, but equally on the portal or website of the collection-holding institution or on other project’s websites, EHRI has developed tools that allow for this, i.e. the mapping and publishing tool work equally for institutions to publish on their own website as well as share their data with other projects. The same is true for those institutions for which the data are manually added to the portal. EHRI can export the data back to the institutions. With some assistance from a web designer or basic explanation on how to create a website by yourself, the institution can further communicate about the data via its own ways of communication. Those without a website can go for a minimum scenario by integrating a link to their repository and its holdings in the EHRI portal in their email signature and spreading the news as such. EHRI furthermore provides tutorials and a helpdesk for each of the explained pathways to bring metadata into the EHRI portal. So, all together, the EHRI portal and the open source EHRI tools help archives not only to join the EHRI project, but equally to publish their own data themselves and to exchange data with other archives, memorials, projects and portals. Figure 5: ADEMP (internal EHRI figure by Mike Priddy) To stay informed about EHRI’s activities and products, there are multiple options: the EHRI project website which includes links to all above named products (https://www.ehri-project.eu), the EHRI Facebook page (https://www.facebook.com/EHRIproject/), the EHRI newsletter (https://ehri-project.eu/ehri-newsletter) and the possibility to follow EHRI on Twitter (@EHRIproject). 8 https://twitter.com/EHRIproject https://ehri-project.eu/ehri-newsletter https://www.facebook.com/EHRIproject/ https://www.ehri-project.eu/ V. Vanden Daelen – Data Sharing, Holocaust Documentation and the Digital Humanities: Introducing the European Holocaust Research Infrastructure (EHRI) Acknowledgments Projects like EHRI are a group effort. The author would like to thank all her colleagues who contribute or have contributed to the EHRI project. 9 Introduction EHRI Fellowships, Online Courses, Research Guides, Document Blog & Workshop The EHRI Portal Integration of Metadata into the EHRI Portal Multiple Scenarios for Metadata Integration The EHRI EAD Mapping Tool and the EAD Publishing Tool EHRI Manual Data Entry and Follow-up Open Source and Data Sharing Beyond the Project Acknowledgments work_hjazlkhc3zfnjfhz6iakdscd7e ---- Mapping Meaning 1 Mapping Meaning: learnings from indigenous mapping technology for Australia's digital humanities mapping infrastructure Bill Pascoe, 2020 I acknowledge the Algonquin Anishnaabeg People and Nation where this conference is hosted, and Awabakal and Worimi people, land and waters where I'm writing. I pay my respects to Elders past, present and emerging. Introduction The Time Layered Cultural Map (TLCMap)1 digital humanities mapping infrastructure is for everyone, but the inspiration, conception and development of it has always had Aboriginal and Torres Strait Islander mapping at its heart. If Australian culture is world famous for anything it is the world’s oldest living culture, a culture for which connection to country is of vital importance. Many years ago, when a simple desire took shape to make it possible for people to add cultural layers to maps that other people could find, it was unthinkable without first considering Aboriginal and Torres Strait Islander culture and mapping technology. Indigenous views on country and its representation have factored into the software architecture and vision from the beginning. The transformational effect that the Colonial Frontier Massacres project has had on Australian culture was a catalyst sparking recognition of the important role digital humanities maps can play in the lives of Australians and played a role in the truth telling process of reconciliation. Five of the main projects in TLCMap are focused on Aboriginal and Torres Strait Islander culture and both acknowledge history and celebrate living culture. These projects come to TLCMap already as collaborations with Aboriginal and Torres Strait Islander people, and indigenous Australians are employed in TLCMap software development and research. Apology The pandemic disruption has delayed some things anticipated to have been complete by now and it wasn’t until the last week that I was sure I could contribute to DH2020, and so was not able to update the abstract by the deadline. This paper may differ slightly to the abstract. Maps and Translation Because this is an international audience I will make some points with reference to Anmatyerre artist, Clifford Possum Tjapaltjarri’s Warlugolong2, a seminal work of writing/art/mapping in the internationally recognised style of western desert ‘dot painting’. There are many art styles and story and song genres, traditional and contemporary, across more than 200 Aboriginal and Torres Strait Islander languages and peoples in Australia, but this one illustrates many points well. Please note that there is a controversial history in Aboriginal and Torres Strait Islander art over misappropriation, 1 TLCMap, http://tlcmap.org is a mapping platform of interoperable digital humanities mapping systems with development on new and existing systems, initiated through an ARC grant Project ID: LE190100019. 2 Tjapaltjarri, Clifford Possum Warlugulong 1977, (Anmatyerre) National Gallery of Australia https://artsearch.nga.gov.au/detail.cfm?IRN=167409 http://tlcmap.org/ Mapping Meaning 2 theft, secret knowledge, exploitation and intellectual property. This particular piece was made specifically for public viewing. Clifford Possum Tjapaltjarri's Warlugolong demonstrates how indigenous ontologies and ethics can be translated across cultures. This work is a landmark masterpiece in Australian and indigenous art. It is both traditional in using traditional symbolic systems to represent Tjukurpa, and contemporary in using the western convention of oil on rectangular canvas, use of the dot technique, and its ‘abstract’ aesthetic. It is also a map and a text, with elements of nine Tjukurrpa relating to places and navigation that can be read, if you learn how to read. Some of the things we can learn about how to do mapping, especially ‘deep mapping’, from indigenous mapping technology, through this include:  Country is an organising principle for navigating knowledge.  A map can exist in many media, not just a 2D grid of longitude and latitude. It can exist as a story, song, dance, painting, etc.  The meaning of a place is across many layers and through its connection to other places in an intersecting mesh.  Mapping is personal and social. Each place and part of a story is the responsibility of an individual. If anyone wants to hear the whole story they must travel to see that person and learn from them. Being connected to a place, understanding, and holding its story, means you are important to the longevity of culture, of the meaning of that place. When we look at a map we see our part in a greater whole and where we stand in relation to the world. Understanding the stories associated with places enhances our personal connection to where we grew up and where we live and work. Broadly defined culture is shared experience. By learning the meaning of the places we inhabit we are connected to our past and our future, and generations before and to come.  Having learned the meaning of a place through a map (painting, sand, words, etc), the meaning of that place is evident when next we see the land and water features, or the buildings. A map is a tool for teaching us ‘how to read country’. The land or place itself then means the lessons of the story. More broadly, places tell the story of our history and being in them, and remembering being in them is a mnemonic for that history. This is identity forming – how we came to be who we are where we are.  There is much more that could be discussed - many more complex details in indigenous mapping technology, and diversity across the continent, such as use of what is called ‘redundancy’ in information theory, polysemy, rhetoric, mnemonics, and relationships to seasons, land management and law but there isn’t space here. What I can learn by analogy from Clifford Possum Tjapaltjarri’s map is far from the experience of Anmatyere and Warlpiri living and growing up within their own culture. Want I mean to do here is illustrate an act of ‘translation’. In any translation something is lost and something is added. There is no one to one ‘mapping’ of meaning across cultures or individuals. Aileen Moreton-Robinson describes this as ‘incommensurable’ and an ongoing process: “This must be theorised in a way which allows for incommensurable difference between the situatedness of the Indigenous people in a colonizing settler society such as Australia and those who have come here. Indigenous and non-Indigenous peoples are situated in relation to (post)colonization in radically different ways - ways that cannot be made into sameness. There may well be spaces in Australia that could be described as postcolonial but these are not spaces inhabited Mapping Meaning 3 by indigenous people. It may be more useful, therefore, to conceptualise the current condition not as postcolonial but as postcolonizing with the associations of ongoing process which that implies.”3 Incommensurability doesn’t mean there is no understanding. Language exists both because we don’t understand each other and because we can understand each other better. This works initially by relating (mapping) new things to things already within our ken, and as we proceed our ken adjusts and changes. The Warlugolong painting is an illustration of how indigenous knowledges can be translated into something non-indigenous people can begin to understand. It has contributed greatly towards international recognition and respect for the sophistication of Aboriginal and Torres Strait Islander culture. Gary Foley points out that (paraphrasing) if you want to help indigenous people, teach yourself, then teach your own people.4 It is as a sad irony that in a country famous for an indigenous culture where the connection to and meaning of country is of central importance, most of us living here don’t know much at all about the places we live. Raewyn Connell suggests a way to avoid the history of objectification in academic indigenous study is to ‘learn from’ instead of ‘learn about’.5 If we are to do that, learnings such as those from Tjapaltjarri, Foley, Moreton-Robinson and many others must be built into TLCMap system architecture. It is incumbent on me then as system architect to take the responsibility to ‘learn from’ to heart. TLCMap, as a national digital humanities mapping infrastructure, has a role to play in enabling people to teach people to read the meaning of places in Australia. Decolonising Software Development As Aileen Moreton-Robinson points out ‘postcolonising’ or ‘decolonising’ is a process. It’s not as if we will ever arrive at an ‘uncolonised’ state since the future cannot be disentangled from the past. TLCMap involves a complicated mix of activity. ‘Decolonising’ is a term often used in terms of archives and collections, but TLCMap is more about ‘agency’. It is a platform enabling people to do research in spatiotemporal humanities that may produce archives, to work with the meaning of place – it isn’t a map but ways to do mapping. We have to consider what agency is involved, who designs that agency, with what assumptions, who has that agency and who can be effected by it. We could look at 3 ways in which decolonisation might occur in TLCMap software systems:  Content where indigenous ‘content’ is put into existing ‘colonial’ IT systems.  Bricolage where existing systems are turned to other purposes.  From the first Where the needs or world view/concepts/metaphors etc of indigenous people drive technological development from the beginning, without limiting possibility to already existing capabilities. In practice these abstractions aren’t mutually exclusive, and most situations involve something of all these approaches. 3 p30, Moreton-Robinson, A. (2003). ‘I Still Call Australia Home: Indigenous Belonging and Place in a White Postcolonising Society’. In Ahmed, S., Castañeda, C., Fortier, A. & Sheller, M. (eds) Uprootings/ Regroundings: Questions of Home and Migration Oxford & New York: Berg 4 Foley, Gary ‘Advice for white Indigenous activists in Australia’ and Foley, Gary 'Educate YOURSELF, then educate the people' 5 Connell, Raewyn Southern Theory: the global dynamics of knowledge in social science Crows Nest: Allen & Unwin, 2007 Mapping Meaning 4 For example, one of the early successes of TLCMap is the Gazetteer of Australian Historical Placenames (GHAP). It’s been commented that the gazetteer has colonial assumptions built in. It is based around placenames demarcated using a coordinate system and surveying technologies that were development to serve the project of European colonisation. The gazetteer begins as a list of ‘official’ placenames as decreed by a colonial government. The naming of places is itself an exercise of power, in stating what exists, and by omission, what does not, and in what language places are named. We provide a means for users to contribute place names. This is more at the ‘content’ end of the spectrum. None the less, this simple addition of functionality means anyone has an opportunity to intervene in the ‘authoritative’ government naming of places, including indigenous people, or researchers in consultation with indigenous people. Awareness of something being there can do something to counteract cultural blindness which factors into government and commercial decisions over land and water use. Other unexpected uses also arise, where we turn the gazetteer to various other ends simply because it is ready to hand. There are some indigenous place names with meanings that have become uncertain in places colonised for a long time. The quick and easy availability of the GHAP means we can quickly obtain maps and information that can help inquire into the meaning of the prefix ‘Coo’ in many south east Queensland placenames. Search results can be exported in open interoperable formats for visualisation, analysis and layering. As research contributions are made, the GHAP will be an increasingly valuable resource for people not only looking for a specific place, but simply wondering “What’s here?” to learn about both indigenous and non-indigenous history and meaning of place. Problems On The Way Unfortunately the pandemic has put us months behind in some cases, which is significant in a 1 year undertaking. The lock down has meant that trips to country that were to be a crucial part of the Ngadjeri Heritage project were cancelled, for example. One common difficulty highlighted by researchers in a recent discussion was the need to re-consult to obtain permission to do new things with information provided earlier such as to putting it on the web. TLCMap Projects TLCMap is an infrastructure or platform of interoperable tools and it involves a suite of projects to drive requirements and development and to demonstrate usefulness. The following are projects that have a specific focus on Aboriginal and Torres Strait Islander culture and history. At the InASA 2018: Unsettling Australia conference, Waanyi woman Josephine Davey, with her companions Ostiane Massiani and Kate Van Wezel movingly expressed her disappointment at the majority of the papers being about the history of violence towards Aboriginal and Torres Strait Islander country and culture which created the impression it was inevitable the same thing happen in her country.6 By contrast she was present to speak about how a ranger program was helping people travel great distances to access important traditional sites. This is a critique echoed in Walter & Suina’s critique of deficit based quantitative indigenous research, and elsewhere.7 The following TLCMap projects from across Australia include both history and traditional knowledge, and both acknowledge the bad and celebrate the good. 6 Davey, Josephine (Waanyi) InASA 2018: Unsettling Australia Conference 3/12/2018 - 5/12/2018 7 Walter, Maggie & Suina, Michele (2019) Indigenous data, indigenous methodologies and indigenous data sovereignty, International Journal of Social Research Methodology, 22:3, 233-243, DOI: 10.1080/13645579.2018.1531228 Mapping Meaning 5 Colonial Frontier Massacres8 Contact: Dr Bill Pascoe, Prof Lyndall Ryan https://c21ch.newcastle.edu.au/colonialmassacres/ This project maps colonial frontier massacres in Australia from 1780 to 1930. Ngadjuri Heritage Mapping Contact: Dr Julie Nichols, Prof Ning Gu This project is a collaboration between Ngadjuri people, particularly Quenten Agius, and University of South Australia staff, particularly Prof Ning Gu and Dr Julie Nichols. This project aims to improve best practice for digital mapping of indigenous heritage including virtual reality, panoramas and 3D architectural modelling. Journey Ways Contact: Dr Francesca Robinson, Prof Paul Arthur This project is a collaboration with Dr Noel Nannup (Nyoongar), Prof Paul Arthur and Dr Francesca Robinson, in consultation with Aboriginal people across WA. It is based on research that went into the ‘Great Journeys’ booklet9 and making this available in digital form. It describes the Aboriginal perspectives and stories that relate to major roads across Western Australia, which often follow traditional routes, and which have become further storied with historical use. It delves also into deep time, showing how stories relate to events of thousands of years ago according to geological time. NSW Aborigines Protection/Welfare Board 1883-1969: A History Contact: Prof Victoria Haskins and Prof John Maynard https://www.newcastle.edu.au/research-and-innovation/centre/purai/history-of-nsw-aborigines- protectionwelfare-board-1883-1969 This project provides a web interface and map into a research collection of Aboriginal Protection/Welfare Board sites in NSW and interviews with and photographs of people about their personal experiences with them. This project is lead by indigenous academic/s, employs indigenous research assistants, and presents Aboriginal perspectives. Aboriginal historians on this project are Prof John Maynard (UON), Dr Lawrence Bamblett (ANU), Dr Lorina Barker (UNE), Dr Ray Kelly (UoN) and Prof Jaky Troy (USyd) and indigenous PhD student, Ms Ashlen Francisco. Endangered Languages Map Data Contact: AProf Mark Harvey This project aims to consolidate and archive an overview of information about indigenous languages, particularly endangered languages in Australia in a way that can be accessed by others. Care has been taken to ensure that only information that can be made public is included in the open archive. This is part of long term work with speakers of endangered languages. OzSpace Contact: AProf Bill Palmer This linguistics project looks at how spatial relations and orientation is conceptualised and spoken about in Australian indigenous languages. It has two main parts, one is a database of languages with information describing the spatial and orientation features of languages, providing an overview. The other is visualisation tools, in particular tools that attempt to illustrate how space and orientation works in that language. 8 Ryan, Lyndall et al Colonial Frontier Massacres v3.0 C21CH, University of Newcastle, Australia, 2020 https://c21ch.newcastle.edu.au/colonialmassacres/ 9 Robertson, Francesca; Nannup, Noel; Barrow Jason Great Journeys Undertaken By Aboriginal People In Ancient Times in Western Australia Batchelor: Batchelor Institute https://c21ch.newcastle.edu.au/colonialmassacres/ https://www.newcastle.edu.au/research-and-innovation/centre/purai/history-of-nsw-aborigines-protectionwelfare-board-1883-1969 https://www.newcastle.edu.au/research-and-innovation/centre/purai/history-of-nsw-aborigines-protectionwelfare-board-1883-1969 https://c21ch.newcastle.edu.au/colonialmassacres/ Mapping Meaning 6 Bibliography Ara Irititja Ara Irititja Aboriginal Corporation (AIAC), 2019 https://www.irititja.com/ Arthur, Bill and Morphy, Frances Macquarie Atlas of Indigenous Australia Sydney: Macquarie Dictionary Publishers, 2019 Aveling , Nado (2013) ‘Don't talk about what you don't know’: on (not) conducting research with/in Indigenous contexts, Critical Studies in Education, 54:2, 203-214, DOI: 10.1080/17508487.2012.724021 Bardon, Geoffrey and Bardon, James Papunya: A Place Made After The Story, The Beginnings of the Western Desert Painting Movement Carlton: Miegunyah Press, 2007 Barwick, Linda; Marett, Allan; Blythe, Joe; Walsh, Michael Arriving, Digging, Performing, Returning: An Exercise in Rich Interpretation of a djanba Song Text in Moyle, R. M. (Ed.), Oceanic Encounters: Festschrift for Mervyn McLean. Auckland: Research in Anthropology and Linguistics Monographs. p13-24 Brody, Hugh Maps and Dreams Vancouver: Douglas & McIntyre, 1981 Burarrwanga, Laklak and family Welcome To Country Sydney: Allen & Unwin, 2013 Cane, Scott First Footprints: The Epic Story of the First Australians Sydney: Allen & Unwin, 2013 Coller, Matt Temporal Earth http://collection.temporalearth.net/pages/loadFile.html Connell, Raewyn Southern Theory: the global dynamics of knowledge in social science Crows Nest: Allen & Unwin, 2007 Cotter, Maria; Boyd, Bill; Gardiner, Jane Heritage Landscapes: Understanding Place and Communities Lismore: Southern Cross University Press, 2001 Dargin, Peter Aboriginal Fisheries of the Darling-Barwon Rivers Dubbo: Brewarrina Historical Society, 1976 Dixon, RMW and Duwell, Martin The Honey Ant Men’s Love Song and other Aboriginal Song Poems St Lucia: University of Queensland Press, 1994 Elder, Bruce Blood on the Wattle Sydney: New Holland Publishers, 1988 Fesl, Eve Mumewa Conned! St Lucia: University of Queensland Press, 1993 Foley, Gary The Koori History Website http://www.kooriweb.org/ Foley, Gary ‘Advice for white Indigenous activists in Australia’, The Juice Media, posted Sep 5, 2010 Filmed during the public discussion forum: 'Deactivating Colonialism / Decolonising Acivism' https://www.youtube.com/watch?v=uEGsBV9VGTQ convened by Clare Land at MAYSAR (Melbourne Aboriginal Youth, Sport and Recreation), Fitzroy: August 31st, 2010. Foley, Gary 'Educate YOURSELF, then educate the people', The Juice Media, posted Sep 5, 2010 https://www.youtube.com/watch?v=Iw8YVBbQgNg Filmed during the public discussion forum: 'Deactivating Colonialism / Decolonising Acivism' convened by Clare Land at MAYSAR (Melbourne Aboriginal Youth, Sport and Recreation), Fitzroy: August 31st, 2010 https://www.irititja.com/ http://collection.temporalearth.net/pages/loadFile.html http://www.kooriweb.org/ https://www.youtube.com/watch?v=uEGsBV9VGTQ https://www.youtube.com/watch?v=Iw8YVBbQgNg Mapping Meaning 7 Foley, Gary ‘Gary Foley 1981 ABC-TV interview’, posted Apr 29, 2019 https://www.youtube.com/watch?v=W0vpOp26PqA Gammage, Bill The Biggest Estate on Earth: How Aborigines Made Australia Crows Nest: Allen & Unwin, 2012 Huijser, Henk, and Brooke Collins-gearing. “Representing Indigenous Stories in the Cinema: Between Collaboration and Appropriation.” International Journal of Diversity in Organisations, Communities and Nations, 2007. https://www.academia.edu/2093466/Representing_Indigenous_Stories_in_the_Cinema_Between_ Collaboration_and_Appropriation Hendery, Rachel; Dousset, Laurent Alfred; McConvell, Patrick; Simoff, Simeon J Waves of words: mapping and modelling the history of Australia's Pacific ties ARC Funded Project 2018-2020 HuNI Deakin University https://huni.net.au/ InASA 2018: Unsettling Australia Conference 3/12/2018 - 5/12/2018 https://iash.uq.edu.au/event/session/1464 Johnson, Ian Heurist https://heuristnetwork.org/ Kelly, Lynne The Memory Code Sydney: Allen & Unwin, 2016 Kerkhove, Ray Aboriginal Camp Sites of Greater Brisbane Brisbane: Boolarong Press, 2015 Kerkhove, Ray with support and collaborations of Kabi Kabi traditional owners including Lyndon Davis, Kerry Jones, Arnold Jones, and others Kabi Kabi Sites and History of the Legendary Mount Coolum (Sunshine Coast, Qld) National Reconciliation Week, 2018 Kerkhove, Ray ‘Aboriginal Camps As Urban Foundations? Evidence from southern Queensland’ in Aboriginal History Vol 42 2018 Koch, Harold and Hercus, Luise (eds) Aboriginal Placenames: Naming and Re-naming the Australian Landscape Canberra: ANU E Press and Aboriginal History Incorporated, 2009 Leavey, Brett Virtual Songlines Brisbane: Bilbie Labs https://www.virtualsonglines.org/brett-leavy Maynard, John Who’s Traditional Land? The Wollotuka Institute, University of Newcastle, 2015 https://www.newcastle.edu.au/__data/assets/pdf_file/0009/41868/Research-document_John- Maynard_whose-land.pdf Maggie Walter & Michele Suina (2019) Indigenous data, indigenous methodologies and indigenous data sovereignty, International Journal of Social Research Methodology, 22:3, 233-243, DOI: 10.1080/13645579.2018.1531228 Mathew, John Two Representative Tribes Of Queensland London: T. Fisher Unwin, 1910 Moreton-Robinson, A. (2003). ‘I Still Call Australia Home: Indigenous Belonging and Place in a White Postcolonising Society’. In Ahmed, S., Castañeda, C., Fortier, A. & Sheller, M. (eds) Uprootings/ Regroundings: Questions of Home and Migration Oxford & New York: Berg, pp.23-40 Moreton-Robinson, Aileen (ed) Sovereign Subjects Crows Nest: Allen & Unwin, 2007 Muecke, Stephen. "Australian Indigenous Philosophy." CLCWeb: Comparative Literature and Culture 13.2 (2011) https://www.youtube.com/watch?v=W0vpOp26PqA https://huni.net.au/ https://iash.uq.edu.au/event/session/1464 https://heuristnetwork.org/ https://www.virtualsonglines.org/brett-leavy https://www.newcastle.edu.au/__data/assets/pdf_file/0009/41868/Research-document_John-Maynard_whose-land.pdf https://www.newcastle.edu.au/__data/assets/pdf_file/0009/41868/Research-document_John-Maynard_whose-land.pdf Mapping Meaning 8 Napaljarri, Peggy Rockman and Cataldi, Lee Warlpiri Dreamings and Histories Pymble: Harper Collins, 1994 Neale, Margo (ed) Songlines: Tracking The Seven Sisters Canberra: ACT National Museum of Australia Press, 2017 Needham W.J. Burragurra Revisited Fyshwick: CanPrint, 2019 Nunn, Patrick D and Reid, Nicholas ‘Aboriginal Memories Of Inundation Of The Australian Coast Dating from More than 7000 Years Ago’ in Australian Geographer, 47:1, 11-47, DOI: 10.1080/00049182.2015.1077539 Pascoe, Bruce Convincing Ground: Learning to Love Your Country Aboriginal Studies Press, 2007 Pascoe, Bruce Dark Emu: Aboriginal Australia and the Birth of Agriculture Broome: Magabala Books, 2018 Phillips, Sandra and Verhoeven, Deb “How Do We Live Together Without Killing Each Other?” Indigenous and Feminist, Perspectives on Relationality, doi:10.1093/ccc/tcaa007 Recogito Austrian Institute of Technology, Exeter University, Humboldt Institute for Internet and Society, The Open University and University of London https://recogito.pelagios.org/ Robertson, Francesca; Nannup, Noel; Barrow Jason Great Journeys Undertaken By Aboriginal People In Ancient Times in Western Australia Batchelor: Batchelor Institute Ryan, Lyndall et al Colonial Frontier Massacres v3.0 C21CH, University of Newcastle, Australia, 2020 https://c21ch.newcastle.edu.au/colonialmassacres/ Ryan, Lyndall & Lydon, Jane Remembering The Myall Creek Massacre Sydney: New South, 2018 Ryan, Lyndall Tasmanian Aborigines: A History Since 1803 Sydney: Allen & Unwin, 2012 Smith, Linda Tuhiwai Decolonizing Methodologies: Research and Indigenous Peoples Dunedin: University of Otago Press, 1999 Steele, J.G. Aboriginal Pathways in Southeast Queensland and the Richmond River St Lucia: University of Queensland, 1983 Sutton, Peter ‘Traditional Cartography in Australia’ chapters 9 and 10 in Woodward, David and Lewis, G. Malcolm The History Of Cartography Volume Two, Book Three: Cartography in the traditional African, American, Arctic, Australian, and Pacific Societies Chicago: University of Chicago Press, 1998, Edited by David Woodward and G. Malcolm Lewis Time Layered Cultural Map http://tlcmap.org/ Tjapaltjarri, Clifford Possum Warlugulong 1977, (Anmatyerre) National Gallery of Australia https://artsearch.nga.gov.au/detail.cfm?IRN=167409 Waanyi/Garawa Rangers Centre For Aboriginal Economic Policy Research Canberra: ANU College of Arts & Social Sciences Walsh, Micheal ‘A Polytropical Approach to the “Floating Pelican” Song: An Exercise in Rich Interpretation of a Murriny Patha (Northern Australia) Song’ in Australian Journal of Linguistics Vol. 30, No 1, Jan, 2010 pp117-130 https://recogito.pelagios.org/ https://c21ch.newcastle.edu.au/colonialmassacres/ http://tlcmap.org/ https://artsearch.nga.gov.au/detail.cfm?IRN=167409 Mapping Meaning 9 Walter, Maggie & Suina, Michele (2019) Indigenous data, indigenous methodologies and indigenous data sovereignty, International Journal of Social Research Methodology, 22:3, 233-243, DOI: 10.1080/13645579.2018.1531228 Won, Miguel; Murieta-Flores, Patricia; Martins, Bruno ‘Ensemble Named Entity Recognition (NER): Evaluating NER Tools in the Identification of Place Names in the Historical Corpora’ in Fontiers In Digital Humanities, Vol 5, 2018 work_hknhv6sruzfkvebqoevkjlbc7a ---- University of Winchester Skip to main navigation Skip to search Skip to main content Welcome to University of Winchester Explore profiles, expertise and research at University of Winchester Advanced search 62 Research Units 2558 Research output 183 Projects 94 Prizes 16326 Activities 146 Student theses University of Winchester Research Portal Welcome to the University of Winchester's Institutional Repository which showcases the excellent research undertaken across the University. The Repository enables open access to outputs where permitted, and full citation details where restrictions apply, making our research accessible worldwide through a searchable, browse-able database. New items are being added all the time. For further information about the Repository, please contact repository@winchester.ac.uk Collaborations within the past 5 years. Click dots and donuts to bring up details or Select a country from the list Collaborations within the past 5 years. Select a country to view shared publications and projects Close Powered by Pure, Scopus & Elsevier Fingerprint Engine™ © 2021 Elsevier B.V. We use cookies to help provide and enhance our service and tailor content. By continuing you agree to the use of cookies Log in to Pure About web accessibility University of Winchester contact form work_hlfsrxlkmngjlbtwngyk2q2qjm ---- IJAHC 1 Title: “Remote Locations: Early Scottish Scenic Films and Geo-databases” Authors: Maria A. Vélez-Serna and John Caughie Email of corresponding author: Maria Vélez-Serna Biographical notes: Maria A. Velez-Serna is a research assistant with the Early Cinema in Scotland project. Her PhD, at Glasgow, was about the emergence of the distribution trade, and she has also worked on Colombian cinema history. She has published articles in Particip@tions, Post Script, and the edited collection Performing New Media (John Libbey, 2014). John Caughie is Emeritus Professor at Glasgow University, and the Principal Investigator on the Early Cinema in Scotland project. He was a founding member of Film and Television Studies at Glasgow University from 1978. From 1999 to 2005, he was Dean of the Faculty of Arts, and from 2009 to 2011, he was Director of Arts Lab. Before stepping down in 2013, he had been a contributor and editor to Screen, the leading international journal in film and television studies, for over 30 years, and was co-editor, with Charlotte Brunsdon, of the Oxford University Press series Oxford Television Studies. Abstract: In the field of cinema history, an increased interest in social experience and context has challenged the centrality of the film and the primacy of textual analysis. The ‘Early Cinema in Scotland, 1896-1927’ research project takes a contextual approach, using geo-database tools to facilitate collaboration. This article shows how spatially-enabled methods can also be mobilized to bring issues of representation back into a cinema history project. We argue that, when the films have not survived, their geographical descriptors as recorded by trade-press reviews and catalogues offer new avenues of analysis. The article argues that foregrounding location as a significant element in the film corpus creates a new point of interconnection between film text and context. The juxtapositions and divergences between the spatial patterns of film production and cinema exhibition are connected to pre-cinematic traditions of representation. The spatial distribution also sheds light on the differences between films made for local and international consumption, reflecting on Scotland’s position in relation to discourses of modernity. Keywords: spatial historiography, new cinema history, early cinema, Scottish cinema, cinematic cartography, Geographic Information Systems, geo-database 2 Remote Locations: Early Scottish Scenic Films and Geo-databases As in many other humanities disciplines, spatial approaches have been gaining ground in film studies and cinema history, and increased attention to social and spatial contexts has challenged the centrality of the film text in current cinema historiography. This spatial turn in cinema studies is an encounter between humanistic and scientific disciplines, and the tensions between their approaches are as productive as the collaborations. The use of geo-databases as a research method plays a key role in this development. This article discusses some of the strategies developed by the Early Cinema in Scotland research team to address questions about textual representations within the conceptual and practical framework of an empirically-minded and spatially-aware cinema history project. A study of early non-fiction films from Scotland illustrates the value of location data to interrogate textual patterns even in the absence of texts, offering a way to engage with a filmography in which the films themselves have mostly been lost. Furthermore, the films can be analyzed through cartographic and database practices that foreground layering and connectivity, revealing relationships with other cultural artifacts and with different datasets. The context for this work is an interdisciplinary research project involving five researchers, and so the collaborative dimension of GIS methods is very valuable. The implementation of database tables and relationships has followed the evolving needs and interests of the researchers, leading to productive conversations about our definitions and methods. The place of the film text within the project has been a recurring question, as the core research agenda situates our work in the territory of “New Cinema History,” an outlook that borrows its methodologies from social and cultural historians in a cumulative effort to produce “a social geography of cinema.”1 Funded by the United Kingdom’s Arts and Humanities Research Council, the Early Cinema in Scotland project set out to address three questions: 1. What are the distinctive features of the early development of cinema and cinema-going in Scotland? 2. Given the well-documented popularity of cinema-going in Scotland in the period, what were the factors that inhibited the development of a sustainable feature film production capacity? 3. How does research on the circulation and reception of cinema in Scotland in the early years of the twentieth century add to wider debates about “the popularization of modernity and the modernization of popularity?”2 These questions will be addressed not only in the context of the economic, social and cultural history of Scotland in the early years of the last century, but in the wider context of a comparative understanding of early cinema outside the major production centres of the US and Europe: that is to say, in small countries, in minor regions, and in rural and small-town communities. This attention to institutional and social aspects places the project alongside a growing number of empirical studies of exhibition and cinemagoing, informed by an interest in what Robert C. Allen calls “the spatiality of the experience of cinema.”3 In contrast with classical theories of spectatorship and reception, empirical studies suggest that “for most audiences for most of the history of cinema, their primary relationship with ‘the cinema’ has not been with individual movies-as-artefacts or as texts, but with the social experience of cinema- going.”4 In this interdisciplinary scholarship the film text is no longer at the centre.5 A purely textual approach, in particular one that looked at “Scottish films” only, would thus be an impoverished representation of the Scottish relationship with cinema. In brief, what we find is that early cinema in Scotland was characterized by a legendary enthusiasm on the part of the 3 audience which, in turn, was catered for by a strong exhibition sector. What we do not find is that this enthusiasm for cinemagoing fostered a consistent or sustainable production sector, or stimulated indigenously-produced Scottish feature films.6 Beyond our interest in these dimensions of institutional configuration and social experience, looking beyond the text was also a pragmatic decision for our project, since only a small fraction of the films made in Scotland before the transition to sound have survived. Even if we wanted to conduct textual analysis, lateral approaches were required to address the broader questions about experience, representation, and modernity. In this article we explore Franco Moretti’s notion of “distant reading” as a model for an even more distanced approach to films, a remote reading, mediated and contextualized through their spatial attributes.7 What we share here are provisional insights from this exploratory process of bringing textual analysis back into the fold through mapping, and reflections on the analytic practices it enables. The conceptual interest in the spatiality of the cinema experience advocated by New Cinema History has sometimes found a methodological correlate in the use of Geographic Information Systems (GIS). As Julia Hallam and Les Roberts have argued, geo-database tools present two significant advantages for projects engaged in a spatial historiography of audiovisual media. Firstly, GIS visualization is organized in layers, and this enables certain ways of navigating, reading, and analyzing sources, in a synchronic layering of temporalities with critical potential. Second, geo-databases turn location data into a connecting point, bringing together disparate datasets that pertain to the same places.8 The mash-up map as scholarly tool is a crude but effective realization of geographer Doreen Massey’s idea of relational space as the dimension where historical trajectories are “thrown together” by happenstance.9 As Deb Verhoeven and Colin Arrowsmith argue, ‘[s]imply recognizing that film industries generate data with a temporal and spatial element enables the building of connections that can reveal previously obscure influences and relationships.’10 This relational potential is particularly valuable for historians working on topics, regions or periods that are less well documented, and it invites transnational and comparative approaches. While effective dataset integration is still an unrealized ambition in cinema history, building compatible data structures is a key step towards that aim.11 The first step for the Early Cinema project was to set up a relational MySQL database with GIS data imported from preliminary work carried out using QuantumGIS and PostgreSQL. The data fields and attributes have been defined in dialogue with other international projects, while retaining some local specificity. A common denominator of most cinema history projects involving databases is the centrality of the cinema venue. This is the case of “Going to the Show,” the website developed by the State Library of North Carolina under Robert C. Allen’s guidance, which documents the development of cinema exhibition in forty-five towns using fire insurance maps and newspaper sources.12 Jeff Klenotic’s work on New Hampshire exhibition history also uses venues as the primary marker, offering a sophisticated range of analytical categories on top of demographic and other base maps, and championing GIS as an exploration tool that accommodates “history from below” through grounded visualization.13 The Australian Cinemas Map, coupled with the Cinema Audiences in Australia database, has taken this analysis a step further, questioning the stability of the notion of venue itself, and reformulating it as a series of events linked to a point in space.14 Projects like these, and several others in development around the world, suggest that geo-spatial tools are becoming a standard component of research projects looking at the histories of cinema exhibition and reception, 4 embraced as a way to link up and contextualize the growing range of sources that cinema historians now employ. Like the projects mentioned above, the Early Cinema in Scotland database design placed geographical locations, rather than film titles, as the main integrating point and the relevant attribute for visualization. Film titles would only acquire a geographical attribute by virtue of being screened at one of these places. However, as the filmography grew, it became apparent that there were many films that had significant Scottish elements, but which may never have been screened in Scotland, thus limiting the usefulness of an exhibition-led cinematic geography. One of the distinctive aspects emerging from our research was the disparity between endogenous and exogenous representations of Scotland, as the prevalence of Scottish themes and settings in international productions far outstripped local output. While only one silent feature made in Scotland survives, the amateur drama Mairi: The Romance of a Highland Maiden (Andrew Paterson, 1912), a review of the British and American trade journals Bioscope, Moving Picture World and Motion Picture News produced at least 119 feature films released between 1908 and 1927 with Scottish settings and stories.15 The popularity of Scottish literature throughout the world in the nineteenth century is key to this anomaly: the works of Walter Scott are staples for film adaptations by European companies before World War 1, and after World War 1 historical romances of Mary Queen of Scots, Rob Roy, Bonnie Prince Charlie, and Young Lochinvar are part of the diet of global cinema. As the author of a 1945 film survey for the Edinburgh Film Guild put it, If the Waverley novels are now read less frequently, it is because their qualities are the very stuff of cinema, which can translate the romantic scene and stirring tale in a modern idiom of swift, sharp beauty keyed to the tenser spirit of the age. Where former generations found romance in Scott the present generation finds it in the cinema.16 These literary traditions, as Moretti and others have argued, had a geographic dimension, with the Highlands functioning, in Scott’s historical novels and in popular legend, as a frontier territory that allows travellers to journey into the past, setting in motion the narrative wheels of the genre as well as its anthropological impulses.17 If we were to explore the continuities in the grammar and the tropes of cinematic landscape from pre-modern and romantic representational forms, we would need to understand these spatial patterns, and therefore our analytical tools— that is, the geo-database—needed to facilitate this. After Moretti’s influential Atlas of the European Novel, a growing body of work on the spatiality of literature has continued exploring the relationships between fictional and topographic space. The best examples challenge both empiricist and dematerialized conceptions of space and place, bringing GIS practice into dialogue with the discourses and approaches developed within the humanities, and showing how, like maps, narratives produce forms of spatial understanding. Mapping the spaces of narrative fiction was also the initial point of contact between geography and film studies. However, as Peta Mitchell and Jane Stadler have noted, “literary geography and film geography are distinct traditions within geography, each with its own histories and assumptions.”18 In the same essay, which refers to the Cultural Atlas of Australia project, Stadler and Mitchell go on to outline their proposal for an intermedial geocritical method, combining the strengths of different disciplines’ spatial turns to examine how “[c]ultural narratives not only mediate and represent space, place, and location, but [are] themselves mediated representational spaces.” The Cultural Atlas of Australia, consequently, surveys narrative space across novels, plays, and films, providing a model for a critical 5 cybercartographic method that pays attention to the multiple perspectives and imaginative geographies of fiction.19 Mitchell and Stadler’s geocritical practice, by drawing on a variety of datasets and utilizing cartographic tools, connects Maltby’s exhortation for a “social geography of cinema” with the more text-centred directions of the spatial turn in film studies. These textual strands have sought to understand how films invent and signify spaces, in works like Charlotte Brunsdon’s London in Cinema, recognizing a mutually creative relationship.20 Closer to the pragmatic motivations of a geo-database platform, the notion of “cinematic cartography” actually involves mapping, while challenging any positivist associations that the practice may evoke. In their introduction to a dedicated issue of the journal of the British Cartographic Society, Sébastien Caquard and D. R. Fraser Taylor explained that this approach turns the implicit connections between cartographic practice and film into a mode of analysis, one that “acknowledges the importance of cartography as an objective and scientifically based discipline, as well as the importance of conveying different forms of emotions and sensations about places through cinematographic language.”21 Cinematic cartographies add another layer of complexity due to the unstable relationship between the profilmic and the diegetic space—that is to say, between location and setting.22 As Brunsdon points out, cinematic geographies are complicated by the fact that cinema “is, in one sense, constituted through the production of spaces. And these cinematic spaces are produced through the manipulation of other spaces and processes.”23 Mapping diegetic locations is a practice rooted in the text-centred literary tradition. On the other hand, mapping shooting locations has become an extremely popular practice—for tourism offices around the world, as well as independent enthusiasts. This distinction was adopted with the creation of two separate database attributes, so that our filmography could document both the setting and the shooting location of a film, if known. With these two location fields, the filmography became spatially- enabled. This means that we can now potentially map the films alongside the other entities in the database, and study the overlaps and divergences between their spatial arrangements. While the divergences and alleged identities between fictional settings and shooting locations deserve more detailed attention in future research, the rest of this paper focuses on non-fiction films, using the geo-database’s layering abilities to explore the inter-medial and inter-textual connections that underpin representation strategies in the silent period. While there is already a significant body of work on the relationships between fictional and topographic places in literature and narrative cinema, there are fewer examples of this approach that engage with documentary or non-fiction cinema.24 Salient amongst them is the Liverpool: City in Film project, which geo-referenced more than 1700 film and video items including everything from newsreels to amateur productions, spanning five decades of urban change in a provincial city. Mapping this large corpus with GIS tools allowed the researchers to examine how different film genres engaged with the city, finding “a series of overlapping mosaics of the city’s urban landscape” in which “specific production practices construct and project different spatial perceptions of the city.”25 This suggests that the geocritical approaches that have developed in relation to fictional geographies can still be necessary when looking at non-fiction films, as they offer their own spatial discourses and contribute to the production of social and cultural space, rather than simply bearing witness to it. Perhaps, riding on the continuing influence of an indexical paradigm, the relationship between a place and its representation in non-fiction is taken for granted. However, as a selective and fragmentary view of the world, and as an accumulation of intelligible discourses, non-fiction 6 films construct narratives of place adapted to different functions. One of the dominant forms of discourse during the early period is the travel film.26 Before the emergence of the term documentary, and its association with a more self-conscious rhetoric of realism, early cinema placed as much stress on the medium’s evidentiary value as in its imaginative possibilities. What Tom Gunning has called the “encyclopedic ambition” of early cinema promised to bring all the world to viewers in metropolitan centres.27 The travel film or “scenic” was thus one of the first film genres to emerge, and it took pride of place in the programmes of early travelling exhibitors, and then as part of the varied assemblies of films shown in nickelodeons and picture houses. As late as 1913, out of the more than 600 films released in the UK in a month, almost ten per cent were catalogued as travel films or scenic films.28 While their length was significantly below the mean, the sustained production of short travel films, mainly by British and European companies, ensured the survival of the “varied programme” that exhibitors believed audiences wanted.29 The travel film, as Ivo Blom points out in his study of the work of filmmaker Anton Noggerath in Iceland, draws on the popularity of travel writing in the eighteenth and nineteenth centuries.30 Like Iceland, Scotland was a favored topic for early modern travel writers, with the Highlands figuring as an accessible wilderness, a margin of Europe and of the British Empire that could be reached by train. Furthermore, the European Grand Tour that was fashionable for the British aristocracy and aspirant bourgeoisie had become too dangerous in the tumultuous conditions of the Revolutionary and Napoleonic Wars. The legitimacy of the Scottish Highlands as an alternative Grand Tour, as well as the pacification of the area a hundred years after the last Jacobite rebellion, were confirmed when Queen Victoria established a private residence at Balmoral in 1852. Landscape painting, by JMW Turner, for example, for Scott’s “Poetical Works” in 1831, and in particular the very popular work of Sir Edwin Landseer, had consolidated the alliance between visual style, literary representations and ideological constructions of the Highlands connected to an aesthetics of the sublime and a rugged exoticism. Lantern lecturers had access to photographic sets such as those produced by George Washington Wilson, a native of Banffshire who attracted both royal patronage and international acclaim for his artistic and technically skilled views, available commercially as single and stereoscopic prints from the 1860s.31 The geographical interest of Wilson’s work, as Charles Withers has argued, needs to be understood against a background of “historical and literary associations [that] drew tourists and artists both” to particular locations such as Loch Lomond, the Trossachs and Glencoe.32 This long history of visual and descriptive representation is engaged again in early non-fiction films about Scotland. Our database, which is still growing and does not claim to be comprehensive, includes at the moment eighty-five travel, educational, and interest films shot in Scotland and offered to the British trade by production companies of various origins and nationalities. Almost half of these were described as scenics, and include titles like A Holiday in the Highlands (Barker, 1919), Mountains and Glens of Arran (H&B, 1915), and The Bonnie Isle of Skye (Kineto, 1913). On a discursive level drawn largely from the trade press, the titles and descriptions suggest a continuity between pre-modern and Romantic literary traditions and the emergent conventions of cinematic landscapes. Thus, for instance, the Bioscope review for The Bonnie Isle of Skye talks of the “romantic and mystical beauty” of the Western Islands, and the invocation of “Caledonia, stern and wild” (from Scott’s Lay of the Last Minstrel) appears in the trade descriptions of both Scottish Scenery (1914) and Prince Charlie’s Country and the Western Highlands (1914).33 Practically, however, the corpus of films on which these continuities can be established is severely incomplete; like most productions of the nitrate era, the majority of the films is lost. 7 This creates a different challenge for our attempt to engage with the filmography on a textual level. Trying to study how these films conveyed representations of Scotland, without being able to see most of them, requires a new approach, and spatial tools can offer some answers. To borrow Moretti’s influential idea, setting and location are two elements that can be read “distantly.”34 Using the British trade journal The Bioscope, we collected the descriptions of Scottish-themed non-fiction films offered for UK distribution every week. These descriptions, while typically embellished and often equivocal, do, in the majority of cases, name locations. It is one of the interesting inflections of reading distantly or remotely through the trade press that the locations that are identified are those that are already known, that are already “mapped” on the tourist agenda and can be invoked in the selling of the film: the Spean Gorge, the waterfalls of the River Clyde, Loch Katrine. This plotting of locations, if framed effectively, gives us some foothold for an investigation of meaning-making strategies in early film representations of Scotland, and allows us to compare their geographical patterns to those in other texts and to situate them in relationship to a broader context. We are not simply reading landscape off the film, but off an imagined map, a “branded” landscape, drawn from nineteenth-century tours and tour guides, that pre-exists the film. While this is very much still work in progress, some of the findings start to show the potential of this geo-database treatment for addressing textual questions. In the last section of this paper, we discuss a corpus of thirty-nine non-fiction films made in Scotland between 1910 and 1927, and advertised in The Bioscope. A quarter of these films mention the Highlands in their title. The trade journal descriptions name seventy-five locations in total, which have been mapped manually. This exercise allows us to understand these films in relation not only to other films, but, importantly, to other dimensions of our research: demographic data, exhibition venues, and the locations of other topical and fictional films. At the core of this analysis is a very simple methodology: using Quantum GIS, we layer various types of data, from the topographic and demographic profiles to the places named in scenic and local topical films. Appropriate use of transparency and labelling allows us to explore overlapping data points and test hypotheses quickly and iteratively. Given the diversity of the primary sources, this is of necessity a work of bricolage, bringing together different time-scales and levels of accuracy. The overlapping temporalities marked in Figure 1 reflect the limits of the sources: Census dates, trade journal runs, and archival holdings. The problematic way in which spatial visualization seems to conflate time is a well-rehearsed discussion amongst digital humanists.35 As an exploratory tool, however, we retain the generative power of the “mash-up” map, with the caveat that a fuller historical explanation would demand a closer breakdown of the layers, their relationships, and the longitudinal changes within each dataset. 8 Figure 1: Locations of scenic films and local topical films compared to geographical distribution of cinema venues. Historical boundary data: Scottish Civil Parishes 1890 (digitized from Black’s Atlas), via EDINA Census Support. Census data: Southall, H.R., Gilbert, D.R. and Gregory, I., Great Britain Historical Database: Census Statistics, Demography, 1841-1931 [computer file]. Colchester, Essex: UK Data Archive [distributor], January 1998. SN: 3707, http://dx.doi.org/10.5255/UKDA-SN-3707-1. To begin with the most general observation, mapping the locations of these scenic films against population density – as per the 1911 Scottish census – reveals a sharp divergence. As the scenic films gravitate towards the Western and Central Highlands, there is a preference for sparsely populated areas. While Edinburgh and Glasgow are sometimes mentioned, they tend to appear as points of departure for a scenic voyage rather than as “scenes” in themselves. The River Clyde, which runs through Glasgow and whose shipbuilding industry produced over twenty per cent of the world’s mercantile ships (by tonnage) during its boom years at the turn of the century, is represented in three of the scenic films.36 However, the picturesque waterfalls to the east of the manufacturing area and the open estuary to the west are privileged over the cranes and molten steel at the centre of the industry. It was not until the Documentary Movement between the 1930s and the 1950s that industrial Scotland would be pictured heroically. Deleted: [Figure 1 here.]¶ 9 The preference for sparsely populated locales has another corollary in the minimal overlap these films have with the geography of the expansion of cinema. Put simply, most of the places depicted did not have a cinema; the films were not meant to be shown there. While itinerant non-theatrical exhibition was common in rural Scotland, and so it is not impossible that films were shown somewhere in the vicinity, there is a sharp distinction between films intended for national and international distribution and the extended practice of local topical filmmaking. There is no mention in the Oban Times, for example, of two scenics or interest films, Highland Games at Oban and Dunoon (Kineto, 1911) and Oban on Regatta Day (Kineto, 1913), being exhibited in the area. While they may or may not have been screened there, they were made by a major UK production company, aimed at an international rather than a local audience, and they did not attract local attention. The local film has been defined by Stephen Bottomore as one that expects “considerable overlap between the people appearing in the film and those who watch it.”37 These local topicals were crowd films: a practice initiated by travelling exhibitors, and adapted later by cinema managers needing to add the irresistible attraction of seeing yourself on screen to their programmes. Whether they nominally documented a gala day, parade, or news event, the camera was always turned on the audience, as this would guarantee their attendance at the show.38 Very few managers and operators had the skills and equipment to shoot and develop local topicals, so they were mostly commissioned from newsreel agents based in Glasgow or Edinburgh. It is thus not surprising that their geographical distribution favors the central belt of Scotland, which was both densely populated and very well provided with cinemas. Although we do not have time to develop the argument here, while it is part of the definition of the local topical film that it be familiar, everyday and recognisably local, it is part of the definition of the scenic film that it be, in some sense, exotic, removed from the everyday, and taking its significance from an already imagined space. Away from the heavy industry and the booming centres of population, most of the places filmed as scenic were connected to the railway or the ferry system – exotic but accessible. In part due to the material determinants of access, cinematic tourism echoed the geographical preferences of earlier tourist narratives. The falls of the Clyde, Loch Lomond and the Trossachs, and parts of Stirlingshire and Perthshire were as popular with filmmakers as they had been with literary visitors in the eighteenth and nineteenth centuries. In her Recollections of a Tour Made in Scotland, A. D. 1803, Dorothy Wordsworth recounts a meandering circuit starting in the Lake District, following the Clyde Valley and taking William and Dorothy Wordsworth, and, for part of the journey, Samuel Taylor Coleridge to the West and Central Highlands, ranging from Glen Coe in the North to the Gaelic-speaking areas of the Trossachs and Loch Katrine just thirty miles North of Glasgow. Their tour ends in the scenic area of the Borders, south of Edinburgh, where they are escorted by Sir Walter Scott. Drawn to waterfalls and gorges, Dorothy Wordsworth’s descriptions expect and evoke the sublime in the bleak landscape. While a fuller discussion of the overlaps and divergences between literary and cinematic tours is the subject of a different article, the simple exercise of mapping and juxtaposing different categories from the existing records, and layering cartographic data from different texts, starts to reveal how forms of cinematic discourse and modes of address are constructed by relation to space and place. Scotland’s complicated position in relation to modernity emerges in the contradictions between endogenous and exogenous forms of representation. Annie Morgan James argues, in her essay on Scottish landscapes in post-war cinema, that “the Highlands as cultural artefact define Scottishness, and in cinema the perpetual landscaping of Scotland intensifies the rurality of this stateless nation.”39 This is, however, only true of outward-facing forms of representation, 10 intended for an international rather than a local market. The rurality and grandeur of the Highlands is itself a discursive product: the production of an image of Scotland for a world imaginary. The fact that the geographic markers used in this analysis are taken solely from the trade descriptions of the films reminds us that this is advertising material. Its function is not to provide a shot-by-shot list of locations, but to sell the place and the journey, making explicit and implicit connections with existing horizons of expectations. The strongest imaginary at play in this commoditized Scottish geography is the Highlands as a vaguely defined, but powerfully symbolic territory, a European border with wildness and pre-modernity. The Highlands remain in these films, and in many feature films from the period, as an obstinate example of imprecise geography. As both literary and cinematic cartographers have shown, the geographies of fiction are often imprecise (as compared to the co- ordinate data expected by GIS software), and even when place names are given, the relationship between a place in a novel or narrative and that place in the world is complicated. Researchers working in the “Literary Atlas of Europe” project describe the uncertainty introduced by literary geographies as “a combination of subjectivity, vagueness and ambiguity (caused by the conceptualisation of literary places) on the one hand, and averaging, completeness and continuousness (resulting through the acquisition method of those literary objects) on the other hand.”40 In other words, it is difficult to create appropriate literary maps because places in literature are either imprecise or made up, while conventional cartography expects precise coordinates and sharp boundaries. A similar contradiction emerges in relation to film, with significant differences. Maurice Tourneur’s The White Heather (1919), for example, featuring a wreck off the coast of the Scottish Highlands, and commended in Bioscope for the vividness and accuracy of its “British atmosphere,” was filmed in Los Angeles Harbor. While narrative setting may be as defined or uncertain as in literature, the uncertainty regarding shooting location is only a contingent one. The indexical root of photographic representation means that there is always a very precise location—although we might not know what it was. From an empiricist perspective, therefore, the imprecision of this geography is merely a technical problem: it is possible to envision an image-recognition algorithm that matched the Highland landscape views to their co- ordinates, or an archival trove with the shooting diaries of all the camera operators involved. It is almost certainly more productive, however, to think through this imprecision and to work with it rather than strive to eliminate it. The tension between the perceived finality of a point on a map, and the fluidity of socially produced space, is a well-known point of contention, but also a creative force for humanities scholars working with digital methods. In the field of cinema history, a similar voltaic arc can be sparked between more text-centred and/or theoretical approaches, and the empirical and archival work that has challenged previous generalizations. The collaborative, data-sharing, linking and layering abilities of digital tools encourage exploratory, mash-up methodologies rather than competitive monotheism. In the context of the Early Cinema in Scotland project, an uncomplicated geo-database structure has enabled and encouraged us to engage with textual aspects as well as social and institutional issues. It allows one researcher’s work with demographics and exhibition history to interact with another’s investigation of film locations and literary precedents, or to help understand production patterns as both discursively and materially determined. Thus, multiple, possibly contradictory stories can be woven into new forms of historical narrative that do not erase difference or seek synthesis. Rather, they retain some of the imprecision and messiness of the social and cultural world sharpened and held in tension with a methodical and critical engagement with technology. 11 Acknowledgements The Early Cinema in Scotland research project is funded by a grant from the Arts and Humanities Research Council (UK), AH/1020535/1. The historical geography elements of this work use boundary material that is copyright of EDINA, University of Edinburgh, and is based on data provided through EDINA Census Support with the support of the ESRC and JISC. Census tables were obtained from the Great Britain Historical Database through the UK Data Archive. End Notes 1 Richard Maltby, “New Cinema Histories,” in Explorations in New Cinema History: Approaches and Case Studies, eds. Richard Maltby, Daniel Biltereyst, and Philippe Meers (Oxford: Wiley-Blackwell, 2011), 28. 2 Francesco Casetti, “Filmic Experience,” Screen 50, no. 1 (2009): 58. 3 Robert C. Allen, “Getting to ‘Going to the Show’,” New Review of Film and Television Studies 8, no. 3 (2010): 268. 4 Richard Maltby, “How can Cinema History Matter More?” Screening the Past 22 (2007), accessed December 15, 2014, http://tlweb.latrobe.edu.au/humanities/screeningthepast/22/board-richard-maltby.html. 5 Richard Altman, “Whither Film Studies (in a Post-film Studies World)?” Cinema Journal 49, no. 1 (2009): 134. 6 Trevor Griffiths, The Cinema and Cinemagoing in Scotland, 1896-1950 (Edinburgh: Edinburgh University Press, 2012), 279. 7 Franco Moretti, Distant Reading (London: Verso, 2013) 8 Julia Hallam and Les Roberts, “Mapping, Memory and the City: Archives, Databases and Film Historiography,” European Journal of Cultural Studies 14, no. 3 (2011): 368-369. 9 Doreen Massey, For Space (London: Sage, 2005), 151. 10 Deb Verhoeven and Colin Arrowsmith, “Mapping the Ill-disciplined? Spatial Analyses and Historical Change in the Postwar Film Industry,” in Locating the Moving Image, eds. Julia Hallam and Les Roberts (Bloomington: Indiana University Press, 2014), 107. 11 Karel Dibbets, “Cinema Context and the Genes of Film History,” New Review of Film and Television Studies 8, no. 3 (2011). See also the website “Cinema Context,” accessed January 7, 2015, http://www.cinemacontext.nl. 12 The University of North Carolina, “Going to the Show,” accessed January 7, 2015, http://docsouth.unc.edu/gtts/. 13 Jeffrey Klenotic, “Putting Cinema History on the Map: Using GIS to Explore the Spatiality of Cinema,” in Explorations in New Cinema History: Approaches and Case Studies, eds. Richard Maltby, Daniel Biltereyst, and Philippe Meers (Oxford: Wiley-Blackwell, 2011), 66. Klenotic’s project is online at “Mapping Movies,” accessed May 18, 2015, http://mappingmovies.unh.edu/maps. 14 Deb Verhoeven, “What is a Cinema? Death, Closure and the Database,” in Watching Films, eds. Karina Aveyard and Albert Moran (Bristol: Intellect, 2013), 33-51. See also the database at “Cinema Audiences in Australia,” accessed January 4, 2015, http://caarp.flinders.edu.au/home. 15 Bioscope was consulted on microfilm at the National Library of Scotland. For the American trade journals, our research was immensely facilitated by their availability via the “Media History Digital Library,” accessed January 7, 2015, http://mediahistoryproject.org/. 16 Norman Wilson, Presenting Scotland: A Film Survey (Edinburgh: Edinburgh Film Guild, 1945), 8. It is worth noting, however, the Scott novel that has never been adapted for cinema is Waverley itself. 17 Franco Moretti, Atlas of the European Novel (London: Verso, 1998), 37-38. 18 Peta Mitchell and Jane Stadler, “Redrawing the Map: An Interdisciplinary Approach to Australian Cultural narratives,” in Geocritical Explorations: Space, Place and Mapping in Literary and Cultural Studies, ed. Robert T. Tally, (New York: Palgrave Macmillan, 2011), 53. 19 “Cultural Atlas of Australia,” accessed January 5, 2015, http://www.australian-cultural- atlas.info/CAA/index.php. 20 Charlotte Brunsdon, London in Cinema: The Cinematic City since 1945 (London: BFI, 2007) 12 21 Sébastien Caquard and D. R. Fraser Taylor, “What is Cinematic Cartography?” The Cartographic Journal 46, no. 1 (2009): 7. 22 Mitchell and Stadler, “Redrawing the Map,” 58. 23 Brunsdon, London in Cinema, 7. 24 Or with non-fiction writing, for that matter. Ian Gregory’s research on historical travel writing and tourist guidebooks of the Lake District, which has geo-referenced 80 texts to explore how the region has been represented, is a pioneering example. See “Lakeland Geo-text Explorer,” accessed January 5, 2015, http://www.lancaster.ac.uk/fass/projects/spatialhum/geotext/. 25 Julia Hallam, “Mapping the ‘City’ Film 1930-1980,” in Locating the Moving Image, eds. Julia Hallam and Les Roberts (Bloomington: Indiana University Press, 2014), 177. See the project database at “Mapping the City in Film,” accessed May 18, 2015, https://www.liv.ac.uk/architecture/research/cava/cityfilm/. 26 Tom Gunning, “‘The Whole World Within Reach’: Travel Images Without Borders,” in Virtual Voyages: Cinema and Travel, ed. Jeffrey Ruoff (Durham: Duke University Press, 2006), 25. 27 Tom Gunning, “Early Cinema as Global Cinema: The Encyclopedic Ambition,” in Early Cinema and the “National,” eds. Richard Abel, Giorgio Bertellini, and Rob King (New Barnet: John Libbey, 2008). 28 “The Cinema Film Register,” The Cinema and Property Gazette, April 2, 1913, 61-66. 29 Ian Christie and John Sedgwick, “‘Fumbling Towards Some New Form of Art?’: The Changing Composition of Film Programmes in Britain, 1908-1914,” in Film 1900: Technology, Perception, Culture, eds. Annemone Ligensa and Klaus Kreimeier (New Barnet: John Libbey, 2009), 159. 30 Ivo Blom, “The First Cameraman in Iceland: Travel Films and Travel Literature,” in Picture Perfect: Landscape, Place and Travel in British Cinema before 1930, eds. Laraine Porter and Briony Dixon (Exeter: Exeter Press, 2007), 68. 31 Charles Withers, “Picturing Highland Landscapes: George Washington Wilson and the Photography of the Scottish Highlands,” Landscape Research 19, no. 2 (1994): 73. 32 Withers, “Picturing Highland landscapes,” 71. 33 “The Pick of the Programmes: What we Think of Them,” Bioscope, October 9, 1913 and March 26, 1914. 34 Moretti, Distant Reading, 67. 35 In cinema history specifically, this “flattening” of sequential events was one of the objections offered by Robert C. Allen against Ben Singer’s account of Manhattan nickelodeons. See Robert C. Allen, “Manhattan Myopia; Or, Oh! Iowa!” Cinema Journal 35 no. 3 (1996): 77. 36 Neil K. Buxton, “The Scottish Shipbuilding Industry Between the Wars: A Comparative Study,” Business History 10, no. 2 (1968): 119. 37 Stephen Bottomore, “From the Factory Gate to the ‘Home Talent’ Drama: An International Overview of Local Films in the Silent Era,” in The Lost World of Mitchell and Kenyon: Edwardian Britain on Film, ed. Vanessa Toulmin, Simon Popple, and Patrick Russell (London, BFI: 2004), 33. 38 See, for instance, from the Scottish Screen Archive’s collection, Arrival at Whitehart Hotel, Campbeltown (1914) http://ssa.nls.uk/film/0795, or many of the films made by Mitchell and Kenyon for travelling exhibitors in the North of England, such as Preston Egg Rolling (1901) http://player.bfi.org.uk/film/watch-preston-egg- rolling-c1901-1901/, accessed January 5, 2015. 39 Annie Morgan-James, “Enchanted Places, Land and Sea, and Wilderness: Scottish Highland Landscape and Identity in Cinema,” in Representing the Rural, eds. Catherine Fowler and Gillian Helfield (Detroit: Wayne State University Press, 2006), 186. 40 Anne-Kathrin Reuschel and Lorenz Hurni, “Mapping Literature: Visualisation of Spatial Uncertainty in Fiction,” The Cartographic Journal 48, no. 4 (2011): 298. work_hm4ljhe2mnbtratt3c3gw756dm ---- Text Mining at an Institution with Limited Financial Resources Search D-Lib: HOME | ABOUT D-LIB | CURRENT ISSUE | ARCHIVE | INDEXES | CALENDAR | AUTHOR GUIDELINES | SUBSCRIBE | CONTACT D-LIB D-Lib Magazine July/August 2016 Volume 22, Number 7/8 Table of Contents   Text Mining at an Institution with Limited Financial Resources Drew E. VandeCreek Northern Illinois University Libraries drew@niu.edu DOI: 10.1045/july2016-vandecreek   Printer-friendly Version   (This Opinion piece presents the opinions of the authors. It does not necessarily reflect the views of D-Lib Magazine, its publisher, the Corporation for National Research Initiatives, or the D-Lib Alliance.)   Abstract The digital humanities are now coming to the attention of a growing number of scholars and librarians, including many at medium-sized and small institutions that lack significant financial resources. Should these individuals seek to explore text mining, one of the digital humanities core activities, they are likely to confront the fact that their library cannot afford the typical expensive database products that contain large volumes of materials suitable for analysis. In this opinion piece, I suggest that vendors would benefit from increasing their customer base by offering potential users the opportunity to purchase discrete portions of data sets individually. This approach may prove practicable for libraries able to muster relatively modest sums for the purchase of single items. It also may represent a new source of revenue for vendors, or at least an opportunity to build trust and goodwill in the digital humanities community.   The Problem The digital humanities' increasing prominence in academic life, marked by such things as the advertisements seeking applications for new positions and calls for papers, has brought it to the attention of a large number of humanities scholars, librarians and administrators not employed at the larger institutions that have heretofore often led the field's development. Many have expressed an interest in the field. These individuals often do not have access to as many financial resources as the field's leaders often enjoy. This shortfall makes itself apparent in any number of ways: the lack of a technical infrastructure robust enough to support many types of digital humanities work; a lack of information technology professionals that understand, appreciate and can support the work; and an inability to attend professional development workshops at other institutions. Another potential problem to be faced by this new group of practitioners at non-elite institutions with limited resources will arise when they undertake text mining, one of the digital humanities' core activities, and confront the expense of acquiring a corpus of data to mine. In this article I discuss the problem, and propose a partial solution which, while far from ideal, could allow these practitioners to begin.   Text Mining: the Cost of Getting Started I attended the University of Michigan's "Beyond CTRL+F: Text Mining Across the Disciplines 2016" workshop on February 1, 2016. I want to thank the University of Michigan Libraries for organizing and hosting the event. I enjoyed it. It must have taken a great deal of work. When the workshop first came to my attention, I noticed that participants could attend at no charge. This was too good to be true. Working at a state university in the bankrupt state of Illinois, I of course have access to no financial support for professional development activities. I happily drove to Ann Arbor and stayed overnight at my own expense, then took part in the workshop. Without the free-admission policy, I might not have gone to the event. The workshop began with a session devoted to "finding your corpus." This seemed reasonable. No one can perform text mining until they have some text. The session featured representatives of several vendors of subscription products providing access to large amounts of textual materials: ProQuest, JSTOR, Gale, Alexander Street Press (full disclosure — I edited an online product for Alexander Street Press and have cashed their checks) and several others. It dawned on me that the no-charge policy resulted, of course, from these vendors' sponsorship of the event. As sponsors, they enjoyed the opportunity to pitch their products to members of a captive audience who had expressed an interest in text mining. Vendor representatives described how scholars and students might use their products for text-mining projects. They presented an impressive set of resources, but they did emphasize that library users were not simply to bring up one of their databases and begin to download the very large bodies of text they wanted to use. Vendors of online library resources typically offer their products for subscription with the proviso that library patrons not use them too much. From a vendor's point of view, a database user might download a very large amount of text and then turn around and put it on the web for free use. Thus, they monitor their product's use, and terminate access if they detect that a patron is downloading too much material. Vendor representatives at the Ctrl+F event explained that their policies direct prospective text miners to use their products to discover potentially suitable text materials, then submit a request for a specific corpus, which they will then prepare and deliver for an extra fee in the range of $500-$1,000. This made something very apparent to me: text mining is in many cases only practicable at its intended scale at institutions commanding the financial resources necessary to 1) subscribe to these products, and 2) go on to pay the additional fee. Of course Open Access entities like HathiTrust make text materials available at the scale required for text mining activities at no cost, but it is important to recognize that vendors of subscription-based products like those discussed at the Ctrl+F event also represent a major source of text materials that scholars will likely find very attractive. I noticed that a significant number of scholars employed at institutions well outside the vendors' target audience of university libraries with budgets allowing them to purchase or subscribe to high-cost digital resources in the humanities attended the "Beyond Ctrl+F" event. Those with whom I conversed often emphasized that they were happy to attend such an introductory-level event hosted by a major institution of high reputation. It offered an opportunity to get oriented in the field, to get started in the work. I suspect that a number of these individuals must have reached the same conclusion that I did: "I can only do this if I can find text available at no charge. I must direct my research toward questions that can be answered by reference to free-use data alone."   My Experience I attended the Ctrl+F event as a digital humanities professional responsible for the encouragement and support of activities like text mining at my university. I am also a scholar of nineteenth and early twentieth century American intellectual and political history. I am interested in language and rhetoric in American political development. More specifically, I am interested in how Americans have talked about the federal government. What did they have to say about its scope of activity? How might Americans have understood what it did, or did not do? What language did they use to argue for more, or less, government involvement in the American economy and society? Did their language reflect the influence of major intellectual traditions like liberalism and republicanism in political thought, or perhaps romanticism and sensibility in literature and culture? I turned to speeches and debates in Congress as a good source of arguments for and against specific state activities. This led me to the Congressional Record, a very large set of text that is available in a searchable text format from several sources. The Library of Congress' A Century of Lawmaking for a New Nation web site provides free access to full-text versions of the Congressional Record beginning with the year 1995. I needed access to full-text versions of the record from the nineteenth century. This led me to ProQuest Congressional, a subscription product providing a variety of Congressional materials. Unfortunately, my university library's subscription to ProQuest Congressional did not include materials from the Congressional Record before 1985. When our Acquisitions Department contacted ProQuest to inquire about the matter they learned that we might purchase the back file materials for the nineteenth-century Congressional Record for a one-time payment of approximately $25,0000. This was an all-or-nothing proposition: purchase the entire back file, or purchase nothing. ProQuest's price was a complete non-starter at my financially strapped university. I asked librarians at several institutions with large library budgets if they might acquire materials for me, in effect providing an inter-library loan, but found that vendors' contracts restrict use to individuals defined as members of an individual institution's user community. I attempted to resolve my problem by asking vendors if they would sell me my preferred chunk of data by itself (the Congressional Record, 1873-1896), rather than an entire database product or back file, at a more reasonable price. ProQuest declined to negotiate, but Hein Online (another vendor of digitized government documents) agreed. I bought, at my own expense, the text of the Congressional Record for the period 1873-1896 for a price I could accept. I now have it available for research. Upon completing this transaction, I discovered that the University of North Texas Libraries, which present a digitized version of the entire Congressional Record, would provide me with their uncorrected text data at no charge. I thank the University of North Texas Libraries for the use of their data, and recommend them to other students and scholars. Their collections include a large amount of digitized Texas newspapers, as well as records of the Federal Communication Commission. However, like other not-for-profit providers of text data, North Texas offered uncorrected copy. With two versions of the same data in hand, I may have an opportunity to compare the results they produce in text-mining work. In any event, corrected text is clearly more useful than uncorrected materials.   The Vendors' Perspective As I pondered the situation, I tried to take ProQuest's point of view. I understand that most library vendors are private concerns and need to make a profit for their investors. Their representatives sell that product in order to earn a living. Nevertheless, the Congressional Record is a government publication available at no charge in libraries and other depositories of federal materials. How could ProQuest charge so much for the use of it? I imagined that from ProQuest's perspective, they are not selling access to a government publication in the public domain. They are selling access to a value-added version of it: a digitized, full-text searchable version of the materials available in an online format. Their costs include funds devoted to the initial digitization of materials originally published in an analog format; the markup and other technical work required to prepare the text for use with a search engine; the storage and preservation of the materials on a technical infrastructure requiring maintenance and upgrades; and the online service of the digital materials themselves, again on an infrastructure requiring maintenance and regular upgrades. Of these costs, those devoted to digitization itself deserve specific discussion. Many librarians and humanities scholars have taken some part in the digitization of materials at some point in their career. Experience with the process reveals that the various software products that convert type-set, analog materials to a digital format are far from foolproof. They often produce enough errors to compromise the materials' usefulness, at least to some degree. This is especially true of older materials, in which ink has often faded and pages have yellowed with age. In my experience nineteenth-century materials digitized from an analog format usually have a very high error rate. I examined a small sample of ProQuest's Congressional Record materials, which they courteously provided me. It contained a very small amount of scanning errors, significantly fewer than those found in the portion of the UNT data that I reviewed, and about the same as the Hein materials. I tentatively determined that in my case vendors provide access to better text than that available for free. If a researcher were to attempt to bring the Open Source data up to the quality of the ProQuest materials, s/he would have to find a way to fix many of the errors in it, most likely by using a script that finds and replaces common scanning errors in a document. In my experience most humanities scholars and students cannot write search and replace scripts, nor do they know how to find them online, ready to use, and implement them in ways that many technologists and programmers do. I certainly do not. Most libraries and medium-sized and smaller institutions with limited resources lack access to this type of technical expertise. Thus, when Hein and ProQuest charge fees for materials in the public domain, they charge for access to more accurate digitized text.   A Measure of Progress My experience with Hein Online led me to draw a parallel to another experience I had with a vendor in a somewhat similar, but not identical, situation. In the past several years I have taken part in the activities of the Digital POWRR Project, an IMLS-funded activity that produced a study of digital preservation challenges and potential solutions at medium-sized and smaller colleges and universities lacking large financial resources. Our study included the review of a number of applications and tools available for use in digital preservation activities. Among them we found a comprehensive, all-in-one product called Preservica. They made no pricing information available online. We had to call for a quote. When we contacted a Preservica sales representative to ask if they might make the product available to our study for testing at little or no cost, they immediately rejected us, explaining that Preservica is a version of a digital preservation product that the company originally sold to large corporations such as banks. They have now begun to market it to other very large institutions with need to preserve digital materials that have suitable budgets, ranging from universities to state and national governments. Apparently, medium-sized and smaller institutions with little money did not represent an attractive market segment. The Digital POWRR Project published a white paper resulting from the study, "From Theory to Action: Good Enough Digital Preservation for Under-Resourced Cultural Heritage Institutions". It recommended that institutions unable to afford a product like Preservica adopt a one-step-at-a-time approach to digital preservation activities using sets of open-source tools in combinations suited to their particular needs. Another thing occurred in the process of conducting the study. Through a frank and open exchange of views with members of the Digital POWRR team, Preservica executives became aware that they were leaving money on the table by adopting a call-for-quote stance and pricing their product at a level that put it well out of reach of smaller, less prosperous institutions. We urged them to adopt a more transparent pricing policy and become aware of this other market, which the response to our study has shown is vast. There are only so many institutions with the resources necessary to buy Preservica at their initial price level. What happens when they all have acquired or constructed a satisfactory digital preservation application? Where does the company find growth then? Preservica executives changed their position, instituting a transparent, online pricing policy and devising versions of their product priced to suit more modest budgets. I want to suggest that vendors of large sets of humanities text materials do the same.   My Recommendation I suggest that vendors of library database products recognize that they can contribute to future scholarship, ease a major, obvious inequity in the field and, perhaps, find a new source of revenue by making chunks of text data available for sale on an à la carte basis. In many cases, this would require them to offer libraries that do not subscribe to their products a free trial-period use so that researchers might identify materials of interest. It would also require the additional administrative work involved in processing a number of transactions involving lesser amounts of funds than those to which they are accustomed. I understand that vendors will raise these objections, but I believe they should investigate this potential sales model in a systematic fashion and determine if they can earn profits with it. I submit that vendors would not need to understand this approach as a charity measure. I suspect that purveyors of large, online humanities text databases may well confront a situation similar to that which the Digital POWRR team perceived in Preservica's case. Once they have sold their products to the limited number of institutions able to afford them, where do they find growth? Of course they can grow by introducing new products, but do they not want to find revenue growth in legacy products as well? Representatives of a number of vendors may reply to this observation by noting that they price their products on the basis of an institution's number of full-time enrolled students, or offer access to a limited number of simultaneous logins, measures that can help a smaller institution. This is not enough. It may prove to be a benefit to smaller institutions to some degree, but it is only a partial measure. It certainly does not help cases like mine — a large institution lacking the budget level to buy even these versions of products — and there are many such institutions. If vendors do not recognize and respond to the market made up of medium-sized and smaller institutions of lesser financial means, I fear that they will make a powerful contribution to the perpetuation of the existing situation: students and scholars at the wealthiest colleges and universities can do text mining work with access to very large collections of suitable materials, while others may never find their corpus. Those vendors will also, in my estimation, leave money on the table. Even if they cannot earn any profit from this type of sale, it may be worthwhile for them to sell materials at a modest loss in order to earn the trust and goodwill of the scholars, librarians, and other practitioners populating the digital humanities. I ask vendors to consider the above proposition, and digital humanists and librarians at institutions of all sizes and financial conditions to raise these issues associated with access to their materials with vendors' sales representatives.   Acknowledgements The author thanks Jim Millhorn of Northern Illinois University Libraries and Alix Keener of the University of Michigan Libraries for help in gathering information for this article.   About the Author Drew E. VandeCreek is Director of Digital Scholarship and Co-Director of the Digital Convergence Lab at Northern Illinois University Libraries. He holds a Ph.D. in American History from the University of Virginia. He has secured funding for and directed the development of a number on online resources exploring nineteenth-century American history, available from the University Libraries Digital Collections.   Copyright ® 2016 Drew E. VandeCreek work_hnpz6ksevfbafby5h5rdw6mhdq ---- Microsoft Word - WorkingDH_WHKChun_LMRhody.docx   [Note: The following is the full text of an essay published in differences 25.1 (2014) as part of a special issue entitled In the Shadows of the Digital Humanities edited by Ellen Rooney and Elizabeth Weed. Duke UP’s publishing agreements allow authors to post the final version of their own work, but not using the publisher’s PDF. The essay as you see it here is thus a standard PDF distinct from that created by Duke UP. Subscribers, of course, can also read it in the press’s published form direct from the Duke UP site. Other than accidentals of formatting and pagination this text should not differ significantly from the published one. If there are discrepancies they are likely the result of final copy edits and the exchange between the differences style guide and our standardized format. This article is copyright © 2014 Duke University Press.] Citation:     Volume  25,  Number  1  doi  10.1215/10407391-­‐2419985     ©  2014  by  Brown  University  and  differences:  A  Journal  of  Feminist  Cultural  Studies     Working the Digital Humanities: Uncovering Shadows between the Dark and the Light Wendy Hui Kyong Chun And Lisa Marie Rhody         The following is an exchange between the two authors in response to a paper given by Chun at the “Dark Side of the Digital Humanities” panel at the 2013 Modern Languages Association (mla) Annual Convention. This panel, designed to provoke controversy and debate, succeeded in doing so. However, in order to create a more rigorous conversation focused on the many issues raised and elided and on the possibilities and limitations of digital humanities as they currently exist, we have produced this collaborative text. Common themes in Rhody’s and Chun’s responses are: the need to frame digital humanities within larger changes to university funding and structure, the importance of engaging with uncertainty and the ways in which digital humanities can elucidate “shadows” in the archive, and the need for and difficulty of creating alliances across diverse disciplines. We hope that this text provokes more ruminations on the future of the university (rather than simply on the humanities) and leads to more wary, creative, and fruitful engagements with digital technologies that are increasingly shaping the ways and means by which we think.  2     Part 1 The Digital Humanities A Case of Cruel Optimism? (Chun) What  follows  is  the  talk  given  by  Wendy  Chun  on  January  4,  2013,  at  the  mla  convention  in  Boston.  It  focuses   on   a   paradox   between   the   institutional   hype   surrounding   DH   and   the   material   work   conditions   that   frequently  support  it  (adjunct/soft  money  positions,  the  constant  drive  to  raise  funds,  the  lack  of  scholarly   recognition  of  DH  work  for  promotions).  Chun  calls  for  scholars  across  all  fields  to  work  together  to  create  a   university  that  is  fair  and  just  for  all  involved  (teachers,  students,  researchers).  She  also  urges  us  to  find  value   in  what  is  often  discarded  as  “useless”  in  order  to  take  on  the  really  hard  problems  that  face  us.     I want to start by thanking Richard Grusin for organizing this roundtable. I’m excited to be a part of it. I also want to start by warning you that we’ve been asked to be provocative, so I’ll use my eight minutes here today to provoke: to agitate and perhaps aggravate, excite and perhaps incite. For today, I want to propose that the dark side of the digital humanities is its bright side, its alleged promise—its alleged promise to save the humanities by making them and their graduates relevant, by giving their graduates technical skills that will allow them to thrive in a difficult and precarious job market. Speaking partly as a former engineer, this promise strikes me as bull: knowing gis (geographic information systems) or basic statistics or basic scripting (or even server-side scripting) is not going to make English majors competitive with engineers or cs (computer science) geeks trained here or increasingly abroad. (*Straight up programming jobs are becoming increasingly less lucrative.*) But let me be clear: my critique is not directed at DH per se. DH projects have extended and renewed the humanities and revealed that the kinds of critical thinking (close textual analysis) that the humanities have always been engaged in is and has always been central to crafting technology and society. DH projects such as Feminist Dialogues in Technology, a distributed online cooperative course that will be taught in fifteen universities across the globe, and other similar courses that use technology not simply to disseminate but also to cooperatively rethink and regenerate education on a global scale—these projects are central. In addition, the humanities should play a big role in big data, not simply because we’re good at pattern recognition (because we can read narratives embedded in data) but also, and more importantly, because we can see what big data ignores. We can see the ways in which so many big data projects, by restricting themselves to certain databases and terms, shine a flashlight under a streetlamp.   3   I also want to stress that my sympathetic critique is not aimed at the humanities, but at the general euphoria surrounding technology and education. That is, it takes aim at the larger project of rewriting political and pedagogical problems into technological ones, into problems that technol- ogy can fix. This rewriting ranges from the idea that moocs (massive open online courses), rather than a serious public commitment to education, can solve the problem of the spiraling costs of education (moocs that enroll but don’t graduate; moocs that miss the point of what we do, for when lectures work, they work because they create communities, because they are, to use Benedict Anderson’s phrase, “extraordinary mass ceremonies”) to the blind embrace of technical skills. To put it as plainly as possible: there are a lot of unemployed engineers out there, from forty-something assembly program- mers in Silicon Valley to young kids graduating from community colleges with cs degrees and no jobs. Also, there’s a huge gap between industrial skills and university training. Every good engineer has to be retaught how to program; every film graduate, retaught to make films. My main argument is this: the vapid embrace of the digital is a form of what Lauren Berlant has called “cruel optimism.” Berlant argues, “[A] relation of cruel optimism exists when something you desire is actually an obstacle to your flourishing” (1). She emphasizes that optimistic relations are not inherently cruel, but become so when “the object that draws your attachment actively impedes the aim that brought you to it initially.” Crucially, this attachment is doubly cruel “insofar as the very pleasures of being inside a relation have become sustaining regardless of the content of the relation, such that a person or world finds itself bound to a situation of profound threat that is, at the same time, profoundly confirming” (2). So, the blind embrace of DH (*think here of Stanley Fish’s “The Old Order Changeth”*) allows us to believe that this time (once again) graduate students will get jobs. It allows us to believe that the problem fac- ing our students and our profession is a lack of technical savvy rather than an economic system that undermines the future of our students. As Berlant points out, the hardest thing about cruel optimism is that, even as it destroys us in the long term, it sustains us in the short term. DH allows us to tread water: to survive, if not thrive. (*Think here of the ways in which so many DH projects and jobs depend on soft money and the ways in which DH projects are often—and very unfairly—not counted toward tenure or promotion.*) It allows us to sustain ourselves and to justify our existence in an academy that is increasingly a sinking ship.  4   The humanities are sinking—if they are—not because of their earlier embrace of theory or multiculturalism, but because they have capitulated to a bureaucratic technocratic logic. They have conceded to a logic, an enframing (*to use Heidegger’s term*), that has made publishing a question of quantity rather than quality, so that we spew forth mpus or minimum publishable units; a logic, an enframing, that can make teaching a burden rather than a mission, so that professors and students are increasingly at odds; a logic, an enframing, that has divided the profession and made us our own worst enemies, so that those who have jobs for life deny jobs to others—others who have often accomplished more than they (than we) have. The academy is a sinking ship—if it is—because it sinks our students into debt, and this debt, generated by this optimistic belief that a university degree automatically guarantees a job, is what both sustains and kills us. This residual belief/hope stems from another time, when most of us couldn’t go to university, another time, when young adults with degrees received good jobs not necessarily because of what they learned, but because of the society in which they lived. Now, if the bright side of the digital humanities is the dark side, let me suggest that the dark side—what is now considered to be the dark side—may be where we need to be. The dark side, after all, is the side of passion. The dark side, or what has been made dark, is what all that bright talk has been turning away from (critical theory, critical race studies—all that fabulous work that #TransformDH is doing). This dark side also entails taking on our fears and biases to create deeper collaborations with the sciences and engineering. It entails forging joint (frictional and sometimes fractious) coalitions to take on problems such as education, global change, and so on. It means realizing that the humanities don’t have a lock on creative or critical thinking and that research in the sciences can be as useless as research in the humanities—and that this is a good thing. It’s called basic research. It also entails realizing that what’s most interesting about the digital in general is perhaps not what has been touted as its promise, but rather, what’s been discarded or decried as its trash. (*Think here of all those failed DH tools, which have still opened up new directions.*) It entails realizing that what’s most interesting is what has been discarded or decried as inhuman: rampant publicity, anonymity, the ways in which the Internet vexes the relationship between public and private, the ways it compromises our autonomy and involves us with others and other machines in ways we don’t entirely know and control. (*Think here of the constant and promiscuous exchange of information that drives the Internet, something that is usually hidden from us.*) As Natalia Cecire has argued, DH is best when it takes on the   5   humanities, as well as the digital. Maybe, just maybe, by taking on the inhumanities, we’ll transform the digital as well. Thank you. The sections in asterisks are either points implied in my visuals or in the talk, which I have elaborated upon in this written version. Part 2 The Digital Humanities as Chiaroscuro (Rhody)   Taking as a point of departure your thoughtful inversion of the “bright” and “dark” sides of the digital humanities, I want to begin by revisiting the origin of those terms as they are born out of rhetoric sur- rounding the 2009 mla Annual Convention, when academic and popular news outlets seemed first to recognize digital humanities scholarship and, in turn, to celebrate it against a dreary backdrop of economic recession and university restructuring. Most frequently, such language refers to William Pannapacker’s Chronicle of Higher Education blog post on December 28, 2009, in which he writes: Amid all the doom and gloom of the 2009 mla Convention, one field seems to be alive and well: the digital humanities. More than that: Among all the contending subfields, the digital humanities seem like the first “next big thing” in a long time, because the implications of digital technology affect every field. I think we are now realizing that resistance is futile. One convention attendee complained that this mla seems more like a conference on technology than one on literature. I saw the complaint on Twitter. (“MLA”)     Of course, Pannapacker’s relationship to digital humanities has changed since his first post. In a later Chronicle blog entry regarding the 2012 mla Annual Convention, Pannapacker walked back his earlier characterization of the digital humanities, explaining: “I regret that my claim about DH as the nbt—which I meant in a serious way—has become a basis for a rhetoric that presents it as some passing fad that most faculty members can dismiss or even block when DH’ers come up for tenure” (“Come-to- DH”). Unfortunately for the public’s perception of digital humanities, the provocativeness of Pannapacker’s earlier rhetoric continues to receive much more attention than the retractions he has written since.  6   In 2009, though, Pannapacker was reacting to the “doom and gloom” with which a December 17 New York Times article set the stage for the mla Annual Convention by citing dismal job prospects for PhD graduates. The Times article begins with a sobering statistic: “faculty positions will decline 37 percent, the biggest drop since the group began tracking its job listings 35 years ago” (Lewin). Pannapacker, though, wasn’t the first one who called digital humanities a “bright spot.” That person was Laura Mandell, in her post on the Armstrong Institute for Interactive Media Studies (aims) blog on January 13, 2010, just following the conference: “Digital Humanities made the news: these panels were considered to be the one bright spot amid ‘the doom and gloom’ of a fallen economy, a severely depressed job market, and the specter of university-restructuring that will inevitably limit the scope and sway of departments of English and other literatures and languages” (“Digital”). In neither her aims post nor in her mla paper does Mandell support a “vapid embrace of the digital” or champion digital humanities as a solution to the sense of doom and gloom in the academy. Rather, in both, Mandell candidly and openly contends with one of the greatest challenges to digital humanities work: collaboration. The “brightness” surrounding digital humanities at the 2009 MLA convention was based on the observation that DH and media studies panels drew such high attendance because they focused on long-standing, unresolved issues not just for digital humanities but for the study of literature and language at large. For example, in Mandell’s session, “Links and Kinks in the Chain: Collaboration in the Digital Humanities”—a session presided over by Tanya Clement (University of Maryland, College Park) and that also included Jason B. Jones (Central Connecticut State University), Bethany Nowviskie (Neatline, University of Virginia), Timothy Powell (Ojibwe Archives, University of Pennsylvania), and Jason Rhody (National Endowment for the Humanities [NEH])—presenters addressed the challenges and cautious optimism that scholarly collaboration in the context of digital humanities projects requires.1 Liz Losh’s reflections on the panel recall a perceived consensus that collaboration is hard enough that one might be tempted to write it off as a fool’s errand, as Nowviskie’s tongue-in-cheek use of an image titled “The Ministry of Silly Walks” (borrowed from a Monty Python skit) implied. But neither Nowviskie’s nor Mandell’s point was to stop trying; quite the opposite, their message was that collaboration takes hard work, patience, revisions to existing assumptions about academic status, and a willingness to compromise when the stakes feel high. As Mandell recalls in her post: “[M]y deep sense of it is that we came to some conclusions (provisional, of course). Digital   7   Humanists, we decided, are concerned to protect the openness of collaboration and intellectual equality of participants in various projects while insuring the professional benefits for those contributors whose positions within academia are not equal (grad students, salaried employees, professors)” (“Digital”). That is a tall order, especially because digital humanities scholarship unsettles deeply rooted institutional beliefs about how humanists do research. If the digital humanities in 2009 seemed “bright,” it was in large part because it refocused collective attention around issues that vexed not just digital humanists but their inter-/ trans-/ multi-disciplinary peers, those Julia Flanders is noted for having called “hybrid scholars,” a term not limited to digital humanists. Furthermore, across the twenty-seven sessions at the conference that might be considered digital humanities or media studies related, most addressed, at least in a tangential way, issues related to working across institutional barriers.2 In other words, the bright optimism of 2009 for digital humanists was not that of economic recovery, employment solutions, and technological determinism, but of consensus building and renewed attention to long-standing institutional barriers. One takeaway from the 2009 MLA panels is also a collective sense of strangeness in claiming “digital humanities” as a name when it draws together such a diversity of humanities scholars with so many different research agendas under a common title—an unease that, perhaps, may be attributed to the chosen theme of the Digital Humanities 2011 conference, “Big Tent Digital Humanities.” What the four years since the “Links and Kinks” panel have proven is that its participants were right: collaboration, digital scholarship, and intellectual equality are really hard, and no, we haven’t come up with solutions to those challenges yet. Reorienting the bright side/dark side debate away from the pro- vocativeness of its media hype and back toward the spirit of creating con- sensus around long-standing humanities concerns, I would like to suggest that the “dark side” of digital humanities is that we are still struggling with issues that we began calling attention to even earlier than 2009: effectively collaborating within and between disciplines, institutions, and national boundaries; reorienting a deeply entrenched academic class structure; recovering archival silences; and building a freer, more open scholarly dis- course. Consequently, a distorted narrative that touts digital humanities as a “bright hope” for overcoming institutional, social, cultural, and economic challenges has actually made it harder for digital humanities to continue acting as a galvanizing force among hybrid scholar peers and to keep the focus on shared interests because such rhetoric falsely positions digital humanities and the “rest” of humanities as if they’re in opposition to one  8   another. DH and Technological Determinism Moving beyond the “bright/dark” dichotomy is in part compli- cated by the popular complaint first levied against digital humanities at the 2009 mla conference that “resistance is futile” and that the convention seemed to be more about technology than literature (see Pannapacker, “mla,” above). Setting aside the problematic opposition between “technology” and “literature” that Pannapacker’s unnamed source makes, the early euphoria over digital humanities that you call attention to in your talk is frequently linked to a sense that digital humanists have fallen victim to a pervasive technological determinism. The rhetoric of technological determinism, however, more often comes from those who consciously position themselves as digital humanities skeptics—which is in stark contrast to how early adopters in the humanities approached technology. In 1998, early technology adopters like Dan Cohen, Neil Fraistat, Alan Liu, Allen Renear, Roy Rosenzweig, Susan Schreibman, Martha Nell Smith, John Unsworth, and others didn’t encourage students to learn html (HyperText Markup Language), sgml (Standard Generalized Markup Language), or tei (Text Encoding Initiative) so they could get jobs. They did it, in large part, so students could understand the precarious opportunity that the World Wide Web afforded scholarly production and communication. Open, shared standards could ensure a freer exchange of ideas than proprietary standards, and students developed webpages to meet multiple browser specifications so that they could more fully appreciate how delicate, how rewarding, and how uncertain publishing on the Web could be in an environment where Netscape and Microsoft Internet Explorer sought to corner the market on Web browsing.3 Reading lists and bibliographies in those early courses drew heavily from the textual studies scholarship of other early adopters such as Johanna Drucker, Jerome McGann, Morris Eaves, and Joseph Viscomi, whose work had likewise long considered the material economies of knowledge production in both print and digital media.   9   Consider the cautious optimism that characterizes Roy Rosen- zweig and Dan Cohen’s 2005 Introduction to Digital History, which begins with a chapter titled “Promises and Perils of Digital History”: We obviously believe that we gain something from doing digital history, making use of the new computer-based technologies. Yet although we are wary of the conclusions of techno-skeptics, we are not entirely enthusiastic about the views of the cyber-enthusiasts either. Rather, we believe that we need to critically and soberly assess where computer networks and digital media are and aren’t useful for historians—a category that we define broadly to include amateur enthusiasts, research scholars, museum curators, documentary filmmakers, historical society administrators, classroom teachers, and history students at all levels [. . .]. Doing digital history well entails being aware of technology’s advantages and disadvantages, and how to maximize the former while minimizing the latter. (18) In other words, digital history, and by extension digital humanities, grew out of a thoughtful and reflective awareness of technology’s potential, as well as its dangers, and not a “vapid embrace of the digital.” Moreover, the earliest convergence between scholars of disparate humanities backgrounds coalesced most effectively and openly in resistance to naive technological determinism. Anxiety, however, creeps into conversations about digital humanities with phrases like “soon it won’t be the digital humanities [. . .] it will just be the humanities.” Used often enough that citing every occasion would be impossible, such a phrase demonstrates and fuels a fear that methods attributed to digital humanities will soon be the only viable methods in the field, and that’s simply not true. And yet, unless there is a core contingent of faculty who continue to distribute their work in typed manuscripts and consult print indexes of periodicals that I don’t know about, everyone is already a digital humanist insofar as it is a condition of contemporary research that we must ask questions about the values, technologies, and economies that organize and redistribute scholarly com- munication—and that is and always has been a fundamental concern within the field of digital humanities since before it adopted that moniker and was called merely “humanities computing.”4  10   DH and moocs Related to concerns over technological determinism is an indictment that digital humanities has given way to a “vapid embrace of the digital” as exemplified by universities’ recent love affair with moocs. You describe the moocification of higher education very well as the desire to “rewrit[e] political and pedagogical problems into technological ones, into problems that technology can fix. This rewriting ranges from the idea that moocs, rather than a serious public commitment to education, can solve the problem of the spiraling cost of education [. . .] to the blind embrace of technological skills.” Digital humanists who have dared to tread on this issue most often do so with highly qualified claims that higher education, too, requires change. For example, Edward Ayers’s article in the Chronicle, “A More-Radical Online Revolution,” contends that if an effective online course is possible, it is only so when the course reorients its relationship to what knowledge production and learning really are. He points out that technology won’t solve the problem, but learning to teach better with technology might help. Those two arguments are not the same. The latter acknowledges that we have to make fundamental changes in the way we approach learning in higher education—changes that most institutions celebrating and embracing moocs are unwilling to commit to by investing in human labor. In solidarity with Ayers’s cautious optimism are those like Cathy Davidson, who has often made the point that moocs are popular with university administrators because they are the least disruptive to education models that find their roots in the industrial revolution—and conversely this is why most digital humanists oppose them. DH and Funding   Another challenge presented by the specter of media attention to the field of digital humanities has been the perception that it draws on large sums of money otherwise inaccessible to the rest of humanities researchers. Encapsulating the “cruel optimism” you identify as described by Lauren Berlant, hopeful academic administrations may once have seen digital humanities research as having access to seemingly limitless pools of money— an assumption that creates department and college resentments. But there’s a reality check that needs to happen, both on the part of hopeful administrations and on the part of frustrated scholars: funding overall is scarce. Period. Humanists are not in competition with digital humanists for funding: humanists are in competition with everyone for more funding. For example, since 2010, the National Endowment for the Humanities   11   (neh) budget has been reduced by 17 percent. In its Appropriations Request for Fiscal Year 2014, the neh lists the 2012 Office of Digital Humanities (odh) actual budget at $4,143,000. In other words, odh—the neh division charged with funding digital research in the humanities—controls the smallest budget of any other division in the agency by a margin of $9 to 10 million (National Endowment 13; see table 1 at the end of this article). Since most grants from odh are institutional grants as opposed to individual grants (such as fellowships or summer stipends), a substantive portion of each odh award is absorbed by the sponsoring institution in order to offset “indirect costs.” When digital humanities centers and their institu- tions send out celebratory announcements about how they just received a grant for a digital humanities project for x number of dollars, only a fraction of that money actually goes to directly support the project in question. Anywhere between 25 to 55 percent of digital humanities grant funds are absorbed by the institution to “offset” what are also referred to as facilities and administrative—f&a—costs, or overhead. Indirect cost rates are usually negotiated once each year between the individual academic institutions and a larger federal agency (think Department of Defense, Environmental Protection Agency, National Institutes of Health, National Aeronautics and Space Administration, or Department of the Navy), and they are presumably used to support lab environments for stem-related disciplines (science, technology, engineering, and mathematics). Whatever the negotiated cost rate at each institution, that same rate is then applied to all other grant recipients from the same institution who receive federal funds regardless of discipline. While specialized maintenance personnel, clean rooms, security, and hazard insurance might be necessary to offset costs to the institution to support a stem-related research project, it is unclear the extent to which digital humanities projects benefit from these funds. Thus, while institutions are excited to promote, publicize, and even support digital humanities grant applications (bright side), that publicity simultaneously casts long shadows obscuring from public view the reality that the actual dollar amount that goes directly to support DH projects is significantly reduced. If we really wanted to get serious about exploring the shadows of digital humanities research, we might begin by asking probative questions about where those indirect costs go and how they are used. In fact, as Christopher Newfield points out in “Ending the Budget Wars: Funding the Humanities during a Crisis in Higher Education,” more of us humanists should be engaging in a healthy scrutiny of our institution’s budgets. New-field points out that academic administrations have been milking humanities departments for quite a long time without clear indication of where income from humanities general education courses actually go:  12   First we must understand that though the humanities in general and literary studies in particular are poor and struggling, we are not naturally poor and struggling. We are not on a permanent austerity budget because we don’t have the intrinsic earning power of the science and engineering fields and aren’t fit enough to survive in the modern university. I suggest, on the basis of a case study, that the humanities fields are poor and struggling because they are being milked like cash cows by their university administrations. The money that departments generate through teaching enrollments that the humanists do not spend on their almost completely unfunded research is routinely skimmed and sent elsewhere in the university. As the current university funding model continues to unravel, the humanities’ survival as national fields will depend on changing it. (271) Lack of clarity about where money absorbed by academic institutions as indirect costs ends up is linked to a much wider concern about whether or not humanities departments really should be as poor and struggling as they are. Here is an opportunity in which we could use the so-called celebrity status of digital humanities to cast new light on the accounting, budgeting, and administrating of humanities colleges in general to the benefit of faculty and researchers regardless of their research methods. DH and Collaboration   The topic of money, however, returns us to the complicated constellation of issues that accompany collaboration. Barriers to collabora- tion, as Mandell, Nowviskie, Powell, Jones, and Rhody discussed in 2009, are less a matter of fear or bias against collaborating with the sciences or engineering than they might have been in the past. As it turns out, though, collaboration across institutional boundaries is hard because financing it is surprisingly complex and often insufficient. In 2009, the Digging into Data Challenge announced its first slate of awardees. Combining the funds and efforts of four granting agencies (jisc [Joint Information Systems Committee], neh, nsf [National Science Foundation], and sshrc [Social Sciences and Humanities Research Council]), Digging into Data grants focused on culling resources, emphasizing collaboration, and privileging interdisciplinary research efforts—all valuable and laudable goals. In a follow-up report (unfortunately named) One Culture: Computationally Intensive Research in the Humanities and Social Sciences: A Report on the Experiences of First Respondents to the Digging into Data Challenge, however, participants   13   identify four significant challenges to their work: funding, time, communication, and data (Williford and Henry). In other words, just about everything it takes to collaborate presents challenges. The question is, though, what have we been able to do to change this? How well have we articulated these issues to those who don’t call them- selves digital humanists in ways that make us come together to advocate for better funding for all kinds of humanities research, rather than constantly competing with one another to grab a bigger piece of a disappearing pie? The frustrating part in all of this is that we know collaboration is hard. We want to bridge communities within the humanities, across to social science and stem disciplines, and even across international, cultural, and economic divides. Unless we really set to work on deeper issues like revising budgets, asking pointed questions about indirect cost rates, and figuring out how to communicate across disciplines, share data, and organize our collective time, four years from now we will still be asking the same questions.   DH and Labor Finally, there are other “shadows” in the academy where digital humanists have been hard at work. While no one in the digital humanities really believes that technical skills alone will prepare anyone for a job, important work by digital humanists has helped reshape the discourse around labor and employment in academia. For example, Tanya Clement and Dave Lester’s neh-funded white paper “Off the Tracks: Laying New Lines for Digital Humanities Scholars” brought together digital humanities practitioners to consider career trajectories for humanities PhDs employed to do academic work in nontenure, often contingent university positions. For example, groups such as DH Commons, an initiative supported by a coalition of digital humanities centers called centerNet, put those interested in tech- nology and the humanities in contact with other digital humanities practitio- ners through shared interests and needs. “Alt-Academy,” a MediaCommons project, invites, publishes, and fosters dialogue about the opportunities and risks of working in academic posts other than traditional tenure-track jobs.  14   While none of these projects could be credited with “finding jobs” for PhDs, per se, they are demonstrations of the ways digital humanities practitioners have made academic labor a central issue to the field. Worth noting: all of these projects have come to fruition since 2009 and in response to concerns about labor issues, recognition, and credit in a stratified academic class structure. And yet, none of these approaches on their own are solutions. There are still more people in digital humanities who are in contingent, nontenure-track positions than there are in tenure-track posts. A heavy reliance on soft funding continues to fuel an academic class structure in which divisions persist between tenure-track and contract faculty and staff— divisions that seem to be reinscribed along lines of gender and race difference. As long as these divisions of labor remain unsatisfactorily addressed, it promises to dim the light of a field that espouses the value of “intellectual equality” (Mandell). Even though recent efforts by the Scholarly Communication Institute (sci) (an Andrew W. Mellon Foundation–supported initiative) have not answered long-standing questions of contingent academic labor and placement of recent PhDs in the humanities, efforts to survey current alternative academic (alt-ac) professionals and to build a network of digital humanities graduate programs through the Praxis Network constitute important steps toward addressing these widely acknowledged problems across a spectrum of humanities disciplines. As a field, digital humanities has not promised direct avenues to tenure-track jobs or even alt-ac ones; however, digital humanities is a community of practice that, born out of an era of decreasing tenure-track job openings and rhetoric about the humanities in crisis, has worked publicly to raise awareness and improve dialogue that identifies, recognizes, and rewards intellectual work by scholars operating outside traditional tenure-track placements. DH Silences and Shadows I agree that what is truly bright about the digital humanities is that it has drawn from passion in its critical, creative, and innovative approaches to persistent humanities questions. For example, I look at the work of Lauren Klein, whose 2012 mla paper was one of four that addressed the archival silences caused by slavery. Klein’s paper responded directly to Alan Liu’s call to “reinscribe cultural criticism at the center of digital humanities work” (“Where Is?”). Her computational methods explore the silent presence of James Hemings in the archived letters of Thomas Jefferson:   15   To be quite certain, the ghost of James Hemings means enough. But what we can do is examine the contours that his shadow casts on the Jefferson archive, and ask ourselves what is illuminated and what remains concealed. In the case of the life—and death—of James Hemings, even as we consider the information disclosed to us through Jefferson’s correspondence, and the conversations they record—we realize just how little about the life of James Hemings we will ever truly know. (“Report”) Klein proposes one possible way in which we might integrate race, gender, and postcolonial theory with computer learning to develop methodologies for performing research in bias-laden archives, whereby we can expose and address absences. Still, while we have become more adept at engaging critical theory and computation in our scholarship, we have spent little of that effort constructing an inclusive, multivalent, diverse, and self-conscious archive of our own field as it has grown and changed. The shadows and variegated terrain of the digital humanities, this odd collection of “hybrid scholars,” is much more complicated, as one might expect, than the bright/dark binary by which it is too often characterized. Recovering the histories of DH has proven complicated. Jacqueline Wernimont made this point famously well in a paper she delivered at DH2013 and in a forthcoming article in Digital Humanities Quarterly (dhq). Wernimont explains that characterizing any particular project as feminist is difficult to do: “The challenges arise not from a lack of feminist engagement in digital humanities work, quite the opposite is true, but rather in the difficulty tracing political, ideological, and theoretical commitments in work that involves so many layers of production.” Put simply: the systems and networks from which DH projects arise are wickedly complex. Perhaps a bit more contentiously: the complexity of those networks has enabled narratives of digital humanities to evolve that elide feminist work that has been foundational to the field. Wernimont’s claim runs contrary to the impulse to address through provocation the sobering challenges that confront the digital humanities. Rather than claiming that “no feminist work has been done in DH,” Wernimont engages productively with the multifaceted work conditions that have led to our understanding of the field. As you suggest at the tail end of your talk, we often claim to “celebrate failures,” but it is unclear to what extent we follow through on that intent. Despite John Unsworth’s 1997 insistence in “Documenting the Reinvention of Text: The Importance of Failure” that we make embracing failure a disciplinary value, we very rarely do it. Consequently, we have riddled our discipline’s own archive with silences about our work process,  16   our labor practices, our funding models, our collaborative challenges, and even our critical theory. As a result, we have allowed the false light of a thriving field alive with job opportunities, research successes, and techno- logical determinism to seep into those holes. In other words, we have not done what we as humanists should know better than to do: we have not told our own story faithfully. Even so, recent events have demonstrated important steps to improving transparency in digital humanities. This summer at the DH2013 conference, Quinn Dombrowski did what few scholars are willing or bold enough to do. She exposed a project’s failure in a talk titled, “Whatever Hap- pened to Project Bamboo?” Dombrowski recounted the challenges faced by an Andrew W. Mellon–funded cyberinfrastructure project between 2008 and 2012. Tellingly, when you go to the project’s website, there is no discussion of what happened to it—whether or not it met its goals, or why, or even what institutions participated in it. There is a “documentation wiki” where visitors might review the archived project files, an “issue tracker,” and a “code repository.” There is even a link to the “archive” copy of the website as it existed during its funding cycle. That is it. In the face of this silence, Dombrowski provided a voice for what might be seen as the project’s failure to begin hashing through the difficulties of collaboration and the dangers of assuming what humanists want before asking them. Dombrowski’s paper was welcomed by the community and cel- ebrated as a necessary contribution to our scholarly communication prac- tices. Significantly, many DH projects, particularly those that receive federal funding, do have outlets for discussing their processes, management, and decisions; however, where these scholarly and reflective documents are published is often in places where those starting out in digital humanities are unlikely to find them. White papers, grant narratives, and project histories— informally published scholarship called gray literature—discuss significant aspects of digital humanities research, such as rationales for staffing decisions, technology choices, and even the critical theories that are foundational to a project’s development. Still, gray literature is often stored or published on funders’ websites or in institutional repositories. Occasionally, though less frequently, white papers may be published on a project’s website. Since these publications reside outside a humanist’s usual research purview, they are less likely to be found or used by scholars new to the field. In her essay “Let the Grant Do the Talking,” Sheila Brennan suggests that wider circulation of these materials would prove an important contribution to scholarship: “One way to present digital humanities work could be to let grant proposals and related reports or white papers do some of the talking for us, because those forms of writing already provide   17   intellectual rationales behind digital projects and illustrate the theory in practice.” Brennan continues by explaining that grant proposals are often heavily scrutinized by peer reviewers and provide detailed surveys of exist- ing resources. Most federal funders require white papers that reflect upon the nature of the work performed during the grant when the grant period is over, all of which are made available to the public. While the nature of the writing differs from what one might find in a typical journal article, grant proposals and white papers address general humanities audiences. That means a body of scholarly writing already exists that addresses the history, composition, and development of a sizeable portion of digital humanities work. The challenge resides in making this writing more visible to a broader humanities audience. Although we still have work to do to continue filling in the archi- val silences of digital humanities, I believe that it is a project worth the work involved. Eschewing the impulse to draw stark contrasts between digital humanities and the rest of the humanities, choosing instead to delve into the complex social, economic, and institutional pressures that a “technological euphoria” obscures represents a promising way ahead for humanists—digital and otherwise.   Part 3 Shadows in the Archive (Chun)   First, thank you for an excellent and insightful response, for the ways you historicize the “bright side” rhetoric, take on the challenges of funding, and elaborate on what you find to be DH’s dark side: your points about the silences about DH’s work process, its labor practices, funding mod- els, collaborative challenges, and critical theory are all profound. Further, your move from bright/dark to shadows is inspiring. By elaborating on the work done by early adopters and younger scholars, you show how digital humanists do not engage in a “vapid embrace of the digital.” You show that the technological determinists rather than the practicing digital humanists are the detractors (and I would also insert here supporters). Indeed, if any group would know the ways in which the digital   humanities do not guarantee everything they are hyped to do, it is those who have for many years worked under the rubric of “humanities computing.” As Liu has so pointedly argued, they have been viewed for years as servants rather than masters (“Where Is”). They know intimately the precariousness of soft money projects, the difficulty of being granted tenure for preparing rather than interpreting texts, and the ways in which teaching students mark-up languages hardly guarantees them jobs. For all these reasons, the “bright side” rhetoric is truly baffling—unless, of course, one considers the institutional framework within which the digital humanities has been embraced. As you point out, it has not given institutions the access to the limitless pools of money they once hoped for, but it has given them access to indirect cost recovery—something that very few humanities projects provide.5 It also gives them a link to the future. As William Gibson, who coined the term “cyberspace” before he had ever used a computer, once quipped, “[T]he future is already here—it’s just not evenly distributed.” The cruel optimism I describe is thus a “vapid embrace of the digital” writ large, rather than simply an embrace of the digital humanities. One need only think back to the mid-1990s when the Internet became a mass medium after its backbone was sold to private corporations and to the rhetoric that surrounded it as the solution to all our problems, from racial discrimination to inequalities in the capitalist marketplace, from government oversight to the barriers of physical location. And as you note, this embrace is most pointed among those on the outside: soon after most Americans were on the Internet, the television commercials declaring the Internet the great equalizer disappeared. Stanley Fish’s “The Old Order Changeth” compares DH to theory, stating, “[O]nce again, as in the early theory days, a new language is confidently and prophetically spoken by those in the know, while those who are not are made to feel ignorant, passed by, left behind, old.” Yet, your discussion of what you see as the dark side—that, because of DHers’ silences, “[W]e have allowed the false light of a thriving field alive with job opportunities, research successes, and technological determinism to seep into those holes”—made me revisit Berlant again and in particular her insistence that cruel optimism is doubly cruel because it allows us to be “bound to a situation of profound threat that is, at the same time, profoundly confirming” (2). It is the confirmation—the modes of sur- vival—that generate pleasure and make cruel optimism so cruel. Also, as Berlant emphasizes, optimism is not stupid or simple, for “often the risk of attachment taken in its throes manifests an intelligence beyond rational calculation” (2). Given the institutional structures under which we work, I   19   find your call for DHers to tell their own story faithfully to be incredibly important and, I think also, incredibly difficult. Rather than focus on DH, though, I want to return to the broad- ness of my initial analysis and your response. I was serious when I stated that my comments were not directed toward DH per se, but rather toward the technological euphoria surrounding the digital, a euphoria that makes political problems into ones that technology can solve. Here, I think the problem we face is not the “crisis in the humanities” or the divide between humanists and digital humanists, but rather the defunding of universities, a defunding to which universities have responded badly. I remember a for- mer administrator at Brown once saying: “[W]e are in the business of two things: teaching and research. Both lose money.” His point was that viewing research simply as a way to generate revenue (“indirect costs”) overlooks the costs of doing “big” research; his point was also that the university was in the business not of making money, but of educating folk. Grasping for ever-diminishing sums of grant money to keep universities going—a grasping that also entails a vast expenditure in start-up funds, costs for facilities, and so on, arguably available to only a small number of already elite universities—is a way to tread water for a while but is unsustainable. We see the unsustainability of this clearly in the recent euphoria around moocs, which are not, as you point out, embraced by the DH com- munity even as they are increasingly defining DH in the minds of many. They are sexy in a way that Zotero is not and Bamboo was not. moocs are attractive for many reasons, not least in terms of their promise (and I want to stress here that it is only a promise—and that promises and threats, as Derrida has argued, have the same structure) to alleviate the costs of getting a college degree. But why and how have we gotten here? And would students such as my younger self, educated in Canada in the 1980s, have found moocs so attractive? As I stressed at the mla, the problem is debt: the level of student debt is unsustainable, as are the ways universities are approaching the problem of debt by acquiring more of it (a problem, I realize, that affects most institutions and businesses in the era of neoliberalism). The problem is also the strained relationship between education and employment. To repeat a few paragraphs from that talk: The humanities are sinking—if they are—not because of their earlier embrace of theory or multiculturalism, but because they have capitulated to a bureaucratic technocratic logic. They have  20   conceded to a logic, an enframing (*to use Heidegger’s term*), that has made publishing a question of quantity rather than quality, so that we spew forth mpus or minimum publishable units; a logic, an enframing, that can make teaching a burden rather than a mission, so that professors and students are increasingly at odds; a logic, an enframing, that has divided the profession and made us our own worst enemies, so that those who have jobs for life deny jobs to others—others who have often accomplished more than they (than we)—have. The academy is a sinking ship—if it is—because it sinks our students into debt, and this debt, generated by this optimistic belief that a university degree automatically guar- antees a job, is what both sustains and kills us. This residual belief/hope stems from another time, when most of us couldn’t go to university, another time, when young adults with degrees received good jobs not necessarily because of what they learned, but because of the society in which they lived. We—and I mean this “we” broadly—have not been good at explaining the difference between being educated and getting a job. A college degree does not guarantee a job; if it did in the past, it was because of demographics and discrimination (in the broadest sense of the term). One thing we can do is to explain to students this difference and to tell them that they need to put the same effort into getting a job that they did into getting into college. To help them, we have not only to alert them to internships and job fairs but also to encourage them to take risks, to expand the courses they take in university and to view challenging courses as rewarding. I cannot emphasize how much I learned—even unintentionally—from doing both systems design engineering and English literature as an undergraduate: combined, they opened up new paths of thinking and analyzing with which I’m still grappling. Another thing we can do is address, as you so rightly underscore, how the university spends money. Most importantly, we need to take on detractors of higher edu- cation not by conceding to the rhetoric of “employability,” but arguing that the good (rather than goods) of the university comes from what lies outside of immediate applicability: basic research that no industrial research center would engage in, the cultivation of critical practices and thinking that make us better users and producers of digital technologies and better citizens. I want to emphasize that this entails building a broad   coalition across all disciplines within the university. The sciences can not only be as useless as the humanities, they can also be as invested in remaining silent and bathing in the false glow of employability and success as some in the DH. As I mentioned in the mla talk, there are students who graduate from the sciences and cannot find jobs; the sciences are creative and critical; the sciences, of all the disciplines, are most threatened by moocs. We need to build coalitions, rather than let some disciplines be portrayed as “in crisis,” so that ours, we hope, can remain unscathed. To live by the rhetoric of usefulness and practicality—of technological efficiency—is also to die by it. Think of the endlessness of debates around global climate change, debates that are so endless in part because the probabilistic nature of science can never match its sure rhetoric. What I also want to emphasize is that these coalitions will be fractious. There will be no consensus, but, inspired by the work of Anna Tsing, I see friction as grounding, not detracting from, political action. These coalitions are also necessary to take on challenges facing the world today, such as the rise of big data. Again, not because they are inherently practical, but rather, because they can take on the large questions raised by it, such as: given that almost any correlation can be found, what is the relationship between correlation and causality? between what’s empirically observable and what’s true? I want to end by thinking again of Berlant’s call for “ambient citizenship” as a response to cruel optimism and Lauren Klein’s really brilliant work, which you cite and which I—along with my coeditors Tara McPherson and Patrick Jagoda—am honored to publish as part of a special issue of American Literature on new media and American literature (“Image”). Berlant ends Cruel Optimism by asking to what extent attending to ambient noise could create forms of affective attachment that can displace those that are cruelly optimistic. These small gestures would attend to noises and daily gestures that surround us rather than to dramatic gestures that too quickly become the site of new promises (although she does acknowledge that ambient citizenship resonates disturbingly with George W. Bush’s desire to “get rid of the filter”). Ambient citizenship would mean attending to things like teaching: teaching, which is often accomplished not by simply relaying information (this is the mooc model), but through careful attention to the noises in and dynamics of the classroom. I also wonder how this notion of ambient citizenship can be linked to Klein’s remarkable work discovering the contours of James Heming in the letters of Thomas Jefferson. Jefferson, as Klein notes, was meticulous about documentation and was very much aware of leaving an archive for history. Searching for “information” about Heming, his former  22   slave and chef, though, is extremely difficult, and reducing the lives of slaves to lists and accounts—to the signals that remain—is unethical. Drawing from the work of Saidiya Hartmann and Stephen Best, Klein uses DH tools to trace the ghost, the lingering presence, of Heming. She uses these tools to draw out the complexity of relations between individuals across social groups. Resisting the logic of and ethic of recovery, she makes the unrecorded story of Hemings “expand with meaning and motion.” She also, even as she uses these tools, critiques visualization as “the answer,” linking the logic of visualization to Jefferson’s uses of it to justify slavery. Klein’s work epitomizes how DH can be used to grapple with the impossible, rather than simply usher in the possible. I think that her work— and some other work in DH—by refusing the light and the dark, reveals the ways in which the work done by the union of the digital and the humanities (a union that is not new, but rich in history) will not be in the clearing (to refer to Heidegger), but rather, as you suggest, in the shadows.       23                                                                                 *This  column  reflects  fy  2013  annualized  funding,  including  a  0.612%  increase  as  provided  by  the  FY  2013  Continuing   Appropriations  Resolution,  p.l.  112-­‐175.           FY  2012          FY  2013                                                  FY  2014   Approp.       Estimate                              Request     Bridging  Cultures     $3,494     $3,515     $9,000     Education  Programs     13,179     13,260     13,250     Federal/State  Partnership     40,435     40,683     43,432     Preservation  and  Access     15,176     15,269     15,750     Public  Programs     13,404     13,486     14,000     Research  Programs     14,502     14,591     15,435     Digital  Humanities     4,143     4,168     4,450     We  the  People     2,995     3,013     —     Program  Development                  499                  502                  500     Subtotal     107,827     108,487     115,817       Challenge  Grants     8,537     8,408     8,850     Treasury  Funds            2,381            2,396          2,400     Subtotal     10,738     10,804     11,250       Administration          27,456            27,624            27,398       Total     $146,021     $146,915*     $154,465       Table 1 FY 2014Appropria- - tion Request ($ in thousands). NEH.gov    24     WENDY  HUI  KYONG  CHUN  is  Professor  and  Chair  of  Modern  Culture  and  Media  at  Brown  University.   She   has   studied   both   systems   design   engineering   and   English   literature,   which   she   combines   and   mutates  in  her  current  work  on  digital  media.  She  is  the  author  of  Programmed  Visions:  Software  and   Memory   (Massachusetts   Institute   of   Technology   Press,   2011)   and   Control   and   Freedom:   Power   and   Paranoia  in  the  Age  of  Fiber  Optics  (Massachusetts  Institute  of  Technology  Press,  2006).  She  is  working   on  a  monograph  titled  “Habitual  New  Media.”     LISA  MARIE  RHODY  is  Research  Assistant  Professor  at  the  Roy  Rosenzweig  Center  for  History  and  New   Media  at  George  Mason  University.  Her  research  employs  advanced  computational  methods  such  as   topic  modeling  to  revise  existing  theories  of  ekphrasis—poetry  to,  for,  and  about  the  visual  arts.  She  is   editor  of  the  Journal  of  Digital  Humanities  and  project  manager  for  the  Institute  of  Museum  and  Library   Services’  (imls)  signature  conference,  WebWise.                                         Anderson,  Benedict.   Imagined  Communities:  Reflections  on   the  Origin  and   the  Spread  of  Nationalism.   London:  Verso,  1983.     Ayers,   Edward   L.   “A   More-­‐Radical   Online   Revolution.”   Chronicle   of   Higher   Education   4   Feb.   2013.   http://chronicle.com/article/A-­‐More-­‐Radical-­‐Online/136915/.     Berlant,  Lauren.  Cruel  Optimism.  Durham:  Duke  up,  2011.     Brennan,   Sheila.   “Let   the   Grant   Do   the   Talking.”   Journal   of   Digital   Humanities   1.4   (Fall   2012).   http://journalofdigitalhumanities.org/1-­‐4/let-­‐the-­‐grant-­‐do-­‐the-­‐talking-­‐by-­‐sheila-­‐brennan/   (accessed   26  July  2013).     Cecire,   Natalia.   “Theory   and   the   Virtues   of   Digital   Humanities.”   Introduction.   Journal   of   Digital   Humanities   1.1   (Winter   2011).   http://journalofdigitalhumanities.org/1-­‐1/introduction   -­‐theory-­‐and-­‐ the-­‐virtues-­‐of-­‐digital-­‐humanities-­‐by-­‐natalia-­‐cecire/  (accessed  26  July  2013).     1      See  “Links  and  Kinks  in  the     4     See  John  Unsworth’s  talk,  “What       Chain:  Collaboration  in  the  Digital       Is  Humanities  Computing  and       Humanities”  for  an  abstract  of  the       What  Is  Not?”  for  more  along  these       2009  mla  Convention  panel.       lines.     2     For  a  list  of  the  twenty-­‐seven  digi   5     Indirect  cost  recovery  started  dur     tal  humanities  and  media  studies       ing  World  War  II  and  the  era  of  Big       sessions  presented  at  the  2009  mla       Science:  the  government  agreed  to       Convention,  see  Sample.       pay  for  the  physical  infrastructure     3     At   the   time,   much   media   attention   was  devoted  to  the  United  States  v.   Microsoft  Corporation  antitrust  case   initiated  in  1998  and  settled  by  the   United  States  Department       needed   for   funded  projects;  private   grant    agencies—still   a   large   source   of   funding  for  the  humanities,  often  in   the   form  of   fellowships—  routinely   refuse  to  pay  for  these  offsets.       of  Justice  in  2001,  which  created           a  backdrop  for  ensuing  conver         sations  about  open  standards  in           humanities  computing.           Notes   Works Cited     25   Clement,  Tanya,  and  Dave  Lester.  “Off  the  Tracks:  Laying  New  Lines  for  Digital  Humanities  Scholars.”   http://mith.umd.edu/wp-­‐content/uploads/whitepaper_offthetracks.pdf  (accessed  26  July  2013).     Davidson,  Cathy.  “Humanities  2.0:  Promise,  Perils,  Predictions.”  pmla  123.3  (2008):  707–17.       Dombrowski,  Quinn.  “Whatever  Happened  to  Project  Bamboo?”  Conference  paper.  DH2013  Conference.   19  July  2013.  University  of  Nebraska–Lincoln.     Fish,   Stanley.   “The   Old   Order   Changeth.”   New   York   Times   26   Dec.   2011.   http://opinionator   .blogs.nytimes.com/2011/12/26/the-­‐old-­‐order-­‐changeth/.     Flanders,   Julia.   “The   Productive   Unease   of   21st-­‐Century   Digital   Scholarship.”   Digital   Humanities   Quarterly  3.3  (2009).  http://www.digitalhumanities.org/dhq/vol/3/3/000055/000055  .html.     Gibson,   William.   “The   Science   in   Science   Fiction.”   Talk   of   the   Nation.   npr   30   Nov.   1999.   http://   www.npr.org/templates/story/story.php?storyId=1067220.     Klein,   Lauren   F.   “The   Image   of   Absence:   Archival   Silence,   Data   Visualization,   and   James   Hemings.”   American  Literature  and  New  Media.  Spec.  issue  of  American  Literature  85.4  (Dec.  2013):  661–68.     .   “A   Report   Has   Come   Here.”   Lauren   F.   Klein   (blog).   9   Jan.   2013.   http://lmc.gatech   .edu/~lklein7/2012/01/09/a-­‐report-­‐has-­‐come-­‐here-­‐social-­‐network-­‐analysis-­‐in-­‐the-­‐papers-­‐of   -­‐ thomas-­‐jefferson/.     Lewin,  Tamar.  “At  Colleges,  Humanities  Job  Outlook  Gets  Bleaker.”  New  York  Times  18  Dec.  2009.     “Links   and   Kinks   in   the   Chain:   Collaboration   in   the   Digital   Humanities.”   Panel.   Modern   Languages   Association   Program   Archive   29   Dec.   2009.   http://www.mla.org/conv_listings_detail?   prog_id=490&year=2009.     Liu,  Alan.  “Digital  Humanities  and  Academic  Change.”  English  Language  Notes  47  (Spring  2009):  17–35.   ebsco  Host  (accessed  10  Dec.  2009).      “Where   Is   Cultural   Criticism   in   the   Digital   Humanities?”   Alan   Liu.   Webpage.   http://liu.english.ucsb.edu/where-­‐is-­‐cultural-­‐criticism-­‐in-­‐the-­‐digital-­‐humanities/   (accessed   27   July   2013).     Losh,   Liz.   “The   Ministry   of   Silly   Walks.”   Virtualpolitik   29   Dec.   2009.   http://networkedblogs   .com/p22905895.     Mandell,   Laura.   “Digital   Humanities:   The   Bright   Spot.”   aims   13   Jan.   2010.   http://aims.muohio   .edu/2010/01/13/digital-­‐humanities-­‐the-­‐bright-­‐spot/.     National  Endowment  for  the  Humanities  Appropriations  Request  for  Fiscal  Year  2014.  Washington,  dc.   National   Endowment   for   the   Humanities,   2013.   http://www.neh.gov/files/neh   _request_fy2014.pdf   (accessed  26  July  2013).     Newfield,  Christopher.   “Ending   the  Budget  Wars:  Funding   the  Humanities  during  a  Crisis   in  Higher   Education.”   Profession   1   (2009):   270–84.   http://www.mlajournals.org/doi/pdf/10.1632   /prof.2009.2009.1.270  (accessed  26  July  2013).     Pannapacker,  William.   “The  mla  and   the  Digital  Humanities.”  Chronicle  of  Higher  Education  28  Dec.   2009.  http://chronicle.com/blogPost/The-­‐MLAthe-­‐Digital/19468/.     “Pannapacker  at  mla:  The  Come-­‐to-­‐DH  Moment.”  Chronicle  of  Higher  Education  7  Jan.  2012.   http://chronicle.com/blogs/brainstorm/pannapacker-­‐at-­‐the-­‐mla-­‐2–the-­‐come  -­‐to-­‐dh-­‐moment/42811.      26   Rosenzweig,  Roy,  and  Dan  Cohen.  Digital  History:  A  Guide  to  Gathering,  Preserving,  and  Presenting   the  Past  on  the  Web.  Philadelphia:  u  of  Pennsylvania  p,  2005.     Sample,  Mark.  “Digital  Humanities  Sessions  at  the  2009  mla.”  Sample  Reality  (blog).  15  Nov.  2009.   http://www.samplereality.com/2009/11/15/digital-­‐humanities-­‐sessions-­‐at-­‐the-­‐2009-­‐mla/.     Tsing,  Anna  L.  Friction:  An  Ethnography  of  Global  Connection.  New  Jersey:  Princeton  up,  2005.     Unsworth,   John.   “Documenting   the  Reinvention  of  Text:  The   Importance  of  Failure.”   Journal  of   Electronic  Publishing  3.2  (Dec.  1997).  http://dx.doi.org/10.3998/3336451.0003.201  (accessed  26   July  2013).      “What   Is   Humanities   Computing,   and   What   Is   Not?”   http://computerphilologie   .tu-­‐ darmstadt.de/jg02/unsworth.html  (accessed  26  July  2013).     Wernimont,   Jacqueline.   “Not   (Re)Covering   Feminist   Methods   in   Digital   Humanities.”   Jacqueline   Wernimont  (blog).  19  July  2013.  http://jwernimont.wordpress.com/2013/07/19  /not-­‐recovering-­‐ feminist-­‐methods-­‐in-­‐digital-­‐humanities.     Williford,   Christa,   and   Charles   Henry.   One   Culture:   Computationally   Intensive   Research   in   the   Humanities  and  Social  Sciences:  A  Report  on  the  Experiences  of  First  Respondents  to  the  Digging  into   Data  Challenge.  Washington,  dc:  clir,  2012.  http://www.clir.org/pubs/reports  /pub151  (accessed   26  July  2013).     work_hqbi22efmvf2xpdtmeopsds2hm ---- In, Out, Across, With- Collaborative Education and Digital Humanities In, Out, Across, With: Collaborative Education and Digital Humanities (Job Talk for Scholars' Lab) MAR 2ND, 2017 I’ve accepted a new position as the Head of Graduate Programs in the Scholars’ Lab, and I’ll be transitioning into that role over the next few weeks! As a part of the interview process, we had to give a job talk. While putting together this presentation, I was lucky enough to have past examples to work from (as you’ll be able to tell, if you check out this past job talk by Amanda Visconti). Since my new position will involve helping graduate students through the process of applying for positions like these, it only feels right that I should post my own job talk as well as a few words on the thinking that went into it. Blemishes, jokes, and all, hopefully these materials will help someone in the future find a way in, just as the example of others did for me. And if you’re looking for more, Visconti has a great list of other examples linked from her more recent job talk for the Scholars’ Lab. For the presentation, I was asked to respond to this prompt: What does a student (from undergraduate to doctoral levels) need to learn or experience in order to add “DH” to his or her skill set? Is that an end or a means of graduate education? Can short-term digital assignments in discipline-specific courses go beyond “teaching with technology”? Why not refer everyone to online tutorials? Are there risks for doctoral students or the untenured in undertaking digital projects? Drawing on your own experience, and offering examples or demonstrations of digital research projects, pedagogical approaches, or initiatives or organizations that you admire, make a case for a vision of collaborative education in advanced digital scholarship in the arts and humanities. I felt that each question could be a presentation all its own, and I had strong opinions about each one. Dealing with all of them seemed like a tall order. I decided to spend the presentation close reading and deconstructing that first sentence, taking apart the idea that education and/or digital humanities could be thought of in terms of lists of skills at all. Along the way, my plan was to dip into the other questions as able, but I also assumed that I would have plenty of time during the interview day to give my thoughts on them. I also wanted to try to give as honest a sense as possible of the way I approach teaching and mentoring. For me, it’s all about people and giving them the care that they need. In conveying that, I hoped, I would give the sort of vision the prompt was asking for. I also tried to sprinkle references to the past and present of the Scholars’ Lab programs to ground the content of the talk. When I mention potential career options in the body of the talk, I am talking about specific alumni who came through the fellowship programs. And when I mention graduate fellows potentially publishing on their work with the Twitter API, well, that’s not hypothetical either. So below find the lightly edited text of the talk I gave at the Scholars’ Lab - “In, Out, Across, With: Collaborative Education and Digital Humanities.” I’ve only substantively modified one piece - swapping out one example for another. And a final note on delivery: I have heard plenty of people argue over whether it is better to read a written talk or deliver one from notes. My own sense is that the latter is far more common for digital humanities talks. I have seen both fantastic read talks and amazing extemporaneous performances, just as I have seen terrible versions of each. My own approach is, increasingly, to write a talk but deliver that talk more or less from memory. In this case, I had a pretty long commute to work, so I recorded myself reading the talk and listened to it a lot to get the ideas in my head. When I gave the presentation, I had the written version in front of me for reference, but I was mostly moving through my own sense of how it all fit together in real time (and trying to avoid looking at the paper). My hope is that this gave me the best of both worlds and resulted in a structured but engaging performance. Your mileage may vary! In, Out, Across, With: Collaborative Education and Digital Humanities It’s always a treat to be able to talk with the members of the UVA Library community, and I am very grateful to be here. For those of you that don’t know me, I am Brandon Walsh, Mellon Digital Humanities Fellow and Visiting Assistant Professor of English at Washington and Lee University. The last time I was here, I gave a talk that had almost exclusively animal memes for slides. I can’t promise the same robust Internet culture in this talk, but talk to me after and I can hook you up. I swear I’ve still got it. In the spirit of Amanda Visconti, the resources that went into this talk (and a number of foundational materials on the subject) can all be found in a Zotero collection at the above link. I’ll name check any that are especially relevant, but hopefully this set of materials will allow the thoughts in the talk to flower outwards for any who are interested in seeing its origins and echoes in the work of others. And a final prefatory note: no person works, thinks or learns alone, so here are the names of the people in my talk whose thinking I touch upon as well as just some – but not all – of my colleagues at W&L who collaborate on the projects I mention. Top tier consists of people I cite or mention, second tier is for institutions or publications important to discussion, and final tier is for direct collaborators on this work. Today I want to talk to you about how best to champion the people involved in collaborative education in digital research. I especially want to talk about students. And when I mention “students” throughout this talk, I will mostly be speaking in the context of graduate students. But most of what I discuss will be broadly applicable to all newcomers to digital research. My talk is an exhortation to find ways to elevate the voices of people in positions like these to be contributors to professional and institutional conversations from day one and to empower them to define the methods and the outcomes of the digital humanities that we teach. This means taking seriously the messy, fraught, and emotional process of guiding students through digital humanities methods, research, and careers. It means advocating for the legibility of this digital work as a key component of their professional development. And it means enmeshing these voices in the broader network around them, the local context that they draw upon for support and that they can enrich in turn. I believe it is the mission of the Head of Graduate Programs to build up this community and facilitate these networks, to incorporate those who might feel like outsiders to the work that we do. Doing so enriches and enlivens our communities and builds a better and more diverse research and teaching agenda. This talk is titled “In, Out, Across, With: Collaborative Education and Digital Humanities,” and I’ll really be focusing on the prepositions of my title as a metaphor for the nature of this sort of position. I see this role as one of connection and relation. The talk runs about 24 minutes, so we should have plenty of time to talk. When discussing digital humanities education, it is tempting to first and foremost discuss what, exactly, it is that you will be teaching. What should the students walk away knowing? To some extent, just as there is more than one way to make breakfast, you could devise numerous baseline curricula. This is what we came up with at Washington and Lee for students in our undergraduate digital humanities fellowship program. We tried to hit a number of kinds of skills that a practicing digital humanist might need. It’s by no means exhaustive, but the list is a way to start. We don’t expect one person to come away knowing everything, so instead we aim for students to have an introduction to a wide variety of technologies by the end of a semester or year. They’ll encounter some technologies applicable to project management, some to front-end design, as well as a variety of programming concepts broadly applicable to a variety of situations. Lists like this give some targets to hit. But still, even as someone who helped put this list together, it makes me worry a bit. I can imagine younger me being afraid of it! It’s easy for us to forget what it was like to be new, to be a beginner, to be learning for the first time, but I’d like to return us to that frame of thinking. I think we should approach lists like these with care, because they can be intimidating for the newcomer. So in my talk today I want to argue against lists of skills as ways of thinking. I don’t mean to suggest that programs need no curriculum, nor do I mean to suggest that no skills are necessary to be a digital humanist. But I would caution against focusing too much on the skills that one should have at the end of a program, particularly when talking about people who haven’t yet begun to learn. I would wager that many people on the outside looking in think of DH in the same way: it’s a big list of unknowns. I’d like to get away from that. Templates like this are important for developing courses, fellowship, and degree-granting programs, but I worry that the goodwill in them might all too easily seem like a form of gatekeeping to a new student. It is easy to imagine telling a student that “you have to learn GitHub before you can work on this project.” It’s just a short jump from this to a likely student response - “ah sorry - I don’t know that yet.” And from there I can all too easily imagine the common refrain that you hear from students of all levels - “If I can’t get that, then it’s because I’m not a technology person.” From there - “Digital humanities must not be for me.” Instead of building our curricula out of as-yet-unknown tool chains, I want to float, today, a vision of DH education as an introduction to a series of professional practices. Lists of skills might be ends but I fear they might foreclose beginnings. Instead, I will float something more in line with that of the Scholarly Communication Institute (held here at UVA for a time), which outlined what they saw as the needs of graduate and professional students in the digital age. I’ll particularly draw upon their first point here (last of my slides with tons of text, I swear): graduate students need training in “collaborative modes of knowledge production and sharing.” I want to think about teaching DH as introducing a process of discovery that collapses hierarchies between expert and newcomer: that’s a way to start. This sort of framing offers digital humanities not as a series of methods one does or does not know, but, rather, as a process that a group can engage in together. Do they learn methods and skills in the process? Of course! Anyone who has taken part in the sort of collaborative group projects undertaken by the Scholars’ Lab comes away knowing more than they came in with. But I want to continue thinking about process and, in particular, how that process can be more inclusive and more engaging. By empowering students to choose what they want to learn and how they want to learn it, we can help to expand the reach of our work and better serve our students as mentors and collaborators. There are a few different in ways in which I see this as taking place, and they’ll form the roadmap for the rest of the talk. Apologies - this looks like the sort of slide you would get at a business retreat. All the same - we need to adapt and develop new professional opportunities for our students at the same time that we plan flexible outcomes for our educational programs. These approaches are meant to serve increasingly diverse professional needs in a changing job market, and they need to be matched by deepening support at the institutional level. So to begin. One of our jobs as mentors is to encourage students to seek out professionally legible opportunities early on in their careers, and as shapers of educational programs we can go further and create new possibilities for them. At W&L, we have been collaborating with the Scholars’ Lab to bring UVA graduate students to teach short-form workshops on digital research in W&L classrooms. Funded opportunities like this one can help students professionalize in new ways and in new contexts while paying it forward to the nearby community. A similar initiative at W&L that I’ve been working on has our own library faculty and undergraduate fellows visiting local high schools to speak with advanced AP computer science students about how their own programming work can apply to humanities disciplines. I’m happy to talk more about these in Q&A. We also have our student collaborators present at conferences, both on their own work and on work they have done with faculty members, both independently and as co-presenters. Here is Abdur, one of our undergraduate Mellon DH fellows, talking about the writing he does for his thesis and how it is enriched by and different from the writing he does in digital humanities contexts at the Bucknell Digital Scholarship Conference last fall. While this sort of thing is standard for graduate students, it’s pretty powerful for an undergraduate to present on research in this way. Learning that it’s OK to fail in public can be deeply empowering, and opportunities like these encourage our students to think about themselves as valuable contributors to ongoing conversations long before they might otherwise feel comfortable doing so. But teaching opportunities and conferences are not the only ways to get student voices out there. I think there are ways of engaging student voices earlier, at home, in ways that can fit more situations. We can encourage students to engage in professional conversations by developing flexible outcomes in which we are equal participants. One approach to this with which I have been experimenting is group writing, which I think is undervalued as a taught skill and possible approach to DH pedagogy. An example: when a history faculty member at W&L approached the library (and by extension, me) for support in supplementing an extant history course with a component about digital text analysis, we could have agreed to offer a series of one-off workshops and be done with it. Instead, this faculty member – Professor Sarah Horowitz – and I decided to collaborate on a more extensive project together, producing Introduction to Text Analysis: A Coursebook. The idea was to put the materials for the workshops together ahead of time, in collaboration, and to narrativize them into a set of lessons that would persist beyond a single semester as a kind of publication. The pedagogical labor that we put into reshaping her course could become, in some sense, professionally legible as a series of course modules that others could use beyond the term. So for the book, we co-authored a series of units on text analysis and gave feedback on each other’s work, editing and reviewing as well as reconfiguring them for the context of the course. Professor Horowitz provided more of the discipline-specific material that I could not, and I provided the materials more specific to the theories and methods of text analysis. Neither one of us could have written the book without the other. Professor Horowitz was, in effect, a student in this moment. She was also a teacher and researcher. She was learning at the same time that she produced original scholarly contributions. Even as we worked together, for me this collaborative writing project was also a pedagogical experiment that drew upon the examples of Robin DeRosa, Shawn Graham, and Cathy Davidson, in particular. Davidson taught a graduate course on “21st Century Literacies” where each of her students wrote a chapter that was then collected and published as an open-access book. For us as for Davidson, the process of knowing, the process of uncovering is something that happens together. In public. And it’s documented so that others can benefit. Our teaching labor could become visible and professionally legible, as could the labor that Professor Horowitz put into learning new research skills. As she adapts and tries out ideas, and as we coalesce them into a whole, the writing product is both the means and the end of an introduction to digital humanities. Professor Horowitz also wanted to learn technical skills herself, and she learned quite a lot through the writing process. Rather than sitting through lectures or being directed to online tutorials by me, I thought she would learn better by engaging with and shaping the material directly. Her course and my materials would be better for it, as she would be helping to bind my lectures and workshops to her course material. The process would also require her to engage with a list of technologies for digital publishing. Beyond the text analysis materials and concepts, the process exposed her to a lot of technologies: command line, Markdown, Git for version control, GitHub for project management. In the process of writing this document, in fact, she covered most of the same curriculum as our undergraduate DH fellows. She’s learning these things as we work together to produce course materials, but, importantly, the technical skills aren’t the focus of the work together. It’s a writing project! Rather than presenting the skills as ends in themselves, they were the means by which we were publishing a thing. They were immediately useful. And I think displacing the technology is helpful: it means that the outcomes and parameters for success are not based in the technology itself but, rather, in the thinking about and use of those methods. We also used a particular platform that allowed Professor Horowitz to engage with these technologies in a light way so that they would not overwhelm our work – I’m happy to discuss more in the time after if you’re interested. This to say: the outcomes of such collaborative educations can be shaped to a variety of different settings and types of students. Take another model, CUNY’s Graduate Center Digital Fellows program, whose students develop open tutorials on digital tools. Learning from this example, rather than simply direct students or colleagues towards online tutorials like these, why not have them write their own documents, legible for their own positions, that synthesize and remix the materials that they already have found? The learning process becomes something productive in this framing. I can imagine, for example, directing collaboratively authored materials by students like these towards something like The Programming Historian. If you’re not familiar, The Programming Historian offers a variety of lessons on digital humanities methods, and they only require an outline as a pitch to their editorial team, not a whole written publication ready to go. Your graduate students could, say, work with the Twitter API over the course of a semester, blog about the research outcomes, and then pitch a tutorial to The Programming Historian on the API as a result of their work. It’s much easier to motivate yourselves to write something if you know that the publication has already been accepted. Obviously such acceptance is not a given, but working towards a goal like this can offer student researchers something to aim for. Their instructors could co-author these materials, even, so that everyone has skin in the game. This model changes the shape of what collaborative education can look like: it’s duration and its results. You don’t need a whole fellowship year. You could, in a reasonably short amount of time, tinker and play, and produce a substantial blog post, an article pitch, or a Library Research Guide (more on that in a moment). As Jeff Jarvis has said, “we need to move students up the education chain.” And trust me - the irony of quoting a piece titled “Lectures are Bullshit” during a lecture to you is not lost on me. But stay with me. Collaborative writing projects on DH topics are flexible enough to fit the many contexts for the kind of educational work that we do. After all, no one needs or values the same outcomes, and these shared and individual goals need to be worked out in conversation with the students themselves early on. Articulating these desires in a frank, written, and collaborative mode early on (in the genre of the project charter), can help the program directors to better shape the work to fit the needs of the students. But I also want to suggest that collaborative writing projects can be useful end products as well as launching pads, as they can fit the shape of many careers. After all, students come to digital humanities for a variety of different reasons. Some might be aiming to bolster a research portfolio on the path to a traditional academic career. Others might be deeply concerned about the likelihood of attaining such a position and be looking for other career options. Others still might instead be colleagues interested in expanding their research portfolio or skillset but unable to commit to a whole year of work on top of their current obligations. Writing projects could speak to all these situations. I see someone in charge of shaping graduate programs as needing to speak to these diverse needs. This person is both a steward of where students currently are – the goals and objectives they might currently have – as well as of where they might go – the potential lives they might (or might not!) lead. After all, graduate school, like undergraduate, is an enormously stressful time of personal and professional exploration. If we think simply about a student’s professional development as a process of finding a job, we overlook the real spaces in which help might be most desired. Frequently, those needs are the anxieties, stresses, and pressures of refashioning yourself as a professional. We should not be in the business of creating CV lines or providing lists of qualifications alone. We should focus on creating strong, well-adjusted professionals by developing ethical programs that guide them into the professional world by caring for them as people. In the graduate context, this involves helping students deal with the academic job market in particular. To me in its best form, this means helping students to look at their academic futures and see proliferating possibilities instead of a narrow and uncertain route to a single job, to paraphrase the work of Katina Rogers. A sprinkler rather than a pipeline, in her metaphor. As Rogers’s work, in particular, has shown, recent graduate students increasingly feel that, while they experienced strong expectations that they would continue in the professoriate, they received inadequate preparation for the many different careers they might actually go on to have. The Praxis Program and the Praxis Network are good examples of how to position digital humanities education as answers to these issues. Fellowship opportunities like these must be robust enough that they can offer experiences and outcomes beyond the purely technical, so that a project manager from one fellowship year can graduate with an MA and go into industry in a similar role just as well-prepared as a PhD student aiming to be a developer might go on to something entirely different. And the people working these programs must be prepared for the messy labor of helping students to realize that these are satisfactory, laudable professional goals. It should be clear that this sort of personal and professional support is the work of more than just one person. One of the strengths of a digital humanities center embedded in a library like this one at UVA is that fellows have the readymade potential to brush up against a variety of career options that become revealed when peaking outside of their disciplinary silos: digital humanities developers and project manager positions, sure, but also metadata specialists, archivists, and more. I think this kind of cross-pollination should be encouraged: library faculty and staff have a lot to offer student fellows and vice versa. Developing these relationships brings the fellows further into the kinds of the work done in the library and introduces them to careers that, while they might require further study to obtain, could be real options. To my mind the best fellowship programs are those fully aware of their institutional context and those that both leverage and augment the resources around them as they are able. We have been working hard on this at W&L. We are starting to institute a series of workshops led by the undergraduate fellows in consultation with the administrators of the fellowship program. The idea is that past fellows lead workshops for later cohorts on the technology they have learned, some of which we selectively open to the broader library faculty and staff. The process helps to solidify the student’s training – no better way to learn than to teach – but it also helps to expand the student community by retaining fellows as committed members. It also helps to fill out a student’s portfolio with a cv-ready line of teaching experience. This process also aims to build our own capacity within the library by distributing skills among a wider array of students, faculty, and staff. After all, student fellows and librarians have much they could learn from one another. I see the Head of Graduate Programs as facilitating such collaborations, as connecting the interested student with the engaged faculty/staff/librarian collaborator, inside their institution or beyond. But we must not forget that we are asking students and junior faculty to do risky things by developing these new interests, by spending time and energy on digital projects, let alone presenting and writing on them in professional contexts. The biggest risk is that we ask them to do so without supporting them adequately. All the technical training in the world means little if that work is illegible and irrelevant to your colleagues or committee. In the words of Kathleen Fitzpatrick, we ask these students to “do the risky thing,” but we must “make sure that someone’s got their back.” I see the Head of Graduate Programs as the key in coordinating, fostering, and providing such care. Students and junior faculty need support – for technical implementation, sure – but they also need advocates – people who can vouch for the quality of their work and campaign on their behalf in the face of committees and faculty who might be otherwise unable to see the value of their work. Some of this can come from the library, from people able to put this work in the context of guidelines for the evaluation of digital scholarship. But some of this support and advocacy has to come from within their home departments. The question is really how to build up that support from the outside in. And that’s a long, slow process that occurs by making meaningful connections and through outreach programs. At W&L, we have worked to develop an incentive grant program, where we incentivize faculty members who might be new to digital humanities or otherwise skeptical to experiment with incorporating a digital project into their course. The result is a slow burn – we get maybe one or two new faculty each term trying something out. That might seem small, but it’s something, particularly at a small liberal arts college. This kind of slow evangelizing is key in helping the work done by digital humanists to be legible to everyone. Students and junior faculty need advocates for their work in and out of the library and their home departments, and the person in this position is tasked with overseeing such outreach. So, to return to the opening motif, lists of skillsets certainly have their place as we bring new people into the ever-expanding field: they’re necessary. They reflect a philosophy and a vision, and they’re the basis of growing real initiatives. But it’s the job of the Head of Graduate Programs to make sure that we never lose sight of the people and relationships behind them. Foremost, then, I see the Head of Graduate Programs as someone who takes the lists, documents, and curricula that I have discussed and connects them to the people that serve them and that they are meant to speak to. This person is one who builds relationships, who navigates the prepositions of my title. It’s the job of such a person to blast the boundary between “you’re in” and “you’re out” so that the tech-adverse or shy student can find a seat at the table. This is someone who makes sure that the work of the fellows is represented across institutions and in their own departments. This person makes sure the fellows are well positioned professionally. This person builds up people and embeds them to networks where they can flourish. Their job is never to forget what it’s like to be the person trying to learn. Their job is to hear “I’m not a tech person” and answer “not yet, but you could be! and I know just the people to help. Let’s learn together.” work_hqonytq3nbg6xcia22yu7ggkzy ---- Big Data and Digital Humanities Jochen Tiepmar Abstract In academic discourse, the term Big Data is often used incorrectly or not considered in relevant use cases. This paper investigates the term Big Data in the context of text oriented digital humanities and in the process shows that it is not necessarily an issue of big data sets. The goal is to provide a starting point or a guideline for researchers in the humanities to relate their work to the concept of Big Data. It may even show the reader that they might be working on a task that can be considered as Big Data even though the data set itself is comparatively small. As such, this paper should not be seen as a concrete solution to specific problems but as a general overview that is based on several years of practical research experience. This paper also argues that interoperability is one of the most prominent Big Data issues in text oriented digital humanities. Jochen Tiepmar Leipzig University, Institute for Computer Science, � jtiepmar@informatik.uni-leipzig.de Archives of Data Science, Series A (Online First) DOI 10.5445/KSP/1000087327/01 KIT Scientific Publishing ISSN 2363-9881 Vol. 5, No. 1, 2018 mailto:jtiepmar@informatik.uni-leipzig.de 2 Jochen Tiepmar 1 Introduction Defining the term Big Data is not trivial. The most obvious defining factor is the size of a data set, but this property can not be applied universally and depends on the domain context as well as data type specific properties or measurements. For instance, text volume can be measured as number of tokens/documents or byte. While a token or document count can often result in impressive and seemingly large numbers, the corresponding bytes are often not in an area that can be considered as large. Yet certain text mining analyses – like citation analysis – and use case specific circumstances – like a real time requirement – may result in workflows that are already too calculation expensive for technically small data sets. IBM suggests the 4 Vs, data specific properties to help describe the Big Data relevance of a problem. These Vs are Volume, Veracity, Velocity and Variety. 1.1 Volume Volume is the most obvious aspect of Big Data and describes the size of a data set. The bigger a data set is, the more effort is required to process, share or store it. Especially medical applications like analysis of MRI images and simulations like weather models or particle systems can create and require large amounts of data. The increasing amount of digital and sometimes publicly available sensory information that is collected – for a vast number of examples, see works about Smart Cities or Internet of Things – will probably increase the need for solutions for size-related problems. Usually, a data set is not characterized as a Big Data problem if smaller than at least 1 Terabyte, and since current standard database systems and hard drives are able to store and manage several terabytes of data without any major issues, most Big Data Volume problems deal with memory and not disk space. Information that is stored in memory can be accessed faster than that in disk drives, but it is lost when the system is shut down. Therefore, disk space is usually used to store, manage, and archive data sets while memory is usually used for more dynamic, analytical tasks. Memory is currently also more expensive – and, therefore, more limited – than disk space, which means that the memory requirements that Big Data and Digital Humanities 3 qualify as a Big Data problem are usually lower than disk-space requirements. An arbitrarily chosen estimated border value could be 100 Gigabytes. In the context of text-oriented digital humanities, volume can also be used to refer to more information-related aspects like the number of tokens, sentences, or documents, as it is usually done for text corpora. Information-related size statistics can quickly result in seemingly big and impressive numbers while the required disk space stays relatively small. In the context of this analysis, Volume with a capitalized letter V refers to disk or memory space. Table 1 illustrates this relationship for some of the biggest data sets (Deutsches Textarchiv (DTA), Geyken et al (2011); Textgrid, Neuroth et al (2011)) that were collected in the context of this work. The disk space is calculated based on the uncompressed data set that is available for download and usually includes additional markup, which implies that the actual text data Volume is usually smaller. The number of documents and tokens is calculated based on the data set. The document number is the number of individual files, and the tokens were delimited by the characters ="<.>()[]{},:;, tab, newline, and whitespace. Textgrid provides multiple documents as part of one XML file, namely the with several TEI documents. These documents were separated into individual files. The token and document count can differ from the official project statistics, because they include the XML markup. This is intentional, since the point is to illustrate the relation between the number of words in a set of files and their hard disk space and, for this comparison, it is more correct to include the markup as tokens as it also influences the file sizes. Table 1: Text corpus statistics vs. hard disk space. Text Corpus Documents Tokens Disk Space DTA 2,435 211,185,949 1.3 GB Textgrid 91,149 232,567,480 1.8 GB PBC1 831 289,651,896 1.9 GB As Table 1 shows, the required disk space for text data is quite small even for comparatively big data sets. Problematic file sizes can usually only occur for 1 Parallel Bible Corpus (PBC), Mayer and Cysouw (2014) 4 Jochen Tiepmar text data sets that include optical scans of the document pages, which shall not be considered as text data but as image data. The English Wikipedia can be considered as one of the largest online text collections. Yet, according to its own statistics,2 as of February 2013, the size of the XML file containing only the current pages, no user or talk pages, was 42,987,293,445 bytes uncompressed (43 GB). It can be stated that storing and managing text data is not a Volume problem with respect to disk size. The data size is also not problematic with respect to setups that are designed to work in memory. At the time of writing, the current prices for 64 GB RAM based on Amazon.com range from 541.95e3 to 1,071.00e,4 which might be too expensive to consider this as standard hardware, but this is probably far from problematic for a project that is designed with the requirement of managing a Wikipedia-size text collection in memory. It must be emphasized that this is not a phenomenon that occurs because the amount of data is still small, and, therefore, can be expected to change in the near future. Instead, it can be considered as a constant characteristic of the practical use of text data. Data sets in this context correspond to individual document collections that tend to include documents that share a certain set of properties like a specific author, language, time period, or any kind of thematic relation. Das Deutsche Textarchiv only includes German literature covering a relatively limited time frame, and the Parallel Bible Corpus only includes Bible translations. Even if a data set includes a wide array of parameter configurations it can always be distinguished from other data sets by its specific properties. It is highly unlikely that the trend for this kind of data is headed toward centralization. This characteristic is especially important in text analysis because, in order to research specific effects, it is important to eliminate the impact of unrelated variables. A token frequency trend analysis usually requires a monolingual text corpus to avoid effects like the German feminine noun article die being counted as the English verb to die. Even in more inclusive use cases like a global digital library, it can be counter-productive not to limit the content to books and include – for instance – Twitter data or collected forum discussions. Therefore, it can be stated that the relatively small disk or memory size required to manage only the text data is and will not be a Big Data-related problem because of 2 https://en.wikipedia.org/wiki/Wikipdia:Size_of_Wikipedia#Size_of_the_English _Wikipedia_database 3 HyperX FURY DDR4 HX421C14FBK4/64 RAM Kit 64GB (4x16GB) 2133MHz DDR4 CL14 DIMM 4 Kingston KVR18R13D4K4/64 RAM 64GB (DDR3 ECC Reg CL13 DIMM Kit, 240-pin) Big Data and Digital Humanities 5 the purpose and characteristics of this kind of data. It is unlikely that the size of document collections is an issue that cannot be solved using present-day standard hardware. If text content is considered primary data, then external annotations and annotated text content can be considered secondary data. Annotated text content can include information about Part-of-Speech tags, named entities, editorial notes, and much more. External annotations can include citation links or linked resources like audio or image snippets to specific text passages. Secondary data does not have to be associated with the original text and can also occur as word frequency tables, topic models, co-occurrence & collocation data, or in the form of any other analytical result format. Secondary data in the context of text is usually the result of automated analytical processes or manual editing. Especially, the amount of information that is added by automated analytical processes can significantly increase the Volume of a data set. The amount of this kind of data depends on the analytical processes that are done and the results that are produced. A representative overview of this kind of data would require an unreasonable amount of work, and provide little to no value because the results for the individual projects would be project-specific and could not be compared. The Wortschatz project (Quasthoff and Richter (2005)) at Leipzig University generates a lot of annotation data and word statistics based on several sentence lists collected from online resources. The sentence lists can be considered the primary data, while everything else – including indices for the primary data – can be considered secondary data. Table 2 shows the relation between the Volumes of primary and secondary data based on the three samples deu_mixed_2011, deu_news_2011 and deu_newscrawl_2011. The information was compiled based on information given by a server administrator with direct access to the databases. Table 2: Primary vs secondary data Volume (Wortschatz). Data Set Primary Data (Bytes) Secondary Data (Bytes) deu_mixed_2011 37,270,576,048 517,020,294,364 deu_news_2011 3,672,898,564 59,421,534,187 deu_newscrawl_2011 3,735,178,336 222,879,231,073 6 Jochen Tiepmar The values in the table are not comparable to each other because each data set includes different sets of database tables. This is not an issue because the purpose is only to illustrate that secondary data tends to be of more Volume than primary data. Combined with the trend for increased interoperability and research in- frastructures that may store and provide annotations that would have been considered as temporary data in project-specific workflows, it may even be possible that exponential Volume growth occurs in the near future because of further annotations that are based on or caused by existing annotations. It can be stated that secondary data itself can qualify as a Volume problem because text annotation can increase the amount of meta information that is attached to any piece of text data without limit, and therefore, the Volume can be inflated indefinitely. Estimating whether or not this would result in Big Data sized document collections would be speculation. Yet, this work proposes that it is unlikely that future document collections will include every piece of annotated information in their documents because it makes the documents harder to read, and the information may even be contradictory to each other. It is more likely and reasonable that text passage references are used to link annotation results to text passages and between external services. 1.2 Variety Variety is about the different types and formats of data sets. Types include more broad differentiations like audio, video, or sensory data and also different file types for each media type like mp3, wav, and flac for audio files. Since the context of this work is text-oriented digital humanities, the types of data are already relatively limited but still include many file types – like tex, txt, xml, doc, csv, pdf, and many more – with specific characteristics. Other layers of complexity in Variety are differences in markup formats for a specific file type – like different XML schemas – and a vast number of workflows and access methods for data. This indicates that the Big Data issue Variety is similar to the increasing need for interoperability that is described in Section 2 and is very relevant in the context of text-oriented digital humanities. Big Data and Digital Humanities 7 1.3 Velocity Velocity describes the processing speed and is especially significant because it has a direct impact on the end-user experience while the other issues are generally only problematic for the service provider. For instance, a navigation system that calculates the best route based on sensory information about the current traffic would not be usable if this calculation requires several hours of processing time. More academic use cases are workflows that include a lot of experimental parameter permutation or the creation of domain-specific training data sets for neural networks and machine learning. A very common way to increase the processing speed of a workflow or algorithm is to parallelize it by dividing it into subsets of problems that are independently solved by different threads or computers in a network cluster and then combining their results. Parallelization of algorithms is an issue that is far from trivial and in some cases may be counter-productive or even impossible to implement because certain workflows can not be divided into independent sub problems. Specific tasks in the text-oriented digital humanities – for example, citation analysis – can be parallelized and provide interesting research questions with regard to Velocity. 1.4 Veracity Veracity refers to the quality and trustworthiness of data and is especially relevant in the context of sensory data where it can be a complex problem to distinguish between a correctly measured anomaly and a malfunction of a sensor. This can result in reduced efficiency and in financial losses as described in Dienst and Beseler (2016). Optical Character Recognition (OCR) can be considered as a complex Veracity-related problem in the context of text-oriented digital humanities. This observation is supported by the conclusions of Chaudhuri et al (2017). Nuances that distinguish certain letters can be hard to interpret correctly by a computer. Since OCR often has to work with documents that were not created digitally, problems like handwriting and unwanted image artefacts have to be considered. Even a comparatively high accuracy of 95% implies that every 20th character was guessed wrongly, which correlates to six mistakes in this sentence. 8 Jochen Tiepmar 1.5 The Big Vs and Digital Humanities A problem can be more or less characterized as Big Data the more or less complex it is as regards to one or many of the Big Data Vs. This especially implies that a problem does not necessarily have to include particularly large sets of data to be considered Big Data. The different aspects can be related to or influence each other. A relatively small data set that needs to be processed exceptionally fast is also a Big Data problem and Veracity can become decreasingly or increasingly important with increasing Volume, depending on the use case. A larger data set can decrease the impact of individual errors but also increase their absolute number in case of a systemic problem. This work argues that the following relations between the Vs and the digital humanities can be observed: • Volume is an issue that does exist with regard to secondary data but generally not as prominent as in other data related contexts and domains. • Velocity and Veracity can be problematic in specific tasks in citation analysis (time effectiveness) and digital humanities like OCR (Veracity). • Variety can be mapped to interoperability, a well known and universal issue in the digital humanities. The following section illustrates, why interoperability or Variety is an especially complex issue in such a broad field of the digital humanities. 2 Interoperability (Variety) Interoperability in the context of this work means the ability to interchange or reuse tools and data sets between different (research) projects. The Oxford Dictionary 2016 defines interoperability as “The ability of computer systems or software to exchange and make use of information” (Oxford Dictionary (2016)). Three technical aspects are relevant to the exchange of functions and data sets: Tools & workflows must understand the data, data types & markup must be understandable by the tools, and data availability & access must be provided. Big Data and Digital Humanities 9 2.1 Tools & workflow Variety Many projects in the text-oriented digital humanities can be characterized as specialized solutions that are not generally applicable to other research projects as e.g. Perseus (Smith et al (2000)), Das Deutsche Textarchiv (Geyken et al (2011)), and The Parallel Bible Corpus (Mayer and Cysouw (2014)). They use existing or newly created technologies to provide project-specific solutions for their project-specific data sets, including the use of publicly available tools like source code repositories (Perseus) as well as hand-crafted solutions (Das Deutsche Textarchiv, Parallel Bible Corpus). Tool reuse can be complicated because of domain-specific circumstances. For instance, it is not unusual to use a whitespace-based word tokenizer in Latin-based languages, which cannot be applied to Chinese texts. There may also be the case that individual tasks in a workflow are considered to be solved more easily using an improvised script instead of investing the effort to evaluate already existing solutions. The result is a set of workflows that consist of an increasingly bigger set of hand-crafted project-specific programs. The general consequence is a heterogeneity of technical solutions which makes it even harder for future researchers to find the tool combinations that are potentially useful for a given research problem. This issue is well-known in the digital humanities community as evidenced by the increasing popularity of digital infrastructures and archival projects like CLARIN (Hinrichs and Krauwer (2014)) and Das Digitale Archiv NRW (Thaller (2013)). With the increasing familiarity, acceptance, generality, and usability of existing tools and frameworks, this variety of (potentially redundant) workflows will probably decrease over time. Source code repositories like Github are already an established technical basis for collaborative text-editing workflows5 and mentions of natural language processing tools like the Part-of-Speech Tagger from the Stanford Natural Language Group (commonly referred to as the Stanford Tagger, Manning et al (2014)) rarely require further explanation. Yet, due to domain and context-specific requirements and also the fact that tool implementers are often motivated to try out and provide new solutions with their individual set of advantages and disadvantages, this workflow variety will probably evolve but never completely disappear, for examples, see the justifications for the toolkits that are offered by almost every Natural Language 5 See https://github.com/PerseusDL or https://github.com/tillgrallert/digital-muqtabas. 10 Jochen Tiepmar Processing group. It is unlikely that a complicated field like the text-oriented digital humanities with its vast variety of research questions and potentially incompatible parameter configurations can be covered by a comprehensive “Jack of all trades”-kind of solution. It can also be argued that this would not be a desirable scenario since a variety of solutions can be expected to be more flexible and promote improvements by innovation. Even established tools and workflows can be expected to change over time due to updates and technical improvements or complete paradigm shifts like the currently emerging trend for workflow parallelization. 2.2 Data type & markup Variety It can be counter-productive not to use established text-markup formats because the specification of a project-specific and competent format requires significantly more effort than the reuse of an existing one. Additionally, since formats like TEI/XML and DocBook already provide comprehensive sets of domain-specific features, it is hard to find acceptance and curiosity for new text markup formats in the research and tool development communities. It is more likely that future researchers will be trained in established markup formats and use or extend these for their purposes as, for example, described in Kalvesmaki (2015). Tool compatibility increases the value of a published data set, and therefore, it can be expected that this aspect will develop toward more interoperable data sets in established formats without further external intervention. 2.3 Data availability & access Variety Access to data sets in the text-oriented digital humanities is generally provided through project-specific websites and solutions, including zipped data dumps (e.g. Textgrid (Neuroth et al (2011)), German Political Speeches (Barbaresi (2012))), source code repositories (e.g. Digital Muqtabas (Grallert (2016)), Perseus), and website-specific catalogues or search forms (e.g. Das Deutsche Textarchive, Parallel Bible Corpus). There does not exist a widely accepted solution for a universal interface for text data. The argument can be made that such a solution could not already be implemented because an application- Big Data and Digital Humanities 11 independent reference & retrieval system for text data did not exist. Text data retrieval systems like archives or website catalogues are not designed to be reusable because they are not meant to provide the basis for other systems but instead, a context-specific way to retrieve data. For example, the search catalogue that serves the data from the Parallel Bible Corpus is not designed to be also able to serve the data from Das Deutsche Textarchiv. Therefore, the data references can be expected to be not compatible with other projects. Application-independent reference systems like ISBN (Griffiths (2015)) or DOI (Paskin (2010)) provide reusable identifiers for text resources but do not serve data in any way. They refer to the electronic resource as a whole, which typically correlates to one file or document while the Canonical Text Service (CTS) protocol (Smith (2009)) extends this principle to individual text passages. This aspect has good potential for improvement. Text referencing and retrieval systems can be combined to provide access to data in an application-independent way as it is already done for complete resources as soon as a reference system like ISBN is integrated into a data archive. Adapting this principle to text passages and combining it with a retrieval web service – as it is done with the CTS implementation described in Tiepmar (2018) – can significantly increase interoperability across projects. 3 Conclusion In summary, it can be stated that Big Data is a complex issue, especially when it is considered in a broad domain like digital humanities, even if it is restricted to the text oriented areas of this field. This paper argues that the trivial assumption that Big Data requires large data sets is not necessarily correct in this context and that other aspects and especially the issue of interoperability may be more relevant. It also shows that focusing only on volume related data aspects may result in ignorance against a significant number of potentially interesting use cases. Interoperability is further divided into three aspects and it is shown that one of them - data availability & access - shows huge potential for significant improvements. This paper lists numerous practically relevant research problems that can be considered as Big Data without requiring large data sets and in the process provides useful starting points and arguments for interested researchers that want to work in this area. Acknowledgements Part of this work was funded by the German Federal Ministry of Education and Research within the project ScaDS Dresden/Leipzig (BMBF 01IS14014B). 12 Jochen Tiepmar References Barbaresi A (2012) German political speeches – corpus and visualization (2nd release). In: Poster Session of the German Linguistic Society, Special Interest Group on Computational Linguistics (DGfS-CL), German Linguistic Society, Special Interest Group on Computational Linguistics (DGfS) / Open Archive of Human and Society Sciences (HAL), Frankfurt (Germany) / Paris (France), URL https://halshs. archives-ouvertes.fr/halshs-00677928 Chaudhuri A, Mandaviya K, Badelia P, Ghosh SK (2017) Optical Character Recognition Systems for Different Languages with Soft Computing. Springer International Publish- ing, Cham (Switzerland). DOI 10.1007/978-3-319-50252-6 Dienst S, Beseler J (2016) Automatic Anomaly Detection in Offshore Wind SCADA Data. In: Win Europe Summit Conference 2016, University of Leipzig / Global Tech I Offshore Wind GmbH, Leipzig / Hamburg (Ger- many), URL https://windeurope.org/summit2016/conference/ submit-an-abstract/pdf/626738292593.pdf Geyken A, Haaf S, Jurish B, Schulz M, Steinmann J, Thomas C, Wiegand F (2011) Das Deutsche Textarchiv: Vom historischen Korpus zum aktiven Archiv. In: Digi- tale Wissenschaft – Stand und Entwicklung digital vernetzter Forschung in Deutschland, Schomburg S, Leggewie C, Lobin H, Puschmann C (eds), Marketing des Hochschulbib- liothekszentrum des Landes Nordrhein-Westfalen (hbz), Cologne (Germany), p. 157– 161, URL https://hbz.opus.hbz-nrw.de/frontdoor/index/index/ docId/206 Grallert T (2016) Digital Muqtabas: An open, collaborative,and scholarly digital edition of Muhammad Kurd Ali’s early Arabic periodical Majallat al-Muqtabas (1906–1917/18). URL https://github.com/tillgrallert/digital-muqtabas Griffiths S (2015) ISBN: A History. NISO’s Information Standards Quarterly, Summer & Fall 2015 27(2):46–48, URL https://groups.niso.org/publications/ isq/v27no2-3/Griffiths/ Hinrichs E, Krauwer S (2014) The clarin research infrastructure: Resources and tools for ehumanities scholars. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), Chair) NCC, Choukri K, Declerck T, Loftsson H, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S (eds), European Lan- guage Resources Association (ELRA), Reykjavik (Iceland), p. 1525–1531, URL http: //www.lrec-conf.org/proceedings/lrec2014/index.html Kalvesmaki J (2015) Three Ways to Enhance the Interoperability of Cross-References in TEI XML. Symposium on Cultural Heritage Markup, Washington, DC (USA), vol. 16, DOI 10.4242/BalisageVol16.Kalvesmaki01 Manning CD, Surdeanu M, Bauer J, Finkel J, Bethard SJ, McClosky D (2014) The Stanford CoreNLP Natural Language Processing Toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Association for Computational Linguistics, Baltimore, MD (USA), p. 55–60, DOI 10.3115/v1/P14- 5010, URL http://aclweb.org/anthology/P14-5010 https://halshs.archives-ouvertes.fr/halshs-00677928 https://halshs.archives-ouvertes.fr/halshs-00677928 https://doi.org/10.1007/978-3-319-50252-6 https://windeurope.org/summit2016/conference/submit-an-abstract/pdf/626738292593.pdf https://windeurope.org/summit2016/conference/submit-an-abstract/pdf/626738292593.pdf https://hbz.opus.hbz-nrw.de/frontdoor/index/index/docId/206 https://hbz.opus.hbz-nrw.de/frontdoor/index/index/docId/206 https://github.com/tillgrallert/digital-muqtabas https://groups.niso.org/publications/isq/v27no2-3/Griffiths/ https://groups.niso.org/publications/isq/v27no2-3/Griffiths/ http://www.lrec-conf.org/proceedings/lrec2014/index.html http://www.lrec-conf.org/proceedings/lrec2014/index.html https://doi.org/10.4242/BalisageVol16.Kalvesmaki01 https://doi.org/10.3115/v1/P14-5010 https://doi.org/10.3115/v1/P14-5010 http://aclweb.org/anthology/P14-5010 Big Data and Digital Humanities 13 Mayer T, Cysouw M (2014) Creating a massively parallel Bible corpus. In: Proceed- ings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), Chair) NCC, Choukri K, Declerck T, Loftsson H, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S (eds), European Language Resources Association (ELRA), Reykjavik (Iceland), p. 3158–3163, URLhttp://www.lrec-conf.org/ proceedings/lrec2014/index.html Neuroth H, Lohmeier F, Smith KM (2011) University of Edinburgh Library Learning Services, Edinburgh (UK). vol. 6, p. 222–231, DOI 10.2218/ijdc.v6i2.198 Oxford Dictionary (2016) Definition of interoperability in english: Interoperabil- ity. In: Oxford Dictionaries, Oxford University Press, URL https://en. oxforddictionaries.com/definition/interoperability Paskin N (2010) Digital Object Identifier (DOI®) System, . Tertius Ltd., Oxford (UK), URL http://www.doi.org/overview/080625DOI-ELIS-Paskin.pdf Quasthoff U, Richter M (2005) Projekt Deutscher Wortschatz. Babylonia 15(3):33–35, Babylonia / Fondazione Lingue e Culture, Bellinzona / Comano (Switzerland), URL http://babylonia.ch/de/archiv/anni-precedenti/2005/ nummer-3-05/projekt-deutscher-wortschatz/ Smith DA, Rydberg-Cox JA, Crane G (2000) The Perseus Project: a digital library for the humanities. Literary and Linguistic Computing 15(1):15–25, DOI 10.1093/llc/15.1.15 Smith DN (2009) Citation in classical studies. 3(1)The Alliance of Digital Humanities Organizations (ADHO), URL http://www.digitalhumanities.org/dhq/ vol/3/1/index.html Thaller M (2013) Das Digitale Archiv NRW in der Praxis – Eine Softwarelösung zur digi- talen Langzeitarchivierung. Kölner Beiträge zu einer geisteswissenschaftlichen Fachin- formatik, Band 5,Verlag Dr. Kovač, Hamburg Tiepmar J (2018) Implementation and Evaluation of the Canonical Text Services Protocol as Part of a Research Infrastructure in the Digital Humanities. PhD thesis, Leipzig University / Leipzig University Library, Leipzig, URL http://nbn-resolving. de/urn:nbn:de:bsz:15-qucosa2-212926 http://www.lrec-conf.org/proceedings/lrec2014/index.html http://www.lrec-conf.org/proceedings/lrec2014/index.html https://doi.org/10.2218/ijdc.v6i2.198 https://en.oxforddictionaries.com/definition/interoperability https://en.oxforddictionaries.com/definition/interoperability http://www.doi.org/overview/080625DOI-ELIS-Paskin.pdf http://babylonia.ch/de/archiv/anni-precedenti/2005/nummer-3-05/projekt-deutscher-wortschatz/ http://babylonia.ch/de/archiv/anni-precedenti/2005/nummer-3-05/projekt-deutscher-wortschatz/ https://doi.org/10.1093/llc/15.1.15 http://www.digitalhumanities.org/dhq/vol/3/1/index.html http://www.digitalhumanities.org/dhq/vol/3/1/index.html http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa2-212926 http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa2-212926 Big Data and Digital Humanities Introduction Volume Variety Velocity Veracity The Big Vs and Digital Humanities Interoperability (Variety) Tools & workflow Variety Data type & markup Variety Data availability & access Variety Conclusion work_hslmqi4fi5dmlgmhf6b7537ugq ---- (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 1 The Consequences of Framing Digital Humanities Tools as Easy to Use Paige Morgan p.morgan@miami.edu ORCID: 0000-0001-8076-7356 ABSTRACT This article examines the recurring ways in which some of the most popular DH tools are presented as easy to use. It argues that attempts to couch powerful tools in what is often false familiarity, directly undermines the goal of encouraging scholarly innovation and risk taking. The consequences of framing digital tools as either easy or more difficult shapes the relationship between librarians and the students and faculty whose research they support, and, more broadly, the role and viability of libraries as spaces devoted to skill acquisition. Keywords: infrastructure, digital humanities, DH tools, DH pedagogy A digital humanities librarian provides consultations to researchers who are developing or struggling with DH projects. Frequently, these consultations begin with the researcher apologizing and explaining to the librarian their poor aptitude for digital humanities. In many cases, these researchers’ prior experience includes a referral to one or more digital humanities tools that have been branded as user-friendly/easy to use. At first, it can look as though this phenomenon is chiefly the result of language and rhetoric used to frame various DH tools — a component influenced by the software industry’s move towards graphical user interfaces and marketing software for everyone to use, whether in the workplace or at home, regardless of gender, age, or other factors that affect digital tools. That language remains the article’s primary focus. However, the issue is not simply tool-framing language. The taglines and framing in tool mailto:p.morgan@miami.edu http://orcid.org/0000-0001-8076-7356 (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 2 documentation are the most visible and stable form, as opposed to more ephemeral instances of language in LibGuides, promotional materials, and workshops and conversations at conferences. Researchers are encountering and struggling with an approach to DH growth and expansion that substantially relies on marketing aspects of DH research as easy. In other words, this article explores the way that our framing for DH tools and resources shapes researchers’ emotions and expectations. Sociologist Susan Leigh Star examined “the work behind the work” in scientific research contexts, meaning “the countless, taken-for-granted and often dismissed practices of assistants, technicians, and students that made scientific breakthroughs possible” (Timmermans 2016, 1). The infrastructure set-up for digital humanities, and the pressures that it places on students, serve as a parallel area of hidden work that can be illuminated. Despite the presence of “easiness” rhetoric in multiple contexts, tool presentation language is often the most concrete example that is available for analysis. Tool presentation language is the material that constitutes users’ introduction to the tool — usually the front page of a website, the about page, and any promotional videos — the materials that create a tool’s reputation. Instead of residing in a particular tool, or the tool creators’ choices, this is a problem within the design of the larger field of the digital humanities, a problem that can remain largely invisible. Recent efforts in library and DH scholarship have focused on illuminating work in digital humanities that tends to go unseen (Shirazi 2016); by unpacking the challenges around tool framing, one can lay the ground for working with them more effectively. Defining Easiness (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 3 Ease of use is one of the most desirable characteristics for any given tool — rivaled only in popularity by the quality of being free. It is not merely a digital humanities fascination — developers have been pursuing the creation of user-friendly graphical interfaces since the late 1960s. That pursuit has its own complex and continuing history, bound up in corporate rivalry and the outsized influence of certain tech leaders, such as Steve Jobs and his fascination with skeuomorphic design. As the tech industry has exerted influence on DH in many ways, it is unsurprising that DH tools have emulated this aspect of tech design. Easiness can seem like an obvious goal for DH support practitioners and tool developers; it goes hand in hand with efforts to democratize the field and make learning and research opportunities more available, regardless of whether institutions have existing and active DH programs. The easier it is to do DH, the more people will try it out — an appealing prospect at a time when humanities departments are looking for ways of asserting their continuing relevance, reinventing themselves in response to cultural shifts, and working to demonstrate that they provide students with job-ready skills. Easiness is attractive in part because it is powerful. The availability of easy-to- use tools shapes DH support infrastructure and affects how DH is incorporated into the classroom, in terms of how much time is needed to show students how to configure a tool and begin using it. For individual scholars developing projects, perceived ease or difficulty can be a deciding factor if there are multiple tools from which to choose and may determine whether the scholar decides to pursue the project at all. Transitioning to digital from conventional printed scholarship includes an adjustment to iterating through (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 4 multiple stages; and may involve multiple, modular outputs, such as datasets, websites, and processing workflows (Brown et. al. 2009, par. 7). The technical and scholarly ambitiousness of a particular project will intersect with each other. Depending on a scholar or team’s prior experience, the impacts of this intersection may be hard to predict (Brown et. al. 2009, par. 6). The problem of unpredictable challenges is complicated further by the pressure researchers face to show their deliverables to colleagues who may be less accustomed to the ups and downs of iteration, but are still called to evaluate it, either for promotion or degree completion. While guidelines and articles from major disciplinary organizations (Modern Language Association 2012; Presner 2012; American Historical Association 2015) discussing the evaluation of digital scholarship acknowledge the iterative nature of digital work, it is harder for such guidelines to prepare colleagues for evaluating mid-stage outputs with aesthetics that may not match the sophistication of the various commercial websites that individuals encounter every day. All these factors contribute to making “easy” tools compelling. Despite its considerable dazzle, easiness is an abstract and intangible quality; the promise of easiness, or an easy-to-use tool, is that some process (whether display, formatting, organization, or analysis) can be accomplished with minimal difficulty, confusion, or extra labor. When such processes are simplified, researchers feel more able to focus their learning on what they perceive as most relevant to their research question and intellectual work. In digital humanities, and in the context of technology generally, easiness is most likely to be associated with tools that are classified as “out- of-the-box,” meaning that they do not require configuration or modification to work, or “off-the-shelf,” meaning that they are standardized, rather than customized, and (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 5 intended for general audiences to be able to use. Because easiness is abstract, it can be taken as synonymous for other qualities, like speed (cf. various statements about accomplishing a process or analysis with “one click”). Though the variants on “easy” are common in tool branding, terms like “fast” and “simple” are regular alternatives. For many tools, it would be more accurate to say that they make a given process not easy, but easier than an alternative. Easiness is subjective — what is easy for one user may not be for another. It is important to understand that easiness is subjective because it is situated and dependent upon other factors. These factors include the particular nature of the material being worked with (i.e., whether the material is text or image-based), and its condition (i.e., whether a dataset has been examined and normalized), as well as the availability (or lack) of training or experience that provides a user with relevant contextual knowledge. However, researchers may not see this situatedness clearly. Finally, because easiness is both powerful and subjective, it is value-laden; and it carries a backlash for individuals who expect to find a process or tool to be easy yet discover the opposite. The backlash comes in part from researchers’ inexperience with the various interdependencies and situatedness of easiness — many of which are complexities of technological, academic, and library systems and infrastructure. Ideally, a researcher pushes past the backlash, and over time they gain familiarity and experience that help them make choices about their research project or their career with greater autonomy. Part of the reason that claims about easiness have such weight is that they inevitably tell us stories about the available infrastructure and its condition — whether or not there are opportunities to learn a particular skill (e.g., a coding (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 6 language), and how legible and genuine those opportunities appear to the audience for whom the tool is intended. As a result, scrutinizing easiness rhetoric can be helpful for librarians and administrators who are trying to get a clearer sense of their patrons’ needs, or who want to think more critically about the type of support they are providing. Examples of Easiness Framing Easiness has become sufficiently important that in digital humanities LibGuides and tool bibliographies, it may be the first or second characteristic mentioned for any tool listed. A typical description might consist of one or two sentences explaining “[Tool] is free and easy to use and allows you to [process/visualize/analyze content].” This sort of description echoes the taglines and catchphrases associated with various tools. Besides Omeka and Scalar, there is Stanford’s Palladio (“Visualize complex historical data with ease.”), the Knight Lab’s TimelineJS (“Easy-to-make, beautiful timelines”) and JuxtaposeJS (“Easy-to-make frame comparisons”), CartoDB (“Maps for the web, made easy” – while this is no longer CartoDB’s official catchphrase, it is still widely visible in search results). Although qualities such as access, sustainability, and portability are significant concerns in DH, in examining libguides and other DH tool roundups, one sees that they are referenced far less than if a tool will be easy. The guide authors try to succinctly articulate what each tool is meant to do; what processes it speeds up, facilitates, or makes easier; and the language that is used to present its capabilities and its value to potential users. In order to get a concrete sense of how this language appears, and the promises and assertions that tool framing makes, this article will examine three tools developed (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 7 specifically for DH use within the last ten years. The point of this examination is not to critique or accuse the tools – they are merely the most concrete and available examples of a more widespread ephemeral phenomenon that shows up not only in written contexts, but also in workshops, webinars, and casual conversations. Omeka.net Omeka was released by the Center for History and New Media at George Mason University in 2008, and it is intended for an audience of users in the galleries, libraries, archives, and museums (GLAM) sector, as well as anyone else wanting to build exhibits and collections online. It allows for the creation of multiple collections of items with metadata structured according to disciplinary or institutional schemas and standards. Users have the ability to follow widespread practices that will make their data interoperable, adjust those schemas to a local house style, or do a bit of each as needed. The sort of functionality that Omeka makes possible is available in software developed for the GLAM community but is often priced at an institutional level that puts it out of reach of individuals and the smallest institutions. This sort of software may be available as open-source and may require experienced tech support personnel to manage the back-end setup and ongoing maintenance. Since the initial release, the Omeka development team has worked to improve the tool’s functionality and accessibility, both through the Omeka.net subscription service and by making it available as a “one-click install” through Internet service providers like Reclaim Hosting. Omeka’s contributions are remarkable, though hard to explain succinctly for audiences who are unfamiliar with the existing software contexts. Dan Cohen summarized it as “WordPress for your exhibits and collections” at the original release, (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 8 aiming at a description that would make it easy for people to describe the tool to others. Up until September 2017, Omeka.net featured a prominent tagline: “your online exhibit is one click away.” In its website redesign that tagline was replaced by a less exuberant description: Getting started is easy with Omeka with our hosted service.” The Omeka.org website continues marketing Omeka via Cohen’s original WordPress reference under the heading “Simple to use”: “Our ‘five-minute setup’ makes launching an online exhibition as easy as starting a blog. No code knowledge required.” This rhetoric isn’t precisely mismatched, because Omeka does indeed allow users to start adding items and metadata right away. For those already versed in metadata standards and best practices, the main learning curve will involve getting accustomed to the interface. However, many digital humanists coming from departments such as English and History are unlikely to have received this training, and as such, face an additional and substantial learning curve, because there is more to a good Omeka exhibit than simply getting content onto the web. The Omeka.net documentation acknowledges this challenge in its Getting Started section, where it recommends that users plan out their content before building an Omeka website and refers them to Cohen & Rosenzweig’s Digital History: A Guide to Gathering, Preserving, and, Presenting the Past on the Web. The Omeka.org documentation goes further, recommending that users sketch out wireframes of their site prior to building it. Both versions of Omeka encourage new users to explore the showcases of existing Omeka sites. But while Omeka may make building an exhibit as easy as blogging on a technical level, its framing is easily misunderstood by users who fail to anticipate the complex intellectual work required to produce a site that is ready to share publicly. (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 9 Scalar Scalar is the creation of the Alliance for Networking and Visual Culture (ANVC) in association with Vectors Journal and the Institute for Multimedia Literacy at the University of Southern California. An open beta version was released in spring 2013, and the current version, Scalar 2.0, was released in late 2015. ANVC presents their work as “explor[ing] new forms of scholarly publishing aimed at easing the current economic crisis faced by many university presses while also serving as a model for media-rich digital publication,” and describes Scalar as a “key part” of this process, facilitating collaboration and material sharing between libraries, archives, scholarly societies and presses” (ANVC: About the Alliance n.d.). These partnerships have resulted in one of Scalar’s most unique features: the ability to add images and videos from organizations like the Shoah Foundation and the Internet Archive to a Scalar site by performing a keyword search, selecting results with a checkbox, and clicking a button to import them, along with any associated metadata. This entire process (including the optional step of editing individual item metadata) can be performed within the Scalar user interface. Once imported, users can select from a few different layouts available via a dropdown menu in order to emphasize text or media, or split the emphasis between the two (Scalar: Selecting a Page's Default View, n.d.). The other feature that especially distinguishes Scalar from other CMSs is the structural freedom that it grants users. Where blogging platforms like Blogger, WordPress, and Dreamwidth structure content chronologically, Scalar has no default organizational structure. Instead, it allows users to create pages, which can be (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 10 combined into paths, annotated, tagged, or used as tags for other content. This gives them multiple options for creating non-linear, nested, radial, recursive, and intersecting narratives. Configuring these choices is accomplished primarily through a Relationships menu at the bottom of each page created, below the main text input window. The actual, final steps of creating an organic structure through a combination of selecting objects and dragging and dropping them within a GUI requires far fewer steps in Scalar than it would in any other environment, and is further enhanced by the fact that Scalar includes options to show visual representations of the structure (Path View, Tag View). However, this structural freedom is also the aspect of Scalar that requires the most careful advance planning from users in order to avoid producing a tangle of disconnected, disparate files. As such, its organizational freedom is simultaneously the feature that most complicates Scalar’s self-presentation of easiness. Like Omeka, Scalar articulates its claim of easiness through a comparison to blogging (“...if you can post to a blog, you can use Scalar”), pointing to the similarities of the WYSIWYG interface in its text input window and those used by WordPress and other blogging platforms. The trailer also connects itself to the activity of blogging by emphasizing the simplicity with which authors can work with a wide range of media types — not just how easy it is to “import media directly without cutting and pasting code” but also combining different types of media, such as “tagging poems with videofiles or tagging images with audiofiles.” What the trailer wants to convey is that any media type the user could imagine — from images and text to maps and source code — can be juxtaposed within a Scalar book, all without requiring the book’s author to have any knowledge of markup language. This emphasis on diverse media formats is (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 11 coupled throughout the trailer with statements about Scalar’s ability to handle quantity — not only in terms of media, but also that Scalar makes it “easy to work with multiple authors because each author’s contributions are tracked and all versions preserved.” As the trailer ends, the narrator reiterates that despite the wide variety of options available (visualizations, paths, annotations, etc.), “all these objects are designed to work together to make it easier for you to create objects to think with — the thinking is still up to you.” As was the case with Omeka, Scalar’s claims aren’t untrue – it does offer unique functionality that simplifies and streamlines the processes of juxtaposing media and crafting non-linear narratives; and it does so in a way that saves considerable technical labor. In emphasizing its most innovative functionalities, however, Scalar’s framing underemphasizes that these functionalities come with their own particular workload. The more complex a narrative structure is, and the more material it contains, the more important it is to have experience managing data with workflows, strict file naming practices, and/or data dictionaries. Without such practices, or a site structure that has been carefully determined in advance, users are more likely to end up with a tangled mess rather than the sophisticated site that they had hoped for. Likewise, Scalar’s documentation raises the question of what tool managers tell users to prepare them for the work of developing site structure. Scalar’s presentation materials focus on the ease with which Scalar can keep track of multiple users – however, this focus tends to obscure the social decision making that will almost certainly be required; as well as the emphasis on how much freedom to show different objects skirts around the reality that producing a good site is often a case of learning (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 12 what not to show in order to keep the narrative streamlined and compelling, rather than simply showing a great quantity of objects. DH Box DHBox (http://www.dhbox.org) is currently in development at the CUNY Graduate Center. As the newest of the tools that I have examined in this piece, DHBox is an indication that easy tool rhetoric is still being used. DHBox uses containers to create remote environments in the cloud that are already configured for several popular and powerful DH tools, including IPython, RStudio, WordPress, and Mallet. Containers allow programs to run in virtual environments that are identical, rather than risking the possibility that some users’ settings and configurations will generate errors. Using pre- configured container environments can substantially cut down on the set-up time before students can get started actually using tools. The streamlined setup enables students to work with complex tools like Mallet and the NLTK on their own laptops without needing a physical computer lab, or requiring the instructor to consult or negotiate with campus IT personnel. DHBox makes a few prominent claims about its easiness. A brief statement centered on its front page explains that “setting up an environment for digital humanities computational work can be time-consuming and difficult. DH Box addresses this problem by streamlining installation processes and providing a digital humanities laboratory in the cloud through simple sign-in via a web browser.” The “About” page reiterates that DHBox allows a cloud laboratory to be deployed “quickly and easily” from (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 13 any computer with an internet connection, promising a device agnostic lab ready to go in minutes. Though DHBox emphasizes how much easier it is to use than it is to create a lab from scratch, it is not actually intended for beginners, as a closer look at the About page shows. DHBox makes it simple to set up a lab if you have an internet connection and “some contextual knowledge.” This abstract phrase gets clarified further down — the tool is intended for users who “know what the command line is” and “what a server does.” For others, the creators recommend a list of four resources to help bring potential users into the target audience, including a portion of the Apache HTTP Server documentation, Shaw’s “The Command Line the Hard Way” book, lessons hosted at the Programming Historian site, and Posner’s “How Did They Make That?.” This is a substantial reading list, but one that should provide a novice digital humanist with a solid grounding in the relevant concepts. Oddly enough, there is no explicit suggestion that individuals using DHBox need to understand how the gold-standard tools it contains work — the implication is that once the virtual lab is up and running, the rest of the progress will follow naturally. The idea of easiness, especially in tech contexts, is often associated with support for new and inexperienced users; however, DHBox is a reminder that the situated nature of easiness means that it can also be intended specifically for advanced users. The presentation materials for DHBox attempt to be direct with would-be users by offering two benchmark questions that must be answered in order to use the tool productively; and the creators acknowledge that users might need to learn more, rather than simply suggesting that the tool will have excellent results for anyone and everyone. (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 14 What tool users are looking for Tool users want the easiest experience possible, but looking at these three tools in particular enables one to more concretely define what easiness means in the context of DH. The emphasis on graphical user interfaces and no coding or technical knowledge suggests a desire for as little preparation as possible — particularly the desire to avoid learning material that is purely technical and has no equivalent in their home disciplines, such as understanding image aspect ratios or file compatibility issues. For researchers who are already overburdened, this is an understandable rational economic choice. Users are also looking for tools that give them the ability to fully realize their imaginations, and to produce something new and dramatically different from what non- DH methods allow. This output could be new because it is a highly visual digital exhibit, or because it features non-linear narratives or juxtapositions of strikingly different media, or because it makes it possible for an entire graduate seminar to have access to sophisticated analytical tools like RStudio and Mallet. Users may likewise be looking for tools that allow them to explore a particular method in depth, and achieve mastery, especially within a given period of time, i.e., one semester-long course (Goldstone 2016). Finally, though this is rarely made directly explicit by the tool presentations themselves, users want stability, and to feel that any effort that they make in a tool will be rewarded and worthwhile, rather than failing (Terras 2014a; Terras 2014b). This is most evident in language that gestures towards the tool’s output. Sometimes this is conveyed by promising speed (an exhibit that is one click away) and sometimes by (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 15 promising complexity. Scalar’s creators understand that “important topics require time and sustained attention to be fully explored,” and work to convey to authors that with Scalar, they will be able to create a Scalar book that is worthy of committed attention from readers. While digital humanists may want to avoid spending time acquiring extraneous knowledge, they are drawn to the field because they are willing to make an investment — but they want that investment to “provide a satisfying moment of completion” (Brown 2009, par. 10) or move them closer to being able to declare the project finished (Kirschenbaum 2009, par. 1). In light of these needs, we might ask whether easiness is a quality that digital humanities tool creators should pursue. In “Blunt Instrumentalism: On Tools and Methods,” Dennis Tenen (2016) argues in favor of caution around easiness in DH research, because prioritizing it often comes at the expense of understanding the critical inner workings of analytical tools. Overreliance on out-of-the-box tools can result in researchers confusing the tools themselves with methodologies (117), and the end result is that the scholarship is less finely-grained and rigorous. The best kinds of tools, according to Tenen, are “the ones we make ourselves” – though he acknowledges the formidable labor involved in producing, marketing, and maintaining such tools, especially when working within academic contexts. Tenen characterizes a preference for easiness as a sort of intellectual laziness or lazy thinking, when more attention to method is warranted (118). In some cases, this critique is highly applicable; in others, it fails to take in to account that the preference for easiness is influenced by a lack of infrastructure – and that some tools, like DH Box, are intended specifically to solve the common infrastructure problem of a lack of physical space. Out-of-the-box tools, which (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 16 might be better characterized as “entry-level” DH tools, are arguably fulfilling a community need. But whose role and responsibility is it to guide new users through those tools and into the more complex understanding of methodologies that might develop as users become more familiar with them? How libraries fit into DH infrastructure growth Whether identified as “digital humanities” or previous terms like “humanities computing” or “technological humanities,” librarians and scholars have been using tools in research contexts for a long time. The current wave of DH seems to have begun around ten years ago, kicked off in part by the creation and release of affordable and user-friendly tools like Omeka, as well as CHNM’s Zotero citation manager. William Pannapacker’s 2009 pronouncement in the Chronicle of Higher Education that DH seemed like “the first ‘next big thing’ in a long time,” was disputed by digital humanists for whom the field was nothing new — still, Pannapacker’s observation reflected the start of a rise in DH-focused hiring. While the quantity of available new DH-focused positions was overstated in some cases (Risam 2013), there has been demonstrable growth in certain sectors. In 2010, there were two searches for Digital Humanities Librarian jobs, and that number has risen steadily since, with twenty-eight job searches for librarians or similarly titled library-based, front-facing positions (such as Digital Scholarship Coordinator, Digital Scholarship Lead) in both 2015 and 2016 — an indication that libraries are actively working to increase their direct involvement with DH (Morgan and Williams 2015). (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 17 As the field of digital humanities and the number of roles associated with it have grown, various concerns and questions have arisen about how to effectively build infrastructure and support systems that are both productive and scalable. Many of these discussions focus on the roles that libraries and librarians play — whether in supporting DH as a service, being the driving force or an active collaborator in DH growth, or providing much needed guidance for archiving and maintaining digital scholarly work. As projects and tools have been created and aged and sometimes disappeared, the larger DH community has begun to be more aware of the importance of sustainability (Davis 2016). Furthermore, in enterprise-level software and hardware provision, librarians have far more expertise and experience than traditional academic personnel. However, this pressure to achieve success and provide expertise risks becoming unsustainable for libraries themselves, while simultaneously failing to fully acknowledge the contributions that they have made to DH growth. There are several excellent articles and essays discussing the opportunities and challenges that libraries face as they develop involvement and support strategies for digital humanities and digital scholarship. In this instance, I want to focus on the challenges that out-of-the-box, easy-to-use tools seem to have the potential to ameliorate, if not solve completely. These include the tendency to assign librarians or coordinators ample amounts of responsibility for creating digital humanities successes without giving them the necessary authority to do so (Posner 2013, 47), a lack of training opportunities (Posner 2013, 46), and a tendency to award credit for achievements to faculty, rather than library collaborators (Posner 2013, 48). These hurdles are further complicated by the sheer variety of requests that occur, many of (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 18 which include requests for time-consuming and non-extensible customization (Vinopal and McCormick 2013, 28). Libraries and librarians are under pressure to produce demonstrable results; to have learned enough from “intensive development for boutique projects” to provide the scalable support that scholars need, often as inexpensively as possible (Maron and Pickle 2014, 30); and to have a reproducible model that can be clearly articulated to stakeholders, and adapted as needed over time. Easy-to-use tools can help with many of these challenges. Because they are branded as entry-level tools, and have documentation, they are positioned to allow librarians to be more hands-off, relieving them of the responsibility for success. If librarians are more hands-off, they are less likely to go uncredited for their work; and if the tools can offer the right balance of restrictions and customization, then the library is absolved of that burden as well. The 2011 ARL SPEC Kit for Digital Humanities survey found that 48% of libraries characterized their digital humanities services as offered on an “ad hoc” basis (Bryson et. al. 2011, 23) — sometimes described as a “service-and-support” model, where projects are initiated by faculty who approach the library with ideas (Posner 2013; Muñoz 2013). An alternate approach is the skunkworks or library incubator model (see Muñoz 2013; Nowviskie 2013), where the library develops DH projects in which it plays a leadership role and allows students and faculty opportunities to be involved. The ad hoc or service-and-support model can be problematic because relatively few members of the campus community have access to it.The skunkworks/incubator model depends on the library having the startup expertise it needs to develop and execute good projects that are compelling to faculty and students, and that provide them with (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 19 opportunities to develop the experience and skills that they see as useful. Even when an incubator can successfully create opportunities that draw faculty and students in, access can be fairly limited. Both of these models have risks in terms of sustainability and scalability. A third model has emerged, one that is more scalable and sustainable — let’s call it “lightweight-service-and-support.” This model may include one or more dedicated personnel, i.e. a DH librarian or specifically DH programmer, but it is resource- conservative, and cautious about providing too much one-to-one guidance that would be unfair to other support seekers, because such guidance would not scale, and would quickly constitute a significant/unsustainable time commitment for the librarian or team. The lightweight-service-and-support model relies heavily on easy-to-use tools, which offer researchers several options while still scaling well to a library’s support capacity. The tools’ user community, documentation, and their popularity (which can result in how-to videos and example projects) helps to lessen the amount of training, management, and outreach that librarians need to do. This model looks very similar to the second tier of support that Vinopal and McCormick (2013) explain how the supported tools “should offer a fixed set of templates, so users can pick the format, style, or functionality that best meets their needs … If services at this level are well- designed and supported, a majority of scholars could rely on these sustainable alternatives to one-off solutions” (32). Vandegrift and Varner likewise gesture towards this model when they provide a concise formula for how libraries should conceptualize their DH offerings: “the goal is to have the fewest tools to support that meet the most needs” (2013, 71). Lightweight-service-and-support need not be the only tier of the (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 20 model as Vinopal and McCormick’s four-tiered model makes clear; however, in the absence of resources for higher tiers to develop potentially ground-breaking and grant- winning projects, lightweight-service-and-support can still serve a wide range of community members. Establishing practices and models that can help make DH in libraries sustainable and scalable is important work that can and will help libraries continue evolving along with scholarly disciplines. But are the practices that are scalable and sustainable for libraries equally sustainable and scalable for the faculty and students who look to the library for DH opportunities? DH as scalable and nonscalable To explain further, anthropologist Anna Lowenhaupt Tsing defines scalability as the ability to expand without having to rethink or transform the underlying basic elements. She examines scalability as a specific approach to design — one that has allowed for both the precision of the factory and the computer; and she argues that scalability is so ubiquitous and powerful that it stops us from noticing the aspects of the world that are not scalable. To push back against this suppressive impulse, Tsing’s nonscalability theory is to allow us to see “how scalability uses articulations with nonscalable forms, even as it denies or erases them” (Tsing 2012, 506). Scalability prioritizes and values precision-nested fit — and it is the driving force behind much of our current infrastructure. The goals of nonscalability theory are to focus on perceiving the heterogeneous and nonscalable forms and understand that they, too, have roles to play in growth. At the heart of nonscalability theory is the question of how we look at, (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 21 and how we handle, the idea of diversity — specifically, the diversity of objects that do not fit within the precision-nested growth structures of scalability. Diversity, argues Tsing, isn’t simply different — it can contain the potential for transformative change. Rawson and Muñoz (2016) adapt Tsing’s theoretical framework to unpack and examine their work “cleaning” data in the NYPL’s “What’s On the Menu?” archive, featuring over one hundred years of menus from restaurants, cafés, hotels, and other dining establishments. They argue that the concept of “data cleaning” and the use of the phrase “data cleaning” obscure the complex and heterogeneous details of the process as well as the degree to which it is high-stakes critical work with far-reaching effects that can impact the value of research findings. To reduce that process to “data cleaning” is to misunderstand a highly nonscalable process as a scalable one. Rawson and Muñoz set out to “clean” and normalize the data of different dishes and food items within the collection. Although the NYPL had arranged the menus in the collection to be interchangeable objects within the catalog, and although menus have a common overall format (i.e., food items with prices, grouped according to particular meals or particular sections of meals), each menu showed considerable variation. Some of this variety was straightforward to normalize (e.g., fifteen variant listings for potatoes au gratin). To clean this data would be to make it scalable — to allow users to query the entire archive of menus to understand when, where, and how potatoes au gratin appeared, and get an accurate answer. However, as they worked to clean the data so that it would help answer research questions about the effect of wartime food rationing on menus or the changing boundaries of what constituted a dish over time, Rawson and Muñoz began to understand that reducing variants to a single value was “not a self- (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 22 contained problem, but rather an issue that required returning to [their] research questions and investigating the foods themselves.” The individual menu items’ heterogeneity was central to answering the research questions, and what was needed was not to make each food item scalable, but instead to create a dataset that would be compatible with the NYPL archive and illuminate (and allow users to interact with) the nonscalable heterogeneous aspects of the menu contents. Becoming aware of the pressures of scalability can be difficult even for experienced digital humanists. Rawson and Muñoz explain that when they began “cleaning” their data, they saw their main challenge and goal as “processing enough values quickly enough to ‘get on with it’” (page). The characteristics associated with scalability — speed, simplicity, and unimpeded growth — have considerable overlap with the characteristics associated with easiness. The tools we use — whether we are their creators or their consumers — are not immune to the pressure to be scalable. Tsing’s theory of nonscalability, which Rawson and Muñoz have shown to have considerable implications for how we conceive of our goals when working with data, is equally relevant to both DH projects and to the infrastructure that we build for people who are working on them. DH projects are nonscalable. This means that they are particularly nonscalable with various out-of-the-box tools (not only Omeka and Scalar) because as Tsing explains, scalability is the “ability to expand without distorting the framework” (Tsing 2012, 523). Tools designed to present and process data may appear or present themselves as though they come with that framework in place. Omeka has items and item types with metadata categories; Scalar has pages, paths, and tags — but these components are building blocks, and a highly incomplete framework, if they (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 23 can be said to be a framework at all. And this is precisely as it should be — they are there to be distorted, or, rather, to be transformed, as researchers’ projects take shape. When tools present themselves as easy, quick, and simple, they are promising the user that working with them will be scalable. And when those of us who are in the position of introducing those tools reiterate and reinforce that presentation, we are likewise telling researchers that they should expect scalability and strive for it, despite the fact that they are engaging in an eminently nonscalable process. We are encouraging them to imagine the complex diversity of their material without preparing them for the transformative process that including it will require. Instead of helping them learn to see heterogeneity, and find effective ways of interacting with it, by training them to expect easiness, we are leaving an empty space in their preparation — and that space is as likely as not to end up filled with a conviction of their own inadequacy. The consequence is not only this emotional plunge. Out-of-the-box tools may successfully circumvent technical work, but in doing so, they may also bypass the thought process of imagining a research question and its answers beyond the constraints and affordances of a single tool. This can impact the depth and richness of the answer to the research question, as well as the project’s long-term sustainability. Thinking beyond the capabilities of a particular tool can also be an opportunity for researchers to utilize their existing disciplinary expertise in making decisions about data categories and relationships between materials – and in the process, gain much needed confidence for future experimentation, allowing them to work with less dependence upon librarians or other support personnel. (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 24 Possible avenues for intervention The ways that “easiness” rhetoric can shape tool users’ expectations and experiences are a challenge. This challenge intersects with a related problem, namely, that the community of practice in DH is still grappling with how best to incorporate data modeling in DH. A data model defines the objects or entries that a database (or really any data presentation system, including content management systems) contains. It sets out the rules for how different pieces of data are connected with each other. If entries have additional data that modifies them (i.e., a data model about individuals might include their nationality, and depending on the focus of the database, one part of the model might specifically focus on defining how to record complexities around nationality, such as individuals who are born in one country to parents who are citizens of another country.) Effectively incorporating data modeling involves articulating the questions and complexities that accompany it in humanities contexts; and the work of disseminating and/or training DHers to understand their work with various tools as data modeling. Posner has previously noted that “humanists have a very different way of engaging with evidence than most scientists or social scientists” (Posner 2015). For example, close reading is more likely to work towards describing a specific pattern within a text and tracing it from its start to end point. The focus of many traditional humanities scholarly essays is identifying and elucidating one or a small number of objects which are unique. To use Tsing here, humanities research is much more focused on illuminating and celebrating nonscalability; thus, it is no surprise that humanists have, even within the DH community, hesitated about invoking the idea of “data” in relation to their work. (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 25 However, organizing data is what allows researchers to produce scholarship (Posner 2015). When the Omeka documentation suggests that users should plan their site before beginning to use the tool, they are obliquely suggesting that scholars need to develop a data model that allows an Omeka site to be driven by a more complex principle than “let me show you all my stuff.” Scalar users face the same challenge — perhaps even more so, since in Scalar the capacity for non-linear and intersecting paths plus the ability to display both text and media-focused pages means that scholars could conceivably be working with two interlocking data models: one for their narrative and one for their non-narrative content. And this need applies to other DH tools as well — including several of the tools available through DHBox. Data modeling is not easy work — but helping students understand how it fits into the process of working with so-called “easy” tools would be one way of preparing them better. This example (and potential impact) of data modeling underscores that the problems created by easy tool rhetoric cannot simply be attributed to the tool creators and the teams that designed and wrote their publicity materials. If our libguides and workshop promotional materials draw on the same tool presentation that emphasizes easiness, then we are also using easiness rhetoric just as the tool makers are. Who has the responsibility and capacity to intervene in this situation? What kind of intervention is appropriate? While tool creators bear some responsibility, there is, in most cases, a gap between the authors of a tool’s presentation site and the readers. Librarians who are mentoring students and faculty who are learning new tools — or who are in charge of designing and maintaining a local infrastructure system — are positioned to fill that gap because they are usually closer to the learners than the tool creators are. Given (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 26 humanists’ uncertainty around thinking of their materials as data (Keener 2015, par. 33), librarians and instructors offering basic tool trainings are more likely to be successful because they can have conversations that go both ways in consulting contexts. Our models for DH development and support in libraries need to consider not only what tools to provide — but also how those tools’ capabilities and reputation shape infrastructure — and how we can design around the tools’ rhetoric in response. In “On Nonscalability,” Tsing points out several examples in which scalability has been achieved in part through a reliance on disciplined labor. One example that she uses is that of sugar cane cutters in Puerto Rico in the 1950s. The workers had a limited time frame in which to work, and their working conditions were crowded and dangerous — especially because of the sharp machetes that each worker used. The result was that “workers were forced to use their full energy and attention to cut in synchrony and avoid injury” (Tsing 2012, 512). By disciplining themselves to learn the skill of synchronous cutting, they solved the company’s problem — and transformed themselves from nonscalable individuals into a scalable work force. Disciplined labor can be created when any powerful entity (a factory, a corporation, or even a library) identifies an infrastructural problem that they then leave to less powerful individuals to solve by changing themselves in some way. The creation of disciplined labor isn’t necessarily malicious. In the context of library infrastructure for DH tools, the problem is the nonscalability of individual DH projects versus the scalable support that we offer in the form of entry-level tools. Because the tools present themselves as easy to use, it is easier for libraries (and departments) to decide that only minimal training is needed, and that the rest can be left to the students themselves. The students become disciplined (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 27 laborers because they see DH tool facility as leading both to greater prestige and to jobs. Even when tools make beneficial achievements in terms of what is possible, the potential for problems exists. Scalar, Omeka, DHBox, and numerous other tools that can be used for DH make it possible for researchers to produce scholarly objects that would not have been possible otherwise without months or sometimes years of training. DHBox takes three tremendous difficulties (money, space, staff), and transforms them into a different difficulty (an individual user’s knowledge of servers and the command line). Scalar and Omeka transform the challenge of needing knowledge around databases, HTML, and CSS, transforming those challenges into the need for a user to understand how to develop an effective data model. All three tools are beneficial to the larger community of practice of digital humanities – and, yet, all three can be problematic as well, because through the combination of the way that libraries use them in building DH infrastructure, and the way that the tools present themselves, they shift tremendous responsibility for success directly onto the individual user and that user’s capacity to pick up wide-ranging (and not always easily accessible) knowledge on the fly. The resulting phenomenon is a form of what economist Jacob Hacker (2008) has identified as “risk shift.” Hacker identifies risk shift by tracing changes in frameworks for economic protection (including banking, income, healthcare, and retirement). Risk shift is the phenomenon by which support provided by larger corporate and social entities (employers, insurance companies, banks) is withdrawn, and responsibility for preventing risks is placed on individual families. While Hacker’s research traces this phenomenon through the larger American employment system, sociologist Tressie McMillan Cottom’s (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 28 recent book Lower Ed: The Troubling Rise of For-Profit Colleges in the New Economy argues that the same risk shift can be seen in the higher education system as credential costs that used to be supported by federal grants have shifted more onto students. A certain reliance on DH tools marketed as “easy to use” creates a similar risk shift for our students and faculty learning to use them, including librarians who are working with limited amounts of time to pick up DH skills and experience. There is no simple solution to the problems that can be created by “easiness” rhetoric. Certainly, the answer is not that the tools featuring it are bad and that we should stop using them. Nor is it for us to take a reverse approach and brand the tools as ultra-challenging, suitable only for hardcore data nerds (a problematic approach that has been an aspect of DH in the past in debates about hacking vs. yacking (Cecire 2012; Nowviskie 2016). Training and dialogue specifically focused on data modeling throughout the community could and will be very helpful, but it will take time for that to happen. If it does, it will be well-augmented by a more complex understanding among DH infrastructure providers (whether in libraries, centers, or departments) of what scalability means with regard to DH. Among other things, this more complex understanding might involve scrutinizing what needs tools are meeting — scrutinize these needs especially through the tools’ marketing and self-presentation — and consider how those needs might shape infrastructure. One specific aspect of this might involve looking at the differences between what tool presentation leads users to think they need (i.e. lots of different types of media) vs. the contextual knowledge that more experienced digital humanists know they need (including naming conventions, data models, etc.). This doesn’t mean that libraries necessarily have to dramatically increase (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 29 their DH infrastructure investment or expend substantially more resources — if we are alert, deliberate, and proactive, it is possible to build infrastructure that is scalable, both for libraries, and for our users. Conclusion When researchers embarking on a digital humanities project look for the right tool, the perceived easiness of that tool is an important consideration. Tools that can provide an easy-to-use experience are becoming an important part of library infrastructure for DH because they seem to require less support and labor from library personnel involved in introducing DH methodologies to students and faculty. However, tools branded as “easy to use” can create a backlash in which users’ research stalls and they blame themselves when a particular tool was more difficult than they expected. This article has sought to better understand the challenges presented by easy tool rhetoric for DH service providers by examining the presentation and documentation of three digital humanities tools. This examination revealed that though the tools have made valuable contributions that substantially simplify certain technical aspects of producing websites and multimedia objects, the rhetoric of their presentation tends to elide the vital and challenging critical thinking that users must do while using the tools. This elision underscores key competencies, such as data modelling, that the larger digital humanities community is only just beginning to grapple with. Libraries have an important role to play in helping tool users develop knowledge that will avoid the backlash of easy tools. (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 30 [Many thanks to Yvonne Lam for invaluable conversations throughout the development of this essay; and to Alex Gil, Yvonne Lam, Emily McGinn, Roopika Risam, and Rachel Shaw for feedback on earlier versions.] References “Alliance for Networking Visual Culture.” n.d. http://scalar.usc.edu/ “Alliance for Networking Visual Culture » About The Alliance.” n.d. http://scalar.usc.edu/about/ American Historical Association. 2015. “Guidelines for the Professional Evaluation of Digital Scholarship by Historians | AHA.” American Historical Association. June. https://www.historians.org/teaching-and-learning/digital-history-resources/evaluation-of- digital-scholarship-in-history/guidelines-for-the-professional-evaluation-of-digital- scholarship-by-historians Brown, Susan, Patricia Clements, Isobel Grundy, Stan Ruecker, Jeffery Antoniuk, and Sharon Balazs. 2009. “Published Yet Never Done: The Tension Between Projection and Completion in Digital Humanities Research.” Digital Humanities Quarterly 3(2). http://www.digitalhumanities.org/dhq/vol/3/2/000040/000040.html Bryson, Tim, Miriam Posner, Alain St. Pierre, and Stewart Varner. 2011. “Digital Humanities, (SPEC Kit 326). http://publications.arl.org/Digital-Humanities-SPEC-Kit-326/ Cecire, Natalia. 2012. “When Digital Humanities Was in Vogue.” Journal of Digital Humanities 1(1). http://journalofdigitalhumanities.org/1-1/when-digital-humanities-was- in-vogue-by-natalia-cecire/ Cohen, Dan. 2008. “Introducing Omeka.” Dan Cohen (blog), February 20. http://www.dancohen.org/2008/02/20/introducing-omeka/ (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 31 Cottom, Tressie McMillan. 2017. Lower Ed: The Troubling Rise of For-Profit Colleges in the New Economy. The New Press. Davis, Robin Camille. 2016. “Die Hard: The Impossible, Absolutely Essential Task of Saving the Web for Scholars.” Presentation, Eastern New York Association of College & Research Libraries Meeting, May 23. http://academicworks.cuny.edu/jj_pubs/76/ “DH Box.” n.d. http://dhbox.org/ “DH Box: About.” n.d. http://dhbox.org/about Goldstone, Andrew. 2016. “Teaching Quantitative Methods: What Makes It Hard (in Literary Studies).” Pre-print (forthcoming in Debates in the Digital Humanities 2018. https://doi.org/10.7282/T3G44SKG. Hacker, Jacob. 2008. The Great Risk Shift: The New Economic Insecurity and the Decline of the American Dream. Oxford, New York: Oxford University Press. Keener, Alix. 2015. “The Arrival Fallacy: Collaborative Research Relationships in the Digital Humanities.” Digital Humanities Quarterly 9(2). http://www.digitalhumanities.org/dhq/vol/9/2/000213/000213.html Kirschenbaum, Matthew G. 2009. “Done: Finishing Projects in the Digital Humanities.” Digital Humanities Quarterly 3(2). http://www.digitalhumanities.org/dhq/vol/3/2/000037/000037.html Maron, Nancy L., and Sarah Pickle. 2014. “Sustaining the Digital Humanities Host Institution Support beyond the Start-Up Phase.” Ithaka S+R. https://digital.library.unt.edu/ark:/67531/metadc463533/m2/1/high_res_d/SR_Supporting _Digital_Humanities_20140618f.pdf (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 32 Modern Language Association. 2012. “Guidelines for Evaluating Work in Digital Humanities and Digital Media” Modern Language Association. January. https://www.mla.org/About- Us/Governance/Committees/Committee-Listings/Professional-Issues/Committee-on- Information-Technology/Guidelines-for-Evaluating-Work-in-Digital-Humanities-and- Digital-Media Morgan, Paige, and Helene Williams. 2016. “The Expansion and Development of DH/DS Librarian Roles: A Preliminary Look at the Data.” Presentation, Digital Libraries Federation Forum 2016. https://osf.io/vu22f/ Munoz, Trevor. 2013. “In Service? A Further Provocation on Digital Humanities Research in Libraries.” dh+lib. June 19. http://acrl.ala.org/dh/2013/06/19/in-service-a-further- provocation-on-digital-humanities-research-in-libraries/ Nowviskie, Bethany. 2013. “Skunks in the Library: A Path to Production for Scholarly R&D.” Journal of Library Administration 53(1): 5366. doi:10.1080/01930826.2013.756698. ———. 2016. “On the Origin of ‘Hack’ and ‘Yack.’” In Debates in the Digital Humanities, edited by Lauren F. Kelin and Matthew K. Gold. University of Minnesota Press. http://dhdebates.gc.cuny.edu/debates/text/58 “Omeka.Net.” n.d. http://www.omeka.net/ “Omeka.Net: About.” n.d. http://info.omeka.net/about/. Pannapacker, William. 2009. “The MLA and the Digital Humanities.” HASTAC (blog). December 30. https://www.hastac.org/blogs/nancyholliman/2009/12/30/mla-and-digital- humanities (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 33 Posner, Miriam. 2013. “No Half Measures: Overcoming Common Challenges to Doing Digital Humanities in the Library.” Journal of Library Administration 53(1): 43–52. doi:10.1080/01930826.2013.756694 Posner, Miriam. 2015. “Humanities Data: A Necessary Contradiction.” Miriam Posner’s Blog. June 25. http://miriamposner.com/blog/humanities-data-a-necessary-contradiction/ Presner, Todd. 2012. “How to Evaluate Digital Scholarship.” Journal of Digital Humanities 1(4). http://journalofdigitalhumanities.org/1-4/how-to-evaluate-digital-scholarship-by- todd-presner/ Rawson, Katie, and Trevor Muñoz. 2016. “Against Cleaning.” Curating Menus (blog), July 6. http://www.curatingmenus.org/articles/against-cleaning/ Risam, Roopika. 2013. “Where Have All the DH Jobs Gone?” Roopika Risam (blog), September 15. http://roopikarisam.com/uncategorized/where-have-all-the-dh-jobs-gone/ “Scalar 1 User’s Guide: Creative Use of Structure.” n.d. Scalar 1 User’s Guide. http://scalar.usc.edu/works/guide/creative-use-of-structure “Scalar 1 User’s Guide: Selecting a Page’s Default View.” n.d. Scalar 1 User’s Guide. http://scalar.usc.edu/works/guide/selecting-a-pages-default-view Shirazi, Roxanne. 2016. Conditions of (In)Visibility: Cultivating a Documentary Impulse in the Digital Humanities. Invisible Work in the Digital Humanities Symposium. Florida State University, November 17-18. https://www.youtube.com/watch?v=28LIvujbrS8. Tenen, Dennis. 2016. “Blunt Instrumentalism: On Tools and Methods.” In Debates in the Digital Humanities 2016, edited by Lauren F. Klein and Matthew K. Gold. University of Minnesota Press. (Accepted Manuscript) Version of Record at https://www.tandfonline.com/doi/full/10.1080/10691316.2018.1480440 CUL: Easy Tools Submission: 34 Terras, Melissa. 2014a. “A Decade in Digital Humanities.” Melissa Terras’ Blog (blog), May 27. http://melissaterras.blogspot.com/2014/05/inaugural-lecture-decade-in-digital.html. ———. 2014b. “Reuse of Digitised Content: Chasing an Orphan Work Through the UK’s New Copyright Licensing Scheme.” Melissa Terras’ Blog (blog). February 4. http://melissaterras.blogspot.com/2014/10/reuse-of-digitised-content-4-chasing.html. Timmermans, Stefan. 2016. “Introduction: Working with Leigh Star.” In Boundary Objects and Beyond, edited by Geoffrey C. Bowker, Stefan Timmermans, Adele E. Clarke, and Ellen Balka. Cambridge, Massachusetts: The MIT Press. Tsing, Anna Lowenhaupt. 2012. “On Nonscalability: The Living World Is Not Amenable to Precision-Nested Scales.” Common Knowledge 18(3): 505 – 524. doi:10.1215/0961754X-1630424 Vandegrift, Micah, and Stewart Varner. 2013. “Evolving in Common: Creating Mutually Supportive Relationships Between Libraries and the Digital Humanities.” Journal of Library Administration 53(1): 67 – 78. doi:10.1080/01930826.2013.756699 Vinopal, Jennifer, and Monica McCormick. 2013. “Supporting Digital Scholarship in Research Libraries: Scalability and Sustainability.” Journal of Library Administration 53(1): 27 – 42. doi:10.1080/01930826.2013.756689 work_hucex7asuzhu5esayctslwqdm4 ---- Digital Humanities 2010 1 The Importance of Pedagogy: Towards a Companion to Teaching Digital Humanities Hirsch, Brett D. brett.hirsch@gmail.com University of Western Australia Timney, Meagan mbtimney.etcl@gmail.com University of Victoria The need to “encourage digital scholarship” was one of eight key recommendations in Our Cultural Commonwealth: The Report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Sciences (Unsworth et al). As the report suggested, “if more than a few are to pioneer new digital pathways, more formal venues and opportunities for training and encouragement are needed” (34). In other words, human infrastructure is as crucial as cyberinfrastructure for the future of scholarship in the humanities and social sciences. While the Commission’s recommendation pertains to the training of faculty and early career researchers, we argue that the need extends to graduate and undergraduate students. Despite the importance of pedagogy to the development and long-term sustainability of digital humanities, as yet very little critical literature has been published. Both the Companion to Digital Humanities (2004) and the Companion to Digital Literary Studies (2007), seminal reference works in their own right, focus primarily on the theories, principles, and research practices associated with digital humanities, and not pedagogical issues. There is much work to be done. This poster presentation will begin by contextualizing the need for a critical discussion of pedagogical issues associated with digital humanities. This discussion will be framed by a brief survey of existing undergraduate and graduate programs and courses in digital humanities (or with a digital humanities component), drawing on the “institutional models” outlined by McCarty and Kirschenbaum (2003). The growth in the number of undergraduate and graduate programs and courses offered reflects both an increasing desire on the part of students to learn about sorts of “transferable skills” and “applied computing” that digital humanities offers (Jessop 2005), and the desire of practitioners to consolidate and validate their research and methods. We propose a volume, Teaching Digital Humanities: Principles, Practices, and Politics, to capitalize on the growing prominence of digital humanities within university curricula and infrastructure, as well as in the broader professional community. We plan to structure the volume according to the four critical questions educators should consider as emphasized recently by Mary Bruenig, namely: - What knowledge is of most worth? - By what means shall we determine what we teach? - In what ways shall we teach it? - Toward what purpose? In addition to these questions, we are mindful of Henry A. Giroux’s argument that “to invoke the importance of pedagogy is to raise questions not simply about how students learn but also about how educators (in the broad sense of the term) construct the ideological and political positions from which they speak” (45). Consequently, we will encourage submissions to the volume that address these wider concerns. References Breunig, Mary (2006). 'Radical Pedagogy as Praxis'. Radical Pedagogy. http://radicalpeda gogy.icaap.org/content/issue8_1/breunig.ht ml. Giroux, Henry A. (1994). 'Rethinking the Boundaries of Educational Discourse: Modernism, Postmodernism, and Feminism'. Margins in the Classroom: Teaching Literature. Myrsiades, Kostas, Myrsiades, Linda S. (eds.). Minneapolis: University of Minnesota Press, pp. 1-51. http://radicalpedagogy.icaap.org/content/issue8_1/breunig.html http://radicalpedagogy.icaap.org/content/issue8_1/breunig.html http://radicalpedagogy.icaap.org/content/issue8_1/breunig.html Digital Humanities 2010 2 Schreibman, Susan, Siemens, Ray, Unsworth, John (eds.) (2004). A Companion to Digital Humanities. Malden: Blackwell. Jessop, Martyn (2005). 'Teaching, Learning and Research in Final Year Humanities Computing Student Projects'. Literary and Linguistic Computing. 20.3 (2005): 295-311. McCarty, Willard, Kirschenbaum , Matthew (2003). 'Institutional Models for Humanities Computing'. Literary and Linguistic Computing. 18.4 (2003): 465-89. Unsworth et al. (2006). Our Cultural Commonwealth: The Report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Sciences. New York: American Council of Learned Societies. work_hx7uxdq4obarpkoubgihxbdala ---- Spazi antichi e futuri possibili: la geografia classica nelle Digital Humanities Introduzione1 In un famoso contributo apparso sul «Digital Humanities Quarterly»2 del 2009, Tom Elliott, uno dei principali esponenti 1 Ritenere di riassumere in queste poche pagine l’intero panorama della “geografia classica digitale” sarebbe, oltre che estremamente pretenzioso, an- che molto ingenuo. Ho scelto sulla base del mio solo, sindacabilissimo, giudi- zio personale, alcuni fra i progetti recenti a mio parere rappresentativi, allo scopo di dare un’idea generale della situazione in questo campo, indirizzata a una audience di classicisti di stampo più tradizionale. Necessariamente, questa selezione comporta parzialità, la cui responsabilità è interamente a mio cari- co. Il curatissimo sito Ancient World Online (AWOL) fornisce una panorami- ca certamente più ampia e completa di tutte le iniziative riguardanti la geo- grafia antica e moderna nella ricerca digitale: http://ancientworldonline. blogspot.de/2012/09/roundup-of-resources-on-ancient.html. Per il lettore vo- lenteroso che volesse approfondire le tematiche qui trattate tramite letture più generali, rimando a un piccolo glossario in corso di pubblicazione sui termini tecnici dell’Informatica Umanistica: https://github.com/ChiaraPalladino /TuftsDCC/wiki/DH-words-vademecum. 2 Una discussione sulle origini, la storia e le tendenze nelle cosiddette Di- gital Humanities va oltre lo spazio e gli scopi di questa discussione. Un in- quadramento di base è fornito dal datato, ma pur sempre solido, S. Schreib- man, R. Siemens, J. Unsworth, Companion to Digital Humanities, Blackwell, Oxford 2004. Si veda anche M. Dacos-P. Mounier, Humanités numériques. État «FuturoClassico» n. 4, 2018 pp. 149-177 ISSN: 2465-0951 © 2018 - Centro Interuniversitario di Ricerca di Studi sulla Tradizione http://ancientworldonline.blogspot.de/2012/09/roundup-of-resources-on-ancient.html http://ancientworldonline.blogspot.de/2012/09/roundup-of-resources-on-ancient.html https://github.com/ChiaraPalladino/TuftsDCC/wiki/DH-words-vademecum https://github.com/ChiaraPalladino/TuftsDCC/wiki/DH-words-vademecum Chiara Palladino 150 della “geografia classica digitale”, rifletteva su come sarebbe cambiato il panorama dei suoi studi nel 20173. Egli immaginava di avere a disposizione un immenso sistema di mappatura che in poche mosse potesse mostrargli le coordinate geografiche di tutti i suoi riferimenti bibliografici; di poter modellare automatica- mente su una mappa un itinerario di epoca romana, e di poterne esplorare specifiche sezioni, risalendo ai manoscritti che ne tramandavano il testo e alle rispettive edizioni; di visualizzare differenti opzioni di mappatura del viaggio descritto a seconda delle connessioni tracciate fra le varie aree localizzate; di effet- tuare, in pochi passaggi, analisi comparative, risalendo a passaggi simili in altre fonti, confrontando la sequenza e il reticolo dei luoghi e individuando dove si sovrapponevano, e in cosa consistevano le differenze, ad esempio, nella ortografia dei topo- nimi nei vari testimoni manoscritti; di estendere questa analisi alla espressione delle distanze, per analizzarne la coerenza inter- na, i riscontri con altre fonti, e così via. «Ci aspettiamo che, nel 2017, – scriveva – la rivoluzione geo-computazionale attualmente in atto, intersecatasi con le tendenze dell’informatica e della so- cietà, contribuisca a cambiare in modo significativo i modi di pensare e di diffondere la ricerca»4. Elliott però aggiungeva: «Per gli umanisti, i compiti della ricerca tradizionale resteranno largamente inalterati: la scoperta, l’organizzazione, l’analisi delle fonti primarie e secondarie con il des lieux et positionnement de la recherche française dans le contexte internatio- nal, OpenEditions: Institut Français, Marseilles 2014, e G. Bodard, S. Mahony, Digital Research in the Study of Classical Antiquity, Routledge, London 2016. Nel campo della filologia classica, segnalo il contributo recente di T. Koent- ges, Classical text and the digital revolution, «The Amphora Issue» XLIII, 2, 2015, pp. 31-46, con utile bibliografia. 3 T. Elliott-S. Gillies, Digital Geography and Classics, «Digital Humanities Quarterly» III, 1, 2009, http://digitalhumanities.org/dhq/vol/3/1/000031/000031. html. 4 «We envision a 2017 in which the geo-computing revolution, now un- derway, has intersected with other computational and societal trends to effect major changes in the way humanist scholars work, publish and teach» (ibid.). http://digitalhumanities.org/dhq/vol/3/1/000031/000031.html http://digitalhumanities.org/dhq/vol/3/1/000031/000031.html Spazi antichi e futuri possibili 151 fine ultimo di comunicare e disseminare i risultati e le informazioni, per l’utilità e l’educazione degli altri». Dunque, senza negare il necessario ampliamento degli orizzonti, determi- nato dall’indubbia sfida posta dalle nuove tecnologie, si continua a ribadire che il metodo di ricerca scientifica consiste, e conti- nuerà a consistere, in una serie di passaggi di ipotesi e verifica, solidamente fondati sulle evidenze fattuali, sempre praticati con la massima onestà intellettuale. «Quello che ci aspettiamo che cambi, – continuava Elliott – è l’entrata a regime di un metodo di lavoro molto più ampiamente collaborativo, dove una percen- tuale molto più alta del tempo di lavoro è spesa nell’analisi e nella comunicazione professionale, il tutto supportato da una rete di connessione pervasiva, e sempre attiva. Molto del lavoro solitario e tedioso di text mining5, ricerca bibliografica e organiz- zazione delle informazioni, saranno gestiti tramite strumenti computazionali, ma noi – corsivo mio – diverremo più responsa- 5 In linguaggio informatico, analisi meccanica (text mining o data mining), ossia un processo che consiste nel ricavare ed estrarre da un testo, o corpus di testi non strutturato, tipologie di informazioni espresse in modo più o meno sistematico, che le macchine possono essere programmate per riconoscere au- tomaticamente: ad esempio, l’identificazione di nomi propri classificati se- condo persone e luoghi, specifici costrutti sintattici, riferimenti bibliografici, glosse, termini in altre lingue, elementi di un determinato ambito lessicale (ad es. del gergo militare o della sfera del sentimento). Nella ricerca storica, il Na- tural Language Processing (NLP) è attualmente uno dei metodi di text mining più in voga: una semplice introduzione in M. Piotrowski, Natural Language Processing for Historical Texts, Morgan & Claypool, San Rafael 2012. Si veda anche A. Kao-S.R. Poteet, Natural Language Processing and Text Mining, Springer, London 2007. Doverosa, anche se scontata, è la precisazione che il text mining non esonera minimamente lo studioso del testo dall’analisi di quello che legge e dalla riflessione su di esso: anzi, il suo scopo è proprio quello di aumentare il tempo di lavoro preposto a tali operazioni creative, nel- le quali la macchina non è in grado di sostituirsi all’uomo, alleviando la fatica di quelle più meccaniche, come raccogliere e ricopiare a mano gli elementi di interesse. Ciò non esenta, né tantomeno elimina, la necessità della lettura ravvicinata, dell’interpretazione, e soprattutto del riscontro dei dati estratti dalla macchina con il contesto originale (in altre parole, i passaggi metodolo- gici essenziali di ogni ricerca seria). Chiara Palladino 152 bili della qualità e dell’efficacia del nostro lavoro per via di come diffondiamo i risultati delle nostre ricerche. […] L’informazione che potrà essere restituita dalle macchine sarà ricavata da un “pastiche globale” di archivi digitali e meccanismi di pubblica- zione, che copriranno virtualmente ogni nuova pubblicazione accademica, così come riproduzioni digitali di molta della produ- zione a stampa, grafica e audio oggi in circolazione»6. Questa interpretazione del futuro, in parte volutamente utopistica, nasceva dalle innovazioni di un decennio, il primo degli anni Duemila, in cui la geolocalizzazione e l’introduzione di avanzatissime tecniche di mappatura hanno effettivamente rivo- luzionato il nostro modo di interpretare e orientarci nello spazio intorno a noi. Elliott, quindi, riecheggiava l’impatto nelle Digital Humanities del cosiddetto spatial turn7, un movimento di pensie- ro iniziato fra la fine del 1800 e gli anni Sessanta, ma veramente unificato a livello intellettuale solamente negli anni Settanta, quando assunse le caratteristiche di una tendenza intellettuale 6 «For Humanists, general research tasks will remain largely unchanged: the discovery, organization and analysis of primary and secondary materials with the goal of communicating and disseminating results and information for the use and education of others. But we expect to see a more broadly col- laborative regime in which a far greater percentage of work time is spent in analysis and professional communication, all underpinned by a pervasive, al- ways-on network. Much of the tedious and solitary work of text mining, bib- liographic research and information management will be handled by compu- tational agents, but we will become more responsible for the quality and ef- fectiveness of that work, by virtue of how we publish our research results. [...] The information offered us in return will be drawn from a global pastiche of digital repositories and publication mechanisms, surfacing virtually all new academic publication, as well as digital proxies for much of the printed, graphic and audio works now for sale, in circulation or on exhibit in one or more first-world, brick-and-mortar bookstores, libraries or museums» (El- liott-Gillies, Digital Geography and Classics cit.). 7 La più completa rassegna sull’impatto dello “spatial turn” nelle discipline storico-artistiche è fornita da J. Guidi, Spatial Humanities. What is the Spatial Turn?, Scholar’s Lab-University of Virginia Library, http://spatial.scholarslab. org/spatial-turn/. Si veda anche B. Warf-S. Arias (ed. by), The Spatial Turn: Interdisciplinary Perspectives, Routledge, London 2014. http://spatial.scholarslab.org/spatial-turn/ http://spatial.scholarslab.org/spatial-turn/ Spazi antichi e futuri possibili 153 che in quegli anni riscopriva l’importanza dello spazio come entità utile a comprendere vari aspetti del mondo, le dinamiche di potere, quelle economiche, il simbolismo religioso, la territo- rialità. Alla base vi è l’idea che lo spazio non sia riconducibile alla semplice geografia naturale, ma che sia plasmato, prodotto dalla società8, come l’espressione di essa nei suoi vari aspetti. Il corol- lario di quest’affermazione è che, ovviamente, la descrizione dello spazio non sia rispondente a principi oggettivi9, ma sia piuttosto frammentata, complessa, composta di molteplici lin- guaggi e fattori. Questa “rivoluzione” ebbe un grande impatto sulla critica letteraria, e stimolò l’avvio di riflessioni teoriche importantissime basate sulle interpretazioni dello spazio e del tempo negli universi narrativi10. Ma un impatto ancora più gran- de fu registrato nelle tecnologie della navigazione: l’importanza determinante assunta dallo spazio nella società moderna contri- buì alla creazione, negli anni Sessanta, del Geographic Informa- tion System (GIS), e alla sua inevitabile estensione, negli anni Novanta, alla ricerca storica e archeologica. Questa innovazione fu, si potrebbe dire, il primo caso di com- mistione sistematica fra tecnologia e ricerca storica nell’era mo- derna. Il GIS non solo fornì risorse d’importanza capitale per il miglioramento delle tecniche di analisi e scoperta, ma contribuì ad introdurre inedite questioni di metodo: l’incontro-scontro fra discipline tecnologiche, caratterizzate dalla precisione e dalla 8 H. Lefebvre, La Production de l’espace, «L’Homme et la Société» 31-32, 1974, pp. 15-32. 9 Da cui la prolifica tendenza alla messa in discussione della cartografia come mezzo obiettivo di descrizione dello spazio: M. Monmonier, How to Lie with Maps, University Press, Chicago 2014; R. Kitchin-M. Dodge, Rethinking maps, «Progress in Human Geography» XXXI, 3, giugno 2007, pp. 331-344. 10 Tra cui vale almeno la pena di citare M.M. Bachtin, Estetica e romanzo, trad. it. Einaudi, Torino 2001; F. Moretti, Atlante del Romanzo Europeo: 1800- 1900, Einaudi, Torino 1997; D.J. Bodenhamer, J. Corrigan, T.M. Harris (ed. by), Deep Maps and Spatial Narratives, Indiana University Press, Bloomington 2015; B. Westphal, Geocriticism: Real and Fictional Spaces, Palgrave Macmillan, New York 2011. Chiara Palladino 154 certezza geometrica dei dati, e discipline storiche, basate per definizione su informazioni parziali e non strutturate, favorì un approccio critico, fortemente orientato all’individuazione e al superamento dei limiti attraverso un continuo miglioramento dei metodi di entrambe11. Sui possibili esiti e sulle ancora inesplorate potenzialità di questo processo tornerò in conclusione. Al tempo del contributo di Elliott, questa combinazione fra informatica, geografia e discipline storiche aveva già stabilito le tecnologie e i passaggi metodologici fondamentali per la mappa- tura su larga scala dei riferimenti geografici nelle fonti, tramite la loro estrazione e il loro inserimento in una rete di coordinate spaziali. Nei primi anni 2000, Google sperimentò la mappatura su larga scala di centinaia di migliaia di testi le cui scansioni erano liberamente consultabili nell’archivio Google Books: il risultato fu la presenza di sezioni, all’interno delle pagine di consulta- zione, in cui era possibile visualizzare la mappa ricavata non solo dai riferimenti geografici presenti nel testo, ma anche i cosiddetti metadati (edizione, luogo di stampa, origine dell’autore etc.). Adoperando gli stessi principi, anche se su scala inferiore, Perseus, il più grande archivio di testi antichi Open Source12, ha effettuato il parsing sul alcune traduzioni di grandi testi geografi- ci dell’antichità classica, mappando e indicizzando automatica- mente tutti i riferimenti a luoghi noti in essi contenuti. Come risultato di questo processo, testi come la Descrizione della Grecia di Pausania presentano un indice, ordinabile alfabeticamente o per frequenza, una mappa dei luoghi menzionati nel testo o in singole sezioni, il dataset di tutti i riferimenti completi di nome, coordinate geografiche e vari altri dati utili; l’utente ha inoltre la 11 F.J. Harvey, A Primer of GIS: Fundamental Geographic and Cartographic Concepts, Guilford Press, New York 2008; D.J. Bodenhamer, J. Corrigan, T.M. Harris, The Spatial Humanities: GIS and the Future of Humanities Scholarship, Indiana University Press, Bloomington-Indianapolis 2010. 12 G.R. Crane, Perseus Digital Library, 1992-. Consultato il 5/03/2018: http://www.perseus.tufts.edu/hopper/. http://www.perseus.tufts.edu/hopper/ Spazi antichi e futuri possibili 155 possibilità di attingere, cliccando sul testo o sugli indici analitici, alle voci di autorità bibliografiche presenti online, come dizio- nari, enciclopedie e atlanti. Ovviamente, questo processo non è affatto esente da errori e approssimazioni13. Le moderne tecnologie di Named Entity Recog- nition (cfr. infra) consentono un margine di accuratezza elevato solo lavorando sulle lingue moderne, e dunque sulle traduzioni. Effettuare un simile lavoro sulle lingue storiche (il greco e il latino, ma anche l’arabo classico o il sanscrito), implica ben altro sforzo; inoltre, sottoporre a mappatura riferimenti a luoghi anti- chi è concettualmente più problematico, a causa della parzialità delle informazioni in nostro possesso. Tornerò fra poco su questo argomento. La tecnologia di queste operazioni di mappatura, però, è basata su principi metodologici sostanzialmente generali, il cui perfezionamento dipende dalla frequenza e dalla scala delle loro applicazioni14. Tali principi si riassumono in tre passaggi fondamentali: identificazione, disambiguazione e catalogazione. 13 Lo faceva già notare Elliott a proposito della scarsa qualità dell’esito nel caso dei luoghi antichi, nell’esperimento di Google Books (Elliott-Gillies, Di- gital Geography and Classics cit.). A proposito dei limiti e delle ancora non sfruttate opportunità di organizzazione semantica e disambiguazione dei rife- rimenti geografici nei testi antichi si veda anche A. Babeu et al., Named Entity Identification and Cyberinfrastructure, «Research and Advanced Technology for Digital Libraries». Lecture Notes in Computer Science, presentato alla In- ternational Conference on Theory and Practice of Digital Libraries, Springer, Berlin-Heidelberg 2007, pp. 259-270. 14 Il Natural Language Processing applicato all’inglese moderno è giunto sostanzialmente allo stato dell’arte, essendo questa una lingua praticata oggi da milioni di parlanti, la lingua stessa dell’informatica e (dato non trascurabi- le) dell’economia, su cui la scala dei dati utilizzabili è composta da miliardi di testi; il caso del cinese moderno, per quanto meno noto in Occidente, è simile. Lingue più circoscritte nella loro applicabilità e meno adoperate, sono anche molto meno frequentemente sottoposte a processi analoghi. È il caso anche di alcune lingue moderne, come il farsi; alcune lingue antiche, tuttavia, si trova- no nella posizione di contribuire a un sostanziale miglioramento delle tecno- logie, essendo attestate su una scala potenzialmente amplissima e in corpora “chiusi”, dunque linguisticamente consolidati: si pensi alla sola Patrologia La- Chiara Palladino 156 I primi due passaggi fanno parte di un settore dei linguaggi di programmazione, e sono correlati l’uno all’altro: nell’insieme, essi prendono il nome di Named Entity Recognition and Classification15. L’identificazione, o recognition, consiste nel met- tere a punto un breve codice (o, più precisamente, script) che sot- topone il documento ad analisi sistematica (o parsing), al fine di ricercare le stringhe di testo contenenti specifiche categorie, ad es. luoghi, persone, organizzazioni, espressioni cronologiche, etc. Tali codici si basano spesso sull’esistenza di librerie preesistenti, atte a fornire alla macchina una lista indicativa dei nomi da riconoscere e disambiguare. Script più complessi possono arriva- re a combinare i passaggi più elementari, come l’estrazione di tutte le stringhe di testo inizianti per maiuscola, con funzioni aggiuntive, ad esempio la differenziazione delle maiuscole da punteggiatura da quelle dei nomi propri16. Il passaggio successivo all’identificazione è, ovviamente, la disambiguazione. Non basta identificare un toponimo, ad esem- pio Alessandria, così come non basta identificare un Alessandro come un nome di persona. Occorre poi associare quel nome a tina per il latino, o al caso emblematico dell’arabo classico, lingua con una spiccata vocazione alla scrittura e con testimonianze manoscritte, ancora oggi non del tutto censite, dell’ordine di milioni di esemplari. Tuttavia, l’interesse per l’analisi computazionale su queste lingue si è intensificato solo in tempi molto recenti: è, dunque, impossibile (si direbbe pretestuoso) pretendere che si raggiunga, in un decimo del tempo e con un centesimo delle risorse, lo stes- so livello di accuratezza. Come ogni altra disciplina, la tecnologia richiede tempi di maturazione per perfezionarsi. 15 Introduzione all’argomento in D. Nadeau-S. Sekine, A survey of named entity recognition and classification, «Linguisticae Investigationes» XXX, 1, 2007, pp. 3-26. 16 Un metodo particolarmente innovativo consiste nell’utilizzare l’inglese come “lingua ponte”, attraverso l’allineamento del testo originale con la sua traduzione. M. Berti, The Digital Fragmenta Historicorum Graecorum and the Ancient Greek-Latin Dynamic Lexicon, in F. Mambrini, M. Passarotti, C. Sporleder (ed. by), Proceedings of the Workshop on Corpus-Based Research in the Humanities (CRH), 10 December 2015 Warsaw, Poland, Institute of Com- puter Science-Polnish Academy of Sciences, Warszawa 2015, pp. 117-123. Spazi antichi e futuri possibili 157 una identità, in maniera tale che non venga confuso con altri. Tale processo deve avvenire abbinando il nome in questione a categorie identificative univoche, ad esempio (ma, si badi bene, non necessariamente) delle coordinate geografiche; per i nomi di persona ci si può servire delle tecniche tipiche della prosopogra- fia, come la data di nascita o di morte, le relazioni di parentela, l’origine geografica, e così via. In questo modo, sapremo che Alessandro è Alessandro detto Magno, re di Macedonia, figlio di Filippo, nato a Pella nel 356 a.C., e che Alessandria è Alessandria d’Egitto, collocata presso il Delta del Nilo, vicina all’insediamento di Pharos, alla latitudine 31.1982456667 e longitudine 29.9079146667, anche denominata al-Iskandarīya durante l’era Ottomana. La disambiguazione in ambito computazionale è anche il pri- mo passo della catalogazione, cui corrisponde l’assegnazione di un identificatore univoco, stabile, semanticamente senza partico- lare significato ma riconoscibile come tale dalle macchine. In ger- go, questo identificatore è detto URI (Uniform Resource Identi- fier), e ha l’aspetto di un indirizzo web a cui è associato un identificatore numerico unico. Naturalmente il primo problema della disambiguazione è che occorrono dei riferimenti canonici adeguati, dizionari o atlanti, che forniscano autorità per assegnare a quel nome una identifica- zione esatta17. Quando questo lavoro viene effettuato sul Web e non sulla carta, è essenziale che tali riferimenti siano puntual- mente riscontrabili online: questo comporta la necessità di rende- re gli atlanti a stampa utilizzabili nel contesto digitale. Pertanto, nel 2000 è stato reso per la prima volta accessibile online uno dei più moderni e aggiornati atlanti del mondo greco e romano, il 17 Non mi soffermerò sul tema vasto e complesso degli atlanti e dei data- base di orientamento archeologico, che sono stati progetti pionieri del settore e rappresentano tuttora uno standard elevatissimo a cui aspirare. È mio inte- resse, in questa sede, concentrarmi sugli aspetti ancora problematici dell’otte- nere la medesima precisione nel campo delle fonti primarie, in particolar mo- do testuali, sul mondo antico. Chiara Palladino 158 cosiddetto Barrington Atlas18: pubblicato anche a stampa nel set- tembre del 2000 da Richard Talbert e Thomas Elliott19, esso includeva la mappatura dei luoghi tramite moderni sistemi di geolocalizzatione20. I dati così raccolti costituiscono l’ossatura di un database online, che oggi, dopo varie fasi di espansione, rap- presenta la risorsa di riferimento più vasta e importante di tutti gli studi, digitali e non, sulla geografia del mondo antico. Si tratta del database collaborativo Pleiades, che ad oggi raccoglie circa 40.000 luoghi antichi e relative collocazioni21. Sul piano concettuale, il merito principale del progetto è stato quello di stabilire la necessità di ridefinire le nozioni-chiave di “luogo” e “spazio” nello studio del mondo antico e nell’era del GIS, aprendo la strada a riflessioni sull’ambiguità del concetto di coordinata geografica, per uno spazio dinamico e ambiguo come 18 Per quanto non l’intero contenuto del Barrington Atlas sia liberamente accessibile tramite Pleiades, è pur sempre apprezzabile che i dati essenziali di una costosa pubblicazione a stampa siano stati messi online e resi, non solo leggibili, ma fruibili senza il pagamento di ulteriori dazi. Si spera che, in futu- ro, tali dati possano essere arricchiti dai riferimenti incrociati a enciclopedie meno recenti, ma di enorme importanza per gli studi classici, come la Realen- cyclopädie der Altertumswissenschaft, la cui progressiva digitalizzazione in modalità Open Access è oggi in atto. Il fatto che lo sia su una piattaforma for- nita da Wikipedia, risorsa sulla quale i dotti nostrani storcono spesso il naso, non ne sminuisce in alcun modo l’importanza, semmai l’amplifica (https:// de.wikisource.org/wiki/Paulys_Realencyclop%C3%A4die_der_classischen_ Altertumswissenschaft). 19 R.J.A. Talbert-R.S. Bagnall, Barrington Atlas of the Greek and Roman World, University Press, Princeton 2000. 20 Il corpus dei dati del Barrington Atlas è oggi mantenuto e aggiornato dall’Ancient World Mapping Center dell’Università Chapel Hill del North Ca- rolina (http://awmc.unc.edu/), che li adopera per ulteriori applicazioni, come ad esempio il webservice Antiquity á la Carte (R. Horne, AWMC: Antiquity À-La- Carte, 2012-. Consultato il 10/02/2018: http://awmc.unc.edu/awmc/applications/ alacarte/), che consente di combinare diverse tipologie di informazioni geo- grafiche, come il reticolo di strade del periodo imperiale, gli acquedotti, i limi- ti provinciali, e naturalmente le coordinate delle principali aree urbane. 21 R. Bagnall et al., Pleiades: A Community-Built Gazetteer and Graph of Ancient Places, 2006-. Consultato il 2/02/2018: http://pleiades.stoa.org. https://de.wikisource.org/wiki/Paulys_Realencyclop%C3%A4die_der_classischen_Altertumswissenschaft https://de.wikisource.org/wiki/Paulys_Realencyclop%C3%A4die_der_classischen_Altertumswissenschaft https://de.wikisource.org/wiki/Paulys_Realencyclop%C3%A4die_der_classischen_Altertumswissenschaft http://awmc.unc.edu/ http://awmc.unc.edu/awmc/applications/alacarte/ http://awmc.unc.edu/awmc/applications/alacarte/ http://pleiades.stoa.org/ Spazi antichi e futuri possibili 159 quello antico, e a una nuova definizione del concetto di luogo sulla base di specifiche categorie culturali, più che cartografiche. Nello studio del mondo antico, infatti, non è raro che un luogo attestato non possa essere ricondotto a coordinate spaziali, e di certo non con la precisione richiesta dai sistemi moderni: questo, tuttavia, non lo rende meno importante. L’idea alla base di Pleiades è, quindi, quella di considerare un luogo non come un toponimo cui corrispondono coordinate oggettive, ma come una “entità culturale” a cui sono connesse una serie di caratteristiche, di cui le coordinate sono solo una, e non necessariamente la principale. Altre caratteristiche riconosciute sono la categoria del luogo in questione, la sua definizione politica, il suo periodo o periodi di attestazione, i suoi nomi attestati nel corso del tempo, le sue connessioni con altre entità spaziali o sociali (fiumi, mari, porti, vie, tribù, individui, popoli, edifici pubblici, siti archeolo- gici…). A queste caratteristiche strutturali si affiancano quelle più prettamente bibliografiche, come ad esempio i riferimenti a dizio- nari e atlanti, ovvero ad altre risorse, come i database epigrafici o i cataloghi museali e delle soprintendenze, ma anche le raccolte di immagini pubbliche, come Flickr.com. Il risultato è la possibi- lità, per l’utente, di accedere analiticamente a una enorme varietà di riferimenti aggiuntivi e in continua crescita, spesso con com- pleta libertà di utilizzo e pubblicazione dei dati di partenza22. 22 Pleiades è un database “collaborativo”: a parte i redattori principali e i responsabili del progetto, i suoi aggiornamenti e arricchimenti sono intera- mente dovuti al libero contributo di studiosi volontari e ricercatori, che con- tribuiscono a perfezionarne i riferimenti, a disambiguare e a correggere, non- ché ad aggiungere informazioni. Un recente esempio è l’aggiunta delle atte- stazioni in arabo di oltre 5000 toponimi presenti nel database, nell’ambito del progetto CALCS, cfr. V. Vitale, Pelagios-Cross-Cultural After-Life of Classical Sites (CALCS), 2016. Consultato il 5/03/2018: https://research.sas.ac.uk/search/ research-project/152/pelagios-cross-cultural-after-life-of-classical-sites-(calcs)/). Tale opportunità di partecipazione, semplice e diretta, e specificamente pensata per gli studiosi, elimina a monte la giustificazione autoassolutoria della pre- senza, inevitabile, di errori, e attribuisce invece a chi li individua la responsa- bilità (e il merito) di correggerli, nella speranza che un valente ricercatore sia meglio informato, ma altrettanto motivato, di un utente di Wikipedia. https://research.sas.ac.uk/search/research-project/152/pelagios-cross-cultural-after-life-of-classical-sites-(calcs)/ https://research.sas.ac.uk/search/research-project/152/pelagios-cross-cultural-after-life-of-classical-sites-(calcs)/ Chiara Palladino 160 Fig. 1. Una voce in Pleiades. Mentre Pleiades, nato come database generale del mondo antico, si avvia a una espansione oltre i confini della cosiddetta classicità, il Digital Atlas of the Roman Empire, ideato sempre nel 2000 da Johan Åhlfeldt dell’Università di Lund23, ha l’intento di raccogliere e, possibilmente, mappare, ogni aspetto della geogra- fia dell’Impero, inclusi i miliari romani, i database delle chiese copte e cristiane, gli anfiteatri, gli acquedotti e via dicendo, e ne fornisce una mappatura semanticamente categorizzata. Una men- zione merita anche il progetto Trismegistos dell’università di Leuven24, già autorità di riferimento nel campo della papirologia 23 J. Åhlfeldt, Digital Atlas of the Roman Empire (DARE), 2015-2017. Con- sultato il 10/02/2018: http://dare.ht.lu.se/. 24 M. Depauw et al., Trismegistos. Consultato il 10/02/2018: https://www. trismegistos.org/. Si veda anche M. Depauw-T. Gheldof, Trismegistos: An In- terdisciplinary Platform for Ancient World Texts and Related Information, in Ł. http://dare.ht.lu.se/ https://www.trismegistos.org/ https://www.trismegistos.org/ Spazi antichi e futuri possibili 161 e dell’epigrafia, che sta avviando una catalogazione analitica delle informazioni spaziali fornite dalle fonti primarie già raccolte nel database25. I testi più interessanti dal punto di vista dell’informazione geografica sono stati sottoposti a Named Entity Recognition, e le informazioni estratte verificate, catalogate e disambiguate manualmente, e ove possibile correlate ai riferi- menti già esistenti. Come risultato di questa operazione, è ora possibile ricavare dati molto completi circa i riferimenti geogra- fici relativi al Nord Africa in numerose delle fonti testuali conservate nel database, come ad esempio i Bicchieri di Vicarello o l’Itinerario Antonino. Esplorare lo spazio antico La catalogazione e mappatura del mondo antico è, ovvia- mente, il passaggio preliminare di ogni analisi più raffinata, e la sua pubblicazione nel contesto digitale ne consente, alla luce dei nuovi studi, un perfezionamento continuo e critico. Ma lo “spatial turn” ha stimolato anche approcci di ricerca volti alla esplorazione della geografia intesa come spazio dinamico e vissu- to, sfidando i concetti di rappresentazione cartografica “statica” insiti nei processi di geolocalizzazione: è il concetto stesso di mappatura che implica la necessità di varcare i confini delle modalità di rappresentazione di un singolo medium, per giungere a un passaggio “dal testo alla mappa” che consenta la valorizza- zione di tutte le informazioni, ossia non solo la rappresentazione piatta dei luoghi menzionati in una fonte, ma delle relazioni se- mantiche e dinamiche insite nella percezione dello spazio e nel- Bolikowski et al. (ed. by), Theory and Practice of Digital Libraries-TPDL 2013 Selected Workshops, Springer, Cham 2013, pp. 40-52. 25 L’iniziativa prende il nome di Trismegistos Places, consultabile a http://www.trismegistos.org/geo/. http://www.trismegistos.org/geo/ Chiara Palladino 162 l’orientamento nel paesaggio ivi descritti26, le loro implicazioni sociali e culturali, i loro mutamenti27. In anni recenti la discus- sione si è arricchita per via della maggiore enfasi posta sulla discrepanza concettuale fra la rappresentazione dello spazio nelle società premoderne, per definizione “non cartografiche”28, e i moderni metodi di mappatura, cui corrisponde una tecnica di navigazione – e quindi una percezione spaziale – molto diversa. Per questo motivo la “rappresentazione su mappa” non potrà che essere parziale, se intesa semplicemente nei limiti dei moderni standard del GIS. Uno dei primi esperimenti in questo senso è stato la digita- lizzazione della Tabula Peutingeriana realizzata, ancora una volta, da Talbert ed Elliott come corollario al Barrington Atlas29. La scansione dell’immagine, opportunamente segmentata, è stata associata a un insieme di legende, simboli e indicazioni di classi- ficazione sovrapponibili all’immagine stessa nell’interfaccia di lettura. Alla mappa, quindi, sono associate indicazioni semanti- che, che forniscono una classificazione analitica dei suoi diversi componenti geografici, storici e concettuali. Inoltre, ogni luogo indicato, con le caratteristiche che gli sono associate nella carta, è associato a un riferimento “moderno” nel Barrington Atlas; il risultato è la creazione di un database che non tiene conto soltan- 26 A proposito delle implicazioni concettuali del passaggio dal mezzo scrit- to al mezzo visuale, si veda almeno Ø. Eide, Media Boundaries and Conceptual Modelling, Palgrave Macmillan UK, London 2015. 27 E. Barker et al. (ed. by), New Worlds from Old Texts: Revisiting Ancient Space and Place, University Press, Oxford 2016. 28 Seppure superato in alcune parti, ancora oggi il testo di Pietro Janni è il riferimento principale per la questione della navigazione spaziale nelle società premoderne. P. Janni, La mappa e il periplo : cartografia antica e spazio odolo- gico, G. Bretschneider, Roma 1984. 29 T. Elliott, Constructing a Digital Edition for the Peutinger Map, in R.J.A. Talbert, R.W. Unger (ed. by), Cartography in antiquity and the Middle Ages. Fresh perspectives, new methods, Brill, Leiden-Boston 2008, pp. 99-110; R.J.A. Talbert, Rome’s World: The Peutinger Map Reconsidered, Cambridge University Press, Cambridge-New York 2010. Spazi antichi e futuri possibili 163 to dei riferimenti geografici contemporanei, ma ricava le infor- mazioni essenziali dalla semantica della mappa stessa. L’esigenza di comprendere il movimento nel mondo antico nei suoi aspetti dinamici è stata invece alla base del progetto Orbis, curato dall’Università di Stanford30. Orbis, la cui mappa di connettività si basa in gran parte su dati archeologici, offre una simulazione delle modalità di viaggio nell’Impero Romano nel II secolo della nostra era: tramite una interfaccia online, è possibile impostare una serie di condizioni, scelte fra i fattori che più noto- riamente determinano le modalità di viaggio nell’antichità (perio- do dell’anno, modalità del percorso, mezzo di trasporto etc.), e attraverso una combinazione di modelli di simulazione fornisce i costi, i tempi e le variabili del percorso scelto. Non si tratta di una ricostruzione basata su fonti primarie, bensì di un modello mate- matico e probabilistico, che necessita di un contesto di ricerca e domande investigative molto precise per essere efficace31. Un analogo tentativo di analisi dello spazio “dinamico” e vissuto, questa volta attraverso l’analisi delle fonti primarie, è stato compiuto nel 2014 con il progetto Hestia32. Esso consiste in una interfaccia integrata di lettura realizzata sulle Storie di Ero- doto, sia in greco che in inglese. La vista iniziale offre una pano- ramica, ingrandibile ed esplorabile, di tutti i luoghi menzionati dal testo e le relative statistiche di frequenza e densità. La visione analitica offre una serie di finestre affiancate, che consistono nel testo stesso, suddiviso per libro, capitolo e paragrafo secondo il sistema editoriale consueto, la mappa dei luoghi menzionati nel passaggio selezionato, una linea temporale che ne descrive la progressione narrativa, e un codice di colori che indica i luoghi 30 W. Scheidel et al., ORBIS: The Stanford Geospatial Network Model of the Roman World, 2014-. Consultato il 10/02/2018: http://orbis.stanford.edu/. 31 W. Scheidel, Orbis: The Stanford Geospatial Network Model of the Roman World, SSRN Scholarly Paper, Social Science Research Network, Rochester- NY 2015. 32 E. Barker et al., Hestia: Herodotus Encoded Space-Text-Imaging Archive, 2014. Consultato il 10/02/2018: http://hestia.open.ac.uk/. http://orbis.stanford.edu/ http://hestia.open.ac.uk/ Chiara Palladino 164 menzionati sulla base della frequenza. Cliccando su un luogo è possibile visualizzare tutti i passaggi delle Storie in cui esso compare, nonché visualizzare in una nuova finestra la mappa delle entità spaziali a cui quel particolare luogo è connesso nel corso della narrazione, e il numero di volte in cui si connette ad essi. Nel corso della ricerca analitica sul testo, inoltre, ci si è chiesti in che modo si potesse esplorare la concezione dello spazio di Erodoto attraverso mezzi di rappresentazione visuale: ci si è presto resi conto che rappresentare l’opera di Erodoto in termini cartografici non era sufficiente per comprenderne le implicazioni storiche e narrative. Vi era, in altri termini, un problema semantico che poneva la necessità di prescindere dalla rappresentazione cartesiana dello spazio. Si è scelto, dunque, di servirsi del principio della connettività, o network theory, parten- do dalla premessa metodologica secondo cui la rappresentazione dello spazio in forma linguistica ha la sua struttura portante nella creazione di relazioni semantiche fra entità33. Nel caso di Hestia, la network theory è stata applicata allo scopo di investigare la ricchezza del testo al di là della rappresentazione bidimensionale della mappa, e in parte proprio per svincolarsi dalle costrizioni imposte da essa. Attraverso tecniche di text mining, sono stati estratti dal testo tutti i riferimenti geografici, e quelli caratteriz- zati da co-occorrenza nella medesima porzione di testo (misurata secondo un principio di unità-paragrafo) sono stati messi in rela- zione fra loro sulla base di quattro criteri di tipo linguistico, ricavati essenzialmente dalla tipologia di forma verbale che 33 La network theory, fin dall’inizio caratterizzata da forti implicazioni spa- ziali, è stata introdotta nell’ambito della critica letteraria dal celebre saggio di Franco Moretti, Graphs, Maps, Trees: Abstract Models for Literary History (Ver- so, London-New York 2005). Moretti ha qui dimostrato l’importanza di legge- re un testo attraverso metodi di distant reading, per ricavarne informazioni spesso non visibili tramite la lettura progressiva: l’esito più importante del distant reading è, per l’appunto, l’estrazione di informazioni relative ai rap- porti macroscopici dei personaggi in un universo narrativo, che possono esse- re rappresentati attraverso varie forme di mappatura, non necessariamente cartografica. Spazi antichi e futuri possibili 165 compariva nella proposizione in cui si trovavano (posiziona- mento, movimento, dinamicità, trasformazione)34. Si è così generato non solo il reticolo di connessioni, che è possibile esaminare nell’interfaccia di lettura, ma anche un database di relazioni che può essere visualizzato indipendentemente dalla mappa, dove invece che al posizionamento geografico può essere data priorità ai vari indici di connettività, o anche alla frequenza. Fig. 2. Un’immagine dell’interfaccia di lettura di Hestia. C’è, però, un altro problema che occorre sottolineare. Tutto quello che è stato fatto con Hestia poteva essere prodotto con relativa esattezza soltanto sulla traduzione inglese. Ai tempi in cui il progetto si è sviluppato, non era pensabile compiere un’a- nalisi così raffinata su un testo in greco antico, per lo meno non 34 S. Bouzarovski-E. Barker, Between East and West: movements and trans- formations in Herodotean topology, in Barker et al. (ed. by), New worlds from Old Texts cit., pp. 155-180. Chiara Palladino 166 con i finanziamenti di un progetto di durata poco più che annua- le35. Questo ci porta a un altro tema fondamentale, ossia quello della creazione dei dati partendo dalle fonti primarie: in altre pa- role, la questione delle edizioni digitali. Leggere lo spazio antico Il supporto digitale offre, a livelli impensabili in precedenza, la possibilità di raccogliere, organizzare e rendere fruibili le più di- sparate categorie di informazione. Le informazioni spaziali fanno parte della tipologia di dato che può essere raccolto e rappresen- tato su questo supporto, lavorando direttamente sulla lingua di partenza: in altre parole, il supporto digitale può contribuire a creare nuove tipologie di edizioni dei testi antichi, in cui sia ade- guatamente valorizzato anche il dato spaziale. Nel campo dell’editoria digitale, i linguaggi di marcatura sono da tempo adoperati per identificare le informazioni di particolare interesse. L’esito di questa operazione è soprattutto la possibilità di indicizzare automaticamente quelle informazioni, con livelli di precisione e opportunità di analisi molto diversificati36. Questa modalità è stata scelta nel caso della marcatura dei sistemi di citazione nella edizione digitale dei Deipnosofisti di Ateneo, o Di- 35 Si confrontino i risultati ottenuti da Trismegistos Places con finanzia- menti, tempi e personale molto più generosi. Si sa che, in ambiente accademi- co, non è “elegante” parlare di risorse economiche: tuttavia proprio queste ri- sorse hanno consentito a Trismegistos di raggiungere risultati su una scala di complessità paragonabile a Hestia, ma lavorando sui testi rigorosamente in lingua originale. 36 Il linguaggio di marcatura per eccellenza, nel caso dei testi, è XML, o Extensible Markup Language (https://www.w3.org/XML/). Nei suoi sotto- schemi, esplicitamente creati per le edizioni di testi complessi, TEI ed EpiDoc (http://www.tei-c.org/, http://epidoc.sourceforge.net/), viene utilizzato come linguaggio standard per l’editoria e l’archiviazione di testi digitali, creati ex novo o frutto della trasposizione su supporto digitale di fonti già esistenti. https://www.w3.org/XML/ http://www.tei-c.org/ http://epidoc.sourceforge.net/ Spazi antichi e futuri possibili 167 gital Athenaeus37. Al testo, già disponibile nell’edizione di Georg Kaibel, è stata affiancata la versione digitale dell’Index Scripto- rum, nonché i Dialogi Personae curati da August Meineke e dallo stesso Kaibel, e il recentissimo Index of authors, texts and persons della nuova edizione di Douglas Olson: ogni luogo citato viene fornito con apposita concordanza fra il sistema di citazione di Kaibel e quello delle pagine di Casaubon, ancora oggi familiare a molti lettori. Il risultato è un unico grande indice, che raccoglie non solo le concordanze, ma classifica semanticamente ogni au- tore, consente di risalire al passaggio riferito, e connette analiti- camente tutte le informazioni ad esso pertinenti. Ma soprattutto, l’indice sfrutta in pieno le potenzialità del nuovo supporto digita- le, in quanto si estende al riconoscimento e all’analisi dei passag- gi del testo di Ateneo riconosciuti come citazioni di altri autori: tali passaggi sono stati appositamente marcati all’interno del testo stesso dei Deipnosofisti. Il risultato è non solo un indice pie- namente fruibile per le opportunità di ricerca più disparate, ma anche la possibilità di varie operazioni di distant reading, come ad esempio l’estrazione di tutte le citazioni di determinati nomi, ovvero categorie di autori, ovvero opere letterarie, suddivisi in base ai luoghi in cui compaiono attraverso tutta l’opera38. 37 M. Berti, Digital Athenaeus. A digital edition of the Deipnosophists of Athe- naeus of Naucratis. Consultato il 05/03/2018: http://www.digitalathenaeus.org. 38 M. Berti et al., Documenting Homeric Text-Reuse in the Deipnosophistae of Athenaeus of Naucratis, «Bulletin of the Institute of Classical Studies» LXIX, 2, 2016, pp. 121-139. http://www.digitalathenaeus.org/ Chiara Palladino 168 Fig. 3. Un esempio delle applicazioni degli indici digitali del Digital Athe- naeus: visualizzare tutte le citazioni a Omero come autore e la loro frequenza classificata per libro. Un piccolo progetto in confronto, ma focalizzato specifica- mente sulla geografia letteraria, è stato recentemente promosso dall’università di Zagreb, nell’ambito della raccolta e archivia- zione digitale dei testi latini di autori croati del Rinascimento, che prende il nome di CroALA (Croatiae Auctores Latini)39. Il progetto ha condotto alla realizzazione di un Index Locorum atto a raccogliere i riferimenti geografici secondo un sistema di classi- ficazione creato specificamente per la cosiddetta geografia lette- raria, che presenta aspetti problematici peculiari, che non neces- sariamente si prestano a una semplice classificazione univoca: com’è evidente, un luogo poetico può non essere necessaria- mente reale, ma nemmeno interamente fittizio; può essere la proiezione di un’entità reale nella letteratura e nell’immaginario 39 N. Jovanović et al., CroALa: Croatiae Auctores Latini, 2009-2014. Consul- tato il 05/03/2018: http://croala.ffzg.unizg.hr/. http://croala.ffzg.unizg.hr/ Spazi antichi e futuri possibili 169 (ad esempio, l’Olimpo), può aver cambiato denominazione, e persino coordinate, nel corso del tempo; inoltre, può essere reale ma non necessariamente appartenere alla geografia terrestre (si pensi alla Luna). La premessa metodologica del CroALA Index Locorum è stata quella di ridiscutere il principio stesso dell’“i- dentificatore univoco”, sulla base dell’essenziale considerazione che non sempre un nome riferito ad un luogo ne eredita le medesime caratteristiche. Il risultato, accessibile tramite l’archi- vio dell’Index Locorum40, sono varie categorie di indici tutti interconnessi fra loro, e che da ultimo rimandano al contesto originale, dove il sistema di classificazione adottato segue un articolato data model, e dove sono stati creati complessi identifi- catori, univoci ma estremamente flessibili e manipolabili, allo scopo di mettere in relazione le differenti entità con la realtà (concreta e “poetica”) a cui si riferiscono. Una maniera alternativa di raccogliere i dati spaziali dalla fonte primaria è la cosiddetta annotazione esterna (o stand-off annotation), che prescinde dai linguaggi di markup e viene effettuata spesso attraverso interfacce web di semplice utilizzo. Uno dei servizi oggi più utilizzati, nell’ambito dello studio delle fonti dell’antichità, è Recogito (http://recogito.pelagios.org/), uno strumento messo a punto nell’ambito di Pelagios41, un progetto mirante a creare una infrastruttura centrale per le risorse e le iniziative relative al concetto, largamente inteso, di “spazio” nel mondo premoderno, allo scopo di favorire l’interconnessione fra progetti di ricerca ed archivi attraverso gli standard dei Linked Open Data42. 40 N. Jovanović, Croala-Pelagios: CITE Semantic Annotations for Place Refer- ences in Croatian Latin Texts. XQuery, 2017. Consultato il 05/03/2018: https:// github.com/nevenjovanovic/croala-pelagios. 41 L. Isaksen et al., Pelagios Commons: Linking the Places of Our Past, 2015-. Consultato il 10/02/2018: http://commons.pelagios.org/. 42 I Linked Open Data sono la tecnologia più importante del cosiddetto Semantic Web (T. Berners-Lee, «Linked Data», 2006: https://www.w3.org/ DesignIssues/LinkedData.html). In sintesi, si tratta di una serie di standard e http://recogito.pelagios.org/ https://github.com/nevenjovanovic/croala-pelagios https://github.com/nevenjovanovic/croala-pelagios http://commons.pelagios.org/ https://www.w3.org/DesignIssues/LinkedData.html https://www.w3.org/DesignIssues/LinkedData.html Chiara Palladino 170 Allo scopo di integrare le informazioni provenienti da fonti secondarie, come atlanti e database online, con i dati provenienti dalle fonti primarie, Pelagios ha fornito agli utenti la possibilità di raccogliere ed esplorare tali informazioni in modo critico, allo stesso tempo contribuendo alla creazione di dati nuovi: Recogito è uno strumento di facile accesso, tramite il quale l’utente può creare un proprio profilo, caricare un testo o una mappa, deci- derne i criteri di condivisione e annotare stringhe di testo conte- nenti informazioni rilevanti, con particolare attenzione – ovvia- mente – ai riferimenti geografici. Nel caso dei toponimi, poi, è anche possibile effettuare operazioni di disambiguazione semi- automatica, attingendo ai database online come Pleiades o il Digital Atlas of the Roman Empire. Servendosi del principio dell’annotazione esterna, Recogito consente di lavorare direttamente sulla fonte, archiviando le annotazioni altrove, in maniera tale che esse siano immediata- mente disponibili sotto forma di dataset in vari linguaggi, senza dover passare attraverso l’estrazione dell’informazione marcata, come nel caso dei testi in XML. Inoltre, questo sistema permette di rendere direttamente accessibili, sotto forma di Linked Open Data, le informazioni create dall’utente: ciò implica non solo maggiore visibilità dell’informazione creata, ma anche un au- mento delle opportunità che quella informazione possa essere adoperata da altri, per la sua incorporazione in nuovi progetti di ricerca o archivi. di tecnologie cui attenersi per rendere il proprio contenuto online accessibile, utilizzabile e connettibile ad altri caratterizzati da un qualche tipo di affinità semantica. Da qualche anno a questa parte, molti database e archivi digitali hanno reso le proprie risorse compatibili con gli standard dei Linked Open Data, per incrementare le proprie potenzialità di accesso e di utilizzo, e per favorire sempre più una metodologia di ricerca trasversale, che trae indubbio beneficio dalla possibilità di accedere a contenuti diversificati attraverso l’utilizzo di un vocabolario semantico comune. Spazi antichi e futuri possibili 171 Fig. 4. Riquadro di annotazione in Recogito. Naturalmente, quello che la macchina non può dire all’utente è cosa o come annotare. La classificazione delle informazioni spa- ziali, specie nelle fonti testuali, è tutto tranne che immediata, e deve necessariamente rispondere a criteri metodologici ponde- rati: dunque, l’annotazione è stata scelta come criterio di raccolta delle informazioni proprio per la sua flessibilità di processo esplorativo. In generale, l’annotazione semantica, che consente di associare, all’interno di un ambiente Web, informazioni di vario tipo a un’entità rinvenuta in una fonte, può muoversi in due direzioni: la prima, facendo riferimento a una classificazione già esistente e complessa, preferibilmente sotto forma di ontologia43; 43 Il concetto di ontologia è un prestito della filosofia, dove la parola deno- ta lo “studio dell’essere”: in informatica esso è passato ad indicare le pratiche di rappresentazione della conoscenza in forme organizzate ed esprimibili in linguaggi comprensibili alle macchine, attraverso la definizione di una strut- Chiara Palladino 172 la seconda, creando da zero un sistema secondo criteri emersi dallo studio dalla fonte stessa, laddove le ontologie e gli atlanti a disposizione non rispondano alle esigenze del ricercatore (si pen- si, ad esempio, ad opere a metà fra la geografia naturale e quella fantastica, come ad esempio il Satyricon). Entrambe queste strade sono percorribili indipendentemente, e rispondono a intenti di- versi nel processo di analisi della fonte primaria, l’uno più generale e comparativo, in quanto focalizzato sulla classifi- cazione in relazione a un “vocabolario” semantico di riferimento, l’altro più concentrato sulla fonte e sulla particolare concezione spaziale del suo autore. L’annotazione si configura, quindi, come qualcosa di più che un semplice processo di classificazione, ma è un modo per arricchire un riferimento spaziale a un livello di approfondi- mento pressoché illimitato, ben oltre il semplice “place- tagging”44. tura formale composta di entità e relazioni fra di esse. Nel Semantic Web, le ontologie vengono adoperate per specificare vocabolari concettuali “stan- dard”, utili alla classificazione dei fenomeni: T. Gruber, Ontology (Computer Science), in L. Liu-M. Tamer Öszu (ed. by), Encyclopedia of database systems, Springer, Boston 2009. Le ontologie spaziali e spazio-temporali vengono ge- neralmente adoperate per la classificazione di fenomeni relativi alla geografia e alla cronologia nel Semantic Web: esse possono fornire un punto di parten- za importante per la disambiguazione e l’annotazione di riferimenti spaziali all’interno delle fonti antiche. Si veda in proposito l’importante progetto GeoLat (F. Ciotti et al., TEI, ontologies, linked open data: geolat and beyond, «Journal of the Text Encoding Initiative» 8, 2015, pp. 1-20). 44 Da questo genere di “commento espanso” vengono le potenzialità più importanti per la futura editoria digitale. Si veda in proposito almeno R. Af- ferni et al., ... but what should I put in a digital apparatus? A not-so-obvious choice. New types of digital scholarly editions, in P. Boot et al. (ed. by), Advanc- es In Digital Scholarly Editing. Papers Presented At The Dixit Conferences in The Hague, Cologne and Antwerp, Sidestone Press, Cologne 2017, pp. 141-143. Spazi antichi e futuri possibili 173 Conclusione Nell’articolo citato in apertura, Elliott auspicava un futuro in cui le informazioni spaziali fossero accuratamente marcate e verificate, nelle edizioni accademiche dei testi, tramite sistemi che ne consentissero la rapida indicizzazione e la mappatura. Elliott auspicava altresì che l’enorme mole di testi non marcati, invece, potesse essere passibile di trattamenti completamente automatici, per raggiungere risultati apprezzabili, e che la disambiguazione potesse essere effettuata in maniera computa- zionale grazie al miglioramento delle tecnologie. Si può senz’al- tro dire che, rispetto a questa visione, vi è ancora molta strada da fare. Tuttavia, nei circa dieci anni successivi, l’incontro-scontro fra le discipline storiche e il mondo digitale ha aperto nuove questioni e nuovi problemi metodologici. Per prima cosa, è necessario riconoscere che lavorare con le macchine rimette in discussione i metodi della ricerca storica: non perché essi siano fallaci, ma proprio perché il loro trasferi- mento su un diverso supporto costringe alla ridefinizione di alcu- ni concetti chiave; la macchina, infatti, necessita di una precisione che non lascia spazio alla vaghezza, e costringe a se- guire un metodo rigoroso e basato sulle evidenze fattuali, più che sulle speculazioni. Questo significa, nel nostro caso, che è neces- sario definire con precisione, e preliminarmente, i concetti di geografia, di luogo, di entità spaziale, per far sì che si capisca univocamente qual è l’oggetto della nostra analisi, e affinché la macchina possa sobbarcarsi molto del lavoro meccanico di raccolta delle informazioni, prima affidato alla buona volontà del ricercatore. Tuttavia, le discipline storiche comportano problemi che mettono in crisi approcci troppo meccanicisti. Abbiamo visto che le esigenze dell’analisi della geografia antica e letteraria, basata su dati per definizione qualitativi, mal si accordano con i mecca- nismi troppo costrittivi del GIS, che è un approccio quantitativo per eccellenza. Andiamo quindi verso qualcosa che, pur adot- tando alcuni approcci del GIS, è altro, è una “geografia digitale” Chiara Palladino 174 che dà valore ad aspetti prettamente storico-culturali. Questo ap- proccio è stato definito da Elliott un-GIS, ossia qualcosa che non conferisce importanza assoluta al concetto puramente quantita- tivo delle coordinate geografiche, ma ha la flessibilità necessaria per includere i dati provenienti dall’indagine umanistica. Lo stesso discorso si può fare per gli identificatori univoci, o URI, che funzionano per le macchine, ma non hanno la stessa ricchez- za semantica dei linguaggi naturali, e spesso sono messi in discussione dalla mancanza di contesto e dall’ambiguità tipiche della ricerca storica. Questo comporta la necessità di realizzare sistemi di identificazione semanticamente più ricchi e flessibili dei semplici indicatori numerici45. Da questa situazione, però, possono emergere non solo dei nuovi problemi, ma anche degli approcci nuovi46. Mi limito qui a 45 Un grande progresso in questo senso è stato fatto in campo bibliografi- co, con l’introduzione del CTS (Canonical Text Service). D.N. Smith-C.W. Blackwell, Four URLs, Limitless Apps: Separation of Concerns in the Homer Multitext Architecture, CHS White Papers: https://chs.harvard.edu/CHS/ article/display/4846. 46 Si potrebbe obiettare se realmente si imponga, per così dire, agli umani- sti, la necessità di “convertirsi” attivamente ai criteri e ai metodi del mondo digitale, e alla ovvia obiezione se tale conversione non comporti, in un futuro prossimo, la perdita di informazioni o di approcci che sono pensabili solo nel mondo della stampa. La prima riposta a una tale obiezione è che tale passag- gio di supporto è una rivoluzione inevitabile: e chi ha fatto studi filologici, in virtù della maggiore consapevolezza dei meccanismi che sottendono alle mo- dalità di diffusione e produzione del sapere, ha gli strumenti sufficienti per comprendere la portata di questo cambiamento. Ma in assenza di tale convin- cimento (i mutamenti culturali generano necessariamente delle resistenze), è preferibile che siano gli umanisti a definire i paradigmi con cui svolgere ri- cerca seria, invece di lasciarlo fare ad altri, che siano le case editrici o i pro- duttori di software, la cui considerazione e competenza nei confronti di tutto ciò che concerne le materie umanistiche, e in special modo storiche, è triste- mente nota. Il cambiamento è in atto, ed è inarrestabile: neppure un’apoca- lisse del World Wide Web, quale in molti preconizzano con un certo compia- cimento, potrebbe mai annullare il processo di produzione e diffusione di in- formazione in formato digitale, che non ha nulla a che fare con l’esistenza di https://chs.harvard.edu/CHS/article/display/4846 https://chs.harvard.edu/CHS/article/display/4846 Spazi antichi e futuri possibili 175 proporne alcuni, che già dovrebbero essere emersi nel corso di questo breve contributo. Il primo è la possibilità, mai riscontrata prima a un tale livello di complessità, di creare indici semanticamente raffinati e artico- lati, sempre connessi al contesto di partenza e potenzialmente legati a una quantità infinita di informazioni aggiuntive; le op- portunità offerte dal supporto digitale in questo senso possono essere sfruttate ai fini della creazione di uno dei desiderata della geografia classica, un lessico, o dizionario critico aggiornato della geografia antica47. I metodi di text mining consentono l’automa- tizzazione del lavoro di estrazione delle informazioni, ma il supporto digitale può prestarsi a rappresentare anche il carattere editorialmente complesso di un’opera del genere, che richiede, evidentemente, un lavoro profondamente analitico da parte di una varietà di figure specializzate. Ma il processo di analisi spaziale comporta anche un arricchi- mento della nozione di edizione critica che va ben oltre il concetto di “indice”: la narrazione spaziale nel suo complesso può essere vista come un sistema linguistico e cognitivo funzionale alla navigazione, il prodotto coeso di una civiltà, e si presta dunque a metodi innovativi di analisi e rappresentazione48. Infine, il merito principale delle Digital Humanities è l’aver rimesso in essere, concretamente, quella interdisciplinarità che in queste discipline si è perduta da tempo: l’attenzione quasi Internet. Perciò, forse, sarebbe meglio che siano i diretti interessati a dettare le modalità con cui questo processo dovrebbe avvenire, proprio al fine di evi- tare il più possibile il rischio della perdita di informazioni e metodi. Diversa- mente, saranno i ‘giganti dell’informazione’ a definire come e dove questa si- tuazione si evolverà, e a spese di chi. 47 La necessità di un dizionario critico della geografia antica è già stata messa in rilievo da D. Marcotte, Les Géographes Grecs, tome I, introduction générale, Pseudo-Scymnos . Circuit de la Terre, texte établi et traduit par D. Marcotte, vol. I, Les Belles Lettres, Paris 2000. 48 M. Thiering-K. Geus (ed. by), Features of Common Sense Geography: Im- plicit Knowledge Structures in Ancient Geographical Texts, Lit Verlag, Berlin- Münster-Wien-Zürich-London 2014. Chiara Palladino 176 esclusiva alle cosiddette culture classiche ha comportato spesso la perdita, involontaria, del contesto generale, quello del mondo antico e premoderno, in cui le civiltà oggetto di studio intera- giscono con altre, che esistono con pari dignità, e sono portatrici di altri modi di vedere il mondo. Nell’ambito del digitale, le disci- pline dell’antichità greca e romana hanno certamente segnato il passo con anticipo, ma numerose altre si stanno facendo strada, arricchendo le metodologie già in essere e contribuendo a porre le premesse per una visione “globale” dell’antichità. Fra i nume- rosi progetti oggi attivi, mi preme citare l’immenso archivio di risorse bibliografiche, prosopografiche e geografiche messo a punto per il siriaco nel progetto Syriaca49, la creazione di un atlante rifinito e profondamente connesso alle fonti letterarie per il mondo islamico50, e il crescente perfezionamento delle tecnolo- gie di data mining, named entity recognition e network visualiza- tion attualmente in atto per il cinese51. Si spera, dunque, che in futuro la ricerca digitale possa contribuire a ricreare un’imma- gine del mondo antico e premoderno che ne restituisca la piena complessità culturale e storica. 49 T.A. Carlson et al., Syriaca.Org: The Syriac Reference Portal. Consultato il 05/03/2018: http://syriaca.org/. 50 M. Seydi-M. Romanov, Al-Ṯurayyā Project, 2013-. Consultato il 10/02/2018: https://althurayya.github.io/. 51 Si veda ad esempio H. De Weerdt, Information, Territory, and Networks: The Crisis and Maintenance of Empire in Song China, Harvard University Asia Center, Cambridge 2015. http://syriaca.org/ https://althurayya.github.io/ Spazi antichi e futuri possibili 177 Abstract. This paper summarizes the current situation of Ancient Geography within the larger context of the Digital Humanities. It proposes an overview of the most important achievements and initiatives in the digital analysis of spatial sources, emphasizing their innovative approaches in research, but also con- sidering issues in the difficult relationship between machine-based collection of data and the traditional means of investigation of the Humanities. In con- clusion, it proposes a set of promising methods and strategies to be pursued for the future of the spatial analysis of premodern sources. Keywords. Ancient Geography, Digital Humanities, Digital Editions, GIS, Network Theo- ry, GeoHumanities, Digital Libraries. Chiara Palladino Furman University, Classics Department chiara.palladino@furman.edu work_i2eulfvrwjaw7oy7fpt5zfmkwi ---- A landscape of data – working with digital resources within and beyond DARIAH RESEARCH ARTICLE A landscape of data – working with digital resources within and beyond DARIAH Tibor Kálmán1 & Matej Ďurčo2 & Frank Fischer3 & Nicolas Larrousse4 & Claudio Leone5 & Karlheinz Mörth2 & Carsten Thiel5 Published online: 3 April 2019 # Springer Nature Switzerland AG 2019 Abstract The way researchers in the arts and humanities disciplines work has changed significantly. Research can no longer be done in isolation as an increasing number of digital tools and certain types of knowledge are required to deal with research material. Research questions are scaled up and we see the emergence of new infrastructures to address this change. The DigitAl Research Infrastructure for the Arts and Humanities (DARIAH) is an open international network of researchers within the arts and humanities community, which revolves around the exchange of experiences and the sharing of expertise and resources. These resources comprise not only of digitised material, but also a wide variety of born- digital data, services and software, tools, learning and teaching materials. The sustaining, sharing and reuse of resources involves many different parties and stakeholders and is influenced by a multitude of factors in which research infrastructures play a pivotal role. This article describes how DARIAH tries to meet the requirements of researchers from a broad range of disciplines within the arts and humanities that work with (born-)digital research data. It details approaches situated in specific national contexts in an otherwise large heterogeneous international scenario and gives an overview of ongoing efforts towards a convergence of social and technical aspects. Keywords Research infrastructure . Digital humanities . Arts and humanities . Sustainability. DARIAH . FAIR principles 1 Introduction Funding agencies, on both the European and national levels, increasingly require that research data and publications produced in publicly funded research projects be International Journal of Digital Humanities (2019) 1:113–131 https://doi.org/10.1007/s42803-019-00008-6 * Tibor Kálmán tibor.kalman@gwdg.de Extended author information available on the last page of the article http://crossmark.crossref.org/dialog/?doi=10.1007/s42803-019-00008-6&domain=pdf mailto:tibor.kalman@gwdg.de published in an open access format. Policy recommendations on research data man- agement are being revised in the context of Open Science (European Commission 2018). It has become a common practice for researchers to publish their research data in an open-access fashion, using free or permissive licenses. In the arts and humanities in particular, however, data sharing and reuse among researchers is not a commonly established practice. Even if researchers in these disciplines published their data in European repositories and archives, this data is often hard to find, access, or reuse. Even if there were an increased awareness of the need and benefit of sharing resources within the disciplines of the arts and humanities, much needs to be done to make it an integral part of an everyday research practice. The sharing of resources is an inherently complex phenomenon that involves many different actors and is influenced by many factors. Challenges to the level of the data itself are well summarised by the FAIR principles, which comprise of stable identifiers, rich, broadly disseminated metadata, widely adopted formats, vocabularies and proto- cols (Wilkinson et al. 2016). These requirements need to be supported by an appropri- ate technical infrastructure: (a) stable repositories for depositing and publication of the data; (b) means for broad dissemination of metadata, most notably the Open Archives Initiative’s Protocol for Metadata Harvesting (OAI-PMH) in combination with large- scale aggregators; (c) authentication and authorisation infrastructure (AAI), allowing for fine-grained handling of permissions and (d) interoperability between tools, i.e., support for established formats and availability of well-defined APIs and import/export functionality to ensure permeability and an easy data flow within the research process. These technical requirements need to be underpinned by policy measures: promotion of standards and permissive intellectual property rights (IPR) for research seconded by clear licensing. It is also important to establish academic gratification for the creation and publication of research data and software, as well as to appreciate its value as research output and enable a proper academic contribution. The latter point is partic- ularly crucial: while the other aspects could be considered as, primarily, enabling factors, the gratification aspect constitutes a strong incentive for researchers to willingly share their work. All of these measures need to be accompanied by appropriate training and outreach campaigns, raising awareness and ensuring the transfer of this kind of knowledge. Both scholars and students and the interested public need to have the opportunity to acquaint themselves with digital methods, technologies, formats and best practices. Ideally, this should take place in intensive, small-scale, hands-on settings, which focus on individ- ual aspects, up-to-date online training material, comprehensive documentation, and opportunities for on-demand personal consultations with experts. The sharing of resources should not be seen as a mere handover of data, but rather as an integral aspect of working with digital resources, interwoven with all the various stages of the research data lifecycle, from creation and curation to dissemination of digital resources for reuse and knowledge acquisition. It naturally affects and is affected by all stakeholders in the research area. While the decision of individual scholars to share the resources they created is the conditio sine qua non, it is crucial to embed the resource in a fruitful, supportive broader environment that ensures all the above- mentioned enabling factors. The traditional institutional context might be the home organisation of the scholar, but given the global challenge to increase the accessibility of research data, the issue at stake cannot be addressed by individual institutions 114 International Journal of Digital Humanities (2019) 1:113–131 anymore and requires joint efforts on many levels, involving entities from the individ- ual research groups up to European and global institutions. Research infrastructure consortia feature a multi-layered structure, ranging from topic-specific working groups and national consortia to the governing bodies on a European level. They are in an ideal position to tackle these multifaceted challenges. Not only do they represent their respective community, but they are also an integral part of it, possessing a deep understanding of research practices in the field. This article gives an overview of the ongoing developments and reflects on the current discourse within and beyond the DARIAH research infrastructure. It is struc- tured as follows: First, we present the DARIAH initiative in detail, including the reasons for its initiation and its unique position in the European context. We then shift our focus to describe different national chapters of DARIAH and their take on dealing with (born-)digital research data collections in a heterogeneous research environment. By helping to moderate the change of scientific practices in the humanities, we aim to make it easier to integrate digital and technical aspects into research workflows in disciplines that were previously rather ‘untechnical’. Some remarks on our work towards a con- vergence of social and technical aspects of this endeavour will conclude the article. 2 DARIAH – A digital and distributed infrastructure for the arts and humanities A research infrastructure can serve as the basis for offering services and resources for the sharing and management of data and for the management of associated legal and organisational issues. Developing such a sustainable research infrastructure, which integrates existing resources, tools and services to broaden the possibilities of a truly open science, and promotes the acceptance of digitally-enabled approaches is also the raison d’être of the DARIAH initiative. DARIAH is short for Digital Research Infrastructure for the Arts and Humanities. This pan-European organisation aims at enabling and supporting digital research methods and teaching across the arts and humanities (DARIAH 2018). DARIAH- EU, as the umbrella organisation is called, was founded in the framework of the European Strategic Forum for Research Infrastructures (ESFRI) and first appeared on the ESFRI roadmap in 2006 as one of six projects for the humanities and social sciences (European Roadmap for Research Infrastructures 2006: 33). Within the ESFRI, the legal form of European Research Infrastructure Consortium (ERIC) has been developed to enable the funded European research alliances to operate on a stable, long-term basis. After a long preparation phase, the DARIAH-ERIC was established by the European Commission in August 2014. To date, 17 countries–– Austria, Belgium, Croatia, Cyprus, Denmark, France, Germany, Greece, Ireland, Italy, Luxembourg, Malta, Poland, Portugal, The Netherlands, Serbia and Slovenia––have become DARIAH members, and the list of cooperating partners in these and other countries is growing. Six further candidate countries are expected to become members by 2020. In practice, DARIAH is a vivid marketplace of ideas and know-how, where people from different countries and disciplines can meet and collaborate, help and learn from each other. It addresses the aforementioned challenges in many different ways. Mainly through its individual partners, DARIAH provides the necessary basic technical International Journal of Digital Humanities (2019) 1:113–131 115 infrastructure and specialised tooling to underpin the whole research process; be it virtual research environments (VRE) for co-creation and publication, repositories for long-term preservation and publication of research data, general publication platforms, or generic project-management solutions, allowing efficient communication in highly distributed collaboration setups. Around these technical efforts, DARIAH also orga- nises numerous training and outreach events to raise awareness and transfer practical skills for digital methods to the scholarly community. On the European level, DARIAH uses its unique position and capacity to push forward necessary policy work that makes the handling and especially sharing of research resources easier. It propagates the utilisation of standards to address the problem that large parts of the produced research data are neither visible, nor reusable (legally or technically). This is why DARIAH engages in the Open Science Policy Platform (OSPP) (Edmond 2018). In the framework of the ongoing project DESIR (DARIAH ERIC Sustainability Refined, see CORDIS 2018), DARIAH has identified six dimensions of sustainability that it seeks to strengthen: dissemination, growth, technology, robustness, trust, education. Up until the projected end of DESIR in December 2019, we will see international workshops and other types of dissemination events to initiate collaborations and further educational work, and the existing services will be enhanced with a focus on entity-based search, scholarly content management, visualisation and text-analytic ser- vices. Furthermore, DARIAH collaborates with other SSH infrastructures such as CESSDA (Consortium of European Social Science Data Archives, see CESSDA 2018), CLARIN (Common Language Resources and Technology Infrastructure [see CLARIN 2018]), and the emerging research software engineering community. The aim is to find a common understanding of how to sustain research software, to address specific challenges of research infrastructures, and to develop a unified technical reference (Kalman et al. 2018). It is a declared task in the DARIAH Strategic Action Plan, released in November 2017, to help developing sustainability models for Digital Humanities (DH) projects and their data collections, especially to ensure the longevity of such projects after the direct funding period has run out (DARIAH 2017). In the future, DARIAH aims at working towards a more resilient, robust setup of the technical infrastructure, making datasets and services more independent from individ- ual providers through stronger cooperation between partners of the consortium, and with e-Infrastructures like EGI (EGI 2018), EOSC (European Commission 2017) or EUDAT (EUDAT 2018), offering basic generic services. With concentrated expertise both on infrastructural aspects and on actual research in the Digital Humanities, DARIAH can act as a broker and mediate between the needs of individual research projects and the large-scale technical solutions offered by e-Infrastructures. Several initiatives were started to lay the technical and organisational groundwork for such collaboration between DARIAH and related e-Infrastructures. For instance, the EGI DARIAH Competence Centre (Harmsen et al. 2015) helped with pilot projects like Storing and Accessing DARIAH contents on EGI (Wandl-Vogt et al. 2017), to analyse, distinguish and meet DARIAH requirements within the EGI infrastructure. The EOSC- hub initiative, which consolidates and integrates access mechanisms to e-Infrastructure resources, recently initiated its DARIAH Thematic Service (Dumouchel 2017) to strengthen the collaboration. Through institutions that are active in both CLARIN and DARIAH, there is cooperation with EUDAT, with particular regard to topics related to preservation and access to long-term storage resources. 116 International Journal of Digital Humanities (2019) 1:113–131 3 National Flavours of DARIAH In this Section, we give an overview over different approaches and national flavours of DARIAH that are working with and sharing a wide variety of data and services through software and tools as well as accompanying learning and teaching material. We present three different examples of DARIAH member countries that demonstrate how national activities contribute to the overall goals. A crucial characteristic of the DARIAH research infrastructure is its distributed nature as a federated network where most of the services are not offered by a central instance, but through the contributions of individual partners. There are various ways in which DH research communities, their data, and their supporting infrastructures are embedded in the national research landscapes. 3.1 DARIAH in Austria 3.1.1 National consortium CLARIAH-AT Right from the start, the national group of humanities research infrastructures in the humanities was set-up as one joint organisational structure comprising of both CLARIN and DARIAH (Ďurčo and Mörth 2014). This approach proved to be very efficient and successful. Interestingly enough, dynamics aiming at a higher degree of interaction and cooperation can also be seen in other countries. In the Netherlands, two infrastructures run one big national project; in Denmark and France, the coordination of both RIs is placed with the same person or institution; in Germany, talks on greater interaction are ongoing, and in other countries similar tendencies can be discerned. The Austrian Centre for Digital Humanities at the Austrian Academy of Sciences (ACDH-OeAW 2015) is the coordinating national institution for both research infrastructures. The centre was founded with the intention to foster the change towards digital paradigms in the humanities and pursues a dual agenda of conducting digitally enabled research and providing technical expertise and support to the research communities at the Academy and in the Austrian research landscape. ACDH-OeAW is not the only player in Austria offering services for the digital humanities community. In CLARIAH-AT, the national group of institutions involved in the two European Research Infrastructure Consortia CLARIN and DARIAH, 14 partner institutions work together to provide a common framework to improve the situation with respect to efficiency of dealing with research data. In 2015, numerous partners of the consortium contributed to a national strategy for Digital Humanities in Austria (Alram et al. 2015). One of the central goals of this strategy, which was fleshed out at the request of the then Ministry for Science, Research and Economy, was the creation of infrastructures to guarantee long-term preservation of research data. One of the measures proposed in the strategy to achieve this goal was the establishment of a national repository federation to ensure long-term access to research data hosted by exchanging expertise, sharing technologies, and interlinking repository resources. The long-term goal is to reach an agreement between individual partners of the federation making sure that partners would step in with their repositories as fall-back options in case one of the participating repositories ceases to exist. Implementation of the measures is part of the agenda for the CLARIAH-AT consortium for the upcoming three-year period. International Journal of Digital Humanities (2019) 1:113–131 117 3.1.2 Data services – One-stop shop for DH projects In the following, we highlight one specific institution, the ACDH-OeAW, to exemplify how local centres support their respective communities, contributing their share to the common cause. ACDH-OeAW strives to cover the whole research process: project planning, data modelling, data curation and processing, digitisation, application devel- opment, service hosting and especially long-term preservation of data. All of this is accompanied by personal consulting and support for individual research endeavours and knowledge transfer, as well as outreach activities promoting the use of digital methods in the various fields of the humanities. Stable, reliable, long-term preservation of research data being an essential precondition for sharing of resources, the ACDH-OeAW is running a repository called ARCHE (A Resource Centre for the HumanitiEs) (ARCHE 2017) as one of its core services offering stable hosting of digital research data––in particular, for the Austrian humanities com- munity. ARCHE welcomes data from all researchers in the Austrian Academy of Sciences, but also from other institutions in and outside the country. While its predecessor, CLARIN Centre Vienna / Language Resources Portal, was dedicated to digital language resources, ARCHE is open to a broader range of disciplines. ARCHE is mainly meant to preserve resources related to Austria, which would include resources that were collected or created in Austria, or involve a geographical area or historical period of interest to Austrian scholars. The collection policy details the types of data the repository is ready to accept and store. ARCHE has been awarded the CLARIN B centre status and certified under the Core Trust Seal (CoreTrustSeal 2018), formerly Data Seal Approval. Secure and robust long-term preservation of data hinges on many factors. Next to the technical level (bitstream preservation), a host of data-related aspects (metadata, established formats), and the institutional setting are to be considered. ARCHE explicitly states which formats it recommends and accepts for depositing. The categories are ‘preferred’ and ‘accepted’. Preferred formats are expected to be stable and usable also in the long-term. Accepted formats are considered less reliable for the long-term and are converted to one of the preferred formats during the ingest process, both formats being stored. The preservation plan, which is currently being developed, will describe the workflow for format monitoring and migration, so as to ensure that data is preserved if formats become obsolete. ARCHE pursues the principles of Open Access and Open Data. It encourages data depositors to use open licences, like CC-BY and CC-BY-SA, adhere to rules for good scientific practice, and apply the FAIR Data Principles. The repository itself supports the FAIR principles in various ways. Not only does it make the data findable by offering search and browse functionalities, but it also makes it available for harvesting through third-party aggregators, such as CLARIN’s metadata catalogue Virtual Lan- guage Observatory (VLO) (Van Uytvanck et al. 2010), by means of publishing metadata via OAI-PMH. It makes the data accessible by assigning persistent identifiers and interoperable by promoting the use of recommended formats and offering direct access to the data and metadata for both human and machine interaction. And, finally, all of these measures contribute to the reuse of the data. In addition to ACDH-OeAW, two other participating institutions have been provid- ing stable hosting and publishing solutions for research data: the Centre for Information Modelling, with the ACDH at the University of Graz running the repository GAMS (Stigler and Steiner 2014) and the University of Vienna, with the PHAIDRA repository 118 International Journal of Digital Humanities (2019) 1:113–131 (Budroni and Höckner 2010). All three repositories build on Fedora Commons (Fedora 2018), GAMS being an integrated system which comes with a specialised ingest tool and a Text Encoding Initiative (TEI) based publication framework. The common technical framework is a good basis for establishing a repository federation, where data could be transferred to and hosted by one of the other partners in case one of the services would shut down. Although sustainable preservation of data is an indispensable part of up-to-date data management in research, there are a number of other components required to cover the whole range of workflow steps in digitally working projects. We refer specifically to tools for automatic processing of data and also solutions supporting the manual collaborative creation and curation of born-digital data (commonly referred to as virtual research environments). Confronted with a multitude of projects with at times very individual needs, ACDH-OeAW adopted a pragmatic approach, trying to use what is there and to provide the missing pieces. In practice this means, e.g., that data encountered in projects encoded in MS Word or Excel files are converted to formats better suited to the long term, like TEI or Simple Knowledge Organisation System (SKOS). Yet, in other cases, we develop project-specific web-based applications with custom-tailored data models, which allow the project teams to create and curate data collaboratively. While this may seem inefficient, we increasingly witness consolidation tendencies and economies of scale, as the colleagues supporting the projects gain more experience in generic frameworks, which allows us to develop new applications with considerably less effort, and re-integrate new functionalities required by new projects back into the common code-base. For ACDH-OeAW, knowledge transfer and outreach are central pillars of the DH strategy. The team organises numerous training activities, most notably the two event series ACDH Lectures and ACDH ToolGallery. The latter being a one-day format, in which various practical tools are presented in a combination with a theoretical introduc- tion on a given topic and a hands-on session, giving participants a chance to try out a particular tool with the support of a qualified expert. ACDH-OeAWalso runs the platform Digital Humanities Austria (DHA 2015), which is the main national dissemination channel for DH in Austria; it is used to announce events and features a comprehensive exhibition of DH projects and a DH bibliography, which serves as an entry point for humanities scholars to delve into DH. An essential part of the community-building efforts is the annual DHA conference, which was organised by ACDH-OeAW in the first three years, before starting to move to other Austrian cities: in 2017, the conference was organised by the Research Centre Digital Humanities at the University of Innsbruck. Part of the institute’s strong commitment to training & education is also the provision of two specialised services for the DH community: #dariahTeach (DARIAH-TEACH 2017), an e-learning platform for teaching material for DH, and the DH Course Registry (DH-registry 2017), an online catalogue providing an overview of DH-related curricula in Europe being collaboratively maintained by CLARIN and DARIAH. 3.2 DARIAH in Germany 3.2.1 National consortium – DARIAH-DE DARIAH-DE is the German national contribution to DARIAH. It currently consists of a consortium of 19 partners, comprising universities, academies of sciences and International Journal of Digital Humanities (2019) 1:113–131 119 independent research institutions, libraries, data centres, a non-governmental organiza- tion (NGO) and a commercial partner (DARIAH-DE 2018h). Now in its third project phase, DARIAH-DE receives funding from the German Federal Ministry of Education and Research. The project’s current focus is the preparation of the operational phase in 2019, aimed at providing a permanent infrastructure for the arts and humanities in Germany, a process which DARIAH-DE and CLARIN-D are jointly advancing in close collaboration with the ministry, the academies of sciences and disciplinary stakeholders (Forschungsinfrastrukturen für die Geisteswissenschaften 2018). The heterogeneous nature of the DARIAH-DE consortium enables the research project to address the multi-faceted challenges for research infrastructures. Two pillars of DARIAH-DE are its tight integration with research and teaching through its partners. Dedicated work packages focus on quantitative data analysis, visualisation and anno- tation with the two focal points addressed in each. Another work package researches the impact and reach of DH in the humanities community, while a strong collaboration with CLARIAH-AT under the umbrella of #dariahTeach focussed on curricular, edu- cational and training materials on a wide variety of topics. The third main aspect is the provision and operation of the technological infrastruc- ture: from basic components such as servers, monitoring and user support through collaboration solutions and development toolchains to the layer of scholarly services. For these, DARIAH-DE’s infrastructure partners, such as data and computing centres and libraries, provide existing and well-established components and services. This includes an authentication and authorisation infrastructure (AAI) that is part of the worldwide authentication network, built by the higher education and research institu- tions. Over the course of the DARIAH-DE project, the tight collaboration of the developers embedded in their fields and the service providers operating the services have been focused upon and sustainability solutions have been developed to ensure the basis for the long term operation of this infrastructure. Finally, the pillar most relevant to the present article is dedicated to the processing and storing of research data, for which several tools and services are offered. Building on the TextGrid project, DARIAH-DE has continued the devel- opment of the TextGrid Repository, focussed on critical digital scholarly editions and optimised for XML-TEI encoded data, to build the DARIAH-DE Repository (cf. DARIAH-DE 2018g). The operation of the repository is institutionalised through the Humanities Data Centre (HDC), a joint venture of Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen (GWDG) and Göttingen State and University Library (SUB). Both institutions thus ensure the sustainability of all data stored in the repository. This repository is one component of the Data Federation Architecture (DFA, see (Gradl and Henrich 2016) for an overview of the underlying concepts and Fig. 1 for the underlying workflow) offered by DARIAH-DE to manage research data. 3.2.2 Data services – A federation architecture The DFA consists of the DARIAH-DE Repository, the Collection Registry, the Generic Search and the Data Modeling Environment (DME). All components (services and applications) of the DFA are designed to interact with one another. They can be used all together or as standalone services depending on the individual needs of the researcher. 120 International Journal of Digital Humanities (2019) 1:113–131 The DARIAH-DE Repository (DARIAH-DE 2018f) is a digital long-term archive for humanities and cultural scientific research data, enabling researchers to store and publish data in a secure and sustainable manner. At the entry point, the DARIAH-DE Publikator (DARIAH-DE 2018e) offers a user-friendly web interface for data manage- ment, description and ingest into the repository. The storage backend is divided into two areas: a restricted private storage area and a public area. All preparation for publication is done in the private storage area via the Publikator and involves three simple steps: First, a collection needs to be created; second, all associated data belonging to the collection has to be uploaded and, finally, all data has to be described by metadata. The repository uses the Dublin Core Simple (cf. Dublin Core Metadata Initiative 2013) metadata standard for description of data, only a few fields are mandatory, such as licence information. Furthermore, persistent identifiers for stable referencing are provided through the publication process – the collections as well as all associated objects get individual Digital Object Identifiers (DOIs). There is a dedicated PID-Service as part of the DFA for assigning unique identifiers and registering them at the DataCite DOI-network. Once published, all data is publicly available. After publication, an optional but highly recommended possibility is the registration of the collection in the Collection Registry (DARIAH-DE 2018a). The Collection Registry enables researchers to make their published data even more visible and understandable and, therefore, more accessible. A draft entry with the metadata already mentioned is automatically created during the publication process and stored in the Collection Registry for further enrichment. For this, a dedicated metadata model for enhanced description of collections and associated data is provided: the DARIAH Fig. 1 DARIAH-DE Data Federation Architecture, Tobias Gradl (updated version from: Gradl et al., 2015, used with permission) International Journal of Digital Humanities (2019) 1:113–131 121 Collection Description Data Model, DCDDM (see DARIAH-DE 2017), based on (Dublin Core Metadata Initiative 2007). Once the collection is registered, all data is searchable via the DARIAH Generic Search interface. Due to the modular design of DARIAH’s Data Federation Architecture, all kinds of metadata––including such that describe data published outside the DARIAH-DE DFA––can be registered and made accessible for the Generic Search. Information on how to access data can be provided, including specification of interfaces and APIs. This includes data that originate in a digital form, but also non-digital data or collections of objects. The design of the Generic Search (DARIAH-DE 2018c) is aimed at providing researchers in the Digital Humanities with an individually adjustable search facility for their research needs. The myCollections functionality enables them to compile their own query by preselecting the sources out of the Collection Registry, store and share them with research colleagues. This allows researchers to precisely query predefined metadata sets. Custom collections can be added at any time via the Collection Registry interface to enlarge the data set of their own query. The Generic Search is accessible without registration and allows a combination of different search strategies and dynamic adjustment of the enquiry‘s granularity, e.g., by adjusting the faceted classification or the number of included collections. If collections with different metadata schemes need to be integrated into the DFA, the Data Modelling Environment (DME) (DARIAH-DE 2018b), as a further compo- nent allows a web based user-friendly mapping and association of metadata fields. The web interface enables researchers to explicate their knowledge on the semantic de- scription of their collections. This bottom-up approach allows for more flexibility when including additional external sources, without enforcing explicit standards. This is especially important for the arts and humanities disciplines with their variety of perspectives on collections, terminology and data models. Besides the Data Federation Architecture, which is designed for research data man- agement purposes of all disciplines within the arts and humanities, DARIAH-DE also offers tools and services that are used for specific project contexts or are related to specific research methods. There are general services for collaborative work and project manage- ment allowing collaboration across locations. Furthermore, tools for annotating, analysing and visualising data are provided. A prominent example is the Geo-Browser (DARIAH- DE 2018d), which allows the analysis of space-time relations of data and collections of source material, facilitating their representation and visualisation in a correlation of geographic spatial relations at corresponding points of time and sequences. Additionally, a virtual research environment (VRE), especially designed for the creation of digital editions based on XML/TEI, offers open source tools and services to collabora- tively edit and generate research data. The VRE TextGrid (TextGrid 2018) enables the editing, storing and publishing of data for scholars in the humanities in a protected environment. DARIAH-DE is not only a digital research infrastructure, but also a social infrastruc- ture. It fosters exchange of experiences and expertise and offers a variety of communica- tion and training facilities, like user meetings, issue specific workshops with hands-on sessions, and regular events on the theme of Digital Humanities, spanning a broad range of topics. The information supply of DARIAH-DE is continuously being enhanced and provided through multiple channels and platforms, e.g. through a Digital Humanities blog (DHdBlog), a Twitter account with current news, a YouTube channel (DHd-Kanal) with 122 International Journal of Digital Humanities (2019) 1:113–131 tutorials, a “Doing Digital Humanities” bibliography as well as many publications and presentations which have been created during the seven years of project lifetime so far. DARIAH-DE creates a network of digital humanities services, expertise and com- munities to support research and cooperation in the humanities and cultural sciences, and promotes open access sharing of digital resources. 3.3 DARIAH in France 3.3.1 National consortium – DARIAH-FR The CNRS (Centre National de la Recherche Scientifique – National Centre for Scientific Research) is a public organisation under the responsibility of the French Ministry of Education and Research. The CNRS, in connection with universities, has implemented an ecosystem aiming to cover the entire lifecycle of the production of scientific data and publications in the Humanities and Social Sciences. This ecosystem is based on the following infrastructures: Open Editions (2018), CCSD (Centre pour la Communication Scientifique Directe 2018), PERSEE (Portail de diffusion de publications scientifiques) and TGIR Huma-Num (Très Grande Infrastructure de Recherche Huma-Num 2018). Huma-Num coordinates the participation in DARIAH and CLARIN of the above- mentioned organisations, as well as other potential contributors, such as Huma-Num’s national consortia (see below). It is also involved in other European and international projects like OPERAS (OPERAS 2018). Huma-Num is an infrastructure that aims to facilitate the digital turn in Humanities and Social Sciences and is part of the national ESFRI roadmap, which is in turn aligned with the European Union’s ESFRI frame- work. This allows good perspectives for recurrent funding. To perform these missions, Huma-Num’s organisation is based on both human and technological layer. It funds “groups of people”, called consortia, working on common areas of interest (e.g., similar scientific objects) and also provides a technological infrastructure, offering a variety of platforms and tools to process, preserve and disseminate digital research data. The main idea of a consortium is to organise multidisciplinary collective dialogue within research communities by bringing together different types of actors (researchers, technical staff, etc.) coming from different institutions, with the aim of creating synergies. In return, a consortium is expected to provide technological (or scientific) good practices and produce corpora, new standards, and tools. Furthermore, Huma-Num provides a technological infrastructure on national scale, based on a large network of partners. Technically, the infrastructure itself is hosted in a big data centre built by and for physicists. A long-term preservation facility from another data centre (CINES – Centre Informatique National de l’Enseignement Supérieur) is also utilised. In addition, a group of correspondents in the “Maison des Sciences de l’Homme” network (MSH Network 2018) all over France is in charge of relaying information about Huma-Num’s services and tools. 3.3.2 Data services throughout the data lifecycle Huma-Num provides tools and services for each step in the research data lifecycle. It coordinates the production of digital data, while offering a variety of platforms and tools to International Journal of Digital Humanities (2019) 1:113–131 123 process, preserve and disseminate the data. It also provides research projects with a range of utilities to facilitate the interoperability of various types of digital raw data and metadata (see Fig. 2). More specifically for digital collections, the aim is to foster the exchange and dissem- ination of metadata, and of the data itself, via standardised tools and lasting, open formats. These tools developed, by Huma-Num, are all based on semantic web technologies, mainly for their auto-descriptive features, and for the enrichment opportunities they enable. All our resources are, therefore, fully compatible with the Linked Open Data (LOD). Three services have been designed and developed by Huma-Num to process, store and display research data, while preparing them for re-use and long-term preservation; to put it another way, the aim is to provide a chain of tools to make data FAIR. These complemen- tary services embrace the research data lifecycle and are designed to meet the needs arising there from: constitute a coherent chain of research data tools. While they interact smoothly with one another, they are also open to external tools using the same technologies. The scientific objective is to promote data sharing so that other researchers, com- munities, or disciplines, can reuse them, including from an interdisciplinary perspective and in different ways. A map, for example, may become a scientific object, which reflects both the point of view of a geographer and that of a historian. More generally, the principles and methods of the Semantic Web (RDF, SPARQL, SKOS, OWL), on which these services rely, enable data to be documented or re-documented for various uses without confining them to inaccessible silos. Another important point is to make the storage of data independent of the device used to disseminate the data. Another objective is to prevent the loss of data by preparing their long-term preservation. Documenting the use of appropriate formats, which are the basis of data interoperabil- ity, greatly facilitates the archiving process. The workflow implemented by Huma-Num has been built on interoperability. The aim is to foster the exchange and dissemination of metadata, but also of the data themselves via standardised tools and lasting, open formats. Huma-Num uses different technologies for cold, warm and hot data. If the technology used for hot data was quite Fig. 2 DARIAH-FR’s Services for Data, Huma-Num 124 International Journal of Digital Humanities (2019) 1:113–131 classical, for warm data, Huma-Num has established a mesh of distributed storage all over France (currently 9 nodes) using different storage technologies encapsulated. Thus, backup and versioning can be made on any node. Furthermore, the data center where Huma-Num’s infrastructure is hosted provides a backup on tapes for cold data. Huma-Num already provides a long-term preservation service based on the CINES (Centre Informatique National de l’Enseignement Supérieur, 2018) facility, a National Computer Center of Higher Education which is responsible for permanent archiving for scientific data in France. This is much more than the bit preservation done with the above- mentioned technologies. A long-term preservation project means that one needs to organise the data with a view to reuse by someone, who did not participate in its creation, that presupposes a lot of curation. In addition, the data should be expressed in a format accepted by the partner and additional information has to be provided to document the context of data production, metadata, etc. Huma-Num accompanies these projects by acting as go-between linking data producers, CINES, archivists and other actors. After a detailed description of three national landscapes, we now shift our focus to the ongoing efforts towards a convergence on the European level in light of the heterogeneity of research data collections, of formats, tools and services. 4 Convergence of tools, methods and collections It was always the vision of DARIAH to enable the DH research community to reuse and build on existing solutions, developed in and by the community. This includes both the social and the technical aspects of the convergence from individual solutions to a distributed infrastructure. The social aspect builds around the idea of an Open Marketplace, which enables us to share and review existing services and solutions. From the technical side, DARIAH has identified the need to address the sustainability of the software, which provide some of the core parts of any digital infrastructure. In the following section, we describe how these are being addressed. 4.1 The open marketplace The idea of developing DARIAH ‘as a social marketplace for services’ (Blanke et al. 2011) dates back as far as to the preparatory phase of the DARIAH initiative. The long-term goal is to provide an Open Marketplace platform, which is planned as an easy-entry place where scholars can find solutions for the digital aspects of their daily research work, such as software, tools, (born-)digital data sets, repositories, services, learning and teaching material. The Marketplace targets all researchers from the broader SSH, not just those scholars who would regard themselves as digital humanists. Various approaches had been started in the past to provide collections and registries with similar goals. The most important difference between such approaches and the DARIAH Marketplace is that it will contextualise the tools and services offered, with user feedback, user stories, links to training material, showcases, contact addresses, ratings. It is going to be actively curated and sustained by the DARIAH community. The idea is not that these solutions would be produced by DARIAH itself, but that International Journal of Digital Humanities (2019) 1:113–131 125 the Marketplace creates visibility for them to help researchers do their work (DARIAH 2017) (Fig. 3). There have been previous attempts at providing an active, community-backed registry of digital tools and services. While most of them did not always live up to their expectations (for a prominent example cf. Dombrowski 2014), one can still learn from them and reuse their highly curated data. Such an attempt was undertaken within the framework of the H2020 project “Humanities at Scale” coordinated by the DARIAH-ERIC. Building on TERESAH, the “Tools E-Registry for E-Social science, Arts and Humanities” originally developed within the FP7 project “Digital Services Infrastructure for Social Sciences and Humanities” (DASISH) until 2014, a demonstrator for a central registry with distributed data sources was created (Engelhardt et al. 2017). While the DARIAH Marketplace is still being formed, it is the declared goal not to just add another list-based overview of digital tools, but to assemble and highlight DH knowledge. The platform will create a place addressing and involving the entire research community and also, eventually, the public and industry (bearing in mind EOSC and EU access policy guidelines for research infrastructures). 4.2 Sustainability of tools and software The social aspect of the marketplace is built on the idea of sharing and reviewing existing services and solutions. In the case of software, providing some of the core technical parts of any digital infrastructure, DARIAH has identified the need to address its sustainability problems (cf. Thiel 2017). In the current status-quo, the construction of sustainable infrastructures is done through grant-based research projects, which has a number of problems. Soft- ware built to address specific research questions is often developed in an ad- hoc manner. This is not helped by the fact that software is not yet generally accepted as creditable research output in and of itself. Without a recognition of Fig. 3 Illustrative sketch of DARIAH Open Marketplace 126 International Journal of Digital Humanities (2019) 1:113–131 the value of the software as a form of research, the individual researcher’s willingness to invest additional time into improving the software in a way that does not directly impact the output will be minimal. The requirement to provide data management plans as part of H2020 grants, which is implemented by national and other funders, sees source code as being identified as digital resources that need preservation. To address this, the UK’s Software Sustainability Institute developed a solution to create a Software Management Plan through DMPonline (Software Sustainability Institute 2018) and GitHub and Zenodo have joined forces to add a simple possibility to publish GitHub releases in Zenodo, making software releases citable through DOIs (GitHub 2016). Archiving code is the first step in ensuring the availabil- ity for future re-use and reproducibility of research output generated with that software. The second step is making sure that the code can be processed and executed when needed, which goes beyond classical practices of data curation, (cf. Katz et al. 2016) for a discussion on the topic. In our context, two problems are most relevant. For reproducibility of results, access to the entire exact build environment is required and it must, therefore, be referenced in the archived software in a machine readable format. For re-use of the software, the adaptability to the constantly changing reality of information technology, such as changes to external libraries and dependencies, becomes relevant. As tech- nology progresses, so do research questions and new applications not envisioned during the original development can emerge (cf. Harms, Grabowski 2011). For a future researcher to be able to actually adapt a given software product, sufficient documentation and code legibility must exist. While research thrives on innovative solutions with fast-paced development progress, the requirements for software maintainability for the long run are directly contrary (see Hettrick 2016, Chapter 3) for a more detailed discussion. This is also a particular problem for infrastructures striving to sustain software developed within projects as services. To be able to do so, the infrastructure providers must make a judgement on the expected and unexpected cost that long-term software maintenance will incur. This can only be done if the software is of sufficiently good quality. To address this, infrastructures are developing guidelines and best practices for developers. At the same time, existing quality measures, such as ISO standards, can be one frame of reference (see e.g. Buddenbohm et al. 2017), while (Doorn et al. 2016) suggest estab- lishing an independent certification, modelled on the Data Seal of Approval, now CoreTrustSeal (CoreTrustSeal 2018). For an infrastructure to provide a valuable service to the scholarly commu- nity, the reliability and the trustworthiness of the services offered is a funda- mental prerequisite. By improving the quality of the software and making this transparent to the end user of the technology through the Open Marketplace platform, DARIAH strives to address both. In particular, through DESIR work was started on a general Technical Reference (Moranville et al., 2018) as baseline for new development and the Marketplace will improve the findability and discoverability of research software. The combination of both supports and builds upon known recommendations for research software (Jiménez et al. 2017). International Journal of Digital Humanities (2019) 1:113–131 127 5 Conclusion We have summarised ongoing developments and reflected current discussions within the research infrastructure DARIAH and within some of DARIAH’s member states, which are creating and integrating solutions for challenges of heterogeneous research data, tools, services in the arts and humanities. We highlighted that the focus of DARIAH is not simply digitized analogue material of galleries, libraries, archives, and museums. As (digital) research produces born-digital materials (e.g. datasets, tools, softwares), which have to be managed, DARIAH’s collection of data is much broader. The challenges, issues and factors of the heterogeneity of (born-)digital research data that DARIAH aims to address only become apparent in large international infrastruc- tures willing to integrate heterogeneous research practices, data formats, tools and services from the wide range of DH disciplines. This article provided insights into this process, both on European and national levels, and reflected on discussions and solutions in the broader DARIAH network. These discussions include the many factors and challenges that influence the sharing of resources in the arts and humanities. The DARIAH research infrastructure seeks to support the scholarly community to enable and foster the work with and sharing of digital resources in numerous ways. This includes the need to look at the activities on the European and national levels and is exemplified by the three examples from member countries, showcasing also the variety in the setups of the national consortia. In order to support communities in reusing distributed existing resources in a coherent manner, a coordinated multi-faceted strategy is paramount. It has to involve technological provisions for robust services as well as sustainable software plans, work on policy level promoting use of standards and permissive licensing, all accompanied by training and outreach activities to raise awareness and convey practical skills on digital methods. DARIAH also acknowledges its position in the general landscape of existing initiatives, infrastructures, as well as projects, and strives to promote exchange and leverage synergies with them. In addition to the collaborations with the initiatives of the SSH communities like CESSDA, CLARIN, EUROPEANA and OpenAIRE, the cooperations with e-Infrastructures like EGI, EOSC or EUDAT are intensified and expanded. A central goal of this pan-European endeavour is to enable, promote, and simplify the discovery and access to the wealth of (born-)digital resources available in line with the FAIR principles. In order to achieve this, DARIAH has started developing a curated community-driven discovery platform, the DARIAH Open Marketplace. Once released, it will serve the researchers and broader audiences in finding data sets, tools and services that are applicable and reusable in their daily research. The key to success is to involve the commu- nities, and in this regard, the Marketplace has a pivotal role for the future. References ACDH-OeAW (2015). Austrian Centre for Digital Humanities at the Austrian Academy of Sciences. Retrieved from https://www.oeaw.ac.at/acdh/. Accessed 26 Feb 2018. 128 International Journal of Digital Humanities (2019) 1:113–131 https://www.oeaw.ac.at/acdh/ Alram, M., Benda, Ch., Ďurčo, M., Mörth, K., Wentker, S., Wissik, T., Budin, G., et al. (2015). DHAUSTRIA-STRATEGIE. Sieben Leitlinien für die Zukunft der digitalen Geisteswissenschaften in Österreich. Wien. https://doi.org/10.1553/DH-AUSTRIA-STRATEGIE-2015. ARCHE (2017). A Resource Centre for the HumanitiEs. Retrieved from https://arche.acdh.oeaw.ac.at/. Accessed 26 Feb 2018. Blanke, T., Bryant, M., Hedges, M., Aschenbrenner, A. & Priddy, M. (2011). Preparing DARIAH. IEEE 7th International Conference on E-Science. IEE Digital Library: Stockholm (pp. 158–165). https://doi. org/10.1109/eScience.2011.30. Buddenbohm, S., Matoni, M., Schmunk, S., & Thiel, C. (2017). Quality assessment for the sustainable provision of software components and digital research infrastructures for the arts and humanities. Bibliothek Forschung und Praxis, 41(2), 231–241. https://doi.org/10.1515/bfp-2017-0024. Budroni, P., Höckner, M. (2010). Phaidra, a Repository Project of the University of Vienna; in: iPRES 2010, 7th International Conference on Preservation of Digital Objects, Vienna. CCSD (Centre pour la Communication Scientifique Directe) (2018). A center which offers a set of services for the management of open archives. Retrieved from https://www.ccsd.cnrs.fr. Accessed 26 Feb 2018. CESSDA (2018). About CESSDA. Retrieved from https://www.cessda.eu/About>. Accessed 26 Feb 2018. CINES (Centre Informatique National de l'Enseignement Supérieur) (2018). Digital archiving solutions for long term preservation. Retrieved from https://www.cines.fr/en/long-term-preservation. CLARIN (2018). CLARIN in a Nutshell. Retrieved from https://www.clarin.eu/content/clarin-in-a-nutshell. Accessed 26 Feb 2018. CORDIS (2018). DARIAH ERIC Sustainability Refined. Retrieved from https://cordis.europa. eu/project/rcn/207190_en.html. Accessed 26 Feb 2018. CoreTrustSeal (2018). CoreTrustSeal Data Repository Certification. Retrieved from https://www. coretrustseal.org/. Accessed 26 Feb 2018. DARIAH (2017). 2020: 25 Key Actions for a Stronger DARIAH by 2020. Retrieved from https://www.dariah. eu/wp-content/uploads/2017/02/DARIAH_STRAPL_v06112017.pdf. Accessed 26 Feb 2018. DARIAH (2018). Dariah in a Nutshell. Retrieved from https://www.dariah.eu/about/dariah-in-nutshell/. Accessed 26 Feb 2018. DARIAH-DE (2017). DARIAH Collection Description Data Model DCDDM. Retrieved from https://github. com/DARIAH-DE/DCDDM. Accessed 26 Feb 2018. DARIAH-DE (2018a). DARIAH-DE Collection Registry. Retrieved from https://colreg.de.dariah.eu. Accessed 26 Feb 2018. DARIAH-DE (2018b). DARIAH-DE: Data Modelling Environment. Retrieved from https://dme.de.dariah. eu/dme. Accessed 26 Feb 2018. DARIAH-DE (2018c). DARIAH-DE Generic Search. Retrieved from https://search.de.dariah.eu/search/. Accessed 26 Feb 2018. DARIAH-DE (2018d). DARIAH-DE Geo-Browser. Retrieved from https://geobrowser.de.dariah.eu/. Accessed 26 Feb 2018. DARIAH-DE (2018e). DARIAH-DE Publikator. Retrieved from https://repository.de.dariah.eu/publikator. Accessed 26 Feb 2018. DARIAH-DE (2018f). DARIAH-DE Repository. Retrieved from https://de.dariah.eu/repository. Accessed 26 Feb 2018. DARIAH-DE (2018g). Data Federation Architecture Technical Documentation. Retrieved from https://repository.de.dariah.eu/doc/services/. Accessed 26 Feb 2018. DARIAH-DE (2018h). Der DARIAH-DE Forschungsverbund. Retrieved from https://de.dariah.eu/der- forschungsverbund>. Accessed 26 Feb 2018. DARIAH-TEACH (2017). dariahTeach. Retrieved from https://teach.dariah.eu/. Accessed 26 Feb 2018. DHA (2015). Digital Humanities Austria. Retrieved from http://digital-humanities.at/. Accessed 26 Feb 2018. DH-registry (2017). DH Course Registry. Retrieved from https://registries.clarin-dariah.eu/courses/. Accessed 26 Feb 2018. Dombrowski, Q. (2014). What ever happened to project bamboo? Literary and Linguistic Computing, 29(3), 326–339. https://doi.org/10.1093/llc/fqu026. Doorn, P., Aerts, P. and Lusher, S. (2016). Research software at the heart of discovery, DANS & NLeSC. Retrieved from https://www.esciencecenter.nl/pdf/Software_Sustainability_DANS_NLeSC_2016.pdf. Accessed 26 Feb 2018. Dublin Core Metadata Initiative (2007). Dublin Core Collection Description Application Profile. Retrieved from http://dublincore.org/groups/collections/collection-application-profile/. Accessed 26 Feb 2018. Dublin Core Metadata Initiative (2013) Dublin Core metadata element set, version 1.1: Reference description. Retrieved from http://www.dublincore.org/documents/dces/. Accessed 26 Feb 2018. International Journal of Digital Humanities (2019) 1:113–131 129 https://doi.org/10.1553/DH-AUSTRIA-STRATEGIE-2015 https://arche.acdh.oeaw.ac.at/ https://doi.org/10.1109/eScience.2011.30 https://doi.org/10.1109/eScience.2011.30 https://doi.org/10.1515/bfp-2017-0024 https://www.ccsd.cnrs.fr/ https://www.cessda.eu/About https://doi.org/10.1553/DH-AUSTRIA-STRATEGIE-2015 https://www.clarin.eu/content/clarin-in-a-nutshell https://cordis.europa.eu/project/rcn/207190_en.html https://cordis.europa.eu/project/rcn/207190_en.html https://www.coretrustseal.org/ https://www.coretrustseal.org/ https://www.dariah.eu/wp-content/uploads/2017/02/DARIAH_STRAPL_v06112017.pdf https://www.dariah.eu/wp-content/uploads/2017/02/DARIAH_STRAPL_v06112017.pdf https://www.dariah.eu/about/dariah-in-nutshell/ https://github.com/DARIAH-DE/DCDDM https://github.com/DARIAH-DE/DCDDM https://colreg.de.dariah.eu/ https://dme.de.dariah.eu/dme https://dme.de.dariah.eu/dme https://search.de.dariah.eu/search/ https://geobrowser.de.dariah.eu/ https://repository.de.dariah.eu/publikator https://de.dariah.eu/repository https://repository.de.dariah.eu/doc/services/ https://de.dariah.eu/der-forschungsverbund https://de.dariah.eu/der-forschungsverbund https://teach.dariah.eu/ http://digital-humanities.at/ https://registries.clarin-dariah.eu/courses/ https://doi.org/10.1093/llc/fqu026 https://www.esciencecenter.nl/pdf/Software_Sustainability_DANS_NLeSC_2016.pdf http://dublincore.org/groups/collections/collection-application-profile/ http://www.dublincore.org/documents/dces/ Dumouchel, S. (2017). How the notion of access guides the organization of a European research infrastruc- ture: the example of DARIAH. Retrieved from https://dh2017.adho.org/abstracts/088/088.pdf> [Last accessed 17 May 2018]. Ďurčo, M. & Mörth, K. (2014). CLARIN-DARIAH.AT – Weaving the network, in: 9th Language Technologies Conference. Information Society – IS 2014, Ljubljana, Slovenia, pp. 14–18. Edmond, J. (2018 Feb) Untangling Barriers: Director Jennifer Edmond on DARIAH’s Commitment to Open Science. Retrieved from https://www.dariah.eu/?p=1997. Accessed 26 Feb 2018. EGI (2018). EGI: advanced computing for research. Retrieved from https://www.egi.eu/about/. Accessed 26 Feb 2018. Engelhardt, C., Leone, C., & Moranville, Y. (2017). Distributed Metadata Schema and Demonstrator for Open Humanities Methods. [Research Report] Göttingen State and University Library; DARIAH. 2017. Available at https://hal.archives-ouvertes.fr/hal-01637051v1. EUDAT (2018). What is EUDAT? Retrieved from https://www.eudat.eu/what-eudat. Accessed 26 Feb 2018. European Commission (2017). EOSC Declaration. Retrieved from https://ec.europa. eu/research/openscience/pdf/eosc_declaration.pdf. Accessed 26 Feb 2018. European Commission (2018). Commission Recommendation of 25.4.2018 on access to and preservation of scientific information.Retrieved from http://ec.europa.eu/newsroom/dae/document.cfm?doc_id=51636. Accessed 26 Feb 2018. European Roadmap for Research Infrastructures. (2006). Report 2006. Luxembourg: Office for Official Publications of the European Communities. Retrieved from https://ec.europa. eu/research/infrastructures/pdf/esfri/esfri_roadmap/roadmap_2006/esfri_roadmap_2006_en.pdf. Accessed 26 Feb 2018. Fedora (2018). Fedora Repository. Retrieved from http://fedorarepository.org/. Accessed 26 Feb 2018. Forschungsinfrastrukturen für die Geisteswissenschaften (2018). Wissenschaftsgeleitete Forschungsinfrastrukturen für die Geistes- und Kulturwissenschaften in Deutschland. Retrieved from https://www.forschungsinfrastrukturen.de/. Accessed 26 Feb 2018. GitHub (2016). Making Your Code Citable. Retrieved from https://guides.github.com/activities/citable-code/. Accessed 26 Feb 2018. Gradl, T., & Henrich, A. (2016). Die DARIAH-DE-Föderationsarchitektur – Datenintegration im Spannungsfeld forschungsspezifischer und domänenübergreifender Anforderungen. Bibliothek Forschung und Praxis, 40(2), 222–228. https://doi.org/10.1515/bfp-2016-0027. Gradl, T., Henrich, A., & Plutte, C. (2015). Heterogene Daten in den Digital Humanities: Eine Architektur zur forschungsorientierten Föderation von Kollektionen. In Baum, C. & Stäcker, T.(eds.) Grenzen und Möglichkeiten der Digital Humanities. Zeitschrift für digitale Geisteswissenschaften, 1. DOI: https://doi.org/10.17175/sb001_020. Harms, P., & Grabowski, J. (2011). Usability of Generic Software in e-Research Infrastructures. Journal of the Chicago Col loquium on Digital Humanities and Computer Science, 1(3) 1–18. http://resolver.sub.uni- goettingen.de/purl?gs-1/9238. Harmsen, H., Kalman, T. & Wandl-Vogt, E. (2015). DARIAH meets EGI. Inspired newsletter – Issue 19. Retrieved from https://www.egi.eu/news-and-media/newsletters/Inspired_Issue_19/dariah.html. Accessed 26 Feb 2018. Hettrick, S. (2016). Research Software Sustainability: Report on a Knowledge Exchange Workshop. Retrieved from https://www.esciencecenter.nl/pdf/Research_Software_Sustainability_Report_on_KE_Workshop_ Feb_2016_FINAL.PDF>. Accessed 26 Feb 2018. Jiménez R.C., Kuzak M., Alhamdoosh M., et al. (2017). Four simple recommendations to encourage best practices in research software [version 1]. F1000Research, 6:876. https://doi.org/10.12688/f1000 research.11407.1. Kalman, T., Thiel, C., Van Uytvanck, D., Moranville, Y. (2018). Sustainable Research Software – Managing a Common Problem of SSH Infrastructures. Digital Infrastructures for Research 2018, Lisbon, Portugal Retrieved from https://indico.egi.eu/indico/event/3973/session/22/contribution/111 Katz, D. S., Niemeyer, K. E., Smith, A. M., Anderson, W. L., Boettiger, C., Hinsen, K., & Hooft, R. (2016). Software vs. Data in the Context of Citation. PeerJ Preprints, 4. https://doi.org/10.7287/peerj. preprints.2630v1. Moranville, Y., Rodzis, M. & Thiel, C. (2018). DARIAH Technical Reference. Retrieved from